CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair ([email protected]) ...

292
CSE360 1 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair ([email protected]) http://carmen.osu.edu http://www.cse.ohio-state.edu/~bbair Copyright © 1998-2006 by Rick Parent, Todd Whittaker, Bettina Bair, Pete Ware, Wayne Heym

Transcript of CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair ([email protected]) ...

Page 1: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 1

CSE 360: Introduction to Computer Systems

Course Notes

Bettina Bair ([email protected])http://carmen.osu.edu

http://www.cse.ohio-state.edu/~bbair

Copyright © 1998-2006 by Rick Parent, Todd Whittaker, Bettina Bair, Pete Ware, Wayne Heym

Page 2: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 2

Section Details MTWF 9:30 & 2:30, DL 305 Bettina Bair ([email protected]) Homepage:

– http://www.cse.ohio-state.edu/~bbair

Office: Dreese Labs 493 Hours: MW 10:30, TF 1:30

– or by appointment

Phone: 292-2565 Grader:

– Hamid Ettefagh ([email protected])– John Colvin ([email protected])

Page 3: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 3

Topics of DiscussionTopics of Discussion Course description Required texts Policies Syllabus Expectations

Page 4: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 4

Description:

Introduction to computer architecture at the machine language and assembly language level; assembly language programming and lab.

Prerequisites: CSE 214 or 222 or H222

Page 5: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 5

Text:

1. Computer Systems: Architecture, Organization, and Programming, Arthur B. Maccabe, Irwin, 1993.

2. Sparc Architecture, Assembly Language Programming & C, Richard Paul, Prentice Hall – a good reference, if you are interested

3. Class handouts

4. Material online at http://carmen.osu.edu

Page 6: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 6

Grading Policy: An assigned grader will grade all homeworks and labs –

your lecturer will grade all exams. Missed assignments or tests without prior approval will

receive a grade of zero. Reasonable excuses must be given in writing to me one

week prior to the due date or test date, at which time the circumstances will be evaluated, and approval granted or rejected.

No late homeworks or labs will be accepted. Exams are closed book, closed notes, and cover all of the

material up to that point.

Page 7: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 7

Grading Weights:

Homeworks (6) 25% as assigned

Labs (3) 25% as assigned

Midterm 20% around the 6th week

Final 30%as indicated in master schedule

Grading Scale - to be determined

Page 8: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 8

Students with Disabilities

If you need an accommodation based on the impact of a disability, please contact me to arrange an appointment as soon as possible.

Office for Disability Services – verifies the need for accommodations

– Helps develop accommodation strategies.

If you have not previously contacted the Office for Disability Services, I encourage you to do so.

Page 9: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 9

Academic Misconduct

Academic misconduct is defined as any activity which tends to compromise the academic integrity of the institution, or subvert the educational process.

University policy requires that all cases of suspected academic misconduct be submitted to the Committee for Academic Misconduct for a hearing and evaluation. – Any academic misconduct will be dealt with via the

appropriate University authorities.

Page 10: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 10

Academic Misconduct

Homework, lab assignments, and exams are to be your own work.

High-level discussion of assignments is encouraged, but the more specific your discussion, the closer you come to cheating. – The policy on collaboration with others is fairly liberal

-- but please don't be tempted to test its limits.

Page 11: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 11

Academic Misconduct

You may not write or otherwise record any part of your solution to an assignment while someone is helping you.

You may not take a physical or electronic copy of any part of a solution to an assignment from anyone.

You may not give a physical or electronic copy of any part of a solution to an assignment to anyone.

Page 12: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 12

Academic Misconduct

You are encouraged to talk with others (especially others in the class) about the design, logic, and implementation of a program. – Do not give anyone or take from anyone written or

recorded material

– Do write up your own solution without assistance.

Professional ethics: – You may not turn in an assignment solution from a

previous quarter's offering of the course

Page 13: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 13

Expectations

Read your e-mail Read, reply to the class discussion group on

Carmen Attend class (it’s correlated to results!) Complete homeworks and labs on time Read the assigned pages from the text

Page 14: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 14

Can I change my section?

Not until Brutus updates – at the end of the first week

– only if there are seats available.

Priority will be given– CSE Majors that are Graduating Seniors

– CSE Majors

– People who attend class the first week

Page 15: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 15

Can I work on assignments from home?

Submission via Carmen “dropbox” HW: MS Word, PDF, or text format Labs:

– Submitted as text formatted source (*.s) file

– Require access to ISEM application Available thru your CSE account: stdsun.cse.ohio-state.edu

– SSH, telnet and file transfer (ftp) protocols are useful

– Read more about remote access on Carmen ISEM may also be available online – where? How? I don’t

know.

Page 16: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 16

Who do I approach if I have a problem with grading?

For labs and homework, contact your grader first– See me if not resolved

For exams, contact me

Page 17: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 17

The Carmen Discussion Group

carmen.osu.edu It’s a place for students to discuss issues related to

course work. Post any questions you might have. Use discretion when making a posting. Look out for important announcements. Instructors/Graders answer questions whenever

they can.

Page 18: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 18

Course Objectives Principles of Computer Organization and

Architecture– Basic Machine Representation of Signed

Integers, Character Strings, Arrays, Stacks, Records, Linked Lists;

Assembly Language Programming. – Fundamentals of Computer Instruction Set

Architectures; – Low Level Algorithms for Data Manipulation and

Conversion and Parameter Passing

Page 19: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 19

150+ Years of Amazing Computers

Sherman, set the WABAC Machine to the year 1822…

Page 20: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 20

Babbage’s Difference Engine, 1822

Babbage's difference engine No. 2, finally built in 1991

Could hold 7 numbers of 31 decimal digits

Could tabulate 7th degree polynomials

Page 21: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 21

Ada Lovelace, the first programmer

Mathematician, Patron Wrote a program for

Babbage’s (theoretical) Analytical Engine to calculate the Bernoulli sequence, in 1843

In 1979, a contemporary programming language was named Ada in her honour.

Page 22: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 22

1890: Hollerith Tabulating System

Census Counter Hollerith Tabulating

System Was A System Of Machines– Punch,

– Tabulator

– Sorting Box

Hollerith's Business Joined A Firm That Later Became IBM.

Page 23: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 23

1943-45: Eniac Electrical Numerical Integrator

And Computer Built To Compute Ballistics

Tables For U.S. Army Artillery During World War II. – 1,000 Times Faster Than Any

Existing Device. External Plug Wires Used To

Program The Machine Principal Designers, J. Presper

Eckert And John Mauchley Cost, About $400,000

Page 24: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 24

Vacuum Tubes

ENIAC– Used Some 18,000

Vacuum Tubes.

– 30 Feet By 50 Feet

– Weighed 30 Tons

The ENIAC was a decimal machine!

Page 25: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 25

Programming the Eniac

Page 26: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 26

Original Eniac Programmers

Page 27: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 27

The Bug In 1947, engineers found

A moth stuck in one of the components. 

Taped it in their logbook Labeled it "first actual

case of bug being found." 

Page 28: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 28

Grace Hopper (1906-1992) 1953: Invented The

Compiler – Translates English Language

Instructions Into Language Of The Target Computer

– "Lazy" And Hoped That "The Programmer May Return To Being A Mathematician."

Led To The Development Of The Business Language Cobol.

Retired From The U.S. Navy As A Rear Admiral.

Page 29: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 29

IAS (1946-1952) Institute For Advanced

Study At Princeton University. 

Designed And Directed By John Von Neumann. 

Cost: Several Hundred Thousand Dollars. 

Used externally stored programs that could be loaded and executed.

Page 30: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 30

1949: Core Memory A Small Ring, Or Core, Of Ferrite

(A Ferromagnetic Ceramic) Can Be Magnetized In Either Of Two Opposite Directions.

A Core Can Be Used For Storing One Bit Of Information.

For Almost 15 Years, 'Core' Was The Most Important Memory Device.

The Invention Of Core Memory Was A Leap Forward In Cost-effectiveness And Reliability.

Page 31: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 31

1950s Assembly Programming Class

This would be so much easier with

a computer…

Page 32: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 32

1965: PDP8

Programmed Data Processor 50,000+ Sold Cost: $18,000. Speed: 1.5 Micro-second Cycle

Time Primary Memory: 4K

– 12-bit Word Core Memory Power: 780 Watts

What does cycle time mean?

Page 33: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 33

1960s/70s Card Reader

Card is pre-printed with FORTRAN field layouts

Page 34: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 34

1977: Trs-80 Radio Shack "Trash-80," 4K Of Memory Could Not Handle Lowercase Letters Only Three Error Messages:

– "HOW?" Whenever The User Tried To Perform An

Illegal Function

– "What" When A Syntax Error Occurred

– "Sorry" When The Available Memory Ran Out

Cost Only $400! Some 55,000 Machines Sold In First Year

Page 35: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 35

1979: Vic-20 Processor Speed: 1.0227 Mhz. ROM: 16kb RAM: 5kb (3.5kb User Memory)

– Expandable To 32kb. Screen: 22 Columns By 23

Rows.– Character Dot Matrix: 8 By 8 Or

8 By 16 (User Programmable).

– Screen Dot Matrix: 176 By 184 With Up To 16 Colors.

Sound: 3 Voices Plus White Noise.

Media: Tape Drive

Page 36: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 36

1984: Macintosh Revolutionary Graphical User

Interface (GUI). – A Device Called A Mouse

– Pictorial Symbols (Icons) On The Screen.

– Select Commands, Call Up Files, Start Programs, Etc.

Original Selling Price: $2,495

Page 37: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 37

What if you had to build your own computer – from scratch?

Page 38: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 38

Course Objectives Understanding the architecture (how the

computer executes assembly language instructions) is the more important aspect of a course at this level.

The fundamental concept to understand is that everything in the computer is represented by ones and zeros (by electric current flowing or not flowing at a specific place, or by something being magnetized one direction or the other, etc.).

Page 39: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 39

Course Objectives At the lowest level, this course will cover various

binary formats of assembly language instructions and various ways in which data can be represented using ones and zeros and how these can be organized into a program.

At high levels, assembly language programming techniques will be studied and a specific assembly language will be used to illustrate these techniques.

Page 40: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 40

Homework #0-0

Log into Carmen See if you can find the following:

– Contact information for your instructor.

– Course policy on late assignments

– Course notes (slides)

– Reading assignment for the second class-meeting

– Dropbox and deadline for first homework

– Story of Mel, A Real Progammer in the discussion group

Page 41: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 41

Homework #0-1

Purchase the textbook written by Maccabe. Read the assigned material for the week Pledge to do the reading assignment before

each class meeting.

Page 42: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 42

Homework #0-10 Login to your CS unix account, on stdsun.cse.ohio-

state.edu.  Your default password is the last four digits of your

social security number followed by your first and last initials. 

– For example, Luke Skywalker, whose social security number is 123-45-6789, has a password of 6789ls.

In a CSE laboratory room, you will have to log in to the Windows PC first.

– Your initial password there is the same as for UNIX except that it has an additional exclamation mark (‘!’) at the end. Luke Skywalker’s initial Windows password is 6789ls!

Page 43: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 43

Make a Table on an Index Card

Show Different Representations of Numeric Values.  – Column Headings Should be:

Decimal Octal Hexadecimal Binary

Page 44: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 44

One Row for Each Numeric Value. 

Show, in Increasing Order, – Representations for 0, 1, 2, 3, 4, … 20– Then, 25, 26, … 216

– Finally 220, 230, 231, 232

Page 45: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 45

For Example,Decimal Octal Hex Binary Note Roman Nat’l Lang

0 0 0 0     zero

1 1 1 1 2 0 I one

2 2 2 10 2 1 II two

And so on.

           

20 24 14 10100   XXIV Twenty

32 40 20 100000 2 5 XXXII ..

And so on.

      ..    

        2 16    

        2 20    

        2 30    

        2 31    

        2 32    

Page 46: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 46

Information Representation 1

Positional Number Systems: position of character in string indicates a power of the base (radix). Common bases: 2, 8, 10, 16. (What base are we using to express the names of these bases?)– Base ten (decimal): digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 form

the alphabet of the decimal system. E.g., 31610 =

– Base eight (octal): digits 0, 1, 2, 3, 4, 5, 6, 7 form the alphabet.

E.g., 4748 =

Page 47: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 47

Information Representation 2

– Base 16 (hexadecimal): digits 0-9 and A-F. E.g., 13C16 =

– Base 2 (binary): digits (called “bits”) 0, 1 form the alphabet.

E.g., 100110 =

– In general, radix r representations use the first r chars in {0…9, A...Z} and have the form dn-1dn-2…d1d0. Summing dn-1rn-1 + dn-2rn-2 + … + d0r0 will convert to base 10. Why to base 10?

Page 48: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 48

Information Representation 3

Base Conversions– Convert to base 10 by multiplication of powers

E.g., 100125 = ( )10

– Convert from base 10 by repeated division E.g., 63210 = ( )8

– Converting base x to base y: convert base x to base 10 then convert base 10 to base y

Page 49: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 49

Information Representation 4

– Special case: converting among binary, octal, and hexadecimal is easier

Go through the binary representation, grouping in sets of 3 or 4.

E.g., 110110012 = 11 011 001 = 3318

110110012 = 1101 1001 = D916

E.g., C3B16 = ( )8

Page 50: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 50

Information Representation 5 What is special about binary?

– The basic component of a computer system is a transistor (transfer resistor): a two state device which switches between logical “1” and “0” (actually represented as voltages on the range 5V to 0V).

– Octal and hexadecimal are bases in powers of 2, and are used as a shorthand way of writing binary. A hexadecimal digit represents 4 bits, half of a byte.1 byte = 8 bits. A bit is a binary digit.

– Get comfortable converting among decimal, binary, octal, hexadecimal. Converting from decimal to hexadecimal (or binary) is easier going through octal.

Page 51: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 51

Information Representation 6

Binary Hex Decimal Binary Hex Decimal

0000 0 0 1000 8 8

0001 1 1 1001 9 9

0010 2 2 1010 A 10

0011 3 3 1011 B 11

0100 4 4 1100 C 12

0101 5 5 1101 D 13

0110 6 6 1110 E 14

0111 7 7 1111 F 15

Page 52: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 52

Information Representation 7

Ranges of values– Q: Given k positions in base n, how many values can

you represent?

– A: nk values over the range (0…nk-1)10

n=10, k=3: 103=1000 range is (0…999)10

n=2, k=8: 28=256 range is (0…255)10

n=16, k=4: 164=65536 range is (0…65535)10

– Q: How are negative numbers represented?

Page 53: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 53

Information Representation 8 Integer representation:

– Value and representation are distinct. E.g., 12 may be represented as XII, C16, 1210, and 11002. Note: -12 may be represented as -C16, -1210, and -11002.

– Simple and efficient use of hardware implies using a specific number of bits, e.g., a 32-bit string, in a binary encoding. Such an encoding is “fixed width.”

– Four methods: (fixed-width) simple binary, signed magnitude, binary coded decimal, and 2’s complement.

– Simple binary: as seen before, all numbers are assumed to be positive, e.g., 8-bit representation of6610 = 0100 00102 and 19410 = 1100 00102

Page 54: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 54

Information Representation 9

– Signed magnitude: simple binary with leading sign bit.0 = positive, 1 = negative. E.g., 8-bit signed mag.:

6610 = 0100 00102

-6610 = 1100 00102

What ranges of numbers may be expressed in 8 bits?

Largest:

Smallest:

Extend 1100 0010 to 12 bits:

Page 55: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 55

Information Representation 10Problems: (1) Compare the signed magnitude numbers1000 0000 and 0000 0000. (2) Must have “subtraction” hardware in addition to “addition” hardware.

– Binary Coded Decimal (BCD): use a 4 bit pattern to express each digit of a base 10 number

0000 = 0 0001 = 1 0010 = 2 0011 = 3 0100 = 4 0101 = 5 0110 = 6 0111 = 7 1000 = 8 1001 = 9 1010 = + 1011 = -

E.g., 123 : 0000 0001 0010 0011+123 : 1010 0001 0010 0011-123 : 1011 0001 0010 0011

Page 56: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 56

Information Representation 11BCD Disadvantages:

– Takes more memory. 32 bit simple binary can represent more than 4 billion discrete values. 32 bit BCD can hold a sign and7 digits (or 8 digits for unsigned values) for a maximum of110 million values, a 97% reduction.

– More difficult to do arithmetic. Essentially, we must force the Base 2 computer to do Base 10 arithmetic.

BCD Advantages:– Used in business machines and languages, i.e., in COBOL for

precise decimal math.

– Can have arrays of BCD numbers for essentially arbitrary precision arithmetic.

Page 57: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 57

Information Representation 12

– Two’s Complement Used by most machines and

languages to represent integers. Fixes the -0 in the signed magnitude, and simplifies machine hardware arithmetic.

Divides bit patterns into a positive half and a negative half (with zero considered positive); n bits creates a range of [-2n-1… 2n-1 -1].

CODE0000000100100011010001010110011110001001101010111100110111101111

Simple0123456789

101112131415

Signed+01234567-0-1-2-3-4-5-6-7

2’s comp01234567-8-7-6-5-4-3-2-1

Page 58: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 58

Information Representation 13

– Representation in 2’s complement; i.e., represent i inn-bit 2’s complement, where -2 n-1 i +2 n-1-1

Positive numbers: same as simple binary Negative numbers:

– Obtain the n-bit simple binary equivalent of | i |

– Obtain its negation as follows:• Invert the bits of that representation

• Add 1 to the result

Ex.: convert -32010 to 16-bit 2’s complement

Ex.: extend the 12-bit 2’s complement number

1101 0111 1000 to 16 bits.

Page 59: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 59

Information Representation 14 Binary Arithmetic

– Addition and subtraction only for now– Rules: similar to standard addition and subtraction, but

only working with 0 and 1. 0 + 0 = 0 0 - 0 = 0 1 + 0 = 1 1 - 0 = 1 0 + 1 = 1 1 - 1 = 0 1 + 1 = 10 10 - 1 = 1

– Must be aware of possible overflow. Ex.: 8-bit signed magnitude 0101 0110 + 0110 0011 =

Ex.: 8-bit signed magnitude 0101 0110 - 0110 0011 =

Page 60: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 60

Information Representation 15

2’s Complement binary arithmetic– Addition and subtraction are the same operation

– Still must be aware of overflow. Ex.: 8 bit 2’s complement: 2310 + 4510 =

Ex.: 8 bit 2’s complement: 2310 - 4510 =

Ex.: 8 bit 2’s complement: 10010 + 4510 =

Page 61: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 61

Information Representation 16

– 2’s Complement overflowOpposite signs on operands can’t overflowIf operand signs are same, but result’s sign is

different, must have overflow

Page 62: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 62

Information Representation 17 Characters and Strings

– EBCDIC, Extended Binary Coded Decimal Interchange Code Used by IBM in mainframes (360 architecture and descendants). Earliest system

– ASCII, American Standard Code for Information Interchange. Most common system

– Unicode, http://www.unicode.org New international standard Variable length encoding scheme with either 8- or 16-bit minimum “a unique number for every character, no matter what the platform,

no matter what the program, no matter what the language.”

Page 63: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 63

Information Representation 18

ASCII– see table 1.7 on pg. 18.

In Unix, run “man ascii”.

– 7 bit code Printable characters for human interactions Control characters for non-human communication (computer-

computer, computer-peripheral, etc.)

– 8-bit code: most significant bit may be set Extended ASCII (IBM), includes graphical symbols and lines ISO 8859, several international standards Unicode’s UTF-8, variable length code with 8-bit minimum

Page 64: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 64

ASCII Easy to decode

– But takes up a predictable amount of space

Upper and lower case characters are 0x20 (3210) apart

ASCII representation of ‘3’ is not the same as the binary representation of 3. – To convert ASCII to binary (an integer), ‘3’-‘0’ = 3

Line feed (LF) character– 000 10102 = 0x0a = 1010

– ‘\n’ = 0xa

Character ASCII Binary ASCII Hex

‘ ’ 010 0000 0x20‘A’ 100 0001 0x41‘a’ 110 0001 0x61‘R’ 101 0010 0x52‘r’ 111 0010 0x72‘0’ 011 0000 0x30‘3’ 011 0011 0x33

Page 65: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 65

Information Representation 19 Decode:

1000001, 1010011, 1000011, 1001001, 1001001, 0100000, 1101001, 1110011, 0100000, 1100101, 1100001, 1110011, 1111001, 0000000

– Or (in hex):

41 53 43 49 49 20 69 73 20 65 61 73 79 00

How many bytes is this? What’s the use of the ’00’?

Character ASCII Binary ASCII Hex

‘ ’ 010 0000 0x20‘A’ 100 0001 0x41‘a’ 110 0001 0x61‘R’ 101 0010 0x52‘r’ 111 0010 0x72‘0’ 011 0000 0x30‘3’ 011 0011 0x33

String definition is programming language dependent.

C, C++: strings are arrays of characters terminated by a null byte.

Page 66: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 66

Information Representation 20

Simple data compression– ASCII codes are fixed length.

– Huffman codes are variable length and based on statistics of the data to be transmitted.

Assign the shortest encoding to the most common character.– In English, the letter ‘e’ is the most common.

– Either establish a Huffman code for an entire class of messages,

– Or create a new Huffman code for each message, sending/storing both the coding scheme and the message.

“a widely used and very effective technique for compressing data; savings of 20% to 90% are typical, depending on the characteristics of the file being compressed.” (Cormen, p. 337)

Page 67: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 67

ECL - Expected Code Length

Char Fixed len encoding

Freq Var len encoding

# bits Expected # bits

00 .5 1 1 .5

01 .25 01 2 .5

10 .15 001 3 .45

11 .10 000 3 .3

Avg len 2 1.75

Page 68: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 68

Information Representation 21 Huffman Tree for “a man a plan a canal panama”

– Determine frequencies of letters (example ignores spaces)

– Create a forest of single node trees. Choose the two trees having the smallest total frequencies (the two

“smallest” trees) Merge them together (lesser frequency as the left subtree. Continue merging until only one tree remains.

Count Frequency

‘ a’ 10 0.476190

‘ c’ 1 0.047619

‘ l ’ 2 0.095238

‘ m’ 2 0.095238

‘ n’ 4 0.190476

‘ p’ 2 0.095238

Page 69: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 69

Information Representation 22

Huffman Tree for "a man a plan a canal panama"

'a'.4762

'n'.1905

'c'.0476

'l'.0952

.1428

'm'.0952

'p'.0952

.1905

.3333

.5238

1.0

Reading a ‘1’ calls for following the left branch.

Reading a ‘0’ calls for following the right branch.

Decoding using the tree:To decode ‘0001’, start at root and follow r_child, r_child, r_child, l_child, revealing encoded ‘m’.

Page 70: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 70

Information Representation 23

Comparison of Huffman and 3-bit code example– 3-bit: 000 011000100 000 101010000100 000 001000100000010 101000100000011000 = 63 bits

– Huffman: 1 0001101 1 00000010101 1 001110110010 0000101100011 = 46 bits

– Savings of 17 bits, or 27% of original message

3-bit code Huffman Code Count H length 3 length

‘a’ 000 1 10 10 30

‘c’ 001 0011 1 4 3

‘l’ 010 0010 2 8 6

‘m’ 011 0001 2 8 6

‘n’ 100 01 4 8 12

‘p’ 101 0000 2 8 6

Totals 46 63

Page 71: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

Tree for: ABE DEFACED A FADED BED

CSE360 71

BC

F

A D E

9/19 10/19

5/19

3/19

freq

A 4/19

B 2/19

C 1/19

D 5/19

E 5/19

F 2/19

Page 72: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 72

ECL - Expected Code Length

Char Fixed len encoding

Freq Var len encoding

# bits Expected # bits

00 .5 1 1 .5

01 .25 01 2 .5

10 .15 001 3 .45

11 .10 000 3 .3

Avg len 2 1.75

Page 73: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

ECL for: ABE DEFACED A FADED BEDfreq code ecl

A 4/19 11 8/19B 2/19 1000 8/19C 1/19 1001 4/19D 5/19 01 10/19E 5/19 00 10/19F 2/19 101 6/19ecl = 2.42

CSE360 73

Use the same encodings to decode11 10000011010001 11100100 1001111000

Page 74: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 74

Parity: Simple error detection

Data transmission, aging media, static interference, dust on media, etc. demand the ability to detect errors.– Ex.: send ASCII ‘S’: send 1010011, but

receive 1010010(‘R’)? Single bit errors detected by using parity

checking. Parity, here, is the “the state of being odd or

even.”

Page 75: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 75

Information Representation 24

How to detect a 1-bit error:– Add a 1-bit parity to make an odd or even number of

bits per byte.

– Parity bit is stripped by hardware after checking. Sender/receiver both agree to odd or even parity.

– 2 flipped bits in the same encoding are not detected.

‘ S’ ‘ E’ASCII 101 0011 100 0101Even parity 0101 0011 1100 0101Odd Parity 1101 0011 0100 0101

What if parity bit is flipped?What if parity bit is flipped?

Page 76: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 76

Information Representation 25 Two meanings for Hamming distance.

1. Specific. A count of the number of bits different in two encodings.E.g., dist(1100, 1001) =

dist(0101, 1101) =2. General. The minimum over all distinct pairs in an

entire code. The ASCII encoding scheme has a Hamming distance of 1. A simple parity encoding scheme has a Hamming distance of 2.

Hamming distance serves as a measure of the robustness of error checking (as a measure of the redundancy of the encoding).

Page 77: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 77

D F lip F lopD ataI n

C lock

D ataO ut

one cycle

Basic Components 1

Terminology from Ch. 2:– Flip flop: basic storage device that holds 1 bit

– D flip flop: special flip flop that outputs the last value that was input to it (a data signal).

– Clock: two different meanings: (1) a control signal that oscillates (low to high voltage) every x nanoseconds; (2) the “write select” line for a flip flop.

Page 78: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 78

Basic Components 2

– Register: collection of flip flops with parallel load. Clock (or “write select”) signal controlled. Stores instructions, addresses, operands, etc.

– Bus: Collection of related data lines (wires).

d7 d6 d5 d4 d3 d2 d1 d0

I nput B us

O utput B us

C lock 8 B it R egister

8

8

C lock

Page 79: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 79

Basic Components 3

– Combinational circuits: implement Boolean functions. No feedback in the circuit, output is strictly a function of input.

Gates: and, or, not, xor

E.g., xy + z

AN D O R N O T X O R

x

y

z f

Page 80: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 80

Basic Components 4

– Gates can be used in combination to implement a simple (half) adder.

Addition creates a value, plus a carry-out.

Z = X Y

CO = X Y

X Y Z CO

0 0 0 0

0 1 1 0

1 0 1 0

1 1 0 1

X

Y

Z

CO

Page 81: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 81

Basic Components 5

– Sequential Circuits: introduce feedback into the circuit. Outputs are functions of input and current state.

– Multiplexers: combinational circuits that use n bits to select an output from 2n input lines.

4 to 1 M UX

s 0 s 1

f

i0i1i2i3

D

C

Q

Page 82: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 82

Basic Components 6 Von Neumann

Architecture– Can access either

instructions or data from memory in each cycle.

– One path to memory(von Neumann bottleneck)

– Stored program system. No distinction between programs and data

M ain M em ory S ys tem

O perational Regis ters

P rogram Counter

Arithm etic and Logic Unit

Contro l Unit

Input/O utput S ys tem

Addres sP athw ay

D ata andIns truc tionP athw ay

Page 83: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 83

Basic Components 7

Examples of Von Neumann architecture to be explored in this course:

SAM: tiny, good for learning architecture MIPS: text’s example assembly language SPARC: labs M68HC11: used in ECE 567 (taken by CSE majors)

Roughly, the order of presentation in this course is as follows:

A couple of days on the Main Memory System Weeks on the Central Processing Unit (CPU) Finish the course with the I/O System

Page 84: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 84

Memory Subsystem – the busses

Address Bus

Data Bus

000

001

010

011

100

101

n-bit Addressible

n

k

The number of elements depend on the size of the address bus.

• If k=3, how many addresses?• If k=4, how many addresses?

# Addresses = 2k

Page 85: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 85

Memory Subsystem – the busses

Address Bus

Data Bus

000

001

010

011

100

101

n-bit Addressible

n

k

Capacity depends on how many bits in each element, or the size of the data bus.

• If n=1 and k=3, how many bits? If n=2?

• If n=8 and k=3, how many Bytes?

Bit capacity = 2k * n

Page 86: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 86

Memory Element & Address Sizes

•If a machine’s memory is 5-bit addressable, then, at each distinct address, 5 bits are stored. The contents at each address are represented by 5 bits.

•If 3 bits are used to represent memory addresses, then the memory can have at most 23 = 8 distinct addresses.

•Such a memory can store at most 8 5 = 40 bits of data.

•If the data bus is 10 bits wide, then up to 10 bits at a time can be transferred between memory and processor; this is a 10-bit word.

Address

ContentsDecimal

Binary

0 000 00011

1 001 01111

2 010 01110

3 011 10100

4 100 00101

5 101 01110

6 110 10100

7 111 10011

Page 87: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 87

Memory Subsystem - Addressibility

Addressibility is the size of the memory element

The size of the element may be smaller than the size of the data bus.– If n=8, only 1 Byte Addressible– If n=16, 1 or 2 Byte Addressible

Address Bus

Data Bus

000

001

010

011

100

101

n-bit Addressible

n

k

How does Addressibility affect capacity?

Page 88: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 88

Memory Subsystem - Addressing

Memory may be organized into banks, with bit labels

The GLOBAL address of each addressible element would be:[relative address] & [bank address]

Address Bus

Data Bus

000

001

010

011

100

101

Bank 0 Bank 1

000 0

001 0

010 0

011 0

100 0

101 0

000 1

001 1

010 1

011 1

100 1

101 1See the pattern that forms?

Page 89: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 89

Memory Subsystem - Alignment

Data bus is 4x the size of addressible element.

So, you may read (or write) one or more Bytes at a time…

But only from/to the same row of memory!

Address Bus

Data Bus

000

001

010

011

100

101

32

Okay to read/write 2 Bytes from 10010?

2B from 01011? 4B from 01100?4B from 00101?

Bank 00

000 00

001 00

010 00

011 00

100 00

101 00

Bank 01

000 01

001 01

010 01

011 01

100 01

101 01

Bank 10

000 10

001 10

010 10

011 10

100 10

101 10

Bank 11

000 11

001 11

010 11

011 11

100 11

101 11

8-bit

Page 90: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 90

Memory Subsystem - Alignment

Address Bus

Data Bus

000

001

010

011

100

101

Bank 00

000 00

001 00

010 00

011 00

100 00

101 00

Bank 01

000 01

001 01

010 01

011 01

100 01

101 01

Bank 10

000 10

001 10

010 10

011 10

100 10

101 10

32

Bank 11

000 11

001 11

010 11

011 11

100 11

101 11

8-bit

Where are operands of various sizes positioned?– 1 Bytes Aligned

on any address

– 2 Byte Aligned on “halfword” boundary addresses divisible by 2 end in hex

0,2,4,6,8,A,C,E)

– 4 Byte Aligned on “word” boundary addresses divisible by 4 end in hex 0,4,8,C)

Page 91: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 91

Contrast with bit ordering

Basic Components 11 Byte ordering: how numeric data is stored in memory

– Ex.: 24789651110 = 0EC699BF16

– Stored at address 0

0 OE

1 C6

2 99

3 BF

0 BF

1 99

2 C6

3 0E

Big Endian

High order (big end) is at byte 0

7 6 5 4 3 2 1 0

1 0 1 1 1 1 1 1

Little Endian

Low order (little end) is at byte 0

Page 92: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 92

Basic Components 12

Read/Write operations: must know the address to read or write. (read = fetch = load, write = store)

CPU puts address on address bus

CPU sends read signal– (R/W=1, CS=1)

– (Read/don’t Write, Chip Select) Wait

Memory puts data ondata bus

– reset (CS=0)

D0D1

D(n-1)

A0A1

A(m-1)

CS

R/ W

Page 93: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 93

Basic Components 13 Types of memory:

– ROM: Read Only Memory: non-volatile (doesn’t get erased when powered down; it’s a combinational circuit!)

– PROM: Programmable ROM: use a ROM burner to write data to it initially. Can’t be re-written.

– EPROM: Erasable PROM. Uses UV light to erase.– EEPROM: Electrically Erasable PROM.– RAM: Random access memory. Can efficiently read/write any

location (unlike sequential access memory). Used for main memory.

Many variations (types) of RAM, all volatile

– SDRAM, DDR SDRAM– RDRAM– www.tomshardware.com

Page 94: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 94

Instructional Sparc Emulator - ISEM

Editing, Assembling, Linking, and Loading

– There are three components to the Instructional SPARC Emulator (ISEM) package that we use for this class:

the assembler, the linker, and the emulator/debugger.

Page 95: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 95

Instructional Sparc Emulator - ISEM –

Editing– There are a number of programs that you can use to create your

source files. Emacs is probably the most popular; vi is also available, but its command syntax is difficult to learn

and use; using pine program, you can use the pico editor, which

combines many features of Emacs into a simple menu-driven facility.

– Start Emacs by “xemacs sourcefile.s &”, which creates the file called sourcefile.s.

– Use the tutorial, accessed by typing "Ctrl-H Ctrl-H t". – For other editors, you are on your own.

Page 96: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 96

Example Sparc Assembly Language Instructions

% type xmp0.s .data ! Assembler directive: data starts here. A_m, B_m, andA_m: .word ’?’ ! C_m are symbolic constants. Furthermore, eachB_m : .word 0x30 ! is an address of a certain-sized chunk of memory. Here,C_m : .word 0 ! each chunk is four bytes (one word) long. When the

! program gets loaded, each of these chunks stores a ! number in 2’s complement encoding, as follows: At ! address C_m, zero; at B_m, 48; at A_m, 0x3F = 077 = 63.

.text ! Assembler directive, instructions start herestart: ! Label (symbolic constant) for this address set A_m, %r2 ! Put address A_m into register 2 ld [%r2], %r2 ! Use r2 as an indirect address for a load (read) set B_m, %r3 ! Put address B_m into register 3 ld [%r3], %r3 ! Read from B_m and replace r3 w/ value at addr B_m sub %r2, %r3, %r2 ! Subtract r3 from r2, save in r2 set C_m, %r4 ! Put address C_m into register 4 st %r2, [%r4] ! Store (write) r2 to memory at address C_mterminate: ! Label for address where ’ta 0’ instruction stored ta 0 ! Stop the programbeyond_end: ! Label for address beyond the end of this program

Page 97: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 97

Instructional Sparc Emulator - ISEM

Assembling– The assembler is called "isem-as", and is the GNU Assembler

(GAS), configured to cross-assemble to a SPARC object format. – It is used to take your source code, and produce object code that

may be linked and run on the ISEM emulator. – The syntax for invoking the assembler is:

isem-as [-a[ls]] sourcefile.s -o objectfile.o

– The input is read from sourcefile.s, and the output is written to objectfile.o.

– The option "-a" tells the assembler to produce a listing file. The sub-options "l" and "s" tell the assembler to include the assembly source in the listing file and produce a symbol table, respectively.

Page 98: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 98

Instructional Sparc Emulator - ISEM

The listing file

– Will identify all the syntactic errors in your program, and it will warn you if it identifies "suspicious" behavior in your source file.

– Column 1 identifies a line number in your source file.

– Column 2 is an offset for where this instruction or data resides in memory.

– Column 3 is the image of what is put in memory, either the machine instructions or the representation of the data.

– The final column is the source code that produced the line.

– At the bottom of the file you will find the symbol table.

– Again, the symbols are represented as offsets that are relocated when the program is loaded into memory.

Page 99: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 99

isem-as -als labn.s -o labn.o >! labn.lst

1 .data 2 0000 0000003F A_m: .word ’?’ 3 0004 00000030 B_m: .word 0x30 4 0008 00000000 C_m: .word 0 5 000c 00000000 .text 6 start: 7 0000 05000000 set A_m, %r2 7 8410A000 8 0008 C4008000 ld [%r2], %r2 9 000c 07000000 set B_m, %r3 9 8610E000 10 0014 C600C000 ld [%r3], %r3 11 0018 84208003 sub %r2, %r3, %r2 12 001c 09000000 set C_m, %r4 12 88112000 13 0024 C4210000 st %r2, [%r4] 14 terminate: 15 0028 91D02000 ta 0 16 002c 01000000 beyond_end: DEFINED SYMBOLS xmp0.s:2 .data:00000000 A_m xmp0.s:3 .data:00000004 B_m xmp0.s:4 .data:00000008 C_m xmp0.s:6 .text:00000000 start xmp0.s:14 .text:00000028 terminate xmp0.s:16 .text:0000002c beyond_end NO UNDEFINED SYMBOLS

Line in source file (.s)

Offset to address

in memory

Contents at

address in

memoryLabels are

symbolic offsets

Page 100: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 100

Instructional Sparc Emulator - ISEM

Linking– Linking turns a set of raw object file(s) into an executable program. – From the manual page, "ld combines a number of object and archive files,

relocates their data and ties up symbol references. Often the last step in building a new compiled program to run is a call to ld."

– Several object files are combined into one executable using ld; the separate files could reference symbols from one another.

– The output of the linker is an executable program.– The syntax for the linker is as follows:

isem-ld objectfile.o [-o execfile]

Examples

% isem-ld foo.o -o foo Links foo.o into the executable foo. % isem-ld foo.o Links foo.o into the executable a.out.

Page 101: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 101

Instructional Sparc Emulator - ISEM

Loading/Running

– Execute the program and test it in the emulation environment.

– The program "isem" is used to do this, and the majority of its features are covered in your lab manual.

– Invoke isem as follows

isem [execfile]

Examples

% isem foo Invokes the emulator, loads the program foo % isem Invokes the emulator, no program is loaded

– Once you are in the emulator, you can run your program by typing "run" at the prompt.

Page 102: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 102

ISEM Debugging Tools 1% isem xmp0 Instructional SPARC EmulatorCopyright 1993 - Computer Science Department University of New Mexico ISEM comes with ABSOLUTELY NO WARRANTY ISEM Ver 1.00d : Mon Jul 27 16:29:45 EDT 1998 Loading File: xmp02000 bytes loaded into Text region at address 8:20002000 bytes loaded into Data region at address a:4000 PC: 08:00002020 nPC: 00002024 PSR: 0000003e N:0 Z:0 V:0 C:0 start : sethi 0x10, %g2 ISEM> runProgram exited normally.

Assembly language programs are not notoriously chatty.

Page 103: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 103

ISEM Debugging Tools 2 reg

– Gives values of all 32 general registers

– Also PC

symb– Shows the resolved values

of all symbolic constants

dump [addr]– Either symbol or hex

address

– Gives the values stored in memory

ISEM> reg

----0--- ----1--- ----2--- ----3--- ----4--- ----5--- ----6--- ----7---

G 00000000 00000000 0000000f 00000030 00004008 00000000 00000000 00000000

O 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

L 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

I 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

PC: 08:0000204c nPC: 00002050 PSR: 0000003e N:0 Z:0 V:0 C:0

beyond_end : sethi 0x0, %g0

ISEM> symb

Symbol List

A_m : 00004000

B_m : 00004004

.

.

.

terminate : 00004028

ISEM> dump A_m

0a:00004000 00 00 00 3f 00 00 00 30 00 00 00 0f 00 00 00 00 ...?...0........

0a:00004010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

0a:00004020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Page 104: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 104

ISEM Debugging Tools break [addr]

– Set breakpoints in execution

– Once execution is stopped, you can look at the contents of registers and memory.

trace – Causes one (or more) instruction(s) to be executed

– Registers are displayed

– Handy for sneaking up on an error when you’re not sure where it is.

Page 105: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 105

ISEM Debugging Tools For the all-time “most wanted” list of errors (and their

fixes)

Page 106: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 106

ISEM Debugging

If you still need help– Print a fresh copy of your source

– Make good notes describing the error

– Visit your lecturer or grader

– Post a question to the discussion board

Page 107: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 107

Basic Components 14

CPU: executes instructions -- primitive operations that the computer can perform.– E.g., arithmetic A+B

data movement A := B

control if expr goto label

logical AND, OR, XOR…

Instructions specify both the operation and the operands. An encoded operand is often a location in memory where the value of interest may be found (address of value of interest).

Page 108: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 108

Basic Components 15

– Instruction set: all instructions for a machine. Instruction format specifies number and type of operands.

Ex.: Could have an instruction like

ADD A, B, RWhere A, B, and R are the addresses of operands in memory. The result is R := A+B.

8

9

1 7

0

4

8

C

A

B

R

M em oryAddr Label

Page 109: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 109

Basic Components 16

– Actually, the “instruction” might be represented in a source file as:0x41444420412C20422C20520A. … A D D A , B , RAs such, it is an assembly language instruction.

– An assembler might translate it to, say, 0x504C, the machine’s representation of the instruction.As such, it is a machine language instruction.

Page 110: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 110

A Simple Instruction Set 1 Simple instruction set: the Accumulator machine.

– Simplify instruction set by only allowing one operand. Accumulator implied to be the second operand.

– Accumulator is a special register. Similar to a simple calculator.

ADD addr ACC ACC + M[addr] SUB addr ACC ACC - M[addr] MPY addr ACC ACC * M[addr] DIV addr ACC ACC / M[addr] LOAD addr ACC M[addr] STORE addr M[addr] ACC

Page 111: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 111

A Simple Instruction Set 2

Ex.: C = AB + CD

LOAD 20 ! Acc<-M[20]MPY 21 ! Acc<-Acc*M[21]STORE 30 ! M[30]<-AccLOAD 22 ! Acc<-M[22]MPY 23 ! Acc<-Acc*M[23]ADD 30 ! Acc<-Acc+M[30]STORE 22 ! M[22]<-Acc

1)2)3)4)5)

Address Symbolic Contents

20 A 0001

21 B 0010

22 C 0011

23 D 0100

30 temp

AccumulatorAccumulator 000000000001000100100010

00100010

001100111100110011101110

11101110

Try C=2A+B

Try C=A+2

Page 112: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 112

An Instruction (Encoding) Format Machine language: Converting from assembly language to

machine language is called assembling. Assume 8-bit architecture. Each instruction may be 8 bits.

3 bits hold the op-code and 5 bits hold the operand.

How much memory can we address? How many op-codes can we have?

7 5 4 0

o p - c o d e o p e r a n d

Operation Code

ADD 000SUB 001MPY 010DIV 011LOAD 100STORE 101

Page 113: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 113

A Simple Instruction Set 4

Convert the mnemonic op-codes into binary codes. Hand assemble our program: Instructions are stored in consecutive memory:

Addr Memory Mnemonic

0 100 10100 LOAD A1 010 10101 MPY B2 101 11110 STORE temp3 100 10110 LOAD C4 010 10111 MPY D5 000 11110 ADD temp6 101 10110 STORE C… …20 4 A21 5 B22 6 C23 7 D… …30 20 temp

Page 114: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 114

Simple Accumulator Machine

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

Page 115: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

Simple Accumulator Machine (SAM)

CSE360 115

REGISTERS– ACC – Accumulator, stores program values

– IR - Instruction Register, holds the instruction during interpretation

– MAR - Memory Address Register, stores address to read/write to/from

– MDR - Memory Data Register, stores data from memory, either written/read

– PC - Program Counter, stores the address of the next instruction

Page 116: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

Simple Accumulator Machine (SAM)

CSE360 116

Combinational Circuits– ALU - Arithmetic and logic unit, implements the

operations (eg, +,-,*,/)

– Decode - Instruction decoder, splits off the opcode and operands

– INC - Incrementer, increments the PC

– MUX - Multiplexer, controls inputs to PC and ACC

Page 117: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

Simple Accumulator Machine (SAM)

CSE360 117

Sequential Circuit– Timing and control - asserts control signals, clock

Combination of flip-flops, circuits and capacitors– Memory – stores instructions and data

Page 118: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 118

A Simple Instruction Set 6

– Control signals: control functional units to determine order of operations, access to bus, loading of registers, etc.

Number Operation Number Operation

0 ACC bus 8 ALU ACC1 load ACC 9 INC PC2 PC bus 10 ALU operation3 load PC 11 ALU operation4 load IR 12 Addr bus5 load MAR 13 CS6 MDR bus 14 R/W7 load MDR

Page 119: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 119

A Simple Instruction Set 7P C to busload M A RI N C to P C

load P C

C S , R /W

M D R to busload I R

A ddr to busload M A R

O P =stor e

A C C to busload M D R

C S

C S , R /W

M D R to busload A C C

O P =load

M D R to busA L U to A C C

A L U opload A C C

F etch

E xecute

0

12

3

State

Y N

4

5Y N

7

8

6

Page 120: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 120

State 0: Control Signals 2, 5, 9, 3

M A R M D R

I R

D ecode

2 to

1M

UXA C C

2 t

o 1

MU

X

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

P C to busload M A RI N C to P C

load P C

C S , R /W

M D R to busload I R

A ddr to busload M A R

O P =stor e

A C C to busload M D R

C S

C S , R /W

M D R to busload A C C

O P =load

M D R to busA L U to A C C

A L U opload A C C

F etch

E xecute

Put the address of the next instruction in the Addr Register and Inc. PC.

Page 121: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 121

State 1: Control Signals 13, 14

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

P C to busload M A RI N C to P C

load P C

C S , R /W

M D R to busload I R

A ddr to busload M A R

O P =stor e

A C C to busload M D R

C S

C S , R /W

M D R to busload A C C

O P =load

M D R to busA L U to A C C

A L U opload A C C

F etch

E xecute

Fetch the word of memory at Address, and load into Data Register.

Page 122: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 122

State 2: Control Signals 6, 4

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

P C to busload M A RI N C to P C

load P C

C S , R /W

M D R to busload I R

A ddr to busload M A R

O P =stor e

A C C to busload M D R

C S

C S , R /W

M D R to busload A C C

O P =load

M D R to busA L U to A C C

A L U opload A C C

F etch

E xecute

Send the word from the Data Register to the Instruction Register.

Page 123: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 123

State 3: Control Signals 12, 5

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

P C to busload M A RI N C to P C

load P C

C S , R /W

M D R to busload I R

A ddr to busload M A R

O P =stor e

A C C to busload M D R

C S

C S , R /W

M D R to busload A C C

O P =load

M D R to busA L U to A C C

A L U opload A C C

F etch

E xecute

Put the address from the instruction in the Address Register.

Page 124: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 124

After State 3, what values are now stored in each register?

PC MAR MDR IR ACC

Page 125: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 125

State 4: Control Signals 0, 7

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

P C to busload M A RI N C to P C

load P C

C S , R /W

M D R to busload I R

A ddr to busload M A R

O P =stor e

A C C to busload M D R

C S

C S , R /W

M D R to busload A C C

O P =load

M D R to busA L U to A C C

A L U opload A C C

F etch

E xecute

Take the value from the ACCumulator and store it in the Data Register.

Page 126: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 126

State 5: Control Signal 13

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

P C to busload M A RI N C to P C

load P C

C S , R /W

M D R to busload I R

A ddr to busload M A R

O P =stor e

A C C to busload M D R

C S

C S , R /W

M D R to busload A C C

O P =load

M D R to busA L U to A C C

A L U opload A C C

F etch

E xecute

Write the data from the Data Register to the address stored in the MAR.

Page 127: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 127

State 6: Control Signals 13, 14

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

P C to busload M A RI N C to P C

load P C

C S , R /W

M D R to busload I R

A ddr to busload M A R

O P =stor e

A C C to busload M D R

C S

C S , R /W

M D R to busload A C C

O P =load

M D R to busA L U to A C C

A L U opload A C C

F etch

E xecute

Load the word at the Address from the Addr Reg into the Data Register.

Page 128: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 128

After State 6, what values are now stored in each register?

PC MAR MDR IR ACC

Page 129: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 129

State 7: Control Signals 6, 1

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

P C to busload M A RI N C to P C

load P C

C S , R /W

M D R to busload I R

A ddr to busload M A R

O P =stor e

A C C to busload M D R

C S

C S , R /W

M D R to busload A C C

O P =load

M D R to busA L U to A C C

A L U opload A C C

F etch

E xecute

Load the word from Data Register into the ACCumulator.

Page 130: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 130

State 8: Control Signals 6, 8, 10/11, 1

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

P C to busload M A RI N C to P C

load P C

C S , R /W

M D R to busload I R

A ddr to busload M A R

O P =stor e

A C C to busload M D R

C S

C S , R /W

M D R to busload A C C

O P =load

M D R to busA L U to A C C

A L U opload A C C

F etch

E xecute

Use word from the Data Register for Arith Op and put result in ACC.

Page 131: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 131

New Instruction•What is necessary to implement a new instruction?

•New states?•New control signals?•New fetch/execute cycle?

•An Example: •SWAP

Exchange value in Accumulator with value at Address

•SWAP addr ! Acc <- #M[addr], M[addr] <- #Acc

Page 132: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 132

New Instruction What changes to fetch/execute cycle?

– The fetch part of the cycle usually remains the same.

– Recall the values stored in registers after each state E.g., After State 6, what values are in each register?

– PC

– MAR

– MDR

– IR

– ACC Handy to have #M[addr] in MDR

– Start after state 6 then… .

P C to busload M A RI N C to P C

load P C

C S , R /W

M D R to busload I R

A ddr to busload M A R

O P =stor e

A C C to busload M D R

C S

C S , R /W

M D R to busload A C C

O P =load

M D R to busA L U to A C C

A L U opload A C C

F etch

E xecute

Page 133: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 133

New State 9: Control Signals 6, 4

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

Save the Data value from the MDR in the Address Register.

MDR -> busLoad IR

Page 134: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 134

New State 10: Control Signals 0, 7

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

Send the ACCumulator value to the Data Register.

ACC -> busload MDR

Page 135: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 135

New State 11: Control Signals 15?, 1

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

Put the saved value from the IR into the ACCumulator.

IR ->busload ACC

Note: there is no control signal in the current architecture opposite of4 (Load IR), so we would have to create a new control signal (MAR to bus) in addition to creating these new states.

Page 136: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 136

New State 12 (Old 5): Control Signals 13

M A R M D R

I R

D ecode

2 to 1M

UXA C C

2 to

1M

UX

P C

I N C

T iming andC ontr ol

M emor y

B us

A ddr O p

0

1

2

3

4

5 6 7

8

9

A L U

10 11

12

13 14

Write the data from the Data Register to the address stored in the MAR.

CS

Page 137: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 137

New Instruction Solution Changes to States, added 9 thru 12 Changes to Signals, added 15: IR-> bus Changes to Fetch/Execute, new register transfer language (RTL)

PC -> bus, load MAR, INC -> PC, Load PCCS, R/wMDR -> bus, load IRAddr -> bus, load MARCS, R/w MDR -> bus, load IRACC -> bus, load MDRIR-> bus, load ACCCS

What if we had added MAR->bus MAR->bus instead of IR->busIR->bus?

What if we had added MAR->bus MAR->bus instead of IR->busIR->bus?

Page 138: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 138

Instruction Set Architectures 1

RISC vs. CISC– Complex Instruction Set Computer (CISC):

Many, powerful instructions. High code density to address the Von Neumann Bottleneck. Instructions have varying lengths, number of operands,

formats, and clock cycles in execution.

– Reduced Instruction Set Computer (RISC): Fewer, less powerful, optimized instructions. Requires simpler, faster hardware. Instructions have fixed length, number of operands, formats,

and similar number of clock cycles in execution.

Page 139: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 139

Instruction Set Architectures 2

Motivation: memory is comparatively slow.– 10x to 20x slower than processor.

– Need to minimize number of trips to memory. Provide faster storage in the processor -- registers. Registers (16, 32, 64 bits wide) are used for intermediate

storage for calculations, or repeated operands. Accumulator machine

– One data register -- ACC.

– 2 memory accesses per instruction -- one for the instruction and one for the operand.

Add more registers (R0, R1, R2, …, Rn)

Page 140: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 140

Instruction Set Architectures 3

How many addresses to specify?– With binary operations, need to know two source

operands, a destination, and the operation. E.g., op (dest_operand) (src_op1) (src_op2)

– Based on number of operands, could have: 3 addr. machine: both sources and dest are named. 2 addr. machine: both sources named, dest is a source. 1 addr. machine: one source named, other source and dest. is

the accumulator. 0 addr. machine: all operands implicit and available on the

stack.

Page 141: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 141

Instruction Set Architectures 4

1-address architecture: a:=ab+cde– Memory only Using registers

1½-address architecture: at least one operand must always be a register. (½ address is register, 1 address is the memory operand: LOAD 100, R1).

– Like an accumulator machine, but with many accumulators.

Code # mem refs

LOAD 100 2MPY 104 2STORE 100 2LOAD 108 2MPY 112 2MPY 116 2ADD 100 2STORE 100 2

Code # mem refs

LOAD 100 2MPY 104 2STORE R2 1LOAD 108 2MPY 112 2MPY 116 2ADD R2 1STORE 100 2

Page 142: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 142

Instruction Set Architectures 5

3-address architecture: a:=ab+cde– Using memory only:

– Using registers:

Code # mem refs

MPY 100, 100, 104 ;a:=abMPY 200, 108, 112 ;t:=cdMPY 200, 116, 200 ;t:=etADD 100, 200, 100 ;a:=t+a

Code # mem refs

MPY R2, 100, 104 ;t1:=abMPY R3, 108, 112 ;t2:=cdMPY R3, 116, R3 ;t2:=et2ADD 100, R3, R2 ;a:=t1+t2

Memory

100 (a)104 (b)108 (c)112 (d)116 (e)...200 (t)

What about instruction size?What about instruction size?

Page 143: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

Instruction Set Architecture

How does instruction size affect addressing?– 16-bit instruction, 3 address, 6 instructions

– How many addresses will be supported?

– What if the instruction were 32 bit?

CSE360 143

Opcode = 3 bits (23=8)

Opcode = 3 bits (23=8) Operand =

(size -opcode) / #addr =4 bits

Operand = (size -opcode) /

#addr =4 bits

Operand = (16-3) /

3 =4 bits

Operand = (16-3) /

3 =4 bits

Operand = 13 / 3 =4 bits

Operand = 13 / 3 =4 bits

Page 144: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 144

Instruction Set Architectures 6

2-address architecture: a:=ab+cde– Using memory only:

– Using registers:

Code # mem refs

MPY 100, 104 ;a:=ab 4MOVE 200, 108 ;t:=c 3MPY 200, 112 ;t:=td 4MPY 200, 116 ;t:=te 4ADD 100, 200 ;a:=t+a 4

Memory

100 (a)104 (b)108 (c)112 (d)116 (e)...200 (t)

Code # mem refs

MPY 100, 104 ; a: =ab 4MOVE R2, 108 ; R2: =c 2MPY R2, 112 ; R2: =R2d 2MPY R2, 116 ; R2: =R2e 2ADD 100, R2 ; a: =t +a 3

Most CISC arch. this way, making 1 operand implicitMost CISC arch. this way, making 1 operand implicit

Page 145: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 145

Instruction Set Architectures 7

0-address architecture: a:=ab+cde– Stack machine: All operands are implicit. Only push

and pop touch memory. All other operands are pulled from the top of stack, and result is pushed on top.E.g., HP calculators.

Code # mem refs

PUSH A 2PUSH B 2MPY 1PUSH C 2PUSH D 2PUSH E 2MPY 1MPY 1ADD 1POP A 2

Stack

A

B

A*B

C

D

E

D*E

C*D*E

A*B + C*D*E

Page 146: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 146

Instruction Set Architectures 8

Load/Store Architectures -- RISCUse of registers is simple and efficient.

Therefore, the only instructions that can access memory are load and store. All others reference registers.Code # mem refs

LOAD R2, 100 ;R2a 2

LOAD R3, 104 ;R3b 2LOAD R4, 108 ;R4c 2LOAD R5, 112 ;R5d 2LOAD R6, 116 ;R6e 2MPY R2, R2, R3 ;R2ab 1MPY R3, R4, R5 ;R3cd 1MPY R3, R3, R6 ;R3(cd)e 1ADD R2, R2, R3 ;R2ab+(cd)e 1STORE 100, R2 ;aab+(cd)e 2

RISC

Load/ Store

Page 147: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 147

Instruction Set Architectures 9 Why load/store architectures?

– Number of instructions (hence, memory references to fetch them) is high, but can work without waiting on memory.

CISC machines tend to need to have their more complex instructions interpreted in micro code

– More room in CPU for registers and memory cache.

– Easier to overlap instruction execution through pipelining.

Fetch …. Execute

Fetch …. Execute

Fetch …. Execute

Fetch …. Execute

Page 148: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 148

Instruction Set Architectures 9

Side effects– Register interlock: delaying execution until memory read

completes. Machine waits when necessary, to avoid erroneous results.

ld [%r1], %r2add %r2, 100, %r3

– Branch delays: instruction after branch is always executed.

Instruction scheduling– Rearranging instructions to maximize efficiency of

pipelining To prevent register interlock (loads on SPARC) To use branch delay slots (branches on SPARC).

Page 149: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 149

SPARC Assembly Language 1 SPARC (Scalable Processor ARChitecture)

– Used in Sun workstations, descended from RISC-II developed at UC Berkeley

– General Characteristics: 32-bit word size (integer, address, register size, etc.) Byte-addressable memory RISC load/store architecture, 32-bit instruction, few

addressing modes Many registers (32 general purpose, 32 floating point, various

special purpose registers)

– ISEM: Instructional SPARC Emulator - nicer than a real machine for learning to write assembly language programs.

Page 150: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 150

SPARC Assembly Language 2 Structure

– Line oriented: 4 types of lines Blank - Ignored Labeled -

– Any line may be labeled. Creates a symbol in listing. Labels must begin with a letter (other than ‘L’), then any alphanumeric characters. Label must end with a colon “:”. Label just assigns a name to an address.

Assembler Directives - E.g., .data .word .text, etc.

Instructions

– Comments start after “!” character and go to the end of the line.

.data

x_m: .word 0x42y_m: .word 0x20z_m: .word 0

.text

start:

set x_m, %r2 ld [%r2], %r2 set y_m, %r3 ld [%r3], %r3

! Load x into reg 2! Load y into reg 3

Page 151: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 151

SPARC Assembly Language 3

Directives: Instructions to the assembler– Not executed by the machine

.data -- following section contains declarations– Each declaration reserves and initializes a certain number of bits

of storage for each of zero or more operands in the declaration.• .word -- 32 bits

• .half -- 16 bits

• .byte -- 8 bitsE.g.,

.dataw: .half 27000x: .byte 8y: .byte ’m’, 0x6e, 0x0, 0, 0z: .word 0x3C5F

.text -- following section contains executable instructions

Page 152: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 152

SPARC Assembly Language 11

– More assembler directives (.asciz and .ascii): Each of the following two directives is equivalent:

– msg01: .asciz "a phrase"– msg01: .byte 'a', ' ', 'p', 'h', 'r' .byte 'a', 's', 'e', 0

Note that .asciz generates one byte for each character between the quote (") marks in the operand, plus a null byte at the end.

The .ascii directive does not generate that extra byte. Each of the following three directives is equivalent:– digits: .ascii "0123456789"– digits: .byte '0', '1', '2', '3', '4', '5' .byte '6', '7', '8', '9'

– digits: .byte 0x30, 0x31, 0x32, 0x33, 0x34 .byte 0x35, 0x36, 0x37, 0x38, 0x39

Page 153: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 153

SPARC Assembly Language Memory alignment: .align 4

– Used when mixing allocations of bytes, words, halfwords, etc. and need word boundary alignment

Reserve bytes of space: .skip 20– Useful for allocating large amounts of space (e.g.,

arrays) Create a symbolic constant: .set mask, 0x0f

– Can now use the word “mask” anywhere we could use the constant 0x0f previously

Page 154: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 154

SPARC Assembly Language 4

Registers -- 32 bits wide– 32 general purpose integer registers, known by several

names to the assembler %r0-%r7 also known as %g0-%g7 global registers -- Note, %r0 always contains value 0.

%r8-%r15 also known as %o0-%o7 output registers %r16-%r23 also known as %l0-%l7 local registers %r24-%r31 also known as %i0-%i7 input registers Use the %r0-%r31 names for now. Other names are used in

procedure calls.

– 32 floating point registers %f0-%f31. Each reg. is single precision. Double prec. uses reg. pairs.

Page 155: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 155

SPARC Assembly Language 5

Assembly language– 3-address operations - format different from book

op src1, src2, dest !opposite of textE.g., add %r1, %r2, %r3 !%r3 %r1 + %r2

or %r2, 0x0004, %r2 !%r2 %r2 b-w-or 0x0004

– Contrast SPARC with MiPs (used in the book) indirect address notation: @addr vs [addr] operand order, especially the destination register register notation: R2 vs. %r2 branches

Page 156: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 156

SPARC Assembly Language 6

– 2-address operations: load and storeld [addr], %r2 ! %r2 M[addr]st %r2, [addr] ! M[addr] %r2

– Use set to put an address (a label, a symbolic constant) into a register, followed by ld to load the data itself.

set x_m, %r1 !put addr x_m into %r1ld [%r1],%r2 !use addr in %r1 to load %r2

Page 157: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 157

SPARC Assembly Language 7 Immediate values: operand is not an address, but a

valueE.g., add %rs, siconst13, %rd !%rd%rs+const

Immediate value coded as 13 bit 2’s complement. Range is, then, -212…212-1 or -4096 to 4095-4096 to 4095.

Immediate values can be specified in decimal, hexadecimal, octal, or binary. E.g., add %r2, 0x1A, %r2

Constant is coded into instruction itself, therefore available after fetching the instruction (no extra trip to memory for an operand).

On SPARC, no special notation for differentiating constants from addresses because no ambiguity in a load/store architecture.

Page 158: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 158

SPARC Assembly Language 8 Synthetic Instructions: assembler translates one

“instruction” into one or more machine instructions.– set : used to load a 32-bit signed integer constant into a register.

Has 2 operands - 32 bit value and register number. How does that fit into a 32 bit instruction?

E.g., set iconst32, %rd

set -10, %r3set x_m, %r4set ’=’, %r8

– clr %rd : used to set all bits in a register to 0. How?

– mov %rs, %rd : copies a register.

– neg %rs, %rd : copies the negation of a register.

Page 159: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 159

SPARC Assembly Language 9

– Operand sizes double word = 8 bytes, word = 4 bytes, half word = 2 bytes,

byte = 8 bits. Recall memory alignment issues.set x_m, %r2 !Put addr x_m in %r2ld [%r2], %r1 !load wordldsb [%r2], %r1 !load byte, sign extendedldub [%r2], %r1 !load byte, extend with 0’s

st %r1, [%r2] !store word, addr is mult of 4stb %r1, [%r2] !store byte, any addresssth %r1, [%r2] !store half word, address is even

– Characters use 8 bits ldub to load a character stb to store a character

Page 160: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 160

SPARC Assembly Language 10

– Traps : provides initial help with I/O, also used in operating systems programming.

ta 0 : terminate program ta 1 : output ASCII character from %r8 ta 2 input ASCII character into %r8 ta 4 : output integer from %r8 in unsigned hexadecimal ta 5 : input integer into %r8, can be decimal, octal, or hex

E.g.,set ’=’, %r8 !put ’=’ in %r8ta 1 !output the ’=’ta 5 !read in value into %r8mov %r8, %r1 !copy %r8 into %r1set 0x0a, %r8 !load a newline into %r8ta 1 !output the newline

Page 161: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 161

SPARC Assembly Language 12

– Quick review of instructions so far: ld [addr], %rd ! %rd M[addr] st %rd, [addr] ! M[addr] %r2 op %rs1, %rs2, %rd ! op is ALU op op %rs, siconst13, %rd ! %rd%rs op const set siconst32, %rd ! %rdconst ta # ! trap signal

– Have actually seen many more variants, e.g., ldub, ldsb, sth, clr, mov, neg, add, sub, smul, sdiv, umul, udiv, etc. Can evaluate just about any simple arithmetic expression.

Page 162: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 162

Review: Sparc Loads, Stores .datax_m: .word 0xa1b2c3d4 .skip 12 .text set x_m, %r2 ld [%r2], %r3 ldsb [%r2], %r4 ldub [%r2], %r5 st %r3, [%r2+4] sth %r3, [%r2+8] stb %r3, [%r2+12] ta 0

After this runs, what values are in %r2-5, and memory locations starting at byte address x_m?After this runs, what values are in %r2-5, and memory locations starting at byte address x_m?

Page 163: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 163

Flow of Control 1 In addition to sequential execution, need ability to

repeatedly and conditionally execute program fragments.– High level language has: while, for, do, repeat, case, if-then-else,

etc.

– Assembler has if, goto.

– Compare: high level vs. pseudo-assembler, implementation of f=n!

f = 1 i = 2loop: if (i > n) goto done f = f * i i = i + 1 goto loopdone: ...

f = 1;i = 2;while (i <= n){ f = f * i; i = i + 1;}

Page 164: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 164

Flow of Control 2

– Branch -- put a new address in the program counter. Next instruction comes from the new address, effectively, a “goto”.

– Unconditional branch (book) BRANCH addr ! PC addr

(SPARC) ba addr ! PC addr

– Conditional branch (book) BRcc R1, R2, target

“if R1 cc R2 then PC target” and cc is comparison operation (e.g., LT is , GE is , etc.)

Page 165: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 165

Flow of Control 4 Other conditions (from text, very similar to MIPS)

Can implement high level control structures now. – Factorial example, using the book’s assembly language:

LOAD R1, #1 ; R1 = f = 1LOAD R2, #2 ; R2 = i = 2LOAD R3, n ; R3 = n

loop: BRGT R2, R3, done ; branch if i > nMPY R1, R1, R2 ; f = f * iADD R2, R2, #1 ; i = i + 1BRANCH loop ; goto loop

done: STORE f, R1 ; f = n!

BRLT Rn, Rm, targetBRLE Rn, Rm, targetBREQ Rn, Rm, targetBRNE Rn, Rm, targetBRGE Rn, Rm, targetBRGT Rn, Rm, target

; if Rn Rm then PCtarget; if Rn Rm then PCtarget; if Rn Rm then PCtarget; if Rn Rm then PCtarget; if Rn Rm then PCtarget; if Rn Rm then PCtarget

Page 166: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 166

Flow of Control 3

Evaluating conditional branches– Evaluate condition

– If condition is true, then PC target, else PC PC+1

O P =B R cc

P C to bus, etc.

O P =B R A N C H

A ddr to bus, loadP C

C ond=T

Y es

N oY es

N o N o

Y es

F etch

E xecuteConsider changes to the fetch-execute cycle given earlier for accumulator machine.

•Do data paths need to change? •New control paths? •New opcodes? •New instruction formats?

Consider changes to the fetch-execute cycle given earlier for accumulator machine.

•Do data paths need to change? •New control paths? •New opcodes? •New instruction formats?

Page 167: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 167

Flow of Control 5 Condition Codes

– Book’s assembly language has 3-address branches. SPARC uses 1-address branches. Must use condition codes.

– Non-MIPS machines use condition codes to evaluate branches. Condition Code Register (CCR) holds these bits. SPARC has 4-bit CCR.

– N: Negative, Z: Zero, V: Overflow, C: Carry. All are shown in a trace, or in the reg command under ISEM.

– Condition codes are not changed by normal ALU instructions. Must use special instructions ending with cc, e.g., addcc.

N Z V C

Page 168: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 168

ALU Hardware 1

Recall the half-adder– Full-adder adds three single digit binary numbers.

Results in a sum, and a carry out.

FA

x y

cout cin

Sum

x y

cout

cin

Sum

Cin X Y Sum Cout

0 0 0 0 0

0 0 1 1 0

0 1 0 1 0

0 1 1 0 1

1 0 0 1 0

1 0 1 0 1

1 1 0 0 1

1 1 1 1 1

Page 169: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 169

ALU Hardware 2 Now cascade the full adder hardware

How are CCR bits set? (Above is a ripple-carry adder.)– C-bit = Cout – V-bit = Cout Cn-1

– Z-bit = (rzn-1 rzn-2 rzn-3 ... rz0)– N-bit = rzn-1

FA 0

register x register y

register z

FAcout FAFA FA

Page 170: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 170

Flow of Control 6 .textstart: set 1, %r2 set 0xFFFFFFFE, %r1 ! –2 in 32-bit 2’s compcc_set: subcc %r1, %r2, %r3 ! r3<= -2-1end: ta 0

ISEM> reg ----0--- ----1--- ----2--- ----3--- ----4--- ----5--- ----6--- ----7---G 00000000 fffffffe 00000001 00000000 00000000 00000000 00000000 00000000O 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000L 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000I 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 PC: 08:00002028 nPC: 0000202c PSR: 0000003e N:0 Z:0 V:0 C:0 cc_set : subcc %g1, %g2, %g3 ISEM> trace ----0--- ----1--- ----2--- ----3--- ----4--- ----5--- ----6--- ----7---G 00000000 fffffffe 00000001 fffffffd 00000000 00000000 00000000 00000000O 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000L 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000I 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 PC: 08:0000202c nPC: 00002030 PSR: 00b0003e N:1 Z:0 V:0 C:0

Page 171: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 171

Flow of Control 7

– Setting the condition codes Regular ALU operations don’t set condition codes. Use addcc, subcc, smulcc, sdivcc, etc., to set condition

codes.

– Considersubcc %r1, %r2, %r0

%r1 %r2 N Z V C

1 0

0 1

1 1

Do the values in the CCR tell us anything about the relationship between %r1 and %r2?

Page 172: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 172

Flow of Control 8– Branches use logic to evaluate CCR (SPARC)

Operation Assembler Syntax Branch Condition

Branch always ba target 1 (always)

Branch never bn target 0 (never)

Branch not equal bne target Z

Branch equal be target Z

Branch greater bg target (Z (N V))

Branch less or equal ble target (Z (N V))

Branch greater or equal bge target (N V)

Branch less bl target N V

Branch greater, unsigned bgu target (C Z)

Branch less or equal, unsigned bleu target C Z

Branch carry clear bcc target C

Branch carry set bcs target C

Branch positive bpos target N

Branch negative bneg target N

Branch overflow clear bvc target V

Branch overflow set bvs target V

Page 173: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 173

Flow of Control 9

– Setting Condition Codes (continued) Synthetic instruction cmp %rs1, %rs2

– Sets CCR, but doesn't modify any registers.

– Implemented as subcc %rs1, %rs2, %g0 Back to the factorial example (SPARC)

set 1, %r1 ! %r1 = f = 1set 2, %r2 ! %r2 = i = 2set n, %r3 ! Get loc of nld [%r3], %r3 ! Put n in %r3

loop: cmp %r2, %r3 ! Set CCR (i?n)bg done ! i > n donenop ! Branch delay

umul %r1, %r2, %r1 ! f = f * iadd %r2, 1, %r2 ! i = i + 1

ba loop ! Goto loopnop ! Branch delay

done: set f, %r3 ! Get loc of fst %r1, [%r3] ! f = n!

Page 174: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 174

Flow of Control 10

– Branch delay slots: unique to RISC architecture Non-technical explanation: processor is running so fast, it

can’t make a quick turn. – Instruction following branch is always executed.

Technical explanation: the efficiency advantage of pipelining is greater if the following instruction, which has almost completed execution, is allowed to complete.

Compilers take advantage of branch delay slots by putting a useful instruction there if possible.

For our purposes, use the nop (no operation) instruction to fill branch delay slots.

Beware! Forgetting the nop will be a large source of errors in your programs!Beware! Forgetting the nop will be a large source of errors in your programs!

Page 175: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 175

High Level Control Structures 1

Converting high level control structures– You get to be the “compiler”.

Some compilers convert the source language (C, Pascal, Modula 2, etc.) into assembly language and then assemble the result to an object file. GNU C, C++ do this to GAS (Gnu Assembler).

– if-then-else, while-do, repeat-until are all possible to create in a structured way in assembly language.

Page 176: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 176

High Level Control Structures 2 General guidelines

– Break down into independent (or nested) logical units– Convert to if/goto pseudo-code.

– Mechanical, step-by-step, non-creative process

f=1 i=2loop: if (i>n) goto done f = f*i i = i+1 goto loopdone: ...

f = 1;

for (i=2; i<=n; i++)

f = f * i;

Page 177: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 177

High Level Control Structures 3 if-then-else

if (a<b) c = d + 1;else c = 7;

if/goto

if (a >= b) goto elsec = d + 1goto end

else: c = 7 end:

init: set a, %r2 ! get &a into r2 ld [%r2], %r2 ! get a into r2 set b, %r3 ! get &b into r3 ld [%r3], %r3 ! get b into r3if: cmp %r2, %r3 ! a ?? b (want >=) bge else ! a >= b, do then nop set d, %r5 ! get &d into r5 ld [%r5], %r5 ! get d into r5 add %r5, 1, %r4 ! r4 <- d+1 ba end nopelse: set 7, %r4 ! get 7 into r4end: set c, %r5 ! get &c into r5 st %r4, [%r5] ! c <- r4

Page 178: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 178

High Level Control Structures 4 while loops:

while (a<b) a = a+1;c = d;

if/goto:whle: if (a>=b) goto done

body: a = a+1goto whle

done: c = d

init: set a, %r4 ! get &a into r4 ld [%r4], %r2 ! get a into r2 set b, %r3 ! get &b into r3 ld [%r3], %r3 ! get b into r3whle: cmp %r2, %r3 ! a ?? b (want >=) bge done ! a >= b skip body nopbody: add %r2, 1, %r2 ! r2 = a + 1 st %r2, [%r4] ! a = a + 1 ba whle ! repeat loop body nopdone: set c, %r5 ! get &c into r5 ...

Page 179: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 179

High Level Control Structures 5 repeat-until loops:repeat …until (a>b)

if/goto:repeat: …

if (a<=b) goto repeat

rpt: ... ... set a, %r2 ; get &a into r2 ld [%r2], %r2 ; get a into r2 set b, %r3 ; get &b into r3 ld [%r3], %r3 ; get b into r3 cmp %r2, %r3 ; a <= b? ble rpt ; do body again

nop

Page 180: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 180

High Level Control Structures 6 Complex condition

if((a<b)and(b>=c)) …

if((a<b)or(b>=c)) …

These can be combined and used in if/else or while loops.

Primitive Language

if (a>=b) then goto skip if (b<c) then goto skipbody: ... ...skip: ...

Primitive Language

if (a<b) then goto body if (b<c) then goto skipbody: ... ...skip: ...

Page 181: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 181

Flow of Control 11

– Optimizing code: change order of instructions, combine instructions, take advantage of branch delay slots.

Factorial example again. (for i:=n downto 1 do…)

Reduced 7 instructions in loop to just 4. (You gain no advantage if you optimize code in your labs.)

set 1, %r1 ! %r1=f=1set n, %r2 ! Get loc of nld [%r2], %r2 ! Put n in %r2

loop: umul %r1, %r2, %r1 ! f=f*nsubcc %r2, 1, %r2 ! Decrement nbg loop ! Repeatnop ! Branch delay set f, %r3 ! Get loc of fst %r1, [%r3] ! f=n!

Page 182: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 182

Synthetic Instructions Remember lab0? .data

x_m: .word 0x42

y_m: .word 0x20

z_m: .word 0

.text

start:

set x_m, %r2

ld [%r2], %r2

set y_m,%r3

ld [%r3], %r3 and so on…

Suppose you gave this command to ISEM (after loading):ISEM> dump start

start 05 00 00 10 84 10 a0 00 c4 00 80 00 07 00 00 10

Could you find the set instruction?

Page 183: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 183

Instruction Encodings 1 First, Instruction Encoding is how instructions are

assembled– All instructions must fit into 32 bits.

Register-register: op=10, i=0

Register-immediate: op=10, i=1

Floating point: op=10, i=0

op rd op3 rs1 asii rs2

3130 29 25 24 19 18 14 1312 5 4

op rd op3 rs1 simm13i

opf rs2op rd op3 rs1 i

Page 184: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 184

Instruction Encodings 2 Call instructions: op=01

Branch instructions: op=00, op2=010

SETHI instructions: op=00, op2=100

Ex.: add %r2, %r3, %r4

in hexadecimal: 88008003

op disp30

3130 29

op rd op2 imm22

10 00100 000000 00010 000000000 00011

3130 29 25 24 19 18 14 1312 5 4

condop i

3130 29 28 25

op2

24 22

disp22

21

a

Page 185: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 185

Decoding an Instruction05 00 00 1016 0000 0101 0000 0000 0000 0000 0001 00002

Instruction Group (bits 30:31) = 00

Destination Register (bits 25:29) = 00010

Op Code (bits 22:24) = 100

Constant (bits 0:21) = 0000000000000000010000

Meaning: sethi 0x10, %r2

%r2 <-- 00000000000000000100000000000000 (0x4000)

Page 186: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 186

Understanding SET SyntheticUsually used to put the value of an address in memory into a register.

For example, set 0x4004, %r3 Can do neither ‘add %r0, 0x4004, %r3’ nor ‘or %r0, 0x4004, %r3’. Why not?

SET is a synthetic instruction which may be implemented in two steps.

bit positions 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

sethi 0x10, %r3 ! Puts 0x10 in the Most Significant 22 bits hex value%r3 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0x124812480x10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 x x x x x x x x x x 0x10sethi%r3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0x4000

or %r3, 0x0004, %r3 ! Puts 0x0004 in the least significant bits%r3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0x40000x0004 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0x00000004OR%r3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0x4004

#2

#1

sethi 0x10, %r3 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0x 07 00 00 10or %r3, 4, %r3 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0x 86 10 E0 04

Machine language encoding for 'set 0x4004, %r3'

Page 187: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 187

SET Synthetic Instruction

set iconst, rdsethi %hi(iconst), rd

or rd, %lo(iconst), rd

--or--

sethi %hi(iconst), rd

--or--

or %g0, iconst, rd

Page 188: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 188

SPARC Assembly Language Memory alignment: .align 4

– Used when mixing allocations of bytes, words, halfwords, etc. and need word boundary alignment

Reserve bytes of space: .skip 20– Useful for allocating large amounts of space (e.g.,

arrays) Create a symbolic constant: .set mask, 0x0f

– Can now use the word “mask” anywhere we could use the constant 0x0f previously

Page 189: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 189

SET and Symbolic Addresses C-style example of pointer data type

char x; // object of type characterchar * ptr; // pointer to character typeptr = &x; // ptr has address of x (points to x)*ptr = ‘a’; // store ‘a’ at address in ptr

Assembly language equivalent.data

x_m: .byte 0 ! reserve character space; x_m = &x; [x_m] = x

.align 4 ! align to word boundaryptr_m: .word 0 ! pointer variable; [ptr_m] = ptr

.textset x_m, %r1 ! get address x_m into %r1set ptr_m, %r2 ! get address ptr_m into %r2st %r1, [%r2] ! make [ptr_m] point to [x_m]set ’a’, %r3 ! put character ‘a’ into r3set ptr_m, %r2 ! get address ptr_m into %r2ld [%r2], %r1 ! get address [ptr_m], i.e. x_m,

into %r1stb %r3, [%r1] ! store ‘a’ at address [ptr_m],

i.e., ptr

x_m

ptr_m‘a’

‘a’x_m, i.e., addr of x

x_m:ptr_m:

r1r2r3

Page 190: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 190

Bitwise Operations 1

Bit Manipulation Instructions– Bitwise logical operations

and %rs1, %rs2, %rd10010011… (32 bits)

01111001…

or %rs1, %rs2, %rd10010011… (32 bits)

01111001…

xor %rs1, %rs2, %rd10010011… (32 bits)

01111001…

x y xy0 0 00 1 01 0 01 1 1

x y x+y0 0 00 1 11 0 11 1 1

x y xy0 0 00 1 11 0 11 1 0

Page 191: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 191

Bitwise Operations 2 andn %rs1, %rs2, %rd

10010011… (32 bits)

01111001…

orn %rs1, %rs2, %rd10010011… (32 bits)

01111001…

not %rs, %rd10010011… (32 bits)

Recall the cc operations, so andcc, orcc, etc. are available. (However, there is no notcc; use xnorcc.)

x y xy0 0 00 1 01 0 11 1 0

x y x y0 0 10 1 01 0 11 1 1

x x0 11 0

Page 192: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 192

Bitwise Operations 3 For what kinds of things are these bit level operations used?

Recall the synthetic operation clr, and mov.

clr %r2 or %r0, %r0, %r2mov %r2, %r3 or %r0, %r2, %r3

Masking operations: Want to select a bit or group of bits from a set of 32. E.g., convert lower (or upper) to upper case:

‘a’ in binary is 01100001

‘A’ in binary is 01000001

All we need to do is “turn off” the bit in position 5.

and %r1, 0b11011111, %r1 will turn off that bit! What if we subtract 32 (0b100000) from %r1? What about converting upper to lower case?

Page 193: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 193

Bitwise Operations 4– Bitwise shifting operations

Shift logical left: sll %rs1, %rs2, %rd%rs1: data to be shifted%rs2: shift count%rd: destination register

E.g., set 0xABCD1234, %r2sll %r2, 3, %r3

%r2: 1010 1011 1100 1101 0001 0010 0011 0100%r3: 0101 1110 0110 1000 1001 0001 1010 0000

sll is equivalent to multiplying by a power of 2 (barring overflow). (In the decimal system, what’s a shortcut for multiplying by a power of ten?)

Page 194: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 194

Bitwise Operations 5 Shift Logical Right: srl %rs1, %rs2, %rd

– Shifts right instead of left, inserting zeros. Arithmetic shifts: propagate the sign bit when shifting right,

e.g., sra. (Left shift doesn't change.)– Almost equivalent to dividing by a power of 2.

Rotating shifts: Bits that would have gone into the bit bucket are shifted in instead. (E.g., rr, rl)

– Rotate not implemented in SPARC

Rotate Right Rotate Left

Page 195: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 195

Addressing Modes 1

Addressing Modes– How do we specify operand values?

In a register, location is encoded in the instruction. As a constant, immediate value is in the instruction. In memory, operand is somewhere in memory, location may

only be known at runtime.

– Memory operands: Effective address: actual location of operand in memory. This

may be calculated implicitly (e.g., by a displacement in the instruction) or may be calculated by the programmer in code.

Page 196: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 196

Addressing Modes 2

– Summary of addressing modes:

Mode Example Loc. Of Operand Suitable for SPARC?

Immediate add %r1, 100, %r1 instruction Constants Yes

Register Direct add %r1, %r2, %r1 %r2 Integers, constants Yes

Memory Direct add %r1, [2000], %r2 mem[2000] Integers, constants No

Memory Indirect add %r1, [[2000]], %r2 mem[mem[2000]] Pointers No

Register Indirect ld [%r1], %r2 mem[%r1] Pointers Yes

Register Indexed st %r1, [%r2+%r3] mem[%r2+%r3] Arrays Yes

Register Displaced

st %r1, [%r2+x] mem[%r2+x] Records Yes

Post Increment ld [%r1]+, %r2 mem[%r1] increment %r1

Arrays, strings, stacks

No

Pre Decrement ld -[%r1], %r2 decrement %r1, mem[%r1]

Arrays, strings, stacks

No

Page 197: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 197

Addressing Modes 3

– Memory Direct addressing Entire address is in the instruction (not in SPARC).

E.g., accumulator machine: each instruction had an opcode and a hard address in memory.

– Can’t be done on SPARC because an address is 32 bits, which is the length of an instruction. No room for opcodes, etc. Can be done in CISC because multi-word instructions are permitted.

– Memory Indirect addressing Pointer to operand is in memory. Instruction specifies location

of pointer. Requires three memory fetches (one each for instruction, pointer, and data). Not in RISC machines because instruction is too slow; such an instruction would cause its own register interlock!

Page 198: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 198

Addressing Modes 4 Register Indirect addressing

– Register has address of operand (a pointer). Instruction specifies register number, effective address is contents of register.

.datan_m: .word 5 ; initialize n to 5

.textset n_m, %r1 ; %r1 has n_m, pointer to nld [%r1], %r3 ; fetch n into %r3

– Simulating Register Indirect addressing on SPARC SPARC doesn't truly have register indirect addressing. Assembler converts ‘st %r2, [%r1]’ into ‘st %r2, [%r1+%r0]’

Page 199: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 199

Addressing Modes 5 Ex.: sum up array of integers:

.datan_m: .word 5 ! Size of arraya_m: .word 4,2,5,8,3 ! 5 word arraysum_m: .word 0 ! Sum of elementsb_m: .skip 5*4 ! another 5 word array

.textclr %r2 ! r2 will hold sumset n_m, %r3 ! r3 points to nld [%r3], %r3 ! r3 gets array sizeset a_m, %r4 ! r4 points to array a

loop: ld [%r4], %r5 ! Load element of a into r5add %r5, %r2, %r2! sum = sum + elementadd %r4, 4, %r4 ! Incr ptr by word sizesubcc %r3, 1, %r3! Decrement counterbg loop ! Loop until count = 0nop ! Branch delay slotset sum_m, %r1 ! r1 points to sumst %r2, [%r1] ! Store sumta 0 ! done

0 54321

a_ma_m+4a_m+8a_m+12a_m+16

r2 r3 r4 r5looploop+1loop+2loop+3loop+4

5 n_ma_ma_m+4a_m+8a_m+12a_m+16sum_m

42583

Page 200: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 200

Register Indexed & Displaced

Recall these Assembler directives Reserve bytes of space: .skip 20 Create a symbolic constant: .set offset, 0x16

Register Indexed and Displaced addressing modes help us work with pointers, arrays, and records in assembly language.

Page 201: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 201

Addressing Modes 7

– Register Indexed addressing Suitable for accessing successive elements of the same type in

a data structure. Ex.: Swap elements A[i] and A[k] in array

Effective address calculations!

.data A: .skip 24*4 ! reserve array[0..23] of int

! assume i is in %r2 and k is in %r3 .text set A, %r4 ! beginning of array ptr. sll %r2, 2, %r2 ! “multiply” i by 4 sll %r3, 2, %r3 ! “multiply” k by 4 ld [%r2+%r4], %r7 ! r7 <- a[i] ld [%r3+%r4], %r8 ! r8 <- a[k] st %r8, [%r2+%r4] ! a[i] <- r8 st %r7, [%r3+%r4] ! a[k] <= r7

AA+4A+8A+12

001 0010 A100 1000

r2 r3 r4 r7 r8

after sll<-

Page 202: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 202

Addressing Modes 8

Array mapping functions: used by compilers to determine addresses of array elements. – Must know upper bound, lower bound, and size

of elements of array. Total storage = (upper - lower + 1)*element_size Address offset for element at index

k = (k - lower)*element_size

Address (byte) offset for A[3] = (3-0)*4 = 12

This is for 1 dimensional arrays only!This is for 1 dimensional arrays only!

Page 203: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 203

Addressing Modes 9

1D array mapping functions: Want an array of n elements, each element is 4 bytes in size, array starts at address arr.– Total storage is 4n bytes

– First element is at arr+0

– Last element is at arr+4(n-1)

– kth (k can range from 0…n-1) element is at arr+4k. Array uses zero-based indexing.

k=0 k=1 k=2 k=3 k=4 k=5

ar r +0 ar r +4 ar r +8 ar r +12 ar r +16 ar r +20

ar r ay of 6 elements, 4 bytes each

Page 204: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 204

Addressing Modes 10

2D array mapping functions: must linearize the 2D concept; e.g., map the 2D structure into 1D memory.

– Convert into 1D array in memory

0,0 0,1 0,2 0,3 0,4

1,0 1,1 1,2 1,3 1,4

2,0 2,1 2,2 2,3 2,4

0 1 2 3 4

0

1

2

3 R ows(0...2)

5 C olumns (0...4)

0,0 0,1 0,2 0,3 0,4 1,0 1,1 2,3 2,4.....

Page 205: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 205

Addressing Modes 11

2 ways to convert to 1D– Row major order (Pascal, C, Modula-2) stores first by rows,

then by columns. E.g.,

– Column major order (FORTRAN) stores first by columns then by rows. E.g.,

0,0 0,1 0,2 0,3 0,4 1,0 1,1 2,3 2,4.....

0,0 1,0 2,0 0,1 1,1 2,1 0,2 1,4 2,4.....

Page 206: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

Addressing Modes Row major 2D array mapping function: Given an array starting at address arr, that is x rows by y

columns, each element is m bytes in size, and indices start at zero, then element (i, j) may be found at location:

CSE360 206

0,0 0,1 0,2 0,3 0,4

1,0 1,1 1,2 1,3 1,4

2,0 2,1 2,2 2,3 2,4

0 1 2 3 4

0

1

2

3 R ows(0...2)

5 C olumns (0...4)

0,0 0,1 0,2 0,3 0,4 1,0 1,1 2,3 2,4.....

Offset to A (0,2) = (5 * 0 + 2) * element sizeOffset to A (0,2) = (5 * 0 + 2) * element size

arr + (y i + j) m

Page 207: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 207

Addressing Modes 12

– 3D array mapping function: natural extension of 2D function. Store by row, then column, then depth.

– Array starting at arr with x rows, y columns, depth z, m element size. Element (i, j, k) is found at location:

arr + (zyi + j) + k)m

0,0,1 0,1,1 0,2,1 0,3,1 0,4,1

1,0,1 1,1,1 1,2,1 1,3,1 1,4,1

2,0,1 2,1,1 2,2,1 2,3,1 2,4,1

0,0,0 0,1,0 0,2,0 0,3,0 0,4,0

0,1,0 1,1,0 1,2,0 1,3,0 1,4,0

2,0,0 2,1,0 2,2,0 2,3,0 2,4,0

3 R ows, 5 C olumns, 2 D epth

+0

+1

+2 +4 +6 +8

+3 +5 +7 +9

+10

+12

+14

+16

+181,0,0

Page 208: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 208

Addressing Modes 15

– Displacement Addressing Suitable for accessing the individual fields of record data

structures. Each field can be of a different type.

Use .set directive to establish offsets to fields within records. Then use displacement addressing to access those fields.

20 C har acter s

I nteger

I nteger

N ame

A ge

D O B

L ogicalview of a

r ecor d

20 bytes 4 bytes 4 bytes

A ctual layout of r ecor d in memor y

per son+0 per son+20 per son+24

Page 209: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 209

Addressing Modes 16 Ex.: Add 1 to the age field in a person record

Problem: alignment in memory. May have to waste some space in the person record in order to have the integer fields align on a word boundary.

.data .set name, 0 ! offset to name field .set age, 20 ! offset to age field .set dob, 24 ! offset to date of birthperson: .skip 28 ! size of a person record

.text.... set person, %r1 ! get addr of person record ld [%r1+age], %r2 ! get the age of the person add %r2, 1, %r2 ! increment age by 1 st %r2, [%r1+age] ! store back to record

Page 210: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 210

Addressing Modes 17

– Auto-increment and Auto-decrement addressing SPARC does not support these modes. They may be

simulated using register indirect addressing followed by an add or subtract of the size of the element on that register.

Useful for traversing arrays forward (auto-increment) and backward (auto-decrement). Also useful for stacks and queues of data elements.

Page 211: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 211

Subroutines 1 Subroutine (also function, method, procedure, or

subprogram)– a portion of code within a larger program, which performs a specific task

and can be relatively independent of the remaining code.

Advantages of subroutines– reducing the duplication of code in a program

– enabling reuse of code across multiple programs

– decomposing complex problems into simpler pieces

– improving readability of a program

– hiding or regulating part of the program

Requires little hardware support, mostly protocols and conventions to handle parameters.

Requires little hardware support, mostly protocols and conventions to handle parameters.

Page 212: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 212

Subroutines 2 Terminology

– Caller: the code (which could be a subroutine itself) which invokes the subroutine of interest

– Callee: the subroutine being invoked by the caller

– Function: subroutine that returns one or more values back to the caller and exactly one of these values is distinguished as the return value

– Return value: the distinguished value returned by a function

Page 213: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 213

Subroutines 3

Terminology (continued)– Procedure: a subroutine that may return values to the

caller (through the subroutine’s parameter(s)), but none of these values is distinguished as the return value

– Return address: address of the subroutine call instruction

– Parameters: information passed to/from a subroutine (a.k.a. arguments)

– Subroutine linkage: a protocol for passing parameters between the caller and the callee

Page 214: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 214

Subroutines 4 Calling a subroutine

– Assembly language syntax for calling a subroutine

call labelnop

– Must change the program counter (as in a branch instruction) however, we must also keep track of where to resume execution after the subroutine finishes. Call instruction handles this atomically (i.e., without interruption) by:

%r15 #PC(PC #nPC)nPC label

Page 215: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 215

Subroutines 4 Returning from a subroutine

– Assembly language syntax for returning from a subroutine

retlnop

Again, must change the program counter to return to an instruction after the one that called the subroutine. The address of the instruction that called it was saved in %r15, and we must skip over the branch delay slot as well. So, this is accomplished by:

nPC %r15+8

Page 216: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 216

Subroutines 5

Parameter passing: 2 approaches– Register based linkage: pass parameters solely through

registers. Has the advantage of speed, but can only pass a few parameters, and it won’t support nested subroutine calls. Such a subroutine is called a leaf subroutine.

– Stack based linkage: pass parameters through the run-time stack. Not as fast, but can pass more parameters and have nested subroutine calls (including recursion).

Page 217: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 217

Register-based Linkage 1– Subroutine linkage:

Startup Sequence: load parameters and return address into registers, branch to subroutine.

Prologue: if non-leaf procedure then save return address to memory, save registers used by callee.

Epilogue: place return parameters into registers, restore registers saved in prologue, restore saved return address, return.

Cleanup Sequence: work with returned values

S tar tupS equence

C leanupS equence

P r ologue

B ody

E pilogue

C aller C allee

call

r et l

Page 218: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 218

Register-based Linkage 2– Example: Print subroutine.

.textmain: set 1, %r1 ! Initialize r1 and r2

set 3, %r2mov %r1, %r8 ! Print %r1call printnopmov %r2, %r8 ! Print %r2call printnopadd %r1, %r2, %r8 ! Do our calculationcall print ! Print the result (expect

‘4’)nopta 0

print: set ‘0’, %r1 ! Ascii value of zeroor %r8, %r1, %r2 ! Treat r8 as parametermov %r2, %r8 ! Move into output registerta 1 ! Output charactermov ‘\n’, %r8ta 1 ! Output end of line

(newline)retl ! Returnnop

What’s wrong with the above code?

Page 219: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 219

Register-based Linkage 3– Which registers can leaf subroutines change?

Convention for optimized leaf procedures:

The subroutine must not use the value in any other register except to save it to memory somewhere and restore it before returning to the caller.

Problem: how can a subroutine call another subroutine? How can a subroutine call itself?

Register(s) Use Mentionable? %r0 Zero Yes %r1 Temporary Yes %r2-%r7 Caller’s variables No %r8 Return value Yes %r8-%r13 Parameters Yes %r14 Stack pointer No %r15 Return address Yes, but preserve %r30 Frame pointer No %r16-%r29, %r31 Caller’s variables No

Page 220: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 220

Register-based Linkage 4

– Example: procedure to print linked list of ints.

. dat a . set dt a, 0 ! off set i n r ecor d t o dat a . set pt r , 4 ! off set i n r ecor d t o next poi nt erhead: . wor d 0

. t extmai n: . . . . ! does al l i ni t and al l ocat i on of l i st set head, %r 8 ! pr epar e par amet er t o t r aver se pr oc l d [ %r 8] , %r 8 ! f ol l ow head poi nt er t o fi r st node cal l t r av ! cal l subr out i ne nop ! br anch del ay . . . .

t r av: mov %r 8, %r 1 ! copy poi nt er t o %r 1l oop: cmp %r 1, 0 ! check f or nul l poi nt er be done ! nul l poi nt er means we ar e done nop ! br anch del ay l d [ %r 1+dt a] , %r 8 ! f ol l ow poi nt er and get dat a fi el d t a 4 ! pr i nt dat a fi el d l d [ %r 1+pt r ] , %r 1 ! get poi nt er t o next r ecor d ba l oop nop ! br anch del aydone: r et l

5 7 4 1 nilhead

nop

Page 221: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 221

Parameter Passing 1

– Review of parameter passing mechanisms: Pass by value copy: parameters to subroutine are copies upon

which the subroutine acts. Pass by result copy: parameters are copies of results produced

by the subroutine. Pass by reference copy: parameters to subroutine are (copies

of) addresses of values upon which the subroutine acts. Callee is responsible for saving each result to memory at the location referred to by the appropriate parameter.

Hybrid: some parameters passed by value copy, some by result copy, and/or some by reference copy. Callee is responsible for saving results for reference parameters.

Page 222: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 222

Parameter Passing 2– Parameter passing notes:

Array or record parameters typically are passed by reference copy (efficiency reasons). Primitive data types may be passed either way.

Conventions among languages allows any language to call functions in any other language:

– Pascal: VAR parameters are passed by reference copy; all others are passed by value copy.

– C: all parameters are passed by value copy. Must explicitly pass a pointer if you want a reference parameter.

– C++: like Pascal, can pass by value or reference copy.– FORTRAN: all things passed by reference copy (even

constants).– ADA: pass by value/result copy.

Page 223: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 223

Parameter Passing 3 .text ! Example 10.1 of Lab Manual! pr_str – print a null terminated string! Parameters: %r8 – pointer to string (initially)!! Temporaries: %r8 – the character to be printed! %r9 – pointer to string!pr_str: mov %r8, %r9 ! we need %r8 for the “ta 1” belowpr_lp: ldub [%r9], %r8 ! load character cmp %r8, 0 ! check for null be pr_dn nop ta 1 ! print character ba pr_lp inc %r9 ! increment the pointer (in ! branch delay slot)pr_dn: retl nop

Page 224: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 224

Parameter Passing 4 Summary from text (p. 220)

– Pass by value copy: For small “in” parameters. Subroutines cannot alter the originals whose copies are passed as parameters.

– Pass by value/result copy: For small “in/out” parameters. Caller’s cleanup sequence stores values of any “in/out” parameters.

– Pass by reference copy: for “in/out” parameters of all sizes, and large “in” parameters. “Out” values are provided by changing memory at those addresses. (Note: pass by reference copy is passing an address by value copy.)

Page 225: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 225

Parameter Passing 5

– Write Sparc code for the caller and callee for the following subroutine using register based parameter passing

! global_function Integer subchr (A, B, C)! Substitutes character C for each B in string [A],! and returns count of changes.! ! // In comments, "[A+index]" is denoted by "ch".! index = 0! count = 0! LOOP: if [A+index]=0 go to END // while (ch != 0) { ! if [A+index]B go to INC // if (ch == B) {! [A+index]=C // ch = C;! count=count+1 // count++; }! INC: index=index+1 // index++;! go to LOOP // }! END:

.data ! data sectionC_m: .byte ’I’ ! parameter CB_m: .byte ’i’ ! parameter BA_m: .asciz "i will tip" ! parameter A .align 4R_m: .word 0 ! for storing result count

Assume

Page 226: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 226

Stack-based Linkage 1 Stack based linkage

– Advantages Permits subroutines to call others. Allows a larger number of parameters to be passed. Permits records and arrays to be passed by value copy. Saving of registers by callee is “built-in”. A way for callee to reserve memory for other uses is “built-in”, too.

– Disadvantages Slower than register based More complex protocol

– Why a stack? Subroutine calls and returns happen in a last-in first-out order (LIFO).

Also known as a runtime stack, parameter stack, or subroutine stack.

Page 227: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 227

Stack-based Linkage 2 Items “saved” on the stack

in one activation record– Parameters to the

subroutine

– Old values of registers used in the subroutine

– Local memory variables used in subroutine

– Return value and return address

Say A() calls B(), B() calls C(), and C() calls A()

1st stackfr ame for A

1st stackfr ame for B

1st stackfr ame for C

2nd stackfr ame for A

L ocal var iables

S aved gener al pur poser egister s

R etur n addr esses

R etur n values

P ar ameter s

R unt ime S tack E xpanded V iew

Page 228: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 228

Stack-based Linkage 3– Stack based linkage parameter passing

convention Startup sequence:

– Push parameters– Push space for return value

Prologue– Push registers that are changed

(including return address)– Allocate space for local variables

Epilogue– Restore general purpose registers– Free local variable space– Use return address to return

Cleanup Sequence– Pop and save returned values– Pop parameters

S tar tupS equence

C leanupS equence

P r ologue

B ody

E pilogue

C aller C allee

call

r et l

Page 229: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 229

Stack-based Linkage 4

– Stack based parameter passing example: Register %r14 %sp stack pointer

– Invariant: Always indicates the top of the stack (it has the address in memory of the last item on stack, usually a word).

– Moved when items are “pushed” onto the stack.

– Due to interruptions (system interrupts (I/O) and exceptions), values stored above %sp (at addresses less than %sp) can change at any time! Hence, any access above %sp is unsafe!

Register %r30 %fp frame pointer– Indicates the previous stack pointer. Activation record is from

(some subroutine-specific number of words before) the %fp to the %sp.

– Invariant: %fp is constant within a subroutine (after prologue).

Page 230: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 230

Stack-based Linkage 5

– Stack based parameter passing example: Want to implement the following subroutine (also a caller):

! global_function Integer subchr (A, B, C)! Substitutes character C for all B in string A,! and returns count of changes.! ! // In comments, "*(A+index)" is denoted by "ch".! index = 0! count = 0! LOOP: if *(A+index)=0 go to END // while (ch != 0) { ! if *(A+index)B go to INC // if (ch == B) {! *(A+index)=C // ch = C;! count=count+1 // count++; }! INC: index=index+1 // index++;! go to LOOP // }! END:

.data ! data sectionC_m: .byte ’I’ ! parameter CB_m: .byte ’i’ ! parameter BA_m: .asciz "i will tip" ! parameter A .align 4R_m: .word 0 ! for storing result count

Page 231: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 231

Stack-based Linkage 6 .data ! data sectionC_m: .word ’I’ ! parameter CB_m: .word ’i’ ! parameter BA_m: .asciz "i will tip" ! parameter A .align 4 ! align to word addressstack: .skip 250*4 ! allocate 250 word stackbstak: ! point to bottom of stackR_m: .word 0 ! reserve for count .text! Program’s one-time initializationstart: set bstak, %sp ! set initial stack ptr

mov %sp, %fp ! set initial frame ptr! STARTUP SEQUENCE to call subchr()

sub %sp, 16, %sp ! move stack ptr set A_m, %r1 ! A is passed by reference

st %r1, [%sp+4] ! push address on stack set B_m, %r1 ! B is passed by value ld [%r1], %r1 ! get value of B st %r1, [%sp+8] ! push parameter B on stack set C_m, %r1 ! C is passed by value ld [%r1], %r1 ! get value of C st %r1, [%sp+12] ! push parameter C on stack

! SUBROUTINE CALL call subchr ! make subroutine call nop ! branch delay slot! CLEANUP SEQUENCE ld [%sp], %r1 ! pop return value off stack

add %sp, 16, %sp ! pop stack set R_m, %r2 ! get address of R st %r1, [%r2] ! store R . . . ! the rest of the program

Return value

b

stack:

%sp ->

%fp ->

addr (a)

c

Page 232: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 232

Stack-based Linkage 7! SUBROUTINE PROLOGUEsubchr: sub %sp, 32, %sp ! open 8 words on stack

st %fp, [%sp+28] ! Save old frame pointer add %sp, 32, %fp ! old sp is new fp st %r15, [%fp-8] ! save return address

st %r8, [%fp-12] ! Save gen. Register … ! Save r9-r13, omitted

! SUBROUTINE BODYld_reg: ld [%fp+4], %r8 ! “pop” (load) addr of A

ld [%fp+8], %r9 ! “pop” (load) value of B ld [%fp+12], %r10 ! “pop” (load) value of C clr %r12 ! count clr %r13 ! index

loop: ldub [%r8+%r13], %r11 ! load a string chr cmp %r11, 0x0 ! is chr=null? be done ! then go to done cmp %r11, %r9 ! is chr<>B? (branch delay) bne inc ! then go to inc nop ! branch delay slot stb %r10, [%r8+%r13] ! change chr to C add %r12, 1, %r12 ! increment count

inc: add %r13, 1, %r13 ! increment index ba loop ! do next chr nop ! branch delay slot

done: st %r12, [%fp+0] ! “push” (store) count on stack

! EPILOGUE … ! Restore r9-r13, omitted ld [%fp-12], %r8 ! Restore r8 ld [%fp-8], %r15 ! get saved return address

ld [%fp-4], %fp ! Get old value of frame ptr add %sp, 32, %sp ! Restore stack pointer retl ! return to caller nop ! branch delay slot

cb

addr (a)

%sp ->

%fp ->

return addrold frame ptrReturn value

...%r9%r8

Page 233: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 233

Stack-based Linkage 8

General Guidelines

– Keep Startups, Cleanups, Prologues, and Epilogues

standard (but not necessarily identical); easy to cut,

paste, and modify.

– Caller: leave space for return value on the TOP of the

stack.

– Callee: always save and restore locally used registers.

– Pass data structures and arrays by reference, all others

by value (efficiency).

Page 234: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 234

Our Fourth Example Architecture Motorola M68HC11 Called “HC11” for short Used in ECE 567, a course required of CSE

majors References:

– Data Acquisition and Process Control with the M68HC11 Microcontroller, 2nd Ed., by F. F. Driscoll, R. F. Coughlin, and R. S. Villanucci, Prentice-Hall, 2000.

– M68HC11 Processor Manual, on Carmen

Page 235: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 235

Another Reference

Late in an academic term (such as now), you can hope to access on-line lecture notes from the Electrical and Computer Engineering course,ECE 265.

Visit http://www.ece.osu.edu Under “Academic Program”, click on the link

“ECE Course Listings”. Find 265 and click on the link “Syllabus of this

quarter”.

Page 236: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 236

HC11 compared with Sparc (1)

HC11 Sparc

CISC RISC, Load/Store

Instruction encoding lengths vary (8 to 32 bits)

Instruction encoding lengths constant (32 bits)

About 316 instructions About 175 instructions

4 16-bit user registers, one of which is divided into two 8-bit registers

32 32-bit user integer registers

Page 237: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 237

HC11 compared with Sparc (2)

HC11 Sparc

8-bit data bus 32-bit data bus

16-bit address bus 32-bit address bus

8-bit addressable 8-bit addressable

Instruction execution not overlapped

Instruction execution overlapped in a pipeline

Page 238: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 238

HC11 compared with Sparc (3)

A Strange Fact: The HC11 architecture “allows accessing an operand from an external memory location with no execution-time penalty.”[p. 27, M68HC11 Processor Manual]

Reason: The HC11 requirements state that the CPU cycle must be kept long enough to accommodate a memory access within one cycle. This seeming miracle is accomplished by keeping processor speed slow enough.

Page 239: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 239

HC11 Programmer’s Model (1)7 00 7

15 0

Accumulator A Accumulator B

Accumulator D

X Index Register

Y Index Register

Stack Pointer (SP)

Program Counter (PC)

Page 240: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 240

HC11 Programmer’s Model (2)

01234567

Condition Code Register (CCR)

S X H I N Z V C

Carry/Borrow

Overflow

Zero

Negative

I Interrupt Mask

Half-Carry

X Interrupt Mask

Stop

Page 241: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 241

HC11 Assembly Language Format (1)

Like Sparc, it is line-oriented. A line may:

– Be blank (containing no printable characters),

– Be a comment line, the first printable character being either a semicolon (‘;’) or an asterisk (‘*’), or

– Have the following format (“[] means an optional field”):[Label] Operation [Operand field] [Comment field]

Page 242: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 242

HC11 Assembly Language Format (2)

Label:– begins in column 1, ending either with a space or a

colon (‘:’)

– Contains 1 to 15 characters

– Case sensitive

– The first character may not be a decimal digit (0-9)

– Characters may be upper- or lowercase letter, digits 0-9, period (‘.’), dollar sign (‘$’), or underscore (‘_’)

Page 243: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 243

HC11 Assembly Language Format (3)

Operation:– Cannot begin in column 1

– Contains: Instruction mnemonic, Assembler directive, or Macro call (we haven’t studied macro expansion in this

course)

Operand field:– Terminated by a space or tab character,

– So multiple operands are separated by commas (‘,’) without using any spaces or tabs

Page 244: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 244

HC11 Assembly Language Format (4)

Comment field:– Begins with the first space character following the

operand field (or following the operation, if there is no operand field)

– So no special printable character is required to begin a comment field

– But it appears to be conventional to begin a comment field with a semicolon (‘;’)

Page 245: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 245

Prefixes for Numeric Constants

Encoding HC11 Sparc

Decimal No symbol No symbol

Hexadecimal $ 0x

Octal @ 0

Binary % 0b

Page 246: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 246

Assembler Directives (1)

Meaning HC11 Sparc

Set location counter (origin)

ORG .data or .text

End of source END Doesn’t have

Equate symbol to a value

EQU .set

Form constant byte FCB .byte

Page 247: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 247

Assembler Directives (2)

Meaning HC11 Sparc

Form double byte FDB .half

Form character string constant

FCC .ascii

Reserve memory byte or bytes

RMB .skip

Page 248: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 248

HC11 Addressing Modes

Immediate (IMM) Extended (EXT) Direct (DIR) Inherent (INH) Relative (REL) Indexed (INDX, INDY)

Page 249: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 249

Immediate (IMM)

Assembler interprets the # symbol to mean the immediate addressing mode

Examples– LDAA #10

– LDAA #$1C

– LDAA #@17

– LDAA #%11100

– LDAA #’C’

– LDAA #LABEL

Page 250: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 250

Extended (EXT)

Lack of # symbol indicates extended or direct addressing mode. These are forms of memory direct addressing, like SAM.

“Extended” means full 16-bit address, whereas “Direct” means directly to a low address, specified using only the least significant 8 bits of the address.

Examples– LDAA $2025

– LDAA LABEL

Page 251: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 251

Direct (DIR)

Examples– LDAA $C2

– LDAA LABEL

Page 252: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 252

Inherent (INH) All operands are implicit (i.e., inherent in the

instruction) Examples: ABA, SBA, DAA ABA means add the contents of register B to the

contents of A, placing the sum in A (A + B A) SBA means A – B A DAA means to adjust the sum that got placed in A

by the previous instruction to the correct BCD result; e.g., $09 + $26 yields $2F in A, then DAA changes this to $35.

Page 253: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 253

Relative (REL)

Used only for branch instructions Relative to the address of the following instruction

(the new value of the PC) Signed offset from -128 to +127 bytes Examples

– BGE -18

– BHS 27

– BGT LABEL

Page 254: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 254

Indexed (INDX, INDY)

Uses the contents of either the X or Y register and adds it to a (positive, unsigned) offset contained in the instruction to calculate the effective address

Example– LDAA 4,X

Page 255: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 255

Interrupts

When an interrupt is acknowledged, the CPU’s hardware saves the registers’ contents on the stack. An interrupt service routine ends with a(n) RTI instruction. This instruction automatically restores the CPU register values from the copies on the stack.

Page 256: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 256

Condition Code Register (CCR)

It’s reasonably safe to say that every instruction that changes a register (A, B, D, X, Y, SP) affects the CCR appropriately. Unlike Sparc, there are no arithmetic instructions that do not set condition codes.

There do exist instructions that compare a register to a memory location by subtracting the memory contents from the register and throwing the result away, but setting the CCR (CMPA, CMPB, CPD, CPX, CPY).

Page 257: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 257

HC11 Condition Code Register

The H bit is turned on by an 8-bit addition operation when there is a carry from the lower-order nibble into the higher-order nibble, that is to say, from bit 3 into bit 4. 

0000 1111+0000 1000

-------------0001 0111

1 0 0 0

Page 258: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 258

HC11 Condition Code Register

The Z bit is turned on when the result is zero. 

The N bit is turned on when the result is negative according to the appropriately-sized 2's complement encoding scheme. 

0000 0000

1010 1010

Page 259: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 259

HC11 Condition Code Register

The V bit is turned on when, under the appropriately-sized 2's complement interpretation of the two source operands and the result, the result is wrong. 

0100+ 1100

-------0000

2’s Comp+4-4----0

Simple Binary+4+12----0??

Correct so V-bit

is off

Incorrect so C-bit is on

Page 260: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 260

HC11 Condition Code Register

The C bit is turned on when, under the simple binary interpretation of the two source operands and the result, the result is wrong. 

0111+ 0111

------- 1110

2’s Comp+7+7-----2??

Simple Binary+7+7----14

Incorrect so V-bit is on

Correct so C-bit

is off

Page 261: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 261

Example HC11 Program

Problem: Produce the following waveforms on the three least significant bits (LSBs) of parallel 8-bit output Port B (mapped to $1004), where we name the bits X, Y, and Z in increasing order of significance (X is bit 0; Y is bit 1; Z is bit 2).

10 ms

20 ms

15 ms

X

Y

Z

Page 262: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 262

Example Source File, p. 1

STACK: EQU $00FF ; set stack pointer

PORTB: EQU $1004 ; set address of Port B

ORG 0

DELAY1: FCB 10 ; set the waveform times

DELAY2: FCB 20 ; for X, Y, and Z

DELAY3: FCB 15

Page 263: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 263

Example Source File, p. 2

ORG $E000 ; program starts at $E000

MAIN: LDS #STACK ; initialize stack pointer

L0: LDAA #1 ; set X on Port B to 1

STAA PORTB

LDAB DELAY1 ; delay for 10 ms

L1: JSR DELAY_1MS

DECB

BNE L1

Page 264: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 264

Example Source File, p. 3 LDAA #%00000010 ; set Y on Port B to 1 STAA PORTB LDAB DELAY2 ; delay for 20 msL2: JSR DELAY_1MS DECB BNE L2 LDAA #%00000100 ; set Z on Port B to 1 STAA PORTB LDAB DELAY3 ; delay for 15 msL3: JSR DELAY_1MS DECB BNE L3 BRA L0 ; continue to cycle

Page 265: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 265

Example Source File, p. 4DELAY_1MS: PSHB ; subr. to delay for 1

ms LDAB #198DELAY: DECB BRN DELAY NOP BNE DELAY PULBRETURN: RTS

ORG $FFFE ; initialize reset vectorRESET: FDB MAIN END

Page 266: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 266

Traps and Exceptions 1

Traps, Exceptions, and Extended Operations

– Other side of low level programming -- the interface

between applications and peripherals

– OS provides access and protocols

Page 267: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 267

Traps and Exceptions 2

– BIOS: Basic Input/Output System Subroutines that control I/O No need for you to write them as application programmer OS interfaces application with BIOS through traps (extended

operations (XOPs))

B I O S

K eyboar d S cr een M ouse D isk

A pplicat ionssoftwar e

Page 268: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 268

Traps and Exceptions 3– Where are OS traps kept? Two approaches:

Transient monitor: traps kept in a library that is copied into the application at link-time

Resident monitor: always keep OS in main memory; applications share the trap routines.

OS routines monitor devices. Frequently used routines kept resident; others loaded as needed.

O S r tns

A ppl 1

O S r tns

A ppl 2

O S r tns

A ppl 3

O S r tns

A ppl 4

O S r tns

A ppl 1

A ppl 2

A ppl 3

A ppl 4

A ppl 5

A ppl 6

Page 269: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 269

D ispatcherA pplicat ion

B I O S 1

B I O S 1

B I O S n

Traps and Exceptions 4

– (Assuming a res. monitor) How to find I/O routines? Store routines in memory, and make a call to a hard address.

E.g., call 256– When new OS is released, need to recompile all application

programs to use different addresses. Use a dispatcher

– Dispatcher is a subroutine that takes a parameter (the trap number). Dispatcher knows where all routines actually are in memory, and makes the branch for you. Dispatcher subroutine must always exist in the same location.

2

Page 270: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 270

Traps and Exceptions 5 Use vectored linking

– Branch table exists at a well known location. The address of each trap subroutine is stored in the table, indexed by the trap number.

– On RISC, usually about 4 words reserved in the table. If the trap routine is larger than 4 words, can call the actual routine.

A ddr of t r ap 0

A ddr of t r ap 1

A ddr of t r ap 2

A ddr of t r ap n

100

104

108

100+4n

100

116

132

100+16n

Page 271: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 271

Traps and Exceptions 6

– Levels of privilege Supervisor mode - can access every resource User mode - limited access to resources OS routines operate in supervisor mode, access is determined

by bit in PSW (processor status word). XOP (book’s notation) can always be executed, sets privilege

to supervisor mode (ta) RTX (book’s notation) can only be executed by the OS, and

returns privilege to user mode (rett)

Page 272: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 272

Traps and Exceptions 7

– Exceptions Caused by invalid use of resource. E.g., divide by zero,

invalid address, illegal operation, protection violation, etc. Control transferred automatically to exception handler routine.

Similar to trap or XOP transfer. Exceptions vs. XOPs

– XOPs explicit in code, exceptions are implicit

– XOPs service request and return to application; exceptions print message and abort (unless masked).

– On SPARC, trap table has 256 entries. 0-127 are reserved for exceptions and external interrupts. 128-

255 are used for XOPs. Trap table begins at address 0x0000. Each entry is 4 instructions (16 bytes) long.

Page 273: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 273

Traps and Exceptions 8

– Trap example: non-blocking read ta 3 If there is nothing in the keyboard buffer, return with a

message that nothing is there. Otherwise, put the character into register 8.

– Status of the keyboard is kept in a memory location, as is the (one-character) keyboard buffer. Memory mapped devices.

! ta 3 returns character if one is there, otherwise! it returns 0x8000000 into %r8 set 0x8000000, %r8 ! set default return val set KbdStatus, %r1 ! KbdStatus is memory loc ld [%r1], %r1 ! read status (1 is ready) andcc %r1, 1, %r1 ! check status be rtn ! can’t read anything set KbdBuff, %r1 ! KbdBuff is memory loc ld [%r1], %r8 ! get characterrtn: rett ! return to caller

Page 274: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 274

Traps and Exceptions 9

– Trap execution: ta 3 Calculate trap address: 3 * 16 + 0x0800 = 16 * (3 + 0x080) Save nPC and PSW to memory

– SPARC uses register windows

– Assumes local registers are available Set privilege level to supervisor mode Update PC with trap address (and make nPC = PC + 4) (jumps to trap

table) Trap table has instruction ba ta3_handler rett

– Restores PC (from saved nPC value) and PSW (resets to user mode)

– Returns to application program

Page 275: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 275

Programmed I/O 1

Programmed I/O – Early approach: Isolated I/O

Special instructions to do input and output, using two operands: a register and an I/O address.

CPU puts device address on address bus, and issues an I/O instruction to load from or store to the device.

Page 276: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 276

Programmed I/O 2

C P U

M emor y

I /O

addr bus

data bus

r ead/wr ite

addr bus

data bus

r ead/wr ite

Isolated I/O

Page 277: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 277

Memory Mapped I/O No special I/O instructions. Treat the I/O device like a

memory address. Hardware checks to see if the memory address is in the I/O device range, and makes the adjustment.

Use high addresses (not “real” memory) for I/O memory maps. E.g., 0xFFFF0000 through 0xFFFFFFFF.

CPU

Memory

I/O

addr bus

data bus

read/write

memor y

unused

I /O

unused

Page 278: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 278

Programmed I/O 3

– Advantages of eachMemory mapped: reduced instruction set,

reduced redundancy in hardware.Isolated: don’t have to give up memory

address space on machines with little memory

Page 279: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 279

Programmed I/O - UARTs UARTs

– Universal Asynchronous Receiver Transmitter

– Asynchronous = not on the same clock.

– Handshake coordinates communication between two devices.

– A kind of programmed I/O.

Keyboard UART

0110 CPU..0

01101010serial

parallel

Page 280: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 280

UARTs 1 UART registers

– Control: set up at init, speed, parity, etc.

– Status: transmit empty, receive ready, etc.

– Transmit: output data– Receive: input data– All four needed for bi-

directional communications, – Status/control, transmit /

receive often combined. Why?

Control Reg

Status Reg

Transmit Reg

Receive Reg

TransmitLogic

ReceiveLogic

Control bus

Address bus

Data bus

Page 281: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 281

UARTs 2 Memory mapped UARTs

– Both memory and I/O “listen” to the address bus. The appropriate device will act based on the addresses.

– Keyboards and Printers require three addresses (when addresses are not combined).

– Modems require four.– (why?)

UART 1 data

UART 1 status

UART 1 control

UART 2 xmit

UART 2 recv

UART 2 status

UART 2 control

UART 3 xmit

FFFF 0000

FFFF 0004

FFFF 0008

FFFF 000C

FFFF 0010

FFFF 0014

FFFF 0018

FFFF 001C

CPUMemory UART1 UART2

Control busAddress bus

Data bus

and so on

Page 282: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 282

Programmed I/O 4

Programmed I/O Characteristics:– Used to determine if device is ready (can it be read or

written).

– Each device has a status register in addition to the data register.

– Like previous trap example, must check status before getting data.

– Involves polling loops.

Page 283: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 283

Programmed I/O – PollingEx.: ta 2 handler (blocking keyboard input)

Can’t afford to wait like this. Computer is millions of times faster than a typist. Also, multi-tasking operating systems can’t wait.

Special purpose computers can wait. E.g., microwave oven controllers.

Must have a better way! Interrupts are the answer!

ta_2_handler: set KbdBuff, %r1 ! get addr of kbd buffer set KbdStatus, %r9 ! get addr of kbd statuswait: ld [%r9], %r10 ! get status andcc %r10, 1, %r10 ! check if ready be wait ! loop until ready nop ! branch delay ld [%r1], %r8 ! get data rett ! return from trap

Are you ready?...Are you ready

now?...How about NOW?...

Nope ..Not

yet..Hang on..

Page 284: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 284

Interrupts and DMA transfers 1

Programmed (polled) I/O used busy waiting.– Advantages: simpler hardware

– Disadvantages: wastes time

Interrupts (IRQs on PCs)– I/O device “requests” service from CPU.

– CPU can execute program code until interrupted. Solves busy waiting problems.

– Interrupt handlers are run (like traps) whenever an interrupt occurs. Current application program is suspended.

Page 285: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 285

Interrupts and DMA transfers 2 Servicing an interrupt

– I/O controller generates interrupt, sets request line “high”.

– CPU detects interrupt at beginning of fetch/execute cycle (for interrupts “between” instructions).

– CPU saves state of running program, invokes intrpt. handler.

– Handler services request; sets the request line “low”.

– Control is returned to the application program.

Application Program::*Interrupt Detected*::

InterruptHandlerService Request::ClearInterrupt

Page 286: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 286

Interrupts and DMA transfers 3 Changes to fetch/execute cycle Problems

– Requires additional hardware in Timing & Control.

– Queuing of interrupts

– Interrupting an interrupt handler (solution: priorities and maskable interrupts)

– Interrupts that must be serviced within an instruction

– How to find address of interrupt handler

Interrupt Pending?

Save PCSave PSW

PSW=new PSWPC=handler_addr

PC -> busload MARINC to PCload PC

Y N

Page 287: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 287

Interrupts and DMA transfers 4

Example: interrupt driven string output– Want to print a string without busy waiting.– Want to return to the application as fast as

possibleI’m

ready!

Page 288: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 288

Trap handler implementation Install trap handler into trap table

– Buffer is like circular queue

– only outputs, at most, one character

disp_buf: .skip 256 ! buffers string to print

disp_frnt: .byte 0 ! offset to front of queue

disp_bck: .byte 0 ! offset to back of queue

ta_6_handler:

! Copy str from mem[%r8] to mem[disp_buf+disp_bck]

! Disp_back = (disp_back+len(str)) mod 256

! If display is ready

! If first char is not null, then output it

! Disp_frnt = (disp_frnt+1) mod 256

rett ! Return from trap

Disp_buf:

disp_frnt

disp_bck

newest

byte

Undisplayed

byte

Oldest

byte

Page 289: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 289

Interrupt handler implementation

This too outputs only one character at most, but when display becomes ready again, it generates another interrupt which invokes this routine!

display_IRQ_handler:

! Save any registers used

! If disp_frnt != disp_bck (queue is not empty)

! Get char at mem[disp_frnt]

! If char is not null, then output it

! Disp_frnt = (disp_frnt+1) mod 256

! Restore registers and set the request line “low”

rett ! Return from trap

Uses the UART for transmission.

I’m ready!

CPU

Memory

Page 290: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 290

Interrupts and DMA transfers 5 Problems with interrupt driven I/O

CPU is involved with each interrupt Each interrupt corresponds to transfer of a single byte Lots of overhead for large amounts of data (blocks of 512 bytes)

Memory CPU Device Controller

Execute 10s or 100sof instructions per byte

Transfer oneword of data

InterruptTransfer one byte of data

Page 291: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 291

Interrupts and DMA transfers 6 DMA (Direct Memory Access)

Want I/O without CPU intervention Want larger than one byte data transfers Solution: add a new device that can talk to both I/O devices

and memory without the CPU; a “specialized” CPU strictly for data transfers.

Memory

CPU

Device Controller

DMA Controller

Page 292: CSE3601 CSE 360: Introduction to Computer Systems Course Notes Bettina Bair (bbair@cse.osu.edu)  bbair.

CSE360 292

Interrupts and DMA transfers 7 Steps to a DMA transfer

– CPU specifies a memory address, the operation (read/write), byte count, and disk block location to the DMA controller (or specify other I/O device).

– DMA controller initiates the I/O, and transfers the data to/from memory directly

– DMA controller interrupts the CPU when the entire block transfer is completed.

Problem– Conflicts accessing memory. Can either arbitrate

access or get a more expensive dual ported memory system.