Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

42
Embedded System Design Center Embedded System Design Center ARM7TDMI ARM7TDMI Sai Kumar Devulapalli

Transcript of Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

Page 1: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

Embedded System Design CenterEmbedded System Design Center

ARM7TDMIARM7TDMI

Sai Kumar Devulapalli

Page 2: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

2 of 42

The Birth of ARM.

• As acorn can’t find any processor ready on the market is acceptable for their needs, they want to design new processor.

• Make new processor need great investment and experience?

• Luckily the papers from the Berkeley RISC I were designed.

• After some custom modifications by acorn, new RISC processor was born !

• The ARM ( Advanced RISC Machine ).

Page 3: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

3 of 42

History of ARM

Acorn - a Computer Manufacturer 1983:

• Acorn Limited:• Dominant position in UK personal computer market with Rockwell

6502 (8- Bit) CPU.1983:

• 16- Bit CISC CPU´s slower than standard memory ports with long interrupt latencies

1983- 85:• Acorn designed the first commercial RISC CPU:• Acorn Risc Machine (ARM)

1990:• Advanced Risc Machine was formed to broaden the market beyond

Acorn´s product range

Page 4: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

4 of 42

History of ARM..

1990:

• Startup with 12 engineers and 1 CEO

• No patents, no customers, very little money

Mid- 1990s:

• T. I. licensed ARM7

• Incorporated into a chip for mobile phones

IPO Spring 1998

• 13 millionaires

Page 5: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

5 of 42

What is RISC/CISC?

Reduced Instruction Set Computer

• Fewer Addressing modes.

• Fewer Instructions available.

• For example, ARM, NEC VR series.

Complex Instruction Set Computer

• More Instructions available

• Many addressing modes.

• For example, Intel x86.

Page 6: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

6 of 42

Advantages of RISC?

• Smaller die size

• Simple instructions - simple processor require less transistors.

• Shorter development time

• Simple processor take less effort to design.

• Higher performance?

• Disadvantages:

• Complex compiler

Page 7: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

7 of 42

The ARM programmers´ model

• ARM is a Reduced Instruction Set Computer (RISC).

• It has:

• a large, regular register file

– any register can be used for any purpose

• a load- store architecture

– instructions which reference memory

– just move data, they do no processing

– processing uses values in registers only

• Fixed length instructions

– 32 bit Arm Instruction Set

– 16 bit Thumb Instruction Set

Page 8: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

8 of 42

Main Features

• A large set of general purpose registers

• A load – store architecture

• 3- address instructions

• Conditional execution for every instruction

• Inclusion of very powerful load-store multiple register instructions

• Ability to perform general shift & general ALU operation in 1 instruction that executes in 1 clk cycle

Page 9: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

9 of 42

ARM7TDMI

ARM7TDMI

• Is the current, low-end ARM Core.

• It is widely used across a range of application, notably in digital mobile telephones.

The origin of the name ARM7TDMI:

• ARM7- a 3 volt compatible rework of ARM6 32-bit integer core

• The THUMB 16-bit compressed instruction set.

• On-chip Debug support, enabling the processor to halt in response to a debug request.

• An enhanced Multiplier, with higher performance than its predecessors and yielding a full 64-bit result.

– 4 extra instructions are provided which performs 32 * 32 -> 64 multiplications and 32 * 32 + 64 -> 64 multiply and accumulate

• Embedded ICE hardware to give on-chip breakpoint and watch point support.

Page 10: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

10 of 42

DATA TYPES

Byte (8-bit):

placed on any byte boundary.

Half-word (16-bit):

aligned to two-byte boundaries.

Word (32-bit):

aligned to four- byte boundaries.

Page 11: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

11 of 42

Processor Modes

* The ARM has six operating modes:

• User (unprivileged mode under which most tasks run)

• Fast interrupt request Mode-FIQ (entered when a high priority (fast) interrupt is raised)

• Interrupt Mode-IRQ (entered when a low priority (normal) interrupt is raised)

• Supervisor Mode-SVC (entered on reset and when a Software Interrupt instruction is executed)

• Abort Mode- ABT (used to handle memory access violations)

• Undefined Mode-UND (used to handle undefined instructions)

* ARM Architecture Version 4 adds a seventh mode:

• System Mode-SYS (privileged mode using the same registers as user mode)

Page 12: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

12 of 42

ARM programming model

r0r1r2r3r4r5r6r7

r8r9r10r11r12r13r14

r15 (PC)

CPSR

31 0

N Z C V

Page 13: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

13 of 42

Endianness

Relationship between bit and byte/word ordering defines endianness:

byte 3 byte 2 byte 1 byte 0 byte 0 byte 1 byte 2 byte 3

bit 31 bit 0 bit 31 bit 0

little-endian big-endian

Page 14: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

14 of 42

The Instruction Pipeline

The ARM uses a pipeline in order to increase the speed of the flow of instructions to the processor.

• Allows several operations to be undertaken simultaneously, rather than serially.

Rather than pointing to the instruction being executed, the PC points to the instruction being fetched.

FETCH

DECODE

EXECUTE

Instruction fetched from memory

Decoding of registers used in instruction

Register(s) read from Register BankShift and ALU operationWrite register(s) back to Register Bank

PC

PC - 4

PC - 8

ARM

Page 15: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

15 of 42

CPU Pipeline Stages* Fetch

• Instruction is fetched from memory and placed in instruction pipeline • In data transfer instruction address is sent to address register

* Decode• Instruction is decoded• Datapath control signals prepared for the next cycle• Instruction owns decode logic but not datapath • In data transfer instructions ,ALU holds address component to compute

auto- indexing modification if required * Execute

• Instruction owns datapath• Register bank is read• An operand shifted• ALU result generated• Result written back into destination register

Page 16: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

16 of 42

ARM7TDMI core

Page 17: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

17 of 42

• ARM has 37 registers in total, all of which are 32-bits long.• 30 general purpose registers • 5 dedicated saved program status registers • 1 dedicated program counter• 1 dedicated current program status register

• However these are arranged into several banks, with the accessible bank being governed by the processor mode. Each mode can access

• a particular set of r0-r12 registers• a particular r13 (the stack pointer) and r14 (link register)• r15 (the program counter)• cpsr (the current program status register)

and privileged modes can also access• a particular spsr (saved program status register)

The Registers

Page 18: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

18 of 42

30 general-purpose, 32-bit registers

• Fifteen general-purpose registers are visible at any one time, depending on the current processor mode, as r0, r1, ... ,r13, r14.

• By convention, r13 is used as a stack pointer (sp) in ARM assembly language. The C and C++ compilers always use r13 as the stack pointer.

• In User mode, r14 is used as a link register (lr) to store the return address when a subroutine call is made. It can also be used as a general-purpose register if the return address is stored on the stack.

• In the exception handling modes, r14 holds the return address for the exception, or a subroutine return address if subroutine calls are executed within an exception. r14 can be used as a general-purpose register if the return address is stored on the stack.

Page 19: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

19 of 42

Saved Program Status Registers (SPSRs)

• The SPSRs are used to store the CPSR when an exception is taken.One SPSR is accessible in each of the exception-handling modes.

• User mode and System mode do not have an SPSR because they are not exception handling modes.

Page 20: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

20 of 42

The program counter(pc)

• The program counter is accessed as r15 (or pc). It is incremented by one word (four bytes) for each instruction in ARM state, or by two bytes in Thumb state.

• Branch instructions load the destination address into the program counter. You can also load the program counter directly using data operation instructions. For example, to return from a subroutine, you can copy the link register into the program counter using:

– MOV pc,lr

• During execution, r15 does not contain the address of the currently executing instruction. The address of the currently executing instruction is typically pc– 8 for ARM, or pc– 4 for Thumb.

Page 21: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

21 of 42

The Current Program Status Register(CPSR)

* The CPSR holds:

– copies of the Arithmetic Logic Unit (ALU) status flags

– the current processor mode

– interrupt disable flags.

* The ALU status flags in the CPSR are used to determine whether conditional instructions are executed or not.

* On Thumb-capable processors, the CPSR also holds the current processor state (ARM or Thumb).

Page 22: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

22 of 42

ARM Register Organisation

ARM General registers and Program Counter

ARM Program Status Registers

r15 (pc)

r14 (lr)

r13 (sp)

r14_svc

r13_svc

r14_irq

r13_irq

r14_abt

r13_abt

r14_undef

r13_undef

User32 / System FIQ32 Supervisor32 Abort32 IRQ32 Undefined32

cpsr

sprsr_fiqsprsr_fiqsprsr_fiq spsr_abtspsr_svcsprsr_fiqsprsr_fiqspsr_fiq sprsr_fiqsprsr_fiqsprsr_fiqsprsr_fiqsprsr_fiqspsr_irq

r12

r10

r11

r9

r8

r7

r4

r5

r2

r1

r0

r3

r6

r7

r4

r5

r2

r1

r0

r3

r6

r12

r10

r11

r9

r8

r7

r4

r5

r2

r1

r0

r3

r6

r12

r10

r11

r9

r8

r7

r4

r5

r2

r1

r0

r3

r6

r12

r10

r11

r9

r8

r7

r4

r5

r2

r1

r0

r3

r6

r12

r10

r11

r9

r8

r7

r4

r5

r2

r1

r0

r3

r6

r15 (pc) r15 (pc) r15 (pc) r15 (pc) r15 (pc)

cpsrcpsrcpsrcpsrcpsr

r14_fiq

r13_fiq

r12_fiq

r10_fiq

r11_fiq

r9_fiq

r8_fiq

sprsr_fiqsprsr_fiqsprsr_fiqsprsr_fiqsprsr_fiqspsr_undef

* Shaded indicates Banked Registers

Page 23: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

23 of 42

Accessing Registers using ARM Instructions

• No breakdown of currently accessible registers.

• All instructions can access r0-r14 directly.

• Most instructions also allow use of the PC.

• Specific instructions to allow access to CPSR and SPSR.

Page 24: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

24 of 42

The Program Status Registers (CPSR and SPSRs)

Condition Code FlagsN = Negative result from ALU flag.Z = Zero result from ALU flag.C = ALU operation Carried outV = ALU operation oVerflowed

Mode Bits M[4:0] define the processor mode.

Interrupt Disable bits. I = 1, disables the IRQ. F = 1, disables the FIQ.

T Bit (Architecture v4T only) T = 0, Processor in ARM state T = 1, Processor in Thumb state

Copies of the ALU status flags (latched if theinstruction has the "S" bit set).

ModeN Z C V

2831 8 4 0

I F T

Condition bits

Page 25: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

25 of 42

Condition Flags

Flag Logical Instruction Arithmetic Instruction

Negative (N=‘1’)

No meaning Bit 31 of the result has been set. Indicates a negative number in signed operations

Zero (Z=‘1’)

Result is all zeroes Result of operation was zero

Carry (C=‘1’)

After Shift operation ‘1’ was left in carry flag

Result was greater than 32 bits

oVerflow (V=‘1’)

No meaning Result was greater than 31 bits Indicates a possible corruption of the sign bit in signed numbers

Page 26: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

26 of 42

• When the processor is executing in ARM state:

• All instructions are 32 bits in length

• All instructions must be word aligned

• Therefore the PC value is stored in bits [31:2] with bits [1:0] equal to zero (as instruction cannot be halfword or byte aligned).

• R14 is used as the subroutine link register (LR) and stores the return address when Branch with Link operations are performed, calculated from the PC.

• Thus to return from a linked branch•MOV r15,r14 or•MOV pc,lr

The Program Counter (R15)

Page 27: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

27 of 42

Internal Organization of ARM

• Two main blocks: datapath and decoder

• Register bank (r0 to r15)

• Two read ports to A-bus/B-bus

• One write port from ALU-bus

• Additional read/write ports for program counter r15

• Barrel shifter - shift/rotate 2nd operand by any number of bits

• ALU performs arithmetic/logic functions

• Address registers/incrementer holds either PC address (with increment) or operand address

Page 28: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

28 of 42

Datapath activity during data processing instruction

• SUB r0, r1, #128; r0 := r1 - 128

• Subtract instruction – one operand is a constant

• Constant 128 encoded in instruction passes through barrel shifter to produce 128*0

• ALU operates on the operands and writes the result back to register r0

• PC value in address register is incremented and coped back to r15 and the address register

Page 29: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

29 of 42

Internal Organization

• Data register holds read/write data from/to memory

• Instruction decoder decodes machine code instructions to produce control signals to datapath

• In single-cycle data processing instructions, data values are read on the A-bus & B-bus, the results from ALU is written back into register bank

• PC value in address register is incremented and copied back to r15 and the address register – this allows fetching new instructions ahead of time (instruction pre-fetch)

Page 30: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

Embedded System Design CenterEmbedded System Design Center

ARM7TDMI MicroprocessorARM7TDMI Microprocessor

Data Processing InstructionsData Processing Instructions

Sai Kumar Devulapalli

Page 31: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

31 of 42

Data processing Instructions

• Largest family of ARM instructions, all sharing the same instruction format.

• Contains:• Arithmetic operations• Comparisons (no results - just set condition codes) • Logical operations• Data movement between registers

• Remember, this is a load / store architecture• These instruction only work on registers, NOTNOT memory.

• They each perform a specific operation on one or two operands.

• First operand always a register - Rn• Second operand sent to the ALU via barrel shifter.

• We will examine the barrel shifter shortly.

Page 32: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

32 of 42

Arithmetic AND logical Instructions: General Format

Opcode{Cond}{S} Rd,Rn,Operand 2

{Cond} - Conditional Execution of instruction

– E.g. GT=GREATER THAN,LT = LESS THAN

{S} - Set the bits in status register after execution.

{Operand 2}- various form of the instruction� immediate/register/shifting

• you can easily check all the combinations in the quick references of ARM.

Page 33: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

33 of 42

Arithmetic Operations

• Operations are:• ADD operand1 + operand2• ADC operand1 + operand2 + carry• SUB operand1 - operand2• SBC operand1 - operand2 + carry -1 <Sub. with C>• RSB operand2 - operand1 <Reverse Sub>• RSC operand2 - operand1 + carry – 1 <Rev.Sub.with C>

• Syntax:• <Operation>{<cond>}{S} Rd, Rn, Operand2

• Examples• ADD r0, r1, r2• SUBGT r3, r3, #1• RSBLES r4, r5, #5

Page 34: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

34 of 42

Register operand

# (I) = 0 indicates that the second operand is specified in register which can also be shifted.

#shift Sh 0 Rm11 7 6 5 4 3 0

# Shift : Immediate shift length

Rs : Register shift length

Sh : Shift type

Rm : Register used to hold second operand.

Rs 0 Sh 1 Rm

11 8 7 6 5 4 3 0

Ex:

ADD RO, R1, R2, LSL #3

Ex:

ADD RO, R1, R2, LSL R3

Page 35: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

35 of 42

Shift operations

Guided by “Sh” field in the format

• Sh = 00 Logical Shift Left : LSL Operation

• Sh = 01 Logical Shift Right : LSR Operation

• Sh = 10 Arithmetic Shift Right : ASR Operation

• Sh = 11 Rotate Right : ROR Operation

• With Sh = 11 and #shift = 00000 (similar to ROR #0) is used for RRX operation.

Page 36: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

36 of 42

Immediate value

• 8 bit number

• Can be rotated right through an even number of positions.

• Assembler will calculate rotate for you from constant.

Register, optionally with shift operation applied.

Shift value can be either be:

• 5 bit unsigned integer

• Specified in bottom byte of another register.

Operand 1

Result

ALU

Barrel Shifter

Operand 2

Using the Barrel Shifter: The Second Operand

Page 37: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

37 of 42

Logical Operations

• Operations are:

• AND operand1 AND operand2

• EOR operand1 EOR operand2

• ORR operand1 OR operand2

• BIC operand1 AND NOT operand2 [ie bit clear]

• Syntax:

• <Operation>{<cond>}{S} Rd, Rn, Operand2

• Examples:

• AND r0, r1, r2

• BICEQr2, r3, #7

• EORS r1, r3, r0

Page 38: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

38 of 42

Comparisons

• The only effect of the comparisons is to

• UPDATE THE CONDITION FLAGS. Thus no need to set S bit.

• Operations are:

• CMP operand1 - operand2, but result not written

• CMN operand1 + operand2, but result not written

• TST operand1 AND operand2, but result not written

• TEQ operand1 EOR operand2, but result not written

• Syntax:

• <Operation>{<cond>} Rn, Operand2

Page 39: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

39 of 42

Comparisons

Examples:

• CMP r0, r1

• CMP R1,Operand2 e.g. CMP R1,R2

– [R1] - [R2]

– Set the N Z C V in CPSR register.

• TSTEQ r2, #5

• TST R1, Operand2 e.g. TST R1,R2

– [R1] AND [R2]

Page 40: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

40 of 42

Data Movement

• Operations are:

• MOV Rd, operand2

• MVN Rd, (NOT) operand2

Note that these make no use of operand1.

• Syntax:

• <Operation>{<cond>}{S} Rd, Operand2

• Examples:

• MOV r0, r1

• MVN r0, r1

• MOVS r2, #10

• MVNEQ r1, #0

Page 41: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

41 of 42

Quiz

Start

Stopr0 = r1?

r0 > r1?

r0 = r0 - r1 r1 = r1 - r0

Yes

No Yes

No

* Convert the GCD algorithm given in this flowchart into

1)“Normal” assembler,where only branches can be conditional.

2)ARM assembler, where all instructions are conditional, thus improving code density.

* The only instructions you need are CMP, B and SUB.

Page 42: Embedded System Design Center ARM7TDMI Sai Kumar Devulapalli.

42 of 42

Quiz - Sample Solutions

“Normal” Assembler

gcd cmp r0, r1 ;reached the end? beq stop blt less ;if r0 < r1 sub r0, r0, r1 ;subtract r1 from r0 bal gcdless sub r1, r1, r0 ;subtract r0 from r1 bal gcdstop

ARM Conditional Assembler

gcd cmp r0, r1 ;if r0 > r1 subgt r0, r0, r1 ;subtract r1 from r0 sublt r1, r1, r0 ;else subtract r0 from r1 bne gcd ;reached the end?