[Slides]

23
Kumar R., Singhania A., Castner A., Kohler E Proceedings of Design Automation Conference Pages: 218 - 223 June 2007 111/07/03

Transcript of [Slides]

Page 1: [Slides]

Kumar R., Singhania A., Castner A., Kohler EProceedings of Design Automation Conference

Pages: 218 - 223 June 2007

112/04/13

Page 2: [Slides]

Many embedded systems contain resource constrained microcontrollers where applications, operating system components and device drivers reside within a single address space with no form of memory protection. Programming errors in one application can easily corrupt the state of the operating system and other applications on the microcontroller. In this paper we propose a system that provides memory protection in tiny embedded processors. Our system consists of a software run-time working with minimal low-cost architectural extensions to the processor core that prevents corruption of state by buggy applications. We restrict memory accesses and control flow of applications to protection domains within the address space. The software run-time consists of a memory map: a flexible and efficient data structure that records ownership and layout information of the entire address space.

AbstractAbstract

- 2 -

Page 3: [Slides]

Memory map checks are done for store instructions by hardware accelerators that significantly improve the performance of our system. We preserve control flow integrity by maintaining a safe stack that stores return addresses in a protected memory region. Cross domain function calls are redirected through a software based jump table. Enhancements to the microcontroller call and return instructions use the jump table to track the current active domain. We have implemented our scheme on a VHDL model of ATMEGA103 microcontroller. Our evaluations show that embedded applications can enjoy the benefits of memory protection with minimal impact on performance and a modest increase in the area of the microcontroller.

Abstract (cont.)Abstract (cont.)

- 3 -

Page 4: [Slides]

Memory corruption on tiny embedded processor

What’s the ProblemWhat’s the Problem

- 4 -

Microcontroller Address Space

Single address space CPU Shared by apps., drivers and OS

Buggy applications can easily corrupt the state of OS and other applications

Memory ProtectionMemory Protection is an enabling technology for is an enabling technology for building robust embedded softwarebuilding robust embedded software

Memory is accessible to all SW modules via a single address spaceMemory is accessible to all SW modules via a single address space

Program1Program1

Program2Program2

Program3Program3

OSOS

Page 5: [Slides]

MMU can provide protection domains However, No MMU in embedded micro-controllers

。MMU hardware requires lot of RAM。Increases area and power consumption 。Poor performance - High context switch overhead

Memory Protection Unit (MPU) Static partition of address space into segments However, not suited for complex embedded software (such as OS)

。Supports only two domains (user mode and supervisor mode) Protect the kernel from applications but not the applications

from another

Software-based Fault Isolation Run time checks to ensure all memory accesses reside within the segment

allocated to it The run time checks are introduced through compiler or binary rewrite

。However, Binary rewrite are quite error prone

Related WorkRelated Work

- 5 -

supervisor

user

P1P1

P2P2

P3P3

MPU

Page 6: [Slides]

Memory protection suited for low-end microcontrollers Solve memory write Protectionmemory write Protection

。“store”, “call”, and “return” instructions

The proposed Memory Protection MethodThe proposed Memory Protection Method

- 6 -

Memory Map Table

HWextensio

n

Program1Program1

Program2Program2

Program3Program3

OSOS

Domain A

Domain B

Domain C

Domain D

Domain N

……

Domain A

Domain B

Domain C

Domain N

Memory Map Checker

Memory Map Checker

Control Flow Manager

Control Flow Manager

Jump TableSafe Stack

Software Routine

Hardware Software Hardware Software Co-Design approach Co-Design approach to memory to memory protectionprotection

Page 7: [Slides]

Protection DomainProtection Domain

- 7 -

Domains - Logical partitions of address space Every software module stores its state in its own

protection domain Protect domain from corruption by other domains

Modules are restricted from writing to memory outside their domain through run-time checks

There is one single trusted domain in the system that is allowed to access all memory

Page 8: [Slides]

Memory Map Data StructureMemory Map Data Structure

- 8 -

Fine-grained layout and ownership information

User Domain

Kernel Domain

Partition address space into blocks

Memory is allocated to domains as segments(Sets of contiguous blocks)

Store information for all blocks Encoded information for all block

Ownership – domain ID Layout - start of a logical segment

A domain could be allocated multiple segments

Efficiently encoded using 4 bits per block xxx0 - Start block of segment xxx1 - Later block of segment xxx is the 3-bit domain ID

Efficiently encoded using 4 bits per block xxx0 - Start block of segment xxx1 - Later block of segment xxx is the 3-bit domain ID

Back

Page 9: [Slides]

Functional unit that validates store operations Programs can write only into their domainPrograms can write only into their domain Invoked before every write access

Memory Map CheckerMemory Map Checker

- 9 -

DATA_BUS

CPUCPUMemory Map

CheckerMemory Map

Checker RAMRAMCPU_ADDRCPU_WR_ENCPU_STALL

MMC_ADDRMMC_WR_EN

ST_INSTR

Triggered on a store instruction Operations performed by the checker

Lookup memory map for issued write address Retrieve permission from memory map and validates stores

。Verify current executing domain is block owner

Page 10: [Slides]

Assuming block size of 8 bytes, the nine significant bits of the address represent the block number

Permissions are packed into a byte If the encoded information is stored in four bits, then each byte would contain information of two

contiguous blocks Last bit of the block number Last bit of the block number represents the block offset of the permission

The remaining bits index into the memory map tale

Address Address Memory Map Lookup Memory Map Lookup

- 10 -

Address (bits 11-0)

Memory Map Table

1 Byte has 2 memory map records

8

1

Block Number (bits 11-3) Byte Offset (bits 2-0)mem_map_base

Assume block size of 8 bytesAssume block size of 8 bytes

Memory Map Offset (bits 11-4)

Page 11: [Slides]

In cycle 2 First, it stalls the processor execution and take control of the address bus to memory Perform address translation address translation to lookup memory map for issued write address Read memory map table to retrieve the permission

In cycle 3 Retrieve permission from memory map, and compare the ownership

information to the current executing domain ID If check is successful, then MMC issues a write operation to data memory

Operations Performed by Memory Map Operations Performed by Memory Map Checker (MMC)Checker (MMC)

- 11 -

CPU_WR_ADDR

MMC_RD_ADDR

CLK

CPU_ADDR

CPU_WR_EN

MMC_ADDR

MMC_WR_EN

CPU_STALL

Cycle 1 Cycle 3Cycle 2

Regular Mode

Protected Mode

Page 12: [Slides]

The software library manages all the memory available Ensure memory map accurately reflects current ownership and

layout。The library provides “malloc”, “free” and “change_own” calls that

automatically update the memory map data structure Only permit block owner to free/change its ownership

。To enforce this condition, the software library reads the current active domain ID

Set up the memory map memory map to be located in a protected regionlocated in a protected region。This prevents corruption of the memory map data structure

Initialize the MMC with the proper block size, number of protection domains and the range of protected address space

Memory Map Software LibraryMemory Map Software Library

- 12 -

Back

Page 13: [Slides]

Control flow can become corrupt at run-time EX: Returns on corrupted stack (return addresses are stored in stack) Memory map can’t prevent such internal memory corruption

。Programming errors can cause a module to corrupt its own state

Control flow manager ensures that control can never flow out of a domain, except Via calls to functions exported by other domains Via returns to calls from other domains

The current executing domain also needs to be tracked Required by the memory map checker to validate write accesses

Control Flow ManagerControl Flow Manager

- 13 -

Preserve control flow integrity through the Preserve control flow integrity through the safe stack safe stack that stores return addressesthat stores return addresses

Page 14: [Slides]

Each domain has its own jump table in flash memory that contains The set of functions exported by each domain

The jump table can’t be corrupted Due to modules are not allowed to write to flash memory

Each entry in the jump table is an instruction jump to a valid exported function

Re-directed through jump table Re-directed through jump table to functions exported by a domain

Cross Domain LinkingCross Domain Linking

- 14 -

Program Memory

Domain Acall fooJT

Domain Bfoo:…ret

fooJT: jmp foofooJT: jmp fooDomain B Jump Table

Cross Domain Call Verify call into jump table Compute callee domain ID

Jump exceptionJump exceptionJump exceptionJump exception

Page 15: [Slides]

Jump table of all domains are stored at fixed location in flash memory This simplifies the verifying of the target address of a call

A valid target address has to reside in the jump table

The ID of the called domain can be easily determined First, computing the address offset from the base address of the jump table Then, dividing it by the size of the jump table

The cross domain call state machine Push the current domain ID into stackPush the current domain ID into stack, during cross domain call Restore the previous domain ID and transfer control back to the caller’s

domain, during cross domain return

Domain TrackingDomain Tracking

- 15 -

jmp_tbl_base_addressjmp_tbl_base_address jmp_tbl_upper_boundjmp_tbl_upper_boundcall_addrcall_addr

<=<= <<

ANDAND

Page 16: [Slides]

Single stack shared by all domains Protection Model

Prevent corruption of stack belonging to a domain by any module belonging to a different domain

Bounds set during cross domain call Processor copies the current stack current stack

pointerpointer into a stack_boundstack_bound register

Enforced by MMC before all writes No writes beyond stack bound

Run-Time Stack ProtectionRun-Time Stack Protection

- 16 -

Run Time Stack

CallerDomain

StackFrame

CallerDomain

StackFrame

CalleeDomain

StackFrame

CalleeDomain

StackFrame

Stack Ptr.

Stack_Bound

Stack Base

Prevent cross domain corruption of stack Prevent cross domain corruption of stack

Page 17: [Slides]

In spite of the stack are protected from corruption from modules in other domains However, programming errors can cause a module to corrupt

its own stack

Therefore, maintain an extra stack in protected memory To store return addresses in a separate stack that resides in a

different protection domain

Setup safe stack at the end of all global data and make it grows up toward run-time stack

Safe StackSafe Stack

- 17 -

RUN-TIMESTACK

RUN-TIMESTACK

SAFESTACKSAFE

STACK

HEAPand

GLOBALS

HEAPand

GLOBALS

Safe Stack and Run-Time Stack approach one another

Page 18: [Slides]

Performance Overhead (CPU Cycles) Introduced Performance Overhead (CPU Cycles) Introduced by the Memory Protection Mechanismby the Memory Protection Mechanism

- 18 -

Compare with software based approach through binary rewrite

Superior performance of run-time

checks in HW

High overhead of software based memory map checker Due to require bit shift operations to translate write address to memory map

lookup Cross domain call and return have an overhead of five cycles

Due to push “current domain ID”, “stack bound” and “return address” to stack 。 Information of five bytes needed to push to stack, and one byte can be written every cycle

Restoring the values read from stack Saving and restoring return addresses doesn’t introduced added overhead

Due to simply redirect the store of the return address to safe stack when processor pushes the return address to the run-time stack

Unit: CPU Cycles

Page 19: [Slides]

Overhead introduced in memory map software library Due to memory map needs to be updated during allocation,

free and transfer of memory Higher overheads of free and change_own calls

Due to additional checks to prevent illegal freeing or ownership transfer of memory by non-owners

Performance Overhead (CPU Cycles) of Software Performance Overhead (CPU Cycles) of Software Library Introduced by the Protection MechanismLibrary Introduced by the Protection Mechanism

- 19 -

Compare overhead of memory allocation routines in the presence and absence of the protection

mechanism

Unit: CPU Cycles

Page 20: [Slides]

Code and Data Memory Usage of the Code and Data Memory Usage of the Software LibrarySoftware Library

- 20 -

Memory map size is 256 bytes for multi-domain protection This represents an overhead of 6.25% (256 bytes / 4KB)

Flexible data-structure - Tradeoff RAM for protection Size of memory map required can be reduced

。By modifying portion of address space that required memory map for protection

The total code memory usage of the software library 3674 bytes , an overhead of 2.8% (3674 bytes / 128KB)

Page 21: [Slides]

Most of the additions to the core area are in the memory map decoder That support arbitrary bit-shift in a single cycle

。We can eliminate this overhead for fixed block size and number of protection domains

32% overall increase in the core area This represents a modest increase in the overall chip area

。As core occupies only a small fraction of the overall area

Hardware Overhead of the Memory Protection Hardware Overhead of the Memory Protection Mechanism Mechanism

- 21 -

Page 22: [Slides]

HW/SW co-design approach for memory protection Enabling technology for reliable embedded software systems Combine flexibility of software with efficiency of hardware

Building blocks for memory protection Memory map checker Control flow manager

Practical system with widespread applications Low resource utilization Minimal performance overhead Binary compatible with existing software and tool-chains

。The software library provides a standard programming interface。Doesn’t modify the instruction set architecture of the processor

ConclusionsConclusions

- 22 -

Page 23: [Slides]

Memory protection suited for low-end microcontrollers Doesn’t static partition of address space

。Rely on a memory map data structure Record ownership ownership and layoutlayout info. of the entire address space

Doesn’t rewrite binary to introduce run time checks。Enhance the “store”, “call”, and “return” instructions to perform run time

checks in hardware Hardware Software Co-Design approach Hardware Software Co-Design approach to memory protection

The proposed Memory Protection MethodThe proposed Memory Protection Method

- 23 -

Memory Map Checker(STORE instruction extension)Memory Map Checker

(STORE instruction extension)

Hardware ExtensionsSoftware Routine

Memory MapMemory Map

Domain Tracker(CALL instruction extension)

Domain Tracker(CALL instruction extension)

Domain Tracker(RETURN instruction extension)

Domain Tracker(RETURN instruction extension)

Jump TableJump Table

Safe StackSafe Stack

Low cost architecture extension and software Low cost architecture extension and software library work together to isolate from anotherlibrary work together to isolate from another