Introduction to Embedded Software Development
description
Transcript of Introduction to Embedded Software Development
Introduction to Embedded Software Introduction to Embedded Software DevelopmentDevelopment
School of software EngineeringSchool of software Engineering
20052005
6. Windows CE System 6. Windows CE System ArchitectureArchitecture
OverviewOverview
System ArchitectureSystem Architecture NK.EXENK.EXE FILESYS.EXEFILESYS.EXE DEVICE.EXEDEVICE.EXE GWES.EXEGWES.EXE SERVICES.EXESERVICES.EXE Thread MigrationThread Migration
Windows CE System Windows CE System ArchitectureArchitecture
Application(s)
COREDLL
NK.EXE
OAL
GWES.EXE FILESYS.EXE
Object Store
Touch Display Keyboard ROM FSStorage Manager
DEVICE.EXE
SERVICES.EXE
FTP HTTPD TELNETD
RAM ROM/
FLASH
Timer INTC
CPU
HARDWARE
SerialUSB
(Function)PCCard ...
DevMgr.Dll
Block Device
Serial Custom
NK.EXENK.EXE NK.LIB + OAL.LIB = NK.LIB + OAL.LIB =
NK.EXENK.EXE Kernel is Hardware Kernel is Hardware
architecture agnostic but architecture agnostic but CPU Instruction set CPU Instruction set specificspecific
Designed to keep OAL as Designed to keep OAL as small as possiblesmall as possible
Microsoft Provides NK.LIB Microsoft Provides NK.LIB as a pre-built libraryas a pre-built library Most of the Source is Most of the Source is
available via Shared available via Shared Source LicenseSource License
More available via More available via Premium Shared Source Premium Shared Source ProgramProgram
ProvidesProvides Memory ManagementMemory Management SchedulerScheduler Protected Server Library Protected Server Library
(PSL) Call mechanism for (PSL) Call mechanism for Micro-Kernel ArchitectureMicro-Kernel Architecture
Base Win32 Function Base Win32 Function implementationimplementation
NK.EXE
OAL
RAM ROM/
FLASH
Timer INTC
CPU
Protected Server Libraries Protected Server Libraries (PSL)(PSL) System Process that implements a System Process that implements a
system API Setsystem API Set Mechanism for implementing OS Mechanism for implementing OS
functionality in isolated processes functionality in isolated processes PSL Calls run through the Kernel PSL Calls run through the Kernel
(NK.EXE)(NK.EXE) Not end user extensibleNot end user extensible
You can’t just create a new PSL and plug it inYou can’t just create a new PSL and plug it in
GWES.EXEGWES.EXE Graphical Graphical Windowing and Windowing and Events System Events System (GWES)(GWES) NLED driver NLED driver
removed in V5.0 so removed in V5.0 so GWES now builds GWES now builds separate from BSPseparate from BSP
Manages all Manages all Graphical User Graphical User Interface and input Interface and input devicesdevices
Desktop USER32 + Desktop USER32 + GDI32 as a single GDI32 as a single PSL ProcessPSL Process
GWES.EXE
Touch Display Keyboard
HARDWARE
DEVICE.EXEDEVICE.EXE Device ManagerDevice Manager Battery driver removed Battery driver removed
from GWES in V4.1 it is from GWES in V4.1 it is now a stream driver in now a stream driver in device.exedevice.exe Works on headless Works on headless
devices!devices! Separated core as a Separated core as a
DLL for use by drivers DLL for use by drivers to make faster calls to to make faster calls to device manager APIsdevice manager APIs
Provides all Driver Provides all Driver related APIs to Systemrelated APIs to System
Uses registry to load Uses registry to load Bus Drivers at boot Bus Drivers at boot timetime
DEVICE.EXE
HARDWARE
DevMgr.Dll
Block Device
Serial Custom
Services.EXEServices.EXE Host process for Host process for ServicesServices Separated from Separated from
Device.exe for Device.exe for greater isolationgreater isolation
FTP, TELNET, HTTPD FTP, TELNET, HTTPD (Web), UPnP, SMB, (Web), UPnP, SMB, etc…etc…
Custom ServicesCustom Services Command line utility Command line utility
for starting, stopping for starting, stopping and restarting servicesand restarting services
Programmatic APIs for Programmatic APIs for manipulating servicesmanipulating services
SERVICES.EXE
FTP HTTPD TELNETD
File SystemFile System All file system functions and APIs are All file system functions and APIs are
managed by FileSys.exemanaged by FileSys.exe
Has a single root “\”, but has NO driver Has a single root “\”, but has NO driver name like “C:”name like “C:”
Has 3 components:Has 3 components: Object StoreObject Store Storage ManagerStorage Manager ROM File SystemROM File System
File System OverviewFile System Overview
Object StoreObject Store
A heap managed by FileSys.exeA heap managed by FileSys.exe
Including:Including: RegistryRegistry DatabaseDatabase RAM File SystemRAM File System
RAM File System usually use the root RAM File System usually use the root directlydirectly Ex : “\myfile.txt” is in RAMEx : “\myfile.txt” is in RAM
ROM File SystemROM File System
Mapped as “\Windows” directoryMapped as “\Windows” directory
All Files in “\Windows” are read onlyAll Files in “\Windows” are read only
Usually is the image of nk.nb0 or Usually is the image of nk.nb0 or nk.nb0nk.nb0
Storage ManagerStorage Manager
Responsible for:Responsible for: Storage device driverStorage device driver Partition device driverPartition device driver File System device driverFile System device driver File System filterFile System filter
Thread MigrationThread Migration
Application(s)
COREDLL
NK.EXE
OAL
GWES.EXE FILESYS.EXE
Object Store
Touch Display Keyboard ROM FSStorage Manager
DEVICE.EXE
SERVICES.EXE
FTP HTTPD TELNETD
RAM ROM/
FLASH
Timer INTC
CPU
HARDWARE
SerialUSB
(Function)PCCard ...
DevMgr.Dll
Block Device
Serial Custom
CreateFile(…)
OverviewOverview
ProcessesProcesses ThreadsThreads Virtual MemoryVirtual Memory
Windows CE Kernel FeaturesWindows CE Kernel FeaturesMultiple processesMultiple processes
Can support maximum of 32 separate processesCan support maximum of 32 separate processesMultiple threadsMultiple threads
Supports 256 thread prioritiesSupports 256 thread prioritiesFibersFibers
Unit of execution that must be manually scheduled Unit of execution that must be manually scheduled by the application by the application
Synchronization objectsSynchronization objects Critical Sections, Mutexes, Semaphores, Events, Critical Sections, Mutexes, Semaphores, Events,
Message QueuesMessage QueuesMemory modelMemory model
Virtual memory, Code sections Paged, No backing Virtual memory, Code sections Paged, No backing store for Data sectionsstore for Data sections
ProcessesProcesses Static context within which one or more Static context within which one or more
threads runthreads run Processes aren’t scheduled to run – threads are. Processes aren’t scheduled to run – threads are.
The maximum number of simultaneous The maximum number of simultaneous processes is limited 32 processes because:processes is limited 32 processes because: It is a reasonable limit for most embedded It is a reasonable limit for most embedded
devices, as using multi-thread is recommended devices, as using multi-thread is recommended over multi-processesover multi-processes
Architecture of some supported CPUs have fixed Architecture of some supported CPUs have fixed MMU mappings.MMU mappings.
Windows CE uses the same loading and Windows CE uses the same loading and unloading mechanism as Windows XP unloading mechanism as Windows XP (and other desktop Win32 versions of Windows) (and other desktop Win32 versions of Windows)
Support for console applicationsSupport for console applications But not the same API as desktop Win32But not the same API as desktop Win32
Call CreateProcess() to start a processCall CreateProcess() to start a process
ThreadsThreads Unit of execution in Win32Unit of execution in Win32 Scheduled by the OS based on PriorityScheduled by the OS based on Priority Higher priority threads pre-empt lower Higher priority threads pre-empt lower
priority threads when ready to runpriority threads when ready to run Threads at the same priority are Threads at the same priority are
scheduled in a Round-Robin fashion.scheduled in a Round-Robin fashion. Default Quantum is 100ms configurable Default Quantum is 100ms configurable
by OEM in OALby OEM in OAL Can also be programmed per thread at run Can also be programmed per thread at run
time. time.
Thread PriorityThread Priority Thread A is in the Thread A is in the
highest priority highest priority and runs until and runs until blocked or blocked or completion completion
Thread B and C run Thread B and C run in “round-robin” as in “round-robin” as long as thread A is long as thread A is blockedblocked
In round-robin In round-robin each thread runs each thread runs for a specific for a specific amount of time – amount of time – called a quantumcalled a quantum
The lower the The lower the priority number the priority number the higher the priorityhigher the priority
Thread Priority Map (Example)Thread Priority Map (Example)
Priority Component
0-19 Open – Real Time Above Drivers
20 Graphics Vertical Retrace
99 Power management Resume Thread
100-108 USB OHCI UHCI, Serial
109-129 IRSIR1, NDIS, Touch
130 KITL
131 VMini
132 CxPort
145 PS2 Keyboard
148 IRComm
150 TAPI
248 Power Management
249 WaveDev, Mouse, PnP, Power
250 WaveAPI
251 Normal
252-255 Open - Applications
Priority InversionPriority Inversion Avoid priority inversion by keeping all threads waiting for same resource Avoid priority inversion by keeping all threads waiting for same resource
at the same priorityat the same priority
Thread 3
High Priority
Medium Priority
Low Priority
Thread 3Resource Owner:
Thread 2
Thread 1
Thread 1
PriorityInversion
Preempt
Preempt
Blocked
PriorityRestored
Thread 3
Example: Thread 1 blocked waiting for resource owned by Thread 3, causing Priority Inversion
Thread 3 BlockedThread 1
Thread 2 Blocked
Thread APIThread API Thread Creation
CreateThread – Creates a new thread at normal priority Thread Priority
GetThreadPriority – current priority level of a thread SetThreadPriority – change priority level of a thread from normal
(251) CeGetThreadPriority – current priority level of a real-time thread CeSetThreadPriority – change priority level of a real-time thread
Thread Suspend Sleep(0) – relinquish remainder of quantum to other threads in its
priority Sleep (n) – milliseconds to suspend execution Sleep (INFINITE) – suspend execution until thread termination or
resume SleepTillTick – suspend execution until next system tick SuspendThread – increments suspend count to stop user-mode ResumeThread – decrements suspend count
Process & ThreadProcess & Thread
Windows CE process does NOT Windows CE process does NOT support Environment variablesupport Environment variable
_wfopen (L“%WINDOWS%\\a.txt”, L“w”); // error
Windows CE process does NOT Windows CE process does NOT support Current directorysupport Current directory
_wfopen(L“a.txt”, L“w”); // error, first search root directory, then search \Windows directory.
Synchronization ObjectsSynchronization ObjectsThread
Requests a synchronization object and blocks while object is not in “Signaled” state
Resumes when the object it is in “Signaled” stateSynchronization Object Types
Critical Section Mutex Semaphore Event
Also can use Interlocked functions & point-to-point message queue
Synchronization (Critical Sections)Synchronization (Critical Sections) Overview
Allows multiple threads shared access to same data Protects a section of code with mutual-exclusive access Other threads blocked until ownership is released Each CS is an application provided data structure that is used by OS Only useful within a single process but more efficient than a MUTEX
Functions InitializeCriticalSection
Allocates CRITICAL_SECTION structure for a CriticalSection object EnterCriticalSection
Calls blocked until owner thread calls LeaveCriticalSection TryEnterCriticalSection
Non-blocking version of EnterCriticalSection LeaveCriticalSection
Releases ownership of a CriticalSection object DeleteCriticalSection
Releases resources allocated by InitializeCriticalSection
Synchronization Objects (Mutexes)Synchronization Objects (Mutexes) Overview
Only one thread can own a mutex at a time Global named mutex objects permits inter-process synchronization Signaled state when not owned by a thread Non-signaled state when it is owned by a thread
Functions CreateMutex
Creates named or unnamed mutex object if it doesn’t already existNon-blocking with return status for already exists or abandoned
WaitForSingleObject or WaitForMultipleObjectCalls blocked until current owner releases specified mutex objectCalls non-blocking while waiting for a mutex object it already owns
ReleaseMutexCalled once per call returned from Wait functionAbandoned state if not called before owner thread terminates
CloseHandleReleases and Destroys mutex object upon last handle close
Synchronization Objects (Semaphores)Synchronization Objects (Semaphores) Overview
Limits the number of threads using a protected resource Global named semaphore objects for inter-process
synchronization Signaled state when its count is greater than zero Non-signaled state when its count is zero
Functions CreateSemaphore
Creates named or unnamed semaphore object if it doesn’t already existMultiple processes can use the same named semaphore object
WaitForSingleObject or WaitForMultipleObjectCalls blocked until semaphore count is non-zero
ReleaseSemaphoreIncrements semaphore count by specified amount
CloseHandleDestroys a semaphore object upon closing its last handle
Synchronization Objects (Events)Synchronization Objects (Events) Overview
Local un-named event objects used within process context Global named event objects permits inter-process synchronization Signaled state when event occurs Non-signaled state when event has not occurred
Functions CreateEvent - Creates named or unnamed event object SetEvent - Set event object to signaled ResetEvent - Set event object to nonsignaled PulseEvent - Set event object to signaled and then resets it to
nonsignaled after releasing specified number of threads WaitForSingleObject or WaitForMultipleObject - Calls blocked until
specified event is signaled CloseHandle - Destroys an event object upon closing its last handle
Synchronization (Interlocked Functions)Synchronization (Interlocked Functions) Overview
Synchronize access to variable shared between multiple threads Prevents thread from being pre-empted while accessing shared
variable Interlocked atomic actions provides mutually exclusive calls between
threads Functions
InterlockedIncrement - Increment a shared variable and check resulting value
InterlockedDecrement - Decrement shared variable and check resulting value
InterlockedExchange - Exchange values of specified variables InterlockedTestExchange - Exchange values when a variable
matches InterlockedCompareExchange - Atomic exchange based on compare InterlockedCompareExchangePointer - Exchange values on atomic
compare InterlockedExchangePointer - Atomic exchange of a pair of values InterlockedExchangeAdd - Atomic increment of an Addend variable
Synchronization (Point-to-Point Message Synchronization (Point-to-Point Message Queues)Queues)
Overview Allows multiple readers of user-defined message queue High priority and alert messages
Functions CreateMsgQueue - Creates or opens a message queue OpenMsgQueue - Opens a handle to an existing
message queue CloseMsgQueue - Closes an open message queue ReadMsgQueue - Reads a single message from a
message queue WriteMsgQueue - Writes a single message into a
message queue GetMsgQueueInfo - Returns information about a
message queue
Memory ManagementMemory Management
* Exist only in desktop windows
Physical Memory * Storage Device
Virtual Memory
Logical Memory (Heap, stack)
C Runtime (mallc, new…)
Application
Memory ArchitectureMemory Architecture Physical MemoryPhysical Memory
Actual RAM/ROM and memory mapped Actual RAM/ROM and memory mapped devices with addresses as they appear on devices with addresses as they appear on the external (or internal) busthe external (or internal) bus
Virtual MemoryVirtual Memory Memory system that runs addresses Memory system that runs addresses
through a Memory Management Unit through a Memory Management Unit (MMU) that translates a “Virtual” address (MMU) that translates a “Virtual” address into a physical one. into a physical one.
Allows for paging code in to memory as Allows for paging code in to memory as neededneeded
Virtual MemoryVirtual Memory Virtual memory managementVirtual memory management
Windows CE provides only Windows CE provides only one virtual address space one virtual address space of 4 GB for all the of 4 GB for all the applications to useapplications to use
System still maintains System still maintains protection between protection between processesprocesses
Allows faster inter-process Allows faster inter-process thread migration.thread migration.
Using virtual memoryUsing virtual memory Allocate large blocks of Allocate large blocks of
memory memory Windows CE manages Windows CE manages
virtual memory in 64 KB virtual memory in 64 KB blocks blocks
Using the local heapUsing the local heap region of reserved virtual region of reserved virtual
memory space that Kernel memory space that Kernel manages for your manages for your application application
Using the stackUsing the stack Is the storage area for Is the storage area for
variables that are variables that are referenced in a function referenced in a function
Memory Mapping(Shared)
Reserved
Slot 32:Process32
.
.
.
Slot 1:XIP DLL Code
Slot 0:Active Process
2GB
2GB
32MB
OverviewOverview
Virtual Memory ModelVirtual Memory ModelStatic Mapped Virtual AddressesStatic Mapped Virtual AddressesProcess ModelProcess ModelProcess MemoryProcess MemoryProcessesProcessesModulesModulesHeapsHeapsStackStack
Virtual Memory ModelVirtual Memory Model Virtual MemoryVirtual Memory
Single 32-bit (4 Gigabyte) flat virtual memory address Single 32-bit (4 Gigabyte) flat virtual memory address spacespace
Permits efficient use of physical memory with protectionPermits efficient use of physical memory with protection Virtual AddressingVirtual Addressing
Memory Management Unit (MMU) “owns” physical memoryMemory Management Unit (MMU) “owns” physical memory Virtual addresses translated to physical addresses by MMUVirtual addresses translated to physical addresses by MMU A valid virtual address must map to a physical addressA valid virtual address must map to a physical address Static or Dynamically mapped virtual addressingStatic or Dynamically mapped virtual addressing
Physical AddressingPhysical Addressing Only used by CPU before MMU is activated during power-Only used by CPU before MMU is activated during power-
upup
Virtual Memory ModelVirtual Memory Model Privilege ModesPrivilege Modes
Virtual memory space split between Kernel-mode and User-Virtual memory space split between Kernel-mode and User-modemode
All processes share the same flat virtual memory address All processes share the same flat virtual memory address space space
Kernel-mode manages User-mode process protection via Kernel-mode manages User-mode process protection via MMUMMU
Kernel SpaceKernel Space Used only by Kernel-mode code with privileged access Used only by Kernel-mode code with privileged access
(Kmode)(Kmode) Mostly static mapped virtual addresses (never page faults)Mostly static mapped virtual addresses (never page faults)
User SpaceUser Space Organized as 64 slots of 32 MB (2Organized as 64 slots of 32 MB (22525 bytes) each bytes) each Mostly dynamically mapped virtual addressesMostly dynamically mapped virtual addresses
Virtual Memory ModelVirtual Memory Model
Kernel Space
UserSpace
Kernel Addresses: KPAGE, Trap Area, Others
Slot 97: NK.EXE
Unused
Statically Mapped Virtual Addresses:
Un-Cached
Statically Mapped Virtual Addresses:
Cached Slot 0 – Current ProcessSlot 1 – XIP DLL code
Slots 2-32 - Processes
Slots 33-63
Object Store and Memory-Mapped Files
FFFF FFFF
E000 0000
C400 0000
C200 0000
C000 0000
A000 0000
8000 0000
7FFF FFFF
4200 0000
0400 0000
0200 0000
0000 0000
Total 4 GB VirtualSpace
2 GB
2 GB
Kernel Space User Space
Unused
Static Mapped Virtual AddressesStatic Mapped Virtual Addresses
2 GB
User512 M
BU
ncached512 M
BC
ached
32 MB Flash
Physical Memory Virtual Memory
04000000
82000000
8000 0000
A000 0000
C000 0000
00000000
64 MB RAM
0000 0000
64 MB RAM
32 MB Flash
64 MB RAM
FFFF FFFF
AddressTranslation
32 MB Flash
Kernel Space
UserSpace
Process ModelProcess Model Virtual Address SlotsVirtual Address Slots
32 MB (232 MB (22525 bytes) of virtual address space per slot bytes) of virtual address space per slot Slot space shared by process, its DLLs, and virtual Slot space shared by process, its DLLs, and virtual
allocationsallocations Fast context switching between process slots (swap page Fast context switching between process slots (swap page
tables)tables) Current thread executes in slot 0Current thread executes in slot 0
Management GranularityManagement Granularity Regions of virtual address space allocated with 64 KB Regions of virtual address space allocated with 64 KB
granularitygranularity Pages committed to physical memory with 4KB granularityPages committed to physical memory with 4KB granularity
Allocation OrderAllocation Order DLL allocations start at high address and grow downDLL allocations start at high address and grow down Process allocations start at low address and grow upProcess allocations start at low address and grow up
Lesson: Process MemoryLesson: Process Memory
Slot 30
Slot 31
Slot 32
. . .
01FF FFFF
0001 0000
0000 0000
32 MB Process Space
Current Process
XIP ROM DLLs
nk.exe
filesys.exe
shell.exe
device.exe
gwes.exe
C400 0000
C200 0000
Free Virtual Space
Slot 63
. . .
Resource DLLs
Slot 2
Slot 3
Slot 4
Slot 5
Slot 1
Slot 0
0A00 0000
3E00 0000
3C00 0000
4000 0000
4200 0000
0000 0000
0400 0000
0200 0000
0600 0000
0800 0000
0C00 0000
. . .Slot 97
8000 0000
7E00 0000
ModulesModules ModulesModules
Standard Win32 Portable Executable file formatStandard Win32 Portable Executable file format Standard Win32 tools (symbols, digital signature, etc)Standard Win32 tools (symbols, digital signature, etc)
Dynamic Link Library (DLL)Dynamic Link Library (DLL) Loadable library with imports/exports to processesLoadable library with imports/exports to processes Same physical copy executed with different instance dataSame physical copy executed with different instance data Activate/Deactivate control by owner processActivate/Deactivate control by owner process
On demand pagingOn demand paging Commits/Copies pages from storage into RAM for executionCommits/Copies pages from storage into RAM for execution Execute-In-Place (XIP) of non-compressed ROM-based Execute-In-Place (XIP) of non-compressed ROM-based
modulesmodules Decompresses ROM-based modules into RAM on-demandDecompresses ROM-based modules into RAM on-demand
System API Calling MechanismSystem API Calling MechanismCoredll.dllCoredll.dll
Located at the top of every process slotLocated at the top of every process slot Fields system API calls from user mode threadsFields system API calls from user mode threads Implements some system API calls directlyImplements some system API calls directly Causes an exception (trap) to pass on system API Causes an exception (trap) to pass on system API
requestrequestKernelKernel
Catches system API request exception trapsCatches system API request exception traps Dispatches to a system EXE to fulfill requestDispatches to a system EXE to fulfill request User mode thread migrated to system EXE process User mode thread migrated to system EXE process
spacespace Access rights of user mode thread inherits current Access rights of user mode thread inherits current
process rightsprocess rights
System API Calling System API Calling MechanismMechanism
User mode thread
Win32 API Thunks
Function Call
Coredll.dll
App.exe
KernelTrap
Win32 API Dispatch
Nk.exe
JumpFunction
Code
system EXE
KernelCallReturn
Call
HeapHeap UsageUsage
Memory allocation with per-byte granularityMemory allocation with per-byte granularity Processor-independent (hides memory paging)Processor-independent (hides memory paging) Automatically allocates memory and commits pages on demandAutomatically allocates memory and commits pages on demand Non-movable (pages reclaimed when entire heap is free)Non-movable (pages reclaimed when entire heap is free) Managed via singly-linked list of heap blocks using first-fit algorithmManaged via singly-linked list of heap blocks using first-fit algorithm Works best with allocations of same-sized objectsWorks best with allocations of same-sized objects
Local HeapLocal Heap Reserves 192 KB virtual memory at process load timeReserves 192 KB virtual memory at process load time Commits physical pages upon allocation by processCommits physical pages upon allocation by process
Private HeapPrivate Heap Reserves initial fixed size or expandable (disjointed) heap spaceReserves initial fixed size or expandable (disjointed) heap space Serialization for mutual exclusion of multiple threadsSerialization for mutual exclusion of multiple threads
Shared HeapsShared Heaps Wwritable to owner process and read only to other processesWwritable to owner process and read only to other processes
StackStack UsageUsage
Stores temporary data referenced within a functionStores temporary data referenced within a function Stores state of processor registers during exception handlingStores state of processor registers during exception handling Default stack allocated for each thread at creationDefault stack allocated for each thread at creation Committed on demandCommitted on demand
SizingSizing CPU-dependent default stack sizeCPU-dependent default stack size Default thread stack size override with /STACK linker switchDefault thread stack size override with /STACK linker switch All threads of a process have same stack size by defaultAll threads of a process have same stack size by default
Call StackCall Stack Stack checking detects buffer overruns with /GS linker switchStack checking detects buffer overruns with /GS linker switch GetThreadCallStack – retrieves call stack frames of a threadGetThreadCallStack – retrieves call stack frames of a thread