Windows Kernel Internals II Processes, Threads, VirtualMemory

38
Windows Kernel Internals II Processes, Threads, VirtualMemory University of Tokyo – July 2004* Dave Probert, Ph.D. Advanced Operating Systems Group Windows Core Operating Systems Division Microsoft Corporation © Microsoft Corporation 2004 1

Transcript of Windows Kernel Internals II Processes, Threads, VirtualMemory

Page 1: Windows Kernel Internals II Processes, Threads, VirtualMemory

Windows Kernel Internals IIProcesses, Threads,

VirtualMemoryUniversity of Tokyo – July 2004*

Dave Probert, Ph.D.Advanced Operating Systems Group

Windows Core Operating Systems DivisionMicrosoft Corporation

© Microsoft Corporation 2004 1

Page 2: Windows Kernel Internals II Processes, Threads, VirtualMemory

© Microsoft Corporation 2004 2

Windows Architecture

User-mode

Kernel-mode Trap interface / LPC

ntdll / run-time library

Win32 GUIProcs & threads

Kernel run-time / Hardware Adaptation Layer

Virtual memoryIO ManagerSecurity refmon

Cache mgr

File filtersFile systemsVolume mgrsDevice stacks

Scheduler

Kernel32 User32 / GDI

DLLs

Applications

System Services

Object Manager / Configuration Management

FS run-time

exec synchr

Subsystemservers

Login/GINA

Critical services

Page 3: Windows Kernel Internals II Processes, Threads, VirtualMemory

ProcessContainer for an address space and threadsAssociated User-mode Process Environment Block (PEB)Primary Access TokenQuota, Debug port, Handle Table etcUnique process IDQueued to the Job, global process list and Session listMM structures like the WorkingSet, VAD tree, AWE etc

© Microsoft Corporation 2004 3

Page 4: Windows Kernel Internals II Processes, Threads, VirtualMemory

ThreadFundamental schedulable entity in the systemRepresented by ETHREAD that includes a KTHREADQueued to the process (both E and K thread)IRP listImpersonation Access TokenUnique thread IDAssociated User-mode Thread Environment Block (TEB)User-mode stackKernel-mode stackProcessor Control Block (in KTHREAD) for cpu state when

not running

© Microsoft Corporation 2004 4

Page 5: Windows Kernel Internals II Processes, Threads, VirtualMemory

JobContainer for multiple processesQueued to global job list, processes and jobs in the job setSecurity token filters and job tokenCompletion portsCounters, limits etc

© Microsoft Corporation 2004 5

Page 6: Windows Kernel Internals II Processes, Threads, VirtualMemory

Process/Thread structure

ObjectManager

Any HandleTable

ProcessObject

Process’Handle Table

VirtualAddress

Descriptors

Thread

Thread

Thread

Thread

Thread

Thread

Files

Events

Devices

Drivers

© Microsoft Corporation 2004 6

Page 7: Windows Kernel Internals II Processes, Threads, VirtualMemory

KPROCESS fieldsDISPATCHER_HEADER HeaderULPTR DirectoryTableBase[2]KGDTENTRY LdtDescriptorKIDTENTRY Int21DescriptorUSHORT IopmOffsetUCHAR Ioplvolatile KAFFINITY ActiveProcessorsULONG KernelTimeULONG UserTimeLIST_ENTRY ReadyListHeadSINGLE_LIST_ENTRY SwapListEntryLIST_ENTRY ThreadListHeadKSPIN_LOCK ProcessLock

KAFFINITY AffinityUSHORT StackCountSCHAR BasePrioritySCHAR ThreadQuantumBOOLEAN AutoAlignmentUCHAR StateBOOLEAN DisableBoostUCHAR PowerStateBOOLEAN DisableQuantumUCHAR IdealNode

© Microsoft Corporation 2004 7

Page 8: Windows Kernel Internals II Processes, Threads, VirtualMemory

EPROCESS fieldsKPROCESS PcbEX_PUSH_LOCK ProcessLockLARGE_INTEGER CreateTimeLARGE_INTEGER ExitTimeEX_RUNDOWN_REF

RundownProtectHANDLE UniqueProcessIdLIST_ENTRY ActiveProcessLinksQuota FeldsSIZE_T PeakVirtualSizeSIZE_T VirtualSizeLIST_ENTRY SessionProcessLinksPVOID DebugPortPVOID ExceptionPortPHANDLE_TABLE ObjectTableEX_FAST_REF TokenPFN_NUMBER WorkingSetPage

KGUARDED_MUTEX AddressCreationLock

KSPIN_LOCK HyperSpaceLockstruct _ETHREAD *ForkInProgressULONG_PTR HardwareTrigger;PMM_AVL_TABLE

PhysicalVadRootPVOID CloneRootPFN_NUMBER

NumberOfPrivatePagesPFN_NUMBER

NumberOfLockedPagesPVOID Win32Processstruct _EJOB *JobPVOID SectionObjectPVOID SectionBaseAddressPEPROCESS_QUOTA_BLOCK

QuotaBlock© Microsoft Corporation 2004 8

Page 9: Windows Kernel Internals II Processes, Threads, VirtualMemory

EPROCESS fieldsPPAGEFAULT_HISTORY

WorkingSetWatchHANDLE Win32WindowStationHANDLE InheritedFromUniqueProcessIdPVOID LdtInformationPVOID VadFreeHintPVOID VdmObjectsPVOID DeviceMapPVOID SessionUCHAR ImageFileName[ 16 ]LIST_ENTRY JobLinksPVOID LockedPagesListLIST_ENTRY ThreadListHeadULONG ActiveThreadsPPEB PebIO Counters

PVOID AweInfoMMSUPPORT VmProcess FlagsNTSTATUS ExitStatusUCHAR PriorityClassMM_AVL_TABLE VadRoot

© Microsoft Corporation 2004 9

Page 10: Windows Kernel Internals II Processes, Threads, VirtualMemory

KTHREAD fieldsDISPATCHER_HEADER HeaderLIST_ENTRY MutantListHeadPVOID InitialStack, StackLimitPVOID KernelStackKSPIN_LOCK ThreadLockULONG ContextSwitchesvolatile UCHAR StateKIRQL WaitIrqlKPROC_MODE WaitModePVOID TebKAPC_STATE ApcStateKSPIN_LOCK ApcQueueLockLONG_PTR WaitStatusPRKWAIT_BLOCK WaitBlockListBOOLEAN Alertable, WaitNextUCHAR WaitReasonSCHAR Priority

UCHAR EnableStackSwapvolatile UCHAR SwapBusyLIST_ENTRY WaitListEntryNEXT SwapListEntryPRKQUEUE QueueULONG WaitTimeSHORT KernelApcDisableSHORT SpecialApcDisableKTIMER TimerKWAIT_BLOCK WaitBlock[N+1]LIST_ENTRY QueueListEntryUCHAR ApcStateIndexBOOLEAN ApcQueueableBOOLEAN PreemptedBOOLEAN ProcessReadyQueueBOOLEAN KernelStackResident

© Microsoft Corporation 2004 10

Page 11: Windows Kernel Internals II Processes, Threads, VirtualMemory

KTHREAD fields cont.UCHAR IdealProcessorvolatile UCHAR NextProcessorSCHAR BasePrioritySCHAR PriorityDecrementSCHAR QuantumBOOLEAN SystemAffinityActiveCCHAR PreviousModeUCHAR ResourceIndexUCHAR DisableBoostKAFFINITY UserAffinityPKPROCESS ProcessKAFFINITY AffinityPVOID ServiceTablePKAPC_STATE ApcStatePtr[2]KAPC_STATE SavedApcStatePVOID CallbackStackPVOID Win32Thread

PKTRAP_FRAME TrapFrameULONG KernelTime, UserTimePVOID StackBaseKAPC SuspendApcKSEMAPHORE SuspendSemaPVOID TlsArrayLIST_ENTRY ThreadListEntryUCHAR LargeStackUCHAR PowerStateUCHAR IoplCCHAR FreezeCnt, SuspendCntUCHAR UserIdealProcvolatile UCHAR DeferredProcUCHAR AdjustReasonSCHAR AdjustIncrement

© Microsoft Corporation 2004 11

Page 12: Windows Kernel Internals II Processes, Threads, VirtualMemory

ETHREAD fieldsKTHREAD tcbTimestampsLPC locks and linksCLIENT_ID CidImpersonationInfoIrpListpProcessStartAddressWin32StartAddressThreadListEntryRundownProtectThreadPushLock

© Microsoft Corporation 2004 12

Page 13: Windows Kernel Internals II Processes, Threads, VirtualMemory

Process Synchronization

ProcessLock – Protects thread list, tokenRundownProtect – Cross process address space,

image section and handle table referencesToken, Prefetch – Uses fast referencingToken, Job – Torn down at last process

dereference without synchronization

© Microsoft Corporation 2004 13

Page 14: Windows Kernel Internals II Processes, Threads, VirtualMemory

© Microsoft Corporation 2004 14

Thread scheduling

states

Page 15: Windows Kernel Internals II Processes, Threads, VirtualMemory

Thread scheduling states

© Microsoft Corporation 2004 15

• Main quasi-states:– Ready – able to run– Running – current thread on a processor– Waiting – waiting an event

• For scalability Ready is three real states:– DeferredReady – queued on any processor– Standby – will be imminently start Running– Ready – queue on target processor by priority

• Goal is granular locking of thread priority queues

• Red states related to swapped stacks and processes

Page 16: Windows Kernel Internals II Processes, Threads, VirtualMemory

Process LifetimeCreated as an empty shellAddress space created with only ntdll and the main image

unless forkedHandle table created empty or populated via duplication

from parentProcess is partially destroyed on last thread exitProcess totally destroyed on last dereference

© Microsoft Corporation 2004 16

Page 17: Windows Kernel Internals II Processes, Threads, VirtualMemory

Thread LifetimeCreated within a process with a CONTEXT recordStarts running in the kernel but has a trap frame to return to

use modeKernel queues user APC to do ntdll initializationTerminated by a thread calling NtTerminateThread/Process

© Microsoft Corporation 2004 17

Page 18: Windows Kernel Internals II Processes, Threads, VirtualMemory

Summary: Native NT Process APIs

NtCreateProcess()NtTerminateProcess()NtQueryInformationProcess()NtSetInformationProcess()NtGetNextProcess()NtGetNextThread()NtSuspendProcess()NtResumeProcess()

NtCreateThread()NtTerminateThread()NtSuspendThread()NtResumeThread()NtGetContextThread()NtSetContextThread()NtQueryInformationThread()NtSetInformationThread()NtAlertThread()NtQueueApcThread()

© Microsoft Corporation 2004 18

Page 19: Windows Kernel Internals II Processes, Threads, VirtualMemory

Virtual Memory ManagerFeatures

Provides 4 GB flat virtual address space (IA32)Manages process address spaceHandles pagefaultsManages process working setsManages physical memoryProvides memory-mapped filesAllows pages shared between processesFacilities for I/O subsystem and device driversSupports file system cache manager

© Microsoft Corporation 2004 19

Page 20: Windows Kernel Internals II Processes, Threads, VirtualMemory

Virtual Memory ManagerNT Internal APIs

© Microsoft Corporation 2004 20

NtCreatePagingFileNtAllocateVirtualMemory (Proc, Addr, Size, Type,

Prot)Process: handle to a processProtection: NOACCESS, EXECUTE, READONLY,

READWRITE, NOCACHEFlags: COMMIT, RESERVE, PHYSICAL, TOP_DOWN,

RESET, LARGE_PAGES, WRITE_WATCH

NtFreeVirtualMemory(Process, Address, Size, FreeType)FreeType: DECOMMIT or RELEASE

NtQueryVirtualMemoryNtProtectVirtualMemory

Page 21: Windows Kernel Internals II Processes, Threads, VirtualMemory

Virtual Memory ManagerNT Internal APIs

Pagefault

NtLockVirtualMemory, NtUnlockVirtualMemory– locks a region of pages within the working set list– requires PROCESS_VM_OPERATION on target

process and SeLockMemoryPrivilegeNtReadVirtualMemory, NtWriteVirtualMemory (

Proc, Addr, Buffer, Size)NtFlushVirtualMemory

© Microsoft Corporation 2004 21

Page 22: Windows Kernel Internals II Processes, Threads, VirtualMemory

Virtual Memory ManagerNT Internal APIs

NtCreateSection– creates a section but does not map it

NtOpenSection– opens an existing section

NtQuerySection– query attributes for section

NtExtendSectionNtMapViewOfSection (Sect, Proc, Addr, Size, …)NtUnmapViewOfSection

© Microsoft Corporation 2004 22

Page 23: Windows Kernel Internals II Processes, Threads, VirtualMemory

Virtual Memory ManagerNT Internal APIs

© Microsoft Corporation 2004 23

APIs to support AWE (Address Windowing Extensions)– Private memory only– Map only in current process– Requires LOCK_VM privilege

NtAllocateUserPhysicalPages (Proc, NPages, &PFNs[])NtMapUserPhysicalPages (Addr, NPages, PFNs[])NtMapUserPhysicalPagesScatterNtFreeUserPhysicalPages (Proc, &NPages, PFNs[])

NtResetWriteWatchNtGetWriteWatch

Read out dirty bits for a section of memory since last reset

Page 24: Windows Kernel Internals II Processes, Threads, VirtualMemory

Allocating kernel memory (pool)• Tightest x86 system resource is KVA

Kernel Virtual Address space• Pool allocates in small chunks:

< 4KB: 8B granulariy>= 4KB: page granularity

• Paged and Non-paged poolPaged pool backed by pagefile

• Special pool used to find corruptors• Lots of support for debugging/diagnosis

© Microsoft Corporation 2004 24

Page 25: Windows Kernel Internals II Processes, Threads, VirtualMemory

8000000080000000

© Microsoft Corporation 2004 25

System code, initial non-paged poolSession space (win32k.sys)

Sysptes overflow, cache overflowPage directory self-map and page tables

Hyperspace (e.g. working set list)

System cachePaged pool

Reusable system VA (sysptes)Non-paged pool expansionCrash dump information

HAL usage

A0000000A0000000

A4000000A4000000

C0000000C0000000

x86C0400000C0400000

C0800000C0800000 Unused – no accessSystem working set list

C1000000C1000000

C0C00000C0C00000

E1000000E1000000

E8000000E8000000

FFBE0000FFBE0000

FFC00000FFC00000

Page 26: Windows Kernel Internals II Processes, Threads, VirtualMemory

Valid x86 Hardware PTEsReservedGlobalDirtyAccessedCache disabledWrite throughOwnerWrite

Pageframe 1R R R G WtR D A Cd O W05 4 17 239 631 12 11 10 8

© Microsoft Corporation 2004 26

Page 27: Windows Kernel Internals II Processes, Threads, VirtualMemory

Virtual Address Translation

0000 0000 0000 0000 0000 0000 0000 0000

CR3

PD PT page DATADATA

1024 PDEs

1024 PTEs

4096 bytes

© Microsoft Corporation 2004 27

Page 28: Windows Kernel Internals II Processes, Threads, VirtualMemory

Self-mapping page tables• Page Table Entries (PTEs) and Page Directory Entries

(PDEs) contain Physical Frame Numbers (PFNs)– But Kernel runs with Virtual Addresses

• To access PDE/PTE from kernel use the self-map for the current process:PageDirectory[0x300] uses PageDirectory as

PageTable– GetPdeAddress(va): 0xc0300000[va>>20]

– GetPteAddress(va): 0xc0000000[va>>10]

• PDE/PTE formats are compatible!• Access another process VA via thread ‘attach’

© Microsoft Corporation 2004 28

Page 29: Windows Kernel Internals II Processes, Threads, VirtualMemory

Self-mapping page tablesVirtual Access to PageDirectory[0x300]

0000 0000 0000 0000 0000 0000 0000 0000

CR3

PD

1100 0000 0011 0000 0000 1100 0000 0000

CR3

PD

© Microsoft Corporation 2004 29

PTEPTE

Phys: PD[0xc0300000>>22] = PD

Virt: *((0xc0300c00) == PD

0x300

Page 30: Windows Kernel Internals II Processes, Threads, VirtualMemory

Self-mapping page tablesVirtual Access to PTE for va 0xe4321000

0000 0000 0000 0000 0000 0000 0000 0000

CR3

PD

1100 0000 0011 1001 0000 1100 1000 0100

CR3

PD PT

© Microsoft Corporation 2004 30

0x300

GetPteAddress:0xe4321000 => 0xc0390c84

0x321

PTEPTE0x390

Page 31: Windows Kernel Internals II Processes, Threads, VirtualMemory

x86 Invalid PTEs

© Microsoft Corporation 2004 31

Page filePage file offset Protection PFN 0

31 12 11 10 9 5 4 1 0

0

TransitionPrototype

TransitionPage file offset Protection HW ctrl 0

31 12 11 10 9 5 4 1 0

1

TransitionPrototype

Cache disableWrite through

OwnerWrite

Page 32: Windows Kernel Internals II Processes, Threads, VirtualMemory

x86 Invalid PTEs

Demand zero: Page file PTE with zero offset and PFN

Unknown: PTE is completely zero or Page Table doesn’t exist yet. Examine VADs.

Pointer to Prototype PTE

pPte bits 7-27 pPte bits 0-6 0014578931 12 11 10

© Microsoft Corporation 2004 32

Page 33: Windows Kernel Internals II Processes, Threads, VirtualMemory

Prototype PTEs

• Kept in array in the segment structure associated with section objects

• Six PTE states:– Active/valid– Transition– Modified-no-write– Demand zero– Page file– Mapped file

© Microsoft Corporation 2004 33

Page 34: Windows Kernel Internals II Processes, Threads, VirtualMemory

Shared Memory Data Structures

© Microsoft Corporation 2004 34

Page 35: Windows Kernel Internals II Processes, Threads, VirtualMemory

© Microsoft Corporation 2004 35

Physical Memory Management

Process/SystemWorking Set

ModifiedList

StandbyList

FreeList

ZeroList

ModifiedPage-writer

MM LowMemory

ZeroThread

DeletePage

SoftFault

TrimDirty

TrimClean

SoftFault

Hardfault(DISK)

Zerofault(FILL)

Physical Page State Changes

Page 36: Windows Kernel Internals II Processes, Threads, VirtualMemory

Paging OverviewWorking Sets: list of valid pages for each process

(and the kernel)Pages ‘trimmed’ from working set on lists

Standby list: pages backed by diskModified list: dirty pages to push to diskFree list: pages not associated with diskZero list: supply of demand-zero pages

Modify/standby pages can be faulted back into a working set w/o disk activity (soft fault)

Background system threads trim working sets, write modified pages and produce zero pages based on memory state and config parameters

© Microsoft Corporation 2004 36

Page 37: Windows Kernel Internals II Processes, Threads, VirtualMemory

Managing Working SetsAging pages: Increment age counts for pages

which haven't been accessedEstimate unused pages: count in working set and

keep a global count of estimateWhen getting tight on memory: replace rather

than add pages when a fault occurs in a working set with significant unused pages

When memory is tight: reduce (trim) working sets which are above their maximum

Balance Set Manager: periodically runs Working Set Trimmer, also swaps out kernel stacks of long-waiting threads

© Microsoft Corporation 2004 37

Page 38: Windows Kernel Internals II Processes, Threads, VirtualMemory

Discussion

© Microsoft Corporation 2004 38