Breaking Kernal address space layout rendomization: KASLAR with Intel TSX

Post on 16-Apr-2017

99 views 0 download

Transcript of Breaking Kernal address space layout rendomization: KASLAR with Intel TSX

DrK:BreakingKernelAddressSpaceLayoutRandomizationwithIntelTSX

Yeongjin Jang,Sangho Lee,andTaesoo KimGeorgiaInstituteofTechnology,August3,2016

Outline

• KASLRBackground• TLBSideChannelAttackonKASLR• AttackingTLBSideChannelwithIntelTSX• AttackingvariousOSes• RootCauseAnalysis• Discussions• Conclusion

Outline

• KASLRBackground• TLBSideChannelAttackonKASLR• AttackingTLBSideChannelwithIntelTSX• AttackingvariousOSes• RootCauseAnalysis• Discussions• Conclusion

KernelAddressSpaceLayoutRandomization(KASLR)• Astatisticalmitigationformemorycorruptionexploits

• Randomizeaddresslayoutpereachboot• Efficient(<5%overhead)

• Attackershouldguesswherecode/dataarelocatedforexploit.• InWindows,asuccessfulguessrateis1/8192.

Example:Linux• Toescalateprivilegetorootthroughakernelexploit,attackerswanttocallcommit_creds(prepare_kernel_creds(0)).

Example:Linux

• KASLRchangeskernelsymboladdresseseveryboot.

• Kernelsymbolsarehiddentonon-rootusers.

1st Boot

2nd Boot

Example:tpwn - OSX10.10.5KernelPrivilegeEscalationVulnerability• [CVE-2015-5864]IOAudioFamailiy allowsalocalusertoobtainsensitivekernelmemory-layoutinformationviaunspecifiedvectors.

BypassingKASLRisrequired…

KASLRMakesAttacksHarder• KASLRintroducesanadditionalbartoexploits• Findinganinformationleakvulnerability

• Bothattackersanddefendersaimtodetectinfoleakvulnerabilities.

Pr[∃MemoryCorruptionVuln ]

Pr[∃ information_leak ] × Pr[∃MemoryCorruptionVuln]

PopularOSesAdoptedKASLR

Outline

• KASLRBackground• TLBSideChannelAttackonKASLR• AttackingTLBSideChannelwithIntelTSX• AttackingvariousOSes• RootCauseAnalysis• Discussions• Conclusion

Isthereanyotherwaythaninfoleak?

• PracticalTimingSideChannelAttacksAgainstKernelSpaceASLR(Hundetal.,Oakland2013)• Ahardware-level sidechannelattackagainstKASLR• No informationleakvulnerabilityinOSisrequired

TLBTimingSideChannel

• Ifaccessedakerneladdressfromtheuserspace

• Mappedaddress:Accessviolation,Pagefault• Unmappedaddress:Invalidaddress,Pagefault

TLBTimingSideChannel• Ifanunmapped kerneladdressisaccessed

1.Trytogetpagetableentrythroughpagetablewalk

2.Thereisnopagetableentryfound,generatepagefault!

TLBTimingSideChannel• Ifamapped kerneladdressisaccessed

1.Trytogetpagetableentrythroughpagetablewalk

2.CachetheentrytoTLB

3.Checkpageprivilegelevel(3<0),generatepagefault!

TLBTimingSideChannel

TLBVirtualAddress

Hit

Miss

Mappedaddressreturnsquicker!

Unmappedaddresstakes~40cycles

moreforpagetablewalk

TLBTimingSideChannel• Measuringthetimeinanexceptionhandler

1.GeneratesPageFault

3.OS handlesPageFault

4.OScallsexceptionhandler

2.CPU generatesPageFault

TLBTimingSideChannel• Result:TLBhittooks lessthan4050cycles,• WhileTLBmisstookmorethanthat…

• Limitation:Toonoisy• <1%timedifference

• (~40within4000cycles)• OSexceptionhandlingistooslow

• Isthereanybetterway?

Outline

• KASLRBackground• TLBSideChannelAttackonKASLR• AttackingTLBSideChannelwithIntelTSX• AttackingvariousOSes• RootCauseAnalysis• Discussions• Conclusion

AMorePracticalTLBSideChannelAttackonKASLR• DrK Attack:WepresentaverypracticalsidechannelattackonKASLR• De-randomizingKernelASLR(thisiswhereDrK comesfrom)

• ExploitIntelTSXforOS-freeexceptionfallback• Accurate:99%-100%• Fast:<1second• OSindependent:Linux,Windows,OSX• Stealthy:NoOSexecutionpath• Cloud:TestedinAmazonEC2

StartingFromaPoC ExampleintheWild

Rafal Wojtczuk,https://labs.bromium.com/2014/10/27/tsx-improves-timing-attacks-against-kaslr/

Lessnoisy

TSXGivesBetterPrecisiononTimingAttack

• Accesstomapped addressinTSX:172 clk• Accesstounmapped addressinTSX:200 clk• 28clk (>15%)intimingdifference

• Accesstomapped addressin__try:2172 clk• Accesstounmapped addressin_try:2192 clk• <1% intimingdifference

• Why?

TransactionalSynchronizationExtension(IntelTSX)• TraditionalLock

1.Blockuntilacquiresthelock

3.Releasethelock(finishesatomicregion)

2.Atomicregion(100%success)

TransactionalSynchronizationExtension(IntelTSX)• TSX:relaxedbutfasterwayofhandlingsynchronization

1.Donotblock,donotuselock

3.Iffailed,handlefailurewithaborthandler(retry,getbacktotraditionallock,etc.)

2.Tryatomicoperation(canfail)

TransactionAbortsIfExistanyofaConflict• ConditionofConflict• Threadraces• Cacheeviction• Interrupt

• ContextSwitch(timer)• Syscalls

• Exceptions• PageFault• GeneralProtection• Debugging• …

RunIfTransactionAborts

AbortHandlerSuppressesExceptions• AbortHandlerofTSX• Suppressallsync.exceptions

• E.g.,pagefault• DonotnotifyOS

• Justjumpintoabort_handler()

NoExceptiondeliverytotheOS!(returnsquicker,solessnoisythan__try__except)

RunIfTransactionAborts

ExploitingTSXasanExceptionHandler• HowtouseTSXasanexceptionhandler?

1.Timestampatthebeginning

2.AccesskernelmemorywithintheTSXregion(alwaysaborts)

3.Measuretimingataborthandler

NoOShandlingpathisinvolved

MeasuringTimingSideChannel

• AccessMapped/Unmappedkerneladdresses• AttemptREAD accesswithintheTSXregion

• mov [rax], 1

MeasuringTimingSideChannel

• AccessExecutable/Non-executableaddress• AttemptJUMP accesswithintheTSXregion

• jmp rax

Demo1:TimingDifferenceonM/UandX/NX

MeasuringTimingSideChannel• Mapped/Unmappedkerneladdresses• Ran1000iterationsfortheprobing,minimumclockon10runs

• MuchfasterthananOSexceptionhandler!• 209versus4000cycles• Significanttimedifference:~15%

Processor Mapped Page UnmappedPagei7-6700K (4.0Ghz) 209 240(+31)i5-6300HQ(2.3Ghz) 164 188(+24)i7-5600U(2.6Ghz) 149 173(+24)E3-1271v3(3.6Ghz) 177 195(+18)

MeasuringTimingSideChannel

• Executable/Non-executablekerneladdresses• Ran1000iterationsfortheprobing,minimumclockon10runs

Processor ExecutablePage Non-execPagei7-6700K (4.0Ghz) 181 226(+45)i5-6300HQ(2.3Ghz) 142 178(+36)i7-5600U(2.6Ghz) 134 164(+30)E3-1271v3(3.6Ghz) 159 189(+30)

ClearTimingChannel

Clearseparationbetweendifferentmappingstatus!

TSXvsSEH

Clearseparationbetweendifferentmappingstatus!

Outline

• KASLRBackground• TLBSideChannelAttackonKASLR• AttackingTLBSideChannelwithIntelTSX• AttackingvariousOSes• RootCauseAnalysis• Discussions• Conclusion

AttackonVariousOSes

• DemoTargets• Fullattack• Linux, Windows,andLinuxinAmazonEC2• Probeeachpageofkernel/drivers(>6,000inLinux,>34,000inWindows)

• Compareitspermissiontopagetabletogettheaccuracy• DetectingModulesLocation

• Basedonsectionsize(X/NX/U),detecttheexactlocationofkernelmodule

• FindingASLRslide• OSX

AttackonLinux

• OSSettings• Kernel4.6.0,runningwithUbuntu16.04LTS• Addedbootarg ‘kaslr’• EnabledwithCONFIG_X86_PTDUMP=y(justforgroundtruth)

• AvailableSlots• Kernel:64slots

• 0xffffffff80000000– 0xffffffffc0000000(2MBpage)• Module:1,024slots

• 0xffffffffc0000000– 0xffffffffc0400000(4KBpage)

Demo2:FullAttackonLinux

Result

• Achieved100%accuracyacross3differentCPUs• Took0.45-0.67sforprobing6,147pages.

• DetectingModules• Fromsizesignature,detected29modulesamong80modules.

AttackonWindows

• OSSettings• Windows10,10.0.10586• AvailableSlots

• Kernel:8,192slots• 0xfffff80000000000- 0xfffff80400000000(2MBpages)

• Drivers:8,192slots• 0xfffff80000000000- 0xfffff80400000000(4KBpages,alignedwith2MB)

Result

• 100%ofaccuracyforthekernel(ntoskrnl.exe)• 100%ofaccuracyfordetectingM/Uforthedrivers• 99.28%ofaccuracyfordetectingX/NXfordrivers• Someareasindriveraredynamicallydeallocated• Missessome‘inactive’pages

• DetectingModules• Fromsizesignature,detected97driversamong141drivers

AttackonOSX

• OSSettings• OSXElCapitan10.11.4• AvailableSlots• Kernel:256slots

• 0xffffff8000000000- 0xffffff8020000000(2MBpages)

• Result• Took31ms onfindingASLRslide(100%accuracyfor10times)

AttackonAmazonEC2

• OSSettings• Kernel4.4.0,runningwithUbuntu14.04LTS• Addedbootarg ‘kaslr’• EnabledwithCONFIG_X86_PTDUMP

• AvailableSlots• Kernel:64slots

• 0xffffffff80000000– 0xffffffffc0000000(2MBpage)• Module:1,024slots

• 0xffffffffc0000000– 0xffffffffc0400000(4KBpage)

ResultSummary

• Linux:100%ofaccuracyaround0.5second• Windows:100%forM/Uin5sec,99.28%forX/NXfor45sec• OSX:100%fordetectingASLRslide,in31ms• LinuxonAmazonEC2:100%ofaccuracyin3seconds

Outline

• KASLRBackground• TLBSideChannelAttackonKASLR• AttackingTLBSideChannelwithIntelTSX• AttackingvariousOSes• RootCauseAnalysis• Discussions• Conclusion

TimingSideChannel(M/U)• ForMapped/Unmappedaddresses• Measuredperformancecounters(on1,000,000 probing)

• dTLB hitonmappedpages,butnotforunmappedpages.• TimingchannelisgeneratedbydTLB hit/miss

Perf.Counter MappedPage UnmappedPage Description

dTLB-loads 3,021,847 3,020,243

dTLB-load-misses 84 2,000,086 TLB-miss onU

ObservedTiming 209(fast) 240(slow)

PathforanUnmappedPage

dTLB

Onthefirstaccess

PML4PML3 PML3

PML2 PML2 PML2PML1 PML1 PML1

PTE

PageTable

KerneladdressaccessTLBmiss

Pagefault!

PathforanUnmappedPage

dTLB

OntheSecondaccess

PML4PML3 PML3

PML2 PML2 PML2PML1 PML1 PML1

PTE

PageTable

KerneladdressaccessTLBmiss

Pagefault!Alwaysdopagetablewalk(slow)

PathforamappedPage

dTLB

Onthefirstaccess

PML4PML3 PML3

PML2 PML2 PML2PML1 PML1 PML1

PTE

PageTable

KerneladdressaccessTLBmiss

Pagefault!

CacheTLBentry!

PTE

PathforamappedPage

dTLB

Onthesecondaccess

PML4PML3 PML3

PML2 PML2 PML2PML1 PML1 PML1

PTE

PageTable

Kerneladdressaccess

Pagefault!

dTLB hit

Nopagetablewalkonthesecondaccess(fast)

PTE

Root-causeofTimingSideChannels(M/U)• ForMapped/Unmappedaddresses

FastPath(Mapped) SlowPath(Unmapped)

1. Access aKerneladdress2. dTLB hits3. Pagefault!

1. AccessaKerneladdress2. dTLB misses3. Walksthroughpagetable4. Pagefault!

Elapsed cycles:209 Elapsed cycles:240

• CachingatdTLB generatestimingsidechannel

TimingSideChannel(X/NX)• ForExecutable/Non-executableaddresses• Measuredperformancecounters(on1,000,000probing)

Perf.Counter ExecPage Non-execPage UnmappedPage

iTLB-loads(hit) 590 1,000,247 272

iTLB-load-misses 31 12 1,000,175ObservedTiming 181 (fast) 226 (slow) 226 (slow)

• Point#1:iTLB hitonNon-exec,butitisslow(226)why?

• iTLB isnottheoriginofthesidechannel.

TimingSideChannel(X/NX)• ForExecutable/Non-executableaddresses• Measuredperformancecounters(on1,000,000probing)

Perf.Counter ExecPage Non-execPage UnmappedPage

iTLB-loads(hit) 590 1,000,247 272

iTLB-load-misses 31 12 1,000,175ObservedTiming 181 (fast) 226 (slow) 226 (slow)

• Point#2:iTLB doesnotevenhitonExecpage,whileNXpagehitsiTLB

• iTLB isnotinvolvedinthefastpath

IntelCacheArchitecture• L1instructioncache• Virtually-indexed,Physically-taggedcache(requiresTLBaccess)• Cachesactualopcode/datacontentofthememory

FromthepatentUS20100138608A1,registeredbyIntelCorporation

IntelCacheArchitecture

FromthepatentUS20100138608A1,registeredbyIntelCorporation

• Decodedi-cache• Aninstructionwillbedecodedasmicro-ops(RISC-likeinstruction)• Decodedi-cachestoresmicro-ops• Virtually-indexed,Virtually-taggedcache(noTLBaccess)

PathforanUnmappedPage

iTLB

OntheSecondaccess,226 cycles

PML4PML3 PML3

PML2 PML2 PML2PML1 PML1 PML1

PTE

PageTable

KerneladdressaccessTLBmiss

Pagefault!Alwaysdopagetablewalk(slow)

PathforanExecutablePage

iTLB

Onthefirstaccess

PML4PML3 PML3

PML2 PML2 PML2PML1 PML1 PML1

PTE

PageTable

Kerneladdressaccess

TLBmiss

Insufficientprivilege,fault!

DecodedI-cache

miss

PTE CacheTLBuops

CacheDecodedInstructions

PathforanExecutablePage

iTLB

Onthesecondaccess,181 cycles

PML4PML3 PML3

PML2 PML2 PML2PML1 PML1 PML1

PTE

PageTable

Kerneladdressaccess

Insufficientprivilege,fault!

DecodedI-cache

PTEuops

DecodedI-cachehit!

NoTLBaccess,Nopagetablewalk(fast)

Pathforanon-executable,butmappedPage

iTLB

Onthefirstaccess

PML4PML3 PML3

PML2 PML2 PML2PML1 PML1 PML1

PTE

PageTable

Kerneladdressaccess

TLBmiss

NX,Pagefault!

DecodedI-cache

miss

PTE CacheiTLB

PathforaNon-executable,butmappedPage

iTLB

Onthesecondaccess,226 cycles

PML4PML3 PML3

PML2 PML2 PML2PML1 PML1 PML1

PTE

PageTable

Kerneladdressaccess Decoded

I-cache

miss

PTE

Pagefault!

TLBhit

Ifnopagetablewalk,itshouldbefasterthanunmapped(butnot!)

CacheCoherenceandTLB• TLBisnotacoherentcacheinIntelArchitecture

TLB0xff01->0x0010,NX

Core1 1.Core1sets0xff01asNon-executable memory

TLB0xff01->0x0010,X

Core2

2.Core2sets0xff01asExecutable memoryNocoherency,donotupdate/invalidateTLBinCore1

3.Core1trytoexecuteon0xff01->PagefaultbyNX

4.Core1mustwalkthroughthepagetableThepagetableentryisX,updateTLB,thenexecute!

Execute

PathforaNon-executable,butmappedPage

iTLB

Onthesecondaccess,226 cycles

PML4PML3 PML3

PML2 PML2 PML2PML1 PML1 PML1

PTE

PageTable

Kerneladdressaccess

NX,Pagefault!

DecodedI-cache

miss

PTE CacheTLB

NX,cannotexecute!

TLBhit

Root-causeofTimingSideChannel(X/NX)• ForeXecute /non-executableaddresses

FastPath(X) SlowPath(NX) SlowPath(U)1. Jmp intotheKerneladdr2. DecodedI-cachehits3. Pagefault!

1. Jmp intothekerneladdr2. iTLB hit3. Protectioncheck fails,

pagetablewalk.4. Pagefault!

1. Jmp intothekerneladdr2. iTLB miss3. Walks throughpagetable4. Pagefault!

Cycles:181 Cycles:226 Cycles: 226

• Decodedi-cache generatestimingsidechannel

AnalysisSummary

• dTLB cachingmakesfasterfaultonmappedaddress• Mapped:PTEcachedindTLB• Unmapped:PTEisnotcachedindTLB,requirespagetablewalk

• DecodedI-cachemakesfasterfaultonexecutableaddress• Executable:Decodedi-cachehits,noiTLB access,nopagetablewalk• Non-executable:iTLB hits,butrequirespagetablewalk• Unmapped:alwaysrequirespagetablewalk

Outline

• KASLRBackground• TLBSideChannelAttackonKASLR• AttackingTLBSideChannelwithIntelTSX• AttackingvariousOSes• RootCauseAnalysis• Discussions• Conclusion

Discussions:ControllingNoise

• Dynamicfrequencyscaling(SpeedStep,TurboBoost)changesthereturnvalueofrdtscp().• RunbusyloopstomakeCPUrunasfull-throttle

• HardwareinterruptsandcacheconflictsalsoabortTSX.• Probemultipletimes(e.g.,2-100)andtaketheminimum

Discussions:IncreasingCovertness

• OSneverseespagefaults• TSXsuppressestheexception

• Possibletraces:performancecounters• HighcountondTLB/iTLB-miss• Normalprogramssequentiallyaccessinghugememorycouldbehavesimilarly.

• Highcountontx-abortsorCPUtime• Attackerscouldslowdowntheprobingrate(e.g.,5min,stillfast)

Discussions:Countermeasures?

• ModifyingCPUtoeliminatetimingchannels• DifficulttoberealizedL

• Usingseparatedpagetablesforkernelanduserprocesses• Highperformanceoverhead(~30%)duetofrequentTLBflush

• Fine-grainedrandomization• Difficulttoimplementandperformancedegradation

• Coarse-grainedtimer?• Alwayssuggested,butnooneadoptsit.

Outline

• KASLRBackground• TLBSideChannelAttackonKASLR• AttackingTLBSideChannelwithIntelTSX• AttackingvariousOSes• RootCauseAnalysis• Discussions• Conclusion

Conclusion

• TSXcanbreakKASLRofcommodityOSes.• Ensureaccuracy,speed,andcovertness

• Timingsidechanneliscausedbyhardware,independenttoOS.• dTLB (forMapped&Unmapped)• DecodedI-cache(foreXecutable /non-executable)

• Weconsiderpotentialcountermeasuresagainstthisattack.