52 Semantics Based Methods
Transcript of 52 Semantics Based Methods
-
8/12/2019 52 Semantics Based Methods
1/69
The Case for Semantics-Based Methods inReverse Engineering
Rolf Rolles, FunemployedREC! "#$" %eynote
-
8/12/2019 52 Semantics Based Methods
2/69
The &oint of This %eynote
'emonstrate the utility of academic programanalysis to(ards solving real-(orld reverseengineering pro)lems
-
8/12/2019 52 Semantics Based Methods
3/69
'efinitions
Syntactic methodsconsider only the encodingrather than the meaning of a given o)*ect, e+g+,seuences of machine-code )ytes or assem)ly
language instructions, perhaps (ith (ildcards Semantic methodsconsider the meaning of
the o)*ect, e+g+, the effects of one or moreinstructions
-
8/12/2019 52 Semantics Based Methods
4/69
Synta vs+ Semantics
Syntactic methods
tend to )e fast, )ut are limited in po(er
(or. (ell in some cases, and poorly in others
are incapa)le of solving certain types of pro)lems
Semantic methods
tend to )e slo(er, )ut are more po(erful
some analyses might produce approimateinformation /i+e+ 0may)e1 instead of 0yes1 or 0no12
-
8/12/2019 52 Semantics Based Methods
5/69
Synta-Based Methods
3re employed in cases such as
&ac.er entrypoint signatures
F45RT signatures
Methods to locate functionality e+g+ FindCrypt
3nti-virus )yte-level signatures
'eo)fuscation of pattern-o)fuscated code
-
8/12/2019 52 Semantics Based Methods
6/69
Syntactic Methods6 Strengths
Syntactic methods (or. (ell (hen theessential featureof the o)*ect lives in arestricted syntactic universe
F45RT signatures in the case (here the li)rary isactually statically-distri)uted and not recompiled
&ac.er E& signatures (hen the pac.er al(aysgenerates the same entrypoint
There is only one instance of some malicioussoft(are
)fuscators (ith a limited voca)ulary
-
8/12/2019 52 Semantics Based Methods
7/69
F45RT Signatures6 7ood Scenario
4i)rary statically-lin.ed, not recompiled
-
8/12/2019 52 Semantics Based Methods
8/69
Syntactic Methods6 8ea.nesses
They do not (or. (ell (hen there are a varietyof ways to encode the same property
F45RT signatures (hen the li)rary is recompiled
&ac.er E& signatures (hen the pac.er generatesthe E& polymorphically
39 signatures for polymorphic mal(are, or mal(aredistri)uted in source form
Comple o)fuscators
Ma.ing many signatures to account for thevariation is not a good solution either
-
8/12/2019 52 Semantics Based Methods
9/69
F45RT Signatures6 Bad Scenario
4i)rary (as recompiled
-
8/12/2019 52 Semantics Based Methods
10/69
Semantics-Based Methods
!umerous applications in RE, including6
3utomated .ey generator generation
Semi-generic deo)fuscation
3utomated )ug discovery S(itch-as-)inary-search case recovery
Stac. trac.ing
This .eynote attac.s these pro)lems via a)stract interpretation and
theorem proving
-
8/12/2019 52 Semantics Based Methods
11/69
Eposing the Semantics
The right-hand side is the IntermediateLanguage translation /or IR2+
-
8/12/2019 52 Semantics Based Methods
12/69
'esign of a Semantics Translator
$+&rogramming language-theoretic decisions Tree-)ased: Three-address form:
"+8hich )ehaviors to model:
Eceptions: 4o(-level details e+g+ segmentation:
;+#######2, or /result ? #2:
Carry@overflo( flags6 model them as )it hac.s a laBochs, or as conditionals a la Relational RE54:
A+
-
8/12/2019 52 Semantics Based Methods
13/69
3ct 5
ld-School &rogram 3nalysis3)stract 5nterpretation
-
8/12/2019 52 Semantics Based Methods
14/69
3)stract 5nterpretation6 Signs3nalysis
35 is complicated, )ut the )asic ideas are not
E6 determine each varia)leDs sign at each point
Replaced the
concrete state (ith an abstract state
concrete semantics (ith an abstractsemantics
-
8/12/2019 52 Semantics Based Methods
15/69
Concept6 3)stract the State
'ifferent a)stract interpretations use different
a)stract states+ For the signs analysis, each
varia)le could )e
n.no(n6 either positive ornegative /@-2
&ositive6 G # /#2
!egative6 ? # /#-2
Hero /#2
ninitialiIed /:2
5gnore all other information, e+g+, the actual
values of varia)les+
-
8/12/2019 52 Semantics Based Methods
16/69
Concept6 3)stract the Semantics /J2
3)stract multiplication follo(s the (ell-.no(n0rule of signs1 from grade school
3 positive times a positive is positive
3 negative times a negative is positive
3 negative times a positive is negative
!ote6 these remar.s refer to mathematical integersKmachine integers are su)*ect to overflo(
J : # # #- @-
: @- # @- @- @-
# # # # # #
# @- # # #- @-
#- @- # #- # @-
@- @- # @- @- @-
-
8/12/2019 52 Semantics Based Methods
17/69
Concept6 3)stract the Semantics /2
&ositive positive positive+ !egative negative negative+
!egative positive un.no(n6
-L L+ Concretely, the result is #+ - L+ Concretely, the result is -$+
-L + Concretely, the result is $+
: # # #- @-
: @- @- @- @- @-
# @- # # #- @-
# @- # # @- @-
#- @- # #- # @-
@- @- # @- @- @-
-
8/12/2019 52 Semantics Based Methods
18/69
Eample6 Sparse S(itch Ta)leRecovery
se a)stract interpretation to infer case la)elsfor s(itches compiled via )inary search+
3)stract domain6 intervals+
-
8/12/2019 52 Semantics Based Methods
19/69
S(itch Ta)les6 Contiguous, 5ndeed
-
8/12/2019 52 Semantics Based Methods
20/69
S(itch Ta)les6 Sparsely-&opulated
S(itch cases are sparsely-distri)uted+
Cannot implement efficiently (ith a ta)le+
ne option is to replace the construct (itha series of if-statements+
This (or.s, )ut ta.es /!2 time+
5nstead, compilers generate decision treesthat ta.e /log/!22 time, as sho(n on the
net slide+
-
8/12/2019 52 Semantics Based Methods
21/69
'ecision Trees for Sparse S(itches
-
8/12/2019 52 Semantics Based Methods
22/69
3ssem)ly 4anguage Reification
3dditional, slight complication6 red instructions modify E3N
throughout the decision tree+
-
8/12/2019 52 Semantics Based Methods
23/69
3ssem)ly 4anguage Reification,7raphical
-
8/12/2019 52 Semantics Based Methods
24/69
The 3)straction
Insight6 (e care a)out (hat range of valuesleads to a terminal case
Data abstraction6 5ntervals Ol,uP, (here l? u
Insight: construct implemented via sub, dec,cmpinstructions Q all are actually su)tractions Qand conditional )ranches
Semantics abstraction: &reservation ofsu)traction, )ifurcation upon )ranching
-
8/12/2019 52 Semantics Based Methods
25/69
3nalysis Results
Beginning (ith no information a)out arg#, each paththrough the decision tree induces a constraint upon itsrange of possi)le values, (ith single values or simple
ranges at case la)els+
-
8/12/2019 52 Semantics Based Methods
26/69
Eample6 7eneric 'eo)fuscation
se a)stract interpretation to removesuperfluous )asic )loc.s from control flo(graphs+
3)stract domain6 three-valued )itvectors+
-
8/12/2019 52 Semantics Based Methods
27/69
3nti-Tracing Control )fuscation
This code is an anti-tracing chec.+ Firstit pushes the flags,rotates the trap flaginto the Iero flag
position, restoresthe flags, and then
*umps if the Ieroflag /i+e+, theprevious trap flag2
is set+
The #m) )inarycontains $#.-$##.of these chec.s+
-
8/12/2019 52 Semantics Based Methods
28/69
)fuscated Control Flo( 7raph
4eft6 control flo( graph (ith o)fuscation of the type on the previous slide+
Right6 the same control flo( graph (ith the )ogus *umps removed )y the analysisthat (e are a)out to present+
-
8/12/2019 52 Semantics Based Methods
29/69
3 Semantic &attern for This Chec.
3 )it in a uantity /e+g+, the TF )it resulting froma pushf instruction2 is declared to )e a constant/e+g+, Iero2, and then the )it is used in furthermanipulations of that uantity+
3)stractly similar to constant propagation, eceptinstead of entire uantities, (e (or. on the )it level+
-
8/12/2019 52 Semantics Based Methods
30/69
&ro)lem6 n.no(n Bits
8e only .no( that certain )its are constantKho( do (e handle non-constant ones:
8hat happens if (e
and, adc, add, cmp, dec, div, idiv, imul, inc, mul,neg, not, or, rcl, rcr, rol, ror, sar, shl, shr, s)), setcc,su), test, or
uantities that contain un.no(n )its:
: : : $ : : : #
J $ : : : : $ : :
: : : : : : : :
3)stract 'omain6 Three 9alued
-
8/12/2019 52 Semantics Based Methods
31/69
3)stract 'omain6 Three-9aluedBitvectors
3)stract )its as having three values instead oft(o6 #, $, U /U un.no(n6 could )e # or $2
Model registers as vectors of three-valued )its
Model memory as arrays of three-valued )ytes
-
8/12/2019 52 Semantics Based Methods
32/69
3)stract Semantics6 3!'
Standard concrete semantics for 3!'6
8hat happens (hen (e introduce U )its:
U 3!' # # 3!' U # /# 3!' anything #2
U 3!' $ $ 3!' U
5f U #, then # 3!' $ #
5f U $, then $ 3!' $ $
Conflictory, therefore U 3!' $ U+
Similarly U 3!' U U+
Final three-valued truth ta)le6
3!' # $
# # #
$ # $
3!' # U $
# # # #
U # U U
$ # U $
3) t t S ti Bit i t
-
8/12/2019 52 Semantics Based Methods
33/69
3)stract Semantics6 Bit(ise perators
3!' # U $# # # #
U # U U
$ # U $
R # U $# # U $
U U U $
$ $ $ $
NR # U $
# # U $
U U U U
$ $ U #
!T # U $
$ U #
These operators follo( the same pattern as the derivation on theprevious slide, and (or. eactly ho( you (ould epect
-
8/12/2019 52 Semantics Based Methods
34/69
3)stract Semantics6 Shift perators
U # $ U # $ U # Some three-valued )itvector, call it B9
# U # $ U # $ U B9 S
-
8/12/2019 52 Semantics Based Methods
35/69
Concrete Semantics6 3ddition
-
8/12/2019 52 Semantics Based Methods
36/69
3)stract Semantics6 3ddition
3)stractly, 3OiP, BOiP, and the carry-in are three-valued, so there are ;;possi)ilities at eachposition+
The derivation is straightfor(ard )ut tedious+
!otice that the system automatically determinesthat the sum of t(o !-)it integers is at most!$ )its+
Carry-ut # # # U U U
3OiP # # # U U U
BOiP # # # U U UCarry-5n # # U U U #
Result # # U U U U
3)stract Semantics6 !egation
-
8/12/2019 52 Semantics Based Methods
37/69
3)stract Semantics6 !egation,Su)traction
!eg/2 !ot/2$ Su)/,y2 3dd/,Vy2 (here the initial carry-in
for the addition is set to one instead of Iero+
Therefore, these operators can )e implemented)ased upon (hat (e presented already+
-
8/12/2019 52 Semantics Based Methods
38/69
nsigned Multiplication
Consider B 3 J #$"; #$"; ###$ ##$# ##$$ "> "L "$ "#
B 3 J /"> "L "$ "#2 /su)stitution2
B 3 J "> 3 J "L 3 J "$ 3 J "# /distri)utivity6 J over 2
B /3 ?? >2 /3 ?? L2 /3 ?? $2 /3 ?? #2/definition of ??2
8hence unsigned multiplication reduces topreviously-solved pro)lems
Signed multiplication is tric.ier, )ut similar
-
8/12/2019 52 Semantics Based Methods
39/69
3)stract Semantics6 Conditionals
For euality, if any concrete )its mismatch, then3 B is true, and 3 B is false+
For 3 ? B, compute B-3 and ta.e the carry-outas the result
For 3 ? B, compute /3 ? B2 W /3 B2+
3 U $ U U U # U U
B U # U U U # U U
' )f i & d
-
8/12/2019 52 Semantics Based Methods
40/69
'eo)fuscation &rocedure
7enerate control flo( graph
$+3pply the analysis to each )asic )loc.
"+5f any conditional *ump )ecomes unconditional,
remove the false edge from the graph
;+&rune all vertices (ith no incoming edges /'FS2
A+Merge all vertices (ith a sole successor, (hose
successor has a sole predecessorL+5terate )ac. to X$ until the graph stops changing
Stupid algorithm, could )e ma*orly improved
& i ' )f ti
-
8/12/2019 52 Semantics Based Methods
41/69
&rogressive 'eo)fuscation
riginal graph6 ";"
vertices
'eo)fuscation round X$6 five
vertices
'eo)fuscation round X",
final6 one verte
E l T .i ES&
-
8/12/2019 52 Semantics Based Methods
42/69
Eample6 Trac.ing ES&
8e eplore and generaliIe 5lfa.Ds (or. on stac.trac.ing+
3)stract domains6 conve polyhedra and
friends in the relational domain family+
-
8/12/2019 52 Semantics Based Methods
43/69
St . T .i 5lf . "##
-
8/12/2019 52 Semantics Based Methods
44/69
Stac. Trac.ing, 5lfa. "##
8ant to .no( the differential of ES& )et(eenfunction )egin and every point in the function+
&ro)lem6 indirect calls (ith un.no(n calling
conventions+
Stac. Trac.ing
-
8/12/2019 52 Semantics Based Methods
45/69
g 7enerate a conve
polyhedron, defined )y6
T(o varia)les for every)loc.6 inesp, outesp+
ne euality for each initialand terminal )loc.+
ne euality for each edge/Xi,X*26 outespi inesp*
ne ineuality /not shown2for each )loc. Xn, relatinginespn to outespn,
)ased on the semantics/ES& modifications6 calls,pushes, pops2 of the )loc.+
Solve the euation systemfor an assignment to the
ES&-related varia)les+
Stac. Trac.ing6 5neualities
-
8/12/2019 52 Semantics Based Methods
46/69
Stac. Trac.ing6 5neualities
This )loc. pushes '8R's /"A )ytes2 on the stac., and it is un.no(n (hether the callremoves them+ Therefore, the ineuality generated for this )loc. is6
outespL - inespL ? "A
3lternative Formulations
-
8/12/2019 52 Semantics Based Methods
47/69
3lternative Formulations 5lfa.Ds solution uses polyhedra, (hich is
potentially computationally epensive
!ote6 all euations are of the form viQ v
*? c
i*,
(hich can )e solved in /W9WJWEW2 time (ith
Bellman-Ford /or other &T5ME solutions2
Figure stolen from 3ntoine MineDs &h+'+ thesis due to lac. of time+ Sorry+
Random Concept6 Reduced &roduct
-
8/12/2019 52 Semantics Based Methods
48/69
Random Concept6 Reduced &roduct 5nstead of performing analyses separately,
allo( them to interact G increased precision Suppose (e perform several analyses, and the
results for varia)le at some point are6
O-$#,P /Interval2 # /Sign2
dd /Parity2
sing the other domains, (e can refine theinterval a)straction6
Reduced product of /O-$#,P,#2 /O#,P,#2
Reduced product of /O#,P,dd2 /O$,LP,dd2
-
8/12/2019 52 Semantics Based Methods
49/69
3ct 55
!e(-School &rogram 3nalysisSMT Solving
Concept6 5nput Crafting via
-
8/12/2019 52 Semantics Based Methods
50/69
p p gTheorem &roving
5dea6 convert portions of code into logicalformulas, and use mathematically precisetechniues to prove properties a)out them
Eample6 (hat value must E3N have at the)eginning of this snippet in order for E3N to )e#$";ALY> after the snippet eecutes:
5R to SMT Formula
-
8/12/2019 52 Semantics Based Methods
51/69
5R to SMT Formula
&art of the 5R translation of the > snippetgiven on the previous slide+
3 slightly simplified /read6 incorrect2 SMTZFEFB9 translation of the 5R from the left+
3s. a Zuestion
-
8/12/2019 52 Semantics Based Methods
52/69
7iven the SMT formula, initial E3N unspecified, is
it possi)le that this postconditionis true: assert/T$YLd #$";ALY>2K (T175d is final !"#
The SMT solver outputs a
modelthat satisfies theconstraints+
The first red line says that theformula is satisfiable, i+e+, the
ans(er is yes+
The final red line says that theinitial value of E3N must )e
$AL#YAAL# or #LY>3B'
3utomated %ey 7enerator 7eneration
-
8/12/2019 52 Semantics Based Methods
53/69
3utomated %ey 7enerator 7eneration 3s )efore, generate an
eecution trace /statically2and convert to 5R+ Thenconvert the 5R to an SMTformula+
Precondition:
a3ctivationCodeO#P N ==a3ctivationCodeO$P [ ==a3ctivationCodeO"P H (hereN regcodeO#P,[ regcodeO$P,
H regcodeO"P, +++
Postcondition6StringderivedO#P D#D ==StringderivedO$P DhD ==StringderivedO"P DoD +++
Eample6 Euivalence Chec.ing for
-
8/12/2019 52 Semantics Based Methods
54/69
Error 'iscovery
8e employ a theorem prover /SMT solver2to(ards the pro)lem of finding situations in(hich virtualiIation o)fuscators produceincorrect translations of the input+
Concept6 Euivalence Chec.ing
-
8/12/2019 52 Semantics Based Methods
55/69
Concept6 Euivalence Chec.ing
5terative )it-tests Seuential ternary operator
Population counting, na\vely+ Count thenum)er of one-)its set+
&opulation Count via Bit
-
8/12/2019 52 Semantics Based Methods
56/69
&opulation Count via Bit -Bit &opulation Count via Bit
-
8/12/2019 52 Semantics Based Methods
57/69
> Bit &opulation Count via Bit
-
8/12/2019 52 Semantics Based Methods
58/69
Euivalence of !a\ve and Bit
-
8/12/2019 52 Semantics Based Methods
59/69
9erification of 'eo)fuscation
7iven some deo)fuscation procedure, (e (antto ensure that the output is euivalent to theinput
5s this /$ of "2
-
8/12/2019 52 Semantics Based Methods
60/69
5s this /$ of "2
5s this +++ /" of "2
-
8/12/2019 52 Semantics Based Methods
61/69
s s / o 2
Euivalent to This:
-
8/12/2019 52 Semantics Based Methods
62/69
Theorem prover says6 YS, if (e ignore the values
)elo( terminal ES&
5neuivalence X$
-
8/12/2019 52 Semantics Based Methods
63/69
These seuences are I!"#I$%L!&6 the o)fuscated versionmodifies the carry flag /(ith the add and su) instructions2 )eforethe inc ta.es place, and the inc instruction does not modify that
flag+
)fuscated version of inc d(ord handler+
'eo)fuscated handler+
5neuivalence X"
-
8/12/2019 52 Semantics Based Methods
64/69
The sar instruction does not change the flags if the shiftand isIero, (hereas the o)fuscated handler does change the flags via
the add instructions+
)fuscated version of sar d(ord handler+
'eo)fuscated handler+
5neuivalence X;
-
8/12/2019 52 Semantics Based Methods
65/69
CanDt sho( o)fuscated version due to it )eing >" instructions long+)fuscated version (rites to stac. (hereas deo)fuscated version does notK therefore,the memory read on the last line could read a value )elo( the stac. pointer, (hich (ould)e different in the o)fuscated and deo)fuscated version+
8arning6
-
8/12/2019 52 Semantics Based Methods
66/69
5 tried to ma.e my presentation friendlyK the
literature does not ma.e any such attempt
References3 l i di li h 5 il d
-
8/12/2019 52 Semantics Based Methods
67/69
3 program analysis reading list that 5 compiled
http6@@(((+reddit+com@r@ReverseEngineering@comments@smfAu@
reverser(antingtodevelopmathematically@cAfayl
Rolles6 S(itch as Binary Search
https6@@(((+openrce+org@)log@vie(@$;$@
https6@@(((+openrce+org@)log@vie(@$;"#@
Rolles6 Control Flo( 'eo)fuscation via 3)stract 5nterpretation
https6@@(((+openrce+org@)log@vie(@$Y"@
Rolles6 Finding Bugs in 9Ms (ith a Theorem &rover
https6@@(((+openrce+org@)log@vie(@$;@
Rolles6 Semi-3utomated 5nput Crafting
https6@@(((+openrce+org@)log@vie(@"#A@
5lfa.6 Simple Method in 5'3 &ro
http6@@(((+he)log+com@:pA"
Zuestions:
https://www.openrce.org/blog/view/1319/https://www.openrce.org/blog/view/1319/ -
8/12/2019 52 Semantics Based Methods
68/69
-
8/12/2019 52 Semantics Based Methods
69/69
]amie 7am)le, Sean