52 Semantics Based Methods

download 52 Semantics Based Methods

of 69

Transcript of 52 Semantics Based Methods

  • 8/12/2019 52 Semantics Based Methods

    1/69

    The Case for Semantics-Based Methods inReverse Engineering

    Rolf Rolles, FunemployedREC! "#$" %eynote

  • 8/12/2019 52 Semantics Based Methods

    2/69

    The &oint of This %eynote

    'emonstrate the utility of academic programanalysis to(ards solving real-(orld reverseengineering pro)lems

  • 8/12/2019 52 Semantics Based Methods

    3/69

    'efinitions

    Syntactic methodsconsider only the encodingrather than the meaning of a given o)*ect, e+g+,seuences of machine-code )ytes or assem)ly

    language instructions, perhaps (ith (ildcards Semantic methodsconsider the meaning of

    the o)*ect, e+g+, the effects of one or moreinstructions

  • 8/12/2019 52 Semantics Based Methods

    4/69

    Synta vs+ Semantics

    Syntactic methods

    tend to )e fast, )ut are limited in po(er

    (or. (ell in some cases, and poorly in others

    are incapa)le of solving certain types of pro)lems

    Semantic methods

    tend to )e slo(er, )ut are more po(erful

    some analyses might produce approimateinformation /i+e+ 0may)e1 instead of 0yes1 or 0no12

  • 8/12/2019 52 Semantics Based Methods

    5/69

    Synta-Based Methods

    3re employed in cases such as

    &ac.er entrypoint signatures

    F45RT signatures

    Methods to locate functionality e+g+ FindCrypt

    3nti-virus )yte-level signatures

    'eo)fuscation of pattern-o)fuscated code

  • 8/12/2019 52 Semantics Based Methods

    6/69

    Syntactic Methods6 Strengths

    Syntactic methods (or. (ell (hen theessential featureof the o)*ect lives in arestricted syntactic universe

    F45RT signatures in the case (here the li)rary isactually statically-distri)uted and not recompiled

    &ac.er E& signatures (hen the pac.er al(aysgenerates the same entrypoint

    There is only one instance of some malicioussoft(are

    )fuscators (ith a limited voca)ulary

  • 8/12/2019 52 Semantics Based Methods

    7/69

    F45RT Signatures6 7ood Scenario

    4i)rary statically-lin.ed, not recompiled

  • 8/12/2019 52 Semantics Based Methods

    8/69

    Syntactic Methods6 8ea.nesses

    They do not (or. (ell (hen there are a varietyof ways to encode the same property

    F45RT signatures (hen the li)rary is recompiled

    &ac.er E& signatures (hen the pac.er generatesthe E& polymorphically

    39 signatures for polymorphic mal(are, or mal(aredistri)uted in source form

    Comple o)fuscators

    Ma.ing many signatures to account for thevariation is not a good solution either

  • 8/12/2019 52 Semantics Based Methods

    9/69

    F45RT Signatures6 Bad Scenario

    4i)rary (as recompiled

  • 8/12/2019 52 Semantics Based Methods

    10/69

    Semantics-Based Methods

    !umerous applications in RE, including6

    3utomated .ey generator generation

    Semi-generic deo)fuscation

    3utomated )ug discovery S(itch-as-)inary-search case recovery

    Stac. trac.ing

    This .eynote attac.s these pro)lems via a)stract interpretation and

    theorem proving

  • 8/12/2019 52 Semantics Based Methods

    11/69

    Eposing the Semantics

    The right-hand side is the IntermediateLanguage translation /or IR2+

  • 8/12/2019 52 Semantics Based Methods

    12/69

    'esign of a Semantics Translator

    $+&rogramming language-theoretic decisions Tree-)ased: Three-address form:

    "+8hich )ehaviors to model:

    Eceptions: 4o(-level details e+g+ segmentation:

    ;+#######2, or /result ? #2:

    Carry@overflo( flags6 model them as )it hac.s a laBochs, or as conditionals a la Relational RE54:

    A+

  • 8/12/2019 52 Semantics Based Methods

    13/69

    3ct 5

    ld-School &rogram 3nalysis3)stract 5nterpretation

  • 8/12/2019 52 Semantics Based Methods

    14/69

    3)stract 5nterpretation6 Signs3nalysis

    35 is complicated, )ut the )asic ideas are not

    E6 determine each varia)leDs sign at each point

    Replaced the

    concrete state (ith an abstract state

    concrete semantics (ith an abstractsemantics

  • 8/12/2019 52 Semantics Based Methods

    15/69

    Concept6 3)stract the State

    'ifferent a)stract interpretations use different

    a)stract states+ For the signs analysis, each

    varia)le could )e

    n.no(n6 either positive ornegative /@-2

    &ositive6 G # /#2

    !egative6 ? # /#-2

    Hero /#2

    ninitialiIed /:2

    5gnore all other information, e+g+, the actual

    values of varia)les+

  • 8/12/2019 52 Semantics Based Methods

    16/69

    Concept6 3)stract the Semantics /J2

    3)stract multiplication follo(s the (ell-.no(n0rule of signs1 from grade school

    3 positive times a positive is positive

    3 negative times a negative is positive

    3 negative times a positive is negative

    !ote6 these remar.s refer to mathematical integersKmachine integers are su)*ect to overflo(

    J : # # #- @-

    : @- # @- @- @-

    # # # # # #

    # @- # # #- @-

    #- @- # #- # @-

    @- @- # @- @- @-

  • 8/12/2019 52 Semantics Based Methods

    17/69

    Concept6 3)stract the Semantics /2

    &ositive positive positive+ !egative negative negative+

    !egative positive un.no(n6

    -L L+ Concretely, the result is #+ - L+ Concretely, the result is -$+

    -L + Concretely, the result is $+

    : # # #- @-

    : @- @- @- @- @-

    # @- # # #- @-

    # @- # # @- @-

    #- @- # #- # @-

    @- @- # @- @- @-

  • 8/12/2019 52 Semantics Based Methods

    18/69

    Eample6 Sparse S(itch Ta)leRecovery

    se a)stract interpretation to infer case la)elsfor s(itches compiled via )inary search+

    3)stract domain6 intervals+

  • 8/12/2019 52 Semantics Based Methods

    19/69

    S(itch Ta)les6 Contiguous, 5ndeed

  • 8/12/2019 52 Semantics Based Methods

    20/69

    S(itch Ta)les6 Sparsely-&opulated

    S(itch cases are sparsely-distri)uted+

    Cannot implement efficiently (ith a ta)le+

    ne option is to replace the construct (itha series of if-statements+

    This (or.s, )ut ta.es /!2 time+

    5nstead, compilers generate decision treesthat ta.e /log/!22 time, as sho(n on the

    net slide+

  • 8/12/2019 52 Semantics Based Methods

    21/69

    'ecision Trees for Sparse S(itches

  • 8/12/2019 52 Semantics Based Methods

    22/69

    3ssem)ly 4anguage Reification

    3dditional, slight complication6 red instructions modify E3N

    throughout the decision tree+

  • 8/12/2019 52 Semantics Based Methods

    23/69

    3ssem)ly 4anguage Reification,7raphical

  • 8/12/2019 52 Semantics Based Methods

    24/69

    The 3)straction

    Insight6 (e care a)out (hat range of valuesleads to a terminal case

    Data abstraction6 5ntervals Ol,uP, (here l? u

    Insight: construct implemented via sub, dec,cmpinstructions Q all are actually su)tractions Qand conditional )ranches

    Semantics abstraction: &reservation ofsu)traction, )ifurcation upon )ranching

  • 8/12/2019 52 Semantics Based Methods

    25/69

    3nalysis Results

    Beginning (ith no information a)out arg#, each paththrough the decision tree induces a constraint upon itsrange of possi)le values, (ith single values or simple

    ranges at case la)els+

  • 8/12/2019 52 Semantics Based Methods

    26/69

    Eample6 7eneric 'eo)fuscation

    se a)stract interpretation to removesuperfluous )asic )loc.s from control flo(graphs+

    3)stract domain6 three-valued )itvectors+

  • 8/12/2019 52 Semantics Based Methods

    27/69

    3nti-Tracing Control )fuscation

    This code is an anti-tracing chec.+ Firstit pushes the flags,rotates the trap flaginto the Iero flag

    position, restoresthe flags, and then

    *umps if the Ieroflag /i+e+, theprevious trap flag2

    is set+

    The #m) )inarycontains $#.-$##.of these chec.s+

  • 8/12/2019 52 Semantics Based Methods

    28/69

    )fuscated Control Flo( 7raph

    4eft6 control flo( graph (ith o)fuscation of the type on the previous slide+

    Right6 the same control flo( graph (ith the )ogus *umps removed )y the analysisthat (e are a)out to present+

  • 8/12/2019 52 Semantics Based Methods

    29/69

    3 Semantic &attern for This Chec.

    3 )it in a uantity /e+g+, the TF )it resulting froma pushf instruction2 is declared to )e a constant/e+g+, Iero2, and then the )it is used in furthermanipulations of that uantity+

    3)stractly similar to constant propagation, eceptinstead of entire uantities, (e (or. on the )it level+

  • 8/12/2019 52 Semantics Based Methods

    30/69

    &ro)lem6 n.no(n Bits

    8e only .no( that certain )its are constantKho( do (e handle non-constant ones:

    8hat happens if (e

    and, adc, add, cmp, dec, div, idiv, imul, inc, mul,neg, not, or, rcl, rcr, rol, ror, sar, shl, shr, s)), setcc,su), test, or

    uantities that contain un.no(n )its:

    : : : $ : : : #

    J $ : : : : $ : :

    : : : : : : : :

    3)stract 'omain6 Three 9alued

  • 8/12/2019 52 Semantics Based Methods

    31/69

    3)stract 'omain6 Three-9aluedBitvectors

    3)stract )its as having three values instead oft(o6 #, $, U /U un.no(n6 could )e # or $2

    Model registers as vectors of three-valued )its

    Model memory as arrays of three-valued )ytes

  • 8/12/2019 52 Semantics Based Methods

    32/69

    3)stract Semantics6 3!'

    Standard concrete semantics for 3!'6

    8hat happens (hen (e introduce U )its:

    U 3!' # # 3!' U # /# 3!' anything #2

    U 3!' $ $ 3!' U

    5f U #, then # 3!' $ #

    5f U $, then $ 3!' $ $

    Conflictory, therefore U 3!' $ U+

    Similarly U 3!' U U+

    Final three-valued truth ta)le6

    3!' # $

    # # #

    $ # $

    3!' # U $

    # # # #

    U # U U

    $ # U $

    3) t t S ti Bit i t

  • 8/12/2019 52 Semantics Based Methods

    33/69

    3)stract Semantics6 Bit(ise perators

    3!' # U $# # # #

    U # U U

    $ # U $

    R # U $# # U $

    U U U $

    $ $ $ $

    NR # U $

    # # U $

    U U U U

    $ $ U #

    !T # U $

    $ U #

    These operators follo( the same pattern as the derivation on theprevious slide, and (or. eactly ho( you (ould epect

  • 8/12/2019 52 Semantics Based Methods

    34/69

    3)stract Semantics6 Shift perators

    U # $ U # $ U # Some three-valued )itvector, call it B9

    # U # $ U # $ U B9 S

  • 8/12/2019 52 Semantics Based Methods

    35/69

    Concrete Semantics6 3ddition

  • 8/12/2019 52 Semantics Based Methods

    36/69

    3)stract Semantics6 3ddition

    3)stractly, 3OiP, BOiP, and the carry-in are three-valued, so there are ;;possi)ilities at eachposition+

    The derivation is straightfor(ard )ut tedious+

    !otice that the system automatically determinesthat the sum of t(o !-)it integers is at most!$ )its+

    Carry-ut # # # U U U

    3OiP # # # U U U

    BOiP # # # U U UCarry-5n # # U U U #

    Result # # U U U U

    3)stract Semantics6 !egation

  • 8/12/2019 52 Semantics Based Methods

    37/69

    3)stract Semantics6 !egation,Su)traction

    !eg/2 !ot/2$ Su)/,y2 3dd/,Vy2 (here the initial carry-in

    for the addition is set to one instead of Iero+

    Therefore, these operators can )e implemented)ased upon (hat (e presented already+

  • 8/12/2019 52 Semantics Based Methods

    38/69

    nsigned Multiplication

    Consider B 3 J #$"; #$"; ###$ ##$# ##$$ "> "L "$ "#

    B 3 J /"> "L "$ "#2 /su)stitution2

    B 3 J "> 3 J "L 3 J "$ 3 J "# /distri)utivity6 J over 2

    B /3 ?? >2 /3 ?? L2 /3 ?? $2 /3 ?? #2/definition of ??2

    8hence unsigned multiplication reduces topreviously-solved pro)lems

    Signed multiplication is tric.ier, )ut similar

  • 8/12/2019 52 Semantics Based Methods

    39/69

    3)stract Semantics6 Conditionals

    For euality, if any concrete )its mismatch, then3 B is true, and 3 B is false+

    For 3 ? B, compute B-3 and ta.e the carry-outas the result

    For 3 ? B, compute /3 ? B2 W /3 B2+

    3 U $ U U U # U U

    B U # U U U # U U

    ' )f i & d

  • 8/12/2019 52 Semantics Based Methods

    40/69

    'eo)fuscation &rocedure

    7enerate control flo( graph

    $+3pply the analysis to each )asic )loc.

    "+5f any conditional *ump )ecomes unconditional,

    remove the false edge from the graph

    ;+&rune all vertices (ith no incoming edges /'FS2

    A+Merge all vertices (ith a sole successor, (hose

    successor has a sole predecessorL+5terate )ac. to X$ until the graph stops changing

    Stupid algorithm, could )e ma*orly improved

    & i ' )f ti

  • 8/12/2019 52 Semantics Based Methods

    41/69

    &rogressive 'eo)fuscation

    riginal graph6 ";"

    vertices

    'eo)fuscation round X$6 five

    vertices

    'eo)fuscation round X",

    final6 one verte

    E l T .i ES&

  • 8/12/2019 52 Semantics Based Methods

    42/69

    Eample6 Trac.ing ES&

    8e eplore and generaliIe 5lfa.Ds (or. on stac.trac.ing+

    3)stract domains6 conve polyhedra and

    friends in the relational domain family+

  • 8/12/2019 52 Semantics Based Methods

    43/69

    St . T .i 5lf . "##

  • 8/12/2019 52 Semantics Based Methods

    44/69

    Stac. Trac.ing, 5lfa. "##

    8ant to .no( the differential of ES& )et(eenfunction )egin and every point in the function+

    &ro)lem6 indirect calls (ith un.no(n calling

    conventions+

    Stac. Trac.ing

  • 8/12/2019 52 Semantics Based Methods

    45/69

    g 7enerate a conve

    polyhedron, defined )y6

    T(o varia)les for every)loc.6 inesp, outesp+

    ne euality for each initialand terminal )loc.+

    ne euality for each edge/Xi,X*26 outespi inesp*

    ne ineuality /not shown2for each )loc. Xn, relatinginespn to outespn,

    )ased on the semantics/ES& modifications6 calls,pushes, pops2 of the )loc.+

    Solve the euation systemfor an assignment to the

    ES&-related varia)les+

    Stac. Trac.ing6 5neualities

  • 8/12/2019 52 Semantics Based Methods

    46/69

    Stac. Trac.ing6 5neualities

    This )loc. pushes '8R's /"A )ytes2 on the stac., and it is un.no(n (hether the callremoves them+ Therefore, the ineuality generated for this )loc. is6

    outespL - inespL ? "A

    3lternative Formulations

  • 8/12/2019 52 Semantics Based Methods

    47/69

    3lternative Formulations 5lfa.Ds solution uses polyhedra, (hich is

    potentially computationally epensive

    !ote6 all euations are of the form viQ v

    *? c

    i*,

    (hich can )e solved in /W9WJWEW2 time (ith

    Bellman-Ford /or other &T5ME solutions2

    Figure stolen from 3ntoine MineDs &h+'+ thesis due to lac. of time+ Sorry+

    Random Concept6 Reduced &roduct

  • 8/12/2019 52 Semantics Based Methods

    48/69

    Random Concept6 Reduced &roduct 5nstead of performing analyses separately,

    allo( them to interact G increased precision Suppose (e perform several analyses, and the

    results for varia)le at some point are6

    O-$#,P /Interval2 # /Sign2

    dd /Parity2

    sing the other domains, (e can refine theinterval a)straction6

    Reduced product of /O-$#,P,#2 /O#,P,#2

    Reduced product of /O#,P,dd2 /O$,LP,dd2

  • 8/12/2019 52 Semantics Based Methods

    49/69

    3ct 55

    !e(-School &rogram 3nalysisSMT Solving

    Concept6 5nput Crafting via

  • 8/12/2019 52 Semantics Based Methods

    50/69

    p p gTheorem &roving

    5dea6 convert portions of code into logicalformulas, and use mathematically precisetechniues to prove properties a)out them

    Eample6 (hat value must E3N have at the)eginning of this snippet in order for E3N to )e#$";ALY> after the snippet eecutes:

    5R to SMT Formula

  • 8/12/2019 52 Semantics Based Methods

    51/69

    5R to SMT Formula

    &art of the 5R translation of the > snippetgiven on the previous slide+

    3 slightly simplified /read6 incorrect2 SMTZFEFB9 translation of the 5R from the left+

    3s. a Zuestion

  • 8/12/2019 52 Semantics Based Methods

    52/69

    7iven the SMT formula, initial E3N unspecified, is

    it possi)le that this postconditionis true: assert/T$YLd #$";ALY>2K (T175d is final !"#

    The SMT solver outputs a

    modelthat satisfies theconstraints+

    The first red line says that theformula is satisfiable, i+e+, the

    ans(er is yes+

    The final red line says that theinitial value of E3N must )e

    $AL#YAAL# or #LY>3B'

    3utomated %ey 7enerator 7eneration

  • 8/12/2019 52 Semantics Based Methods

    53/69

    3utomated %ey 7enerator 7eneration 3s )efore, generate an

    eecution trace /statically2and convert to 5R+ Thenconvert the 5R to an SMTformula+

    Precondition:

    a3ctivationCodeO#P N ==a3ctivationCodeO$P [ ==a3ctivationCodeO"P H (hereN regcodeO#P,[ regcodeO$P,

    H regcodeO"P, +++

    Postcondition6StringderivedO#P D#D ==StringderivedO$P DhD ==StringderivedO"P DoD +++

    Eample6 Euivalence Chec.ing for

  • 8/12/2019 52 Semantics Based Methods

    54/69

    Error 'iscovery

    8e employ a theorem prover /SMT solver2to(ards the pro)lem of finding situations in(hich virtualiIation o)fuscators produceincorrect translations of the input+

    Concept6 Euivalence Chec.ing

  • 8/12/2019 52 Semantics Based Methods

    55/69

    Concept6 Euivalence Chec.ing

    5terative )it-tests Seuential ternary operator

    Population counting, na\vely+ Count thenum)er of one-)its set+

    &opulation Count via Bit

  • 8/12/2019 52 Semantics Based Methods

    56/69

    &opulation Count via Bit -Bit &opulation Count via Bit

  • 8/12/2019 52 Semantics Based Methods

    57/69

    > Bit &opulation Count via Bit

  • 8/12/2019 52 Semantics Based Methods

    58/69

    Euivalence of !a\ve and Bit

  • 8/12/2019 52 Semantics Based Methods

    59/69

    9erification of 'eo)fuscation

    7iven some deo)fuscation procedure, (e (antto ensure that the output is euivalent to theinput

    5s this /$ of "2

  • 8/12/2019 52 Semantics Based Methods

    60/69

    5s this /$ of "2

    5s this +++ /" of "2

  • 8/12/2019 52 Semantics Based Methods

    61/69

    s s / o 2

    Euivalent to This:

  • 8/12/2019 52 Semantics Based Methods

    62/69

    Theorem prover says6 YS, if (e ignore the values

    )elo( terminal ES&

    5neuivalence X$

  • 8/12/2019 52 Semantics Based Methods

    63/69

    These seuences are I!"#I$%L!&6 the o)fuscated versionmodifies the carry flag /(ith the add and su) instructions2 )eforethe inc ta.es place, and the inc instruction does not modify that

    flag+

    )fuscated version of inc d(ord handler+

    'eo)fuscated handler+

    5neuivalence X"

  • 8/12/2019 52 Semantics Based Methods

    64/69

    The sar instruction does not change the flags if the shiftand isIero, (hereas the o)fuscated handler does change the flags via

    the add instructions+

    )fuscated version of sar d(ord handler+

    'eo)fuscated handler+

    5neuivalence X;

  • 8/12/2019 52 Semantics Based Methods

    65/69

    CanDt sho( o)fuscated version due to it )eing >" instructions long+)fuscated version (rites to stac. (hereas deo)fuscated version does notK therefore,the memory read on the last line could read a value )elo( the stac. pointer, (hich (ould)e different in the o)fuscated and deo)fuscated version+

    8arning6

  • 8/12/2019 52 Semantics Based Methods

    66/69

    5 tried to ma.e my presentation friendlyK the

    literature does not ma.e any such attempt

    References3 l i di li h 5 il d

  • 8/12/2019 52 Semantics Based Methods

    67/69

    3 program analysis reading list that 5 compiled

    http6@@(((+reddit+com@r@ReverseEngineering@comments@smfAu@

    reverser(antingtodevelopmathematically@cAfayl

    Rolles6 S(itch as Binary Search

    https6@@(((+openrce+org@)log@vie(@$;$@

    https6@@(((+openrce+org@)log@vie(@$;"#@

    Rolles6 Control Flo( 'eo)fuscation via 3)stract 5nterpretation

    https6@@(((+openrce+org@)log@vie(@$Y"@

    Rolles6 Finding Bugs in 9Ms (ith a Theorem &rover

    https6@@(((+openrce+org@)log@vie(@$;@

    Rolles6 Semi-3utomated 5nput Crafting

    https6@@(((+openrce+org@)log@vie(@"#A@

    5lfa.6 Simple Method in 5'3 &ro

    http6@@(((+he)log+com@:pA"

    Zuestions:

    https://www.openrce.org/blog/view/1319/https://www.openrce.org/blog/view/1319/
  • 8/12/2019 52 Semantics Based Methods

    68/69

  • 8/12/2019 52 Semantics Based Methods

    69/69

    ]amie 7am)le, Sean