Finding Application Errors and Security Flaws Using PQL: A Program Query Language Michael Martin,...
-
date post
19-Dec-2015 -
Category
Documents
-
view
216 -
download
3
Transcript of Finding Application Errors and Security Flaws Using PQL: A Program Query Language Michael Martin,...
Finding Application Errors and Security Flaws Using PQL: A Program Query Language
Michael Martin, Benjamin Livshits,Michael Martin, Benjamin Livshits,
Monica S. LamMonica S. Lam
Stanford UniversityStanford University
OOPSLA 2005OOPSLA 2005
The Problem
Lots of bug-finding researchLots of bug-finding research Null dereferencesNull dereferences Buffer overrunsBuffer overruns Data racesData races
Many bugs application-specificMany bugs application-specific Misuse of librariesMisuse of libraries Violation of application logicViolation of application logic
Solution: Division of Labor
ProgrammerProgrammer Knows target programKnows target program Doesn’t know analysisDoesn’t know analysis
Program Analysis SpecialistsProgram Analysis Specialists Knows analysis Knows analysis Doesn’t know specific bugsDoesn’t know specific bugs
Give the programmer a usable analysisGive the programmer a usable analysis
Program Query Language
Queries operate on Queries operate on program tracesprogram traces Sequence of events representing a runSequence of events representing a run Refers to object Refers to object instancesinstances, not variables, not variables Events matched may be widely spacedEvents matched may be widely spaced
Patterns resemble Java codePatterns resemble Java code Like a small matching code snippetLike a small matching code snippet No references to compiler internalsNo references to compiler internals
System Architecture
Question
instrumenterProgram
static analyzer
PQL Query
PQL Engine
Instrumented Program
Optimized Instrumented
ProgramStatic Results
Complementary Approaches
Dynamic analysis: finds matches at run timeDynamic analysis: finds matches at run time After a match:After a match:
Can execute user codeCan execute user codeCan fix code by replacing instructionsCan fix code by replacing instructions
Static analysis: finds all possible matchesStatic analysis: finds all possible matches Conservative: can prove lack of matchConservative: can prove lack of match Results can optimize dynamic analysisResults can optimize dynamic analysis
Experimental Results Explored a wide range of PQL queriesExplored a wide range of PQL queries
Bad session stores (API violations)Bad session stores (API violations) SQL injections (security flaws)SQL injections (security flaws) Mismatched calls (API violations)Mismatched calls (API violations) Lapsed listeners (memory leaks)Lapsed listeners (memory leaks)
Automatically repaired and prevented many bugs at runtimeAutomatically repaired and prevented many bugs at runtime Fixed memory leaks, prevented security flawsFixed memory leaks, prevented security flaws
Runtime overhead is reasonableRuntime overhead is reasonable Overhead in the 9-125% rangeOverhead in the 9-125% range Static optimization removed 82-99% of instr. pointsStatic optimization removed 82-99% of instr. points
Found 206 bugs in 6 real-life Java applicationsFound 206 bugs in 6 real-life Java applications Eclipse, popular web applicationsEclipse, popular web applications 60,000 classes combined60,000 classes combined
Question
instrumenterProgram
static analyzer
PQL Query
PQL Engine
Instrumented Program
Optimized Instrumented
ProgramStatic Results
System Architecture
Running Example: SQL Injection
Unvalidated user input passed to a Unvalidated user input passed to a database back-end for executiondatabase back-end for execution
If SQL in embedded in the input, If SQL in embedded in the input, attacker can take over databaseattacker can take over database Unauthorized data readsUnauthorized data reads Unauthorized data modificationsUnauthorized data modifications
One of the top web security flawsOne of the top web security flaws
SQL Injection 1
HttpServletRequest req = /* ... */;HttpServletRequest req = /* ... */;
java.sql.Connection conn = /* ... */;java.sql.Connection conn = /* ... */;
String q = req.getParameter(“QUERY”);String q = req.getParameter(“QUERY”);
conn.execute(q);conn.execute(q);
11 CALLCALL oo11.getParameter(.getParameter(oo22))
22 RETRET oo22
33 CALLCALL oo33.execute(.execute(oo22))
44 RETRET oo44
SQL Injection 2String read() {String read() {
HttpServletRequest req = /* ... */;HttpServletRequest req = /* ... */;
return req.getParameter(“QUERY”);return req.getParameter(“QUERY”);
}}
/* ... *//* ... */
java.sql.Connection conn = /* ... */;java.sql.Connection conn = /* ... */;
conn.execute(read());conn.execute(read());
11 CALLCALL read()read()
22 CALLCALL oo11.getParameter(.getParameter(oo22))
33 RETRET oo33
44 RETRET oo33
55 CALLCALL oo44.execute(.execute(oo33))
66 RETRET oo55
Essence of Pattern the Same
1 1 CALLCALL oo11.getParameter(.getParameter(oo22))
2 2 RETRET oo33
3 3 CALLCALL oo44.execute.execute((oo33))
4 4 RETRET oo55
1 1 CALLCALL read()read()
2 2 CALLCALL oo11.getParameter(.getParameter(oo22))
3 3 RETRET oo33
4 4 RETRET oo33
5 5 CALLCALL oo44.execute(.execute(oo33))
6 6 RETRET oo55
The object The object returned by returned by getParametergetParameter is then is then argument 1 to argument 1 to executeexecute
Translates Directly to PQL
query main()query main()
uses String x;uses String x;matches {matches { x = HttpServletRequest.getParameter(_); x = HttpServletRequest.getParameter(_);
Connection.execute(x);Connection.execute(x);}}
Query variables Query variables heap objects heap objects Instructions need not be adjacent in Instructions need not be adjacent in
tracetrace
Alternation
query main()query main()
uses String x;uses String x;matches {matches { x = HttpServletRequest.getParameter(_) x = HttpServletRequest.getParameter(_)
| x = HttpServletRequest.getHeader(_);| x = HttpServletRequest.getHeader(_);
Connection.execute(x);Connection.execute(x);}}
SQL Injection 3
HttpServletRequest req = /* ... */;HttpServletRequest req = /* ... */;n = getParameter(“NAME”);n = getParameter(“NAME”);p = getParameter(“PASSWORD”);p = getParameter(“PASSWORD”);conn.execute(conn.execute(
““SELECT * FROM logins WHERE name=” + SELECT * FROM logins WHERE name=” + n +n +“ “ AND passwd=” + AND passwd=” + pp
););
Compiler translates string concatenation into Compiler translates string concatenation into operations on operations on StringString and and StringBufferStringBuffer objects objects
SQL Injection 311 CALLCALL oo11.getParameter(.getParameter(oo22))
22 RETRET oo33
33 CALLCALL oo11.getParameter(.getParameter(oo44))
44 RETRET oo55
55 CALLCALL StringBuffer.<init>(StringBuffer.<init>(oo66))
66 RETRET oo77
77 CALLCALL oo77.append(.append(oo88))
88 RETRET oo77
99 CALLCALL oo77.append(.append(oo33))
1010 RETRET oo77
1111 CALLCALL oo77.append(.append(oo99))
1212 RETRET oo77
1313 CALLCALL oo77.append(.append(oo55))
1414 RETRET oo77
1515 CALLCALL oo77.toString().toString()
1616 RETRET oo1010
1717 CALLCALL oo1111.execute(.execute(oo1010))
1818 RETRET oo1212
Old Pattern Doesn’t Work
11 CALLCALL oo11.getParameter(.getParameter(oo22))
22 RETRET oo33
33 CALLCALL oo11.getParameter(.getParameter(oo44))
44 RETRET oo55
1717 CALLCALL oo1111.execute(.execute(oo1010))
oo1010 is neither is neither oo33 nor nor oo55, so no match, so no match
Instance of Tainted Data Problem
User-controlled input must be trapped and User-controlled input must be trapped and validated before being passed to databasevalidated before being passed to database
Objects derived from an initial input must also Objects derived from an initial input must also be considered user controlledbe considered user controlled
Generalizes to many security problems: Generalizes to many security problems: cross-site scripting, path traversal, response cross-site scripting, path traversal, response splitting, format string attacks...splitting, format string attacks...
SensitiveComponent
o1 o2 o3
Pattern Must Catch Derived Strings
11 CALLCALL oo11.getParameter(.getParameter(oo22))
22 RETRET oo33
33 CALLCALL oo11.getParameter(.getParameter(oo44))
44 RETRET oo55
99 CALLCALL oo77.append(.append(oo33))
1010 RETRET oo77
1111 CALLCALL oo77.append(.append(oo99))
1212 RETRET oo77
1515 CALLCALL oo77.toString().toString()
1616 RETRET oo1010
1717 CALLCALL oo1111.execute(.execute(oo1010))
Pattern Must Catch Derived Strings
11 CALLCALL oo11.getParameter(.getParameter(oo22))
22 RETRET oo33
33 CALLCALL oo11.getParameter(.getParameter(oo44))
44 RETRET oo55
99 CALLCALL oo77.append(.append(oo33))
1010 RETRET oo77
1111 CALLCALL oo77.append(.append(oo99))
1212 RETRET oo77
1515 CALLCALL oo77.toString().toString()
1616 RETRET oo1010
1717 CALLCALL oo1111.execute(.execute(oo1010))
Pattern Must Catch Derived Strings
11 CALLCALL oo11.getParameter(.getParameter(oo22))
22 RETRET oo33
33 CALLCALL oo11.getParameter(.getParameter(oo44))
44 RETRET oo55
99 CALLCALL oo77.append(.append(oo33))
1010 RETRET oo77
1111 CALLCALL oo77.append(.append(oo99))
1212 RETRET oo77
1515 CALLCALL oo77.toString().toString()
1616 RETRET oo1010
1717 CALLCALL oo1111.execute(.execute(oo1010))
Derived String Query
query derived (Object x)query derived (Object x)
uses Object temp;uses Object temp;
returns Object d;returns Object d;
matches {matches {
{ temp.append(x); d := derived(temp); }{ temp.append(x); d := derived(temp); }
| { temp = x.toString(); d := derived(temp); }| { temp = x.toString(); d := derived(temp); }
| { d := x; }| { d := x; }
}}
New Main Query
query main()query main()uses String xuses String x, final, final;;matches {matches { x = HttpServletRequest.getParameter(_) x = HttpServletRequest.getParameter(_)
| x = HttpServletRequest.getHeader(_);| x = HttpServletRequest.getHeader(_);
final := derived(x);final := derived(x);
Connection.execute(final);Connection.execute(final);}}
Defending Against Attacks
query main()query main()uses String x, final;uses String x, final;matches {matches { x = HttpServletRequest.getParameter(_) x = HttpServletRequest.getParameter(_)
| x = HttpServletRequest.getHeader(_);| x = HttpServletRequest.getHeader(_);
final := derived(x);final := derived(x);
}}
replaces Connection.execute(final) withreplaces Connection.execute(final) with
SQLUtil.safeExecute(x, final);SQLUtil.safeExecute(x, final);
Sanitizes user-derived inputSanitizes user-derived input Dangerous data cannot reach the databaseDangerous data cannot reach the database
Other PQL Constructs
Partial orderPartial order { x.a(), x.b(), x.c(); }{ x.a(), x.b(), x.c(); } Match calls to Match calls to aa, , bb, and , and cc on on xx in in anyany
order.order. Forbidden EventsForbidden Events
Example: double-lockExample: double-lockx.lock(); ~x.unlock(); x.lock();x.lock(); ~x.unlock(); x.lock();
Single statements onlySingle statements only
Expressiveness
Concatenation + alternation = Loop-free Concatenation + alternation = Loop-free regexpregexp
+ Subqueries = CFG+ Subqueries = CFG + Partial Order = CFG + Intersection+ Partial Order = CFG + Intersection Quantified over heapQuantified over heap
Each subquery independentEach subquery independent Existentially quantifiedExistentially quantified
Question
instrumenterProgram
static analyzer
PQL Query
PQL Engine
Instrumented Program
Optimized Instrumented
ProgramStatic Results
System Architecture
Dynamic Matcher
Subquery Subquery state machine state machine Call to subquery Call to subquery new instance of new instance of
machinemachine States carry “bindings” with themStates carry “bindings” with them
Query variables Query variables heap objects heap objects Bindings are acquired as the Bindings are acquired as the
variables are referenced for the first variables are referenced for the first time in a matchtime in a match
Query to Translate
query main()query main()uses Object x, final;uses Object x, final;matches {matches { x = getParameter(_) | x = getHeader(); x = getParameter(_) | x = getHeader();
f := derived (x); execute (f);f := derived (x); execute (f);}}
query derived(Object x)query derived(Object x)uses Object t;uses Object t;returns Object y;returns Object y;matches { matches {
{ y := x; }{ y := x; } || { t = x.toString(); y := derived(t); }{ t = x.toString(); y := derived(t); } | { t.append(x); y := derived(t); }| { t.append(x); y := derived(t); }}}
main() Query Machine
*
* *
x = getParameter(_) x = getHeader(_)
f := derived(x)
execute(f)
derived() Query Machine
t=x.toString()
t.append(x)
y := x
y := derived(t)
y := derived(t)
*
*
Example Program Trace
oo1 1 = getHeader(= getHeader(oo22))
oo33.append(.append(oo11))
oo33.append(.append(oo44))
oo55 = = execute(execute(oo33))
main(): Top Level Match
*
* *
x = getParameter(_) x = getHeader(_)
f := derived(x)
execute(f)
{ }
{ } { }
{ x=o1 }
{ x=o1 }1
oo11 = getHeader(o = getHeader(o22))
derived(): call 1
t=x.toString()
t.append(x)
y := x
y := derived(t)
y := derived(t)
*
*{x=o1}
{x=o1}
{x=o1}
{x=o1} {x=y=o1}{x=y=o1}
oo11 = getHeader(o = getHeader(o22))
main(): Top Level Match
*
* *
x = getParameter(_) x = getHeader(_)
f := derived(x)
execute(f)
{ }
{ } { }
{ x=o1 }
{ x=o1 }1
oo11 = getHeader(o = getHeader(o22)) {x=o1,f=o1}oo33.append(o.append(o11))
derived(): call 1
t=x.toString()
t.append(x)
y := x
y := derived(t)
y := derived(t)
*
*{x=o1}
{x=o1}
{x=o1}
{x=o1}
oo11 = getHeader(o = getHeader(o22))oo33.append(o.append(o11))
{x=o1, t=o3}2
{x=y=o1} {x=y=o1}
derived(): call 2
t=x.toString()
t.append(x)
y := x
y := derived(t)
y := derived(t)
*
*
oo11 = getHeader(o = getHeader(o22))oo33.append(o.append(o11))
{x=o3}
{x=o3}
{x=o3}
{x=o3}
{x=y=o3}{x=y=o3}
derived(): call 1
t=x.toString()
t.append(x)
y := x
y := derived(t)
y := derived(t)
*
*{x=o1}
{x=o1}
{x=o1}
{x=o1}
oo11 = getHeader(o = getHeader(o22))oo33.append(o.append(o11))
{x=o1, t=o3}2
{x=y=o1}
{x=y=o1}
{x=o1, y=t=o3}
{x=o1, y=t=o3}
main(): Top Level Match
*
* *
x = getParameter(_) x = getHeader(_)
f := derived(x)
execute(f)
{ }
{ } { }
{ x=o1 }
{ x=o1 }1
oo11 = getHeader(o = getHeader(o22)) {x=o1,f=o1}oo33.append(o.append(o11))oo33.append(o.append(o44))
oo55 = execute(o = execute(o33)) {x=o1,f=o3}
, {x=o1,f=o3}
Find Relevance Fast
Hash map for each transitionHash map for each transition Every call-instance combinedEvery call-instance combined
Key on known-used variablesKey on known-used variables All used variables known-used All used variables known-used one one
lookup per transitionlookup per transition
Question
instrumenterProgram
static analyzer
PQL Query
PQL Engine
Instrumented Program
Optimized Instrumented
ProgramStatic Results
System Architecture
Static Analysis
““Can this program match this query?”Can this program match this query?” Undecidable in generalUndecidable in general We give a conservative approximationWe give a conservative approximation No matches found = None possibleNo matches found = None possible
Static Analysis
PQL query automatically translated to query on PQL query automatically translated to query on pointer analysis resultspointer analysis results
Pointer analysis is Pointer analysis is sound sound and and context-sensitivecontext-sensitive 10101414 contexts in a good-sized application contexts in a good-sized application Exponential space represented with BDDsExponential space represented with BDDs Analyses given in DatalogAnalyses given in Datalog
Whaley/Lam, PLDI 2004 (Whaley/Lam, PLDI 2004 (bddbddbbddbddb) for details) for details
Static Results
Sets of objects and events that could Sets of objects and events that could represent a matchrepresent a match
OROR Program points that could participate in Program points that could participate in
a matcha match
No results = no match possible!No results = no match possible!
Question
instrumenterProgram
static analyzer
PQL Query
PQL Engine
Instrumented Program
Optimized Instrumented
ProgramStatic Results
System Architecture
Optimizing the Dynamic Analysis
Static results conservativeStatic results conservative So, point not in result So, point not in result point never point never
in any matchin any match So, no need to instrumentSo, no need to instrument
Usually more than 90% reductionUsually more than 90% reduction
Experiment Topics
Domain of Java Web applicationsDomain of Java Web applications Serialization errorsSerialization errors SQL injectionSQL injection
Domain of Eclipse IDE pluginsDomain of Eclipse IDE plugins API violationsAPI violations Memory leaksMemory leaks
Experiment Summary
NameName ClassesClasses Inst PtsInst Pts BugsBugs
webgoatwebgoat 1,0211,021 6969 22
personalblogpersonalblog 5,2365,236 3636 22
road2hibernateroad2hibernate 7,0627,062 779779 11
snipsnapsnipsnap 10,85110,851 543543 88
rollerroller 16,35916,359 00 11
EclipseEclipse 19,43919,439 18,15218,152 192192
TOTALTOTAL 59,96859,968 19,57919,579 206206
Session Serialization Errors
Very common bug in Web applicationsVery common bug in Web applications Server tries to persist non-persistent objectsServer tries to persist non-persistent objects
Only manifests under heavy loadOnly manifests under heavy load Hard to find with testingHard to find with testing
One-line query in PQLOne-line query in PQL HttpSession.setAttribute(_,!Serializable);HttpSession.setAttribute(_,!Serializable);
Solvable purely staticallySolvable purely statically Dynamic confirmation possibleDynamic confirmation possible
SQL Injection
Our running exampleOur running example Static optimizes greatlyStatic optimizes greatly
92%-99.8% reduction of points92%-99.8% reduction of points 2-3x speedup2-3x speedup
4 injections, 2 exploitable4 injections, 2 exploitable Blocked both exploitsBlocked both exploits
Further applications and an improved Further applications and an improved static analysis in Usenix Security ’05static analysis in Usenix Security ’05
Full derived() Queryquery StringProp (Object * x)query StringProp (Object * x)returns Object y;returns Object y;uses Object z;uses Object z;matches {matches { y.append(x, ...)y.append(x, ...)| x.getChars(_, _, y, _)| x.getChars(_, _, y, _)| y.insert(_, x)| y.insert(_, x)| y.replace(_, _, x)| y.replace(_, _, x)| y = x.substring(...)| y = x.substring(...)| y = new java.lang.String(x)| y = new java.lang.String(x)| y = new java.lang.StringBuffer(x)| y = new java.lang.StringBuffer(x)| y = x.toString()| y = x.toString()| y = x.getBytes(...)| y = x.getBytes(...)| y = _.copyValueOf(x)| y = _.copyValueOf(x)| y = x.concat(_)| y = x.concat(_)| y = _.concat(x)| y = _.concat(x)| y = new java.util.StringTokenizer(x)| y = new java.util.StringTokenizer(x)| y = x.nextToken()| y = x.nextToken()| y = x.next()| y = x.next()| y = new java.lang.Number(x)| y = new java.lang.Number(x)| y = x.trim()| y = x.trim()| { z = x.split(...); y = z[]; }| { z = x.split(...); y = z[]; }| y = x.toLowerCase(...)| y = x.toLowerCase(...)| y = x.toUpperCase(...)| y = x.toUpperCase(...)| y = _.replaceAll(_, x)| y = _.replaceAll(_, x)| y = _.replaceFirst(_, x);| y = _.replaceFirst(_, x);}}
query derived (Object x)query derived (Object x)returns Object y;returns Object y;uses Object temp;uses Object temp;matches {matches { y := xy := x| { temp = StringProp(x);| { temp = StringProp(x); y := derived(temp); }y := derived(temp); }}}
Eclipse
IDE for JavaIDE for Java Very large (tens of MB of bytecode)Very large (tens of MB of bytecode)
Too large for our static analysisToo large for our static analysis Purely interactivePurely interactive
Unoptimized dynamic overhead Unoptimized dynamic overhead acceptableacceptable
Queries on Eclipse
Paired MethodsPaired Methods register/deregisterregister/deregister createWidget/destroyWidgetcreateWidget/destroyWidget install/uninstallinstall/uninstall startup/shutdownstartup/shutdown
Lapsed ListenersLapsed Listeners
Eclipse Results
All paired methods queries were run All paired methods queries were run simultaneouslysimultaneously 56 mismatches detected56 mismatches detected
Lapsed listener query was run aloneLapsed listener query was run alone 136 lapsed listeners136 lapsed listeners Can be automatically fixedCan be automatically fixed
Current Status
Open source and hosted on Open source and hosted on SourceForgeSourceForge
httphttp://pql.sf.://pql.sf.netnet – standalone dynamic – standalone dynamic implementationimplementation
Related work
PQL is a PQL is a query languagequery language JQueryJQuery
on on program tracesprogram traces Partiqle, Dalek, ...Partiqle, Dalek, ...
Observing Observing behaviorbehavior and and finding bugsfinding bugs Metal, Daikon, PREfix, Clouseau, ...Metal, Daikon, PREfix, Clouseau, ...
and and automatically add code to fix themautomatically add code to fix them AspectJAspectJ
Conclusions PQL – a Program Query LanguagePQL – a Program Query Language
Match histories of sets of objects on a program traceMatch histories of sets of objects on a program trace Targeting application developersTargeting application developers
Found many bugsFound many bugs 206 application bugs and security flaws206 application bugs and security flaws 6 large real-life applications6 large real-life applications
PQL provides a bridge to powerful analysesPQL provides a bridge to powerful analyses Dynamic matcherDynamic matcher
Point-and-shoot even for unknown applicationsPoint-and-shoot even for unknown applications Automatically repairs program on the flyAutomatically repairs program on the fly
Static matcherStatic matcher Proves absence of bugsProves absence of bugs Can reduce runtime overhead to production-acceptableCan reduce runtime overhead to production-acceptable