1 Enhancing Security of Real-World Systems with a Better Understanding of the Threats Shuo Chen...
-
Upload
edith-morrison -
Category
Documents
-
view
212 -
download
0
Transcript of 1 Enhancing Security of Real-World Systems with a Better Understanding of the Threats Shuo Chen...
11
Enhancing Security of Real-Enhancing Security of Real-World Systems with a Better World Systems with a Better Understanding of the Threats Understanding of the Threats
Shuo ChenShuo ChenCandidate of Ph.D. in Computer ScienceCandidate of Ph.D. in Computer ScienceCenter for Reliable and High Performance Center for Reliable and High Performance ComputingComputingCoordinated Science LaboratoriesCoordinated Science LaboratoriesUniversity of Illinois at Urbana-ChampaignUniversity of Illinois at Urbana-Champaign
22
Security Threat Analysis and Mitigations in Security Threat Analysis and Mitigations in Real-World SystemsReal-World Systems– Investigate the impact of hardware memory errors on Investigate the impact of hardware memory errors on
the security of Internet servers and firewalls. the security of Internet servers and firewalls. Simulate random hardware memory errorsSimulate random hardware memory errors Stochastic model to estimate the probability of security Stochastic model to estimate the probability of security
violations.violations.– Analyze and model a wide spectrum of software security Analyze and model a wide spectrum of software security
vulnerabilities reported by CERT and Bugtraq. vulnerabilities reported by CERT and Bugtraq. Decompose each vulnerability to many primitive operations.Decompose each vulnerability to many primitive operations. Introduce formalism into reasoning and description of real Introduce formalism into reasoning and description of real
vulnerabilities.vulnerabilities. Interesting outcome: discovered a new security bug in an Interesting outcome: discovered a new security bug in an
HTTP server, now published in Bugtraq.HTTP server, now published in Bugtraq.
– Construct non-traditional methods to attack major Construct non-traditional methods to attack major Internet server programs without being detected by Internet server programs without being detected by most current defense techniques. This represents a new most current defense techniques. This represents a new challenge for defense research.challenge for defense research.
– Develop techniques to provide a better security Develop techniques to provide a better security protection for real-world systemsprotection for real-world systems
A theorem proving based code analysisA theorem proving based code analysis A processor architecture level runtime defenseA processor architecture level runtime defense
Focus of this talk
My DissertationMy Dissertation
Earlier work
33
PART I:PART I:
Analyzing and Identifying Analyzing and Identifying Security Threats on Real-World Security Threats on Real-World SoftwareSoftware
44
Significance of Memory Significance of Memory VulnerabilitiesVulnerabilities
CERT Advisories: CERT Advisories: 66% vulnerabilities are low 66% vulnerabilities are low level memory errors in software.level memory errors in software.
Widely exploited by attackers, worms and viruses.Widely exploited by attackers, worms and viruses.
Format String 7%
Globbing2%
Heap Corruption
8%
Integer Overflow
6%
Buffer Overflow
44%
Other33%
55
Widely Understood Threats of Memory Widely Understood Threats of Memory CorruptionsCorruptions
Once a memory error is found, it is Once a memory error is found, it is straightforward to take control of the straightforward to take control of the victim system by victim system by control-hijacking control-hijacking attacksattacks..– First, overwrite control data, such as First, overwrite control data, such as
return addresses, function pointers, GOT return addresses, function pointers, GOT entries or DTOR entries.entries or DTOR entries.
– Program control is hijacked to execute Program control is hijacked to execute code with malicious purposes.code with malicious purposes.
– The malicious code is able to make system The malicious code is able to make system calls with the privilege of the victim calls with the privilege of the victim process. Do real damages to the system.process. Do real damages to the system.
66
Current Techniques to Defeat Current Techniques to Defeat Memory Corruption Attacks Memory Corruption Attacks Control hijacking is the most dominant form of memory Control hijacking is the most dominant form of memory
corruption attacks (CERT and Microsoft Security Bulletin)corruption attacks (CERT and Microsoft Security Bulletin)
Accordingly, many current defense techniques are Accordingly, many current defense techniques are designed to enforce program control flow integrity in designed to enforce program control flow integrity in order to provide software security. This research area order to provide software security. This research area has been active for many years. has been active for many years.
A common justification: attacks not hijacking program A common justification: attacks not hijacking program control flow (i.e., non-control-hijacking attacks) are rare control flow (i.e., non-control-hijacking attacks) are rare against real-world software.against real-world software.
Important question: Important question: – How confident can we rely on this justification to build How confident can we rely on this justification to build
defenses?defenses?– Is it possible that people currently underestimate the real Is it possible that people currently underestimate the real
threats of memory corruption attacks?threats of memory corruption attacks?– Specifically, dominance of control-hijacking attacks Specifically, dominance of control-hijacking attacks
attackers’ attackers’ incapabilityincapability or or lack of incentivelack of incentive to mount non- to mount non-control-hijacking attacks?control-hijacking attacks?
77
Our Claim: General Applicability Our Claim: General Applicability of Non-control-hijacking Attacksof Non-control-hijacking Attacks
Our previous papers suggest an initial doubtOur previous papers suggest an initial doubt– Even random hardware memory errors can subvert the Even random hardware memory errors can subvert the
security of real-world systems with a non-negligible security of real-world systems with a non-negligible probability. None of the compromises is due to control probability. None of the compromises is due to control hijacking.hijacking.
– Software vulnerabilities are more deterministic and Software vulnerabilities are more deterministic and more amenable to attacks. Why attackers are more amenable to attacks. Why attackers are incapable to mount non-control-hijacking attacks incapable to mount non-control-hijacking attacks against real-world systems?against real-world systems?
We make a hypothetical claim:We make a hypothetical claim:– Many real-world software applications are susceptible Many real-world software applications are susceptible
to non-control-hijacking attacks;to non-control-hijacking attacks;– The severity of the attack consequences is equivalent The severity of the attack consequences is equivalent
to that due to control hijacking attacks. to that due to control hijacking attacks.
If the claim is indeed true, it represents a new If the claim is indeed true, it represents a new challenge to defense techniques.challenge to defense techniques.
88
Goal: Empirical Validation of the Goal: Empirical Validation of the ClaimClaim
Investigate many “representative Investigate many “representative software applications”. Try to break into software applications”. Try to break into them using non-control-hijacking attacks.them using non-control-hijacking attacks.
Choose representative software Choose representative software applicationsapplications– We did a quick survey on the recent four We did a quick survey on the recent four
years of CERT advisories. Over 1/3 years of CERT advisories. Over 1/3 vulnerabilities are in vulnerabilities are in FTPFTP, , SSHSSH, , TelnetTelnet and and HTTPHTTP servers. servers.
Construct non-control-hijacking attacks to Construct non-control-hijacking attacks to compromise these servers. Each attack compromise these servers. Each attack results in the root compromise of the results in the root compromise of the victim server.victim server.
99
Non-control-hijacking attack on Non-control-hijacking attack on WU-WU-FTPFTP Server (via a format string bug) Server (via a format string bug)
int x;FTP_service(...) { authenticate(); x = user ID of the authenticated user; seteuid(x); while (1) { get_FTP_command(...); if (a data command?) getdatasock(...); }}getdatasock( ... ) { seteuid(0); setsockopt( ... ); seteuid(x);}
x=109, run as EUID 0
x uninitialized, run as EUID 0
x=109, run as EUID 109. Lose the root privilege!
Get a special SITE EXEC command. Exploit a format string vulnerability.x= 0, still run as EUID 109.
x=0, run as EUID 0
x=0, run as EUID 0
When return to service loop, still runs as EUID 0 (root). Allow me to upload /etc/passwdI can grant myself the root privilege!
Only corrupt an integer, not control hijacking.
Get a data command (e.g., PUT)
1010
Non-control-hijacking attack on Non-control-hijacking attack on NULL-HTTPNULL-HTTP Server (via a heap overflow Server (via a heap overflow bug)bug)
Attack the configuration string of CGI-BIN Attack the configuration string of CGI-BIN path.path.
Mechanism of CGIMechanism of CGI– suppose server name = www.foo.comsuppose server name = www.foo.com
CGI-BIN = /usr/local/httpd/exe CGI-BIN = /usr/local/httpd/exe – Requested URL = http://www.foo.com/cgi-bin/barRequested URL = http://www.foo.com/cgi-bin/bar– The server executesThe server executes
Our attackOur attack– Exploit a heap overflow vulnerability to overwrite Exploit a heap overflow vulnerability to overwrite
CGI-BIN to /binCGI-BIN to /bin– Request URL http://www.foo.com/cgi-bin/shRequest URL http://www.foo.com/cgi-bin/sh– The server executes The server executes
The server gives me a root shell!Only overwrite four characters in the CGI-BIN string.Not control hijacking.
//usr/local/httpd/exeusr/local/httpd/exe//barbar
/bin/bin/sh/sh
1111
Non-control-hijacking attack on Non-control-hijacking attack on SSH SSH CommunicationsCommunications SSH Server (via an integer SSH Server (via an integer overflow bug)overflow bug)
void do_authentication(char *user, ...) { int auth = 0; ... while (!auth) { /* Get a packet from the client */ type = packet_read(); switch (type) { ... case SSH_CMSG_AUTH_PASSWORD: if (auth_password(user, password)) auth =1; case ... } if (auth) break; } /* Perform session preparation. */ do_authenticated(…);}
auth = 0
auth = 0
auth = 1
Password incorrect, but auth = 1
auth = 1
Logged in without correct password
1212
More non-control-hijacking More non-control-hijacking attacksattacks
Against Against NetKitNetKit Telnet server (default Telnet Telnet server (default Telnet server of server of Redhat LinuxRedhat Linux))– Exploit a heap overflow bugExploit a heap overflow bug– Overwrite two strings:Overwrite two strings:
/bin//bin/loginlogin –h –h foo.comfoo.com -p (normal scenario) -p (normal scenario) /bin//bin/shsh –h –h –p–p -p (attack scenario) -p (attack scenario)
– The server runs /bin/sh when it tries to The server runs /bin/sh when it tries to authenticate the user.authenticate the user.
Against Against GazTekGazTek HTTP server HTTP server– Exploit a stack buffer overflow bugExploit a stack buffer overflow bug
Send a legitimate URL http://www.foo.com/cgi-bin/barSend a legitimate URL http://www.foo.com/cgi-bin/bar The server checks that “/..” is not embedded in the URLThe server checks that “/..” is not embedded in the URL Exploit the bug to change the URL to Exploit the bug to change the URL to
http://www.foo.com/cgi-bin/http://www.foo.com/cgi-bin/../../../../bin/sh../../../../bin/sh The server executes /bin/shThe server executes /bin/sh
1313
Implications of Non-Control-Hijacking Implications of Non-Control-Hijacking AttacksAttacks
Control flow integrity is not a sufficiently Control flow integrity is not a sufficiently accurate approximation to software security.accurate approximation to software security.– Given a memory bug in a real software, attackers’ Given a memory bug in a real software, attackers’
behaviors can be very diversified.behaviors can be very diversified.
Although non-control-hijacking attacks are Although non-control-hijacking attacks are specific to application semantics, there are specific to application semantics, there are many types of non-control data critical to many types of non-control data critical to software securitysoftware security– E.g., user identity data, configuration data, user E.g., user identity data, configuration data, user
input data and decision-making Booleans.input data and decision-making Booleans.
Once attackers have the incentive, they are Once attackers have the incentive, they are likely to succeed in non-control-hijacking likely to succeed in non-control-hijacking attacks.attacks.
1414
Re-Examining Current Defense Re-Examining Current Defense TechniquesTechniques
They were mainly tested against control-hijacking attacks. They were mainly tested against control-hijacking attacks. Need to re-examine the effectiveness.Need to re-examine the effectiveness.– Many of them are based on control flow integrityMany of them are based on control flow integrity
Monitor system call sequence Monitor system call sequence Protect control dataProtect control data Non-executable stack and heapNon-executable stack and heap
– Pointer encryption (PointGuard)Pointer encryption (PointGuard) Need to encrypt pointers in libraries to be effective (challenging Need to encrypt pointers in libraries to be effective (challenging
because no enough type info, type casting very often, because no enough type info, type casting very often, performance).performance).
– Address space randomizationAddress space randomization Good idea. In each run of the program, memory layout is different.Good idea. In each run of the program, memory layout is different. Challenging to deploy on all program segments.Challenging to deploy on all program segments. Even every segment is randomized, a recent paper shows the Even every segment is randomized, a recent paper shows the
deployment on 32-bit address space doesn’t provide enough deployment on 32-bit address space doesn’t provide enough entropy.entropy.
– StackGuard, Libsafe and FormatGuardStackGuard, Libsafe and FormatGuard They are specific to defeat stack smashing attacks and format They are specific to defeat stack smashing attacks and format
string attacks. Not generic solutions.string attacks. Not generic solutions. Building a generic and secure defense technique to defeat Building a generic and secure defense technique to defeat
memory corruption attacks is still an open problem. memory corruption attacks is still an open problem. Future defense research should consider non-control-Future defense research should consider non-control-
hijacking attacks more seriously.hijacking attacks more seriously.
1515
PART II:PART II:
Pointer TaintednessPointer Taintedness Detection: Detection: Towards a Better Security Towards a Better Security Protection for Real-World Protection for Real-World SystemsSystems
1616
Pointer TaintednessPointer Taintedness
Pointer Taintedness: a pointer value, : a pointer value, including a return address, is derived including a return address, is derived from user input. from user input.
Most memory corruption attacks are due Most memory corruption attacks are due to pointer taintedness. to pointer taintedness. – It allows attackers to specify the memory It allows attackers to specify the memory
locations to read, write or transfer control to. locations to read, write or transfer control to. Usually a pathological program behaviorUsually a pathological program behavior..
Pointer taintedness provides a unifying Pointer taintedness provides a unifying perspective for reasoning about a perspective for reasoning about a significant number of security significant number of security vulnerabilities.vulnerabilities.
1717
Most Memory Corruption Attacks are Most Memory Corruption Attacks are Due to Pointer TaintednessDue to Pointer Taintedness
Format string attack Format string attack – Taint an argument pointer of functions such Taint an argument pointer of functions such
as as printf, fprintf, sprintf printf, fprintf, sprintf andand syslog. syslog. Stack buffer overflow (stack smashing)Stack buffer overflow (stack smashing)
– Taint a function frame pointer or a return Taint a function frame pointer or a return address.address.
Heap corruption Heap corruption – Taint the free-chunk doubly-linked list of the Taint the free-chunk doubly-linked list of the
heap.heap. Glibc Glibc globbingglobbing attack attack
– User input resides in a location that is used as User input resides in a location that is used as a pointer by the parent function of a pointer by the parent function of glob().glob().
1818
Stack Buffer Overflow Stack Buffer Overflow
Vulnerable code: char buf[100]; strcpy(buf,user_input);
Return addrReturn addr
Frame pointerFrame pointer
buf[99]buf[99]
……
buf[1]buf[1]
buf[0]buf[0]
High
Low
Sta
ck g
row
th
buf
user_input
Frame pointer or return address can be tainted.
1919
ap: argument pointer
fmt: format string pointer
Format String AttackFormat String Attack
In vfprintf(), if (fmt points to “%n”) then **ap = (character count)
Vulnerable code: recv(buf); printf(buf); /* should be printf(“%s”,buf) */
\xdd \xcc \xbb \xaa %d %d %d %n
……
%n%n
%d%d
%d%d
%d%d
0xaabbccdd0xaabbccdd
fmt: format string pointer
ap: argument pointer
High
Low
Sta
ck g
row
th
*ap is a tainted value.
2020
Heap Corruption AttackHeap Corruption Attack
Free chunk A
Free chunk Bfd=Abk=C
Allocated buffer buf
Free chunk C
user
inpu
t
Vulnerable code:buf = malloc(1000);recv(sock,buf,1024);free(buf);
In free():B->fd->bk=B->bk; B->bk->fd=B->fd;
When B->fd and B->bk are tainted, the effect of free() is to write a user specified value to a user specified address.
2121
Building Defense Techniques Building Defense Techniques based on Pointer Taintednessbased on Pointer Taintedness
Static code analysis: analyze the source Static code analysis: analyze the source code to extract the conditions under code to extract the conditions under which the possibility of pointer which the possibility of pointer taintedness exists.taintedness exists.– To uncover potential vulnerabilitiesTo uncover potential vulnerabilities
Runtime detection: monitor at runtime Runtime detection: monitor at runtime whether a tainted value is dereferenced whether a tainted value is dereferenced as a pointer.as a pointer.– To defeat memory corruption attacks (both To defeat memory corruption attacks (both
control-hijacking and non-control-hijacking control-hijacking and non-control-hijacking attacks)attacks)
2222
Project AProject AFormal Reasoning about Pointer Formal Reasoning about Pointer Taintedness: Taintedness: To Extract Security Specifications of Library To Extract Security Specifications of Library FunctionsFunctions
2323
Project OverviewProject Overview Our analysis on CERT advisories showsOur analysis on CERT advisories shows
– A significant portion of vulnerabilities (A significant portion of vulnerabilities ( 33.6%) due to errors in 33.6%) due to errors in library functions or incorrect invocations of library functions.library functions or incorrect invocations of library functions.
– Need a more rigorous reasoning on library function specifications.Need a more rigorous reasoning on library function specifications. Library function specifications are currently ad-hoc. Many of Library function specifications are currently ad-hoc. Many of
them are specified after real attacks are discovered.them are specified after real attacks are discovered.– printf(fmt,…)printf(fmt,…): : fmtfmt cannot be a user-specified string cannot be a user-specified string– strcpy(d,s)strcpy(d,s): the length of string : the length of string ss should not exceed the size of buffer should not exceed the size of buffer
dd, and , and dd and and ss cannot be overlapped. cannot be overlapped.– d= savestr(s)d= savestr(s): do not free : do not free dd if this is not the first invocation of if this is not the first invocation of
savestr. savestr. – free(p)free(p): : pp must be a pointer obtained from a previous must be a pointer obtained from a previous mallocmalloc; ; pp
cannot be freed before.cannot be freed before.– glob(p)glob(p): p cannot be a string starting with ‘~’ and ending with ‘{’. : p cannot be a string starting with ‘~’ and ending with ‘{’.
What is a unified reason why these specifications are required?What is a unified reason why these specifications are required?– Answer: they are required to eliminate the possibility of pointer Answer: they are required to eliminate the possibility of pointer
taintedness.taintedness. Extraction of security specifications of a function is reduced to a Extraction of security specifications of a function is reduced to a
theorem proving problem: under which conditions can a function theorem proving problem: under which conditions can a function eliminate the possibility of pointer taintedness.eliminate the possibility of pointer taintedness.– I develop an equational logic based theorem proving approach to I develop an equational logic based theorem proving approach to
extract security specifications.extract security specifications.
2424
Extracting Function Extracting Function Specifications by Theorem Specifications by Theorem ProverProver
C source code of a library function
formal semantic representation
Automatically translated to formal semantic representation
Theorem generation
Theorem proving
A set of sufficient conditions that imply the validity of the theorems. They are the security specifications of the analyzed function.
For each pointer dereference in an assignment, generate a theorem stating that the pointer is not tainted
2525
Example: Example: vfprintf()vfprintf()
int vfprintf (FILE *s, const char *format, va_list ap){ char * p, *q; int done,data,n,state; char buf[10]; p=format; done=0; if (p==NULL) return 0; state=NO_PENDING; while (*p != 0) { if (state==NO_PENDING) {
if (*p=='%') state=PENDING; else outchar(s,*p); }
else { switch (*p) { case '%': outchar(s,'%')
break;case 'd': data=va_arg (ap, int);
if (data<0) { outchar(s,'-'); data=-data; }
n=0; while (data>0 && n<10) {
buf[n]=data%10+'0'; data/=10;
n++; } while (n>0) { n--; outchar(s,buf[n]); } break;case 's': q=va_arg (ap, char *); if (q==NULL) break; while (*q!=0) {
outchar(s,*q)q++; }
break;case 'n': q= va_arg(ap,void*) ;
*(int*) q = done;break;
default: outchar(s,*p)}state=NO_PENDING;
} p++; } return done; }
Theorem1: buf+n should not be a tainted value
Theorem2: q should not be a tainted value
2626
Extracting the Specifications of Extracting the Specifications of vfprintf()vfprintf()
Try to prove the two theoremsTry to prove the two theorems Initially, the theorem prover cannot complete the Initially, the theorem prover cannot complete the
proof, because the theorems are only valid under proof, because the theorems are only valid under certain preconditions.certain preconditions.
Add these preconditions as axioms to the theorem Add these preconditions as axioms to the theorem prover.prover.
Repeat the above step until the theorems are Repeat the above step until the theorems are proved.proved.
Finally, the following four preconditions are Finally, the following four preconditions are added, which are the specifications of added, which are the specifications of vfprintf (FILE *s, const char *format, va_list ap)– apap never points to any location within the current never points to any location within the current
function frame.function frame.– *ap*ap never points to the location of variable ap, i.e., never points to the location of variable ap, i.e., *ap *ap
&ap&ap– Suppose the memory segment that Suppose the memory segment that apap sweeps over is sweeps over is
called called ap_activitiy_rangeap_activitiy_range, then , then *ap*ap never points to any never points to any location within location within ap_activitiy_rangeap_activitiy_range..
– No locations within No locations within ap_activitiy_rangeap_activitiy_range are tainted before are tainted before vfprintf()vfprintf() is called. is called.Suggest the scenario of format string vulnerability
2727
Other Studied ExamplesOther Studied Examples Function Function strcpy()strcpy()
– Four security specifications indicating buffer overflow, Four security specifications indicating buffer overflow, buffer overlapping and buffer underflow scenarios buffer overlapping and buffer underflow scenarios causing pointer taintedness.causing pointer taintedness.
Function Function free()free() of a heap management system of a heap management system– Seven security specifications are extracted, including Seven security specifications are extracted, including
several specifications indicating several specifications indicating heap corruption vulnerabilities.vulnerabilities.
Socket read functions of Apache HTTPD and Socket read functions of Apache HTTPD and NULL HTTPDNULL HTTPD– The Apache function is proven to be free of pointer The Apache function is proven to be free of pointer
taintedness.taintedness.– Two (known) vulnerabilities are exposed in the Two (known) vulnerabilities are exposed in the
theorem proving process of NULL HTTPD function. theorem proving process of NULL HTTPD function.
2828
Project BProject BRuntime Pointer Taintedness Runtime Pointer Taintedness Detection: Detection: To Defeat Memory Corruption AttacksTo Defeat Memory Corruption Attacks
2929
Project OverviewProject Overview
We propose a processor architectural level We propose a processor architectural level mechanism to detect pointer taintednessmechanism to detect pointer taintedness– Implemented on SimpleScalar simulatorImplemented on SimpleScalar simulator– An extended memory system with taintedness bit An extended memory system with taintedness bit
attached to every byteattached to every byte– Enhanced load, store and ALU instructions to track Enhanced load, store and ALU instructions to track
taintedness bits in memorytaintedness bits in memory– Detecting security attacks when tainted data are Detecting security attacks when tainted data are
dereferenced.dereferenced. Evaluation Evaluation
– It detects both control hijacking and non-control-It detects both control hijacking and non-control-hijacking attacks against hijacking attacks against real-world software.real-world software.
– No known false positive: no alarm during normal No known false positive: no alarm during normal executions of network servers and SPEC executions of network servers and SPEC benchmarks. Fully compatible to existing benchmarks. Fully compatible to existing applications.applications.
– Transparent to applications. We can run Transparent to applications. We can run precompiled binaries on the architecture.precompiled binaries on the architecture.
– Some potential false negative scenarios. They are Some potential false negative scenarios. They are rare and not defeated by current generic detection rare and not defeated by current generic detection techniques either.techniques either.
3030
ConclusionsConclusions
3131
ConclusionsConclusions
Our analysis shows that real-world software can be Our analysis shows that real-world software can be compromised by corrupting non-control data. Non-compromised by corrupting non-control data. Non-control-hijacking attacks represent a realistic threat. control-hijacking attacks represent a realistic threat. – It is insufficient to rely on control flow integrity for It is insufficient to rely on control flow integrity for
software security.software security.
Pointer taintedness is a common characteristic of Pointer taintedness is a common characteristic of most memory corruption attacks, including control most memory corruption attacks, including control hijacking and non-control-hijacking attacks. hijacking and non-control-hijacking attacks.
A theorem proving based code analysis approach is A theorem proving based code analysis approach is designed to reason about possibilities of pointer designed to reason about possibilities of pointer taintedness.taintedness.– E.g., to formally extract security specifications of library E.g., to formally extract security specifications of library
functions.functions.
A runtime pointer taintedness detection mechanism is A runtime pointer taintedness detection mechanism is designed. It can effectively detect most memory designed. It can effectively detect most memory corruption attacks.corruption attacks.
3232
Summary of My Research Summary of My Research MethodologyMethodology Analysis-centric approachAnalysis-centric approach
– Analyzed impact hardware faults on security Analyzed impact hardware faults on security (fault injection + stochastic modeling)(fault injection + stochastic modeling)
– Analyzed Bugtraq and CERT vulnerability Analyzed Bugtraq and CERT vulnerability databasesdatabases
– Analyzed application source code, attacks and Analyzed application source code, attacks and current defense techniquescurrent defense techniques
– Analysis results motivate Analysis results motivate To expose new security threatsTo expose new security threats Propose new defense techniquesPropose new defense techniques
I like doing analysis of real data and I like doing analysis of real data and incidentsincidents– Tedious? Sometimes, but it is a crucial step Tedious? Sometimes, but it is a crucial step
toward a lot of fun.toward a lot of fun.– Rewarding? Definitely. Analysis is especially Rewarding? Definitely. Analysis is especially
important for systems research.important for systems research.– Goal: strongly motivate research topics that solve Goal: strongly motivate research topics that solve
problems in the reality. problems in the reality.
3333
Backup SlidesBackup Slides
3434
Static and Dynamic ApproachesStatic and Dynamic Approaches
Static approaches (avoid producing memory Static approaches (avoid producing memory vulnerabilities in programs)vulnerabilities in programs)
Writing code with type safe languageWriting code with type safe language Compiler techniques to uncover memory vulnerabilitiesCompiler techniques to uncover memory vulnerabilities Compiler instruments source code according to program Compiler instruments source code according to program
annotations.annotations. Challenges: legacy code and low level code, Challenges: legacy code and low level code,
compatibility and performance.compatibility and performance. Fact: Memory vulnerabilities are still constantly Fact: Memory vulnerabilities are still constantly
discovered and exploited.discovered and exploited. Intrusion detection techniques (defeat attacks, Intrusion detection techniques (defeat attacks,
given the existence of vulnerabilities)given the existence of vulnerabilities)– Specialized techniques Specialized techniques
Defeat stack buffer overflow and format string attacks.Defeat stack buffer overflow and format string attacks.– Generic defense techniquesGeneric defense techniques
Most techniques are designed to defeat control-hijacking Most techniques are designed to defeat control-hijacking attacksattacks. Host intrusion detection system and control flow . Host intrusion detection system and control flow integrity protection techniques. very active research integrity protection techniques. very active research area.area.
Others have constraints and difficulties in their Others have constraints and difficulties in their deployments. (pointer encryption and address deployments. (pointer encryption and address randomization)randomization)
3535
One-Slide Intro to Equational LogicOne-Slide Intro to Equational Logic
Use term rewriting to establish proofs of theorems.Use term rewriting to establish proofs of theorems. Natural number addition expressed in the Maude Natural number addition expressed in the Maude
system. system.
0 : Natural .s_ : Natural -> Natural ._+_ : Natural Natural -> Natural .
vars N M : Natural Axiom: N + 0 = N .Axiom: N + s M = s (N + M) .
(s s s 0) + (s s 0) = s ((s s s 0) + (s 0)) = s( s((s s s 0) + 0)) = s(s((s s s 0)) = s s s s s 0Intuitively, this is a proof of “3 + 2 = 5” in natural number algebra.
3636
Axioms of Axioms of EvalEval and and ExpTExpT operationsoperationsEval(S, I) = I // I is an integer constantEval(S, ^ E1) = Ftch(S, Eval(S,E1))Eval(S, E1 + E2) = Eval(S, E1) + Eval(S, E2)Eval(S, E1 - E2) = Eval(S, E1) - Eval(S, E2) … …ExpT (S, I) = falseExpT(S, ^ E1) = LocT(S,Eval(S,E1)) ExpT(S,E1 + E2) = ExpT(S,E1) or ExpT(S,E2)ExpT(S,E1 - E2) = ExpT(S,E1) or ExpT(S,E2)… …
E.g., is the expression (^100)–2 tainted in store S?ExpT(S, (^100)–2) = ExpT(S, (^100)) or ExpT(S, 2) = LocT(S,100) or false = LocT(S,100)
Note: ^ is the dereference operator, ^100 gives the content in the location 100
3737
Taintedness-Aware Memory Taintedness-Aware Memory ModelModel
• A store represents a snapshot of the memory state at a point in the program execution. • For each memory location, we can evaluate two properties: content and taintedness (true/false).• Operations on memory locations:
•The fetch operation Ftch(S,A) gives the content of the memory address A in store S•The location-taintedness operation LocT(S,A) gives the taintedness of the location A in store S
• Operations on expressions:•The evaluation operation Eval(S,E) evaluates expression E in store S•The expression-taintedness operation ExpT(S,E) computes the taintedness of expression E in store S.
3838
Semantics of Language LSemantics of Language L The following instructions are defined:The following instructions are defined:
– mov [Exp1] <- Exp2mov [Exp1] <- Exp2– branch (Condition) Labelbranch (Condition) Label – call FuncName(Exp1,Exp2,…)call FuncName(Exp1,Exp2,…)
Axioms defining Axioms defining movmov instruction semantics instruction semantics– Specify the effects of applying Specify the effects of applying movmov instruction on a instruction on a
storestore– Allow taintedness to propagate from Exp2 to [Exp1].Allow taintedness to propagate from Exp2 to [Exp1].
Axioms defining the semantics of Axioms defining the semantics of recvrecv (similarly, (similarly, scanfscanf, , recvfrom: recvfrom: user input functions)user input functions)– Specify the memory locations tainted by the recv call.
3939
Example: strcpy()Example: strcpy()
char * strcpy (char * dst, char * src) { char * res;0: res =dst; while (*src!=0) {1: *dst=*src; dst++; src++; }2: *dst=0; return res;}
0: mov [res] <- ^ dst
lbl(#while#6)
branch (^ ^ src is 0) #ex#while#6
1: mov [^ dst] <- ^ ^ src
mov [dst] <- (^ dst) + 1
mov [src] <- (^ src) + 1
branch true #while#6
lbl(#ex#while#6)
2: mov [^ dst] <- 0
mov [ret] <- ^ res
Translate to formal semantics
a) Suppose S1 is the store before Line L1, then LocT(S1,dst) = false b) If S0 is the store before Line L0, and S2 is the store after Line L1, then
I < Eval(S0, ^dst) or Eval(S0, ^dst+dstsize) I => LocT(S2,I) = LocT(S0, I)
c) Suppose S3 is the store before Line L2, then LocT(S3,dst) = false
Theorem generation
Theorem proving
4040
Specifications ExtractedSpecifications Extracted
Specifications that are Specifications that are extracted by the theorem extracted by the theorem proving approachproving approach– srclensrclen <= <= dstsizedstsize– The buffers The buffers srcsrc and and dstdst do not do not
overlap in such a way that the overlap in such a way that the buffer buffer dstdst covers the string covers the string terminator of the terminator of the srcsrc string. string.
– The buffers The buffers dstdst and and srcsrc do not do not cover the function frame of strcpy.cover the function frame of strcpy.
– Initially, Initially, dst dst is not taintedis not tainted
Documented in Linux man page
Not documented
Suppose when function strcpy() is called, the Suppose when function strcpy() is called, the sizesize of of destination buffer (dst) is destination buffer (dst) is dstsizedstsize, the , the lengthlength of user of user input string (src) is input string (src) is srclensrclen