Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes =...
Transcript of Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes =...
![Page 2: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/2.jpg)
Static Code Analyzer for Security
![Page 3: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/3.jpg)
![Page 4: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/4.jpg)
![Page 5: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/5.jpg)
![Page 6: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/6.jpg)
![Page 7: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/7.jpg)
Static Code Analyzer for Security(HP Fortify SCA)
C/C++
Vulnerabilities
Java
![Page 8: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/8.jpg)
LLVM Language‐independent Services
C/C++
Objective‐C
Swift
22nd
![Page 9: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/9.jpg)
Bitcode for Source Analysis?
C/C++
Objective‐C
Swift
22nd
Vulns
![Page 10: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/10.jpg)
Bitcode for Source Analysis?
C/C++
Objective‐C
Swift
22nd
Vulns
![Page 11: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/11.jpg)
HP Fortify SCA for Objective‐C
C/C++
Objective‐C
Swift
22nd
Vulnsclang -gsrc
clang -g
![Page 12: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/12.jpg)
Bitcode with Enhanced Source Info
C/C++
Objective‐C
Swift
Vulns
clang -g
clang -gsrcswift -gsrc
frontend -gsrc
![Page 13: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/13.jpg)
Bitcode with Enhanced Source Info
C/C++
Objective‐C
Swift
Vulns
clang -g
clang -gsrcswift -gsrc
frontend -gsrc
cross‐language analysis
![Page 14: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/14.jpg)
Why we cannot do this today?
C/C++
Objective‐C
Swift
Vulns
clang -g
![Page 15: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/15.jpg)
Objective‐C Static Taint Analyzer@implementation HtmlViewController- (void)viewDidLoad {
if (_content) {…
} else {// Display the "About iGoat" splash screen as a default.…
NSString *fileContents = [[NSString alloc] initWithContentsOfFile:filePathencoding:NSUTF8StringEncoding error:&error];
NSString *version = [[[NSBundle mainBundle] infoDictionary] objectForKey:@"CFBundleShortVersionString"];
[self.webView loadHTMLString:[NSStringstringWithFormat:fileContents, version] baseURL:baseURL];
}}…@end
15
![Page 16: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/16.jpg)
Objective‐C Static Taint Analyzer@implementation HtmlViewController- (void)viewDidLoad {
if (_content) {…
} else {// Display the "About iGoat" splash screen as a default.…
NSString *fileContents = [[NSString alloc] initWithContentsOfFile:filePathencoding:NSUTF8StringEncoding error:&error];
NSString *version = [[[NSBundle mainBundle] infoDictionary] objectForKey:@"CFBundleShortVersionString"];
[self.webView loadHTMLString:[NSStringstringWithFormat:fileContents, version] baseURL:baseURL];
}}…@end
16
taint source by API doc
![Page 17: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/17.jpg)
Objective‐C Static Taint Analyzer@implementation HtmlViewController- (void)viewDidLoad {
if (_content) {…
} else {// Display the "About iGoat" splash screen as a default.…
NSString *fileContents = [[NSString alloc] initWithContentsOfFile:filePathencoding:NSUTF8StringEncoding error:&error];
NSString *version = [[[NSBundle mainBundle] infoDictionary] objectForKey:@"CFBundleShortVersionString"];
[self.webView loadHTMLString:[NSStringstringWithFormat:fileContents, version] baseURL:baseURL];
}}…@end
17
taint sink by API doc
![Page 18: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/18.jpg)
Objective‐C Static Taint Analyzer@implementation HtmlViewController- (void)viewDidLoad {
if (_content) {…
} else {// Display the "About iGoat" splash screen as a default.…
NSString *fileContents = [[NSString alloc] initWithContentsOfFile:filePathencoding:NSUTF8StringEncoding error:&error];
NSString *version = [[[NSBundle mainBundle] infoDictionary] objectForKey:@"CFBundleShortVersionString"];
[self.webView loadHTMLString:[NSStringstringWithFormat:fileContents, version] baseURL:baseURL];
}}…@end
18
taint source
taint sink
![Page 19: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/19.jpg)
Objective‐C Static Taint Analyzer
19
• Our taint source or taint sink is written in a declarative fashion, which is matched by the analyzer against its method signature.
NodeType: TaintSourceClassName: NSArray | NSString | NSData | NSConstantStringMethodSig: arrayWithContentsOfFile: | (string|init)WithContentsOfFile:(usedE|e)ncoding:error: |initWithContentsOfFile: | (data|init)WithContentsOfFile:(options:error:)?Output: returnTaintFlags: FILE_SYSTEM,XSS
![Page 20: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/20.jpg)
A Source‐friendly IR
20
• A method signature
public class NSString extends NSObject {public virtual NSString*
initWithContentsOfFile$encoding$error$(NSString* this, …);
}
![Page 21: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/21.jpg)
From Bitcode to Sourceint convert(unsigned u) { return 0; }
21
define i32 @convert(i32 %u) #0 {entry:ret i32 0
}
!4 = metadata !{i32 786478, metadata !1, metadata !5, metadata !"convert", metadata !"convert“,...} ; [ DW_TAG_subprogram ] [line 25] [def] [convert]
![Page 22: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/22.jpg)
From Bitcode to Source
NamedMDNode *M_Nodes =M->getNamedMetadata("llvm.dbg.cu");DIArray SPs = CU.getSubprograms();for (unsigned i2 = 1,
e2 = SPs.getNumElements();i2 != e2; ++i2) {
DISubprogram DISP(SPs.getElement(i2));DICompositeType DIC(DISP.getType());DIArray Tys = DIC.getTypeArray();// Tys[0] return type// others are parameter types
} 22
![Page 23: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/23.jpg)
No Metadata for Declarations
extern int convert(unsigned u);
23
declare i32 @convert(i32 %u) #2;
No metadata describing @convert.
![Page 24: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/24.jpg)
No Metadata for Declarations
extern int convert(unsigned u);
24
declare i32 @convert(i32 %u) #2;
Metadata emission is a subprocess during code emission. No code generation, no metadata.
![Page 25: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/25.jpg)
Generate Bitcode with Rich Source Info
25
• Decouple metadata emission and code generation.
• Control rich metadata emission by using ‐gsrc
$ clang –gsrc –O0 –c –emit-llvm –S HtmlViewController.m
![Page 26: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/26.jpg)
Bitcode with Rich Source Info
declare extern_weak i8* @"-[NSStringinitWithContentsOfFile:encoding:error:]"(%1*, i8*, %1*, i64, %3**)
!1538 = metadata !{i32 786478, metadata !4, metadata !302, metadata !"-[NSStringinitWithContentsOfFile:encoding:error:]",...} ; [ DW_TAG_subprogram ]...
26
![Page 27: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/27.jpg)
Bitcode with Rich Source Info
Type signature: (NSString*, objc_selector*, NSString*, NSStringEncoding, NSError**) -> NSString*
typedef: NSStringEncoding,NSUInteger,long unsigned int
27
![Page 28: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/28.jpg)
A Source‐friendly IR
28
public class NSString extends NSObject {public virtual NSString*
initWithContentsOfFile$encoding$error$(NSString* this, …);
}
• NST
![Page 29: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/29.jpg)
Bitcode with Enhanced Source Info
C/C++
Swift
clang -gsrc
clang
Vulnstaint analysis
Objective‐C
![Page 30: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/30.jpg)
Small Modification Big Opportunity
• Entire patch to Clang/LLVM has 543 lines for 3.3 (git diff)
• Upgrading to 3.5
30
![Page 31: Source Code Analysis for Security through LLVM...From Bitcode to Source NamedMDNode *M_Nodes = M->getNamedMetadata("llvm.dbg.cu"); DIArray SPs = CU.getSubprograms(); for (unsigned](https://reader036.fdocuments.in/reader036/viewer/2022071302/60adb07bb10f6c4bdb09bdcd/html5/thumbnails/31.jpg)
Small Modification Big Opportunity• All frontends should implement this feature
31
C/C++
Swift
Vulnstaint analysis
Objective‐C
clang -gsrcswift -gsrc
frontend -gsrc