SpyAware: Investigating the Privacy Leakage Signatures in App Execution Traces Hui Xu, Yangfan Zhou,...
Transcript of SpyAware: Investigating the Privacy Leakage Signatures in App Execution Traces Hui Xu, Yangfan Zhou,...
1
SpyAware: Investigating the Privacy Leakage Signatures in App Execution
Traces
Hui Xu, Yangfan Zhou, Cuiyun Gao, Yu Kang, Michael R. Lyu
2
Private Data Is Valuable
Big Data
Machine Learning
Recommendation
3
Whether a Leakage Is Legitimate?
Depends on: † User Preference† Software Functionality
4
How to Handle the Leakage?
Principle: Privacy Awareness† Users should be informed when the leakage
happens.† Malware disposing approach is inappropriate.
Your SMS has been leaked!!!
Maybe I should remove the app.
5
Privacy Leakage Definition
Read Behavior
Send Behavior
Privacy Leakage
PrivacySensitive
Data
Source Sink
6
Industrial Solutions
http://getandroidstuff.com/best-free-android-permission-management-apps-privacy-control/
They only control read behaviors!
7
Research Solutions
Taint Analysis
Without Taint Analysis
Read Send
Variable1SensitiveData
Instructions
Variable2
Leakage Happens
Data
Taint Propagation
Read Send
Variable1Sensitive
Data
Instructions
Variable2
Leakage ???
Data
? ? ?
8
TaintDroid
Approach: dynamic taint analysis (tracks the data flow during runtime)
Usability Issues: portability (a new OS), overhead W. Enck, et al. TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones[J]. ACM Transactions on Computer Systems (TOCS), 2014
Our Inspiration & Hypothesis
ObservablePhenomenon
Hidden Incident
Pre I1 Pre I2 Read Pos I1 Pos I2 Send
Pre I2 Read Pos I1Pre I3 Read Pos I1
Pre I1 Read Pos I2 Send
S1
Pos I3
Spyware BehaviorApp Execution Traces
Hypothesis: Some correlation exists between privacy leakage behaviors and app execution traces.Approach of Data Analytics: Transform data to insight.
9
spyware
spywarebenign
benignS3S4
S2
10
What Instructions Are Helpful?System Call: widely used on Linux platform
† Pros: It contains all the information of program executions.
† Cons: It is low level, and the interpretation is difficult.
Binder Call: newly proposed in Android OS† Pros: It is semantical, and can be easily interpreted.† Cons: It only traces inter-process communications.
11
Trace the Instructions with a Profiler† To trace system calls: strace† To trace binder calls:
a) Hijack a payload into the target app process with ptrace.b) The homemade payload decodes binder calls.
12
Overall Framework
Statistical Pattern RecognitionTraining Phase
Detection Phase
ProfilerBinder CallSystem Call
Leakage IndicatorTaintDroid
Feature Extractor
BenignSamples
SpywareSamples
Trainer Models
ProfilerBinder CallSystem Call
Feature Extractor Classifier
ProfileSample ResultApp
Apps
13
Android Binder Call# Type Data
1 BC_TRANSACTION ****android.app.IActivityManager**********%*com.sec.multiwindow.MW_TOUCH_DETECTED***********************`*********mw_x****e*****mw_action*********mw_y*********
2 BR_REPLY ****
3 BC_TRANSACTION ****android.content.IContentProvider****GET_system****sound_effects_enabled***
4 BR_REPLY ****
5 BC_TRANSACTION **"*android.gui.DisplayEventConnection**
6 BR_REPLY **$*********value*****0*
7 BC_TRANSACTION ****android.app.IActivityManager************com.android.contacts**** 8 BR_REPLY **$*********value*****0*
9 BC_TRANSACTION ****android.content.IContentProvider****'*content://com.android.contacts/contacts*****_id***********************
10 BR_REPLY ****0*com.android.providers.contacts.ContactsProvider2****com.android.providers.contacts******************com.android.providers.contacts****************com.android.providers.contacts******android.process.acore*************#*/system/app/SecContactsProvider.apk*#*/system/app/SecContactsProvider.apk*-*/data/data/com.android.providers.contacts/lib*****!*/system/framework/sec_feature.jar*+*/data/user/0/com.android.providers.contacts*********************android.process.acore*********contacts;com.android.contacts***android.permission.READ_CONTACTS**!*android.permission.WRITE_CONTACTS*********.***********
11 BR_REPLY ****************_id*********B***************************************f***
Access Contacts
Binder Instance Details
14
Detect Read Behaviors Signature Data Type
android.os.IServiceManager****iphonesubinfo IMEI, ICCIDcontent://com.android.contacts/ Contact Listandroid.content.IContentProvider + com.android.contacts Contact List
content://sms/ SMScontent://call_log/ Call Historycontent://browser/bookmarks Browser Historyandroid.account.IAccountManager AccountAndroid.os.IServiceManager****location Locationandroid.location.ILocationManager***gps Locationandroid.location.ILocationManager***network Locationandroid.location.ILocationManager***passive Locationandroid.media.IMediaRecorder Micandroid.gui.Sensor Accelerometerandroid.hardware.Camera Camera
15
Binder Call-based Features
Approach: a) Use BR_TRANSACTION; discard BR_REPLY.b) Strip details and retain the destination instance name.c) Choose discriminative instances.
†Leakage happens automatically when starting a new activity:android.app.IActivityManager
†Network communications are generally performed in a stand alone thread: adroid.app.IApplicationThread
†Apps may check current network connection status before communication: android.net.IConnectivityManager, android.net.wifi.IWifiManager
†Messenger is a common method to pass event or values between threads: android.os.IMessenger
†Leakages may happen when an app is querying the server:com.android...view.IInputMethodManager
16
System Call-based Features
High DF:Not discriminative
Approach:a) Strip the parameters and retain the name.b) Calculate the document frequency of system calls.
Low DF:Rarely occurred
Features: 13 system calls ranging from 0.06 to 0.22
17
Extract Features for Each SampleTerms:† A Sample: We separate the sequence of instructions into
samples according to touch operations.† A sample is a suspicious sample, if it includes at least one
read behavior according to the binder call.
Steps:a) Judge whether a sample is a suspicious sample.b) Discard the sample if it is nonsuspicious.c) Extract features for only suspicious ones.
Reason: Android app is UI oriented.
18
Experimental Settings
Goal: Discriminate whether a suspicious sample indicates a privacy leakage.Baseline: TaintDroidApp set: 100 top ranking apps from Google PlayMethod: We manually run each app for a few minutes; we don’t use Monkey because of registration issues.
Leak
No Leak?Suspicious
Profiles
19
Experimental Apps
App DevID Location App DevID Location App DevID Locationcom.wochacha Leak Leak com.starbucks.hk Read 0 com.trello 0 Leakjp.naver.line.android Read 0 org.coursera.android Leak 0 sg.bigo Leak Leakcn.com.fetion Leak 0 com.wonder Leak Leak com.axonlabs.hkbus 0 Readcom.chinamobile.contacts.im Leak Leak com.babytree.apps.lama Leak 0 com.tranzmate Read Leakcom.tencent.pb Read 0 com.skyscape.android.ui Leak 0 org.wikipedia 0 Leakcom.sina.weibo Leak 0 com.epocrates Leak 0 com.ijinshan.kbatterydoctor_en Leak 0com.airbnb.android 0 Leak com.ebay.mobile Read Leak com.groupon Leak Leakcom.booking 0 Leak com.sirma.mobile.bible.android Leak Leak com.coupons.ciapp 0 Leakcom.tripadvisor.tripadvisor 0 Leak com.sinyee.babybus.feeling Leak Leak com.nextmedia Leak Leakcom.musixmatch.android.lyrify Read Read com.etermax.preguntados.lite 0 Read com.Qunar Leak Leakcom.soundcloud.android Read 0 com.ss.android.article.news Leak Leak cn.kuwo.kwmusichd Leak 0de.motain.iliga 0 Read com.dianping.v1 Leak Leak com.banjo.android Leak Leakcom.sankuai.meituan Leak Leak com.yahoo...im Read Leak com.kayak.android Leak Leakcom.easygame.marblelegend Read 0 com.dolphin.browser.express.web Read Leak net.skyscanner.android.main 0 Readcom.zillow.android.zillowmap Leak Read org.mozilla.firefox 0 Read com.ik.flightherofree 0 Leakcom.evernote Read Read com.ksmobile.cb Read 0 com.flightview.flightview_free Leak Leakcom.eico.weico Leak Read com.droidware.uninstallmaster Read Leak cn.bluesky.chinesechess Leak Leakcom.netease.newsreader Leak Leak com.lingualeo.android 0 Read com.happiplay.baccarat Leak 0com.zhaopin.social Leak Leak com.baidu.news Leak Leak me.soundwave.soundwave 0 Leakcom.sohu.newsclient Leak Leak com.tencent.news Leak 0 com.thefancy.app 0 0com.cubic.autohome Leak Leak com.ifeng.news2 Leak 0 com.wanelo.android Read Leakcom.soufun.app Leak 0 com.wumo Leak Leak com.mobilesrepublic.appy Read Leakcom.yahoo...weather Read Leak com.quanleimu.activity Leak Leak com.nytimes.android Read 0com.moji.mjweather Leak Leak com.pccw.finance Leak 0 com.bigduckgames.flowbridges 0 Leak
DevID Leakage: 347 suspicious profiles from 56 apps, 139 spyware behaviorsLocation Leakage: 171 suspicious profiles from 51 apps, 51 spyware behaviors
20
Experimental Result
The results justify the existence of correlation between spyware behaviors and app execution traces.
Using Support Vector Machine and Cross ValidationPositive Negative Total Accuracy
Dev ID True 59 175 234 67.4%
False 33 80 113
Location True 21 113 134 78.4%
False 7 30 37
Naïve guesser with prior distribution knowledgeDev ID Accuracy F1-Measure
Naïve Guesser 59.6% 0%
SVM 67.4% 50.6%
Location Accuracy F1-Measure
Naïve Guesser 70.2% 0%
SVM 78.4% 53.1%
21
Summary
† Spyware awareness is an appropriate way for combating privacy leakage.
† Detecting privacy leakage precisely is difficult: using dynamic taint analysis approach
† We propose to discriminate privacy leakage events through app execution traces, which include binder call and system call.
† We design a set of tools, and justify the correlation between privacy leakage events and app execution traces through real-world experiments.
22
Feature Work
† Improve the performance by:• Investigating on in-app signatures• Trying more complicated features
† Analyze the insights from the result:• Understand more about the traces.
† Improve our profiler and method by:• Considering multi-process • Considering cross-app leakage
† Develop and deploy such a tool for real-world usage.