8/12/2019 Android App Clones
1/42
Attack of the Clones:Detecting Cloned
Applications on AndroidMarkets
Jonathan Crussell1,2, Clint Gibler1, and Hao Chen1
1
University of California, Davis2 Sandia National Labs
Source: ESORICS 2012
http://clintgibler.com/http://www.cs.ucdavis.edu/~hchen/http://www.iit.cnr.it/esorics2012/http://www.iit.cnr.it/esorics2012/http://www.cs.ucdavis.edu/~hchen/http://www.cs.ucdavis.edu/~hchen/http://www.cs.ucdavis.edu/~hchen/http://clintgibler.com/8/12/2019 Android App Clones
2/42
Outline
Introduction Background
Threat Model
Clone Detection Approaches and Related Work
Methodology
Evaluation Case Studies
Discussion
Conclusion
8/12/2019 Android App Clones
3/42
Introduction
Much of the user experience of Android relies on third-party Android has numerous marketplaces.
Protect users from malicious apps.
Protect developers from plagiarists.
8/12/2019 Android App Clones
4/42
Introduction
Developers can charge directly for their apps. Offer free apps that are ad-supported or contain in-game bi
Some apps have two version.
Paid appcracked & release for free
Free appcloned & change ad libraries
8/12/2019 Android App Clones
5/42
Introduction
8/12/2019 Android App Clones
6/42
Background
Android Markets Android Application Structure
8/12/2019 Android App Clones
7/42
Threat ModelDefinition of Clone
Clones occur when two applicationshave similar codebut have different ownership.
IgnoreThird-party librariesMultiple versions of the same application if they have the
ownership.
8/12/2019 Android App Clones
8/42
Resistance to Evasion Techniques.
High level modifications Method Restructurings
Control Flow Alterations
Addition/Deletion
Reordering
8/12/2019 Android App Clones
9/42
Non Goals
Find cloning in native code. Determine which applications are the victims and which are
8/12/2019 Android App Clones
10/42
Clone Detection ApproachesFeatBased
Feature based approaches analyze a program and extract a features.
Number or size of classes, methods, loops, or variables to inlibraries.
Low detection rate or high false positive rate.
8/12/2019 Android App Clones
11/42
Clone Detection ApproachesStructuBased
Structure based systems convert programs into a stream of and then compare the streams between two programs.
More robustly than feature based systems.
JPLAG, Winnowing and MOSS.
Comparing DEX byte code streams could be a quite quick an
method to find exactly or near exactly copied code. But byte code streams contain nohigher level semantic kno
about the code.
8/12/2019 Android App Clones
12/42
Clone Detection ApproachesPDGBased
Program Dependence Graph:each node is a statementeach edge shows a dependency between statements
two types of dependencies: data and control
A data dependency edge between statements 1and 2exisis a variable in
2
whose value depends on 1
.
A control dependency between two statements exists if thevalue of the first statement controls whether the second staexecutes.
8/12/2019 Android App Clones
13/42
Related Work
Androguard, DEXCD and DroidMOSS. All these approaches are structure based or structure based
approximations.
None of these tools use any semantic information to aid in dplagiarism.
8/12/2019 Android App Clones
14/42
Methodology
8/12/2019 Android App Clones
15/42
Selecting Potentially Cloned Applica
The goal of an application plagiarist is to entice unwary userchoose her cloned application instead of the original.
Name and description.
8/12/2019 Android App Clones
16/42
Determining Application Similarity Based onAttributes
We use Solrto mimic the search engines on Android market Attributes of the apps:
name, package, market, owner, and description
http://lucene.apache.org/solr/http://lucene.apache.org/solr/8/12/2019 Android App Clones
17/42
Constructing PDGs
dex2jar: Convert both apps code from the DEX format to a J WALA: Construct PDGs for each method in every class of the
applications.
Only data dependency edges: More robust against statemenreordering, insertion and deletion.
https://code.google.com/p/dex2jar/http://wala.sourceforge.net/wiki/index.php/Main_Pagehttp://wala.sourceforge.net/wiki/index.php/Main_Pagehttps://code.google.com/p/dex2jar/8/12/2019 Android App Clones
18/42
Comparing PDGs-Excluding CommLibraries
Ad library Admob, Facebook API, etc.
Dumped both the package name and SHA-1 hash of known files and recorded the most frequent SHA-1 hashes for each
8/12/2019 Android App Clones
19/42
Lossless and Lossy Filters
Lossless filter: Removes PDGs from consideration that are smthan a specified size (< 10 nodes).
Lossy filter: Calculate a frequency vector for each of the metthe pair.
This vector counts how many times a specific node type occ
PDG. Compare these two vectors using hypothesis testing (G-test
8/12/2019 Android App Clones
20/42
Subgraph Isomorphism
Find a mapping between nodes in
and nodes in
Subgraph isomorphism is NPComplete.
VF2 algorithm.
8/12/2019 Android App Clones
21/42
Computing Similarity Scores
For each method(excluding the methods in known librarieapplication, let ||be the number of nodes in this methodFind the best match of this PDG in s PDGs and denote it a
Similarity score: () = |()|
||
8/12/2019 Android App Clones
22/42
Evaluation
75,000 free apps from 13 Android markets.
Randomly selected 9,400 pairs from the potential clones.
Hadoop: parallelize DNADroid.
HDFS: share data across a small cluster.
The average throughput of DNADroid on this small cluster is
application pairs per minute.
8/12/2019 Android App Clones
23/42
Similarity between Applications
8/12/2019 Android App Clones
24/42
Similarity between Applications
8/12/2019 Android App Clones
25/42
8/12/2019 Android App Clones
26/42
Clustering Cloned Applications
8/12/2019 Android App Clones
27/42
8/12/2019 Android App Clones
28/42
Filter Performance
8/12/2019 Android App Clones
29/42
Filter Performance
8/12/2019 Android App Clones
30/42
Visual and Behavioral Verification
8/12/2019 Android App Clones
31/42
Case Studies
8/12/2019 Android App Clones
32/42
Benign Cloning
DNADroid found 30 pairs that both have a 100% similarity s
Translation.
8/12/2019 Android App Clones
33/42
Changes to Advertising Libraries
We can see when an application has most likely been clonedmonetary gain.
Ex: XWind Downloader
For the 141 apps, we found that 91 (65%) of these pairs hadlibraries, all of which included changes to advertising librarie
8/12/2019 Android App Clones
34/42
Malware Added to an Application
HippoSMS is a malicious application requires 10 permissio
It shares the same package name as a Chinese video player 11 permissions.
6 permissions that video player doesnt use.
8/12/2019 Android App Clones
35/42
Two Variants of the Same Malware
Two malicious apps that are identified by VirusTotalas beingof the BaseBridge malware family.
Both applications have been stripped of meaningful class annames.
DNADroid found coverages of 35% and 28% between the tw
U f F C ki T l i th
8/12/2019 Android App Clones
36/42
Use of Freeware Cracking Tool in thWild AntiLVL
Decompiling an app with baksmaliInserts a new file:SmaliHook.classAnd hide AntiLVLsmodifications from the app itself by returni
original file size, MD5, and signatures.
Android License Verification Library (LVL), Amazon Appstore DRMVerizon DRM.
189 of 310 applications containing SmaliHook.class 235 of 310 containing references to AntiLVL in their signature file
Only 8% of our total apps were acquired from Chinese markets, apps including AntiLVL traces were from Chinese markets.
8/12/2019 Android App Clones
37/42
Discussion
8/12/2019 Android App Clones
38/42
False Positive
Since it is a serious allegation to claim an application is a clo
design DNADroid to have a very low false positive rate.
8/12/2019 Android App Clones
39/42
False Negative
Cloned applications often have similar attributes as the orig
There exist advancedprogram transformations that can evabased clone detection.
8/12/2019 Android App Clones
40/42
Comparison to Other Approaches
Androguard: miss 18%
DEXCDhad problems running on the pairs DNADroid identif
DroidMOSSis not currently publicly available.
8/12/2019 Android App Clones
41/42
Performance
DNADroid are more expensive but result in fewer false posit
false negatives.
8/12/2019 Android App Clones
42/42
Conclusion
DNADroid is a tool for finding clones on a large scale.
We evaluated DNADroid on applications crawled from 13 Anmarkets.
Identified at least 141 apps that have been clonedAn additional 310 apps that were cracked with AntiLVL
We describe five case studies
DNADroid has a very low false positive rate
DNADroid is an effective tool.
Top Related