Lessons learned from Android malware...

29
Introduction Datasets Designing an experiment Conclusion Lessons learned from Android malware experiments Jean-François Lalande Security Days Rennes 9-10th january 2019

Transcript of Lessons learned from Android malware...

Page 1: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

Introduction Datasets Designing an experiment Conclusion

Lessons learnedfrom Android malware experiments

Jean-François Lalande

Security Days

Rennes9-10th january 2019

Page 2: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

2 / 26

Introduction Datasets Designing an experiment Conclusion

Context

Google Play Store: 3.5 million applications (2017)

Malware are uploaded to the Play Store+ Third party markets

Research efforts:Detection, classificationPayload extraction, unpacking, reverseExecution, triggering

Difficulties for experimenting with Android malware samples?

Page 3: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

3 / 26

Introduction Datasets Designing an experiment Conclusion

Example: the MobiDash adware

Displays adds after some days...Launch the applicationModify com.cardgame.durak_preferences.xmlreboot the device

Page 4: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

4 / 26

Introduction Datasets Designing an experiment Conclusion

Researchers

Usually they do:

You have an ideaYou developYou take a datasetYou evaluate

We also do this :)

We = Valérie Viet Triem Tong, Mourad Leslous (PhD), PierreGraux (PhD), Cédric Herzog (PhD) and many other masterstudents. . .

Page 5: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

5 / 26

Introduction Datasets Designing an experiment Conclusion

Designing an experiment from scratch

Manualdecompilation

APK

Monitoringactions

Execution

Results

Collect samples

Check thatthey are malware

Find thepayload

Staticanalysis

helphelp

We have not time for these folks!We want an automatic process. . .

Page 6: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

6 / 26

Introduction Datasets Designing an experiment Conclusion

Lessons learned

About datasets of malwareIs there any datasets?

About analysis of malwareIs it reliable and reproducible?How much does it cost?

Page 7: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

7 / 26

Introduction Datasets Designing an experiment Conclusion

1 Introduction

2 Datasets

3 Designing an experiment

4 Conclusion

Page 8: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

8 / 26

Introduction Datasets Designing an experiment Conclusion

Example from the state of the art

About papers that work on Android malware. . .

CopperDroid [Tam et al. 2015]:1365 samples3% of payloads executed

IntelliDroid [Wong et al. 2016]:10 samples90% of payloads executed

GroddDroid [us ! 2015]:100 samples24% of payloads executed

. . .

Is it easy to get these figures (and to reproduce them)?Are these results relevant?

Page 9: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

9 / 26

Introduction Datasets Designing an experiment Conclusion

Papers with Android malware experiments

use extracts of reference datasets:The Genome project (stopped !) [Zhou et al. 12]Contagio mobile dataset [Mila Parkour]Hand crafted malicious apps (DroidBench [Artz et al. 14])Some Security Challenges’ apps

need to be significant:Tons of apps (e.g. 1.3 million for PhaLibs [Chen et al. 16])Some apps (e.g. 11 for TriggerScope [Fratantonio et al. 16])

Lesson learned 1 and 2A well documented dataset does not exist !Online services give poor information !

Page 10: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

10 / 26

Introduction Datasets Designing an experiment Conclusion

Some hope: new recent datasets

AndroZoo [Allix et al. 2016]3 million appsWith pairs of applications (repackaged ?)

The AMD dataset [Wei et al. 2017]24,650 samplesWith contextual informations (classes, actions, . . . )

We need more contextual information !Where is the payload ?How to trigger the payload ?Which device do I need ?

Page 11: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

11 / 26

Introduction Datasets Designing an experiment Conclusion

1 Introduction

2 Datasets

3 Designing an experiment

4 Conclusion

Page 12: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

12 / 26

Introduction Datasets Designing an experiment Conclusion

Difficulties

1 Is this apk a malware?

2 Where is the payload?locating the payload 6= classifying a malware/goodwarewhat does the payload?

3 Is the static analysis possible?What is the nature of the code?Is there any countermeasure?

4 How to execute automatically the malware?How to handle the GUI?How to find entry points?How to monitor the execution?

Page 13: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

12 / 26

Introduction Datasets Designing an experiment Conclusion

Difficulties

1 Is this apk a malware?Is this apk a malware?

2 Where is the payload?locating the payload 6= classifying a malware/goodwarewhat does the payload?

3 Is the static analysis possible?What is the nature of the code?Is there any countermeasure?

4 How to execute automatically the malware?How to handle the GUI?How to find entry points?How to monitor the execution?

Is this apk a malware?

Page 14: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

13 / 26

Introduction Datasets Designing an experiment Conclusion

Check that a sample is a malware?

Manually. . . for 10 samples ok, but for more ?

Ask VirusTotal!

∼45 antiviruses softwareUse a threshold to decide (e.g. 20 antiviruses)Free upload API (few samples / day)Used by others in papers

Is it a good idea?

Page 15: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

14 / 26

Introduction Datasets Designing an experiment Conclusion

An experiment with 683 fresh samples

Threshold of x antiviruses recognizing a sample?

Page 16: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

15 / 26

Introduction Datasets Designing an experiment Conclusion

Check that a sample is a malware?

Not solved:using VirusTotalfor fresh new samples

Solved:for old well-known samplesby many learning papers (detection rate ≥ 90%)e.g. Milosevic et al.: precision of 87% with Random Forestse.g. Zhu et al.: precision of 88% with Rotation Forests

Lesson learned 3Reliable detection is an important sub-problem.

Page 17: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

16 / 26

Introduction Datasets Designing an experiment Conclusion

Difficulties

1 Is this apk a malware?

2 Where is the payload?Where is the payload?locating the payload 6= classifying a malware/goodwarewhat does the payload?

3 Is the static analysis possible?What is the nature of the code?Is there any countermeasure?

4 How to execute automatically the malware?How to handle the GUI?How to find entry points?How to monitor the execution?

Where is the payload?

Page 18: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

17 / 26

Introduction Datasets Designing an experiment Conclusion

Where is the payload?

Seminal paper: “DroidAPIMiner: Mining API-Level Features forRobust Malware Detection in Android”’ Aafer et al. (2013)⇒ Extract relevant features from API analysis.Enables to:

gives more meaning to the payloadclassifies apps with more accuracy

Results from Aafer et al. (2013):detection accuracy permission based / api based

Extracted from DroidAPIMiner: Mining API-Level Features for Robust Malware Detection in Android, Aafer et al.

Page 19: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

18 / 26

Introduction Datasets Designing an experiment Conclusion

Difficulties

1 Is this apk a malware?

2 Where is the payload?locating the payload 6= classifying a malware/goodwarewhat does the payload?

3 Is the static analysis possible?Is the static analysis possible?What is the nature of the code?Is there any countermeasure?

4 How to execute automatically the malware?How to execute automatically the malware?How to handle the GUI?How to find entry points?How to monitor the execution?

Is the static analysis possible?

How to execute automatically the malware?

Page 20: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

19 / 26

Introduction Datasets Designing an experiment Conclusion

Analyzing malware

Main analysis methods are:

static analysis:⇒ try to recognize knowncharacteristics of malware in thecode/resources of studied applications

dynamic analysis:⇒ try to execute the malware

Page 21: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

19 / 26

Introduction Datasets Designing an experiment Conclusion

Analyzing malware

Main analysis methods are:

static analysis:⇒ try to recognize knowncharacteristics of malware in thecode/resources of studied applications

dynamic analysis:⇒ try to execute the malware

Countermeasures: reflec-tion, obfuscation, dynamicloading, encryption, native

Countermeasures:logic bomb, time

bomb, remote server

Page 22: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

20 / 26

Introduction Datasets Designing an experiment Conclusion

Solving attacker’s countermeasures

Implemented / Possible solutions against attacker’scountermeasures:

Problem Solutionmalformed files ignore it if possible

reflection execute itdynamic loading execute itlogic/time bomb force conditions

native code watch it from the kernelpacking ???

(dead) remote server ???

Lesson learned 4Dynamic analysis requires a lot of efforts to be automatized.

Page 23: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

21 / 26

Introduction Datasets Designing an experiment Conclusion

Reliability

Some malware crash (and people don’t care. . . )

Crash ratio (at launch time):AMD dataset [Yang et al. 2017]: 5%Our native dataset: 20%

Lesson learned 5We need to know the reasons behind the crash.

Page 24: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

22 / 26

Introduction Datasets Designing an experiment Conclusion

Performances

Time evaluation (average):

For one app and one payload:

Flashing device: 60 sStatic analysis: 7 sDynamic analysis (execution): 4 mTotal: 5 m

From the AMD dataset: 135 samples100 payloads per appAll Android OS: 8 versions

Lesson learned 6We need 1 year of experiments (with 1 device). . .

Page 25: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

22 / 26

Introduction Datasets Designing an experiment Conclusion

Performances

Time evaluation (average):

For one app and one payload:

Flashing device: 60 sStatic analysis: 7 sDynamic analysis (execution): 4 mTotal: 5 m

From the AMD dataset: 135 samples100 payloads per appAll Android OS: 8 versions

Lesson learned 6We need 1 year of experiments (with 1 device). . .

Page 26: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

23 / 26

Introduction Datasets Designing an experiment Conclusion

Conclusion

Designing experiments on Android malwareis a difficult challenge!

Lessons learned

A well documented dataset does not exist !

Online services give poor information !

Reliable detection is an important sub-problem.

Dynamic analysis requires a lot of efforts to beautomatized.

We need to know the reasons behind the crash.

We need 1 year of experiments (with 1device). . .

http://kharon.gforge.inria.frhttp://kharon.gforge.inria.fr/dataset

Page 27: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

24 / 26

Introduction Datasets Designing an experiment Conclusion

Android’s Future

Evolution of the platform:

Apps can be developed in KotlinFuschia can become the new underlying OS

Android is everywhere:

Wear 2+Android AutomotiveAndroid Things

Will malware exist?

Page 28: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

c©Inria / C. Morel

Thank you!

Page 29: Lessons learned from Android malware experimentspeople.rennes.inria.fr/Jean-Francois.Lalande/talks/... · 2019-11-15 · Papers with Android malware experiments use extracts of reference

26 / 26

Introduction Datasets Designing an experiment Conclusion

References

H. J. Zhu, Z. H. You, Z. X. Zhu, W. L. Shi, X. Chen, and L. Cheng, “DroidDet: Effective and robust detectionof android malware using static analysis along with rotation forest model,” Neurocomputing, vol. 272, pp.638–646, 2018.

N. Milosevic, A. Dehghantanha, and K.-K. R. Choo, “Machine learning aided Android malwareclassification,” Comput. Electr. Eng., vol. 61, pp. 266–274, Jul. 2017.

Y. Aafer, W. Du, and H. Yin, “DroidAPIMiner: Mining API-Level Features for Robust Malware Detection inAndroid,” Secur. Priv. Commun. Networks, vol. 127, pp. 86–103, 2013.

L. Li et al., “Understanding Android App Piggybacking: A Systematic Study of Malicious Code Grafting,”IEEE Trans. Inf. Forensics Secur., vol. 12, no. 6, pp. 1269–1284, Jun. 2017.

W. Yang, D. Kong, T. Xie, and C. A. Gunter, “Malware Detection in Adversarial Settings: Exploiting FeatureEvolutions and Confusions in Android Apps,” 2017, pp. 288–302.

A. Abraham, R. Andriatsimandefitra, A. Brunelat, J. F. Lalande, and V. Viet Triem Tong, “GroddDroid: Agorilla for triggering malicious behaviors,” in 2015 10th International Conference on Malicious and UnwantedSoftware, MALWARE 2015, 2016, pp. 119–127.

M. Leslous, V. Viet Triem Tong, J.-F. Lalande, and T. Genet, “GPFinder: Tracking the Invisible in AndroidMalware,” in 12th International Conference on Malicious and Unwanted Software, 2017, pp. 39–46.

K. Allix, T. F. Bissyandeé, J. Klein, and Y. Le Traon. AndroZoo: Collecting Millions of Android Apps for theResearch Community. Mining Software Repositories (MSR) 2016.