ProfilingCryptography Developers - Portalscg.unibe.ch/download/softwarecomposition/2020-07-28... ·...

26
Profiling Cryptography Developers Bachelor Project of Said Ali supervised by Mohammadreza Hazhirpasand

Transcript of ProfilingCryptography Developers - Portalscg.unibe.ch/download/softwarecomposition/2020-07-28... ·...

  • Profiling CryptographyDevelopers

    Bachelor Project of Said Alisupervised by Mohammadreza Hazhirpasand

  • Research Question

    2

    Correlation between crypto developer activity on Stack Overflow

    andcrypto developer contribution on GitHub

  • Pipeline

    Crypto tag

    • Heuristic 1• Heuristic 2

    3

    Crypto user

    • Stack Exchange

    Account

    • Scraping• Manual search

    Crypto file

    • GH repo API• GH code API• Scraping

    Crypto contributor

    • Git Blame• File author• File committer

  • Results

    4

    478 380 66 76

    No GitHub GitHub - Scraping GitHub - Links manual search GitHub - Google manual search

    52.2% Match

  • 5

    {23‘633 Repositories}

    478 522

    # User without gitHub # User with gitHub

    Results

  • 6

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    Shell JavaScr ipt HTML Python Ruby CSS C Perl C++ Java C# PHP Objective-C Haskell Rust

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    Shell

    JavaS

    cr ipt

    HTML

    Pytho

    nRu

    by CSS C

    Perl

    C++

    Java C# PH

    P

    Objec

    tive-C

    Haske

    llRu

    st

    # Repositories

    Results

  • 7

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    Shell JavaScr ipt HTML Python Ruby CSS C Perl C++ Java C# PHP Objective-C Haskell Rust

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    Shell

    JavaS

    cr ipt

    HTML

    Pytho

    nRu

    by CSS C

    Perl

    C++

    Java C# PH

    P

    Objec

    tive-C

    Haske

    llRu

    st

    # Repositories

    Results

  • 8

    Python Ruby C C++ Java C# Javascript Rust

    from passlib. require 'rbnacl' include "tomcrypt_ include

  • ResultsGit Blame

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    Shell JavaScr ipt HTML Python Ruby CSS C Perl C++ Java C# PHP Objective-C Haskell Rust

    Non crypto reposi torie Crypto related repositorie

    9

  • Crypto contributors

    10

    5

    18

    33

    37

    40

    42

    49

    RUST

    C++

    PYTHON

    RUBY

    C#

    C

    JAVA

    # Developer contributed to crypto file

    522

    { email, fullname and username }

  • Data Analysis

    11

  • Overall Analysis

    • Crypto contributors vs no crypto contributors• {reputation, crypto score and crypto accepted answers}

    • Crypto contributors• {crypto score, crypto accepted answers, # crypto file contributions}• {crypto score-reputation-crypto accepted answers}

    332189

    ReputationCrypto score

    Crypto Accepted answers

    Crypto contributors No cypto contributors

    12

  • Mann-Whitney U test

    • Compares probability to get higher value • H0 : difference between two groups is not statistically significant

    • accepted: p-value > α=0.05• medians of the two samples are identical.

    • H1 : difference between two groups is statistically significant

    • Non normal distribution• Independent variable of two categories• Independence of observations

    13

  • Visual analysis

    Mean 173.5 113

    Median 69 56

    Standard Deviation 341 193.9

    39.6 31.8

    19 18.5

    60.5 45.7

    67223.6 72215.9

    40738 46877

    71989.5 73956.87

    0

    50

    100

    150

    200

    250

    300

    350

    400

    450

    Score Score

    0

    10

    20

    30

    40

    50

    60

    Answer Answer

    0

    50000

    100000

    150000

    200000

    250000

    300000

    Reputation Reputation

    14

  • Mann-Whitney U test

    • Crypto contributors vs no crypto contributors

    • Score• H0 accepted: p-value = 0.25 > α

    • Reputation• H0 accepted: p-value = 0.29 > α

    • Answers• H0 accepted: p-value = 0.458 > α

    No significant difference

    15

  • Visual analysis

    Crypto accepted answers of crypto contributors

    Mean 39.6

    Median 19

    Standard Deviation 60.5

    LOW

    HIGH

    0

    10

    20

    30

    40

    50

    60

    Answer

    16

  • Mann-Whitney U test

    • Accepted answers of crypto contributors

    • # Crypto file contribution• H0 accepted: p-value = 0.14 > α

    =HIGH LOW

    17

  • Visual analysis

    Mean 173.5

    Median 69

    Standard Deviation 341

    Score of crypto contributors

    LOW

    HIGH

    0

    50

    100

    150

    200

    250

    300

    350

    400

    450

    Score

    18

  • Mann-Whitney U test

    • Score of crypto contributors

    • # Crypto file contribution• H0 accepted: p-value = 0.645668 > α

    =HIGH LOW

    19

  • Visual analysis

    Mean 14.8

    Median 3

    Standard Deviation 39.3

    Number of crypto file contribution

    0

    5

    10

    15

    20

    25

    30

    35

    #Crypto file contribution

    LOW

    HIGH

    20

  • Mann-Whitney U test

    • Number of crypto file contribution

    • Score• H0 accepted: p-value = 0.7462 > α

    • Reputation• H0 accepted: p-value = 0.7422 > α

    • Answers• H0 accepted: p-value = 0.2613 > α

    =HIGH LOW

    21

  • Visual analysis

    Mean 173.5

    Median 69

    Standard Deviation 341

    39.6

    19

    60.5

    67223.6

    40738

    71989.5

    Crypto activities of crypto contributors

    0

    50

    100

    150

    200

    250

    300

    350

    400

    450

    Score

    0

    10

    20

    30

    40

    50

    60

    Answer

    0

    50000

    100000

    150000

    200000

    250000

    Reputation

    22

  • Mann-Whitney U test

    • Crypto activites of crypto contributors

    • # Crypto file contribution• H0 accepted: p-value = 0.39 > α

    =HIGH LOW

    23

  • Results

    • No significant difference between crypto developer activity andcrypto developer contributon

    HIGH LOW

    crypto activity = crypto file contribution

    24

    Reputation Crypto score

    Crypto accepted answers

  • Future Work

    • Only GitHub• Add additional Q&A and development platforms

    • Only one idenitity linkage• Email MD5• Identity linkage Software (TBIL)

    • Other programming languages• Crypto libraries

    25

  • Conclusion

    26