Automated code audits

Post on 28-Jun-2015

179 views 0 download

Tags:

description

Even nowadays, PHP code is mostly manually audited. Expert pore over actual code, in search for bugs or code smells. Actually, it is possible to have PHP do this work itself ! Strengthened with the internal Tokenizer, bolstered by the manual, it is able to scan thousands of lines of code, without getting bored, and bringing pragmatic pieces of wisdom: official manual recommendations, version migration, code pruning and security. In the end, it deliver a global overview of the code, without reading it.

Transcript of Automated code audits

Automatic code audits

Rotterdam, Nederland, October 9th 010PHP

Definition

A kind of code analysis where the code is reviewed without running it.

Just like we would do ourselves!

Who is speaking?

Damien Seguy

CTO at exakat

Phather of the plush toy elePHPant

Back from China

Stepping up automated code audit services

Internals

Code

Analyze Report

AST

<?php!function x($a) {         return $a;} x(1, 2); ?>

Found

Dead code

Undefined structures

Unused structures

Illogical exp.

Slow code

Bad practices

Unsafe code

Maintainability

Bug issue

Ancient style

Uninitialized vars

Taint propagation

<?phpswitch ($this->consume()){     case "\x09":    case "\x0A":    case "\x0B":    case "\x0B":    case "\x0C":    case "\x20":    case "\x3C":    case "\x26":    case false:        break;    case "\x23":        switch ($this->consume())        {            case "\x78":            case "\x58":                $range = '0123456789ABCDEFabcdef';                $hex = true;                break;        }    }{?>

    protected function openString($end, &$out=null, $nestingOpen, $rejectStrs = null) {         $nestingLevel = $count = 0;        $content = array();        while ($this->match($patt, $m, false)) {             $tok = $m[2];            if ($tok == "@{" && $this->interpolation($inter)) {                 $content[] = $inter;                continue;            }            if (!empty($rejectStrs) && in_array($tok, $rejectStrs)) {                 $ount = null;                break;            }            $content[] = $tok;            $count += strlen($tok);        }        $this->eatWhiteDefault = $oldWhite;        if (count($content) == 0) return false;        $out = array("string", "", $content);        return true;    }

Spot bugs early

Code Test PreProd Production

Run it at commit Run it as audit

Static audit vs Unit test

No running

100% of the code

Symbolic testing

Little configuration

Has false positive

Mostly internal

Needs dedicated servers

Will only scan a part

Test only provided data

Write scenario

Has false negative

Can be handed to users

When does it help

Help port to a new system

Search for weak code fragments

Audit external libraries

Hint at refactoring

Report

Bugs

Useless code

Suggestions

Bugs

<?php !!  if($content = file_get_contents($file))    {        $content = trim($content);        $content = substr($content, -2) == '?>' ? substr($content, 0, -2) : $content;    } ! return true;!?> !

Useless code<?php!!// inside a legit class $this->module->xmlRequest;$_G['setting']['debug'];$post['usesig'] ? ($_G['setting']['sigviewcond'] ?  (strlen($post['message']) > $_G['setting']['sigviewcond'] ?!! ! ! !  $post['signature'] : '') !! ! ! ! ! : $post['signature']) : '';?> !

Suggestions

<?php !// Nested ternary should be turned into if then structures     $operation == 'ENCODE' ? sprintf('%010d', $expiry ? $expiry + time( ) : 0) . substr(md5($string . $egiskeys), 0, 16) . $string : base64_decode(substr($string, $key_length))    // Multiply by one is useless     SetCache($prefix, $key, $row, 60*60*1);

// Portable syntax     $bits = split('.', $string);    $y = $bits[0];     ! // Modern syntax     $y = split('.', $string)[0];?> !

Where it doesn’t help

Unit tests

Architecture

Old traditions that won’t change

Semantic errors

Architecture

No framework context

Conception is done before coding

Of course!

Static audit will report standards, not norms

Old traditions

<?php    $pna = explode(')(', $pn);    while (list($k, $v) = each($pna)) {        $lst = explode('"', $v);        if (isset($lst[3])) {            $pn[$lst[1]] = $lst[3];        } else {            $pn[$lst[1]] = '';        }    } ?>

10 % of current applications uses this instead of foreach()

Semantic errors

<?php     $babycarriage = new carriage();     $wheel1 = new Racingwheel();     $wheel2 = new Racingwheel();     $wheel3 = new Racingwheel();     $wheel4 = new Rhinoceros();         $babycarriage->installWheels($wheel1, ## # # # # # # # # # # # $wheel2, ## # # # # # # # # # $wheel3, ## # # # # # # # # # $wheel4);     ?>

Undefined classes : Vehicle, Racingwheel, Rhinoceros

Semantic errors

Available analyzers

PHP code sniffer

PHP MD

Scrutinizer-ci

Fortify

insight from Sensio

damien.seguy@gmail.com

http://www.slideshare.net/dseguy

THE END

http://010php.nl/

http://www.meetup.com/010PHP/