... now write an interpreter (PHPem 2016)

38
@asgrim … and write an interpreter James Titcumb PHPem Unconference 2016

Transcript of ... now write an interpreter (PHPem 2016)

Page 1: ... now write an interpreter (PHPem 2016)

@asgrim

… and write an interpreterJames Titcumb

PHPem Unconference 2016

Page 2: ... now write an interpreter (PHPem 2016)

Who is this guy?James Titcumb

www.jamestitcumb.com

www.roave.com

www.phphants.co.uk

www.phpsouthcoast.co.uk

@asgrim

Page 3: ... now write an interpreter (PHPem 2016)

@asgrim

Let’s write an interpreterIn three easy steps…

Page 4: ... now write an interpreter (PHPem 2016)

@asgrim

Warning: do not use in production

Page 5: ... now write an interpreter (PHPem 2016)

@asgrim

View > Sourcehttps://github.com/asgrim/basic-maths-compiler

Page 6: ... now write an interpreter (PHPem 2016)

@asgrim

Define the languageTokens

● T_ADD (+)

● T_SUBTRACT (-)

● T_MULTIPLY (/)

● T_DIVIDE (*)

● T_INTEGER (\d)

● T_WHITESPACE (\s+)

Page 7: ... now write an interpreter (PHPem 2016)

@asgrim

Step 1: Writing a simple lexer

Page 8: ... now write an interpreter (PHPem 2016)

@asgrim

Using regular expressionsprivate static $matches = [

'/^(\+)/' => Token::T_ADD,

'/^(-)/' => Token::T_SUBTRACT,

'/^(\*)/' => Token::T_MULTIPLY,

'/^(\/)/' => Token::T_DIVIDE,

'/^(\d+)/' => Token::T_INTEGER,

'/^(\s+)/' => Token::T_WHITESPACE,

];

Page 9: ... now write an interpreter (PHPem 2016)

@asgrim

Step through the input stringpublic function __invoke(string $input) : array

{

$tokens = [];

$offset = 0;

while ($offset < strlen($input)) {

$focus = substr($input, $offset);

$result = $this->match($focus);

$tokens[] = $result;

$offset += strlen($result->getLexeme());

}

return $tokens;

}

Page 10: ... now write an interpreter (PHPem 2016)

@asgrim

The matching methodprivate function match(string $input) : Token

{

foreach (self::$matches as $pattern => $token) {

if (preg_match($pattern, $input, $matches)) {

return new Token($token, $matches[1]);

}

}

throw new \RuntimeException(sprintf(

'Unmatched token, next 15 chars were: %s', substr($input, 0, 15)

));

}

Page 11: ... now write an interpreter (PHPem 2016)

@asgrim

Step 2: Parsing the tokens

Page 12: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence/**

* Higher number is higher precedence.

* @var int[]

*/

private static $operatorPrecedence = [

Token::T_SUBTRACT => 0,

Token::T_ADD => 1,

Token::T_DIVIDE => 2,

Token::T_MULTIPLY => 3,

];

Page 13: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence/** @var Token[] $stack */

$stack = [];

/** @var Token[] $operators */

$operators = [];

while (false !== ($token = current($tokens))) {

if ($token->isOperator()) {

// ...

}

$stack[] = $token;

next($tokens);

}

Page 14: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence/** @var Token[] $stack */

$stack = [];

/** @var Token[] $operators */

$operators = [];

while (false !== ($token = current($tokens))) {

if ($token->isOperator()) {

// ...

}

$stack[] = $token;

next($tokens);

}

Page 15: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence/** @var Token[] $stack */

$stack = [];

/** @var Token[] $operators */

$operators = [];

while (false !== ($token = current($tokens))) {

if ($token->isOperator()) {

// ...

}

$stack[] = $token;

next($tokens);

}

Page 16: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence/** @var Token[] $stack */

$stack = [];

/** @var Token[] $operators */

$operators = [];

while (false !== ($token = current($tokens))) {

if ($token->isOperator()) {

// ...

}

$stack[] = $token;

next($tokens);

}

Page 17: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedenceif ($token->isOperator()) {

$tokenPrecedence = self::$operatorPrecedence[$token->getToken()];

while (

count($operators)

&& self::$operatorPrecedence[$operators[count($operators) - 1]->getToken()]

> $tokenPrecedence

) {

$higherOp = array_pop($operators);

$stack[] = $higherOp;

}

$operators[] = $token;

next($tokens);

continue;

}

Page 18: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedenceif ($token->isOperator()) {

$tokenPrecedence = self::$operatorPrecedence[$token->getToken()];

while (

count($operators)

&& self::$operatorPrecedence[$operators[count($operators) - 1]->getToken()]

> $tokenPrecedence

) {

$higherOp = array_pop($operators);

$stack[] = $higherOp;

}

$operators[] = $token;

next($tokens);

continue;

}

Page 19: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedenceif ($token->isOperator()) {

$tokenPrecedence = self::$operatorPrecedence[$token->getToken()];

while (

count($operators)

&& self::$operatorPrecedence[$operators[count($operators) - 1]->getToken()]

> $tokenPrecedence

) {

$higherOp = array_pop($operators);

$stack[] = $higherOp;

}

$operators[] = $token;

next($tokens);

continue;

}

Page 20: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedenceif ($token->isOperator()) {

$tokenPrecedence = self::$operatorPrecedence[$token->getToken()];

while (

count($operators)

&& self::$operatorPrecedence[$operators[count($operators) - 1]->getToken()]

> $tokenPrecedence

) {

$higherOp = array_pop($operators);

$stack[] = $higherOp;

}

$operators[] = $token;

next($tokens);

continue;

}

Page 21: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence// Clean up by moving any remaining operators onto the token stack

while (count($operators)) {

$stack[] = array_pop($operators);

}

return $stack;

Page 22: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

Output stack

Operator stack

Page 23: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1Output stack

Operator stack

Page 24: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1

+

Output stack

Operator stack

Page 25: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2

+

Output stack

Operator stack

Page 26: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2

+ *

Output stack

Operator stack

Page 27: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2 3

+ *

Output stack

Operator stack

Page 28: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2 3 *

+ *

Output stack

Operator stack

Page 29: ... now write an interpreter (PHPem 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2 3 * +

+

Output stack

Operator stack

Page 30: ... now write an interpreter (PHPem 2016)

@asgrim

Create ASTwhile ($ip < count($tokenStack)) {

$token = $tokenStack[$ip++];

if ($token->isOperator()) {

// (figure out $nodeType)

$right = array_pop($astStack);

$left = array_pop($astStack);

$astStack[] = new $nodeType($left, $right);

continue;

}

$astStack[] = new Node\Scalar\IntegerValue((int)$token->getLexeme());

}

Page 31: ... now write an interpreter (PHPem 2016)

@asgrim

Create ASTwhile ($ip < count($tokenStack)) {

$token = $tokenStack[$ip++];

if ($token->isOperator()) {

// (figure out $nodeType)

$right = array_pop($astStack);

$left = array_pop($astStack);

$astStack[] = new $nodeType($left, $right);

continue;

}

$astStack[] = new Node\Scalar\IntegerValue((int)$token->getLexeme());

}

Page 32: ... now write an interpreter (PHPem 2016)

@asgrim

Create ASTwhile ($ip < count($tokenStack)) {

$token = $tokenStack[$ip++];

if ($token->isOperator()) {

// (figure out $nodeType)

$right = array_pop($astStack);

$left = array_pop($astStack);

$astStack[] = new $nodeType($left, $right);

continue;

}

$astStack[] = new Node\Scalar\IntegerValue((int)$token->getLexeme());

}

Page 33: ... now write an interpreter (PHPem 2016)

@asgrim

Create ASTwhile ($ip < count($tokenStack)) {

$token = $tokenStack[$ip++];

if ($token->isOperator()) {

// (figure out $nodeType)

$right = array_pop($astStack);

$left = array_pop($astStack);

$astStack[] = new $nodeType($left, $right);

continue;

}

$astStack[] = new Node\Scalar\IntegerValue((int)$token->getLexeme());

}

Page 34: ... now write an interpreter (PHPem 2016)

@asgrim

Create AST

Node\BinaryOp\Add (

Node\Scalar\IntegerValue(1),

Node\BinaryOp\Multiply (

Node\Scalar\IntegerValue(2),

Node\Scalar\IntegerValue(3)

)

)

Page 35: ... now write an interpreter (PHPem 2016)

@asgrim

Step 3: Executing the AST

Page 36: ... now write an interpreter (PHPem 2016)

@asgrim

Compile & execute ASTprivate function compileNode(NodeInterface $node)

{

if ($node instanceof Node\BinaryOp\AbstractBinaryOp) {

return $this->compileBinaryOp($node);

}

if ($node instanceof Node\Scalar\IntegerValue) {

return $node->getValue();

}

}

Page 37: ... now write an interpreter (PHPem 2016)

@asgrim

Compile & execute ASTprivate function compileBinaryOp(Node\BinaryOp\AbstractBinaryOp $node)

{

$left = $this->compileNode($node->getLeft());

$right = $this->compileNode($node->getRight());

switch (get_class($node)) {

case Node\BinaryOp\Add::class:

return $left + $right;

case Node\BinaryOp\Subtract::class:

return $left - $right;

case Node\BinaryOp\Multiply::class:

return $left * $right;

case Node\BinaryOp\Divide::class:

return $left / $right;

}

}

Page 38: ... now write an interpreter (PHPem 2016)

Any questions?

James Titcumb @asgrim