YOLT

13
YOLT YOLT Y Y uan Zheng uan Zheng O O mar Ahmed mar Ahmed L L ukas Dudkowski ukas Dudkowski T T . Mark Kuba . Mark Kuba

description

YOLT. Y uan Zheng O mar Ahmed L ukas Dudkowski T . Mark Kuba. Overview of YOLT. Simple scripting language Easy for coding and maintenance. Regular expression support := and @ “Web-scraping” uses Natural Language Processing Generating RSS Feeds - PowerPoint PPT Presentation

Transcript of YOLT

Page 1: YOLT

YOLTYOLT

YYuan Zhenguan Zheng

OOmar Ahmedmar Ahmed

LLukas Dudkowskiukas Dudkowski

TT. Mark Kuba. Mark Kuba

Page 2: YOLT

Overview of YOLTOverview of YOLT

Simple scripting languageSimple scripting language Easy for coding and maintenance.Easy for coding and maintenance. Regular expression supportRegular expression support := and @:= and @ ““Web-scraping” usesWeb-scraping” uses

Natural Language ProcessingNatural Language Processing Generating RSS FeedsGenerating RSS Feeds Reformatting HTML for other uses Reformatting HTML for other uses

(XML,etc)(XML,etc)

Page 3: YOLT

A Useful YOLT ProgramA Useful YOLT Program

Page 4: YOLT

SemanticsSemantics YOLT Semantic checker is extremely YOLT Semantic checker is extremely

simple. It serves a few main tasks:simple. It serves a few main tasks: Make sure that functions are declared properly, Make sure that functions are declared properly,

i.e. function declarations match functions, and i.e. function declarations match functions, and function calls match the declarationsfunction calls match the declarations

Make sure that variables are initialized before Make sure that variables are initialized before they are used (or, in some cases, un-initialized)they are used (or, in some cases, un-initialized)

(redundant) Make sure that the tree is properly (redundant) Make sure that the tree is properly formed (i.e. make sure that an if-then-else node formed (i.e. make sure that an if-then-else node has exactly three children, etc)has exactly three children, etc)

*note*: there was once basic type-checking, but *note*: there was once basic type-checking, but no longer.no longer.

Page 5: YOLT

Semantics Lessons Semantics Lessons LearnedLearned

It is very easy to do too much in semantic It is very easy to do too much in semantic checkingchecking

Either there are types, or no types (NO Either there are types, or no types (NO MIDDLE GROUND)MIDDLE GROUND)

Scripting languages are an enormous relief to Scripting languages are an enormous relief to a semantic checker--they take away the a semantic checker--they take away the biggest hasslesbiggest hassles

The tree walker should know EXACTLY what The tree walker should know EXACTLY what the structure of the AST will look like and the structure of the AST will look like and cannot make ANY assumptions--things, as cannot make ANY assumptions--things, as evident, can break down when you least evident, can break down when you least expect them to.expect them to.

Page 6: YOLT

Code GenerationCode Generation

Written in JavaWritten in Java Input: correct AST Input: correct AST Output: Perl programOutput: Perl program

AST Code generator Perl Program

Java

Page 7: YOLT

ImplementationImplementation

Walk ASTWalk AST According to the information of the According to the information of the

node, generate code or go down to node, generate code or go down to the child node the child node

e.g.:e.g.: :=:=

$a $a http://www.columbia.eduhttp://www.columbia.edu

Go down to the tree at node “:=“Go down to the tree at node “:=“Generate code at node “$a” and Generate code at node “$a” and

“http://www.columbia.edu”“http://www.columbia.edu”

Page 8: YOLT

Implementation Implementation (tricks)(tricks)

The httpget :=The httpget := invoke UNIX system call “wget” to download the web page invoke UNIX system call “wget” to download the web page

into a temp fileinto a temp file Read the file line by line and store them into an perl array Read the file line by line and store them into an perl array Invoke another UNIX system call “rm” to remove the temp Invoke another UNIX system call “rm” to remove the temp

filefile Keep the web address in an perl scalarKeep the web address in an perl scalar

Scalar and arrays use same syntaxScalar and arrays use same syntax Compiler (code generator) “guesses” whether the variable is Compiler (code generator) “guesses” whether the variable is

a scalar or an arraya scalar or an array Arrays can only appears in certain places (e.g.. Foreach)Arrays can only appears in certain places (e.g.. Foreach)

Page 9: YOLT

Documentation and Documentation and TestingTestingLexer/Parser - Semantic Checker

Lexer/Parser

Semantic Checker

Reference File:What I think itshould produce

Diff

Log result:Good should be good.Bad should be bad.

Test Cases

●Good●Bad

Page 10: YOLT

Integration TestingIntegration Testing

Goal: display any comics that Goal: display any comics that have the word have the word hamster in the hamster in the URL of URL of www.toothpastefordinner.com, www.toothpastefordinner.com, Summer 2002 archive.Summer 2002 archive.

Trying little YOLT programs to see functionality, code generation, etc. Working out bugs in implementation & design.

Example:

<img src="http://www.toothpastefordinner.com/072802/hamster-table-tennis.gif"><br><img src="http://www.toothpastefordinner.com/072502/even-hamsters.gif"><br><img src="http://www.toothpastefordinner.com/060602/hamsters-are-the-best.gif"><br>

Yolt Program

Generated Perl

Resultant HTML

$toothpaste_home ="http://www.toothpastefordinner.com/";system('wget -q -O - http://www.toothpastefordinner.com/archives-sum02.php > toothpaste.txt');open INFILE, "toothpaste.txt";@toothpaste=<INFILE>;close INFILE;system ('rm toothpaste.txt');$toothpaste = "http://www.toothpastefordinner.com/archives-sum02.php";$tags ="<a href=\"(.*)\">.*hamster.*</a>";@tmp1=();foreach ( @toothpaste) {if ($_=~m/($tags)/i){push @tmp1, $2}}@elements = @tmp1;foreach $x ( @elements ) {print "<img src=\"".$toothpaste_home.$x."\""."><br>";print "\n";}

begin

$toothpaste_home="http://www.toothpastefordinner.com/";$toothpaste:="http://www.toothpastefordinner.com/archives-sum02.php";

$tags="<a href=\"(.*)\">.*hamster.*</a>";

$elements = $tags @ $toothpaste;

foreach $x in $elements { echo "<img src=\"".$toothpaste_home.$x."\""."><br>"; echo "\n";

}

end

Page 11: YOLT

The ResultThe Result

The source site The end result

Page 12: YOLT

Lessons LearnedLessons Learned

Develop and test incrementallyDevelop and test incrementally There are ALWAYS bugs, you just There are ALWAYS bugs, you just

haven’t found them yethaven’t found them yet CLIC is not designed to be lived inCLIC is not designed to be lived in

Page 13: YOLT

One More ExampleOne More Example