Hacker 102 - regexes w/Javascript, Python

hacker 102code4lib 2010 preconference

Asheville, NC, USA 2010-02-21

iv. regular expressions

JavaScript

if all languagelooked like

“aabaaaabbbabaababa”it’d be

easy to parse

parsing “aabaaaabbbabaababa”

•there are two elements, “a” and “b”

•either may occur in any order

•/([ab]+)/

• [] denotes “elements” or “class”

• // demarcates regex

• + denotes “one or more of previous thing”

• () denotes “remember this matched group”

• /[ab]/ # an ‘a’ or a ‘b’

• /[ab]+/ # one or more ‘a’s or ‘b’s

• /([ab]+)/ # a group of one or more ‘a’s or ‘b’s

to firebug!

• [a-z] is any lower case char bet. a-z

• [0-9] is any digit

• + is one or more of previous thing

• ? is zero or one of previous thing

• | is or, e.g. [a|b] is ‘a’ or ‘b’

• * is zero to many of previous thing

• . matches any character

• [^a-z] is anything *but* [a-z]

• [a-zA-Z0-9] is any of a-z, A-Z, 0-9

• {5} matches only 5 of the preceding thing

• {2,} matches at least 2 of the preceding thing

• {2,6} matches from 2 to 6 of preceding thing

• [\d] is like [0-9] (any digit)

• [\S] is any non-whitespace

• visit any web page

• open firebug console

• title = window.document.title

• try regexes to match parts of the title

try this

most every languagehas regex support

try unix “grep”

v. glue it together

Python

problem: Carol’s data

TITLE: ABA journal. BD. HOLDINGS: Vol. 70 (1984) - Vol. 94 (2008)CURRENT VOL.: Vol. 95 (2009) -OTHER LIBRARIES: Miami:v. 68 (1982) - USDC: v. 88 (2002) - Birm.:v. 89 (2003) -(Formerly: American Bar Association Journal)(Bound and on Hein)

TITLE: Administrative law review. BD. HOLDINGS: Vol. 22 (1969/1970) - Vol. 60 (2008)CURRENT VOL.: Vol. 61 (2009) - (Bound and on Hein)

starter codefor you

#!/usr/bin/env pythonimport rere_tag = re.compile(r'([A-Z \.]+):')re_title = re.compile('TITLE: (.*)')for line in open('journals-carol-bean.txt'): line = line.strip() m1 = re_tag.match(line) m2 = re_title.match(line) if line == "": continue print "\n->", line, "<-" if m1 or m2: print "MATCH" if m1: print 'tag:', m1.groups() if m2: print 'title:', m2.groups()

Hacker 102 - regexes w/Javascript, Python

Documents

Transcript of Hacker 102 - regexes w/Javascript, Python

Wily Hacker

Hacker organizations

XSS Horror Show scary XSS vectors About me Researcher for Portswigger (makers of Burp suite) JavaScript XSS hacker I love JavaScript sandboxes Built.

CERTIFIED ETHICAL HACKER Ethical Hacker Certified Ethical Hacker v10: Course Description The Certified Ethical Hacker program is a trusted and respected ethical hacking training Program

Hacker Anis

Regexes vs Regular Expressions; and Recursive Descent Parser

Regexes and-performance-testing

Hacker vs. Hacker

JavaScript and jQuery (Hacker School 2011)

Albania Hacker

Using HTML5 To Make JavaScript (Mostly) Secure · 2013-09-20 · Using HTML5 To Make JavaScript (Mostly) Secure Mike Shema Hacker Halted US September 20, 2013. Hello Again, Atlanta!

Hacker Classic

Bioinformatics p2-p3-perl-regexes v2014

Hacker Inside2-

Hacker Perspectives - Domůledvina/DHT/tugraz/Hacker Perspectives.pdf · ACN SS 07 - Hacker Perspectives Overview Definition of a Hacker History of Hacking How to get into Scene Information

Love Hacker

Become Hacker

Hacker halted2

The Bastards Book of Regular Expressionssamples.leanpub.com/bastards-regexes-sample.pdf · RegularExpressionsareforEveryone. Apre-releasewarning Whatyou’recurrentlyreadingisaveryalphareleaseofthebook.

Hacker Systemat