Parsingfor Fun and Profit(but mainly fun)
PatchSpace LtdSaturday, 23 February 13
What?
Saturday, 23 February 13
Parsing
Adding structure and meaning to text
Saturday, 23 February 13
Parsing Human Languages
Jake stretched his legs“Jake”, “stretched”, “his”, “legs”“Jake”<noun>, “stretched”<verb, past>, “his”<possessive pronoun>, “legs”<noun>“Jake” <noun, subject>, “stretched”, (“his”, “legs”)<noun phrase, object>
Saturday, 23 February 13
Parsing Computer Languages
“foo = bar + 123”“foo”, “=”, “bar”, “+”, “123”“foo”<var>, “=”<assignment_op>, “bar”<var>, “+”<op_plus>, “123”<int_literal>
Saturday, 23 February 13
Why?
Saturday, 23 February 13
Not just compiling!Compilers breathe fire.
Saturday, 23 February 13
Pretty PrintingSaturday, 23 February 13
Pretty Printing
gofmt
http://gofmt.com/
Saturday, 23 February 13
Code Smell Detectorshttps://rubygems.org/gems/reek
Saturday, 23 February 13
Code Smell DetectorsSaturday, 23 February 13
Other ideasCode metricsBug detectorsDomain-specific languagesLanguage translators (e.g. Ruby -> PHP)Code obfuscatorsAlternative syntaxes (e.g. CoffeeScript)Refactoring tools
Saturday, 23 February 13
How?
Saturday, 23 February 13
Step 13 year computer science
degree
Saturday, 23 February 13
Lexing/Tokenising
if x > 100 then return “big” else return “small”if x > 100 then return “big” else return “small”
Saturday, 23 February 13
Tree Buildingif x > 100 then return “big” else return a + b
if
x
>
100
then
return
“big”
else
return
a+
b
Saturday, 23 February 13
Parsing Expression Grammars
Like regular expressions, but can handle recursion, e.g. HTMLNot actually that much harder to use
Saturday, 23 February 13
Regexes and HTML
Saturday, 23 February 13
Treetop PEG grammarSaturday, 23 February 13
Doing Sums
Saturday, 23 February 13
Switch to Sublime Text, idiot
Code is now available:https://github.com/patchspace/parsing_for_fun_and_profit/
Saturday, 23 February 13
A Ruby Syntax Highlighter
Saturday, 23 February 13
What
A tool to read in simple Ruby source and output syntax highlighted HTML
Saturday, 23 February 13
Why
Because I thought it would be funIt wasBecause I thought it would be easy…
Saturday, 23 February 13
Why
Saturday, 23 February 13
HowBuild a parse tree of the Ruby sourceWalk the tree and spit out a <span> element for each bit of textOh yes, make sure each line goes in <div> and <pre> tagsWrap it in <html>And for bonus points, do some fancy method highlighting
Saturday, 23 February 13
Switch to Chrome, idiot
Saturday, 23 February 13
Switch to Sublime Text again, idiot
Code is now available:https://github.com/patchspace/parsing_for_fun_and_profit/
Saturday, 23 February 13
We’re doing this the hard way
Ruby’s grammar is too complex and undefined to easily implement as a PEGTools for parsing Ruby already exist
Saturday, 23 February 13
Ripper (Ruby 1.9.3)Saturday, 23 February 13
Learn more!
Skip theoretical physics, start by playing with Lego
Saturday, 23 February 13
Do moreIdeas you might like to try:
CSV parserJSON parser (return arrays & hashes)XML parserJSON highlighterA simple JavaScript minifier (just kill whitespace)
Saturday, 23 February 13
Thank you
PatchSpace LtdSaturday, 23 February 13
Top Related