SWP - A Generic Language Parser


Click here to load reader


This talk was part tongue in cheek, part serious, but entirely fun and given twice as a lightning talk - once at Europython & once at the ACCU python uk 05. It presents a generic python like language parser which does actually work. Think of it as an alternative to brackets in Lisp!

Transcript of SWP - A Generic Language Parser

Page 1: SWP - A Generic Language Parser


A Generic Language Parser (Gloop?)(SWP == Semantic Whitespace Parser for want of a better name)

Michael Sparks

Page 2: SWP - A Generic Language Parser

Parse Anything

● Got bored of seeing “use Prothon”... “no”● Hacking python to add a keyword whilst trivial

wasn't trivial enough● Got bored of seeing “use prothon's replacement”● Thought it might be a fun thing to try● Got very bored of seeing “use the replacement for

prothon's replacement”● etc

Page 3: SWP - A Generic Language Parser

Parse Anything

Parse this: def displayResult(result,quiet): if not quiet: print "The result of parsing your program:" print result print if not result: print "Rule match/evaluation order" for rule in r: print " ", rule end end else: if result is None: print "Parse failed" else: print "Success" end endend

Page 4: SWP - A Generic Language Parser

Parse Anything

Parse this:## Sample logo like language using the parser#

shape square: pen down repeat 4: forward 10 rotate 90 end pen upend

repeat (360/5): square() rotate 5end

Page 5: SWP - A Generic Language Parser

Parse Anything

Parse this:## Example based on defining grammars for L-Systems. #OBJECT tree L_SYSTEM: ROOT G RULES: G -> T { G } { A G } { B G } { C G } (0.00 .. 0.15) G -> T { A B G } { B A G } { C A G } (0.15 .. 0.30) G -> T { A C G } { B B G } { C B G } (0.30 .. 0.45) G -> T { A A G } { B C G } { C C G } (0.45 .. 0.60) G -> T { A G } { C G } (0.70 .. 0.80) G -> T { A G } { B G } (0.80 .. 0.95) G -> T { A G } (0.95 .. 1.00) T -> T (0.00 .. 0.75) ENDRULESENDOBJECT

Page 6: SWP - A Generic Language Parser

Parse Anything

Parse this:## An SML-like language using this parser.#structure Stk = struct : exception EmptyStack_exception datatype 'x stack = EmptyStack | push of ('x * 'x stack) fun pop(push(x,y)) = y fun pop EmptyStack = raise EmptyStack_exception fun top(push(x,y)) = x fun top EmptyStack = raise EmptyStack_exceptionend

Page 7: SWP - A Generic Language Parser

Parse Anything, etcEXPORT OBJECT person: PRIVATE: flat name, telephone address::PTR TO LONG telephone ENDATTRSENDOBJECT

PROC compare_address(address1::PTR TO LONG, address2::PTR TO LONG): # Returns *TRUE* if the address2 exists _inside address1 DEF result=TRUE, f FOR f:=0 TO 5: IF address2[f]: IF Not(((StrLen address2[f])==0) AND ((StrLen address1[f])==0)): # The following line incorrectly(?) says that a # NULL string does not exist inside a NULL string. # The IF above corrects this result:=result AND ( ((InStr address1[f],address2[f])<>-1) OR ((StrLen address2[f])==0) ) ENDIF ENDIF ENDFORENDPROC result

Page 8: SWP - A Generic Language Parser

Parse This?!OBJECT tree L_SYSTEM: ROOT G structure Stk = struct : exception EmptyStack_exception datatype 'x stack = EmptyStack | push of ('x * 'x stack) shape square: repeat 4: forward 10 rotate 90 end end end RULES: G -> T { A G } { C G } (0.70 .. 0.80) G -> T { A G } { B G } (0.80 .. 0.95) G -> T { A G } (0.95 .. 1.00) ENDRULESENDOBJECT

if (__name__ == "__main__"): import sys assign lexonly False assign trace False for fields in using query: SELECT fname, lname, t.phone, tsite.name : FROM tcontact, tsite WHERE table_contact.objid = "CONTID" AND table_site.objid = "SITEID" ENDSELECT endfor if sys.argv[1]: assign source open(sys.argv[1]).read() else: assign source "junk" endend

Page 9: SWP - A Generic Language Parser

Parsed!● ['program', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'OBJECT'], ['factorlist', ['factorlist', ['factorlist', ['ID', 'tree']], ['trailedfactor', ['ID',

'L_SYSTEM'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'ROOT'], ['factorlist', ['ID', 'G']]]]], ['statement_list', ['assignment', '=', ['explist', ['functioncall', ['ID', 'structure'], ['factorlist', ['ID', 'Stk']]]], ['explist', ['functioncall', ['trailedfactor', ['ID', 'struct'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'exception'], ['factorlist', ['ID', 'EmptyStack_exception']]]]], ['statement_list', ['assignment', '=', ['explist', ['functioncall', ['ID', 'datatype'], ['factorlist', ['factorlist', ['ID', "'x"]], ['ID', 'stack']]]], ['explist', ['infixepr', '|', ['ID', 'EmptyStack'], ['explist', ['functioncall', ['ID', 'push'], ['factorlist', ['factorlist', ['ID', 'of']], ['bracketedexpression', ['bracketedexpression', ['explist', ['infixepr', '*', ['ID', "'x"], ['explist', ['functioncall', ['ID', "'x"], ['factorlist', ['ID', 'stack']]]]]]]]]]]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'shape'], ['factorlist', ['factorlist', ['trailedfactor', ['ID', 'square'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'repeat'], ['factorlist', ['factorlist', ['trailedfactor', ['number', 4], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'forward'], ['factorlist', ['number', 10]]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'rotate'], ['factorlist', ['number', 90]]]]]]]]]]], ['ID', 'end']]]]]]]]]], ['ID', 'end']]]]]]]]]]], ['factorlist', ['ID', 'end']]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['trailedfactor', ['ID', 'RULES'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['infixepr', '->', ['ID', 'G'], ['explist', ['functioncall', ['ID', 'T'], ['factorlist', ['factorlist', ['factorlist', ['constructorexpression', ['constructorexpression', ['explist', ['functioncall', ['ID', 'A'], ['factorlist', ['ID', 'G']]]]]]], ['constructorexpression', ['constructorexpression', ['explist', ['functioncall', ['ID', 'C'], ['factorlist', ['ID', 'G']]]]]]], ['bracketedexpression', ['bracketedexpression', ['explist', ['infixepr', '..', ['dottedfactor', ['number', 0], ['attribute', ['number', 70]]], ['explist', ['expression', ['dottedfactor', ['number', 0], ['attribute', ['number', 80]]]]]]]]]]]]]]], ['statement_list', ['exprstatement', ['explist', ['infixepr', '->', ['ID', 'G'], ['explist', ['functioncall', ['ID', 'T'], ['factorlist', ['factorlist', ['factorlist', ['constructorexpression', ['constructorexpression', ['explist', ['functioncall', ['ID', 'A'], ['factorlist', ['ID', 'G']]]]]]], ['constructorexpression', ['constructorexpression', ['explist', ['functioncall', ['ID', 'B'], ['factorlist', ['ID', 'G']]]]]]], ['bracketedexpression', ['bracketedexpression', ['explist', ['infixepr', '..', ['dottedfactor', ['number', 0], ['attribute', ['number', 80]]], ['explist', ['expression', ['dottedfactor', ['number', 0], ['attribute', ['number', 95]]]]]]]]]]]]]]], ['statement_list', ['exprstatement', ['explist', ['infixepr', '->', ['ID', 'G'], ['explist', ['functioncall', ['ID', 'T'], ['factorlist', ['factorlist', ['constructorexpression', ['constructorexpression', ['explist', ['functioncall', ['ID', 'A'], ['factorlist', ['ID', 'G']]]]]]], ['bracketedexpression', ['bracketedexpression', ['explist', ['infixepr', '..', ['dottedfactor', ['number', 0], ['attribute', ['number', 95]]], ['explist', ['expression', ['dottedfactor', ['number', 1], ['attribute', ['number', 0]]]]]]]]]]]]]]]]]]]]], ['factorlist', ['ID', 'ENDRULES']]]]]]]]]]]], ['ID', 'ENDOBJECT']]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'if'], ['factorlist', ['factorlist', ['trailedfactor', ['bracketedexpression', ['bracketedexpression', ['explist', ['infixepr', '==', ['ID', '__name__'], ['explist', ['expression', ['string', '__main__']]]]]]], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'import'], ['factorlist', ['ID', 'sys']]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'assign'], ['factorlist', ['factorlist', ['ID', 'lexonly']], ['ID', 'False']]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'assign'], ['factorlist', ['factorlist', ['ID', 'trace']], ['ID', 'False']]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'for'], ['factorlist', ['factorlist', ['factorlist', ['factorlist', ['factorlist', ['ID', 'fields']], ['ID', 'in']], ['ID', 'using']], ['trailedfactor', ['ID', 'query'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'SELECT'], ['factorlist', ['ID', 'first_name']]], ['explist', ['expression', ['ID', 'last_name']], ['explist', ['expression', ['dottedfactor', ['ID', 'table_contact'], ['attribute', ['ID', 'phone']]]], ['explist', ['expression', ['ID', 'e_mail']], ['explist', ['functioncall', ['dottedfactor', ['ID', 'table_site'], ['attribute', ['trailedfactor', ['ID', 'name'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'FROM'], ['factorlist', ['ID', 'table_contact']]], ['explist', ['expression', ['ID', 'table_site']]]]], ['statement_list', ['assignment', '=', ['explist', ['functioncall', ['ID', 'WHERE'], ['factorlist', ['dottedfactor', ['ID', 'table_contact'], ['attribute', ['ID', 'objid']]]]]], ['explist', ['expression', ['string', '<CASECONTACTID>']]]], ['statement_list', ['assignment', '=', ['explist', ['functioncall', ['ID', 'AND'], ['factorlist', ['dottedfactor', ['ID', 'table_site'], ['attribute', ['ID', 'objid']]]]]], ['explist', ['expression', ['string', '<CASESITEID>']]]]]]]]]]]], ['factorlist', ['ID', 'ENDSELECT']]]]]]]]]]]]]], ['ID', 'endfor']]]]], ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'if'], ['factorlist', ['factorlist', ['factorlist', ['dottedfactor', ['ID', 'sys'], ['attribute', ['trailedfactor', ['trailedfactor', ['ID', 'argv'], ['bracketedtrailer', ['explist', ['expression', ['number', 1]]]]], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'assign'], ['factorlist', ['factorlist', ['factorlist', ['ID', 'source']], ['ID', 'open']], ['dottedfactor', ['bracketedexpression', ['bracketedexpression', ['explist', ['expression', ['dottedfactor', ['ID', 'sys'], ['attribute', ['trailedfactor', ['ID', 'argv'], ['bracketedtrailer', ['explist', ['expression', ['number', 1]]]]]]]]]]], ['methodcall', 'read', ['bracketedexpression', None]]]]]]]]]]]]]], ['trailedfactor', ['ID', 'else'], ['blocktrailer', ['block', ['statement_list', ['exprstatement', ['explist', ['functioncall', ['ID', 'assign'], ['factorlist', ['factorlist', ['ID', 'source']], ['string', 'junk']]]]]]]]]], ['ID', 'end']]]]]]]]]]]]]], ['ID', 'end']]]]]]]]

Page 10: SWP - A Generic Language Parser

Grammar (SLR)

program -> blockblock -> BLOCKSTART statement_list BLOCKENDstatement_list -> statement*statement -> (expression | expression ASSIGNMENT expression | ) EOLexpression -> oldexpression (COMMA expression)*oldexpression -> (factor [factorlist] | factor INFIXOPERATOR expression )factorlist -> factor* factor factor -> ( bracketedexpression | constructorexpression | NUMBER | STRING | ID | factor DOT dotexpression | factor trailer | factor trailertoo )dotexpression -> (ID bracketedexpression | factor )bracketedexpression -> BRA [ expression ] KETconstructorexpression -> BRA3 [ expression ] KET3trailer -> BRA2 expression KET2trailertoo -> COLON EOL block

Page 11: SWP - A Generic Language Parser


● Just uses a slightly modified PLY (1.5)● All of the examples are parseable by the same

parser – no changes to the lexer or parser.● Just spits out a syntax tree● Treats everything as a function

Page 12: SWP - A Generic Language Parser

Everything's a function

● This is a function:if bar(bibble=>baz): bla bla bla bingle bongleelse: babble babble this = bing

● Parsed as: Call function “if” with the arguments:bar(bibble=>baz), codeblock, “else”, codeblock, “endif”

Page 13: SWP - A Generic Language Parser


● http:///www.cerenity.org/SWP-0.0.0.tar.gz● http://www.cerenity.org/SWP/● I'd be curious to see someone put a lisp back end

on it :-)– Actually no, don't do that, someone might use this
