PetrKerzum (Yandex) @ CodeCamp2011

download PetrKerzum (Yandex) @ CodeCamp2011

If you can't read please download the document

Transcript of PetrKerzum (Yandex) @ CodeCamp2011

  • 1. 0 () A0 1 0 0 0 1 1 () 1 1 1 0 1 0

2. , , 3. .... ... *(|)*((|)*(|)*)* ::= ( )* =*(|)* =| 1 0 1 , , 4. - Q ; q 0 F ( ) ( ), , ; / 5. -

  • Q = { 0, 1 } q 0= 0 F = 1 = { , , } ={(0, ) -> 1, (0, ) -> 0, (0, ) -> 1,
      • (1, A) -> 1, (1, ) -> 0, (1, ) -> 0 }

1 0 1 , , 6. Int state = 0; // q0; while(char c = getkey()) { switch(state) { Case 0: goto st0; Case 1: goto st1; } St0: switch(c) { Case '': state = 1; goto end; Case '': goto end; Case '': state = 1; goto end; } St1: switch(c) { Case '': goto end; Case '': state = 0; goto end; Case '': state = 0; goto end; } End: } If (state == 1) exit_ok(); // state in F else exit_fail(); 7. 0 () A0 1 0 0 0 1 1 () 1 1 1 0 1 0 *(|)*((|)*(|)*)* Int state = q0; while(char c = getkey()) { switch(state) { Case 0: goto st0; Case 1: goto st1; } St0: switch(c) { Case '': state = 1; goto end; Case '': goto end; Case '': state = 1; goto end; } St1: switch(c) { Case '': goto end; Case '': state = 0; goto end; Case '': state = 0; goto end; } End: } If (state in F) exit_ok(); else exit_fail(); 1 0 1 , , 8.

  • ()

9. 10. 11. () 12. ()

  • FSM Finite state machine

13. DFA Deterministic finite automate 14. machine 15. language 16. expression 17.

  • DFA (FSM) Deterministic finite automata
  • Ragel

18. GREP 19. Lex 20. AWK

  • NFANon-deterministic finite automate
  • Perl

21. PCRE 22. POSIX 23. Python, PHP, ... 24. ([a-zA-Z]+)s+ 1 s/ ([a-zA-Z]+)s+1 / 1 /g perl -pi -e ' s/ ([a-zA-Z]+)s+1 / 1 /g 'book.txt !!! 25. PERL NFA print "YESn" if ( $ARGV[0] =~ /( a (?{ print "a1 "; }) b (?{ print "b1 "; }) cX ) | ( a (?{ print "a2 "; }) b (?{ print "b2 "; }) cd )/x ) # ./prog.pl abcd a1 b1 a2 b2 YES RAGEL DFA # ./prog_rl abcd a1 a2 b1 b2 YES 26. Ragel http://www.complang.org/ragel/ Adrian D. Thurston "Parsing Computer Languageswith an Automaton Compiled from a Single Regular Expression." In 11th International Conference on Implementation and Application of Automata(CIAA 2006), Lecture Notes in Computer Science, volume 4094, pp. 285-286, Taipei, Taiwan,August 2006.pdf . 27. Ragel fsm :=( 'a' %{ print(a1); }'b' %{ print(b1); }'cX' ) | ( 'a' %{ print(a2); }'b' %{ print(b2); } 'cd' ); 28. Ragel %%{Machine example; fsm := ('a' %{ print(a1); } 'b' %{ print(b1); } cX ) | ('a' %{ print(a2); } 'b' %{ print(b2); } cd ); write data; }%% Int main(int, char** argv) { const char p = argv[0]; const char pe = p + strlen(p) + 1; int cs; %% write init; %% write exec; If (cs == example_error) return 2; return cs < example_first_final; } 29. Ragel # mcedit example.rl # ragel6 example.rl -C -o example.cpp # gcc -o example example.cpp # ./example 'abcd' a1 a2 b1 b2 yes 30. Ragel host language host language: -CThe host language is C, C++,Obj-C or Obj-C++ (default) -DThe host language is D -JThe host language is Java -RThe host language is Ruby -AThe host language is C# 31. Ragel generated code style code style: (C/D/Java/Ruby/C#) -T0Table driven FSM (default) code style: (C/D/Ruby/C#) -T1Faster table driven FSM -F0Flat table driven FSM -F1Faster flat table-driven FSM code style: (C/D/C#) -G0Goto-driven FSM -G1Faster goto-driven FSM code style: (C/D) -G2Really fast goto-driven FSM -PN-Way Split really fast goto-driven FSM 32. Ragel parse float action dgt{ printf("DGT: %cn", fc); } action dec{ printf("DEC: .n"); } action exp{ printf("EXP: %cn", fc); } action exp_sign { printf("SGN: %cn", fc); } action number{ /*NUMBER*/ } number = ( [0-9]+ $dgt ( '.' @dec [0-9]+ $dgt )? ( [eE] ( [+-] $exp_sign )? [0-9]+ $exp )? ) %number; main := ( number 'n' )*; 33. Ragel parse float st0: if ( ++p == pe ) goto out0; if ( 48