STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.
-
Upload
august-stephens -
Category
Documents
-
view
215 -
download
0
Transcript of STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.
![Page 1: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/1.jpg)
SMT'15, San Fransisco 1
STRINGS AND
AUTOMATA MODULO THEORIES
Margus Veanes
July 18, 2015
![Page 2: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/2.jpg)
SMT'15, San Fransisco 2
• Symbolic execution– Path feasibility analysis involving string
constraints– Regular expression matching
• Security vulnerabilities– SQL injection attacks– XSS attacks – DoS attacks
• e.g. regex injection
– Directory traversal attacks
…• Data processing
– Parallelization– Deforestation
• Malware detection
MOTIVATION
July 18, 2015
[OWASP]top 1,3 culprits
http://foo.bar.system/scripts/..%c1%1c../winnt/system32/cmd.exe?/c+dir+c:\
![Page 3: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/3.jpg)
SMT'15, San Fransisco 3
“EARLY” WORK RELATED TO STRING ANALYSIS
• Tools– Mona: Henriksen-Jensen-Jørgensen-Klarlund-Paige-Rauhe-Sandholm, TACAS’95
• Built on BRICS automata library
– JSA: Christensen-Møller-Schwartzbach, SAS’03 (Uses BRICS)– Haderach: Shannon-Hajra-Lee-Zhan-Khurshid, MUTATION’07 (Uses BRICS)
• Theory– Bjørner, PhD Thesis’98, Decision procedure for queues– Blumensath-Grädel, LICS’00 (automatic structures)– Benedikt-Libkin-Schwentick-Segoufin, LICS’01 (regular string relations)– Khoussainov-Nies-Rubin-Stephan, LICS’04 (automatic Boolean Algebras)– Bala, STACS’04, (regular term matching)– Kunc, DLT’2007, (complexity of language equations)
July 18, 2015
![Page 4: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/4.jpg)
SMT'15, San Fransisco 4
THE RISE OF THE STRING ANALYZERS
• String theory encodings in SMT:– Pex-LL: Bjørner-Tillmann-Voronkov, TACAS’09 (strings + SMT)– Reggae: Li-Xie-Tillmann-deHalleux-Schulte, ASE’09 (symolic exploration of regex code)– Z3-str: Zheng-Zhang-Ganesh, ESEC/FSE 2013 (plugin to Z3)– CVC4-str: Liang-Reynolds-Tinelli-Barrett-Deters, CAV’14 (DPLL(TSLRp))– S3: Trinh-Chu-Jaffar, CCS’14 (uses Z3-str-star)
• Automata related:– Stranger: Yu-Alkhalaf-Bultan-Ibarra-Cova, SPIN’08, TACAS’09, TACAS’10 (automata based)– DPRLE: Hooimeijer-Weimer, PLDI’09 (subset checking)– Hampi: Kiezun-Ganesh-Guo-Hooimeijer-Ernst, ISSTA’09 (best paper award) (reduction to BV)– Kaluza(in Kudzu): Saxena-Akhawe-Hanna-Mao-McCamant-Song, Okland’10 (Hampi + mult.var.)– Rex: Veanes-deHalleux-Tillmann-Bjørner-deMoura, ICST’10, LPAR’2010 (language acceptors)– Bek: Hooimeijer-Livshits-Molnar-Saxena-Veanes-Bjørner, USENIX Security'11, POPL’12 (transducers)– Bex: D’Antoni-Veanes, VMCAI’13, CAV’13 (lookahead)– PASS: Li-Ghosh, HVC 2013 (best paper award) . (array based)– SMC: Luu-Shinde-Saxena-Demsky, PLDI’14 (model counting)
CAV’15:– ABC: Aydin-Bang-Bultan (automata based counting, using Stranger and BRICS)– NORN: Abdulla-Atig-Chen-Holik-Rezine-Rümmer-Stenman, also CAV’14 (Horn clauses, BRICS)– Z3-str+: Zheng-Ganesh-Subramanian-Tripp-Dolby-Zhang. (string + regex + length )
July 18, 2015
![Page 5: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/5.jpg)
SMT'15, San Fransisco 5
TWO QUESTIONS
• What are characters?
• What are strings?
July 18, 2015
smileycipher(“hello world”) = “ 😧😤😫😫😮😶😮
”😱😫😣
Is this a string function?
![Page 6: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/6.jpg)
SMT'15, San Fransisco 6
WHAT ARE CHARACTERS?1. Elements of a Finite Alphabet ?
– Only primitive operation is =: Bool– What about Unicode, e.g., 😀 😁 http://unicode.org/charts/PDF/U1F600.pdf
• || = 1,112,064 – For succinctness allow total order ≺: Bool and ranges [a-b] (denotes {x | a ≼ x ≼ b})
• This affects the notion of automaton over !• Why not other operations as well?
2. Bit-vectors, say char (BV16) ?– With primitive operations like &: char char char – “ ” 😀 = “\uD83D\uDE00” (UTF16 surrogate pair)
• has its own theory, namely bv theory!
3. Integers (code points) ?– 😀 = 0x1F600 = 128512– e.g. + 1 = = 0x1F601😀 😁
• has its own theory, namely int theory!
…
July 18, 2015
![Page 7: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/7.jpg)
SMT'15, San Fransisco 7
WHAT ARE STRINGS?• Finite sequences of characters (char)
– CVC4-strSingleton string = char
• Restricted arrays of int to char– Pex-LL, PASSarray<int,char> ≠ char singleton string ≠ char
• Finite lists of characters– Pex-Rexlist<char> ≠ char singleton string ≠ char
• Finite queues– transducers
The answer depends on the context and the required operations. – First, Last, Rest, Append, Substring, Length, …
July 18, 2015
![Page 8: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/8.jpg)
SMT'15, San Fransisco 8
ANALYSIS TASKS
• Consider character type C, string type S<C>, and regular expression type R<C>.– When is DPLL(TC,TS<C>,TR<C>) possible/feasible?
• What about (finite state) transducers?– Regular transformations of type S<Tin> S<Tout>
– Typically Tin = Tout = bit-vectors– Many string transformations are such:
• sanitizers, encoders
July 18, 2015
![Page 9: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/9.jpg)
SMT'15, San Fransisco 9
HTML ENCODER
July 18, 2015
Arithmetic operations on
characters
![Page 10: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/10.jpg)
SMT'15, San Fransisco 10
FOR EACH DOMAIN SPECIFIC TASK
Design a language that• only has the features required by the task• it is simple to use• enables to automatically reason about what
the programs do• compiles into efficient code
July 18, 2015
![Page 11: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/11.jpg)
SMT'15, San Fransisco 11
THE REST OF THE TALK
• Symbolic Automata and Transducers• BEK and string sanitizers• BEX and string encoders• Data parallel BEK/BEX for string processing
July 18, 2015
![Page 12: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/12.jpg)
SYMBOLIC FINITE AUTOMATA
July 18, 2015 SMT'15, San Fransisco 12
![Page 13: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/13.jpg)
SMT'15, San Fransisco 13
SYMBOLIC FINITE AUTOMATON (SFA)
• Labels are predicates
qp x. 'a' ≤ x ≤ 'd'
July 18, 2015
one symbolic transition:
denotesmany concrete
transitions:qp
'a'
‘c'‘b'
'd'
for x〚 'a' ≤ x ≤ 'd' 〛
![Page 14: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/14.jpg)
SFA EXECUTION EXAMPLE
14
λx. x mod 2=0
λx. x mod 2=1
p q
λx. x mod 2 =0λx. x mod 2=1
1 2 5 3
p p q p p
p is final accept the inputJuly 18, 2015 SMT'15, San Fransisco
![Page 15: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/15.jpg)
SYMBOLIC FINITE AUTOMATAWhat is the alphabet?
July 18, 2015 SMT'15, San Fransisco 15
![Page 16: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/16.jpg)
ALPHABET IS ANEFFECTIVE BOOLEAN ALGEBRA
July 18, 2015 SMT'15, San Fransisco 16
Domain Predicates
P 2D
(D,P, 〚 _ 〛 , , T, , , )
![Page 17: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/17.jpg)
ALPHABET EXAMPLE
July 18, 2015 SMT'15, San Fransisco 17
{a,b}
{,{a},{b},{a,b}}
id
{a,b}
c
p q
{a,b}{a}
{b}
a*b(a|b)*
SFA over 2{a,b} :
regex :
2{a,b} = (D,P, 〚 _ 〛 , , T, , , )
![Page 18: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/18.jpg)
ALPHABET EXAMPLE: 2BVK
• D = {n | 0 n < 2k}• P = BDDs of depth k• Boolean operations are BDD operations Below 〚 i 〛 = {n D | i'th bit of n is 1}
July 18, 2015 SMT'15, San Fransisco 18
i has fixed size independent of i
![Page 19: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/19.jpg)
ALPHABET EXAMPLE: SMTINT
• D = Integers • P = integer linear arithmetic formulas
(with one fixed free variable)• 〚 〛 = 〚〛 〚〛• 〚〛 = , 〚 〛 = D \ 〚〛• Satisfiability: 〚〛
July 18, 2015 SMT'15, San Fransisco 19
![Page 20: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/20.jpg)
BOOLEAN ALGEBRA INTERFACE IN C#
July 18, 2015 SMT'15, San Fransisco 20
public interface IBoolAlg<P>{
P Top { get; }P Bot { get; }P Not(P pred);P Or(P pred1, P pred2);P And(P pred1, P pred2);bool IsSat(P predicate);}
public interface IBoolAlgExt<P,D> : IBoolAlg<P>{IEnumerable<D> Den(P);P One(D);}
![Page 21: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/21.jpg)
UNIT ALPHABET EXAMPLE IN C#
July 18, 2015 SMT'15, San Fransisco 21
class A1 : IBoolAlg<bool>{
public bool Top { get { return true; } }public bool Bot { get { return false; } }public bool Not(bool pred) { return !pred; }public bool Or(bool pred1, bool pred2) { return pred1 || pred2; }public bool And(bool pred1, bool pred2) { return pred1 && pred2; }public bool IsSat(bool pred){ return pred; }}
One-letter alphabet
![Page 22: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/22.jpg)
ANOTHER ALPHABET EXAMPLE IN C#
July 18, 2015 SMT'15, San Fransisco 22
class A16 : IBoolAlg<UInt16>{
public UInt16 Top { get { return 0xFFFF; } }public UInt16 Bot { get { return 0; } }public UInt16 Not(UInt16 pred) { return ~pred; }public UInt16 Or(UInt16 pred1, UInt16 pred2) { return pred1 | pred2; }public UInt16 And(UInt16 pred1, UInt16 pred2) { return pred1 & pred2; }public bool IsSat(UInt16 pred){ return pred != 0; }}
16-letter alphabet
![Page 23: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/23.jpg)
ALPHABET TRANSFORMATIONS
• Effective Boolean algebras can be extended– e.g. disjoint union
• Effective Boolean algebras can be restricted– e.g. restriction wrt. a given predicate
July 18, 2015 SMT'15, San Fransisco 23
![Page 24: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/24.jpg)
DISJOINT UNION OF ALPHABETS IN C#
July 18, 2015 SMT'15, San Fransisco 24
public class PairAlg<S, T> : IBoolAlg<Pair<S, T>>{ IBoolAlg<S> A; IBoolAlg<T> B; Pair<S,T> Bot {get return new Pair<S,T>(A.Bot,B.Bot);} … public Pair<S, T> Or(Pair<S,T> a, Pair<S,T> b) { return new Pair<S,T>(A.Or(a[0],b[0]), B.Or(a[1],b[1])); } public bool IsSat(Pair<S,T> p) { return A.IsSat(p[0]) || B.IsSat(p[1]); }}
![Page 25: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/25.jpg)
SFA VS. CLASSICAL AUTOMATA?
• SFAs can support infinite alphabets• For some cases SFAs are
exponentially more succinct than NFAsExample (recall the BDDs i from before):
Equivalent NFA requires 2k transitions.July 18, 2015 SMT'15, San Fransisco 25
![Page 26: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/26.jpg)
SYMBOLIC FINITE AUTOMATAAlgorithms over SFAs.
July 18, 2015 SMT'15, San Fransisco 26
![Page 27: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/27.jpg)
ALGORITHMS OVER SFAS
• Language intersection– Uses product of automata
• Language complementation– Requires determinization
• Minimization– Extensions of Moore/Hopcroft [POPL’14]
• Regex SFA construction– Uses BDDs to represent Unicode character sets– Requires BDD interval-set conversions
• May cause exponential blowup: recall the BDDs i
July 18, 2015 SMT'15, San Fransisco 27
![Page 28: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/28.jpg)
LANGUAGE INTERSECTION
• Uses DFS and product of transitions
July 18, 2015 SMT'15, San Fransisco 28
p1 q1
p2 q2
A:
B:
p1
p2
AB: q1
q2
delete when
unsat
X
![Page 29: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/29.jpg)
INTERSECTION EXAMPLE
July 18, 2015 SMT'15, San Fransisco 29
a1 a2
2
A:
B:
66
b1
3
a1
b1
a2 b2
23
63
a1 b2
3
let k(x) ((x mod k) = 0)
AB:
b263
X
![Page 30: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/30.jpg)
LANGUAGE COMPLEMENTATIONFirst determinize then swap final and nonfinal states
July 18, 2015 SMT'15, San Fransisco 30
p q
r
{p}{q}
{q,r}
{r}
delete unsat guards
determinize
![Page 31: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/31.jpg)
31
MINIMIZATION (SYMBOLIC MOORE)
D := (F (Q\F)) ((Q\F) F)foreach (p’,q’) D, (p,q) D if (IsSat(guard(p,p’) ∧ guard(q,q’)))
add (p,q) to D
p
q
p’
q’
distinguishable
φ
ψ
distinguishable IsSat(φ ∧ ψ)
July 18, 2015 SMT'15, San Fransisco
![Page 32: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/32.jpg)
REGEX SFA
• Classical algorithm extended to work with predicates– First produces SFA (SFA with -moves )– Then -moves are eliminated using the
standard -elimination algorithm– Requires interval-set BDD algorithm for
converting character classesExample: [\0x0-\0xFF] = BDD whose bits in pos. > 7 are 0
July 18, 2015 SMT'15, San Fransisco 32
![Page 33: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/33.jpg)
ONLINE SFA ALGORITHM EXAMPLES
• http://www.rise4fun.com/Bex/zE
July 18, 2015 SMT'15, San Fransisco 33
![Page 34: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/34.jpg)
SYMBOLIC FINITE TRANSDUCERS
July 18, 2015 SMT'15, San Fransisco 34
![Page 35: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/35.jpg)
SYMBOLIC FINITE TRANSDUCER (SFT)
• Labels are guarded transformation functions
Concrete transitions:
p
q
Symbolic transition:
‘\x80’/“\xC2\x80”
… ‘\x7FF’/“\xDF\xBF”
q
p
x. 8016 ≤ x ≤ 7FF16/[C016|x10,6, 8016|x5,0]
guard
bitvector operations
1920transitions
SMT'15, San Fransisco 35July 18, 2015
![Page 36: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/36.jpg)
SFT EXECUTION EXAMPLE
36
x mod 2 =0/[x, x]
x mod 2 =1/[x-1]
p q
x mod 2 =0/[]x mod 2 =1/[x-1]
1 2 5 3
p p q p p
Input tape
Output tape 0 2 2 4 2
July 18, 2015 SMT'15, San Fransisco
![Page 37: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/37.jpg)
SYMBOLIC FINITE TRANSDUCERSProperties and algorithms
July 18, 2015 SMT'15, San Fransisco 37
![Page 38: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/38.jpg)
WHY SFTS?
• They have good algebraic properties (POPL'12)– SFTs are closed under composition– Equivalence is decidable in the single-valued case– domain of an SFT is an SFA
• SFAs are closed under Boolean operations
• Useful for various analysis tasks
July 18, 2015 SMT'15, San Fransisco 38
![Page 39: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/39.jpg)
SFT COMPOSITION
AB = x.B(A(x))
July 18, 2015 SMT'15, San Fransisco 39
a1 a2A
B
x>0/ [x+1,x+2]
b1 b2x<5/ [] b3x<4/[x,x]
AB a1b1
x>0 x+1<5 x+2<4 / [x+2, x+2] a2b3
![Page 40: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/40.jpg)
SMT'15, San Fransisco 40
• Composition:
• Equiv. checking for single-valued-SFTs:(undecidable in general)
Algorithms use SMT for satisfiability checking of character formulas
SFT A B
SFT ALGORITHMS
July 18, 2015
in outSFT Bin outSFT A
in outSFT A
in outSFT B
“input string” A and B not equivalent
![Page 41: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/41.jpg)
SMT'15, San Fransisco 41
PROPERTY ANALYSIS (USENIX SEC'11)
• Does it matter if a sanitizer is applied twice? Idempotence:
• Does order of sanitizers matter? Commutativity:
July 18, 2015
“input string” A not idempotent
A AA A
A
“input string” A and B not commutative
B AB A
A BA B
![Page 42: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/42.jpg)
APPLICATIONS
July 18, 2015 SMT'15, San Fransisco 42
![Page 43: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/43.jpg)
APPLICATIONS OF SFAS/SFTS
• SFAs:– Regex support in parameterized unit testing– Fuzz testing of regexes– Password generation
• SFTs:– Analysis of string encoders/decoders– Security analysis of sanitizers
July 18, 2015 SMT'15, San Fransisco 43
![Page 44: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/44.jpg)
SMT'15, San Fransisco 44
APPLICATION 1REGEXES IN PARAMETERIZED UNIT TESTING
• Rex component in Pex• Generate values for s that reach the return branches
– s is a string of Unicode characters (16-bit bit-vectors)
July 18, 2015
bool IsValidEmail(string s) { string r1 = @"^[A-Za-z0-9]+@(([A-Za-z0-9\-])+\.)+([A-Za-z\-])+$"; string r2 = @"^\d.*$"; if (System.Text.RegularExpressions.Regex.IsMatch(s, r1)) if (System.Text.RegularExpressions.Regex.IsMatch(s, r2)) return false; //branch 1 else return true; //branch 2 else return false; //branch 3 }
Solve: sL(r1)L(r2) [eg. s = “[email protected]”]
Solve: sL(r1)\L(r2) [eg. s = “[email protected]”]
Solve: sL(r1) [eg. s = “[email protected]”]
![Page 45: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/45.jpg)
APPLICATION 2 PASSWORD GENERATIONGiven constraints:• Length is k: "^[\x21-\x7E]{k}$"• Contains 2 capital letters: "[A-Z].*[A-Z]"• Contains a digit: "\d"• Contains a non-word character: "\W"Generate random instances with uniform distribution that match all the above conditions.k=4 : http://www.rise4fun.com/Rex/4nE
http://www.rise4fun.com/Bek/c3j
July 18, 2015 SMT'15, San Fransisco 45
![Page 46: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/46.jpg)
SMT'15, San Fransisco 46
APPLICATION 3SAFETY ANALYSIS
Example: suppose good output = “NoEars"NoEars = [^\uDE38-\uDE40]*bad output: WithEars = Complement(NoEars)
x(smileycipher(x) WithEars) ?
{x | smileycipher(x) WithEars}
Does there exist an input x that causes “ears" in the
output ?
http://www.rise4fun.com/Bek/5sHO
July 18, 2015
![Page 47: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/47.jpg)
EXTENSIONS
July 18, 2015 SMT'15, San Fransisco 47
![Page 48: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/48.jpg)
EXTENSIONS OF SFAS AND SFTS
• ESFT– SFA/SFTswith look-ahead [CAV'13]– BEX language
• STT – Symbolic automata/transducer over trees– FAST language [PLDI’14]
• k-SFT – SFT with lookback [POPL’15]
July 18, 2015 SMT'15, San Fransisco 48
![Page 49: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/49.jpg)
ESFAS AND ESFTS
• Unlike in the classical caselook-ahead breaks many properties– e.g. equivalence of ESFAs is undecidable
July 18, 2015 SMT'15, San Fransisco 49
x1≤FF ∧ x2≤FF ∧ x3≤FF / [x1>>2, ((x1&3)<<4)|(x2>>4), ((x2&0xF)<<2)|(x3>>6), x3&0x3F]
q
above ESFT, reads 3 and writes 4 symbols
(base64encoder)
http://www.rise4fun.com/Bex/tutorial/guide
M a n M a n
T W F u T W F u
![Page 50: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/50.jpg)
SMT'15, San Fransisco 50
FAST (TREE TRANSDUCERS)
• Trees are common input/output data structures– XML query, type-checking, etc…– Natural Language translators (from parse tree to parse
tree)– Compilers/optimizers (from parse tree to parse tree)– Tree manipulating programs: data structures algorithms,
ontologies, etc…– Augmented Reality
– http://www.rise4fun.com/Fast/tutorial/guide July 18, 2015
![Page 51: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/51.jpg)
SMT'15, San Fransisco 51
TransducerModel
Z3
Transformation Analysis Does it do the right thing?
AnalysisquestionAutomata-.NET
s := iter(c in t)[b := false;] { case (!b && c in "[\"\\]"):
b := false; yield('\\', c); case (c == '\\'):
b := !b; yield(c); case (true):
b := false; yield(c);
};
DSL
Code Gen
C# JavaScript C
Code Gen
OUR RECIPE FOR EACH TASK
July 18, 2015
![Page 52: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/52.jpg)
SMT'15, San Fransisco 52
Automata-.NET will be open source on GitHub under MIT license
Some references:
BEK• Fast and precise sanitizer analysis with BEK
Hooimeijer, Livshits, Molnar, Saxena, Veanes, USENIX11• Symbolic finite state transducers: algorithms and applications
Veanes, Hooimeijer, Livshits, Molnar, Bjorner, POPL12
BEX• Static analysis of string encoders and decoders
D’Antoni, Veanes, VMCAI13• Equivalence of extended symbolic finite transducers
D’Antoni, Veanes, CAV13• Data parallel string manipulating programs
Veanes, Mytkowicz, Molnar, Livshits, POPL15July 18, 2015
![Page 53: STRINGS AND AUTOMATA MODULO THEORIES Margus Veanes July 18, 2015SMT'15, San Fransisco1.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649db25503460f94aa144a/html5/thumbnails/53.jpg)
QUESTIONS?
Links to related online tutorials:– Bek
http://rise4fun.com/Bek/tutorial
– Bexhttp://rise4fun.com/Bex/tutorial
– Rexhttp://rise4fun.com/rex/
– Fasthttp://rise4fun.com/Fast/tutorial
SMT'15, San Fransisco 53July 18, 2015