LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

32
LANGUAGE AND GRAMMARS © University of Liverpool COMP 319 slide 1
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    220
  • download

    0

Transcript of LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Page 1: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

LANGUAGE AND GRAMMARS

© University of LiverpoolCOMP 319 slide 1

Page 2: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Contents• Languages and Grammars• Formal languages• Formal grammars• Generative grammars• Analytic grammars• Context-free grammars• LL parsers• LR parsers• Rewrite systems• L-systems

© University of LiverpoolCOMP319 slide 2

Page 3: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Software Engineering FoundationSoftware engineering may be summarised by saying that it concerns the construction of programs to solve problems and that there are three parts:

- Construction/engineering, and methods

- Problems, and problem solving, and

- Programs© University of LiverpoolCOMP319 slide 3

Page 4: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Languages and grammar• Languages are spoken and written

(linguistics)• To be effective they must be based on a

shared set of rules – a grammar• Grammars are introspective they are

based on and couched in language• Natural language grammars are

constantly shifting and locally negotiated• A grammar is a formal language in which

the rules of discourse are discussed and are the aim

© University of LiverpoolCOMP319 slide 4

Page 5: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Formal language concepts• The concept emerges because of the

need to define rules (for language)• Formally, they are collections of

words composed of smaller, atomic units

• Issues of concern are- the number and nature of the atomic units,

- the precision level required,- the completeness of the formalism

© University of LiverpoolCOMP319 slide 5

Page 6: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Examples of formal languages

• The set of all words over {a, b}• The set {an : n is a prime number}• The set of syntactically correct

programs in a given computer programming language

• The set of inputs upon which a certain Turing machine halts

© University of LiverpoolCOMP319 slide 6

Page 7: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Formal language specification

There are many ways in which a formal language can be specified e.g.

• strings produced in a formal grammar

• strings produced by regular expressions

• the strings accepted by automata• logic and other formalisms

© University of LiverpoolCOMP319 slide 7

Page 8: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Language Production Operations

• Concatenation of strings drawn from the two languages

• Intersection or union of common strings in both languages

• Complement of one language• Right quotient of one by the other• Kleene star operation on one

language• Reverse of a language• Shuffle combination of languages

© University of LiverpoolCOMP319 slide 8

Page 9: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Formal Grammars

• Noam Chomsky- Linguist, philosopher at MIT- 1956, papers on information and grammar

• Types of formal grammar- Generative grammar- Analytical grammar

© University of LiverpoolCOMP319 slide 9

Page 10: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Generative formal grammars

• Generative grammars:A set of rules by which all possible strings in a language to be described can be generated by successively rewriting strings starting from a designated start symbol.

In effect it formalises an algorithm that generates strings in the language.

© University of LiverpoolCOMP319 slide 10

Page 11: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Analytic formal grammars

• Analytic grammars:A set of rules that assumes an arbitrary string as input, and which successively reduces or analyses that string to yield a final boolean “yes/no” that indicates whether that string is a member of the language described by the grammar

In effect a parser or recogniser for a language

© University of LiverpoolCOMP319 slide 11

Page 12: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Generative grammar components

Chomsky’s definition – essentially for linguistics but perfect for formal computing grammars; consists of the following components:

- A finite set N of nonterminal symbols- A finite set of terminal symbols disjoint from N- A finite set P of production rules where a rule is

of the form: string in ( N)* → string in ( N)*

- A symbol S in N that is identified as the start symbol

© University of LiverpoolCOMP319 slide 12

Page 13: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Generative grammar definition

• A language of a formal grammar:• G = (N, ,P, S)• Is denoted by L(G)• And is defined as all those strings

over such that can be generated by starting from the symbol S and then applying P until no more nonterminal symbols are present

© University of LiverpoolCOMP319 slide 13

Page 14: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

A generative formal grammar• Given the terminals {a, b}, nonterminals {S, A,

B} where S is the special start symbol and• Productions:

S → ABSS → (the empty string)BA → ABBS → bBb → bbAb → abAa → aa

Defines all the words of the from anbn, (i.e. n copies of a followed by n copies of b)

© University of LiverpoolCOMP319 slide 14

Page 15: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Context Free Grammars

• Theoretical basis of most programming languages.

• Easy to generate a parser using a compiler compiler.

• Two main approaches exist: top-down parsing e.g. LL parsers, and bottom-up parsing e.g. LR parsers.

© University of LiverpoolCOMP319 slide 15

Page 16: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

LL parser• Table based, top down parser for a

subset of the context-free grammars (LL grammars).

• Parsing is Left to right, and constructs a Leftmost derivation of the sentence.

• LL(k) parsers use k tokens of look-ahead to parse the LL(k) grammar sentence.

• LL(1) grammars are popular and fast because only the next token is considered in parsing decisions.

© University of LiverpoolCOMP319 slide 16

Page 17: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Table based LL parsing

© University of LiverpoolCOMP319 slide 17

Input buffer: <null> | | +-------------+ Stack | | S <---| Parser | --> Output $ | | +-------------+ ^ |

+-----------+ | Parsing | | table | +-----------+

Architecture• Consider the grammar

1. S → F2. S → ( S + F)3. F → 1

• This has the parsing table

e.g. 1 and S implies rule 1i.e. Stack S is replaced with

Fand 1 is outputStack and Input same =

deleteStack and Input different =

error• Example input

( 1 + 1 ) $

( ) 1 + $

S 2 - 1 - -

F - - 3 - -

Page 18: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Table based LL parsing

© University of LiverpoolCOMP319 slide 18

• Consider the grammar1. S → F2. S → ( S + F)3. F → 1

• This has the parsing table

e.g. 1 and S implies rule 1i.e. Stack S is replaced with

Fand 1 is outputStack and Input same =

deleteStack and Input different =

error• Example input

( 1 + 1 ) $

( ) 1 + $

S 2 - 1 - -

F - - 3 - -

input stack action output

( S$ parse ( S : 2 2

( (S + F)$ ( ( delete 2

1 S + F)$ parse 1 S : 1 21

1 F + F)$ parse 1 F : 3 213

1 1 + F)$ 1 1 delete 213

+ + F)$ + + delete 213

1 F)$ parse 1 F : 3 2133

1 1)$ 1 1 delete 2133

) )$ ) ) delete 2133

$ $ stop 2133

Page 19: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Parse Tree

Page 20: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Left Right Parser• Bottom up parser for context-free

grammars used by many program language compilers

• Parsing is Left to right, and produces a Rightmost derivation.

• LR(k) parsers uses k tokens of look-ahead.• LR(1) is the most common type of parser

used by many programming languages. Usually always generated using a parser generator which constructs the parsing table; e.g. Simple LR parser (SLR), Look Ahead LR (LALR) e.g. Yacc, Canonical LR.

© University of LiverpoolCOMP319 slide 20

Page 21: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Left Right parser example..

• Rules ...• 1) E → E * B• (2) E → E + B• (3) E → B• (4) B → 0• (5) B → 1

© University of LiverpoolCOMP319 slide 21

Page 22: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Left Right parser example

© University of LiverpoolCOMP319 slide 22

Page 23: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Re-writing• Rewriting is a general process involving

strings and alphabets. Classified according to what is rewritten e.g. strings, terms, graphs, etc.

• A rewrite system is a set of equations that characterises a system of computation that provides one method of automating theorem proving and is based on use of rewrite rules.

• Examples of practical systems that use this approach includes the software Mathematica.

© University of LiverpoolCOMP319 slide 23

Page 24: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Re-writing logic example

• ! ! A = A // eliminate double negative

• !(A AND B) = !A OR !B // de-morgan

© University of LiverpoolCOMP319 slide 24

Page 25: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Re-writing in Mathematica (Wolfram)

© University of LiverpoolCOMP319 slide 25

Page 26: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

L-systems

• Named after Aristid Lindenmeyer (1925-1989) a Swedish theoretical biologist and botanist who worked at the University of Utrecht (Netherlands)

• Are a formal grammar used to model the growth and morphology of plants and animals

• In plant and animal modelling a special form, the parametric L-system is used – based on rewriting.

• Because of their recursive, parallel, and unlimited nature they lead to concepts of self-similarity and fractional dimension and fractal-like forms.

© University of LiverpoolCOMP319 slide 26

Page 27: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

L-system structure• The basic system is identical to formal grammars:

G = {V, S, Ω, P}• where

G is the grammar definedV (the alphabet) a set of symbols that can be replaced by

(variables)S is a set of symbols that remain fixed (constants)Ω(start, axiom or initiator) a string from V, the initial stateP is a set of rules or productions defining the ways

variables can be replaced by constants and other variables. Each rule, consists of a LHS (predecessor) and RHS (successor)

© University of LiverpoolCOMP319 slide 27

Page 28: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

© University of LiverpoolCOMP319 slide 28

Slide 28

Example 1: Fibonacci numbers

• V: A B • C: none• Ω : A• P: p1: A → B p2: B →

AB

N=0 AN=1 → BN=2 → AB N=3 → BAB N=4 → ABBAB N=5 → BABABBAB N=6 → ABBABBABABBABN=7 → BABABBAB...Counting lengths we get: 1,1,2,3,5,8,13,21,...The Fibonacci numbers

Page 29: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

© University of LiverpoolCOMP319 slide 29

Slide 29

Example 2: Algal growth

• V: A B • C: none• Ω : A• P: p1: A → AB p2: B → A

N=0 A → ABN=1 → ABAN=2 → ABAABN=3 → ABAABABA

Page 30: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

© University of LiverpoolCOMP319 slide 30

COMP319 Software Engineering II

Example 3: Koch snowflake

• V: F • C: none• Ω : F• P: p1: F → F+F-F-

F+F

N=0 F N=1 → F+F-F-F+FN=2 → F+F-F-F+F+F...N=3 etc

Page 31: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Example 4: 3D Hilbert curve

© University of LiverpoolCOMP319 slide 31

Page 32: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1.

Example 5: Branching

© University of LiverpoolCOMP319 slide 32