Programming Languages: Design, Specification, and Implementation

23
Programming Languages: Design, Specification, and Implementation G22.2210-001 Rob Strom December 7, 2006

description

Programming Languages: Design, Specification, and Implementation. G22.2210-001 Rob Strom December 7, 2006. Programming Languages Core Exam. Syntactic issues: regular expressions, context-free grammars (CFG), BNF. Imperative languages: program organization, control structures, exceptions - PowerPoint PPT Presentation

Transcript of Programming Languages: Design, Specification, and Implementation

Page 1: Programming Languages: Design, Specification, and Implementation

Programming Languages:Design, Specification, and

Implementation

G22.2210-001Rob Strom

December 7, 2006

Page 2: Programming Languages: Design, Specification, and Implementation

Programming Languages Core Exam Syntactic issues: regular expressions, context-free grammars (CFG), BNF. Imperative languages: program organization, control structures, exceptions Types in imperative languages: strong typing, type equivalence, unions and

discriminated types in C and Ada. Block structure, visibility and scoping issues, parameter passing. Systems programming and weak typing: exposing machine characteristics, type

coercion, pointers & arrays in C. Run-time organization of block-structured languages: static scoping, activation

records, dynamic and static chains, displays. Programming in the large: abstract data types, modules, packages and namespaces

in Ada, Java, and C++. Functional programming: list structures, higher order functions, lambda expressions,

garbage collection, metainterpreters in Lisp and Scheme. Type inference and ML. Object-Oriented programming: classes, inheritance, polymorphism, dynamic

dispatching. Constructors, destructors and multiple inheritance in C++, interfaces in Java.

Generic programming: parametrized units and classes in C++, Ada and Java. Concurrent programming: threads and tasks, communication, race conditions and

deadlocks, protected methods and types in Ada and Java.

Page 3: Programming Languages: Design, Specification, and Implementation

Readings (optional)

• Regular expressions: http://www.regular-expressions.info/

• Prolog: Giannesini et al, “Prolog”, Addison-Wesley 1986.

• SQL: C.J. Date, “An Introduction to Database Systems”, Addison-Wesley 2000, chapters 3, 4.

• Hermes: Strom et al: “Hermes: A Language for Distributed Computing”, Prentice-Hall, 1991.

• Java RMI: http://java.sun.com/docs/books/tutorial/rmi/overview.html

Page 4: Programming Languages: Design, Specification, and Implementation

M1

V1O1

O2

O3

O5

O9

O10

O4O6

O8O7

V2

M2

X

Guava’s Component Model

V11

X

Page 5: Programming Languages: Design, Specification, and Implementation

Java and Guava types

Values• built in – primitive types

• no classes/methods

• no references/sharing

• copy by value

Referenced Objects• classes-methods

• access by reference

• synchronized or not

Referenced Monitors• + always synchronized

Referenced Objects• + never synchronized

• – restricted references

Values• + may be user-definable

• + classes/methods

• + move, copy-by-value

Page 6: Programming Languages: Design, Specification, and Implementation

Guava changes: summary

– Annotations•synchronized

•volatile

+ Annotations•[read], update

•[local], global

•lent, kept, in Region, new

InstancetoString clone

hashCode equalsgetClass

ObjectMonitor

wait notifyfinalize notifyAll

Local ReferenceMobilecopy

Value

Page 7: Programming Languages: Design, Specification, and Implementation

Monitors, Values, ObjectsM1

M2 M3

X: BucketList[]

X[0]

X[1]

A 3 . D 9 . E .

B 3 . C 8 .

Y: BucketList[]

Y[0]

Y[1]

A 3 . D 9 . E .

B 3 . C 8 .Y[1]

ZB 3 . C 8 .

An object has An object has at most one at most one owningowning

monitor/valuemonitor/value

G 1 .G 1 .

G 1 .G 1 .G 1 .

Page 8: Programming Languages: Design, Specification, and Implementation

Region Analysis: lent, kept

M2

M1

Y

Z

X

M2.foo(X);M2.foo(X);

this.N = P1;this.N = P1;Z.bar(P1,Y);Z.bar(P1,Y);

class M2type extends Monitor {class M2type extends Monitor { . . .. . . void foo (void foo (lentlent Bucket P1); Bucket P1); }} class Ztype extends Object {class Ztype extends Object { . . .. . . void bar (void bar (lentlent Bucket P1, Bucket P1, kept kept Bucket P2);Bucket P2); }}•lentlent = An unknown region = An unknown region

(Default for parameters of non-objects)(Default for parameters of non-objects)•keptkept = Same region as = Same region as thisthis

(Default for parameters of objects)(Default for parameters of objects)

this.A = P1;this.A = P1;this.A = P2;this.A = P2;P1.op();P1.op();P2.N = this;P2.N = this;

Z.bar(Y,P1);Z.bar(Y,P1);

M2

M1

Y

Z

X

M2.foo(X);M2.foo(X);

Page 9: Programming Languages: Design, Specification, and Implementation

Region Analysis: new, in R

M2

M1 A

E = M2.m( E = M2.m( A,B,C,D); A,B,C,D);

P1.N = P2;P1.N = P2;P4.N = P3;P4.N = P3;P2.N = P4;P2.N = P4;return new Bucket ( return new Bucket ( 3, null);3, null);

class M2type extends Monitor {class M2type extends Monitor { . . .. . . new new Bucket m (Bucket P1 Bucket m (Bucket P1 in R1, in R1, Bucket P2 Bucket P2 in R1, in R1,

Bucket P3 Bucket P3 in R2, in R2, Bucket P4 Bucket P4 in R2in R2););

}}

•new new = No region= No region•in Rin R = Same region as other = Same region as other

parameters labeled parameters labeled in Rin R

B

C

D

EEE

Page 10: Programming Languages: Design, Specification, and Implementation

Other paradigms

String processing – lexx, yacc, SableCC, regular expressions

Logic programming – Prolog Transactional Programming – CLU, SQL,

XQuery Distributed – Hermes, Java (and other)

RMI tools

Page 11: Programming Languages: Design, Specification, and Implementation

Regular expressions Distinguish between the syntax of the pattern, and

the semantics. Different engines will have slightly different syntax.

A regular expression is a• pattern that you apply to• a text, in order to• determine a match, and sometimes to• parse the components of the match (e.g. for

search/replacement) A regular language is one that can be parsed with

a regular expression

Page 12: Programming Languages: Design, Specification, and Implementation

Patterns Exact character: a Any character: . Zero or more repetitions of a pattern: * (Patterns can be grouped with parentheses) Concatenation: (pat1)(pat2) Alternation: (pat1)|(pat2) Zero or 1: x? (same as (x)|()) One or more: x+ (same as x(x)*) Any character (not) in the set: [a-z] [^a-z] Other special matches, that match positions rather than

characters: e.g. ^ (start of line) $ (end of line), \b (word) Modern extensions: {x, y} between x and y occurrences

Page 13: Programming Languages: Design, Specification, and Implementation

Examples

.*\.txt matches a filename ending with .txt \b[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\

b matches an email address, like [email protected]

<[A-Za-z][A-Za-z0-9]*> matches a tag like <Foo>

Page 14: Programming Languages: Design, Specification, and Implementation

More examples

<.+> applied to the string • Here is a test: <test> hello </test> now what?

• Matches <test> hello </test>

That’s because + and * are greedy Suppose you wanted to match only <test> You should do one of the following:

• Select lazy matching: <.+?>

• Replace dot with something more restrictive, e.g. <[^<>]+>

Page 15: Programming Languages: Design, Specification, and Implementation

Logic Programming – Prolog (from Gelernter et al.) Facts, relationships that are true

• female(annette).• female(marilyn).• female(audrey).• parents(annette, fred, marilyn).• parents(audrey, fred, marilyn).• parents(marilyn, john, liz).

Rules, inferences that can be made:• sister(X,Y) :- female(X), female(Y), parents(X,M,F), parents(Y,M,F) – two

females having the same parents are sisters Queries,

• ? :- sister(annette, audrey). – are annette and audrey sisters?• ? :- sister(_, audrey). – does audrey have a sister?• ? :- sister(X, audrey). – for what X are X and audrey sisters (i.e., who are

audrey’s sisters)

Page 16: Programming Languages: Design, Specification, and Implementation

Examples:

Data types: integers, strings, lists. Append example:

• append([], L, L). – an empty list appended by L yields L

• append([X|L1], L2, [X|L3]) :- append(L1,L2,L3) – if given that L3=L1 appended by L2 then X followed by L3 is X followed by L1 appended by L2

• ? :- append([a,b,c],[d,e,f],X) asks what is [a,b,c] appended by [d,e,f] – answer is [a,b,c,d,e,f].

• ? :- append([a,b,c], X, [a,b,c,d,e,f]) asks what should I append [a,b,c] by to get [a,b,c,d,e,f]. Notice the query is just as easy to express.

Page 17: Programming Languages: Design, Specification, and Implementation

How does it work? Our old friend, unification!

• Remember, that means, find a substitution of variables that makes two expressions the same.

A goal is satisfied if:• A fact unifies with the goal• A rule’s head unifies with the goal and then each clause in that

rule’s body is satisfied (these clauses then recursively become goals).

• If there are multiple possible unifications, then they must all be tried, backtracking on failure.

• To speed up search, and to inhibit multiple paths, Prolog introduces a cut operator. The trouble with “cut” is that it spoils the abstraction of Prolog: the user has to know the order in which goals are evaluated.

Page 18: Programming Languages: Design, Specification, and Implementation

Transactional Programming with SQL

Actually transactions and SQL are logically independent• Transactions means executing a set of

operations (reads and/or updates) atomically.

• Remember that: Atomically means that if T1 includes [a, b, c] and T2 includes [d, e, f], the only possible results are [a,b,c,d,e,f] and [d,e,f,a,b,c] and no interleavings.

• Whereas SQL means using a relational model of data whether transactional or not.

Page 19: Programming Languages: Design, Specification, and Implementation

What’s a relational database? A collection of tables, each

representing a relation. Each table’s schema defines

• The names and types of each column, e.g. DeptName (String), Budget (Float), Manager (String)

• Some integrity constraint, e.g. Each department Name has exactly one Budget and one Manager.

Viewed as a collection of rows, each row having one entry in each column

Each row stands for a fact, e.g. “The Marketing Department has a budget of $20M and is managed by Slick”.

DeptName Budget Manager

Marketing $20M Slick

Manufacturing $12M Grind

Research $.1M Geek

Employee DeptName Salary

Chen Research $100K

Jones Marketing $80K

Super Marketing $160K

Page 20: Programming Languages: Design, Specification, and Implementation

What is the SQL query language? A declarative way of extracting derived facts, e.g.

• SELECT (Employee, Salary) FROM (Departments JOIN Employees OVER DeptName) WHERE(Manager = ‘Slick’)

(Tell me the names and salaries of all employees who work for Slick.) It hides:

• How data is organized (hashtables, indexes, trees, etc.)• How the database is navigated to answer the query

SQL has a query subset which is purely declarative, plus imperative operations that can be used within transactions.

Operations: • Select (also called “restrict”) – filter rows from a table• Project – select certain columns from a table• Join – combines facts from multiple tables based upon some common

column(s)• Others: -- e.g. top-K

Page 21: Programming Languages: Design, Specification, and Implementation

Distributed Languages These are languages dealing with multiple

relatively independent systems that interact• Sometimes by message passing• Sometimes by remote procedure calls or remote

method invocations What they have in common is local-remote

transparency• This means that the syntax and semantics for an

interaction between two modules is the same regardless of whether the communicating modules are located on the same machine/process or on different machines/processes.

Page 22: Programming Languages: Design, Specification, and Implementation

Hermes Modular unit is the process A process behaves like an instance of an Ada task type, but

• Processes never leave the module• They communicate by

• First, establishing connections between their output ports and another process’s input ports• Then they either send messages on their output ports – they get queued up at the input port and

eventually received• Or else, they send call messages on their output ports – these behave like Ada rendezvous calls or

Java synchronized method calls; they queue up and get accepted one at a time, but the caller blocks until the call message is returned.

There are no references. Only processes, values, and ports. Everything is passed by value. Inout parameters of calls are passed by value/result.

There are no global variables. The only way a process can talk to something outside itself is via a port. It starts out getting any ports its parent passed it in the constructor, and it exports back to its parent a connection to itself.

Ports are first-class. For example, I can call a service (e.g. a file factory) and receive back a port that lets me send things to the file.

The implementation of Hermes hides from the user whether the process at the other end of an output port is actually running in the same machine or in a different machine. Only the performance is different.

As in all remote procedure call systems, remote ports are implemented via proxy objects. In Hermes these proxies are managed 100% by the runtime.

Page 23: Programming Languages: Design, Specification, and Implementation

Java RMI Almost a transparent system, but

not quite! The most important differences:

• You have to declare interfaces remote, and the operations have to throw RemoteException

• Classes being passed must implement Serializable

• References not declared remote are passed by deep copying; those declared remote are passed by reference; those declared static or transient are not passed at all!

• Parameters passed by value cannot be returned!

• You need to compile stubs• Servers and clients are asymmetric,

and servers need to run security managers.

package compute;

import java.rmi.Remote;

import java.rmi.RemoteException;

public interface Compute extends Remote {

<T> T executeTask(Task<T> t) throws RemoteException;

}

This is a problem with any language that has pointers.

If an object refers to another, and I pass an

object do I mean the object and everything it points to (which might be the world) or just those parts of it that

represent its value?