Why Study Programming Languages? -...
Transcript of Why Study Programming Languages? -...
1 IntroductionWhy Study Programming Languages?
• Choosing the right language for the job
• Designing a better language
• Languages we know determine how we think about
programming
“A language that doesn’t affect the way you
think about programming is not worth
knowing.” — Alan Perlis
1
8 Languages in an hour
We look briefly at some of the variety we find in
programming languages
Just to show variety; don’t worry if you don’t understand
most of the programs
• Fortran
• Cobol
• Lisp
• APL
• Forth
• Eiffel
• Bison
• Mercury
2
1.1 Fortran
• (FORmula TRANslator)
• Designed in the mid 1950s
• Subroutines, but no recursion or nesting
• Control flow by goto, conditional, and bounded
iteration
• Commonly used in engineering and science applications
• Fortran 90 and 95 are much improved versions,
compared to the original versions; work is under way
on Fortran 2000
3
Fortran example
integer I, MX, MN,A(100)
real RS
read(A(I), I = 1, 100)
MX = A(1)
MN = A(1)
do 10 I = 2, 100
if (A(I).gt.MX) MX = A(I)
if (A(I).lt.MN) MN = A(I)
10 continue
RS = (MN + MX)/2
write RS
end
4
1.2 Cobol
• (COmmon Business Oriented Language)
• Designed around 1959
• For processing large amounts of data
• Verbose, for readability
• Powerful notion of file
• Supports goto, conditional goto, for loop
• Program consists of 4 divisions: Identification,
Environment, Data, and Procedure
• Still commonly used for business applications
5
Cobol example (simplified)
data division
file section
FD STFILE
01 STUDENT
02 STUDENT-NAME picture A(15)
02 COURSE occurs 30 times
03 COURSE-NAME picture AAAA999
03 SCORE picture 99
02 STUDENT-ID picture 99999
working-storage section
01 TOTAL picture 999999
6
Cobol example (simplified) (2)
procedure division
init. open input STFILE. move zero to TOTAL.
sum. read STFILE; at end go to fin.
perform adding
varying J from 1 by 1 until J > 30.
go to sum.
adding. read COURSE(J). add SCORE to TOTAL.
fin. display TOTAL. close STFILE.
7
1.3 Lisp (LISt Processing)
• Designed in the late 1950s, inspired by the lambda
calculus.
• Still in active use, with several dialects surviving
• Functional, that is, mainly based on function
application
• Functions are first-class objects, even if higher-order
and/or recursive
• Typeless: S-expression is the only type
8
Lisp (LISt Processing) (2)
• S-expression is number, atom (symbol), or list
a
b c
d
This binary tree represents the S-expression
(((a.nil).(b.c)).(d.nil))
9
Lisp example
(defun intersect (m n)
(cond
((null m) nil)
((member (car m) n)
(cons (car m)
(intersect (cdr m) n)))
(t (intersect (cdr m) n))))
This function returns the intersection of two lists
10
1.4 APL (A Programming Language)
• Designed in the early 1960s
• Extremely compact programs
• Based on multidimensional arrays
• Powerful array operations, elementwise, cumulative,
. . .
• Many special symbols, uses a special character set
• Used in scientific and engineering applications
11
APL example
A program that generates the first N Fibonacci numbers:
∇ FIB N
[1] A← 1 1
[2]→ 2×N > ρA← A, +/−2 ↑ A∇
12
APL example (2)
A program that generates prime numbers up to N :
(2 = +/[1]0 = S ◦ .|S)/S ← ι N)
• With experience this becomes readable (?)
• Some call APL a “write-only” language
• Often easier to rewrite than modify a function
• “concise” 6= “readable!”
13
1.5 Forth
• Designed in the early 1970s for use on small computers
• Stack based: operations take their operands from the
(single) stack and place their result(s) on the stack
• No named parameters or local variables; data is
handled by stack manipulation
• Forth is the basis for the Postscript page description
language
14
Forth example
: sqr dup * ; n => n*n
: dosum swap 1 + n, s => s, (n+1)
swap over => (n+1), s, (n+1)
sqr + ; => (n+1), (s+(n+1)^2)
: sumsqr 0 swap n => 0, n
0 swap 0 => 0, 0, n, 0
do dosum loop => sum i=0 to n of i^2
15
1.6 Eiffel
• Designed in the 1980s
• Object oriented: definitions of operations and data are
encapsulated together.
• Uses inheritance to define new data structures and
operations in terms of others
• Design by contract: operations specify the initial
conditions they require and the final conditions they
will ensure
• Polymorphic: can define operations and types that can
work on objects of any type
16
Eiffel example
class STACK[T] export
push, pop, empty, full
feature
implementation: ARRAY[T]
max_size: INTEGER
nb_elements: INTEGER
Create(n: INTEGER) is
do if n>0 then max_size := n end;
implementation.Create(1, max_size)
end;
17
Eiffel example (2)
empty: BOOLEAN is
do Result := (nb_elements = 0)
end;
pop:T is
require not empty
do Result :=
implementation.entry(nb_elements);
nb_elements := nb_elements - 1;
ensure not full;
nb_elements = old nb_elements - 1
end;
...
18
1.7 Bison
• Originally developed in the 1980s as a free
replacement for the older YACC language
• Special purpose designed for writing parsers
• Translates input into C source code
• Usually used with a scanner generator such as FLEX
• Usually used for only a small part of an application
19
Bison example
%{
#define YYSTYPE double
#include <math.h>
%}
%token NUM
%left ’-’ ’+’
%left ’*’ ’/’
%left NEG /* negation--unary minus */
%right ’^’ /* exponentiation */
%%
20
Bison example (2)
input: /* empty string */ | input line ;
line: ’\n’ | exp ’\n’ {
printf ("\t%.10g\n", $1); };
exp: NUM { $$ = $1; }
| exp ’+’ exp { $$ = $1 + $3; }
| exp ’-’ exp { $$ = $1 - $3; }
| exp ’*’ exp { $$ = $1 * $3; }
| exp ’/’ exp { $$ = $1 / $3; }
| ’-’ exp %prec NEG { $$ = -$2; }
| exp ’^’ exp { $$ = pow ($1, $3); }
| ’(’ exp ’)’ { $$ = $2; }
;
%%
21
1.8 Mercury
• Originated in early 1990s at University of Melbourne
• Logic/functional programming language
• A program consists of clauses, that is, facts and rules
that allow new facts to be deduced from old
• Purely declarative
• A program can be regarded as a knowledge-base (what
to compute, rather than how)
• Strong types and modes
• Nondeterministic: some queries have multiple solutions
• Control flow by backtracking as well as invocation
• Parameter passing by unification (bidirectional)
22
Mercury example
:- type list(T) ---> [] ; [T | list(T)].
:- pred append(list(T), list(T), list(T)).
:- mode append(in, in, out) is det.
:- mode append(in, out, in) is semidet.
:- mode append(out, out, in) is multi.
append([], C, C).
append([A|B], C, [A|BC]) :- append(B, C, BC).
23
24
2 Abstraction
We all know that the only mental tool by means
of which a very finite piece of reasoning can cover
a myriad cases is called ”abstraction”; as a result
the effective exploitation of his powers of
abstraction must be regarded as one of the most
vital activities of a competent programmer.
. . . The purpose of abstracting is not to be vague,
but to create a new semantic level in which one
can be absolutely precise.
— Edsgar Dijkstra
25
Abstraction (2)
From the ultralingua.net dictionary:
1. The process of formulating general concepts by
abstracting common properties of instances;
generalization.
2. A general concept formed by extracting common
features from specific examples.
26
Abstraction (3)
• Good programmers continually look for better
abstractions for what they are doing
• Often find them in new functions or datatypes
• Occasionally they can only be found in a different
programming language or paradigm, or by using a
language preprocessor
• Once in a while, one must invent a new preprocessor
or language or even paradigm
27
2.1 Abstraction in programmingSome abstractions developed in computer science:
• assembly language abstracted instruction numbers and
formats
• FORTRAN and other high-level languages abstracted
the details of the actual machine
• Operating Systems abstracted interaction with
external entities
• functions and procedures abstract a sequence of
operations
28
2.1.1 Machine Language
• Why do we have programming languages?
• A (fragment of a) stored executable program
ultimately looks like this:
00000010101111001010
00000010111111001000
00000011001110101000
and initially, this was what programmers wrote (or
toggled into a computer’s front panel)
• Each line is a command; command names, register
numbers, etc, are encoded as numbers
29
2.1.2 Assembly Language
• In assembly language the above might be written
LOAD I
ADD J
STORE K
“Take the value stored in I’s cell, add the value stored
in J’s cell, and put the sum in K’s cell”
• I.e., calculate K = I + J
• Abstracts numeric opcodes to symbols, addresses to
names
30
2.1.3 More intelligible languages
• Historic trend towards more abstract, understandable
programming notation
• The language should help us write programs that are
easy to read, easy to understand, easy to modify
• This generally means higher levels of abstraction, and
sometimes specialization for certain programming tasks
• Different languages provide different abstractions; no
one-size-fits-all language
31
2.1.4 Data and Control Abstraction
• The two most important kinds of abstractions made in
programming languages are data and control
abstraction
• Discussed in detail later
• Data abstractions abstract the things the program
manipulates
• Control abstractions abstract the operations performed
32
2.2 Binding Time
• Binding time: when a particular decision is made
• Many possible binding times, but the most interesting
are:
1. Run time
2. Compile time (or link time)
3. Coding time (programmer decides)
4. Language implementation time
5. Language definition time
33
Binding Time (2)
Some issues to consider binding time for:
• variable type
• possible variable values
• variable value
• data structure deallocation
• procedure invoked
• conditional branch taken
34
Binding Time (3)
• Trade-off between flexibility and efficiency
• Later binding times often mean simpler and more
powerful facilities, earlier often mean better
performance
• For example, deciding about storage reclamation at
runtime (GC) makes programming easier and more
robust; deciding at coding time is more efficient
• Program analysis may allow some reclamation at
compile-time, with the rest done at runtime.
• This is typical: often some runtime decisions can be
made at compile-time by program analysis
35
2.3 Issues
• What are your favourite programming languages?
• Why? What do you like about them?
• What language misfeatures do you dislike?
• What makes a language powerful? Safe? Easy to
code? Easy to debug?
36
2.4 SyntaxSome syntax issues in language design:
• readability
• consistency
• orthogonality
• simplicity
• substitutivity
• familiarity
37
2.4.1 Readability
The C syntax for declarations aims at being very consistent
with the syntax for object use. For example,
int *n;
makes it clear that *n denotes an integer. However,
char (*(*x())[])();
does not exactly make it clear that x is a function, and the
type of the function would be a mystery even to seasoned
programmers.
38
2.4.2 Consistency
• C syntax deliberately blurs the distinction between
pointers and arrays (and strings).
• Often misleading: pointers and arrays are semantically
very different
• Many (but not all!) operations can be applied equally
well to arrays and pointers, leading to errors
• Can refer to the same object as f[3], *(f+3), or even
3[f].
• Is that really what we want?
39
2.4.3 Orthogonality
• Pre 1977 Fortran had a number of strange restrictions
on syntax.
• E.g., 5 * X was allowed, but not X * 5.
• Also, something like F(100 - X) was illegal. For a
subroutine call like that, one would have to do
Y = 100 - X
F(Y)
40
2.4.4 Simplicity
• The syntax of languages like PL/I and Ada are
criticized as being too complex. There is much to
remember, making it difficult to program without a
reference manual at hand
• Lisp has a very simple syntax: everything is a list,
written surrounded with parentheses
• Many Lisp hackers appreciate the simple syntax,
finding that it makes code layout simple
• Lisp detractors for Lisp programs unreadable, however,
insisting that LISP stands for “Lots of
Incomprehensible Silly Parentheses.”
• Familiarity issue? Visual complexity?
41
2.4.5 Substitutivity
The old debate about the semicolon:
• In PL/1: a statement terminator
• In Pascal: a statement separator
• In C: a terminator for some statements, not blocks
• Pascal programmers have a problem editing:
begin
x := 5;
y := 7
end
42
Substitutivity (2)
• In C, macros cause trouble with semicolon.
#define swap(p,q) \
{t=p; p=q; q=t;}
• Now
if (x>3) swap(x,y);
is fine, but we have a problem with
if (x>3) swap(x,y);
else x=y;
43
Substitutivity (3)
• It is not easy to see how to avoid this.
• The expert C hacker will write:
#define swap(p,q) \
do {t=p; p=q; q=t} while (0)
• This doesn’t even declare t — a bigger problem
• Even a simple definition like:
#define square(x) x*x
goes badly wrong
44
2.4.6 Familiarity
• Sometimes language designers break conventions from
mathematical notation or other programming
languages, making life harder for programmers
• In C, the assignment operator is =, the usual notation
for identity, a symmetric relation
• Made worse by the fact that an assignment has a
value, so that if (x = 3) ... is syntactically correct
• Other operators (e.g., <- or :=) preferable? But now =
is familiar from FORTRAN, C, etc!
45
Familiarity (2)
• The Z specification language uses LATEX to write
programs, so Z uses rich notation including symbols
such as ∪,∩,⊆, etc.
• Prolog has an “if then” construct, p -> q, which looks
like classical implication, but false -> false returns
false!
• A different syntax would be less misleading.
(NU-Prolog, differs: the query would yield true.)
46
2.4.7 Conclusion
These are some broad principles one can apply to
programming language syntax. Unfortunately, how to
weight these aspects is unclear, and even how to apply
these principles is not always clear.
Syntax decisions are influenced by:
• what kinds of problems are targeted
• background of likely practitioners
• programming environments available
• taste
47
48