Scribe: Joseph Bebel Joseph Bebel We will now discuss computer programs, a concrete manifestation of...

CSCI 270: Introduction to Algorithms and Theory of ComputingFall 2017Prof: Leonard AdlemanScribe: Joseph Bebel

We will now discuss computer programs, a concrete manifestation of what we’ve beencalling algorithms. Computer programs are written in specific programming languages. Here,for simplicity we will consider a simplified version of BASIC that we call SBASIC. It will havethe usual arithmetic operations and usual decision logic that most programming languageshave (if/then/else, for loops, goto, etc). One major simplification we will make is to assumethat the input and output of an SBASIC program are natural numbers, and that input canonly occur in the first line of a program and output can only occur on the last line.

We say that an SBASIC program halts on a given input, if when executed on that input,the program reaches this last line and produces some output. Note that some SBASICprograms may halt on some inputs and not halt on other inputs.

One important concept is that each SBASIC program has an index. Let us put theSBASIC programs in lexicographical order based on their source code. Then there is a 1stSBASIC program, a 2nd SBASIC program, and so on (ad infinitum). Since there are aninfinite number of SBASIC programs, each SBASIC program has a unique natural numberas its index, and each natural number corresponds to a unique SBASIC program. You canalso think of the index as the natural number (binary string) corresponding to that SBASICprogram’s source code.

We can ask, for a set of natural numbers, if there exists an SBASIC program that decidesif a number is in that set or not:

The Decision Problem for S ⊆ NINPUT: n, a natural numberOUTPUT: 1 if n ∈ S

0 if n 6∈ S

For example, consider the set E = {0, 2, 4, 6, 8, 10, . . .} of even numbers. The decisionproblem for E is then: on input an even number, output 1, otherwise output 0.

DEF: 1. A subset S ⊆ N is decidable iff there exists an SBASIC program that solves thedecision problem for S

DEF: 2. A subset S ⊆ N is undecidable iff there does not exist an SBASIC program thatsolves the decision problem for S

Note that the set E is decidable; the SBASIC program just needs to look at one bit ofthe input to decide if the input is even or odd.

1

We can now pose the following question: given an SBASIC program and an input, does ithalt on that input? We can more precisely ask this question about SBASIC programs thatare given their own index as input. It turns out that there is no SBASIC program that cansolve that decision problem:

Theorem 1. Undecidability of the Halting Problem K = {i | Bi on input i halts } isundecidable.

Proof. Assume there exists an SBASIC program P that solves the decision problem for K.Let Bc be the following program:

INPUT Gexecute P on input Gif P outputs 1:

A: GOTO Aelse if P outputs 0:

PRINT 0

Let c be the index of Bc. Is c ∈ K?

• if c ∈ K, then since P solves the decision problem for K, P outputs 1 on input c.Therefore, Bc on input c goes into an infinite loop. Therefore, c 6∈ K, which is acontradiction

• if c 6∈ K, then since P solves the decision problem for K, P outputs 0 on input c.Therefore, Bc on input c prints 0. Therefore, c ∈ K, which is a contradiction

Since we reach a contradiction in both cases, our original assumption must be false, andthere does not exist an SBASIC program that solves the decision problem for K, and K isundecidable.

The Decision Problem for KINPUT: n, a natural numberOUTPUT: 1 if n ∈ K

0 if n 6∈ K

2

So have we proven that the decision problem for K is not solvable? Not quite; we’veshown that there does not exist an SBASIC program that solves the decision problem for K,but maybe you can solve it in some other programming language like C++.

We make the claim that, actually, every program in another programming language can be“translated” or “compiled” into an SBASIC program with equivalent behavior. This concepthas been explored over the years and has resulted in something we might call “Church’sThesis”:

Church’s Thesis: If an algorithm has an input/output behavior then there is an SBASICprogram with the same input/output behavior.

Note that Church’s Thesis is philosophy, not science or mathematics. It says that everymachine that you can conceivably build in the real world, whatever its input/output is, canbe simulated by some equivalent SBASIC program.

This statement of Church’s Thesis is more frequently stated with Turing machines ratherthan SBASIC programs. But Turing machines are horrible to program, and in principle,it doesn’t matter; SBASIC programs (or C++ programs, or Java programs, etc) are allequivalent in this regard. But we cannot prove philosophy, so we have to accept that whateverwe prove about SBASIC programs will apply to all computers and computer programs whichmight exist in the future.

So the tool (the computer) which (presumably) we are dedicating our lives to studying andusing is fundamentally flawed; in fact, most problems are undecidable, and most problems

3

which are decidable are not solvable in polynomial time. It’s the fundamental limits of thetool.

Another point to make about the Undecidability of the Halting Problem; it dependsgreatly on “self-reference” or what is sometimes called diagonalization. It is like the sentence“This sentence is false”; it cannot be true, because then it would imply its own falsehood, andit cannot be false, because then it would be true. While such logical contradictions are moremathematical toys than anything else, the basic concept of self-reference is critical to theUndecidability of the Halting Problem.

Let’s consider another mind experiment. Can we build a robot that can assemble exactcopies of itself? Let’s say that we build such a robot and leave it in a big warehouse full ofits parts. Assuming you build a sufficiently sophisticated robot, maybe it can assemble theparts into a copy of itself. But then it remains the question of how the new robot gets itssoftware. Somehow, it has to be copied from the old robot. Maybe this is possible to do byensuring that the software is in some readable form on the disk of the robot. But actually,that is not necessary.

Kleene Recursion Theorem: put simply, every program can compute its own index.That is, every program can have a line of code which performs the operation “j = MYINDEX”.

So a program, mid-way through its execution, can somehow (we don’t say how) obtaina reference or a copy of itself. Why is this important? Consider the following set and thedecision problem for that set:

S = {i | Bi on input 0 outputs 1 }At first glance, it may not be clear whether this problem is decidable or undecidable. On

one hand, it seems like just as difficult a problem as the decision problem for K. On theother hand, it is not immediately clear how to obtain self-reference; for the decision problemfor K, we only talk about the behavior of each SBASIC program on input its own index,so in the counterexample we just assume the input is the index. However, using Kleene’srecursion theorem, we can show this set is also undecidable:

Theorem 2. S is undecidable.

Proof. Assume there exists an SBASIC program P that solves the decision problem for S.Let Bc be the following program:

INPUT GN ← MY INDEXexecute P on input Nif P outputs 1:


PRINT 1

Let c be the index of Bc. Is c ∈ S?

4

• if c ∈ S, then since P solves the decision problem for S, P outputs 1 on input c.Therefore, Bc on input 0 goes into an infinite loop. Therefore, c 6∈ S, which is acontradiction

• if c 6∈ S, then since P solves the decision problem for S, P outputs 0 on input c.Therefore, Bc on input 0 prints 0. Therefore, c ∈ S, which is a contradiction

Since we reach a contradiction in both cases, our original assumption must be false, andthere does not exist an SBASIC program that solves the decision problem for S, and S isundecidable.

T = {i | (∀n)[Bi on input n halts ]}Theorem 3. T is undecidable.

Proof. Assume there exists an SBASIC program P that solves the decision problem for T .Let Bc be the following program:

INPUT GN ← MY INDEXexecute P on input Nif P outputs 1:


PRINT 0

Let c be the index of Bc. Is c ∈ T?

• if c ∈ T , then since P solves the decision problem for T , P outputs 1 on input c.Therefore, Bc on all inputs goes into an infinite loop. Therefore, c 6∈ T , which is acontradiction

• if c 6∈ T , then since P solves the decision problem for T , P outputs 0 on input c.Therefore, Bc on all inputs prints 0. Therefore, c ∈ T , which is a contradiction

Since we reach a contradiction in both cases, our original assumption must be false, andthere does not exist an SBASIC program that solves the decision problem for T , and T isundecidable.

5

So there are decision problems that cannot be solved by SBASIC programs. Can westill say something (computationally) about these undecidable sets, or are computers totallyuseless here? We will introduce the concept of recursively enumerable/listable; to doso, we relax the definition of SBASIC program to allow output at arbitrary lines (and tocontinue executing after producing output)

DEF: 3. A set S ⊆ N is recursively enumerable, or listable iff there exists an SBASICprogram P that (given no input) prints a sequence of numbers such that:

1. P never prints a number not in S

2. for all elements n ∈ S, P will eventually output n (if given enough time)

Note that if the set S has infinitely many elements, then a program that lists S must runforever.

Every decidable set S is also listable: just loop over all numbers n, run the decisionproblem for S solver to decide if n is in the set, and if so output n; otherwise go to the nextnumber. But is every listable set also decidable?

Theorem 4. K is listable.

Proof. The following program lists K:

INPUT Gfor i ← 1 to infinity:

for j ← 1 to i:execute Bj on input j for i stepsif Bj halts within i steps, output j.otherwise, continue to next j

We need to show that the output follows the definition of listable. Assume a number n isoutput by this program: then that must have meant that Bn was run on input n for somenumber of steps, and Bn halted. Therefore n ∈ K and the program correctly output n.

Does it output every element of K? Assume n ∈ K. Then it must be that Bn halts oninput n after t steps for some number t. Since i, j loop over all numbers to infinitely, theremust be an iteration of the loop where i = n and j = t, in which case Bn will be run on inputn for t steps, halt, and n is output.

Therefore, K is listable.

Are there any unlistable sets?

Theorem 5. For all X ⊆ N, if X is undecidable, then either X is not listable or X is notlistable.

Proof. Assume P lists X and P ′ lists X. Consider the following program:

6

INPUT Gfor i ← 1 to infinity:

run P until it outputs i numbers. If the i’th number is G, then output 1.run P ′ until it outputs i numbers. If the i’th number is G, then output 0.

This program decides X. Note that every number n must be in either X or X. Therefore,either P or P ′ will eventually output n. We run both listers until one of them outputs n,then we have decided whether n ∈ X. If X is undecidable, then it must be the case thateither X or X is not listable.

Theorem 6. K is not listable.

Is it possible for both a set and its complement to be unlistable? Yes:

Theorem 7. T is not listable.

Proof. Assume there exists an SBASIC program P that lists T .Let Bc be the following program:

INPUT GN ← MY INDEXexecute P until it outputs G numbersif the G’th output of P is N:

A: GOTO Aelse:

PRINT 0


• if c ∈ T , then since P lists T , P eventually outputs c. Therefore, for some input G, Bc

will compute the G’th output of P , find it is c, and go into an infinite loop. Therefore,c 6∈ T , which is a contradiction

• if c 6∈ T , then since P lists T , on all possible inputs G, the G’th output of P will neverbe c. Therefore, Bc on all inputs prints 0. Therefore, c ∈ T , which is a contradiction

Since we reach a contradiction in both cases, our original assumption must be false, andthere does not exist an SBASIC program that lists T , and T is unlistable.

Theorem 8. T is not listable.

Proof. Assume there exists an SBASIC program P that lists T .Let Bc be the following program:

INPUT GN ← MY INDEXfor i ← 1 to infinity:

execute P until it outputs i numbers

7

if the i’th output of P is N:GOTO A

elsenext i

A: PRINT 0


• if c ∈ T , then since P lists T , P eventually outputs c. Therefore, for some value of i, Bc

will compute the i’th output of P , find it is c, and goto the PRINT 0 line. Therefore,c 6∈ T , which is a contradiction

• if c 6∈ T , then since P lists T , for all values of i, the i’th output of P will never bec. Therefore, Bc will never exit the loop, always incrementing i, and thus never halt.Therefore, c ∈ T , which is a contradiction

Since we reach a contradiction in both cases, our original assumption must be false, andthere does not exist an SBASIC program that lists T , and T is unlistable.

Therefore T and T are both undecidable and both unlistable.Is unlistable the worst thing that a set can be?Shortest Program:Consider the string 0000000000000000000000. Would you think that this string was

generated randomly (e.g. by flipping a coin and writing ‘0’ if you get heads and ‘1’ if you gettails)? Of course it is possible to get this sequence by coin flips, but it’s not likely.

If our concern is getting roughly an equal number of ‘0’s and ‘1’s, then what about thestring 0101010101010101010101? Is this likely to come from coin flips? Again, it is possible toget this string randomly, but there is “something” about it that suggests it was not random.

It can sometimes be difficult to tell. For example, the string 0110111001011101111000.This might begin to look more random, but in fact it is the digits 0-8, each written in binary,concatenated together.

One problem with trying to call these strings “random” or “not random” is that they areall equally likely to occur as any other 22 bit string. That is: you get each possible stringwith probability 2−22 including 0000000000000000000000 and 1101110110101101010101 (an“actual” string generated randomly). So why does one string seem more “random” than theother?

Kolmogorov gave a computational answer to this question. He observed that “non-random”strings have short programs that output them. For example, consider a 1-million bit longstring of only ‘0’s. There is a program of length approximately 1 million bits that outputsthat string: simply hard-code the string into the program, and output it. But there is a muchshorter program, which simply has a loop that counts from 1 to 1 million and outputs 0 eachiteration. This program is much shorter than the hard-coding program, yet produces thesame output.

8

Similar short programs can output the other “non-random” strings we’ve discussed.However, for a string generated randomly, with high probability it does not have a shorterprogram than the one that outputs a hard-coded string.

DEF: 4. For all numbers n, an SBASIC program P is a shortest program for n iff:

1. On input 0, P outputs n

2. There does not exist another program P ′ that on input 0 outputs n, and the length ofP ′ is less than the length of P .

The length of a program P is the number of ASCII characters in its source code.

Sh = {i | (∃n)[Bi is a shortest program for n]}.

DEF: 5. For all X ⊆ N, X is immune iff

1. X is infinite

2. X has no infinite subset that is listable.

Theorem 9. Sh is immune.

Proof. Assume there exists an SBASIC program P that lists some infinite subset of Sh.Let Bc be the following program:

INPUT GN ← MY INDEXfor i ← 1 to infinity

execute P until it produces its i’th output, kif Bk is an SBASIC program longer than BN :

execute Bk on input 0PRINT whatever Bk outputs

elsenext i

What is the output of Bc on input 0? It will compute its own index first, then run P untilit outputs a shortest program that is longer than Bc. Note that this must occur if P lists aninfinite subset of Sh, since there are only finitely many programs shorter than or equal inlength to Bc (assuming SBASIC programs are written in ASCII). So after outputting thatfinite number of elements of Sh, P must eventually output the index of an SBASIC programBk longer than Bc.

When Bc now executes that other SBASIC program, Bc will produce the same output asBk. However, Bc is shorter than Bk, contradicting the claim that Bk was a shortest programand that k ∈ Sh. Therefore, P does not list an infinite subset of Sh, and Sh is immune.

9

The study of shortest programs has implications for data compression. Consider a PNGimage, which is a lossless compression of a bitmap. Some images are very compressible;they are very large images with very short representations. The compressed form can beconsidered to be a “short program” that outputs that image. Yet we know that a randombitmap image is incompressible; there does not exist a representation that is much shorterthan the raw data.

10

Scribe: Joseph Bebel Joseph Bebel We will now discuss computer programs, a concrete manifestation of...

Documents

Transcript of Scribe: Joseph Bebel Joseph Bebel We will now discuss computer programs, a concrete manifestation of...