Why are Humans so Smart? Basic idea of runaway evolution due to Ronald Fisher (whom we will later...

Post on 18-Jan-2018

217 views 0 download

description

2 inch tail 2.5 inch tail 1.5 inch tail I like guys with a two inch tail.. Under certain conditions, the genes for having a trait, and the genes for choosing the same trait, can “take hold” in a population. If that happens, it is unlikely to ever break out of the cycle. A male that has a longer or shorter tail simply will not be able find a mate.

Transcript of Why are Humans so Smart? Basic idea of runaway evolution due to Ronald Fisher (whom we will later...

Why are Humans so Smart?

• Basic idea of runaway evolution due to Ronald Fisher (whom we will later meet as the inventor of the linear classifier)

• Application to human intelligence mostly due to Geoffrey Miller

• Geoffrey Miller The Mating Mind: How Sexual Choice Shaped Human Nature (2000)

• This is a theory, but very plausible.

Ronald Fisher

2 inch tail

2.5 inch tail

1.5 inch tail

I like guys with a two inch tail..

• Sexual selection is a big driver of evolution.• The tails of a bird are control by genes, but critically, so is their behavior.• Female choice is an example of a behavior. If a female that likes 2-inch tails has

daughters, it is likely that the daughters will also like 2-inch tails.• Some female choice may be rational, they may choose for strong beaks, or for

good nest making ability, or….• Some (perhaps most) female choices could be arbitrary.

2 inch tail

2.5 inch tail

1.5 inch tail

I like guys with a two inch tail..

• Under certain conditions, the genes for having a trait, and the genes for choosing the same trait, can “take hold” in a population.

• If that happens, it is unlikely to ever break out of the cycle. A male that has a longer or shorter tail simply will not be able find a mate.

2 inch tail

2.5 inch tail

1.5 inch tail

• Suppose the female choice is not for a certain length tail, but for a tail that is longer than average….

I like guys whose tail is longer than average….

2 inch tail

2.5 inch tail

1.5 inch tail

I like guys whose tail is longer than average….

• Now every generation has longer and longer tails….

2.6 inch tail

• Once the genes for liking longer-than-average-tails reach a critical mass, we have a positive feedback cycle….

2.8 inch tail

2.9 inch tail

I like guys whose tail is longer than average….

5.9 inch tail

• …and the tails get longer and longer

• Note that the male would be much better of without the long tail. It costs energy to grow it, it makes it very hard to fly, it make it hard to avoid detection by predators, and hard to evade them… But any male without a long tale will not find a mate. This seems like a theoretical model, but….

6.2 inch tail

5.1 inch tail

I like guys whose tail is longer than average….

The pin-tailed whydah (Vidua macroura)

IQ = 96

IQ = 50

IQ = 90

• Sometime in the last few hundred thousand years, human female choice started to select for guys that where smarter than average (how did the females know who was smart?)

• Human intelligent is just a by-product of this arbitrary accident.

I like guys whose are

smarter than average….

Humans are not that Smart

Converting a continuous problem to a discrete problem

• The problems we considered thus far are intrinsically discrete.

• What happens if our problem space is continuous?

Curiosity was launched from Cape Canaveral on November 26, 2011, and successfully landed on Aeolis Palus in Gale Crater on Mars on August 6, 2012. The Bradbury Landing site[6] was less than 2.4 km (1.5 mi) from the center of the rover's touchdown target after a 563,000,000 km.

When Mars is closest to the Earth, it takes light three minutes to travel between the two planets.

Mars is usually a lot farther away than that. At its greatest distance it is 42 light minutes away

Path Planning Search

target

Path Planning

What is the state space?

This example is by Jean-Claude Latombe

Formulation #1

Cost of one horizontal/vertical step = 1Cost of one diagonal step = 2

Optimal Solution?

This path is the shortest in the discretized state space, but not in the original continuous space(Trade-off: The smaller the grid…The larger the grid..)

Formulation #2sweep-line

Formulation #2

States

Operator: Visit center of adjacent region

Solution Path

A path-smoothing post-processing step is usually needed to shorten the path further

Formulation #3

Cost of one step: length of segment

Formulation #3

Cost of one step: length of segment

Visibility graph

Solution Path

The shortest path in this state space is also the shortest in the original continuous space

A description of the desired state of the world (goal state), could be implicit or explicit.

explicit

implicit

FWDC

XXX 21

X= Xor1=1or2=2

Search to Solve Word Ladders

Before we consider Heuristic Search, let us review a little first

We will begin by considering a new problem space, for Word Ladders….

DOGDOTPOTPOPMOPMAPCAPCAT

Example of a word ladder

Change DOG to CAT. Change ONLY one letter at a time and form a new legal word at each step.

DOG??CAT

What is the depth of the solution?

Since all three letters are different in the two words, and we have to change only one letter at a time, it is clear that we have to take at least three steps.

DOGDOTCOTCAT

As it happens, there is a solution at depth three

What is the diameter of the Word Ladder problem in general?

In other words, what two English words could you give me, that would require the deepest tree?

The two words are “charge” and “comedo”, and the tree is of depth 49.

(Note: this is using a standard English dictionary, no plurals or verb conjugations)

APE

MAN{“charge”, “change”, “chance”, “chancy”, “chanty”, “shanty”, “shanny”, “shinny”, “whinny”, “whiney”, “whiner”, “shiner”, “shiver”, “shaver”, “sharer”, “scarer”, “scaler”, “sealer”, “healer”, “header”, “reader”, “render”, “renter”, “ranter”, “ranker”, “hanker”, “hacker”, “hackee”, “hackle”, “heckle”, “deckle”, “decile”, “defile”, “define”, “refine”, “repine”, “rapine”, “ravine”, “raving”, “roving”, “roping”, “coping”, “coming”, “homing”, “hominy”, “homily”, “homely”, “comely”, “comedy”, “comedo”}

?

Invented by Lewis Carroll

He suggested: APE ARE ERE ERR EAR MAR MAN

But we can do: APE APT OPT OAT MAT MAN

which takes one less move…

1832 –1898

Review point: A problem may have multiple solutions. Different solutions may have different costs. We generally want the cheapest solution, the optimal solution.

UNSPORTSMANLIKE

Some problems may have no solutions

ADVENTUROUSNESS

This problem has no solution

FROG

TOAD

Can you change FROG to TOAD?

FROG

TOAD

This time we are given the tree depth, here it is six.

The branching factor is the number of letters in the word (4), times 25.

100 is a huge branching factor, but the fairly limited tree depth means it might be tenable.

We can use depth-limited search, L = 6

FROG

AROG BROG CROG DROG

TOAD

GOAD

But how do we know the legal paths?

In other words, how do we know what are legal words?

It would be best if we had a dictionary of legal words, but if needed…

AROG BROG CROG FROM ::::

AROG has 1,060,00 hits FROM has 25,270,000,000 hits

Actually 25,270,000,000 is the max Google allows

FROG FROM PROM PRAM GRAD GOAD TOAD

FROG 262,000,000

FROM 25,270,000,000

PROM 224,000,000

PRAM 23,000,000

GRAD 225,000,000

GOAD 5,380,000

TOAD 50,400,000

Number of Google Hits

FROG

AROG 1060000

BROG 3080000

CROG724000

DROG

The “Google hits” strategy suggests a way to do greedy search, or hill-climbing search.

We can expand the nodes with the highest count first

That way we are very unlikely to waste time exploring the: FROG CROG … subtree etc

A

B C

ED

H I K L M

F G

J

Assume we have this tree. We have two goal states (highlighted).To understand all our algorithms, we can ask: “in what order do the nodes get REMOVE-FRONT (dequeued) for a given algorithm.

A

B C

ED

H I K L M

F G

J

I am going to do Depth First Search(Enqueue nodes in LIFO (last-in, first-out) order)You should do all algorithms (except perhaps bi-directional search) and make up new trees etc.

A

B C

ED

H I K L M

F G

J

Depth First Search

ANodes

Before entering the loop, the initial state A is enqueued

We now enter the loop for the first time (next slide)

Depth First Search

Nodes

The front of Nodes is dequeued, it was A

We ask A, are you the goal?Since the answer is no, we expand all A’s children (do every operator) and enqueue them in Nodes.

C B Nodes

We now jump back to the top of the loop…

A

B C

ED

H I K L M

F G

J

Depth First Search

CNodes

The front of Nodes is dequeued, it was B

We ask B, are you the goal?Since the answer is no, we expand all B’s children (do every operator) and enqueue them in Nodes.

C E D Nodes

We now jump back to the top of the loop…

A

B C

ED

H I K L M

F G

J

Depth First Search

C ENodes

The front of Nodes is dequeued, it was D

We ask D, are you the goal?Since the answer is no, we expand all D’s children (do every operator) and enqueue them in Nodes.

C E I HNodes

We now jump back to the top of the loop…

A

B C

ED

H I K L M

F G

J

Depth First Search

C E INodes

The front of Nodes is dequeued, it was H

We ask H, are you the goal?Since the answer is no, we expand all H’s children. As it happens, there are none

C E INodes

We now jump back to the top of the loop…

A

B C

ED

H I K L M

F G

J

Depth First Search

C ENodes

The front of Nodes is dequeued, it was I

We ask I, are you the goal?Since the answer is no, we expand all I’s children. As it happens, there are none

C ENodes

We now jump back to the top of the loop…

A

B C

ED

H I K L M

F G

J

Depth First Search

CNodes

The front of Nodes is dequeued, it was E

We ask E, are you the goal?Since the answer is no, we expand all E’s children (do every operator) and enqueue them in Nodes.

C JNodes

We now jump back to the top of the loop…

A

B C

ED

H I K L M

F G

J

Depth First Search

CNodes

The front of Nodes is dequeued, it was J

We ask J, are you the goal?Since the answer is yes, we report success!

A

B C

ED

H I K L M

F G

J

Heuristic Search The search techniques we have seen so far...

• Breadth first search• Uniform cost search• Depth first search• Depth limited search • Iterative Deepening• Bi-directional Search

...are all too slow for most real world problems

uninformed searchblind search

Sometimes we can tell that some states appear better that others...

1 2 34 5 67 8

7 8 43 5 16 2

FWD

C FW C

D

...we can use this knowledge of the relative merit of states to guide search

Heuristic Search (informed search) A Heuristic is a function that, when applied to a state, returns a number that is an estimate of the merit of the state, with respect to the goal.

In other words, the heuristic tells us approximately how far the state is from the goal state*.

Note we said “approximately”. Heuristics might underestimate or overestimate the merit of a state. But for reasons which we will see, heuristics that only underestimate are very desirable, and are called admissible.

*I.e Smaller numbers are better

Heuristics for 8-puzzle I

•The number of misplaced tiles (not including the blank)

1 2 34 5 67 8

1 2 34 5 67 8

1 2 34 5 67 8

1 2 34 5 67 8

N N NN N NN Y

In this case, only “8” is misplaced, so the heuristic function evaluates to 1.In other words, the heuristic is telling us, that it thinks a solution might be available in just 1 more move.

Goal State

Current State

Notation: h(n) h(current state) = 1

Heuristics for 8-puzzle II

•The Manhattan Distance (not including the blank)

In this case, only the “3”, “8” and “1” tiles are misplaced, by 2, 3, and 3 squares respectively, so the heuristic function evaluates to 8.In other words, the heuristic is telling us, that it thinks a solution is available in just 8 more moves.

3 2 84 5 67 1

1 2 34 5 67 8

Goal State

Current State

3 3

8

8

1

1

2 spaces

3 spaces

3 spaces

Total 8

Notation: h(n) h(current state) = 8

1 2 34 57 8 6

1 2 34 5

7 8 6

1 34 2 57 8 6

1 24 5 37 8 6

1 2 34 5 67 8

1 2 34 57 8 6

1 2 34 8 5

7 6

1 2 34 8 57 6

1 2 34 8 57 6

1 24 8 37 6 5

1 2 34 87 6 5

5

6 4

3

4 2

1 3 3

0 2

We can use heuristics to guide “hill climbing” search.

In this example, the Manhattan Distance heuristic helps us quickly find a solution to the 8-puzzle.

But “hill climbing has a problem...”

h(n)

1 2 34 5 86 7

1 2 34 56 7 8

1 2 34 5 86 7

1 2 34 56 7 8

1 24 5 36 7 8

6

7 5

6 6

In this example, hill climbing does not work!

All the nodes on the fringe are taking a step “backwards”(local minima)

Note that this puzzle is solvable in just 12 more steps.

h(n)

We have seen two interesting algorithms.

Uniform Cost “looks backwards, how far have I come”• Measures the cost to each node.• Is optimal and complete!• Can be very slow.

Hill Climbing “looks forwards, how far to go”• Estimates how far away the goal is.• Is neither optimal nor complete.• Can be very fast.

Can we combine them to create an optimal and complete algorithm that is also very fast?

Uniform Cost SearchEnqueue nodes in order of cost

Intuition: Expand the cheapest node. Where the cost is the path cost g(n)

25 25

1 7

25

1 7

4 5

Hill Climbing SearchEnqueue nodes in order of estimated distance to goal

Intuition: Expand the node you think is nearest to goal. Where the estimate of distance to goal is h(n)

1917 1917

16 14

13 15

1917

16 14

Uniform Cost SearchEnqueue nodes in order of cost (distance to the start)

Intuition: Expand the cheapest node. Where the cost is the path cost g(n)

25

Hill Climbing SearchEnqueue nodes in order of estimated distance to goal

Intuition: Expand the node you think is nearest to goal. Where the estimate of distance to goal is h(n)

1917

2 + 195 + 17

A*SearchEnqueue nodes in order of f(n) = g(n) + h(n)

The A* Algorithm (“A-Star”) Enqueue nodes in order of estimate cost to goal, f(n)

g(n) is the cost to get to a node.h(n) is the estimated distance to the goal.

f(n) = g(n) + h(n)

We can think of f(n) as the estimated cost of the cheapest solution that goes through node n

Note that we can use the general search algorithm we used before. All that we have changed is the queuing strategy.

If the heuristic is optimistic, that is to say, it never overestimates the distance to the goal, then…

A* is optimal and complete!

Informal proof outline of A* completeness• Assume that every operator has some minimum positive cost, epsilon .• Assume that a goal state exists, therefore some finite set of operators lead to it.•Expanding nodes produces paths whose actual costs increase by at least epsilon each time. Since the algorithm will not terminate until it finds a goal state, it must expand a goal state in finite time.

Informal proof outline of A* optimality • When A* terminates, it has found a goal state• All remaining nodes have an estimate cost to goal (f(n)) greater than or equal to that of goal we have found.•Since the heuristic function was optimistic, the actual cost to goal for these other paths can be no better than the cost of the one we have already found.

How fast is A*?A* is the fastest search algorithm. That is, for any given heuristic, no algorithm can expand fewer nodes than A*.

How fast is it? Depends of the quality of the heuristic.

•If the heuristic is useless (ie h(n) is hardcoded to equal 0 ), the algorithm degenerates to uniform cost.

•If the heuristic is perfect, there is no real search, we just march down the tree to the goal.

Generally we are somewhere in between the two situations above. The time taken depends on the quality of the heuristic.

What is A*’s space complexity?A* has worst case O(bd) space complexity, but an iterative deepening version is possible ( IDA* )

A Worked Example: Maze Traversal

1 2 3 4 5

A

B

D

C

E

Problem: To get from square A3 to square E2, one step at a time, avoiding obstacles (black squares).

Operators: (in order)•go_left(n) •go_down(n) •go_right(n) each operator costs 1.

Heuristic: Manhattan distance

Operators: (in order)•go_left(n) •go_down(n) •go_right(n) each operator costs 1.

A2

A3

B3 A4g(A2) = 1h(A2) = 4

g(B3) = 1h(B3) = 4

g(A4) = 1h(A4) = 6

1 2 3 4 5

A

B

D

C

E

A2

B3

A4

Operators: (in order)•go_left(n) •go_down(n) •go_right(n) each operator costs 1.

A2

A3

B3 A4g(A2) = 1h(A2) = 4

g(B3) = 1h(B3) = 4

g(A4) = 1h(A4) = 6

A1 g(A1) = 2h(A1) = 5

1 2 3 4 5

A

B

D

C

E

A2

B3

A1 A4

Operators: (in order)•go_left(n) •go_down(n) •go_right(n) each operator costs 1.

A2

A3

B3 A4g(A2) = 1h(A2) = 4

g(B3) = 1h(B3) = 4

g(A4) = 1h(A4) = 6

C3 B4g(C3) = 2h(C3) = 3

g(B4) = 2h(B4) = 5

A1 g(A1) = 2h(A1) = 5

1 2 3 4 5

A

B

D

C

E

A2

B3

A4A1

C3

B4

Operators: (in order)•go_left(n) •go_down(n) •go_right(n) each operator costs 1.

A2

A3

B3 A4g(A2) = 1h(A2) = 4

g(B3) = 1h(B3) = 4

g(A4) = 1h(A4) = 6

C3 B4g(C3) = 2h(C3) = 3

g(B4) = 2h(B4) = 5

A1 g(A1) = 2h(A1) = 5

1 2 3 4 5

A

B

D

C

E

B1 g(B1) = 3h(B1) = 4

A2

B3

A4A1

B1

C3

B4

Operators: (in order)•go_left(n) •go_down(n) •go_right(n) each operator costs 1.

A2

A3

B3 A4g(A2) = 1h(A2) = 4

g(B3) = 1h(B3) = 4

g(A4) = 1h(A4) = 6

C3 B4g(C3) = 2h(C3) = 3

g(B4) = 2h(B4) = 5

A1 g(A1) = 2h(A1) = 5

1 2 3 4 5

A

B

D

C

E

B1 g(B1) = 3h(B1) = 4

B5 g(B5) = 3h(B5) = 6

A2

B3

A4A1

B1

C3

B4 B5

Here is a larger version of the previous example

The black square is the initial state

The white square is the goal state.

The blue squares are the barriers

In this version, diagonal moves are allowed.

Here are the first four states it expands….

Success!

Please watch the videoA* Pathfinding Algorithm Visualizationhttps://www.youtube.com/watch?v=19h1g22hby8