10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW tw.

04/11/23

Constrainedness of Search

Toby Walsh

NICTA and UNSWhttp://cse.unsw.edu.au/~tw

Motivation

Will a problem be satisfiable or unsatisfiable?

Will it be hard or easy? How can we develop

heuristics for a new problem?

04/11/23

Take home messages

Hard problems often associated with a phase transition Under constrained,

easy Critically

constrained, hard Over constrained,

easier

04/11/23

Provide definition of constrainedness Predict location of such

phase transitions Can be measured during

search Observe beautiful

“knife-edge” Build heuristics to get

off this knife-edge

04/11/23

Let’s start with the mother of all NP-complete problems!

04/11/23

04/11/23

3-SAT

Where are the hard 3-SAT problems? Sample randomly generated 3-SAT

Fix number of clauses, l Number of variables, n By definition, each clause has 3 variables Generate all possible clauses with uniform probability

04/11/23

Random 3-SAT

Which are the hard instances? around l/n = 4.3

What happens with larger problems?

Why are some dots red and others blue?

This is a so-called “phase transition”

04/11/23

Random 3-SAT

Varying problem size, n

Complexity peak appears to be largely invariant of algorithm complete algorithms like

Davis-Putnam Incomplete methods like local

search

What’s so special about 4.3?

04/11/23

Random 3-SAT

Complexity peak coincides with satisfiability transition

l/n < 4.3 problems under-constrained and SAT

l/n > 4.3 problems over-constrained and UNSAT

l/n=4.3, problems on “knife-edge” between SAT and UNSAT

04/11/23

Where did this all start?

At least as far back as 60s with Erdos & Renyi thresholds in random graphs

Late 80s pioneering work by Karp,

Purdom, Kirkpatrick, Huberman, Hogg …

Flood gates burst Cheeseman, Kanefsky &

Taylor’s IJCAI-91 paper

04/11/23

What do we know about this phase transition? It’s shape

Step function in limit [Friedgut 98]

It’s location Theory puts it in interval:

3.42 < l/n < 4.506 Experiment puts it at:

l/n = 4.2

04/11/23

3SAT phase transition

Lower bounds (hard) Analyse algorithm that almost always solves problem Backtracking hard to reason about so typically without

backtracking Complex branching heuristics needed to ensure success But these are complex to reason about

04/11/23


Upper bounds (easier) Typically by estimating count of solutions

04/11/23


Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method

For any statistic X

prob(X>=1) <= E[X]

04/11/23



For any statistic X

prob(X>=1) <= E[X]

No assumptions about the distribution of X except non-negative!

04/11/23



For any statistic X

prob(X>=1) <= E[X]

Let X be the number of satisfying assignments for a 3SAT problem

04/11/23



For any statistic X

prob(X>=1) <= E[X]


The expected value of X can be easily calculated

04/11/23



For any statistic X

prob(X>=1) <= E[X]


E[X] = 2^n * (7/8)^l

04/11/23



For any statistic X

prob(X>=1) <= E[X]


E[X] = 2^n * (7/8)^l

If E[X] < 1, then prob(X>=1) = prob(SAT) < 1

04/11/23



For any statistic X

prob(X>=1) <= E[X]


E[X] = 2^n * (7/8)^l

If E[X] < 1, then 2^n * (7/8)^l < 1

04/11/23



For any statistic X

prob(X>=1) <= E[X]


E[X] = 2^n * (7/8)^l

If E[X] < 1, then 2^n * (7/8)^l < 1

n + l log2(7/8) < 0

04/11/23



For any statistic X

prob(X>=1) <= E[X]


E[X] = 2^n * (7/8)^l

If E[X] < 1, then 2^n * (7/8)^l < 1

n + l log2(7/8) < 0

l/n > 1/log2(8/7) = 5.19…

04/11/23


Upper bounds (easier) Typically by estimating count of solutions To get tighter bounds than 5.19, can refine the counting

argument E.g. not count all solutions but just those maximal under

some ordering

04/11/23

Random 2-SAT

2-SAT is P linear time algorithm

Random 2-SAT displays “classic” phase transition l/n < 1, almost surely SAT l/n > 1, almost surely UNSAT complexity peaks around l/n=1

x1 v x2, -x2 v x3, -x1 v x3, …

04/11/23

Phase transitions in P

2-SAT l/n=1

Horn SAT transition not “sharp”

Arc-consistency rapid transition in whether

problem can be made AC peak in (median) checks

04/11/23

Phase transitions above NP

PSpace QSAT (SAT of QBF)x1 x2 x3 . x1 v x2 & -x1 v x3

04/11/23

Phase transitions above NP

PSpace-complete QSAT (SAT of QBF) stochastic SAT modal SAT

PP-complete polynomial-time probabilistic

Turing machines counting problems #SAT(>= 2^n/2)

04/11/23

Exact phase boundaries in NP

Random 3-SAT is only known within bounds 3.42 < l/n < 4.506

Exact NP phase boundaries are known:

1-in-k SAT at l/n = 2/k(k-1)

Are there any NP phase boundaries known exactly?

04/11/23

Backbone

Variables which take fixed values in all solutions alias unit prime implicates

Let fk be fraction of variables in backbone in random 3-SAT

l/n < 4.3, fk vanishing (otherwise adding clause could make problem unsat)

l/n > 4.3, fk > 0discontinuity at phase boundary!

04/11/23

Backbone

Search cost correlated with backbone size if fk non-zero, then can easily assign variable “wrong” value such mistakes costly if at top of search tree

One source of “thrashing” behaviour can tackle with randomization and rapid restarts

Can we adapt algorithms to offer more robust performance guarantees?

04/11/23

Backbone

Backbones observed in structured problems quasigroup completion problems (QCP)

Backbones also observed in optimization and approximation problems coloring, TSP, blocks world planning …

Can we adapt algorithms to identify and exploit the backbone structure of a problem?

04/11/23

2+p-SAT Morph between 2-SAT and 3-

SAT fraction p of 3-clauses fraction (1-p) of 2-clauses

2-SAT is polynomial (linear) phase boundary at l/n =1 but no backbone discontinuity

here!

2+p-SAT maps from P to NP p>0, 2+p-SAT is NP-complete

04/11/23

2+p-SAT phase transition

04/11/23


l/n

p

04/11/23


Lower bound are the 2-clauses (on their

own) UNSAT? n.b. 2-clauses are much more

constraining than 3-clauses

p <= 0.4 transition occurs at lower

bound 3-clauses are not

contributing!

04/11/23

2+p-SAT backbone

fk becomes discontinuous for p>0.4 but NP-complete for p>0 !

search cost shifts from linear to exponential at p=0.4

similar behavior seen with local search algorithms

Search cost against n

04/11/23

2+p-SAT trajectories

Input 3-SAT to a SAT solver like Davis Putnam REPEAT assign variable

Simplify all unit clauses Leaving subproblem with a mixture of 2 and 3-clauses

For a number of branching heuristics (e.g random,..) Assume subproblems sample uniformly from 2+p-SAT space Can use to estimate runtimes!

04/11/23

2+p-SAT trajectories

UNSAT

SAT

04/11/23

Beyond 2+p-SAT

Optimization MAX-SAT

Other decision problems 2-COL to 3-COL Horn-SAT to 3-SAT XOR-SAT to 3-SAT 1-in-2-SAT to 1-in-3-SAT NAE-2-SAT to NAE-3-SAT ..

04/11/23

COL Graph colouring

Can we colour graph so that neighbouring nodes have different colours?

In k-COL, only allowed k colours 3-COL is NP-complete

2-COL is P

04/11/23

Random COL

Sample graphs uniformly n nodes and e edges

Observe colourability phase transition random 3-COL is "sharp", e/n =approx 2.3

BUT random 2-COL is not "sharp"

As n->oo prob(2-COL @ e/n=0) = 1

prob(2-COL @ e/n=0.45) =approx 0.5

prob(2-COL @ e/n=1) = 0

04/11/23

2+p-COL

Morph from 2-COL to 3-COL fraction p of 3 colourable nodes

fraction (1-p) of 2 colourable nodes

Like 2+p-SAT maps from P to NP

NP for any fixed p>0

Unlike 2+p-SAT maps from coarse to sharp transition

04/11/23

2+p-COL

04/11/23

2+p-COL sharpness

p=0.8

04/11/23

2+p-COL search cost

04/11/23

2+p-COL

Sharp transition for p>0.8

Transition has coarse and sharp regions for 0<p<0.8

Problem hardness appears to increase from polynomial to exponential at p=0.8

2+p-COL behaves like 2-COL for p<0.8 NB sharpness alone is not cause of complexity since

2-SAT has a sharp transition!

04/11/23

Location of phase boundary

For sharp transitions, like 2+p-SAT:

As n->oo, if l/n = c+epsilon, then UNSAT

l/n = c-epsilon, then SAT

For transitions like 2+p-COL that may be coarse, we identify the start and finish: delta2+p = sup{e/n | prob(2+p-colourable) = 1}

gamma2+p = inf{e/n | prob(2+p-colourable) = 0}

04/11/23

Basic properties

monotonicity: delta <= gamma sharp transition iff delta=gamma simple bounds:

delta_2+p = 0 for all p<1

gamma_2 <= gamma_2+p <= min(gamma_3,gamma_2/1-p)

04/11/23

2+p-COL phase boundary

04/11/23

XOR-SAT XOR-SAT

Replace or by xor

XOR k-SAT is in P for all k

Phase transition XOR 3-SAT has sharp transition

0.8894 <= l/n <= 0.9278 [Creognou et al 2001]

Statistical mechanics gives l/n = 0.918 [Franz et al 2001]

04/11/23

XOR-SAT to SAT Morph from XOR-SAT to SAT

Fraction (1-p) of XOR clauses

Fraction p of OR clauses

NP-complete for all p>0 Phase transition occurs at:

0.92 <= l/n <= min(0.92/1-p, 4.3)

Upper bound appears loose for all p>0 Polynomial subproblem does not dominate!

3-SAT contributes (cf 2+p-SAT, 2+p-COL)

04/11/23

Other morphs between P and NP NAE 2+p-SAT

NAE = not all equal

NAE 2-SAT is P, NAE 3-SAT is NP-complete

1-in-2+p-SAT 1-in-k SAT = exactly one in k literals true

1-in-2 SAT is P, 1-in-3 SAT is NP-complete

…

04/11/23

NAE to SAT Morph between two NP-complete problems

Fraction (1-p) of NAE 3-SAT clauses

Fraction p of 3-SAT clauses

Each NAE 3-SAT clause is equivalent to two 3-SAT clauses NAE 3-SAT phase transition occurs around l/n = 2.1

Tantalisingly close to half of 4.2

NAE(a,b,c) = or(a,b,c) & or(-a,-b,-c)

Can we ignore many of the correlations that this encoding of NAE SAT into SAT introduces?

04/11/23

NAE to SAT Compute “effective” clause size

Consider (1-p)l NAE 3-SAT clauses and pl 3-SAT clauses

These behave like 2(1-p)l 3-SAT clauses and pl 3-SAT clauses

That is, (2-p)l 3-SAT clauses

Hence, effective clause to variable ratio is (2-p)l/n

Plot prob(satisfiable) and search cost against (2-p)l/n

NAE to SAT

04/11/23

04/11/23

The real world isn’t random?

Very true!Can we identify structural

features common in real world problems?

Consider graphs met in real world situations social networks electricity grids neural networks ...

04/11/23

Real versus Random Real graphs tend to be sparse

dense random graphs contains lots of (rare?) structure

Real graphs tend to have short path lengths as do random graphs

Real graphs tend to be clustered unlike sparse random graphs

L, average path lengthC, clustering coefficient(fraction of neighbours connected to

each other, cliqueness measure)

mu, proximity ratio is C/L normalized by that of random graph of same size and density

04/11/23

Small world graphs

Sparse, clustered, short path lengths

Six degrees of separation Stanley Milgram’s famous

1967 postal experiment recently revived by Watts &

Strogatz shown applies to:

actors database US electricity grid neural net of a worm ...

04/11/23

An example

1994 exam timetable at Edinburgh University 59 nodes, 594 edges so

relatively sparse but contains 10-clique

less than 10^-10 chance in a random graph assuming same size and

density

clique totally dominated cost to solve problem

04/11/23

Small world graphs

To construct an ensemble of small world graphs morph between regular graph (like ring lattice) and

random graph prob p include edge from ring lattice, 1-p from random

graph

real problems often contain similar structure and stochastic components?

04/11/23

Small world graphs

ring lattice is clustered but has long paths random edges provide shortcuts without

destroying clustering

04/11/23

Small world graphs

04/11/23

Colouring small world graphs

04/11/23

Small world graphs

Other bad news disease spreads more

rapidly in a small world

Good news cooperation breaks out

quicker in iterated Prisoner’s dilemma

04/11/23

Other structural features

It’s not just small world graphs that have been studied

High degree graphs Barbasi et al’s power-law model

Ultrametric graphs Hogg’s tree based model

Numbers following Benford’s Law 1 is much more common than 9 as a leading digit!

prob(leading digit=i) = log(1+1/i) such clustering, makes number partitioning much easier

04/11/23

High degree graphs

Degree = number of edges connected to node Directed graph

Edges have a direction E.g. web pages = nodes, links = directed edges

In-degree, out-degree In-degree = links pointing to page Out-degree = links pointing out of page

04/11/23

In-degree of World Wide Web

Power law distribution Pr(in-degree = k) =

ak^-2.1

Some nodes of very high in-degree E.g. google.com, …

04/11/23

Out-degree of World Wide Web

Power law distribution Pr(in-degree = k) =

ak^-2.7

Some nodes of very high out-degree E.g. people in SAT

04/11/23

High degree graphs

World Wide Web Electricity grid Citation graph

633,391 out of 783,339 papers have < 10 citations

64 have > 1000 citations 1 has 8907 citations

Actors graph Robert Wagner, Donald

Sutherland, …

04/11/23

High degree graphs

Power law in degree distribution Pr(degee = k) = ak^-b where b typically around 3

Compare this to random graphs Gnm model

n nodes, m edges chosen uniformly at random Gnp model

n nodes, each edge included with probability p In both, Pr(degree = k) is a Poisson distribution

tightly clustered around mean

04/11/23

Random v high degree graphs

04/11/23

Generating high-degree graphs

Grow graph Preferentially attach new

nodes to old nodes according to their degree Prob(attach to node j)

proportional to degree of node j

Gives Prob(degree = k) = ak^-3

04/11/23

High-degree = small world?

Preferential attachment model n=16, mu=1 n=64, mu=1.35 n=256, mu=2.12 …

Small world topology thus for large n!

04/11/23

Search on high degree graphs

Random Uniformly hard

Small world A few long runs

High degree More uniform Easier than random

04/11/23

What about numbers?

So far, we’ve looked at structural features of graphs

Many problems contain numbers Do we see phase

transitions here too?

04/11/23

Number partitioning

What’s the problem? dividing a bag of numbers into

two so their sums are as balanced as possible

What problem instances? n numbers, each uniformly

chosen from (0,l ] other distributions work

(Poisson, …)

04/11/23

Number partitioning

Identify a measure of constrainedness more numbers => less constrained larger numbers => more constrained could try some measures out at random (l/n, log(l)/n,

log(l)/sqrt(n), …)

Better still, use kappa! (approximate) theory about constrainedness based upon some simplifying assumptions

e.g. ignores structural features that cluster solutions together

04/11/23

Theory of constrainedness

Consider state space searched see 10-d hypercube opposite

of 2^10 possible partitions of 10 numbers into 2 bags

Compute expected number of solutions, <Sol> independence assumptions

often useful and harmless!

04/11/23

Theory of constrainedness

Constrainedness given by: kappa= 1 - log2(<Sol>)/n where n is dimension of state space

kappa lies in range [0,infty) kappa=0, <Sol>=2^n, under-constrained kappa=infty, <Sol>=0, over-constrained kappa=1, <Sol>=1, critically constrained phase boundary

04/11/23

Phase boundary

Markov inequality prob(Sol) < <Sol>

Now, kappa > 1 implies <Sol> < 1 Hence, kappa > 1 implies prob(Sol) < 1

Phase boundary typically at values of kappa slightly smaller than kappa=1 skew in distribution of solutions (e.g. 3-SAT) non-independence

04/11/23

Examples of kappa

3-SAT kappa = l/5.2n phase boundary at kappa=0.82

3-COL kappa = e/2.7n phase boundary at kappa=0.84

number partitioning kappa = log2(l)/n phase boundary at kappa=0.96

04/11/23

Number partition phase transition

Prob(perfect partition) against kappa

04/11/23

Finite-size scaling

Simple “trick” from statistical physics around critical point, problems indistinguishable except for

change of scale given by simple power-law

Define rescaled parameter gamma = kappa-kappac . n^1/v kappac

estimate kappac and v empirically e.g. for number partitioning, kappac=0.96, v=1

04/11/23

Rescaled phase transition

Prob(perfect partition) against gamma

04/11/23

Rescaled search cost

Optimization cost against gamma

04/11/23

Easy-Hard-Easy?

Search cost only easy-hard here? Optimization not decision search cost! Easy if (large number of) perfect partitions Otherwise little pruning (search scales as 2^0.85n)

Phase transition behaviour less well understood for optimization than for decision sometimes optimization = sequence of decision problems (e.g

branch & bound) BUT lots of subtle issues lurking?

Looking inside search

04/11/23Clauses/variables down search branch

Looking inside search

04/11/23Clauses length down search branch

Constrainedness knife-edge

04/11/23kappa down search branch


04/11/23Real world register allocation graph colouring problem


04/11/23Optimisation problems too (number partitioning)

Exploiting the knife-edge

Get off the knife-edge asap Aka minize constrainedness

Many existing heuristics can be viewed in this light E.g. fail first heuristic in CSPs E.g. KK heuristic for number partitioning …

04/11/23

Exploiting the knife-edge

Get off the knife-edge asap Aka minize constrainedness

Many existing heuristics can be viewed in this light E.g. fail first heuristic in CSPs E.g. KK heuristic for number partitioning …

Good way to design new heuristics Branch into subproblem with minimal kappa Challenge: to compute this efficiently!

04/11/23

04/11/23

The future?

What open questions remain?

Where to next?

04/11/23

Open questions Prove random 3-SAT occurs at l/n = 4.3

random 2-SAT proved to be at l/n = 1 random 3-SAT transition proved to be in range 3.42

< l/n < 4.506

2+p-COL Prove problem changes around p=0.8 What happens to colouring backbone?

04/11/23

Open questions

Does phase transition behaviour give insights to help answer P=NP? it certainly identifies hard problems! problems like 2+p-SAT and ideas like backbone also show

promise

But problems away from phase boundary can be hard to solve

over-constrained 3-SAT region has exponential resolution proofs under-constrained 3-SAT region can throw up occasional hard

problems (early mistakes?)

04/11/23

Summary

That’s nearly all from me!

04/11/23

Conclusions

Phase transition behaviour ubiquitous decision/optimization/... NP/PSpace/P/… random/real

Phase transition behaviour/constrainedness gives insight into problem hardness suggests new branching heuristics ideas like the backbone help understand branching

mistakes

04/11/23

Conclusions

AI becoming more of an experimental science? theory and experiment complement each other well increasing use of approximate/heuristic theories to keep

theory in touch with rapid experimentation

Phase transition behaviour is FUN lots of nice graphs as promised and it is teaching us lots about complexity and

algorithms!

04/11/23

Very partial bibliographyCheeseman, Kanefsky, Taylor, Where the really hard problem are, Proc. of IJCAI-91Gent et al, The Constrainedness of Search, Proc. of AAAI-96Gent & Walsh, The TSP Phase Transition, Artificial Intelligence, 88:359-358, 1996Gent & Walsh, Analysis of Heuristics for Number Partitioning, Computational Intelligence, 14 (3),

1998Gent & Walsh, Beyond NP: The QSAT Phase Transition, Proc. of AAAI-99Gent et al, Morphing: combining structure and randomness, Proc. of AAAI-99Hogg & Williams (eds), special issue of Artificial Intelligence, 88 (1-2), 1996Mitchell, Selman, Levesque, Hard and Easy Distributions of SAT problems, Proc. of AAAI-92Monasson et al, Determining computational complexity from characteristic ‘phase transitions’,

Nature, 400, 1998Walsh, Search in a Small World, Proc. of IJCAI-99Walsh, Search on High Degree Graphs, Proc. of IJCAI-2001.Walsh, From P to NP: COL, XOR, NAE, 1-in-k, and Horn SAT, Proc. of AAAI-2001.Watts & Strogatz, Collective dynamics of small world networks, Nature, 393, 1998

Some blatent adverts!

2nd International Optimisation Summer School

Jan 12th to 18th, Kioloa, NSW

Will also cover local search

http://go.to/optschool

NICTA

Optimisation Research Group

04/11/23

We love having visitors stopby to give talks, or for longer(week, month or sabbaticals!)

10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW tw.

Documents

Transcript of 10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW tw.