Post on 14-Dec-2015
Outline
Discrepancy Theory• What is it• Applications• Basic Results (non-constructive)
SDP connection• Algorithms• Lower Bounds
2/40
Discrepancy: What is it?
Study of irregularities in approximating the continuous by the discrete.
Historical motivation: Numerical Integration/ SamplingHow well can you approximate a region by discrete points ?
4/40
Discrepancy: What is it?
Problem: How uniformly can you distribute points in a grid. “Uniform” : For every axis-parallel rectangle R | (# points in R) - (Area of R) | should be low.
n1/2
n1/2
RDiscrepancy: Max over rectangles R|(# points in R) – (Area of R)|
5/40
Distributing points in a grid
Problem: How uniformly can you distribute points in a grid. “Uniform” : For every axis-parallel rectangle R | (# points in R) - (Area of R) | should be low.
Uniform Random Van der Corput Set
n= 64points
n1/2 discrepancy n1/2 (loglog n)1/2 O(log n) discrepancy!6/40
Quasi-Monte Carlo Methods
n random samples (Monte Carlo) : Error
Quasi-Monte Carlo Methods* :
Extensive research area.
*Different constantof proportionality
7/40
Discrepancy: Example 2
Input: n points placed arbitrarily in a grid.Color them red/blue such that each axis-parallel rectangle is
colored as evenly as possible
Discrepancy: max over rect. R ( | # red in R - # blue in R | )
Continuous: Color each element 1/2 red and 1/2 blue (0 discrepancy)
Discrete:Random has about O(n1/2 log1/2 n)Can achieve O(log2.5 n)
Why do we care? 8/40
Combinatorial Discrepancy
Universe: U= [1,…,n] Subsets: S1,S2,…,Sm
Color elements red/blue so each set is colored as evenly as possible.
Find : [n] ! {-1,+1} toMinimize |(S)|1 = maxS | i 2 S (i) |
Example: U={1,2,3} disc = 0 For disc = 2
S1
S2
S3
S4
9/40
ApplicationsCS: Computational Geometry, Comb. Optimization, Monte-Carlo simulation, Machine learning, Complexity, Pseudo-Randomness, …
Math: Dynamical Systems, Combinatorics, Mathematical Finance, Number Theory, Ramsey Theory, Algebra, Measure Theory, …
11/40
Hereditary Discrepancy
Discrepancy a useful measure of complexity of a set system
Hereditary discrepancy:
herdisc (U,S) = maxdisc (U’, S|U’)
Robust version of discrepancy
A1
A2
…
1 2 … n
A’1
A’2
…
1’ 2’ … n’
But not so robust 𝑆 𝑖=𝐴𝑖∪𝐴 ’𝑖
12/40
Discrepancy = 0
Rounding
Lovasz-Spencer-Vesztermgombi’86: Given any matrix A, and can round x to s.t.
Proof: Round the bits of x one by one.
: blah .0101101
: blah .1101010
…
: blah .0111101
Error = herdisc(A) ( )
~𝑥
Ax=b
A
14/40
(-1)
(+1)
Key Point: Low discrepancycoloring guides our updates!
x
Rounding
LSV’86 result guarantees existence of good rounding.
How to find it efficiently? Nothing known until recently.
Thm [B’10]. Can round efficiently so that
15/40
Discrepancy and optimization
Corollary(LSV’86): A is integer matrix, herdisc(A) =1, then A is TU.(Totally unimodular: Ax b polytope integral for all integer vectors b.)
Ghouila-Houri test for TU matrices.
Open: Can you characterize matrices with herdisc(A) = 2?
Bin Packing: OPT LP + O(1) ?
[Eisenbrand, Palvolgyi, Rothvoss’11]: Yes. For constant item sizes, if k-permutation conjecture is true.(Recently, Newman-Nikolov’11 disproved the k-permutation conjecture)
Refined further by Rothvoss’12. (Entropy rounding method)16/40
Dynamic Data Structures
N weighted points in a 2-d region.
Weights updated over time.
Query: Given an axis-parallel rectangle R,
determine the total weight on points in R.
Goal: Preprocess (in a data structure)
1) Low query time
2) Low update time (upon weight change)
17/40
ExampleLine: Interval queries Trivial: Query Time= O(n) Update Time = 1
Query time= 1 Update time= O() (Table of entries W[a,b] )
Query time = 2 Update time= O(n) (W[a,b] = W[0,b] – W[0,a])
Query = O(log n) Update = O(log n)
Recursively for 2-d.
18/40
What about other queries?
Circles arbitrary rectangles aligned triangle
Turns out
Reason: Set system S formed by query sets & points has large discrepancy (about )
Larsen-Green’11
19/40
General set system
What is the discrepancy of a general system on m sets?
Useful Fact: After n coin tosses
E[# Heads] = n/2
# Heads with prob.
In general: n independent “nice” random variables
Then ] with prob.
21/40
(Previous) Best Algorithm
Random: Color each element i independently x(i) = +1 or -1 with prob. ½ each.
Thm: Discrepancy = O (n log m)1/2
Pf: For each set, expect O(n1/2) discrepancy
Standard tail bounds: Pr[ | i 2 S x(i) | ¸ c n1/2 ] ¼ e-c2
Union bound + Choose c ¼ (log m)1/2
Tight: Random cannot do better.
For m=n case: Random 22/40
Better Colorings Exist!
[Spencer 85]: (Six standard deviations suffice)
Any system with n sets has discrepancy · 6n1/2
(In general for arbitrary m, discrepancy = O(n1/2 log(m/n)1/2)Tight: For m=n, cannot beat 0.5 n1/2 (Hadamard Matrix)
Inherently non-constructive proof (counting)Powerful Entropy Method.
Question: Can we find it algorithmically ?Certain algorithms do not work [Spencer]
Conjecture [Alon-Spencer]: May not be possible.
23/40
Space of colorings
Results
Thm: Can get Spencer’s bound constructively. That is, O(n1/2) discrepancy for m=n sets.
Thm: For any set system, can find coloring withDiscrepancy · Hereditary discrepancy.
Corollary: Rounding w/ error = Herdisc(A).
General Technique: k-permutation problem [Spencer, Srinivasan,Tetali]
geometric problems , Beck Fiala setting (Srinivasan’s bound) …
24/40
SDPs
Vector Program View:
Variables: Vectors
(in arbitrary dimension)
Constraints: Arbitrary linear constraints on
e.g.
25/40
Relaxations: LPs and SDPs
Not clear how to use.Linear Program is useless. Can color each element ½ red and ½
blue. Discrepancy of each set = 0!
In general, if x is a good coloring, then so is –x. But
SDPs: | i 2 S vi |2 · n 8 S
|vi|2 = 1 8 i
Intended solution vi = (+1,0,…,0) or (-1,0,…,0).
Trivially feasible: vi = ei (all vi’s orthogonal)
Yet, SDPs will be a major tool.26/40
Punch line
SDP very helpful if “tighter” () bounds for some sets.
But why does it work for Spencer’s setting?An additional idea needed.
Algorithm constructs coloring over time, using several SDPs.
27/40
Algorithm (at high level)
Cube: {-1,+1}n
Analysis: Few steps to reach a vertex (walk has high variance) Disc(Si) does a random walk (with low variance)
start
finish
Algorithm: “Sticky” random walk Each step generated by rounding a suitable SDP Move in various dimensions correlated, e.g. t
1 + t2 ¼ 0
Each dimension: An ElementEach vertex: A Coloring
28/40
An SDP
Hereditary disc. ) the following SDP is feasible
SDP: Low discrepancy
|i 2 Sj vi |2 · 2 for each set .
|vi|2 = 1 for each element i.
Perhaps can guide us how to update element i ?
Trouble: is a vector. Need a real number.Perhaps project on some vector g? (i.e. for each i, consider i = g ¢ vi)
Seems promising:
Obtain vi 2 Rn
29/40
Idea
Which vector g to project on?
Lemma: If g 2 Rn is a random Gaussian, for any v 2 Rn, g ¢ v is distributed as N(0, |v|2).
Pf: N(0,a2) + N(0,b2) = N(0,a2+b2) g¢v = i v(i) gi » N(0, i v(i)2)30/40
Pick a random Gaussian vector g in
g = () each is i.i.d. N(0,1)
Properties of Rounding
Lemma: If g 2 Rn is a random Gaussian, for any v 2 Rn, g ¢ v is distributed as N(0, |v|2)
1. Each i » N(0,1)
2. For each set S,
i 2 S i = g ¢ (i2 S vi) » N(0, · 2)
(std deviation ·)
SDP:|vi|2 = 1|i2 S
vi|2 ·2
Recall: i = g ¢ vi
’s will guide our updates to x.31/40
Algorithm Overview Construct coloring iteratively.
Initially: Start with coloring x0 = (0,0,0, …,0) at t = 0.
At Time t: Update coloring as xt = xt-1 + (t1,…,t
n) ( tiny: 1/n suffices)
x(i)
xt(i) = (1i + 2
i + … + ti)
Color of element i: Does random walk over time with step size ¼ (0,1)N
Fixed if reaches -1 or +1.
time
-1
+1
Set S: xt(S) = i 2 S xt(i) does a random walk w/ step N(0,· 2)
32/40
Analysis Consider time T = O(1/2)
Claim 1: With prob. ½, an element reaches -1 or +1.Pf: Each element doing random walk (martingale) with size ¼ .Recall: Random walk with step 1, is ¼ t1/2 away in t steps.
Claim 2: Each set has O() discrepancy in expectation.Pf: For each S, xt(S) doing random walk with step size ¼ .
At time T = O((log n)/) 1. Prob. that an element still floating < 1/(10 n).2. Expected discrepancy of set = (By Chernoff, all have discrepancy O( )
33/40
Recap
At each step of walk, formulate SDP on floating variables.SDP solution -> Guides the walk
Properties of walk: High Variance -> Quick convergence Low variance for discrepancy on sets -> Low discrepancy
34/40
start
finish
Refinements
Spencer’s six std deviations result: Recall: Want O(n1/2) discrepancy, but random coloring gives n1/2 (log n)1/2
Previous approach seems useless: Expected discrepancy for a set O(n1/2), but some random walks will deviate by up to (log n)1/2 factor
Tune down the variance of dangerous sets (not too many) Entropy Method -> SDP still feasible.
0 20n1/2 30n1/2 35n1/2 …
Danger 1 Danger 2 Danger 3 …
Further Developments
Can be derandomized [Bansal-Spencer’11]
Our algorithm still uses the Entrpoy method.
Gives no new proof of Spencer’s result.
Is there a purely constructive proof ?
Lovett Meka’12: Yes.
Gaussian random walks + Linear Algebra
36/40
Matousek Lower Bound
Thm (Lovasz Spencer Vesztergombi’86): (A)
detlb(A):
Conjecture (LSV’86): Herdisc O(1) detlb
Remark: For TU Matrices, Herdisc(A) =1, detlb = 1
(every submatrix has det -1,0 or +1)
37/40
DetlbHoffman: Detlb(A) 2
Palvolgyi’11: gap
Matousek’11: herdisc(A) O(log n ) detlb.
Idea: Our Algorithm -> SDP relaxation is not too weak
SDP Duality -> Dual Witness for large herdisc(A).
Dual Witness -> Submatrix with large determinant.
Other implications:
38/40
In ConclusionVarious basic questions remain open in discrepancy.
Algorithmic questions:– Conjecture (Matousek’11): disc(A) hervecdisc(A)
(would imply tight bound herdisc(A) = O(log m) detlb(A)) – Constructive Banaszczyk bound ( ) for Beck Fiala.– Approximation for hereditary discrepancy?
Various other non-constructive methods:
Counting, topological, fixed points, …
What can be made constructive, not so well understood ?
39/40