A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of...

34
A Dependent LP- Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick, UK

Transcript of A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of...

Page 1: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

A Dependent LP-Rounding Approach for the k-Median Problem

Moses Charikar1 Shi Li11Department of Computer Science

Princeton University

ICALP 2012, Warwick, UK

Page 2: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

• Introduction• Linear Programming Relaxation• Simple Pseudo-Approx. for k-median• Our Algorithm for k-median

Outline

Page 3: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

k-Median as a Clustering Problem

• Given: metric (X, d), k• Partition X into k clusters• Select a center for each

cluster• Minimize sum of distances to

the centers:

• Quantifies how well a set can be divided into k partitions

k = 4

Page 4: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

k-Median in Operation Research

• Given metric (F C, d), ko F : set of facilitieso C : set of clients

• Open k facilities• Connect each client to its

nearest open facility• Minimize total connection

cost

k = 4

Page 5: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Related Problem : Facility Location

Problem• Given metric (F C, d), k

o F : set of facilitieso C : set of clientso fi : facility cost of opening i

• Open k facilities • Connect each client to its

nearest open facility• Minimize total connection

cost

{fi ≥ 0 : i F}

Open a set F' F of facilities

Minimize sum of facility cost and connection cost,

k = 4

Page 6: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Known Results

• *local search: if switching p facilities can not improve a solution, then the solution is a 3+2/p-approx.

• Integrality gap of the natural linear programming is between 2 and 3o the proof of the upper bound 3 is non-constructive

Approx. Hardness of appox.

facility location 1.488 [Li11] 1.463 [GK98,Sri02]

k-median 3+ε* [AGK+01] 1+2/e+ε [JMS02]

Page 7: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Our Results• A LP-rounding approach for k-median

o prove 3.25 approximation ratioo thus give a constructive proof for the 3.25 integrality gapo faster running time compared to the local search algorithmo potential to improve the 3+ε approximation

• the upper bound 3.25 is not tight• our algorithm may already give approximation ratio smaller than 3

Page 8: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Our Resultsprev. best approx. ratio our approx. ratio

k-facility location [Zha06] 3.25

matroid median 16 [KKN+11] 9

knapsack median ≥ 1000 [Kum12] 34

• k-facility location: facility location problem with constraint that at most k facilities can be open

• matroid median: the set of open facilities must be an independent set of a given matroid

• knapsack median problem: each facility has a cost, the total cost of open facilities can not exceed a budget B

Page 9: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

• Introduction• Linear Programming Relaxation• Simple Pseudo-Approx. for k-median• Our Algorithm for k-median

Outline

Page 10: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Natural LP Relaxation• yi{0,1}, iF : whether facility i is open

• xi,j{0,1}, iF, jC : whether client i is connected to facility j

Every client j must be connected to 1 facility

Client j can only be connected to an open facility

We can open at most k facilities

Page 11: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Canonical Instance

• km facilities• every client j is connected to its nearest m facilities

• in the LP solution, yi=1/m, xi,j{0,1/m}

facilities clients

j

Page 12: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Canonical Instance

• Fj: the set of m facilities that j is connected to

• average distance from j to Fj

• maximum distance from j to Fj

• LP value =

facilities clients

j

Page 13: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

• Introduction• Linear Programming Relaxation• Simple Pseudo-Approx. for k-median• Our Algorithm for k-median

Outline

Page 14: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Pseudo-Approximation• An (α, c)-pseudo approximation is a solution that

opens at most αk facilities and whose connection cost is at most c times the optimal cost

• A warm-up : (1 + ε, O(1/ε))-pseudo approximation for k-median

Page 15: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Pseudo-Approximation

• Let m' = m / (1+ε), y'i=(1+ε)yi=1/m'

• Every client only needs to connect to m' facilities• We fractionally open km(1/m')=(1+ε)k facilities

• Define F'j, d'av(j),d'max(j) similarly

facilities clients

j

Page 16: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Pseudo-Approximation

• Two clients j and j' conflict if F'jF'j' ≠ ∅

• Select a set C' of clients such that no two clients in C' conflict each other

facilities clients

j

j'

Page 17: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Pseudo-Approximation

• greedily constructing C'C with no conflictiono while C ≠ ,∅

• select jC with the minimum dav(j)

• add j to C' • remove j and all clients that conflict j from C

facilities clients

Page 18: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Pseudo-Approximation

• open facilitieso For every j C', randomly open 1 of the m' facility in F'jo For any facility i that is not inside jC'F'j , open i with probability 1/m'

• connect each client to its nearest open facility

facilities clients

Fact: every facility is open with probability 1/m'

Page 19: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Pseudo-Approximation

Proof Enough to assume j C' • ∃j' C' s.t

o F'jF'j' ≠ and ∅ d'av(j') ≤ d'av(j)

• E[Cj] ≤ E[Cj']+d(j, j')

≤ E[Cj']+d'max(j)+d'max(j')

≤ d'av(j')+(1/ε)d'av(j')+(1/ε)d'av(j')

≤ (1+2/ε)d'av(j) ≤ (1+2/ε)dav(j)

Lemma E[Cj] ≤ O(1/ε)dav(j), where Cj is the connection cost of j

j

j'

facilities clients

F'j

F'j'

Page 20: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

• Introduction• Linear Programming Relaxation• Simple Pseudo-Approx. for k-median• Our Algorithm for k-median

Outline

Page 21: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Barrier to Obtain True Approximation

• If ε=0, then F'j=Fj

• dmax(j) >> dav(j)

• With non-zero prob., j will be connected to facilities in Fj'

• The expected connection cost of j is unbounded compared to dav(j)

facilities clients

Fj

Fj'

j

j'

Page 22: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Remove the Barrier• Solution: j only “claims”

close facilities in Fj

• Let Uj be the set of claimed facilities

• Use Uj to replace Fj in the algorithm

• New Barrier: |Uj| < m might happen

• can not guarantee always a facility open in Uj

Fj

Uj

j

Page 23: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Remove the New Barrier

• can guarantee |Uj| ≥ m/2

• |UjUj'| ≥ m if Uj and Uj' are disjoint

• pair the clients in C'• always open 1 facility (possibly

2 facilities) in UjUj' for a matched pair (j, j')

j

Uj

Uj'

j'

Page 24: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Remove the New Barrier

• How to open facilities for a matched pair?

• m boxes in a line

• Permute facilities in Uj put them in the leftmost |Uj| boxes

• Permute facilities in Uj' put them in the rightmost |Uj'| boxes

• Open facilities in a random selected box

m

Uj

Uj'

Page 25: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

The Algorithm• Filtering

o 2 clients j and j' conflict if d(j, j') ≤ 4max{dav(j),dav(j')}

o while C ≠ ∅

• select j C that minimizes dav(j);

• add j to C'• remove j and all clients that

conflict j from C

Page 26: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

The Algorithm• Filtering• Claiming

o For any j C', let 2Rj be the distance between j and its nearest neighbor in C'

o A facility i is claimed by j, if

• i Fj and

• d(i, j) ≤ Rj

i.e, Uj = Fj Ball(j, Rj)

Fact: any client j C' will claim at least m/2 and at most m facilities.

Page 27: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

The Algorithm• Filtering• Claiming• Matching

o while there are at least 2 unmatched clients in C'• select 2 unmatched clients j and j'

that minimizes d(j, j')• match j and j'

Page 28: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

The Algorithm• Filtering• Claiming• Matching• Rounding

o For each matched pair (j, j'), open 1 or 2 facilities in UjUj'

o If there is an unmatched client j, open 0 or 1 facility in Uj

o For each facility i that is not inside any Uj, open i with probability 1/m

o Connect each client to its nearest open facility

Page 29: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Proof of Constant Approx.

Ratio

Proof • it is enough to assume jC'

o Assume jC', there exists a client j' such that

d(j', j) ≤ 4dav(j) and dav(j') ≤ dav(j)

o Assume E[Cj'] ≤ αdav(j')

o E[Cj] ≤ d(j, j') + E[Cj'] ≤ 4dav(j)+αdav(j') ≤ (4+α)dav(j)

• W.L.O.G, assume dav(j) = 1

Lemma E[Cj] ≤ O(1)dav(j), where Cj is the connection cost of j

Page 30: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Proof of Constant Approx.

Ratio

j j1j2

nearest neighbor of j in C' j2 is matched with j1

2Rj 2Rj1 ≤ 2Rj

Rj Rj1 Rj2

• There is always 1 facility open in Uj1Uj2

• Any facility in Uj1Uj2 is at most 2Rj+2Rj1+Rj2≤ 5Rj away from j

• |Uj| ≥ m(1-1/Rj)

• with prob. 1-1/Rj, connect to a random facility in Uj

• only with prob. 1/Rj, connect to a facility that is 5Rj away

• E[Cj] ≤ 5 n

Uj Uj1 Uj2

Page 31: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Proof of 3.25 approx. ratio

• complicated, details omitted• rough idea : for a client j C'

o j1C' is the client that conflicts and removes j in the filtering phase

o j2C' is the nearest neighbor of j1 in C'

o j3C' is the client matched with j2

o Consider the nearest open facility of j in FjFj1Uj2Uj3

• Our algorithm opens k facilities in expectation• Can be easily transformed so that it always opens k

facilities• Algorithm naturally extends to k-FL problem

Page 32: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Ongoing Work• Joint work with Svensson, improved the best

approximation ratio (3+ε) for k-median

Page 33: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Summary• We introduced a LP-rounding algorithm for k-median

problemo proved 3.25 approximation ratio for the problemo it has potential to improve the decade-long 3 approximation

• Improved approximation algorithms for the following problemso k-facility location problem 3.25o Matroid median problem 9o Knapsack median problem 34

Page 34: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Thanks