LINEAR PROGRAMMING AND CIRCUIT IMBALANCES

László Végh
Slides available at https://personal.lse.ac.uk/veghl/ipco
LP Operations Research
§ L: total bit-complexity of the rational input (, , )
§ Simplex method: Dantzig, 1947
§ Interior point method: Karmarkar, 1984
min ! = ≥ 0
Weakly vs strongly polynomial algorithms for LP § variables, constraints, total encoding L.
§ Strongly polynomial algorithm:
§ PSPACE: The bit-length of numbers during the algorithm remain polynomially bounded in the size of the input.
§ Can also be defined in the real model of computation
min ! = ≥ 0
Is there a strongly polynomial algorithm for Linear
Programming?
Strongly polynomial algorithms for some classes of Linear Programs
§ Systems of linear inequalities with at most two nonzero variables per inequality: Megiddo ’83
§ Network flow problems
§ Min-cost flow: Tardos ’85, Fujishige ’86, Goldberg-Tarjan ’89, Orlin ’93, …
§ Generalized flow: V ’17, Olver-V ’20
§ Discounted Markov Decision Processes: Ye ’05, Ye ’11, …
Dependence on the constraint matrix only
§ Algorithms with running time dependent only on , but not on and .
§ Combinatorial LP’s: integer matrix ∈ !×#. Δ$ = max{| det |: submatrix of }
Tardos ’86: poly ,, log Δ$ black box LP algorithm
§ Layered-least-squares (LLS) Interior Point Method Vavasis-Ye ’96: poly ,, log $ LP algorithm in the real model of computation $: condition number
§ Dadush-Huiberts-Natura-V ’20: poly ,, log $∗ $∗ : optimized version of $
min !, = ≥ 0
Outline of the lectures
2. The circuit imbalance measure " and the condition measure "
3. Solving LPs: from approximate to exact
4. Optimizing circuit imbalances
6. Layered-least-squares interior point methods
§ Dadush-Huiberts-Natura-V ’20: A scaling-invariant algorithm for linear programming whose running time depends only on the constraint matrix
§ Dadush-Natura-V ’20: Revisiting Tardos’s framework for linear programming: Faster exact solutions using approximate solvers
Part 1 Tardos’s algorithm for min-cost flows circuits, proximity, and variable fixing
§ Directed graph = (, ), node demands : → with = 0, costs : → .
min !
§ Form with arc capacities can be reduced to this form.
§ Constraint matrix is totally unimodular (TU)
The minimum-cost flow problem
arcs
§ Directed graph = (, ), node demands : → with = 0, costs : → .
min !
§ Optimality: #" > 0 "−# = #"
The minimum-cost flow problem: optimality
!
max ! s. t. " − # ≤ #" ∀ ∈
§ Residual cost: #"$ = #" + # − " ≥ 0
§ Residual graph: % = ∪ , : #" > 0 "# = −#"
Dual solutions: potentials
1
-1
LEMMA: The primal feasible f is optimal ∃: +,- ≥ 0 for all , ∈ and +,- = 0 if +, > 0 ∃: +,- ≥ 0 for all , ∈ .
#
Variable fixing by proximity § If for some (, ) ∈ we can show that +,∗ = 0 in every
optimal solution, then we can remove (, ) from the graph.
§ Overall goal: in strongly polynomial number of steps, guarantee that we can infer this for at least one arc.
PROXIMITY THEOREM: Let Q be the optimal dual potential for costs , and ∗ an optimal primal solution for the original costs . Then,
+, /- > ⋅ − 0 ⇒ +,∗ = 0
§ For the node-arc incidence matrix , ker ⊆ ) is the set of circulations:
in-flow=out-flow
=7 #
Circulations and cycle decompositions
§ LEMMA: Let and + be two feasible flows for the same demand vector . Then, we can write
′ = +7 #
for sign-consistent directed cycles # in :
§ If #"+ > #" then cycles may only contain but not .
§ If #" > #"+ then cycles may only contain but not .
§ If #" = #"+ then no cycle contains or .
Every cycle is moving from towards ′.
Circulations and cycle decompositions
PROOF:
PROXIMITY THEOREM: Let R be the optimal dual potential for costs , and ∗ an optimal primal solution for the original costs . Then,
#"-. > ⋅ − / ⇒ #"∗ = 0
Rounding the costs § Rescale such that 0 = ||
§ Round costs as +, = ⌊+,⌋
§ For we can find optimal primal and dual solutions in strongly polynomial time, e.g. the Out-of-Kilter method by Ford and Fulkerson 1962.
§ For the optimal dual Q, fix all arcs to 0 that have
+, /- > > ⋅ − 0
§ QUESTION: Why would such an arc exist?
Minimum-norm projections § Residual cost:
§ The cost vectors = {.: ∈ 0} ⊂ )
form an affine subspace.
§ For any feasible flow and any residual cost ., . ! = ! + !
§ Solving the problem for and . is equivalent.
§ If 0 ∈ , i.e. ∃: c. ≡ 0, then every feasible flow is optimal
§ IDEA: Replace the input by the min norm projection to the affine subspace :
. = arg min .∈$
. 2 #
Rounding the costs § Assume is chosen as a min norm projection:
≥ ∀ ∈
§ Rescale such that / = ||
§ Round costs as #" = ⌊#"⌋
§ For the optimal dual R, fix all arcs to 0 that have #" -. > > ⋅ − /
§ LEMMA: There exist at least one such arc. PROOF:
-. / ≥ -. 2
-. ≥ 0
Summary of Tardos’s algorithm § Variable fixing based on proximity that can be shown by
cycle decomposition.
§ Round to small integer costs
§ Find optimal dual R for with simple classical method
§ Identify a variable #"∗ = 0 as one where #"-. is large and remove all such arcs.
§ Iterate
Part 2 The circuit imbalance measure ! and the condition measure !
The circuit imbalance measure § The matrix ∈ !×# defines a linear matroid on
= {1,2, … , }: a set ⊆ [] is independent if the columns {$: ∈ } are linearly independent.
§ ⊆ [] is a circuit if $: ∈ is a linearly dependent set minimal for containment.
§ For a circuit , there exists a vector % ∈ % unique up to a scalar multiplier such that
5 $∈%
§ The circuit imbalance measure is defined as
' = max (%
5 2.4 -3 -1
"*
# * : ∈ 6, , ∈
§ This measure depends only on the linear subspace = ker : if ker = ker() then 6 = 7
§ We will use 8 = 6 for = ker
Connection to subdeterminants:
§ For an integer matrix ∈ 9×;, Δ6 = max{| det |: submatrix of }
§ For a circuit ∈ 6, with = let = <,* ∈ (=>?)×= be a submatrix with linearly independent rows.

(#) ∈ (=>?)×(=>?) remove the -th column from . By Cramer’s rule * = det ? , det 2 , … , det =
Properties of ! § LEMMA: For an integer matrix ∈ 9×;,
6 ≤ Δ6 For a totally unimodular matrix , 6 = 1
§ EXERCISE: i. If is the node-edge incidence matrix of
an undirected graph, then 6 ∈ 1,2 ii. For the incidence matrix of a complete
undirected graph on nodes,
Δ6 ≥ 2 % &
PROOF (Ekbatani & Natura):
THEOREM (Cederbaum, 1958): If ∈ !×# is a TU- matrix, then $ = 1. Conversely, if 5 = 1 for a linear subspace ⊂ # then there exists a TU-matrix such that = ker .
Duality of circuit imbalances
5 = 5)
Circuits in optimization § Appear in various LP algorithms directly or indirectly
§ IPCO summer school 2020: Laura Sanità’s lectures discussed circuit augmentation algorithms and circuit diameter
§ …
The condition number ! 6 = sup ! ! >? : is positive diagonal matrix
§ Measures the norm of oblique projections
§ Introduced by Dikin 1967, Stewart 1989, Todd 1990
§ THEOREM (Vavasis-Ye 1996): There exists a poly ,, log 6 LP algorithm for min ! , = , ≥ 0, ∈ 9×;
§ LEMMA i. If is an integer matrix with bit encoding length , then
6 ≤ 2@(A)
ii. 6 = max >? : nonsingular × submatrix of iii. 6 only depends on the subspace = ker() iv. 8 = 8'
The lifting operator § For a linear subspace ⊂ # and index set ⊆ , we let
*: # → *
denote the coordinate projection, and * = {*: ∈ }
§ The lifting operator *+: * → # is defined as *+ = argmin ,: ∈ , * =
§ This is a linear operator; we can efficiently compute a projection matrix ∈ #×* such that *+ = .
§ LEMMA:
' = max *⊆[#]
: ⊆ , ∈ * {0}
The lifting operator
The lifting operator LEMMA:
: ⊆ , ∈ B 0
PROOF:
The condition numbers ! and !
Approximability of and e:
LEMMA (Tunçel 1999): It is NP-hard to approximate e by a factor better than 29:;<(>?@A($))
THEOREM: For every matrix ∈ !×#, ≥ 2 1 + $7 ≤ $ ≤ $
Part 3 Solving LPs: from approximate to exact
Fast approximate LP algorithms
§ Approximately optimal: ! ≤ OPT +
§ Finding an approximate solution with log ? D
running time dependence implies a weakly polynomial exact algorithm.
min ! = ≥ 0
§ variables, equality constraints, Randomized vs. Deterministic § Significant recent progress:
§ R nnz + ( log) * log + ,
Lee–Sidford ’13–’19
Cohen, Lee, Song ’19
van den Brand ’20
van den Brand, Lee, Liu, Saranurak, Sidford, Song, Wang ’21
Some important techniques: § weighted and stochastic central paths § fast approximate linear algebra § efficient data structures
min ! = ≥ 0 Fast approximate LP algorithms
Fast exact LP algorithms with ! dependence
§ variables, equality constraints
THEOREM (Dadush, Natura, V. ‘20) There exists a poly(,, log 6) algorithm for solving LP exactly. § Feasibility: m calls to an approximate solver § Optimization: mn calls to an approximate solver with = 1/(poly , 6 ). Using van den Brand ’20, this gives a
deterministic exact EF? log2 log(6+) time LP optimization algorithm
§ Generalization of Tardos ’86 for real constraint matrices and with directly working with approximate solvers.
§ Main difference: arguments in Tardos ’86 heavily rely on integrality assumptions
min ! = ≥ 0
Hoffman’s proximity theorem Polyhedron = ∈ ;: ≤ , point G ∉ , norms . H, . I
THEOREM (Hoffman, 1952): There exists a constant H,I() such that
∃ ∈ : − G H ≤ H,I() G − F I

$

§ Subspace form: W = ker , ∈ ; s.t. =
min D = ≥ 0
max D D + =
≥ 0
max D( − ) ∈ E + ≥ 0
Proximity theorem with ! THEOREM: For ∈ 9×;, ∈ ;, consider the system
x ∈ + , ≥ 0.
There exists a feasible solution such that − / ≤ 8 > ?
PROOF:
Linear feasibility algorithm Linear feasibility problem
∈ + , ≥ 0. § Recursive algorithm using a stronger problem formulation:
∈ + , ≥ 0. − / ≤ ′82 > ?
§ Black box oracle for = 1/(poly , 6 )
∈ + − % ≤ & ! '
! % ≤ ! '
The linear feasibility algorithm
1. Call the black box solver to find a solution for = 1/ 8 J
2. Set = { ∈ : # < 8 > ?}; assume ≠ [].
3. Recursively obtain R ∈ F < from
(<(), <)
∈ + − % ≤ & ! '
! % ≤ ! '
≥ 0
& ! '
1. Call the black box solver to find a solution for = 1/ 8 J
2. Set = { ∈ : # < 8 > ?}; assume ≠ [].
3. Recursively obtain R ∈ F < from
(<(), <)
∈ + − % ≤ & ! '
! % ≤ ! '
≥ 0
= { ∈ : + < 5 S T};
§ If = [], then we replace by its projection to E
§ Bound on the number of recursive calls; can be decreased to
§ (UVW T log(5 + )) feasibility algorithm using van den Brand '20.
The linear feasibility algorithm
§ In case of infeasibility we return an exact Farkas certificate
§ 8 is hard to approximate within 2@ ; Tunçel 1999
§ We use an estimate in the algorithm
§ The algorithm may fail if <8( R − <) / > R − < ?
§ In this case, we restart with
max 2, <8( R − <) / R − < ?
§ Our estimate never overshoots 8 by much, but can be significantly better.
Proximity for optimization
THEOREM: Let ∈ D + , ≥ 0 be a feasible dual solution, and assume the primal is also feasible. Then there exists a primal optimal ∗ ∈ + , ∗ ≥ 0 such that
∗ − 0 ≤ 5 S T + XY99 Z T .
min D ∈ + ≥ 0
max D( − ) ∈ E +
≥ 0
Optimization algorithm
§ ≤ Outer Loops, each comprising ≤ Inner Loops
§ Each Outer Loop finds with − "small", and (, ) primal and dual optimal solutions to
min ! . . ∈ + , ≥ 0
§ Using proximity, we can use this to conclude B > 0 for a certain variable set ⊆ and recurse.
min D ∈ + ≥ 0
max D( − ) ∈ E + ≥ 0
Part 4 Optimizing circuit imbalances
Diagonal rescaling of LP
min D = ≥ 0
max D D + =
≥ 0
max D′ ()D′ + ′ =
′ ≥ 0
min ()D′ ′ = ′ ≥ 0
max D′ ()D′ + ′ =
′ ≥ 0
§ Natural symmetry of LPs and many LP algorithms.
§ The Central Path is invariant under diagonal scaling.
§ Most “standard” interior point methods are invariant.
Dependence on the constraint matrix only
§ Algorithms with running time dependent only on , but not on and .
§ Combinatorial LP’s: integer matrix ∈ !×#. Δ$ = max{| det |: submatrix of }
Tardos ’86: poly ,, log Δ$ LP algorithm
§ Layered-least-squares (LLS) Interior Point Method Vavasis-Ye ’96: poly ,, log $ LP algorithm in the real model of computation $: condition number
§ Dadush-Huiberts-Natura-V ’20: poly ,, log $∗ $∗ : optimized version of $
min !, = ≥ 0

Optimizing ! and ! by rescaling = set of × positive diagonal matrices
6∗ = inf{6K: ∈} 6∗ = inf{6K: ∈}
§ A scaling invariant algorithm with 6 dependence automatically yields 6∗ dependence.
§ Recall 1 + 62 ≤ 6 ≤ 6.
THEOREM (Tunçel 1999): It is NP-hard to approximate $ (and thus $) by a factor better than 29:;<(>?@A($))
THEOREM (Dadush-Huiberts-Natura-V ’20): Given ∈ !×#, in (77 + \) time, one can • approximate the value $ within a factor $∗ 7, and • compute a rescaling D ∈ satisfying $] ≤ $∗ \.
Approximating !∗
= set of × positive diagonal matrices $∗ = inf{$]: ∈}
§ EXAMPLE: Let ker = span( 0,1,1,M , (1,0,M, 1) )
Pairwise circuit imbalances § For a circuit , there exists a vector ^ ∈ ^ unique
up to a scalar multiplier such that
x +∈^
+, = max ,^
§ The circuit imbalance measure is $ = max
+,,∈[#] +,
LEMMA For any directed cycle on {1,2, … , }
6∗ |M| ≥ #," ∈M
Circuit imbalance min-max formula
§ BUT: Computing the #" values is NP-complete…
§ LEMMA: For any circuit ∈ 6 s.t. , ∈ , "*
#* ≥
Part 5 Interior point methods: basic concepts
Primal and dual LP § Matrix form: ∈ 9×;, ∈ 9, c ∈ ;
§ Subspace form: W = ker , ∈ ; s.t. =
§ Complementary slackness: Primal and dual solutions (, ) are optimal if ! = 0: for each ∈ [], either # = 0 or # = 0.
§ Optimality gap: ! − ! − = !.
min D = ≥ 0
max D D + =
s≥ 0
max D( − ) ∈ D +
≥ 0
The central path § For each > 0, there exists a unique
solution () = ((), (), ()) such that
# # = ∀ ∈ the central path element for .
§ The central path is the algebraic curve formed by {(): > 0}
§ For → 0, the central path converges to an optimal solution ∗ = (∗, ∗, ∗).
§ The optimality gap is !() = .
§ Interior point algorithms: walk down along the central path with decreasing geometrically.
The Mizuno-Todd-Ye Predictor-Corrector Algorithm
§ Start from point G = (G, G, G) 'near' the central path at some G > 0.
§ Alternate between § Predictor steps: 'shoot down' the
central path, decreasing by a factor at least 1 − /. May move slightly 'farther' from the central path.
§ Corrector steps: do not change parameter , but move back 'closer' to the central path.
Within () iterations, decreases by a factor 2.
The predictor step § Step direction Δ = (Δ, Δ, Δ)
§ Pick the largest ∈ [0,1] such that ′ is still “close enough” to the central path [ = + Δ = + Δ, + Δ, + Δ
§ Long step: |Δ+Δ+| small for every ∈ [] § New optimality gap is 1 − .
Δ = 0 DΔ + Δ = 0
+Δ+ + +Δ+ = −++ ∀ ∈ []
The predictor step – subspace view
§ Assume the current point = (, , ) is on the central path. The steps can be found as minimum norm projections in the ( ⁄& ') and ( ⁄& () rescaled norms
Δ = argminF #)&
Δ = argminF #)&
+Δ+ + +Δ+ = −++ ∀ ∈ []
Some recent progress on interior point methods § Tremendous recent progress on fast approximate
variants LS’14–’19, CLS’19,vdB’20,vdBLSS’20,vdBLLSSSW’21
§ Fast approximate algorithms for combinatorial problems flows, matching and MDPs: DS’08, M’13, M’16, CMSV’17, AMV’20, vdBLNPTSSW’20, vdBLLSSSW’21
Part 6 Layered-least-squares interior point methods
Layered-least-squares (LLS) Interior Point Methods: Dependence on the constraint matrix only
$∗ = inf{$]: ∈}
§ Monteiro-Tsuchiya ’03 O( )
\.b log($∗ + ) + 7 log log 1/ iterations
§ Lan-Monteiro-Tsuchiya ‘09 O \.b log($∗ + ) iterations, but the running time of the iterations depends on b and c
§ Dadush-Huiberts-Natura-V ’20: scaling invariant LLS method with O 7.b log() log($∗ + ) iterations
Near monotonicity of the central path
LEMMA For = (, , ) on the central path, and for any solution + = (+, +, +) s.t. + !+ ≤ !, we have
7 #O?
IPM learns gradually improved upper bounds on the optimal solution.
Variable fixing…—or not? LEMMA After every iteration, there exists variables # and " such that
1
≤ # #∗ , " "∗ ≤
For the optimal ∗, ∗, ∗ . Thus, # and " have “converged” to their final values.
§ PROOF: Can be shown using the form of the predictor step:
Δ = argmin7 #O?
; # + Δ# #
Δ = argmin7 #O?
; # + Δ# #
and bounds on the stepsize.
Variable fixing…—or not? LEMMA After every iteration, there exists variables # and " such that
1
Thus, # and " have “converged” to their final values.
We cannot identify these indices, just show their existence
Layered least squares methods § Instead of the standard predictor step, split the
variables into layers.
§ Force new primal and dual variables that must have converged.
Recap: the lifting operator and ! § For a linear subspace ⊂ ; and index set ⊆ , we let
B: ; → B
§ The lifting operator B8: B → ; is defined as
B8 = argmin 2: ∈ , B =
§ LEMMA: 6 = max A1 2 Q 3 Q 4
: ∈ B
B = , and / ≤ 6 ?
Motivating the layering idea: final rounding step in standard IPM
min D = ≥ 0
max D D + =
s≥ 0 § Limit optimal solution ∗, ∗, ∗ , and optimal
partition = ∪ s.t. = supp ∗ , = supp ∗ .
§ Given (, , ) near central path with ‘small enough’ = D/ such that for every ∈ [], either + or + very small.
§ Assume that we can correctly guess = : + > , = {: + > }
§ Assume we have a partition ,, we have ∈ : # > , # < / ∈ : # < /, # >
§ Goal: move to = + Δ, = + Δ, = + Δ s.t. supp ⊆ , supp ⊆ . Then, ! = 0: optimal solution.
§ Choice: Δ = −R8 R , Δ = −78 7
Layered-least-squares step Assume we have a partition ,, with ∈ : # > , # < / ∈ : # < /, # >
Standard primal predictor step:
Δ = argmin 7 #O?
ΔR = argmin7 #∈R
# + Δ# #
Layered-least-squares step Vavasis-Ye ‘96
§ Order variables decreasingly as ? ≥ 2 ≥ ≥ ; § Arrange variables into layers (?, 2, … , =); start a new layer when
# > S 6#F? § Primal step direction by least squares problems from backwards, layer-
by-layer
Not scaling invariant!
Progress measure: crossover events Vavasis-Ye’96
§ DEFINITION: The variables # and " cross over between and +, > +, if § S 6 ; " ≥ # § S 6 ; " ′′ < # ′′ for any ++ ≤ ′
§ LEMMA: In the Vavasis-Ye algorithm, a crossover event happens every (?.U log(6 + )) iterations, totalling to (V.U log(6 + )).
Scaling invariant layering DNHV’20 § Instead of the ratios #/", we consider the rescaled circuit
imbalance measures #"#/" § Layers: strongly connected components of the arcs
, : #"# "
> 1
Scaling invariant crossover events Vavasis-Ye’96
§ DEFINITION: The variables # and " cross over between and +, > +, if § S 6 ; " ≥ #"# § S 6 ; " ′′ < #"# ′′ for any ++ ≤ ′
§ Amortized analysis, resulting in improved (2.U log() log(6 + )) iteration bound.
Limitation of IPMs
§ THEOREM (Allamigeon–Benchimol–Gaubert–Joswig ‘18): No standard path following method can be strongly polynomial.
§ Proof using tropical geometry: studies the tropical limit of a family of parametrized linear programs.
Future directions § Circuit imbalance measure: key parameter for strongly
polynomial solvability.
§ LP classes with existence of strongly polynomial algorithms open: § LPs with 2 nonzeros per column in the constraint matrix,
equivalently: min cost generalized flows § Undiscounted Markov Decision Processes
§ Extend the theory of circuit imbalances more generally, to convex programming and integer programming.
Thank you!

LINEAR PROGRAMMING AND CIRCUIT IMBALANCES

Documents

Transcript of LINEAR PROGRAMMING AND CIRCUIT IMBALANCES