Minimizing Flow Time on Multiple Machines Nikhil Bansal IBM Research, T.J. Watson.

25
Minimizing Flow Time on Multiple Machines Nikhil Bansal IBM Research, T.J. Watson

Transcript of Minimizing Flow Time on Multiple Machines Nikhil Bansal IBM Research, T.J. Watson.

Minimizing Flow Time on Multiple Machines Nikhil Bansal

IBM Research, T.J. Watson

Scheduling

Collection of m machines, n jobs Arrival time or release time (rj)

Service requirement or size (pj)

t=0r2

r3

r1

C2C1

Job preempted

C3

m=1

Scheduling

Flow Time = Time job spends = Completion time – release time = Waiting + Processing

t=0r2

r3

r1

C2C1 C3

Flow time of job 2

m=1

Scheduling

t=0r2

r3

r1

C2C1 C3

Flow time of job 3

Flow Time = Time job spends = Completion time – release time = Waiting + Processing

minimize total flow

time

m=1

Total Flow Time (Another View)Imagine each job costs $1 per unit time.

Cost of a job = Its flow timeTotal cost = Total flow time

Total cost = t cost at time t

= t # jobs at time t

Total Flow Time (Single Machine)Total cost = t # jobs at time t Processor has a “to do” list of jobsGoal: Minimize number of jobs on list

Work on the job it can finish earliest.Shortest remaining processing time (SRPT): Optimal

algorithm

Flow Time on multiple Machines (m ¸ 2)NP-Hard:Breakthrough: O(log n) competitive [Leonardi, Raz 97]

Works for arbitrary # of machines (m)Any online algorithm: (log n) competitive

Improvements: No migrations [Awerbuch et al 99]

Immediate dispatch [Avrahami and Azar 03]

Flow Time on Multiple MachinesWhat about approximation algorithms?

O(log n) best known, even for m=2

Lower bounds: NP-Hard, APX-Hard ?

Flow Time on Multiple MachinesMain Result: A (1+) approximation scheme Running Time = nO(m log n)

Or, nO(log n) for m=O(1)

Suggests: PTAS likely for O(1) machines

Basic Idea

Rounding: Simplify the input without losing quality too much

Search: Dynamic Programming over some reasonable space of schedules

Related Problem

Minimizing total completion time:( i ci or equivalently i (ri + fi) )

Same as flow time wrt optimalityBut easier for approximation

PTASes known with runtime poly(n,m) [Afrati et al 99]

Techniques not applicable to flow time

Rounding for Flow Time

Flow Time is quite sensitive

Suppose round size to powers of (1+)

Cannot distinguish between Job of size 1 arrives at t=1,2,…,n Job of size 1+ arrives at t=1,2,…,n

Very Different: (n) vs (n2) !!!

Rounding for Flow Time

Can show: Let B be largest size,Rounding ri, pi to multiples of B/n2 is fine

Proof: Each job affected by · B/n Opt ¸ B

Implies: Sizes 2 [1,n2/] , Events at [1,n3/]

Still bad for exhaustive search over all schedules.

Restricting possible schedules Jobs assigned to a machine, worked in SRPT order.

Given a machine, which jobs assigned to it? (2n possibilities)

Approx state under SRPT in O(log2 n) bits of info. Store for each machine.

Dynamic program: For (state,t) whats the best flow time achievable.

State

Properties

1) Enough information: State at t+1 computable from that at time t.

2) Gives number of jobs to within 1+ factor

Property of SRPT

At any time, among jobs with size 2 [a,b], at most one has remaining processing < a.

Property of SRPT

At any time, among jobs with size 2 [a,b], at most one has remaining processing < a.

Proof: b

a

Not executeduntil blue finishes

Property of SRPT

At any time, among jobs with size 2 [a,b], at most one has remaining processing < a.

Proof: b

a

Both cannot be < aat some time

Property of SRPT

At any time, among jobs with size 2 [a,b], at most one has remaining processing < a.

Suppose a= (1+)i, b=(1+)i+1

Given, total remaining size (x) of jobs s.t. pi 2 [a,b]

x/b · Estimate # of jobs · x/a + 1

Configuration on a machine

Consider O(log n/) size-classes [(1+)i,(1+)i+1]

For each class, Total remaining processing times 1/ largest remaining processing times x/(1+)i+1 · # of jobs · x/(1+i) + 1

Class 1: (Total 1, x1,x2,…,x1/) … Class k: (Total k, y1,y2,…,y1/)

k=O(log n)

In all O(log2 n) bits

Updating a configuration

At most O(m log2n) bits of informationGives number of jobs to within 1+How to update, as time passes?

Class 1: (Total 1, x1,x2,…,x1/)

… Class j : (Total j, y1,y2,…,y1/)

On arrival, guess the machine & update statem branches

Updating a configuration

At most O(m log2n) bits of informationGives number of jobs to within 1+How to update, as time passes?

Class 1: (Total 1, x1,x2,…,x1/)

… Class j : (Total j, y1,y2,…,y1/)

Working step: For each machine, guess class with smallest remaining time job [(log n)m choices]

Fitting it all together

At any time,O(m log2n/2) total bits of info.

Know how to update.

Dynamic program over all possible states.

Weighted Flow Time ( i wi fi)

NP-Hard for m=1, No o(n) approximation known, even for m=1

m=1: (1+) approx, time nO(log B log W) [Chekuri, Khanna 02]

B: max/min size W: max/min wt

This paper: Extend to m=O(1), time nO(m log Bn log Wn)

Hardness: Exponential dependence on m likely

(1+ ) approx with running time 2O(polylog(n,m,W,B))

) NP µ DTIME(npolylog(n))

Open Problems

1) PTAS or O(1) approx for minimizing flow time on O(1) machines? [Our QPTAS => PTAS likely]

2) For arbitrary number of machines. PTAS or APX-Hard?