Reconnect ‘04 Introduction to PICO

34
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC04-94AL85000. Reconnect ‘04 Introduction to PICO Cynthia Phillips, Sandia National Laboratories Joint work with: Jonathan Eckstein, Rutgers William E. Hart, Sandia National Laboratories

description

Reconnect ‘04 Introduction to PICO. Cynthia Phillips, Sandia National Laboratories Joint work with: Jonathan Eckstein, Rutgers William E. Hart, Sandia National Laboratories. Parallel Computing Systems. A set of processors (from 2 up to tens of thousands) working together on a problem - PowerPoint PPT Presentation

Transcript of Reconnect ‘04 Introduction to PICO

Page 1: Reconnect ‘04 Introduction to PICO

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy under contract DE-AC04-94AL85000.

Reconnect ‘04Introduction to PICO

Cynthia Phillips, Sandia National Laboratories

Joint work with:Jonathan Eckstein, Rutgers

William E. Hart, Sandia National Laboratories

Page 2: Reconnect ‘04 Introduction to PICO

Slide 2

Parallel Computing Systems

• A set of processors (from 2 up to tens of thousands) working together on

a problem

• communicating by messages (even if hidden from user)

Architectures:

• Grid

• Network of workstations (LAN)

• Beowulf cluster

• Tightly-coupled system

Page 3: Reconnect ‘04 Introduction to PICO

Slide 3

Parallelism in Branch and Bound

Two sources of parallelism in B&B:

• Within subproblems

• Across subproblems

Warning:

• Can solve problems otherwise unsolvable but

– A constant-factor increase in # processors (even 10,000) cannot

overcome exponential growth.

– We still have to be clever

Page 4: Reconnect ‘04 Introduction to PICO

Slide 4

Parallelism Issues for Branch and Bound

• In the best of cases, all the processors are busy all the time doing useful,

independent work

– Overhead (coordination, exchange of data)

– Load balancing

• What to do when the tree is small?

• Tree shape depends on order of node evaluation

– Can lead to slowdown anomalies

– Try to emulate a good serial ordering

– We’d do a lot better with a single processor 1000x faster

Page 5: Reconnect ‘04 Introduction to PICO

Slide 5

Parallel Experimental Algorithmics/Engineering Issues

• Inherent nondeterminism

• Parallel random number generators

– e.g. for randomized algorithms

• Debugging

Page 6: Reconnect ‘04 Introduction to PICO

Slide 6

Solution Options for Integer Programming

• Commercial codes (ILOG’s cplex)– Good and getting better– Expensive– Serial (or modest SMP)

• Free serial codes (ABACUS, MINTO, BCP)• Modest-level parallel codes (Symphony)• Grid parallelism (FATCOP)• In development: ALPS/BiCePs/BLIS

• Massive parallelism: PICO (Parallel Integer and Combinatorial Optimizer)

Note: Parallel B&B for simple bounding: PUBB, BoB/BOB++, PPBB-lib,

Mallba, Zram

Page 7: Reconnect ‘04 Introduction to PICO

Slide 7

Parallel Integer and Combinatorial Optimizer (PICO)

Distributed memory (MPI), C++

• Massively parallel (scalable)

• General parallel Branch & Bound environment

• Portable, flexible

– Serial, small LAN, Cplant, ASCI Red, Red Storm

• Allows exploitation of problem-specific knowledge/structure

• Open Source release

– Always support a free LP solver

Page 8: Reconnect ‘04 Introduction to PICO

Slide 8

PICO Features for Efficient Parallel B&B

• Efficient processor use during ramp-up

• Integration of heuristics to generate good solutions early

• Efficient work storage/distribution

• Load balancing

• Non-preemptive proportional-share “thread” scheduler

• Flexible hub/worker interaction

• Subproblem states with flexible search strategy

• Correct termination

• Early output

Page 9: Reconnect ‘04 Introduction to PICO

Slide 9

What To Do With 9000 Processors and One Subproblem?

Option 1: Presplitting

Make log P branching choices and expand all ways (P problems)

P = # processors

BAD!

Expands many problems that would be fathomed in a serial solution.

Page 10: Reconnect ‘04 Introduction to PICO

Slide 10

PICO MIP Ramp-up

• Serialize tree growth– All processors work in parallel on a single node

• Parallelize– LP bounding– Preprocessing– Cutting plane generation– Incumbent Heuristics– Pseudocost (gradient) initialization

• Work division by processor ID/rank• Crossover to parallel with perfect load balance

– When there are enough subproblems to keep the processors busy– When single subproblems cannot effectively use parallelism

Page 11: Reconnect ‘04 Introduction to PICO

Slide 11

Parallel Incumbent Search

• Genetic algorithms

• Decomposition-based methods (general)

• Pivot, cut, and dive general heuristic

• Custom Methods

Page 12: Reconnect ‘04 Introduction to PICO

Slide 12

Interior-Point Method for Solving the Root Problem

• Mehrotra’s predictor-corrector (primal-dual) method• Iterative method where the computational core of each iteration is the

solution of a linear system with constraint matrix:

– A is the original LP constraint matrix.

– D is a diagonal matrix that changes each iteration.• Direct Cholesky Solvers OK for moderate parallelism• Iterative methods

– Preconditioning is a big issue

– Support theory can help if the matrix has network structure

AD2AT

Page 13: Reconnect ‘04 Introduction to PICO

Slide 13

Resolving LP on Subproblems

• Dual simplex is much faster than starting over• Need parallel dual simplex!

Original LP

Feasible region

LP optimal solution

Cutting plane

(valid inequality or branch constraint)

Integer optimal

Page 14: Reconnect ‘04 Introduction to PICO

Slide 14

Hubs and Workers

Each hub controls some number of workers (can work itself)

• Setting parameters, can go from fully centralized to fully distributed

Subproblem pools at both the hub and workers

• Heap (best-first), stack (depth-first), queue (breadth-first), custom

• Hubs only keep tokens

Page 15: Reconnect ‘04 Introduction to PICO

Slide 15

Subproblem Movement

Hub Worker• When worker has low load or low-quality local pool

Worker Hub• Draw back when hub out of work and cluster unbalanced• Send new subproblem tokens to hub (probabilistically) depending on load• Probabilistically scatter tokens to a random hub. If load in cluster is high

relative to others, scatter probability increases.

Setting parameters, go from pure master-slave to local• Tradeoffs: Communication, Processor utilization, approximation of serial

search order

Page 16: Reconnect ‘04 Introduction to PICO

Slide 16

Subproblem Movement/Data Storage

Worker

SP

SP

SP

SP

SP

SP Server

SP Receiver

?

SPSP

THub

T

TT

T

T

SP Server

SP

SPSP

SP

Page 17: Reconnect ‘04 Introduction to PICO

Slide 17

Load Balancing

• Hub pullback

• Random scattering

• Rendezvous

– Hubs determine load (function of quantity and quality)

– Use binary tree of hubs

– Determine donors and receivers, match them, exchange

Page 18: Reconnect ‘04 Introduction to PICO

Slide 18

Non-Preemptive Scheduler is Sufficient for PICO

• Processes are cooperating• Control is returned voluntarily so data structures left in clean state

– No memory access conflicts, no locks• PICO has its own “thread” scheduler

– High priority, short threads are round robin and done first• Hub communications, incumbent broadcasts, sending subproblems• If these are delayed by long tasks could lead to

– Idle processors– Processors working on low-quality work

– Compute threads are proportional share (stride) scheduling• Adjust during computation (e.g. between lower and upper-

bounding)

Page 19: Reconnect ‘04 Introduction to PICO

Slide 19

Page 20: Reconnect ‘04 Introduction to PICO

Slide 20

Subproblem States

Boundable

Being Bounded

Bounded

Being Separated

Separated

Dead

Handlers: lazy, eager, hybrid, build your own

Page 21: Reconnect ‘04 Introduction to PICO

Slide 21

Early Output

• Problem: If you have to abort a long run, want to know variable settings

for the incumbent

– May be good enough to stop

– Otherwise seed new search with the incumbent value

• PICO will save a new incumbent if

– It is a strict improvement over the last saved value (or is the first)

– A sufficient time has passed since the last write

• Requires a new message-triggered thread in parallel

– Hub, incumbent holder, I/O processor

Page 22: Reconnect ‘04 Introduction to PICO

Slide 22

Serial Class Structure - Inheritance

• Branching classes - control search

• Branchsub classes - subproblems (tree nodes)

• Problem data classes - derived only

PICO Core

KnapsackNonlinear

Branch&Prune PICO MIP CORE

MIP Application

AMPL Interface

(optional)

PICO B&C

CORE

Page 23: Reconnect ‘04 Introduction to PICO

Slide 23

Required Methods for Derived Node Class

• bGlobal( ) - subproblem pointer to branching (search control)

• setRootComputation( ) - create the root of the search tree

• boundComputation( ) - compute subproblem lower bound (for min)

• splitComputation( ) - determine how to partition the subproblem

• makeChild( ) - create a child subproblem from a split parent

• candidateSolution( ) - determine whether a proposed solution is viable

candidate for optimality

Page 24: Reconnect ‘04 Introduction to PICO

Slide 24

Optional Customizations

• Incumbent heuristic

• Incumbent representation/update

• Solution output

• Solution validation

• Preprocessing

• Override default parameters

In MIP:

• Custom cutting planes

• Adjust branching priorities

– Plan to add more complex branching strategies

Page 25: Reconnect ‘04 Introduction to PICO

Slide 25

All PICO’s Parallelism Comes (Almost) For Free

User must• Define serial application (debug in serial)• Describe how to pack/unpack data (using a generic packing tool)

C++ inheritance gives parallel management

User may add threads to• Share global data• Exploit problem-specific parallelism

– MIP: pseudocosts

PICO parallel Core Serial application

Parallel application

PICO serial Core

Page 26: Reconnect ‘04 Introduction to PICO

Slide 26

Utilib

• Predates STL

• Abstract data types: arrays, heaps, hash tables, balanced trees

• Random number generators

• Hash tables

– work well for doubles very close in value

• Arrays offer

– Protected access (bounds checking)

– Sharing

• PackBuffer methods facilitate parallelization

Page 27: Reconnect ‘04 Introduction to PICO

Slide 27

Pieces of PICO

PICO requires

• utilib (for data structures, math, etc)

• COIN (an IBM-sponsored optimization interface standard)

– Base interface to LP solvers

• We add more PICO-specific functionality

– Cut generation library

• An LP solver

– Currently support cplex, soplex, CLP

Page 28: Reconnect ‘04 Introduction to PICO

Slide 28

Using A Math Programming Language

• How easily can one bring up applications?

– In our world, applications are a moving target; need agility

Page 29: Reconnect ‘04 Introduction to PICO

Slide 29

A Mathematical Programming Language (AMPL)

• AMPL builds the matrix.

• Nice cross between programming language and LaTeX (math view)

DataFiles

Solver:Eg. cplex

ModelFiles

AMPL

Page 30: Reconnect ‘04 Introduction to PICO

Slide 30

AMPL-PICO Interface

• Write cutting-plane and approximate-solution code using AMPL

variables

• Mapping transparent

DataFiles

Solver:PICO

Exact

ComputeApproximate

Solution

IP

LPModelFiles

AMPL

Cutting Planes

Page 31: Reconnect ‘04 Introduction to PICO

Slide 31

AMPL-PICO Interface

Standard AMPL interfaces

Customized PICO Interface

AMPL Software

AMPL Model File

AMPL Problem Specification

FilesPICO

Executable

AMPL Solution Specification

Files

AMPL Software

AMPL Solver Output

Unix Scripts

AMPL Model File

Tailored PICO C++ Files

C++ Compiler

Tailored PICO Executable

PICO Output

Page 32: Reconnect ‘04 Introduction to PICO

Slide 32

Availability

• PICO will be free under GNU lesser public license

– MIP Requires serial LP solver

• Cplex is expensive, but many companies/universities have it

• CLP is free (through COIN)

• Part of ACRO (A Common Repository for Optimizers)

• http://software.sandia.gov/Acro/

[Need password for CVS checkout; otherwise tarballs]

Page 33: Reconnect ‘04 Introduction to PICO

Slide 33

Open Problems (Wish List)

Tools we’d like to see:

• Parallel matrix generation from a math-programming interface

• Parallel (sparse) dual simplex solver for linear programming

Open algorithms questions:

• Ramp up management: multiple subproblems in parallel

Page 34: Reconnect ‘04 Introduction to PICO

Slide 34

Development Team

Core Team

• Jonathan Eckstein (RUTCOR): PICO core

• Bill Hart (Sandia): scheduler, utilib, AMPL interface, design, etc

• Cindy Phillips (Sandia): MIP layer, MIP applications

Other Developers

• Harvey Greenberg (UCD): preprocessor design

• Vitus Leung (Sandia): preprocessor

• Tod Morrison (UCD, student): soplex interface, porting

• Mikhail Nediak (RUTCOR student, now McMaster): MIP heuristic

• Konrad Borys (RUTCOR student): core templatization, heuristic integration

• Mike Eldred (Sandia): DAKOTA optimization framework

• Ojas Parekh (ex-CMU student, Sandia): soplex interface

• Mario Alleva (Sandia): porting