Presentation 15 condor-v1

23
15 Condor – A Distributed Job Scheduler Todd Tannenbaum, Derek Wright, Karen Miller, and Miron Livny Beowulf Cluster Computing with Linux, Thomas Sterling, editor, Oct. 2001. Summarized by Simon Kim

Transcript of Presentation 15 condor-v1

Page 1: Presentation 15 condor-v1

15 Condor – A Distributed Job Scheduler

Todd Tannenbaum, Derek Wright, Karen Miller, and Miron Livny

Beowulf Cluster Computing with Linux, Thomas Sterling, editor, Oct. 2001.

Summarized by Simon Kim

Page 2: Presentation 15 condor-v1

Contents

• Introduction to Condor• Using Condor• Condor Architecture• Installing Condor under Linux• Configuring Condor• Administration Tools• Cluster Setup Scenarios

Page 3: Presentation 15 condor-v1

Introduction to Condor

• Distributed Job Scheduler• Condor Research Project at University of

Wisconsin-Madison Department of Computer Sciences

• Changed name to HTCondor in 2012– http://research.cs.wisc.edu/htcondor

Page 4: Presentation 15 condor-v1

Introduction to Condor

Condor

Job

Run Run

IdleMonitor Progress

Report

Queue

Run

Idle Run

Nodes

User

Policy

Complete!

Page 5: Presentation 15 condor-v1

Introduction to Condor

• Workload Management System• Job Queuing Mechanism• Scheduling Policy• Priority Scheme• Resource Monitoring and Management

Page 6: Presentation 15 condor-v1

Condor Features

• Distributed Submission• User/Job Priorities• Job Dependency - DAG• Multiple Job Models – Serial/Parallel Jobs• ClassAds – Job : Machine Matchmaking• Job Checkpoint and Migration• Remote System Calls – Seamless I/O Redirection• Grid Computing – Interaction with Globus

Resources

Page 7: Presentation 15 condor-v1

ClassAds and Matchmaking

• Job ClassAd– Looking for Machine– Requirements: Intel, Linux, Disk Space, …– Rank: Memory, Kflops, …

• Machine ClassAd– Looking for Job– Requirements– Rank

Page 8: Presentation 15 condor-v1

Using Condor

• Roadmap to Using Condor• Submitting a Job• User Commands• Universes• Standard Universe

– Process Checkpointing– Remote System Calls– Relinking– Limitations

• Data File Access• DAGMan Scheduler

Page 9: Presentation 15 condor-v1

Using Condor

Batch Job

STDIN

STDOUT

STDERR

univers = vanillaexecutable = foolog = foo.loginput = input.dataoutput = output.dataqueue

Submit Description

Standard

Vanilla

PVM

MPI

Grid

Scheduler

Universes:Runtime Environment

Prepare a Job

Submit

Serial Job

Parallel Job

Meta Scheduler

$ condor_submit

Page 10: Presentation 15 condor-v1

Status of Submitted Jobs

• $ condor_status -submitters

Page 11: Presentation 15 condor-v1

All jobs in the Queue• $ condor_q

• Removing Job– $ condor_rm 350.0

• Changing Job Priority: -20 ~ 20(high), default: 0– $ condor_prio –p -15 350.1

Page 12: Presentation 15 condor-v1

Universes

• Execution Environment - Universe• Vanilla– Serial Jobs– Binary Executable and Scripts

• MPI Universe– MPI Programs– Parallel Jobs– Only on Dedicated Resources

# Submit DescriptionUniverse = mpi…machine_count = 8queue

Page 13: Presentation 15 condor-v1

Universes

• PVM Universe– Master-worker Style Parallel

Programs• Written for Parallel Virtual Machine

Interface

– Both Dedicated and Non-dedicated (workstations)

– Condor Acts as Resource Manager for PVM Daemon

– Dynamic Node Allocation

PVM Daemon

Condor

# Submit DescriptionUniverse = pvm…machine_count = 1..75queue

pvm_addhosts()

Page 14: Presentation 15 condor-v1

Universes

• Scheduler Universe– Meta-Scheduler– DAGMan Scheduler• Complex Interdependencies Between Jobs

A

B C

D * B and C are executed in parallel

Job Sequence: A -> B and C -> D

Page 15: Presentation 15 condor-v1

Universes

• Standard Universe– Serial Job– Process Checkpoint, Restart, and Migration– Remote System Calls

Page 16: Presentation 15 condor-v1

Process Checkpointing

• Checkpoint– Snapshot of the Program’s Current State– Preemptive Resume Scheduling– Periodic Checkpoints – Fault Tolerance– No Program Source Code Change

• Relinking with Condor System Call Library

– Signal Handler• Process State Written to a Local/Network File• Stack/Data Segments, CPU state, Open Files, Signal Handlers and

Pending Signals

– Optional Checkpoint Server• Checkpoint Repository

Page 17: Presentation 15 condor-v1

Remote System Calls

• Redirects File I/O– Open(), read(), write() -> Network Socket I/O– Sent to ‘condor_shadow’ process on Submit

Machine• Handles Actual File I/O• Note that Job Runs on Remote Machine

• Relinking Condor Remote System Call Library– $ condor_compile cc myprog.o –o myprog

Page 18: Presentation 15 condor-v1

Standard Universe Limitations

• No Multi-Process Jobs– fork(), exec(), system()

• No IPC – Pipes, Semaphores, and Shared Memory

• Brief Network Communication– Long Connection -> Delay Checkpoints and Migration

• No Kernel-level Threads– User-level Threads Are Allowed

• File Access: Read-only or Write-only– Read-Write: Hard to Roll Back to Old Checkpoint

• On Linux, Must be Statically Linked

Page 19: Presentation 15 condor-v1

Data Access from a Job

• Remote System Call – Standard Universe• Shared Network File System• What About Non-dedicated Machines (Desktops) ?– Condor File Transfer– Before Run, Input Files Transferred to Remote– On Completion, Output Files Transferred Back to Submit

Machine– Requested in Submit Description File

• transfer_input_files = <…>, transfer_output_files=<…>• transfer_files=<ONEXIT | ALWAYS | NEVER>

Page 20: Presentation 15 condor-v1

Condor ArchitectureCentral Manager Machine

Negotiator

Collector

Startd

Sched

Startd

Sched

Machine 1

Startd

Sched

Machine 2

Startd

Sched

Machine N

Page 21: Presentation 15 condor-v1

Condor ArchitectureCentral Manager Machine

Negotiator

Collector

Startd

Sched

Startd

Sched

Machine 1: Submit

Startd

Sched

Machine N: Execute

Starter

Job

ShadowCondor Remote

System Call

Page 22: Presentation 15 condor-v1

Cluster Setup Scenarios

• Uniformed Owned Dedicated Cluster– MPI Jobs on Dedicated Nodes

• Cluster of Multi-Processor Nodes– 1VM per Processor

• Cluster of Distributively Owned Nodes– Jobs from Owner Preferred

• Desktop Submission to Cluster– Submit-only Node Setup

• Non-Dedicated Computing Resources– Opportunistic Scheduling and Matchmaking with Process

Checkpointing, Migration, Suspend and Resume

Page 23: Presentation 15 condor-v1

Conclusion

• Distinct Features– Matchmaking with Job and Machine ClassAds– Preemptive Scheduling and Migration with

Checkpointing– Condor Remote System Call

• Powerful Tool for Distributed Scheduling Jobs– Within and Beyond Beowulf Clusters

• Unique Combination of Dedicated and Opportunistic Scheduling