Libra Design

download Libra Design

of 42

Transcript of Libra Design

  • 8/7/2019 Libra Design

    1/42

    Sproj3

    Libra: An Economy-Driven Cluster

    SchedulerDesign Documentation

    Version

  • 8/7/2019 Libra Design

    2/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    Revision HistoryDate Version Description People

    First draft Project Owner and

    Client: RajkumarBuyya

    Faculty Advisor: Dr.

    Arif Zaman

    Project Group:Jahanzeb Sherwani,

    Nosheen Ali,

    Nausheen Lotia,

    Zahra Hayat

    Confidential Sproj3, 2001 Page 2

  • 8/7/2019 Libra Design

    3/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    Table of Contents

    1. Introduction 4

    2. Design 5

    3. Architecture 6

    4. Interfaces 7

    4.1 QMonitor 7

    4.2 QMonitor Job Control (Pending Jobs) 8

    4.3 QMonitor Job Control (Finished Jobs) 9

    4.4 Submit Job 10

    4.5 Queue Control 11

    4.6 Queue Configuration: Modify 12

    4.7 Host Configuration (Administration Host) 13

    4.8 Complex Configuration 14

    4.9 Cluster Configuration 15

    4.10 Scheduler Configuration 16

    5. Data Definitions 17

    6. Architectural and Procedural Design 22

    6.1 Job Input Controller 24

    6.2 Job Acceptance Controller 25

    6.3 Job Initialization Controller 26

    6.4 Job Assignment Controller 27

    6.5 Execution Host and Queue Determination Controller 28

    6.6 Job Execution Controller 29

    6.7 Job Query Controller 30

    6.8 Job Modification Controller 31

    7. Program Design Language (Low-Level Design) 32

    7.1 Log User In and Input Job 32

    7.2 Accept / Reject Job 33

    7.3 Compute Job finish Time 34

    7.4 Determine Execution Host and Queue 34

    7.5 Schedule according to deadline and budget 35

    7.6 Implement proportional scheduling model according to allocated tickets 36

    7.7 Select Job (according to stride scheduling algorithm) 38

    7.8 Query Job 397.9 Delete Job 40

    7.10 Change Job Details 41

    8. References 42

    Confidential Sproj3, 2001 Page 3

  • 8/7/2019 Libra Design

    4/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    Design Documentation

    1. Introduction

    This document contains the high-level and low-level design specifications for the Software

    Requirements Specifications (SRS) submitted on 23rd December 2001. Additionally, the software

    architecture, and user interfaces for each of the deliverables are described as well.

    Confidential Sproj3, 2001 Page 4

  • 8/7/2019 Libra Design

    5/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    2. Design

    For the high-level design, a call-return architecture has been specified using structure charts. For

    procedural design, the high-level design modules have been explained using PDL.

    Confidential Sproj3, 2001 Page 5

  • 8/7/2019 Libra Design

    6/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    3. Architecture

    The Libra Scheduler architecture is as follows:

    The fileserver stores the

    workspaces of all processes, as

    well as the binary executables

    of the scheduler itself.

    Communications daemons

    (commd) are running on each of

    the host on a specified TCP port

    (786), and form the means of

    communication between hosts.

    Execution hosts contain

    execution queues, each with their own scheduling policies, on which jobs are sent to execute.

    The master host contains the central queue managing daemon, qmaster, and the scheduling

    daemon, which makes the actual scheduling decisions regarding which job is to be assigned to

    which execution queue. Submit hosts are the entry-points for the users. A submit host submits

    jobs using the qsub command, which submits the job along with its corresponding details to the

    master host. The master host summons the scheduler, which has access to the Queue State Table

    which in turn contains information on each of the execution hosts. Based on the scheduling

    algorithm implemented via Libra, the scheduler will choose an execution queue, which the

    qmaster will subsequently dispatch the job to.

    Confidential Sproj3, 2001 Page 6

  • 8/7/2019 Libra Design

    7/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    4. Interfaces

    4.1 QMonitor

    This is the central control screen for the Sun Grid Engine within which the Libra Scheduler is to

    be implemented. From here, the options (from top left to bottom right) are:

    1. Job Control

    2. Queue Control

    3. Job Submission

    4. Complexes Configuration

    5. Host Configuration

    6. Cluster Scheduler

    7. Scheduler Configuration

    8. Calendar Configuration

    9. User Configuration

    10. Parallel Environment Configuration

    11. Checkpoint Configuration

    12. Browser

    13. Exit

    Confidential Sproj3, 2001 Page 7

  • 8/7/2019 Libra Design

    8/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    4.2 QMonitor Job Control (Pending Jobs)

    Once a job is submitted, it goes to the master host which places it in the pending jobs queue.

    Once there, its ID, Priority, Name, Owner, Status and Queue (to which it will be sent) are stored

    until the job is actually dispatched to the queue. Every job initially goes to the pending queue

    while the scheduling daemon runs the selected scheduling algorithm to determine the execution

    queue to which the job is to proceed. Once this is done, the job goes into the Running Jobs tab

    (not shown in the screen images), after which it ends up in the finished jobs queue.

    Confidential Sproj3, 2001 Page 8

  • 8/7/2019 Libra Design

    9/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    4.3 QMonitor Job Control (Finished Jobs)

    The jobs in the finished jobs queue are those that have been scheduled, executed, and completed.

    It is essentially a list of all the hitherto completed jobs that have been submitted and outputs

    returned.

    While job status may be viewed from the Job Control window, new jobs may be submitted by

    clicking on the Submit button, which opens up the Job Submit dialog window.

    Confidential Sproj3, 2001 Page 9

  • 8/7/2019 Libra Design

    10/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    4.4 Submit Job

    Jobs are submitted using this dialog window. The job script specifies the location of the file

    containing the actual executable commands (for example, sleeper.sh). Job args takes in the

    arguments for the job, including deadline, budget, and expected exeuction time. Output may be

    directed to different files, and is specified in the stdout and stderr input boxes. Only those hosts

    that are specified submit hosts on the master host have privileges to submit jobs, however.

    Confidential Sproj3, 2001 Page 10

  • 8/7/2019 Libra Design

    11/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    4.5 Queue Control

    This window lists the currently configured queues for the execution hosts in the cluster. In the

    above example, execution hosts mspc29 and mspc36 each have a queue (called q) running on

    each of them, and each of these queues has 0 out of a maximum of 1 job running on them.

    Further, more queues may be added to the specified execution hosts, and existing queues may be

    modified or deleted.

    Confidential Sproj3, 2001 Page 11

  • 8/7/2019 Libra Design

    12/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    4.6 Queue Configuration: Modify

    This is the window from which queue configurations may be changed. For instance, different

    load/suspend thresholds, which specify the ceiling of maximum allowed load for the queue,

    beyond which it will begin to suspend jobs running on it till the threshold is satisfied. Also,

    complexes may be attached or detached from the queue, making queues cater to different

    resource requirements (in our case, deadline and budget). Checkpointing (the process of saving

    the current status of any job, for later restarting, or for job migration) can also be configured for

    each queue individually, to set the conditions under which checkpointing and/or job migration

    occur. User access specifies which users/hosts are to be given access, and which are to be denied.

    Subordinates includes those queues that are to follow suit with whatever instructions are sent to

    this queue; for instance, if a queue is suspended, then its subordinates will be suspended as well.

    Limits specifies the hard or soft limits imposed on jobs running on the queue (for example, no

    job may take more than 10 minutes of CPU time, otherwise it is to be suspended or killed).

    Confidential Sproj3, 2001 Page 12

  • 8/7/2019 Libra Design

    13/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    4.7 Host Configuration (Administration Host)

    In this window, hosts are configured. The three type of hosts recognized are Administration,

    Submit, and Execution hosts. Administration hosts are those hosts that are allowed to view and

    edit the configuration for the cluster as a whole, for hosts individually, for queues within specific

    hosts, as well as for jobs running on specific queues. Submit hosts are those that are allowed to

    submit jobs into the cluster. Execution hosts are those that have at least one queue set up where

    jobs scheduled by the cluster may be sent to actually execute.

    Confidential Sproj3, 2001 Page 13

  • 8/7/2019 Libra Design

    14/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    4.8 Complex Configuration

    Complexes are entities representing some aspect of the cluster that the cluster administrator

    defines. For instance, CPU usage, Virtual Memory usage, Disk Storage usage, are all complexes

    that the administrator can define and then attach onto specific queues or hosts (as shown in the

    queue configuration modification screenshot). By attaching a complex to a queue, and defining

    an associated limit for that complex, the administrator can effectively define a policy for the

    queue. For instance, one queue may be configured such that no process that requires more than

    10 Mb of disk space may be run. Alternatively, another queue may be defined such that the total

    amount of memory used by processes does not exceed the physical RAM on the host, such that

    no virtual memory is used, thus speeding execution times. In our case, a complex defining user

    budget, and another defining user deadline can both be used concurrently to define the behavior

    for a queue that can effectively ensure that user deadlines and budgets are accounted for during

    queue execution. This can act as a sentinel in case the scheduling algorithm inaccurately

    calculates the scheduling policy for the cluster.

    Confidential Sproj3, 2001 Page 14

  • 8/7/2019 Libra Design

    15/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    4.9 Cluster Configuration

    This window allows the administrator to define the local policies for hosts in the cluster as well

    as the global policies for the entire cluster as a whole. Global policies are defined which are

    inherited by hosts, and in turn inherited by queues in those hosts. Alternatively, hosts may define

    their individual policies that override the global policies as required. Finally, policies may also

    be defined for individual queues that override both the global as well as host-level policies. Thus,

    there is a great degree of customizability for the specific needs of the cluster.

    Confidential Sproj3, 2001 Page 15

  • 8/7/2019 Libra Design

    16/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    4.10 Scheduler Configuration

    This is the window from which the scheduling algorithm (as described by the high-level design

    diagrams) is selected for the master hosts scheduling daemon. This is effectively the master

    scheduling policy for the cluster, which selects the hosts and queues that are to be allocated to

    each job. This algorithm effectively schedules jobs based on the user constraints of deadline and

    budget, within which it maximizes as much as possible system-centric constraints such as CPU

    utilization, etc.

    Confidential Sproj3, 2001 Page 16

  • 8/7/2019 Libra Design

    17/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    5. Data Definitions

    Name: Cluster Information stored in Cluster State Table

    Aliases: Updated Cluster InformationWhere used/how used: Schedule Job (input)

    Accept/Reject Job (input)

    Content Description: Cluster Information = CPU Load + Node Status + Remaining

    Time of Pending Jobs + Available Memory

    CPU Load = numeric

    Node Status = [full | overloaded | partially-full]

    full = string

    overloaded = string

    partially-full = string

    Remaining Time of Pending Jobs = (hours) + (minutes) + (seconds)

    Hours = double

    Minutes = double

    Seconds = double

    Available Memory = numeric

    Supplementary Information: This is the information maintained by the master host. It is

    collected periodically from all the execution hosts and also

    communicated to the scheduler to enable it to make its schedulingdecisions.

    Name: Job DetailsAliases: Job Parameters, Job Specifications

    Where used/how used: Get Job Details (output)Schedule Job (output)

    Accept/Reject Job (input)

    Determine Execution Host (input)Implement Scheduling Policy (input)

    Content Description: Job Details = Job ID + Job Name + Job Type + Standalone

    Execution Time + Location of Executable and Input Data Sets +

    Location of Output + System Type + Budget + Deadline

    Confidential Sproj3, 2001 Page 17

  • 8/7/2019 Libra Design

    18/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    Job ID = integer

    Job Name = string

    Job Type = [Sequential Job | Embarrassingly Parallel Job]

    Sequential Job = char

    Embarrassingly Parallel Job = char

    Job Category = [Deadline | Budget | Deadline and Budget]

    This explains what is provided as the criterion for job

    Job Start Time = integer

    Job Submit Time = integer

    Standalone Execution Time = Job Duration

    Job Duration = (hours) + (minutes) + (seconds)

    Hours = double

    Minutes = double

    Seconds = double

    Location of Execution and Input Data Sets = string

    Location of Output = string

    System Type = string

    Budget = rupees

    Rupees = double

    Deadline = (hours) + (minutes) + (seconds)

    Hours = double

    Minutes = double

    Seconds = double

    Supplementary Information: none

    Name: Scheduling Information

    Aliases: Scheduling Policy ImplementationWhere used/how used: Schedule Job (output)

    Execute Job (input)

    Confidential Sproj3, 2001 Page 18

  • 8/7/2019 Libra Design

    19/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    Implement Scheduling Policy (output)

    Content Description: Scheduling Information = Tickets + Stride + Pass

    Tickets = integer

    Stride = integer

    Pass = integer

    Supplementary Information: Tickets represent the clients resource allocation, relative to other

    clients.

    Stride represents the interval between selection for execution, measured in passes, and isthe inverse of tickets.

    Pass represents the virtual time index for the clients next selection.

    Name: Job ResultsAliases: Job Output

    Where used/how used: Execute Job (output)

    Content Description: Job Results = (string) + (char) + (numeric)Supplementary Information: These are the results of the execution of the job, and are

    communicated to the user on the submit host from the execution

    host via the master host.

    Name: Required Budget/DeadlineAliases: none

    Where used/how used: Accept/Reject Job (output)

    Content Description: Required Budget/Deadline = Deadline + BudgetBudget = rupees

    Rupees = double

    Deadline = (hours) + (minutes) + (seconds)

    Hours = double

    Minutes = double

    Seconds = double

    Supplementary Information: This data object represents the response of the system to the

    submission of a job that cannot be completed with the requested

    parameters. The system then notifies the user, and demands adifferent budget and/or deadline.

    Name: Chosen Execution HostAliases: Chosen Execution Node

    Where used/how used: Determine Execution Host (output)

    Dispatch Job to Execution Host (input)Dispatch Job to Execution Host (output)

    Confidential Sproj3, 2001 Page 19

  • 8/7/2019 Libra Design

    20/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    Update Cluster Status (input)

    Content Description: Chosen Execution Host = (Host Name) + Host IP Address

    Host Name = string

    Host IP Address = numeric

    Supplementary Information: This is decided by the scheduler, based on the Cluster Information

    available to it.

    Name: Chosen Execution QueueAliases: noneWhere used/how used: Determine Execution Host (output)

    Dispatch Job to Execution Host (input)

    Dispatch Job to Execution Host (output)

    Update Cluster Status (input)Content Description: Chosen Execution Queue = (Queue Name) + Queue Number

    Queue Name = string

    Queue Number = numeric

    Supplementary Information: This is decided by the scheduler, based on the Cluster Information

    available to it.

    Name: Scheduled Job

    Aliases: Dispatched JobWhere used/how used: Schedule Job (output)

    Execute Job: Stride Scheduling (input)Dispatch Job to Execution Host (output)

    Content Description: Scheduled Job = Job Details + Chosen Execution Host + Chosen

    Execution Queue

    Job Details = Job ID + Job Name + Job Type + Standalone Execution Time + Location

    of Executable and Input Data Sets + Location of Output + System

    Type + Budget + Deadline

    Job ID = integer

    Job Name = string

    Job Type = [Sequential Job | Embarrassingly Parallel Job]

    Sequential Job = char

    Embarrassingly Parallel Job = char

    Standalone Execution Time = Job Duration

    Job Duration = (hours) + (minutes) + (seconds)

    Confidential Sproj3, 2001 Page 20

  • 8/7/2019 Libra Design

    21/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    Hours = double

    Minutes = double

    Seconds = double

    Location of Execution and Input Data Sets = string

    Location of Output = string

    System Type = string

    Budget = rupees

    Rupees = double

    Deadline = (hours) + (minutes) + (seconds)

    Hours = double

    Minutes = double

    Seconds = double

    Chosen Execution Host = (Host Name) + Host IP Address

    Host Name = string

    Host IP Address = numeric

    Chosen Execution Queue = (Queue Name) + Queue Number

    Queue Name = string

    Queue Number = numeric

    Supplementary Information: This data object represents a job once it has been accepted by thesystem, and its execution node and execution queue.

    Confidential Sproj3, 2001 Page 21

  • 8/7/2019 Libra Design

    22/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    6. Architectural and Procedural Design

    Confidential Sproj3, 2001 Page 22

    Job AssignmentController

    Job ExecutionController

    Job QueryingController

    Libra Scheduler

    Job ModificationController

    Job SubmissionController

    job

    details

    Job InitializationController

    jobdetai

    ls,jobacce

    pteds

    ignal

    update

    dpen

    ding

    joblis

    t

    sc

    hed

    uli

    ng

    info

    ,

    sc

    hed

    ule

    d

    job

    sched

    uling

    info,

    sche

    duledjo

    b

    updatedclusterstatus

  • 8/7/2019 Libra Design

    23/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    Confidential Sproj3, 2001 Page 23

    Job SubmissionController

    Job InputController

    Job AcceptanceController

    Job AssignmentController

    SchedulingController

    Job ExecutionControl

    Job SelectionController

    QuantumAllocationController

    selec

    tedjo

    b

    Execution Hostand Queue

    DeterminationController

    Cluster UpdateController

    jobdetails, jobd

    etails

    job

    deta

    ils

    job

    details

    jobacce

    pted

    signaljo

    bdetails

    updatedclu

    stersta

    tus

    sele

    cted

    job

    sch

    edulin

    gi

    nfo

    ,

    sch

    edule

    dj

    ob

  • 8/7/2019 Libra Design

    24/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    6.1 Job Input Controller

    Confidential Sproj3, 2001 Page 24

    Job InputController

    Get JobParameters

    Forward JobDetails To Master

    Host

    Log User In

    Get User ID Get UserPassword

    Authenticate User

    Input JobParameters

    Authenticate JobParamters

    user

    id

    use

    rid

    password

    userid,

    password

    authentication

    signal

    userid,pas

    sword

    userid

    jobdetails

    userid

    job

    details jo

    b

    deta

    ils

    ver

    ified

    job

    deta

    ils

    userid,jobdetails

  • 8/7/2019 Libra Design

    25/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    6.2 Job Acceptance Controller

    Confidential Sproj3, 2001 Page 25

    Job AcceptanceController

    Assess ClusterState

    Inform UserAssess Job Finish

    Time

    Retrieve RelevantJob Details

    Compute JobFinish TIme

    Compare JobFinish Time to Job

    Deadline

    Query ClusterState Table

    Query ClusterResource List

    c,t,e

    ,b,d

    job

    id jobfinihstime,

    jobdeadline

    withindeadline

    jobaccepted

    signal

    queu

    elo

    adavaila

    ble

    resources

    jobacceptedsignal

    clusterread

    ysignal,av

    ailableres

    ources

    jobid,

    available

    reso

    urces

    c,

    t,e,

    b,

    d,

    availableresources

    jobfinishtime

    useracceptedsignal

    c = category (deadline only, budget only,deadline and budget)

    t = type (sequential, emb parallel)e = execution timeb = budget

    d = deadline

  • 8/7/2019 Libra Design

    26/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    6.3 Job Initialization Controller

    Confidential Sproj3, 2001 Page 26

    Job InitializationController

    Retrieve JobDetails

    job

    details

    Set Job Details

    jobdetails

    Update PendingJob List

    job

    id,details

    updatedpen

    ding

    joblist

    upd

    atedpendin

    g

    joblist

    Assign Job Id

    jobi

    d

  • 8/7/2019 Libra Design

    27/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    6.4 Job Assignment Controller

    Confidential Sproj3, 2001 Page 27

    Job AssignmentController

    Cluster UpdateController

    SchedulingController

    RetrieveScheduled

    Job LIst

    CalculateScheduling

    Information

    Execution Hostand Queue

    Determination

    Controller

    UpdatCluste

    State Ta

    sched

    ulingi

    nfo

    job

    detail

    s

    sch

    edule

    d

    job

    lis

    t

    ReserveResources

    on ExecHost

    Update ExecHost Queue

    Status

    CalculatePass

    CalculateStride

    AllocateTickets

    jobdetails,c

    lusterin

    fo

    job

    detail

    s,

    ch

    osen

    exec

    host,

    queue

    execqueue,

    jobdetails

    chosenexechost,

    jobdetails

    chosen

    exechosta

    ndqueue,

    scheduledjo

    b

    jobdetails,chosenexechost,que

    ue

    updatedclusterstatus

    job

    details

    allo

    cate

    d

    tick

    ets

    allocatedtickets

    stride

    strid

    e

    calculatedpass

    jobdetails

    scheduling

    info

    Get Cluster Info

    clu

    ste

    rinfo

  • 8/7/2019 Libra Design

    28/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    6.5 Execution Host and Queue Determination Controller

    Confidential Sproj3, 2001 Page 28

    Execution Hostand Queue

    DeterminationController

    Sort Hostsby Load

    Choose MinLoaded Host

    SelectAppropriate

    Queue

    Set SubmitTime

    job

    details

    ,clu

    ster

    info

    host

    load

    list

    hostlo

    adli

    st

    sort

    edh

    ostlo

    adli

    st s

    ortedhos

    tloadlist

    minloadedhost

    jobdetails,

    selecte

    dexechost

    selectedqueue

    chosenselectedhost,queu

    e

    DetermineQueue SlotAvailability

    QueryQueue State

    Table

    sele

    cte

    de

    xec

    host,

    job

    detail

    s

    query

    info

    selectedqueue

    queryin

    fo

    schedule

    djob

    DispatchJob

    scheduledjob

    updatedscheduledjob

    Gather HostLoad Info

  • 8/7/2019 Libra Design

    29/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    6.6 Job Execution Controller

    Confidential Sproj3, 2001 Page 29

    Job SelectionController

    Get CurrentlyExecuting Job

    List

    Sort Job List byPass Value

    Select MinimumPass Value Job

    Job ExecutionControl

    QuantumAllocation

    Controller

    selectedjob,schedulinginfo

    curr

    ently

    exe

    cuting

    job

    list

    joblist

    sortedjoblist s

    orte

    djo

    blis

    t

    selectedjo

    b

    selectedj

    ob

    sele

    cted

    job

    selectedjob,

    schedulinginfo

    updated

    schedulinginfo

    selectedjo

    b

    schedulin

    ginfo

    pass

    ,stride

    pass,str id

    e

    new

    pas

    svalue

    newpassvalue

    updated

    schedulingin

    fo

    Allot Quantum/sto Job and Run

    Calculate andSet New Pass

    Value

    Insert Job inCurrently

    Executing JobsList

    Get Present Passand Stride

    Values

    Add Stride toPresent Pass

    Value

    Set New PassValue

  • 8/7/2019 Libra Design

    30/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    6.7 Job Query Controller

    Confidential Sproj3, 2001 Page 30

    Job QueryingController

    Log User In Get Job ID

    Get Job Details Display Job Details

    user

    id,

    passw

    ord

    jobid

    scheduledjob

    details

    Prompt User forJob ID

    Authenticate JobID

    job

    id

    jobid

    authentic

    atio

    n

    signal

    scheduledjob

    details

    Retrieve DetailsBased on User's

    ChoiseDisplay Options

    jobid,

    choice

    c

    hoic

    e

    scheduledjo

    b

    details

    Get Info from User

    userid

    ,

    password

    jobid

  • 8/7/2019 Libra Design

    31/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    6.8 Job Modification Controller

    Confidential Sproj3, 2001 Page 31

    Job ModificationController

    Get Info from User

    Display Change

    Job/Delete JobOption

    userid,

    passwo

    rd,jobi

    d

    jobid

    Delete Job Change Job

    Determine JobExecution Host

    and Queue

    UpdateCluster

    State Table

    Update ExecHost Queue

    Status

    GetScheduledJob Details

    jobid

    jobid

    jobid

    scheduled

    jobdetails

    exechost,q

    ueue

    ,

    job

    details

    jobid,exec

    host,queue

    exechost,queue,

    jobdetails

    updated

    clusterstatus

    exec

    host,

    queue

    exechost,q

    ueue,

    jobdetails

    jobid

    choice

    ch

    oic

    e

    aut

    he

    nt

    icat

    ion

    si

    gna

    l jobid

    updatedclus

    terstatus

    updated

    clusters

    tatus

    Delete JobFrom Queue

    Update ClusterStatus

    DisplayChangeableParameters

    and Get User'sChoice

    AuthenticateChoice

    Update JoInfo

  • 8/7/2019 Libra Design

    32/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    7. Program Design Language (Low-Level Design)

    7.1 Log User In and Input Job

    Output: userID

    UserID = getUserIdFromUser

    Password = getPasswordFromUser

    If UserID and Password match UserID and Password in database

    Send authentication signal

    Request Job details

    Verify Job Details

    Assign Job Id and Initialize Job

    Else

    Display error and ask for UserID and Password again

    Return userID

    Confidential Sproj3, 2001 Page 32

  • 8/7/2019 Libra Design

    33/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    7.2 Accept / Reject Job

    Input: jobDetails

    Output: jobAcceptedSignal

    qLoadList = clusterStateTable.QueueList

    clusRes = clusterResourceList

    send clusterReady signal

    c = jobDetails.category

    t = jobDetails.type

    e = jobDetails.execTime

    b = jobDetails.budget

    d = jobDetails.deadline

    Loop through qLoadList

    If current.rate

  • 8/7/2019 Libra Design

    34/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    7.3 Compute Job finish Time

    Input: queueLoad, jobExecTime

    Output: finishTime

    Loop through the queueLoad timeSlots until jobExecTime is zero

    If currentTimeSlot is free

    jobExecTime = jobExecTime - size of currentTimeSlot

    Return currentTimeSlot

    7.4 Determine Execution Host and Queue

    Input: jobDetails, ClusterInfo

    Output: chosenExecQueue, chosenHost

    HostLoadList = List of the Loads on each Host

    Set initial load as minimum

    Loop though list of loads

    If current load < minimum so far

    minimum = current number

    chosenHost = HostLoadList(minimum)

    queueList = QueueStateTable [HostLoadList(chosenHost)]

    Set first queue as chosenExecQueue

    Confidential Sproj3, 2001 Page 34

  • 8/7/2019 Libra Design

    35/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    Loop through the queuelist

    If there exist slots in current queue that fulfill jobdetails.execTime

    choose current queue as chosenExecQueue

    ScheduledJobDetails.startTime = current time

    Return (chosenExecQueue,chosenHost)

    7.5 Schedule according to deadline and budget

    Algorithm:

    M - Resources, N - Jobs, D - deadline

    Note: Cost of any Ri is less than any of Ri+1 Or Rm

    RL: Resource List need to be maintained in increasing order of cost

    Ct - Time when accessed (Time now)

    Ti - Job runtime (average) on Resource i (Ri) [updated periodically]

    Ti is acts as a load profiling parameter.

    Ai - number of jobs assigned to Ri , where:

    Ai = Min (No.of Unassigned Jobs, No. Jobs Ri can complete by remaining deadline)

    No.UnAssignedJobsi = Diff( N, (A1++Ai-1))

    JobsRi consume = RemainingTime (D- Ct) DIV Ti

    ALG: Invoke Job Assignment() periodically until all jobs done.

    Job Assignment():

    Establish ( RL, Ct , Ti , Ai ) dynamically.

    For all resources (I = 1 to M) { Assign Ai Jobs to Ri , if required }

    Confidential Sproj3, 2001 Page 35

  • 8/7/2019 Libra Design

    36/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    7.6 Implement proportional scheduling model according to allocated tickets

    We are using the stride scheduling algorithm to allocate quantums between jobs, to implement

    their priorities according to their budget and deadline. The tickets represent their priority and are

    determined according to prioritizing sorted lists of jobs according to budget and sorted lists

    according to deadline, and determining their combined priorities to figure out the eventual

    priorities of these jobs.

    The Algorithm is as follows:

    /* per-client state */

    typedef struct {

    int tickets, stride, pass, remain;

    } *client t;

    /* quantum in real time units (e.g. 1M cycles) */

    const int quantum = (1

  • 8/7/2019 Libra Design

    37/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    /* update global tickets and stride to reflect change */

    void global tickets update(int delta)

    {global tickets += delta;

    global stride = stride1 / global tickets;}

    /* initialize client with specified allocation */

    void client init(client t c, int tickets){

    /* stride is inverse of tickets, whole stride remains */

    c->tickets = tickets;c->stride = stride1 / tickets;

    c->remain = c->stride;

    }

    /* join competition for resource */

    void client join(client t c, queue t q)

    {/* compute pass for next allocation */

    global pass update();

    c->pass = global_pass + c->remain;

    /* add to queue */

    global tickets update(c->tickets);

    queue insert(q, c);

    }

    /* leave competition for resource */void client leave(client t c, queue t q)

    {

    /* compute remainder of current stride */global pass update();

    c->remain = c->pass - global_pass;

    /* remove from queue */

    global tickets update(-c->tickets);

    queue remove(q, c);}

    Confidential Sproj3, 2001 Page 37

  • 8/7/2019 Libra Design

    38/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    /* proportional-share resource allocation */

    void allocate(queue t q)

    {int elapsed;

    /* select client with minimum pass value */current = queue remove min(q);

    /* use resource, measuring elapsed real time */

    elapsed = use resource(current);/* compute next pass using quantum-adjusted stride */

    current->pass +=

    (current->stride * elapsed) / quantum;queue insert(q, current);

    }

    7.7 Select Job (according to stride scheduling algorithm)

    Output: selectedJob

    currJobList = currentlyExecutingJobList

    Set first job as minimum

    Loop through currJobList

    If currentJob.pass < minimum.pass

    minimum = currentJob

    selectedJob = currJobList(minimum)

    Return selectedJob

    Confidential Sproj3, 2001 Page 38

  • 8/7/2019 Libra Design

    39/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    7.8 Query Job

    Output: Required queries displayed on screen

    userID = getUserIdFromUser

    password = GetPasswordFromUser

    If UserID and Password match UserID and Password in database

    Send authentication signal

    Else display error and ask for UserID and Password again

    jobID = getJobIDFromUser

    If jobID and UserID tally

    Send authentication signal

    Else display error message and ask user for jobID again

    Display job attirbutes that user may view

    If remainingTime is selected

    call procedure calculateRemainingTime(jobID)

    If startTime is selected

    display jobDetails.startTime

    If budget is selected

    display jobDetails.budget

    If Deadline is selected

    display jobDetails.deadline

    If Execution Host and Queue is selected

    display jobDetails.execHost and jobDetails.execQueue

    Confidential Sproj3, 2001 Page 39

  • 8/7/2019 Libra Design

    40/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    7.9 Delete Job

    Input: jobID

    Output: updatedClusterStatus

    eHost = jobDetails.execHost

    eQueue = jobDetails.execQueue

    eQueue.delete(jobID)

    clusterStateTable.host(eHost).resources -= calculateRemainingTime(jobID)

    Confidential Sproj3, 2001 Page 40

  • 8/7/2019 Libra Design

    41/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    7.10 Change Job Details

    Input: jobID, choice

    If choice = name

    newName = readNewNameFromUser()

    If newName is blank

    Display error message and ask user to enter new name again

    Else

    jobDetails.name = newName

    If choice = location of execution and input data sets

    currJobList = currentlyExecutingJobList

    Loop through currJobList

    If currentJob.jobID = jobID

    Display error message

    Return

    newLoc = readNewLocationFromUser()

    jobDetails.inputLocation = newLoc

    If choice = location of output

    newLoc = readNewLocationFromUser()

    jobDetails.outputLocation=newLoc

    Confidential Sproj3, 2001 Page 41

  • 8/7/2019 Libra Design

    42/42

    Libra: An Economy-Driven Cluster Scheduler Version:

    Design Documentation Date:

    First draft

    8. References

    [1] R. Buyya, D. Abramson, and J. Giddy, Nimrod/G: An Architecture for a Resource

    Management and Scheduling System in a Global Computational Grid, HPC ASIA2000, China,

    IEEE CS Press, USA, 2000.

    [2] R. Buyya, D. Abramson, J. Giddy,An Economy Driven Resource Management Architecture

    for Global Computational Power Grids, The 2000 International Conference on Parallel and

    Distributed Processing Techniques and Applications (PDPTA 2000), Las Vegas, USA, June 26-

    29, 2000.

    [3] R. Buyya, D. Abramson, and J. Giddy, An Economy Grid Architecture for Service-Oriented

    Grid Computing, 10th IEEE International Heterogeneous Computing Workshop (HCW 2001),

    with IPDPS 2001, SF, California, USA, April 2001.

    [4] Rajkumar Buyya, Heinz Stockinger, Jonathan Giddy, and David Abramson, Economic

    Models for Management of Resources in Peer-to-Peer and Grid Computing, Monash University.

    [5] B.N. Chun and D.E. Culler, Market-based proportional resource sharing for clusters.

    Submitted for publication, September 1999.

    [6] Sun Microsystems, Inc., Sun Grid Engine 5.2.3 Manual. July 2001