Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein [email protected] Office Hours: 11-1...

26
Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein [email protected] Office Hours: 11-1 MW or by appointment Tel: 513-556-1807
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein [email protected] Office Hours: 11-1...

Page 1: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Fall 2008CS 668

Parallel ComputingProf. Fred Annexstein

[email protected]

Office Hours: 11-1 MW or by appointment

Tel: 513-556-1807

Page 2: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Lecture 1: Welcome

• Goals of this course• Syllabus, policies, grading• Blackboard Resources• LINC Linux cluster• Introduction/Motivation for HPPC• Scope of the Problems in Parallel Computing

Page 3: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Goals

• Primary: – Provide an introduction to the computing systems,

programming approaches, common numerical and algorithmic methods used for high performance parallel computing

• Secondary:– Have an course meeting competency requirements of

RRSCS – Provide hands-on parallel programming experience

Page 4: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

• Official Syllabus Available on Blackboard

• Textbook

Parallel Programming in C with MPI and OpenMP, Michael J. Quinn

Other Recommended Texts- Parallel Programming With Mpi, Peter Pacheco- Introduction to Parallel Computing: Design and Analysis

of Algorithms: Ananth Grama, Anshul Gupta, George Karpis, Vipin Kumar

- Using MPI - 2nd Edition: Portable Parallel Programming with the Message Passing Interface by William Gropp

Page 5: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Workload/Grading

• Exams (1 or 2)– Graded 30% of Grade

• Written exercises (3-4)– May/may not be graded

• Programming Assignments (3-4)– May be done in groups of at most 2– MPI programming, performance measurement

• Research papers (1)– Discussion research questions, strengths, weaknesses,

interesting points, contemporary bibliography

• Final project (1)– Individual or Group programming project and report

Page 6: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Policies

• Missed Exams: – Missed exams can not be made up unless pre-

approved. Please see the instructor as soon as possible in the event of a conflict.

• Academic Honesty: – Plagiarism on assignments, quizzes or exams will not

be tolerated. See your student code of conduct (http://www.uc.edu/conduct/Code_of_Conduct.html) for more on the consequences of academic misconduct. There are no “small” offenses.

Page 7: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Blackboard

• Syllabus and my contact info

• Announcements

• Lecture slides

• Assignment handouts

• Web resources relevant to the course

• Discussion board

• Grades

Page 8: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

What is the Ralph Regula School?

• The Ralph Regula School of Computational Science is a statewide, virtual school focused on computational science. It is a collaborative effort of the Ohio Board of Regents, Ohio Supercomputer Center, Ohio Learning Network and Ohio's colleges and universities. With funding from NSF, the school acts as a coordinating entity for a variety of computational science education activities aimed at making education in computational science available to students across Ohio, as well as to workers seeking continuing education about this technology.

• Website: http://www.rrscs.org

Page 9: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

CS LINC Cluster

• Michal Kouril’s links– http://www.ececs.uc.edu/~kourilm/clusters/– See README file for instructions on running MPI

code on beowulf.linc.uc.edu• Accounts

– ECE/CS students should already have an account– I can request accounts for the non-ECE/CS students

• Access– Remote access only, the cluster is in the ECE/CS

server/machine room on the 8th floor of Rhodes, visible through windows in the 890’s hallway

Page 10: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Why HPPC?

• Who needs a roomful of computers anyway?

• My PC and XBOX run at GFLOP rates (Billion Floating Point Operations per second)

NCSA TeraGrid IA-64 Linux Cluster(http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/TGIA64LinuxCluster/)

Page 11: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Needed by People who solve Science and Engineering problems• Materials / Superconductivity• Fluid Flow• Weather/Climate• Structural Deformation • Genetics / Protein interactions• Seismic

Many Research Projects in Natural Sciences and Engineering cannot exist without HPPC

Page 12: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Applications

• Videos – Applications in Physics and Geology

• Simulation of Large-Scale Structure of Universe http://www.youtube.com/watch?v=8C_dnP2fvxk

• Stability Simulation – http://www.youtube.com/watch?v=ZCMiLJOXrpc

• Super Volcano Movie - Show first 1:00 minute http://www.youtube.com/watch?v=unGODG7N1Bs

Page 13: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Why are the problems so large?

• 3-Dimensional– If you want to increase the level of resolution by factor

of 10, problem size increases by 103

• Many Length Scales (both time and space)– If you want to observe the interactions between very

small local phenomenon and larger more global phenomenon

• The number of relationships between data items grows quadraticly. – Example: human genome 3.2 G base pairs means

about 5,000,000,000,000,000,000=5E relations

Page 14: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

How can you solve these problems?

• Take advantage of parallelism– Large problems generally have many operations

which can be performed concurrently

• Parallelism can be exploited at many levels by the computer hardware– Within the CPU core, multiple functional units,

pipelining– Within the Chip, many cores– On a node, multiple chips– In a system, many nodes

Page 15: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

However….

• Parallelism has overheads– At the core and chip level the cost is

complexity and money– Most applications get only a fraction of peak

performance (10%-20%)– At the chip and node level, memory bus can

get saturated if too many cores– Between nodes, the communication

infrastructure is typically much slower than the CPU

Page 16: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Necessity Yields Modest Success

• Power of CPUs keeps growing Power of CPUs keeps growing exponentiallyexponentially

• Parallel programming environments Parallel programming environments changing very slowly – much harder than changing very slowly – much harder than sequentialsequential

Two standards have emergedTwo standards have emerged• MPI library, for processes that do not share MPI library, for processes that do not share

memorymemory• OpenMP directives, for processes that do OpenMP directives, for processes that do

share memoryshare memory

Page 17: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Why MPI?

• MPI = “Message Passing Interface”MPI = “Message Passing Interface”

• Standard specification for message-Standard specification for message-passing librariespassing libraries

• Very PortableVery Portable

• Libraries available on virtually all parallel Libraries available on virtually all parallel computerscomputers

• Free libraries also available for networks Free libraries also available for networks of workstations or commodity clustersof workstations or commodity clusters

Page 18: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Why OpenMP?

• OpenMP an application programming OpenMP an application programming interface (API) for shared-memory interface (API) for shared-memory systemssystems

• Based on model of creating and Based on model of creating and scheduling multi-threaded computations.scheduling multi-threaded computations.

• Supports higher performance parallel Supports higher performance parallel programming of symmetrical programming of symmetrical multiprocessorsmultiprocessors

Page 19: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

What are the Costs?Commercial Parallel Systems• Relatively costly per processorRelatively costly per processor• Primitive programming environmentsPrimitive programming environments• Scientists looked for alternativeScientists looked for alternative

Beowulf Concept circa 1994• NASA project (written by Sterling and Becker)NASA project (written by Sterling and Becker)• Commodity processorsCommodity processors• Commodity interconnectCommodity interconnect• Linux operating systemLinux operating system• Message Passing Interface (MPI) libraryMessage Passing Interface (MPI) library• High performance/$ for certain applicationsHigh performance/$ for certain applications

Page 20: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

How are they Programmed? Task Dependence Graph• Begin with Directed graphBegin with Directed graph• Vertices = tasks Edges = dependencesVertices = tasks Edges = dependences• Edges are removed as tasks completeEdges are removed as tasks complete

Data Parallelism• Independent tasks apply same operation to different elements of a Independent tasks apply same operation to different elements of a

data setdata set

Functional Parallelism• Independent tasks apply different operations to different data Independent tasks apply different operations to different data

elementselements

Pipelining• Divide a process into stagesDivide a process into stages• Produce and consume several items simultaneouslyProduce and consume several items simultaneously

Page 21: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Why not just use a Compiler?• Parallelizing compiler - Detect parallelism in sequential programParallelizing compiler - Detect parallelism in sequential program• Produce parallel executable programProduce parallel executable program

Advantages Advantages Can leverage millions of lines of existing serial programsCan leverage millions of lines of existing serial programs

• Saves time and labor- Requires no retraining of programmersSaves time and labor- Requires no retraining of programmers• Sequential programming easier than parallel programmingSequential programming easier than parallel programming

DisadvantagesDisadvantages• Parallelism may be irretrievably lost when programs written in Parallelism may be irretrievably lost when programs written in

sequential languagessequential languages• Simple example: Compute all partial sums in an arraySimple example: Compute all partial sums in an array• Performance of parallelizing compilers on broad range of Performance of parallelizing compilers on broad range of

applications still up in airapplications still up in air

Page 22: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Can we Extend Existing Languages?

Programmer can give directives or clues to the Programmer can give directives or clues to the complier about how to parallelizecomplier about how to parallelize

AdvantagesAdvantages• Easiest, quickest, and least expensiveEasiest, quickest, and least expensive• Allows existing compiler technology to be Allows existing compiler technology to be

leveragedleveraged• New libraries can be ready soon after new New libraries can be ready soon after new

parallel computers are availableparallel computers are availableDisadvantagesDisadvantages• Lack of compiler support to catch errorsLack of compiler support to catch errors• Easy to write programs that are difficult to debugEasy to write programs that are difficult to debug

Page 23: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Or Create New Parallel Languages?

AdvantagesAdvantages• Allows programmer to communicate parallelism Allows programmer to communicate parallelism

to compiler directlyto compiler directly• Improves probability that executable will achieve Improves probability that executable will achieve

high performancehigh performance

DisadvantagesDisadvantages• Requires development of new compilersRequires development of new compilers• New languages may not become standardsNew languages may not become standards• Programmer resistanceProgrammer resistance

Page 24: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Where are we in 2008?

• Performance makes Low-level approaches Performance makes Low-level approaches popularpopular

• Augment existing language with low-level Augment existing language with low-level parallel constructs and directivesparallel constructs and directives

• MPI and OpenMP are prime examplesMPI and OpenMP are prime examples

Advantages Advantages • EfficiencyEfficiency• PortabilityPortabilityDisadvantagesDisadvantages• More difficult to program and debugMore difficult to program and debug

Page 25: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Programming Assignment #1

• Log into beowulf.linc.uc.edu and run some simple sample programs.

Page 26: Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807.

Reading Assignment #1 on Blackboard