CS160 – Spring 2000 Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

29
CS160 – Spring 2000 http://www- cse . ucsd . edu /classes/sp00/cse160 Prof. Fran Berman - CSE Dr. Philip Papadopoulos - SDSC
  • date post

    20-Jan-2016
  • Category

    Documents

  • view

    220
  • download

    0

Transcript of CS160 – Spring 2000 Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Page 2: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Two Instructors/One Class

• We are team-teaching the class• Lectures will be split about 50-50 along

topic lines. (We’ll keep you guessing as to who will show up next lecture )

• TA is Derrick Kondo. He is responsible for grading homework and programs

• Exams will be graded by Papadopoulos/Berman

Page 3: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Prerequisites

• Know how to program in C

• CSE 100 (Data Structures)

• CSE 141 (Computer Architecture) would be helpful but not required.

Page 4: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Grading

• 25% Homework

• 25% Programming assignments

• 25% Midterm

• 25% Final

Homework and Programming Assignments Due at beginning of section

Page 5: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Policies

• Exams are closed book, closed notes• No Late Homework• No Late Programs• No Makeup exams

• All assignments are to be your own original work.• Cheating/copying from anyone/anyplace will be

dealt with severely

Page 6: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Office Hours (Papadopoulos)

• My office is SSB 251 (Next to SDSC)

• Hours will be TuTh 2:30 – 3:30 or by appointment.

• My email is [email protected]

• My campus phone is 822-3628

Page 7: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Course Materials

• Book: Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers, by B. Wilkinson and Michael Allen.

• Web site: Will try to make lecture notes available before class

• Handouts: As needed.

Page 8: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Computers/Programming

• Please see the TA about getting an account for the undergrad APE lab.

• We will use PVM for programming on workstation clusters.

• A word of advice: With the web, you can probably find almost completed source code somewhere. Don’t do this. Write the code yourself. You’ll learn more. See policy on copying.

Page 9: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Any other Adminstrative Questions?

Page 10: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Introduction to Parallel Computing

• Topics to be covered. See syllabus (online) for full details– Machine architecture and history– Parallel machine organization, – Parallel algorithm paradigm– Parallel programming environments and tools– Heterogeneous computing. – Evaluating Performance– Grid Computing

• Parallel programming and project assignments

Page 11: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

What IS Parallel Computing?

• Applying multiple processors to solve a single problem

• Why?– Increased performance for rapid turnaround

time (wall clock time)– More available memory on multiple machines– Natural progression of standard Von Neumann

Architecture

Page 12: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

World’s 10th Fastest Machine (as of November 1999) @ SDSC

1152 Processors

Page 13: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Are There Really Problems that Need O(1000) processors?

• Grand Challenge Codes– First Principles Materials Science– Climate modeling (ocean, atmosphere)– Soil Contamination Remediation– Protein Folding (gene sequencing)

• Hydrocodes – Simulated nuclear device detonation

• Code breaking (No Such Agency)

Page 14: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

There must be problems with the approach

• Scaling with efficiency (speedup)• Unparallelizable portions of code (Amdahl’s law)• Reliability• Programmability• Algorithms• Monitoring• Debugging• I/O• …

– These and more keep the field interesting

Page 16: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Basic Measurement Yardsticks

• Peak Performance (AKA, guaranteed never to exceed) = nprocs X FLOPS/proc

• NAS Parallel Benchmarks

• Linpack Benchmark for the TOP 500

• Later in the course, We will explore about how to Fool the Masses and valid ways to measure performance

Page 17: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Illiac IV (1966 – 1970)

• $100 Million of 1990 Dollars

• Single instruction multiple data (SIMD)

• 32 - 64 Processing elements

• 15 Megaflops

• Ahead of its time

Page 18: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

ICL DAP (1979)

• Distributed array Processor (also SIMD)

• 1K – 4K bit Serial processors

• Connected in a mesh

• Required an ICL mainframe to front-end the main processor array

• Never caught on in the US

Page 19: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Goodyear MPP (late 1970s)

• 16K bit-serial processors (SIMD)

• Goddard Space and Flight Center – NASA

• Only a few sold. Similar to the ICL DAP

• About 100 Mflops (100 MHz Pentium)

Page 20: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Cray-1 (1976)

• Seymour Cray, Designer

• NOT a parallel machine

• Single processor machine with vector registers

• Largely regarded as starting the modern supercomputer revolution

• 80 MHz Processor (80 MFlops)

Page 21: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Denelcor HEP (Heterogeneous Element Processor, early 80’s)

• Burton Smith, Designer• Multiple Instruction, Multiple Data (MIMD)• Fine (instruction-level) and Large-grain parallelism

(16 processors)– Instructions from different programs ran in per-processor

hardware queues (128 threads/proc)

• Precursor to the Tera MTA (Multithreaded architecture• Full-empty bit for every memory location. Allowed

fast synchronization• Important research machine

Page 22: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Caltech Cosmic Cube - 1983

• Chuck Seitz (Founded Myricom) and Geoffrey Fox (Lattice gauge theory)

• First Hypercube interconnection network• 8086/8087 based machine with Eugene Brooks’

Crystalline Operating System (CrOS)• 64 Processors by 1983• About 15x cheaper than a VAX 11/780• Begat nCUBE, Floating Point Systems, Ametek, Intel

Supercomputers (all dead companies)• 1987 – Vector coprocessor system achieved 500MFlops

Page 23: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Cray – XMP (1983) and Cray-2 (1985)

• Up to 4-Way shared memory machines

• This was the first supercomputer at SDSC– Best Performance (600 Mflop Peak)– Best Price/Performance of the time

Page 24: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Late 1980’s

• Proliferation of (now dead) parallel computers• CM-2 (SIMD) (Danny Hillis)

– 64K bit-serial, 2048 Vector Coprocessors• Achieved 5.2 Gflops on Linpack (LU Factorization)

• Intel iPSC/860 (MIMD - MPP)– 128 Processors– 1.92 Gigaflops (Linpack)

• Cray Y/MP (Vector Super)– 8 processors (333 Mflops/proc peak)– Achieved 2.1 Gigaflops (Linpack)

• BBN Butterfly (Shared memory)

• Many others (long since forgotten)

Page 25: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Early 90’s

• Intel Touchstone Delta and Paragon (MPP)– Follow-On iPSC/860– 13.2 Gflops on 512 Processors– 1024 Nodes delivered to ORNL in 1993 (150 GFLOPS Peak)

• Cray C-90 (Vector Super)– 16 Processor update of the Y/MP– Extremely popular, efficient and expensive

• Thinking Machines CM-5 (MPP)– Upto 16K Processors– 1024 Node System at Los Alamos National Lab

Page 26: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

More 90’s

• Distributed Shared Memory– KSR-1 (Kendall Square Research)

• COMA (Cache Only Memory Architecture)

– University Projects• Stanford DASH Processor (Hennessy)

• MIT Alewife (Agarwal)

• Cray T3D/T3E. Fast Processor Mesh with upto 512 Alpha CPUs

Page 27: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

What Can you Buy Today? (not an exhaustive list)

• IBM SP– Large MPP or Cluster

• SGI Origin 2000– Large Distributed Shared Memory Machine

• Sun HPC 10000 – 64 Processor True Shared Memory• Compaq Alpha Cluster• Tera MTA

– Multithreaded architecture (one in existence)

• Cray SV-1 Vector Processor• Fujitsu and Hitachi Vector Supers

Page 28: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Clusters

• Poor man’s Supercomputer?

• A pile-of-PC’s

• Ethernet or High-speed (eg. Myrinet) network

• Likely to be the dominant high-end architecture.

• Essentially a build-it-yourself MPP.

Page 29: CS160 – Spring 2000   Prof. Fran Berman - CSE Dr. Philip Papadopoulos.

Next Time …

• Flynn’s Taxonomy

• Bit-Serial, Vector, Pipelined Processors

• Interconnection Networks– Routing Techniques– Embedding– Cluster interconnects

• Network Bisection