Cis017 6 revision-parallel_2015

12
CIS017-6 Distributed and Parallel Architectures Revision (Parallel Architectures) Jon Hitchcock

Transcript of Cis017 6 revision-parallel_2015

Page 1: Cis017 6 revision-parallel_2015

CIS017-6Distributed and Parallel Architectures

Revision(Parallel Architectures)

Jon Hitchcock

Page 2: Cis017 6 revision-parallel_2015

Exam questions• Past papers are available from the university library

catalogue– Search for the unit code: CIS017-6

• Questions are selected from a pool– Each exam paper does not cover every topic – Each year questions are updated– New questions are added

• Newer questions tend to be broader– Less focus on remembering lists of facts– More opportunity to show what you have learnt and that you have

a wide understanding of the subject• This year there is no choice of questions

– You should answer all the questions on the paper

Page 3: Cis017 6 revision-parallel_2015

Possible wording of questions

• Compare THIS and THAT and critically discuss their advantages and disadvantages.

• Give an example of SOMETHING.

• Illustrate your answer with a diagram.

• Consider a SITUATION and suggest what can be done about it.

Page 4: Cis017 6 revision-parallel_2015

Types of parallelism• Parallelism in applications

– Data parallelism– Task parallelism

• Parallelism in hardware – Bit-level parallelism– Instruction-level parallelism– Vector architectures and GPUs– Thread-level parallelism– Request-level parallelism

Page 5: Cis017 6 revision-parallel_2015

Flynn’s taxonomy

• SISD : Single Instruction Single Data

• SIMD : Single Instruction Multiple Data

• MIMD : Multiple Instruction Multiple Data

Page 6: Cis017 6 revision-parallel_2015

Parallel Computer Implementation• Parallel computers can be roughly classified according to

the level at which the hardware supports parallelism.– Multicore processor– Shared memory multiprocessor– Cluster

• These classes are not mutually exclusive• Clusters of multicore processors are common• Low-level implementations

– Vector processor– Graphics processing unit– Spatial computing

• GPUs are often used for general-purpose parallel computation

Page 7: Cis017 6 revision-parallel_2015

Memory Organisation• Shared memory

– Multicore processor– Shared memory multiprocessor– Parallel Random Access Machine (PRAM) model

• Concurrent read and/or write access• Exclusive access

• Distributed memory– Cluster– Warehouse-scale computer– Passes messages to transfer data– Bulk Synchronous Parallel (BSP) model

Page 8: Cis017 6 revision-parallel_2015

Parallel Programming• Shared-state concurrency

– Java• Threads

– C++• Threads• Task-based concurrency

– OpenMP– GPU programming

• Message-passing concurrency– Erlang– MPI

Page 9: Cis017 6 revision-parallel_2015

Performance Analysis of Parallel Systems

• Scalability– Speedup

• Linear• Superlinear

– Efficiency• Amdahl’s law• Gustafson’s law

Page 10: Cis017 6 revision-parallel_2015

OpenMP• Parallel computing on shared memory systems• Directives

– Control Structures• parallel

– Work Sharing• sections• for

– Synchronisation• barrier

• Regions and loops

Page 11: Cis017 6 revision-parallel_2015

MPI• Message passing (a distributed computing style of

communication) used for parallel computing• Collective communications

– One-to-many (broadcast, scatter)– Many-to-one (reduce, gather)– Many-to-many (prefix sum, total exchange, circular shift)

• Synchronous message passing– Three-way signalling process

• Request to send• Ready-to-accept acknowledgement• Message transfer

– “Blocking” Functions in MPI• MPI_Send and MPI_Recv return when it is safe to continue

Page 12: Cis017 6 revision-parallel_2015

See also• The essential reading from Hennessy and Patterson (2012)

– in BREO weeks 2, 3 and 4• The essential reading from Greaves (2015)

– in BREO week 3• Tutorial questions

– in BREO weeks 2 and 4