Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong...

Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong Chen

What is Parallel Computing Definition: Parallel computing is a type of computing architecture in which several processors execute or process an application or computation simultaneously. The Compute Resources The Computational problem

Why Parallel Computing Parallel computing is used in the natural world. Parallel computing is suited in real world phenomena. Reasons for using parallel computing: Save time and/or money Solve larger/more complex problems Provide concurrency Take advantage of non-local resources Cost savings Overcoming memory constraints

Types Of Parallelism Bit-Level ALU Parallelism Based on increasing processor word size Instruction Level Pipelining Execution of several instructions simultaneously Thread Level Splitting program into parts, then having them run side by side

Hardware Architecture

Flynns Classical Taxonomy SISD Single Instruction, Single Data MISD Multiple Instruction, Single Data SIMD Single Instruction, Multiple Data MIMD Multiple Instruction, Multiple Data Michael J. Flynn 1966 Classification of parallel computer architectures based on based on the number of instruction and data streams available. Machines can have one or multiple data streams and one or multiple processors

SISD Single Instruction, Single Data Uniprocessors Simple to design Not as flexible as MIMD

SIMD Single Instruction, Multiple Data Array Processors Multiprocessors must execute same instructions simultaneously

MISD Multiple Instruction, Single Data

MIMD Multiple Instruction, Multiple Data Each processor has its own independent instruction and data stream Can be further separated into shared memory and distributed memory

Flynns Classical Taxonomy Advantages Most widely accepted Disadvantages Very few applications of MISD machines Assumes parallelism was homogenous No consideration of how processors are connected or how they view memory in the MIMD category

Memory Architectures

Shared Memory Some General Characteristics of memory include: Shared memory parallel computers vary widely, but generally have in common the ability for all processors to access all memory as global address space. Multiple processors can operate independently but share the same memory resources. Changes in a memory location effected by one processor are visible to all other processors. Historically, shared memory machines have been classified as UMA(Uniform memory access) and NUMA(non uniform memory access), based upon memory access times. Some characteristics of UMA include: It is most commonly used in Symmetric Multiprocessor machines, it has identical processors, equal access and access times to memory,. Some characteristics of NUMA include: Often made by physically linking two or more SMPs, one SMP can directly access memory of other SMP. Not all processors have equal access time to all memories. Memory access over link is slower.

Distributed Memory General Characteristics: Distributed memory systems require a communication network to connect inter-processor memory. Processors have their own local memory. Memory addresses in one processor do not map to another processor, so there is no concept of global address space across all processors. Because each processor has its own local memory, it operates independently. Changes it makes to its local memory have no effect on the memory of other processors. Advantages of Distributed Memory include: Memory is scalable with the number of processors. Increase the number of processors and the size of memory increases proportionately. Disadvantages include: The programmer is responsible for many of the details associated with data communication between processors. It also may be difficult to map existing data structures, based on global memory, to this memory organization.

Hybrid Distributed-Shared Memory The largest and fastest computers in the world today employ both shared and distributed memory architectures. The distributed memory component is the networking of multiple shared memory/GPU machines, which know only about their own memory - not the memory on another machine. Therefore, network communications are required to move data from one machine to another. Advantages and Disadvantages: Whatever is common to both shared and distributed memory architectures. Increased scalability is an important advantage Increased programmer complexity is an important disadvantage

Parallel Programming Models

Shared Memory Model 1. In the shared-memory programming model, tasks share a common address space, which they read and write asynchronously. 2. Various mechanisms such as locks / semaphores may be used to control access to the shared memory. 3. An advantage of this model from the programmer's point of view is that the notion of data "ownership" is lacking, so program development can often be simplified. 4. An important disadvantage in terms of performance is that it becomes more difficult to understand and manage data locality. http://www.fergustan.net/dbs-atm-queues/

Threads Model 1. In the threads model of parallel programming, a single process can have multiple, concurrent execution paths. 2.Threads are commonly associated with shared memory architectures and operating systems. 3. Unrelated standardization efforts have resulted in two very different implementations of threads: POSIX Threads and OpenMP. 4. Microsoft has its own implementation for threads, which is not related to the UNIX POSIX standard or OpenMP. http://www.bu.edu/today/2012/boston-landmarks-orchestra-hosts-first-gala/

Message Passing Model 1. In a message passing model, parallel tasks exchange data through passing messages to one another. These communications can be asynchronous or synchronous 2. The Communicating Sequential Processes (CSP) formalization of message-passing employed communication channels to 'connect' processes, and led to a number of important languages such as Joyce, Occam and Erlang. 3. Message Passing Model Implementations (MPI) is now the "de facto" industry standard for message passing, replacing virtually all other message passing implementations used for production work. Most, if not all of the popular parallel computing platforms offer at least one implementation of MPI. A few offer a full implementation of MPI-2. http://www.bigbluewave.ca/2012/08/canadian-pro-lifers-need-to-talk-to.html

Data Parallel Model Address space is globally treated. Focuses on performing operations on a data set. Sets of tasks work simultaneously, but on different partition on same data structure Tasks perform the same operation on its partition

References https://computing.llnl.gov/tutorials/parallel_comp/#MemoryArch https://computing.llnl.gov/tutorials/parallel_comp/#MemoryArch "Introduction to parallel computation". Blaise Barney.http://computing.llnl.gov/tutorials/parallel_comp/#WhyUs ehttp://computing.llnl.gov/tutorials/parallel_comp/#WhyUs e Linda Null, Julia Lobur. (2012)."Computer organization and Architectecture". Alternative Architecture,Third Edition, 505-541 -http://en.wikipedia.org/wiki/Parallel_programming_model - cosy.univ-reims.fr/~fnolot/.../ introduction_to_parallel _ computing. ppthttp://en.wikipedia.org/wiki/Parallel_programming_model cosy.univ-reims.fr/~fnolot/.../ https://computing.llnl.gov/tutorials/parallel_comp/#ModelsData

Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong...

Documents

Transcript of Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong...