CS 600.416 Transaction Processing Lecture 18 Parallelism.

31
CS 600.416 Transaction Processing Lecture 18 Parallelism

Transcript of CS 600.416 Transaction Processing Lecture 18 Parallelism.

Page 1: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Lecture 18

Parallelism

Page 2: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Motivation for Parallel Databases

• Extremely large data sets– Special application needs: computer-aided design, World Wide

Web

• Queries that have large data requirements– Decision support systems, statistical analysis

• Inherent parallelism in data– Set oriented nature of relations

• Commoditization of parallel computers– 2 or 4 SMPs are commonplace– Clustering software for multiple SMPs is freely available– Weak point in the argument in light of mainframe OSes

Page 3: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Motivation for Parallel Databases

• 2 Major reasons for parallel databases that we learned on the previous slide

• Large data sets, application and query– Because we need it

• Parallel computers and feasible application domain– Because we can

Page 4: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Motivation Reality Check

• Have always needed parallel DBs– DBs have always stretched the capabilities of computer

architectures– Enterprises have always grown to a DBs capabilities

• Distribution cannot really solve the problem– Replication and latency concerns

•As we learned from the paper last week

– Isolation problems•Fault and performance isolation

• One big computer is more powerful than 2 equivalent small computers

– Parallel machines look like 1 big computer from the outside

Page 5: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Parallelism

Theoretically, the execution of a task T onto the computer system Pn should be n times faster than processor P1

Pn

P1 P2 Pn

N processor system

T

T1 T2 Tn

Tasks of equal sizes

Page 6: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Parallelism

• Hardware Parallelism– Parallelism “available” as a result of the existing resources

– Egs: multiprocessors, RAIDS, etc.

• Software Parallelism– parallelism that could be "discovered" in an application

– Egs: parallel algorithms, programming style, compiler optimization

Page 7: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Speedup, Efficiency, and Scaleup

• Definition:– T(p,N) = time to solve problem of size N on p processors

• Speedup:– S(p,N) = T(1,N)/ T(p,N)– Compute same problem with more processors in shorter time

• Efficiency:– E(p,N) = S(p,N) / p

• Scaleup:– Sc(p,N) = N / n with T(1,n) = T(p,N)– Compute larger problem with more processors in same time

• Problems:– S(p,N) close to p or far less ? -> Sub linear speedup

Page 8: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Scale up

• Two kinds:– Batch scaleup:

– The size of the task increases

– E.g: size of database increase, sequential scan is proportionately increased

– Transaction scaleup– Rate of submission of task increases

– Each task may still be short lived

Page 9: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Scaleup, Speedup

sublinear speedup

linear speedup

resources

s

p

e

e

d

Ts

--

Tl

sublinear scaleup

linear scaleup

problem size

Page 10: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Factors against parallelism

• Startup– Thousands of processes may influence startup costs

• Interference– synchronization, communication – Even 1% contention limits speedup to 37x

• Skew– Efficiently load balancing– At fine granularity, variance can exceed mean time to

finish one parallel step

Page 11: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

When is parallelism available?

• Good if,– Operations access significant amount of data for e.g joins of large

tables, bulk inserts, aggregation, copying, queries, etc

– Symmetric multiprocessors

– Sufficient IO bandwidth, under utilized or intermittently used CPUs

• Bad if,– Query execution/transactions are short lived

– CPU, memory and IO resources are heavily utilized

• Software parallelism should utilize hardware parallelism

Page 12: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Parallel Architectures

• Stonebraker’s simple taxonomy for parallel architectures:– Shared memory : Processors share common memory

– Shared disk/Clusters: Processors share a common set of disks

– Shared nothing: Network sharing

– Hierarchical: Hybrid of architectures above

Page 13: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Shared Memory

MP

P

P

P

Processors share common memory

Common in SMP systems

Page 14: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Shared Nothing

• Pros– Cost

• Use inexpensive computers to build such a system

– Extensibility• Promotes incremental growth

– Availability• Redundancy can be introduced by replication of data

• Cons– Complex

• Distributed database concepts in parallel setup

– Difficult to achieve load balancing• Relies on software parallelism

Page 15: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Shared Disk

P

P

P

P

M

M

M

MProcessors share a common set of disks

Common in clusters

Networked attached I/O protocols

make more readily available

Page 16: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Shared Disk

• Features– Shared disk access but exclusive memory access– Global locking protocols are needed

• Pros– Cost

• Lower as standard I/O interconnects can be used

– Extensibility• Interference is minimized by exclusive memory cache

– Availability• Degree of fault tolerance in both processor subsystem and disks

• Cons– Highly Complex– Shared-disk as a potential bottleneck

Page 17: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Shared Nothing

P

P

P

P

P

MM

M

MM

Network sharing only

Parallelism available without

any hardware support

Page 18: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Shared Memory

• Pros– Fast processor to processor communication

• No software primitives required

– Simplicity• Meta and control information shared by all

• Cons– Cost

• Expensive interconnect

– Limited extensibility• The shared memory soon becomes a bottleneck• Limited to 10-20 processors

– Cache coherency– Low availability

• Availability depends on the robustness of memory

Page 19: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

So far…

• Parallelism and its measures• Problems with parallelism…• Parallel architectures…

Page 20: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

I/O and Databases

• What’s important about I/O– Reminder: the performance measure for all DBs is the number of

I/Os. – For the most part, it is the only thing that matters

• Why is I/O inherently parallel?– Even a machine with 1 processor has multiple disks– Placement of data on these disks greatly effects performance

• What does this tell us about parallel DBs?– Parallelism is not necessarily about supercomputers, but occurs at

many levels in computer systems– Every system has some degree of parallelism, can be between

scheduling the different processing units in a CPU

Page 21: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

I/O or Disk Parallelism

• Partition data onto multiple disks– Most frequently horizontal paritioning

– Conduct I/O to all disks at the same time

• Techniques– Round-robin: send ith tuple to disk i mod n in an n-disk system

– Hash partitioning: send tuple n to disk f(n) where f is a uniformly distributed random function

– Range partitioning: break tuples up into contiguous ranges of keys, requires a key that can be ordered linearly

– Multi-dimensional partitioning strategies: used for spatial data, images, other mutli-dimensional sets much recent work

Page 22: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Workloads

• Several important/expected workloads– Scanning the entire relation

– Locating a tuple (identity query)

– Locating a set of tuples based on attribute value•Range query, e.g. 100<a<200

•Find all people whose names start with A–Note this is not an identity query

Page 23: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Range partitioning

• Parition requires a partitioning attribute A usually the primary key• A vector of dimension n partitions A

– Vector {v0,v2,…,vn-1}

• Each tuple t goes into:– Partition 0 if t[A] < v0– Partition n-1 if t[A] > vn-2– Partition k if t[A] > vk-1 and t[A] < vk, k >=1

• Simple range paritioning #disks = #partitions• Combined with round robin #disk*k = #partitions

– Has some benefits of avoiding variance in any one partition

Page 24: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Some Practicalities

• Disk blocks are what we partition– Block size is generally a tradeoff between I/O performance and utilization

• bigger blocks are better for performance, more I/O• bigger blocks fragment data, leading to poor space utilization

– Blocks are generally set to the page size• bigger than we would like • often lots of space fragemented (> 50% in file systems)

• What is the problem with larger blocks?– Small relations don’t get placed on as many disks, less parallelism

• What is the problem with small blocks?– Well pages are what OSes read?– Performance suffers

• Some applications with known large data use larger block sizes– Paricularly scientific applications

Page 25: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Workloads Round Robin

• Ups– Good for scans, sequential, parallel, entirely load balanced– What about unfairness in the tail (if you start on the same block all

the time)• randomize start block• use a next block policy

• Downs– Identity queries search n blocks (n/2 if item always exists and is a

key)(n blocks if not a key or to establish it is not found)– Range queries search n blocks, there is not relationship between

key value and placement

Page 26: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Workloads Hash Partition

• Ups– Good for identity queries

• isolates query to a single disk

– Good for sequential scans• low variance in hashing \Omega(log t) for relations with cardinality t

• essentially d (number of disks) times speedup over a single disk system (actually d/(1+\Omega(logt))

• Downs– Bad for range queries, search n blocks

– Bad for identity queries on non-partitioning attributed• e.g. partition/hash on SS# and lookup by last name

Page 27: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Workloads Range Partition

• Ups– Good for identity queries

• isolates query to a single “data” disk/block• must generally read another block to read range information, which hash partitions do not require

–Indices can be large

• Ambiguous-es– Range queries

• good performance when queries access few items–Isolates queries to one or few disks–Allows other queries to run in parallel on other disks

•Bad when accessing lots of data items– Can localize traffic to few disks, creating a hot spot

•Really the good outweighs the bad here

Page 28: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Handling Skew

• Attribute value skew – when lots of tuples are clustered around the same (or nearly same value)

– Occurs in range partitioning

– Imagine a relation with 2 values of an attribute and k disks• Only two will be used

• Partition skew – load imbalance when there is no attribute skew

– O(log t) for t tuples in hash partitioning, no problem

– From poorly constructed range vector

Page 29: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Constructing a Range Vector

• Balanced range partitioning vector can be constructed by– Sorting existing tuples – but incurs I/O costs when sorting and does

not keep the partitioning balanced as new inserts arise

– Using a B-tree, but this limits occupancy of tuples in disks blocks, which ultimately limits I/O performance

– Statistics – keep counts of values based on buckets of values, but this has problems of AV skew within buckets and estimation (Histogram)

Page 30: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Virtual Processor Technique

• Create many virtual processors and map ranges to virtual processors

• Assign the virtual processors to real processors– This eliminates skew, because each processor is accessing many

virtual processors, which are more likely to have close to mean load

• Allows a system to use a “poor” range partition and not have problems with skew

– Generally DBs use histograms with VPs

Page 31: CS 600.416 Transaction Processing Lecture 18 Parallelism.

CS 600.416 Transaction Processing

Lessons Learned

• Parallelism is important, even for single machines

• Disk based parallelism is the most important kind of paralellism

– I/O is the bottleneck in databases

– Not really entirely true, networking is starting to be the bottleneck in distributed TP applications

• Know thy data