Ordered Record Collection

Sort of Vinyl: Ordered Record CollectionChris Douglas01.18.2010

Obligatory MapReduce Flow Slide

Split 2

Split 1

Split 0

ne* hd

Obligatory MapReduce Flow Slide

Split 2

Split 1

Split 0

ne* hd

Map Output Collection

Overview

Hadoop (∞, 0.10) Hadoop [ 0.10, 0.17) Hadoop [0.17, 0.22]

Lucene HADOOP-331 HADOOP-2919

Overview

Cretaceous Jurassic Triassic

Awesome!

Problem Description

*map(K1,V1)

collect(K2,V2)

Problem Description

*Serialization

K2.write(DataOutput)

V2.write(DataOutput)

*write(byte[], int, int)

map(K1,V1)

collect(K2,V2)

p0 partition(key0,val0)

Problem Description

*Serialization

map(K1,V1)

collect(K2,V2)

Problem Description

*Serialization

map(K1,V1)

collect(K2,V2)

Problem Description

*Serialization

map(K1,V1)

collect(K2,V2)

Problem Description

*Serialization

map(K1,V1)

collect(K2,V2)

Problem Description

*Serialization

map(K1,V1)

collect(K2,V2)

key0 val0

Problem Description

**write(byte[], int, int)

map(K1,V1)

collect(K2,V2)

key0 val0

Serialization

Problem Description

*Serialization

map(K1,V1)

collect(K2,V2)

key0 val0

byte[] byte[]

Problem Description

For all calls to collect(K2 keyn, V2 valn):•Store result of partition(K2 keyn, V2 valn)•Ordered set of write(byte[], int, int) for keyn

•Ordered set of write(byte[], int, int) for valn

Challenges:•Size of key/value unknown a priori•Records must be grouped for efficient fetch from reduce•Sort occurs after the records are serialized

Overview

Hadoop (∞, 0.10)

map(K1,V1)

collect(K2,V2)

*collect(K2,V2)

SequenceFile::Writer[p0].append(key0, val0)

Hadoop (∞, 0.10)

map(K1,V1)

collect(K2,V2)

*collect(K2,V2)

key0.write(localFS)

val0.write(localFS)

Hadoop (∞, 0.10)

map(K1,V1)

collect(K2,V2)

*collect(K2,V2)

key0.write(localFS)

val0.write(localFS)

Hadoop (∞, 0.10)

map(K1,V1)

collect(K2,V2)

*collect(K2,V2)

key0.write(localFS)

val0.write(localFS)

Not necessarily true. SeqFile may buffer configurable amount of data to effect block compresion, stream buffering, etc.

Hadoop (∞, 0.10)

map(K1,V1)

collect(K2,V2)

*collect(K2,V2)

SequenceFile::Writer[p0].append(keyn’, valn’)

clone(key0, val0)

reduce(keyn, val*)

flush()

Hadoop (∞, 0.10)

map(K1,V1)

collect(K2,V2) *collect(K2,V2)

clone(key0, val0)

reduce(keyn, val*)

flush()

Hadoop (∞, 0.10)

map(K1,V1)

collect(K2,V2) *collect(K2,V2)

clone(key0, val0)

reduce(keyn, val*)

flush()

Combiner may change the partition and ordering of input records. This is no longer supported

Hadoop (∞, 0.10)

TaskTracker

Reduce kReduce 0 …

Hadoop (∞, 0.10)

TaskTracker

Reduce kReduce 0 …

Hadoop (∞, 0.10)

Reduce 0

sort/merge localFS

Hadoop (∞, 0.10)

Pro:•Complexity of sort/merge encapsulated in SequenceFile, shared between MapTask and ReduceTask•Very versatile Combiner semantics (change sort order, partition)

Con:•Copy/sort can take a long time for each reduce (lost opportunity to parallelize sort)•Job cleanup is expensive (e.g. 7k reducer job must delete 7k files per map on that TT)•Combiner is expensive to use and its memory usage is difficult to track•OOMExceptions from untracked memory in buffers, particularly when using compression (HADOOP-570)

Overview

Hadoop [0.10, 0.17)map(K1,V1)

*collect(K2,V2)

BufferSorter[p0].addKeyValue(recOff, keylen, vallen)

0 1 k-1 k

sortAndSpillToDisk()

Hadoop [0.10, 0.17)map(K1,V1)

*collect(K2,V2)

0 1 k-1 k

Hadoop [0.10, 0.17)map(K1,V1)

*collect(K2,V2)

0 1 k-1 k

Hadoop [0.10, 0.17)map(K1,V1)

*collect(K2,V2)

0 1 k-1 k

Hadoop [0.10, 0.17)map(K1,V1)

*collect(K2,V2)

0 1 k-1 k

Hadoop [0.10, 0.17)map(K1,V1)

*collect(K2,V2)

0 1 k-1 k

Keep offset into buffer, length of key, value.

Add memory used by all BufferSorter implementations and keyValBuffer. If spill threshold exceeded, then spill contents to disk

Hadoop [0.10, 0.17)map(K1,V1)

*collect(K2,V2)

0 1 k-1 k

0*Sort permutes offsets into (offset,keylen,vallen). Once ordered, each record is output into a SeqFile and the partition offsets recorded

Hadoop [0.10, 0.17)map(K1,V1)

*collect(K2,V2)

0 1 k-1 k

0*Sort permutes offsets into (offset,keylen,vallen). Once ordered, each record is output into a SeqFile and the partition offsets recorded

K2.readFields(DataInput)

V2.readFields(DataInput)

SequenceFile::append(K2,V2)

Hadoop [0.10, 0.17)map(K1,V1)

*collect(K2,V2)

0 1 k-1 k

0*If defined, the combiner is now run during the spill, separately over each partition. Values emitted from the combiner are written directly to the output partition.*

K2.readFields(DataInput)

V2.readFields(DataInput)

SequenceFile::append(K2,V2)

<< Combiner >>

Hadoop [0.10, 0.17)map(K1,V1)

*collect(K2,V2)

0 1 k-1 k

Hadoop [0.10, 0.17)map(K1,V1)

*collect(K2,V2)

0 1 k-1 k

Hadoop [0.10, 0.17)

mergeParts()

Hadoop [0.10, 0.17)

mergeParts()

Hadoop [0.10, 0.17)

Reduce k

Reduce 0

Hadoop [0.10, 0.17)

Reduce k

Reduce 0

Hadoop [0.10, 0.17)

Pro:•Distributes the sort/merge across all maps; reducer need only merge its inputs•Much more predictable memory footprint•Shared, in-memory buffer across all partitions w/ efficient sort•Combines over each spill, defined by memory usage, instead of record count•Running the combiner doesn’t require storing a clone of each record (fewer serializ.)•In 0.16, spill was made concurrent with collection (HADOOP-1965)

Con:•Expanding buffers may impose a performance penalty; used memory calculated on every call to collect(K2,V2)•MergeSort copies indices on each level of recursion•Deserializing the key/value before appending to the SequenceFile is avoidable•Combiner weakened by requiring sort order and partition to remain consistent•Though tracked, BufferSort instances take non-negligible space (HADOOP-1698)

Overview

Hadoop [0.17, 0.22)map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

Hadoop [0.17, 0.22)map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

io.sort.mb * io.sort.record.percent

io.sort.mb

Hadoop [0.17, 0.22)map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

KS.serialize(V2)

Serialization

io.sort.mb

Instead of explicitly tracking space used by record metadata, allocate a configurable amount of space at the beginning of the task

Hadoop [0.17, 0.22)

bufstartbufendbufindexbufmark

kvstartkvend

kvindex

map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

Hadoop [0.17, 0.22)

bufstartbufendbufindexbufmark

kvstartkvend

kvindex

map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

kvoffsets kvindices

kvbufferPartition no longer implicitly tracked. Store (partition, keystart,valstart) for every record collected

Hadoop [0.17, 0.22)

bufstartbufend

kvstartkvend

map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

bufindexbufmark

kvindex

Hadoop [0.17, 0.22)

bufstartbufend

kvstartkvend

map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

bufmark

kvindex

bufindex

Hadoop [0.17, 0.22)

bufstartbufend

kvstartkvend

map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

bufmark

kvindex

bufindex

Hadoop [0.17, 0.22)

bufstartbufend

kvstartkvend

map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

bufmark

kvindex

bufindex

Hadoop [0.17, 0.22)

bufstartbufend

kvstartkvend

map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

kvindex

bufindexbufmark

io.sort.spill.percent

Hadoop [0.17, 0.22)

bufstart

kvstart

map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

kvendkvindex

bufendbufindexbufmark

Hadoop [0.17, 0.22)

bufstart

kvstart

map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

bufend

kvindex bufindexbufmark

Hadoop [0.17, 0.22)

bufstart

kvstart

map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

bufend

kvindex

bufmarkbufindex

Hadoop [0.17, 0.22)map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

kvstartkvend

bufstartbufend

bufmarkbufindex

kvindex

Hadoop [0.17, 0.22)map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

kvstartkvend

bufstartbufend

bufmarkbufvoid

bufindexkvindex

RawComparator interface requires that the key be contiguous in the byte[]

Invalid segments in the serialization buffer are marked by bufvoid

Hadoop [0.17, 0.22)map(K1,V1)

*collect(K2,V2)

KS.serialize(K2)

VS.serialize(V2)

Serialization

kvstartkvend

bufstartbufend

bufvoid

bufmarkbufindexkvindex

Hadoop [0.17, 0.22)

Pro:•Predictable memory footprint, collection (though not spill) agnostic to number of reducers. Most memory used for the sort allocated upfront and maintained for the full task duration.•No resizing of buffers, copying of serialized record data or metadata•Uses SequenceFile::appendRaw to avoid deserialization/serialization pass•Effects record compression in-place (removed in 0.18 with improvements to intermediate data format HADOOP-2095)

Other Performance Improvements•Improved performance, no metadata copying using QuickSort (HADOOP-3308)•Caching of spill indices (HADOOP-3638)•Run combiner during the merge (HADOOP-3226)•Improved locking and synchronization (HADOOP-{5664,3617})

Con:•Complexity and new code responsible for several bugs in 0.17•(HADOOP-{3442,3550,3475,3603})•io.sort.record.percent is obscure, critical to performance, and awkward•While predictable, memory usage is arguably too restricted•Really? io.sort.record.percent? (MAPREDUCE-64)

bufstartbufend

bufindexbufmarkequator

kvstartkvendkvindex

kvstart 1048560

kvend 1048560

kvindex 1048560

equator 0

bufstart 0

bufend 0

bufindex 0

bufmark 0

bufvoid 1048576

Hadoop [0.22]

bufstartbufendequator

kvstartkvend

kvindex

bufindexbufmark

kvstart 1048560

kvend 1048560

kvindex 968576

equator 0

bufstart 0

bufend 0

bufindex 300000

bufmark 300000

bufvoid 1048576

Hadoop [0.22]

kvstartkvend

kvindex

bufmark

kvstart 1048560

kvend 1048560

kvindex 968576

equator 0

bufstart 0

bufend 0

bufindex 300030

bufmark 300000

bufvoid 1048576 bufindex

Hadoop [0.22]

kvstartkvend

kvindex

bufmark

kvstart 1048560

kvend 1048560

kvindex 968576

equator 0

bufstart 0

bufend 0

bufindex 300030

bufmark 300000

bufvoid 1048576

Hadoop [0.22]

bufindex

kvstartkvend

kvindex

bufmark

kvstart 1048560

kvend 1048560

kvindex 968576

equator 0

bufstart 0

bufend 0

bufindex 300030

bufmark 300000

bufvoid 1048576bufindex

Hadoop [0.22]

kvstartkvend

kvindex

bufmark

kvstart 1048560

kvend 1048560

kvindex 968576

equator 0

bufstart 0

bufend 0

bufindex 300030

bufmark 300000

bufvoid 1048576bufindex

Hadoop [0.22]

p0kvoffsets and kvindices information interlaced into metadata blocks. The sort is effected in a manner identical to 0.17, but metadata is allocated per-record, rather than a priori (kvoffsets) (kvindices)

kvstartkvend

kvindex

kvstart 1048560

kvend 1048560

kvindex 968560

equator 0

bufstart 0

bufend 0

bufindex 300030

bufmark 300030

bufvoid 1048576bufindexbufmark

Hadoop [0.22]

bufstartkvstart

kvindex

kvstart 1048560

kvend 968576

kvindex 736000

equator 736020

bufstart 0

bufend 300030

bufindex 736020

bufmark 300030

bufvoid 1048576bufend

equatorbufindexbufmark

Hadoop [0.22]

bufstartkvstart

kvindex

kvstart 1048560

kvend 968576

kvindex 696000

equator 736020

bufstart 0

bufend 300030

bufindex 811020

bufmark 811020

bufvoid 1048576bufend

equator

bufindexbufmark

Hadoop [0.22]

kvindex

kvstart 968576

kvend 968576

kvindex 696000

equator 736020

bufstart 300030

bufend 300030

bufindex 811020

bufmark 811020

bufvoid 1048576bufstartbufend

equator

kvstartkvend

bufindexbufmark

Hadoop [0.22]

kvindex

kvstart 736000

kvend 736000

kvindex 696000

equator 736020

bufstart 736020

bufend 736020

bufindex 811020

bufmark 811020

bufvoid 1048576

equatorbufstartbufend

kvstartkvend

bufindexbufmark

Hadoop [0.22]

kvstartkvend

kvindex

bufmark

kvstart 1048560

kvend 1048560

kvindex 968576

equator 0

bufstart 0

bufend 0

bufindex 300000

bufmark 300000

Hadoop [0.22]

bufstart

kvstart

kvindex

bufmarkbufendequator

kvstart 1048560

kvend 1048560

kvindex 968576

equator 0

bufstart 0

bufend 0

bufindex 300000

bufmark 300000

Hadoop [0.22]

kvstartkvend

bufmarkbufstartbufendequator

kvstart 1048560

kvend 1048560

kvindex 968576

equator 0

bufstart 0

bufend 0

bufindex 300000

bufmark 300000

bufvoid 1048576

bufindex

kvindex

Hadoop [0.22]

Questions?

Ordered Record Collection

Technology

Transcript of Ordered Record Collection

Scorched Earth Record Collection

Structures. Heterogeneous Structures Collection of values of possibly different types. Name the collection. Name the components. Example : Student record.

Introduction to Direct Access Files.. Sequential Files - Adding a Record Rec085 Rec300Rec150Rec005Rec090Rec045Rec100Rec001Rec325^Z Ordered Rec001Rec005Rec045Rec090Rec100Rec150Rec300Rec325^Z.

Restructuring the Court-Ordered Debt Collection Process...California’s Court-Ordered Debt Collection Process. Upon conviction of a traffic violation or criminal offense, individuals

Student Record Collection 2013-2014 Updates Webinar

Collection of Gifts - Duke University · 66810 Ball Point . . . . . . . . . . . . $139.00 ... Collection of Gifts catalog may be ordered free of charge ... Distressed Script Duke

Kate Bush Collection Off the Record

AR12 - RECORD COLLECTION BILLED AR (BANK DEPOSIT)dof.ca.gov/.../documents/AR12-Record_Collection_Bille… · · 2016-05-06AR12 - RECORD COLLECTION – BILLED AR (BANK DEPOSIT) Source

1 Relations and Their Properties Section 8.1. 2 Review: Cartesian Products Definition: An ordered n-tuple (a 1, a 2, …, a n ) is the ordered collection.

Student Record Collection - Virginia Department of Education · 2017-10-12 · Student Record Collection 2017-2018 Updates Virginia Department of Education Office of Educational Information

Record Collection Strategies …also known as progress monitoring.

Specifications for Completing SRC€¦ · Web viewStudent Record Collection: 2020-2021. Updated August, 2020. Specifications for Completing the Student Record Collection 2020 -

NEBRASKA STATE HISTORICAL SOCIETY COLLECTION RECORD · NEBRASKA STATE HISTORICAL SOCIETY COLLECTION RECORD ... Carl Thomas, 1905-2000 (Nebraska State Historical Society) 2 2. Curtis

NEBRASKA STATE HISTORICAL SOCIETY COLLECTION ......NEBRASKA STATE HISTORICAL SOCIETY COLLECTION RECORD RG1517.AM: Fort Robinson (Neb.) Reference Collection Records and Papers Dawes

Salmonellosis Data Collection Worksheet · Salmonellosis Routine Questionnaire - August 2018 Record type: Record ID: Record Name: In this form the answers (Yes, Probably, No, and

Collection Directions: Some Reflections on Libraries and Stewardship of the Scholarly Record

Record labels represented in Belfer 45s Collection · Record labels represented in Belfer 45s Collection Company Popular music Latin American music “X” x (Discoteca) Europa x

The Gonzaga Record 1970-1979gonzagaarchive.ie/Gonzaga Record 1970s.pdf · 2015-04-02 · The Gonzaga Record 1970-1979 . A.M.D.G . Preface . Welcome to this collection of Archive items

Student Post-collection seminar January 2013. Review of C11051 Student record.

Comparative Clay Analysis and Curation for Archaeological ......HOW-TO SERIES TABLE 1. Example of Clay Sample Collection Record. CLAY SAMPLE COLLECTION RECORD SAMPLE #: c8VO1; FLMNH