2017 High Performance Database with Scala, Akka, Spark

Click here to load reader

  • date post

    21-Jan-2018
  • Category

    Engineering

  • view

    1.141
  • download

    1

Embed Size (px)

Transcript of 2017 High Performance Database with Scala, Akka, Spark

  • Building a High-Performance Database with

    Scala, Akka, and SparkEvan Chan

    November 2017

  • Who am I

    User and contributor to Spark since 0.9, Cassandra since 0.6 Created Spark Job Server and FiloDB Talks at Spark Summit, Cassandra Summit, Strata, Scala Days, etc. http://velvia.github.io/

    http://github.com/spark-jobserver/spark-jobserverhttp://github.com/tuplejump/FiloDBhttp://velvia.github.io/

  • Why Build a New Streaming Database?

  • Needs Ingest HUGE streams of events IoT etc.

    Real-time, low latency, and somewhat flexible queries

    Dashboards, quick answers on new data

    Flexible schemas and query patterns

    Keep your streaming pipeline super simple

    Streaming = hardest to debug. Simplicity rules!

  • Message QueueEvents

    Stream Processing

    Layer

    State / Database

    Happy Users

  • Spark + HDFS Streaming

    Kafka Spark StreamingMany small files (microbatches)

    Dedup, consolidate

    job

    Larger efficient files

    High latency

    Big impedance mismatch between streaming systems and a file system designed for big blobs of data

  • Cassandra? Ingest HUGE streams of events IoT etc.

    C* is not efficient for writing raw events

    Real-time, low latency, and somewhat flexible queries

    C* is real-time, but only low latency for simple lookups. Add Spark => much higher latency

    Flexible schemas and query patterns

    C* only handles simple lookups

  • Introducing FiloDBA distributed, columnar time-series/event database.

    Built for streaming.

    http://www.github.com/filodb/FiloDB

    http://www.github.com/filodb/FiloDB

  • Message QueueEvents

    Spark Streaming

    Short term storage, K-V

    Adhoc, SQL, ML

    Cassandra

    FiloDB: Events, ad-hoc, batch

    Spark

    Dashboards, maps

  • 100% Reactive Scala

    Akka Cluster

    Spark

    Monix / Reactive Streams

    Typesafe Config for all configuration

    Scodec, Ficus, Enumeratum, Scalactic, etc.

    Even most of the performance critical parts are written in Scala :)

  • Scala, Akka, and Spark for Database

  • Why use Scala and Akka? Akka Cluster!

    Just the right abstractions - streams, futures, Akka, type safety.

    Failure handling and supervision are critical for databases

    All the pattern matching and immutable goodness :)

  • Scala Big Data Projects

    Spark

    GeoMesa

    Khronus - Akka time-series DB

    Sirius - Akka distributed KV Store

    FiloDB!

  • Actors vs Futures vs Observables

  • One FiloDB Node

    NodeCoordinatorActor (NCA)

    DatasetCoordinatorActor (DsCA)

    DatasetCoordinatorActor (DsCA)

    Active MemTable

    Flushing MemTableReprojector ColumnStore

    Data, commands

  • Akka vs Futures

    NodeCoordinatorActor (NCA)

    DatasetCoordinatorActor (DsCA)

    DatasetCoordinatorActor (DsCA)

    Active MemTable

    Flushing MemTableReprojector ColumnStore

    Data, commands

    Akka - control flow

    Core I/O - Futures/Observables

  • Akka vs Futures Akka Actors:

    External FiloDB node API (remote + cluster)

    Async messaging with clients

    Cluster/distributed state management

    Futures and Observables:

    Core I/O

    Columnar data processing / ingestion

    Type-safe processing stages

  • Futures for Single Actions /** * Clears all data from the column store for that given projection, for all versions. * More like a truncation, not a drop. * NOTE: please make sure there are no reprojections or writes going on before calling this */ def clearProjectionData(projection: Projection): Future[Response]

    /** * Completely and permanently drops the dataset from the column store. * @param dataset the DatasetRef for the dataset to drop. */ def dropDataset(dataset: DatasetRef): Future[Response]

    /** * Appends the ChunkSets and incremental indices in the segment to the column store. * @param segment the ChunkSetSegment to write / merge to the columnar store * @param version the version # to write the segment to * @return Success. Future.failure(exception) otherwise. */ def appendSegment(projection: RichProjection, segment: ChunkSetSegment, version: Int): Future[Response]

  • Monix / Reactive Streams http://monix.io

    observable sequences that are exposed as asynchronous streams, expanding on the observer pattern, strongly inspired by ReactiveX and by Scalaz, but designed from the ground up for back-pressure and made to cleanly interact with Scalas standard library, compatible out-of-the-box with the Reactive Streams protocol

    Much better than Future[Iterator[_]]

    http://monix.iohttps://en.wikipedia.org/wiki/Observer_patternhttp://reactivex.io/http://scalaz.org/http://www.reactive-streams.org/

  • Monix / Reactive Streams def readChunks(projection: RichProjection, columns: Seq[Column], version: Int, partMethod: PartitionScanMethod, chunkMethod: ChunkScanMethod = AllChunkScan): Observable[ChunkSetReader] = { scanPartitions(projection, version, partMethod) // Partitions to pipeline of single chunks .flatMap { partIndex => stats.incrReadPartitions(1) readPartitionChunks(projection.datasetRef, version, columns, partIndex, chunkMethod) // Collate single chunks to ChunkSetReaders }.scan(new ChunkSetReaderAggregator(columns, stats)) { _ add _ } .collect { case agg: ChunkSetReaderAggregator if agg.canEmit => agg.emit() } } }

  • Functional Reactive Stream Processing

    Ingest stream merged with flush commands

    Built in async/parallel tasks via mapAsync

    Notify on end of stream, errors

    val combinedStream = Observable.merge(stream.map(SomeData), flushStream) combinedStream.map { case SomeData(records) => shard.ingest(records) None case FlushCommand(group) => shard.switchGroupBuffers(group) Some(FlushGroup(shard.shardNum, group, shard.latestOffset)) }.collect { case Some(flushGroup) => flushGroup } .mapAsync(numParallelFlushes)(shard.createFlushTask _) .foreach { x => } .recover { case ex: Exception => errHandler(ex) }

  • Akka Cluster and Spark

  • Spark/Akka Cluster SetupDriver

    NodeClusterActor

    Client

    Executor

    NCA

    DsCA1 DsCA2

    Executor

    NCA

    DsCA1 DsCA2

  • Adding one executorDriver

    NodeClusterActor

    Client

    executor1

    NCA

    DsCA1 DsCA2

    State:Executors -> (executor1)

    MemberUp

    ActorSelectionActorRef

  • Adding second executorDriver

    NodeClusterActor

    Client

    executor1

    NCA

    DsCA1 DsCA2

    State:Executors -> (executor1, executor2) MemberUp

    ActorSelection ActorRef

    executor2

    NCA

    DsCA1 DsCA2

  • Sending a commandDriver

    NodeClusterActor

    Client

    Executor

    NCA

    DsCA1 DsCA2

    Executor

    NCA

    DsCA1 DsCA2

    Flush()

  • Yes, Akka in Spark Columnar ingestion is stateful - need stickiness of

    state. This is inherently difficult in Spark.

    Akka (cluster) gives us a separate, asynchronous control channel to talk to FiloDB ingestors

    Spark only gives data flow primitives, not async messaging

    We need to route incoming records to the correct ingestion node. Sorting data is inefficient and forces all nodes to wait for sorting to be done.

  • Data Ingestion SetupExecutor

    NCA

    DsCA1 DsCA2

    task0 task1

    Row Source Actor

    Row Source Actor

    Executor

    NCA

    DsCA1 DsCA2

    task0 task1

    Row Source Actor

    Row Source Actor

    Node Cluster Actor

    Partition Map

  • FiloDB NodeFiloDB Node

    FiloDB separate nodesExecutor

    NCA

    DsCA1 DsCA2

    task0 task1

    Row Source Actor

    Row Source Actor

    Executor

    NCA

    DsCA1 DsCA2

    task0 task1

    Row Source Actor

    Row Source Actor

    Node Cluster Actor

    Partition Map

  • Testing Akka Cluster MultiNodeSpec / sbt-multi-jvm

    NodeClusterSpec

    Tests joining of different cluster nodes and partition map updates

    Is partition map updated properly if a cluster node goes down inject network failures

    Lessons

  • Kamon Tracing http://kamon.io

    One trace can encapsulate multiple Future steps all executing on different threads

    Tunable tracing levels

    Summary stats and histograms for segments

    Super useful for production debugging of reactive stack

    http://kamon.io

  • Kamon Tracing def appendSegment(projection: RichProjection, segment: ChunkSetSegment, version: Int): Future[Response] = Tracer.withNewContext("append-segment") { val ctx = Tracer.currentContext stats.segmentAppend() if (segment.chunkSets.isEmpty) { stats.segmentEmpty() return(Future.successful(NotApplied)) } for { writeChunksResp responses.head } } }

  • Kamon Metrics

    Uses HDRHistogram for much finer and more accurate buckets

    Built-in metrics for Akka actors, Spray, Akka-Http, Play, etc. etc.

    KAMON trace name=append-segment n=2863 min=765952 p50=211