Building a High-Performance Database with Scala, Akka, and Spark

download Building a High-Performance Database with Scala, Akka, and Spark

of 29

  • date post

    09-Jan-2017
  • Category

    Engineering

  • view

    463
  • download

    4

Embed Size (px)

Transcript of Building a High-Performance Database with Scala, Akka, and Spark

  • Building a High-Performance Database with

    Scala, Akka, and SparkEvan Chan

  • Who am I

    User and contributor to Spark since 0.9, Cassandra since 0.6 Created Spark Job Server and FiloDB Talks at Spark Summit, Cassandra Summit, Strata, Scala Days, etc. http://velvia.github.io/

    http://github.com/spark-jobserver/spark-jobserverhttp://github.com/filodb/FiloDBhttp://velvia.github.io/

  • Streaming is now King

  • Message Queue

    EventsStream

    Processing Layer

    State / Database

    Happy Users

  • Why are Updates Important?Appends

    Streaming workloads. Add new data continuously.

    Real data is *always* changing. Queries on live real-time data has business benefits.

    Updates

    Idempotency = really simple ingestion pipelines

    Simpler streaming later

    update late events (See Spark 2.0 Structured Streaming)

  • Introducing FiloDBA distributed, versioned, columnar analytics database. With updates. Built for streaming.

    http://www.github.com/filodb/FiloDB

    http://www.github.com/filodb/FiloDB

  • Fast Analytics Storage Scan speeds competitive with Apache Parquet

    In-memory version significantly faster

    Flexible filtering along two dimensions

    Much more efficient and flexible partition key filtering

    Efficient columnar storage using dictionary encoding and other techniques

    Updatable

    Spark SQL for easy BI integration

  • Message Queue

    EventsSpark

    Streaming

    Short term storage, K-V

    Adhoc, SQL, ML

    Cassandra

    FiloDB: Events, ad-hoc, batch

    Spark

    Dashboards, maps

  • 100% Reactive Scala

    Akka Cluster

    Spark

    Typesafe Config for all configuration

    Scodec, Ficus, Enumeratum, Scalactic, etc.

    Even most of the performance critical parts are written in Scala :)

  • Scala, Akka, and Spark Akka - eliminate shared mutable state

    Remote and cluster makes building distributed client-server architectures easy

    Backpressure, at-least-once is easy to build

    Failure handling and supervision are critical for databases

    Spark for SQL, DataFrames, ML, interfacing

  • One FiloDB Node

    NodeCoordinatorActor (NCA)

    DatasetCoordinatorActor (DsCA)

    DatasetCoordinatorActor (DsCA)

    Active MemTable

    Flushing MemTableReprojector ColumnStore

    Data, commands

  • Akka vs Futures

    NodeCoordinatorActor (NCA)

    DatasetCoordinatorActor (DsCA)

    DatasetCoordinatorActor (DsCA)

    Active MemTable

    Flushing MemTableReprojector ColumnStore

    Data, commands

    Akka - control flow

    Core I/O - Futures

  • Akka vs Futures Akka Actors:

    External FiloDB node API (remote + cluster)

    Async messaging with clients

    State management and scheduling (flushing)

    Futures:

    Core I/O

    Columnar data processing / ingestion

    Type-safe processing stages

  • Akka for Control FlowDriver

    Client

    Executor

    NCA

    DsCA1 DsCA2

    Executor

    NCA

    DsCA1 DsCA2

    Flush()

    NodeClusterActor

    SingletonClusterProxy

  • Yes, Akka in Spark Columnar ingestion is stateful - need stickiness of state. This

    is inherently difficult in Spark.

    Akka (cluster) gives us a separate, asynchronous control channel to talk to FiloDB ingestors

    Spark only gives data flow primitives, not async messaging

    We need to route incoming records to the correct ingestion node. Sorting data is inefficient and forces all nodes to wait for sorting to be done.

    On failure, can control state recovery and moving state

  • Data Ingestion SetupExecutor

    NCA

    DsCA1 DsCA2

    task0 task1

    Row Source Actor

    Row Source Actor

    Executor

    NCA

    DsCA1 DsCA2

    task0 task1

    Row Source Actor

    Row Source Actor

    Node Cluster Actor

    Partition Map

  • FiloDB NodeFiloDB Node

    FiloDB separate nodesExecutor

    NCA

    DsCA1 DsCA2

    task0 task1

    Row Source Actor

    Row Source Actor

    Executor

    NCA

    DsCA1 DsCA2

    task0 task1

    Row Source Actor

    Row Source Actor

    Node Cluster Actor

    Partition Map

  • Akka wire protocol

  • Backpressure Assumes receiver is OK, starts sending rows

    Allows configurable number of unacked messages before stops sending

    Acking is receivers way of rate-limiting

    Automatic retries for at-least-once

    NACK for when receiver must stop (out of memory or MemTable full)

  • Testing Akka Cluster MultiNodeSpec / sbt-multi-jvm

    AWESOME

    Test multi-node message routing

    Test cluster membership and subscription

    Inject network failures

  • Core: All Futures /** * Clears all data from the column store for that given projection, for all versions. * More like a truncation, not a drop. * NOTE: please make sure there are no reprojections or writes going on before calling this */ def clearProjectionData(projection: Projection): Future[Response]

    /** * Completely and permanently drops the dataset from the column store. * @param dataset the DatasetRef for the dataset to drop. */ def dropDataset(dataset: DatasetRef): Future[Response]

    /** * Appends the ChunkSets and incremental indices in the segment to the column store. * @param segment the ChunkSetSegment to write / merge to the columnar store * @param version the version # to write the segment to * @return Success. Future.failure(exception) otherwise. */ def appendSegment(projection: RichProjection, segment: ChunkSetSegment, version: Int): Future[Response]

  • Kamon Tracing def appendSegment(projection: RichProjection, segment: ChunkSetSegment, version: Int): Future[Response] = Tracer.withNewContext("append-segment") { val ctx = Tracer.currentContext stats.segmentAppend() if (segment.chunkSets.isEmpty) { stats.segmentEmpty() return(Future.successful(NotApplied)) } for { writeChunksResp responses.head } } }

  • Kamon Tracing http://kamon.io

    One trace can encapsulate multiple Future steps all executing on different threads

    Tunable tracing levels

    Summary stats and histograms for segments

    Super useful for production debugging of reactive stack

    http://kamon.io

  • Kamon Metrics

    Uses HDRHistogram for much finer and more accurate buckets

    Built-in metrics for Akka actors, Spray, Akka-Http, Play, etc. etc.

    KAMON trace name=append-segment n=2863 min=765952 p50=2113536 p90=3211264 p95=3981312 p99=9895936 p999=16121856 max=19529728KAMON trace-segment name=write-chunks n=2864 min=436224 p50=1597440 p90=2637824 p95=3424256 p99=9109504 p999=15335424 max=18874368KAMON trace-segment name=write-index n=2863 min=278528 p50=432128 p90=544768 p95=598016 p99=888832 p999=2260992 max=8355840

  • Validation: Scalactic private def getColumnsFromNames(allColumns: Seq[Column], columnNames: Seq[String]): Seq[Column] Or BadSchema = { if (columnNames.isEmpty) { Good(allColumns) } else { val columnMap = allColumns.map { c => c.name -> c }.toMap val missing = columnNames.toSet -- columnMap.keySet if (missing.nonEmpty) { Bad(MissingColumnNames(missing.toSeq, "projection")) } else { Good(columnNames.map(columnMap)) } } }

    for { computedColumns

  • Machine-Speed Scalahttp://github.com/velvia/filo

    https://github.com/filodb/FiloDB/blob/new-storage-format/core/src/main/scala/filodb.core/binaryrecord/BinaryRecord.scala

    http://github.com/velvia/filohttps://github.com/filodb/FiloDB/blob/new-storage-format/core/src/main/scala/filodb.core/binaryrecord/BinaryRecord.scala

  • Filo: High Performance Binary Vectors

    Designed for NoSQL, not a file format

    random or linear access

    on or off heap

    missing value support

    Scala only, but cross-platform support possible

    http://github.com/velvia/filo is a binary data vector library designed for extreme read performance with minimal deserialization costs.

    http://github.com/velvia/filo

  • Billions of Ops / Sec

    JMH benchmark: 0.5ns per FiloVector element access / add

    2 Billion adds per second - single threaded

    Who said Scala cannot be fast?

    Spark API (row-based) limits performance significantly

    val randomInts = (0 until numValues).map(i => util.Random.nextInt) val randomIntsAray = randomInts.toArray val filoBuffer = VectorBuilder(randomInts).toFiloBuffer val sc = FiloVector[Int](filoBuffer) @Benchmark @BenchmarkMode(Array(Mode.AverageTime)) @OutputTimeUnit(TimeUnit.MICROSECONDS) def sumAllIntsFiloApply(): Int = { var total = 0 for { i

  • Thank you Scala OSS!