Spark and Online Analytics: Spark Summit East talky by Shubham Chopra

©2017Bloomberg Finance L.P.All rights reserved.

February 9, 2017

Shubham ChopraSoftware Engineer

Spark and Online AnalyticsSpark Summit East 2017


Agenda• DataandAnalyticsatBloomberg• TheroleofSpark• TheBloombergSparkServer• Sparkforonlineusecases


Data and Analytics are our Business


Analytics at Bloomberg• Human-time,interactiveanalytics• Scalability

• Handleincreasinglysophisticatedclientanalyticworkflows• Ad-hocandcross-domainaggregations,filtering

• Heterogeneousdatastores• Analyticsoftenrequiresdatafrommultiplestores

• Low-latencyupdates,inadditiontoqueries


Spark for Bloomberg Analytics• Distributedcomputescaleswellfor:

• Largesecurityuniverses• Multi-universecross-domainqueries

• Abstractawayheterogeneousdatasourcesandpresentconsistentinterfaceforefficientdataaccess• Sparkasatoolforsystemsintegration

• Connectorsandprimitivestodealwithincomingstreams• Cacheintermediatecomputeforfastqueries


Spark as a Service?• Stand-aloneSparkAppsonisolatedclustersposechallenges:

• Redundancyin:

• CraftingandmanagingRDDs/DFs

• Coding of thesameorsimilar types oftransforms/actions

• Managementofclusters,replicationofdata,etc.

• Analyticsareconfinedtospecificcontentsetsmakingcross-assetanalyticsmuchharder

• Needtohandlereal-timeingestionineachApp


Bloomberg Spark Server• Asinglelong-runningSparkapplication

• AnalyticsdeployedasRequestProcessorsandservedviaaRESTAPI

• CanbedeployedonYARNorMESOSorstandalone

• IngesttimetransformstoloaddatainSparkfromabackingstore

• QuerytimetransformstorunanalyticsontheingesteddatainSpark


Bloomberg Spark Server


Spark Server: Content Caching• Dataaccesshaslongtailcharacteristics

• Highvaluedatasub-settedwithinSpark

• Specifiedasafilterpredicateattimeofregistration

• SeamlessunificationofdatainSparkandbackingstore

• Reliability?


Spark HA: State of the World• ExecutionlineageinDriver

• RecoveryfromlostRDDs• RDDReplication

• Lowlatency,evenwithlostexecutors• Supportfor“MEMORY_ONLY”,“MEMORY_ONLY_2”,“MEMORY_ONLY_SER”,“MEMORY_ONLY_SER_2”modesforin-memorypersistence.Easilyextensibletomorereplicasifneeded.

• Speculativeexecution• Minimizingperformancehitfromstragglers

• Off-heapdata• MinimizingGCstalls


Spark Architecture


RDD Block ReplicationExecutor-1 Executor-2Driver

ComputeRDD

Computationcomplete GetPeersforreplication

ListofPeers

ReplicateblocktoPeer

BlockstoredlocallyResultsofcomputation


RDD Block Replication: Challenges• LostRDDpartitionscostlytorecover

• Datareplenishedatquerytime

• RDDreplicatedtorandomexecutors• OnYARN,multipleexecutorscanbebroughtuponthesamenodeindifferentcontainers• Hencemultiplereplicaspossibleonthesamenode/rack,susceptibletonode/rackfailure• Lostblockreplicasnotrecoveredproactively


Topology Aware Replication (SPARK-15352)• MakingPeerselectionforreplicationpluggable

• Drivergetstopologyinformationforexecutors• Executorsinformedaboutthistopologyinformation• Executorsuseprioritizationlogictoorderpeersforblockreplication• PluggableTopologyMapper andBlockReplicationPrioritizer• DefaultimplementationreplicatescurrentSparkbehavior


Topology Aware Replication (SPARK-15352)• Customizableprioritizationstrategiestosuitdifferentdeployments• Varietyofreplicationobjectives– ReplicateToDifferentHost,

ReplicateBlockWithinRack,ReplicateBlockOutsideRack• Optimizertofindaminimumnumberofpeerstomeetthe

objectives• Replicatetothesepeerswithahigherpriority

• Proactivereplenishmentoflostreplicas• BlockManagerMasterEndpoint triggeredreplenishmentwhenan

executorfailureisdetected.


Spark HA: Challenges• HighAvailabilityofSparkDriver• Highbootstrapcosttoreconstructingclusterandcachedstate• NaïveHAmodels(suchasmultipleactiveclusters)surfacequeryinconsistency

• HighAvailabilityandLowTailLatencycloselyrelated


Spark HA – A Strawman• MultipleSparkServersinLeader-Standbyconfiguration

• EachSparkServerbackedbyadifferentSparkCluster

• EachSparkServerrefreshedwithup-to-datedata

• Queriestostandbysredirectedtoleader• Onlyleaderrespondstoqueries- Dataconsistency

• RDDPartitionlossintheleaderstillaconcern• Performancestillgatedbyslowestexecutorinleader

• ResourceusageamplifiedbythenumberofSparkServers


Spark Driver State• SparkDriver isanarbitraryJavaapplication• Onlyasubsetofthestateisinterestingorexpensive toreconstruct• Foronline-use cases,onlyRDDs/DFscreatedduringingestionareofinterest• Expressing ingestionusingDFshasbetterdecouplingofdata/statethanRDDs


Spark Driver State*• BlockManagerMasterEndpoint holdsBlock<->Executorassignment• CacheManagerholdsLogicalPlanandDataFrame references

• Usedtoshort-circuitquerieswithpre-cachedqueryplans,ifpossible• JobScheduler

• Keepsatrackofvariousstagesandtasksbeingscheduled• Executorinformation

• Hostnameandportsofliveexecutors

*Illustrative,notexhaustive


Externalizing Driver StateBenefits:• Quickerrecoveries• Noneedtorestartexecutors• Stateaccessible frommultipleActive-Activedrivers

Solutions:• Off-heapstorageforRDDs• Residualbook-keepingdriverstateexternalizedtoZooKeeper


Quorum of Drivers


THANK [email protected]

Spark and Online Analytics: Spark Summit East talky by Shubham Chopra

Data & Analytics

Transcript of Spark and Online Analytics: Spark Summit East talky by Shubham Chopra