Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
-
Upload
spark-summit -
Category
Data & Analytics
-
view
236 -
download
2
Transcript of Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
©2017Bloomberg Finance L.P.All rights reserved.
February 9, 2017
Shubham ChopraSoftware Engineer
Spark and Online AnalyticsSpark Summit East 2017
©2017Bloomberg Finance L.P.All rights reserved.
Agenda• DataandAnalyticsatBloomberg• TheroleofSpark• TheBloombergSparkServer• Sparkforonlineusecases
©2017Bloomberg Finance L.P.All rights reserved.
Data and Analytics are our Business
©2017Bloomberg Finance L.P.All rights reserved.
Analytics at Bloomberg• Human-time,interactiveanalytics• Scalability
• Handleincreasinglysophisticatedclientanalyticworkflows• Ad-hocandcross-domainaggregations,filtering
• Heterogeneousdatastores• Analyticsoftenrequiresdatafrommultiplestores
• Low-latencyupdates,inadditiontoqueries
©2017Bloomberg Finance L.P.All rights reserved.
Spark for Bloomberg Analytics• Distributedcomputescaleswellfor:
• Largesecurityuniverses• Multi-universecross-domainqueries
• Abstractawayheterogeneousdatasourcesandpresentconsistentinterfaceforefficientdataaccess• Sparkasatoolforsystemsintegration
• Connectorsandprimitivestodealwithincomingstreams• Cacheintermediatecomputeforfastqueries
©2017Bloomberg Finance L.P.All rights reserved.
Spark as a Service?• Stand-aloneSparkAppsonisolatedclustersposechallenges:
• Redundancyin:
• CraftingandmanagingRDDs/DFs
• Coding of thesameorsimilar types oftransforms/actions
• Managementofclusters,replicationofdata,etc.
• Analyticsareconfinedtospecificcontentsetsmakingcross-assetanalyticsmuchharder
• Needtohandlereal-timeingestionineachApp
©2017Bloomberg Finance L.P.All rights reserved.
Bloomberg Spark Server• Asinglelong-runningSparkapplication
• AnalyticsdeployedasRequestProcessorsandservedviaaRESTAPI
• CanbedeployedonYARNorMESOSorstandalone
• IngesttimetransformstoloaddatainSparkfromabackingstore
• QuerytimetransformstorunanalyticsontheingesteddatainSpark
©2017Bloomberg Finance L.P.All rights reserved.
Bloomberg Spark Server
©2017Bloomberg Finance L.P.All rights reserved.
Spark Server: Content Caching• Dataaccesshaslongtailcharacteristics
• Highvaluedatasub-settedwithinSpark
• Specifiedasafilterpredicateattimeofregistration
• SeamlessunificationofdatainSparkandbackingstore
• Reliability?
©2017Bloomberg Finance L.P.All rights reserved.
Spark HA: State of the World• ExecutionlineageinDriver
• RecoveryfromlostRDDs• RDDReplication
• Lowlatency,evenwithlostexecutors• Supportfor“MEMORY_ONLY”,“MEMORY_ONLY_2”,“MEMORY_ONLY_SER”,“MEMORY_ONLY_SER_2”modesforin-memorypersistence.Easilyextensibletomorereplicasifneeded.
• Speculativeexecution• Minimizingperformancehitfromstragglers
• Off-heapdata• MinimizingGCstalls
©2017Bloomberg Finance L.P.All rights reserved.
Spark Architecture
©2017Bloomberg Finance L.P.All rights reserved.
RDD Block ReplicationExecutor-1 Executor-2Driver
ComputeRDD
Computationcomplete GetPeersforreplication
ListofPeers
ReplicateblocktoPeer
BlockstoredlocallyResultsofcomputation
©2017Bloomberg Finance L.P.All rights reserved.
RDD Block Replication: Challenges• LostRDDpartitionscostlytorecover
• Datareplenishedatquerytime
• RDDreplicatedtorandomexecutors• OnYARN,multipleexecutorscanbebroughtuponthesamenodeindifferentcontainers• Hencemultiplereplicaspossibleonthesamenode/rack,susceptibletonode/rackfailure• Lostblockreplicasnotrecoveredproactively
©2017Bloomberg Finance L.P.All rights reserved.
Topology Aware Replication (SPARK-15352)• MakingPeerselectionforreplicationpluggable
• Drivergetstopologyinformationforexecutors• Executorsinformedaboutthistopologyinformation• Executorsuseprioritizationlogictoorderpeersforblockreplication• PluggableTopologyMapper andBlockReplicationPrioritizer• DefaultimplementationreplicatescurrentSparkbehavior
©2017Bloomberg Finance L.P.All rights reserved.
Topology Aware Replication (SPARK-15352)• Customizableprioritizationstrategiestosuitdifferentdeployments• Varietyofreplicationobjectives– ReplicateToDifferentHost,
ReplicateBlockWithinRack,ReplicateBlockOutsideRack• Optimizertofindaminimumnumberofpeerstomeetthe
objectives• Replicatetothesepeerswithahigherpriority
• Proactivereplenishmentoflostreplicas• BlockManagerMasterEndpoint triggeredreplenishmentwhenan
executorfailureisdetected.
©2017Bloomberg Finance L.P.All rights reserved.
Spark HA: Challenges• HighAvailabilityofSparkDriver• Highbootstrapcosttoreconstructingclusterandcachedstate• NaïveHAmodels(suchasmultipleactiveclusters)surfacequeryinconsistency
• HighAvailabilityandLowTailLatencycloselyrelated
©2017Bloomberg Finance L.P.All rights reserved.
Spark HA – A Strawman• MultipleSparkServersinLeader-Standbyconfiguration
• EachSparkServerbackedbyadifferentSparkCluster
• EachSparkServerrefreshedwithup-to-datedata
• Queriestostandbysredirectedtoleader• Onlyleaderrespondstoqueries- Dataconsistency
• RDDPartitionlossintheleaderstillaconcern• Performancestillgatedbyslowestexecutorinleader
• ResourceusageamplifiedbythenumberofSparkServers
©2017Bloomberg Finance L.P.All rights reserved.
Spark Driver State• SparkDriver isanarbitraryJavaapplication• Onlyasubsetofthestateisinterestingorexpensive toreconstruct• Foronline-use cases,onlyRDDs/DFscreatedduringingestionareofinterest• Expressing ingestionusingDFshasbetterdecouplingofdata/statethanRDDs
©2017Bloomberg Finance L.P.All rights reserved.
Spark Driver State*• BlockManagerMasterEndpoint holdsBlock<->Executorassignment• CacheManagerholdsLogicalPlanandDataFrame references
• Usedtoshort-circuitquerieswithpre-cachedqueryplans,ifpossible• JobScheduler
• Keepsatrackofvariousstagesandtasksbeingscheduled• Executorinformation
• Hostnameandportsofliveexecutors
*Illustrative,notexhaustive
©2017Bloomberg Finance L.P.All rights reserved.
Externalizing Driver StateBenefits:• Quickerrecoveries• Noneedtorestartexecutors• Stateaccessible frommultipleActive-Activedrivers
Solutions:• Off-heapstorageforRDDs• Residualbook-keepingdriverstateexternalizedtoZooKeeper
©2017Bloomberg Finance L.P.All rights reserved.
Quorum of Drivers
©2017Bloomberg Finance L.P.All rights reserved.
THANK [email protected]