Big Data, Big Opportunity: A Primer for Understanding The Big Data Frontier
-
Upload
ca-technologies -
Category
Technology
-
view
757 -
download
5
Transcript of Big Data, Big Opportunity: A Primer for Understanding The Big Data Frontier
BigData,BigOpportunityAPrimerforUnderstandingTheBigDataFrontier
SanjaiMarimadaiah
Mainframe
CATechnologiesProductManagement,OfficeoftheCTO,BigDataManagementMFX01E
@SanjaiM1#CAWorld
MichaelHarer @MikeHarer Hiren Mandalia @hiren0210
2 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Abstract
BigDataenvironmentsnowarebusiness-criticalforanyorganization.LearnthebasicsofBigDataandsomeoftheemergingtechnologiestargetingtheBigDataspace
SanjaiMarimadaiah
MichaelHarer
Hiren MandaliaCATechnologiesProductManagementOfficeoftheCTOBigDataManagement
3 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Agenda
WHATISBIGDATA?
BIGDATAUSECASES
HADOOPBASICS
1
2
3
NOSQL BASICS4
CASSANDRABASICS5
MONGODB BASICS6
4 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
HowdoIdeliveraflawlessexperienceeverytimeanapplicationtouchesthemainframe?
Intheapplicationeconomyit’sallaboutyourcustomers.Youneedtothinkaboutyourmainframereframed.
Connectmobile-to-mainframeapplications
Createmainframeinfrastructureflexibility
forthefuture
Unleashthepowerofdataonthemainframe
4 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
5 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
WhatisBigData?
Datasetswhosevolume,velocity,varietyandcomplexityexceedabilityofcommonlyusedsoftwaretoolstocapture,process,store,manage,andanalyzethem.
Information Sources
MobileTransactionalData
SearchTextsCRM,SCM,ERP
$ € ¥
ImagesEmail SocialMedia
ITOps AudioVideo
Velocity Volume
Variety Complexity
BigData
6 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
EvolutionofDataManagementSolutionsRelationalDatabasesarenotsuitedforBigData
HierarchicalDataModels
RelationalDataModels
1960 1970 1980 1990 2000 2010
DocumentDataModels
Structured DataUnstructured Data
IBMIMS
SybaseInformixOracleIBM
GoogleHadoop
7 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
StateofDatabaseWorkloadsBigDataworkloadsenablebroaderOLAPworkloads
Database- RDBMSOnline TransactionProcessing
DataWarehouseOnlineAnalyticalProcessing
BigDataBigDataWorkloads
BetterAnalyticsforhighervaluetransactions
Collecthistoricaltransactionaldataforanalytics
Addingmorecompletedataenhances analytics
Enhancedinsightsfromoperationalworkloads&
informationaccessapplications
Multimedia
WebLogs
SocialData
Sensordata:images
RFID
TextData:emails
8 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
WhatisdrivingBigDataSolutionsCostefficiencyandStandardizedPlatformisfosteringinnovation
Scale-OutArchitecture Open-SourceSoftware
• Protects Investment : Just add more servers to expand capacity
• Lower cost of Infrastructure: Less expensive commodity servers (x86 based)
• Standardization leads to Innovation: A common programing interface is enabling innovation up the SW stack
• Lower software cost: Open source software is lowering software cost
100’s of inexpensive servers
HadoopCassandra
9 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
AdoptionofBigDataSolutions
2X INCREASEinnumberoforganizationsthathavedeployed/implementeddatadrivenprojectssince2014
KeyTrends• Greaterpriorityonstructureddatainitiatives
• Topvendorcriteria- Integrationwithexistinginfrastructure
- Security- EaseofUse
• Necessaryskill sets:BusinessAnalysts,DataArchitects,DataAnalysts&DataVisualizers
40% oforganizationsarestillplanningtoimplementdataprojects
oforganizationsarestillplanningtoimplementdataprojects30%
Source:2015CASponsoredResearch:Vanson Bourne GlobalBigDataUserSurvey
10 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
OverallBigDataMarket
§ TheBigDatamarketwas$27.36Bin2014,upfrom$19.6Bin2013.
§ 89%ofbusiness leadersbelieveBigDatawillrevolutionizebusinessopsthesamewaytheInternetdid.
§ 83%havepursuedBigDataprojectsinordertoseizeacompetitiveedge.
Wikibon projectstheBigDatamarketwilltop$84Bin2026,attaininga17%Compound AnnualGrowthRate(CAGR)fortheforecastperiod2011to2026.
Source:2015Wikibon BigDataMarketForecast
11 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
DatabaseforBigDataOverallBigDatadatabasemarkettoprojectedtogrowat33%CAGRuntil2017
Source:©Wikibon BigDataModel2011-2017,BigDataMarketDatabase Projection,2011-2017($USbillions)
• BigDatadatabasemarketwillgrowatapprox.60%from2011-2017(6-year)
• MarketforNoSQLdatabasewas$0.2Bin2012,growingto$1.6Bin2017.
• Technologyprogression inData-in-DRAM-MemoryandData-in-Flash-Memorywillimprovethescalability ofSQLdatabases.
• Applications areeasiertoprogramandrequirelowermaintenanceifSQLisused;NoSQLhasgreaterscalabilityandlowertechnologycostsforverylargebig-dataapplications.
Source:2015Wikibon BigDataModel2011-2017
12 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
VendorLandscape– BroaderParticipantsBIGDATAMARKETSEGMENT
HARDWARESERVERS(CHIPS) STORAGE NETWORKING
HP EMC/Dell CiscoDell NetApp AristaNetworksIntel Fusion-io Infeineta Systems
SOFTWAREHADOOP NOSQL *NGDW ANALYTICS &BI Management Solutions
Hortonworks Cassandra HP Vertica DigitalReasoning CABigDataControlCenter
Informatica
Cloudera MongoDB EMCGreenplum RevolutionAnalytics Vmware IBM BigInsights
MapR Couchbase TeradataAster Jaspersoft HPHAVEn ZettasetHadapt DataStax IBMNetezza Dataeet BluedataEPIC Syncsort
EMCGreenplum 10gen SAP Pentaho StackIQ BMC Control-M
SERVICESCLOUD SERVICES TECHNICAL SERVICES PROFESSIONALSERVICES
Amazon Hortonworks ThinkBigAnalyticsGoogle Cloudera IBMMapR Cloudwick EMCIBM EMC Accenture
Microsoft IBM Deloitte
*NGDW=NextGenerationDataWarehouse
CoreInfrastructureHadoopCassandraMongoDB AmazonBigDataMAPRElasticSearch
BigDataUseCaseStudies
14 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Media&EntertainmentUseCasePROBLEM SOLUTION POTENTIALBENEFITS§ Acompany’s streamingbusiness
hasexpandedfromthousandsofmemberswatchingoccasionallytomillionsofmemberswatchingovertwobillionhourseverymonth.
§ Acollectionofeventsdescribingwhat isbeing viewedmust begathered. Giventhatviewingiswhatmembersspendmostoftheirtimedoing,what’sneededisarobustandscalablearchitecturetomanageandprocessthis.
§ Certain thingswillbreakthearchitecturethatprocessesbillionsofviewing-relatedeventsperday.
§ Focusontheminimumviablesetofusecases
§ Availabilityoverconsistency- ourprimaryusecasescantolerateeventuallyconsistentdata,sodesignfromthestartfavoringavailabilityratherthanstrongconsistencyinthefaceoffailures.
§ Byfocusingontheminimumviablesetofusecases,ratherthanbuildingagenericall-encompassingsolution,wehavebeenabletobuildasimplearchitecturethatscales.
§ The company’sviewingdataarchitectureisdesignedforavarietyofusecases,rangingfromuserexperiencestodataanalytics.Thefollowingarethreekeyusecases,allofwhichaffecttheuserexperience:
15 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
HealthCareUseCase
15
*SystemzVSAMdatabaserequiresspecialskillstoaccesswithoutvStorm ConnectDataStreamingforBigData
PROBLEM SOLUTION POTENTIALBENEFITS
§ Relapsesincardiacpatients§ “Onesizefitsall”
treatment§ Medicare readmission
penalties§ Sensitivepatientdataon
zSystemsVSAMfiles§ Noefficientwaytooffload
§ Identifyriskfactorsbyanalyzingpatientdata*
§ Factorsusedtopredictlikelyoutcomes
§ Reductioninreadmissions§ Savingsinnopenalty fees§ Nomanualintervention§ Noincrease instaffing
16 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
RetailUseCase
16
PROBLEM SOLUTION POTENTIALBENEFITS
§ Streamsofuserdatanotcorrelated
§ e.g.storepurchases,websiteusagepattern,cardusage,historicalcustomerdata
§ Historical customerdataSystemzVSAM&DB2based– noefficient,secureoffload
§ HDFSsecurelypopulatedwithhistoricalcustomerdata,cardusage,storepurchases,websitelogs
§ Splunk scorescustomersbasedonthevariousdatastreams
§ Highscoringcustomersofferedcoupons,specialdealsonwebsite
§ Increaseinonlinesalesinthemiddleofretailslowdown
§ Improved conversionrateofwebsitebrowsingcustomers(shoppingcarttosales)
§ Eliminationofdatasilos–sincenowanalyticscoveralldatanomorereliance onmultiple reports/formats
HadoopBasics
18 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
WhatisHadoop?
Hadoopis…open-sourcesoftwaredesignedforHighScalability,FaultTolerant andHighlyDistributed
Keyelements:1. Distributedprocessing ofBigData(e.g.MapReduce)2. Distributedstorage(HadoopDistributedFileSystemorHDFS)
HDFS(DistributedReliableStorage)
MapReduce(ResourceManagement
&DataProcessing)
HDFS(DistributedReliableStorage)
YARN(ResourceManagement)
MapReduce(Dist.Programming)
Hadoop1.0 Hadoop2.0
Spark(InMemory)
1
23
HBase
(NoSQLstore)
Hive(Query)
Pig(Scripting)
Oozie(Workflow)
45
19 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
MapReduce– CoreHadoop1
§ Hadoop’sMapReduceframeworkinvolvestwophases:1. MapPhase:Distributesdatasetamongmultiple serversand
operatesonthedatalocally.2. ReducePhase:Recombinesthepartialresults.
AdistributedcomputingFramework
20 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
MapReduce– CoreHadoop1
AdistributedcomputingFramework
• JobTracker-OneoftheCoreHadoopservices thatmanagesthejobs andtheresourcesinthecluster(tasktrackers).JobTrackertriestoschedule a“map”asclosetotheactualdatabeingprocessed.
• TaskTracker–deployedonthedatanodes andareresponsible forrunningthemapandreducetasksasinstructedbyjobtracker
JobTracker
Job-1
Job-2
Job-3
Job-4
Job-5
MR
Processeslargejobsinparallelacrossmanynodesandcombinestheresults.
245
125
134
235
134
DataNodes
TaskTrackers
MasterNode
SlaveNodes
21 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Job-1
Job-2
Job-3
Job-4
Job-5
HDFS
DataNodes
TaskTrackers
HadoopDistributedFileSystem(HDFS)Self-healing,highbandwidthClusteredStorage
• NameNode-OneoftheCoreHadoopservicesthatmaintainsthenamespace–knowswheredataisandmanagesblocks ondatanodes
• DataNode- serves thatactualstorethedataintheirlocaldisks.
• SecondaryNameNode-performsperiodic checkpointofprimarynamenodetoserveasabackupincaseoffailure
SlaveNodes
245
125
134
235
134
HDFSbreaksincomingfilesintoblocksandstoresthemredundantlyacrossthecluster.
NameNode(primary)
NameNode(secondary)
MasterNode
PeriodicCheckpoint
2
22 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
YARN
YARNis…§ ResourceManagement§ NextgenerationMapReduce(MRv2)§ Splits JobTrackerinto:
– ResourceManager– Scheduling /Monitoring
3
WhatdoesYARNdo?§ Provides aclusterlevelresourcemanagerfor
improvedresourcemanagement&scaling§ Formsthenewsystem formanaging
applications inadistributedmanner§ Provides slotsforjobsotherthan
Map/Reduce§ Improvesresourceutilization ResourceManagementmovesintoYARN
YetAnotherResourceNegotiator
23 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
HBASE
Whatisit?§ AHadoopopen source(Java)NoSQLdatabase§ Provides real-timeread/writeaccesstothose
largedatasets§ Distributedwithautomaticfailover
Anon-relational(NoSQL)databasethatrunson topofHDFS
4
Whyuseit?§ Provides anaturaldatastoragemechanism forall
kinds ofdata(especially unstructured)§ Forrandom,realtimeaccesstodatainHadoop§ Whentheprojectgoalistohostverylargetables
i.e.billions ofrowsandmillions ofcolumns§ Combines datasources thatuseawidevarietyof
differentstructuresandschemas§ Greatfor: storingsemi-structureddatalikelogdata
HBase(NoSQLstore)
LogicalViewofCustomerContactInformationinHBase
24 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Hive
Whatisit?§ AqueryenginewrapperbuiltonMapReduce§ TreatedasadatawarehousetoolfortheHadoop
ecosystem§ PrimarilyforuserswithSQLskills§ ProvidesHive=QL(similartoSQL)§ StoresdatainHDFS
ADataWarehouseinfrastructurebuiltonHadoop
5
Whyuseit?§ Dataanalysisandreportingpurposes§ HidesHadoopcomplexityfromendusers§ CanbeusedwithinanELTfunction– i.e.toconvert
StructuredQuerylanguagetounstructuredMapReducejobs torunonaHadoopcluster
§ Goodfor:BatchProcessing tasks:logs, textmining,documentindexing, customerBI)
§ Notgoodfor:Onlinetransactionprocessing, real-timequeries.
Hive(Query)
25 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Cross-IndustryUseCase– ApacheHadoopELTPROBLEM SOLUTION BENEFITS§ Traditional DataWarehousing resourcesare
EXPENSIVE (e.g.transactionalMainframesystems)
§ Needtoreducecosts associatedtoStorage,CPUcapacityand3rd partyETLtools
§ Current systems cannotscale(i.e.process§ Lackefficient tools§ Toolstypicallyonlyhandlestructured data
(RDBMS)but BigDatainsightisderivedfromalltypesofdata(structured, unstructured, semi-structured
§ ApacheHadooptoolsto:
1. perform ETLfunctions
2. forhandlingallofthespecific datatypes.
3. Toshiftawayfromtraditional ETLtoELT(extract, load, andtransform).Thisshiftismainlydrivenbybigdata,whichfollowsthe“storefirst, analyzelater”modelthatisbecomingthenewstandard.
§ Compared totraditional transactional systems,Hadoopprovidesfast,low-cost processing
§ Newvaluecanbederivedfromability tohandlestructured andnon-structured data
§ Greater flexibility &choice:e.g.theTransformfunction canuseMapReduce,Hive,Pig,R,ShellScripts, Java…etc.
§ Vastsupport model:opensourcedevelopercommunity
ExtractTransform
Load
Load
Load
DWH
DataMining
Reporting
OLAP Analysis
Traditional ETLProcess
Web
CRM
ERP
Web
CRM
ERP
Social Media
Sensor Logs
Structured
Unstructured
Flume
Sqoop
Extract/Load
DataMining
Reporting
AnalyticsHDFS
HadoopDistributedFileSystem
PigMapReduce
Hive
Transform
26 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Pureopensource– OpenCore– Compatible
CommercialDistributionsofHadoop
Cloudera Hadoop
HDFS OOZIE
Hortonworks
MAPR
Apache
27 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
TheEvolvingHadoopEcosystemComponents Description
mahout RDataMining/machinelearningtoolsusedagainstHadoop datatodetectpatternsandtrends
PigScriptinglanguageforanalyzinglargedatasets.CompilestoMapReduce jobs
MapReduce YARNProgrammingmodelforprocessinglargedatasets.YARNperforms overall resourcemgmt
Oozie Aworkflowscheduler tooltomanageHadoop MapReduce jobs
Sqoop HiveEnableSQLforHadoop data:Sqoop - DatatransferbetweenHadoopandstructureddatastores.HIVE - datawarehouseforHadoop.Drill - opensource,lowlatencySQLqueryengineforHadoop andNoSQL.
Drill
ZooKeeperCoordinationofconfig.data,namingandsynchronizationofHadoop projects
Components Description
BigTopPackagingservicesforHadoopprojectstoeasetestinganddeployment
HBaseAnon-relational,distributeddatabasethatrunsontopofHDFS
Thrift /AVRO Schema-baseddata serializationsystemusingRPCcalls
Solrhutch Indexingandsearchtoolsfor
datastoredinHDFSforHadoopElasticsearch
Kafka /FlumeCollect,aggregate,andmovestreamingdatafrommultiplesourcesinto Hadoop
SparkAppDev toolfor Hadoop appscombiningbatch,streaming,andinteractiveanalytics
Anbari Chukwa Monitoring&ManagementofHadoop clustersandnodes
NoSQLBasics
29 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
NoSQL DatabasesOverview
§ Farbetterathandlingsemi-structuredandunstructureddata
§ Databaseconsistencyiscompromisedforavailabilityandeaseofpartitioning
§ Supportsobject-orientedprogrammingthatiseasytouseandflexible
§ Efficient,scale-outarchitectureinsteadofexpensive,monolithicarchitecture
30 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
NoSQLtypes
Type DatabaseexamplesColumnDataModel HBase,Cassandra, Accumulo
DocumentDataModel MongoDB
Key-ValueDataModel OpenTSDB,Redis
GraphDataModel Neo4j,ArangoDB
CassandraBasics
32 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Cassandra– History
BigTable,2006 Dynamo,2007
OpenSource,2008
CassandraDSE– Dec2011
Google Amazon
Datastax
33 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
CassandraisIdealFor…
§ Massive,linearscaling
§ Extremelyheavywrites
§ Highavailability
CERN Barracuda
CISCO BlueMountain
Comcast Netflix SoundCloud
34 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Cassandra– DataModel
BenefitsofCassandraDataModel:§ Easilyaddnewcolumnswithoutdowntime
§ Schemafree/schemalessdatabase
§ Compressionpermitscolumnaroperations(MIN,MAX,SUMetc.)rapidly
ColumnFamily(similar toRDBMStable) ColumnFamily- JSONFormat
35 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
CassandraArchitecture
§ Allnodesthesame
§ Datapartitionedamongallnodesincluster
§ EachnodecommunicateswithothernodesusingGossipprotocol
§ Acommitlogisusedoneachnodetocapturewriteactivityfordatadurability
Client
Storage :CassandraFileSystemProcessing :CassandraQueryLanguage(CQL)
36 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Cassandra– Keyfeatures
§ Nosinglepointoffailure
§ Multi-datacenterandzonesupport
§ Purepeer-to-peerclustersetup
§ Allowsfor“tunableconsistency”
§ CassandraQueryLanguage(CQL)
§ CassandraFileSystem(CFS)
37 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
CassandraatNetflix
Usecases:§ WhattitleshaveIwatched?§ Whattitlesarerecommendedforme?§ WheredidIleaveofflast?§ Whatelseisbeingwatched?§ Measurememberengagement§ Informproduct&contentdecisions
Solution:§ Captureall‘view’ eventsinscalable
Cassandraclusters
Challenges:§ Ability toscalebillionwriteevents/day§ Provideresponsive titlebrowsingexp.
Source:techblog.netflix.com
MongoDB Basics
39 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
2007Founded
2009MongoDB 1.0Open-sourced
2012MongoDB 2.0
2015MongoDB 3.0
2013MongoDB Inc.
10gen 10gen 10gen MongoDB MongoDB
40 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
MongoDB isidealfor…
§ RDBMSreplacementforWebApplications
§ Semi-structuredContentManagement
§ Real-timeAnalyticsandHigh-Speedlogging
§ CachingandHighScalability
Web2.0,Media,SAAS,Gaming
HealthCare,Finance, Telecom,Government
Notsogreatfor– HighTransactionalDatabases
DisneyEventbriteIntuitIGN
Craigslist
41 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
MongoDB – Datamodel
RDBMS Document-oriented
BenefitsofDocument-orientedDBMS:
• Databaseschemaisoptional
• Flexibleindealingwithchangeandoptionalvalues
{“streetnum”: “123”,“streetname”: “Main St.”,“unit”: “456”,“City”: “Mountain View”,“State”: “California”,“zip”: “65432”}
{“streetnum”: “123”,“streetname”: “Main St.”,“unit”: “456”,“City”: “Mountain View”,“State”: “California”,“County”: “Santa Clara”“zip”: “65432”}Present
Future
42 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
MongoDB – Sharding
43 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
ShardedProductionClusterSetup
Imagesource:mongodb.org
§ Shards storethedata.Toprovidehighavailabilityanddataconsistency,inaproductionshardedcluster,eachshardisareplicaset
§ ReplicaSetAclusterofMongoDB serversthatimplementsmaster-slavereplicationandautomatedfailover
§ QueryRouters,or mongos instances,interfacewithclientapplicationsanddirectoperationstotheappropriateshardorshards.
§ Config servers storethecluster’smetadata.Thisdatacontainsamappingof thecluster’sdatasettotheshards.
44 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
MongoDB– KeyFeatures
§ ScalableHigh-PerformanceOpen-Source,Document-orienteddatabase
§ BuiltforSpeed
§ RichdocumentformatallowsforEasyReadability
§ FullindexsupportforHighPerformance
§ ReplicationandFailoverforHighAvailability
§ Auto-Sharding forEasyScalability
§ Map/ReduceforAggregation
45 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
MongoDB atCraigslist
Usecases:§ Createnewposts§ Browseallmyposts§ Allowforpostclassification§ Searchrelevantposts
Solution:§ MigratefromMySQLtoMongoDB
Challenges:§ Archivebillions ofrecordsinmultiple formats§ Query/reportonarchivesatruntime§ Needcontinuous availabilitymandatedfor
regulatorycompliance§ Support 700sitesin70differentcountries
CraigslistEnvironment
• 5Billiondocuments• Avg Size:2KB• 3Replicasets/3serverseach• 2Datacenters• Sharding key– PostingID
Closing
47 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
CABigDataControlCenter– Vision
Bringefficiencytoroot-causeanalysis atalllevelsofBigDatasolution stack
SimplifymanagementbyabstractingthecomplexitiesofunderlyingBigDataTechnologies
HolisticallymeettheneedsofDevOpsbymanagingthelifecycleofApplications,DataandServices
BigDataTechnologies
LOB/BizAnalysts
AppDev./DataSci.
DataEng./DataAdmin
ITOps/ITMgmt.
BigData/SysAdmin
PrimaryPersonas
1
2
3
SecondaryPersona
End-to-EndManagementofBigDataEnvironments fortheApplicationEconomy
Application
Data Services
DataSources
ITSolutions CABigDataControlCenter
48 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
ManageBigDataWithAUnifiedView
JobMonitoring
HeterogeneousSystemManagement
IntelligentAlertManagement
ResourceReporting
Cluster/Job/NodeManagement
49 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
UnifiedView– Details
50 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
RecommendedSessions
SESSION# TITLE DATE/TIME
MFT05S BigIron+BigData=BIGDEAL!Unlock ThePowerofYourMainframeData
1/18/2015 at2:00pmLocation:MainframeTheater
MFX15S PredictingWhenYourApplicationsWillGoOfftheRails!ManagingDB2Application PerformanceusingAnalytics
1/18/2015 at4:30pmLocation:BreakersI
MFT15TNewMainframeITAnalytics:ActionableInsightintoRootCauseAnalysis ofPerformanceIssues
1/18/2015 at3:45pmLocation:MainframeAreaTechTalk
MFX06S CA'sStrategyandVision forMainframeDataManagementandAnalytics
1/18/2015 at1:00pmLocation:BreakersI
MFT01S TheBigData,BigPicture:CanYouSeeIt? 11/19/2015 at3:45pmLocation:MainframeTheater
51 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
MustSeeDemos
SeetheFutureofBigDataManagement
CABigDataControlCenter
AppEconomyAreaStation:APPECN001
UnleashthePowerof
MainframeData
vStorm ConnectDataStreamingforBigData
MainframeAreaStation:MNFSE001
MaximizeYourMainframe
DatabaseValue
CAIDMS/CADatacom
MainframeAreaStation:MNFSE002
PerformanceAnalyticsforDB2
DB2Analytics
MainframeAreaStation:MNFSE004
52 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
FollowOnConversationsAt…
SmartBarDB2ToolsandPerformance
Analytics
MainframeAreaonExpoFloor
TechTalksFiveStepstoPowerfulDatabase
Experience
MainframeAreaonExpoFloor
53 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
InfluencingOurRoadmap
WinningwithCA
§ Submityourideasoncommunities.ca.com
§ Vote&commentonideasthatareimportanttoyou
§ CAProductManagementreviewsideasandupdatesstatusastheymovethroughthelifecycle
§ “CurrentlyPlanned”ideastatusindicatesinclusioninAgileBacklogorProductRoadmap
Taketheopportunity to influenceourproductdevelopment.Helpensurethatwedeliveriswhatyouneedandwant.
AgileDevelopment
CACommunities Ideation§ Registertoparticipatein:– LiveDemos/End-of-SprintReviews
– Private-MembersOnly-OnlineCommunity
– Pre-ReleaseOnsiteTestingandSupport(Beta)
– UpgradeSupportfromSWATTeam
§ Howtoregister:https://validate.ca.com
CustomerValidation
54 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
AgileDevelopmentTransformation
DrivingSignificantBusinessValueforourCustomers!
Speed Quality
Performance
UKCustomerStandardLifebenefitsfromCAagileprocess
251 uniquecustomersparticipatedin56 productreleasesduringayear
99.5%reductionincost98%reductioninmonthendcycletime
45products releasedagainstzerodefectpolicy20%decreaseinsupportissues
55 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
ForInformationalPurposesOnlyTermsofthisPresentation
©2015CA.Allrightsreserved.Alltrademarksreferencedhereinbelongtotheirrespectivecompanies.Thepresentationprovided atCAWorld2015isintendedforinformationpurposesonlyanddoesnotformanytypeofwarranty.Someofthespecificslideswith customerreferences relatetocustomer'sspecificuseandexperienceofCAproductsandsolutionssoactualresultsmayvary.
CertaininformationinthispresentationmayoutlineCA’sgeneralproductdirection.Thispresentationshallnotserveto(i)affecttherightsand/orobligationsofCAoritslicenseesunderanyexistingorfuturelicenseagreement orservicesagreementrelatingtoanyCAsoftwareproduct;or(ii)amendanyproductdocumentationorspecificationsforanyCAsoftwareproduct.Thispresentationisbasedon currentinformationandresourceallocationsasofNovember18,2015,andissubjecttochangeorwithdrawalbyCAatanytimewithoutnotice.Thedevelopment,release andtimingofanyfeaturesorfunctionalitydescribedinthispresentationremainatCA’ssolediscretion.
Notwithstandinganythinginthispresentationtothecontrary,uponthegeneralavailabilityofanyfutureCAproductrelease referenced inthispresentation,CAmaymakesuchrelease availabletonewlicenseesintheformofaregularlyscheduledmajorproductrelease.SuchreleasemaybemadeavailabletolicenseesoftheproductwhoareactivesubscriberstoCAmaintenanceandsupport,onawhen andif-availablebasis.Theinformationinthispresentationisnotdeemedtobeincorporatedintoanycontract.
56 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Q&A