Big-Data Analytics Architecture for Businesses: Open ... · Big Data Processing Big Data...

Post on 24-Jul-2020

6 views 0 download

Transcript of Big-Data Analytics Architecture for Businesses: Open ... · Big Data Processing Big Data...

Big-DataAnalyticsArchitectureforBusinesses:

Open-sourcePerspectiveMert Onuralp Gökalp,KeremKayabay,MohamedZaki

CambridgeServiceAlliance(CSA)InstituteforManufacturingUniversityofCambridge

January2018

Whystudyopen-sourcetoolsforBigData?

• Open-sourcetoolshavebecomethestandardBigDataprocessingplatforms*• Thegap:Studytheopen-sourcetoolsconsideringbothmanagerialandtechnicalperspective• SomeQuestions• Whypeopleprefercommercialsolutionsratherthanopen-source?• DoweneedcommercialBigDatasolutions?

TheBigToolsEra

• Manytoolscontinuetoemergetodealwithbigdataatafastpace• Characteristics:Volume,speed,diversity• Problems:Processing,storage,manipulation,aggregation,visualization

• Sometoolsonlyaimtoanalyse datainacertaindomain• InternetofThings,EdgeComputing

• Justbyreviewingopen-sourcetools,wehavecomeacross6500suchtoolsandfiltereddownto241

Open-sourcebigdatatools

• TypicallysupportedbycompaniesthatprovideservicesoverInternet• Google,Yahoo,Twitter,LinkedIn

• Toprovidebetterservicestotheirusersandthird-partycustomers• ThetoolsaremadeavailabletoITindustryasopen-sourcetools• Theyhavebecomethestandardbigdataprocessingplatforms

Someexampletools

BigDataProcessing BigDataCharacteristic ToolsandTechnologies

Batchprocessing Volume Hadoop,Spark,Flink

Streamprocessing Velocity Storm,Samza,S4, SparkStreaming,Flink Streaming

BigDataStorage BigDataCharacteristic ToolsandTechnologies

NoSQL Variety MongoDB,Cassandra,Hbase,Redis

Choosingtherighttoolset

• Choicesdependonthecharacteristicsofdataanddomainofoperation• Businessesincurcoststryingtoadoptnewtechnologies:technicaldebt• Trainingtheworkforce• Changeexistingsourcecodetorunonnewerversions• Changetheunderlyingtoolset

Whyfirmsoutsource?

• Trackingthedevelopmentsinthisdomainishard• Mostofthetoolsareunknowntothebusinessworld

• NottolagbehindthehypeBUT commercialsolutionproviders• Relyonasubsetofavailableopen-sourcetools• Donothavethedomainspecificexpertise• Donotsolvetechnicalandsoftchallenges

Aimofthisstudy

• Systematicallyreviewtheopen-sourcetoolsinthebigdatadomain• Establishamethodfortrackingthedevelopmentsfortheopen-sourcetools• Buildareferenceopen-sourcebigdataanalyticsarchitecture• Analyse firmstogivedirectionstobusinessesusingtheproposedarchitecture

Toolselectionprocess

Somefigures

Open-sourceArchitecture

DistributionofOpen-SourceTools

Howtochooseabigdatatool?

• Weneedtocomeupwithcriteria• Wecanlookat(113)real-worldusecases,solutionbriefs,whitepapers&blogpostsfromawiderangeofindustries• Telecommunication,healthcare,banking&finance,manufacturing,transportation,energy

• Secondarydata-setssupporttheproposedbig-datareferencearchitecture

Secondaryuse-casecompanydistribution

Howtochooseabigdatatool?

• Timingrequirement:Batchvsstreamprocessing• Datasize: In-memoryvson-diskprocessing• Platformindependency: Interoperabilityofabigdatatool• Datastoragemodel: Graph-based,key-value-based,document-based,time-series-based

Problemsofarchitecturedevelopmentinbigdata• Choosingthebesttool• Abundanceoftools• Nosinglebesttool• Maturityofatool

• Domain-specificchallenges• Thegapbetweendomain-specificknowledgeanddatascience

• Firm-specificsoftchallenges

Problemsofarchitecturedevelopmentinbigdata• Firm-specificsoftchallenges• Managerialskillsdeep-rootedinanorganization• Lackofdata-drivenorganizationalculture• Customersmaynotbeabletoperceivethevalueofbigdata

Tosumup

• Newertoolsneverceasetoemergeinthisdomain• Wecanforeseewheretheindustrywillfocusresearchefforts• Organizationsshouldtrytobuildtheirownbigdataarchitecture• Relyonopen-sourcetoolsinsteadofimposedcommercialsolutions

Tosumup

• Organizationsshouldtrytobuildtheirownbigdataarchitecture• Itisrewarding• Capturedomain-specificknowledge• Theprocesswouldbuildadata-drivencultureanddeveloptherightmanagerialskills• Betterdecision-making

Thankyou!

• Q&A

Forthcoming Webinars

Date14:30hr BST

Topic Invitedspeaker

2018Jan15th BigDataAnalyticsArchitectureforBusiness Mert/Kareem/MohamedFeb12th DigitalBusinessTransformationandStrategy:What

doweknowsofar?MariamHelmy IsmailAbdelaal

Mar12th Doesbuyers’dependencetranslateintofinancialperformance?Anempiricalanalysis ofmanufacturer-serviceproviderrelationships

OrnellaBenedettini