Apache Spark in Cloud and Hybrid: Why Security and Governance Become More Important: keynote by Arun...

Post on 12-Feb-2017

125 views 2 download

Transcript of Apache Spark in Cloud and Hybrid: Why Security and Governance Become More Important: keynote by Arun...

1 ©Hortonworks Inc.2011– 2017.AllRightsReserved1 ©Hortonworks Inc.2011– 2017.AllRightsReserved

Arun C.MurthyHortonworksCo-Founder@acmurthy

2 ©Hortonworks Inc.2011– 2017.AllRightsReserved2 ©Hortonworks Inc.2011– 2017.AllRightsReserved

DataCenter CloudModernData

Applications

©Hortonworks Inc.2011– 2016.AllRightsReserved

Exceptions thatDrivePredictiveMaintenance

360ViewofOperations,

Equipment FailureAnalytics, Recalls,

etc.

DeepHistoricalAnalysis

DATA CENTER

Stream Analytics

Mfg LineData

ConnectedCity,Infrastructure, and

Devices

MachineLearning

CLOUD Sensors,Infotainment,ControlSystems

EdgeAnalytics

Time –basedRouteOptimization

ConnectedDataPowersModernDataApplications

4 ©Hortonworks Inc.2011– 2017.AllRightsReserved4 ©Hortonworks Inc.2011– 2017.AllRightsReserved

5 ©Hortonworks Inc.2011– 2017.AllRightsReserved

PrescientTravelerIsaModernDataApplication“Weknowthatwhenwedefineahigh-threatareainagivenareaoftheworld,thatitisunderpinnedbyveryspecificdatasources.It’sdata-driven,andwecanpointtothosesources—ifeverasked—andsay,‘Here’swhy.’”

MikeBishop,ChiefSystemsArchitect,Prescient

6 ©Hortonworks Inc.2011– 2017.AllRightsReserved

PrescientHarnessesMachineLearningforTravelerSafetyWarnings

S ITUAT ION

Couldonlyproduceoneassessmentevery3-4days

Performsriskmanagement

Useshumanstoidentifyfalsepositives

Neededefficientwaytostorerawdataforanalytics

49,500DataSources

700%ProductivityImprovement

5PetabytesofData

HybridArchitecture

ingestedbyHDFintoHDP

forgeospatialanalysts

storedinHDPconnectedEMC

HDFconnectsdatacentertocloud

ETLOFFLOADSensor Data

Ingest

DATADISCOVERY

ThreatAssessments

SINGLEVIEWGlobal

Threat Map

PREDICT IVEANALYT ICSThreat-Proximity MobileAlerts

ACTIVEARCHIVEStreaming

Threat Archive

DATAENRICHMENTProvenanceMetadata

“Weknowthatwhenwedefineahigh-threatareainagivenareaoftheworld,thatit isunderpinnedbyveryspecific datasources. It’s data-driven,andwecanpointtothosesources—if everasked—andsay,‘Here’s why.’”MikeBishop,ChiefSystems Architect

7 ©Hortonworks Inc.2011– 2017.AllRightsReserved

Symantec:DataScienceSpeedsTimetoCyberSecurityProtection

From4hoursto2-Seconds

5000x

10sofPBS

CloudFlexibility

threatdetectionlatency

improvedtime-to-protection

ofhistoricaldataformachinelearning

tomeetpeakdemandforanalysis

S ITUAT ION

3-4hrprocessinglatenciestoanalyzedigitalthreats

Networkhas+57Mattacksensorsin157countries

Datastreamsfrom75Muserson120Mdevices

Longopenwindowsofexposuretocyberattacks

DATADISCOVERYSecurityLogs

DATADISCOVERY

ThreatDetection

DATADISCOVERYAttacker

Identification

PREDICTIVEANALYTICSUnifiedSecurity

PREDICTIVEANALYTICSProtectiveSafeguard

ETLOFFLOADDevice Data

Ingest

ACTIVEARCHIVEThreatArchive

ETLOFFLOADGreenplumOffload

ETLOFFLOADMetadataCapture

PREDICT IVEANALYT ICS

Threat Predictions

“Onanygivenday,we’llbeprocessing40billionmessagesintooursystem…It usedtobethatqueueswouldbackup.Wewouldseetimes toanalysisontheorderof4hours.Onaverage,we’ve gottenthatdowntotwoortwoandahalfseconds.”DavidLin,SeniorDirectorofEngineering, Symantec CloudPlatform

8 ©Hortonworks Inc.2011– 2017.AllRightsReserved

Progressive:RewardSafeDriversandImproveTrafficSafety

ETLOFFLOADSensor Data

Ingest

DATADISCOVERYWeb LogAnalysis

ACTIVEARCHIVEIndividual

Driving Histories

100%in2-3Days

+12Billion

WebApp-Enabled

$2.6Billion

drivingdetailcapturedfromSnapshot,inHDF

milesdrivenstored

customersseedrivingdetailandimprovesafety

in2014Premiums

ExistingDataSystemsDidNotScaleEfficiently

Usage-Based“Snapshot”InsuranceProgram

In-CarSensorCapturesIoT Data

~7DaystoTransformOnly25%ofUBIData

S ITUAT IONDATA

DISCOVERYOnlineAdPlacement

DATADISCOVERYClaimNotes

Mining

PREDICT IVEANALYT ICSUsage-Based Insurance (UBI)

“We’re lookingatdatasetsthatweneverdreamedwecouldlookat…It’s joiningdotsthatinthepastwedidn’tevenknowwecouldjoin.”Pawan Divakarla, Data&Analytics Business Leader

9 ©Hortonworks Inc.2011– 2017.AllRightsReserved

HDCSharedServices:Secure&GovernallModernDataApplications

CentralizedManagement• DatainMotionoratRest• AcrossDataCenterand

Cloud• Manageend-to-enddata

lifecycle

Cloud

DataCenter

Security

Governance

DataLifecycle

SharedServices

10 ©Hortonworks Inc.2011– 2017.AllRightsReserved10 ©Hortonworks Inc.2011– 2017.AllRightsReserved

Demo

11 ©Hortonworks Inc.2011– 2017.AllRightsReserved11 ©Hortonworks Inc.2011– 2017.AllRightsReserved

12 ©Hortonworks Inc.2011– 2017.AllRightsReserved

ThankYou