Admiral Group

16
Speakers Simon Elliston Ball – Solutions Architect, Hortonworks Adam Morton – Enterprise Data Architect, Admiral Group plc Over 10 years experience in Data Warehousing, Business Intelligence and Analytics Working at Admiral for the past 2 years delivering a greenfield Enterprise Data Warehouse as part of an overall Data Architecture modernisation programme

Transcript of Admiral Group

Page 1: Admiral Group

Speakers

Simon Elliston Ball – Solutions Architect, Hortonworks

Adam Morton – Enterprise Data Architect, Admiral Group plc

• Over 10 years experience in Data Warehousing, Business Intelligence and Analytics

• Working at Admiral for the past 2 years delivering a greenfield Enterprise Data Warehouse as part of an overall Data Architecture modernisation programme

Page 2: Admiral Group

The Admiral Group

Admiral Group has grown from a small start up to one of the largest car insurance providers in the UK with a presence in seven countries.

Our strategy is simple: To continue to progress in the UK Car Insurance market whilst taking what we do well to new markets and products: keep doing what we’re doing and do it better year after year.

Page 3: Admiral Group

Admiral – International Operations

Admiral employs more than 7,000 people at its offices in the UK, Spain, Italy, France, USA, Canada and India.

"People who like what they do, do it better"

Page 4: Admiral Group

R&D at Admiral

• Strong history of using data to drive innovation which needs to be continued

• New function aimed at testing and learning through technology

• Time-boxed iterative efforts of no more than 4-6 weeks

• Fail fast, fail quickly approach; success or failure can end the PoC early

• Understand ‘Big Data’ and trial Hadoop ecosystem projects

Page 5: Admiral Group

Why Telematics?

• Scalability – A product with large potential and potentially huge volumes

• Timeliness - Data & Scoring was processed in batch – how quickly can this be done?

• Granularity - Suppliers provide aggregated data – could map matching be improved?

• Event Notification – Can we respond quickly to NRT events in the data?

• Data Enrichment - Opportunity to uncover further insights by integrating with interesting data sources

Page 6: Admiral Group

Objectives of the Telematics PoC

• Scalability - Prove that data storage and high performance analytics can be accomplished on large data sets cost effectively

• Timeliness - Reduce scoring time

• Data Enrichment

• NRT data processing – acting on events such as proximity to an airport

• Improve stability and flexibility

• Test the viability of a cloud solution

• Data Visualisation

Page 7: Admiral Group

Technical Challenges – Networking and Security

• Privacy Sensitive

• Third Party Sources

• Real-time data

Page 8: Admiral Group

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

There’s a VPN, it will be fine!

Admiral vNET

Third Party vNET

Telematics Provider

DC

External Users

Internal Users

Page 9: Admiral Group

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Kafka SSL

Admiral vNET

Telematics Provider

DC

External Users

Internal Users

K

SSL

Page 10: Admiral Group

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Ingest with NiFi

Admiral vNET

Telematics Provider

DC

External Users

Internal Users

K

HDF

Other Providers

Other Providers

Other Providers

Page 11: Admiral Group

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Real-time Scoring

Clean up done in NiFi– Basic data correctness– Format changes

Fed To Kafka

Spark Streaming– NEAR Real time requirement– Mixing Scala RDD and Data Frames code– Integrating with map matching library

Output fed into Kafka– Kafka to WebSockets bridge for real-time visualization

Page 12: Admiral Group

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Batch Scoring

More Spark!

Zeppelin for ease of use, interaction

Productionized into batch Spark Jobs

Page 13: Admiral Group

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

SAS on Hive

Spark as ETL engine Hive for Large Scale processing SAS connector using Hive ORC as a file format

– Significantly smaller than JSON– So much faster to process

Page 14: Admiral Group

14

Technical Challenges – Map Matching

• GPS data is messy

• Open Data sources based on roads

• Nearest road is fast, but not very good

• Hidden Markov Models. Know where you’re going, and where you’ve been.

• Open source to the rescue…

Page 15: Admiral Group

15

Barefoot – Map Matching

• https://github.com/bmwcarit/barefoot

• Docker based service

• PostGIS map server loaded from OSM data

• Serializable map, distributed in Spark

Page 16: Admiral Group

Next Steps

Completing knowledge transfer workshops with Hortonworks

How to move from a POC to Production – ready?

Establishing a in-house R&D function

Deciding on the tools and frameworks to use within a POC environment in the future