Lessons learned from the proverbial battlefield - Hortonworks roadshow
-
Upload
suhail-shergill -
Category
Documents
-
view
21 -
download
1
Transcript of Lessons learned from the proverbial battlefield - Hortonworks roadshow
Lessons learned from the proverbial
battlefield
Suhail Shergill, Scotiabank
Who Am ISuhail Shergill (@suhailshergill)
• Computer Science background (Programming Languages and Machine Learning)
• create and run skunkworks teams focused on data science and technology
• technical advisor to startups
• organizer of a few technical meetups
• leading the Data Science & Model Innovation group in GRM at Scotiabank.
ObjectiveWhat’s in scope• What is “Big Data”
• What are the challenges of “Big Data”
• How can some of these challenges be addressed – lessons learned
• What are we doing in Scotia
“Big Data” and Hadoop
Hadoop
Challenges of Big Data
Feedback loops • Very “big”
• Getting “bigger” at a faster rate
• Long-term solutions need to have exponential/logarithmic characteristics
Feedback loops• Very “big”
• Getting “bigger” at a faster rate
• Long-term solutions need to have exponential/logarithmic characteristics
From data to insights
From data to insights
No free lunches / silver bullets
No free lunches / silver bullets
The challenges of “Big Data”We have a very “big” problem. How do we solve it?
How to solve it
Lessons learned
Data quality is paramount
Build tools
Teach enough to question
Rotations and harmonics
Open doors
Faster and shorter feedback loops
Summary
SummaryNo silver bullet
SummaryNo silver bullet
Data quality is paramount
SummaryNo silver bullet
Data quality is paramount
Build tools
SummaryNo silver bullet
Data quality is paramount
Build tools
Teach enough to question
SummaryNo silver bullet
Data quality is paramount
Build tools
Teach enough to question
Rotations and harmonics
SummaryNo silver bullet
Data quality is paramount
Build tools
Teach enough to question
Rotations and harmonics
Open doors
SummaryNo silver bullet
Data quality is paramount
Build tools
Teach enough to question
Rotations and harmonics
Open doors
Faster & shorter feedback loops
SummaryNo silver bullet
Data quality is paramount
Build tools
Teach enough to question
Rotations and harmonics
Open doors
Faster & shorter feedback loops
What we’re doing in Scotia
Scotiabank’s Enterprise Data Lake InitiativeScotiabank’s 2015 business strategy focuses on these priorities:
• Improving the customer experience;
• Enhancing leadership capabilities throughout the organization; and
• Improving operational efficiency and effectiveness.
• A key component of the digital strategy supporting these priorities is to leverage big data analytics in order to better understand and address customer needs and preferences.
• To this end, Scotiabank is making material investments in the Hadoop technology used to support big data analytics across a wide spectrum of companies and industries.
Scotiabank’s Enterprise Data Lake – Next Steps 1. EDL 1.0 :
• Initial cluster 1PB (Jan-2016) rapidly growing to accommodate more tenants
• A very good start with consistent and commoditized stack• A review of areas we can further optimize and identify gaps• A review of areas where we require higher level flexibility &
portability• A review of what made sense to be directed where to achieve
scale , yet preserve consistency• A review of where are the limiting factors : agile and repeatable
periodically every 2-3 months2. EDL 2.0:
• Need to drive velocity: refactor engineered infrastructure environment
• Need flexibility on workload: decouple compute & data• Need workload portability: next gen hybrid architecture & cloud
Scotiabank’s Enterprise Data Lake – Highlights 1. What we got out of EDL 1.0 :
• Regulatory & Risk Reporting (RDARR)• Consolidation of divisional data repositories• Capability for Anti Money Laundering• Capability for Asset Liability Management• Consolidation of International Banking Datawarehouses• M&A and Credit Card data acquisition and analysis
Thank you