HB
ASE
AT B
LOO
MB
ERG
//
THE EVOLUTION OF BLOOMBERG DATA SYSTEMS
MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015
HB
ASE
AT B
LOO
MB
ERG
//
BLOOMBERG 2
Leading Data and Analytics provider to the financial industry
HB
ASE
AT B
LOO
MB
ERG
//
DATA IS OUR BUSINESS 3
HB
ASE
AT B
LOO
MB
ERG
//
September 28: Full Workshop at Bloomberg September 30: Showcase at Strata Hadoop Call for papers at: bloomberglabs.com/data-science
DATA FOR GOOD EXCHANGE: GOVERNMENT INNOVATION, PUBLIC HEALTH, ENVIRONMENT, EDUCATION
HB
ASE
AT B
LOO
MB
ERG
//
5
• We have a “medium data” problem…
• Speed and availability are paramount
• Hundreds of thousands of users with expensive requests
We’ve built many systems to address
DATA MANAGEMENT TODAY
HB
ASE
AT B
LOO
MB
ERG
//
DATA MANAGEMENT CHALLENGES 6
• Single security analytics on Big Iron
• Replication of Systems and Data
• Complexity kills
Top 500 Supercomputer list, 2013
>96% Linux. 100% of top 40.
HB
ASE
AT B
LOO
MB
ERG
//
DATA MANAGEMENT TOMORROW 7
• Simplicity and performance
• Benefit from external developments
• Retain our independence
• Details matter
HB
ASE
AT B
LOO
MB
ERG
//
THE PREMISE 8
• Can apply big data techniques to our medium data problem, by addressing gaps in existing open systems
• HBase is a good bet • Part of a broader whole • The Biggest community wins
HB
ASE
AT B
LOO
MB
ERG
//
CHALLENGES
Our requirements from HBase are: • Read performance – fast with low variability • High availability • Operational simplicity • Efficient use of good hardware • Expressive power
Bloomberg has been investing in all these aspects of HBase
HB
ASE
AT B
LOO
MB
ERG
//
WE’VE MADE THAT BET 10
HB
ASE
AT B
LOO
MB
ERG
//
WE’RE NOT THE ONLY ONES 11
Google Cloud Bigtable
HB
ASE
AT B
LOO
MB
ERG
//
AIMING HIGHER
We can make things better by working together
Let’s be the gold standard
HB
ASE
AT B
LOO
MB
ERG
//
13
>>>>>>>>>>>>>> CALL TO ACTION
HB
ASE
AT B
LOO
MB
ERG
//
FURTHER BOLSTER RELIABILITY 15
Great strides such as HBASE-10070 but more to do
• Improved reconciliation of state between Master, META and ZK
• More determinism in Admin/Master operations
HB
ASE
AT B
LOO
MB
ERG
//
BENEFIT FROM MODERN HARDWARE 16
• 32 cores - 256GB RAM – SSD - untapped potential
• CPU load max 20% , inadequate throughput
• Multi-RS administratively painful
• Much better story with memory
HB
ASE
AT B
LOO
MB
ERG
//
IMPROVE MULTI-TENANCY 17
• Mixed workloads challenging • interactive vs batch • read vs write • different read access
patterns
• Many solutions in progress
• Administrative simplicity is key
HB
ASE
AT B
LOO
MB
ERG
//
SPARK INTEGRATION 18
• Analytical frameworks need a distributed database
• Columnar file format != column database
• Integrate with HBase to move towards the universal database
HB
ASE
AT B
LOO
MB
ERG
//
ANALYTICS: EFFICIENCY 19
• Choice of row and columnar storage engines
• Expose primitives for efficiency: • Column pruning • Predicate pushdowns • Data locality
HB
ASE
AT B
LOO
MB
ERG
//
THE FUTURE IS BRIGHT 20
• The state of the “Hadoop Database” union is strong – Increasing adoption – Strong foundation – Great community
• Prominent role in the data & analytics platform of the future
• Let’s go create the future
>>>>>>>>>>>>>> THANK YOU
Top Related