OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA...

download OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS

of 21

  • date post

    27-Jun-2020
  • Category

    Documents

  • view

    14
  • download

    0

Embed Size (px)

Transcript of OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA...

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    THE EVOLUTION OF BLOOMBERG DATA SYSTEMS

    MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    BLOOMBERG 2

    Leading Data and Analytics provider to the financial industry

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    DATA IS OUR BUSINESS 3

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    September 28: Full Workshop at Bloomberg September 30: Showcase at Strata Hadoop Call for papers at: bloomberglabs.com/data-science

    DATA FOR GOOD EXCHANGE: GOVERNMENT INNOVATION, PUBLIC HEALTH, ENVIRONMENT, EDUCATION

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    5

    • We have a “medium data” problem… • Speed and availability are paramount • Hundreds of thousands of users with

    expensive requests

    We’ve built many systems to address

    DATA MANAGEMENT TODAY

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    DATA MANAGEMENT CHALLENGES 6

    • Single security analytics on Big Iron

    • Replication of Systems and Data

    • Complexity kills

    Top 500 Supercomputer list, 2013

    >96% Linux. 100% of top 40.

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    DATA MANAGEMENT TOMORROW 7

    • Simplicity and performance

    • Benefit from external developments

    • Retain our independence

    • Details matter

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    THE PREMISE 8

    • Can apply big data techniques to our medium data problem, by addressing gaps in existing open systems

    • HBase is a good bet • Part of a broader whole • The Biggest community wins

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    CHALLENGES

    Our requirements from HBase are: • Read performance – fast with low variability • High availability • Operational simplicity • Efficient use of good hardware • Expressive power

    Bloomberg has been investing in all these aspects of HBase

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    WE’VE MADE THAT BET 10

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    WE’RE NOT THE ONLY ONES 11

    Google Cloud Bigtable

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    AIMING HIGHER

    We can make things better by working together

    Let’s be the gold standard

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    13

  • >>>>>>>>>>>>>> CALL TO ACTION

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    FURTHER BOLSTER RELIABILITY 15

    Great strides such as HBASE-10070 but more to do

    • Improved reconciliation of state between Master, META and ZK

    • More determinism in Admin/Master operations

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    BENEFIT FROM MODERN HARDWARE 16

    • 32 cores - 256GB RAM – SSD - untapped potential • CPU load max 20% , inadequate throughput • Multi-RS administratively painful • Much better story with memory

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    IMPROVE MULTI-TENANCY 17

    • Mixed workloads challenging • interactive vs batch • read vs write • different read access

    patterns

    • Many solutions in progress

    • Administrative simplicity is key

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    SPARK INTEGRATION 18

    • Analytical frameworks need a distributed database • Columnar file format != column database • Integrate with HBase to move towards the

    universal database

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    ANALYTICS: EFFICIENCY 19

    • Choice of row and columnar storage engines • Expose primitives for efficiency:

    • Column pruning • Predicate pushdowns • Data locality

  • H B

    AS E

    AT B

    LO O

    M B

    ER G

    //

    THE FUTURE IS BRIGHT 20

    • The state of the “Hadoop Database” union is strong – Increasing adoption – Strong foundation – Great community

    • Prominent role in the data & analytics platform of the future

    • Let’s go create the future

  • >>>>>>>>>>>>>> THANK YOU

    The evolution �of bloomberg data systems BLOOMBERG DAtA is our business Slide Number 4 Data management today Data management Challenges Data management tomorrow The premise challenges We’ve made that bet WE’re not the only ones Aiming higher Slide Number 13 Call to action Further Bolster Reliability Benefit from modern hardware IMPROVE Multi-tenancy Spark integration ANALYTICS: efficiency The future is bright Thank you