Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open...

23
The history and anatomy of Apache Superset

Transcript of Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open...

Page 1: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

The history and anatomy of

Apache Superset

Page 2: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

2

Maxime Beauchemin

● Open source leader & community builder○ Creator of Apache Superset○ Creator of Apache Airflow

● Digital artist● Influencer in the data engineering space● 15+ years of experience in data & analytics● Entrepreneur

Page 3: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

Apache SupersetA data visualization and exploration platform

3

Page 4: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

4

Page 5: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

5

Page 6: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

6

Page 7: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

7

Page 8: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

8

Page 9: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

● Easy-to-use & fast “time-to-dashboard”● Enterprise-ready (RBAC) & cloud-native● Richest set of visualizations (50+)

○ Solid geospatial visualization● Lightweight semantic layer● Works with a wide array of databases● Deep integration with Druid● A thriving and growing community

Apache SupersetA data visualization and exploration platform

9

Page 10: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

10

Panoramix Caravel

The early days

Page 11: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

11

Page 12: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

The Superset Project

● Thriving & accelerating open source community● Most promising open source BI solution● 1500 WAU at Airbnb (replaced Tableau), 400 WAU at Lyft● 12 + committed engineers at 3 leading tech companies

12

Page 13: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

StackES6 Javascript Frontend

● React / Redux● webpack / eslint / jest● Broken down as many packages @supserset-ui/*● nvd3, data-ui (VX), blocks, ...

Python Backend

● Flask.* + Flask App Builder● Pandas● SQLAlchemy (ORM + SQL Toolkit)● Many utility libs (sqlparse, dateutils, ...)

13

Page 14: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

Async Infra [optional]

14

metadataMySQL, Postgres, MariaDB, ...

Message Queue

Redis, RabbitMQ, ...

WSGI Web Server(s)

Celery Worker(s)

Chart Cache (optional)Redis, Memcached, ... Results Cache

[optional] S3, HDFS,...Analytics DatabasesDruid, Presto, Hive, Redshift, BigQuery, MySQL, Postgres,

Snowflake, MemCached, ...

Architecture

Page 15: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

Challenges

15

Page 16: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

Challenge: a fast pace repo

16

Page 17: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

Challenge: a huge dependency tree

17

Javascript:

● 88 production packages● 68 dev packages● ls node_modules/ | wc -l == 1242

Python

● 35 direct dependencies● ~66 leaves in the dep tree

Page 18: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

Challenge: Release Management

18

Page 19: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

Challenge: Coordination

19

Page 20: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

Challenge: ASF bureaucracy

20

Page 21: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

Roadmap● Steady Apache-approved releases● Quality & polish ++● Thumbnails + cards!● A formal data access layer API● Embeddable components● Schedule simple data pipelines

21

Page 22: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

What’s next!?● Automated root cause analysis & anomaly detection● Assisted dashboard generation● Collaborative workspaces & social features● Mobile!● Data governance & auditing● Integrated notebooks● Storytelling● Specialized visualization packages● ML models introspection● Alerts, notifications, email/mobile delivery

22

Page 23: Apache Superset The history and anatomy of Council... · 2019-04-18 · 2 Maxime Beauchemin Open source leader & community builder Creator of Apache Superset Creator of Apache Airflow

Conclusion

23

● I’m looking to help companies onboard!● Interested in working on Superset!?

[email protected]

github.com/apache/incubator-superset

apache-superset.slack.com