Embracing the Monolith
-
Upload
leon-sasson -
Category
Data & Analytics
-
view
65 -
download
0
Transcript of Embracing the Monolith
Doubling down on python to move fast without breaking things.
Embracing the Monolith in Small Teams
Leon Sasson @leonsasson
PyData Chicago 2016
Rise Science
Rise Science
Product Goals
• Sleep Improvement • User Enjoyment
Iterate Fast
Young company, timeline of weeks and days.
Data is core to the product
No data = 😩
Development Cycle
Hypothesis
Exploration
Experiment
Productizing
Evaluate & Analyze
Easy, right?
😓
Obstacles
• Data Silos
• Data Silos • Different Tooling
• Data Silos • Different Tooling • People
• Data Silos • Moving from phase requires different tools • People • "It works on staging"
• Testing data products is hard • Garbage in → Garbage out • Capacity problems
• Data Silos • Moving from phase requires different tools • People • "It works on staging"
Extended Product Cycles
How do we start?
Descriptive, visuals, basic summaries
Step Back
What the organization needs.
Understand problem before getting into solutions
Solution First
Focus is on tech trade-off
Solution First
Focus is on tech trade-off
Problem First
Focus is on making progress for the org
vs.
Solution First
Focus is on tech trade-off
Problem First
Focus is on making progress for the org
vs.
Solution First
Focus is on tech trade-off
Problem First
Focus is on making progress for the org
vs.
Business Optimality
Technical Optimality
What's the least I can do to solve the problem?
What's the least I can do to solve the problem?
You need an architecture compatible with this mindset
Monolithic Architecture
© Martin Fowler: http://martinfowler.com/articles/microservices.html
A monolithic application puts all its functionality into a singles process..
... and scales by replicating the monolith
on multiple servers
© Martin Fowler: http://martinfowler.com/articles/microservices.html
A microservices architecture puts each
element of functionality into a separate service..
... and scales by distributing these services across servers, replicating as
needed
© Martin Fowler: http://martinfowler.com/articles/microservices.html© Martin Fowler: http://martinfowler.com/articles/microservices.html
Django. The Good Things
Reuse Libraries
IPython Notebooks
Reuse your ORM when accessing data.
Pandas, django-pandas
Instrumentation
People
The Problem of Toil
".. manual, repetitive, automatable, tactical, devoid of enduring value, and scales linearly as a service grows.."
Toil-induced negative data culture
Self-Serve Analytics
Breaking Data Silos
Why do Data Silos Happen?
person id date duration
1 2016-08-01 450
2 2016-08-01 426
1 2016-08-02 438
Row
person id date duration
1
2
1
2016-08-01
2016-08-01
2016-08-02
450
426
438
Columnar
Centralizing Data
Segment.com
Backend DB
RedshiftETL
Redshift is fast for aggregations
Out-of-the-box compatible with Postgres
(Mostly..)
Bring data to the people
Positive Feedback Loop on Data Culture
Non tech can access data whenever Data team can focus on bigger problems and act as enablers
Be scrappy.
Thanks!