Big Data and Machine Learning at Zalando BIG DATA AT ZALANDO Business Intelligence Machine Learning

download Big Data and Machine Learning at Zalando BIG DATA AT ZALANDO Business Intelligence Machine Learning

of 27

  • date post

    05-Jun-2020
  • Category

    Documents

  • view

    1
  • download

    0

Embed Size (px)

Transcript of Big Data and Machine Learning at Zalando BIG DATA AT ZALANDO Business Intelligence Machine Learning

  • Big Data and

    Machine

    Learning at

    Zalando

    K s h i t i j K u m a r

    V i c e P r e s i d e n t ,

    D a t a I n f r a s t r u c t u r e

    1 0 - 0 3 - 2 0 1 9

  • 2

    WE LOVE FASHION

  • 3

    WHAT STARTED AS A

    SIMPLE ONLINE

    SHOP…

  • 4

    …HAS BECOME THE

    LEADING EUROPEAN

    ONLINE PLATFORM

    FOR FASHION

  • 5

    W E O F F E R A S U C C E S S F U L AN D C U R AT E D AS S O R T M E N T

    HIGHLY

    EXPERIENCED category management

    CURATED

    SHOPPING with Zalon

    > 500 designers & stylists

    > 300,000 articles from

    ~ 2,000 international brands

    private labels11

    LOCALIZATION of the assortment

  • 6

    PLATFORM STRATEGY

    BRANDS CONSUMERS

    ENABLER

  • 7

    WE DRESS CODE

  • 8

    WE ARE CONSTANTLY INNOVATING

    CLOUD-BASED,

    CUTTING-EDGE

    & SCALABLE technology solutions

    > 2,000 employees at

    international

    tech locations8

    HQs in Berlin

    help our brand to

    WIN ONLINE

  • 9

    BIG DATA AT ZALANDO

    Business

    Intelligence

    Machine

    Learning

    Data

    Governance

    Data at the

    core of

    everything

    we do

  • 10

    A TYPICAL BIG DATA INFRASTRUCTURE

    ML Platform

    •Explore

    •Train

    •Serve

    •Observe

    Data Platform

    • Ingestion,

    •Metadata,

    •Store,

    •Process

    Business Intelligence

    •Data Warehousing

    •Visual KPIs

    •Trusted datasets

    Data Governance

    •Data Catalog

    •Privacy

    •GDPR

  • 11

    SOME ML USE CASES AT AN ONLINE RETAILER

  • 12

    AN ML-DRIVEN CUSTOMER

    EXPERIENCE

  • 13

    ML-driven

    real-time

    reco

    engine

    People who browsed this style also browsed these other styles…

  • 14

    COMPLETE THE LOOK

    • Multi-dimensional ML driven product placement

    • Search • Recommended products • Complimentary items • Size (fit) • Delivery promise

  • 15

    THE ML JOURNEY

    Explore

    Fetch

    Prepare

    Train Model

    Evaluate Model

    Deploy to production

    Monitor/ Evaluate

    Ready the dataServe the models

  • 16

    ACHIEVING THE BALANCE TO RUN ML AT SCALE

    Exploding new With the needs

  • 17

    THE ML PIPELINE – FOR A SINGLE USE CASE

    ML Use Case Notebook/UI

    creates workflows

    Fetch Data

    Extract Features

    Prepare Data

    Train Model Deploy Model

    Serve

    Monitor

    Evaluate and Feedback

  • 18

    WHAT HAPPENS – WITH A COUPLE OF USE CASES

  • 19

    AND THE MESS THAT COMES WITH MANY USE

    CASES

  • 20

    TACKLING THE ML SCALING CHALLENGE

    With cost efficiency

    The ability to run hundreds of training jobs that are “serverless”. Trainings produce models and infrastructure is automatically shutdown.

    3

    With safety

    The ability to understand metadata at every stage of the ML journey by just describing a training job at the

    call of an API.

    2

    With speed

    The ability to compose training jobs, tuning jobs and endpoints with ease, at the call of an API, and with algorithms available out of the box.

    1

  • 21

    END TO END ML PIPELINE (real-life use case)

  • 22

    Productionizing ML: Speed, with simplicity

  • 23

    SAFE AND MONITORABLE ML

    How is the model endpoint performing?

  • WHERE WOULD BE LIKE TO BE IN 2020?

    A Scalable, Cost-efficient, Flexible Data Infrastructure

    Shared Data, Models, Features

    Safe, secure data usability, with privacy

    Open source, Inner source, best-of-breed vendor tools

  • We’re hiring!

  • Big Data and

    Machine

    Learning at

    Zalando

    K s h i t i j K u m a r

    V i c e P r e s i d e n t ,

    D a t a I n f r a s t r u c t u r e

    k s h i t i j . k u m a r @ z a l a n d o . d e

    1 0 - 0 3 - 2 0 1 9

    mailto:kshitij.kumar@zalando.de

  • 27

    ML pipelines should be safe and understandable

    What training job resulted in the deployment?

    Which model(s) was deployed?

    What instances are the model(s) deployed?

    How much traffic routed to which model?