Story of the algorithms behind Deezer Flow
RecSysFr, Paris, 2016 March 23th
B. Mathieu, Data ArchitectT. Bouabca, Data Scientist
/01
/02
/03
/04
/05
Context
Initial system
Content tagging system
Live adaptive algorithms
Conclusion
Story of the algorithms behindDeezer Flow
Story of the algorithms behind Deezer Flow
Context
/01
Story of the algorithms behind Deezer Flow
Deezer overview
/01 Context
Story of the algorithms behind Deezer Flow
● Music streaming service
● 6M paying users
● 40M tracks
● 180+ countries
● Up to 200+ tracks / user / day
Story of the algorithms behind Deezer Flow
Adapt tracklist to● Music tastes● Localization● Activity● Mood● Time & day● Discovery preferences
Interesting debate
Should we ask questions to the user or let data science do the magic?
Deezer Flow: Initial pitchThe magic play button
Context/01
Initial system
/02
Story of the algorithms behind Deezer Flow
/02 Initial system
Story of the algorithms behind Deezer Flow
Available data:
● User likes (artists, albums, tracks)
● User streams logs● Album recommendation
algorithm (collaborative filtering)
Initial System (2014)
Strategy:
● Tracklist computed offline● Tracks from library / listening
habits● Tracks from recommended
albums
/02 Initial system
Story of the algorithms behind Deezer Flow
Cold start problem: addressing new users
1. New users are asked to select some musical genres, and some artists
2. Build tracklist based on liked artists & similar artists
3. Fallback to top tracks in country
/02 Initial system
Story of the algorithms behind Deezer Flow
● Tracklist only fits user’s tastes
● Tracklist do not fit user’s mood or user’s activity or time ...
To reach this goal:
● Immediately take into account user’s last interactions
● Refresh tracklist more often
● Insights into the content of a track
Need a more content-based approach
First Flow limitations
Content tagging system
/03
Story of the algorithms behind Deezer Flow
/03 Content tagging system
Story of the algorithms behind Deezer Flow
Building a content tagging system
/03
Story of the algorithms behind Deezer Flow
● Heterogenous sources
● Millions of songs, artists, playlists or albums to tag everyday
Quality assessment:
● Monitoring every sources
● Benchmarking ● Studying new metrics
How to consolidate such data?
Content tagging system
/03 Content tagging system
Story of the algorithms behind Deezer Flow
Architecture overview
Content data:- Tags- Popularity
User data:- Taste model- Hot tracks- Behaviors
Build tracklist
- Data cache- User action history
- Update user models- Consolidate tags data- Build indexes
actions logs
Live adaptive algorithms
/04
Story of the algorithms behind Deezer Flow
The live Flow (2015)
● Generated user profile● User history analyzed offline● Recently played tracks● Recent actions
● Querying tracks from ElasticSearch index
/04 Live adaptive algorithms
Story of the algorithms behind Deezer Flow
Story of the algorithms behind Deezer Flow
Flat tag profiles can lead to mistakes
● Tag clustering
● Querying ES with different tag queries
● Serving tracks according to cluster proportion
/04
We can be more precise!
Live adaptive algorithms
Different metrics to follow:
● Listening time
● Satisfaction
● User interaction (skipped / liked)
● Reconnection to Flow
Live evaluation - AB Testing
/04 Live adaptive algorithms
Story of the algorithms behind Deezer Flow
Conclusion
/05
Story of the algorithms behind Deezer Flow
Story of the algorithms behind Deezer Flow
What‘s next ?
● Fitting to user’s mood
● Increased performance on first days
Where are we now?
● Collaborative filtering combined with Content-Based approach (coming soon)
● More adaptation to the context
Conclusion/05
We are hiring!
Story of the algorithms behind Deezer Flow
● Data scientist
● Data architect
● Search scientist
https://www.deezer.com/jobs
Conclusion/05
21
Thanks for your attention
Questions?
Top Related