Real-time Personal Trainer on the SMACK Stack

24
Real-time personal trainer on the SMACK stack @honzam399 Jan Machacek @anirvan_c Anirvan Chakraborty

Transcript of Real-time Personal Trainer on the SMACK Stack

Page 1: Real-time Personal Trainer on the SMACK Stack

Real-time personal trainer on the SMACK stack

@honzam399 Jan Machacek

@anirvan_c Anirvan Chakraborty

Page 2: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Automated personal trainer - muvr• Suggests the sequence of exercise sessions • Suggests exercises in a session, including exercise

parameters (e.g. weight, repetitions, …) • Provides tips on proper exercise form

• With additional hardware (smartwatch, smart clothes), muvr provides • Completely unobtrusive exercise experience • More accurate tips on proper exercise form • With over–fitting, it is usable for physiotherapy

Page 3: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Architecture

Page 4: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Privacy

Page 5: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

The technologies—iOS• Learns the users’ behaviour

• Exercise sessions • Exercises within exercise session • Short–term prediction of [scalar] labels for the exercises

• Performs the real–time analysis of the incoming sensor data • Advised by the expected behaviour • Signal processing to compute repetitions / strokes • Forward–propagation to label the exercise

• Submits all recorded sensor data and confirmed (!) labels per session • Handles offline / travel modes • Synchronises the data across the user’s devices using iCloud

Page 6: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

The technologies—Akka• Reactive services for user profiles, model parameters,

and sensor data • CQRS/ES implementation, which helps to

• Handle peaks in load • Handle failures of individual nodes • Reason about the scope of the mutable state we keep

• Uses Cassandra for its journal and snapshot stores • The written values are binary “blobs”

• Writes the sensor data to Cassandra • Writes the sensor data in “readable” form; it can be read outside the Akka / Scala

world

• Reads the model and exercise parameters from Cassandra • It selects the best / newest model parameters to serve to the mobile app

Page 7: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

The technologies—Spark• Distributed computation framework

• “Big data” tasks • Integrates extremely well with Cassandra

• Reads and processes the profiles and sensor data • Identifies clusters of users on their profile information • Slices the sensor inputs by sensor types • Writes the results to another store

• Runs in batches • Executes by schedule (typically once a day)

Page 8: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

The technologies—neon• A machine learning framework, including

• “The usual” suspects in tensor algebra • Signal processing • Different ML approaches

• Training and evaluation programs • Both programs terminate either upon discovering the perfect model or when their

budget is up • Reads clustered training and testing data from the Spark job • Writes the model parameters and evaluation result to Cassandra

Page 9: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

The technologies—Cassandra• Underpins the entire platform

• Journal and snapshot store for Akka • Sensor data store • Model parameter store • “Summary” store

• High availability • No single point of failure • High read and write • Replication factor • Tuneable consistency level

Page 10: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Page 11: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Spark & Cassandra• Group the sensor data into n clusters by user profile with

biometric ID • Expand the sensor data

• Slices of the sensor data by combinations of accelerometer, gyroscope, heart rate, targeted muscle group strain gauges, …

• 1 user = 1 MiB from one sensor per hour; but 4 sensors expand into 4! MiB

• Trivial tasks • The most popular user–contributed exercises • The most popular exercise sessions and exercises within the sessions • The most effective (by overall fitness improvement, weight loss, muscle mass gain, …)

exercise sessions

Page 12: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Production MLTake the data from Cassandra (written there by the Spark jobs) and:

• Split into training and test datasets • Fit models for various sensor types • Save model parameters • Evaluate the newly fitted models, and re-evaluate old

data

Page 13: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Production ML• We are using convolutional network

• 2 seconds of sensor data input (e.g. a @ 50 Hz for accelerometer; a, g @ 50 Hz for accelerometer + gyroscope; u, l @ 10 Hz for smart clothes)

• The exercise classes as the outputs

• The training program • CNN in neon • Loads the mini–batches from Cassandra • Fits the model; evaluates the fitted model • Saves the model parameters into Cassandra

• The re–evaluation program • Re–evaluates past n models against the latest training dataset; computing accuracy,

precision, recall, f1

Page 14: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Having code is jolly good

Page 15: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Running it• Simplicity • Ease of orchestration • Ease of development • Support for polyglot frameworks and components • Cost effective resource utilisation

Page 16: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Docker• Deploy reliably & consistently • Execution is fast and light weight • Simplicity • Developer friendly workflow • Fantastic community

Page 17: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Dockerize Cassandra Dev Environment• Super low memory settings in cassandra-env.sh

• MAX_HEAP_SIZE=“128M” • HEAP_NEWSIZE=“24M”

• Remove caches in dev mode in cassandra.yml • key_cache_size_in_mb: 0 • reduce_cache_sizes_at: 0 • reduce_cache_capacity_to: 0

Page 18: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Dockerize Cassandra Production• Use host networking (—net=host) for better network

performance • Put data, commitlog and saved_caches in volume

mount folders to the underlying host • Run cassandra on the foreground using (-f) • Tune JVM heap for optimal size • Tune JVM garbage collector for your workload

Page 19: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Mesos• Distributed systems kernel • Scales to 10,000s of nodes • Depends on Zookeeper for fault tolerance and high

availability • Creates a highly available, scalable single resource pool

• Automatic failover • Ease of management • Simple to operate • Support for Docker container

Page 20: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Mesos architecture

image source: https://assets.digitalocean.com/articles/mesosphere/mesos_architecture.png

Page 21: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Cassandra on Mesos• Running Cassandra as Docker containers

• Custom Dockerfile and entry-point script to control Cassandra configuration

• Marathon to initialize and control

Page 22: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Cost effective resource in AWS• Embrace AWS spot instances

• About 50-60% cheaper than on demand instances • Can be reclaimed without notice if outbidded

• Run dev and staging on spot instances • Run Spark jobs on spot instances

Page 23: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Page 24: Real-time Personal Trainer on the SMACK Stack

© 2016 Cake Solutions Limited CC BY-NC-SA 4.0

Thanks!

Twitter: @cakesolutionsTel: 0845 617 1200

Email: [email protected] Jobs: http://www.cakesolutions.net/

careers