Using Task Queues and D3.js to build an analytics product on App Engine

28
Using Task Queues and D3.js to build an analytics product on App Engine Warren Edwards Founder, Waizee

description

"Using Task Queues and D3.js to build an analytics product on App Engine" by Warren Edwards, Founder, Waizee

Transcript of Using Task Queues and D3.js to build an analytics product on App Engine

Page 1: Using Task Queues and D3.js to build an analytics product on App Engine

Using Task Queues and D3.js to build an

analytics product on App Engine

Warren EdwardsFounder, Waizee

Page 2: Using Task Queues and D3.js to build an analytics product on App Engine

TODAYHow Do We Handle All The Data?!?

Look at the Product

Why App Engine? Why D3.js?

Task Queues in App Engine

Code Samples

A Quiz!

Wrap Up

Page 3: Using Task Queues and D3.js to build an analytics product on App Engine

All of the data stored in the world currently about 4 zettabytes according to EMC.

At current growth rates, that data will surpass a yottabyte in 2030.

If each bit was represented by a grain of sand, a yottabyte would be about five percent the mass of the Moon.

Data surpasses human readability.

Page 4: Using Task Queues and D3.js to build an analytics product on App Engine

Software analyzes the data so humans

can focus on [ interpretation,

judgement, action ];

Page 5: Using Task Queues and D3.js to build an analytics product on App Engine

[ Show Product Demo ]

Page 6: Using Task Queues and D3.js to build an analytics product on App Engine

AWS AE Others

Team Experience

Automated scaling / spin-up

Automated load-balancing

Automated security

Task queue functionality

Why App Engine?

Honorable mention: AppScale path from AE

Page 7: Using Task Queues and D3.js to build an analytics product on App Engine

Google App Engine builds on Big Table

Page 8: Using Task Queues and D3.js to build an analytics product on App Engine

Why D3.js?Canvas Directly D3 PolyChart Math

plotlibSuper-

conductor

Team experience ( NONE )

Development effort

Maturity of library

Developer traction

Honorable mention: xCharts builds on D3.js with prepackaged chartsPHOTO CREDIT: http://www.flickr.com/photos/cusegoyle/

Page 9: Using Task Queues and D3.js to build an analytics product on App Engine

HTTP Request Task Queue

Full access to data store? YES YES

Atomic transactions? YES YES

Write in Python / Java / Go? YES YES

Maximum request lifetime 30-60 sec Up to 10 min

Timing of execution Immediate Up to 30 days

Retry Can be done manually, no policy

Automatic by policy

Concurrent requests No set limit Limited by policy

TQ build on App Engine's HTTP Requestbut liberate server use from user interaction

Introducing Task Queues (TQ)

Page 10: Using Task Queues and D3.js to build an analytics product on App Engine

Appeal of Task Queues

Well suited to number crunching● Long-lived jobs for running analytics● Full access to data store and messaging to build

on your existing App Engine knowledge● Save state back to Data Store and call the next

task

Task Queues allow you to crunch data "while you wait"

Page 11: Using Task Queues and D3.js to build an analytics product on App Engine

USERUPLOADS

DATATask#1

Task#2

. . . Task#N

RESULT

Save to Data Store

Pass Parameters

Cascade Tasks in Queue for Multipass Processing

Save to Data Store

Program Flow

Data Flow

Page 12: Using Task Queues and D3.js to build an analytics product on App Engine

DATA

STORE

Task#1

Task#2

. . .

Task#N

USERUPLOADS

DATARESULT

Process Tasks in Parallel Queues

Program Flow

Data Flow

Page 13: Using Task Queues and D3.js to build an analytics product on App Engine

Let's Look at a Code Sample

taskqueue.add(queue_name='analyze', url='/work1', params={'key': keyID})

class WorkerTheFirst(webapp2.RequestHandler): def post(self): keyID = self.request.get('key')

app = webapp2.WSGIApplication([('/', StartPage), ('/work1', WorkerTheFirst), ('.*', ErrPage) ])

Add Task into Queue

Define Task with Retrieval of Parameters

Associate Task with Handler

Page 14: Using Task Queues and D3.js to build an analytics product on App Engine

One Task Calls Anotherclass WorkerTheFirst(webapp2.RequestHandler): def post(self): keyID = self.request.get('key') # # First pass of work here # taskqueue.add(queue_name='analyze', url='/work2', params={'key': keyID})

class WorkerTheSecond(webapp2.RequestHandler): def post(self): keyID = self.request.get('key') # # Second pass of work here #

app = webapp2.WSGIApplication([('/', StartPage), ('/work1', WorkerTheFirst), ('/work2', WorkerTheSecond), ('.*', ErrPage) ])

Page 15: Using Task Queues and D3.js to build an analytics product on App Engine

total_storage_limit: 120G # Max for free apps is 500M

queue:# Queue for analyzing the incoming data- name: analyze rate: 35/s retry_parameters: task_retry_limit: 5 task_age_limit: 2h

# Queue for user behavior heuristics- name: heuristic rate: 5/s # Queue for doing maintenance on the data store or site- name: kickoff rate: 5/s

Setting up Task Queues in queue.yaml

Page 16: Using Task Queues and D3.js to build an analytics product on App Engine

svg = d3.select('body') .append('svg') .attr('class', 'circles') .attr('width', width) .attr('height', height)

svg.append('g').selectAll('circle') .data(data) .enter() .append('circle') .attr('transform', 'translate(' + pan + ', 0)') # pan allows moving te whole graph slightly # to account for long text labels

svg.selectAll('circle') .attr('cx', function(d) {return x(d.v2)}) .attr('cy', function(d) {return y(d.v1)}) .attr('r', dot_out) # dot_out scales the size .attr('fill', function(d) {return d.v4})

Core of the D3 Code

Page 17: Using Task Queues and D3.js to build an analytics product on App Engine

svg = d3.select('body') .append('svg') .attr('class', 'circles') .attr('width', width) .attr('height', height)

svg.append('g').selectAll('circle') .data(data) .enter() .append('circle') .attr('transform', 'translate(' + pan + ', 0)') # pan allows moving te whole graph slightly # to account for long text labels

svg.selectAll('circle') .attr('cx', function(d) {return x(d.v2)}) .attr('cy', function(d) {return y(d.v1)}) .attr('r', dot_out) # dot_out scales the size .attr('fill', function(d) {return d.v4})

Program Writes Its Own Code !

Axes are set heuristically by server software

Page 18: Using Task Queues and D3.js to build an analytics product on App Engine

Titles Are Set By Heuristic Text Analysis

var title = '{{ pagetitle }}'var subtitle = '{{ pagesubtitle }}'

var title = 'Google+ Rating More Important Metric Than Star Rating'var subtitle = 'Survey of Productivity Apps in the Chrome Web Store, Nov 2012'

In Django template

Rendered in Javascript to the browser

Labels were pulled Heuristically from input - not hard coded!

Page 19: Using Task Queues and D3.js to build an analytics product on App Engine

QUIZ !

Page 20: Using Task Queues and D3.js to build an analytics product on App Engine

Choose the Correct Task Queue Call

taskqueue.add(queue_name='analyze', url='/work2', param={key: keyID})

queue.add(queue='analyze', url='/work2', params={'key': keyID})

taskqueue.add(queue='analyze', url='/work2', param={'key': keyID})

taskqueue.add(queue_name='analyze', url='/work2', params={'key': keyID})

taskqueue.add(queue='analyze', url='/work2', params={key: keyID})

A

B

C

D

E

Page 21: Using Task Queues and D3.js to build an analytics product on App Engine

ANSWER IS ...

Page 22: Using Task Queues and D3.js to build an analytics product on App Engine

Choose the Correct Task Queue Call

taskqueue.add(queue_name='analyze', url='/work2', param={key: keyID})

queue.add(queue='analyze', url='/work2', params={'key': keyID})

taskqueue.add(queue='analyze', url='/work2', param={'key': keyID})

taskqueue.add(queue_name='analyze', url='/work2', params={'key': keyID})

taskqueue.add(queue='analyze', url='/work2', params={key: keyID})

A

B

C

D

E

Correct Answer is D

Page 23: Using Task Queues and D3.js to build an analytics product on App Engine

TQ Open a World of Possibilities

You can send tasks to different versions of your app → Automated test of new version of app before Go Live

You can access the queue’s usage data→ Your app can monitor its own consumption of tasks through QueueStatistics class

Task Queue + Crowdsource = ???→ Software application instructing humans !

Page 24: Using Task Queues and D3.js to build an analytics product on App Engine

Task Queues allow "while you wait" processing● Allow server task to run autonomously● Cascade tasks for multistep processing● Flexible functionality to create great products

Task Queues provide a great tool for automating the understanding of data

D3.js offers flexible, stable tool for viz of data● Works nicely with automated scripting● Lush visualizations but not pre-packaged● Leverage huge traction in San Francisco

D3.js is best platform for visualization using automated processing of data

Page 25: Using Task Queues and D3.js to build an analytics product on App Engine

Questions?

Do you have a passion for analytics? Let’s talk!

[email protected]@campbellwarren

Page 26: Using Task Queues and D3.js to build an analytics product on App Engine

We are your number cruncher in the cloud that understands your data and shows you only what is

most important.

Page 27: Using Task Queues and D3.js to build an analytics product on App Engine

Data SourcesThe Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East. IDC sponsored by EMC. December 2012

diameter and volume of the Moon: Wolfram Alpha

Yottabyte representation: Waizee calculations

2013 Gartner Magic Quadrant for Business Intelligence and Analytics Platforms

Page 28: Using Task Queues and D3.js to build an analytics product on App Engine

Photo Creditshttp://www.flickr.com/photos/pcoin/2066230078/sizes/m/in/photostream/

http://en.wikipedia.org/wiki/File:Visicalc.png

http://www.todayscampus.com/rememberthis/load.aspx?art=348

http://commons.wikimedia.org/wiki/File:Chocolate_chip_cookie.jpg

http://www.wpclipart.com/signs_symbol/checkmarks/checkmarks_3/checkmark_Bold_Brush_Green.png.html

http://www.flickr.com/photos/cusegoyle/