[Research] protocols and structures for inference a res tful api for machine learning - James...

24
Protocols and Structures for Inference A RESTful API for Machine Learning work conducted at ANU in collaboration with CISRA and additional support from an Amazon AWS Education Grant James Montgomery [email protected] @jamesatbond Mark Reid [email protected] @mdreid Barry Drake [email protected] http://... http://... http://...

Transcript of [Research] protocols and structures for inference a res tful api for machine learning - James...

Page 1: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project

Protocols and Structures for Inference

A RESTful API for Machine Learning

work conducted at ANU in collaboration with CISRA

and additional support from an Amazon AWS Education Grant

James [email protected]@jamesatbond

Mark [email protected]@mdreid

Barry [email protected]

http://...

http://...

http://...

Page 2: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project An ‘ecosystem’ of ML services

Prediction API

Microsoft AzureMachine Learning

Amazon Machine Learning

…and many others

The problem is not that these are bad (they’re all very good)nor that there is competition (also good)

But this ecosystem doesn’t encourage service composition or provide a way for ML practitioners of all sizes to share their data and algorithms

Page 3: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Goals of PSI

A web service API• that is standardised, yet• sufficiently flexible to support a wide range of ML techniques

Select your ML du jour

Flexible Federated

http://...

http://...

http://...

Page 4: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Protocols and Structures for Inference

An API specification for ML web services

Communication via JSON

Set of common ML-related resources that describe their differences using a schema language (based on JSON schema)

Support for data transformation, training, prediction and updating/online learning

with points for extension and customisation

and support for data formats beyond JSON

Iris Versicolor by Danielle Langlois / CC-BY-SA-3.0

Page 5: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Catalogue of PSI resources

Relations (datasets)of instances

@Attributes

Learners Predictors

f(x)Transformers

Collections of PSI resources

and their

Page 6: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project

PSI start

Schema collection

integer

string

...

schema

...

Relations collection

relation

attribute

sub-attribute

...

...

...

Learners collection

learner

...

Predictors collection

predictor

update

...

Transformers collection

transformer

...

Structure of a PSI service

Optional resources

Resource instances

Required resources

L

L

f(x)

f(x)

Structured attributes (arrays, objects) can be decomposed and new attributes created

Page 7: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project

PSI start

Schema collection

integer

string

...

schema

...

Relations collection

relation

attribute

sub-attribute

...

...

...

Learners collection

learner

...

Predictors collection

predictor

update

...

Transformers collection

transformer

...

This is also a PSI service

Optional resources

Resource instances

Required resources

L

L

f(x)

f(x)

An organisation or individual could choose to provide access to one or more datasets

The ‘root’ and collection resources are very lightweight

Page 8: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project

PSI start

Schema collection

integer

string

...

schema

...

Relations collection

relation

attribute

sub-attribute

...

...

...

Learners collection

learner

...

Predictors collection

predictor

update

...

Transformers collection

transformer

...

…and so is this

Optional resources

Resource instances

Required resources

L

L

f(x)

f(x)

ML researchers could present their just-published learning algorithm as a resource

Page 9: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project

PSI start

Schema collection

integer

string

...

schema

...

Relations collection

relation

attribute

sub-attribute

...

...

...

Learners collection

learner

...

Predictors collection

predictor

update

...

Transformers collection

transformer

...

…and this, etc.

Optional resources

Resource instances

Required resources

L

L

f(x)

f(x)

Or even a single predictor

Page 10: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Is it RESTful? Is that important?

100% RESTful is not a reasonable aim

But can improve interoperability and development of clients

• Discoverable namespace

• Extensible through links entry in resource representations(similar to HTML link element & part of JSON Hyper-schema standard)

Client must still know it’s using a PSI service…

• but each resource does provide informationabout how to use it through schema

(no PSI media types)

Page 11: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Schema describes…

@

Form of learning tasks that learners can process

The domain and range of transformers

f(x)The domain and range

of transformersService-specific queriessupported by relations

The data format of attributes

Page 12: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Common workflows

Predict

Train

Predicted

value

Relation

resource

instance

Attribute 1 emits

resource

Attribute 2 emits

Attribute n emits

instance

representations

Learnerrequires

resource

Predictoraccepts

update

emits

resource

Attribute emits

resource

Update

Transformer emitsaccepts

resource

Relation

resource

instance

schema

other datasource

Attribute emits

resourceRelation

resource

instance

a resourceresource

Legend:

JSON

Page 13: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Common workflows

Predict

Train

Predicted

value

Relation

resource

instance

Attribute 1 emits

resource

Attribute 2 emits

Attribute n emits

instance

representations

Learnerrequires

resource

Predictoraccepts

update

emits

resource

Attribute emits

resource

Update

Transformer emitsaccepts

resource

Relation

resource

instance

schema

other datasource

Attribute emits

resourceRelation

resource

instance

a resourceresource

Legend:

JSON

Page 14: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Training

@

GET

representation includes task schema

resources

any other

vector attribute

training parameters

nominal attribute

@@@

POST

a task

resources

n = 5

a vector attribute

l = 0.5

a nominal attribute

JSON representationsof attributes (not their values) or URI references

discover attributes;reshape as needed;

compose with transformers

201 Created/202 Accepted

URI of

1

2

3

4

Page 15: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project What’s in the schema

Algorithm requires an attribute that produces

JSON schema required to describe PSI attribute’s representation and enforce it produces correct values

Feature vectors of numbers

{"type":"object","properties":{"responseType":{"enum":["attribute#description"],"required":true},"uri":{"type":"string","required":true},"schema":{"type":"object","properties":{"type":{"enum":["array"],"required":true},"items":{"type":"array","items":{"type":"object","properties":{"type":{"enum":["integer","number"],"required":true}}},"required":true}},"required":true},"description":{"type":"string"},"provenance":{"type":["string","object"]},"relation":{"type":"string"},"subattributes":{"type":"array","items":{"type":"string"}}}}

𝑋 ∈ 𝑛

This is really the only change, but this is still very complicated

Page 16: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project

𝑋 ∈ ( ∪ ∗)𝑛

Pre-defined PSI schema eases burden

Algorithm requires an attribute that produces

PSI schema

Feature vectors of numbers "$arrayAttribute": {

"allItems" : "$numberSchema"}

Feature vectors of real-numbers, integers or strings

"$arrayAttribute": {"allItems" : "$atomicValueSchema"

}

𝑋 ∈ 𝑛

Page 17: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Common workflows

Predict

Train

Predicted

value

Relation

resource

instance

Attribute 1 emits

resource

Attribute 2 emits

Attribute n emits

instance

representations

Learnerrequires

resource

Predictoraccepts

update

emits

resource

Attribute emits

resource

Update

Transformer emitsaccepts

resource

Relation

resource

instance

schema

other datasource

Attribute emits

resourceRelation

resource

instance

a resourceresource

Legend:

JSON

Page 18: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Prediction

GET?value=[5.1,3.5,1.4,0.2]

@setosa

join request with URI of

201 Created, URI of @

Simple Prediction Or join predictor with attribute to predict on whole relation

GET?instance=all

[setosa,setosa,setosa,…,virginica]

@

Page 19: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Beyond JSON data types

• PSI rich values support data of any media type(PSI schema can still be used for data type validation)

• Rich value is either an HTTP URI or Data URI

Iris Versicolor by Danielle Langlois / CC-BY-SA-3.0

data:image/png;base64/9j/4AAQSkZJRgABAQEASABIAAD/2wBDAAYEBQYFBAYGBQYHBwYIChAKCgkJChQODwwQFxQYGBcUFhYaHSUfGhsjHBYWICwgIyYnKSop

Page 20: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Proofs of concept

Demonstration PSI service at poseidon.cecs.anu.edu.au

Demonstration Javascript client atpsi.cecs.anu.edu.au/demo

• HTML forms generated from PSI schema

• Predictor evaluation and comparison

Play 1.2-based service, that exposes some classification and regression algorithms from

scikit-learnplus a simple

ranking algorithm using scikit-learn

An HTML to bag-of-words transformer in Python will be on GitHub soon

Page 21: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Querying relations

Schema defines the elements of the query, their data type and can even include descriptions (which become hints here)

Page 22: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Constructing a learning task

Page 23: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Evaluation via client-service interactions

Page 24: [Research] protocols and structures for inference  a res tful api for machine learning - James Montgomery

psikit.netgithub.com/psi-project Future

Amazon AWS AMI of play-based

service planned

thanks to an Amazon AWS Education Grant

PSI provides the core of a flexible ML

API that can be freely

implemented

Security & authentication can

be built on top

Can be offered as alternative interface to existing ML web

services