REST::Neo4p - Talk @ DC Perl Mongers
-
Upload
mark-jensen -
Category
Software
-
view
193 -
download
12
Embed Size (px)
description
Transcript of REST::Neo4p - Talk @ DC Perl Mongers

1
REST::Neo4pA Perl Driver for Neo4j
Mark A. JensenSRA International, Inc.
https://github.com/majensen/rest-neo4p.git

2
• Perler since 2000• CPAN contributor (MAJENSEN) since 2009• BioPerl Core Developer• Scientific Project Director, The Cancer Genome Atlas
Data Coordinating Center• @thinkinator, LinkedIn

3
Motivation
• TCGA: Biospecimen, Clinical, Genomic Data– complex– growing– evolving technologies– evolving policies– need for precise accounting
• Customer suggested Neo4j– I wanted to play with it, but in Perl

4
Feeping Creaturism
• 2012 - Some Perl experiments out there, nothing complete
• Got excited• People started using it• Sucked into the open-source attractor
MaintenanceGlory

5
Dog Food I

6
Dog Food II

7
Woof!Neo4p Classes

8
Design Goals• "OGM" – Perl 5 objects backed by the graph• User should never have to deal with a REST endpoint*
*Unless she wants to.• User should never have to deal with a Cypher query†
†Unless he wants to.• Robust enough for production code
– System should approach complete coverage of the REST service
– System should be robust to REST API changes and server backward-compatible (or at least version-aware)• Take advantage of the self-describing features of the API

9
REST::Neo4p core objects• Are Node, Relationship, Index
– Index objects represent legacy (v1.0) indexes– v2.0 “background” indexes handled in Schema
• Are blessed scalar refs : "Inside-out object" pattern– the scalar value is the item ID (or index name)– For any object $obj, $$obj (the ID) is exactly what you need for
constructing the API calls• Are subclasses of Entity
– Entity does the object table handling, JSON-to-object conversion and HTTP agent calls
– Isolates most of the kludges necessary to handle the few API inconsistencies that exist(ed)

10
Auto-accessors
• You can treat properties as object fields if desired
• Caveat: this may not make sense for your application (not every node needs to have the same properties, but every object will possess the accessors currently)

11
Batch Calls
• Certain situations (database loading, e.g.) make sense to batch : do many things in one API call rather than many single calls
• REST API provides this functionality• How to make it "natural" in the context of
working with objects?– Use Perl prototyping sugar to create a "batch
block"

12
Example:
Rather than call the server for every line, you can mix in REST::Neo4p::Batch, and then use a batch {} block:
Calls withinblock are collected anddeferred

13
You can execute more complex logic within the batch block, and keep the objects beyond it:

14
But miracles are not yet implemented:
Object here doesn't really exist yet…

15
How does that work?
• Agent module isolates all bona fide calls– very few kludges to core object modules req'd
• batch() puts the agent into “batch mode” and executes wrapped code– agent stores incoming calls as JSON in a queue
• After wrapped code is executed, batch() switches agent back to normal mode and has it call the batch endpoint with the queue contents
• Batch processes the response and creates objects if requested

16
Batch Profiling• Used Devel::NYTProf, nytprofhtml– Flame graph:
Vertical : unique call stack (call on top is on CPU)Horizontal : relative time spent in that call stack configuration Color : makes it look like a flame

17
Batch Profiling

18
Batch Profiling
batch keep : 1.1 of 1.2s
batch discard: 1.0 of 1.1s
no batch: 13.0 of 13.9 s

19
Batch/keep
Batch ProfilingNo batch

20
HTTP Agent

21
Agent• Is transparent
– But can always see it with REST::Neo4p->agent– Agent module alone meant to be useful and independent
• Elicits and uses the API self-discovery feature on connect()• Isolates all HTTP requests and responses• Captures and distinguishes API and HTTP errors
– emits REST::Neo4p::Exceptions objects• [Instance] Is a subclass of a "real" user agent:
– LWP::UserAgent– Mojo::UserAgent, or – HTTP::Thin

22
Working within API Self-DescriptionGet first level of
actions
Register actions
Get ‘data’ level of actionsRegister more
actions
Kludge around missing actions

23
Working within API Self-Description
• Get the list of actions with – $agent->available_actions
• And AUTOLOAD will provide (see pod for args):– $agent->get_<action>()– $agent->put_<action>()– $agent->post_<action>()– $agent->delete_<action>()
• Other accessors, e.g. node(), return the appropriate URL for your server

24
Agent Profiling
lwp: 2.5 of 2.7 s
mojo: 3.3 of 3.6s
thin: 2.4 of 2.6s

25
App-level Constraints

26
Use Case
You start out with a set of well categorized things, that have some well defined relationships.Each thing will be represented as a node, that's fine. But,
You want to guarantee (to your client, for example) that1. You can classify every node you add or read
unambiguously into a well-defined group; 2. You never relate two nodes belonging to particular
groups in a way that doesn't make sense according to your well-defined relationships.

27
Constrain/Constraint
• Now, v2.0 allows integrated Labels and unique constraints and prevents deletion of connected nodes, but…
• REST::Neo4p::Constrain - An add-in for constraining (or validating)– property values– connections (relationships) based on node properties– relationship types
according to flexible specifications

28
Constrain/Constraint
• Multiple modes:– Automatic (throws exception if constraint violated)– Manual (validation function returns false if
constraint violated)– Suspended (lift constraint processing when
desired)• Freeze/Thaw (in JSON) constraint
specifications for reuse

30
Cypher Queries
• REST::Neo4p::Query takes a familiar, DBI-like approach– Prepare, execute, fetch– "rows" returned are arrays containing scalars,
Node objects, and/or Relationship objects– If a query returns a path, a Path object (a simple
container) is returned

31

32
Cypher Queries
• Prepare and execute with parameter substitutions
Do This!
Not This!

33
Cypher Queries
• Transactions are supported when you have v2.0.1 server or greater– started with REST::Neo4p->begin_work()– committed with REST::Neo4p->commit()– canceled with REST::Neo4p->rollback()(here, the class looks like the database handle in
DBI, in fact…)

34
DBI – DBD::Neo4p
• Yes, you can really do this:

35
Glory!
Maintenance.

36
Future Directions/Contribution Ideas
• Get it onto GitHub https://github.com/majensen/rest-neo4p.git
• Make batch response parsing more efficient– e.g., don't stream if response is not huge
• Beautify and deodorize• Completely touch-free testing• Add traversal functionality• Could Neo4p play together with DBIx::Class? (i.e.,
could it be a real OGM?)