REST::Neo4p - Talk @ DC Perl Mongers

download REST::Neo4p - Talk @ DC Perl Mongers

If you can't read please download the document

  • date post

    27-Aug-2014
  • Category

    Software

  • view

    106
  • download

    7

Embed Size (px)

description

Updated Slides for DC PerlMongers meetup - REST::Neo4p Perl driver for Neo4j graph database

Transcript of REST::Neo4p - Talk @ DC Perl Mongers

  • REST::Neo4p A Perl Driver for Neo4j Mark A. Jensen SRA International, Inc. 1 https://github.com/majensen/rest-neo4p.git
  • Perler since 2000 CPAN contributor (MAJENSEN) since 2009 BioPerl Core Developer Scientific Project Director, The Cancer Genome Atlas Data Coordinating Center @thinkinator, LinkedIn 2
  • Motivation TCGA: Biospecimen, Clinical, Genomic Data complex growing evolving technologies evolving policies need for precise accounting Customer suggested Neo4j I wanted to play with it, but in Perl 3
  • Feeping Creaturism 2012 - Some Perl experiments out there, nothing complete Got excited People started using it Sucked into the open-source attractor MaintenanceGlory 4
  • Dog Food I 5
  • Dog Food II 6
  • Woof! Neo4p Classes 7
  • Design Goals "OGM" Perl 5 objects backed by the graph User should never have to deal with a REST endpoint* *Unless she wants to. User should never have to deal with a Cypher query Unless he wants to. Robust enough for production code System should approach complete coverage of the REST service System should be robust to REST API changes and server backward-compatible (or at least version-aware) Take advantage of the self-describing features of the API 8
  • REST::Neo4p core objects Are Node, Relationship, Index Index objects represent legacy (v1.0) indexes v2.0 background indexes handled in Schema Are blessed scalar refs : "Inside-out object" pattern the scalar value is the item ID (or index name) For any object $obj, $$obj (the ID) is exactly what you need for constructing the API calls Are subclasses of Entity Entity does the object table handling, JSON-to-object conversion and HTTP agent calls Isolates most of the kludges necessary to handle the few API inconsistencies that exist(ed) 9
  • Auto-accessors You can treat properties as object fields if desired Caveat: this may not make sense for your application (not every node needs to have the same properties, but every object will possess the accessors currently) 10
  • Batch Calls Certain situations (database loading, e.g.) make sense to batch : do many things in one API call rather than many single calls REST API provides this functionality How to make it "natural" in the context of working with objects? Use Perl prototyping sugar to create a "batch block" 11
  • Example: Rather than call the server for every line, you can mix in REST::Neo4p::Batch, and then use a batch {} block: 12 Calls within block are collected and deferred
  • 13 You can execute more complex logic within the batch block, and keep the objects beyond it:
  • 14 But miracles are not yet implemented: Object here doesn't really exist yet
  • How does that work? Agent module isolates all bona fide calls very few kludges to core object modules req'd batch() puts the agent into batch mode and executes wrapped code agent stores incoming calls as JSON in a queue After wrapped code is executed, batch() switches agent back to normal mode and has it call the batch endpoint with the queue contents Batch processes the response and creates objects if requested 15
  • Batch Profiling Used Devel::NYTProf, nytprofhtml Flame graph: 16 Vertical : unique call stack (call on top is on CPU) Horizontal : relative time spent in that call stack configuration Color : makes it look like a flame
  • Batch Profiling 17
  • Batch Profiling 18 batch keep : 1.1 of 1.2s batch discard: 1.0 of 1.1s no batch: 13.0 of 13.9 s
  • Batch/keep Batch Profiling 19 No batch
  • HTTP Agent 20
  • Agent Is transparent But can always see it with REST::Neo4p->agent Agent module alone meant to be useful and independent Elicits and uses the API self-discovery feature on connect() Isolates all HTTP requests and responses Captures and distinguishes API and HTTP errors emits REST::Neo4p::Exceptions objects [Instance] Is a subclass of a "real" user agent: LWP::UserAgent Mojo::UserAgent, or HTTP::Thin 21
  • Working within API Self-Description 22 Get first level of actions Register actions Get data level of actionsRegister more actions Kludge around missing actions
  • Working within API Self-Description 23 Get the list of actions with $agent->available_actions And AUTOLOAD will provide (see pod for args): $agent->get_() $agent->put_() $agent->post_() $agent->delete_() Other accessors, e.g. node(), return the appropriate URL for your server
  • Agent Profiling 24 lwp: 2.5 of 2.7 s mojo: 3.3 of 3.6s thin: 2.4 of 2.6s
  • App-level Constraints 25
  • Use Case You start out with a set of well categorized things, that have some well defined relationships. Each thing will be represented as a node, that's fine. But, You want to guarantee (to your client, for example) that 1. You can classify every node you add or read unambiguously into a well-defined group; 2. You never relate two nodes belonging to particular groups in a way that doesn't make sense according to your well-defined relationships. 26
  • Constrain/Constraint Now, v2.0 allows integrated Labels and unique constraints and prevents deletion of connected nodes, but REST::Neo4p::Constrain - An add-in for constraining (or validating) property values connections (relationships) based on node properties relationship types according to flexible specifications 27
  • Constrain/Constraint Multiple modes: Automatic (throws exception if constraint violated) Manual (validation function returns false if constraint violated) Suspended (lift constraint processing when desired) Freeze/Thaw (in JSON) constraint specifications for reuse 28
  • 29 Open the POD now, HAL.
  • Cypher Queries REST::Neo4p::Query takes a familiar, DBI-like approach Prepare, execute, fetch "rows" returned are arrays containing scalars, Node objects, and/or Relationship objects If a query returns a path, a Path object (a simple container) is returned 30
  • 31
  • Cypher Queries Prepare and execute with parameter substitutions 32 Do This! Not This!
  • Cypher Queries Transactions are supported when you have v2.0.1 server or greater started with REST::Neo4p->begin_work() committed with REST::Neo4p->commit() canceled with REST::Neo4p->rollback() (here, the class looks like the database handle in DBI, in fact) 33
  • DBI DBD::Neo4p Yes, you can really do this: 34
  • 35 Glory! Maintenance.
  • Future Directions/Contribution Ideas Get it onto GitHub https://github.com/majensen/rest-neo4p.git Make batch response parsing more efficient e.g., don't stream if response is not huge Beautify and deodorize Completely touch-free testing Add traversal functionality Could Neo4p play together with DBIx::Class? (i.e., could it be a real OGM?) 36
  • Thanks! 37 https://github.com/majensen/rest-neo4p.git