HypergraphDB

20
HypergraphDB NDBI040 Jan Drozen http://www.ms.mff.cuni.cz/~drozenj

description

A gentle introduction to the HypergraphDB database.

Transcript of HypergraphDB

Page 1: HypergraphDB

HypergraphDB

NDBI040

Jan Drozen

http://www.ms.mff.cuni.cz/~drozenj

Page 2: HypergraphDB

HypergraphDB (HGDB)

• open-source

• graph-oriented database

• embedded

• higher-order relationships

• queries and traversals

• indices

• transactions

• distribution

Page 3: HypergraphDB

Hypergraph

• „is a family of sets over a universal set of vertices V“

• undirected graph where an edge can connect ANY number of vertices

Page 4: HypergraphDB

Data model

• basic unit is an atom

• each atom has associated tuple of atoms called target set

• the size of target set is arity

• arity 0 atoms are nodes, otherwise links

• let x is an atom then a set of atoms having x in target set is incidence set of x• set of links pointing to x

• each atom has its value

• each value has its type

Page 5: HypergraphDB

Storage architecture

• physical storage independent

• needs key-value indexing storage

• uses BerkeleyDB

• two layers

• primitive storage layer

• model layer

Page 6: HypergraphDB

Primitive storage layer

• low-level storage

• graph of identities and raw data

• consists of two key-value stores

• LinkStore: ID->List<ID>

• DataStore: ID->List<Byte>

• ID is cryptographically strong UID

• eliminating collisions

• type 4 UUID

Page 7: HypergraphDB

Model layer

• atoms, type system, caching, indexing, queries

• formalizing layout of the primitive storage• AtomID -> [type,value,{target set}]• ValueID -> List<ID> | List<Byte>

• ValueID can form complex structures

• core indices needed – UUID -> SortedSet<UUID>• IncidenceIndex

• maps hypergraph atom to set of all links pointing to it

• TypeIndex• maps type atom to set of all its instance atoms

• ValueIndex• maps a top-level value structure to the set of atoms with this value

Page 8: HypergraphDB

Architecture

Page 9: HypergraphDB

Types

• programming language neutral

• maps data values to/from permanent storage

• type is an atom too

• capable of storing, constructing and removing instances to/from storage

• subtype/supertype relationships

Page 10: HypergraphDB

Type system

• is bootstrapped from basic types• predefined numbers, strings, records, lists, maps

• HGAtomType interface

• each type atom implements this one

• has an Object make(…) method

• type constructor is a type atom which make method returns an HGAtomType instance

• records type constructor is managing records• single record‘s parts are managed recursively

• as atoms

• as values

Page 11: HypergraphDB

Java typing

Page 12: HypergraphDB

Indices

• we are able to create indices

• maintained at primitive layer• handled by type implementation

• and at model layer too• are always associated with atom types (and sub-types)

• interface HGIndexer• instances are atoms

• produces a key for given atom

• predefined indexers• ByPartIndexer, ByTargetIndexer, CompositeIndexer, LinkIndexer, TargetToTargetIndexer

Page 13: HypergraphDB

Queries

• traversal

• DF or BF

• adjacency

• depending on atom type, traversal direction

• predicate match

• not necessarily linked atoms

• pattern matching of graph structures

• special query language needed (SPARQL)

Page 14: HypergraphDB

Predicate match

• set-oriented queries

• set of query primitives:

• eq(x), lt(x), eq(“name“,x) compare atom‘s value

• target(LinkID) atom belongs to the target set of LinkID

• incident(TargetID) atom points to TargetID

• arity(n) arity of the atom is n

• and, or, not

• …

• lazy evaluation

Page 15: HypergraphDB

Transactions

• multiversion consistency check

• ACI by default

• upon failure commited data may be lost

• transaction nesting

• auto-transactions (for updates)

Page 16: HypergraphDB

Distribution

• implemented at model layer

• peer-to-peer• Agent Communication Language

• propose, accept, inform, request, query,…

• not total availability

• eventually consistent• upon startup each agent broadcasts interest in certain atoms (sending subscribe)

• each peer listens to atom events. After update, additon or removal notifies interested peers (sending inform)

• local transactions are lineary ordered by a version number and logged (ensures consistency, can reach all interested peers)

• a peer that received transaction notification must acknowledge it and decide whether to enact the transaction locally or not

Page 17: HypergraphDB

DEMO

• assume we have following situation:

• library containing some books, every book has an author, someone could borrow some books, there can be friendships between people

Book

Name

Page count

Human

Author

First name

Last name

Nationality

Reader

First name

Last name

lent

friendship

writen by

Page 18: HypergraphDB

Queries

• we can now query the database:

• set-oriented queries:

• for all books of an author X

• for all books are currently lent to a friend of a person X

• traversal-oriented:

• get all people are connected with me via my friends

Page 19: HypergraphDB

References

• http://www.hypergraphdb.org

• official website

• http://code.google.com/p/hypergraphdb/

• official Google code repository and Wiki

Page 20: HypergraphDB

Thank you!