The Entity Registry System @ Verisign Labs, 2013
-
Upload
exascale-infolab -
Category
Technology
-
view
319 -
download
1
description
Transcript of The Entity Registry System @ Verisign Labs, 2013
![Page 1: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/1.jpg)
The Promise of a Better Connected Digital World:
Philippe Cudré-MaurouxeXascale Infolab, University of Fribourg
Switzerland
Verisign Labs Distinguished Speakers SeriesVerisign Labs, Reston–USA
December 13, 2013
Christophe GuéretVU University / DANS
The Netherlands
Data Registry Systems Without the Web
![Page 2: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/2.jpg)
Entities2
![Page 3: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/3.jpg)
Entity Data
• Semi-structured, interlinked descriptions of shared instances– Persons– Objects– Software– Locations– Sensors– …
3
![Page 4: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/4.jpg)
Entities as Mediation
• Rising paradigm– Store information at the entity granularity– Integrate information by inter-linking entities
• Advantages?– Coarser granularity compared to keywords
• More natural, e.g., brain functions similarly (or is it the other way around?)
– Denormalized information compared to RDBMSs• Schema-later, heterogeneity, sparsity• Pre-computed joins, “Semantic” linking
• Drawbacks?
4
![Page 5: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/5.jpg)
Prominence of Entity-Powered Apps
– Collaborative Editing (Wikipedia’s wikidata)– Social Networks (Facebook’s Open Graph)– Serious Networks (LinkedIn’s Business Graph)– Web Search (Google’s Knowledge Graph)– Software Integration (Yahoo!’s WOO)– Question Answering (IBM’s Watson)– Dynamic Websites (BBC’s London Olympics)– Open Data (data.gov.uk, linkeddata.org)– Most of our own applications (exascale.info)– etc. etc.
5
![Page 6: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/6.jpg)
Problem: Limited Access to Entities (1)
• 70+% of the world’s population has no or very limited access to the Web
[Ahmed Shams 2013]6
![Page 7: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/7.jpg)
Problem: Limited Access to Entities (2)
• Even in developed countries, deploying collaborative entity-editing platforms is technically exceedingly challenging– Local/Global QoS to serve arbitrary entity data
• Performance, scale-out
– Collaborative aspects• Transactions, versioning, integration
– Offline / mobile concerns• Caching / replication / serializability
7
![Page 8: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/8.jpg)
Potential Building Blocks?
• … for a hybrid online/offline, collaborative entity registry:– DNS3 (never meant for entity data)– DOA (awkward Web integration, limited features)– RDBMSs (ACID? Impedance mismatch, limited perf.)– P2P / decentralized CDNs (performance issues)– Native RDF Stores (too expressive; scalability / perf. issues)– (Structured) Inverted Indices (no transactions; slow updates)– noSQL key-value / document stores
(wrong PACELC trade-offs; (some) performance issues)
8
[Iliya Enchev 2012ISWC 2013]
![Page 9: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/9.jpg)
Our Solution: ERS, theEntity Registry System
• Three-tier solution to deploy entity-powered apps– Flexible
• Seamlessly reconcile entities in local / ad-hoc / global modes
– Collaborative• Transactional consistency, data versioning
– Scalable• Bridges, scale-out servers, tunable consistency
– Open-source• https://github.com/ers-devs
9
![Page 10: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/10.jpg)
ERS Architecture (1)
• Contributors: Contributors read and edit the contents of the registry. They may create and delete entities, look for entities, and contribute to the entities’ descriptions.
• Bridges: Bridges do not directly contribute to the contents of the registry. They are used to connect isolated closed networks and improve the availability of the descriptions shared by the contributors.
• Aggregators: Some use-cases may require the presence of global servers that contains a copy of all the data provided individually by the contributors. The global server provides a single entry point to the registry.
10
![Page 11: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/11.jpg)
ERS Architecture (2)
11
www www
![Page 12: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/12.jpg)
Sample Deployment [Videos]
12
![Page 13: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/13.jpg)
ERS Data & API
• Data: flexible RDF quads serialized as – JSON documents (contributors, bridges)– Key, value pairs (aggregators)
• Atomic & serializable operations through various locking granularities– Insert entity (IE), Insert property (IP), Update property (UP), Delete
property (DP), Delete entity (DE), Shallow entity copy (SC), Deep entity copy (DC), Insert link between two entities (IL), Delete link between two entities (DL)
=> Consistency
13
![Page 14: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/14.jpg)
Unique Technical Features
I. Seamless, best-effort entity synchronization– Local, ad-hoc, global modes
II. Fault-tolerance and decentralization– Property replication, no single point of failure
III. Built-in versioning and provenance– Collaborative entity editing made easy
IV. Linear scalability– Tunable consistency levels
14
![Page 15: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/15.jpg)
Performance: Distributed Locking (1)
• Decentralized, multi-granular locking protocol for transactional consistency on top of persistency layers
15
![Page 16: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/16.jpg)
Performance: Distributed Locking (2)
• Fault-tolerant, though Paxos-like algorithms limit horizontal scalability
16
![Page 17: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/17.jpg)
Performance: Optimistic Concurrency (1)
• ERS typically operates on insert-heavy low-conflict workloads– Most of the time new entities are inserted and properties added
• Goal: separate validation from write operations– Per worker TX management– Distributed ID generator for consistent commits
17
![Page 18: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/18.jpg)
Tunable Consistency in ERS• Weak Writes: For each TX a CID is acquired and a new record
is written, write is not validated
• Strong Writes: For each TX a CID is acquired and a new record is written. After the write, a read verifies the visibility; if the record is not visible the write is performed again
• Write validation: Forward chaining of records based on the highest CID, last writer wins
111
Inserting 111 is possible using weak writes, but the write cannot be validated
18
![Page 19: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/19.jpg)
OC – Execution Stack
• Breakdown of a single write operation with tunable consistency
![Page 20: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/20.jpg)
20
Performance: Optimistic Concurrency (2)
=> Linear scalability even for write-heavy workloads
![Page 21: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/21.jpg)
Ongoing Deployments (1)
• Swiss-Dutch local/global social messaging
21
Cloud hosted Aggregator
Bridge inAmsterdam (VUA)
Bridge inSwitzerland (Exascale)
Contributors on VUA internal network
Contributors on Exascale Infolabinternal network
![Page 22: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/22.jpg)
Ongoing Deployments (2)
• Test deployments on new affordable devices
Wandboard(ultra low power computer)
SmilePlug(cloud-based learning)
Earl (backcountry survival tablet)
![Page 23: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/23.jpg)
Ongoing Deployments (3)
• Entity-powered apps for the Sugar Learning Platform
23
![Page 24: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/24.jpg)
Ongoing Deployments (4)
• ERS for Ambient Assisted Living of elderly persons in tropical environments[AAL research group @ VU]
24
![Page 25: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/25.jpg)
Conclusions• The Web is becoming entity-centric
–Land of opportunities for new registries–Urgent needs for developing countries
• ERS is a unique, open-source entity registry solution supporting
–Local / ad-hoc / global modes–Collaborative editing and entity versioning–Tunable consistency levels–Linear scalability
• Series of ongoing deployments–Stay tuned for more results and lessons learnt
![Page 26: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/26.jpg)
Big Thanks to the whole ERS Team
… and to our MSc students:–Iliya Enchev and Ahmed Shams
Dr. Christophe Guéret
Dr. Marat Charlaganov
C. Dinu & Pepijn Kroes
Prof. Dr. Philippe Cudré-Mauroux
Dr. Martin Grund
Teodor Macicas
Dutch Team @ DANS Swiss team @ XI
![Page 27: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/27.jpg)
And Special Thanks to…
• Scott Hollenbeck, Debra Anderson, Allison Mankin & the Internet Infrastructures Grant team
• Dr. Burt Kaliski and his team
• Vincenzo Russo, Benoit Perroud, Romain Cholat and the whole Verisign Fribourg office
… for their continued support
![Page 28: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/28.jpg)
References• P. Cudré-Mauroux, G. Demartini, D.E. Difallah, A.E. Mostafa, V. Russo, and M. Thomas. A
Demonstration of DNS3: a Semantic-Aware DNS Service. ISWC 2011.
• P. Cudré-Mauroux, G. Demartini, I. Enchev, C. Gueret and B. Perroud: Downscaling Entity Registries for Ad-Hoc Environments. Downscale 2012.
• M. Charlaganov, P. Cudré-Mauroux, C. Dinu, C. Guéret, M. Grund, T. Macicas: Demonstrating The Entity Registry System: Implementing 5-Star Linked Data Without the Web. ISWC 2013.
• P. Cudré-Mauroux, I. Enchev, S. Fundatureanu, P.T. Groth, A. Haque, A. Harth, F. Keppmann, D.P. Miranker, J. Sequeda, M. Wylot: NoSQL Databases for RDF: An Empirical Evaluation. ISWC 2013.
• M. Charlaganov, P. Cudré-Mauroux, C. Dinu, C. Guéret, M. Grund, T. Macicas: The Entity Registry System: Implementing 5-Star Linked Data Without the Web. CoRR abs 2013.
• M. Charlaganov, P. Cudré-Mauroux, C. Dinu, C. Guéret, M. Grund, P. Kroes, and T. Macicas: Collaboratively Editing an Entity Registry in Poorly Connected Environments.
CAiSE 2014 [submitted].
28
![Page 29: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/29.jpg)
Further Entity Research @ XI• R. Prokofyev, G. Demartini and P. Cudré-Mauroux: Effective
Named Entity Recognition for Idiosyncratic Web Collections. WWW 2014.
• G. Demartini, D.E. Difallah., and P. Cudré-Mauroux: Large-scale linked data integration using probabilistic reasoning and crowdsourcing. The VLDB Journal, 2013.
• A. Tonon, M. Catasta, G. Demartini, P. Cudré-Mauroux, and K. Aberer: TRank: Ranking Entity Types Using the Web of Data. ISWC 2013.
• A. Tonon, G. Demartini, and P. Cudré-Mauroux: Combining inverted indices and structured search for ad-hoc object retrieval. SIGIR 2012.
• G. Demartini, D.E. Difallah, and P. Cudré-Mauroux: ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. WWW 2012.
29
![Page 30: The Entity Registry System @ Verisign Labs, 2013](https://reader035.fdocuments.net/reader035/viewer/2022062703/554e8d56b4c905fc368b4a57/html5/thumbnails/30.jpg)
Thanks a lot for your attention
30
http://exascale.info