Data-driven Applications with conStruct

38
Michael K. Bergman Structured Dynamics LLC presented at Semantic Technologies ’09 Conference San Jose, CA June 16, 2009 “BKN: Building Knowledge through Communities, and Communities through Knowledge”

description

The first unveiling of conStruct, a structured content system for enabling Drupal to be driven by structured (RDF) data. conStruct also is based on the platform-independent structWSF Web services framework, the provides dataset collaboration over the Web. Presentation is from SemTech 2009.

Transcript of Data-driven Applications with conStruct

Page 1: Data-driven Applications with conStruct

Michael K. BergmanStructured Dynamics LLC

presented at

Semantic Technologies ’09 ConferenceSan Jose, CA

June 16, 2009

“BKN: Building Knowledge through Communities,

and Communities through Knowledge”

Page 2: Data-driven Applications with conStruct

2

Bibliographic Knowledge NetworkSemTech ‘09

Presentation Outline

I. Overview of BKN

II. Demonstration of a BKN Node

III. More on Design

IV. Architecture and Web Services Framework

V. Collaboration Demonstration

VI. Design Benefits

VII. conStruct and Next Plans

VIII. Contact Info

Page 3: Data-driven Applications with conStruct

I. Overview of BKN

Page 4: Data-driven Applications with conStruct

4

Bibliographic Knowledge NetworkSemTech ‘09

Introduction BibKN or BKN: the Bibliographic Knowledge Network

Develop tools and services for virtual organizations (“VOs”): Scientific communities of varied sizes, interests Conference groups Departmental research groups Students, teachers and faculty alike

Select, filter and enhance bibliographic data for each VO: Multiple input citation, bibliographic and formats

Apply new and existing methods to these bibliographic collections Bibliometric analysis Machine learning Statistical visualization

Research, provide authoring tools for relevant structured data

First math and statistics then any knowledge community

Page 5: Data-driven Applications with conStruct

5

Bibliographic Knowledge NetworkSemTech ‘09

The Project

Two-year effort

Began October 2008

Joint consortium effort of: University of California, Berkeley Harvard University Stanford University American Institute of Mathematics (AIM)

Director: Dr. Jim Pitman Professor of Statistics and Mathematics, UC Berkeley

Funded by NSF (Grant No. 0835851)

All software and data to be open source

Page 6: Data-driven Applications with conStruct

6

Bibliographic Knowledge NetworkSemTech ‘09

Some Points for this Talk Four kinds of network “nodes”:

1. VO Nodes – collaboration portals2. Gateways – connections to existing external content3. Hubs – aggregate suppliers of useful datasets4. Individual dataset contributors and clients

Data formats and models: BibJSON: human readable, exchange format Ingest of various existing citation/biblio formats RDF: “canonical” internal data model

VO Nodes the emphasis of this talk: CMS: Drupal RDF triplestore: Virtuoso Full-text, faceted search: Solr

Page 7: Data-driven Applications with conStruct

7

Bibliographic Knowledge NetworkSemTech ‘09

Preview of Benefits

Comprehensive toolset for structured data (semantic Web) conversion, use and exposure

Data-driven via ontologies; easily scoped, tailored

Naïve data formats and RDF APIs via RESTful Web services

Web services framework can mix-and-match: Standalone Integrated with any CMS External tools access

Web-wide user and dataset access and permissions

Available as a distro of the Drupal CMS

Open source and extensible

Page 8: Data-driven Applications with conStruct

8

Bibliographic Knowledge NetworkSemTech ‘09

conStruct SCS

A ‘data-driven app’ (ontologies + structured data)

Datasets + permissions

Context-sensitive display templates

Use and exposure of linked data

Hosted by Drupal

Page 9: Data-driven Applications with conStruct

II. Demonstration of a BKN Node

Page 10: Data-driven Applications with conStruct

10

Bibliographic Knowledge NetworkSemTech ‘09

BibKN Welcome Screen

Page 11: Data-driven Applications with conStruct

III. More on Design

Page 12: Data-driven Applications with conStruct

12

Bibliographic Knowledge NetworkSemTech ‘09

Data Formats and Exchange

Ingest standard structured and bibliographic data: BibTeX Bibsonomy RePEc etc., etc.

Standard ‘naïve’ data format: BibJSON Attribute-value pair orientation Human readable Easily authored and edited

RDF as the internal, ‘canonical’ data model Also can be used for exchange

Page 13: Data-driven Applications with conStruct

13

Bibliographic Knowledge NetworkSemTech ‘09

Data Formats and Exchange

Page 14: Data-driven Applications with conStruct

14

Bibliographic Knowledge NetworkSemTech ‘09

Four Kinds of ‘Nodes’

VO Nodes – collaboration portals

Gateways – connections to existing external content

Hubs – aggregate suppliers of useful datasets

Individual dataset contributors and clients

The primary construct for data exchange and use is the dataset

Page 15: Data-driven Applications with conStruct

15

Bibliographic Knowledge NetworkSemTech ‘09

Multiple Deployment Options

Page 16: Data-driven Applications with conStruct

16

Bibliographic Knowledge NetworkSemTech ‘09

Illustrative BKN Network

Page 17: Data-driven Applications with conStruct

17

Bibliographic Knowledge NetworkSemTech ‘09

The Backend Structure

Drupal: User interface and theming User and group management Content management system Many third-party tools and modules

Virtuoso: RDF triple store SPARQL and endpoints; linked data exposure Some structured data conversions

Solr: Full-text indexing Faceting and aggregating (counts) Innovative complete RDF search

Page 18: Data-driven Applications with conStruct

18

Bibliographic Knowledge NetworkSemTech ‘09

The Backend Structure

Page 19: Data-driven Applications with conStruct

IV. Architecture and Web Services Framework

Page 20: Data-driven Applications with conStruct

20

Bibliographic Knowledge NetworkSemTech ‘09

Overall Architecture

Page 21: Data-driven Applications with conStruct

21

Bibliographic Knowledge NetworkSemTech ‘09

structWSF (Web services framework) Standard functionality:

Data management: Create Read Update Delete

Browse and search Import/export Display templates

Dataset registration w/ permissions

Ontology registration

User authorization and access (IP-based initially)

RESTful design

Page 22: Data-driven Applications with conStruct

22

Bibliographic Knowledge NetworkSemTech ‘09

structWSF Schema

Page 23: Data-driven Applications with conStruct

V. Collaboration Demonstration

Page 24: Data-driven Applications with conStruct

24

Bibliographic Knowledge NetworkSemTech ‘09

Topology of the Collaboration Demo

Page 25: Data-driven Applications with conStruct

VI. Design Benefits

Page 26: Data-driven Applications with conStruct

26

Bibliographic Knowledge NetworkSemTech ‘09

Comprehensive Toolset Standard functionality:

Data management: Create Read Update Delete

Browse and search Import/export Display templates

Dataset registration w/ permissions

Ontology registration

User authorization and access (via Drupal OG)

Drupal modules are lightweight wrappers over structWSF

Page 27: Data-driven Applications with conStruct

27

Bibliographic Knowledge NetworkSemTech ‘09

Data-driven via Ontologies

Ontologies (OWL) guide: Instance records and attributes Data relationships Data presentation:

Trees/hierarchies Display templates

User interface, e.g.: Autocompletion Contextual dropdown list selections New record entry New attribute entry etc.

Addition or swap out of ontologies drives generic tools

Page 28: Data-driven Applications with conStruct

28

Bibliographic Knowledge NetworkSemTech ‘09

RESTful structWSF

Independent, standalone piece

Naïve data formats and RDF APIs via RESTful Web services

Web services framework can mix-and-match: Standalone Integrated with any CMS External tools access

Web-wide user and dataset access and permissions

As described

Page 29: Data-driven Applications with conStruct

29

Bibliographic Knowledge NetworkSemTech ‘09

Open Source Drupal Distro

Provided as the conStruct module

See conclusion

Page 30: Data-driven Applications with conStruct

30

Bibliographic Knowledge NetworkSemTech ‘09

Re-cap of Benefits

Comprehensive toolset for structured data (semantic Web) conversion, use and exposure

Data-driven via ontologies; easily scoped, tailored

Naïve data formats and RDF APIs via RESTful Web services

Web services framework can mix-and-match: Standalone Integrated with any CMS External tools access

Web-wide user and dataset access and permissions

Available as a distro of the Drupal CMS

Open source and extensible

Page 31: Data-driven Applications with conStruct

VII. conStruct and Next Plans

Page 32: Data-driven Applications with conStruct

32

Bibliographic Knowledge NetworkSemTech ‘09

conStruct SCS conStruct SCS is a structured content system

Open source version of the BKN project that runs on Drupal

Drupal components: Drupal conStruct module (core functionality) structDisplay module (display templating) Required additional modules (e.g., Organic Groups,

WYSIWYG, etc.)

Structured Dynamics: structWSF (Web services framework)

Third parties: Virtuoso RDF triple store Solr full-text faceted search Smarty templating system

Page 33: Data-driven Applications with conStruct

33

Bibliographic Knowledge NetworkSemTech ‘09

The conStruct System

Page 34: Data-driven Applications with conStruct

34

Bibliographic Knowledge NetworkSemTech ‘09

conStruct Distribution Options

Initially: Downloads from multiple sites Download, Install & Configuration manual

Next option: Single-click Amazon EC2 AMI install

Planned: Complete, packaged Drupal distro

Hoped: Multiple CMS versions

Page 35: Data-driven Applications with conStruct

35

Bibliographic Knowledge NetworkSemTech ‘09

conStruct Plans and Schedule

conStructSCS.com now open Online demo available

OpenStructs.org now open Also the distribution point for open source display templates,

converters and extractors

conStruct alpha distro to be available June 30

sturctWSF alpha distro to be available June 30

Additional conStruct supporting modules planned on an ongoing basis

Page 36: Data-driven Applications with conStruct

VII. Contact Info

Page 37: Data-driven Applications with conStruct

37

Bibliographic Knowledge NetworkSemTech ‘09

Contacts & InformationBibliographic Knowledge Network

Dr. Jim [email protected]

Nitin BorwankarProject [email protected]

Web Sitewww.BibKN.org

Structured Dynamics

Michael K. [email protected]: www.mkbergman.com

Web SitesconStructSCS.comwww.structureddynamics.comwww.umbel.org

Page 38: Data-driven Applications with conStruct