New York City and Baltimore Semantic Web Meetups 20130221/20120226

39

description

Slides for a talk at the New York City and Baltimore Semantic Web Meetups on the Linked Data book being published by Manning.

Transcript of New York City and Baltimore Semantic Web Meetups 20130221/20120226

Page 1: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 2: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Discount code: 13ldev

at manning.com

Page 3: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Our Rapidly Changing Internet

35 hours of videouploaded per minute

51% of Internet trafficis non-human

>2.3 billon Internet users,>1 billion in Asia

Page 4: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 5: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 6: New York City and Baltimore Semantic Web Meetups 20130221/20120226

08 Oct 2007 07 Nov 2007 10 Nov 2007 28 Feb 2008 31 Mar 2008

18 Sep 2008 05 Mar 2009 27 Mar 2009 14 Jul 2009 22 Sep 2010

Page 7: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 8: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 9: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 10: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 11: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Luke Ruth is a Linked Data developer supporting the Callimachus Project (http://callimachusproject.org).

David Wood has co-founded several Open Source Software projects related to the Semantic Web, including Persistent URLs, Mulgara and the Callimachus Project. He is co-chair of the W3C’s RDF Working Group.

Marsha Zaidman is Associate Professor Emerita of Computer Science at the University of Mary Washington.

Michael Hausenblas is Chief Data Engineer at MapR. He formerly led the Linked Data Research Centre in Galway, Ireland.

Page 12: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 13: New York City and Baltimore Semantic Web Meetups 20130221/20120226

http://www.manning.com/dwood/

Page 14: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Linked Data about the

book!

Page 15: New York City and Baltimore Semantic Web Meetups 20130221/20120226

The first chapter is free

Page 16: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Manning Early Access Program (MEAP)

• Concept:

• Give away the first chapter

• Sell a low-resolution PDF to early readers

• Readers get PDF updates and a print copy when it becomes available

Page 17: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Author Forumhttp://www.manning-sandbox.com/forum.jspa?forumID=838

Page 18: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Success Criteria

Page 19: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Success Criteria

• Sell 100 copies in MEAP in one month

Page 20: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Success Criteria

• Sell 100 copies in MEAP in one month

• 598 copies in one month

Page 21: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Success Criteria

• Sell 100 copies in MEAP in one month

• 598 copies in one month

• Lots of interest in Linked Data!

Page 22: New York City and Baltimore Semantic Web Meetups 20130221/20120226

What’s Inside

• What Linked Data is

• Find Linked Data you can reuse

• Use Linked Data in your applications

• Create your own Linked Data

• Build Linked Data applications using standard Web techniques

Page 23: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Coding Examples

Callimachus

Page 24: New York City and Baltimore Semantic Web Meetups 20130221/20120226

1. Linked Data to the rescue Available

2. RDF - the data model for Linked Data Available

3. Consuming Linked Data Available

4. Creating Linked Data Available

5. Querying Linked Data Available

6. Enhancing results from search engines Available

7. Collecting Linked Data Available

8. Datasets Completed

9. Callimachus - a Linked Data management system In draft

10. Building a read-write Linked Data application In draft

Page 25: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Callimachushttp://callimachusproject.org

Page 27: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Callimachus

Partners

Sesame

Sesame(in progress)

Page 28: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 29: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 30: New York City and Baltimore Semantic Web Meetups 20130221/20120226

From WikipediaFrom EPA

Open Street Map

Page 31: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Subject

Object(Predicate is defined in a template)

Page 32: New York City and Baltimore Semantic Web Meetups 20130221/20120226

RDF  “Describe”  View

Subject

Predicate Object

Page 33: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 34: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 35: New York City and Baltimore Semantic Web Meetups 20130221/20120226
Page 36: New York City and Baltimore Semantic Web Meetups 20130221/20120226

HTTP-accessible endpoints capable of returning XML or textual content

Convert XML or textual results to RDF

Render RDF to HTML via templateUser resolves asingle URI to anActive PURL

Multiple targets queriedindependently

1

David Wood1 and Tom [email protected], [email protected]

Active PURLs for Clinical Study Aggregation

The problem: No coordinated view of clinical study information. Information is distributed across departments, subsidiaries and government data sources.

The solution: Gather, convert, aggregate and format for display

Challenges

Next steps

How semantic technologies help

3 Round Stones and AstraZeneca created a system to allow coordinated views of distributed clinical trial information. The system extended the CallimachusProject, an Open Source management system for Linked Data. Persistent URLs, or PURLs, were used to provide globally unique and resolvable identifiers for each clinical study. The PURL concept was extended to enablePURLs to have multiple targets and for the results of each target to undergo arbitrary transformation. PURLs which have such capabilities are called Active PURLs. Information sources relevant to clinical studies were identified, regardless of whether their location was internal or external to the pharmaceutical company'snetwork. Active PURLs were used to resolve data sources having HTTP endpoints capable of returning XML or textual results. Each information source isdynamically transformed into Resource Description Framework (RDF) formats and all sources' results then merged into a single, temporary graph of RDF data.Information is rendered to end users as coordinated HTML descriptions regarding each clinical trial using the Callimachus template engine. Machine-readableversions of the data are also available.

Linked Data techniques can help to address both the availability of clinical trial information and provide a means to build effective information systems using it.Linked Data techniques allow for "cooperation without coordination". Publishers of data provide context for use by third parties in other portions of a distributedenterprise. Users of Linked Data can combine information from multiple sources. Subsequent publication can create a virtuous circle of positive feedback, allowingresearchers, informaticists and support staff to collaboratively and distributively build a reusable knowledge base.

Distributed queries have many knownlimitations, such as the introduction ofmultiple single points of failure in anygiven PURL resolution. HTTP timeouts,auth/auth errors or other network failurescan slow or stop a pipeline from returningcorrectly. Similarly, distributed queries can resultin variant query-time performance due tocomplex network and endpoint perform-ance variances. Proactive caching and cache manage-meant strategies can improve runtimeperformance and protect end users fromthe limitations inherent in a distributedquery architecture. Caching ofintermediate results from endpoints hasnot yet been implemented.

We intend to continue to addressReferences

1. Callimachus Project,

User experience

Users resolve a URL thatprovides a unique identifier fora clinical study, drug, chemicalor other concept managed bythis system. The user maybe presented with the URL onHTML pages, search it via full-text techniques or discover itvia semantic search.

1

2 Users are presented with adynamically generated Webpage representing aggregatedclinical study information. Usersare isolated from the complexand distributed informationenvironment.

Page 37: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Credit: Bradley P. Allen, Elsevier Labs

XHTML 5

DocBook 5

ePub 3

Page 38: New York City and Baltimore Semantic Web Meetups 20130221/20120226

http://www.manning.com/dwood/

Page 39: New York City and Baltimore Semantic Web Meetups 20130221/20120226

Discount code: 13ldev

at manning.com