Web 3.0 Emerging

35
Tetherless World Constellation Web 3.0 emerges… Jim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information Technology and Web Science Rensselaer Polytechnic Institute http://www.cs.rpi.edu/~hendler

description

A talk given at CSHALS 2010, an overview of the current state of linked data, Semantic Web, and Web 3.0 efforts.

Transcript of Web 3.0 Emerging

Page 1: Web 3.0 Emerging

Tetherless World Constellation

Web 3.0 emerges…

Jim HendlerTetherless World Professor of Computer and Cognitive Science

Assistant Dean of Information Technology and Web Science

Rensselaer Polytechnic Institutehttp://www.cs.rpi.edu/~hendler

Page 2: Web 3.0 Emerging

Tetherless World Constellation

Page 3: Web 3.0 Emerging

Tetherless World Constellation

The Semantic Web (ca. 2001)

(Berners-Lee, Hendler, Lassila; 2001)

Page 4: Web 3.0 Emerging

Tetherless World Constellation Semantic Web ca. 2010

• Semantic Web finding success even in tough market– Lots of small companies in the market: Altova… Zepheira (eg. C&P, Franz,

Intellidimension, Intellisophic, Ontology Works, Siderean, SandPiper, SiberLogic, TopQuadrant …)

– Web 3.0 new buzzword: Garlik, Twine, Freebase, Bintro, Siri, Talis, …– Semantic Search taking off - Powerset bought by Microsoft for over $100,000,000, hakia,

bing, T2, tiptop, …– Semantic match: classifieds (bintro), clinical studies (TrialX.com)

• Bigger players buying in– 2009 announcements at SemTech (June): Google, New York Times, Oracle, IBM, Yahoo, MS

Live Labs, Siri, …– 2008: Gartner identifies Corporate Semantic Web as one of three "High impact" Web

technologies– Tool market forming: AllegroGraph, TopBraid, Pellet2, …– O’Reilly “Programming the Semantic Web”

• Government projects in and across agencies– Recent open data announcements by UK and US

– UK in linked data format, US 3rd party to linked data (5B triples so far)– Projects/demos in EU, Japan, Korea, China, India…– SKOS update in govt (and private) libraries

• Several "verticals" heavily using Semantic Web technologies– Health Care and Life Sciences – Financial services– Human Resources– Publishing/New Media– Sciences other than Life Science

• Virtual observatory, Geo ontology, …

Page 5: Web 3.0 Emerging

Tetherless World Constellation

Two very different sorts of use cases

• cf. US National Center for Biotechnology Information, "Oncology Metathesaurus"– 50,000+ classes, ~8 people supporting full time,

monthly updates, mandated for use by NIH-funded cancer researchers

• OWL DL rigorously followed• Provably consistent

• cf. Friend of a Friend (Foaf)– 30+ classes, Dan Brickley and Libby Miller made it,

maintained by consensus in a small community of developers

• Violates DL rules (undecidable)• Used inconsistently

Page 6: Web 3.0 Emerging

Tetherless World Constellation

Widely varying use

• NCBI Oncology Ontology – High use in medical community– Very "trusted" information (provenance from NCBI)– Primarily terminological (relationships between cancer-

related concepts), not data-oriented

• FOAF– ~60M Foaf people (not necessarily distinct individuals) – Used by a number of large providers

• If you use LiveJournal, you have a FOAF file– Also flickr, ecademy, tribe, joost, …– And you can export Foaf from Facebook and many other social

networking sites

– Becoming de facto standard for open social networking

Page 7: Web 3.0 Emerging

Tetherless World Constellation

Why?

• NCBI view: Formal properties– Based on a decidable subset of KR

• Description logics– For which much scaling research has been happening

• Ca. 2000 - 10,000 axioms, no facts, 1 day• Ca. 2008 - 50,000 axioms, million facts, 10 min.

– Not just faster computers (but Moore's Law helps), significant research into optimization, "average case"

– Moving to parallel (Web server)

– With some new ways of linking to larger data sets• SHER, IBM, "reduced Abox"• OWL-Prime, Oracle, "materialized views”• OWL 2 QL (?)

In this view OWL is a formal knowledge representation standard

Page 8: Web 3.0 Emerging

Tetherless World Constellation

The argument for this seems compelling

• When "folksonomy" isn't enough…

Which one do you want your doctor to use?

Page 9: Web 3.0 Emerging

Tetherless World Constellation

But the cost is high

• Formal modeling finds its use cases in verticals and enterprises– Where the vocabulary can be controlled– Where finding things in the data is important

• Example– Drug discovery from data

• Model the molecule (site, chemical properties, etc) as faithfully and expressively as possible

• Use "Realization" to categorize data assets against the ontology– Bad or missed answers are money down the drain

• But the modeling is very expensive and the return on investment must be very high!– Which is part of why the "expert systems revolution" wasn't

one– Became part of the technology tool kit, a useful niche in the

programming pantheon, but didn't change the world

Page 10: Web 3.0 Emerging

Tetherless World Constellation

The alternative

• OWL is based on RDF, a language designed for the (Semantic) Web– Built with Web architecture in mind

• Exploits Web infrastructure, respects W3C TAG recommendations– Internationalization, accessibility, extensibility

– Fits the Web culture• Open and extensible, supports communities of interest

– If you don't like my ontology, extend it, change it, or build your own• Fits the Web application development paradigm

– Scales like "databases"

– With some new ways of linking to formal models• Heavy use of a small amount of OWL • Generally used "like it sounds" not like the formal model

– Example "owl:sameAs" debate

“linked data” often used to describe this low semantics Semantic Web (slogan: a little semantics goes a long way)

Page 11: Web 3.0 Emerging

Tetherless World Constellation

RDFTripleStore

DynamicContentEngine

HTTP

RDF

Web App(w SPARQL)

RDFTripleStore

Semantic Web Applications

• ~2006: Web app developers discover the Semantic Web

HTML

2008 examples include sites from "regular" Web players such as Dow Jones, Reuters and Yahoo!

Page 12: Web 3.0 Emerging

Tetherless World Constellation

cf. Yahoo mixes RDF with other technologies: at Web scale

Dave Beckett, SemTech 08http://www.semantic-conference.com/session/733/

Page 13: Web 3.0 Emerging

Tetherless World Constellation

Linked Data + Semantics

• "Linked Data" approach finds its use cases in Web Applications (at Web scales)– A lot of data, a little semantics– Finding anything in the mess can be a win!

• Example– Declare simple inferable relationships and apply, at

scale, to large, heterogeneous data collections• eg. Use InverseFunctional triangulation to find the entities

that can be inferred to be the same– These are "heuristics" not every answer must be right

(qua Google) – But remember time = money!

Page 14: Web 3.0 Emerging

Tetherless World Constellation

Example: the linked open data cloud now has tens of billions of triples and is growing rapidly

The data is out there

Page 15: Web 3.0 Emerging

Tetherless World Constellation

Government Data on the Web

Page 16: Web 3.0 Emerging

Tetherless World Constellation

Moving data.gov to linked data (UK)

• Built around linked data with top-down push from “Number 10”

Page 17: Web 3.0 Emerging

Tetherless World Constellation

Moving data.gov to linked data (US)

• Third parties (like RPI) translate the govt data into Sem Web forms and link to sources

• Plans for a semantic.data.gov in OGD implementation plans,, but unfunded

Page 18: Web 3.0 Emerging

Tetherless World Constellation

Adding Meta-data

Page 19: Web 3.0 Emerging

Tetherless World Constellation

Pump through to Google Viz for demos

Page 20: Web 3.0 Emerging

Tetherless World Constellation

Data.gov + epa.gov

Page 21: Web 3.0 Emerging

Tetherless World Constellation

Adding some Web magic

Web Analytics

Social Data Networks

External Links

Page 22: Web 3.0 Emerging

Tetherless World Constellation

Mashup w/Web content

Page 23: Web 3.0 Emerging

Tetherless World Constellation

Linked Data (RDF, SPARQL)

Semantic Web (RDFS, owl)

Web 3.0

Web 2.0

Web 3.0 extends current Web applications using Semantic Web technologies and graph-based, open data.

Page 24: Web 3.0 Emerging

Tetherless World Constellation

Semantic Search

IEEE Computer, Jan 2010; IEEE Computing Now, Feb 2010 (free)

Page 25: Web 3.0 Emerging

Tetherless World Constellation

Semantic Search examples

T2 (twine.com) TipTop (feeltiptop.com)

Page 26: Web 3.0 Emerging

Tetherless World Constellation

Web 3.0 examples

Semantic classified (bintro.com)

Page 27: Web 3.0 Emerging

Tetherless World Constellation

Trialx.com

Page 28: Web 3.0 Emerging

Tetherless World Constellation

Web 3.0 examples

Page 29: Web 3.0 Emerging

Tetherless World Constellation

Web 3.0 examples

Social database (freebase.com)

Page 30: Web 3.0 Emerging

Tetherless World Constellation

Tiptop health

Ok, so we have a way to go on this one

Page 31: Web 3.0 Emerging

Tetherless World Constellation

Web 3.0 excitement (hype?)

• Significant and growing commercial interest…– Web: Google, Amazon, Travelocity…– Web 2.0: Facebook, Wikipedia,

YouTube, Twitter…– Web 3.0: the big ones are still out

there

Page 32: Web 3.0 Emerging

Tetherless World Constellation

1B computers (mostly owned by a few large companies)3B cell phones (most are Web enable)

Page 33: Web 3.0 Emerging

Tetherless World Constellation

Sem Web going mobile?

(Add social contexts)

Page 34: Web 3.0 Emerging

Tetherless World Constellation

Summary

• The Semantic Web is going just fine thank you– People asking “how,” not why

• So far the commercial driver has been “weak semantics”– In the enterprise

• Web 3.0 adds semantics as a value add to regular Web functionality– Semantic search– Semantic match– Semantic etc

• The big one is still out there

Page 35: Web 3.0 Emerging

Tetherless World Constellation

What’s next?

Still, like many researchers working in this area, Hendler is already looking beyond emerging Semantic Web strategies and related technologies that are now collectively called Web 3.0. “This stuff is new and exciting,” he says. “But I look at it this way: I started playing with the Semantic Web back in the 1990s. As a researcher, I’m not content to sit around and exploit Web 3.0; my job is to help create Web 4.0.”

“Engineering the Web’s Third Decade,” CACM March, 2010