Big Data on the Web – What We Will Do

31
Big Data on the Web What We Will Do Jump into Action Haklae Kim, PhD. , April 2012

description

 

Transcript of Big Data on the Web – What We Will Do

Page 1: Big Data on the Web – What We Will Do

Big Data on the Web – What We Will Do Jump into Action

Haklae Kim, PhD. , April 2012

Page 2: Big Data on the Web – What We Will Do

Introduction Open Data and

Open Government Data

The Semantic Web

& Linked Data What We Will Do

This Presentation .....

Today

2

Page 3: Big Data on the Web – What We Will Do

Web in Transition “a steady progression from a document-centric Web to one that is data-centric, including the mediation of semantics”

Let’s Start

3

(Source: Mike, 2007)

Page 4: Big Data on the Web – What We Will Do

4

Page 5: Big Data on the Web – What We Will Do

Big Data “data that becomes large enough that it cannot be processed using conventional methods”

Let’s Start

5

“Big Data is like Sex in High School–Lots of people are talking about it, but few are having it.”

-Eric Hansen, SiteSpect founder and CEO

Page 6: Big Data on the Web – What We Will Do

Introduction Open Data and

Open Government Data

The Semantic Web

& Linked Data What We Will Do

This Presentation .....

Today

6

Page 7: Big Data on the Web – What We Will Do

Data on the Web

Overview

Data is information about things

7

Data is something machines can process

Data drives applications (e.g. web sites, mobile services)

Data is relations among things

Page 8: Big Data on the Web – What We Will Do

“Open”

material (data) is open if it can be freely used,

reused and redistributed by anyone

“Government data”

data and information produced or

commissioned by government or

government controlled entities.

Source: Open Knowledge Foundation, 2010

8

What is Open (Government) Data?

Definition

Page 9: Big Data on the Web – What We Will Do

• Transparency

• Participation

• Collaboration

“My administration is committed to creating an unprecedented level of

openness in Government.” – Barack Obama “Memorandum for the Heads of Executive Departments and Agencies – Transparency and Open Government” Jan 2009

Page 10: Big Data on the Web – What We Will Do

http://www.practicalparticipation.co.uk/odi/wp-content/uploads/2010/06/Open-Data-Impacts-Timeline-Draft-0.1.png 10

Page 11: Big Data on the Web – What We Will Do

Source: http://tinyurl.com/44rub56

The State of Open Government Data

Public Sector Dataset

11

Page 13: Big Data on the Web – What We Will Do

Introduction Open Data and

Open Government Data

The Semantic Web

& Linked Data What We Will Do

This Presentation .....

Today

13

Page 14: Big Data on the Web – What We Will Do

14

Source: http://www.scientificamerican.com/article.cfm?id=the-semantic-web-overview

Page 15: Big Data on the Web – What We Will Do

.. a system of interlinked hypertext documents accessed via the Internet

The Web as a Global Data Platform

Let’s Start

15

Page 16: Big Data on the Web – What We Will Do

All data including documents, services, people ...

DATA DATA links

The Semantic Web is not about links between web pages.

16

Page 17: Big Data on the Web – What We Will Do

“The Semantic Web isn't just about putting data on the web. It is about making links, so that a person or

machine can explore the web of data. With linked data, when you have some of it, you can find other, related,

data” - TBL.

Linked Data & The Semantic Web

Overview

17

5 Stars Open linked data

Make your stuff available on the Web

Make it avaiable as structured data

Use open, standard formats (instead of excel)

Use a open data format – URLs, descriptions

Link your data to other people’s data

★★

★★★

★★★★

★★★★★

Page 18: Big Data on the Web – What We Will Do

… Linked Data provides the means to reach the goal of the Semantic Web – “the

emergence of a Web of Data”

18

Growth of Interlinks

Overview

2007-05-01 2007-10-08 2007-11-10 2008-02-28 2008-03-31

2008-09-18 2009-03-05 2009-03-27 2009-07-14 2010-09-22

Page 19: Big Data on the Web – What We Will Do

19 October, 2011

295 interlinked datasets, approximately 31 billions triples

DBpedia

Structured Wikipedia

BBC

Best Buy UK Gov

Multimedia Content

Commercial Product Government Data

Page 20: Big Data on the Web – What We Will Do

Linked Data and Open Government Data

Why

20

Page 21: Big Data on the Web – What We Will Do

Applications

Case Studies

21

DBPedia BBC New York Times thedatahub

Page 22: Big Data on the Web – What We Will Do

Linked Government Data

Case Studies

22

UK Data-Gov Wiki Code for America Data Market

Page 23: Big Data on the Web – What We Will Do

Introduction Open Data and

Open Government Data

The Semantic Web

& Linked Data What We Will Do

This Presentation .....

Today

23

Page 24: Big Data on the Web – What We Will Do

What is the Semantic Web for?

Question

24

Search Inference Intelligence Standards

Page 25: Big Data on the Web – What We Will Do

Google’s Semantic Search

Case Studies

People should be able to ask questions and we should understand their meaning, or they should be able to talk

about things at a conceptual level. ... A lot of people will turn to things like the semantic Web as a possible answer to

that.“ - Google Vice President of Search Products & User Experience Marissa Mayer

25

an initiative launched on 2 June 2011 by Bing, Google and Yahoo!

to "create and support a common set of schemas for structured data markup on web pages."

http://schema.org/docs/full.html

The Knowledge Graph is a collection of information sources that

help discern a user’s specified intent with each individual query.

The graph is actually an encyclopedia with structured information

obtained from the web. (currently, 200 million entities)

Freebase is an open, Creative Commons licensed repository

of structured data of almost 22 million entities. An entity is a single

person, place, or thing connected by a graph.

Page 26: Big Data on the Web – What We Will Do

Apple’s Siri

Case Studies

Ask Siri how Apple recorded the best quarter in history for a tech company, and her answer should be: "Me."

26

Siri (Speech Interpretation and Recognition Interface) is

an intelligent personal assistant and knowledge navigator which

works as an application for Apple's iOS.

A Brief History

- In December 2007 Siri, Inc. was formed by Dag Kittlaus (CEO),

Adam Cheyer (VP Engineering), and Tom Gruber (CTO/VP Design).

- Siri Inc. went after funding and by November 2009 it had

secured $15.5 million investment, resulted in the creation of the

first Siri application, which debuted on the iPhone 3GS in

February 2010.

- Siri acquired by Apple; iPhone becomes the Virtual Personal

Assistant

Knowledge Navigator (1987)

a concept described by former Apple Computer CEO John

Sculley in his 1987 book, Odyssey.

(Source: http://www.youtube.com/watch?v=QRH8eimU_20)

Page 27: Big Data on the Web – What We Will Do

27

Active Ontology

Case Studies

A processing formalism where distinct processing elements are arranged according to ontology notions;

an execution environment.

Basic concepts

* Ontology : A data structure

- Formal representation for domain knowledge

- Classes, attributes, relations

* Active Ontology : A processing environment

- Processing elements arranged according to ontology notions

- Communication channels

movie

genre actor rating P P P

P

rule set

rule

condition

action

rule

condition

action

rule

condition

action

(Baur et al., 2007)

Page 28: Big Data on the Web – What We Will Do

28

Active Framework

Examples

a platform for constructing service-oriented applications which can be accessed through multiple modalities in a

natural, task-oriented manner that leverages context throughout the experience

Active Server

Active Editor

Active Ontology

Active Console

Facts store

Active Ontology

Active Ontology

Active Ontology

Evaluation

Engine

“find action movies in San Francisco” “nearby Chinese restaurants”

(Baur et al., 2007)

Page 29: Big Data on the Web – What We Will Do

Difficult

Think

Hope is not a strategy and the “change” has been change for

the worse, and not better.

What We Will Do

Interdisciplinary Collaboration

29

Page 30: Big Data on the Web – What We Will Do

- Charles Baur, Adam Cheyer, Didier Guzzoni, Active, a platform for building intelligent software

- Page 1: http://www.w3.org/DesignIssues/diagrams/websci/Marius%20Watz%20-%20Web%20Science%20artwork.png

- Page 4: http://www.go-gulf.com/60seconds.jpg

- Page 9: http://cloud.frontpagemag.com/wp-content/uploads/2012/03/obama11.jpg

- Page 27: http://www.patentlyapple.com/.a/6a0120a5580826970c0168e5ccdd81970c-800wi

- Page 29: http://programminggeeks.com/wp-content/uploads/2010/05/Programming-Geeks-Web-Science.jpg

- Page 29: http://3.bp.blogspot.com/-C0Kyck90Djo/T4KZTg3k1XI/AAAAAAAAAsE/RUp165S0FCQ/s1600/Commitment.jpeg

References

30

Page 31: Big Data on the Web – What We Will Do

For more information

contact Haklae Kim via

[email protected]

Twitter: haklaekim

Or read up on the

sonagi blog at:

http://blogweb.co.kr