(Day 2) Online Communication and Marketing · Topic - World Wide Web, Semantic Web and Schema.org...
Transcript of (Day 2) Online Communication and Marketing · Topic - World Wide Web, Semantic Web and Schema.org...
www.sti-innsbruck.at © Copyright 2015 STI INNSBRUCK www.sti-innsbruck.at
Online Communication and Marketing(Day 2)
Zaenal [email protected]
www.sti-innsbruck.at
● Topic - World Wide Web, Semantic Web and Schema.org
● Agenda○ 09.00 - 10.30
■ Introduction■ The Internet and the World Wide Web
○ Break - 15 minutes○ 10.45 - 12.15
■ Semantic Web■ Markup Languages■ Schema.org
○ Break - 30 minutes○ 12.45 - 14.15
■ Pro-seminar■ Working with task in hand in a group (3-5 students)
Topic & Agenda
2
www.sti-innsbruck.at
1. INTRODUCTION
3
www.sti-innsbruck.at
What travel consumers do online?
(*) ETOA, “The New Online Travel Consumer”, 2014, http://www.etoa.org
4
www.sti-innsbruck.at
Online sources of travel inspiration
5
(*) Think with Google, “The 2014 Traveler’s Road to Decision”, 2014, https://www.thinkwithgoogle.com
www.sti-innsbruck.at
Top 10 online sources used in travel planning
6
(*) Think with Google, “The 2014 Traveler’s Road to Decision”, 2014, https://www.thinkwithgoogle.com
www.sti-innsbruck.at
Typed into Google when start to plan a trip
7
(*) Think with Google, “The 2014 Traveler’s Road to Decision”, 2014, https://www.thinkwithgoogle.com
www.sti-innsbruck.at
Events in Landeck?
8
www.sti-innsbruck.at
Events in Vienna?
9
www.sti-innsbruck.at
Hotel Schwarzer Adler?
10
www.sti-innsbruck.at
How old is david alaba?
11
www.sti-innsbruck.at
How is this possible?
The answer is annotation of web pages with Structured Semantic Data
- Search engines can more easily organized and display them in creative ways
12
www.sti-innsbruck.at
How is it relevant?
• Semantically annotated web pages will increase the pages online visibility
• Higher ranked web pages could attract more visitors
• More visitors means more potential customers
• More potential customers increases your business success
13
www.sti-innsbruck.at
2. FUNDAMENTALS OF THE INTERNET
14
Picture taken from: http://querosaber.sapo.pt/media/galeria_multimedia_v2/offline/19577.0.original.jpg
www.sti-innsbruck.at
The Internet
15
https://en.wikipedia.org/wiki/File:Internet_map_1024_-_transparent,_inverted.png
• US Government (1960s): “robust, fault-tolerant communication via computer networks”
• ARPANET (1980s): “backbone for interconnection of regional academic and military networks”
• 1990s: birth of “modern Internet”: merging of– Academic networks– Military networks– Commercial enterprise networks
Source: https://en.wikipedia.org/wiki/Internet
www.sti-innsbruck.at
The Internet
16
http://www.bitrebels.com/technology/the-growth-of-the-internet-infographic/http://www.internetworldstats.com/emarketing.htm
www.sti-innsbruck.at
The Internet
17
Architecture– Globally connected network of computers– Currently 2.9 billion „things“ connected [1]– Estimated 25 billion by the end of 2020 → Internet of things
Source: https://en.wikipedia.org/wiki/Internet [1] http://www.zdnet.com/article/25-billion-connected-devices-by-2020-to-build-the-internet-of-things/
www.sti-innsbruck.at
The Internet
Evolution
18
1945 1995
Memex Concei
ved1945
WWWCreate
d1989
Mosaic
Created
1993
A Mathemati
calTheory of
Communic
ation1948
Packet Switchi
ng Invente
d1964Silic
onChip1958
First Vast
Computer
NetworkEnvision
ed1962
ARPANET
1969
TCP/IPCreate
d1972
Internet
Named
and Goes
TCP/I
P1984
Hypertext
Invented
1965
Age ofeCommerceBegins
1995
Source: http://www.isoc.org/internet/history2002_0918_Internet_History_and_Growth.ppt
www.sti-innsbruck.at
The Internet
Energy use:
2011 Estimation: “170–307 GW, less than 2% of energy used by humanity“
Estimation includes building, operating and replacing:
• 750M laptops
• 1B smartphones
• 100M servers
• Routers, cell towers, optical switches, Wi-Fi transmitters and cloud
storage devices
19
www.sti-innsbruck.at
Services based on the Internet: Communication
● E-mail: messages and attachments are sent over the internet infrastructure. ○ Protocols in use: SMTP, POP, IMAP
● Chat: Short-message based communication○ Protocol: eg. IRC (Internet relay chat)○ Typically: install a client, connect to a server, start conversation,
examples: ICQ, Skype, Talker, Windows Live Messenger● Internet Telephonie: (Skype, ...)
○ Internet carries voice traffic ○ calls are free or cost much less○ serious competitor to traditional telephony○ aka VoIP = Voice over Internet Protocol
The Internet
20
www.sti-innsbruck.at
The Internet
Services based on the Internet: Data transfer
● File sharing○ uploading file to server for storing and sharing: FTP○ peer-to-peer sharing of large files: Torrent (BitTorrent)
● Streaming media: „real-time delivery of digital media for the immediate consumption or enjoyment by end users”○ Live: Radio stations, TV,...: fm4, orf TVthek, das erst mediathek○ On Demand: Podcasts, Netflix, Sky, Spotify, Pandora,...
● Webcams:○ Weather cameras, animal watch, traffic monitoring, surveillance,
sports, „online live shows“, video chat, live demos, ...
21
www.sti-innsbruck.at
The Internet
Services based on the Internet: World Wide Web
22
www.sti-innsbruck.at
3. THE WORLD WIDE WEB
23
Picture taken from: http://webfoundation.org/about/vision/history-of-the-web/
www.sti-innsbruck.at
The World Wide Web
24
Vinton G. Cerf & Sir Tim Berners-Lee
29 October 2014,W3C20 ANNIVERSARY SYMPOSIUM
www.sti-innsbruck.at
The World Wide Web
25
„Invented“ by Sir Tim Berners-Lee while employed at CERN
→ in 1989 TBL wrote a proposal for a system called „World Wide Web“ [1]
→ TBL wrote the first Web browser and Web server
→ wrote the first Webpage
[1] http://www.w3.org/History/1989/proposal.html
www.sti-innsbruck.at
The World Wide Web
26
„information space where documents and other web resources are identified by URLs, interlinked by hypertext links, and can be accessed via the Internet”
URI
URI
URI
Website
UrlHyperlink (Link)
www.sti-innsbruck.at
The World Wide Web
Web „1.0“ – first stage in the WWW – the static web
“collection of text documents and other resources, linked by hyperlinks and URLs, usually accessed by web browsers, from web servers”
• Netscape– Netscape is associated with the breakthrough of the Web.– Netscape had rapidly a large user community making attractive for
others to present their information on the Web.
• Google– Google is the incarnation of Web 1.0 mega grows – Google indexed already in 2008 more than 1 trillion pages [*] – Google and other similar search engines turned out that a piece of
information can be faster found again on the Web than in the own bookmark list
27
[*] http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html
www.sti-innsbruck.at
The World Wide Web
Web 2.0“The term "Web 2.0" (2004–present) is commonly associated with web
applications that facilitate interactive information sharing, interoperability, user-centered design, and collaboration on the World Wide Web”
• Web 2.0 is a vaguely defined phrase referring to various topics such as social networking sites, wikis, communication tools, and folksonomies.
• Tim Berners-Lee is right that all these ideas are already underlying his original web ideas, however, there are differences in emphasis that may cause a qualitative change.
• With Web 1.0 technology a significant amount of software skills and investment in software was necessary to publish information.
• Web 2.0 technology changed this dramatically.
28
http://en.wikipedia.org/wiki/Web_2.0
www.sti-innsbruck.at
The World Wide Web
The four major breakthroughs of Web 2.0 are:
1. Blurring the distinction between content consumers and content
providers.
2. Moving from media for individuals towards media for communities.
3. Blurring the distinction between service consumers and service
providers
4. Integrating human and machine computing in a new and innovative way
29
www.sti-innsbruck.at
The World Wide Web
Wiki, Blogs, and Twitter turned the publication of text in mass phenomena, as flickr and youtube did for multimedia
30
1. Blurring the distinction between content consumers and content providers
www.sti-innsbruck.at
The World Wide Web
Social web sites such as del.icio.us, facebook, FOAF, linkedin, myspace and Xing allow communities of users to smoothly interweave their information and activities
31
2. Moving from media for individuals towards media for communities
www.sti-innsbruck.at
The World Wide Web
Mashups allow web users to easy integrate services in their web site that were implemented by third parties
32
3. Blurring the distinction between service consumers and service providers
www.sti-innsbruck.at
The World Wide Web
Amazon Mechanical Turk - allows to access human services through a web service interface blurring the distinction between manually and automatically provided services
33
4. Integrating human and machine computing in a new and innovative way
www.sti-innsbruck.at
The World Wide Web
34
(*) K. Bratcher, “Web: The History of the Internet”, https://www.tes.com/lessons/hMm6KQB3x9wzPw/web-the-history-of-the-internet
www.sti-innsbruck.at
The World Wide Web
But...
The current Web has its limitations when it comes to:
1. finding relevant information
2. extracting relevant information
3. combining and reusing information
35
www.sti-innsbruck.at
The World Wide Web
• Finding information on the current Web is based on keyword search• Keyword search has a limited recall and precision due to:
– Synonyms: • e.g. Searching information about “Cars” will ignore Web pages
that contain the word “Automobiles” even though the information on these pages could be relevant
– Homonyms:• e.g. Searching information about “Jaguar” will bring up pages
containing information about both “Jaguar” (the car brand) and “Jaguar” (the animal) even though the user is interested only in one of them
36
www.sti-innsbruck.at
The World Wide Web
37
www.sti-innsbruck.at
The World Wide Web
• Keyword search has a limited recall and precision due also to:– Spelling variants:
• e.g. “organize” in American English vs. “organise” in British English
– Spelling mistakes– Multiple languages
• i.e. information about same topics in published on the Web on different languages (English, German, Italian,…)
• Current search engines provide no means to specify the relation between a resource and a term– e.g. sell / buy
38
www.sti-innsbruck.at
The World Wide Web
• One-fit-all automatic solution for extracting information from Web pages is not possible due to different formats, different syntaxes
• Even from a single Web page is difficult to extract the relevant information
39
Which book isabout the web?
What is the priceof the book?
www.sti-innsbruck.at
The World Wide Web
• Extracting information from current web sites can be done using wrappers
40
WEBHTML pagesLayout
Structured Data,Databases,
XMLStructure
Wrapper
extractannotatestructure
www.sti-innsbruck.at
The World Wide Web
• The actual extraction of information from web sites is specified using standards such as XSL Transformation (XSLT) *)
• Extracted information can be stored as structured data in XML format or databases.
• However, using wrappers do not really scale because the actual extraction of information depends again on the web site format and layout
*) https://en.wikipedia.org/wiki/XSLT
41
www.sti-innsbruck.at
The World Wide Web
Tasks often require to combine data on the Web
1. Searching for the same information in different digital libraries
2. Information may come from different websites and needs to be combined
42
www.sti-innsbruck.at
The World Wide Web
1. Searches for the same information in different digital libraries
43
Example: I want to travel from Innsbruck to Rome.
www.sti-innsbruck.at
The World Wide Web
2. Information may come from different websites and needs to be combined
44
Example: I want to travel from Innsbruck to Rome where I want to stay in a hotel and visit the city
www.sti-innsbruck.at
The World Wide Web
• Increasing automatic linking among data
• Increasing recall and precision in search
• Increasing automation in data integration
• Increasing automation in the service life cycle
45
www.sti-innsbruck.at
Summary
● Word Wide Web is a service run on the Internet
● Web 1.0 (the static web)
○ Contents are available on web servers, accessible through
browsers
○ Limited to the passive viewing of content
● Web 2.0 (the social web)
○ User-generated content
○ High interaction and collaboration of users in a virtual community
46
www.sti-innsbruck.at
Summary
● Extracting information from the Web is challenging:○ Use different formats, different syntaxes○ Located on distributed sources○ Search information based on keyword is less accurate
■ Search information about “Car”, but not “Automobiles”?■ Search information about “Jaguar” (the car brand) or “Jaguar”
(the animal)?
● Adding semantics to data and services is the solution!
● Technical solution: The Semantic Web
47
www.sti-innsbruck.at
4. THE SEMANTIC WEB
48
www.sti-innsbruck.at
The Semantic Web
49
Short motivation movie: https://www.youtube.com/watch?v=off08As3siM
www.sti-innsbruck.at
The Semantic Web
50
Static WWWURI, HTML, HTTP
More than 2 billion usersmore than 50 billion pages
www.sti-innsbruck.at
The Semantic Web
51
WWWURI, HTML, HTTP
Serious problems in • information finding• information extracting• information representing• information interpreting• information maintaining
Semantic WebRDF, RDF(S), OWLStatic
www.sti-innsbruck.at
The Semantic Web
“The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.”
T. Berners-Lee, J. Hendler, O. Lassila, “The Semantic Web”, Scientific American, May 2001
52
www.sti-innsbruck.at
The Semantic Web
• The next generation of the WWW
• Information has machine-processable and machine-understandable semantics
• Not a separate Web but an augmentation of the current one
• The backbone of Semantic Web are ontologies
53
www.sti-innsbruck.at
Ontology definition
formal, explicit specification of a shared conceptualization
commonly accepted understanding
conceptual model of a domain
(ontological theory)
unambiguous terminology definitions
machine-readability with computational
semantics
Gruber, “Toward principles for the design of ontologies used or knowledge sharing?” , Int. J. Hum.-Comput. Stud., vol. 43, no. 5-6,1995
www.sti-innsbruck.at
… “well-defined meaning” …
• “An ontology is an explicit specification of a conceptualization”Gruber, “Toward principles for the design of ontologies used for knowledge sharing?” , Int. J. Hum.-Comput. Stud., vol. 43, no. 5-6,1995.
• Ontologies are the modeling foundations to Semantic Web
– They provide the well-defined meaning for information
www.sti-innsbruck.at
… explicit, … specification, … conceptualization, …
An ontology is:• A conceptualization
– An ontology is a model of the most relevant concepts of a phenomenon from the real world
• Explicit– The model explicitly states the type of the concepts, the
relationships between them and the constraints on their use• Formal
– The ontology has to be machine readable (the use of the natural language is excluded)
• Shared– The knowledge contained in the ontology is consensual, i.e. it has
been accepted by a group of people.
Studer, Benjamins, D. Fensel, “Knowledge engineering: Principles and methods”, Data Knowledge Engineering, vol. 25, no. 1-2, 1998.
www.sti-innsbruck.at
Ontology example
Concept conceptual entity of the domain
Property attribute describing a concept
Relation relationship between concepts or properties
Axiom coherency description between Concepts / Properties / Relations via logical expressions
Person
Student Professor
Lecture
isA – hierarchy (taxonomy)
name email
matr. nr.
researchfield
topiclecturenr.
attends holds
holds(Professor, Lecture) =>Lecture.topic = Professor.researchField
www.sti-innsbruck.at
Top Level O., Generic O. Core O., Foundational O., High-level O, Upper O.
Task & Problem-solving Ontology
Application Ontology
Domain Ontology
[Guarino, 98] Formal Ontology in Information Systems, http://www.loa-cnr.it/Papers/FOIS98.pdf
describe very general concepts like space, time,
event, which are independent of a particular
problem or domain
describe the vocabulary related to a
generic domain by specializing the concepts
introduced in the top-level ontology.
describe the vocabulary related to a
generic task or activity by
specializing the top-level ontologies.
the most specific ontologies. Concepts in application ontologies
often correspond to roles played by domain
entities while performing a certain activity.
Types of ontologies
www.sti-innsbruck.at
The Semantic Web is about…
• Web Data Annotation– connecting (syntactic) Web objects, like text chunks, images, … to
their semantic notion
– e.g., this image is about Innsbruck, Dieter Fensel is a professor
• Data Linking on the Web (Web of Data)– global networking of knowledge through URI, RDF, and SPARQL
– e.g., connecting my calendar with my rss feeds, my pictures, ...
• Data Integration over the Web– seamless integration of data based on different conceptual models
– e.g., integrating data coming from my two favorite book sellers
59
www.sti-innsbruck.at
Web Data Annotation
60
(*) Images: http://mist-deid.sourceforge.net/docs_2_0/html/use_ui.html
www.sti-innsbruck.at
Data Linking on the Web (Web of Data)
(*) Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/
www.sti-innsbruck.at
Data Linking on the Web Principles
• Use URIs as names for things– anything, not just documents
– you are not your homepage
– information resources and non-information resources
• Use HTTP URIs– globally unique names, distributed ownership
– allows people to look up those names
• Provide useful information in RDF– when someone looks up a URI
• Include RDF links to other URIs– to enable discovery of related information
62
www.sti-innsbruck.at
Data Linking on the Web
Google Knowledge Graph:
1. Find the right thing
2. Get the best summary
3. Go deeper and broader
63
(*) https://googleblog.blogspot.co.at/2012/05/introducing-knowledge-graph-things-not.html
www.sti-innsbruck.at
Data integration over the Web
• Data integration involves combining data residing in different sources and providing user with a unified view of these data
• Data integration over the Web can be implemented as follows:
1. Export the data sets to be integrated as RDF graphs
2. Merge identical resources (i.e. resources having the same URI) from different data sets
3. Start making queries on the integrated data, queries that were not possible on the individual data sets.
www.sti-innsbruck.at
Data integration over the Web
1. Export first data set as RDF graphFor example the following RDF graph contains information about book “The Glass Palace” by Amitav Ghosh
http://www.w3.org/People/Ivan/CorePresentations/SWTutorial/Slides.pdf
www.sti-innsbruck.at
Data integration over the Web
1. Export second data set as RDF graphInformation about the same book but in French this time is modeled in RDF graph below
http://www.w3.org/People/Ivan/CorePresentations/SWTutorial/Slides.pdf
www.sti-innsbruck.at
Data Integration over the Web
Same URI = Same resource
2. Merge identical resources (i.e. resources having the same URI) from different data sets
http://www.w3.org/People/Ivan/CorePresentations/SWTutorial/Slides.pdf
www.sti-innsbruck.at
Data integration over the Web
2. Merge identical resources (i.e. resources having the same URI) from different data sets
http://www.w3.org/People/Ivan/CorePresentations/SWTutorial/Slides.pdf
www.sti-innsbruck.at
Data integration over the Web
3. Start making queries on the integrated data
– A user of the second dataset may ask queries like: “give me the title of the original book”
– This information is not in the second dataset
– This information can be however retrieved from the integrated dataset, in which the second dataset was connected with the the first dataset
www.sti-innsbruck.at
• The Semantic Web extends current web with machine-processable and machine-understandable semantics
• Its backbones are Ontologies
• It is about
– Web Data Annotation
– Data Linking on the Web
– Data Integration over the Web
Summary
www.sti-innsbruck.at
5. MARKUP LANGUAGES
71
Picture taken from: http://webfoundation.org/about/vision/history-of-the-web/
www.sti-innsbruck.at
Markup language
• A system for annotating a document in a way that is distinguishable from the text
• Digital annotation media is known as “tags”
<strong>Online Communication and Marketing</strong>
<underline>Landeck, Tyrol, Austria</underline>
72
(*) https://en.wikipedia.org/wiki/Markup_language(*) Image: https://persistentenlightenment.wordpress.com/2013/04/14/popperberlinpart/
www.sti-innsbruck.at
HTML
73
“HyperText Markup Language“
• Invented by Tim Berners-Lee at CERN 1993
• Current Version: 5
• Standard markup language to create web pages
• Interpreted by browser
• HTML describes structure of website– Semantically– With cues for representation
www.sti-innsbruck.at
HTML
74
www.sti-innsbruck.at
XML
• Extensible Markup Language
• A markup language for encoding documents in a format that is both human-readable and machine readable
• XML emphasize simplicity, generality, and usability across the Internet
75
(*) https://en.wikipedia.org/wiki/XML
www.sti-innsbruck.at
JSON
• JavaScript Object Notation
• An open-standard format that uses human-readable text to transmit data objects consisting of attribute-value pairs
<name> : <value><key> : <value><field> : <value><attribute> : <value>
76
(*) https://en.wikipedia.org/wiki/JSON
www.sti-innsbruck.at
JSON-LD
• JavaScript Object Notation for Linked Data, provides links called “context” between objects in a JSON to concepts in an Ontology
• Playground: http://json-ld.org/playground/
77
(*) https://en.wikipedia.org/wiki/JSON-LD
www.sti-innsbruck.at
Microdata
HTML specification used to embed metadata within existing content on web pages
78
(*) https://en.wikipedia.org/wiki/Microdata_(HTML)
www.sti-innsbruck.at
RDFa
• Resource Description Framework in Attributes• A set of attribute-level extensions to HTML, XML for embedding rich
metadata within web documents
79
(*) https://en.wikipedia.org/wiki/RDFa
www.sti-innsbruck.at
Summary
• Markup languages - used to annotate a document in a way distinguishable
• HyperText Markup Language (HTML)
• Extensible Markup Language (XML)
• JavaScript Object Notation (JSON)
• JavaScript Object Notation for Linked Data (JSON-LD)
• Microdata
• Resource Description Framework in Attributes (RDFa)
80
www.sti-innsbruck.at
6. SCHEMA.ORG
81
www.sti-innsbruck.at
Schema.org
• Initiative founded 2011, by:– Bing– Google– Yahoo!– Yandex
• Vocabulary for structuring data in web sites
• Embedded into html– Microdata– RDFa– JSON-LD
82
www.sti-innsbruck.at
Schema.org
83
www.sti-innsbruck.at
Recipes
A set of instructions that describes how to prepare or make something
84
www.sti-innsbruck.at
Events
An organized event that people may attend at a particular time and place
85
www.sti-innsbruck.at
Reviews
A review of an item such as restaurant, movie, or store
86
www.sti-innsbruck.at
Products
Information about a product, including price, availability, and review ratings
87
www.sti-innsbruck.at
Schema.org
88
(*) Google’s testing tool: https://search.google.com/structured-data/testing-tool
www.sti-innsbruck.at
Summary
• World Wide Web Evolution
• Need a Semantic Structured Data
• Markup Languages
• Markup Vocabulary
89