Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de
Knowledge-Based Systems and Deductive Databases
13.1 Generating ontologies 13.2 Wisdom of the crowds 13.3 Folksonomies
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- D-
13 Social Systems
• Last week we saw ontologies as a powerful instrument for… – Representing knowledge – And reason about it!
• Ontologies, rules and logics form the middle layer of the proposed Semantic Web stack – Formal syntax – Formal semantics
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- E-
13.0 Semantic Web Reasoning
• OWL is the language (and semantics) of choice for the ontology part – But OWL DL has a somewhat different semantics
from RDF/S – And OWL Full is compatible with RDF/S, but
computationally difficult…
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- F-
13.0 Semantic Web Reasoning
G2H>I-
• Thus, the stack does not really consists of a set of languages building directly and completely on the lower languages (RDF/S ! OWL ! logic) – Also a subsequent refinement to the
‘DL-program’ bit of OWL and the split between OWL and rule languages did not help much
– RDF triples encode facts, but are also used to encode syntax… • Complex syntax is clumsy to write • Syntax is a true fact..?!
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- J-
13.0 Semantic Web Reasoning
• While RDF/S (or at least the DLP bits) form a valid foundation for OWL, Datalog-style rule languages need other assumptions – Closed world semantics – Leads to full negation as failure (NAF) – …
• Whereas DLP is only a subset of Horn rules – And if it is interpreted with
Herbrand models and CWA, it is no longer suitable for OWL…
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- K-
13.0 Semantic Web Reasoning
A,-0?&C&-+"-#6&C+C4?<"(----------%#(<4-:C+1&$#C=L-
• Hmmmm… this leads to difficult questions… – If you want to join the debate: • P. Patel-Schneider: A Revised Architecture for Semantic Web
Reasoning. In PPSWR‘05, LNCS, Springer, 2005. • I. Horrocks, B. Parsia, P. Patel-Schneider, J. Hendler: Semantic
Web Architecture: Stack or Two Towers? In PPSWR‘05, LNCS, Springer, 2005.
– Maybe rules on top of OWL..?!
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- M-
13.0 Semantic Web Reasoning
• In any case ontologies and logics are powerful once you have them, but how do we get the ontologies..?! – Expert create them like in our Datalog expert
systems? • Do all experts have the same world view? Can we simply
extract their knowledge?
– Create a common backbone and let all individual users build their extensions ‘as they go’? • How to keep the ontology consistent?
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- N-
13.0 Semantic Web Reasoning
• Ontologies are extremely powerful and based on decidable logics, but… – Let one little hobbit (read:
inconsistency) in and the entire thing comes crashing down…
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- O-
13.0 Semantic Web Reasoning
• So,… do we always need a full-fledged ontology or are there other possibilities..?! – Depends on the area: a medical domain ontology
should be sound and consistent!!!
– But some ontology for document management or organizing your holiday photos..?!
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- PQ-
13.0 Semantic Web Reasoning
• Medical Subject Heading – Controlled vocabulary for indexing journal articles and
books in life sciences • Taxonomy • Thesaurus
– Maintained by the US National Library of Medicine (NLM) • Used to classify the MEDLINE/PubMed collections • Free for use and download
– Proprietary XML or text format – HTML web view
– MeSH is hand-crafted by medical experts !"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- PP-
13.1 The MeSH Ontology
– Currently, MeSH contains around 25,000 subjects (descriptors) • Accompanied by brief definition and a synonym list • Descriptors are arranged in a hierarchy and may occur
multiple times in different branches
– Entries in the tree hierarchies are uniquely identified by an alpha-numerical ID system
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- PD-
13.1 The MeSH Ontology
;#@-%&6&%-4#"4&@0,-#:-4+C<&,-
R+C<&,-0/@&,-
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- PE-
13.1 The MeSH Ontology 2&,4C<@0#CS?&+'<"(-G4#"4&@0I-
;C&&-A2-
2&T"<5#"-
./"#"/1,-
U&%+0&'-4#"4&@0,-
V3+%<T&C,-GR#11#"-;+(,I-
?W@XSS$$$Y"%1Y"<?Y(#6S4(<S1&,?SDQQOSZ*[4(<L1#'&\]<"'&^\EJME]T&%'\+%%]_Z\]AA\]>`\]:#C1\]<"@30\-
• Qualifiers encode commonly used tags – Can be added to all other headings – e.g. viral, microbiol, epidemic, etc
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- PF-
13.1 The MeSH Ontology
V3+%<T&C-,?#C0430-
?W@XSS$$$Y"%1Y"<?Y(#6S4(<S1&,?SDQQOSZ*[4(<L1#'&\]0&C1\ZA]T&%'\a3+%-
• By using MeSH, concept maps can be visualized – Help to quickly assess a given topic
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- PJ-
13.1 The MeSH Ontology
?W@XSS$$$Y43C&?3"0&CY4#1S@37%<4S'<45#"+C/Y'#-
b<,3+%-'<45#"+C/-3,&,--4#)#443CC&"4&-#:-4#"4&@0,--A"-@37%<4+5#",-+,-$&<(?0-<"'<4+0#C-
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- PK-
13.1 The MeSH Ontology
;/@&'-%<"=,-7&0$&&"-4#"4&@0,--+%%#$-:#C-c7C#$,<"(d-- -
• Also, can easily become very large and complex
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- PM-
13.1 The MeSH Ontology
• MeSH is an example for enriched taxonomy manually modeled by domain experts – Expert taxonomies are widely used, however, they
come with problems • Inflexible and rigid structure representing just the authors
view and knowledge • Hard to change once established, expensive to maintain • Hierarchical classification often not very practical
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- PN-
13.1 Hierarchical Expert Ontologies
• Example hierarchies: – Periodic Table of Elements,
devised in 1869 by Dmitri Mendeleev – Probably the best classification scheme ever – But still, it is and was heavily disputed
• Represented just the knowledge known by Mendeleev • e.g. initial version was missing noble gases
– …by the way, is Helium really a gas? It becomes solid when cooled…
• Ordering scheme changed from weight to atomic number • Inserted and added rows / columns, added categories, etc • etc.
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- PO-
13.1 Hierarchical Expert Ontologies
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- DQ-
13.1 Hierarchical Expert Ontologies
• Dewey Decimal Classification (DDC) – Proprietary system for library classification,
developed by Melvin Dewey in 1876 • Updated in varying intervals (currently 22nd revision)
– Used by, e.g. Library of Congress – Organizes everything in 10 main classes, which are
divided into 10 divisions, which have 10 sections • A less flexible variant of a system
similar to the tree ID in MeSH • Strictly hierarchical
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- DP-
13.1 Hierarchical Expert Ontologies
• Currently, main categories are like
– e.g. 025 is library management
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- DD-
13.1 Hierarchical Expert Ontologies
QQQ-8-R#1@30&C-,4<&"4&e-<":#C1+5#"e-+"'-(&"&C+%-$#C=,-PQQ-8->?<%#,#@?/-+"'-@,/4?#%#(/-DQQ-8-U&%<(<#"-EQQ-8-.#4<+%-,4<&"4&,-FQQ-8-H+"(3+(&,-JQQ-8-.4<&"4&-+"'-Z+0?&1+54,-KQQ-8-;&4?"#%#(/-+"'-+@@%<&'-,4<&"4&-MQQ-8-`C0,-+"'-C&4C&+5#"-NQQ-8-H<0&C+03C&-OQQ-8-_<,0#C/-+"'-(&#(C+@?/-+"'-7<#(C+@?/-
• One of the main problems in inflexibility and inability to further model relationships between entries – Also, all entries are considered to be co-equal – Until recently, classification for the top concept 200 –
Religion looked like this:
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- DE-
13.1 Hierarchical Expert Ontologies
!""#$%&'()(*+--DPQ-f+03C+%-0?&#%#(/--DDQ-*<7%&--DEQ-R?C<,5+"-0?&#%#(/-DFQ-R?C<,5+"-1#C+%-]-'&6#5#"+%-0?&#%#(/--DJQ-R?C<,5+"-#C'&C,-]-%#4+%-4?3C4?--DKQ-R?C<,5+"-,#4<+%-0?&#%#(/--DMQ-R?C<,5+"-4?3C4?-?<,0#C/--DNQ-R?C<,5+"-,&40,-]-'&"#1<"+5#",--DOQ-g0?&C-C&%<(<#",-
• In the late 90ties, Yahoo! started to classify the World Wide Web – For this task, ontology experts where hired to
create the classification hierarchy – Often, this classification was quite difficult and
awkward – Also, links among entities were necessary between
entries • Strict hierarchical modeling not sensible
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- DF-
13.1 Hierarchical Expert Ontologies
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- DJ-
13.1 Hierarchical Expert Ontologies
*##=,-+C&-"#0--&"0&C0+<"1&"0e-%<"=-0#-_31+"<5&,h-
*##=,&%%&C,-(#-0#-@C#:&,,<#",-
• Thus, the transition was made from strictly hierarchical to linked hierarchical taxonomies
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- DK-
13.1 Hierarchical Expert Ontologies
• In case of highly unstructured domains, capturing information in an hierarchical way becomes increasingly difficult – More and more links, hierarchy less and less useful
• Modeling more and more complex – Idea: Just omits the hierarchy part and use only links
• Folksonomies • Automatically generated Lightweight ontologies
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- DM-
13.1 Hierarchical Expert Ontologies
• Of course the manual creation of ontologies is an expensive and error-prone process – Is there a possibility to create ontologies
automatically? – It’s a current research question, but first approaches
lead to semi-automatic procedures… • Basically all approaches mine
statistical connections between terms…
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- DN-
13.1 Ontology Generation
• A major group for taxonomy creation are natural language processing approaches – Gathering simple typical phrases from full texts like
“…such as…” or “…like e.g.,…” to find synonyms or subclasses • The surrounding noun phrases can be
put into some (hierarchical) relationship • The belief in the correctness of derived classes and/or
hierarchies can be supported by comparison to general ontologies like WordNet or counting co-occurrences e.g., in documents retrieved from Google
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- DO-
13.1 Ontology Generation
• Or, domain ontologies can be derived relying on simple statistics, e.g., term co-occurrence – Extract all salient keywords from each document – Keyword X subsumes keyword Y, if at least 80% of
the documents in which Y occurs also contain X, and if X occurrs in more documents than Y
– Works only if a sufficiently large number of documents for a certain domain is given
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- EQ-
13.1 Ontology Generation
• Please note, all these techniques are heuristics… – The Semantic Web does not really understand the
contents of the pages (not yet..?!) – But still,… better than nothing…
– Thus, the question arises: Can purely statistical approaches lead to reasonably intelligent results?
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- EP-
13.1 Ontology Generation
• Just a little anecdote for the start: • Sir Francis Galton (1822-1911) – Victorian polymath with special interest
in statistics • Established principles for correlation, deviation, and
regression
– Special interest in research methodologies of eugenics, heredity, genetics, and historiometry • Claim: Intelligence and leadership properties are inherited,
and only few people possess them. And only those are able to lead and act intelligently.
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- ED-
13.2 Wisdom of Crowds
• In 1906, he visited a country fair which also featured an ox weighting betting contest – An ox is presented, everybody guesses how much the meat after
slaughtering will weight, closest bet wins – ~800 people participated
• Farmers, housewives, cattle experts, random visitors, children, etc
– Galtons claim: • Experts will win, the other people will just guess nonsense, crowd
consensus will be useless – Statistical analysis
• Ox weighted 1,198 pounds, average guess of all people was 1,197 pound, no single guess was better than crowd consensus
– Galtons afterwards: • “The result seems more creditable to the trustworthiness of a democratic
judgment than one might have expected.”
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- EE-
13.2 Wisdom of Crowds
• This observation fueled a new research observing crowd decisions
• Experiments and observed events – Bean guessing games • Crowd estimate always very good
– “Who wants to be a millionaire” joker • 91% success rate vs. 65% expert success
– Predicting outcomes of sport events • Aggregated bets are usually more accurate than any expert
guess
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- EF-
13.2 Wisdom of Crowds
• In 1968, the nuclear submarine USS Scorpion mysteriously disappeared – Search for the sub was hopeless and was abandoned – However, Dr. John Craven from Navy’s Special Projects continued
the search with his team of mathematicians – Idea:
• Provide all known evidence to a large group of peoples and teams – Submarine experts, salvage experts, oceanologists,
mathematicians, ship captains, etc. • Each team should develop a theory to what happened
and where the submarine was • Craven combined all theories (wildly diverse) using Bayes’s theorem • Submarine wreckage was immediately located 200 meters off the combined
estimated location
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- EJ-
13.2 Wisdom of Crowds
• Observation – Under certain restrictions large crowds of people
are able to perform highly effective decisions • Far superior to nearly all singular decisions • Some care and control is required to prevent this approach
from failing miserably
– Further reading: • James Surowiecki: The Wisdom of the
Crowds, 2004 • TED Talk at
https://www.ted.com/speakers/james_surowiecki
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- EK-
13.2 Wisdom of Crowds
• Group intelligence can effectively be used on three types of problems
• Cognition Problems – Judging and Processing Information – Examples • Guessing, assessing, predicting, modeling,… • “Who will win Germany’s Next Top model?” • “How many beans are in the jar?” • “How many VW Golfs will be sold in the next term?” • “Which movie should one watch who liked Star Wars?” • “How can the music TOP-100 be classified into genres?”
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- EM-
13.2 Principles of Group Intelligence
• Coordination Problems – How to coordinate ones own behavior with all
others, knowing that they try to do the same? – Often, coordination problems encode cultural
behavior – Examples • Navigating in heavy traffic • Using the seats in a lecture hall • How to figure out a good price
for used items?
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- EN-
13.2 Principles of Group Intelligence
• Cooperation Problems – Get self-centered, distrustful people to work together
for a greater good – Forming networks of trust without necessity of a
central controller – Examples • The free market • Paying taxes • Dealing with pollution
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- EO-
13.2 Principles of Group Intelligence
• For obtaining ‘wise group decisions’, some key criteria have to be met
• Diversity of opinion – Each person should have private information, even if it's just
an eccentric interpretation of the known facts – Opposing opinions usually increase the accuracy of group
decisions by either… • Canceling out each others mistakes • Or fostering discussion among group members
– However, it is important that everybody needs an understanding of the problem • You don’t need to ask kindergarten children about the potential cause
of the SARS epidemic • Diversity means diversity of knowledgeable opinions, not any
opinions!
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- FQ-
13.2 Principles of Group Intelligence
– Groups being too homogeneous will not be able to tap into the power of their numbers • Too much of just the same comes up
• Independence – People's opinions should not be determined by the
opinions of those around them – The strength of group decision making comes from the
diversity of opinion which will be lost, if the group members are not independent
– Dominating members will affect the decisions of the other members • Hype bubbles • “Monkey see, monkey do”
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- FP-
13.2 Principles of Group Intelligence
– Especially, information cascades cancel the original diversity by homogenizing the groups opinions and reducing its effectiveness • People observing others and assuming the observed
decision as one’s own without further reflection • People adopting opinions of their superiors • Often leads to irrational and erratic
herd behavior – “The Emperor’s New Clothes” – Telecom stocks – New economy bubble
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- FD-
13.2 Principles of Group Intelligence
• Decentralization – People are able to specialize and draw on local
knowledge • Crucial to tap into peoples tacit knowledge
– Specialization adds more diversity to the group • Specialist for a certain area provide more valuable input
than non specialists for a special problem
– Using local knowledge allows for optimized solutions for special cases compared to central generic solutions
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- FE-
13.2 Principles of Group Intelligence
– Example: • Open source software
– Specialists from all areas work together in decentralized fashion
• Ancient Athens – Local law and organization is left to regional magistrates, the central
assembly only dealt with “great matters”
• Ant or bee hives – Insects just act on their own, forming their behavior around local
circumstances without central control
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- FF-
13.2 Principles of Group Intelligence
– Problems with decentralization without further adjustments • Wasted efforts
– Many try to solve the same problem although it was already solved many times elsewhere
• Crucial information does not propagate among the groups – Think of 09/11: most facts for predicting the incident were known,
but scattered among all the intelligence agencies
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- FJ-
13.2 Principles of Group Intelligence
• Aggregation – All individual efforts are lost, if there is no mechanism for
turning them into a collective decision • Bean or Ox Guessing
– Compute the average • Sport betting
– Aggregate bets in form of betting margins or ratios • Find lost submarines
– Perform Bayesian aggregation • Program an operating system
– Integrate code of contributors into the distributions /core /etc. • Intelligence Services
– Communicate and share information • …
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- FK-
13.2 Principles of Group Intelligence
• How can crowd intelligence be harnessed for our problems (i.e. dealing with knowledge in computer science)? • Most popular example: Google PageRank! • Base idea:
– Each link from one page to another is a vote, i.e. the author thinks that the linked page is somewhat important
– The more “votes” a page gets, the more important is it – Pages originating from important pages count more than those from
unimportant ones – The votes thus propagate along all pages, encoding the common,
aggregated belief of importance of all websites given by all website authors!
• Incorporating the crowd knowledge given by page rank into traditional IR methods made Google the most successful search engine ever!
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- FM-
13.3 Folksonomies
• So, how to use crowds for actually modeling knowledge? – Observation around 2004: People enthusiastically
enjoyed tagging content on the web – Idea arose that these tags can be used to represent
common, shared knowledge similar to ontologies • The folksonomy was invented!
– Usually credited to Thomas Vander Wal
• Also collaborative tagging, social classification, social indexing, and social tagging
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- FN-
13.3 Folksonomies
• What is tagging? – A tag is just some word which is assigned to some
resource and represents some informal meta-data • Tags are usually freely chosen by the tagger
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- FO-
13.3 Folksonomies
UgiHh-
_j_j-
;B-*.-
;B--*.-
Ak-
A:A.-
;B--*.-
Ak-
<":&C"+%-@C<,#"-
• The tags for a single resource can be represented by tag clouds – The bigger a tag appears, the more often it was used
for this resource
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- JQ-
13.3 Folksonomies
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- JP-
13.3 Folksonomies ;B--*.-
Ak-
A:A.-
A:A.-
;B-*.-
'+0+7+,&,-
U2*P-
A:A.-
%#%-
• Now, a folksonomy could be build by, e.g. observing the co-occurrence of tags on resources
;B-*.- Ak-
A:A.-
UgiHh-
UgiHh-
%#%-
'+0+7+,&,-
U2*P-
• What are folksonomies? – A folksonomy is a much weaker structure than
description logic ontologies • No taxonomies and usually not even a vocabulary
– A Folksonomy just link some tags to some other tags • The tags themselves as well as the links do not have to be
necessarily meaningful
– A Folksonomy represent the self-emergent semantics of the collaborative tagging effort
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- JD-
13.3 Folksonomies
• Formal representation of folksonomies – A folksonomy ! can be represented by a tripartite
hypergraph "#!$%&%'()%*+,• Vertices (%&%-%.%/%.%0%%are partitioned into the disjoint sets
– set - of actors/users, – set / of tags/concepts – set 0 of instances/objects.
• Each tag represents an edge between an actor, tag, and instance – *&%112)3)45%6%#2)3)4$%7%!5,
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- JE-
13.3 Folksonomies
g"0#%#(<&,-+C&-3,X-`-3"<T&'-1#'&%-#:-,#4<+%-"&0$#C=,-+"'-,&1+"54,e->&0&C-Z<=+e--A.9R-DQQJ-
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- JF-
13.3 Folksonomies
A",0+"4&,-0,
R#"4&@0,--/, `40#C,-`,
%#%-
;B--*.-
A:A.-
"#!$%&%'()%*+-
• Based on the hypergraph of !, three weighted bipartite graphs can be generated – Weight represents how often the two diagrammed
vertices of the bipartite graph had been connected by edges in the hypergraph
– The graph -/ of actors and concepts – The graph%/0%of concepts and instances – The graph -0 of actors and instances – E.g., see definition of AC: • -/%&%'-%8%/)%*23+)%*23%&%1#2)%3$%6%9%470%#2)3)4$7*)%:;%<%*%=%>)%?@&%#2)%3$%7%*23)%;#@$%<&%614%<%#2)3)4$7%*56%%%,
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- JJ-
13.3 Folksonomies
• Resulting weighted graph /0,– Also called affiliation graph • Optionally, a threshold can be applied to remove weak
edges
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- JK-
13.3 Folksonomies
A",0+"4&,-0,R#"4&@0,--/,
%#%-
;B--*.-
A:A.-
P-
P-
DD
• The affiliation graphs can be folded into two lightweight ontologies – i.e. for the affiliation graph /0, we can get • The lightweight ontology of related concepts • The lightweight ontology of related instances
– Those ontologies represent how strongly its contained entities are related • Similar to counting co-occurrence
– Mathematically, this can be achieved by multiplying the matrix of the affiliation graph within its inverse, normalizing it with Jaccuard-Coefficient, etc, ….
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- JM-
13.3 Folksonomies
• Lightweight ontology for concepts in delicious
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- JN-
13.3 Folksonomies
• An excerpt from the delicious lightweight ontology graph
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- JO-
13.3 Folksonomies
• Some shortcoming – No controlled vocabulary • e.g. A34@B3@C43D4EB%vs. F34@B3@%G43D4EB vs. A34@B3@HC43D4EB,• IJI)%KJGI)%LBEMN@OPQBLR)%etc.,
– Handling of synonyms and homonyms • 0P0F vs. 0BAD4DQD@%PSM%0BPEMT2D4EBA%FRAD@T@,• U23V@OEM (degree) vs. U23V@OEM (unmarried male)
– Questionable semantics of links • What does a link in a folksonomy mean? Does it mean
something? • No formal reasoning possible!
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- KQ-
13.3 Folksonomies
• delicous.com
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- KP-
13.3 Folksonomies
0+(,-
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(-
• flickr.com
KD-
13.3 Folksonomies
;+(,-
><403C&-
R#11&"0,-
lC#3@-
A")><403C&);+(,-
R#11&"0,-
• Project 10X – “Industry Roadmap to Web 3.0 and Multibillion Dollar
Market Opportunities” • Vast industrial report on semantic web business future • i.e. marketing blubber, but still realistic
– Web 2.0: Connecting people – Web 3.0: Connecting knowledge • Add a “knowledge layer”
on top of the internet • Finally realize the Semantic
Web vision
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- KE-
13.4 The Web 3.0?
?W@XSS$$$Y@C#m&40PQ^Y4#1S-?W@XSS$$$Y<,#4#Y4#1S@':S.&1+"54[9+6&[DQQN)n^&4356&[,311+C/Y@':-
• Claim: the support for creating the Web 3.0 is finally there – Semantic technologies embraced by many big players
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- KF-
13.4 The Web 3.0?
• Trends of Web 3.0 – Semantic User Experience • “Intelligent user interfaces drive gains in user productivity &
satisfaction” • Personalized, context aware, immersive human-
computer interaction
– Semantic Social Computing • “Collective knowledge systems become the next killer app” • Enrich Web 2.0 technologies (blogging, tagging, social
networking, wikis, etc.) with semantic layers – Tag ontologies, semantic wikis, semantic blogs, etc
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- KJ-
13.4 The Web 3.0?
– Semantic Applications • “New capabilities, concepts of operation, & improved lifecycle
economics” • Enhance enterprise-level off the shelf software (e.g. ERP,
CRM, SCM, PLM, HR, etc) with knowledge layers – Ontology-driven discovery of documents – Policy-driven processes modeled using ontologies – Business logic modeling – Automated agents and advisors
– Semantic Infrastructure • “Hardware for semantic software” • New immersive display technologies for better data interaction,
specialized processors, mega-broadband internet, “everything connected” – Ubiquitous computing
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- KK-
13.4 The Web 3.0?
• The future internet in 2020? Web 4.0?
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- KM-
13.4 The Web 3.0?
• Technologies for Web 3.0?
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- KN-
13.4 Web 3.0?
• Question Answering – Basic Techniques – Watson
!"#$%&'(&)*+,&'-./,0&1,-+"'-2&'3456&-2+0+7+,&,-8-9#%:);<%#-*+%=&-8->?<%<@@-9<%%&-8-A:A.-8-;B-*C+3",4?$&<(- KO-
13 Next Lecture
Top Related