Http:// © 2004 Ontopia AS XML 2004, Washington DC Towards Seamless Knowledge Integrating Public...
-
Upload
lynne-wilson -
Category
Documents
-
view
222 -
download
0
Transcript of Http:// © 2004 Ontopia AS XML 2004, Washington DC Towards Seamless Knowledge Integrating Public...
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Towards Seamless KnowledgeIntegrating Public Sector Portals in Norway
Steve PepperChief Strategy Officer, [email protected]
Topic MapsPublished SubjectsTMRAP
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
What is Topic Maps?
• Topic Maps is an ISO standard for Knowledge Integration
• It is the only international standard for Knowledge Integration
• But the more important question is…
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
What are Topic Maps used for?
• That’s like asking: What are relational databases used for?
• The answer is: A whole number of things, including(but not limited to):
– Organizing large bodies of information– Capturing corporate memory– Representing complex rules and processes– Supporting concept-based eLearning– Enabling Enterprise Knowledge Integration (EKI)
• But in particular…
• Any or all of the above, in combination!
• Topic Maps lets you achieve Seamless Knowledge
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Seamless Knowledge
• General business problem addressed by Topic Maps:– The disconnectedness of Information and Knowledge
• Seamless Knowledge– A term coined within the Topic Maps community to describe the business
benefits of applying Topic Maps
• There is growing awareness of the scale of this problem:– Increased talk about “metadata”, “taxonomies”, “ontologies”, and
“semantics”– The META Group talks of a “near-impending crisis”– What people are looking for is knowledge integration –
i.e., Seamless Knowledge– Topic Maps offers a standards-based solution
• Seamless Knowledge is not the same as the Semantic Web– But there is some overlap and even more potential synergy
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Semantic Portals
• One of many applications of Topic Maps– Topic Maps is an ideal model for portals and other forms of web-based
information delivery
• The basic concept is to have the topic map drive the portal– Not just a navigational layer on top of something else– The very structure of the portal is a topic map– All content is organized around topics (“subject-centric organization”)
• Each page represents a topic (we call this a “Topic Page”)– Topics act as points of collocation– They provide a “one-stop shop” for everything that is known about a
particular subject
• Navigating the portal == Navigating the topic map– Associations provide very intuitive navigation (“As we may think”)
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
A Topic Page
the current topic
multiple names
(multiple) types
multiple typedoccurrences
multipletypedassociations
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Architecture of a Topic Maps Portal
web server usersweb clientdata and documents
topicmapappli-cation
topicmap
current topic
occurrences
associations
the current topicmultiple names
multiple occurrences
multipleassociations
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
The Rise and Rise of Semantic Portals in Norway
• In Norway, this concept has been put into practice on a scale that is now verging on the industrial…
– There are over a dozen topic map-driven portals in production
– More are on the way…
• And while the rest of the world is asking questions like– “Metadata?” “Taxonomies?” “Ontologies?”
• …in Norway, customers are saying “Topic Maps!”– RfPs regularly specify Topic Maps as a requirement
– Headhunters are looking for Topic Maps experts
– 120 people attended the last Topic Maps Congress (Norway: pop. 4 million)
– Topic Maps are quickly moving from “early adopter” to “early majority”
• How did this situation come about?– The presence of Ontopia was important, but not enough on its own
– We needed a high visibility success story as well…
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
The ITU Story (in brief)
• Once upon a time, not long ago (in late 2000), …
• … the Network for IT Research and Competence in Education (ITU) was planning a new web site
• They had rather special requirements…– “Relationships between objects and various groups of objects offer users multiple paths
to the same content and stimulate cross-site content exploration.”– “Visualisation of this network is supposed to give the user a conceptual model of the
network, and give a feeling of being in a ‘relational space’.”
• The consultant leading the project was Stian Danenbarger
• At exactly the same time, XTM 1.0 was announced:– “A standardized notation … used to define topics, and the relationships between
topics... A topic map defines a multidimensional topic space (in which) locations are topics… relationships […] define the path from one topic to another.”
– A light bulb went on for Stian…– Ontopia helped him build an Open Source web-based content management and
publishing system that was entirely driven by topic maps, called ZTM (Zope Topic Maps)
• … and ITU got the web site it was looking for:
current topic
occurrences associations
moreassociations
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
The success of ITU started a trend
• ITU was “bleeding edge” in early 2001– Stian calls it a “technical base jump – without a parachute”
– Such adventures are not for the faint-hearted
• Since then Topic Maps Portals have become a proven and well established technology
– …at least in Norway...
• ITU was followed by web sites for the Norwegian Research Council, the Norwegian Consumers Association and many others…
– Some of these are based on ZTM
– Others are based on other Topic Maps engines
• At present there are over a dozen, with more on the way
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Some Topic Maps Portals in Norway
• In production
• http://www.itu.nohttp://www.luna.itu.no(Ministry of Education)
• http://www.forskning.nohttp://www.nysgjerrigper.no(Research Council of Norway)
• http://forbrukerportalen.no(Consumers Association)
• http://www.skifte.no(Norwegian Defence)
• http://www.hoyre.no++(Norwegian Conservative Party)
• http://matportalen.no(Ministry of Agriculture)
• http://www.udi.no(Ministry of Justice)
• http://www.kulturnett.no(Ministry of Culture)
• Under development
• Skatteetaten(Tax Office)
• Statsministerens kontor(Office of the Prime Minister)
• Statistisk Sentralbyrå(Central Bureau of Statistics)
• IFE/Halden(Nuclear Reactor Project)
• etc.
• etc.
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Towards Seamless Knowledge
• As the number of portals multiplies, the amount of overlap increases…
• Take these three portals as an example:
• forskning.no (Research Council web site aimed at young adults)
• forbrukerportalen.no (Public site of the Norwegian Consumer Association)
• matportalen.no (Biosecurity portal of the Department of Agriculture)
Genetically modified food at forskning.no
Genetically modified food at Forbukerrådet
Genetically modified foodstuffs at Matportalen
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Three Topic Maps Portals – One Common Subject
one “virtual portal”with seamless navigation in all directions
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Towards Seamless Knowledge
• Very little is required for these portals to achieve a simple but effective form of Seamless Knowledge
• They have already achieved subject-centric organization of their content
– Without this, Seamless Knowledge is beyond reach
– Without this, Seamless Knowledge is beyond reach
• From a technical viewpoint, only two additional pieces are required to complete the puzzle:
#1 An identity mechanism– To make it possible to know when their subjects are the same
#2 An exchange protocol– To enable information to be requested and exchanged automatically
• (There must also be a real desire to share information, but that’s a political matter)
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Piece #1: The Identity Mechanism
• Simply put:– How can we know that “genetically modified food” is the same as “genetically
modified foodstuffs” (or “GM food”, or “genmodifisert mat”, for that matter)?
• One thing is certain: Basing this on names won’t work– Synonyms, homonyms and polysemy make names a minefield
– In any case we would like to multilingual knowledge integration
• What is needed is nothing more or less than unique, “global” identifiers for all subjects of common interest
• An impossible task?
• Not if we go about it the right way…
• In fact, the solution already exists in the form of a mechanism developed as part of the Topic Maps standard…
• That mechanism is called Published Subjects
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
So what is Published Subjects?
• An open, distributed mechanism for assigning unique, global identifiers to arbitrary subjects
• Originally conceived as part of the Topic Maps effort, but applicability is far more general
• The mechanism is based on using URLs as identifiers– e.g. Ibsen Museum in Oslo:
http://psi.kulturnett.no/museum/ibsen-museet
• Nothing very special about that…, except that
• The Published Subjects mechanism has two interesting characteristics:– It is two-sided – it works for both computers and humans– It works from the bottom up – not from the top down
• Both of these are critically important…
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
For computers AND humans
• Clearly any mechanism has to work for computers– In one sense, that is the whole point:– To make it possible for computers to decide when subjects A and B are the same– Only then can information be connected correctly
• Computers can simply compare URLs• http://psi.kulturnett.no/museum/ibsen-museet +
http://psi.kulturnett.no/museum/ibsen-museet = same subject• http://psi.kulturnett.no/museum/ibsen-museet +
http://psi.kulturnett.no/museum/ibsen-huset = not (demonstrably) the same subject
• But the mechanism must also work for humans– Because it is humans who (in the final analysis) actually assign the identifiers when
structuring or classifying their information– A human needs to know exactly which is represented by a URL such as
• http://psi.kulturnett.no/museum/ibsen-museet• (Is it the Ibsen-museum in Oslo or the Henrik Ibsen Museum i Skien?)
• With PSIs, this can be done quite simply by clicking on the URL– The result is a document that provides some suitably unambiguous, human-interpretable
indication of the identity of the subject– We call this document a subject indicator
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
The dual nature of Published Subjects
• The issue of identity is two-sided– It involves both humans and computers
• The Published Subjects mechanism is similarly two-sided
• The dual aspects are– a subject identifier (URL) used by computers– a subject indicator (document) intended for humans
• The identifier is the address of the indicator
• To understand what the identifier is intended to identify, simply click on it!
• What could be simpler?
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
For computers AND humans
A subject is identified via a URL• The URL is called a subject identifier
Ibsen Museum
topic
subject
http://psi.kulturnett.no/museum/ibsen-museet
subject identifier
The URL is the address of a document
• That document provides a human-interpretable indication of the identity of the subject
• The document is called a subject indicator
Ibsen Museum
Museum located in the apartment in Arbiens gate in Oslo where the playwright Henrik Ibsen lived for the last 11 years of his life, from 1895 until 1906.
subject indicator
http://psi.kulturnett.no/museum/ibsen-museet
• Humans use the indicator
By inspecting the document one can be sure that the identifier does not refer to, say, the Henrik Ibsen Museum in Skien
• Computers use the identifier
Simple comparison of string values: Identical values mean that the subject is the same
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Life, the Universe and Everything
The Computer Domain
The Topic Map Domain
The concept of Subject Indicators in Topic Maps
• The identity of most subjects can only be established indirectly
– An information resource (like a definition or a picture) can provide some kind of indication of the subject’s identity to a human
– Such a resource is called asubject indicator
– A topic may have multiple subject indicators
• Because it is a resource, a subject indicator has an address, even though the subject that it is indicating does not
– Computers can use the address of the subject indicator to establish identity
– These are called subject identifiers
• Subject indicators and subject identifiers are the two sides of the human-computer dichotomy
subject
Giacomo Puccini, Italian composer, b. Lucca 22nd Dec 1858, d. Brussels, 29th Nov 1924. Best known for his operas, of which Tosca is the most . . .
subject indicator
Puccini
http://
psi.o
ntopia
.net
/oper
a/pucc
ini.h
tml
subject identifier
topic
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Ramming the point home: Another diagram!
The abbreviation PSI means published subject indicator,but could equally well mean published subject identifier
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
From the bottom up – open and democratic
• Earlier top down attempts to create global identifiers have largely failed – or at least only met with moderate success
– For example, URNs have been around for jonks and yet there are still only 17 official URN namespaces
• Perhaps requiring a registration authority is too bureaucratic?• Perhaps the inability to resolve URNs easily makes them difficult to use?
• Published Subjects uses the opposite approach…– Anyone can create a PSI (Published Subject Indicator)
– The process is bottom up – open and anarchic – just like the Web itself
• Survival of the most trusted– An evolutionary, darwinistisk process
– The more authoritative, trusted, and respected the “publisher”, the more likely its identifiers will achieve widespread adoption
• Emergence of de facto standards based on trust– The key parameter is confidence in the stability and longevity of the PSI
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Piece #2: The Exchange Protocol
Portal A: forskning.no
Portal B: Matportalen
Hi! Do you know the subject “genetically modified food”?
*
Sure. My URL is: http://matportalen.no/Mat
portalen/Emner/gmo
http://matportalen.no/Matportalen/Emner/gmo
This scenario is Level 1 of TMRAP knowledge integration.
* The actual question was:Is the subjecthttp://psi.forskning.no/food/gm-foodknown in your system?
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
TMRAP (Topic Maps Remote Access Protocol)
• Abstract protocol for getting information from remote repositories– The protocol has an HTTP REST binding– A SOAP binding would be easy to do
• Any repository can support TMRAP– For topic map applications support TMRAP is very easy– For other applications it’s less easy, but the benefit is that legacy applications can be
integrated
• The OKS currently contains a prototype implementation– Used to implement the Vizigator applet– Also used for the Omnigator Rap demo
• For a short introduction to TMRAP:– http://www.jtc1sc34.org/repository/0507.htm
• Some related work:– RDF Net API: http://www.w3.org/Submission/2003/SUBM-rdf-netapi-20031002/– SNAPI: http://sourceforge.net/projects/snapi
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
The Omnigator Rap Demo (Part 1: VISIT)
• Two Omnigators are running on this machine– Different browsers (Opera and Internet Explorer)– Different skins (Ontopia National Colours and Vive Québec)– Different names
pepper poivre
– Different TMs (Italian Opera and Various Geographical TMs)
• They are aware of each other’s existence
• Their support for TMRAP is turned on
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
VISIT: Some Considerations
• The functionality is deceptively simple, yet potential very powerful– From the user’s point of view the VISIT links might have been hand-coded
(there is no visible difference)– The cool thing is that they are generated entirely automatically– This is spontaneous knowledge federation in practice!!
• Think about it a bit:– Having multiple Omnigators rapping together is already fairly cool– In fact, any application built with the Ontopia Knowledge Suite can now
join in the fun– And more importantly:– So can any application at all – whether or not it is based on Topic Maps– The only prerequisites are:
• Subject-centric organization (i.e., some concept of Topic Pages)• Use of Published Subjects (for the purpose of subject identification)• Support for TMRAP (in order to send and respond to requests)
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
VISIT: More Considerations
• How useful is it really?
• Isn’t it a little simple-minded?
• For many of our customers it is sufficient as a first step– The Norwegian Research Council and the Norwegian Consumers’ Association want to
be able to link to each other in this way
– The VISIT paradigm enables them to retain their own branding
– At the same time, they offer their users an extremely valuable service
• TMRAP is already being implemented in ZTM– When done, not only will the Research Council and the Consumers’ Association be
able to rap together…
– …any Omnigator user will also be able to rap with them!
• And remember:– This game can be played by any solution that uses some kind of
subject-centric organization and PSIs
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
The Omnigator Rap Demo (Part 2: GET)
• But we can go a step further with relatively little effort
• Remember: Topic Maps are designed for merging …– … so we can exchange not only Topic Page URLs,– but also fragments of content in topic map form
• We are calling those fragments topic maplets
• TMRAP also supports exchanging maplets
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Piece #3: Topic Maplets (XTM fragments)
Portal A: forskning.no
Portal B: Matportalen
Hi! What do you know about “genetically modified food”?
*
Oh, this and that.Here you are. Be my
guest!
This scenario is Level 2 of TMRAP knowledge integration.
* The actual question was:What information do have about http://psi.forskning.no/food/gm-foodin your system?
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
GET: Some Considerations
• The functionality is even more powerful…– The seamlessness factor is much greater– In fact we have “dumbed it down” in this Omnigator implementation in
order to be able to show what is going on: The GET functionality could be activated automatically
• Application areas are slightly different:– Useful when seamlessness is more important and branding issues
less important• E.g., within a corporate environment
– Opens up the possibility of totally individualized “portals”
• Topic Maplets– Raises some interesting technical issues– The most important is deciding exactly what the fragment should
contain– TMQL (Topic Maps Query Language) will provide greater flexibility
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
The Building Blocks of Seamless Knowledge
• Topic Maps– Semantically structured data that can be “viewed as topic maps”
• Without this, Seamless Knowledge is beyond reach• By the way, this includes RDF, Relational DBs, XML and more
– Already here
• Published Subjects– The Semantic Superhighway
– Globally unique identifiers for arbitrary subjects
– Already here
• Topic Maps Remote Access Protocol (TMRAP)– Protocol for requesting and delivering Topic Page URIs and Topic Maplets
– Already here
• Topic Maps Query Language (TMQL)– For more powerful and precise TMRAP requests
– Watch this space (and use tolog in the meantime)
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Seamless Knowledge and the Semantic Web
• Are they the same thing?– Not if you go by the vision of the Semantic Web articulated by Tim Berners-Lee
(e.g., in the famous Scientific American article)• In reality, that amounts to “AI on the Web”
– Most business users today don’t need AI and they don’t want to be restricted to the Web– On the other hand, other people have other visions of the Semantic Web…
• In any case– Semantics are akin to knowledge … – … and seamlessness implies the existence of something web-like …– … so in a broader sense they do have a lot in common– Certainly Semantic Web data (i.e., RDF) will be easily reusable in the context of
Seamless Knowledge (as will relational data and XML)– The RDF and Topic Maps communities are currently working together to acheive
interoperability at the data level (RDF/TM Interoperability Task Force)
• However, the TBL Semantic Web won’t be here for many years– There is much research still to be done
• Seamless Knowledge is achievable today– Solving the problem of disconnected knowledge on a less ambitious scale|
http://www.ontopia.net/© 2004 Ontopia AS XML 2004, Washington DC
Conclusions
• Topic Maps has almost “crossed the chasm” – at least in Norway– Web sites, Portals, E-learning, Knowledge Management,
Enterprise Knowledge Integration, …
• Seamless Knowledge is what Topic Maps is about– “Topic Maps” speaks only to the technology– CIOs are interested in business benefits and ROI
• Published Subjects are the key to solving the identity issue– Anyone can create a PSI (Published Subject Indicator)– PSIs work for computers AND humans
• TMRAP allows other data to be “viewed as” Topic Maps– Provided information can be made to look like a topic map, any legacy technology can
play– The key is subject-centric organization of information
• Without this, Seamless Knowledge is beyond reach• Without this, Seamless Knowledge is beyond reach• Without this, Seamless Knowledge is beyond reach• Without this, Seamless Knowledge is beyond reach• Without this, Seamless Knowledge is beyond reach