Kontrast@TKE 2012

75
KONTRAST TKE 2012 Brigitte Juanals Martin Lafréchoux Jean-Luc Minel Hi. I am Martin Lafréchoux. Photo : http://www.flickr.com/photos/daynoir/2180507211/

description

This paper is part of a study focusing on the terminological and socio-organizational analysis of a corpus of 18 national and international standards, written in English, in the domains of business continuity activity management and risk management. The aim is to determine whether lobbying by certain countries seeking to impose their own national standards is a decisive element in standardization. First, we present the building of a new tool, called KONTRAST, designed to exploit the terminological variants in a non-stabilized terminological domain. Then we describe the workflow to build an RDF/SKOS/OWL base from an XML glossary and a use case to illustrate the ability of KONTRAST to detect influence networks.

Transcript of Kontrast@TKE 2012

Page 1: Kontrast@TKE 2012

KONTRASTTKE 2012

Brigitte JuanalsMartin Lafréchoux

Jean-Luc Minel

Hi.

I am Martin Lafréchoux.

Photo : http://www.flickr.com/photos/daynoir/2180507211/

Page 2: Kontrast@TKE 2012

Standardization and Global Security

The work I am about to present was part a three-year project called NOTSEG (www.notseg.fr, ANR-CSOG 2009).

We acknowledge funding from the French National Research Agency (ANR)

Page 3: Kontrast@TKE 2012
Page 4: Kontrast@TKE 2012

KONTRAST is a termino-ontological resource designed to represent and analyze the vocabulary used in international management standards

In short...

Photo : http://www.flickr.com/photos/natureindyablogspotcom/3038070680

Page 5: Kontrast@TKE 2012

I. ContextII. ModelIII. Use Case

I will briefly describe the context of our work, i.e. the challenges faced by terminology in management standards. I’ll then describe the representation model we came up with to address these challenges, as well as our workflow. The last part of this talk will show you a quick use case.

Photo : http://www.flickr.com/photos/natureindyablogspotcom/3038070680

Page 6: Kontrast@TKE 2012

I. Context

Page 7: Kontrast@TKE 2012

Standards & Terminology

Kontrast was designed to address the specific issues of terminology in the context management standards.

Let me begin by giving you an overview of these specificities.

Page 8: Kontrast@TKE 2012

StandardsStandards can refer to many things.

The first thing that comes to your minds is probably something very down to earth, like power outlets or the size of vegetables.

That’s not what I’ll be talking about today. I’ll talk about management standards, and more precisely I’ll talk about business continuity.

Illustration : http://www.flickr.com/photos/double-m2/4341910416/

Page 9: Kontrast@TKE 2012

Business Continuity: The activity performed by an organization to ensure that critical functions will be available to those who need them even in the event of a disaster.

Just a quick reminder.

Business continuity is a subtask of risk management.

Page 10: Kontrast@TKE 2012

• Management standards are not about physical objects

• They deal with the abstract — processes, business rules, concepts, methods

• Standards include a ‘Terms and Definitions’ (T&D) section to alleviate ambiguity

Generally speaking, management standard use natural language to standardize an abstract material: rules, processes, and so on.

Given that there is nothing tangible, concrete to refer to, only words and abstractions, these standards have to include some manner of terminology. That’s the use of their T&D section.

This is what we studied.

Illustration : http://www.flickr.com/photos/double-m2/4341910416/

Page 11: Kontrast@TKE 2012

Here you can see the beginning of the T&D of ISO 31000:2009. It’s a glossary, either thematic or alphabetical.

We chose to study T&D because we witnessed some heated discussion about them during the writing process of certain ISO standards. They seemed to be some kind of focal point where we could observe the different influences at play. This was confirmed by the AFNOR experts we worked with: it’s not easy to agree on a terminology when writing international standards.

Page 12: Kontrast@TKE 2012

Consensus

If I oversimplify things a bit, the international standardization process goes something like this:

Illustration : http://www.flickr.com/photos/double-m2/4324115629/

Page 13: Kontrast@TKE 2012

• Each standards organization can define its own vocabulary

• International standards have to choose between several concurrent vocabularies

• The writing process follows a so-called ‘consensual procedure’.

On a given topic, several countries write their own national standard. Each one comes with its own T&D.

When ISO decides to write an international standard on this topic, these countries send delegations and of course, each country wants his own terminology to prevail, as it would give a competitive advantage to those who have already adopted its national standard.

There is this competition between several vocabularies, none of which can be seen as more valid as the others.

Illustration : http://www.flickr.com/photos/double-m2/4324115629/

Page 14: Kontrast@TKE 2012

Experts

This is where the ‘experts’ come in.

Experts is a generic term to describe the people sent by each organisation and country to ISO. They are mostly consultants or from corporate background.

When writing a standard, a so-called ‘consensual procedure’ is used, where one expert (secretary) will review each definition proposal and every other expert can ask for modifications. There is no formal definition of this procedure.

Illustration : http://www.flickr.com/photos/beatnic/3683822225/

Page 15: Kontrast@TKE 2012

• The way a standard will be implemented depends largely on its T&D

• T&D are a product of the notional systems of the experts who wrote them

• T&D are an economical, sometimes political issue

Since the way a standard will be implemented depends largely on its T&D, the T&D can become economical and political issues. The hypothesis we are trying to verify is this: can the power plays and maneuvers that took place when writing international standards be traced in their T&D?

Illustration : http://www.flickr.com/photos/beatnic/3683822225/

Page 16: Kontrast@TKE 2012

Authority?

International standardization has one major specificity compared to other terminology-heavy fields - like, say, industry. No one has authority.

Illustration : http://www.flickr.com/photos/double-m2/4324611290/

Page 17: Kontrast@TKE 2012

• ISO has no authority over national standardization organisations

• ISO vocabularies do not replace nor supercede other terminologies

• ISO itself is not monolithic. Different subgroups coexist among ISO.

Standards are not laws. They are only references. An organization is free to use a standard or not.

In the same way, ISO has no authority over national organisations.

Page 18: Kontrast@TKE 2012

“resilience”

“The ability of an organization to resist being affected by an incident”

“The adaptive capacity of an organization in a complex and changing environment”

ISO DIS 22300:2011ISO/IEC 27031:2011

For example here are two definitions of the word ‘resilience’ in two ISO standards written in 2011. Resilience is a key concept in business continuity.

But more on that later.

Page 19: Kontrast@TKE 2012

Borrowings & ReferencesAs I said, each country *can* create its own vocabulary. That does not mean that they always do.

Photo : http://www.flickr.com/photos/linneberg/6976347269/

Page 20: Kontrast@TKE 2012

• Creating a new terminology is a tedious and costly process

• Standards frequently recycle or reuse other definitions

• These quotes and reuses sometimes go unacknowledged

As I am sure you are all aware, creating a terminology can be a long, daunting, costly process.

Often a standard will reuse the T&D of an existing standard, in part or in total, or at least refer to it. These quotes and borrowings create a complex network, which is what we want our system to track.

Photo : http://www.flickr.com/photos/linneberg/6976347269/

Page 21: Kontrast@TKE 2012

Reuse

The most simple case is a straight reuse. In effect it is similar to ‘importing’ a library in a programming language.

Source : ISO DIS 22301:2010

Page 22: Kontrast@TKE 2012

Quote

When a standard quotes a single definition, the ID of the original standard is in brackets.

Page 23: Kontrast@TKE 2012

Modified Quote

Some quotes are shortened or modified.

Page 24: Kontrast@TKE 2012

Quote?

Some even go unacknowledged, which is, of course, of particular interest to us.

Guide 81 vs. BS 25999 1 - impact

Page 25: Kontrast@TKE 2012

Reference

Some definitions also refer to another one.

Page 26: Kontrast@TKE 2012

• A wide range of terminological systems are coexisting

• They are interlinked by a complex network of influence, borrowings, adaptations and reuses

• How can we represent them simultaneously without alignement?

Photo : http://www.flickr.com/photos/moofbong/4240137966/

Page 27: Kontrast@TKE 2012

II. ModelHere is the termino-ontological model we designed to address these issues.

Photo : http://www.flickr.com/photos/esm723/3573226450/

Page 28: Kontrast@TKE 2012

A contrastive ontological glossary

As you will see, it is not a ‘proper’ ontoterminology, so we call it a contrastive ontological glossary instead.

Photo : http://www.flickr.com/photos/esm723/3573226450/

Page 29: Kontrast@TKE 2012

•A 2 part-model :

- Terminological Data

- Structural Data

It’s a 2-part structure.

Photo : http://www.flickr.com/photos/esm723/3573226450/

Page 30: Kontrast@TKE 2012

Terminological DataPhoto : http://www.flickr.com/photos/fijneman/2971217479/

Page 31: Kontrast@TKE 2012

•Terminology in standards offers unique challenges to knowledge engineering

•How can several semi-identical concepts coexist?

•We built on the operational properties of OWL ontologies with a twist: an unorthodox definition of the ‘concept’

The main challenge is that we had to represent several conflicting concepts at the same time, without alignment and without any central authority to choose the ‘right one’.

Page 32: Kontrast@TKE 2012

In Kontrast a ‘concept’ is the relationship between a term, a context of use and a definition.

Page 33: Kontrast@TKE 2012

“An unstable condition involving an impending abrupt or significant change...”@en

Définition

This allowed us to designed a scattered, decentralised model, where the individual representing concepts are nothing more than the reification of a ternary relationship between a term, a standard and a definition.

Please allow me to reiterate: this was a technical design decision.

Page 34: Kontrast@TKE 2012

One of the first advantages of this definition was that it worked well with the concept as defined by SKOS.

SKOS is designed to represent several thesaurus that may share terms and / or definitions, which worked well for us.

Illustration : http://www.w3.org/TR/2005/WD-swbp-skos-core-guide-20051102/

Page 35: Kontrast@TKE 2012

Ontology BuildingHere is how we worked.

Page 36: Kontrast@TKE 2012

•A linear XML glossary is converted to RDF/OWL

•XSLT perform most of the work

•Python scripts extract relationships between concepts

In a different part of the NOTSEG project, a glossary was manually compiled from the 20 standards of our corpus. We then applied automatic and manual treatments over this glossary.

I’ll walk you over the different steps.

Page 37: Kontrast@TKE 2012

Here is an entry of the XML glossary.

Page 38: Kontrast@TKE 2012

31000:2009

Here is a diagram representing the same entry in Kontrast.

Page 39: Kontrast@TKE 2012

First the term.

Page 40: Kontrast@TKE 2012

31000:2009

It is used in several places in the ontology: as a term (left), as part of the ID for the concept (center), and as a label (right).

Page 41: Kontrast@TKE 2012

Definition

Page 42: Kontrast@TKE 2012

31000:2009

Transposed as a concept property

Page 43: Kontrast@TKE 2012

The standards in which the concept appears...

Page 44: Kontrast@TKE 2012

31000:2009

... are represented as skos:conceptSchemes, which makes them independent thesaurus.

Page 45: Kontrast@TKE 2012

Relationship ExtractionNext step is extracting relationships

Page 46: Kontrast@TKE 2012

“vulnerability”

ISO Guide 73:2009

intrinsic properties of something resulting in susceptibility to a risk source (3.5.1.2) that can lead to an event with a consequence (3.6.1.3)

AS/NZS 5050:2010

Intrinsic properties of something resulting in susceptibility to a risk source that can lead to an event with a consequence. [ISO Guide 73:2009, Risk Management—Vocabulary, definition 3.6.1.6]

ASIS SPC.1:2009

Intrinsic properties of something that create susceptibility to a source of risk (3.53) that can lead to a consequence. [ISO/IEC Guide 73:2002]

ISO DIS 22300:2011

intrinsic properties of something resulting in susceptibility to a risk source that can lead to an event with a consequence

Another example.

Here are four definitions of ʻvulnerabilityʼ.

Page 47: Kontrast@TKE 2012

“vulnerability”

ISO Guide 73:2009

intrinsic properties of something resulting in susceptibility to a risk source (3.5.1.2) that can lead to an event with a consequence (3.6.1.3)

AS/NZS 5050:2010

Intrinsic properties of something resulting in susceptibility to a risk source that can lead to an event with a consequence. [ISO Guide 73:2009, Risk Management—Vocabulary, definition 3.6.1.6]

ASIS SPC.1:2009

Intrinsic properties of something that create susceptibility to a source of risk (3.53) that can lead to a consequence. [ISO/IEC Guide 73:2002]

ISO DIS 22300:2011

intrinsic properties of something resulting in susceptibility to a risk source that can lead to an event with a consequence

To compare them, we normalize the text — i.e. get rid of everything in brackets, of punctuation and capital letters.

Page 48: Kontrast@TKE 2012

“vulnerability”

ISO Guide 73:2009

intrinsic properties of something resulting in susceptibility to a risk source (3.5.1.2) that can lead to an event with a consequence (3.6.1.3)

AS/NZS 5050:2010

Intrinsic properties of something resulting in susceptibility to a risk source that can lead to an event with a consequence. [ISO Guide 73:2009, Risk Management—Vocabulary, definition 3.6.1.6]

ASIS SPC.1:2009

Intrinsic properties of something that create susceptibility to a source of risk (3.53) that can lead to a consequence. [ISO/IEC Guide 73:2002]

ISO DIS 22300:2011

intrinsic properties of something resulting in susceptibility to a risk source that can lead to an event with a consequence

Three of them match: we create a skos:exactMatch relationship.

Page 49: Kontrast@TKE 2012

“vulnerability”

ISO Guide 73:2009

intrinsic properties of something resulting in susceptibility to a risk source (3.5.1.2) that can lead to an event with a consequence (3.6.1.3)

AS/NZS 5050:2010

Intrinsic properties of something resulting in susceptibility to a risk source that can lead to an event with a consequence. [ISO Guide 73:2009, Risk Management—Vocabulary, definition 3.6.1.6]

ASIS SPC.1:2009

Intrinsic properties of something that create susceptibility to a source of risk (3.53) that can lead to a consequence. [ISO/IEC Guide 73:2002]

ISO DIS 22300:2011

intrinsic properties of something resulting in susceptibility to a risk source that can lead to an event with a consequence

The fourth one is slightly different. Itʼs very close, but we have no way to confirm it automatically. On such short texts, a variation of a few words can be huge. A human analysis is still needed.

The best we can do is to create a temporary file for humans to check afterwards.

Page 50: Kontrast@TKE 2012

exactMatch closeMatch relatedMatch

Recall

Precision

1.0 0.38 0.21

1.0 1.0 1.0

Our test are designed for maximum precision. They cannot fail. The obvious downside is that recall is very low.

Page 51: Kontrast@TKE 2012

•20 standards

•291 terms

•649 concepts

•1107 matching relationships• 486 skos:relatedMatch

• 85 skos:exactMatch

• 535 skos:closeMatch

Page 52: Kontrast@TKE 2012

Structural Data

Briefly, here is the other part of the resource.

Photo : http://www.flickr.com/photos/hindrik/1919291052/

Page 53: Kontrast@TKE 2012

Data about...

•the standards: release date, version, current status, reach...

•the standardization process: publishers, working groups, institutions...

It contains mostly metadata about the standars and the writing process.

Photo : http://www.flickr.com/photos/hindrik/1919291052/

Page 54: Kontrast@TKE 2012

This data is linked with the terminological data through the individuals representing standards.

We used Dublin Core (dcterms) whenever possible to ensure maximum interoperability.

Page 55: Kontrast@TKE 2012

Standards are revised regularly. Through the borrowings and quotes, an ‘older’ definition can remain in use even if a new version of the standard has been published. So we have to represent several versions of the same standard at the same time.

We also used dcterms.

Page 56: Kontrast@TKE 2012

Some standards of our corpus are present in DBPedia. We used owl:sameAs or dcterms:isPartOf assertions to connect Kontrast with the Linked Data.

Page 57: Kontrast@TKE 2012

•Decentralized, simple and extensible model

•Uses standard semantic web vocabularies

•Connected to the linked data

Photo : http://www.flickr.com/photos/sperkyajachtu/5497757852/

Page 58: Kontrast@TKE 2012

III. Use CasePhoto : http://www.flickr.com/photos/daynoir/2180507271/

Page 59: Kontrast@TKE 2012

‘resilience’Kontrast does not have its own GUI. We used third-party tools such as RDF Gravity or Ontograf to display a graphical representation of Kontrast.

Page 60: Kontrast@TKE 2012

“resilience”

“The ability of an organization to resist being affected by an incident”

“The adaptive capacity of an organization in a complex and changing environment”

ISO DIS 22300:2011ISO/IEC 27031:2011

Let’s get back to resilience.

Page 61: Kontrast@TKE 2012

“resilience”

Here are the concepts using the term resilience in Kontrast

In yellow, skos:closeMatches, in green skos:exactMatches, in brown skos:relatedMatches.

Capture : Ontograf (Protégé plug-in)

Page 62: Kontrast@TKE 2012

“resilience”

These three nodes are British standards — the two parts of the BS25999 standard, and the associated good practice guide.

Page 63: Kontrast@TKE 2012

“resilience”

“The ability of an organization to resist being

affected by an incident”

They use the same definition of ‘resilience’.

The UK has worked on emergy planning since the 1980’s. It has been a business continuity leader since then, and their standards have been used as international references for a long time.

It led the BSI to a prominent position within ISO for business continuity standards.

Page 64: Kontrast@TKE 2012

“resilience”

“The ability of an organization to resist being

affected by an incident”

ISO 27031 was recently published and uses the british definition.

Page 65: Kontrast@TKE 2012

“resilience”

On the other side of the graph are definitions under American influence.

Page 66: Kontrast@TKE 2012

“resilience”

Towars the center you can see the ASIS SPC.1:2009 definition.

Page 67: Kontrast@TKE 2012

The adaptive capacity of an organization in a complex and changing environment. - NOTE 1: Resilience is the ability of an organization to resist being affected by an event or the ability to return to an acceptable level of performance in an acceptable period of time after being affected by an event.

“resilience”

ASIS SPC.1:2009

«The adaptive capacity of an organization in a complex and changing environment.»

At first, the definition seems different. But if you take a closer look...

Page 68: Kontrast@TKE 2012

The adaptive capacity of an organization in a complex and changing environment. - NOTE 1: Resilience is the ability of an organization to resist being affected by an event or the ability to return to an acceptable level of performance in an acceptable period of time after being affected by an event.

“resilience”

ASIS SPC.1:2009

The first note reproduces the british definition. In 2009, when the ASIS standard was published, the british influence was still very strong and completely foregoing the british definition would have handicapped the standard.

Page 69: Kontrast@TKE 2012

“The adaptive capacity of an organization in a complex and changing

environment”

“resilience”

After the publication of this standard, the US used a different strategy. They kind of went around the british influence and pushed to have their definition of ‘resilience’ adopted in broader standards — the ISO 31000 series which deals with risk management.

With the help of an Israeli expert, the US managed to get their definition in the 2009 version of the ISO Guide 73 (rightmost node). This was a good move because the definitions of ISO Guide are often quoted or borrowed, as we can see at here at the bottom of the graph (the Australia / New Zealand standard).

Page 70: Kontrast@TKE 2012

“The adaptive capacity of an organization in a complex and changing environment, to achieve the organizations objectives NOTE 1 Resilience is the ability of an organization to manage the risks of events”

“resilience”

In 2011, the US pushed for a new definition to be adopted in the ISO 22300 series. This caused quite a stir, as the 22300 series is a purely BC standard series, where the british concepts usually prevail.

The debate is still ongoing.

Page 71: Kontrast@TKE 2012

“resilience”

And that’s how you end up with two conflicting definitions of an important concept in two standards written within the same organization, during the same year.

Page 72: Kontrast@TKE 2012

“resilience”

“The ability of an organization to resist being affected by an incident”

“The adaptive capacity of an organization in a complex and changing environment”

ISO DIS 22300:2011ISO/IEC 27031:2011

These definitions are important because they translate two different visions of business continuity: in short, the US / Israeli position is about the planning and reactive capacities of an organisation, whereas the british experts believe that risk assessment is costly and not very realistic, as it is impossible to anticipate all possible risks.

Photos : http://www.flickr.com/photos/pmillera4/6366227011/ & http://www.flickr.com/photos/adrianclarkmbbs/3050195566/

Page 73: Kontrast@TKE 2012

Conclusion

Photo : http://www.flickr.com/photos/bruceberrien/4262228892/

Page 74: Kontrast@TKE 2012

• Standardization terminology offers unique challenges to knowledge engineering

• Influence can be traced in terms and definitions and Kontrast can be a useful tool to assist human analysis

• Many possible optimizations: use Lemon instead of SKOS, automate relationship extraction...

SKOS was great for us because it was readily available and simple to set-up, but now we begin to feel its limits.

Lemon would allow us to push our idea further.

Photo : http://www.flickr.com/photos/bruceberrien/4262228892/

Page 75: Kontrast@TKE 2012

Thank you