A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

Click here to load reader

  • date post

  • Category


  • view

  • download


Embed Size (px)

Transcript of A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

  • 1.A Vague Sense Classifier for Detecting Vague Definitions in Ontologies Panos Alexopoulos, John Pavlopoulos 14th Conference of the European Chapter of the Association for Computational Linguistics Gothenburg, Sweden, 2630 April 2014

2. 2 Vagueness Introduction Vagueness is a semantic phenomenon where predicates admit borderline cases, i.e. cases where it is not determinately true that the predicate applies or not (Shapiro 2006). This happens when predicates have blurred boundaries: Whats the threshold number of years separating old and not old films? What are the exact criteria that distinguish modern restaurants from non-modern? 3. 3 Vagueness Consequences Introduction The problem with vague terms in semantic data is the possibility of disagreements! E.g., when we asked domain experts to provide instances of the concept Critical Business Process, there were certain processes for which there was a dispute among them about whether they should be regarded as critical or not. The problem was that different experts had different criteria of process criticality and could not decide which of these were sufficient to classify a process as critical. 4. 4 Problematic Scenarios Introduction 1. Structuring Data with a Vague Ontology: Possible disagreement among experts when defining class and relation instances. 2. Utilizing Vague Facts in Ontology-Based Systems: Reasoning results might not meet users expectations 3. Integrating Vague Semantic Information: The merging of particular vague elements can lead to data that will not be valid for all its users. 5. 5 Problem Definition & Approach Automatic Vagueness Detection Can we automatically determine whether an ontology entity (class, relation etc.) is vague or not? StrategicClient as A client that has a high value for the company is vague! AmericanCompany as A company that has legal status in the Unites States is not! Problem Definition We train a binary classifier that may distinguish between vague and non-vague term word senses. Training is supervised, using examples from Wordnet. We use this classifier to determine whether a given ontology element definition is vague or not. Approach 6. 6 Data Automatic Vagueness Detection 2,000 adjective senses from WordNet. 1,000 vague 1,000 non-vague Inter-agreement of vague/non-vague annotation among 3 human judges was 0.64 (Cohens Kappa) Vague Senses Non Vague Senses Abnormal: not normal, not typical or usual or regularor conforming to a norm Compound: composed of more than one part Impenitent: impervious to moral persuasion Biweekly: occurring every two weeks. Notorious: known widely and usually unfavorably Irregular: falling below the manufacturer's standard Aroused: emotionally aroused Outermost: situated at the farthest possible point from a center. 7. 7 Training and Evaluation Automatic Vagueness Detection 80% of the data used to train a multinomial Naive Bayes classifier. We removed stop words and we used the bag of words assumption to represent each instance. The remaining 20% of the data was used as a test set. Classification accuracy was 84%! 8. 8 Comparison with Subjectivity Analyzer Automatic Vagueness Detection We also used a subjective sense classifier to classify our datasets senses as subjective or objective. From the 1000 vague senses, only 167 were classified as subjective while from the 1000 non-vague ones 993. This shows that treating vagueness in the same way as subjectiveness is not really effective. 9. 9 Use Case: Detecting Vagueness in CiTO Ontology Automatic Vagueness Detection As an ontology use case we considered CiTO, an ontology that enables characterization of the nature or type of citations. CiTO consists primarily of relations, many of which are vague (e.g. plagiarizes). We selected 44 relations and we had 3 human judges manually classify them as vague or not. Then we applied our Wordnet-trained vagueness classifier on the textual definitions of the same relations. 10. 10 Use Case: Detecting Vagueness in CiTO Ontology Automatic Vagueness Detection Vague Relations Non Vague Relations plagiarizes: A property indicating that the author of the citing entity plagiarizes the cited entity, by including textual or other elements from the cited entity without formal acknowledgement of their source sharesAuthorInstitutionWith: Each entity has at least one author that shares a common institutional affiliation with an author of the other entity citesAsAuthority: The citing entity cites the cited entity as one that provides an authoritative description or definition of the subject under discussion. providesDataFor: The cited entity presents data that are used in work described in the citing entity. 11. 11 Use Case: Detecting Vagueness in CiTO Ontology Automatic Vagueness Detection Classification Results: 82% of relations were correctly classified as vague/non-vague 94% accuracy for non-vague relations. 74% accuracy for vague relations. Again, we classified the same relations with the subjectivity classifier: 40% of vague/non-vague relations were classified as subjective/objective respectively. 94% of non-vague were classified as objective. 7% of vague relations were classified as subjective. 12. 12 Future Work Vagueness-Aware Semantic Data Incorporate the current classifier into an ontology analysis tool Improve the classifier by contemplating new features See whether it is possible to build a vague sense lexicon. 13. 13 Questions? Thank you! iSOCO Madrid Av. del Partenn, 16-18, 17 Campo de las Naciones 28042 Madrid Espaa (t) +34 913 349 797 iSOCO Pamplona Parque Toms Caballero, 2, 64 31006 Pamplona Espaa (t) +34 948 102 408 iSOCO Valencia C/ Prof. Beltrn Bguena, 4 Oficina 107 46009 Valencia Espaa (t) +34 963 467 143 iSOCO Barcelona Av. Torre Blanca, 57 Edificio ESADE CREAPOLIS Oficina 3C 15 08172 Sant Cugat del Valls Barcelona, Espaa (t) +34 935 677 200 iSOCO Colombia Complejo Ruta N Calle 67, 52-20 Piso 3, Torre A Medelln Colombia (t) +57 516 7770 ext. 1132 Key Vendor Virtual Assistant 2013 Quieres innovar? Dr. Panos Alexopoulos Semantic Applications Research Manager [email protected] (t) +34 913 349 797