Using construction grammar in conversational systems
-
Upload
cj-jenkins -
Category
Technology
-
view
2.778 -
download
4
description
Transcript of Using construction grammar in conversational systems
![Page 1: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/1.jpg)
Using Construction Grammarin
Conversational Systems
Marie-Claire Jenkins, PhD Thesis
(High level overview)
![Page 2: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/2.jpg)
Overview
This thesis was motivated by the machine's limitations in understanding natural language and in forming responses. The limitations and complexities of current search engine querying was also a factor.
Conversational systems are good for testing possible solutions and are useful on the web.
We used methods that are not common in these systems:
- Construction Grammar (CxG)- OWL ontologies- Lexical semantics- A new stemmer (Uea-Lite)
![Page 3: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/3.jpg)
What I'm going to talk about
• Conversational systems: what they are and how they work & what their limitations are
• The Turing test and the Loebner prize
• 2 early experimental systems that we built
• OWL ontologies vs databases
• Construction grammar and Fluid construction grammar
• UEA-Lite stemmer
• Machine learning component
• KIA system diagram
• Evaluation methods and learnings
![Page 4: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/4.jpg)
Things I covered in my research:
- Natural language understanding - Natural language generation- Human computer interaction- Service oriented systems
Things I didn't cover in my research:
- Knowledge acquisition- Open domains- Affective behaviour- Everything else
![Page 5: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/5.jpg)
Conversational systems
They are more commonly referred to as "chatbots" or “Artificial Conversational Entities”
They converse with a user in natural language and simulate a human-human conversation.
They need to:
- "Understand” the user input- Retrieve relevant information- Generate a natural language response
There are 3 different kinds of chatbots...
![Page 6: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/6.jpg)
Social chatbots
Their purpose is to chat freely about anything at all with a user, much like you would with a friend. They are used online for fun.
![Page 7: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/7.jpg)
Educational chatbots
Their purpose is to help the user learn about something such as a new language, history or geography. They are often used in schools
![Page 8: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/8.jpg)
Service oriented chatbots
Their purpose is to help customers find their way around the website and also to answer questions about their products & services.
![Page 9: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/9.jpg)
How they work
There are a variety of methods used but the most popular are:
- Database driven- AIML (artificial intelligence markup language, xml based) - Canned responses- Stochastic methods- Supervised learning- Named entity recognition- Templates
![Page 10: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/10.jpg)
“Phrase Based systems” are seen as generalized templates at the sentence level (like phrase structure rules) or at the discourse level.
1- Phrasal pattern selected [subject noun verb]
2 - Each part of the pattern is expanded [noun modifiers]
3 - When each phrasal pattern has been replaced by 1+ words –END
They are very difficult to build because the phrasal interrelationships must be clearly specified otherwise there can be inappropriate phrase expansions.
Phrase-based systems
![Page 11: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/11.jpg)
In “Feature-based systems” each possible alternative is represented by a feature and each sentence is specified by them.
Sentence generation is achieved by using all of these features until the sentence is determined.
Features may include: positive/negative, past/present, statement/question…
Strength: any distinction in language can be a feature
Weakness: very hard to maintain feature inter-relationships and the control of feature selection.
Feature-based systems
![Page 12: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/12.jpg)
Tests on dialogue from the human-human customer service system on a large commercial website reveal that there is no consistency in language or phrase formulation.
There is a very small amount of Formulaic language (canned responses).
A question was never formulated in the same way and never answered in the same way (apart from formulaicity).
This makes it hard for us to produce templates or anticipate user utterances.
Observations from live data
![Page 13: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/13.jpg)
More Limitations
Main issues with existing systems:
- Scalability- Knowledge & information storage- User input disambiguation- Response generation (word order, vocabulary, etc...)- Knowledge/information retrieval- Anaphora- Managing the dialogue- Displaying appropriate behaviour (affective issues)- Knowledge assimilation- Evaluation
![Page 14: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/14.jpg)
Turing test
“A machine is termed capable of thinking if it can, under certain prescribed conditions imitate a human by answering questions sufficiently well to deceive a human questioner for a reasonable period of time.” (Turing)
Objections to the test include proving intelligence, "understanding" and other things.
My personal opinion has changed since the beginning of my PhD research:
“The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.” (Dijkstra)
![Page 15: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/15.jpg)
Turing test illustration
Wikipedia
![Page 16: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/16.jpg)
XKCD
![Page 17: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/17.jpg)
Loebner prize
This yearly contest is run by Hugh Loebner who has offered a $100,000 prize for the 1st chatbot to pass the Turing test
This test is controversial. Marvin Minsky said:
“I do hope that someone will volunteer to violate this proscription so that Mr. Loebner will indeed revoke his stupid prize, save himself
some money, and spare us the horror of this obnoxious and unproductive annual publicity campaign.”
![Page 18: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/18.jpg)
Loebner prize diagram
Michael Mauldin- carnegie mellon
![Page 19: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/19.jpg)
John
We built a conversational chatbot and entered it into the Loebner prize (2006). It was designed & built in 2 months and operated on a closed domain.
Reason: to run on a small database requiring little manual labour. We used ngrams, weighted responses, a vector approach, perl, Brill, UEA-Lite, wildcards, AIML
We were a finalist and we learned that:
- A small database worked for a small amount of time- A database system makes for laborious build and limited
information (well used systems work much better)- Template methods are limited- Canned responses are awkward- AIML is restrictive
![Page 20: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/20.jpg)
KIA: the HCI tests
We designed a system made to research human-machine interaction and human behaviour: this is a test on humans and not the system
We included functions that were meant to test user persistence with query repair, emotive response, language etc...
Results: users persist, are emotive, sensitive to interface design and more.
Details available in our paper
![Page 21: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/21.jpg)
KIA – a CxG & OWL driven system
![Page 22: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/22.jpg)
Databases vs OWL ontologies:
Databases focus on local semantics and ontologies on global semantics.
In ontologies the semantics are explicit and in databases implicit.
Ontologies allow data to be reused whereas database schemas cannot be reused.
Ontologies are portable between websites to facilitate maintenance and construction
Restrictions in databases do not allow for all of the necessary relations to be built into the data.
![Page 23: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/23.jpg)
Database(Wordpress Bits)
Owl Ontology(Richard Durban)
![Page 24: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/24.jpg)
OWL flavour
We used OWL (Web Ontology Language) as it is more expressive than other semantic web languages and is built to enable ontologies to be created easily.
It is a semantic markup language and an extension of RDF (Resource Description Framework).
There are different subsets of OWL: OWL Full, OWL Lite and OWL DL (Description Logic).
We chose to use OWL DL.
![Page 25: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/25.jpg)
Why Ontologies & why OWL DL?
Taxonomies are also not as expansive as ontologies.
“At one extreme there are ontologies and the other mind maps and pathfinder networks, and in between taxonomies and browserable hierarchies”. (Brewtser and Wilkes)
Ontologies have a greater potential for inference and a greater degree of formality.
OWL DL has stricter restrictions which are necessary in our type of system.
It has maximum expressiveness without losing computational completeness (all entailments are will be computed) and decidability (all computations will finish in finite time) of reasoning systems.
![Page 26: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/26.jpg)
OWL Ontology example: Koala
![Page 27: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/27.jpg)
What do we store in there?
- All of the domain knowledge (e.g all about Koalas)
- The collection of constructions (commonly used when discussing koalas)
- Canned responses (formulaic language)
![Page 28: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/28.jpg)
KIA system domain knowledge
![Page 29: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/29.jpg)
Construction Grammar
It is a cognitive linguistic method and it is:
- Constraint based- Generative- Non-derivational- A monostratal grammatical model- Incorporates the cognitive and interactional foundations of
language- Consists of taxonomies of families of constructions- Uses entire constructions as the primary unit of grammar- Is a pairing of form and meaning (metonomic)- Frames used in CxG != regular frames because the argument
structure types invoke frames which designate event types- The verb alone is not the main unit of meaning, the construction
itself is
![Page 30: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/30.jpg)
ConstructionsWords
Sentences
Constructions make sense in computing
![Page 31: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/31.jpg)
Example of CxG
Semantics: relational predicate involving a singer Syntactics: predicate requires arguments and ``Heather'' is the
subject
Generative Grammar
Construction Grammar
![Page 32: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/32.jpg)
Advantages of CxG
- Adapts to changing language patterns easily
- Takes into consideration both semantics and syntactics
- Constructions are easier to manage than words as the atomic unit
- Allows for integration into bigger collections of constructions
- Can be computed
![Page 33: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/33.jpg)
UEA-Lite stemmerAfter testing the system with all available stemmers, we realised that
we needed to design our own to facilitate topic/construction detection.
UEA-Lite stems conservatively to orthographically correct word forms and recognizes words which do not need to be stemmed.
There is a Perl, Java and Ruby version
More information here(an updated paper to follow soon)
![Page 34: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/34.jpg)
Machine learning
It identifies constructions (NP or VP), the syntactic pole and the semantic pole feed information so constructions to be loaded with meaning and form information.
The machine learning engine finds sets of constructions which commonly work in conjunction with each other or that have been used in conjunction in the past.
The weights are adjusted each time a new construction is added. This happens when the system encounters a new instance.
The engine runs through this data and calculates a probability of the right matches to the query information to be found.
![Page 35: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/35.jpg)
Algorithms
- Jaccard Distance to weight the constructions (how often different constructions are found in conjunction, partial or complete)
- Naive Bayes algorithm clusters all of the constructions according to their different features in our training set (requires little training data)
Once the data has been processed through the Naive Bayes algorithm we know which constructions are often found with others, and in what order. We not only look at the syntax but also at the semantic aspect both in isolation and in conjunction with each other.
The role of the classifier is to determine which categories future constructions belong to, and also to tell us which constructions are a likely match to a query.
![Page 36: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/36.jpg)
Naïve Bayes for CxG
P (Constructions) doesn't change over time. Naive Bayes estimates a multinomial distribution over categories, which is the prior distribution of categories We can therefore say that:
Best category [ArgaMax cat in cats] = P (constructions ¦ cat) (P (cat))
If c1, c2, ... cn are the constructions in the document, then:
Best category [ArgaMax cat in cats] = P(c1|cat)*P(c2|cat)*...*P(cn|cat)*P(cat)
![Page 37: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/37.jpg)
System diagram
There are many more components to the systemthan presented in this presentation as you can see.
![Page 38: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/38.jpg)
Evaluation methods
There are not any robust evaluation methods for conversational systems but we found that a mixture of the following worked well:
- Human evaluation (feedback form)- "Pourpre” to evaluate sentence complexity (Jimmy Lin)- Expected vs Given response score
Evaluation is not finished as yet but the initial results are encouraging with good knowledge retrieval and construction selection.
![Page 39: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/39.jpg)
Things that didn't work
Using LSI/PLSI to determine the similarity between individual utterances in order to extract useful constructions failed.
The reasons:
LSI is an information retrieval method and Q&A systems require a higher level of accuracy.
Information retrieval uses a hammer and every problem is a nail.Subtler systems require a more delicate approach.
It is very hard to get LSI to scale to sentence level, which is interesting as it has been proven that it doesn't scale
The fact that it can't capture polysemy is ok because we disambiguate prior to this and append information to constructions
![Page 40: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/40.jpg)
Fluid construction Grammar (FCG)(also didn't work!)
- Bi-directional (using rules)
- Selects meanings and maps them into the real world.
- "fluid" because it takes into consideration the fact that users change and update their grammars often.
- User input can be broken down syntactically in order to gain meaning from the grammatical components, whilst also being able to map the semantic relationships
BUT: not developed enough to work well in our system
Also: bi-directional rules are very hard to write
![Page 41: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/41.jpg)
Some Outcomes & Learnings
- Construction Grammar is a useful method for NLU & NLG
- OWL ontologies are well suited to these systems
- Stemming affects the system greatly
- Fluid CxG is not practical at this time
- Better evaluation methods need to be developed
- Turing test is not useful as it does not provemachine intelligence or understanding
- User perception is a primordial area of research
![Page 42: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/42.jpg)
Applications & Future work
- Assisted search- Summarization systems
- Content creation- Speech systems
- Sentiment analysis- More powerful AI module
- Anaphora resolution- Open domain testing
- Improved machine learning- Further work on query disambiguation methods
![Page 43: Using construction grammar in conversational systems](https://reader035.fdocuments.net/reader035/viewer/2022062418/555051f0b4c905ae3f8b46dd/html5/thumbnails/43.jpg)
Thank you
Find me at:
http://www.scienceforseo.comhttp://twitter.com/missmcj
Google reader