Historical Perspectives on Natural Language Processing
description
Transcript of Historical Perspectives on Natural Language Processing
April 2008 Historical Perspectives on NLP 1
Historical Perspectives onNatural Language Processing
Mike Rosner
Dept Artificial Intelligence
April 2008 Historical Perspectives on NLP 2
Outline
• What is NLP?
• What makes natural languages special?
• Classic computational models– Knowledge Free NLP– Knowledge Based NLP
• Issues
• Demos
April 2008 Historical Perspectives on NLP 3
References
• Dan Jurafsky and Jim Martin, Speech and Language Processing, Prentice Hall 2000.
• www.cs.um.edu.mt/~mros/nlpworld/historical/index.html
April 2008 Historical Perspectives on NLP 4
What is NLP
• NLP aims to get computers to process/use natural language like people.
• Motivation– performance goal: make computers more
human friendly.– scientific goal: understand how language
works by building computational models.
April 2008 Historical Perspectives on NLP 5
Language-Enabled Programs
• Spelling and Style Correction
• Parsing and Generation
• Document Processing– Classification– Summarisation– Retrieval/Extraction
• Translation
• Dialogue• Question Answering• Speech• Multimodal
Communication
April 2008 Historical Perspectives on NLP 6
NLP is Interdisciplinary
• Computer Science +
• Linguistics +
• Artificial Intelligence +
• Software Engineering +
• Signal Processing +
• Knowledge Representation
April 2008 Historical Perspectives on NLP 7
Overall History
• 1950-1965: Machine Translation
• 1970-1980: Artificial Intelligence approaches based on semantics
• 1980-1995: Increasingly sophisticated use of syntactic models
• 1995-Present: Data driven statistical models.
April 2008 Historical Perspectives on NLP 8
Machine Translation
• Les soldats sont dans le café
April 2008 Historical Perspectives on NLP 9
Machine Translation
Typical Problem:
• Les soldats sont dans le café
• The soldiers are in the coffee.
April 2008 Historical Perspectives on NLP 10
Natural LanguagesWhat is so Special?
• Not designer languages
• Universal application
• Context dependence
• Linguistic + para-linguistic elements
• Ambiguity
April 2008 Historical Perspectives on NLP 11
Ambiguity
• Word-sense ambiguity– The board balanced the decision– The board balanced the lead weights
• Syntactic Ambiguity– I shot an elephant in my underpants
• Semantic Ambiguity– Every girl likes a sailor
• Pragmatic Ambiguity– Can you pass the salt?
April 2008 Historical Perspectives on NLP 12
ELIZA
• Joseph Weizenbaum coded ELIZA at MIT during the years 1964-1966
• The aim was to produce a very simple program, based on the presence of keywords in the input, capable of typed NL conversation.
• ELIZA simulates a psychotherapist who reflects the patient's statements back to him and encourages the patient to keep talking.
• The role of the computer was carefully chosen to disguise the program's lack of knowledge.
April 2008 Historical Perspectives on NLP 13
ELIZA Interaction
• Men are all alike• IN WHAT WAY?• They're always bugging us about something or other• CAN YOU THINK OF A SPECIFIC EXAMPLE?• Well my boyfriend made me come here• YOUR BOYFRIEND MADE YOU COME HERE?• He says I'm depressed much of the time• I'M SORRY TO HEAR YOU ARE DEPRESSED• Perhaps I could learn to get on with my mother.• TELL ME MORE ABOUT YOUR FAMILY
April 2008 Historical Perspectives on NLP 14
ELIZA – How it works
• Database of keywords, each associated with• A decomposition rule to match against input (0 you 0 me) | | | | 1 2 3 4• A reassembly rule that constitutes an output specification
e.g.(what makes you think that I 3 you?)– I wonder if you really like me– WHAT MAKES YOU THINK THAT I REALLY LIKE YOU
• Various heuristics for selecting next topic, keeping conversation going, avoiding repetition etc.
April 2008 Historical Perspectives on NLP 15
ELIZA - Comments
• Remarkable example of robust NLP which always produces an answer.
• Contains no interesting knowledge about language or the world
• Good example of creating apparently intelligent behaviour using programming tricks and an artificial context
• J. Weizenbaum, ELIZA – A Computer Program for the Study of NL Communication between Man and Machine CACM vol 9.1 pp36-43, 1966
April 2008 Historical Perspectives on NLP 16
Eliza Anecdote [1]• Student: May I use your computer time from the office?• response: Do you want to use my computer time?• Student: Yes.• response: I understand.• Student: Are you planning to come into the office tonight?• response: Why do you want to know whether I am planning to come
into the office tonight?• Student: Because I want to use your computer time!• response: I see.• Student: Well?• response: What do you mean, "Well"?• Student: Well, what's your answer to my questions?• response: Let's explore further why you want to use my computer
time.
April 2008 Historical Perspectives on NLP 17
Eliza Anecdote [2]
• After about ten minutes of increasing frustration and anger, the student rushed to a telephone and dialed Bobrow's home, rousing him from a deep sleep.
• Student: What the heck do you think you're doing?
• Bobrow: What do you mean, What the heck do I think I'm doing?
• http://www.kurzweilai.net/meme/frame.html?main=/articles/art0319.html
April 2008 Historical Perspectives on NLP 18
SHRDLU (T. Winograd 1971)
Three Basic Principles• A computer cannot deal reasonably with
language unless it can understand the subject it is discussing.
• In modelling language understanding, different levels of language processing must be integrated.
• Humans do not understand isolated sentences. Sentences always occur in a context which includes discourse context, spatio/temporal context, and shared knowledge.
April 2008 Historical Perspectives on NLP 19
Winograd's Approach
• It is difficult or impossible to isolate one aspect of language use syntax, morphology, semantics, pragmatics) from another: He gave the house plants to charity vsHe gave the boy plants to water.
• It is difficult to draw the line between use of linguistic knowledge and other kinds of knowledge
• How can the problem be contained?
April 2008 Historical Perspectives on NLP 20
Blocks World
• Winograd's strategy was to pick a microworld to talk about, the "blocks world" of a toy robot. In the course of a dialogue:– Manipulate objects– Answer questions– Discuss events– Accept new facts
• Key idea: by limiting the subject area, it becomes possible to address general issues of how language is used in a framework of physical objects, events and continuing discourse.
April 2008 Historical Perspectives on NLP 21
Main Components
• Syntactic Parser – complex grammar of English
• Semantic routines – knowledge of concepts attached to words
• Cognitive deductive system for exploring the consequences of actions
• Simple set of programs for generating appropriate English responses
April 2008 Historical Perspectives on NLP 22
Sample Interaction I
• Pick up a big red block• OK• Grasp the pyramid• I DON'T KNOW WHICH PYRAMID YOU MEAN• Find a block which is taller than the one you are
holding and put it in the box• BY IT I ASSUME YOU MEAN THE BLOCK
WHICH IS TALLER THAN THE ONE I AM HOLDING
April 2008 Historical Perspectives on NLP 23
Sample Interaction II
• Had you touched any pyramid before you put the little green one on the little cube?
• YES, THE GREEN ONE• When did you pick it up?• WHILE I WAS STACKING UP THE RED CUBE,
A LARGE RED BLOCK AND A LARGE GREEN CUBE
• Why?• TO GET RID OF IT• Why did you do that• TO CLEAN OFF THE RED CUBE
April 2008 Historical Perspectives on NLP 24
Issues
• Does NLP have to be knowledge-based?• What kind of knowledge? How is it
represented?• How does the knowledge get there?
– Expert Systems approach – knowledge is entered explicitly in the form of rules
– Data driven approach – take large quantities of data and use machine learning techniques to acquire knowledge automatically