CSTalks-Natural Language Processing-2 Nov

25
Natural Language Processing Daniel Dahlmeier NUS Graduate School for Integrative Sciences and Engineering [email protected] CSTalks 2 November 2011

description

 

Transcript of CSTalks-Natural Language Processing-2 Nov

Page 1: CSTalks-Natural Language Processing-2 Nov

Natural Language Processing

Daniel Dahlmeier

NUS Graduate School for Integrative Sciences and [email protected]

CSTalks 2 November 2011

Page 2: CSTalks-Natural Language Processing-2 Nov

Acknowledgments

Examples and figures from Michael Collins’ lecture notes:http://www.cs.columbia.edu/∼mcollins.

Some other figures are from Wikipedia: http://www.wikipedia.org.

The rest I randomly found on the web.

Page 3: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Google translate

3/25

Page 4: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

IBM’s Watson computer wins at Jeopardy!

4/25

Page 5: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Siri

5/25

Page 6: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

What is Natural Language Processing?

Natural Language Processing (NLP) or Computational Linguistics

Language processing that goes beyond a “bag of words” representation.

Example

Translate from one language into the other.

Answer natural language questions.

Parse the syntactic/semantic structure of a sentence.

The other NLP

NLP 6= neuro-linguistic programming.

6/25

Page 7: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Background(s): Artificial Intelligence

Talk to your computer

Dave: Hello, HAL. Do you read me, HAL?

HAL: Affirmative, Dave. I read you.

Dave: Open the pod bay doors, HAL.

HAL: I’m sorry, Dave. I’m afraid I can’t do that.

The computer needs to ...

Understand the user : Natural Language Understanding.

Generate a well-formed reply : Natural Language Generation.7/25

Page 8: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Background(s): Artificial Intelligence (cont.)

Turing Test

Experimenter talks to two parties A and B via a terminal.

If C cannot distinguish which party is a computer and which is ahuman, we should consider the computer to be intelligent.

Natural language is deeply intertwined with intelligence.

8/25

Page 9: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Background(s): Linguistics

Generative Linguistics

Humans can produce and understand an infinite number ofsentences by means of a finite set of rules.

Language is produced through a generative, recursive process in thehuman brain.

The principles that underlie this process are universal to alllanguages (universal grammar).

9/25

Page 10: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Background(s): the Web

“We are drowning in information but starved for knowledge.”by Edward Osborne Wilson

Too much text to read...

Wikipedia: over 3.7 million articles (English).

PubMed: over 20 million citations.

WWW: billions of pages, trillions of words.

10/25

Page 11: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Part-of-speech Tagging

Part-of-speech tagging

Input: a sentence.

Output: a part-of-speech tag sequence, e.g., noun, verb, adjective,...

Example

Profits/N soared/V at/P Boeing/N Co./N ,/, easily/ADV topping/Vforecasts/N on/P Wall/N Street/N ./.

11/25

Page 12: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Named-entity recognition

Named-entity recognition

Input: a sentence.

Output: a BIO-named entity tag sequence, e.g., PERSON,ORGANIZATION, OTHER.

Example

Profits/O soared/O at/O Boeing/B-ORG Co./I-ORG ,/O easily/Otopping/O forecasts/O on/O Wall/O Street/O ./O

12/25

Page 13: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Word Sense Disambiguation

Word sense disambiguation

Input: a sentence.

Output: the sense of each word in the sentence.

Example

I/sense1 can/sense1 can/sense2 a/sense1 can sense3 .

13/25

Page 14: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Parsing

Parsing

Input: a sentence.

Output: the syntactic tree structure of the sentence.

Example

Boeing is located in Seattle.

14/25

Page 15: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Machine translation

Machine Translation

Input: a sentence in language F .

Output: the translated sentence in language E .

Example

Input: Syriens Prasident Baschar al-Assad hat den Westen davorgewarnt, sich in die Angelegenheiten seines Landes einzumischen.

Output: Syrian President Bashar al-Assad has warned the West againstinterfering in the affairs of his country.

15/25

Page 16: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Why is it hard? ( example from L.Lee)

“At last, a computer that understands you like your mother”

16/25

Page 17: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Ambiguity of Natural Language

“At last, a computer that understands you like your mother”

This could mean...

1 It understands you as well as your mother understands you.

2 It understands (that) you like your mother.

3 It understands you as well as it understands your mother.

1 and 3: Does this mean well, or poorly?

17/25

Page 18: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Ambiguity at the Acoustic Level

“At last, a computer that understands you like your mother”

This sounds like...

1 “... a computer that understands you like your mother.”

2 “... a computer that understands you lie cured mother.”

18/25

Page 19: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Ambiguity at the Syntactic (structure) Level

“At last, a computer that understands you like your mother”

19/25

Page 20: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Ambiguity at the Syntactic (structure) Level

“List all flights on Tuesday.”

20/25

Page 21: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Ambiguity at the Semantic (meaning) Level

Definition of “mother”

1 a woman who has given birth to a child

2 a stringy slimy substance consisting of yeast cells and bacteria; isadded to cider or wine to produce vinegar.

More ambiguity

They put money in the bank (= buried in mud?).

I saw her duck with a telescope (= a duck carrying a telescope?).

21/25

Page 22: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Ambiguity at the Discourse (multi-clause) Level

Anaphora resolution

Alice says they’ve built a computer that understands you like yourmother.But she ...

... doesn’t know any details (Alice)

... doesn’t understand me at all (my mother)

22/25

Page 23: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Related Stuff

Machine Learning

This really made large-scale, open domain NLP applications possible.

Information Retrieval

Both need to “understand” language.

Linguistics

Interested in the nature of language.

Psychology / Cognitive Science

Both interested in human cognitive capabilities.

23/25

Page 24: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Conclusion

What I have told you...

What NLP is about.

Some NLP tasks that people work on.

Why it’s not that easy.

What I haven’t told you

How do you solve all these problems?

How well does it work?

What is left to be done?

24/25

Page 25: CSTalks-Natural Language Processing-2 Nov

ExamplesWhat is NLP?

BackgroundNLP tasks

Why is it hard?Related Stuff

Conclusion

Would you like to know more?

NLP courses at NUS

CS4248: natural language processing

CS6207: advanced natural language processing

Books

Jurafsky and Martin, Speech and Language Processing (2nd Edition)

25/25