ch01 cs834

download ch01 cs834

of 19

Transcript of ch01 cs834

  • 7/27/2019 ch01 cs834

    1/19

    An Introduction to Language Processing with Perl andProlog

    Chapter 1: An Overview of Language Processing

    Pierre Nugues

    Lund University

    [email protected]

    http://www.cs.lth.se/home/Pierre_Nugues/

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 1 / 19

    http://localhost/var/www/apps/conversion/tmp/scratch_1/[email protected]://www.cs.lth.se/home/Pierre_Nugues/http://www.cs.lth.se/home/Pierre_Nugues/http://localhost/var/www/apps/conversion/tmp/scratch_1/[email protected]://find/http://goback/
  • 7/27/2019 ch01 cs834

    2/19

    Chapter 1: An Overview of Language Processing

    Applications of Language Processing

    Spelling and grammatical checkers: MS Word

    Text indexing and information retrieval on the Internet: Google,

    Microsoft Bing, YahooTelephone information that understands some spoken questions: SJ(trains in Sweden) or Tellme.com in the United States

    Speech dictation of letters or reports: IBM ViaVoice, Windows Vista

    Translation: Google Translate, SYSTRAN

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 2 / 19

    http://find/
  • 7/27/2019 ch01 cs834

    3/19

  • 7/27/2019 ch01 cs834

    4/19

    Chapter 1: An Overview of Language Processing

    Linguistics Layers

    Sounds

    Phonemes

    Words and morphologySyntax and functions

    Semantics

    Dialogue

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 4 / 19

    Ch 1 A O i f L P i

    http://find/
  • 7/27/2019 ch01 cs834

    5/19

    Chapter 1: An Overview of Language Processing

    Sounds and Phonemes

    Serious Cest par la It is that way

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 5 / 19

    Ch t 1 A O i f L P i

    http://goforward/http://find/http://goback/
  • 7/27/2019 ch01 cs834

    6/19

    Chapter 1: An Overview of Language Processing

    Lexicon and Parts of Speech

    The big cat ate the gray mouse

    The/article big/adjective cat/noun ate/verb the/article gray/adjective

    mouse/nounLe/article gros/adjectif chat/nom mange/verbe la/article souris/nomgrise/adjectifDie/Artikel groe/Adjektiv Katze/Substantiv it/Verb die/Artikelgraue/Adjektiv Maus/Substantiv

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 6 / 19

    Chapter 1: An Overview of Language Processing

    http://find/
  • 7/27/2019 ch01 cs834

    7/19

    Chapter 1: An Overview of Language Processing

    Morphology

    Word Root form

    worked to work + verb + preterittravaille travailler + verb + past participlegearbeitet arbeiten + verb + past participle

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 7 / 19

    Chapter 1: An Overview of Language Processing

    http://find/
  • 7/27/2019 ch01 cs834

    8/19

    Chapter 1: An Overview of Language Processing

    Syntactic Tree

    sentence

    noun phrase verb phrase

    article verb noun phrase

    nounarticle

    noun

    The boy hit the ball

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 8 / 19

    Chapter 1: An Overview of Language Processing

    http://find/
  • 7/27/2019 ch01 cs834

    9/19

    Chapter 1: An Overview of Language Processing

    Syntax: A Classical View

    A graph of dependencies and functions

    The boy hit the ballSubject

    ObjectVerb

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 9 / 19

    Chapter 1: An Overview of Language Processing

    http://find/
  • 7/27/2019 ch01 cs834

    10/19

    Chapter 1: An Overview of Language Processing

    Semantics

    As opposed to syntax:

    1 Colorless green ideas sleep furiously.

    2 *Furiously sleep ideas green colorless.

    Determining the logical form:

    Sentence Logical representation

    Frank is writing notes writing(Frank, notes).Francois ecrit des notes ecrit(Francois, notes).

    Franz schreibt Notizen schreibt(Franz, Notizen).

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 10 / 19

    http://find/
  • 7/27/2019 ch01 cs834

    11/19

    Chapter 1: An Overview of Language Processing

  • 7/27/2019 ch01 cs834

    12/19

    p g g g

    Reference

    Pierre wrote notes wrote(pierre, notes)

    Pierre

    Louis

    Charlotte operating

    systems

    computational

    linguistics

    Prolog

    programming

    1. sentence 2. logical representation

    3. real

    world

    referencing referencing

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 12 / 19

    Chapter 1: An Overview of Language Processing

    http://find/http://goback/
  • 7/27/2019 ch01 cs834

    13/19

    g g g

    Ambiguity

    Many analyses are ambiguous. It makes language processing difficult.Ambiguity occurs in any layer: speech recognition, part-of-speech tagging,parsing, etc.

    Example of an ambiguous phonetic transcription:The boys eat the sandwichesThat may correspond to:The boy seat the sandwiches; the boy seat this and which is; the buoyseat the sand which is

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 13 / 19

    Chapter 1: An Overview of Language Processing

    http://find/
  • 7/27/2019 ch01 cs834

    14/19

    Models and Tools

    Linguistics has produced an impressive set of theories and modelsLanguage processing requires significant resources

    Models and tools have matured. Resources are available.Tools involve notably finite-state automata, regular expressions, rewritingrules, logic, statistics and machine learning.

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 14 / 19

    Chapter 1: An Overview of Language Processing

    http://find/http://goback/
  • 7/27/2019 ch01 cs834

    15/19

    The Carsim System: A Text-to-Scene Converter

    Texts XML Templates 3D Animation

    Vehicule B venantde ma gauche, je me trouve dans lecarrefour, a faiblevitesse environ 40

    km/h, quand le vehicule B, percutemon vehicule, et merefuse la priorite adroite. Le premier

    choc atteint mon ailearriere gauche,

    // Static ObjectsSTATIC [ROADTREE]// Dynamic ObjectsDYNAMIC [VEHICLE [ID = vehicule b;

    INITDIRECTION = east;

    = =NLP engine Java 3D animation program

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 15 / 19

    Chapter 1: An Overview of Language Processing

    http://find/
  • 7/27/2019 ch01 cs834

    16/19

    Dialogue: The Persona Project from Microsoft Research

    A conversation with Peedy

    Turn Utterance[Peedy is asleep on his perch]

    User: Good morning, Peedy.[Peedy rouses]

    Peedy: Good morning.User: Lets do a demo.

    [Peedy stands up, smiles]Peedy: Your wish is my command, what would you like to hear?User: What have you got by Bonnie Raitt?

    [Peedy waves in a stream of notes, and grabs one as they rushby.]Peedy: I have The Bonnie Raitt Collection from 1990.User: Pick something from thatPeedy: How about Angel from Montgomery?

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 16 / 19

    Chapter 1: An Overview of Language Processing

    http://find/
  • 7/27/2019 ch01 cs834

    17/19

    Dialogue: The Persona Project from Microsoft Research

    User: Sounds good.[Peedy drops note on pile]

    Peedy: OK.User: Play some rock after that.

    [Peedy scans the notes again, selects one]

    Peedy: How about Fools in love?User: Who wrote that?[Peedy cups one wing to his ear]

    Peedy: Huh?User: Who wrote that?

    [Peedy looks up, scrunches his brow]Peedy: Joe JacksonUser: Fine.

    [Drops note on pile]

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 17 / 19

    Chapter 1: An Overview of Language Processing

    http://find/
  • 7/27/2019 ch01 cs834

    18/19

    Persona System Architecture

    Source: http:

    //research.microsoft.com/research/pubs/view.aspx?pubid=439Pierre Nugues An Introduction to Language Processing with Perl and Prolog 18 / 19

    Chapter 1: An Overview of Language Processing

    http://research.microsoft.com/research/pubs/view.aspx?pubid=439http://research.microsoft.com/research/pubs/view.aspx?pubid=439http://research.microsoft.com/research/pubs/view.aspx?pubid=439http://research.microsoft.com/research/pubs/view.aspx?pubid=439http://find/http://goback/
  • 7/27/2019 ch01 cs834

    19/19

    Research Relevance

    Large companies like Microsoft, Google, Yahoo, IBM, or Xerox have a

    research activity in natural language processing.The 7th European framework program (2007-2013) names six technologypillars in information technologies. Two of them are related to languageprocessing:

    Knowledge, cognitive and learning systems: semantic systems;capturing and exploiting knowledge embedded in web and multimediacontent; bio-inspired artificial systems that perceive, understand, learnand evolve, and act autonomously; learning by convivial machines andhumans based on a better understanding of human cognition.

    Simulation, visualization, interaction and mixed realities: tools forinnovative design and creativity in products, services and digitalmedia, and for natural, language-enabled and context-rich interactionand communication.

    Pierre Nugues An Introduction to Language Processing with Perl and Prolog 19 / 19

    http://find/