Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

13
Automatic Construction of Semantic Hierarchies Rion L. Snow [email protected] Fair Isaac 5935 Cornerstone Court San Diego, CA 92121 AQUAINT Phase I Biannual Workshop San Diego, CA 9 – 12 June 2003

description

Automatic Construction of Semantic Hierarchies Rion L. Snow [email protected]. AQUAINT Phase I Biannual Workshop San Diego, CA 9 – 12 June 2003. Fair Isaac 5935 Cornerstone Court San Diego, CA 92121. Notation and Terminology Similarity of Meaning and Usage Automatic Polysemy Discovery - PowerPoint PPT Presentation

Transcript of Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

Page 1: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

Automatic Construction of Semantic Hierarchies

Rion L. [email protected]

Fair Isaac 5935 Cornerstone CourtSan Diego, CA 92121

AQUAINT Phase I Biannual Workshop San Diego, CA 9 – 12 June 2003

Page 2: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

2 Automatic Construction of Semantic Hierarchies

Outline / Summary

• Notation and Terminology

• Similarity of Meaning and Usage

• Automatic Polysemy Discovery

• Constructing Semantic Hierarchies

• Applications to Query Expansion and Sentence Meaning Comparison

Page 3: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

3 Automatic Construction of Semantic Hierarchies

Language Representation in the Model

The Token Lexicon

president george bush visited san jose last weekend.

Each word activates a fixed token of neurons on the input region.

Our experiments typically use a nine region network; this network advances one word at a time over the input text.

president

george

bush

visited

“President George Bush visited San Jose last weekend.”Incoming text is mapped into the universal token language by means of the token lexicon.

Our experiments use a lexicon of size 100,000, representing the 30,000 most frequent words and the 70,000 selected phrases.

… …

Page 4: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

4 Automatic Construction of Semantic Hierarchies

Unsupervised Language TrainingTarget region

An antecedent support network

Cerebral cortex

Source region

Cortical antecedent support fascicles are trained between each pair of regions. The connection strength between a pair of neurons is a function of those neurons’ occurrence and co-occurrence probabilities.

In our experiments we train a maximum of four fascicles forward and backward from each region, for a total of 52 possible fascicles.

For training we use a 1.4 giga-word, 75 million sentence untagged newswire corpus (which includes the AQUAINT newswire corpus).

president george bush visited san jose last weekend.

ij

Pr i , j S

Pr j

ijS

Page 5: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

5 Automatic Construction of Semantic Hierarchies

brazil

colombia . guatemala . nicaragua . bolivia . ecuador . mexico . el salvador . honduras .

venezuela . costa rica . panama . brazil

Similarity of Meaning and Usage

brazilvenezuela

venezuelaecuador brazil

Page 6: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

6 Automatic Construction of Semantic Hierarchies

acrossinto

alongnear

around through

ontotoward

offinsidedownoverfromonin

out

newspaper new york times

washington post magazine

wall street journal journal daily times post

newspapers daily news

associated press paper news weekly

passion fascination enthusiasm

desire appetite

penchant obsession

love fondness affection

sense

Word Families: The Emergence of Abstraction

redbluepinkgray

greenblackyellowwhite

A simple automated process produced over 400,000 families for a lexicon of 30,000 words and 70,000 multi-word ‘phrases’. Each family is a subset of the synonymy set of that word. Families are like word senses, but more abstract and more useful. For example, word family matching between sentences can be used to evaluate their similarity of meaning.

talksaccord

agreementnegotiations

dealpeace

process plan

peace talksefforts

comply withabide by

compliance withaccordance with

violation ofline with

complying with

Page 7: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

7 Automatic Construction of Semantic Hierarchies

Automatic Polysemy Discovery

plantsanimals

birdsspecies

fishdogs

animaldogcats

humansinsects

plantsplant

facilitiesfacility

reactorsreactor

factoriesfactory

systemsequipment

systempipelinenuclearstation

footkneeankle

shoulderwristelbow

leg

footfeet

## feetmiles

inchesmeters

## milesmile

kilometers## inches## meters

inchyardskm

winto win

winningwon

after winningwho won

wins

wingetseedo

playgo

takemakehearfind

winvictorygame

victory overloss

gamesseasonwin overopenerseries

Plants“living plants”

“industrial plants”

“body part”

“unit of measurement”

“verb: to win”

“noun: a win”

“common verbs”

Foot Win

Page 8: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

8 Automatic Construction of Semantic Hierarchies

Multi-Scale Semantic Similarity

merrill lynchmorgan stanley

salomon brotherslehman brothersgoldman sachssmith barney

goldmanj.p. morganbear stearns

wells fargobank

bank of americafirst interstatesecurity pacific

chase manhattanfargo

corp.co.inc.ltd.plc

companygroupunit

intelibm

microsoftcompaq

hewlett-packardappleoracle

motorola

at&tbellsouth

gtemci

bell atlanticnynexsprintsbc

ameritechbell

corp.

intel at&t wells fargo merrill lynch corp.

“Companies” Super-Family

Page 9: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

9 Automatic Construction of Semantic Hierarchies

Extrapolating this Process Yields…

The Semantic Structure of the English Language

Page 10: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

10 Automatic Construction of Semantic Hierarchies

Semantic Analysis with Word Families

annual meetingconvention

meetingconference

summitannual conventionannual conference

the their hisItsherourmy

hei

wesheyou

peopleit

who

“He was named executive vice president following the annual meeting.”

“He was named executive vice president following the annual meeting.”

“Who served as vice president in the wake of the meeting?”

was namedwill become

was appointedbecame

was electedserves as

he becameserved as

is now

executive vice presidentvice president

senior vice presidentchief executivevice chairman

chief executive officerpresidentchairmandirector

chief operating officer

following during

shortly afterjust before

prior toafter

soon aftershortly before

beforedays beforehours after

on the eve ofdays after

in the wake of

Page 11: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

11 Automatic Construction of Semantic Hierarchies

Demo: Sentence Meaning Comparison

Page 12: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

12 Automatic Construction of Semantic Hierarchies

Conclusion

• Similarity of meaning is applied to construct powerful hypernym-type semantic hierarchies by grouping words according to similar contexts.

• Domain specific knowledge is seamlessly integrated into the overall semantic construction.

• This method may be directly applied to foreign languages, as well as to other information modalities such as sound and vision.

• We plan on building semantic hierarchies for Chinese, Arabic, and Spanish soon.

Page 13: Automatic Construction of Semantic Hierarchies Rion L. Snow rionsnow@fairisaac

13 Automatic Construction of Semantic Hierarchies

The Team

Other Team Members

Dr. Robert Hecht-Nielsen - Project Leader

Dr. Robert Means - Chief Technologist

Kate Mark - Project Coordinator

David Busby - Chief Brain Software Architect

Dr. Syrus Nemat-Nasser - Scientist

Dr. Shailesh Kumar - Scientist

Adrian Fan - Researcher

Research SponsorsFair Isaac

ARDA