Research Introspection “ICML does ICML”

Post on 09-Jan-2016

34 views 1 download

Tags:

description

Research Introspection “ICML does ICML”. Andrew McCallum Computer Science Department University of Massachusetts Amherst. Relational Modeling of the Research Literature & other Entities. Better understand structure of our own research area. Tools to help us learn a new sub-field. - PowerPoint PPT Presentation

Transcript of Research Introspection “ICML does ICML”

Research Introspection“ICML does ICML”

Andrew McCallum

Computer Science Department

University of Massachusetts Amherst

Relational Modeling of theResearch Literature & other Entities

• Better understand structure of our own research area.

• Tools to help us learn a new sub-field.• Aid collaboration• Map how ideas travel through social networks

of researchers.• Aids for hiring and finding reviewers!

• Many opportunities for rich relational learning• ... in a domain we understand well.

Previous Systems

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

ResearchPaper

Cites

Previous Systems

ResearchPaper

Cites

Person

UniversityVenue

Grant

Groups

Expertise

More Entities and Relations

Rexa System Overview

Reference resolution

(of papers, authors & grants)

Spider Web

for PDFs

Convert to text

(with layout & format)

Extract metadata

(title, authors, abstract, venue,

citations; 14 fields in total)

Browsable Web

Interface

Topic Analysis & other Data

Mining

WWW

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Home-grownJava+MySQL

(~1m PDF/day)

Enhancedps2text

(better word stiching,plus layout in XML)

ConditionalRandom Fields

(99% word accuracy)

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

NSF grant DB

Discriminativelytrained

graph partitioning

(competition-winningaccuracy)

From Text to Actionable Knowledge

SegmentClassifyAssociateCluster

Filter

Prediction Outlier detection Decision support

IE

Documentcollection

Database

Discover patterns - entity types - links / relations - events

DataMining

Spider

Actionableknowledge

SegmentClassifyAssociateCluster

Filter

Prediction Outlier detection Decision support

IE

Documentcollection

Database

Discover patterns - entity types - links / relations - events

DataMining

Spider

Actionableknowledge

Uncertainty Info

Emerging Patterns

Joint Inference

SegmentClassifyAssociateCluster

Filter

Prediction Outlier detection Decision support

IE

Documentcollection

ProbabilisticModel

Discover patterns - entity types - links / relations - events

DataMining

Spider

Actionableknowledge

Conditional Random Fields [Lafferty, McCallum, Pereira]

Conditional PRMs [Koller…], [Jensen…], [Geetor…], [Domingos…]

Discriminatively-trained undirected graphical models

Complex Inference and LearningJust what we researchers like to sink our teeth into!

Unified Model

Information Extraction

Markov dependencies

...and long-range & KB dependencies?

IE from Research Papers[McCallum et al ‘99]

@article{ kaelbling96reinforcement, author = "Leslie Pack Kaelbling and Michael L. Littman and Andrew P. Moore", title = "Reinforcement Learning: A Survey", journal = "Journal of Artificial Intelligence Research", volume = "4", pages = "237-285", year = "1996",

(Linear Chain) Conditional Random Fields

yt -1

yt

xt

yt+1

xt +1

xt -1

Finite state model Graphical model

Undirected graphical model, trained to maximize

conditional probability of output sequence given input sequence

. . .

FSM states

observations

yt+2

xt +2

yt+3

xt +3

said Jones a Microsoft VP …

OTHER PERSON OTHER ORG TITLE …

output seq

input seq

Asian word segmentation [COLING’04], [ACL’04]IE from Research papers [HTL’04]Object classification in images [CVPR ‘04]

Wide-spread interest, positive experimental results in many applications.

Noun phrase, Named entity [HLT’03], [CoNLL’03]Protein structure prediction [ICML’04]IE from Bioinformatics text [Bioinformatics ‘04],…

[Lafferty, McCallum, Pereira 2001]

p(y | x) =1

Zx

Φ(y t ,y t−1,x, t)t

∏ where

Φ(y t ,y t−1,x, t) = exp λ k fk (y t ,y t−1,x, t)k

∑ ⎛

⎝ ⎜

⎠ ⎟

Entity Resolution

Joint inference among all pairwise coref

...models of entities, attributes, first-order...

Y/N

Y/N

Y/N

Joint Co-reference Decisions,Discriminative Model

Stuart Russell

Stuart Russell

[Culotta & McCallum 2005]

S. Russel

People

Y/N

Y/N

Y/N

Y/N

Y/N

Y/N

Co-reference for Multiple Entity Types

Stuart Russell

Stuart Russell

University of California at Berkeley

[Culotta & McCallum 2005]

S. Russel

Berkeley

Berkeley

People Organizations

Y/N

Y/N

Y/N

Y/N

Y/N

Y/N

Joint Co-reference of Multiple Entity Types

Stuart Russell

Stuart Russell

University of California at Berkeley

[Culotta & McCallum 2005]

S. Russel

Berkeley

Berkeley

People Organizations

Reduces error by 22%

Structured Topic Models

Discovering latent structurein jointly modeling words, time, relations...

Topical N-gram Model

z1 z2 z3 z4

w1 w2 w3 w4

y1 y2 y3 y4

1

T

D

. . .

. . .

. . .

WTW

1 2 2

[Wang, McCallum 2005]

Finding Topics with TNG

Traditional unigram LDArun on 1.6 million

titles / abstracts(200 topics)

...select ~300k papers onML, NLP, robotics, vision...

Find 200 TNG topics among those papers.

Topical TransferCitation counts from one topic to another.

Map “producers and consumers”

Trends in 17 years of NIPS proceedings

Topic Distributions Conditioned on Time

time

top

ic m

ass

(in

ver

tica

l h

eig

ht)

Topical Transfer Through Time

• Can we predict which research topicswill be “hot” at ICML next year?

• ...based on– the hot topics in “neighboring” venues last year– learned “neighborhood” distances for venue pairs

How do Ideas Progress Through Social Networks?

COLT

“ADA Boost”

ICML

ACL(NLP)

ICCV(Vision)

SIGIR(Info. Retrieval)

Hypothetical Example:

How do Ideas Progress Through Social Networks?

COLT

“ADA Boost”

ICML

ACL(NLP)

ICCV(Vision)

SIGIR(Info. Retrieval)

Hypothetical Example:

How do Ideas Progress Through Social Networks?

COLT

“ADA Boost”

ICML

ACL(NLP)

ICCV(Vision)

SIGIR(Info. Retrieval)

Hypothetical Example:

Preliminary Results

MeanSquaredPredictionError

# Venues used for prediction

Transfer Model with Ridge Regression is a good Predictor

(SmallerIs better) Transfer

Model

Other Relational Opportunities

• Categorizing citations.• Map transfer of ideas through science.• Rank CS departments by various criteria.• What 10 papers tell the story of ASR research?• Predicting when a student will graduate.• Help me find the right postdoc.• Suggest best collaborative opportunities.• Who should chair the next ICML?