HITIQA - and Data Fusion Tomek Strzalkowski, PI. Paul B. Kantor June 2003 AQUAINT 18 month Meeting.

HITIQA - and Data FusionTomek Strzalkowski, PI.

Paul B. Kantor

June 2003

AQUAINT 18 month Meeting

Jiune 10 , 2003 2

HITIQA Team• SUNY Albany:

– Prof. Tomek Strzalkowski, PI/PM– Prof. Rong Tang– Prof. Boris Yamrom, consultant– Ms. Sharon Small, Research Scientist– Mr. Sean Ryan, Research Assistant– Mr. Ting Liu, Graduate Student– Mr. Nobuyuki Shimizu, Graduate Student– Mr. Zhenyu Dai, Graduate Student– Mr. Tom Palen, summer intern– Mr. Peter LaMonica, summer intern/AFRL– Mr. Pei Zhu, graduate student, CUNY Lehman

• Rutgers:– Prof. Paul Kantor, co-PI– Prof. K.B. Ng– Prof. Nina Wacholder– Mr. Robert Rittman, Graduate Student– Ms. Ying Sun, Graduate Student– Mr. Peng Song, Graduate student– Mr. Bing Bai, Graduate student

Jiune 10 , 2003 3

Rutgers:Two Components

• Automate the estimation of qualitative aspects– At the interface

– To support customization to individual analyst

• Leverage the power of multiple information retrieval systems via data fusion techniques– Behind the scenes

– To improve the quality of clusters/subclusters used for frame building and dialog aimed to improve system’s understanding of the analysts’ goals.

Jiune 10 , 2003 4

The two faces of fusion

• Fusion of evidence -- improve the retrieved set for supporting dialog with the analyst

• Fusion of answer information -- preliminary demo -- tomorrow afternoon --

Jiune 10 , 2003 5

Our Approach to Retrieval Fusion

SMART

FUSION PROCESS

Request

DOCUMENTS SETS

Result Set

Delivered SET

KL - Lang Mod

InQueryResult Set

Result Set

Jiune 10 , 2003 6

Progress -- Fusion

• Systems– 11 Aspects of Lemur

• Redundancies

• Errors in the Lemur code

• Language Model appears to work best

– SMART• “State of the art”

• Already in HITIQA

Jiune 10 , 2003 7

Fusion

• Additions to HITIQA– InQuery system – more powerful than SMART

– Available in a research version• Uses bag of words OR

– More powerful query structures [not yet implemented]

• Fusion of SMART and InQuery– Linear classifier models used

– Coefficients optimized on OTHER TOPICS

Jiune 10 , 2003 8

Background on Fusion Problem

• There are systems S, T, U, …

• There are problems to be solved P,Q,R…

• This defines several fusion problemsLocal fusion: for a given problem P, and a pair of

systems S,T, what is the best fusion rule:

Let s(d) ,t(d) be the scores assigned to document d by systems S and T. Fusion tries to find the “best” combining function f(s,t)

Jiune 10 , 2003 9

What does “best f(s,t)” mean?

• A. Effective in locating relevant documents.– pave(fusion) is a better precision score

– Means more of the relevant documents are near the top of the list

• Does not penalize duplicates

• HITIQA deals with these by clustering

• B. Local versus Global

Jiune 10 , 2003 10

Local Fusion Rule

• A local fusion rule fP(s,t) depends on the specific problem P.– This is relevant if P represents a static problem

or profile, which will be considered on many occasions

• A global fusion rule f(s,t) does not depend on a specific problem P, – and can be safely used on a variety of problems.

Jiune 10 , 2003 11

Global is Harder

• …..Of course• How to measure it:

– For a set of topics– Approach them with an established fusion rule– Ask whether (that is, for how many of them)

• Does the fusion rule beat the system that does best for this particular topic? [Called “beating the oracle” Ng]

• Does the fusion rule beat the average of systems?• Does the fusion rule beat the “generally better” system

Jiune 10 , 2003 12

Global results are mixed

• SMART was being used

• Added the Inquery capability in compatible mode (that is, using Bag of Words)

• Each system has some special features (name lists, etc.)

• However the fused system does not beat InQuery very often on a global basis (6 of 15)

Jiune 10 , 2003 13

• Completely rigorous For each topic:

• 1) Randomly split the documents into two parts: training and testing

• 2) Do the logistic regression on training part and get the fusion scores for both training and testing documents

• 3) Calculate p_100 on testing documents.

• 4) Excellent results (one random sample for each)

• 5) Test SMART and InQuery on the same random testing set

Local Fusion Results are Good

Jiune 10 , 2003 14

Local Fusion Results are GoodTopicno Smart InQuery Fusion F>SM F>IN F=IN IN>F

311 26 15 27 1 1 0 0318 0 3 0 0 0 0 1324 55 59 62 1 1 0 0342 1 5 5 1 0 1 0359 2 10 10 1 0 1 0365 6 16 16 1 0 1 0374 30 36 38 1 1 0 0386 2 4 4 1 0 1 0392 27 13 27 0 1 0 0403 11 11 11 0 0 1 0415 39 36 41 1 1 0 0421 3 5 5 1 0 1 0424 21 32 33 1 1 0 0432 0 1 1 1 0 1 0450 74 69 74 0 1 0 0

Overall 297 315 354 11 7 7 1

For each topic: 1) Randomly split the result into two parts: training and testing

2) Do the logistic regression on training part and get the fusion scores

for both training and testing documents

3) Calculate P100 on testing documents.

Jiune 10 , 2003 15

Summary of Local FusionF & SM F & IN F & BEST

win 11 7 5tie 4 7 9lose 0 1 1

Smart InQuery Fusion318 0 3 0318 1 5 2318 1 5 4318 0 3 1318 1 4 2

PROBLEM CASE

We ran 5 split half runs on the odd case (318) and the results persist.

Jiune 10 , 2003 16

Newer Results

• Also, using degree 2 polynomial as the kernel function,

• got the best performance with SVM. Its overall P100 on 50 topics is 742

• (logistic regression 732). The overall P100 on 13 "hard" [embedded relevant documents] topics is 53 (logistic regression 49). The comparison over the individual systems: Fusion of all 3 systems:

• Smart InQuery KL

• wins ties loses wins ties loses wins ties loses

• SVM 38 5 6 34 5 10 29 9 11

• Logistics 36 7 6 31 8 10 27 11 11

•

Jiune 10 , 2003 17

Is Local Sensible?

• Local fusion depends on getting information about a particular topic, and doing the best possible fusion.

• Not available in an AdHoc (e.g. Google) setting

• Potentially available in an intelligence applications - -filtering; standing profile

Jiune 10 , 2003 18

Making Local Fusion Work

• The system must get accurate feedback from the analyst/user on the relevance of individual items that are examined. If the analyst only sees excerpts, the judgment may be inaccurate (reflecting only the excerpt).

• The system can use feedback to update a fusion rule, “offline”.

Jiune 10 , 2003 19

Inserting Local Fusion

• When the system sees (retrospectively) that the fusion rule would have worked significantly better than the better system, it can rely on the fusion rule to make future judgments.

Jiune 10 , 2003 20

Our Approach to Retrieval Fusion

ADOPT: Fusion

System

Monitor Fusion Set

and Receive

Feedback

USE: Better System

Adaptive “Local” Fusion

SMART

FUSION

PROCESS

Request

DOCUMEN

TS SETS

Result Set

Delivered

SET

Result Set

KL - Lang

ModInQuery

Jiune 10 , 2003 21

Summary of Retrieval Fusion• Fusion Processes Examined

– Linear forms – discriminant analysis; logistic

• Key results– Global Fusion of InQuery and SMART beats SMART 10 times of

15.• Only beats InQuery 6 of 15

– Local (topic dependent) fusion beats (ties) InQuery 7(7) of 15 times

• Prospective– Expand local (profile dependent) fusion– Explore more powerful methods such as

• Quadratic classifiers• SVM

Jiune 10 , 2003 22

Why is Global Fusion Hard

• Global fusion requires that there be a single good rule that selects the region of greatest richness, for most topics, at once.

Jiune 10 , 2003 23

Fusion of InQuery and Smart: Topic 450* Easy case – almost any linear rule works well. Either system works well

Fusion of InQuery and Smart: Topic 450* Easy case – almost any linear rule works well. Either system works well Fusion of InQuery and Smart: Topic 392

* Easy case – SMART works well. InQuery works poorly

Fusion of InQuery and Smart: Topic 392* Easy case – SMART works well. InQuery works poorly

Fusion of InQuery and Smart: Topic 432* Another hard case – relevant documents not compactly grouped in the score space. Not many relevant documents found at all.

Fusion of InQuery and Smart: Topic 432* Another hard case – relevant documents not compactly grouped in the score space. Not many relevant documents found at all.

Fusion of InQuery and Smart: Topic 318* Interesting case – no linear rule works well. Relevant documentsembedded. Requires non-linear methods – Quadratic; SVM; other

Jiune 10 , 2003 24

Fusion of InQuery and Smart: Topic 421* Really challenging case: Quite a few relevant documents. Very diffuse in score space. Neither system works well. Possibly Boolean AND

Fusion of InQuery and Smart: Topic 421* Really challenging case: Quite a few relevant documents. Very diffuse in score space. Neither system works well. Possibly Boolean AND

Fusion of InQuery and Smart: Topic 359* A disaster.

Fusion of InQuery and Smart: Topic 359* A disaster.

Fusion of InQuery and Smart: Topic 374* Possible Boolean AND. Neither works well alone

Fusion of InQuery and Smart: Topic 374* Possible Boolean AND. Neither works well alone

Fusion of InQuery and Smart: Topic 415* Part of the relevant material is easily found. Part is embedded

Fusion of InQuery and Smart: Topic 415* Part of the relevant material is easily found. Part is embedded

Jiune 10 , 2003 25

KL Language Model and InQuery -- SVM with 2nd degree polynomial kernel -

conic section level curves. RBF gives multiple centers, but inferior

performance. The X-axis is the normalized scores from KL and Y-axis is the normalized scores from InQuery. And from the legend in the right part of the plots, you can find the predicted scores for each level curve..

Jiune 10 , 2003 26

Fusion• Fusion of SMART InQuery, K-L Language

model– Linear classifier models used– Coefficients optimized on OTHER TOPICS

(Global).– Coefficients optimized on SPECIFIC TOPICS

(Local: Good results)

Jiune 10 , 2003 27

Results and Prospects• Fusion Processes Examined

– Linear forms – logistic analysis; local weighted regression [100NN-300NN] ; SVM. kNN SVM

• Key results• Fusion of all 3 systems:• Smart InQuery KL ORACLE• wins ties loses wins ties loses wins ties loses win tie lose• SVM 38 5 6 34 5 10 29 9 11 25 9 15• Logistics 36 7 6 31 8 10 27 11 11 19 13 17

• Prospective– Expand local (profile dependent) fusion– Develop selection rules: when to use the “best single system” --

when to use fusion. Can we predict?

* Warning

Microsoft Windows has now been operating continuously for 88 minutes and 23 seconds. This is the limit allowed under your lease agreement. Windows Operating System will now crash in flames, destroying all of the content in your presentation, and humiliating you in front of your peers. For a better upgrade version contact Microsoft Sales Promotion at any computer store.

Jiune 10 , 2003 28

Thank you

• Questions?

Jiune 10 , 2003 29

Paul,

I've upload some of the recent data about fusion onto HITIQA website at:http://www.scils.rutgers.edu/~hitiqa/TEAM/psong/AquaintWorkshop200306/.

I only uploaded some key performance files because I'm not sure whatdata you will need in the presentation. Please let me know any furtherdata needed.

Below is the explanation of the files I uploaded:

* result.iks.logistic: P100 for each topic with logistic regressionmethod* perf.iks.logistic: Number of wins, ties and loses comparing withindividual systems.

* result.iks.lwr250: P100 for each topic with Local weighted regressionwhen choose kenel size 250* perf.iks.lwr250: Number of wins, ties and loses comparing withindividual systems.

* result.iks.meta: P100 for each topic when combine the scores oflogistic regression and LWR with equal weights (meta fusion)* perf.iks.meta: Number of wins, ties and loses comparing withindividual systems.

* result.iks.svm: P100 for each topic with support vector machine usingbest parameter set* perf.iks.svm: Number of wins, ties and loses comparing with individualsystems.

Please let me know for any questions.

Peng------- End of forwarded message -------

HITIQA - and Data Fusion Tomek Strzalkowski, PI. Paul B. Kantor June 2003 AQUAINT 18 month Meeting.

Documents

Transcript of HITIQA - and Data Fusion Tomek Strzalkowski, PI. Paul B. Kantor June 2003 AQUAINT 18 month Meeting.