HITIQA - and Data Fusion Tomek Strzalkowski, PI. Paul B. Kantor June 2003 AQUAINT 18 month Meeting.
-
Upload
wendy-edwards -
Category
Documents
-
view
217 -
download
0
Transcript of HITIQA - and Data Fusion Tomek Strzalkowski, PI. Paul B. Kantor June 2003 AQUAINT 18 month Meeting.
HITIQA - and Data FusionTomek Strzalkowski, PI.
Paul B. Kantor
June 2003
AQUAINT 18 month Meeting
Jiune 10 , 2003 2
HITIQA Team• SUNY Albany:
– Prof. Tomek Strzalkowski, PI/PM– Prof. Rong Tang– Prof. Boris Yamrom, consultant– Ms. Sharon Small, Research Scientist– Mr. Sean Ryan, Research Assistant– Mr. Ting Liu, Graduate Student– Mr. Nobuyuki Shimizu, Graduate Student– Mr. Zhenyu Dai, Graduate Student– Mr. Tom Palen, summer intern– Mr. Peter LaMonica, summer intern/AFRL– Mr. Pei Zhu, graduate student, CUNY Lehman
• Rutgers:– Prof. Paul Kantor, co-PI– Prof. K.B. Ng– Prof. Nina Wacholder– Mr. Robert Rittman, Graduate Student– Ms. Ying Sun, Graduate Student– Mr. Peng Song, Graduate student– Mr. Bing Bai, Graduate student
Jiune 10 , 2003 3
Rutgers:Two Components
• Automate the estimation of qualitative aspects– At the interface
– To support customization to individual analyst
• Leverage the power of multiple information retrieval systems via data fusion techniques– Behind the scenes
– To improve the quality of clusters/subclusters used for frame building and dialog aimed to improve system’s understanding of the analysts’ goals.
Jiune 10 , 2003 4
The two faces of fusion
• Fusion of evidence -- improve the retrieved set for supporting dialog with the analyst
• Fusion of answer information -- preliminary demo -- tomorrow afternoon --
Jiune 10 , 2003 5
Our Approach to Retrieval Fusion
SMART
FUSION PROCESS
Request
DOCUMENTS SETS
Result Set
Delivered SET
KL - Lang Mod
InQueryResult Set
Result Set
Jiune 10 , 2003 6
Progress -- Fusion
• Systems– 11 Aspects of Lemur
• Redundancies
• Errors in the Lemur code
• Language Model appears to work best
– SMART• “State of the art”
• Already in HITIQA
Jiune 10 , 2003 7
Fusion
• Additions to HITIQA– InQuery system – more powerful than SMART
– Available in a research version• Uses bag of words OR
– More powerful query structures [not yet implemented]
• Fusion of SMART and InQuery– Linear classifier models used
– Coefficients optimized on OTHER TOPICS
Jiune 10 , 2003 8
Background on Fusion Problem
• There are systems S, T, U, …
• There are problems to be solved P,Q,R…
• This defines several fusion problemsLocal fusion: for a given problem P, and a pair of
systems S,T, what is the best fusion rule:
Let s(d) ,t(d) be the scores assigned to document d by systems S and T. Fusion tries to find the “best” combining function f(s,t)
Jiune 10 , 2003 9
What does “best f(s,t)” mean?
• A. Effective in locating relevant documents.– pave(fusion) is a better precision score
– Means more of the relevant documents are near the top of the list
• Does not penalize duplicates
• HITIQA deals with these by clustering
• B. Local versus Global
Jiune 10 , 2003 10
Local Fusion Rule
• A local fusion rule fP(s,t) depends on the specific problem P.– This is relevant if P represents a static problem
or profile, which will be considered on many occasions
• A global fusion rule f(s,t) does not depend on a specific problem P, – and can be safely used on a variety of problems.
Jiune 10 , 2003 11
Global is Harder
• …..Of course• How to measure it:
– For a set of topics– Approach them with an established fusion rule– Ask whether (that is, for how many of them)
• Does the fusion rule beat the system that does best for this particular topic? [Called “beating the oracle” Ng]
• Does the fusion rule beat the average of systems?• Does the fusion rule beat the “generally better” system
Jiune 10 , 2003 12
Global results are mixed
• SMART was being used
• Added the Inquery capability in compatible mode (that is, using Bag of Words)
• Each system has some special features (name lists, etc.)
• However the fused system does not beat InQuery very often on a global basis (6 of 15)
Jiune 10 , 2003 13
• Completely rigorous For each topic:
• 1) Randomly split the documents into two parts: training and testing
• 2) Do the logistic regression on training part and get the fusion scores for both training and testing documents
• 3) Calculate p_100 on testing documents.
• 4) Excellent results (one random sample for each)
• 5) Test SMART and InQuery on the same random testing set
Local Fusion Results are Good
Jiune 10 , 2003 14
Local Fusion Results are GoodTopicno Smart InQuery Fusion F>SM F>IN F=IN IN>F
311 26 15 27 1 1 0 0318 0 3 0 0 0 0 1324 55 59 62 1 1 0 0342 1 5 5 1 0 1 0359 2 10 10 1 0 1 0365 6 16 16 1 0 1 0374 30 36 38 1 1 0 0386 2 4 4 1 0 1 0392 27 13 27 0 1 0 0403 11 11 11 0 0 1 0415 39 36 41 1 1 0 0421 3 5 5 1 0 1 0424 21 32 33 1 1 0 0432 0 1 1 1 0 1 0450 74 69 74 0 1 0 0
Overall 297 315 354 11 7 7 1
For each topic: 1) Randomly split the result into two parts: training and testing
2) Do the logistic regression on training part and get the fusion scores
for both training and testing documents
3) Calculate P100 on testing documents.
Jiune 10 , 2003 15
Summary of Local FusionF & SM F & IN F & BEST
win 11 7 5tie 4 7 9lose 0 1 1
Smart InQuery Fusion318 0 3 0318 1 5 2318 1 5 4318 0 3 1318 1 4 2
PROBLEM CASE
We ran 5 split half runs on the odd case (318) and the results persist.
Jiune 10 , 2003 16
Newer Results
• Also, using degree 2 polynomial as the kernel function,
• got the best performance with SVM. Its overall P100 on 50 topics is 742
• (logistic regression 732). The overall P100 on 13 "hard" [embedded relevant documents] topics is 53 (logistic regression 49). The comparison over the individual systems: Fusion of all 3 systems:
• Smart InQuery KL
• wins ties loses wins ties loses wins ties loses
• SVM 38 5 6 34 5 10 29 9 11
• Logistics 36 7 6 31 8 10 27 11 11
•
Jiune 10 , 2003 17
Is Local Sensible?
• Local fusion depends on getting information about a particular topic, and doing the best possible fusion.
• Not available in an AdHoc (e.g. Google) setting
• Potentially available in an intelligence applications - -filtering; standing profile
Jiune 10 , 2003 18
Making Local Fusion Work
• The system must get accurate feedback from the analyst/user on the relevance of individual items that are examined. If the analyst only sees excerpts, the judgment may be inaccurate (reflecting only the excerpt).
• The system can use feedback to update a fusion rule, “offline”.
Jiune 10 , 2003 19
Inserting Local Fusion
• When the system sees (retrospectively) that the fusion rule would have worked significantly better than the better system, it can rely on the fusion rule to make future judgments.
Jiune 10 , 2003 20
Our Approach to Retrieval Fusion
ADOPT: Fusion
System
Monitor Fusion Set
and Receive
Feedback
USE: Better System
Adaptive “Local” Fusion
SMART
FUSION
PROCESS
Request
DOCUMEN
TS SETS
Result Set
Delivered
SET
Result Set
KL - Lang
ModInQuery
Jiune 10 , 2003 21
Summary of Retrieval Fusion• Fusion Processes Examined
– Linear forms – discriminant analysis; logistic
• Key results– Global Fusion of InQuery and SMART beats SMART 10 times of
15.• Only beats InQuery 6 of 15
– Local (topic dependent) fusion beats (ties) InQuery 7(7) of 15 times
• Prospective– Expand local (profile dependent) fusion– Explore more powerful methods such as
• Quadratic classifiers• SVM
Jiune 10 , 2003 22
Why is Global Fusion Hard
• Global fusion requires that there be a single good rule that selects the region of greatest richness, for most topics, at once.
Jiune 10 , 2003 23
Fusion of InQuery and Smart: Topic 450* Easy case – almost any linear rule works well. Either system works well
Fusion of InQuery and Smart: Topic 450* Easy case – almost any linear rule works well. Either system works well Fusion of InQuery and Smart: Topic 392
* Easy case – SMART works well. InQuery works poorly
Fusion of InQuery and Smart: Topic 392* Easy case – SMART works well. InQuery works poorly
Fusion of InQuery and Smart: Topic 432* Another hard case – relevant documents not compactly grouped in the score space. Not many relevant documents found at all.
Fusion of InQuery and Smart: Topic 432* Another hard case – relevant documents not compactly grouped in the score space. Not many relevant documents found at all.
Fusion of InQuery and Smart: Topic 318* Interesting case – no linear rule works well. Relevant documentsembedded. Requires non-linear methods – Quadratic; SVM; other
Jiune 10 , 2003 24
Fusion of InQuery and Smart: Topic 421* Really challenging case: Quite a few relevant documents. Very diffuse in score space. Neither system works well. Possibly Boolean AND
Fusion of InQuery and Smart: Topic 421* Really challenging case: Quite a few relevant documents. Very diffuse in score space. Neither system works well. Possibly Boolean AND
Fusion of InQuery and Smart: Topic 359* A disaster.
Fusion of InQuery and Smart: Topic 359* A disaster.
Fusion of InQuery and Smart: Topic 374* Possible Boolean AND. Neither works well alone
Fusion of InQuery and Smart: Topic 374* Possible Boolean AND. Neither works well alone
Fusion of InQuery and Smart: Topic 415* Part of the relevant material is easily found. Part is embedded
Fusion of InQuery and Smart: Topic 415* Part of the relevant material is easily found. Part is embedded
Jiune 10 , 2003 25
KL Language Model and InQuery -- SVM with 2nd degree polynomial kernel -
conic section level curves. RBF gives multiple centers, but inferior
performance. The X-axis is the normalized scores from KL and Y-axis is the normalized scores from InQuery. And from the legend in the right part of the plots, you can find the predicted scores for each level curve..
Jiune 10 , 2003 26
Fusion• Fusion of SMART InQuery, K-L Language
model– Linear classifier models used– Coefficients optimized on OTHER TOPICS
(Global).– Coefficients optimized on SPECIFIC TOPICS
(Local: Good results)
Jiune 10 , 2003 27
Results and Prospects• Fusion Processes Examined
– Linear forms – logistic analysis; local weighted regression [100NN-300NN] ; SVM. kNN SVM
• Key results• Fusion of all 3 systems:• Smart InQuery KL ORACLE• wins ties loses wins ties loses wins ties loses win tie lose• SVM 38 5 6 34 5 10 29 9 11 25 9 15• Logistics 36 7 6 31 8 10 27 11 11 19 13 17
• Prospective– Expand local (profile dependent) fusion– Develop selection rules: when to use the “best single system” --
when to use fusion. Can we predict?
* Warning
Microsoft Windows has now been operating continuously for 88 minutes and 23 seconds. This is the limit allowed under your lease agreement. Windows Operating System will now crash in flames, destroying all of the content in your presentation, and humiliating you in front of your peers. For a better upgrade version contact Microsoft Sales Promotion at any computer store.
Jiune 10 , 2003 28
Thank you
• Questions?
Jiune 10 , 2003 29
Paul,
I've upload some of the recent data about fusion onto HITIQA website at:http://www.scils.rutgers.edu/~hitiqa/TEAM/psong/AquaintWorkshop200306/.
I only uploaded some key performance files because I'm not sure whatdata you will need in the presentation. Please let me know any furtherdata needed.
Below is the explanation of the files I uploaded:
* result.iks.logistic: P100 for each topic with logistic regressionmethod* perf.iks.logistic: Number of wins, ties and loses comparing withindividual systems.
* result.iks.lwr250: P100 for each topic with Local weighted regressionwhen choose kenel size 250* perf.iks.lwr250: Number of wins, ties and loses comparing withindividual systems.
* result.iks.meta: P100 for each topic when combine the scores oflogistic regression and LWR with equal weights (meta fusion)* perf.iks.meta: Number of wins, ties and loses comparing withindividual systems.
* result.iks.svm: P100 for each topic with support vector machine usingbest parameter set* perf.iks.svm: Number of wins, ties and loses comparing with individualsystems.
Please let me know for any questions.
Peng------- End of forwarded message -------