Overview of the INEX 2008 Efficiency Track

21
Overview of the INEX 2008 Efficiency Track Martin Theobald Ralf Schenkel Max-Planck Institute

description

Overview of the INEX 2008 Efficiency Track. Martin Theobald Ralf Schenkel Max-Planck Institute. General Idea. Have a nice collection of readily assessed INEX ad-hoc topics from 2006—2008, so why not consider runtimes for a change? Attract more people from DB&IR to efficient XML-IR - PowerPoint PPT Presentation

Transcript of Overview of the INEX 2008 Efficiency Track

Page 1: Overview of the INEX 2008 Efficiency Track

Overview of the INEX 2008 Efficiency Track

Martin TheobaldRalf Schenkel

Max-Planck Institute

Page 2: Overview of the INEX 2008 Efficiency Track

General Idea

• Have a nice collection of readily assessed INEX ad-hoc topics from 2006—2008, so why not consider runtimes for a change?

• Attract more people from DB&IR to efficient XML-IR

• Investigate effectiveness/efficiency trade-offs for different retrieval modes and topic types:

Article, Thorough, Focused, NEXI CO/CAS, XPath 2.0 Full-Text, high-

dimensional content (query expansion/relevance feedback), deep structure, top-k, distribution, sequential vs. parallel executions

Page 3: Overview of the INEX 2008 Efficiency Track

Test Collection

• Default INEX-Wikipedia collection 2007– 4.38 GB XML sources > 659,000 documents and > 115,000,000 elements– Not very heterogeneous but a rather awkward

structure sometimes, many deeply nested paths> 3,000 distinct tags (> 1,000 out of which have content)> 120,000 distinct root-to-leaf paths

– No DTD available– But: Be able to reuse large body of Ad-Hoc topics &

assessments

Page 4: Overview of the INEX 2008 Efficiency Track

Topics• 540 type (A) topics (no. 289—828)

– Previous and current Ad-Hoc topics taken from INEX 2006—2008 – 308 topics have assessments– Topic titles in NEXI CO & CAS and XPath 2.0 Full-Text syntax– Full-text predicates: “”, +, -

• 21 type (B) topics (no. 829—849)– High-dimensional content with up to 112 keywords– Obtained from the 2006 Interactive Track feedback experiments by the Royal School Of Library And Information Science, Denmark– Originally CO topics only, cast into CAS using //*[about(…)]– Mapping to original Ad-Hoc topic id, thus reuse assessments

• 7 type (C) topics (no. 850—856)– High-dimensional structure with multiple branches– Newly submitted by Efficiency Track participants– Assessments skipped due to low expected overall impact on overall effectiveness

results

Page 5: Overview of the INEX 2008 Efficiency Track

<topic id="844" adhocid="517" type="B"><co_title>castle mound castles fortress defensive earthworks offensive herefordshire french fortification ditch hollingbury scrob type circularconquest defend siegecraft surrounded dry essential walls norman ages weapon skagerrak kattegat inseparably internees feature citadels halland bayeux connotes palisade zar segovia mota provide earth castel word motte richard middle twofold fitz moat inroads tapestry confessor shropshire country flattened article disambiguation examines perimeter include bordering angular denotes styled crest prehistoric discusses fortified maiden timber denote countryside occupy welsh summit hostile erected towers parish danish mainland siege depicted wait mechanism stronger restricted residence aspect familiar provinces knight subjects survive virtually medieval lay swedish estate enemies describes measure denmark structures architecture traditionally domestic techniques store permanent normally camp fort</co_title><cas_title>//*[about(., castle mound castles fortress defensive earthworks offensive herefordshire french fortification ditch hollingbury scrob type circular conquest defend siegecraft surrounded dry essential walls norman ages weapon skagerrak kattegat inseparably internees feature citadels halland bayeux connotes palisade zar segovia mota provide earth castel word motte richard middle twofold fitz moat inroads tapestry confessor shropshire country flattened article disambiguation examines perimeter include bordering angular denotes styled crest prehistoric discusses fortified maiden timber denote countryside occupy welsh summit hostile erected towers parish danish mainland siege depicted wait mechanism stronger restricted residence aspect familiar provinces knight subjects survive virtually medieval lay swedish estate enemies describes measure denmark structures architecture traditionally domestic techniques store permanent normally camp fort)]</cas_title><xpath_title>//*[. ftcontains "castle mound castles fortress defensive earthworks offensive herefordshire french fortification ditch hollingbury scrob type circular conquest defend siegecraft surrounded dry essential walls norman ages weapon skagerrak kattegat inseparably internees feature citadels halland bayeux connotes palisade zar segovia mota provide earth castel word motte richard middle twofold fitz moat inroads tapestry confessor shropshire country flattened article disambiguation examines perimeter include bordering angular denotes styled crest prehistoric discusses fortified maiden timber denote countryside occupy welsh summit hostile erected towers parish danish mainland siege depicted wait mechanism stronger restricted residence aspect familiar provinces knight subjects survive virtually medieval lay swedish estate enemies describes measure denmark structures architecture traditionally domestic techniques store permanent normally camp fort"]</xpath_title>

</topic>

Example Topic - Type (B)

Conjunctive evaluation not fe

asible!

Page 6: Overview of the INEX 2008 Efficiency Track

Example Topic - Type (C)<topic id="856" type="C">

<co_title> State Parks Geology Geography +Canyon</co_title><cas_title> //article//body[about(.//section//p, State Park) and

about(.//section//title, Geology) and about(.//section//title, Geography)] //figure[about(.//caption, +Canyon)]

</cas_title><xpath_title> //article//body[.//section//p ftcontains "State Park" and

.//section//title ftcontains "Geology" and

.//section//title ftcontains "Geography"] //figure[.//caption ftcontains "Canyon"]

</xpath_title><description>

I’m looking for state parks with sections describing their geology and/or geography, preferably with a figure of a canyon as target element. </description>

<narrative> State park pages often follow the common pattern of having sections entitled

with "Geology" or "Geography". I’m particularly interested in those pages with a figure of a canyon, e.g., the Grand Canyon.

</narrative></topic>

Page 7: Overview of the INEX 2008 Efficiency Track

Sub-Tasks• Article

– Article-only runs, naturally overlap-free– In combination with CO queries, this resembles a classic IR

setting with keyword queries and documents as results• Thorough

– Used in INEX 2003—2006, allows overlapping results– May be more efficient for some systems

• Focused – Current default mode in INEX, overlap-free at both passage- and

element-level– May be an expensive post-processing step to remove overlap

Page 8: Overview of the INEX 2008 Efficiency Track

Submissions<!ELEMENT efficiency-submission (topic-fields,

general_description,ranking_description,indexing_description,caching_description,topic+)>

<!ATTLIST efficiency-submission participant-id CDATA #REQUIRED run-id CDATA #REQUIRED task (article|thorough|focused) #REQUIREDquery (automatic|manual) #REQUIREDsequential (yes|no) #REQUIREDno_cpu CDATA #IMPLIEDram CDATA #IMPLIEDno_nodes CDATA #IMPLIEDhardware_cost CDATA #IMPLIEDhardware_year CDATA #IMPLIEDtopk (15|150|1500) #IMPLIED >

<!ELEMENT topic-fields EMPTY><!ATTLIST topic-fields

co_title (yes|no) #REQUIREDcas_title (yes|no) #REQUIREDxpath_title (yes|no) #REQUIREDtext_predicates(yes|no) #REQUIREDdescription (yes|no) #REQUIREDnarrative (yes|no) #REQUIRED >

<!ELEMENT general_description (#PCDATA)><!ELEMENT ranking_description (#PCDATA)><!ELEMENT indexing_description (#PCDATA)><!ELEMENT caching_description (#PCDATA)><!ELEMENT topic (result*)><!ATTLIST topic

topic-id CDATA #REQUIREDtotal_time_ms CDATA #REQUIREDcpu_time_ms CDATA #IMPLIEDio_time_ms CDATA #IMPLIED >

<!ELEMENT result (file, path, rank?, rsv?) ><!ELEMENT file (#PCDATA)><!ELEMENT path (#PCDATA)><!ELEMENT rank (#PCDATA)><!ELEMENT rsv (#PCDATA)>

DTD for run submissions

<!ELEMENT efficiency-submission (topic-fields,general_description,ranking_description,indexing_description,caching_description,topic+)>

<!ATTLIST efficiency-submission participant-id CDATA #REQUIRED run-id CDATA #REQUIRED task (article|thorough|focused) #REQUIREDquery (automatic|manual) #REQUIREDsequential (yes|no) #REQUIREDno_cpu CDATA #IMPLIEDram CDATA #IMPLIEDno_nodes CDATA #IMPLIEDhardware_cost CDATA #IMPLIEDhardware_year CDATA #IMPLIEDtopk (15|150|1500) #IMPLIED >

Page 9: Overview of the INEX 2008 Efficiency Track

Submissions<!ELEMENT efficiency-submission (topic-fields,

general_description,ranking_description,indexing_description,caching_description,topic+)>

<!ATTLIST efficiency-submission participant-id CDATA #REQUIRED run-id CDATA #REQUIRED task (article|thorough|focused) #REQUIREDquery (automatic|manual) #REQUIREDsequential (yes|no) #REQUIREDno_cpu CDATA #IMPLIEDram CDATA #IMPLIEDno_nodes CDATA #IMPLIEDhardware_cost CDATA #IMPLIEDhardware_year CDATA #IMPLIEDtopk (15|150|1500) #IMPLIED >

<!ELEMENT topic-fields EMPTY><!ATTLIST topic-fields

co_title (yes|no) #REQUIREDcas_title (yes|no) #REQUIREDxpath_title (yes|no) #REQUIREDtext_predicates(yes|no) #REQUIREDdescription (yes|no) #REQUIREDnarrative (yes|no) #REQUIRED >

<!ELEMENT general_description (#PCDATA)><!ELEMENT ranking_description (#PCDATA)><!ELEMENT indexing_description (#PCDATA)><!ELEMENT caching_description (#PCDATA)><!ELEMENT topic (result*)><!ATTLIST topic

topic-id CDATA #REQUIREDtotal_time_ms CDATA #REQUIREDcpu_time_ms CDATA #IMPLIEDio_time_ms CDATA #IMPLIED >

<!ELEMENT result (file, path, rank?, rsv?) ><!ELEMENT file (#PCDATA)><!ELEMENT path (#PCDATA)><!ELEMENT rank (#PCDATA)><!ELEMENT rsv (#PCDATA)>

DTD for run submissions

<!ELEMENT topic-fields EMPTY><!ATTLIST topic-fields

co_title (yes|no) #REQUIREDcas_title (yes|no) #REQUIREDxpath_title (yes|no) #REQUIREDtext_predicates (yes|no) #REQUIREDdescription (yes|no) #REQUIREDnarrative (yes|no) #REQUIRED >

Page 10: Overview of the INEX 2008 Efficiency Track

Submissions<!ELEMENT efficiency-submission (topic-fields,

general_description,ranking_description,indexing_description,caching_description,topic+)>

<!ATTLIST efficiency-submission participant-id CDATA #REQUIRED run-id CDATA #REQUIRED task (article|thorough|focused) #REQUIREDquery (automatic|manual) #REQUIREDsequential (yes|no) #REQUIREDno_cpu CDATA #IMPLIEDram CDATA #IMPLIEDno_nodes CDATA #IMPLIEDhardware_cost CDATA #IMPLIEDhardware_year CDATA #IMPLIEDtopk (15|150|1500) #IMPLIED >

<!ELEMENT topic-fields EMPTY><!ATTLIST topic-fields

co_title (yes|no) #REQUIREDcas_title (yes|no) #REQUIREDxpath_title (yes|no) #REQUIREDtext_predicates(yes|no) #REQUIREDdescription (yes|no) #REQUIREDnarrative (yes|no) #REQUIRED >

<!ELEMENT general_description (#PCDATA)><!ELEMENT ranking_description (#PCDATA)><!ELEMENT indexing_description (#PCDATA)><!ELEMENT caching_description (#PCDATA)><!ELEMENT topic (result*)><!ATTLIST topic

topic-id CDATA #REQUIREDtotal_time_ms CDATA #REQUIREDcpu_time_ms CDATA #IMPLIEDio_time_ms CDATA #IMPLIED >

<!ELEMENT result (file, path, rank?, rsv?) ><!ELEMENT file (#PCDATA)><!ELEMENT path (#PCDATA)><!ELEMENT rank (#PCDATA)><!ELEMENT rsv (#PCDATA)>

DTD for run submissions

<!ELEMENT topic (result*)><!ATTLIST topic

topic-id CDATA #REQUIREDtotal_time_ms CDATA #REQUIREDcpu_time_ms CDATA #IMPLIEDio_time_ms CDATA #IMPLIED >

<!ELEMENT result (file, path, rank?, rsv?) ><!ELEMENT file (#PCDATA)><!ELEMENT path (#PCDATA)><!ELEMENT rank (#PCDATA)><!ELEMENT rsv (#PCDATA)>

Page 11: Overview of the INEX 2008 Efficiency Track

Metrics

• Interpolated Precision (iP) and Mean Average Interpolated Precision (MAiP) for Focused & Article modes– New INEX evaluation software 2008 (passage-based)– Qrels from 2006-2007 transformed into 2008 format

• Classic precision/recall plots for Thorough mode– INEX EvalJ 2006-2007 (element-based)– Qrels from 2008 transformed back into 2006 format(made available for download at the track homepage)

Page 12: Overview of the INEX 2008 Efficiency Track

Participants & Runs

• Max-Planck-Institut Informatik [10], 8 runs• University of Frankfurt [16], 5 runs• University of Toronto [42], 2 runs• University of Twente & CWI [53], 4 runs• JustSystems Corporation [56], 1 run

Page 13: Overview of the INEX 2008 Efficiency Track

Results Overview• General Setting (parameters taken from submission headers)

Page 14: Overview of the INEX 2008 Efficiency Track

Results Overview• Effectiveness (iP, MAiP) vs. Efficiency (wallclock runtime)

Page 15: Overview of the INEX 2008 Efficiency Track

Effectiveness: Focused & Article, All Topics

0.000.010.020.030.040.050.060.070.080.090.100.110.120.130.140.150.160.170.180.190.200.210.220.230.240.250.260.270.280.290.300.310.320.330.340.350.360.370.380.390.400.410.420.430.440.450.460.470.480.490.500.510.520.530.540.550.560.570.580.590.600.610.620.630.640.650.660.670.680.690.700.710.720.730.740.750.760.770.780.790.800.810.820.830.840.850.860.870.880.890.900.910.920.930.940.950.960.970.980.991.000.0

0.1

0.2

0.3

0.4

0.5

0.6

pftijah_article_strict

B2U0_full-depth-heur

VSM_RIP

TOPX2-Eff08-CO-15-Focused-W

TOPX2-Eff08-CO-150-Focused-W

TOPX2-Eff08-CO-1500-Focused-W

TOPX2-Eff08-CAS-15-Focused-W

TOPX2-Eff08-CAS-150-Focused-W

TOPX2-Eff08-CAS-1500-Focused-W

003-Uni Frankfurt,Architect-Sim

002-Uni Frankfurt,Baseline

004-Uni Frankfurt,Fine-Sim

005-Uni Frankfurt,Path-Sim

001-Uni Frankfurt,Strict

Recall

iPrecision

Page 16: Overview of the INEX 2008 Efficiency Track

Focused, Type (A)

0.000.010.020.030.040.050.060.070.080.090.100.110.120.130.140.150.160.170.180.190.200.210.220.230.240.250.260.270.280.290.300.310.320.330.340.350.360.370.380.390.400.410.420.430.440.450.460.470.480.490.500.510.520.530.540.550.560.570.580.590.600.610.620.630.640.650.660.670.680.690.700.710.720.730.740.750.760.770.780.790.800.810.820.830.840.850.860.870.880.890.900.910.920.930.940.950.960.970.980.991.000.0

0.1

0.2

0.3

0.4

0.5

0.6

pftijah_article_strictB2U0_full-depth-heurVSM_RIPTOPX2-Eff08-CO-15-Focused-WTOPX2-Eff08-CO-150-Focused-WTOPX2-Eff08-CO-1500-Focused-WTOPX2-Eff08-CAS-15-Focused-WTOPX2-Eff08-CAS-150-Focused-WTOPX2-Eff08-CAS-1500-Focused-W003-Uni Frankfurt,Architect-Sim002-Uni Frankfurt,Baseline004-Uni Frankfurt,Fine-Sim005-Uni Frankfurt,Path-Sim001-Uni Frankfurt,Strict

Recall

iPrecision

Page 17: Overview of the INEX 2008 Efficiency Track

Focused, Type (B)

0.000.010.020.030.040.050.060.070.080.090.100.110.120.130.140.150.160.170.180.190.200.210.220.230.240.250.260.270.280.290.300.310.320.330.340.350.360.370.380.390.400.410.420.430.440.450.460.470.480.490.500.510.520.530.540.550.560.570.580.590.600.610.620.630.640.650.660.670.680.690.700.710.720.730.740.750.760.770.780.790.800.810.820.830.840.850.860.870.880.890.900.910.920.930.940.950.960.970.980.991.000.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

pftijah_article_strict

B2U0_full-depth-heur

VSM_RIP

TOPX2-Eff08-CO-15-Focused-W

TOPX2-Eff08-CO-150-Focused-W

TOPX2-Eff08-CO-1500-Focused-W

TOPX2-Eff08-CAS-15-Focused-W

TOPX2-Eff08-CAS-150-Focused-W

TOPX2-Eff08-CAS-1500-Focused-W

Recall

iPrecision

Page 18: Overview of the INEX 2008 Efficiency Track

Focused, Type (C)

Page 19: Overview of the INEX 2008 Efficiency Track

Thorough, All Topics

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10

0.11

0.12

0.13

0.14

0.15

0.16

0.17

0.18

0.19

0.20

0.21

0.22

0.23

0.24

0.25

0.26

0.27

0.28

0.29

0.30

0.31

0.32

0.33

0.34

0.35

0.36

0.37

0.38

0.39

0.40

0.41

0.42

0.43

0.44

0.45

0.46

0.47

0.48

0.49

0.50

0.51

0.52

0.53

0.54

0.55

0.56

0.57

0.58

0.59

0.60

0.61

0.62

0.63

0.64

0.65

0.66

0.67

0.68

0.69

0.70

0.71

0.72

0.73

0.74

0.75

0.76

0.77

0.78

0.79

0.80

0.81

0.82

0.83

0.84

0.85

0.86

0.87

0.88

0.89

0.90

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1.00

0.00

0.05

0.10

0.15

0.20

0.25

0.30

pftijah_star_strict

pftijah_asp_strict

pftijah_asp_vague

B2U0_full-depth-sr

TOPX2-Eff08-CO-15-Thorough-W

TOPX2-Eff08-CAS-15-Thorough-W

Recall

Precision

Page 20: Overview of the INEX 2008 Efficiency Track

Conclusions

• Continue in 2009 with more new topics & subtasks

• Establish as reference benchmark for XML-IR experiments for a broad DB&IR audience

• Make available also to non INEX participants?

Page 21: Overview of the INEX 2008 Efficiency Track

http://www.inex.otago.ac.nz/efficiency/efficiency.asp