Chances and Challenges in Comparing Cross-Language Retrieval Tools
-
Upload
giovannaroda -
Category
Technology
-
view
529 -
download
1
description
Transcript of Chances and Challenges in Comparing Cross-Language Retrieval Tools
![Page 1: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/1.jpg)
Chances and Challenges in ComparingCross-Language Retrieval Tools
Giovanna RodaVienna, Austria
Irf Symposium 2010 / June 3, 2010
![Page 2: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/2.jpg)
CLEF-IP: the Intellectual Property track at CLEF
CLEF-IP is an evaluation track within the Cross LanguageEvaluation Forum (Clef). 1
organized by the IRF
first track ran in 2009
running this year for the second time
1http://www.clef-campaign.org
![Page 3: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/3.jpg)
CLEF-IP: the Intellectual Property track at CLEF
CLEF-IP is an evaluation track within the Cross LanguageEvaluation Forum (Clef). 1
organized by the IRF
first track ran in 2009
running this year for the second time
1http://www.clef-campaign.org
![Page 4: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/4.jpg)
CLEF-IP: the Intellectual Property track at CLEF
CLEF-IP is an evaluation track within the Cross LanguageEvaluation Forum (Clef). 1
organized by the IRF
first track ran in 2009
running this year for the second time
1http://www.clef-campaign.org
![Page 5: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/5.jpg)
CLEF-IP: the Intellectual Property track at CLEF
CLEF-IP is an evaluation track within the Cross LanguageEvaluation Forum (Clef). 1
organized by the IRF
first track ran in 2009
running this year for the second time
1http://www.clef-campaign.org
![Page 6: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/6.jpg)
CLEF-IP: the Intellectual Property track at CLEF
CLEF-IP is an evaluation track within the Cross LanguageEvaluation Forum (Clef). 1
organized by the IRF
first track ran in 2009
running this year for the second time
1http://www.clef-campaign.org
![Page 7: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/7.jpg)
What is an evaluation track?
An evaluation track in Information Retrieval is a cooperative actionaimed at comparing different techniques on a common retrievaltask.
produces experimental data that can be analyzed and used toimprove existing systems
fosters exchange of ideas and cooperation
produces a reusable test collection, sets milestones
Test collection
A test collection consists traditionally of target data, a set ofqueries, and relevance assessments for each query.
![Page 8: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/8.jpg)
What is an evaluation track?
An evaluation track in Information Retrieval is a cooperative actionaimed at comparing different techniques on a common retrievaltask.
produces experimental data that can be analyzed and used toimprove existing systems
fosters exchange of ideas and cooperation
produces a reusable test collection, sets milestones
Test collection
A test collection consists traditionally of target data, a set ofqueries, and relevance assessments for each query.
![Page 9: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/9.jpg)
What is an evaluation track?
An evaluation track in Information Retrieval is a cooperative actionaimed at comparing different techniques on a common retrievaltask.
produces experimental data that can be analyzed and used toimprove existing systems
fosters exchange of ideas and cooperation
produces a reusable test collection, sets milestones
Test collection
A test collection consists traditionally of target data, a set ofqueries, and relevance assessments for each query.
![Page 10: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/10.jpg)
What is an evaluation track?
An evaluation track in Information Retrieval is a cooperative actionaimed at comparing different techniques on a common retrievaltask.
produces experimental data that can be analyzed and used toimprove existing systems
fosters exchange of ideas and cooperation
produces a reusable test collection, sets milestones
Test collection
A test collection consists traditionally of target data, a set ofqueries, and relevance assessments for each query.
![Page 11: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/11.jpg)
What is an evaluation track?
An evaluation track in Information Retrieval is a cooperative actionaimed at comparing different techniques on a common retrievaltask.
produces experimental data that can be analyzed and used toimprove existing systems
fosters exchange of ideas and cooperation
produces a reusable test collection, sets milestones
Test collection
A test collection consists traditionally of target data, a set ofqueries, and relevance assessments for each query.
![Page 12: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/12.jpg)
Clef–Ip 2009: the task
The main task in the Clef–Ip track was to find prior art for agiven patent.
Prior art search
Prior art search consists in identifying all information (includingnon-patent literature) that might be relevant to a patent’s claim ofnovelty.
![Page 13: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/13.jpg)
Clef–Ip 2009: the task
The main task in the Clef–Ip track was to find prior art for agiven patent.
Prior art search
Prior art search consists in identifying all information (includingnon-patent literature) that might be relevant to a patent’s claim ofnovelty.
![Page 14: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/14.jpg)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
![Page 15: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/15.jpg)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
![Page 16: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/16.jpg)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
![Page 17: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/17.jpg)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
![Page 18: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/18.jpg)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
![Page 19: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/19.jpg)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
![Page 20: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/20.jpg)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
![Page 21: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/21.jpg)
Participants - 2009 track
1 Tech. Univ. Darmstadt, Dept. of CS,Ubiquitous Knowledge Processing Lab (DE)
2 Univ. Neuchatel - Computer Science (CH)
3 Santiago de Compostela Univ. - Dept.Electronica y Computacion (ES)
4 University of Tampere - Info Studies (FI)
5 Interactive Media and Swedish Institute ofComputer Science (SE)
6 Geneva Univ. - Centre Universitaired’Informatique (CH)
7 Glasgow Univ. - IR Group Keith (UK)
8 Centrum Wiskunde & Informatica - InteractiveInformation Access (NL)
![Page 22: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/22.jpg)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
![Page 23: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/23.jpg)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
![Page 24: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/24.jpg)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
![Page 25: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/25.jpg)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
![Page 26: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/26.jpg)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
![Page 27: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/27.jpg)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
![Page 28: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/28.jpg)
Participants - 2009 track
9 Geneva Univ. Hospitals - Service of MedicalInformatics (CH)
10 Humboldt Univ. - Dept. of German Languageand Linguistics (DE)
11 Dublin City Univ. - School of Computing (IE)
12 Radboud Univ. Nijmegen - Centre for LanguageStudies & Speech Technologies (NL)
13 Hildesheim Univ. - Information Systems &Machine Learning Lab (DE)
14 Technical Univ. Valencia - Natural LanguageEngineering (ES)
15 Al. I. Cuza University of Iasi - Natural LanguageProcessing (RO)
![Page 29: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/29.jpg)
Participants - 2009 track
15 participants
48 experimentssubmitted for the maintask
10 experimentssubmitted for thelanguage tasks
![Page 30: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/30.jpg)
Participants - 2009 track
15 participants
48 experimentssubmitted for the maintask
10 experimentssubmitted for thelanguage tasks
![Page 31: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/31.jpg)
Participants - 2009 track
15 participants
48 experimentssubmitted for the maintask
10 experimentssubmitted for thelanguage tasks
![Page 32: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/32.jpg)
Participants - 2009 track
15 participants
48 experimentssubmitted for the maintask
10 experimentssubmitted for thelanguage tasks
![Page 33: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/33.jpg)
2009-2010: participants
![Page 34: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/34.jpg)
2009-2010: evolution of the CLEF-IP track
2009
2010
1 task: prior art search
prior art candidate searchand classification task
targeting granted patents
patent applications
15 participants
20 participants
all from academia
4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
![Page 35: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/35.jpg)
2009-2010: evolution of the CLEF-IP track
2009
2010
1 task: prior art search
prior art candidate searchand classification task
targeting granted patents
patent applications
15 participants
20 participants
all from academia
4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
![Page 36: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/36.jpg)
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search
prior art candidate searchand classification task
targeting granted patents
patent applications
15 participants
20 participants
all from academia
4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
![Page 37: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/37.jpg)
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents
patent applications
15 participants
20 participants
all from academia
4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
![Page 38: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/38.jpg)
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents patent applications
15 participants
20 participants
all from academia
4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
![Page 39: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/39.jpg)
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents patent applications
15 participants 20 participants
all from academia
4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
![Page 40: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/40.jpg)
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents patent applications
15 participants 20 participants
all from academia 4 industrial participants
families and citations
include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
![Page 41: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/41.jpg)
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents patent applications
15 participants 20 participants
all from academia 4 industrial participants
families and citations include forward citations
manual assessments
expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
![Page 42: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/42.jpg)
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents patent applications
15 participants 20 participants
all from academia 4 industrial participants
families and citations include forward citations
manual assessments expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
![Page 43: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/43.jpg)
2009-2010: evolution of the CLEF-IP track
2009 2010
1 task: prior art search prior art candidate searchand classification task
targeting granted patents patent applications
15 participants 20 participants
all from academia 4 industrial participants
families and citations include forward citations
manual assessments expanded lists of relevantdocs
standard evaluation mea-sures
new measure: pres, morerecall-oriented
![Page 44: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/44.jpg)
What are relevance assessments
A test collection (also known as gold standard) consists of a targetdataset, a set of queries, and relevance assessments correspondingto each query.
The CLEF-IP test collection:
target data: 2 million EP patents
queries: full-text patents (without images)
relevance assessments: extended citations
![Page 45: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/45.jpg)
What are relevance assessments
A test collection (also known as gold standard) consists of a targetdataset, a set of queries, and relevance assessments correspondingto each query.
The CLEF-IP test collection:
target data: 2 million EP patents
queries: full-text patents (without images)
relevance assessments: extended citations
![Page 46: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/46.jpg)
What are relevance assessments
A test collection (also known as gold standard) consists of a targetdataset, a set of queries, and relevance assessments correspondingto each query.
The CLEF-IP test collection:
target data: 2 million EP patents
queries: full-text patents (without images)
relevance assessments: extended citations
![Page 47: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/47.jpg)
What are relevance assessments
A test collection (also known as gold standard) consists of a targetdataset, a set of queries, and relevance assessments correspondingto each query.
The CLEF-IP test collection:
target data: 2 million EP patents
queries: full-text patents (without images)
relevance assessments: extended citations
![Page 48: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/48.jpg)
What are relevance assessments
A test collection (also known as gold standard) consists of a targetdataset, a set of queries, and relevance assessments correspondingto each query.
The CLEF-IP test collection:
target data: 2 million EP patents
queries: full-text patents (without images)
relevance assessments: extended citations
![Page 49: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/49.jpg)
Relevance assessments
We used patents cited as prior art as relevance assessments.
Sources of citations:
1 applicant’s disclosure: the Uspto requires applicants todisclose all known relevant publications
2 patent office search report: each patent office will do a searchfor prior art to judge the novelty of a patent
3 opposition procedures: patents cited to prove that a grantedpatent is not novel
![Page 50: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/50.jpg)
Relevance assessments
We used patents cited as prior art as relevance assessments.
Sources of citations:
1 applicant’s disclosure: the Uspto requires applicants todisclose all known relevant publications
2 patent office search report: each patent office will do a searchfor prior art to judge the novelty of a patent
3 opposition procedures: patents cited to prove that a grantedpatent is not novel
![Page 51: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/51.jpg)
Relevance assessments
We used patents cited as prior art as relevance assessments.
Sources of citations:
1 applicant’s disclosure: the Uspto requires applicants todisclose all known relevant publications
2 patent office search report: each patent office will do a searchfor prior art to judge the novelty of a patent
3 opposition procedures: patents cited to prove that a grantedpatent is not novel
![Page 52: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/52.jpg)
Relevance assessments
We used patents cited as prior art as relevance assessments.
Sources of citations:
1 applicant’s disclosure: the Uspto requires applicants todisclose all known relevant publications
2 patent office search report: each patent office will do a searchfor prior art to judge the novelty of a patent
3 opposition procedures: patents cited to prove that a grantedpatent is not novel
![Page 53: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/53.jpg)
Relevance assessments
We used patents cited as prior art as relevance assessments.
Sources of citations:
1 applicant’s disclosure: the Uspto requires applicants todisclose all known relevant publications
2 patent office search report: each patent office will do a searchfor prior art to judge the novelty of a patent
3 opposition procedures: patents cited to prove that a grantedpatent is not novel
![Page 54: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/54.jpg)
Extended citations as relevance assessments
direct citations and their families
![Page 55: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/55.jpg)
Extended citations as relevance assessments
direct citations of family members ...
![Page 56: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/56.jpg)
Extended citations as relevance assessments
... and their families
![Page 57: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/57.jpg)
Patent families
A patent family consists of patents granted by different patentauthorities but related to the same invention.
simple family all family members share the same priority number
extended family there are several definitions, in the INPADOCdatabase all documents which are directly orindirectly linked via a priority number belong to thesame family
![Page 58: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/58.jpg)
Patent families
A patent family consists of patents granted by different patentauthorities but related to the same invention.
simple family all family members share the same priority number
extended family there are several definitions, in the INPADOCdatabase all documents which are directly orindirectly linked via a priority number belong to thesame family
![Page 59: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/59.jpg)
Patent families
A patent family consists of patents granted by different patentauthorities but related to the same invention.
simple family all family members share the same priority number
extended family there are several definitions, in the INPADOCdatabase all documents which are directly orindirectly linked via a priority number belong to thesame family
![Page 60: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/60.jpg)
Patent families
Patent documents are linked bypriorities
![Page 61: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/61.jpg)
Patent families
Patent documents are linked bypriorities
INPADOC family.
![Page 62: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/62.jpg)
Patent families
Patent documents are linked bypriorities
Clef–Ip uses simple families.
![Page 63: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/63.jpg)
Relevance assessments 2010
Expanding the 2009 extended citations:
1 include citations of forward citations ...
2 ... and their families
This is apparently a well-known method among patent searchers.
Zig-zag search?
![Page 64: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/64.jpg)
Relevance assessments 2010
Expanding the 2009 extended citations:
1 include citations of forward citations ...
2 ... and their families
This is apparently a well-known method among patent searchers.
Zig-zag search?
![Page 65: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/65.jpg)
Relevance assessments 2010
Expanding the 2009 extended citations:
1 include citations of forward citations ...
2 ... and their families
This is apparently a well-known method among patent searchers.
Zig-zag search?
![Page 66: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/66.jpg)
Relevance assessments 2010
Expanding the 2009 extended citations:
1 include citations of forward citations ...
2 ... and their families
This is apparently a well-known method among patent searchers.
Zig-zag search?
![Page 67: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/67.jpg)
Relevance assessments 2010
Expanding the 2009 extended citations:
1 include citations of forward citations ...
2 ... and their families
This is apparently a well-known method among patent searchers.
Zig-zag search?
![Page 68: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/68.jpg)
How good are the CLEF-IP relevance assessments?
CLEF-IP uses families + citations:
![Page 69: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/69.jpg)
How good are the CLEF-IP relevance assessments?
how complete are extendedcitations as a relevanceassessments?
will every prior art patent beincluded in this set?
and if not, what percentageof prior art items are capturedby extended citations?
when considering forwardcitations, how good areextended citations as a priorart candidate set?
![Page 70: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/70.jpg)
How good are the CLEF-IP relevance assessments?
how complete are extendedcitations as a relevanceassessments?
will every prior art patent beincluded in this set?
and if not, what percentageof prior art items are capturedby extended citations?
when considering forwardcitations, how good areextended citations as a priorart candidate set?
![Page 71: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/71.jpg)
How good are the CLEF-IP relevance assessments?
how complete are extendedcitations as a relevanceassessments?
will every prior art patent beincluded in this set?
and if not, what percentageof prior art items are capturedby extended citations?
when considering forwardcitations, how good areextended citations as a priorart candidate set?
![Page 72: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/72.jpg)
How good are the CLEF-IP relevance assessments?
how complete are extendedcitations as a relevanceassessments?
will every prior art patent beincluded in this set?
and if not, what percentageof prior art items are capturedby extended citations?
when considering forwardcitations, how good areextended citations as a priorart candidate set?
![Page 73: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/73.jpg)
Feedback from patent experts needed
Quality of prior art candidate sets has to be assessed
![Page 74: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/74.jpg)
Feedback from patent experts needed
Know-how of patent search experts is needed
![Page 75: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/75.jpg)
Feedback from patent experts needed
at Clef–Ip 2009 7 patent search professionals assessed 12search results
the task was not well defined and there weremisunderstandings on the concept of relevance
amount of data was not sufficient to draw conclusions
![Page 76: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/76.jpg)
Feedback from patent experts needed
at Clef–Ip 2009 7 patent search professionals assessed 12search results
the task was not well defined and there weremisunderstandings on the concept of relevance
amount of data was not sufficient to draw conclusions
![Page 77: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/77.jpg)
Feedback from patent experts needed
at Clef–Ip 2009 7 patent search professionals assessed 12search results
the task was not well defined and there weremisunderstandings on the concept of relevance
amount of data was not sufficient to draw conclusions
![Page 78: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/78.jpg)
Feedback from patent experts needed
![Page 79: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/79.jpg)
Some initiatives associated with Clef–Ip
The results of evaluation tracks are mostly useful for the researchcommunity.
This community often produces prototypes that are of littleinterest to the end-user.
Next I’d like to present two concrete outcomes - not of Clef–Ipdirectly but arising from work in patent retrieval evaluation
![Page 80: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/80.jpg)
Some initiatives associated with Clef–Ip
The results of evaluation tracks are mostly useful for the researchcommunity.
This community often produces prototypes that are of littleinterest to the end-user.
Next I’d like to present two concrete outcomes - not of Clef–Ipdirectly but arising from work in patent retrieval evaluation
![Page 81: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/81.jpg)
Some initiatives associated with Clef–Ip
The results of evaluation tracks are mostly useful for the researchcommunity.
This community often produces prototypes that are of littleinterest to the end-user.
Next I’d like to present two concrete outcomes - not of Clef–Ipdirectly but arising from work in patent retrieval evaluation
![Page 82: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/82.jpg)
Soire
![Page 83: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/83.jpg)
Soire
developed at Matrixware
service-oriented architecture - available as a a Web service
allows to replicate IR experiments based on classicalevaluation model
tested on the CLEF-IP data
customized for the evaluation of machine translation
![Page 84: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/84.jpg)
Soire
developed at Matrixware
service-oriented architecture - available as a a Web service
allows to replicate IR experiments based on classicalevaluation model
tested on the CLEF-IP data
customized for the evaluation of machine translation
![Page 85: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/85.jpg)
Soire
developed at Matrixware
service-oriented architecture - available as a a Web service
allows to replicate IR experiments based on classicalevaluation model
tested on the CLEF-IP data
customized for the evaluation of machine translation
![Page 86: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/86.jpg)
Soire
developed at Matrixware
service-oriented architecture - available as a a Web service
allows to replicate IR experiments based on classicalevaluation model
tested on the CLEF-IP data
customized for the evaluation of machine translation
![Page 87: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/87.jpg)
Soire
developed at Matrixware
service-oriented architecture - available as a a Web service
allows to replicate IR experiments based on classicalevaluation model
tested on the CLEF-IP data
customized for the evaluation of machine translation
![Page 88: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/88.jpg)
Spinque
![Page 89: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/89.jpg)
Spinque
a spin-off (2010) from CWI (the Dutch National ResearchCenter in Computer Science and Mathematics)
introduces search-by-strategy
provides optimized strategies for patent search - tested onCLEF-IP data
transparency: understand your search results to improvestrategy
![Page 90: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/90.jpg)
Spinque
a spin-off (2010) from CWI (the Dutch National ResearchCenter in Computer Science and Mathematics)
introduces search-by-strategy
provides optimized strategies for patent search - tested onCLEF-IP data
transparency: understand your search results to improvestrategy
![Page 91: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/91.jpg)
Spinque
a spin-off (2010) from CWI (the Dutch National ResearchCenter in Computer Science and Mathematics)
introduces search-by-strategy
provides optimized strategies for patent search - tested onCLEF-IP data
transparency: understand your search results to improvestrategy
![Page 92: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/92.jpg)
Spinque
a spin-off (2010) from CWI (the Dutch National ResearchCenter in Computer Science and Mathematics)
introduces search-by-strategy
provides optimized strategies for patent search - tested onCLEF-IP data
transparency: understand your search results to improvestrategy
![Page 93: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/93.jpg)
Clef–Ip 2009 learnings
The Humboldt University implemented a model for patent searchthat produced the best results.
The model combined several strategies:
using metadata (IPC, ECLA)
indexes built at lemma level
an additional phrase index for English
crosslingual concept index (multilingual terminologicaldatabase)
![Page 94: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/94.jpg)
Clef–Ip 2009 learnings
The Humboldt University implemented a model for patent searchthat produced the best results.
The model combined several strategies:
using metadata (IPC, ECLA)
indexes built at lemma level
an additional phrase index for English
crosslingual concept index (multilingual terminologicaldatabase)
![Page 95: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/95.jpg)
Clef–Ip 2009 learnings
The Humboldt University implemented a model for patent searchthat produced the best results.
The model combined several strategies:
using metadata (IPC, ECLA)
indexes built at lemma level
an additional phrase index for English
crosslingual concept index (multilingual terminologicaldatabase)
![Page 96: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/96.jpg)
Clef–Ip 2009 learnings
The Humboldt University implemented a model for patent searchthat produced the best results.
The model combined several strategies:
using metadata (IPC, ECLA)
indexes built at lemma level
an additional phrase index for English
crosslingual concept index (multilingual terminologicaldatabase)
![Page 97: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/97.jpg)
Clef–Ip 2009 learnings
The Humboldt University implemented a model for patent searchthat produced the best results.
The model combined several strategies:
using metadata (IPC, ECLA)
indexes built at lemma level
an additional phrase index for English
crosslingual concept index (multilingual terminologicaldatabase)
![Page 98: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/98.jpg)
Clef–Ip 2009 learnings
The Humboldt University implemented a model for patent searchthat produced the best results.
The model combined several strategies:
using metadata (IPC, ECLA)
indexes built at lemma level
an additional phrase index for English
crosslingual concept index (multilingual terminologicaldatabase)
![Page 99: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/99.jpg)
Some additional investigations
Some citations were hard to find
% runs class≤ 5 hard
5 < x ≤ 10 very difficult
10 < x ≤ 50 difficult
50 < x ≤ 75 medium
75 < x ≤ 100 easy
![Page 100: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/100.jpg)
Some additional investigations
Some citations were hard to find
% runs class≤ 5 hard
5 < x ≤ 10 very difficult
10 < x ≤ 50 difficult
50 < x ≤ 75 medium
75 < x ≤ 100 easy
![Page 101: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/101.jpg)
Some additional investigations
We looked at the content of citations and citing patents.
![Page 102: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/102.jpg)
Some additional investigations
Ongoing investigations.
![Page 103: Chances and Challenges in Comparing Cross-Language Retrieval Tools](https://reader034.fdocuments.net/reader034/viewer/2022042813/5462ba1ab4af9f711c8b4902/html5/thumbnails/103.jpg)
Thank you for your attention.