An “information retrieval system” searches the computer system to find the required...
-
Upload
cecily-sanders -
Category
Documents
-
view
214 -
download
1
Transcript of An “information retrieval system” searches the computer system to find the required...
![Page 1: An “information retrieval system” searches the computer system to find the required information.](https://reader038.fdocuments.net/reader038/viewer/2022110405/56649edb5503460f94bea51f/html5/thumbnails/1.jpg)
WEB SEARCH
ENGINES
&
THEIR
PERFORMANCEMerve YILMAZ
MGIS 301
08.05.2008
![Page 2: An “information retrieval system” searches the computer system to find the required information.](https://reader038.fdocuments.net/reader038/viewer/2022110405/56649edb5503460f94bea51f/html5/thumbnails/2.jpg)
An “information retrieval system” searches the computer system to find
the required information.
![Page 3: An “information retrieval system” searches the computer system to find the required information.](https://reader038.fdocuments.net/reader038/viewer/2022110405/56649edb5503460f94bea51f/html5/thumbnails/3.jpg)
The most public, visible form of a search engine
İnformation it searches for can be:› Web pages› İmages› Other types of files
![Page 4: An “information retrieval system” searches the computer system to find the required information.](https://reader038.fdocuments.net/reader038/viewer/2022110405/56649edb5503460f94bea51f/html5/thumbnails/4.jpg)
The very first tool leading the web search engines: Archie› Collects the file names and creates an
unindexed database from the collection.› Created by a student of McGill
University,Montreal› The name stands for “archive” without “v”
![Page 5: An “information retrieval system” searches the computer system to find the required information.](https://reader038.fdocuments.net/reader038/viewer/2022110405/56649edb5503460f94bea51f/html5/thumbnails/5.jpg)
The first search engine: Wandex. › Created by an MIT student.
One of the first full-text crawler-based search engines: WebCrawler*crawler: something that crawls like a reptile
Another popular search engine: Lycos, first started in Carnegie Mellon.
Then came other search engines:Excite,Altavista,Infoseek…
![Page 6: An “information retrieval system” searches the computer system to find the required information.](https://reader038.fdocuments.net/reader038/viewer/2022110405/56649edb5503460f94bea51f/html5/thumbnails/6.jpg)
Yahoo provides directory browsing
![Page 7: An “information retrieval system” searches the computer system to find the required information.](https://reader038.fdocuments.net/reader038/viewer/2022110405/56649edb5503460f94bea51f/html5/thumbnails/7.jpg)
How did Google become the most popular among others?
An innovation called “Page Rank” Minimalist interface rather than embedding a web search
engine into a web portal
![Page 8: An “information retrieval system” searches the computer system to find the required information.](https://reader038.fdocuments.net/reader038/viewer/2022110405/56649edb5503460f94bea51f/html5/thumbnails/8.jpg)
I. Web CrawlingII. IndexingIII. SearchingWeb Crawling: an automated web
browser follows every link it sees.Indexing:words are extracted from titles,
headings,meta tags*meta tag: <META name=“keywords” content=“stamps,stamp
collecting,stamps for sale”>
![Page 9: An “information retrieval system” searches the computer system to find the required information.](https://reader038.fdocuments.net/reader038/viewer/2022110405/56649edb5503460f94bea51f/html5/thumbnails/9.jpg)
Search Target 1: Odysseuss2009.org
Best rating percentage: Yahoo.com
Least #of results: Excite.com
Worst rating percentage: Windows
Live / MSN Search
(search.msn.com) (0%)* Excite.com is actually a meta search
engine and shows results from other search engines.
![Page 10: An “information retrieval system” searches the computer system to find the required information.](https://reader038.fdocuments.net/reader038/viewer/2022110405/56649edb5503460f94bea51f/html5/thumbnails/10.jpg)
Best rating percentage among meta engines: Dogpile.com
Worst rating percentage: Apollo7.co.uk (almost 0%)
* The engines that the meta search engines collect results from, change from one meta search engine to another. Therefore their performances change accordingly.
![Page 11: An “information retrieval system” searches the computer system to find the required information.](https://reader038.fdocuments.net/reader038/viewer/2022110405/56649edb5503460f94bea51f/html5/thumbnails/11.jpg)
Search query: “Loss of customer goodwill in lot-sizing”, an article by Deniz Aksen
*all results including links to the article, or the pages where article is cited/referred are accepted as “hit”.
• Best rating percentage: Ask.com (found 1st in almost all searches)
• Worst rating percentage: Lycos.com (almost none of the results are hit)
• Best rating percentage among meta engines: Mamma.com
![Page 12: An “information retrieval system” searches the computer system to find the required information.](https://reader038.fdocuments.net/reader038/viewer/2022110405/56649edb5503460f94bea51f/html5/thumbnails/12.jpg)
• Worst rating percentage among meta engines: Donbusca.com
* Samples are to show only results of some
searches,there are a total of 480 searches with various keywords for one target search result. The remaining are included in excel sheets in detail.