1 21 May 2005Karen Blakeman Top Tips for Expert Searching Inforum 2005, Prague Karen Blakeman, RBA...

79
1 21 May 2005 Karen Blakeman www.rba.co.uk Top Tips for Expert Searching Inforum 2005, Prague Karen Blakeman, RBA Information Services [email protected] rba.co.uk

Transcript of 1 21 May 2005Karen Blakeman Top Tips for Expert Searching Inforum 2005, Prague Karen Blakeman, RBA...

121 May 2005 Karen Blakeman www.rba.co.uk

Top Tips for Expert Searching

Inforum 2005, Prague

Karen Blakeman, RBA Information [email protected]

rba.co.uk

221 May 2005 Karen Blakeman www.rba.co.uk

What will be covered?

■ What Google has been doing over the last year

■ Alternatives to Google■ RSS and Blogs■ Desktop Search■ Storing, managing and sharing resources■ Top Tips to reduce overload

321 May 2005 Karen Blakeman www.rba.co.uk

Google

■ New services– Google Mail– Google Print– Google Scholar– Google Suggests– Google Local and Maps (only US, Canada and UK)– My Search History– Web Accelerator– Personalized News– Google Desktop

■ Look in labs.google.com

421 May 2005 Karen Blakeman www.rba.co.uk

Google

■ Increased database to over 8 billion pages– more rubbish to sift through– need to use advanced search features to get

more useful results■ New search features

– increased number of search terms to 32– numeric range search

toblerone 1..5 kg DVD player $100..200 2000..2005

– synonym search (~ before a word)

521 May 2005 Karen Blakeman www.rba.co.uk

Google: use advanced search

■ language■ file format (filetype:)■ date modified■ domain (site:)■ Similar pages (related:)■ pages that link to a known page (link:)■ Also define: for definitions■ Use * to stand in for a word in a phrase■ Use + sign before a word to stop automatic

stemming

621 May 2005 Karen Blakeman www.rba.co.uk

721 May 2005 Karen Blakeman www.rba.co.uk

"Google sinker"

■ Works in most search engines (not MSN)■ Repeat the most important word in your

search several times e.g.– beer "market share" france germany czech – beer "market share" france germany czech

czech – beer "market share" france germany czech

czech czech all give different results!

821 May 2005 Karen Blakeman www.rba.co.uk

Google Scholar

■ scholar.google.com– "search specifically for scholarly literature,

including peer-reviewed papers, theses, books, preprints, abstracts and technical reports from all broad areas of research"

– no source list– includes citations and books– limited "advanced search" and author search

unpredictable– articles ranked by relevance only– many articles are priced

921 May 2005 Karen Blakeman www.rba.co.uk

Google print

■ Books supplied to Google by publishers■ Google digitizes them■ Search on:

– books about......– e.g. books about hubbert

■ 3 books listed near the top of the results list■ Search within the book■ Limit on number of pages that can be viewed■ Information about the book and links to book

stores

1021 May 2005 Karen Blakeman www.rba.co.uk

1121 May 2005 Karen Blakeman www.rba.co.uk

1221 May 2005 Karen Blakeman www.rba.co.uk

1321 May 2005 Karen Blakeman www.rba.co.uk

Google Suggests

■ labs.google.com and click on Google Suggests■ Start typing in your search and Google

suggests additions to your search together with the number of results

1421 May 2005 Karen Blakeman www.rba.co.uk

1521 May 2005 Karen Blakeman www.rba.co.uk

Variations on Google

■ Lots of search tools based on Google■ bananaslug.com

– takes your search, adds a random term and searches Google

– pulls up results that would usually be hidden far down your results list

– can select a random term from a category animals, random number, Shakespeare random number works well when looking for statistics

or market data

1621 May 2005 Karen Blakeman www.rba.co.uk

Why try alternatives?

■ Different coverage■ Different way of sorting results■ Different search features■ Different types of resource■ Compare some of the major search engines

using Thumbshots Ranking– ranking.thumbshots.com– shows overlap in first hundred results

varies depending on the search

1721 May 2005 Karen Blakeman www.rba.co.uk

1821 May 2005 Karen Blakeman www.rba.co.uk

1921 May 2005 Karen Blakeman www.rba.co.uk

2021 May 2005 Karen Blakeman www.rba.co.uk

2121 May 2005 Karen Blakeman www.rba.co.uk

Dogpile Missing Pieces

■ missingpieces.dogpile.com/WhitePaper.pdf■ missingpieces.dogpile.com/missingpiecestool.aspx

– compares Google, Yahoo and Ask Jeeves– graphic shows how many results are in only 1, 2 or in

all 3 search engines for a particular search and includes sponsored links from the top of the pages.

2221 May 2005 Karen Blakeman www.rba.co.uk

2321 May 2005 Karen Blakeman www.rba.co.uk

Yahoo!

■ search.yahoo.com■ Launched at the beginning of 2004■ ? billion pages■ Search features very similar to Google■ Key features

– searches first 500K of a page (Google only 100K)– full Boolean search– link and linkdomain command better than Google – RSS/XML filetype search– News alerts available as RSS feed

2421 May 2005 Karen Blakeman www.rba.co.uk

Yahoo link and linkdomain commands

■ Google – link:www.zefix.ch (68 results)

■ Yahoo– link:http://www.zefix.ch/ (1100 results)

finds pages that link to this individual page– linkdomain:www.zefix.ch (1290 results)– can also exclude your starting point using -site:

2521 May 2005 Karen Blakeman www.rba.co.uk

MSN

■ www.msn.com■ Launched in autumn 2004■ 5 billion pages■ "Search Builder" = advanced search

options■ No filetype search option in advanced

search■ Results tend to be "consumer" oriented■ News Alerts available as RSS

2621 May 2005 Karen Blakeman www.rba.co.uk

Exalead

■ www.exalead.com■ 1 billion pages■ Full Boolean search■ NEAR command - within 16 words of each

other■ Supports wild cards

– middle and end of the word

2721 May 2005 Karen Blakeman www.rba.co.uk

Exalead (2)

■ Advanced Search– phonetic search– approximate spelling– automatic stemming

■ Pattern matching– good for solving (cheating at?) crossword

puzzles start pattern with a forward slash, represent each

missing letter with a full stop and finish pattern with a forward slash e.g. /.h.s.c..n/

use a full stop followed by an asterisk to represent one or more letters e.g. /psych.*ist/

2821 May 2005 Karen Blakeman www.rba.co.uk

Exalead (3)

■ Results– can be sorted by date (newest to oldest or

oldest to newest) select option on Advanced Search screen

– results display thumbnail of page next to each entry

– related terms displayed– can be viewed by file format e.g. PDF, DOC

2921 May 2005 Karen Blakeman www.rba.co.uk

3021 May 2005 Karen Blakeman www.rba.co.uk

Kartoo

■ kartoo.com■ Meta-search tool■ Graphical representation of results■ Extracts related terms from documents■ Different layouts available including

straightforward text listing

3121 May 2005 Karen Blakeman www.rba.co.uk

3221 May 2005 Karen Blakeman www.rba.co.uk

Unique search features

■ Google– numeric range search, synonym search, define

command, Google Suggests■ Yahoo

– RSS/XML format■ Exalead

– phonetic search, approximate spelling, pattern matching, wildcards, NEAR command, related terms

■ Kartoo– graphical representation of results, related terms

3321 May 2005 Karen Blakeman www.rba.co.uk

Which search tool

■ Synonyms and related terms– Google, Exalead, Kartoo

■ Wild cards, variations on words– Exalead wild card, phonetic search,

approximate spelling, pattern matching■ Proximity search

– Exalead■ Numeric range search

– Google

3421 May 2005 Karen Blakeman www.rba.co.uk

Quick facts and reference queries

■ Answers.com– "topic-based snapshot"– 100 authoritative encyclopedias, dictionaries,

glossaries and atlases■ Wikipedia

– www.wikipedia.org– free-content encyclopaedia that anyone can edit– editors required to compile a balanced article

including references to other sources– several language versions– good for quick reference and for links to other

related sources

3521 May 2005 Karen Blakeman www.rba.co.uk

Quick facts and reference queries (2)

■ More examples:– acronymfinder.com– dictionary.com– Encarta– encyclopedia.com– brainboost.com

3621 May 2005 Karen Blakeman www.rba.co.uk

Evaluated listings

■ Annotated directories on a particular subject, industry or type of information

■ Provide access to recommended resources on a topic

■ Expert human assessment of resources■ Examples:

– eco5.com for finance and economics– Biogate biogate.lub.lu.se for "1000 best links in the

biological sciences"– Official Statistics on the Web

www.library.auckland.ac.nz/subjects/stats/offstats/

3721 May 2005 Karen Blakeman www.rba.co.uk

Evaluated listings

■ How do you find them?– BUBL Link bubl.ac.uk– Pinakes, a subject launchpad

www.hw.ac.uk/libWWW/irn/pinakes/pinakes.html – professional or trade association– by personal recommendation– by chance

3821 May 2005 Karen Blakeman www.rba.co.uk

Meta search tools

■ Take your search and run it in several search engines at once

■ For:– saves time and effort– combined results sometimes better than

individual search tools– some arrange results into folders e.g. Killerinfo

■ Against:– cannot use the advanced search options of

individual search tools

3921 May 2005 Karen Blakeman www.rba.co.uk

Examples of meta search tools

■ kartoo.com■ killerinfo.com■ vivissimo.com■ ixquick.com■ dogpile.com■ turboscout.com (searches one at a time)■ turbo10.com (build your own!)

4021 May 2005 Karen Blakeman www.rba.co.uk

Turboscout

■ Interface to a range of different search tools and types of resources

– 21 "standard" search tools e.g. Google, Teoma– 12 image search tools– 17 reference sources e.g. Wikipedia, Scirus– 10 news search tools– 13 product search e.g. Amazon– 9 blog tools e.g. DayPop, Technorati– 8 audio visual

■ Type in your search once and click on each tool in turn

4121 May 2005 Karen Blakeman www.rba.co.uk

4221 May 2005 Karen Blakeman www.rba.co.uk

4321 May 2005 Karen Blakeman www.rba.co.uk

Turbo10.com

■ Build your own meta search tool– select search tools from existing list of search

engines and sites– add your own search tool or site

wizard take you through the steps not all tools and sites can be added dependent on cookies to keep and display your

collections so problems if you move from one PC to another

4421 May 2005 Karen Blakeman www.rba.co.uk

4521 May 2005 Karen Blakeman www.rba.co.uk

RSS and blogs

■ What is RSS?– a way of delivering headlines and stories – stands for Really Simple Syndication, or Rich Site

Summary, or RDF Site Summary– more information at www.rba.co.uk/rss/rss.htm

■ Need a program to "read" the feeds– web based e.g. bloglines.com

Bloglines tutorial at tinyurl.com/ap42n– desktop program e.g. FeedReader, FeedDemon– list of readers at

en.wikipedia.org/wiki/List_of_news_aggregators

4621 May 2005 Karen Blakeman www.rba.co.uk

"Raw" RSS feed

4721 May 2005 Karen Blakeman www.rba.co.uk

RSS feeds in a Feedreader

4821 May 2005 Karen Blakeman www.rba.co.uk

RSS and blogs

■ Many news services now offer RSS feeds– Yahoo News – MSN News – Newstrove.com, Moreover.com and FeedDirect– look for the RSS or XML logo

■ Blogs– online journal or diary– can range from superficial irrelevances to

extreme erudition– try and find blogs by industry "gurus"

4921 May 2005 Karen Blakeman www.rba.co.uk

RSS and blogs

■ Searching for RSS feeds and blogs– Yahoo Advanced Search, RSS/XML file format– syndic8.com– bloglines.com– blogdex.net– blogdigger.com– daypop.com– technorati.com– feedster.com– blogpulse.com

5021 May 2005 Karen Blakeman www.rba.co.uk

Desktop Search

■ Searches documents, emails, chat, IM messages, web pages etc. stored on your PC

– search both file names and content■ Can combine local search with web search■ Indexes documents and folders on your PC

– do not generally index documents on other machines on the network (some exceptions)

■ Useful for tracking down "lost" files but....■ Not a replacement for structured, well

managed document folders

5121 May 2005 Karen Blakeman www.rba.co.uk

Examples

■ Blinkx - www.blinkx.com■ Ask Jeeves - sp.ask.com/docs/desktop/■ Google Desktop - desktop.google.com■ MSN - www.msn.com■ Copernic Desktop - www.copernic.com■ Yahoo desktop - desktop.yahoo.com

5221 May 2005 Karen Blakeman www.rba.co.uk

Desktop search - what is indexed?

■ Your documents, email, cached web pages etc■ How?

– varies with search tool e.g. Google first creates a cache and then indexes the cache, others create just an index

■ First time indexing can take along time■ Updates handled differently by different tools

– scheduled e.g. every hour, once a day– continually as new or edited documents appear

(dynamic indexing) can interfere with work need to be able to pause or "snooze" indexing if necessary

5321 May 2005 Karen Blakeman www.rba.co.uk

Document types supported

■ Varies depending on the desktop search tool■ Crucial when choosing a tool■ Usually at least MS Office, html, text files

– may not support OpenOffice, Thunderbird, Eudora, Firefox, Netscape etc.

– may only index default folders e.g. My documents■ Preview will vary depending on type of document■ Check treatment of secure web pages

– https e.g. bank statements, Intranet pages■ Check treatment of password protected docs

5421 May 2005 Karen Blakeman www.rba.co.uk

Also check...

■ Can the content of a particular document format be searched or is only title searched?

■ Can the tool support files without extensions?■ Do you really need network or Enterprise Search

rather than just local PC search?■ Also check that you have at least the minimum

spec on your machine to support the desktop search tool - some are resource hungry

■ Operating systems supported– MS, MAC, Linux?

5521 May 2005 Karen Blakeman www.rba.co.uk

Three useful references

■ UW E-Business Institute "Benchmark Study of Desktop Search Tools" - www.uwebi.org

– free of charge■ Desktop Search Handbook - an Office

Watch guide - shop.office-watch.com/dsh/– US$ 14.95, e-book, updates are free

■ Desktop Detectives, Davey Winder, Information World Review, May 2005, Issue 213, pp.19-21

5621 May 2005 Karen Blakeman www.rba.co.uk

UWEBI criteria

■ Usability– how easy to use, intuitive?

■ Versatility– e.g. which document formats are supported?

■ Accuracy■ Efficiency

– memory usage, indexing time, index pause options

■ Security■ Enterprise readiness

5721 May 2005 Karen Blakeman www.rba.co.uk

Blinkx

■ "blinkx changes the way you interact with all kinds of information by reading the content on your computer screen and automatically linking you to related information - Web sites, the latest news on the Web, even documents and email on your computer."

■ "Without having to actively or explicitly fire off a search or even choose words to search on, IQ uses intelligent analysis of the page a user is reading or writing to find related information, again regardless of its source, whether the local computer, the internet or television."

5821 May 2005 Karen Blakeman www.rba.co.uk

Blinkx

■ Limited range of file formats supported■ Indexing is very slow■ Memory hungry■ Not intuitive■ "Intelligent Analysis" loses the plot if you

have a wide range of interests

5921 May 2005 Karen Blakeman www.rba.co.uk

Ask Jeeves

■ sp.ask.com/docs/desktop/■ Document support limited■ Not easy to pause indexing■ Does not search content of PDF, ZIP or

Excel files■ No web history search■ Can be unstable

– regularly crashed my machine

6021 May 2005 Karen Blakeman www.rba.co.uk

Google Desktop Search

■ desktop.google.com■ File formats supported improving■ Third party plugins provide support for file formats ■ Indexes documents as you view them, even

password protected files unless you tell it not to■ Includes https files unless you tell it not to■ Sends back anonymous info about your searches

unless you tell it not to■ Problems with persistent cache and indexes

– remove function not easy to use

6121 May 2005 Karen Blakeman www.rba.co.uk

6221 May 2005 Karen Blakeman www.rba.co.uk

6321 May 2005 Karen Blakeman www.rba.co.uk

6421 May 2005 Karen Blakeman www.rba.co.uk

6521 May 2005 Karen Blakeman www.rba.co.uk

MSN Desktop

■ www.msn.com■ Document formats support is limited■ Spots when you are active on your PC and

suspends indexing quite quickly– also a "snooze" button you can use to stop indexing

immediately. ■ Good sort options

6621 May 2005 Karen Blakeman www.rba.co.uk

Copernic Desktop

■ www.copernic.com■ Supports "the usual suspects" plus Firebird,

Mozilla Netscape and Thunderbird– Advanced options allow you to

add file types to index add text file types to index

■ Searches as you type ■ Dynamic indexing■ Has a good pause button to stop indexing■ Uses Alltheweb for the Web search

6721 May 2005 Karen Blakeman www.rba.co.uk

Copernic Desktop (2)

■ Individual results are displayed in a Quick Preview pane below the search results

■ Results categorised into groups that change depending on search type e.g. email, file, music

■ Search terms are highlighted■ Enterprise option - Coveo

6821 May 2005 Karen Blakeman www.rba.co.uk

6921 May 2005 Karen Blakeman www.rba.co.uk

Yahoo Desktop

■ desktop.yahoo.com■ Based on X1 Desktop Search■ Supports over 200 document types + media

files■ Searches inside zip files■ Good preview■ No dynamic indexing■ Enterprise search - X1

7021 May 2005 Karen Blakeman www.rba.co.uk

7121 May 2005 Karen Blakeman www.rba.co.uk

Which Desktop Search?

Karen's choice1. Yahoo2. Copernic3. MSN4. Google5. Ask Jeeves6. Blinkx

UWEBI's choice1. Copernic2. Yahoo4. MSN5. Google6. Ask Jeeves11. Blinkx

7221 May 2005 Karen Blakeman www.rba.co.uk

Alternatives

■ Windows Search■ Agent Ransack/File Locator Pro

– www.mythicsoft.com– "Unlike other search products FileLocator Pro

does not consider any file too small or insignificant to examine."

– sophisticated (geekish?) search options – like Windows Search can take a long time to

search your PC

7321 May 2005 Karen Blakeman www.rba.co.uk

7421 May 2005 Karen Blakeman www.rba.co.uk

Should you install desktop search?

■ Security issues to consider and conflicts with document management policies

■ Safest one so far seems to be Yahoo■ All limited in file types supported

– Yahoo most comprehensive■ Do not rely on desktop search to find your

documents – far better to have well organised, structured

documents folders

7521 May 2005 Karen Blakeman www.rba.co.uk

The future for desktop search

■ Limited take up at present in the corporate environment

– some are specifically prohibited ■ Many tools need to improve drastically

– many will not bother■ Microsoft to integrate desktop search into

the operating system■ Will still need tools for non-Microsoft

platforms and non-Microsoft document formats

7621 May 2005 Karen Blakeman www.rba.co.uk

Storing and organising resources

■ Organise bookmarks/favorites■ Add frequently used sites to links tool bar

– www.rba.co.uk/search/toolbar.htm■ Copy URLs and descriptions to your own

web page or word document■ Firefox users

– Copy URL extension copy URL, page title and any highlighted text paste into application of your choice

7721 May 2005 Karen Blakeman www.rba.co.uk

Storing and organising resources

■ Netsnippets– netsnippets.com– "Capture. Organize. Share"– stores selected text, whole web pages, files

along with your comments– organise research and pages into folders or

"projects"– share that information with colleagues and

friends– help produce a report from your research

7821 May 2005 Karen Blakeman www.rba.co.uk

Storing and sharing bookmarks online

■ Store, comment on and organise resources via third party web site

■ Can keep your bookmarks private, make them totally public or share them with selected individuals

■ Ideal for sharing project resources amongst a group of widely dispersed co-workers

■ But service could vanish so make backups■ Examples

– Furl.net, Spurl.net, del.icio.us, de.lirio.us, Connotea.org

7921 May 2005 Karen Blakeman www.rba.co.uk

Contact details

Karen BlakemanRBA Information Servicesemail: [email protected]: rba.co.ukTel: +44 118 947 2256Fax: +44 20 8020 0253