Transcript of Ryerson University Library and Archives Searching the Deep Web Winter 2012.
Slide 1
Ryerson University Library and Archives Searching the Deep Web
Winter 2012
Slide 2
Virtual Parking Lot If you should have questions that are
either too time consuming, theoretical or technical in nature to be
addressed in this introductory session, then e-mail your question
to Jay Wolofsky [email protected]@ryerson.ca the answer
to your question(s) will be shared with the group.
Slide 3
The Deep Web The Deep Web is currently 400 to 500 times larger
than the commonly defined Surface Web or WWW (7,500 terabytes of
information compared to 19 terabytes of information in the Surface
Web and is growing exponentially
Slide 4
Deep Web/Surface Web The Deep Web (a.k.a.) the Invisible Web
contains high quality information not accessible from conventional
conventional search engines such as Google
Slide 5
Deep Web/Surface Web Structured information contained in
research databases cannot be accessed from the Surface Web
Slide 6
Deep Web/Surface Web The real problem is the spidering and
crawling technology used by conventional search engines that return
links based on popularity, not content Surface Web search results
are ranked by the Frequency documents link to each other (page
rank) The first results are those that have had the most references
by other documents, and not necessarily the most relevant or recent
Information or content
Slide 7
Federated Search Engines \ Federated search engines execute
simultaneous real time search of the Deep Web using sophisticated
software connectors The results are collated and presented back to
the user in a unified format
Slide 8
Federated Search Engines One type, a web spider variant crawls
information from from as many databases as possible creating a
giant uniform index, e.g. Google ScholarGoogle Scholar A more
advanced type searches across each databases own indexing AND
crawls information, e.g., Biznar, Mednar,
DeepDyveBiznarMednarDeepDyve
Slide 9
Federated Search Engines There are 3 general types: The first
type searches across each database using its own indexing The
second type web spider crawls Information from as many databases as
possible creating a giant uniform index, e.g. Google Scholar,
OpenDOARGoogle ScholarOpenDOAR The third type searches across each
databases own indexing AND crawls information, e.g. Biznar, Mednar,
DeepDyve BiznarMednarDeepDyve
Slide 10
Accessing Deep Web Content BiznarBiznar (Business)
DeepDyveDeepDyve (Multidisciplinary) E-Print Network E-Print
Network (Science and Technology) Google ScholarGoogle Scholar
(Multidisciplinary) Highbeam Highbeam (Multidisciplinary)
HighWireHighWire (Multidisciplinary) MednarMednar (Medicine)
MetaPressMetaPress (Multidisciplinary) OpenDOAROpenDOAR
(Multidisciplinary) Science.gov Science.gov (Science and
Technology) ScirusScirus (Science and Technology) Social Science
Research NetworkSocial Science Research Network (Social Sciences)
World Wide ScienceWorld Wide Science (Science and Technology)