Searching the Deep WebLEMA, February 2011
Deep Web Video
Image from express.howstuffworks.com, 14 Feb 11
Surface Web: accessible via general-purpose search engines such as Google and Yahoo!
Deep Web: Not accessible via typical search engines; primarily databases
25%
75%
AKA visible vs. invisible web
1 trillion + Pages
500 trillion +
Pages!!
The “deep web” contains …Databases which use dynamic or temporary links
Often ?, &, CGI, other elements in the URLWebsites which aren’t indexed, by design or because
there are no links to itDeep web sites
Google limits the amount of a web site it indexes, an unpublished factor in its secret algorithm
At one point, only 110KFormats that aren’t currently supported
Google now shows results for .pdf, .doc, .ppt
Boundary between surface and deep web always in flux as search engines incorporate more of the deep web at the same time more is being added to the deep web
Deep Web: Why important?Studies show that students’ searching habits
are fairly ingrained by college Use Google for everything Only look at the 1st page of results Assume trustworthiness of web sites
Rich source of in-depth material not accessible through a typical Google search
Expose students now to richer and more authoritative resources.
Students need to understand ….The best results are NOT in the top 10Everything’s NOT on the webGoogle does NOT search the whole webEverything’s NOT freeEverything’s NOT trustworthySearching/Research is NOT always easy
How can we help our students be better searchers?Introduce them to the idea that Google isn’t everything &
whyReinforce the idea of evaluating resourcesMake them better “surface” searchers
Many information needs can be met with the surface webEasy yet “advanced” Google searching techniques
Better alternatives to the “surface web” & how to effectively search these alternativesDatabases!Familiarity with “deep” sites on a particular topic
Example: Primary materials available at Library of Congress Example: Legislative info at thomas.loc.gov
Familiarity with portals and directories
Three simple techniques to being a better Google searcher ….Phrase searching
“xxx xxxx”Searching the title of web pages
intitle: xxx or intitle:”xxx xxxx”Example: intitle:”climate change”Example: intitle:unicorn
Specifying a sitesite:.xxx or site:xxx.comegypt site:washingtonpost.com “climate change” site:.gov
NOTE:1. No space after
colon2. Lowercase
commands
Let’s try a site: search ….Look for a Washington Post article on the B-
52s
Now let’s try a phrase search…First, try Howard Morris as a simple keyword
search -- How many hits?
Now try it as a phrase “Howard Morris”How many hits?
Now let’s try an intitle: searchFirst, just search for “climate change” – how
many hits?
An intitle: searchNow try searching for “climate change” in
the title of the web page – how many hits?
Searching the Deep WebLVHS Library Web Page – Deep Web link on
the left Google search for your topic and add
keyword database Ex: Plane crashes database
The Deep Web: A ComparisonUsing Google, search on the term metabolismOpen a separate tab, go to www.science.gov
and search metabolism againLooking at the top ten results of each, which
provided generally “better” information? How difficult/easy is it to pursue your search
in related fields?
Directories/Portals of InterestIpl2
January 2010Merge of Internet Public Library and Librarians’
Internet IndexLibrarians and Information Science ProfessionalsHosted by Drexel University’s College of Information
Science & TechnologyInfomine
University-level scholarly resourcesLibrarian built and maintainedUniversity of California
Virtual Private Library
Other ResourcesLVHS Library Web Page – Deep Web link on
the leftGoing Beyond Google: The Invisible Web
in Learning and Teaching by Jane Devine and Francine Egger-Sider, 2009Not as up-to-date as web resources, butVery focused on teaching
Any questions?
Top Related