Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips...

34
Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and Online Sources”, Prague , 2003. Toshka Borisova AUBG Freedom Forum Journalism Library Coordinator

Transcript of Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips...

Page 1: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

Search Tips or with competition with search robots

Inspired by Mary Ellen Bates’ workshop“Tips From a Super Searcher: Getting the Most From the Web and Online Sources”, Prague , 2003.

Toshka Borisova

AUBG Freedom Forum Journalism Library Coordinator

Page 2: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 2

Search Tips

The World Wide Web contains more information than any other single resource in existence today. Finding the information you are looking for among the billions of web pages on the web can be tough. This guide of search tips will have you on the road to finding information quickly and effectively.

Web search tips The invisible web

Page 3: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 3

Online Search Strategies

What are you looking for: Full text or abstracts? Current material or 10 years back? Basic or advanced material? Short or in-depth articles? Any "validating" sources? Exact match or something close? Leads to identify experts to call? White papers ( White Papers contain an official set of proposals in

specific policy areas), statistics and other info more likely to be on web sites?

Page 4: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 4

Online Search Tips Use "advanced search" option http://www.aubg.bg/library/text.php?i=68

0 Google Well known as the "king of search," this engine

has one of the largest databases of web pages in the world. Fast, accurate results are common here and chances are good that if you can't find it in Google, it's not meant to be found.

Page 5: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 5

Online Search Tips Plan on two separate search sessions Be sure to value your time

White Paper on the true cost of searching

the open web vs. the professional online

Services www.factiva.com/infopro/BusIntellletter.pdf Assume you will find something We have higher relevance expectations

than our patrons Watch for what's not online

Page 6: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 6

Online Search Tips

Watch for references to "grey literature“"That which is produced on all levels of government, academics, business and industry in print and electronic formats, but which is not controlled by commercial publishers."

Include www or http in your search strategies to find mentions of web sites

Always use several tools for the same search Watch for alternate spellings and

phrasings Use same words in different order

Page 7: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 7

Web Search Tips

Use tools, not search engines. There is absolutely no pattern

Wayback Machine

http://www.archive.org/ Purge your "assumptions cache" regularly Keep a trail of where you have been Be sure to value your time

Page 8: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 8

Web Search Tips

When exploring a site, use the Site Map or Site Index Use the [Search This Site] feature to find hidden

pages Know the "power tools" of each search engine

Field searches File-type searches Limits by date, language, site Truncation Boolean

Page 9: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 9

Search Tips

Keyword SearchMany search engines by default offer a keyword search

Phrase Search. Boolean Operators

Named after mathematician George Boole, Boolean logic involves the operators AND, OR, NOT, and occasionally NEAR

Page 10: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 10

Online Search Tips

Keyword Search Use KWIC (Key Word In Context)

Try to find synonyms, acronyms

http://www.keyworddensity.com/

http://www.wordtracker.com/ Search for key words in title Use the "at least X times" feature

DJI/Factiva, LexisNexis, Dialog:

Page 11: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 11

Web Search TipsPhrase Searching

Requires the terms to appear in the exact order that they are typed. Most systems that allow phrase searching have the user enter the phrase in quotes.

 "national endowment for the arts" Phrase Searching”- Supported by all Google - Phrases may not be on page Teoma- “Not always exact matches” (FIXED) Openfind Debuting in beta form in July 5, 2002

Openfind is a new, large independently-built search engine, initially claiming 3.5 billion pages. It is based on research in Taiwan and has a Chinese version as well. None available now

Page 12: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 12

Web Search Tips

Boolean operators Just use it wisely

– Simple ANDs, ORs– Narrows results

Boolean NOT ( - )– Exclude meaning– Exclude domains

Boolean OR– Crucial synonyms– Need more pages

Page 13: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 13

Web Search TipsTo OR or not to OR Google: OR in CAPS, advanced

– Does not always work right– yellowstone bison OR buffalo

AlltheWeb: use ( ) or Advanced Boolean Box– yellowstone (bison buffalo)

AltaVista: normal– yellowstone AND (bison OR buffalo)

Gigablast: Use + (but not the same)– +yellowstone bison buffalo

Teoma– yellowstone bison OR buffalo– Becomes(yellowstone AND bison) OR buffalo

Page 14: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 14

Web Search Tips

Proximity

– Text matching– citation hunt– plagiarism check– Q&A

NEAR and Other Proximity– AltaVista only

Page 15: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 15

Web Search Tips

TruncationSearches for variants of a word by using a symbol to represent one or more characters. The most common symbols are * (asterisks), ? (question marks), and ! (exclamation marks). If truncation is not supported by the search engine use the Boolean operator OR to combine like terms. – AltaVistaTruncation

HotBot & MSN Truncation Another term ”Stemming”: MSN (e.g., find "movies" if your

search word is "movie")

Page 16: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 16

Web Search Tips

Case Sensitive ( alaskan pipeline- with the incorrect lowercase "a")

– AltaVista Advanced or Quoted Simple– MIT vs. mit or IT vs. it

Page 17: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 17

Web Search Tips

Wild Card Word in Phrase Wild Card characters represent undefined letters or numerals in a search term. Wild Card characters allow for retrieval of:

- Singular and plural word forms

- Spelling variations (e.g., British/American spellings)

- Word stems with prefixes and suffixes

* - Represents zero to any number of characters at the beginning or end of a term. *GROW* - Possible Retrievals GROW , GROWS, OUTGROWTH

? - Represents exactly one character within a term...

T??TH TEETH, TOOTH, TRUTH

...or one character at the end of a term AMIN? AMINE , AMINO

Page 18: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 18

Web Search Tips

Field SearchingFields searching allows the searcher to designate where a specific search term will appear. Rather than searching for words anywhere on a Web page, fields define specific structural units of a document. The title, the URL, an image tags, or a hypertext link are common fields on a Web page.

How search engines workSpidering program - Collect links

Indexing program - Include metatags

Search/retrieval program - Sort results

Page 19: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 19

Web Search Tips

Link Searching

Pages include a link to the specified URL.

Link Updates, Impact Analysis- Best at AltaVista, AlltheWeb

– Can have different results for

http://www.name.org/Example: http://www.freedomforum.org/ - finds pages with links to

this site Title:searching will look for the word 'searching' in the

title of a Web page. Hits have the term(s) in the HTML title element. title: "search engines”

Page 20: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 20

Web Search TipsField Searching IP: Page is the specified IP range. Incomplete numbers

are truncated. ip:216.32.120 finds any computer in 216.32.120.*

Site: Results are only from the specified site. site:nasa.gov - finds pages at NASA's Web site

Suburl: Pages have the term(s) somewhere in the URL (host name, path, or filename). suburl:searchenginewatch

URL: Result must be exactly this URL and nothing else. url: www.slashdot.com/index.html

Page 21: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 21

Web Search Tips

– Field Searching

title: AltaVista, AlltheWeb, HotBot, Lycos, Gigablast

intitle: Google Google, Teomaurl: AltaVista, AlltheWeb, Lycos, Gigablastinurl: Google, Teomasite: AlltheWeb, Gigablast, Google, Teomalink: AltaVista, Google, AlltheWeb, HotBot,

Gigablastanchor: AltaVistaimage: AltaVista

Page 22: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 22

Web Search Tips

Selected Limits Usually on advanced search formLanguage: At most, languages varyDate: AlltheWeb, AltaVista, Google, Inktomi– Cut out old material, focus search– Or to find old informationFile Type: AlltheWeb, AltaVista, Google, Inktomi.

PDFs at all, Flash at AlltheWeb, Media Type: HotBot, MSN, AlltheWebPage Size: AlltheWebIP Range: AlltheWeb

standard

Page 23: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 23

Web Search Tips

Diacritics: é

Does e find é? - Sometimes Not at Google

– Exact match on diacritics only At other search engines

– e usually finds e OR é

é usually finds only éUse English equivalents for special letters and omit diacritics

Page 24: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 24

Web Search Tips

Counting Complexities Search Engines Can’t Count

Only the big search engines count, top10 search engines Numbers constantly change

– From one page of results to the next– From one minute to the next

Try reloading for more

Page 25: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 25

Web Search TipsFeature Inconsistencies Databases Changes

– Constant– If they don’t . . .

• They get old, out-of-date, dead links– Size Changes Often Sudden– Database Reversions– Searching Failures And Other Unexpected Results

On the Fly Analysis Always Question Results Evaluate and Compare Find one unique, low-posted term

– Use for search engine comparisons – Evaluate change over time

“On-the-Fly Search Engine Analysis.” ONLINE 23(5):63-66, Sept. 1999. onlinemag.net/OL1999/net9.html

Page 26: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 26

Web Search Tips

CEO - Search Engine Optimization SearchEngineShowdown.com

More on Advanced Features

Feature Chart

Detailed Reviews Search Engine Watch

http://www.searchenginewatch.com/facts/ataglance.html

Page 27: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 27

InconsistenciesLow Recall or "I am not finding any sites on my topic!!" Have I chosen the correct database? Have I been too specific in formulating the search? Have I included all possible terms and word forms? Should I use

truncation? Was Boolean logic used correctly? Did I make a technical error, e.g., spelling, or command syntax?

Low Precision or "I found hundreds of citations and many are not on my topic!!"

Delete less specific synonyms and ambiguous terms Search fewer fields e.g., just the title field or URL Add additional facets with AND or NOT Add restrictions, e.g., date of publication

Page 28: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 28

The Invisible Web

What is it?It consists of searchable information resources whose

contents cannot be indexed by traditional search engines.

Content in databases Professional online services Non-ASCII files Sites that require log-in or registration Real-time information Dynamically-created web pages Discussion forums and BBSs

Page 29: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 29

Searching the Invisible Web

Much "invisible" content has a

"visible web" front Some databases are opening up

Google searches PDF, XLS, RTF, DOC files

Page 30: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 30

Searching the Invisible Web Use directories and portals

-Open Directory Project http://www.dmoz.org is the largest, most comprehensive human-edited directory of the Web. It is constructed and maintained by a vast, global community of volunteer editors.-Librarian’s Index to the Internet http://www.lii.org-Subject-specific directories http://www.econ.bg

Experts and info pros watch for this materialExperts.com www.experts.comA reliable and diverse source of experts, many of whom are outside the academic arena.

Yahoo - http://groups.yahoo.com/ Search for database or forum along with subject terms

Page 31: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 31

Searching the Invisible Web

Use meta-search engines DogPile.com MetaCrawler.com Use Teoma.com's "Experts' Links“ Scan the libraries of relevant

discussion groups Lurk on lists

Page 32: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 32

Searching the Invisible Web

Use reverse link look-up to find "more like this"

Google and Alta Vista:

link:www.BatesInfo.com HotBot: http://www.hotbot.com/

link:www.aubg.bg/fforum - use [Links to this URL]

Page 33: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 33

The Invisible WebInvisible Web Directories http://www.invisibleweb.com/

The InvisibleWeb Catalog™ contains over 10,000 databases and searchable sources that have been frequently overlooked by traditional searching.

CompletePlanet.com Contains 103 searchable databases

DirectSearch Difficult to use but extensive

http://www.internets.com/They have assembled the largest filtered collection of useful search engines and newswires anywhere on the World Wide Web. There are 1-2 billion documents, on the "surface web". The deep web is estimated to be approximately 500 billion documents.

Good hierarchy of databases

Page 34: Search Tips or with competition with search robots Inspired by Mary Ellen Bates’ workshop “Tips From a Super Searcher: Getting the Most From the Web and.

19 and 26 June 2003 Toshka Borisova 34

Web Search Tips

Set aside one afternoon every two weeks for your web reading !!!

More infohttp://www.BatesInfo.com