Internet Librarian International 2008 :Internet Librarian International 2008 Maintaining Search Qualityor How to Beat the Search Engine Crunch
Karen BlakemanRBA Information Services 22 October 2008 Karen Blakeman www.rba.co.uk 1 This work is licensed under a Creative Commons Attribution 2.5 License This work is licensed under a Creative Commons Attribution 3.0 License Photo by Florian Knust http://www.flickr.com/photos/florianknust/471111469/
Search engine meltdown :Search engine meltdown AlltheWeb Livesearch - gone
AlltheWeb itself still alive but no further development (uses Yahoo databases)
Ask
Big News - gone
Live.com
link and linkdomain commands – now you see ‘em, now you don’t
Academic Live, Live Books – gone
Yahoo
NOT command, parentheses, Mindset - gone
Exalead
Approximate spelling (transformed into smellslike, spell slike!)
‘regular expression’ internal masking of letters - gone
Accoona – gone
http://www.rba.co.uk/wordpress/2008/10/05/accoona-is-no-more/ 22 October 2008 Karen Blakeman www.rba.co.uk 2
What’s new? :What’s new? Google
Knol, much improved Google Finance, lots of tweaks to existing services
Ask
Yet another makeover, new layout, return to “Ask a Question
MSE360
http://www.mse360.com
Silobreaker - http://www.silobreaker.com/
Search visualisation tools
Quintura, AllPlus, Cluuz ……
Lots of Web 2.0 ‘stuff’
Cuil
Cuil not so Cool
http://www.rba.co.uk/wordpress/2008/07/28/cuil-not-so-cool/ 22 October 2008 Karen Blakeman www.rba.co.uk 3
Search techniques – a reminder :Search techniques – a reminder Search engines still search for all of your terms by default
but note that Google also looks for terms in ‘links to’
Double quote marks around phrases
e.g. “climate change”
To exclude pages containing a term, precede the term with a minus sign (-)
Boolean search
OR, AND, NOT
must use capital letters for the operators
only OR works in Google and even that does not work well
Live.com, Exalead and MSE360 are best (Yahoo has withdrawn NOT, and nested Boolean)
for example chemical engineer AND (inurl:cv OR intitle:cv) AND (oil OR petroleum) 22 October 2008 Karen Blakeman www.rba.co.uk 4
Search techniques – a reminder (2) :Search techniques – a reminder (2) Focus your search on areas of the document
inurl: for example “process engineer” inurl:cv
intitle: for example “process engineer” intitle:cv
Search sites or domains using the site: command
e.g. site:statistics.gov.uk, site:gov.uk
Imagine what you would like to appear in your ideal document and include those terms in your strategy
Ask your question or partially answer your question in your strategy
“How fast can a hippopotamus run?”
“A hippopotamus can run at” 22 October 2008 Karen Blakeman www.rba.co.uk 5
Search techniques – a reminder (3) :Search techniques – a reminder (3) Repeat your key search terms in your strategy
chocolate production UK france belgium
chocolate production UK france belgium belgium belgium
give different results
Change the order of your terms
chocolate production Belgium Switzerland
production Belgium Switzerland chocolate
different results
See the summary and comparison chart for the major search engines at http://www.rba.co.uk/search/compare.pdf and http://www.rba.co.uk/search/compare.shtml 22 October 2008 Karen Blakeman www.rba.co.uk 6
File format search :File format search Use advanced search options to limit your search to file types or format:
pdf or doc for government or industry/market reports
xls for data and statistics
ppt or pdf for presentations
Run in at least Google, Yahoo and Live
Looking for experts on a topic or presentations?
Slideshare http://www.slideshare.net/
authorSTREAM http://www.authorstream.com/
YouTube http://www.youtube.com/ 22 October 2008 Karen Blakeman www.rba.co.uk 7
Do not capitalise commands :Do not capitalise commands Google
site:charity-commission.gov.uk grants - about 5,950
Site:charity-commission.gov.uk grants - 3 (all my presentations giving this as an example!)
Google sees the capital ‘S’ and treats the strategy as a phrase
Yahoo
site:charity-commission.gov.uk grants - 1,020
Site:charity-commission.gov.uk grants - 495,000
With a capital ‘S’ Yahoo treats all parts of the search including the domain as separate key words 22 October 2008 Karen Blakeman www.rba.co.uk 8
Unique Google search features :Unique Google search features Automatically looks for variations on your terms
to stop it, precede your terms with plus signs e.g. air +pollution or put your term in double quotes e.g. “Smyth”
Synonym search
precede your search terms with a tilde (~)
Numeric range search
now on advanced search page
can be weights, distances, years, prices
Command line syntax is
search term(s) first value..second value unit of measurement
TV advertising spend forecasts 2005..2012
toblerone 1..5 kg 22 October 2008 Karen Blakeman www.rba.co.uk 9
Unique Google search features (2) :Unique Google search features (2) Proximity
use the asterisk (*) to stand in for one or more terms
macular * degeneration picks up
macular retinal degeneration
macula disciform degeneration
macular choroidal degeneration
macular vitelliform degeneration
macular pigmentary degeneration
adding extra * changes the results
add, remove spaces between * * to change ranking of results
why does it do that – who knows?
no information on maximum number of terms of separation 22 October 2008 Karen Blakeman www.rba.co.uk 10
Firefox – Customise Google Add-on :Firefox – Customise Google Add-on Adds numbers to Google search results (position counter)
Links to other search engines
Stream search result pages
Add links to Wayback Machine 22 October 2008 Karen Blakeman www.rba.co.uk 11
Use something other than Google :Use something other than Google 22 October 2008 Karen Blakeman www.rba.co.uk 12
Ask :Ask http://www.ask.com/, http://www.ask.co.uk/
Suggestions for narrowing down or expanding your search
Particularly good for blogs
Big News gone
US search interface revamped
new Q&A tab 22 October 2008 Karen Blakeman www.rba.co.uk 13
Exalead :Exalead http://www.exalead.com/
http://www.exalead.co.uk/
Supports wild cards
asterisk (*) at the end of a word
pollut* finds pollute, pollutant, polluting etc.
NEAR - finds words within 16 terms of one another
NEAR/n finds words within n number of terms one another
climate NEAR/3 change
Approximate spelling, phonetic search (?)
Regular expression (internal masking of letters)
Feedback from users is that there is more European content, which seems to be given priority 22 October 2008 Karen Blakeman www.rba.co.uk 14
Live Search :Live Search http://www.live.com/
Results tend to be more consumer oriented
Has the most up to date database
Possibly has the most extensive database of web pages
Good image search option
Feed command for locating RSS feeds on a specified web site
site:bbc.co.uk feed:bbc.co.uk
Revamped interface but no improvement in advanced search screen
Link commands gone
Axed Link commands, Books and Academic Live 22 October 2008 Karen Blakeman www.rba.co.uk 15
Yahoo! :Yahoo! http://search.yahoo.co.uk/ http://search.yahoo.com/
Boolean AND, OR
NOT no longer available – use the minus sign.
parentheses do not work
Indexes first 500 K of a document (Google 101 K)
Square brackets round terms to pick up terms on the page in the order specified
[carbon emissions trading]
Region command (inherited from Inktomi)
region: e.g. region:europe, region:mediterranean
others are africa, asia, centralamerica, northamerica, southamerica, mideast, southeastasia, downunder 22 October 2008 Karen Blakeman www.rba.co.uk 16
MSE360.com :MSE360.com http://www.mse360.com/
See reviews at
http://www.rba.co.uk/wordpress/2008/10/05/mse360-search/
http://www.rba.co.uk/wordpress/2008/10/06/update-on-mse360/
Full Boolean nested search options
No advanced search screen but can use commands e.g. filetype: , site;
‘Tiered’ results – Web, Wikipedia, blogs
Customise results layout
Tags sites that you have already visited (Firefox only at present)
Quick to respond to bug reports and fix problems 22 October 2008 Karen Blakeman www.rba.co.uk 17
Zuula.com :Zuula.com 22 October 2008 Karen Blakeman www.rba.co.uk 18
Intelways.com :Intelways.com 22 October 2008 Karen Blakeman www.rba.co.uk 19
News :News Search engine news options e.g. Yahoo, Google
have only the last 30 days of free news
advanced search options limited and unreliable
no source list, and sources frequently change
key industry publications may not be included
Google News Archive http://www.google.com/archivesearch
some sources going back 200 years
many articles are priced (before you buy check other sources)
Silobreaker - http://www.silobreaker.com/
Chipwrapper - http://www.chipwrapper.co.uk/ 22 October 2008 Karen Blakeman www.rba.co.uk 20
Silobreaker http://www.silobreaker.com :Silobreaker http://www.silobreaker.com covers free resources
news, blogs, video, images
market trends
geographical location of stories
people
networks 22 October 2008 Karen Blakeman www.rba.co.uk 21
Chipwrapper http://www.chipwrapper.co.uk :Chipwrapper http://www.chipwrapper.co.uk Google Custom Search engine
Searches everything available on 15 free UK News Sites
No date sort option but typing in the year usually works 22 October 2008 Karen Blakeman www.rba.co.uk 22
Yahoo Finance :Yahoo Finance 22 October 2008 Karen Blakeman www.rba.co.uk 23
Google Finance :Google Finance 22 October 2008 Karen Blakeman www.rba.co.uk 24
Let RSS take the strain :Let RSS take the strain 22 October 2008 Karen Blakeman www.rba.co.uk 25
Blog searching :Blog searching Google Blogsearch
http://www.google.com/blogsearch
Ask – Blogs and feeds
http://www.ask.com/
Exalead
http://www.exalead.com/
limit search to Site Type Blog
Live Search
http://www.live.com/ and select Feeds
Blog and feed search engines
Technorati.com, Blogpulse.com 22 October 2008 Karen Blakeman www.rba.co.uk 26
Blogpulse search and trends :Blogpulse search and trends 22 October 2008 Karen Blakeman www.rba.co.uk 27 Click on the graph to see ‘trends’
Blogpulse trends :Blogpulse trends 22 October 2008 Karen Blakeman www.rba.co.uk 28
Twitter :Twitter http://www.twitter.com/
Micro-blogging - 140 characters per ‘tweet’
What are people saying about you?
Oh dear! 22 October 2008 Karen Blakeman www.rba.co.uk 29
Searching Twitter and Tweets :Searching Twitter and Tweets How Companies Use Twitter to Bolster Their Brands - BusinessWeek
http://www.businessweek.com/technology/content/sep2008/tc2008095_320491.htm
Searching public tweets
http://www.twitterment.com/ (?)
http://www.tweetscan.com/ (?)
Searching hashtags e.g. #ili2008
http://search.twitter.com/
http://www.hashtags.org/ - deceased?
http://www.twemes.com/ - delayed reporting so not ideal for keeping up with conference tweets real-time 22 October 2008 Karen Blakeman www.rba.co.uk 30
Twemes – http://twemes.com/ :Twemes – http://twemes.com/ 22 October 2008 Karen Blakeman www.rba.co.uk 31
http://search.twitter.com/ :http://search.twitter.com/ 22 October 2008 Karen Blakeman www.rba.co.uk 32
pipl :pipl http://www.pipl.com/
Review at http://www.rba.co.uk/wordpress/2007/05/05/pipl-people-search-beta/
Searches ‘hidden’ web + Google search
blog search, Google Groups, LinkedIn, Flickr, Google Scholar, Electoral Roll, Directories, Amazon, Hoovers, Zoominfo etc.
Google web search results not the same as an ordinary Google search – they incorporate terms such as resume, CV 22 October 2008 Karen Blakeman www.rba.co.uk 33
Zoominfo - Karen Blakeman’s verified profile :Zoominfo - Karen Blakeman’s verified profile 22 October 2008 Karen Blakeman www.rba.co.uk 34 Information ‘verified’ by Karen Blakeman View the ‘references’ (web pages) to see the information in context
LinkedIn :LinkedIn 22 October 2008 Karen Blakeman www.rba.co.uk 35
Facebook :Facebook 22 October 2008 Karen Blakeman www.rba.co.uk 36
Cluuz :Cluuz http://www.cluuz.com/
“Cluuz … core technology understands the relationship between the entities, terms, or persons searched leading to more relevant, easy to understand search results”
Not totally intuitive but the network visualisation is ‘cool’
The links in the network visualisation do not always relate to the same person or organisation but they are usually working in a similar field or subject area
Results change from one day to the next, one hour to the next, but still worth a look 22 October 2008 Karen Blakeman www.rba.co.uk 37
Cluuz :Cluuz 22 October 2008 Karen Blakeman www.rba.co.uk 38
Quintura.com :Quintura.com 22 October 2008 Karen Blakeman www.rba.co.uk 39
AllPlus.com :AllPlus.com 22 October 2008 Karen Blakeman www.rba.co.uk 40
Create your own search engine :Create your own search engine Examples:
AlacraSearch
http://www.alacra.com/alacrasearch
pipl
http://www.pipl.com/
Chipwrapper
http://www.chipwrapper.co.uk/
Google Custom Search Engines
http://www.google.com/coop/cse
can be hosted on your own site or on Google
http://www.rba.co.uk/sources/energy.shtml
http://www.google.com/coop/cse?cx=014304212364962740038:tui4ebh5r_a 22 October 2008 Karen Blakeman www.rba.co.uk 41
Slide 42:‘Disappearing’ pages Search engine cache copies
Google, Yahoo, Live, Ask, Exalead
Wayback machine
http://www.archive.org/
from 1996 to about 6 months ago
navigate the archived site or type in the full URL of the document if known
Firefox users
install the Resurrect Pages add-on 27 November 2006 Karen Blakeman www.rba.co.uk 42 22 October 2008 Karen Blakeman www.rba.co.uk 42
Wayback Machine :Wayback Machine 22 October 2008 Karen Blakeman www.rba.co.uk 43
Slide 44:22 October 2008 44 Karen Blakeman
Tel: 0118 947 2256, +44 118 947 2256
karen.blakeman@rba.co.uk
http://www.rba.co.uk/
blog: http://www.rba.co.uk/wordpress/
Facebook – Karen Blakeman
Twitter: karenblakeman
http://www.slideshare.net/KarenBlakeman Photo Nachoman-au. Creative Commons Attribution ShareAlike license versions 2.5, 2.0, and 1.0
http://commons.wikimedia.org/wiki/Image:Lotto_Skyworks_Applecross.jpg