logging in or signing up Search Teas bruce Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 156 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 28, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Search Secrets: Technology Intelligence (Finding What You Want): Search Secrets: Technology Intelligence (Finding What You Want) Larry G.Hull May 20, 2004Presentation Goals: Presentation Goals Common Search Mistakes Basic Search Techniques Major Search Engines Google Secrets Advanced Search Secrets Google Advanced Operators and Information Google Resources Other Technology Intelligence Resources Search Engines’ Future GoalsBiggest Search Mistakes: Biggest Search Mistakes Typing URLs into the wrong boxTop Search Terms - 2:00 May 13 : Top Search Terms - 2:00 May 13 Kelly Blue Book Paris Hilton / Pamela Anderson Mapquest Google Games Yahoo Jokes Health Thong / Thongs eBay Dictionary Lyrics Hotmail Friends Finale Ask JeevesWhat Did You Notice? : What Did You Notice? Kelly Blue Book Games Mapquest Google Paris Hilton / Paris Hilton Video Yahoo Jokes Health Thong / Thongs eBay Dictionary Pamela Anderson Hotmail Friends Finale Ask JeevesMy Point : My Point Kelly Blue Book Games Mapquest Google Paris Hilton / Paris Hilton Video Yahoo Jokes Health Thong / Thongs eBay Dictionary Pamela Anderson Hotmail Friends Finale Ask JeevesBiggest Search Mistake: Biggest Search Mistake Typing URLs into the wrong box Search terms entered in the address field Addresses entered in the search field Source: wordtracker.com Today’s top search terms Shown as moving list at top of page Download using “Send words to a friend” link People continue to type addresses as search terms, even after the mistake has been pointed out to themBiggest Search Mistakes: Biggest Search Mistakes Typing URLs into the wrong box Using the wrong tool at the wrong time The Second Biggest Mistake: The Second Biggest Mistake Using a directory instead of a search engine ... and then not understanding why you can’t find anything! Would you look up the definition of a word in the phone book? Directories are human-compiled and have a small number of pages in their databases (usually in the low millions) e.g. MSN or Yahoo Search engines are machine-compiled and have a very large number of pages in their databases (usually in the hundreds of millions or even billions) Directories vs. Search Engines: Directories vs. Search Engines Directories are great for “telephone book” searches What is the Web page address for company A, organization B, or entity C? Who makes product X? Where is a list of Web pages for topic Z? Search engines are great for “encyclopedia” or “dictionary” searches What is Z? Three parts to a search engine Spider or bot (sometimes called a crawler) Index or catalog Front end or retrieval pageDirectories vs. Search Engines: Directories vs. Search Engines Directories may grow to where they effectively become a small search engine or may explicitly decide to transform themselves into one, e.g., Yahoo Search engines may divide into topic or functionality oriented subsets that resemble directories, e.g., Google Some offer unique front ends, e.g., Ask Jeeves and KartOO (offering multisearch and a visual display) Competition is fiercePeople Use Directories -- Why?: People Use Directories -- Why? Search engines have more, far more, pages in their databases than do directories Why do people predominantly use directories? They know how to use a directory No one has ever shown them how to use a search engine! Note: This is also true for a search box or page at a Web site. People will say they prefer to use a site’s search facility. Usability tests show most use the site map if one exists.Biggest Search Mistakes: Biggest Search Mistakes Typing URLs into the wrong box Using the wrong tool at the wrong time Not knowing when and how to use directories or search engines to actually find informationSearch Basics: Search Basics Be specific ... because if you aren’t specific, you’ll end up with a bunch of garbage! minimum of three to five specific terms, e.g., ocean biology and biogeochemistry data mining and is a “stop word”, i.e., an extremely common word that is frequently discarded to speed up searchesSearch Basics: Search Basics Be specific ... because if you aren’t specific, you’ll end up with a bunch of garbage! Use quotes to search for phrases “ocean biology” biogeochemistry “data mining”Search Basics: Search Basics Be specific ... because if you aren’t specific, you’ll end up with a bunch of garbage! Use quotes to search for phrases Use the + sign to require “ocean biology” biogeochemistry +“data mining”Search Basics: Search Basics Be specific ... because if you aren’t specific, you’ll end up with a bunch of garbage! Use quotes to search for phrases. Use the + sign to require Use the - sign to exclude “ocean biology” biogeochemistry +“data mining” -solicitation Search Basics: Search Basics Be specific ... because if you aren’t specific, you’ll end up with a bunch of garbage! Use quotes to search for phrases. Use the + sign to require. Use the - sign to exclude. Combine symbols as often as possible (see rule #1) +“ocean biology” biogeochemistry +“data mining” -solicitationMajor Search Engines: Major Search Engines Google http://www.google.com Well-deserved reputation as top choice for searching web Crawler-based service provides comprehensive coverage and great relevancy Recommended as a first stop Yahoo http://www.yahoo.com Web's oldest (1994) directory Shift in October 2002 to crawler-based listings from Google Second shift in February 2004 to its own search technology Major Search Engines - 2: Major Search Engines - 2 Ask Jeeves http://www.ask.com/ Ask Jeeves initially gained fame in 1998 and 1999 as being the "natural language" search engine Behind the scenes, editors monitored search logs and went out onto the web to located what seemed to be the best sites Today, Ask Jeeves depends on crawler-based technology HotBot http://www.hotbot.com HotBot provides easy access to the web's three major crawler-based search engines: Yahoo, Google and Teoma (pronounced tay-o-ma and owned by Ask Jeeves) Unlike a meta search engine, it cannot blend the results from all of these crawlers togetherMeta Search Engines: Meta Search Engines Unlike search engines, metacrawlers don't crawl the web to build listings. Instead, they allow searches to be sent to several search engines all at once. The results are then blended together onto one page. Dogpile http://www.dogpile.com Searches a customizable list of search engines, directories and specialty search sites Displays results from each search engine individuallyMeta Search Engines - 2: Meta Search Engines - 2 MetaCrawler http://www.metacrawler.com/ One of the oldest meta search services, MetaCrawler began in July 1995 at the University of Washington KartOO http://www.kartoo.com Shows search results using a visual display Sites are interconnected by keywords Looking for search engines? http://searchenginewatch.com/links/ Google Secrets: Google Secrets It’s easy if you remember a few, simple commands Title searches Search for pages that have a particular word or phrase in their title intitle:terms Examples intitle:moon intitle:”Moon Landing” Google Secrets - 2: Google Secrets - 2 Site searches (used with another term) term(s) site:domainname Limit search to only pages within a specific site or domain apollo site:nasa.gov apollo +site:gov Exclude pages from a specific site or domain apollo -site:nasa.gov apollo -site:govGoogle Secrets - 3: Google Secrets - 3 URL searches Search for pages that have a particular word or phrase in their URL Works great when you can only remember one word in a long URL inurl:url inurl:apollo Example of “real” search +inurl:apollo +moon -“John Young” +site:nasa.gov Note: does not mean Young won’t appear in the siteGoogle Secrets - 4: Google Secrets - 4 Related searches Search for pages that are similar to another page related:url Related search to find pages similar to gsfc.nasa.gov related:gsfc.nasa.com Typical result will be servername.gsfc.nasa.gov Note: related search results equivalent to URL “similar pages”Advanced Search Secrets: Advanced Search Secrets How does search really work in search engines? Only the insiders know for sure People have made some shrewd guesses Number of keywords appearing on page scores “X” Keyword adjacency (closeness) on page scores “Y” Number of appearances of a keyword is given a weight and composite of weights scores “Z” Search engine uses X, Y, Z, and other secret variables to come up with a score for each page; applies a page rank, and displays to top n pages m at a time What is a page rank? Note: “PageRank” is Google’s secret algorithmPage Rank (Digression): Page Rank (Digression) Premise: importance of a research paper can be judged by the number of citations Analogy: importance of a Web page can be judged by the number of links pointing to it from other pages Or, mathematically, where PR = page rank and C = citations (links) d is a damping factor in the range of 0 < d < 1, perhaps 0.8Page Rank (continued): Page Rank (continued) The page rank of a Web page is the sum of the page ranks of all the pages linking to it divided by the number of links on each of those pages A page with many links to it from other pages is judged to be more important than a page with only a few links from other pages A page with only a few links to other pages is deemed to be more important than a page with links to lots of other pages, e.g., a portal Boolean Operators : Boolean Operators Search engines use AND as their Boolean default Automatically look for pages that contain ALL keywords Google: The "AND" operator is unnecessary -- we include all search terms by default. One reason to use phases, e.g., “data mining” Some searches need a Boolean OR OR operator is (almost) always in all caps Goes between keywords Often may be (should be) used with parentheses ocean biology and biogeochemistry data mining “ocean biology” biogeochemistry “data mining” (“ocean biology” OR biogeochemistry) “data mining”More Tips: More Tips Capitalization may not matter (except Boolean OR) Number of keywords may be limited Don’t use stop words: a, about, an, and, are, as, at, be, by, from, how, i, in, is, it, of, on, or, that, the, this, to, we, what, when, where, which, with Note: try using a + to search for a stop word, +we Use wildcard if supported by search engine Wildcards are characters, usually asterisks (*), that represent other characters Google offers full-word wildcards, it’s +a * world Some search engines support a technique called “stemming”, some* matches sometime, somewhere, …More Tips 2: More Tips 2 Order of the search keywords matters Number of results (hits) will probably be the same Ordering of the hits, particularly the top hits, will vary Some search engines offer advance search (page and/or advanced search operators) Search specific parts of Web pages Search for specific types of information We’ve seen some of Google’s advanced operators Query modifiers, e.g., intitle:, inurl:, site: Alternative query types, e.g., related: Information types, e.g., define:Google Advanced Operators: Google Advanced Operators filetype: Results restricted to files with a particular extension ".doc”, “.xls”, “.ppt”, etc Shows only files created with the corresponding program No space between filetype: and the file extension “dot” in the file extension is optional “data mining” filetype:pdf daterange: Date or range of dates page was indexed Works only with Julian dates “data mining” daterange:2452401-2452766 Google Advanced Operators - 2: Google Advanced Operators - 2 inanchor: Results restricted to text in a page’s anchors (links) No space between inanchor: and the first word Phases (in quotes) do work <p>Please <a href=“guestbook.html”>sign our guestbook</a></p> terms inanchor:guestbook intext: Searches body text and only body text No space between intext: and the first word or phase terms intext:“Goddard Space Flight Center” Google Information: Google Information Phonebook:, three versions phonebook: searches the entire Google phonebook rphonebook: searches residential listings only bphonebook: searches business listings only parameters first name (or first initial), last name, city (state is optional) first name (or first initial), last name, state first name (or first initial), last name, area code first name (or first initial), last name, zip code phone number, including area code last name, city, state last name, zip code phonebook:hull 20770 Rena Hull - (301) 614-9654 - 155 Westway, Greenbelt, MD 20770Google Information - 2: Google Information - 2 define: definitions for the word or phrase that follows if definitions are available no space between define: and the word or phrase define:nanotechnology stocks: Google treats the query terms as stock ticker symbols Links to a Yahoo finance page showing stock information for those symbols Ignores spacesGoogle Resources: Google Resources Free guides and FAQ Web searching in general Google’s features in specific http://www.google.com/help/ Google Usenet newsgroup google.public.support.general Google Hacks by Calishain and Dornfest Extremely advanced book written for Perl programmers Recommended but only if you can hack itOther Resources: Other Resources Deep Web (Invisible Web) Content of databases accessible on the Web Estimated 500 times larger than the fixed Web Non-textual files on the Web Multimedia files Graphical files (now offered by Google) Documents in non-standard formats such as Portable Document Format (PDF) Sources Directory of high quality databases maintained by Gary Price and Chris Sherman http://www.invisible-web.net/ CompletePlanet http://www.completeplanet.com/ Other Resources - 2: Other Resources - 2 The GSFC Library http://library.gsfc.nasa.gov/ Library Catalog Article/Paper Search Engines Technical databases (deep web) Available to users within GSFC domains Journals, eBooks, and more NEW: Ask A Librarian Real-time reference! 10 am to 4 pm, Mon - Fri Never underestimate a well-trained reference librarianFuture Goals: Future Goals Search engines understand (and answer) questions in natural language Microsoft Research’s AskMSR Rewrite search question (generate templates) Search for exact matches -- done Search for close matches -- redundancy => high probability Search engines as good as well-trained reference librarians Major advances needed in probabilistic machine learning and natural-language processing You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Search Teas bruce Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 156 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 28, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Search Secrets: Technology Intelligence (Finding What You Want): Search Secrets: Technology Intelligence (Finding What You Want) Larry G.Hull May 20, 2004Presentation Goals: Presentation Goals Common Search Mistakes Basic Search Techniques Major Search Engines Google Secrets Advanced Search Secrets Google Advanced Operators and Information Google Resources Other Technology Intelligence Resources Search Engines’ Future GoalsBiggest Search Mistakes: Biggest Search Mistakes Typing URLs into the wrong boxTop Search Terms - 2:00 May 13 : Top Search Terms - 2:00 May 13 Kelly Blue Book Paris Hilton / Pamela Anderson Mapquest Google Games Yahoo Jokes Health Thong / Thongs eBay Dictionary Lyrics Hotmail Friends Finale Ask JeevesWhat Did You Notice? : What Did You Notice? Kelly Blue Book Games Mapquest Google Paris Hilton / Paris Hilton Video Yahoo Jokes Health Thong / Thongs eBay Dictionary Pamela Anderson Hotmail Friends Finale Ask JeevesMy Point : My Point Kelly Blue Book Games Mapquest Google Paris Hilton / Paris Hilton Video Yahoo Jokes Health Thong / Thongs eBay Dictionary Pamela Anderson Hotmail Friends Finale Ask JeevesBiggest Search Mistake: Biggest Search Mistake Typing URLs into the wrong box Search terms entered in the address field Addresses entered in the search field Source: wordtracker.com Today’s top search terms Shown as moving list at top of page Download using “Send words to a friend” link People continue to type addresses as search terms, even after the mistake has been pointed out to themBiggest Search Mistakes: Biggest Search Mistakes Typing URLs into the wrong box Using the wrong tool at the wrong time The Second Biggest Mistake: The Second Biggest Mistake Using a directory instead of a search engine ... and then not understanding why you can’t find anything! Would you look up the definition of a word in the phone book? Directories are human-compiled and have a small number of pages in their databases (usually in the low millions) e.g. MSN or Yahoo Search engines are machine-compiled and have a very large number of pages in their databases (usually in the hundreds of millions or even billions) Directories vs. Search Engines: Directories vs. Search Engines Directories are great for “telephone book” searches What is the Web page address for company A, organization B, or entity C? Who makes product X? Where is a list of Web pages for topic Z? Search engines are great for “encyclopedia” or “dictionary” searches What is Z? Three parts to a search engine Spider or bot (sometimes called a crawler) Index or catalog Front end or retrieval pageDirectories vs. Search Engines: Directories vs. Search Engines Directories may grow to where they effectively become a small search engine or may explicitly decide to transform themselves into one, e.g., Yahoo Search engines may divide into topic or functionality oriented subsets that resemble directories, e.g., Google Some offer unique front ends, e.g., Ask Jeeves and KartOO (offering multisearch and a visual display) Competition is fiercePeople Use Directories -- Why?: People Use Directories -- Why? Search engines have more, far more, pages in their databases than do directories Why do people predominantly use directories? They know how to use a directory No one has ever shown them how to use a search engine! Note: This is also true for a search box or page at a Web site. People will say they prefer to use a site’s search facility. Usability tests show most use the site map if one exists.Biggest Search Mistakes: Biggest Search Mistakes Typing URLs into the wrong box Using the wrong tool at the wrong time Not knowing when and how to use directories or search engines to actually find informationSearch Basics: Search Basics Be specific ... because if you aren’t specific, you’ll end up with a bunch of garbage! minimum of three to five specific terms, e.g., ocean biology and biogeochemistry data mining and is a “stop word”, i.e., an extremely common word that is frequently discarded to speed up searchesSearch Basics: Search Basics Be specific ... because if you aren’t specific, you’ll end up with a bunch of garbage! Use quotes to search for phrases “ocean biology” biogeochemistry “data mining”Search Basics: Search Basics Be specific ... because if you aren’t specific, you’ll end up with a bunch of garbage! Use quotes to search for phrases Use the + sign to require “ocean biology” biogeochemistry +“data mining”Search Basics: Search Basics Be specific ... because if you aren’t specific, you’ll end up with a bunch of garbage! Use quotes to search for phrases. Use the + sign to require Use the - sign to exclude “ocean biology” biogeochemistry +“data mining” -solicitation Search Basics: Search Basics Be specific ... because if you aren’t specific, you’ll end up with a bunch of garbage! Use quotes to search for phrases. Use the + sign to require. Use the - sign to exclude. Combine symbols as often as possible (see rule #1) +“ocean biology” biogeochemistry +“data mining” -solicitationMajor Search Engines: Major Search Engines Google http://www.google.com Well-deserved reputation as top choice for searching web Crawler-based service provides comprehensive coverage and great relevancy Recommended as a first stop Yahoo http://www.yahoo.com Web's oldest (1994) directory Shift in October 2002 to crawler-based listings from Google Second shift in February 2004 to its own search technology Major Search Engines - 2: Major Search Engines - 2 Ask Jeeves http://www.ask.com/ Ask Jeeves initially gained fame in 1998 and 1999 as being the "natural language" search engine Behind the scenes, editors monitored search logs and went out onto the web to located what seemed to be the best sites Today, Ask Jeeves depends on crawler-based technology HotBot http://www.hotbot.com HotBot provides easy access to the web's three major crawler-based search engines: Yahoo, Google and Teoma (pronounced tay-o-ma and owned by Ask Jeeves) Unlike a meta search engine, it cannot blend the results from all of these crawlers togetherMeta Search Engines: Meta Search Engines Unlike search engines, metacrawlers don't crawl the web to build listings. Instead, they allow searches to be sent to several search engines all at once. The results are then blended together onto one page. Dogpile http://www.dogpile.com Searches a customizable list of search engines, directories and specialty search sites Displays results from each search engine individuallyMeta Search Engines - 2: Meta Search Engines - 2 MetaCrawler http://www.metacrawler.com/ One of the oldest meta search services, MetaCrawler began in July 1995 at the University of Washington KartOO http://www.kartoo.com Shows search results using a visual display Sites are interconnected by keywords Looking for search engines? http://searchenginewatch.com/links/ Google Secrets: Google Secrets It’s easy if you remember a few, simple commands Title searches Search for pages that have a particular word or phrase in their title intitle:terms Examples intitle:moon intitle:”Moon Landing” Google Secrets - 2: Google Secrets - 2 Site searches (used with another term) term(s) site:domainname Limit search to only pages within a specific site or domain apollo site:nasa.gov apollo +site:gov Exclude pages from a specific site or domain apollo -site:nasa.gov apollo -site:govGoogle Secrets - 3: Google Secrets - 3 URL searches Search for pages that have a particular word or phrase in their URL Works great when you can only remember one word in a long URL inurl:url inurl:apollo Example of “real” search +inurl:apollo +moon -“John Young” +site:nasa.gov Note: does not mean Young won’t appear in the siteGoogle Secrets - 4: Google Secrets - 4 Related searches Search for pages that are similar to another page related:url Related search to find pages similar to gsfc.nasa.gov related:gsfc.nasa.com Typical result will be servername.gsfc.nasa.gov Note: related search results equivalent to URL “similar pages”Advanced Search Secrets: Advanced Search Secrets How does search really work in search engines? Only the insiders know for sure People have made some shrewd guesses Number of keywords appearing on page scores “X” Keyword adjacency (closeness) on page scores “Y” Number of appearances of a keyword is given a weight and composite of weights scores “Z” Search engine uses X, Y, Z, and other secret variables to come up with a score for each page; applies a page rank, and displays to top n pages m at a time What is a page rank? Note: “PageRank” is Google’s secret algorithmPage Rank (Digression): Page Rank (Digression) Premise: importance of a research paper can be judged by the number of citations Analogy: importance of a Web page can be judged by the number of links pointing to it from other pages Or, mathematically, where PR = page rank and C = citations (links) d is a damping factor in the range of 0 < d < 1, perhaps 0.8Page Rank (continued): Page Rank (continued) The page rank of a Web page is the sum of the page ranks of all the pages linking to it divided by the number of links on each of those pages A page with many links to it from other pages is judged to be more important than a page with only a few links from other pages A page with only a few links to other pages is deemed to be more important than a page with links to lots of other pages, e.g., a portal Boolean Operators : Boolean Operators Search engines use AND as their Boolean default Automatically look for pages that contain ALL keywords Google: The "AND" operator is unnecessary -- we include all search terms by default. One reason to use phases, e.g., “data mining” Some searches need a Boolean OR OR operator is (almost) always in all caps Goes between keywords Often may be (should be) used with parentheses ocean biology and biogeochemistry data mining “ocean biology” biogeochemistry “data mining” (“ocean biology” OR biogeochemistry) “data mining”More Tips: More Tips Capitalization may not matter (except Boolean OR) Number of keywords may be limited Don’t use stop words: a, about, an, and, are, as, at, be, by, from, how, i, in, is, it, of, on, or, that, the, this, to, we, what, when, where, which, with Note: try using a + to search for a stop word, +we Use wildcard if supported by search engine Wildcards are characters, usually asterisks (*), that represent other characters Google offers full-word wildcards, it’s +a * world Some search engines support a technique called “stemming”, some* matches sometime, somewhere, …More Tips 2: More Tips 2 Order of the search keywords matters Number of results (hits) will probably be the same Ordering of the hits, particularly the top hits, will vary Some search engines offer advance search (page and/or advanced search operators) Search specific parts of Web pages Search for specific types of information We’ve seen some of Google’s advanced operators Query modifiers, e.g., intitle:, inurl:, site: Alternative query types, e.g., related: Information types, e.g., define:Google Advanced Operators: Google Advanced Operators filetype: Results restricted to files with a particular extension ".doc”, “.xls”, “.ppt”, etc Shows only files created with the corresponding program No space between filetype: and the file extension “dot” in the file extension is optional “data mining” filetype:pdf daterange: Date or range of dates page was indexed Works only with Julian dates “data mining” daterange:2452401-2452766 Google Advanced Operators - 2: Google Advanced Operators - 2 inanchor: Results restricted to text in a page’s anchors (links) No space between inanchor: and the first word Phases (in quotes) do work <p>Please <a href=“guestbook.html”>sign our guestbook</a></p> terms inanchor:guestbook intext: Searches body text and only body text No space between intext: and the first word or phase terms intext:“Goddard Space Flight Center” Google Information: Google Information Phonebook:, three versions phonebook: searches the entire Google phonebook rphonebook: searches residential listings only bphonebook: searches business listings only parameters first name (or first initial), last name, city (state is optional) first name (or first initial), last name, state first name (or first initial), last name, area code first name (or first initial), last name, zip code phone number, including area code last name, city, state last name, zip code phonebook:hull 20770 Rena Hull - (301) 614-9654 - 155 Westway, Greenbelt, MD 20770Google Information - 2: Google Information - 2 define: definitions for the word or phrase that follows if definitions are available no space between define: and the word or phrase define:nanotechnology stocks: Google treats the query terms as stock ticker symbols Links to a Yahoo finance page showing stock information for those symbols Ignores spacesGoogle Resources: Google Resources Free guides and FAQ Web searching in general Google’s features in specific http://www.google.com/help/ Google Usenet newsgroup google.public.support.general Google Hacks by Calishain and Dornfest Extremely advanced book written for Perl programmers Recommended but only if you can hack itOther Resources: Other Resources Deep Web (Invisible Web) Content of databases accessible on the Web Estimated 500 times larger than the fixed Web Non-textual files on the Web Multimedia files Graphical files (now offered by Google) Documents in non-standard formats such as Portable Document Format (PDF) Sources Directory of high quality databases maintained by Gary Price and Chris Sherman http://www.invisible-web.net/ CompletePlanet http://www.completeplanet.com/ Other Resources - 2: Other Resources - 2 The GSFC Library http://library.gsfc.nasa.gov/ Library Catalog Article/Paper Search Engines Technical databases (deep web) Available to users within GSFC domains Journals, eBooks, and more NEW: Ask A Librarian Real-time reference! 10 am to 4 pm, Mon - Fri Never underestimate a well-trained reference librarianFuture Goals: Future Goals Search engines understand (and answer) questions in natural language Microsoft Research’s AskMSR Rewrite search question (generate templates) Search for exact matches -- done Search for close matches -- redundancy => high probability Search engines as good as well-trained reference librarians Major advances needed in probabilistic machine learning and natural-language processing