Search engines
- types
- what they search
- how they work
- how they work
- crawl for content
- store content (meta tags, source code, and later queries and who used)
- index content for retrieval (html, pdf, doc, xls, flash & images & ...)
- searching
- results presentation / PageRank related to what links to it (Austin, 2005, Berry, 2005)
- history
- human driven 1990 archie, gopher, 1994 Yahoo, turned hybrid
- robot driven 1993 Aliweb, 1994 WebCrawler, Infoseek, Lycos, 1998 Google, 2004 Yahoo Search, 2006 Windows LiveSearch, Kartoo
- challenges
- exponential growth (time to assemble an index 12 days/10B pages) and updated pages and secure sites (Google claimed 25B pages indexed)
- keyword search creates false positives
- dynamically generated pages or database driven content = hidden or invisible web
- linkspam / attempts to inflate rankings
Shortcuts
Street maps, stock quotes, calculator, definitions, travel conditions, weather, numbers (flights, upc, area codes, patents, isbn/issn), phone numbers & addresses, find in a library
Search operators
Services
Advanced search form (reverse engineering)
Tools for better searching
Google Hacks (book)
Gwiggle (online puzzle game)
Google Guide (website)
|