|
|
|
FyberSearch is not the owner of the article below and accepts no responsibility for it's content, claims or the the websites that it links to. It was submitted to FyberSearch for publication on Search Industry Blog by it's assumed author. If you are the author and did not authorize it's publication please contact the Search Industry Blog immediately.
The Internet, and networks in general, revolve around the passing of information between users distributed over distance. Looking at the various services it provides in a historical order can give us a clearer idea of what the Internet is and what it can do. The Internet, and networks in general, revolve around the passing of information between users distributed over distance. Looking at the various services it provides in a historical order can give us a clearer idea of what the Internet is and what it can do In the beginning... there was Archie, FTP(File Transfer Protocol ) and Gopher. Gopher and Archie already contained most of the components of current search engines. They had spiders crawling the web looking for content which was stored in databases and/or topic directories. Sites were ranked in a computer generated estimation of relevance to the query. Search protocols, including the search command syntax, were established from the beginning. As Archie was to FTP archives, Veronica was to Gopherspace, a search utility that helps find information on gopher servers. (See Common Questions and Answers about Veronica, a title search and retrieval system for use with the Internet Gopher). Check the Original Internet Hunt and THE ANSWERS for an indication of what the early Internet could deliver in the right hands. Then came... Graphic User Interfaces (GUI), in the form of Mosaic in 1993, & the World Wide Web (WWW) and existing search features were built into the new search engines such as Lycos, Alta Vista and Hotbot (See A brief history of the Lycos and HotBot search engines and A brief history of the AltaVista search engine). Alta Vista's popularity stemmed from its embrace of Boolean searches enhanced with 'case sensitivity', 'phrase searching' and a 'proximity search capability' (the NEAR operator) all of which survived until the recent takeover by Yahoo of AltaVista. Ten little Indians... Hotbot, based on Inktomi search results, used such features as field searching, limiting by date and searching for particular file types. It was, briefly, the search engine of choice for many but never achieved the popularity enjoyed by Alta Vista and lost direction after substituting DirectHit content for that from Inktomi. Even before the Internet bubble burst in 2001, great search tools closed or changed. You can see some of them at Searching Graveyard where they are organized in chronologic order with some of their logos. The Swiss Army knife of search engines; or why we are googly-eyed about Google In a departure from the boolean search based technologies of the early nineties the rating of Google hits is based on their linkages (in imitation of the famed ISI Citation Indexes for academic literature) and authority rather than 'weightings' by the numbers of occurrences of keywords in the text . Google detects phrase matches even when quotes are not used in the basic search mode and it usually ranks documents with matching phrases higher. ( See Review of Google 5 June 2004; Google Advanced Search operators and The Google ~Guide Site ) Google has its limitations....: There is no nesting, no truncation, and it does not support full Boolean logic; It only indexes the first 101 KB of a Web page and about 120 KB of PDFs; The number of keywords you can search on is limited to 10 ( now 32...January 2005) but you can override this limitation by putting a plus sign ( + ) in front of any of the words when using them in a search phrase or you can use the wildcard symbol ( * ) and actually search for more than 10 (32) keywords at a time because the ( * ) is not counted as a word ....and special features It is currently indexing the abstract records for all online technical documents and standards by the Institute of Electrical and Electronics Engineers (IEEE); Abstracts are available free and full-text documents are available to subscribers or for online purchase; Starting a search with "define", "definition", "what is", and "what are" will invoke a Google Glossary lookup; Google will soon provide access to a 2 million record subset of more than 53 million records in the OCLC Project WorldCat - the most popular and widely available books (but see Two Million Open Worldcat Records Hit the Yahoo Database - Infotoday July 18 2004); WebQuotes - what people are saying about a particular site Google provides background information on a page if you type the URL in the form info:www.whatever.xxx. (See also Gary Price's Tips for Searching Google and FAQ based on questions in the google.public.support.general newsgroup ) ... but that aint all, Google very sensibly allows and encourages others to adapt and enhance their software as indicated by the following examples: Google Ultimate Interface utilizes all advanced search options (e.g. Web search, Image search, News search) and Google's tools (e.g. Glossary, Sets),toggle the Duplicates Filter on or off, use the file format search, and set the number of results per page & has links for typing non-English letters; Google API Proximity Search (GAPS) lets you look for two words within one, two or three words of each other; Google hacks by Tara Calishain and Rael Dornfest (book) - Google Hacks - 100 industrial-strength, real-world, tested solutions to practical problems including Hack 5: Getting Around the 10 Word Limit Hack 17: Consulting the Phonebook Hack 32: Google News Hack 44: Scraping Google Results Hack 54: NoXML, Another SOAP::Lite Alternative Hack 79: Measuring Google Mindshare Hack 87: Google Whacking Hack 100: Removing Your Materials from Google There are other search
engine technologies... The clustering search engines Vivisimo, Mooter and SnakeT(SNippet Aggregation for Knowledge
ExTraction) show potential but are effected by the usual business manoeuverings.
The clustering meta-search engine Vivisimo no longer harvests data from Google.
Different tools cluster using different methods. One of the more common methods
is to look for phrases which appear in multiple listings. All pages that have a
certain phrase are listed in this cluster, who's name is that phrase. ( See Topic Clustering in
Searches ). Kartoo visual search
and Maps of the Web use similar
technology but present their results in a visual display. Beyond Google... Rumours persist of work being done by Yahoo! and Microsoft (MSN) to supplant Google(See MSN launches revamped search engine and Yahoo! Search has a fresh, new look) and claims are made about Social networking search technologies such as those employed by Eurekster, Orkut, Ryze, Linkedin, delicious, and Furl but none of these appears to be in a position yet to effect a dramatic shift in web searching. (See accounts of Tim Bernars Lee's Semantic Web for more measured projections of future search technologies) But wait. Search
engines dont tell us the full story. There are many tried and true websites for
searchers Here are a selection of resources which can be appealed
to immediately when appropriate: DIRECTORIES: Keyword searching ensures maximum recall but often
finds far too many hits to check easily and some of those found have limited relevance.
The hierarchical subject directories on the other hand were usually produced by
human indexers and consequently excluded much of the ephemeral, the unreliable
and the purely commercial sites. Whilst these are now under challenge from the
clustering search engines (see above) many remain key resources, for
example: SPECIALISED SEARCH TOOLS Amazon.com "Search Inside the Book" results list authors and titles, "excerpt from" and the hyperlinked title of the book...FAQ Cached websites Gigablast, Wayback Machine, Daypop, IncyWincy, Yuntis ( See also Finding Old Web Pages ) MESA - Meta-Email-Search-Agent . PINAKES, A Subject Launchpad Voice of the Shuttle (University of California, Santa Barbara) one of the few comprehensive research subject lists with a humanities orientation. SurfWax Enterprise/SurfWax Scholar /SurfWax LawKT (Knowledge Tools) by subscription SEARCH PORTALS Fagan Finder - search engines, reference, tools, and more...Biography page...Quotations and Proverbs Search Pandia Powersearch: All-in-One List of Search Engines DATABANKS &/OR DIGITAL LIBRARIES Encyclopedia Britannica: The 1911 Edition Jewish Encyclopedia.com New Advent Catholic Encyclopedia Nonverbal Dictionary of Gestures, Signs & Body Language Cues Official history of Australia in the war of 1914–1918 Guardian Archive (since 1899) Home Economics Archive: Research, Tradition, History Old Car Manual Project Spectator Text Project Published by Joseph Addison and Richard Steele from 1711 to 1714 Technology in Australia 1788-1988 INTERNET ARCHIVES Scout Report Archives The Coombsweb is the world's oldest and most prominent Asian Studies online research facility. Its Web pages are designed for transmission speed, not fancy looks. Alan Lomax Archive... Audio Archive...Film and Videotape Archive...Paper Archive...Photograph Collection. BBC World Service Archive international news, analysis and information in English and 42 other languages (See also BBC Audio Interviews ) AUSTRALIAN DATABANKS & GUIDES The AusAnthrop Database On Line AusStage gateway to Australian performing arts Australian Cooperative Digitisation Project, 1840-45 Australian Digital Theses Program - CAUL Historical Australian Acts (none earlier than 1973) Mining in Australia Social Health Atlas of Australia Womens Weekly Index Database See also NLA's Electronic Australiana, Charles Sturt University Regional Archives and SLNSW's Aboriginal Australian links SOUTH AUSTRALIAN DATABANKS &
GUIDES
The author of this article is Brian Bingley |
|
|
|