Internet Guide Logo

Finding Web Sites, Content, Information and Resources

Once a user has connected to the World Wide Web using a browser, the next step is finding relevant web: sites, content, information and resources. When the World Wide Web was launched in 1991, there was only a handful of web servers, and Tim Berners-Lee (web inventor) was able to keep a list of every web server. Therefore, it was fairly simple to find most of the information available on the web, but that was due to a lack of information being available. As the web expanded from 1991-1993, web content, located on websites, was typically found by one of three methods: 1) following hyperlinks suggested by the website a user is currently viewing; 2) word of mouth; 3) using directories (who provided a human edited list of links).

Dmoz directory, launched in 1998, and closed in 2017 Yahoo! directory: Jerry and David's Guide to the World Wide Web

Dmoz and Yahoo! directories

Web directories tend to be human edited and consist of webpages organised on specific subjects that include a list of links related to that subject. While directories can be accurate in the short-term - when recently edited - they do suffer from 'link-rot' (dead link) over time; link rot refers to links that point to content that no longer exists or is redirecting to another resource. Directories are notoriously difficult to keep up-to-date: due to them being edited by humans. Another problem directories have is how to generate income: most rely on paid inclusion, which has the drawback of creating a poor user experience. One of the earliest web directories, and one of its most popular, was named: Jerry and David's Guide to the World Wide Web. Launched in 1994, by two graduate students, Jerry Yang and David Filo, it was simple a list of Jerry Yang's favourite websites. Jerry and David's Guide to the World Wide Web was renamed Yahoo! in 1995. While the Yahoo! directory continued until 2014, its relevancy and prominence on the Yahoo! website was sidelined from 2002 onwards; when Yahoo! focused on its new search engine. The Yahoo! directory was human-edited and a fee was charged for most successful submissions. Another popular directory was Dmoz, launched in 1998 by America Online, it's catchphrase was "humans do it better", and its volunteer structure enabled it to stay active, updated and relevant far longer than any other directory. Dmoz was closed on March 17th 2017, and probably signaled the end of directories as a web service that could challenge search engines as a method for finding new websites.

AltaVista, a search engine that dominated the web before Google

AltaVista, a search engine that dominated the web before Google

By 1994, due to the popularity of the web, and the amount of new content being added, locating web resources through a human-edited directory was proving inefficient and unsatisfactory, and it became apparent that an automated service was required. The automated service that was launched, initially in 1993, but more successfully in 1994, was a search engine: that used a automated piece of software called a crawler: that followed hyperlinks to index the content found on each webpage it visited. Early web crawlers could only index a small proportion of the information found on a webpage - typically its meta data - due to the cost of storing such a large amount of information. WebCrawler, launched in 1994, by Brian Pinkerton at the University of Washington, was the first search engine that was able to crawl, index and search the full text of every webpage it visited. WebCrawler was purchased by America Online in 1995, and it's functionality was soon rivaled by AltaVista. AltaVista is credited as being one of the first search engines that was capable of indexing a large proportion of the content found on the World Wide Web, and it was able to do so by using multiprocessor machines, and back-end machines with unheard of storage capacity. While AltaVista was probably the first professional looking search engine, the web was still small compared to the modern web: in 1997, AltaVista was receiving in the region of 12 million queries every day, whereas Google receives in the region of 3.5 billion queries per day in 2017 (internetlivestats.com statistics).

It is impossible to ascertain what proportion of new content is found from search engines, and what proportion if found from suggestions provided by websites the user already visits. Social networking websites, especially Twitter and Facebook, are now functioning as a 'hub' from where new content is served to users in a user's news feed. What is clear, is that search engines continue to be the most popular web service that is specifically designed with the aim of helping user's locate third-party content.

Search Engines: Google, Bing, Yandex, Baidu, Ask Jeeves
Directories: ODP (Dmoz), Looksmart
Social Media: Bebo, LinkedIn, Google Plus, Myspace, Facebook, Tumblr, Twitter
Web Portals: About, Excite, Lycos, Yahoo!