Internet Guide Logo

Finding Web Sites


Once a user is able to connect to the Internet, it is easy to access the World Wide Web: what is required is a web browser. The three most popular browsers are Microsoft's Internet Explorer, Mozilla's Firefox and Google's Chrome (all free for download). The World Wide Web is based upon a client-server model: the client (browser) retrieves documents (webpages) from computers (servers) connected to the Internet.

Retrieving Web Content

The World Wide Web is a hypertext document system, each document hosted by a web server (http server) has a unique address. The unique address of a web document is termed a URL (Uniform Resource Locator). If a user knows the URL of a web document then the content of this document can be retrieved by a browser in a variety of ways: for example, it can be downloaded as feeds/rss (from publishers) and read offline.

Address box

However, the most popular way in which web content is retrieved: is by typing the address (URL) of the resource into the address bar (shown above) of a browser and navigating to the webpage. The drawback to finding web content in this way: is that it requires the user to know the address of the resource. Without the address (URL), the user cannot access the resource. Therefore, a problem arises, if a user does not know the address of a resource, how does the user access web content?

XML and XHTML: Future of HTML

In 1998, XML 1.0 was released: XML was a new markup language, developed by a range of W3C working groups, and it was viewed as an alternative and possible replacement for HTML. XML did not replace HTML, but it's syntax was used to create the Extensible Hypertext Markup Language (XHTML).

The development of XHTML was inspired by the publication of a W3C Working Draft titled 'Reformulating HTML in XML'; released in 1998. XHTML 1.0 was released in 2000. HTML had not been updated since 2000 (HTML 4.01), and the World Wide Web Consortium (W3C) has focused all future developments upon XHTML.

By 2010, HTML and XHTML were commonly used on the World Wide Web. HTML5 was developed as an attempt to create a single markup language that encompasses and includes both HTML and XHTML. When HTML5 was released, it became the first version of HTML to become an application of XML - instead of SGML - and aims to increase the interoperability of future web content.

The Web Hypertext Application Technology Working Group (WHATWG) was formed in 2004 to encourage the development of web technologies and to evolve HTML. WHATWG played an important role in the development of HTML5: creating a range of proposals and papers for W3C working groups to vote upon. Members of WHATWG included the Mozilla Foundation and Opera Software (browser developers).

Finding Web Content: Search Engines

As has already been stated, if a web user is looking for web content, but does not know the address (URL) of the content, then the user is left with a dilemma: how do I find the address of the content? web content (websites) is connected together through hyperlinks: hyperlinks are a hypertext element with a URL embedded within it. Therefore, users can click upon hyperlinks to be redirected to web content; the problem is finding hyperlinks that send users to relevant web content (they are interested in).

What is needed is a start point, or "jump off" location. Search engine's are currently (2014) the best "jump off" location. Search engines "crawl" the World Wide Web (by following hyperlinks) and indexing the documents they find (document address, title, description) in a database. Users (of the search engine) can then query this database with a 'search term' to find documents which match the search term. The most successful search engines (Google) provide accurate results. The anchor text of hyperlinks is used by search engines to provide accurate results.

Therefore, all a user requires is the address (URL) of a search engine and they should beable to find web content which interests them. Most ISPs (Internet Service Providers) will set their homepage as the default website that loads up when a web browser is opened. ISP homepage's tend to provide a search engine facility; but the search engine results are usually provided by another company: most likely Google, Bing or Yahoo!

Websites which combine a search engine with other services - most notable a directory (explained below) - are often described as a web portal. Yahoo! is a prime example of a web portal: it began life as a directory, but expanded it's range of services to include a search engine.

Alternatives to a Search Engine: Directories and Social Networking Websites


While most users use and 'rate' search engines as the best way to find content online, there is another option: which is a directory. Directories tend to be human edited - in comparison to search engines which use 'robots' to index content - and create webpages full of hyperlinks on a specific subject. While directories can be accurate in the short-term - when recently edited - they do suffer from 'link-rot' (dead link) over time: which means the hyperlinks listed in their subject pages are directed at documents that no longer exist. A historical record of directories/websites can be found at the Internet Archive. When it comes to how directory websites are structured: a site map is commonly employed, alongside a site search. Most directory websites are free to use, but they may charge websites - on a subscription payment model - to be listed within the directory.

Social Networking Websites

Since 2006, social networking websites - like Myspace - have become increasing popular. Social networking websites are categorised as a Web 2.0 service: which does not refer to a specific technical infrastructure, but to websites which can interact with their users. Web 2.0 websites - like blogs and the blogoshere - usually allow users to provide a comment about the content of the page, provide a rating, and submit a vote to a poll etc. Social media takes blogs a step further by linking "friends" together to share opinions and web content. Social networking websites, like Facebook, have become so popular that they are often the first website that a user will open, and as such, are an alternative "entry point" to search engines for finding web content. Social media websites tend to be more compatible with mobile devices than search engines. While a search engine may be compatible with a mobile device, the content it lists may not be. Many websites are now listing their content on social media websites like Facebook; therefore, users can browse a wide breadth of content on Facebook, which is compatible with most devices.

Alternatives to Search Engines

Directories: ODP (Dmoz), Looksmart
Social Media: Bebo, LinkedIn, Google Plus, Myspace, Facebook, Tumblr, Twitter
Web Portals: About, Excite, Lycos, Yahoo