Internet Guide Logo

World Wide Web

bullet Introduction

The World Wide Web is a nonproprietary hypertext document system that is available on the Internet. The World Wide Web is commonly confused with the Internet: however, the World Wide Web is simple a service accessed on the Internet. The World Wide Web is also referred to as: w3, web, www, and 'the web'. The World Wide Web was the not the first hypertext system, but, it was the first hypertext system that was successfully interconnected with the software systems (TCP/IP) of the Internet.

The World Wide Web was designed with the aim of giving users unfettered access to information, with no centre, and a technological leveling of hierarchy. Authority became horizontal; earlier information systems tended to be vertical, with a strong power structure. This, in part, is due to the World Wide Web supporting unidirectional links: where a resource can be linked to without the resource having given permission.

Berners-Lee managed to "tie the knot" between hypertext and the Internet by inventing or co-inventing three technologies:

  1. HTML (universal hypertext language to design Web documents)
  2. HTTP (protocol, using URLs, to retrieve data from Web servers)
  3. URLs (unique identifier for documents hosted on Web servers)

The World Wide Web is designed with a client server model; a client-server model splits the workload between a server (provides the data) and the client (who requests the data). While the server will share it's resources with the client, the client simple accesses the resource, and does not share any of it's resources. Therefore, the World Wide Web is a system that is comprised of web servers (computers that host files) and web browsers (client programs that retrieve the files).

The World Wide Web is named a 'Web' because hypertext documents (webpages) are connected together - in an system likened to a 'Web' - through the use of hyperlinks. Hyperlinks are embedded into hypertext documents (webpages), and include a URL. Uniform Resource Locators (URLs) are a type of Uniform Resource Identifier (URI), and are used by the Hypertext Transfer Protocol (HTTP) to locate the computer a webpage is stored upon.

The World Wide Web is not the only online document retrieval system: Gopher is an example of another system.

bullet History

The World Wide Web was invented by Sir Tim Berners-Lee. From 1989-1991, Berners-Lee proposed and developed the software systems of the World Wide Web. Berners-Lee invented the World Wide Web while he was employed at CERN. Berners-Lee had attempted to build a document system for CERN in the early 1980's: the system was named ENQUIRE. Just as with the World Wide Web, the aim of ENQUIRE, was to help CERN scientists share information, and to avoid losing information.

Tim Berners-Lee was in the 'right place at the right time' to invent the World Wide Web. From 1983, the CERN Networking Group - which included members: Brian Carpenter, Giorgio Heiman, François Flückiger, Jean-Michel Jouanigot - began to build an internal and external networking infrastucture. The CERN Networking Group decided to implement TCP/IP instead of the ISO networking standard, and by 1991, the CERN external network was a hub for international Internet traffic. By 1991, it has been claimed that the CERN network was responsible for handling 80% of Europe's international Internet traffic. Therefore, CERN was the ideal location to launch a new Internet service.

Berners-Lee proposed a new 'hypertext project' on the 12st of March, 1989. The proposal paper was named 'a large hypertext database with typed links'. This proposal was not successful, and it led to Berners-Lee asking for assistance from Robert Cailliau to develop a more 'concrete' proposal. The proposal they developed was named: 'WorldWideWeb: Proposal for a HyperText Project' and it was published on the 12th of November, 1990. This proposal was successful, and Cailliau and Berners-Lee began the development of the software systems for their new hypertext project.

The project to develop the World Wide Web was originally referred to as the: CERN WWW Project, and members of the WWW Project included: Alain Favre, Arthur Secret, Bebo White, Bernd Pollermann, Carl Barker, Dan Connolly, David Foster, Eelco van Asperen, James Whitescarver, Jean-Francois Groff, Jonthan Streets, Nicola Pellow, Peter Dobberstein, Paul Kunz, Pei Wei, Robert Cailliau, Tim Berners-Lee, Tony Johnson, Willem van Leeuwen

Berners-Lee decided that the hypertext documents of the World Wide Web would be read-only, and accessed by a client server architecture (browsers). The first World Wide Web server (used by Berners-Lee) was a 'NeXT' computer, and Berners-Lee developed the first web server software: named CERN httpd. Since then, there has been a plethora of HTTP server software, such as Apache, that have simplified the process of hosting web servers, and have helped to popularise the World Wide Web.

Berners-Lee was also responsible for designing the first web browser; unsurprisingly named the: WorldWideWeb. The second browser created - a version of the WorldWideWeb browser that was ported to several operating systems - was the Line Mode Browser, and it was designed by: Tim Berners-Lee, Henrik Frystyk Nielsen and Nicola Pellow. In 1992, Robert Cailliau and Nicola Pellow developed the first browser for the Macintosh platform: named MacWWW. Pei Wei, a member of the WWW project - like Cailliau and Pellow - went on to develop the ViolaWWW hypertext browser.

The World Wide Web was first available as an Internet service on the 6th of August, 1991: when Berners-Lee released information about his "Hypertext project" on the newsgroup: alt.hypertext. However, Berners-Lee had launched the first web server on the 25th of December, 1990: some simple webpages were available for download, but only a select number of people knew the project existed at that date.

On the 30th of April 1993, CERN made the World Wide Web's software - such as a library of code - publicly available; with the aim of increasing it's popularity. CERN also announced that the World Wide Web would be free to use, and no license fee would be charged to developers (unlike with Gopher: who charged a license fee to host a server). In 1994, Berners-Lee left CERN and founded the World Wide Web Consortium (October, 1994). The role of the World Wide Web Consortium is to create new web standards and promote the compatibility of web standards for developers.

The World Wide Web was fortunate that it was launched at the same time that the Internet was being transitioned from a U.S. federally funded network to a commercial network. By 1995, the World Wide Web became the Internet's most popular service. By 1997, the popularity of the World Wide Web spawned the development of large technology companes - Yahoo!, Paypal, eBay - whose primarily focus was developing content for the World Wide Web, or, selling products on it. The growth of Internet use led to the dot com bubble: where the stocks of Internet companies soared in value (1997-2000) and then crashed in value.

bullet HTTP and Internet Protocols

The World Wide Web is a service/application found on the Internet. The Internet is a system of interconnected computer networks that use TCP/IP (Internet protocol suite) to communicate. The Internet protocol suite has four layers and it's highest layer is the application layer. The Hypertext Transfer Protocol (HTTP) is part of the application layer of the Internet protocol suite.

The World Wide Web could not function without HTTP. HTTP is a 'request-response' protocol: which means that one computer sends a request for data and another computer responds to the request. The World Wide Web is based upon a client-server computing model: a client application program (browser), residing on a users computer, will use HTTP to send a request to retrieve data from a server connected to the Internet.

Files, hosted at a web server, are retrieved by a client program (browsers) using a Uniform Resource Identifier (URI). The World Wide Web uses a URI that is named: Uniform Resource Locator (URL). URL's are embedded within a hyperlink: hyperlinks, referred to as links, are embedded into webpages (hypertext documents), and a user simple has to click on a hyperlink and a browser will use HTTP to locate the resource.

When a client program requests HTTP to locate and retrieve a resource: HTTP will use another Internet protocol in the process of retrieving data from a server. HTTP, through a process of encapsulation, uses protocols in the transport layer of the Internet protocol suite: most notable the Transmission Control Protocol (TCP). TCP ensures that data is reliable sent and received by an application layer protocol. While TCP will use other Internet protocols, HTTP will be unaware and unconcerned how TCP functions.

The Internet Protocol (IP) is ultimately responsible for the transmission of data between the computer sending a request and the computer responding to the request. The Internet Protocol (IP) uses an IP address (32bit or 128bit number) to identify a host computer and uses a datagram service to transmit data between IP addresses. TCP will ensure that IP functions in a reliable 'manner' and no data packets are lost.

bullet Content on the World Wide Web

If a user wants to upload data to the World Wide Web, then they need to upload it to a Web server: a web server is a computer connected to the Internet that allows HTTP to retrieve files from their hard disk. Users can upload data to web servers hosted by other users, however, a user has very little control over the data they upload to such a server.

The most common files uploaded to a web server are webpages (HTML documents). When a collection of webpages are located at a single web domain, it is referred to as a website. If a user wishes to create their own website: then they need to either setup their own web server or rent space from a third party web server (which can be either a shared or a dedicated server).

The majority of people who create a website rent 'space' from a web hosting company. Once a user has rented 'space' at a web server: then they have the ability to upload files to the server. If a user wishes to create a website: then they will need to create webpages (hypertext document) for it. Once a user has uploaded webpages to a web server: then other Internet users can access these pages by using a Uniform Resource Locator (URL).

While a URL can use an IP address to loctae a webpage: the problem with an IP address is that they are difficult to remember. A domain name is preferable to an IP address: they are easier to remember, and if a user changes the server they host their data on, then the data can still be found using the domain name (the domain name will be tied to an IP address, and the IP address can be changed).

The content of the webpages falls into two broad categories: commercial and non-commercial. The World Wide Web has spawned many successful online businesses; which are referred to as e-tailers (e-tailing) and e-commerce businesses. Some notable e-commerce businesses are: Amazon, eBay, and Paypal. Some commercial websites have even launched their own virtual currency, such as Bitcoin.

The world's largest technology corporations are typically located in Silicon Valley (California); whereas a number of UK technology companies can be found in Silicon Glen (Scotland).

bullet Accessing and finding Web content

As has already been stated, a user needs a web browser to access and read web conent; the browser retrieves web documents/files by using TCP/IP and also renders the document. To read more about browsers. Once a user has a web browser and is 'surfing' the web, then they need to find content: the primary way of doing that is by using websites called search engines.

bullet Privacy and the World Wide Web

While there is no requirement to record the browsing history of the World Wide Web: typically, the majority of browsing is recorded to some degree. The World Wide Web is based upon a client-server model: a client program (browser) requests and retrieves files (webpages, pictures) from a web server (computer connected to the Internet). Therefore, whenever a file is requested and retrieved upon the World Wide Web: the client and the server usually records the session.

When a user requests a file (webpage) from a web server: the server usually record this request and the IP address of the user requesting the data. Likewise, the browser (client) of the user will usually record the data transmission by keeping a copy of the retrieved files in it's cache (directory) and keeping a record in the history feature of the browser. The Internet Service Provider - network the user accesses the Internet with - will typically keep a record of all the files a user accesses from the Internet. However, the internal policies of ISP's differ: it is difficult to know precisely what an ISP will record and store, and for how long. ISP's will only share their user logs when it is demanded by a legal entity.

Alongside server logs, websites can also record the browsing habits of it's users by using HTTP cookies (invented by Louis Montulli). Cookies are small files, stored on the users computer, that store information, such as: username authentication, password authentication, past browsing history. Therefore, if a user returns to a commercial website - for example eBay - the user will not be required to enter their username and password again, and the website can serve content to the user that is related to the content they viewed the last time they visited the website. Users can delete cookie's whenever they wish, and the typical (first party) cookie does not pose a serious privacy risk; especially if the user has not provided personal identifiable information to the website. However, tracking cookies, referred to as third party cookies, can compile a long term record of a users browsing history - as they record browsing habits at multiple websites - and are sometimes viewed as malware.

Most websites have a 'privacy policy' that will usually promise to keep a user's browsing history secret. The problem a user has, is when they create an account at a website and the account publicly provides personal details about the user. The most obvious example is social media websites, like Facebook, where a user will publicly share their name and location; amongst other personal data. While social media websites do have privacy settings, the amount of information a user shares is usually extensive, and the impact it may have upon a person's privacy and 'real life' is far greater than with other websites.

Further Reading:

1. WWW Error Messages: understand the error messages for unavailable Web pages.

bullet Frequently Asked Questions

bullet Do I need special software to view Web pages?
bullet What options do I have to surf the Web?
bullet Is the WWW prefix required for a URL?
bullet Did Tim Berners-Lee invent the World Wide Web?
bullet Was the Web inspired by similar projects?
bullet Was the Web always free?
bullet Did Berners-Lee create the World Wide Web Consortium?
bullet Is the Deep Web akin to the Invisible Web?
bullet Is the Web 2.0 a new World Wide Web?