Internet Guide Logo

World Wide Web

bullet Introduction

The World Wide Web is a nonproprietary hypertext document system that is available on the Internet. The World Wide Web is also referred to as: w3, web, www, and 'the web'. The World Wide Web was the not the first hypertext system, but, it was the first hypertext system that was successfully interconnected with the software systems (TCP/IP) of the Internet. The World Wide Web was designed with the aim of giving users unfettered access to information, with no centre, and a technological leveling of hierarchy. Authority became horizontal; earlier information systems tended to be vertical, with a strong power structure. This, in part, is due to the World Wide Web supporting unidirectional links: where a resource can be linked to without the resource having given permission. Berners-Lee managed to "tie the knot" between hypertext and the Internet by inventing or co-inventing three technologies:

  1. HTML (universal hypertext language to design Web documents)
  2. HTTP (protocol, using URLs, to retrieve data from Web servers)
  3. URLs (unique identifier for documents hosted on Web servers)

The World Wide Web is designed with a client server model; a client-server model splits the workload between a server (provides the data) and the client (who requests the data). While the server will share it's resources with the client, the client usually accesses the resource, and does not share any of it's resources. Therefore, the World Wide Web is a system that is comprised of web servers (computers that host files) and web browsers (client programs that retrieve the files). The World Wide Web is named a 'Web' because hypertext documents (webpages) are connected together - in an system likened to a 'Web' - through the use of hyperlinks. Hyperlinks are text/images that are embedded into hypertext documents (webpages), and include a URL. Uniform Resource Locators (URLs) are a type of Uniform Resource Identifier (URI), and are used by the Hypertext Transfer Protocol (HTTP) to locate the computer a webpage is stored upon.

The World Wide Web is not the only online document retrieval system: Gopher is an example of another system. The World Wide Web is commonly confused with being the Internet: the World Wide Web is a service accessed on the Internet, the Internet existed before the World Wide Web and could continue to exist without it; that said, the World Wide Web is the most popular service used on the Internet, and is essential to many 'real world' civil and business services.

bullet History

The World Wide Web was invented by Sir Tim Berners-Lee: from 1989-1991, he proposed and developed the software systems of the World Wide Web while he was employed at CERN. Berners-Lee had attempted to build a document system for CERN in the early 1980's: the system was named ENQUIRE, and it's purpose was to help CERN scientists share information, and to avoid losing information. Tim Berners-Lee was in the 'right place at the right time' to invent the World Wide Web: from 1983, the CERN Networking Group - which included members: Brian Carpenter, Giorgio Heiman, François Flückiger, Jean-Michel Jouanigot - had began to build an internal and external networking infrastructure. The CERN Networking Group decided to implement TCP/IP instead of the ISO networking standard, and by 1991, the CERN external network was a hub for international Internet traffic. By 1991, the CERN network was responsible for handling up to 80% of Europe's international Internet traffic. Therefore, CERN was the ideal location to launch a new Internet service.

The date that Berners-Lee proposed his new 'hypertext project' was on the 12st of March, 1989; the paper was titled 'a large hypertext database with typed links'. This proposal was not successful, and it led to Berners-Lee asking for assistance from Robert Cailliau to develop a more 'concrete' proposal: the proposal they developed was named: 'WorldWideWeb: Proposal for a HyperText Project' and it was published on the 12th of November, 1990. This proposal was 'green lit' and Cailliau and Berners-Lee began the process of building a development team to design the software systems for the new hypertext project. The project was originally referred to as the CERN WWW Project, and the members of the WWW Project included: Alain Favre, Arthur Secret, Bebo White, Bernd Pollermann, Carl Barker, Dan Connolly, David Foster, Eelco van Asperen, James Whitescarver, Jean-Francois Groff, Jonthan Streets, Nicola Pellow, Peter Dobberstein, Paul Kunz, Pei Wei, Robert Cailliau, Tim Berners-Lee, Tony Johnson, and Willem van Leeuwen.

Berners-Lee decided that the hypertext documents of the World Wide Web would be read-only, and accessed by a client server architecture (browsers). The first World Wide Web server (used by Berners-Lee) was a NeXT computer, and the first web server software was named CERN httpd. Since then, there has been a plethora of HTTP server software, such as Apache, that have simplified the process of hosting web servers, and have helped to popularise the World Wide Web. Berners-Lee was responsible for designing the first web browser, unsurprisingly named: WorldWideWeb. The second browser created - a version of the WorldWideWeb browser that was ported to several operating systems - was the Line Mode Browser, and it was designed by: Tim Berners-Lee, Henrik Frystyk Nielsen and Nicola Pellow. In 1992, Robert Cailliau and Nicola Pellow developed the first browser for the Macintosh platform: named MacWWW. Pei Wei, another member of the WWW project, developed the ViolaWWW hypertext browser.

The World Wide Web was first available as an Internet service on the 6th of August, 1991: when Berners-Lee released information about his "Hypertext project" on the newsgroup: alt.hypertext. However, Berners-Lee had launched the first web server on the 25th of December, 1990; some simple webpages were available for download, but only a select number of people knew the project existed. On the 30th of April, 1993, CERN made the World Wide Web's software - such as a library of code - publicly available; with the aim of increasing it's popularity. CERN also announced that the World Wide Web would be free to use, and no license fee would be charged to developers (unlike with Gopher who charged a license fee to host a server). In 1994, Berners-Lee left CERN and founded the World Wide Web Consortium (October, 1994); the role of the World Wide Web Consortium is to create new web standards and promote the compatibility of web standards for developers.

The World Wide Web was fortunate that it was launched at the same time that the Internet was transitioning from a U.S. government funded network to a commercial network. By 1995, the World Wide Web was the Internet's most popular service, and was responsible for the creation of large tech companies like Amazon, Yahoo!, Paypal, and eBay. The early growth of Internet and the World Wide Web led to the dot com bubble: where the stocks of Internet companies soared in value (1997-2000) and then crashed in value.

bullet HTTP and Internet Protocols

The World Wide Web is a service/application found on the Internet, the Internet is a system of interconnected computer networks that use TCP/IP (Internet protocol suite) to communicate. The Internet protocol suite has four layers and it's highest layer is the application layer. The Hypertext Transfer Protocol (HTTP) is part of the application layer of the Internet protocol suite. The World Wide Web could not function without HTTP: HTTP is a 'request-response' protocol, which means that one computer sends a request for data and another computer responds to the request.

The World Wide Web is based upon a client-server computing model: a client application program (browser), residing on a users computer, will use HTTP to send a request to retrieve data from a web server connected to the Internet. Computer files are retrieved by a client program (browsers) using a Uniform Resource Identifier (URI); the World Wide Web uses a URI that is named: Uniform Resource Locator (URL). URL's are embedded within hyperlinks: hyperlinks, usually referred to as 'links', are embedded into webpages (hypertext documents), and a user simple has to click on a hyperlink and the browser will use HTTP to locate the resource.

When a client program (browser) requests HTTP to locate and retrieve a computer file more than one Internet protocol will be used in the process of retrieving data from a server. HTTP, through a process of encapsulation, typically uses Transmission Control Protocol (TCP) of the transport layer of the Internet protocol suite: TCP ensures that application layer data is reliable sent and received. The TCP data segments will then be encapsulated (enveloped) into an Internet Protocol (IP) packet, which will then be encapsulated in a link layer frame as it 'hops' across the Internet from host to host. The process is likened to a letter being placed inside an envelope that is placed inside another envelope that is placed within a final envelope.

bullet Content on the World Wide Web

If a user wants to upload content to the World Wide Web, then they need to upload it to a web server: a web server is a computer system that will process requests via HTTP. The most common files uploaded to a web server are images files (gif, jpeg) and html documents. The next issue is how do users access the files / content located at a web server: one option is via the web servers IP address, but the most common way is through the Domain Name System (DNS). A domain name is registered, such as, then DNS records are created that tie the domain name to the web server: all users need to do is enter the domain name address, such as, to locate files / content uploaded to a web server.

When a collection of webpages (typically html documents including a index.html file) are uploaded to a web server - tied to a domain name - the overall resource of content is termed a website. If a user wishes to create their own website: all they need to do setup/rent a web server, register a domain name, and upload web content to the web server. Users can then access the web content by entering it's URL (includes a domain name) into a browser and HTTP will retrieve the content. The web server will have a bandwidth limit per month, if this bandwidth (download limit) is exceeded, then the content will be unreachable and requests for the content will be served with a error message.

Web content 'falls' into two broad categories: commercial and noncommercial. The World Wide Web has spawned many successful online commercial businesses; which are referred to as e-tailers (e-tailing) and e-commerce businesses. Some notable e-commerce businesses are Amazon, eBay, and Paypal - most of these businesses are located in California, the state in which ARPANET was launched (the network that evolved into the Internet). Some commercial websites have even launched their own virtual currency, such as Bitcoin. While the world's largest technology corporations are typically located in a tech centre named Silicon Valley (California), UK technology companies can be found in Silicon Glen (Scotland) and Silicon Roundabout (East London Tech City).

bullet Accessing and finding Web content

As has already been stated, a user needs a web browser to access and read web content; the browser retrieves web documents/files by using TCP/IP and also renders the document. To read more about browsers. Once a user has a web browser and is 'surfing' the web, then they need to find content: the primary way of doing that is by using websites called search engines.

bullet Privacy and the World Wide Web

While there is no requirement to record the browsing history of the World Wide Web, the majority of browsing is recorded to some degree. The World Wide Web is based upon a client-server model: a client program (browser) requests and retrieves files (webpages, pictures) from a web server (computer connected to the Internet). Therefore, whenever a file is requested and retrieved upon the World Wide Web: the client and the server usually records the session. Web server are installed with software that usually logs the IP address of all incoming requests for data. Likewise, the browser (client) of the user will usually record the data transmission by keeping a copy of the retrieved files in it's cache (directory) and keeping a record in the history feature of the browser. Internet Service Providers - the network a user accesses the Internet with - will also keep a record of each users usage. The internal policies of ISP's differ: it is difficult to know precisely what an ISP will record and store, and for how long. ISPs will only share their user logs when it is demanded by a legal entity.

Alongside server logs, websites can also record the browsing habits of it's users by using HTTP cookies (invented by Louis Montulli). Cookies are small files, stored on the users computer, that store information, such as: username authentication, password authentication, past browsing history. Therefore, if a user returns to a commercial website - for example eBay - the user will not be required to enter their username and password again, and the website can serve content to the user that is related to the content they viewed the last time they visited the website. Users can delete cookie's whenever they wish, and the typical (first party) cookie does not pose a serious privacy risk; especially if the user has not provided personal identifiable information to the website. However, tracking cookies, referred to as third party cookies, can compile a long term record of a users browsing history - as they record browsing habits at multiple websites - and are sometimes viewed as malware.

Most websites have a 'privacy policy' that typically promise to keep users personal details and usage history secret; though sometimes they will share this data with third parties, which should be disclosed in the 'terms of use' of the website. Social media websites (Twitter, Facebook, LinkedIn, Google+) are by their nature more open when it comes to privacy, with most users openly sharing information publicly. While social media website do include privacy options, many users are probably unaware that the information they upload to these sites is often data mined to identify patterns and establish relationships (usually related to targeted adverts). Additionally, due to the extensive amount of personal information a user shares on social media, the long term impact it may have upon a persons 'real life' is far greater than with other types of websites.

Further Reading:

1. WWW Error Messages: understand the error messages for unavailable Web pages.

bullet Frequently Asked Questions

bullet Do I need special software to view Web pages?
bullet What options do I have to surf the Web?
bullet Is the WWW prefix required for a URL?
bullet Did Tim Berners-Lee invent the World Wide Web?
bullet Was the Web inspired by similar projects?
bullet Was the Web always free?
bullet Did Berners-Lee create the World Wide Web Consortium?
bullet Is the Deep Web akin to the Invisible Web?
bullet Is the Web 2.0 a new World Wide Web?