Internet Guide Logo

Transmission Control Protocol (TCP)

Last Edit: 13/06/17

bullet Introduction

The Transmission Control Protocol (TCP) is a reliable host-to-host protocol that was designed and developed in the 1970's as a part of the TCP/IP protocol suite, and whose specification was published in 1981 in RFC 793: "DoD Standard Transmission Control Protocol (TCP)". TCP was designed for use on packet switching networks, and for reliable sending data between hosts; hosts (computers connected to the network) are the source and destination of packets (blocks of data). When it's specification was published in 1981, TCP was named a DoD (Department of Defense) communication protocol standard. RFC 793 states that the motivation behind TCP/IP was to provide a robust communication that could easily connect strategic and tactical military computer networks; TCP's role was to provide end-to-end reliability for multi-network applications.

In 1972, Bob Kahn joined the IPTO of DARPA (agency of U.S. DoD) and started a project to design a new communications protocol that would improve upon and replace the Network Control Program (NCP). Bob Kahn invited Vint Cerf to help him design this new communications protocol: which would provide an open, simple network design. The emphasis of this new communications protocol was to make it easier to interconnect networks, and place responsibility for the delivery of data and complex functions onto the host computers (edge of the network) rather than on the central network architecture.

The eventual result was TCP/IP, featuring a set of protocols placed across a four layer model: Application layer > Transport layer > Internet layer > Link layer. In RFC 793, it states that "very few assumptions are made as to the reliability of the communication protocols below the TCP (transport) layer." When the specification for TCP was published in 1981, these layers were referred to as: higher-level > TCP > internet protocol > communication network. While TCP/IP was designed for use on military networks, it's civil use has superseded it's military use, and DoD TCP/IP was eventually renamed as the Internet protocol suite. The Internet protocol suite and TCP are currently designed by IETF; an organisation comprised of civil volunteers who form working groups to design Internet protocols. TCP Extensions are still being proposed and developed to improve it's efficiency, for example: TCP Cookie Transactions (TCPCT), TCP Fast Open and Proportional Rate Reduction (PRR).

bullet Function

The Internet and TCP/IP is based upon packet switching: where tiny blocks of data, known as packets (also referred to as a datagram), are routed from one location to another. The size and structure of packets is defined by the Internet Protocol (IP) (IPv4 and IPv6), and this protocol also provides an address syntax so that a packet can be sent from a source location to a destination location. The problem with the Internet Protocol (IP) is that it is unreliable and does not guarantee that the data received will be identical to the data sent - due to congestion on the network, traffic balance load and other network issues. The IP datagram (packet) features a field where a TCP segment can be carried. TCP will then detect if any of these network issues occur and can rectify the issue.

When application layer software (web / email etc) require the use of TCP, the sequence of files bytes (octets) are divided into TCP segments and these segments are then enclosed into IP datagrams - a process referred to as encapsulation - and IP is then responsible for delivering the data. TCP applies a sequence number to each octet of data and these sequence numbers, alongside an acknowledgment number, ensure that the segments are ordered correctly. Each segment also includes a checksum to check for error damage. Sequence numbers also help TCP provide flow control by creating a 'window' where only a certain amount of sequence numbers can be accepted by the destination source. Ports are used by TCP to enable multiplexing: where multiple application layer protocols can use TCP by each being assigned a unique port number. Sequence numbers are also used to create a handshake mechanism so that a connection between hosts can be closed. Buffers are used at each end to package the data segments.

To achieve the above - alongside congestion control, precedence and security - each TCP data segment (enclosed into a IP datagram) is given a segment header with the following fields:

The checksum method for IPv4 and IPv6 (version 4 and 6) differs slightly.

To conclude: the Transmission Control Protocol (TCP) is concerned with reliability rather than speed: therefore, if a data stream uses TCP it will delay the transmission of the data by a few seconds - because it has to wait to check for data packets that have been sent out of order. If an application layer did not use TCP, but relied solely on the Internet Protocol (IP) to send the data: the data would be transported faster, but with no guarantee that the data will be sent, or sent in an orderly manner. UDP is a transport layer protocol that provides a 'middle way' between the two approaches: providing a header with ports and error checking (checksum) but no handshake to ensure the data is 100% reliable sent.

If an application layer protocol does use the Transmission Control Protocol (TCP), then TCP will accept the data, divide it into segments of data, and will add a header (which includes routing information) for the data. The TCP segment will then - through a system of encapsulation - be enclosed into a Internet Protocol (IP) datagram. The Internet Protocol (IP) will always be responsible for the actual delivery of the data, but the Transmission Control Protocol (TCP) will 'keep an eye' on the data to ensure it is sent and received as intended. TCP achieves this through segmenting the data and adding it's own header.