TCP/UDP and Sockets

Transport Layer Protocols

There are two transport layer protocols as given below.

UDP (User Datagram Protocol)

UDP is a connection less protocol. UDP provides a way for application to send encapsulate IP datagram and send them without having to establish a connection.

  • Datagram oriented
  • unreliable, connectionless
  • simple
  • unicast and multicast
  • Useful only for few applications, e.g., multimedia applications
  • Used a lot for services: Network management (SNMP), routing (RIP), naming (DNS), etc.

UDP transmitted segments consisting of an 8 byte header followed by the payload. The two parts serve to identify the end points within the source and destinations machine. When UDP packets arrives, its payload is handed to the process attached to the destination ports.

  • Source Port Address (16 Bits)

Total length of the User Datagram (16 Bits)

  • Destination Port Address (16 Bits)

Checksum (used for error detection) (16 Bits

TCP (Transmission Control Protocol)

TCP provides full transport layer services to applications. TCP is reliable stream transport port-to-port protocol. The term stream in this context, means connection-oriented, a connection must be established between both ends of transmission before either may transmit data. By creating this connection, TCP generates a virtual circuit between sender and receiver that is active for the duration of transmission.

TCP is a reliable, point-to-point, connection-oriented, full-duplex protocol.

image001

Flag bits

  • URG: Urgent pointer is valid If the bit is set, the following bytes contain an urgent message in the sequence number range “SeqNo <= urgent message <= SeqNo + urgent pointer”
  • ACK: Segment carries a valid acknowledgement
  • PSH: PUSH Flag, Notification from sender to the receiver that the receiver should pass all data that it has to the application. Normally set by sender when the sender’s buffer is empty
  • RST: Reset the connection, The flag causes the receiver to reset the connection. Receiver of a RST terminates the connection and indicates higher layer application about the reset
  • SYN: Synchronize sequence numbers, Sent in the first packet when initiating a connection
  • FIN: Sender is finished with sending. Used for closing a connection, and both sides of a connection must send a FIN.

TCP segment format

Each machine supporting TCP has a TCP transport entity either a library procedure, a user process or port of kernel. In all cases, it manages TCP streams and interfaces to the IP layer. A TCP entities accepts the user data stream from local processes, breaks them up into pieces not exceeding 64 K bytes and sends each piece as separate IP datagrams.

Sockets

A socket is one end of an inter-process communication channel. The two processes each establish their own socket. The system calls for establishing a connection are somewhat different for the client and the server, but both involve the basic construct of a socket.

The steps involved in establishing a socket on the client side are as follows:

  1. Create a socket with the socket() system call
  2. Connect the socket to the address of the server using the connect() system call
  3. Send and receive data. There are a number of ways to do this, but the simplest is to use the read()and write() system calls.

The steps involved in establishing a socket on the server side are as follows:

  1. Create a socket with the socket() system call
  2. Bind the socket to an address using the bind() system call. For a server socket on the Internet, an address consists of a port number on the host machine.
  3. Listen for connections with the listen() system call
  4. Accept a connection with the accept() system call. This call typically blocks until a client connects with the server.
  5. Send and receive data

When a socket is created, the program has to specify the address domain and the socket type.

Two processes can communicate with each other only if their sockets are of the same type and in the same domain.

There are two widely used address domains, the unix domain, in which two processes which share a common file system communicate, and the Internet domain, in which two processes running on any two hosts on the Internet communicate. Each of these has its own address format.

The address of a socket in the Unix domain is a character string which is basically an entry in the file system.

The address of a socket in the Internet domain consists of the Internet address of the host machine (every computer on the Internet has a unique 32 bit address, often referred to as its IP address). In addition, each socket needs a port number on that host. Port numbers are 16 bit unsigned integers. The lower numbers are reserved in Unix for standard services.

For example, the port number for the FTP server is 21. It is important that standard services be at the same port on all computers so that clients will know their addresses. However, port numbers above 2000 are generally available.

Socket Types

There are two widely used socket types, stream sockets, and datagram sockets.

Stream sockets treat communications as a continuous stream of characters, while datagram sockets have to read entire messages at once. Each uses its own communications protocol. Stream sockets use TCP (Transmission Control Protocol), which is a reliable, stream oriented protocol, and datagram sockets use UDP (Unix Datagram Protocol), which is unreliable and message oriented. A second type of connection is a datagram socket. You might want to use a datagram socket in cases where there is only one message being sent from the client to the server, and only one message being sent back. There are several differences between a datagram socket and a stream socket.

  1. Datagrams are unreliable, which means that if a packet of information gets lost somewhere in the Internet, the sender is not told (and of course the receiver does not know about the existence of the message). In contrast, with a stream socket, the underlying TCP protocol will detect that a message was lost because it was not acknowledged, and it will be retransmitted without the process at either end knowing about this.
  2. Message boundaries are preserved in datagram sockets. If the sender sends a datagram of 100 bytes, the receiver must read all 100 bytes at once. This can be contrasted with a stream socket, where if the sender wrote a 100 byte message, the receiver could read it in two chunks of 50 bytes or 100 chunks of one byte.
  3. The communication is done using special system calls sendto() and receivefrom() rather than the more generic read() and write().
  4. There is a lot less overhead associated with a datagram socket because connections do not need to be established and broken down, and packets do not need to be acknowledged. This is why datagram sockets are often used when the service to be provided is short, such as a time-of-day service.