Socket-level Programming

TCP/IP

The OSI model was devised using a committee process wherein the standard was set up and then implemented. Some parts of the OSI standard are obscure, some parts cannot easily be implemented, some parts have not been implemented.

The TCP/IP protocol was devised through a long-running DARPA project. This worked by implementation followed by RFCs (Request For Comment). TCP/IP is the principal Unix networking protocol. TCP/IP = Transmission Control Protocol/Internet Protocol.

TCP/IP stack

The TCP/IP stack is shorter than the OSI one:

TCP is a connection-oriented protocol, UDP (User Datagram Protocol) is a connectionless protocol.

IP datagrams

The IP layer provides a connectionless and unreliable delivery system. It considers each datagram independently of the others. Any association between datagrams must be supplied by the higher layers.

The IP layer supplies a checksum that includes its own header. The header includes the source and destination addresses.

The IP layer handles routing through an Internet. It is also responsible for breaking up large datagrams into smaller ones for transmission and reassembling them at the other end.

UDP

UDP is also connectionless and unreliable. What it adds to IP is a checksum for the contents of the datagram and port numbers. These are used to give a client/server model - see later.

TCP

TCP supplies logic to give a reliable connection-oriented protocol above IP. It provides a virtual circuit that two processes can use to communicate.

Internet adddresses

In order to use a service you must be able to find it. The Internet uses an address scheme for machines so that they can be located.

The address is a 32 bit integer which gives the IP address. This encodes a network ID and more addressing. The network ID falls into various classes according to the size of the network address.

Network address

Class A use 8 bits for the network address with 24 bits left over for other addressing. Class B uses 16 bit network addressing. Class C uses 24 bit network addressing and class D uses all 32.

The University of Canberra is registered as a Class B network, so we have a 16 bit network address with 16 bits left to identify each machine.

Subnet address

Internally, the Uni network is divided into subnetworks. Building 11 is currently on one subnetwork and uses 10-bit addressing, allowing 1024 different hosts.

Host address

8 bits are finally used for host addresses within out subnet. This places a limit of 256 machines that can be on the subnet.

Total address

The 32 bit address is usually written as 4 integers separated by dots

Symbolic names

Each host has a name. This can be found from the Unix user level command hostname A symbolic name for the network also exists. For our network it is ``canberra.edu.au''. The the symbolic network name for any host is formed from the two: birch.ise.canberra.edu.au

Programming interface in C

Address conversion

These functions convert to and from the ``dotted'' addresses as in 137.92.11.1 to 32 bit integer addresses: #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> unsigned long inet_addr(char *ptr) char *inet_ntoa(struct in_addr in) (The structure in_addr has only one field which is the 32 bit IP address.)
struct in_addr {
    unsigned long int s_addr;
}
The BSD library provides some functions for finding names. char *gethostname(char *name, int size) finds the ordinary hostname. struct hostent *gethostbyname(char *name) returns a pointer to a structure with two important fields:``char * h_name'' which is the ``official'' network name of the host and ``char **h_addr_list'' which is a list of TCP/IP addresses.
struct hostent {
   char    *h_name;        /* official name of host */
   char    **h_aliases;    /* alias list */
   int     h_addrtype;     /* host address type */
   int     h_length;       /* length of address */
   char    **h_addr_list;  /* list of addresses */
}
The following program prints these: #include <stdio.h> #include <sys/param.h> #include <sys/types.h> #include <sys/socket.h> #include <netdb.h> #include <netinet/in.h> #include <arpa/inet.h> #define SIZE (MAXHOSTNAMELEN+1) int main(void) { char name[SIZE]; struct hostent *entry; if (gethostname(name, SIZE) != 0) { fprintf(stderr, "unknown name\n"); exit(1); } printf("host name is %s\n", name); if ((entry = gethostbyname(name)) == NULL) { fprintf(stderr, "no host name info\n"); exit(2); } printf("offic. name: %s\n", entry->h_name); printf("address: %s\n", inet_ntoa( *(struct in_addr *) (entry->h_addr_list[0]))); exit(0); } This programming interface uses a number of standard files: /etc/hostname to find the name, /etc/rhosts to find the network address (or a name server) if it can't find it there.

Port addresses

A service exists on a host, and is identified by its port. This is a 16 bit number. To send a message to a server you send it to the port for that service of the host that it is running on. This is not location transparency!

Certain of these ports are ``well known''. They are listed in the file /etc/services. For example,

Ports in the region 1-255 are reserved by TCP/IP. The system may reserve more. User processes may have their own ports above 1023.

The function ``getservbyname'' can be used to find the port for a service that is registered in /etc/services.

struct servent *getservbyname(const char *name, const char *proto);

struct servent {
   char    *s_name;        /* official service name */
   char    **s_aliases;    /* alias list */
   int     s_port;         /* port number */
   char    *s_proto;       /* protocol to use */
}

Berkeley sockets

When you know how to reach a service via its network and port IDs, what then? If you are a client you need an API that will allow you to send messages to that service and read replies from it.

If you are a server, you need to be able to create a port and listen at it. When a message comes in you need to be able to read and write to it.

Berkeley sockets are the BSD Unix system calls for this. They are part of the BSD Unix kernel. They have also been adopted by the PC world. They form the lowest practical level of doing client/server on both Windows and Unix.


Data representation

Some computers are ``big endian''. This refers to the representation of objects such as integers within a word. A big endian machine stores them in the expected way: the high byte of an integer is stored in the leftmost byte, while the low byte of an integer is stored in the rightmost byte. A Sun Sparc is big endian. So the number 5 + 6 * 256 would be stored as

A ``little endian'' machine stores them the other way. The 386 is little endian.

If a Sparc sends an integer to a 386, what happens? The 386 sees 5 + 6 * 256 as
     5 *16777216 + 6 * 65536 
To avoid this, two communicating machines must agree on data representation.

The Sun RPC uses a format known as XDR, which just happens to be the format that doesn't require any conversions for Suns. However, if two 386s are communicating then each of them will have to keep swapping bytes both on receipt and send.

The OSF DCE uses native format, with the receiving machine swapping bytes if needed. This section describes the Unix BSD networking API for IP as in WR Stevens ``Unix Network Programming.''


Byte ordering

To handle byte ordering for non-standard size integers there are conversion functions

Addresses

The address of an IP service given is using a structure #include <netinet/in.h> struct sockaddr_in { short sin_family; u_short sin_port; struct in_addr sin_addr; char sin_zero[8]; } Example: The finger service (port 79) on machine 137.92.11.1 is given by struct sockaddr_in addr; addr.sin_family = AF_INET; addr.sin_port = htons(79); addr.sin_addr.s_addr = inet_addr("137.92.11.1");

Sockets

A socket is a data structure maintained by a BSD-Unix system to handle network connections.

A socket is created using the call ``socket''. It returns an integer that is like a file descriptor: it is an index into a table and ``reads'' and ``writes'' to the network use this ``socket file descriptor''.

#include <sys/types.h> #include <sys/socket.h> int socket(int family, int type, int protocol); Here ``family'' will be AF_INET for IP communications, ``protocol'' will be zero, and ``type'' will depend on whether TCP or UDP is used.

Two processes wishing to communicate over a network create a socket each. These are similar to two ends of a pipe - but the actual pipe does not yet exist.


Connection oriented (TCP)

One process (server) makes its socket known to the system using ``bind''. This will allow other sockets to find it.

It then ``listens'' on this socket to ``accept'' any incoming messages.

The other process (client) establishes a network connection to it, and then the two exchange messages.

As many messages as needed may be sent along this channel, in either direction.

Server:

create endpoint (socket())
bind address (bind())
specify queue (listen())
wait for conection (accept())
transfer data (read() write())

Client:

create endpoint (socket())
connect to server (connect())
transfer data (read() write())

TCP time client

Each machine runs a TCP server on port 13 that returns in readable form the time on that particular machine. All that a client has to do is to connect to that machine and then read the time from that machine. /* TCP client that finds the time from a server */ #include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #define SIZE 1024 char buf[SIZE]; #define TIME_PORT 13 int main(int argc, char *argv[]) { int sockfd; int nread; struct sockaddr_in serv_addr; if (argc != 2) { fprintf(stderr, "usage: %s IPaddr\n", argv[0]); exit(1); } /* create endpoint */ if ((sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0) { perror(NULL); exit(2); } /* connect to server */ serv_addr.sin_family = AF_INET; serv_addr.sin_addr.s_addr = inet_addr(argv[1]); serv_addr.sin_port = htons(TIME_PORT); if (connect(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0) { perror(NULL); exit(3); } /* transfer data */ nread = read(sockfd, buf, SIZE); write(1, buf, nread); close(sockfd); exit(0); } Example: If the program is compiled to ``tcptime'', find the time in various places by

TCP time server

The real time server can only be started by the system supervisor (usually at boot time) as the time port is reserved. To run the following code yourself, change the time port to say 2013. #include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #define SIZE 1024 char buf[SIZE]; #define TIME_PORT 13 int main(int argc, char *argv[]) { int sockfd, client_sockfd; int nread, len; struct sockaddr_in serv_addr, client_addr; time_t t; /* create endpoint */ if ((sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0) { perror(NULL); exit(2); } /* bind address */ serv_addr.sin_family = AF_INET; serv_addr.sin_addr.s_addr = htonl(INADDR_ANY); serv_addr.sin_port = htons(TIME_PORT); if (bind(sockfd, &serv_addr, sizeof(serv_addr)) < 0) { perror(NULL); exit(3); } /* specify queue */ listen(sockfd, 5); for (;;) { len = sizeof(client_addr); client_sockfd = accept(sockfd, &client_addr, &len); if (client_sockfd == -1) { perror(NULL); continue; } /* transfer data */ time(&t); sprintf(buf, "%s", asctime(localtime(&t))); len = strlen(buf) + 1; write(client_sockfd, buf, len); close(client_sockfd); } }

Connectionless (UDP)

In a connectionless protocol both sockets have to make their existence known to the system using ``bind''. This is because each message is treated separately, so the client has to find the server each time it sends a message and vice versa.

When bind is called it binds to a new port - it cannot bind to one already in use. If you specify the port as zero the system gives you a currently unused port.

Because of this extra task on each message send, the processes do not use read/write but recvfrom/sendto. These functions take as parameters the socket to write to, and the address of the service on the remote machine.

Server:

create endpoint (socket())
bind address (bind())
transfer data (sendto() recvfrom())

Client:

create endpoint (socket())
bind address (bind())
connect to server (connect())
transfer data (sendto() recvfrom())

Time client (UDP)

The UDP time server requires a datagram to be sent to it. It ignores the contents of the message but uses the return address to send back a datagram containing the time.
/* UDP client for time */

#include 
#include 
#include 
#include 
#include 

#define SIZE 1400
char buf[SIZE];

#define TIME_PORT 13

int main(int argc,
         char *argv[])
{
  int sockfd;
  int nread;
  struct sockaddr_in serv_addr,
                   client_addr;
  int len;

  if (argc != 2) {
    fprintf(stderr,
          "usage: %s IPaddr\n",
          argv[0]);
    exit(1);
  }
  if ((sockfd =
       socket(AF_INET,
              SOCK_DGRAM, 0)) 
       < 0) {
    perror(NULL);
    exit(2);
  }
  client_addr.sin_family =
             AF_INET;
  client_addr.sin_addr.s_addr =
             htonl(INADDR_ANY);
  client_addr.sin_port =
             htons(0);

  serv_addr.sin_family =
            AF_INET;
  serv_addr.sin_addr.s_addr =
            inet_addr(argv[1]);
  serv_addr.sin_port =  
            htons(TIME_PORT);


  if (bind(sockfd,
           &client_addr, 
           sizeof(client_addr))
      < 0) {
    perror(NULL);
    close(sockfd);
    exit(3);
  }

  len = sizeof(serv_addr);
  sendto(sockfd, buf, 1, 0,
         &serv_addr, len);
  nread = recvfrom(sockfd, buf,
                  SIZE, 0,
                  &client_addr,
                  &len);
  write(1, buf, nread);

  close(sockfd);
  exit(0);
}

Socket controls

Sockets are treated by the O/S as devices and so there are a variety of device driver controls that can be used (see later). For example, the command ``fcntl'' can be used to make a socket non-blocking, and ``select'' can be used to test if a socket (device) has input or output pending.

In addition, ``getsockopt'' and ``setsockopt'' can be used for more specific socket control:

If a read or write does not return, it should timeout. What should the time limit be? On your own machine or on your local network it should be in milliseconds. To Melbourne in seconds, whereas to Scandinavia it should probably be minutes.

Timeout algorithms should adjust the time according to the curent trip time in some manner. They can be implemented using timer signals.

Server slaves

The servers given earlier could only handle one client at a time. If the connection lasted for some time, then other clients would be blocked from access during this time. In such cases, the server should create "slave" servers to handle each client as it connects, as a separate process.

In Unix this is fairly easy: after the accept() has succedeed, the server should fork off a new process to handle the client. Here is the TCP time server, capable of handling many clients at once:

#include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #define SIZE 1024 char buf[SIZE]; #define TIME_PORT 13 int main(int argc, char *argv[]) { int sockfd, client_sockfd; int nread, len; struct sockaddr_in serv_addr, client_addr; time_t t; /* create endpoint */ if ((sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0) { perror(NULL); exit(2); } /* bind address */ serv_addr.sin_family = AF_INET; serv_addr.sin_addr.s_addr = htonl(INADDR_ANY); serv_addr.sin_port = htons(TIME_PORT); if (bind(sockfd, &serv_addr, sizeof(serv_addr)) < 0) { perror(NULL); exit(3); } /* specify queue */ listen(sockfd, 5); for (;;) { len = sizeof(client_addr); client_sockfd = accept(sockfd, &client_addr, &len); if (client_sockfd == -1) { perror(NULL); continue; } /* create a new slave */ if (fork() == 0) { close(sockfd); /* transfer data */ time(&t); sprintf(buf, "%s", asctime(localtime(&t))); len = strlen(buf) + 1; write(client_sockfd, buf, len); close(client_sockfd); exit(0); } } }

Server listening on multiple sockets

A server may be attempting to listen to multiple clients not just on one port, but on many. In this case it has to use some sort of polling mechanism between the ports.

The select() call lets the kernel do this work. The call takes a number of file descriptors. The process is suspended. When I/O is ready on one of these, a wakeup is done, and the process can continue. This is cheaper than busy polling.

Internet superserver

Each server executes as a process. If there are multiple servers on a system, then many processes will exist, most probably doing nothing. The Internet inetd process can replace many of these, doing a listen on each port. When a request comes through, inetd will create a new process to handle each request. The file /etc/inetd.conf will specify which servers it is to listen for.

When a request comes through, a server is created to handle it. To make it simpler for the server, instead of having to deal with ports, inetd will remap these onto stdin and stdout so that the server just reads/writes to stdin/stdout.


Sockets in Java

Client

import java.io.*;
impoert java.net.*;

public class SocketTest {

  public static void main(String argv[]) {
    try {
      Socket t = new Socket("java.sun.com", 13);
      DataInputStream is =
	new DataInputStream(t.getInputStream());
      boolean more = true;
      while (mnore) {
	String str = is.readLine();
	if (str == null)
	  more = false;
	else
	  System.out.println(str);
	}
      }
    } catch(IOException e) {
      System.out.println("Error" + e);
    }
  }
}

Server

import java.io.*;
import java.net.*;

public class EchoServer {
  public static void main(String argv[]) {
    try {
      ServerSocket s = new ServerSocket(8189);
      Socket incoming = s.accept();
      DataInputStream in =
	new DataInputStream(incoming.getInputStream());
      PrintStream out =
	new PrintStream(incoming.getOutputStream());
      out.println("Hello. Enter BYE to exit");

      boolean done = false;
      while ( ! done) {
	String str = in.readLine();
	if (str == null) 
	  done = true;
	else {
	  out.println("Echo: " + str);
	  if (str.trim().equals("BYE"))
	    done = true;
        }
      incoming.close();
    } catch(Exception e) {
      System.out.println(e);
    }
  }
}

Sockets in Perl

Perl has a C-like interface to sockets. It also has a higher level one using the IO::Socket module. This example is a client to fetch documents from a Web server.


#!/usr/bin/perl -w
use IO::Socket;
unless (@ARGV > 1) { die "usage: $0 host document ..." }
$host = shift(@ARGV);
$EOL = "\015\012";
$BLANK = $EOL x 2;
foreach $document ( @ARGV ) {
    $remote = IO::Socket::INET->new( Proto => "tcp",
                                     PeerAddr  => $host,
                                     PeerPort  => "http(80)",
                                    );
    unless ($remote) { die "cannot connect to http daemon on $host" }
    $remote->autoflush(1);
    print $remote "GET $document HTTP/1.0" . $BLANK;
    while ( <$remote> ) { print }
    close $remote;
}

Here is a server that will execute some commands and return a result


#!/usr/bin/perl -w
use IO::Socket;
use Net::hostent;              # for OO version of gethostbyaddr

$PORT = 9000;                  # pick something not in use

$server = IO::Socket::INET->new( Proto     => 'tcp',
                                 LocalPort => $PORT,
                                 Listen    => SOMAXCONN,
                                 Reuse     => 1);

die "can't setup server" unless $server;
print "[Server $0 accepting clients]\n";

while ($client = $server->accept()) {
    $client->autoflush(1);
    print $client "Welcome to $0; type help for command list.\n";
    $hostinfo = gethostbyaddr($client->peeraddr);
    printf "[Connect from %s]\n", $hostinfo->name || $client->peerhost;
    print $client "Command? ";
    while (<$client>) {
        next unless /\S/;       # blank line
        if (/quit|exit/i) {
            last;
        } elsif (/date|time/i) {
            printf $client "%s\n", scalar localtime;
        } elsif (/who/i ) {
            print  $client `who 2>&1`;
        } elsif (/cookie/i ) {
            print  $client `/usr/games/fortune 2>&1`;
        } elsif (/motd/i ) {
           print  $client `cat /etc/motd 2>&1`;
        } else {
           print $client "Commands: quit date who cookie motd\n";
        }
     } continue {
        print $client "Command? ";
     }
     close $client;
}


Jan Newmarch <jan@newmarch.name>
Last modified: Sun Jul 22 18:44:23 EST 2001