OHJ-5016 Introduction to Distributed Systems, 5 cr


Course catalog
Main page
Lectures
Projects
Examination

Institute of Software Systems courses
EnglishAll

Programming project 1:
Network programming with Berkeley sockets

There is a small change in the specification that has been indicated in red below. There is also a new deadline: 10:00 on Fri 30 May 2008.
Here is the echoserv.c program and Makefile that we studied in class. Here is a new version of echoserv.c and two makefiles: Makefile.haikara (that works on Linux servers) and Makefile.kaarne (that works on Sun servers). On haikara, you should use the command "make -f Makefile.haikara", and on kaarne the command "gmake -f Makefile.kaarne".

Doctor Not-An-Ale has just had a brilliant idea: data can be sent across the network much faster if the transfer operations are parallelized among many file servers. (This is essentially the BitTorrent idea.) In the implementation, the names and locations of binary files are stored in a central database (at the server) and this information is provided to clients who use it to download and upload the binary files from and to each other.

Implement this idea in either C or C++ and use BSD sockets to implement the communication. Your program must work in Lintula's Solaris environment on the machine(s) called kaarne.cs.tut.fi. Before you return your answer, you MUST test your program in this environment.

IMPORTANT CONSTRAINT:

Every implementation must contain code that destroys the user process after one hour, at the latest. This will prevent Lintula from being overrun with a lot of zombie processes. Implement this functionality as the FIRST step, so that also your testing code has this property.

This example code shows you how to do this.

Server

The task of the server is to listen on a TCP/IP port for connecting clients and to use the hajap protocol to implement the registration service.

When a client connects to the server, the first step is always to authenticate the user. Once this is done, the server must be ready to process one of the following three requests:

  1. Ask for a list of file names.
  2. Ask for location information about one file name.
  3. Register itself as a provider for a specific file name.

The server may terminate the connection as soon as it has sent a response to the client's request. Only one request is serviced for each TCP connection.

The server must understand only one command-line option:

  • --portti N where N is an integer that specifies the TCP/IP port number on which clients will connect.

NOTE: It is important that your client and server use the Finnish language command-line options, because your programs will be tested with these exact options.

User authentication

User authentication is performed using the challenge-response authentication-mechanism, whereby the user's password is never sent directly over the network. Instead, the client and the server engage in the following exchange:

Client Server
  • Connects to the server and sends the user id K
 
 
  • Searches its database for the user id K and the associated password S1. (If the user id is not found, the authentication phase is aborted and the connection is closed.)
  • Generates random number R, and sends it to the client.
  • Asks the user for their password S2.
  • Concatenates the password and random number (S2+R) and calculates the MD5 hash value H1 of the string.
  • Sends the value H1 to the server.
 
 
  • Calculates the MD5 hash value H2 of the string S1+R.
  • Compares the values H1 and H2.
  • If they agree, the password that the user entered is correct and the user has been authenticated. If not, the authentication phase is aborted and the connection is closed.)

MD5 hashing operates on 12 bytes of data, of which the first 8 bytes are taken from the ASCII password and the last 4 bytes are the 32-bit random number. If the password is shorter that 8 characters, it is filled up with ASCII 0 characters. If the password is longer than 8 characters, only the first 8 are used. The random number is always a 32-bit integer in "network byte order"; this means "most significant byte first".

You can find pseudocode for computing the MD5 hash function on the Wikipedia page for MD5.

Client

The client program supports three operations: name listing, binary data upload, and binary data download.

The client ALWAYS needs the following four command-line options:

  • --palvelin N specifies the server address. Here N is the machine name in standard dot notation ("130.230.4.2") or symbolic notation ("mustavaris.cs.tut.fi") that needs to be looked up on a name server.
  • --portti P specifies the TCP/IP port number P that the server is using.
  • --tunnus K specifies the user id that the client must use to connect to the server.
  • --salasana S specifies the user password.

When the client connects to the server, the first step is always to authenticate the user. Once this has been done, the client can request the operation that the user is interested in.

  • Name listing

    If the user specifies the --listaa command-line option, the client must request the name list from the server (using the hajap protocol) and display the list to the user.

  • Distribution operation

    If the user specifies the --jakele <filename> command-line options, the client must check that the given file exists locally, register the file name at the server using the hajap protocol, and then be prepared to transmit the named file as UDP packets using the same protocol.

    After sending the registration command, the TCP connection is closed but the client must still respond to incoming (hajap protocol) UDP requests it has received. Only after all such requests have been satisfied can the client quit. The client must quit when the user presses Control-C, even if it is in the middle of a send operation. Bear in mind that this also has important implications for the receiver of the information. The client program can choose for itself on which to register the UDP connection and this port is sent in the registration process to the server.

  • Collection operation

    If the user specifies the --nouda <filename> command-line option, the client must use the hajap protocol to obtain information about the given file, break the TCP connection, and then download the binary file using UDP packets. Data collection works in the following way:

    1. The client sends a state request to all the UDP servers registered for the given filename.

    2. All those servers that are prepared to share the given file, are placed in a list of active responders.

    3. Requests for data is sent to all the active responders (for example, by cycling through the list and asking for different addresses from each responder) and the data received is saved temporarily (in memory, for example).

    4. When all the data has arrived, the client writes a local file with the given file name containing the data, and the collection operation is complete.

    When the client has received all the requested data, it informs the user and switches over to the distribution operation.

Evaluation

The minimum requirements for this project is a working implementation of the server and client programs, such that the client can construct and display a list of active responders for the collection operation. In other words, the minimum implementation does NOT have to implement the actual data transfer, but it must the UDP protocol to build a list of responders.

You can earn extra points by implementing some of the following features:

  1. The client can download the binary data one block at a time using several of the active responders. Different responders deliver different blocks of data.

  2. More robust UDP transfers (for example, if a requested data block does not arrive within time X, the client can re-request the block from another active responder).

  3. The client and server programs can work with IPV6 connections.

  4. There is also a new version ("2") of the protocol that uses checksums to ensure that the transfered data is not corrupted. The client should read a checksum (from the active responder) for each of the data blocks and check that it is correct. Also the server should provide a checksum for the entire file (which it obtains when a client registers that it is willing to provide the file). The client can use this overall checksum to check that the correct blocks have been downloaded. (If you implement this version, you must still also implement version 1 of the protocol. Add a command-line parameter to your server and client programs to activate the advanced version.)

How to submit your program

  • The name of your server program must be palvelin and the name of your client program must be asiakas.

  • Place your source code (no binaries please) in a gzipped tar-file called ohj5016.tar.gz. For example, if directory MyWork contains your source files, you can say

        $ tar cf ohj5016.tar MyWork
        $ gzip ohj5016.tar
    
  • Send your gzipped-tar file as an attachment to the address geldenh2 -at- cs.tut.fi with the subject OHJ5016 Project. If you use another subject, the mail will not be picked up by the filtering mechanism. Make sure that it is correct.

  • You are allowed to do the project either alone or with a partner. The body of your mail must include your name and student number, and the name and student number of your partner, if you have one. If this information is not present, the credit for the project will be donated to Ethiopian orphans, and you will not receive any of it.

  • The deadline for the submission of the tar-file is 10:00 on Fri 2 May 2008 10:00 on Fri 30 May 2008.

Notes

  • The different machines in Lintula use firewalls and port protection mechanisms that makes testing the project more difficult. A working TCP or UDP connection can be established in one of the following ways:

    1. run the client and the server on the same machine and use the connection address "localhost" or "127.0.0.1";

    2. run the client and the server within Lintula on the following machines and use port numbers in the range 40001-65536.

      mustatilhi, hopeatilhi, viherkiuru, mustakiuru, viherharakka, mustaharakka

      (NOTE: this will only work when you connect to the server from "inside" Lintula, not from the outside world.)

  • To avoid confusion with port numbers during testing, use the following values for TCP and UDP port numbers: 40001 + (student_number % 25499)

  • The TCP server may operate iteratively (serving one client and request at a time). If, however, you implement a concurrent server (for example, by using fork), REMEMBER to implement the important time-out constraint that is discussed at the top of this page, and make sure that it is working properly in the all the processes.

  • If you are programming in C++, you can still use the C libraries by using the extern "C" { mechanism.

  • There are a lot of examples of socket programming on the Internet. We cannot prevent you from looking at it, and "if you can't beat them, join them". So here is our own example code.

  • Getting integer data into and out of packets can be tricky (because of the differences between network byte order and machine byte order). Here is some example code.

  • If for one or another reason you need to know your own IP address, the right routine for the job is getsockname:

        connect(sock, &addr, sizeof addr);
        addrlen = sizeof( addr );
        getsockname(sock, &addr, &addrlen);
        printf("My IP is: %s\n", inet_ntoa(addr.sin_addr));
    
  • To use the BSD socket interface in Solaris, you will need to link your program with the following libraries: -lsocket -lnsl (see, for example, man -s 3n socket).

  • Processing the command-line is not the focus of the project, and you should not spend too much time on it. If you are using C, you could use the GNU popt library. (On Solaris say man -M/opt/local/gnu/man popt, and link the library with -L/opt/local/gnu/lib -lpopt).


http://www.cs.tut.fi/OHJ-5016/sockets.html