UNIVERSITY    of     HOUSTON
Department of Computer Science

COSC 4377 - Introduction to Computer Networks
Section 07662, Fall 2001

Project: A Distributed Sorting Program using UDP transport layer

Design Plan: Due in the class, April 2, 2001

Final Project: Due at 10 PM, April 24 (new date), 2001

 

Description:

The goal of the project is to design and implement a distributed sorting algorithm using UDP transport layer. The input will be a text file containing a list of integer keys separated by blanks. The output should have exactly the same format, except that the numbers should be sorted in increasing order. Only one processor should read the input and write to the output file. The general idea is that one node will read input and distribute the keys among nodes. All the nodes will then combine to perform the sort. Finally, one node will output the sorted keys in a file.

You have a lot of freedom regarding how you would implement this project - basically any reasonable approach is fine. You can choose any sorting algorithm, but it should scale. This means that, as the number of processors is increased, the work done by each processor should decrease proportionally (approximately). Obviously, a single processor should not sort the entire list, or do any operation on all keys that is not linear in the number of keys. But a single processor may do a linear operation on the entire list, e.g., it could merge two arrays or lists of sorted keys, or it could separate a list of keys into buckets, where each bucket is an array or list of numbers in a range of values. You should also avoid unnecessary communication. For example, it is not desirable to broadcast the entire array of keys to all processors.

You can develop any scheme to overcome messages lost or reordered by the UDP transport layer. (You may want to write a version of your socket routines to randomly lose a packet with a fixed probability, for testing). However, you should keep in mind that the purpose of this scheme is simply to get sorting done reliably, and not a general purpose reliable transport like TCP.

You can make reasonable assumptions. Here are some examples:

Some of the assumptions you can make about the keys.

Do not assume that the number of processors is a constant, but you may make other reasonable assumptions. For instance, you may assume:

You have to take into account that messages may get lost or reordered in transit when using UDP. However, if it helps, you may assume:

You may make other reasonable assumptions. But please make sure you list them in your design document and the final report. Also, you probably will not need to some of the assumptions above. Don't think you are missing the solution if you don't need some or all of the above sample assumptions.

What to submit and when:

April 2 (in class): Submit a plan of how you will implement the project. It should contain the list of person(s) in the team, what kind of algorithm you will use for sorting, how you will implement it in the sockets framework, and how you will recover from any loss and reordering of packets done by UDP. This would typically be about one page. The main purpose of this document is to make sure that you have a good plan, so don't waste time formatting and styling.

April 20 (10 PM): Submission of a final project report. The report should contain the following:

You will be asked to pick an appointment slot and demonstrate your project the week following the submission.

Honesty Policy:

You are free to discuss your design with other students and exchange example test inputs etc. But you must write your reports and develop your code independently. We are very eager to catch cheating with our toolset to find commonalities in documents and code. Don't fall in the trap.

Late penalty:

10 points per day out of a total of 100 points for the final project submission.
5 points per day for late submission of your design document.

Sample UDP Client/Server Programs:

Here are sample UDP programs. Here is sample code for timer which show you how to use "gettimeofday".

Test Data:

Here is the source (testdata.c) for generating test data. The default setting will generate 500 numbers in the range of [0 .. 9999]. Change the program to fit your needs.

How to submit

At command line on CS SunOS/Solaris machines (pegasus, letos, etc.), type
  ~xsun/submit assignment# filename#
then, follow the instruction. For exapmle:
  ~xsun/submit 4 filename#

Last modified: April 21, 2001.
tihuang at cs . uh . edu