COSC 6377 - Computer Networks

Fall 2011

MW 2:30-4:00pm at AH301

Instructor

Homework - 2

Due: October 19, 2011

Source Routing with Bloom Filters

Lets imagine we have a network with paths with extremely large number of hops. We want to use source routing because the source has the entire network topology and can compute the entire route to the destination. Unfortunately, we do not have space in our packet to put the entire path. So, we will use Bloom Filters. As we discussed in the class, we can use Bloom Filters to compress set membership information at a small risk of making mistakes. In this assignment, you will write programs that will encode and decode the source route.

Write a program that will encode the source route as a bloom filter. You will be provided an input file that represents the path - each node along the path is separated by space. For example, if the content of the file is the following:

1 2 3 4 5

then the path starts at node 1 and ends at node 5 with 2, 3, and 4 as intermediate nodes.

You will compute the bloom filter as follows: First, we need to compute the index to the filter. Typically, we use a hash function to compute the index. In this assignment, we will use a special hash function:

str = "node1,node2"
hashval = asciivalue(md5sumhex(str)) mod n

md5sumhex() returns a string of length 32 -- md5 message digest value in hex digits, and asciivalue() is a function that returns the sum of ASCII value of all the characters in the argument, which is a string. For example, asciivalue("abc") would return 294.

This hash function will convert each edge in the path to an index (hashval) in the range 0..n-1. For example, if our path is n1->n2->n3, we will compute the hash for "n1,n2" and "n2,n3". Now, you will maintain a bit array and set the bit for that index. If hash of "n1,n2" and "n2,n3" return 2 and 5, then we will set the 2nd and 5th bit to '1'. This is our bloom filter.

Your job is to convert the path given in the input file to a bloom filter of n bits. We will run your encoder like this:

./write-route route_filename output_filename filter_size

Sample route input file

where filter_size specifies the size of the filter in bits and output_filename is the file to which you write your bloom filter . You should write the size of the filter as the first byte and the filter starting the second byte in the file. If your filter size is 5 bytes, then you here are the six bytes in the output_file: 5, A, B, C, D, E where (A..E) is the bit array representing your filter. Of course all these bytes would be one after another in the file in binary format.

For debugging, you can use byte array for the filter and write to the file in human-readable text form. But you should eventually write the bit array in binary format.

If we were building a network protocol, the filter would be put on a packet and forwarded to the next hop. In our example, we are going to write the filter to a file and write another program that will decode the filter stored in the file. With this exercise, we are essentially simulating sending a packet (writing to a file), receiving a packet (reading from a file and decoding the information), and the process by which an intermediate router will decide the next hop.

For this part, assume there is an input file that specifies the topology of the network. For each node in the network, there is a line in this file. The first column of each line is the node that has rest of the nodes in the line as its neighbors. For example, if we have an input file like this:

1 2 3
5 900 901

Sample topology file

then we have 6 nodes in the network. 1 has links to nodes 2 and 3. 5 has links to nodes 900 and 901. Here 2 and 3 are not neighbors to each other. 2 and 3 have one neighbor: node 1.

Your decoder program will read the filter (output by write-route), the topology file, and decide the next hop for nodeid. For example,

./next-hop filter_filename topology_filename nodeid

Your program will then display the next hop(s) for the nodeid. You will first read the topology file to find all the neighbors of nodeid. Then, you should do a lookup for each edge at the node using the has function above.

Lets say the route is 1->2->3->4 and node 2 has nodes 3, 5, and 10 as neighbors, if we call next-hop for nodeid 2, it should display 3 (assuming no false positive) as next hop. If there are false positives, it should display all the possible next hops.

Q 1 Lets say, our path has 50 hops and we are using a filter of size 40. What is the probability of false positive assuming uniform hashing?

Q 2 What happens to our routing/forwarding mechanism, if there is a false positive? Did you encounter false positives? If you did, provide the command line arguments, route input file, and topology file so we can reproduce the false positive.

Q 3 Plot the CDF of index values computed by our hash function of strings "1" through "10000" for a filter of size 50 and 100 bits. Is our hash function biased?

Q 4 Complete this table experimentally for these input sets: "1".."20", "1".."40", "1".."60", "1".."80", and "1".."100". Here "1".."20" represents the twenty different strings: "1", "2", "3", ... "20", thus the argument to our hash function with be the integer formatted as string (not integer,integer formatted as string for our work above).

Filter Size Compression Ratio False Positive Rate

5

10

20

30

40

Then, draw one graph with five lines for the five input sets. On x-axis put filter size, and on y-axis put false positive rate.

Q 5 Why does the encoder need to put the size of the filter as the first byte in the output file?

Homework Submission

Your submission will contain the code and the answers to the questions. The answers should be in a single PDF file. Your code should be in a sub-directory. There should be a text file with the name README with clear instruction on how to compile and run your code with the examples given on the HW2 description. If your does not work for some inputs or in some scenarios, you should explain that in the README. Please submit your homework through Moodle.