COSC 4377 - Introduction to Computer Networks

Spring 2012

MW 1:00-2:30pm at PGH347

InstructorOmprakash Gnawali

Homework 10 : Understanding the Internet Paths

Due: midnight April 11, 2012

In this assignment, we will study the quality of paths in the Internet. Often we are interested in knowing how long it takes to send data from one machine in the Internet to another, e.g., a web server to a client that wants to browse the site. We will learn how to measure latency between machines in the Internet, visualize the latencies, and make recommendations for service placement based on the information we collect.

Although bandwidth of a path is also an important metric, we will focus exclusively on latency in this assignment. Latency is more important than bandwidth in interactive online applications that do not transfer a lot of data.

Measuring Latency

We can use the "ping" command to measure the RTT between a machine on which you are logged in to an arbitrary host in the Internet as long as that host replies to ping messages.

Here is an example of using ping to measure RTT:


$ ping www.google.com
PING www.l.google.com (74.125.227.112): 56 data bytes
64 bytes from 74.125.227.112: icmp_seq=0 ttl=51 time=22.235 ms
64 bytes from 74.125.227.112: icmp_seq=1 ttl=50 time=21.460 ms
64 bytes from 74.125.227.112: icmp_seq=2 ttl=51 time=23.291 ms
64 bytes from 74.125.227.112: icmp_seq=3 ttl=51 time=23.881 ms

In the example above, the first line is the command. Rest of the lines are output from the ping command. We can see the first RTT is 22.235ms, second one is 21.460ms, etc. We collect a number of these measurements, and report the mean latency as the latency between the host in which we ran ping and www.google.com. Depending on your platform, the output of ping command might be slightly different.

Participant Machines

There will be 20 machines available to you for approximately 10 days. In the meantime, you can prepare your scripts and commands on bayou. While on bayou, you can run ping commands to 127.0.0.1 or localhost and "pretend" they are different machines for testing your methodology.

Questions

  1. We have 20 machines available to us. How many pair-wise paths are there? Please report all the pair-wise latencies in a text file called latencies.txt. Each line in the text file will have three columns: host1, host2, and latency in ms. The entries in the file must be sorted by latency in an ascending order. Please also plot the CDF of the pairwise latencies.
  2. One of our goals is to create a latency snapshot of the Internet paths. What this means is finish all the measurements as quickly as possible so that the result of our measurement represents the state of the links at a given time. If it takes you a long time (e.g., 2 hrs) to perform all the latency measurements, it would be hard to argue that you have a snapshot because the measurements are taken as far apart as 2 hrs. During that time, the link latencies in the Internet might have changed a few times.

    1. Please describe your methodology for the measurements.
    2. How long do you expect the measurement to take?
    3. How long did your measurement take? If this time is different from your expected time, can you explain the reason for the difference?
    4. How do you measure how long it takes to perform the measurement?
    Taking a snapshot with the least measurement time is a competition. We will honor the submission with the shortest measurement time.
  3. To what extent do the links have asymmetric latencies? Come up with a creative way of visualizing the link latency asymmetry in at most 1/2 page.
  4. Plot a graph where the hosts are nodes, the edges represent path between the nodes, and the edge weight is smaller of the two latencies for that edge. Try your best to maintain the scale. For example, if a link AB has 10ms latency and link CD has 20ms latency, the link CD should be twice as long as AB in your graph. You might find tools such as graphviz or other graph drawing package for such an illustration.
  5. Lets imagine we have a web service that is very popular among UH students. If we had the flexibility of using one of the above machines as a server, which machine should we pick as the primary, 1st backup, and 2nd backup server? Why?

Submission

All the answers to questions should be in a single pdf, no longer than three pages in length. You should also include latencies.txt in your submission. If you wrote custom scripts/programs in this assignment, please include the source. Put all the files into a folder with the name: uhid_hw10, where uhid is the prefix of your .uh.edu email address. Then, zip the directory and upload the zip file using Blackboard.