UNIVERSITY    of     HOUSTON
Department of Computer Science

COSC 6377 - Computer Networks, Fall 2000

FAQ for Term Project

  1. Can I use JAVA (or other languages) for the project? [11/03/2000]
    Yes, but your program has to meet all the requirements. Also, you should provide sufficient info how to run your program. TA will compile and run on available resource (machine and software) to the department. If you need special software, you have to provide it and schedule a demo to TA (bring in your notebook/PC if necessary).

    We will not provide equivalent library functions (libproxy.a) for JAVA (or other languages). So, you have to implement your own version of libproxy.a (unless you know how to call C library from JAVA or other languages).

    Your JAVA program written for Internet Computing class probably will not help you much, since you are dealing with much low level HTTP messages, pipe, multiple socket connections, etc. Plus you might have to rewrite libproxy.a (plus librx.a). You might end up spending equal or more time than using C/C++. Make sure you can get program done in time before you devote your time in JAVA.

    Perl can do lots of things, especially with lots of libraries available to you. Again, you have rewrite libproxy.a or have something equivalent. Plus, comparing to C/C++, perl version of proxy server is very slow and require lots of system resource. This also applies to Python.

    If you want to use Visual C++, you have to use WinSOCK. And you have to port libproxy.a to Windows, too.

  2. Q: Can I write my program on bayou? [11/03/2000]
    Yes, but you have to make sure it runs on SunOS or Linux machines in the department. I would strongly suggest you to use "gcc" for better portability. However, you should have an account on CS dept's machines, so you can submit your programs.

    Be aware that department's firewall blocks most of the ports, so it's likely that you can't test your proxy server on bayou from any of CS dept's machine.

  3. Q: How can I test my program? [last update: 11/08/2000]
    You can use telnet to test your program. For example, your proxy server is running on pegasus listening port 3000. You can do "telnet pegasus 3000" and manually type in HTTP request.

    Or use "sample1.c" (client) and "sample2.c" / "sample2-sig.c" (server) to test your program.

    You can also use Netscape or Internet Explorer (IE) to test your program. For Netscape, click "Edit" --> "Preferences" --> click on the triangle in front of "Advanced" --> "Proxies" --> check "Manual proxy configuration" and click "View..." --> type in hostname and port number where your HTTP proxy runs on. For IE, click "Tools" --> "Internet Options", rest is similar to Netscape (I don't have IE with me now. I will update it part later.)

  4. Q: How can I get client's hostname? [11/06/2000]
    You can use getpeername() to determines the address of the remote system on a socket. This should be used in conjunction with gethostbyname() to determine the DNS address of the remote system. Please see this sample program " message logger" for example.
  5. Q: How my program will be graded? [11/06/2000]
    Please see the section for grading in project page.
  6. Q: I can't compile select() sample program, nbserver.c, from The World of select(). [last update: 11/14/2000]
    I have corrected some errors in nbserver.c. You can use my corrected version.

    You will need sockhelp.c and sockhelp.h to compile it.

    Some of you report having problems compiling sockhelp.c on SunOS machines. I have post corrected version of sockhelp.h.

  7. Q: How can I trace the messages exchanged among client, proxy, and web server? [11/14/2000]
    You can use sniffer program to do so. One of many you can find on the web is ethereal. You can find the instruction for ethereal which I wrote for COSC 4377 class.
  8. Q: My program was running fine on one machine (Linux), but when I run on different machine (say SunOS machines) my program had all kinds of problems. [11/15/2000]
    If you compile your program on different OS, you have to recompile "libproxy.a" and your program, otherwise you will run into some problems which you don't expect.
  9. Q: Some web sites reported that they couldn't find the document, but it worked fine under Netscape. [11/16/2000]
    Let's take www.yahoo.com for example. If your request line reads "GET http://www.yahoo.com HTTP/1.0\r\n\r\n ", it will give you a "document not found". If you use "GET / HTTP/1.0\r\n\r\n", then it will find correct document. Click on those two links for "GET" for detail.

    Please note that Netscape always include "Host:" in the header field which can be used as your reference for remote web site's hostname.

  10. Q: How do I know the client (Netscape) have finish transmitting the request? [11/16/2000]
    Firstly, you have to check if there are two CRLFs (\r\n\r\n) which are treated as a separator. Secondly, if there is a "Content-Length" in header fields, it indicates the content length after CRLFs. Then you know how many bytes you have to read from client.
  11. Q: Should I keep the connection after finish transmitting data to client (Netscape)? [11/16/2000]
    No. You don't have to, unless you get "Proxy-Connection: Keep-Alive" in the head fields. Replys from web sites may have "Connection: Keep-Alive". You can replace or add "Connection: close". So the client will close the connection after receiving the data.
  12. Q: I can get html file without problem, but no image. [11/19/2000; last update: 11/20/2000]
    This problem is likely caused by the way your proxy read data from web server. Since socket is treated the same as a stream, you can use any file read / write / flush functions (except functions for random access files), for example read(), recv(), fscanf(), fgets(), fgetc(), getc(), etc. But you should be aware the side effect of terminating characters (such as '\n' and '\0') for some functions mentioned above. '\n' and '\0' could be a part of image content.

    Also, pay attention to "Content-Length" which tells you length of the image file.

    [11/20/2000 update:]
    If you read extra data and sent it to client, your client (Netscape) might not display it properly.

  13. Q: gethostbyname() is not working when I call it for the second time? [11/20/2000]
    Some students reported that if you use strtok() in your program, gethostbyname() will not work properly. Since it's fixed format, you can use sscanf() instead.
  14. Q: I can't get sample code socket-sample.c running on Linux, but it ran fine on SunOS machines. [11/20/2000]
    Somehow, Linux doesn't like two lines in the socket-sample.c. I had made some changes. See this new version of socket-sample.c. Line 74, "server_addr.sin_addr.s_addr = inet_addr(inet_ntoa(hostentry->h_addr_list[0])); is not working well on Linux. Line 105-110, the if clause for setsockopt() is also not working well on Linux.
  15. Q: I have a question about getting the client host name such as "eng.uh.edu". I used getpeername and I got an IP address. How do I change the IP address to actual domain name? [11/22/2000]
    You can use gethostbyaddr().

    [11/23/2000 update:]
    Also, see 4.9 in Unix-socket-faq for network programming.

    h_name of struct hostent which returned by gethostbyaddr() might contain hostname only (say "pegasus", but not "pegasus.cs.uh.edu"). The reason behind this is that name resolve service is done by NIS (instead of DNS). One way to solve this problem is check the content of h_name. If it doesn't contain domain name (for example "cs.uh.edu" or "uh.edu"), you can manually append the domain name to h_name (you can run "domainname" to get domain name info). As far as I know, there is no easy way to solve this problem. If you know a better solution, feel free to post in newgroup or send me an e-mail.

  16. I am still confused how filter works. And what is "scramble" in proxyrc? [last update: 11/27/2000]
    This is how you filter incoming data. The HTTP protocol defines a bunch of content-types which have exactly the same structure as used in MIME mail. You have to parse the 'Content-type' header field, extract the media and pass it to proxy_filter(). If a filter has been defined, you'll get as a return the string representing a shell command to be executed.

    In the proxyrc file,

        filter text/(.*) "scramble -swedish -\1"
          
    Sample entry here says that all text should be piped through an imaginary scramble filter which groks text on stdin and dumps a swedish-chef version of it on stdout. There is no such program called "scramble" in the system. To test your program, you can use (There was a typo in the sed command. There should be an "s" before first "/".))
        filter text/(.*) "sed s/is/IS/g"
          
    This will replace all "is" in the returned documents to "IS". Say the document from remote web site is "http://www.yahoo.com/index.html". The document "index.html" received from www.yahoo.com will be redirected to filter's (sed program's) stdin. All "is" in index.html will be replaced with "IS" (for example, "This is" becomes "ThIS IS"). And this filter has the same result as "cat index.html | sed s/is/IS/g | more". (Stdin of "sed" is redirected from "cat" and stdout of "sed" is redirected to "more".) This doesn't mean you have to save the document to a file. And you should create pipes to execute external filter. And the output of filter (stdout) will be redirected to client.

    Also, after you get the result from proxy_filter(), you can call proxy_exec() to execute the filter. Please read "exec.c" for info about how to use proxy_exec(). Please note that there was a report from a student saying proxy_exec() might not function properly. If you can't get proxy_exec() working, you might want to consider writing your own version of proxy_exec().


Last modified: November 27, 2000.