Assignment 1 is due by 5 pm on Thursday, April 16th. Place your code and makefile and README in a directory labeled A-B where A and B are the last names of the two people in your group. Use tar to create a file A-B.tar.gz and email it to barath@cs.ucsd.edu. Your makefile should generate a binary named “server” and upon execution the server should take the listening port number as an option. Your README should contain a brief description of what you implemented, how you implemented it, and what problems there were (or still are). Your assignment will be evaluated under Linux (2.6.21-rc6 i686 Intel(R) Pentium(R) 4 CPU 2.80GHz with GCC/G++ version 3.4.6).
The grading checklist that will be used to evaluate your submission is here. Take a look and feel free to share test cases, questions, and answers in the comments section of this post.
Hi, I have a few questions:
For HTTP 1.0, should the server NOT include a content-length header? ( ie, is it an error to include this for HTTP 1.0)
There was mention of .htaccess files in the assignment page, and from of your email responses to me I inferred that we had to implement the HEAD method as well… I’ve done these – are these not graded?
Also I did not work with a partner. What filename would you expect me to use for submission (lastname.tar.gz)?
You should include content-length even for HTTP 1.0:
As for the HEAD method, I’ll be using it to test whether the server returns a 404 when appropriate. You don’t have to implement .htaccess, but you should check whether are able to read the file in question (and return a 403 if not).
And yup, if you worked alone, just name your submission lastname.tar.gz
Barath,
the page says: ” Applications should use this field to indicate the size of the Entity-Body to be transferred, regardless of the media type of the entity. A valid Content-Length field value is required on all HTTP/1.0 request messages containing an entity body. ”
Since we only send response messages, is it necessary for our server to send Content-Length header as well?
Suprita
Sorry, I should have been clearer. Content-length isn’t necessary for HTTP 1.0 since the default is that the server just closes a connection after it sends an object to the client (thereby indicating EOF). However, if it’s easier (and I imagine it might be depending on your implementation) you should include it for both 1.0 and 1.1. I’m not going use a client that gets confused about content-length.
I have a question regarding 403 vs. 404 in the case of trying to access something above the document root (e.g. accessing /../foo). It seems like it would take considerable effort to return a 403 for files which don’t exist since it is hard to perform relative -> absolute path conversion on a file that doesn’t exist (without parsing the relative path by hand, which is a pain). I know it’s probably inappropriate for a server to reveal whether a file exists outside of the document root, but is it okay to return a 403 in some cases and a 404 in others as long as no files above the document root are allowed to be viewed in the end? I guess to ask it another way, how rigorously are we going to be tested on files above the document root?
In a related question, are we defining a 403 for an otherwise valid document as not being world readable (e.g. chmod o-r foo)?
I’m not going to test weird cases of files above the document root. The easiest way to handle it is to return 403 for any request that goes above the document root, irrespective of whether the file exists or not.
Also, you don’t have to explicitly check permissions. Instead, just try to open the file for reading and if you get EACCES then you can return 403.
I guess my question is more specific to my implementation. I’m using realpath (man 3 realpath) to convert all of my paths to canonicalized absolute paths (no symlinks, /./, or /../). This has the advantage of making it very easy to tell if the final path is below the document root since I can just do a strncmp to ensure that the root is a prefix of the path. Unfortunately, if the file given to realpath does not exist, it returns ENOENT and does not give any absolute path. In order to always return a 403 for any access that goes above the document root, it seems like it would be necessary to check every path element in the relative path, which is more complicated. I can do this if it’s what is required, though.
realpath() is quite convenient…
Well, my tests were going to be only for 404s for files within the document root that don’t exist and for 403s for files above the document root that do exist, so it’s up to you how you handle the case of files above the document root that don’t exist.
I noticed that when I do something like http://www.cs.ucsd.edu/../index.html, it would just return http://www.cs.ucsd.edu/index.html (i.e. looking for anything above the root is treated as looking for a file in the root), rather than return a 403, so I implemented the corresponding behaviour. Is this incorrect?
That’s fine. A lot of webservers do a redirect to the root if you try to go above the root, but doing a redirect isn’t a requirement for this assignment.
3. Hi Barath, I had a doubt regarding the connection keep alive. Although my server is keeping the connection open, I find that browser is again making a new connection for subsequent requests instead of reusing the existing connection. I am sending the http keep alive field in the header to the client. Is there anything else i need to do to make the client use the persistent connection?
If you’re using Firefox, try setting network.http.max-connections-per-server, which I think should force it to use a single pipelined connection for its transfers. (And of course make sure network.http.pipelining is enabled.)