Graduate Networks, UCSD

CSE222 – Spring 2009

End-to-end Arguments in System Design April 2, 2009

Three important facts brought out by the paper

A lot of communication sub-systems tend to provide several functions aimed at making the system more reliable, secure, error proof, etc. The paper takes a look at the big picture, including both the communication sub-system and the upper application layer, and brings out the fact that a lot of these functions would provide a more complete coverage if present in the upper, application layer.

The paper also mentions how providing the same functionality in both, the communication sub-system and the application layer would only introduce more redundancy into the system without actually helping performance or coverage to a great extend.

In addition, it talks of the trade-offs and the purpose achieved by providing these functions at a lower or upper layer. It takes a look at six major functions relevant to a networked, distributed system: reliability, delivery guarantee, secure data transmission, duplicate message suppression, in order delivery and transaction management. It discusses the major benefit and functionality achieved by providing each of these features at a communication sub-system level or the upper application layer or both. This provides some rather interesting insight into how several of these features perform redundant work given others are already in place and only result in unnecessary wastage of bandwidth and compute time.

Facts it overlooks or doesn’t provide

It presents no concrete experimental results. It provides a lot of reasoning which makes sense. However, there are a lot more trade-offs between providing a functionally in the lower or upper layer and additional aspects to the ones mentioned than that touched upon by the paper. For example, it does not talk of the overhead/time wasted in letting an error get to the application layer before it is detected. It does mention how large files might benefit with per packet error checking and how this can also be accomplished by a “careful file transfer” application. However, it makes no mention of the time taken and computation required to perform this error checking in the application layer as compared to the lower, communication layers. It is possible that performing checks in the lower layers takes up half as much time and if majority of the errors occur in the lower layers, a lot of performance benefit is to be got by providing efficient error checking in the lower layers itself. In the case of end-to-end reliability checking, there is also the overhead of sending a checksum from the client machine to the server machine as part of the last step to compare the original checksum with the final checksum. If the communication sub-system were to detect this error a priori, this overhead can be averted. A clearer and more concrete picture of all these trade-offs working at the same time would be got if actual experimental results were presented and a comparisons made between providing functions at the lower and upper layers.

Future Work

One of the extensions that can be made to the idea presented by this paper is to provide quantitative measures that take into consideration the network type, traffic, protocol etc to determine if a functionality should be provided in the upper or lower layer. A further extension could be to make this a dynamic decision, wherein, depending on the number of packet errors or the QOS provided by the network, a functionality might be realized in the lower or upper layers. For example, if there were more than one or two packet errors in a single file, error checking/reliability could be provided in the lower layer to prevent spending time in receiving the entire file before declaring it to contain an error.

 

End-to-end Arguments in System Design April 2, 2009

The authors bring to the surface the question of where to place key functions within design frameworks. The discussion is couched within the context of a network stack. They use several examples to illustrate how implementation of functions in the communication layer may (often) be insufficient for applications. In their examples they suggest that the communication layer may be able to provide guaranteed, once-only, FIFO, uncorrupted, confirmed delivery of individual packets across a network. However, only the end point applications will be able to identify if the intended action was successful. This determination will most likely involve some application specific function that replicates similar functionality added to the communication layer. The question becomes, why add any functions to the lower level layers if the application layer will need to perform its own form of the same functionality?

The authors go on to argue that having some level of functionality in the communication layer will reduce the errors encountered at the application level. Moreover, that most applications except real-time applications, will benefit by having this minimal level of functionality. Indeed, this is probably true for most applications.

It seems however, that most applications expect a common level of functionality from the communication layer. This is most evident by the widespread use of TCP. Of course real-time applications may choose to use UDP for performance reasons. But if the author’s arguments are as strong as they suggest, then I would expect more applications to use UDP. The corresponding TCP functionality would need to be built into the application already, so why bother with redundant functionality in the communication layer?

While the main argument of efficiency is valid, it seems that the larger argument is whether the cost of redundant functionality is acceptable to the application. My limited experience suggests that the benefits of some of the lower layer functions exceeds the cost of the redundant functions (in terms of development time and runtime). Further research into quantifying how much each feature is “worth” for common network applications would be an interesting avenue to pursue.