Graduate Networks, UCSD

CSE222 – Spring 2009

TCP Vegas: End to End Congestion Avoidance on a Global Internet May 15, 2009

  1. Conditions for retransmission — instead of using actual timer for timeout in every case or automatically retransmitting after n duplicate ACKs, assert interval between when a packet was sent and when a corresponding duplicate ACK was seen is within round-trip time.
  2. Congestion control — instead of blindly increasing congestion window size until congestion is detected, decreasing the window size to compensate, then repeating the cycle, TCP Vegas makes an estimate based on non-congested round-trip time and the current congestion window size, and compares to the number of unrelated bytes sent between a packet and its ACK.
  3. Slow-start — to set a more accurate threshold window for switching to linear expansion of the congestion window from exponential expansion, the experimental Vegas* is proposed. Vegas* sends four packets and waits for ACKs; as they arrive, it schedules congestion window size increases in the future and adjusts the threshold window based on the corresponding bandwidth estimates.
 

TCP Vegas: End to End Congestion Avoidance on a Global Internet May 15, 2009

Three Important Things:

  • TCP Vegas makes several implementation changes on the sending side in order to achieve better bandwidth utilization. This means that it is still fully protocol compliant, but with significant optimizations. The first is an extension to the retransmission mechanism for lost packets. In the Reno implementation, TCP waits for 3 duplicate ACKs before retransmitting. The problem with this is that in many cases these ACKS are not received at all due to loss, and a timeout occurs. In order to reduce the number of timeouts and detect loss early, Vegas proposes to retransmit immediately if it is determined that a sent packet was not received. This is based on the response time for the duplicate ACK compared to the previously observed RTT.
  • Another issue with Reno concerns congestion detection. It is a reactive protocol that needs to actually create congestion and loss in order to determine the bottleneck bandwidth. Vegas improves on the by again using a measurement and heuristic approach. It operates under the principle that increased window size should result in increased throughput. If it is observed that it does not, Vegas assumes that the extra segments are starting to fill up the router queues, and reduces the windows size. Also, when the throughput is too close to the expected value, more segments are sent to prevent under-utilization.
  • Finally, the TCP slow start mechanism is modified to more quickly arrive at a reasonable window size with less overshoot and losses during the initial phase. Instead of performing exponential increase and the resulting exponential backoff, Vegas doubles its growth only on every other transmission. Using the congestion avoidance heuristc, the decision is made after the intermediate transmission whether to double again or switch to linear growth. The result is that the window size and throughput waveform is more stable initially, and settles faster into the familiar sawtooth shape.

Glaring Problem:

The new implementation relies heavily on the RTT measurement for all of its heuristic rules. This is a possible weakness in the protocol, as it requires the host to have an accurate timing mechanism that must have very low latency. It also makes the protocol susceptible to random flucutations in RTT time that could prompt it to behave erroneously.

Future Work:

The new ideas established in this paper should be further deployed and examined independently by others. Of particular interest is the study of the interactions between Reno and Vegas, since any rollout of Vegas would no doubt need to coexist with Reno for a long time. It might end up being the case that Reno ends up dominating Vegas in many cases and limiting its effectiveness.

 

TCP Vegas: End to End Congestion Avoidance on a Global Internet May 15, 2009

(i) the three most important things the paper says

The first most important topic that the paper covers is the fact that TCP Reno relies on congesting a network in order for it to be detected.  This is a huge observation, as TCP Reno must effectively make the congestion situation worse in order for it to get better.  The authors claim (and thereby show) that TCP Vegas handles this situation much differently.  TCP Vegas attempts to predict a congestion-based situation before it leads to significantly decreased performance on the network as a whole (by clogging up buffers at each intermediate router).  What the authors are effectively saying here is that the anti-congestion measures in place in TCP should be based upon preventative measures instead of failure-based measures.

The next observation that the authors make deals with the conservativeness of the Reno retransmit algorithm.  The authors claim that the retransmit algorithm doesn’t quite detect when there’s been a lost packet quite fast enough (fast enough meaning before the next global timeout).  TCP Vegas addresses this by, instead of waiting for a third consecutive repeated ACK, attempting to guess based upon the timeout value of each individual outstanding packet (versus the global timeout).

A third important observation made by the authors stated that the slow start mechanism used by Reno was too aggressive.  The exponential start could basically create a congestion problem just by how quickly it can ramp up (without testing out the current network condition).  TCP Vegas handles this by providing more time in between exponential growth periods, allowing the TCP connection to gauge its progress slowly (a slower ramp up) without congesting the connection on a wrong guess.

(ii) the most glaring problem with the paper

One of the biggest problems with this paper is the fact that it introduces quite a bit of complexity to (what was) a relatively simple protocol.  This means that the TCP connection itself will require more processing power.  While this may not be of great concern to higher-computation capable modern PCs, this could become a problem for lower powered embedded systems that use TCP as a mode of communication.  With lower powered devices on the rise (internet phones, etc.) increasing the complexity of TCP might not be the greatest idea.

(iii) the future research directions of the work

I feel that a larger scale study could be done with TCP Vegas on a TCP Reno based Internet.  The study that was done contained a small set of Vegas connections over the internet, but what would be interesting would be assigning a large subset of the Internet to use the TCP Vegas design on top of the TCP Reno Internet.  I feel that this will reveal more information about the usefulness of TCP Vegas in a TCP Reno world (and whether or not the conservativeness of Vegas will fall prey to the aggressive actions of TCP Reno).

 

TCP Vegas: End to End Congestion Avoidance on a Global Internet May 15, 2009

This paper talks about TCP Vegas, a TCP implementation based on TCP Reno but with better throughput and fewer packet losses. The basic idea is to try to predict congestion and avoid it, as opposed to Reno’s approach which creates congestion and then adjusts afterwards. The main contributions of the paper are:

  1. Retransmission Mechanism: Reno uses a coarse timeout for retransmissions, thus typically taking too long before retransmitting. It also retransmits when 3 duplicate ACKs are received. However, Vegas realizes eliminating dependency on this coarse-grain timer would reduce timeouts further, and increase throughput. Vegas recalculates RTT times each time an ACK arrives, thus giving better estimates of RTT. Then it takes advantage of this accurate RTT and uses certain ACKs as a hint to check if a timeout should occur: (1) rather than waiting for N duplicate ACKs, each time a duplicate ACK is received, check to see if any of the relevant segments have timed out. If so retransmit them. (2)Check the timeout for segments in question for the first or second non-duplicate ACK is received after a retransmission. By using this policy, Vegas can quickly detect if a retransmission is necessary, rather than having to wait for the coarse grain timeout to expire.
  2. Congestion Avoidance and modified Slow-start: Reno needs to create losses before it can detect congestion. It has to continually increase its window size for higher bandwidth, until it congests the network. Vegas on the other hand tries to predict congestion by comparing the current throughput to an expected throughput calculation. Vegas tries to keep bandwidth between two predetermined threshold values (indicating the number of extra buffers to use in the network). As for slow-start, Reno always sets the threshold window to half the congestion window when a retransmit timeout occurs.  The congestion window will exponentially increase until the slowstart threshold, after which it will be a linear increase. Vegas on the other hand only uses exponential growth every other RTT, and between the RTTs it keeps the window fixed to calculate an accurate throughput rate.
  3. Implementation can play a significant role in performance of a protocol: TCP Vegas is an implementation of the TCP protocol, in the same manner that TCP Reno is. This goes to prove how implementation details that are typically left out of the protocol specification can play a significant role on the performance of the end-system, and thus careful considerations should be made for the implementation.

Problems: Vegas tries to always keep a few extra buffers filled up in the network. That is, it sends enough traffic to ensure that some segments are buffered up on the routers. While this might result in good performance for a few nodes, when there are thousands or millions of Vegas nodes all doing the same thing, the routers’ buffers could easily be overwhelmed and start dropping packets. It seems like a greedy idea to aim to buffer packets on the routers all the time.

Future Research: Vegas was the basis for RenoNew and therefore provided the groundwork for the current popular TCP implementation. I don’t know anything about RenoNew, but I would assume some research was done to change the buffering mechanism of Vegas, as it would have most likely overwhelmed routers’ buffers.