Three Important Things:
- There is common acceptance that the internet in its current form is robust and efficient in adjusting routes in the face of link failures. Such occurrences are fairly common in such a huge infrastructure, and the authors recognized that it would be valuable to evaluate this accepted belief. By evaluating BGP convergence they came to the conclusion that inter-domain failover has significant delay, and that the path vector approach is incorrectly believed to provide improved convergence properties over distance vector routing. The claim is that the delay stems from ambiguity in the BGP protocol, and varying implementations across different router vendors.
- The paper was focused on experimental trails conducted over a long period of time, lending strength to the obtained results. The authors emphasize several points about designing and executing their experiments in order to produce quality, meaningful, relevant results. Faults were injected into the actual internet backbone as opposed to a simulation environment. Latency effects were measured at the router level by routing ISP routing tables, this allowed the examination of several events: route repair, route failure, repair and failover, failurea and failover. End-to-end measurements were performed to determine the application level impact, the results proved that even moderate routing table flux caused packet loss, increased packet latency, and out-order-delivery.
- In addition to the hard data, the authors provide theoretical lower and upper bounds. The exploration of these bounds resulted in several insights about delayed convergence. For example, a lack of minimum route advertisement interval timer in some implementations influences the order in which link updates are processed, which impacts the rate of convergence. There are numerous implementation variations across vendors that contribute to the performance bounds.
Glaring Problem:
A major concern that I had with this work is that it only focues on BGP, and that it didn’t really discuss the parallels between inter-domain and intra-domain results. It would have been useful to see how the obtained results pertained to other routing protocols, if at all.
Future Work:
Following this thorough treatment of BGP convergence analysis, it would be useful to follow up on the root causes of delayed convergence and try to design new protocols, or at least new features in future routers that could help alieviate the strain on the internet backbone. These topics were mentioned in the paper, but not at length. Such research would be crucial to the continued viability of the internet from a QoS perspective.