Graduate Networks, UCSD

CSE222 – Spring 2009

Tor: The Second-Generation Onion Router May 12, 2009

Filed under: Extra Papers — stufflebean @ 12:32 am
Tags: , , , ,

This paper describes the architecture of the Tor communication service. Tor provides an overlay on the Internet which provides relatively strong anonymizing guarantees for communication carried over it.

The main points of this paper are as follows:

  1. There are a great number of possible attacks against traffic over the Internet. There are timing attacks to determine the identity of a person accessing a given service, denial-of-service attacks to reduce or eliminate the functionality of a network, exploits which aim to impersonate a trusted service or user, and many others. The Tor network mostly aims to anonymize users, only preventing against other attacks to the extent that it makes the service more desirable or useful. The main mechanism by which Tor guarantees anonymity is by having many onion routers (ORs) over which communication is established in a circuit-based manner. Each hop in the circuit is encrypted using a different key, so as the encrypted message is passed along the circuit, one layer of encryption is “peeled away” at every step (hence the term “onion router”), only revealing the unencrypted data at the time it needs to exit the Tor overlay. This makes it very difficult to correlate data coming into the network with data leaving the network unless the attacker has compromised a large segment of the network.
  2. This system has an interesting blend of end-to-end and per-hop technologies. The onion unwrapping technique is necessarily per-hop, but the system also provides an end-to-end message verification to prevent a hostile OR from injecting or modifying the data in a packet. There is also a blend of centralization and distribution. The network should be comprised of many ORs in order to make it more difficult for a malicious observer to watch all endpoints for timing information, which can be used to deduce who is accessing a given service at a given time. However, to reduce overhead, directory services, which are used to provide an anonymous rendezvous for a responder service–normally only the requester’s identity is protected–are provided by a small number of centralized servers. This opens up further avenues of attack against the network, but should allow the system to operate more efficiently than one based on a gossip-based protocol. The reason that these dichotomies are interesting is that it makes the argument that there are always tradeoffs involved when choosing end-to-end/per-hop or centralized/distributed and that there are generally applications which require aspects of each.
  3. The paper also makes the point that no matter how well the Tor network is designed, it is only part of the communication process, and therefore can only provide a relatively limited set of protections against attack. For example, if a malicious user manages to commandeer a web server which an anonymous user is attempting to access through Tor, unless the user has used an application-level proxy to scrub any personally-identifying information from the HTTP request, then there might be information that could be used to track down the requester even though they used the Tor network. Tor also only provides so much deniability for a basic user. The default method for a user to connect to Tor is using an onion proxy, which packages his or her requests for transit to the overlay. However, since all of the data coming from the proxy consists of requests from that user, it makes it easier to identify which service the user is accessing through the use of traffic analysis at the exit nodes. This is not a failure of the Tor network, but an artifact of the service it provides. This limitation can also be mitigated by the user running a full OR, since this means that onion data leaving the user’s computer could either be a local request or just data forwarded on the behalf of other users.

The largest weakness that I see with the paper is that the Tor network, while effective, depends a great deal on there being a considerable number of trustworthy, reliable nodes to form the OR overlay. Since there is not much of an incentive for the average Internet user to go to such great lengths to protect his or her privacy, the overlay network might have problems becoming large enough to provide the deniability which is so valuable to the anonymous users. Further, since it introduces considerable latency into the communication path, it is unsuitable for some classes of applications which might benefit from anonymity, such as VoIP. It will be interesting to see how useful it is in large-scale deployment, especially when attackers have time to exploit many of the weaknesses listed in the paper.

Future research in this field should include ways to reduce the overhead of communication while incentivizing being a part of the overlay. If the overhead can be reduced to the point where it is tolerable for the average user, then the onion routing approach could be deployed on a very large scale, providing an unprecendented level of anonymity and, to some extent, privacy for the Internet at large.

 

Internet Indirection Infrastructure April 16, 2009

Three Important Things:

  • The motivating observation by the authors is that the demand for additional internet abstractions is out there, and the current IP layer cannot provide things such as multicast, anycast, and mobility in a scalable way. Taking the overlay network approach to this problem is a crucial decision, and it has the advantage of working with the infrastructure in place instead of trying to replace it. Such a network could be easily deployed in today’s internet, and based on the evaluation of its merit could someday become the dominant protocol.
  • By decoupling the sender from the receiver i3 is able to implement a variety of services. Using indirection is a key design point that enables flexible communication between hosts. This transition to rendezvous-based communication  is a marked improvement over the point to point model.
  • As the way that we use the internet has evolved, security has become a primary concern because of the services that are delivered over the internet. The problem is that the protocols in use today were not designed with security in mind. Any new idea for internet architecture should make sure to address security concerns. The paper comments on issues that are particularly relevant such as eavesdropping and DoS attacks.  The ability to have private triggers, enforcing that only a host can insert triggers on its own behalf, and fair queuing are all important extensions.

Glaring Problem:
A major concern with i3 is that the end hosts are responsible for inserting triggers, which means that it would be hard to enforce order and structure in the resulting overlay network. I would have liked to see the paper address this more fully, and maybe elaborate on how a large scale network could evolve out of an ad hoc, approach. This is discussed in the context of preventing attacks, but not in terms of how the network should be structured.

Future Work:
The i3 overlay is presented to us as a general mechanism for implementing communication abstractions. It would be interesting to see what other abstractions could be implemented with i3 in its current form or with extensions.

 

Internet Indirection Infrastructure April 16, 2009

Filed under: R05. Internet Indirection Infrastructure — koderaks @ 2:16 pm
Tags: , ,

The main contributions of this paper are:

  1. Indirection: the authors discuss how decoupling of the sender and receivers helps to generalize point-to-point communication. i3 provides better mobility, multicast, and anycast through this indirection. i3 identifiers and triggers serve to achieve this level of indirection.
  2. The paper proves the flexibility of overlay networks. It further provides valuable techniques to build security, scalability, availability, fault tolerance, and improved performance on top of the overlay. Overlay also allows for incremental deployment of i3 in a network.
  3. The authors provide many novel ideas such as using a stack of i3 identifiers to achieve service composition on top of i3, heterogeneous multicast (where receivers choose to receive different flavors of the multicast data), and fault tolerance (if one identifier fails, try the next identifier).

Glaring problems: The paper is well written and many scenarios are well thought of, however, the authors provide no experimental results for the applications of i3 most discussed in the paper: multicast, anycast, mobility. The paper goes at length to describe how i3 eases multicast, but does not provide evidence about i3’s efficiency when dealing with these. Additionally, it is my understanding that i3 would make unicast quite slow, so it would not be a suitable replacement for existing network infrastructue.

Future research: it would be interesting to implement a hybrid approach where packets are not required to go through an i3 node. Also implement i3 servers as modified switches. That is, modify a network switch to work with i3 triggers and identifiers instead of IP packets. If a switch can be made like this, a lot of the overhead from processing the triggers in software is taken away.

 

Internet Indirection Infrastructure April 16, 2009

Filed under: R05. Internet Indirection Infrastructure — mdjacobsen @ 2:12 pm
Tags: , ,

The authors introduce a design for separating network senders from receivers by introducing a layer of indirection. The system is called i3. The basic structure involves receivers registering a trigger containing an id and the receiver’s address which signals willingness to receive data labeled with the id. Correspondingly, senders send data with an id. The i3 system ensures that data sent with an id is routed to receivers interested in data with that id. The decoupling is intended to improve connectivity in the face of end host mobility as well as provide scalable multicast and anycast solutions.

The greatest contribution is the idea of decoupling the sender from the receiver. This is also the source of the greatest hit to performance, as will be discussed later. Because the sender does not know who will be receiving its transmissions, it is free to simply publish the data however is easiest. The infrastructure can handle routing the data to the appropriate hosts, whether there exists just one or multiple hosts.

Another feature of the i3 design is that it allows ids to be composite ordered stacks. This means that an end host can express a route that the data should take before being delivered. This feature can be used to compose services, such as transforming data into a specific format before being received by the end host. By allowing this kind of id based routing, end hosts can push processing into the network instead of orchestrating such compositions directly (as might be done in a traditional IP network). Of course this freedom is not without consequences. Being able to direct a route is a powerful feature and can cause problems with efficiency and loops if not used carefully.

The authors take steps to avoid the obvious misuses of this type of design. Security and robustness are directly addressed with seemingly adequate solutions. Although they also implement numerous optimizations to achieve low overhead and low latency, they are ultimately tied to the extra processing and hops necessitated by the level of indirection. The simulations in the paper show that at best, packet latency is twice as long as direct IP delivery — and this is after caching addresses and sampling servers to find the closest intermediary. Interestingly, one of the example applications proposed in the paper is transmission of MPEG data that is distributed to multiple end hosts. One of the hosts makes use of i3 composition to transcode the MPEG data into H.264 data en route. Clearly, this was not meant to be an example of a realtime video stream.

Despite the glaringly high latency issues, I find the approach to be interesting and possibly useful in specialized environments. It’s not clear how i3 can be used on the Internet, in practice, however. Further exploration into specific applications where mobility is an extremely common and problematic issue might yield enough demand to consider using i3.

 

Active Network Vision and Reality: Lessons from a Capsule-based System. April 14, 2009

The authors provide a description of their experience building and using an active network system, ANTS. ANTS is a system that routes capsules (i.e. network packets) through a network of active nodes according to custom software, as provided by the capsules themselves. Because the software is custom, it can provide support for deploying a new protocol without changing all the routers currently in the network.

ANTS is a Java based active router that runs on PCs called active nodes in the paper. The capsules are augmented IP packets that contain additional header information that (among other things) defines the custom software to run. The custom software is not actually contained in the capsule itself. It is referenced by a signature in the capsule and acquired separately via a directory service.

One of the key characteristics of this implementation is that despite the fact that the capsule doesn’t contain the custom code itself, the code is still acquired on demand. When an active node receives a capsule that doesn’t have the necessary custom code, a header in the capsule defines the previous active node. This previous active node can supply the necessary missing code on demand. This solution is efficient in that it only requires distributing custom code when necessary and only to those nodes on the path of actively routed capsules.

Another benefit of the ANTS design is that the active nodes are based on soft state. Thus each active node has the flexibility to unload state as necessary. The state can be reacquired at some later point if necessary. This flexibility may result in dropped capsules, but the authors argue that this behavior is consistent with “normal” levels of packet loss in traditional forwarding networks.

Lastly, and possibly the most important contribution, the ANTS implementation supports incremental deployment. All custom protocols must support heterogeneous network environments, where active nodes are connected by traditional IP forwarding nodes. This requirement allows the ANTS system to be deployed over existing networks. Ultimately, any new protocol will need to be deployed incrementally as it is unrealistic to expect any large network to change nodes completely and instantaneously.

Unfortunately, their security model is not as well crafted. The authors describe how they tried to balance security with the idea of making ANTS available to anyone with a new protocol. It seems that their compromises were less than successful at achieving either goal. Although the ANTS runtime environment provides significant protection in terms of protocol isolation and resource usage, the validity of the custom routing code ultimately boils down to third party certification. This approach seems to be used when no other alternative to programmatic security can be thought of.

Despite some faults, the ANTS system seems like a reasonable implementation for trying active network approaches. I’d like to see further research in terms of developing low cost devices that can function as active nodes. FPGAs are a likely candidate as they can perform a constrained set of active node API operations at wire line speeds.