This paper describes the architecture of the Tor communication service. Tor provides an overlay on the Internet which provides relatively strong anonymizing guarantees for communication carried over it.
The main points of this paper are as follows:
- There are a great number of possible attacks against traffic over the Internet. There are timing attacks to determine the identity of a person accessing a given service, denial-of-service attacks to reduce or eliminate the functionality of a network, exploits which aim to impersonate a trusted service or user, and many others. The Tor network mostly aims to anonymize users, only preventing against other attacks to the extent that it makes the service more desirable or useful. The main mechanism by which Tor guarantees anonymity is by having many onion routers (ORs) over which communication is established in a circuit-based manner. Each hop in the circuit is encrypted using a different key, so as the encrypted message is passed along the circuit, one layer of encryption is “peeled away” at every step (hence the term “onion router”), only revealing the unencrypted data at the time it needs to exit the Tor overlay. This makes it very difficult to correlate data coming into the network with data leaving the network unless the attacker has compromised a large segment of the network.
- This system has an interesting blend of end-to-end and per-hop technologies. The onion unwrapping technique is necessarily per-hop, but the system also provides an end-to-end message verification to prevent a hostile OR from injecting or modifying the data in a packet. There is also a blend of centralization and distribution. The network should be comprised of many ORs in order to make it more difficult for a malicious observer to watch all endpoints for timing information, which can be used to deduce who is accessing a given service at a given time. However, to reduce overhead, directory services, which are used to provide an anonymous rendezvous for a responder service–normally only the requester’s identity is protected–are provided by a small number of centralized servers. This opens up further avenues of attack against the network, but should allow the system to operate more efficiently than one based on a gossip-based protocol. The reason that these dichotomies are interesting is that it makes the argument that there are always tradeoffs involved when choosing end-to-end/per-hop or centralized/distributed and that there are generally applications which require aspects of each.
- The paper also makes the point that no matter how well the Tor network is designed, it is only part of the communication process, and therefore can only provide a relatively limited set of protections against attack. For example, if a malicious user manages to commandeer a web server which an anonymous user is attempting to access through Tor, unless the user has used an application-level proxy to scrub any personally-identifying information from the HTTP request, then there might be information that could be used to track down the requester even though they used the Tor network. Tor also only provides so much deniability for a basic user. The default method for a user to connect to Tor is using an onion proxy, which packages his or her requests for transit to the overlay. However, since all of the data coming from the proxy consists of requests from that user, it makes it easier to identify which service the user is accessing through the use of traffic analysis at the exit nodes. This is not a failure of the Tor network, but an artifact of the service it provides. This limitation can also be mitigated by the user running a full OR, since this means that onion data leaving the user’s computer could either be a local request or just data forwarded on the behalf of other users.
The largest weakness that I see with the paper is that the Tor network, while effective, depends a great deal on there being a considerable number of trustworthy, reliable nodes to form the OR overlay. Since there is not much of an incentive for the average Internet user to go to such great lengths to protect his or her privacy, the overlay network might have problems becoming large enough to provide the deniability which is so valuable to the anonymous users. Further, since it introduces considerable latency into the communication path, it is unsuitable for some classes of applications which might benefit from anonymity, such as VoIP. It will be interesting to see how useful it is in large-scale deployment, especially when attackers have time to exploit many of the weaknesses listed in the paper.
Future research in this field should include ways to reduce the overhead of communication while incentivizing being a part of the overlay. If the overhead can be reduced to the point where it is tolerable for the average user, then the onion routing approach could be deployed on a very large scale, providing an unprecendented level of anonymity and, to some extent, privacy for the Internet at large.