Graduate Networks, UCSD

CSE222 – Spring 2009

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — vikrams3 @ 4:13 pm

The title of the paper describes the efforts of the authors to make Internet research more compatible with other science fields.

  1. Unlike other science areas, Internet researchers are faced with a dearth of real data, which keeps them from realistically monitoring the state and patterns of network traffic. This lack of understanding has led to many major investments being made in the dark, after getting misguided by businesses’ individual interests. Hence there is an immediate need to develop tools that make monitoring of Internet simpler.
  2. The measurement system Ark that is described in the paper has several useful features. It helps in easy development and rapid prototyping of a measurement idea, and hence indirectly helps the researcher devote more time to innovative ideas. Another important feature is the distributed measurement and coordination (aggregation of results) among several nodes in the experimental system.
  3. A very useful feature of Ark is that its coordination is based on tuple-space distributed shared memory. Ark provides simple interface and usage semantics to the user, who can use simple queries to get data on the dynamics behind ping, traceroute programs, RTT as a funtion of distance etc. The results are stored in the form of tuples.

One problem that I find is in the usage of distributed shared memory for storing data in tuple-space. DSM is known to have many problems like complex semantics, availability issues etc. Not clear from the paper whether the authors have addressed these issues.

It will be interesting to see the measurements from the largest set of IP topology data that the authors are currectly corlecting. Also, it will be nice to see if there are any substantial advantages of IPv6 over IPv4 based on the measurement data from te Ark tools developed.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — giledelman @ 4:13 pm

Three Important Things:

  • The Ark architecture was developed as a means to deploy active internet measurements. It was built around good software engineering principles of rapid prototyping and low barrier-to-entry coding. Recognizing that internet measurement is an embarrassingly parallel operation, the architecture incorporates several “monitor” nodes that can cooperate and coordinate their operations. The implemented communication channel (tuple space) allows unicast and multicast communication, and serves as the building block for decentralized management and data aggregation.
  • The process for consolidating path information into AS-level topologies is non-trivial in the context of the paper. Still, a high level view of the internet is desirable, and so the authors outline three steps to achieve such a map. First they establish ways to map IP addresses to routers, and then IP addresses to ASes. A router level topology is established by consolidating IP addresses that belong to the same routers. Interdomain and intradomain links are identified when two adjacent addresses map to different entities.
  • With an infrastructure of 31 monitors at the time that this paper was written, the team members were able to gather 2.1 billion traceroutes. This was done with random probing of /24 IP prefixes using the “scamper” tool. This demonstrates the scope and ultimate feasibility of internet measurement.

Glaring Problem:

Lacking in this paper was a discussion on creating a geographic map of the internet. With information about the IP-router mappings this should be possible to obtain.

Future Work:

Ark is presented as an extensible and distributed architecture for performing internet measurements. As such there are many more possible uses for it. Possible future applications are: bandwidth estimation, quality of service monitoring, content sampling, traffic pattern observation based on time of day and location.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — ctrezzo @ 4:13 pm

Three major contributions:

  1. Developed and designed a new architecture for Internet topology measurement and mapping. It is called Archipelago (Ark).
  2. Currently collecting the largest set of IP topology data for use by academic researchers.
  3. Provide a method for Internet researchers to quickly and easily deploy new measurement techniques at a low cost. This allows more experimentation with measurement techniques that could lead to greater discovery.

Major problem:

They claim that it is easy and cheap for researchers to deploy measurement experiments, but they do not provide any proof for this claim. Also, they do not show if their measurement architecture is actually producing realistic data.

Future implications:

This will allow for greater collaboration between Internet researchers. Also, this system will hopefully provide larger real world data sets that researchers can use to substantiate their work. Questions about the political (security) and economic effect this system will have on the world need to be answered.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — Mike @ 4:13 pm

(i) The three most important things the paper says:

1. Ark was conceived to give researchers a platform capable of many different kinds of internet measurements. They distributed system nodes worldwide to enable measurements of separate internet cultures as well as combined measurements of the internet as a whole. Ark also has the capability to create AS topologies and IP topologies of the internet through use of novel identification techniques.

2. Since the system is completely distributed, coordination was a huge concern and forte of Ark’s design. Without coordination that allows the nodes to communicate quickly and effectively the researchers would be extremely limited on the kinds of measurements they could take and most of the advantages of having wide deployment would vanquish. Ark employs the tuple space coordination language. This language takes full advantage of pattern-matching to allow autonomous measurements while allowing communication and control between nodes.

3. Since parameters, commands and return values can all be expressed in the forms of tuples, the user can easily write programs in the functional model used my most programming languages today. This abstraction allows the programmer to stay far away from the network-level details of his implementation and focus his effort on taking sufficient measurements for his needs.

(ii) The most glaring problem with the paper:

The most glaring problem I noticed with the paper is that it doesn’t discuss the shortcomings of Ark. One of the most useful things to know about an application is what it fails to achieve. Even though it is hard for an author to want to present weaknesses it allows others to further understand the design and it brings to light further research topics in the area. Unfathomable that Ark has perfectly addressed of its original objectives.

(iii) The future research directions of the work:

Ark in itself presents a wide  range of future directions for research. Finding novel ways to exploit ark’s design to acheive different measurements would be a rewarding avenue to persue. There is also the opportunity to try to build a better reasearch platforms than Ark which could be tailored towards certain applications or even more flexible.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — mohit1982 @ 4:13 pm

This paper talks about the importance of internet infrastructure measurement and assessment and proposes a secure measurement platform to accomplish it. They claim to be using the best available techniques for IP topology mapping and developing some new techniques and software for data analysis, topology generation, and interactive visualization of result-oriented annotated graphs. The authors talk about their next generation skitter-based active measurement infrastructure, Archipelago (Ark). They stress on three qualities that Ark strives for:

  • The authors see easy deployment and rapid prototyping as very important for internet measurements and therefore, Ark supports software development at high-level of abstraction using dynamic scripting languages and pre-built API’s and services. They use Ruby as the primary implementation language.
  • They stress that many desirable measurements require dynamism and coordination among measurement nodes. To enable this, they provide a tuple-space coordination model called Marinda which is a distributed shared memory with a small number of easy to use operations. It also allows for decoupling of measurement processes in time and space.
  • Another quality pursued is the support of measurement services. This support for services is made possible by the tuple space, which acts as the unified mechanism for transport and messaging, in the terminology of the web services protocol stack.  The service architecture based on tuple space has advantages of low deployment effort and cost, no special privileges required, decentralized management, ease of implementation, ease of aggregation and diverse communication patterns.

The authors have deployed these Ark monitors in diverse geographical regions. This infrastructure systematically measures IP-levels paths to a dynamically generated list of IP addresses covering all /24 prefixes in routed IPv4 address space. They also perform DNS lookup of all IP addresses seen in the IPv4 Routed /24 topology dataset.   Reconstructing the router-level topology from this data requires grouping

multiple IP addresses belonging to the same router. This grouping process is called alias resolution. The authors have developed several IP alias resolution heuristic techniques to this end. They derive an AS-level topology map of the internet from Ark and route views data.

Future work would involve gathering large dataset of IP topology data and the expansion of the analysis tool required to make inferences of trends out of the data.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — ameenakel @ 4:13 pm

(i) the three most important things the paper says

One thing that the paper says is that in order for their system of measurement to work properly, their system must be deployed diversely around the world.  Thinking about this observation, this is really the only way for a scalable measurement of the internet as a whole, as it is quite vast and complicated.  Also, since these types of measurements will require such hardware, the measurement hardware itself cannot be prohibitively expensive.  Another important point that the paper makes is that even measuring traffic around the internet will no alone give insight as to how the internet is actually organized.  It requires much more analysis of that data to even think of developing some sort of meaningful organization of ASes and routers.  The authors even note that these algorithms are not trivial either.  The authors also point out that ICMP-based traffic seems to reach the greatest number of router links on the internet, but unfortunately does not return very desirable observations about the network being analyzed.  This goes to show how difficult of a field internet measurement can be.

(ii) the most glaring problem with the paper

It seems that, although the researchers here have invented a great system for measuring certain aspects about the internet, they have yet to develop a system that measures the data that they are most interested in.  They claim that they must use pretty complicated heuristics (and that they’re still searching for more) in order to extrapolate data from their results.  It seems as if the project wasn’t quite ready for a paper submission (as the system itself–including interpretive models–isn’t even ready yet).

(iii) the future research directions of the work.

I can’t quite think of a better research direction than the one the authors are already pursuing.  It is of great importance that the network community study an infrastructure that mimics that of the internet in order to make decisions about future innovations and changes to its underlying protocols.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — siva @ 4:13 pm

One of the contributions of the Ark project is the simplicity and speed of development and deployment of different measurement experiments related to Internet measurement. It provides a high level of abstraction to users, which allows for faster development of measurement experiments.

Another important contribution of the project is the ability to create experiments using several nodes with a high degree of coordination between them. This is made possible through the tuple space abstraction. This allows different heterogeneous hosts to coordinate with each other and perform a measurement task efficiently.

An characteristic feature of the Ark project is the creation of the service oriented measurement infrastructure for mapping the internet topology. Users can implement and deploy mesurement related services using the the tuple space abstraction. This allows other the research community to easily use the services developed by other users.

One of the drawbacks of the paper is that it does not describe how the Ark project is different from other similar efforts on Internet measurement like iPlane. There is no comparison to other related efforts.

Future research in this direction could be on how to use this topology data to determine interesting properties about the internet. Eg. Studying how the topology changes over time (both on a short timescale and over a longer period) combined with other measurements on bandwidth demands can provide us good pointers on provisioning resources as the internet grows.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — stufflebean @ 4:13 pm
Tags: , , ,

This paper describes Archipelago (Ark), which is a measurement infrastructure being used primarily for mapping the IP topology of the Internet. The main body of the paper describes their efforts to measure the topology of the Internet, but they start by describing the three features they are striving toward in the Ark infrastructure. We will examine how their topology mapping benefits from the architecture of the Ark infrastructure by examining their goals:

  1. Easy development and rapid prototyping. The process of generating IP topology data requires multiple steps involving creative uses of (primarily) ICMP and UDP probing. Since the Ark infrastructure provides high-level APIs and services, like the scamper measurement tool, generating these probes is a very straightforward process involving only a few commands in a high-level scripting language (they use Ruby). In addition, Ark facilitates writing new measurement tools, like those comparing ICMP traceroute against UDP link measurement and analyzing probes from spoofer clients, which have been developed by other researchers for use on Ark.
  2. Dynamic and coordinated measurements. The Ark infrastructure proves a tuple space which allows storage and communication of data through a map-like interface. This tuple space provides shared state which is vital for creating distributed services, since it allows them to communicate without having to coexist temporally or geographically. It also provides greater flexibility since messages don’t have to be directed to an address which might not be known a priori.
  3. Measurement services. Another useful attribute of the tuple space is that it enables the construction of named services without having to know who is providing them. For example, by placing the tuple ["PING", "<some IP address>"] into the tuple space, the user or program requesting the ping is counting on someone providing a ping service to the tuple space. However, the requester does not have to know who is providing the service in order to receive the response. It also allows great parallelism, since multiple service providers have access to the tuple space and one or many of them can respond to a given request, depending on its semantics.

The weakness of this peper is that the Ark service does not seem to have any real provisions for security. Anyone who is connected to the tuple space can serve requests without having to provide credentials certifying that they will produce a valid result. While this may not be a problem at the earlier stages of the network (similarly to the Internet, where initially all users were assumed to be trusted), it might be a problem if it is used on a larger scale or for commercial purposes.

Future research might include a layer of security that is simple to use in a similar manner to the tuple paradigm already in effect. For example, a credential server could provide service registration through a shared key system whereby only the registered service providers had the private key necessary to decrypt requests.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — liyunjiu @ 4:13 pm

The Archipelago measurement infrastructure is CAIDA’s newest active measurement infrastructure which uses state-of-the-art strategic measurement and analysis methods to provide a comprehensive and coherent view of the internet topology.

1. Ark allows for rapid prototyping by developing at a high level of abstraction using dynamic scripting languages. They provide a library for controlling tools like scamper and iffinder from a Ruby script which allows users to collect distributed measurements.

2. Ark focus on coordination between Ark monitor nodes to obtain results. Ark employs a new implementation called Marinda which utilizes distributed shared memory which stores tuples with a small number of operations.

3. Ark supports measurement services which a user can write a program which interpret tuples as commands and performs measurements, and return the result as a tuple. This approach allows researchers to build on the work of others at a granularity of services.

There are no serious flaws in the paper. Further research topics include measurement tools for probing the internet, legal ramifications of providing AS-level topology that providers will not want to be public.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — koderaks @ 4:13 pm

This paper is about CAIDA’s newest measurement platform. The main contributions of the paper are:

  1. Archipelago  (Ark): The paper describes the new generation of CAIDA’s measurement infrastructure. Ark is designed with three goals in mind: (1) Easy deployment; this is achieved by choosing Ruby as the primary implementation language for measurements, with a set of APIs. (2) Dynamic measurements; the goal is to achieve a dynamic environment for the targets where coordination among the measurement nodes is possible. Ark achieves this by utilizing a tuple-space model, where the notion of distributed shared memory is made feasible. (3) Providing measurement services; the goal is to provide a set of services that the researchers can build on, and to hide the complexity of network programming by utilizing these services.
  2. Two alias resolution methods are used to group multiple IP addresses belonging to the same router. The iffinder tool and kapar (modified version of APAR) are used towards obtaining a router-level map of the Internet. This creates a IP-to-router mapping.
  3. The infrastructure presented in the paper enables the creation of a AS-router internet topology. A trace-route is first captured, including IP hops along the path from a source to a destination, and then Route Views is used to map IP addresses to AS numbers. Two different ASes within two consecutive IP hops of a trace usually represent a link between the two ASes. This topology is represented using a simple undirected, unweighted graph. This is an IP-to-AS mapping. This mapping together with the mapping mentioned above (IP-to-router) are merged together to create a dual AS-router graph: links between ASes are annotated with router IDs, and routers are annotated with AS that they belong to. This map is a big step forward in the internet mapping desired by the authors.

Problems: The only problem I can think of is the placement of Ark monitors. The map shows no monitors in Africa and the ones in Asia are all concentrated around China/Japan. Though these areas represent most of the internet, the other areas that are under-represented by the monitors could affect the final mapping of AS-to-router significantly.

Future: there seems to be many scenarios where it’s not possible to determine precisely which AS owns which IP address. A good number of these scenarios are avoided by the measurement protocol (ie, the AS-sets  and multi-origin ASes). More research to find ways to determine mapping for these ambiguous cases could result in a more precise mapping .

 

Internet Mapping: from Art to Science May 25, 2009

(i) Three most important things

1. We critically depend on the Internet for our professional, personal, and political lives but we know little about what keeps the Internet stable as the Internet becomes continuously challenging to research and analyze. The paper presents the design of an infrastructure and operating system platform known as Ark that supports large-scale active measurements studies of the global Internet for Internet topology mapping.

2.  Easy development and rapid prototyping are important factors in how they promote discovery. A researcher can explore more risky ideas which could have higher returns, by lowering the cost needed in time and effort to implement a measurement idea. Ark supports rapid prototyping by promoting software development at a high-level of abstraction using dynamic scripting languages and pre-built API’s and services.

3. It should be easy for researchers to use and to build upon the work of others at the granularity of services. Ark supports measurement services by providing a tuple space which acts as the unified mechanism for transport and messaging. A user can easily deploy a measurement service by writing a program that interprets tuples as commands, performs some measurement, and returns the result as a tuple.

(ii) Most glaring problem

The most glaring problem would be that paper doesn’t discuss much how Ark has helped study Internet topology. The paper mentions a couple researches that have implemented Ark but doesn’t really provide any actual results and conclusions.

(iii) Future Research Directions

Future research directions for this work would be to have Ark implemented by other research communities so that more data can be collected on Internet topology and expand Ark to perform more IPv6 topology measurements.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — yipiokayyay @ 4:13 pm
Tags: , , ,

(i) The three most important things the paper says:

1) The building of their system based on a concept of easy development and rapid prototyping are important factors.  Claffy et al points out that this not only allows an increase of productivity, but also how they promote increased discovery.  I agree that this is the case because they allow researchers to focus on developing new strategies for research instead of tons of time on development.  Because if development of a program for research may take a long time, they may limit the types of experiments that they complete.  However if the facilities to perform this research is easy, then they may be encourage to perform more types of analysis.

2) The paper indicated that “In a few cases when we cannot determine a Provider-Customer relationship for a set of ASes accessing the same  router, we assign this router to the AS with the smallest outdegree.”  Basically in this case they are making assumptions using a best effort educated guess.  What I think its important is this type of approach, in which they are actively looking for ways to continue their research and not be bounded by external factors.   This is unlike the Claffy paper “Ten Things Lawyers Should Know About the Internet ”, which didn’t really address the limitations that they complained about.  Furthermore, they did mentioned that moving forward they will fill in these assumptions with accurate data as they get them.  This approach that they promote is very important to this study since so many bits of information are blank.

3) They are actively looking to obtain information about IPv6 as it is being deployed.  I believe this is an important fact, since they are in the unique position to capture information about a technology at its early adoption phase.  This type of “infrastructure data” is invaluable for future research as indicated in “Ten Things Lawyers Should Know About the Internet”. .   In addition, they have already indicated a good use of this data to help with dealing with IPv4 exhaustion.  This further reinforces the point that this is an important topic.

(ii) The most glaring problem with the paper:

The biggest problem with the paper is the fact that they didn’t talk about the errors in the ARK system.  No system is perfect and many assumptions were made in the building of ARK and in the measurement it collected (e.g. AS information in traces).  However, there was no mention about potential errors in the calculation of the topology, only assumptions.  Also, when it presented the assumptions that were made, Claffy et al didn’t address the implications of these underlying assumptions.  The reader is left to speculate the outcome.

(iii) The future research directions of the work:

The future research of the work would work to resolve inconsistency or gaps in their data by building business relationships.  Perhaps there is some investigation into the legal options in which they can partner with an ISP.  Ultimately this system seems like they are from the “outside looking in”.  However if we want something accurate and precise, we will need to have data on the inside.

 

Internet Mapping: from Art to Science May 25, 2009

The paper describes an active measurement infrastructure currently in use by CAIDA, the Cooperative Association for Internet Data Analysis. It states that the need for this measurement effort is justified by the large dependency in today’s society on the function of the Internet.

Mainly, this paper contributes a description of the Archipelago (short: Ark) measurement infrastructure, which provides a high-level abstraction for the service of measuring network properties in a controlled, distributed manner. A Ruby API ensures rapid prototyping. A tuple-space coordinates the different measurement nodes by providing shared state between the nodes.

Second, this paper introduces a number of measurements which are supported by the Ark infrastructure and describes the methods and goals of these measurements. To capture IP topology, traceroute and ping tools are run from measurement nodes targeting all routable /24 prefixes in the IPv4 Internet. These measurements obtain topology information annotated with round-trip times over time. Furthermore, DNS resolution and IP to router mappings are measured. With these annotations, an AS-level map of the internet is generated by mapping routers to ASes.

One problem of these approaches is a possible bias in all those measurements because the Ark measurement nodes are likely not placed in representative places in the network, because only academic nodes are involved. This view of the network may differ from what a normal DSL customer experiences on the internet.

Future research involves mainly coming up with new measurement ideas to leverage the data collected by this infrastructure and to further extend the toolset. Also, it would be worthwhile to compare the collected results with the real topology existent at a large provider. While this topology is likely to be private, the providers could at least acknowledge the precision of those measurement methods.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — erubow @ 4:13 pm

Internet Mapping: from Art to Science

Important Things:
1) The structure, performance limits, dynamics, and evolution of the global Internet is not currently well-understood. Structural information is hard to come by due to the distributed ownership and maintenance of it, and concerns about sharing information. At such a large scale, it is a challenge to perform infrastructure measurements.
2) They have built a flexible Internet measurement infrastructure, Archipelago (or Ark), which runs on several Ark monitors distributed throughout the globe (31 as of mid-December 2008). It makes use of such tools as ping, traceroute, and DNS queries to gather data and places the results in a shared tuple-space to enable coordination. A library makes the development of experiments easy. The experiments can be static or dynamic.
3) The results will be validated through surveys of infrastructure owners, and this infrastructure will continue to be refined to profide useful Internet data for research.

Problems:
Measurements and analysis at this early stage may be prone to some degree of error. It is not yet clear to me how much error is invovled, or how much of the Internet can currently be captured. She also questioned the Internet’s ability to “maintain and strengthen its role as the world’s communications substrate” in the first paragraph without elaborating on this.

Future Work:
This is definitely very exciting work, shedding light on the structure of the Internet through external measurements. This should certainly continue, eventually giving a view of its evolution over time. I think it will be important to attempt to quantify the degree of error and completeness in the resulting data, as well as to add richer information to the Internet map.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — supritapagad @ 4:13 pm
Tags: , ,

1. Distributed and scalable measurement architecture

The paper suggests a measurement/monitoring architecture that allows different groups of users dispersed over geographically separated locations to make topology measurements. It also provides a common, shared database that is updated with all measurements made. This way, not only is support for a large user base provided, but also, diverse and time skewed measurements of the Internet topology is  obtained. It makes use of tuple-space co-ordination model to achieve this shared memory structure. Decentralized measurement also introduces randomness into measurements which more accurately captures topology data.

2. Incremental development

The architecture makes it feasible for the users to build upon the work and development done by others. It shields the users from underlying complexities and allows them to use a scripting language to develop their codes for the measurement.

3. Interpreting and processing gathered information

The paper suggests some interesting methods for deriving topology information from the raw data gathered. For eg. they mention a technique to obtain the IP addresses of hosts connected to a router from data obtained by measuring paths to a list of IP addresses covering all /24 prefixes in an address space.

Short-comings

A shared memory to update measurements and a common code to which any user can make additions mean a faulty or malicious code and compromise the entire structure. However, this is a trade-off one needs to make with any form of open-ware. In addition, there is no mention of accommodating for dynamic and changing network topologies while allowing different users to update the database with measurements made at different instances of time. The method of alias resolution mentioned by them seems to only support end routers with hosts directly connected to the router of concern.

Future Work

The idea is still young and can grow in multiple dimensions. The API can be developed further to provide for greater ease 0f use and deployment. Increased functionality can be added to their data interpretation tools to glean greater insight into the network topology from the raw data gathered by the system. Attempt can be made to increase the variety of data gathered by the system as well.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — mdjacobsen @ 4:12 pm
Tags: ,

The authors describe a distributed network measurement infrastructure, Archipelago (Ark), which they have deployed around the world to help researchers coordinate large scale Internet measurements. The Ark infrastructure is deployed at over 31 sites around the globe and can coordinate using a tuple-based directory service. This is the primary contribution of this paper.

Ark is deployed as individual monitors (nodes) that run the Ark software. The interface is written in Ruby, though lower level access using C/C++ is supported as well. The distributed directory service support is based on Marinda, a tuple-space distributed shared memory. Monitors can modify this tuple-space to implement support for dynamic coordinate and even measurement services. The tuple-space serves as the layer of abstraction that services can use to read/write domain/range values.

As described, such an infrastructure by itself is not such a new or useful tool. However, the designers wrap the implementation with an easy to use API (in Ruby). This allows measurement studies to be written in a high level scripting language, which is conducive to rapid development and high adoption. Furthermore, the designers decided to allow access to anyone who wants it. No special authorization is needed.

Using this system, the authors have been able to gather Internet topology information that is annotated with link latencies, router ids, and AS numbers (whenever possible). This well annotated global topology can be used by many researchers for macroscopic studies. The authors claim that using Ark, they have been able to coordinate with other researchers and develop improved methods for such measurement tasks as: reconstructing router level topology, dns resolution, and ICMP & UDP topology probing.

One of the most attractive features of Ark is that it is open to anyone who wishes to gather measurement data. The paper describes the Ark infrastructure as secure, but there is not explanation of how it is secure. Indeed, if anyone can use the system, access the tuple-space shared memory to coordinate with other services/researchers, and there is no authorization, how is the system secure? This is not addressed in the paper. Some discussion of the security claim is in order.

I’d expect to see further development for Ark in the direction of built-in measurement services. It seems like the addition of a library of highly optimized (best practice) basic measurement routines would be very useful for any researcher starting out with a measurement project.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — brokerer @ 4:12 pm

This paper is about a measurement platform capable of performing various types of Internet infrastructure measurements and assessments. The authors use various techniques for IP topology mapping while supporting software for data analysis, topology generation etc… It is important that we have a good Internet topology for research and ownership of network devices. No one really knows what really keeps the Internet stable and Ark is a stepping stone in that direction. The major points are all ways that help widespread the platform as the IP topology techniques were based off of various known techniques.

Major Points:
1.) Easy development and rapid prototyping
It is very important that development on ARK is fast because ARK is a tool that should aid researchers. Researchers are not going to adopt the tool unless it is easy to develop on and it is fast. Using a language like RUBY and keeping everything in a high level view, allows developers to get more done without getting messy with the low level development work.
2.) Coordination
ARK focuses on coordination to allow the heterogeneous pieces of a measurement infrastructure to work efficiently toward a common task. They use a new implementation call Marinda that uses a tuple-space coordination model. The tuple space stores tuples and the clients retrieve tuples by pattern matching. The tuple space is easy to use and hides the complexities of network so it lowers the barrier to deploying sophisticated distributed measurements.
3.) Measurement Services
To allow researchers to build on top of the work of others, ARK supports services using the tuple space. This allows a user to easily deploy a measurement service by writing a program that interprets tuples as commands, performs some measurement and returns the result as a tuple. This has a lot of advantages like: low deployment effort and cost, anyone can provide a service, decentralized management, etc…

Glaring Problem:
Assigning routers to ASes is a difficult task and ARK’s approach works fine when the IP address on both sides of the link belong to the same AS but they run into cases where they can’t find a Provider-Customer relationship for a set of ASes so they just assign the router to the AS with the smallest outdegree. This leads to an incomplete map.

Future Work:
The whole idea of ARK is to help expand the future of network research. By making their platform easy to develop with, ARK has already been used by two researchers outside of CAIDA. As the authors stated, researchers and policymakers are analyzing a trillion-dollar ecosystem in the dark. With ARK on it’s way, these researchers and policymakers will have a new dimension of information to work with. The authors also mention working on a IPv6 topology measurement, exploring more dynamic IPv4 topology measurements using our new ad-hoc topology measurement facility and implementing a new visualizations of IP- and AS-level topology.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — subhramazumdar @ 4:12 pm

The paper discusses the design, implementation and deployment of a secure measurement platform that is capable of performing various types of internet infrastructure measurement and assessments. The state of the art technology has been used to build a coherent view of the internet topology that will help understand the strength and pitfalls of the network, thus helping it’s faster and secured use. Internet topology measurement and mapping data have been gathered to build a consolidated map for use by academic researchers. Current agencies charged with infrastructure protection have little awareness regarding the global dynamics and operational threats. In face of such darkness, it is imperative to build up a coherent view of the internet and understand it much better. The problem arises from the distributed ownership of the various part of the network and the lack of availability of internal data to others. This paper has proposed a secure measurement platform for collecting and processing data, analyzing and topology generation and interactive visualization of large annotated graphs as a view of the internet topology. They have described Archipelago or Ark as the measurement infrastructure which supports rapid prototyping by promoting software development at a very high level of abstraction. Ark focuses on coordination which concerns planning, execution and controlling a bunch of distributed computations working towards a common goal. For this purpose Ark uses a tupple-space coordination model which acts as a channel for one-to-one and many-to-one communication. The Ark does a systematic measurement of the IP-level paths by dynamically distributing the task among the members. From the IP trace routes, an IP-level topology of the internet is constructed. Also DNS look ups for the IPs give a data set for studying characteristics of DNS name servers. Detail analysis also yields data such as relationship between organizations. AS level topology are also built from the route views data which requires mapping of trace-route observed IPs to ASs. Finally a dual AS-router level internet topology is built by merging the AS level and router level maps that gives an integrated view of the links and nodes in both graphs, consistently annotated with relevant metadata.

The major challenge of the project seems to be the deployment of task force, collection of such large scales of data and analyzing them to get a coherent view of the internet topology. Since the big picture of the internet is built from numerous individual trace routes, integrating them into one consistent data set and representing information in a comprehensive way is difficult. Also due to the lack of certain detailed information and access, certain information needs to be extrapolated from the collected data set which might not be true in all cases.

The project is still in the early stages and future work will include increasing its extensiveness and large scale deployment. The set of analysis tools needs to be expanded and also the queries to be asked on the dataset has to be properly modeled. The emerging IPv6 topology also needs to be measured. Finally new visualizations for IP and AS level topology has to be implemented that will enable regular provision of rich topology maps of observable internet infrastructure and support security objectives.

 

Internet Mapping: From Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — gracewangcse222 @ 4:12 pm

(i) The three most important things the paper says:

  1. It is currently difficult to obtain a clear view of the current topology of the internet, as well as many characteristics crucial to research and making policies about the internet. The paper describes a measurement platform called Archipelago which aims to collect data about internet topology.
  2. The Ark service architecture uses tuples to allow users to create measurement services (with tuples representing the input and output to the measurement program) that are easy and cheap to develop, decentralized, and able to support a number of communication patterns.
  3. The authors used 31 Ark monitors deployed around the world to obtain traceroute measurements to all routed /24s by parallelizing the measurement tasks (randomized probing) dynamically among groups of monitors. A desired result of obtaining this data is to determine both a router-level topology and an AS-level topology of the internet.

(ii) The most glaring problem with the paper:

I’m not sure how consistent DNS responses tend to be over time, but 1 or 2 days seems like a while to wait before doing a DNS lookup. Furthermore, I would be curious as to how malicious internet behaviour, such as poisoning the DNS cache, would affect the validity of the resulting data.

(iii) The future research directions of the work:

Once the macroscopic maps of the internet are complete, a number of things can be done with the data. This information would certainly be of great use to lawmakers who are in charge of internet policy, as well as to various ASes who might be interested in developing the best possible set of business relationships and informing their own routing policies to maximize their own efficiency. The maps will also be helpful to researchers as they develop new techniques, and to aid and inform deployment of new technologies.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — dgaschk @ 4:12 pm

Determining the topology of the Internet is a difficult proposition. Internet protocol provides no intrinsic facility for collecting this data and network operators are reticent to share their proprietary operational information. Several efforts to collect Internet infrastructure data are ongoing and the Ark architecture provides a new tool for data collection.

Ark monitors have been distributed on all continents. Additional Ark devices are deployed monthly. The system provides a foundation for collection experiments that require coordinated distributed measurement locations. Deployment of new measurement tasks is facilitated by the Ark system. The system allows collected data to be processed in multiple stages over time. Existing tasks may be used as components in newer measurement endeavors.

The document describes a useful and extensible system for data collection. It fails to provide motivation for the data collection. Examples of how the data will be used is required. It is interesting to know the topology and routing characteristics of the Internet but is there a higher purpose?

The system is limited by the number of deployed measurement devices. Promulgating an appealing motivation would help to recruit additional participants. Ensuring that the type of information collected is limited and providing ways for participants to guarantee that devices placed on their networks only collects agreed upon data will allay security fears and help to further deployment.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — krishnanadh @ 4:12 pm

The paper presents an overview of Archipelago (Ark), the platform for active measurement, analysis and mapping of Internet IP topology data for use by network research communities worldwide. The authors indicate that the Internet infrastructure protection agencies and regulatory authorities require constructive comprehension of Internet topology data to provide better services to address the ever changing dynamics of Internet and deter operational threats faced by it. Topology data is also made available to the research community which has traditionally relied on heuristics to solve network related challenges. In addition to providing a network measurement infrastructure the authors also develop software for data processing and analysis. The main points discussed in the paper are the Ark architecture, its deployment and accomplishments and future goals.

The Ark architecture primarily aims to enable ease of development and rapid prototyping, dynamic and coordinated measurements and provide a set of measurement services that come handy to researchers. Ark allows users to avail network measurement services over the network through dynamic scripting languages and pre-built APIs. Measurement like path diversity within a given prefix and monitoring prefixes containing critical infrastructure require dynamism and coordination between the measuring nodes distributed in space. Ark supports this dynamism and coordination using a tuple-space model called Marinda. Tuple-space acts like a shared memory between communicating processes, the contents of which can retrieved by simple pattern matching thereby enabling distributed measurements. Various measurements services like ping ands trace-route can be built and deployed at monitor nodes on top of tuple-space which acts as the underlying transport/messaging medium. Such service architecture allows flexibilities like decentralized management and ease of aggregation for diverse communication patterns.

The authors plan to deploy Ark monitors in under-represented regions and in regions with IPv6 connectivity to enable active measurements on it. Among the measurements that the Ark infrastructure makes it to calculate the delay of IP paths between a dynamically generated set of /24 prefix hosts. The task is parallelized by dividing it among multiple parallel monitors at different geographical sites constituting three teams which randomly probe the prefixes. These monitors poll the scamper measurement tool server node which supports IPv4, IPv6, ping and similar services and implements trace-routes measurements for TCP, UDP and ICMP. Apart from trace-route measurement on these /24 prefixes, the authors also perform DNS look up on them and get data like IP to-hostname maps and raw DNS query/response traffic. The third major measurement of alias resolution is used by the authors to get a router level map of the Internet which will allow them to identify more realistic physical links between routers rather than IP interfaces. Alias resolution is implemented by two heuristics techniques, CAIDA iffinder and APAR. Further the authors extract an AS-level Internet maps which can be used to maximize the number of valid paths in the AS topology. Lastly the authors construct dual AS-router level Internet topologies The resulting dual map merges router and AS-level graphs into an integrated view where links and nodes in both graphs are consistently annotated with semantically relevant meta-data and increase researchers’ situational awareness of the critical Internet infrastructure and open grounds for understanding sand modeling Internet evolution.

Overall Ark provides a great platform that research users can use to quickly design, implement and coordinate the execution of experiments across a distributed set of dedicated monitors. The Ark implementation is a big step towards providing synergic Internet topology data to various research communities across the globe and requires careful inclusion and expansion to encompass geographical regions spanning different continents and reflecting disparate Internet traffic characteristics. The main problems that arise from this vision could be resistance from service providers and regulatory authorities for inclusive growth and sharing of critical Internet data and also economies of scale which at some level might not be acceptable to all communities. Also maintainability is major issue since trace-route type measurements need series active nodes and failure of any one of them might disrupt further experimentation.

 

Internet Mapping: from Art to Science May 25, 2009

Filed under: R17. Internet Mapping: from Art to Science — filipposeracini @ 4:12 pm

This paper describes some of the features of a measurement platform capable of performing several measurements of the Internet infrastructure. This platform, called Ark, is a Ruby based operating system that support large scale active measurements of the global Internet. Ark is composed of 31 monitors located around the world. The plan is to increase the number of monitors in order to have a wider representation of the Internet. The Ark project goes along the lines with PlanetLab, iPlane, etc.

This measurement infrastructure allows researchers to run experiments and prototypes using a high-level language API. One of the most important features of Ark is the coordination between all the monitors that allow the deployment of sophisticated distributed measurements. Such coordination takes place in the form of a service oriented architecture (SOA). Each measurement can be seen as a service and the whole experiment reduces to the orchestration of such services. SOA is known to be able to scale up very well. Hence I believe that it is a very sound approach in a such large scale context as Ark.

The Ark project is quite recent, but already achieved some good results. In particular it was able to create a first representation of the Internet IP infrastructure at the router granularity. Other experiments to increase the information associated with it are currently running and under development.

It is hard to make an evaluation of this paper because it is basically just a shopping list of what they did and what they are planning to do. It will be interesting to see how this project evolves and the results that eventually will come out of it. I think one of the trickiest part for Ark will be to establish itself as the platform where to run experiments over the Internet, hence avoid to be only yet another measurement infrastructure.

 

Internet Mapping: from Art to Science May 21, 2009

Filed under: R17. Internet Mapping: from Art to Science — damedeiros @ 12:45 pm

The purpose of this article was to describe the ARK architecture as well as the authors results and accomplishments with using the system. The ARK was designed to improve on the methods used for Internet measurement and technology discovery. The key points in this article were:

1.ARK is a distributed system that uses an implementation of tuples, called Miranda. This system was designed for ease of use and to remove the challenges involved with deploying a large scale distributed system. This method of implementation allows the researchers to deploy autonomous systems across a wide area without needing to control their action directly and improved the “range” of the measurements being taken as well as the ease of implementation for new services and measurements.

2.The ARK has been successful in conducting several extremely large scale Internet measurements such as the macro-topology of the network, the DNS lookup topology of the network, alias-lookup measurements to separate and divide those systems behind the same routers, as well as several other. The system itself is extremely flexible, allowing for nearly any measurement to be conducted.

3.The system is extensible, having already (at the time of this paper) been used by several researchers in their own projects that involved measurements of IP spoofing, and the most efficient probing mechanism. This shows that the authors design choices to make the system simple, as well as extensible has let to further use and was a well thought out plan.

The paper did a good job of providing a basic summary as well as describing some of the results that are already available. I thought that there could have been some more discussion of of the performance issues as well as what this system provides that others do no that help it to stand out. Some discussion of the limits of their design would have also been useful in evaluating it’s strength and potential.

The future research that this paper could lead to would be further optimization of Internet measurements as well as a slew of freely distributed results for researchers to use and compare. This could lead to a much greater understanding of the Internet as well as some things to work on as we move forward. The authors talk about and Ipv6 measurement planned to help plan the deployment of the new protocol. I think that this is a good area to start as we are likely to learn a great deal from this likely imminent protocol and the planning of it should have all of the information available.