Overlay network monitoring enables distributed Internet applications to detect and recover from path outages and periods of degraded performance within seconds. For an overlay network with end hosts, existing systems either require O(n2) measurements, and thus lack scalability, or can only estimate the latency but not congestion or failures. Our earlier extended abstract [Y. Chen, D. Bindel, and R. H. Katz, "Tomography-based overlay network monitoring," Proceedings of the ACM SIGCOMM Internet Measurement Conference (IMC), 2003] briefly proposes an algebraic approach that selectively monitors k linearly independent paths that can fully describe all the O(n2) paths. The loss rates and latency of these paths can be used to estimate the loss rates and latency of all other paths. Our scheme only assumes knowledge of the underlying IP topology, with links dynamically varying between lossy and normal. In this paper, we improve, implement, and extensively evaluate such a monitoring system.We further make the following contributions: i) scalability analysis indicating that for reasonably large (e.g., 100), the growth of is bounded as O(n log n), ii) efficient adaptation algorithms for topology changes, such as the addition or removal of end hosts and routing changes, iii) measurement load balancing schemes, iv) topology measurement error handling, and v) design and implementation of an adaptive streaming media system as a representative application. Both simulation and Internet experiments demonstrate we obtain highly accurate path loss rate estimation while adapting to topology changes within seconds and handling topology errors.
- Load balancing
- Network measurement and monitoring
- Numerical linear algebra
ASJC Scopus subject areas
- Computer Science Applications
- Computer Networks and Communications
- Electrical and Electronic Engineering