Reversible sketches: Enabling monitoring and analysis over high-speed data streams

Robert Schweller*, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Yin Zhang, Peter A Dinda, Ming-Yang Kao, Gokhan Memik

*Corresponding author for this work

Research output: Contribution to journalArticle

67 Citations (Scopus)

Abstract

A key function for network traffic monitoring and analysis is the ability to perform aggregate queries over multiple data streams. Change detection is an important primitive which can be extended to construct many aggregate queries. The recently proposed sketches are among the very few that can detect heavy changes online for high speed links, and thus support various aggregate queries in both temporal and spatial domains. However, it does not preserve the keys (e. g., source IP address) of flows, making it difficult to reconstruct the desired set of anomalous keys. To address this challenge, we propose the reversible sketch data structure along with reverse hashing algorithms to infer the keys of culprit flows. There are two phases. The first operates online, recording the packet stream in a compact representation with negligible extra memory and few extra memory accesses. Our prototype single FPGA board implementation can achieve a throughput of over 16 Gb/s for 40-byte packet streams (the worst case). The second phase identifies heavy changes and their keys from the representation in nearly real time. We evaluate our scheme using traces from large edge routers with OC-12 or higher links. Both the analytical and experimental results show that we are able to achieve online traffic monitoring and accurate change/intrusion detection over massive data streams on high speed links, all in a manner that scales to large key space size. To the best of our knowledge, our system is the first to achieve these properties simultaneously.

Original languageEnglish (US)
Pages (from-to)1059-1072
Number of pages14
JournalIEEE/ACM Transactions on Networking
Volume15
Issue number5
DOIs
StatePublished - Oct 1 2007

Fingerprint

Monitoring
Data storage equipment
Intrusion detection
Routers
Data structures
Field programmable gate arrays (FPGA)
Throughput

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Electrical and Electronic Engineering

Cite this

Schweller, Robert ; Li, Zhichun ; Chen, Yan ; Gao, Yan ; Gupta, Ashish ; Zhang, Yin ; Dinda, Peter A ; Kao, Ming-Yang ; Memik, Gokhan. / Reversible sketches : Enabling monitoring and analysis over high-speed data streams. In: IEEE/ACM Transactions on Networking. 2007 ; Vol. 15, No. 5. pp. 1059-1072.
@article{31c1ab2e7a4048e7bdd0193623cbbb50,
title = "Reversible sketches: Enabling monitoring and analysis over high-speed data streams",
abstract = "A key function for network traffic monitoring and analysis is the ability to perform aggregate queries over multiple data streams. Change detection is an important primitive which can be extended to construct many aggregate queries. The recently proposed sketches are among the very few that can detect heavy changes online for high speed links, and thus support various aggregate queries in both temporal and spatial domains. However, it does not preserve the keys (e. g., source IP address) of flows, making it difficult to reconstruct the desired set of anomalous keys. To address this challenge, we propose the reversible sketch data structure along with reverse hashing algorithms to infer the keys of culprit flows. There are two phases. The first operates online, recording the packet stream in a compact representation with negligible extra memory and few extra memory accesses. Our prototype single FPGA board implementation can achieve a throughput of over 16 Gb/s for 40-byte packet streams (the worst case). The second phase identifies heavy changes and their keys from the representation in nearly real time. We evaluate our scheme using traces from large edge routers with OC-12 or higher links. Both the analytical and experimental results show that we are able to achieve online traffic monitoring and accurate change/intrusion detection over massive data streams on high speed links, all in a manner that scales to large key space size. To the best of our knowledge, our system is the first to achieve these properties simultaneously.",
author = "Robert Schweller and Zhichun Li and Yan Chen and Yan Gao and Ashish Gupta and Yin Zhang and Dinda, {Peter A} and Ming-Yang Kao and Gokhan Memik",
year = "2007",
month = "10",
day = "1",
doi = "10.1109/TNET.2007.896150",
language = "English (US)",
volume = "15",
pages = "1059--1072",
journal = "IEEE/ACM Transactions on Networking",
issn = "1063-6692",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "5",

}

Reversible sketches : Enabling monitoring and analysis over high-speed data streams. / Schweller, Robert; Li, Zhichun; Chen, Yan; Gao, Yan; Gupta, Ashish; Zhang, Yin; Dinda, Peter A; Kao, Ming-Yang; Memik, Gokhan.

In: IEEE/ACM Transactions on Networking, Vol. 15, No. 5, 01.10.2007, p. 1059-1072.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Reversible sketches

T2 - Enabling monitoring and analysis over high-speed data streams

AU - Schweller, Robert

AU - Li, Zhichun

AU - Chen, Yan

AU - Gao, Yan

AU - Gupta, Ashish

AU - Zhang, Yin

AU - Dinda, Peter A

AU - Kao, Ming-Yang

AU - Memik, Gokhan

PY - 2007/10/1

Y1 - 2007/10/1

N2 - A key function for network traffic monitoring and analysis is the ability to perform aggregate queries over multiple data streams. Change detection is an important primitive which can be extended to construct many aggregate queries. The recently proposed sketches are among the very few that can detect heavy changes online for high speed links, and thus support various aggregate queries in both temporal and spatial domains. However, it does not preserve the keys (e. g., source IP address) of flows, making it difficult to reconstruct the desired set of anomalous keys. To address this challenge, we propose the reversible sketch data structure along with reverse hashing algorithms to infer the keys of culprit flows. There are two phases. The first operates online, recording the packet stream in a compact representation with negligible extra memory and few extra memory accesses. Our prototype single FPGA board implementation can achieve a throughput of over 16 Gb/s for 40-byte packet streams (the worst case). The second phase identifies heavy changes and their keys from the representation in nearly real time. We evaluate our scheme using traces from large edge routers with OC-12 or higher links. Both the analytical and experimental results show that we are able to achieve online traffic monitoring and accurate change/intrusion detection over massive data streams on high speed links, all in a manner that scales to large key space size. To the best of our knowledge, our system is the first to achieve these properties simultaneously.

AB - A key function for network traffic monitoring and analysis is the ability to perform aggregate queries over multiple data streams. Change detection is an important primitive which can be extended to construct many aggregate queries. The recently proposed sketches are among the very few that can detect heavy changes online for high speed links, and thus support various aggregate queries in both temporal and spatial domains. However, it does not preserve the keys (e. g., source IP address) of flows, making it difficult to reconstruct the desired set of anomalous keys. To address this challenge, we propose the reversible sketch data structure along with reverse hashing algorithms to infer the keys of culprit flows. There are two phases. The first operates online, recording the packet stream in a compact representation with negligible extra memory and few extra memory accesses. Our prototype single FPGA board implementation can achieve a throughput of over 16 Gb/s for 40-byte packet streams (the worst case). The second phase identifies heavy changes and their keys from the representation in nearly real time. We evaluate our scheme using traces from large edge routers with OC-12 or higher links. Both the analytical and experimental results show that we are able to achieve online traffic monitoring and accurate change/intrusion detection over massive data streams on high speed links, all in a manner that scales to large key space size. To the best of our knowledge, our system is the first to achieve these properties simultaneously.

UR - http://www.scopus.com/inward/record.url?scp=36148989956&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=36148989956&partnerID=8YFLogxK

U2 - 10.1109/TNET.2007.896150

DO - 10.1109/TNET.2007.896150

M3 - Article

AN - SCOPUS:36148989956

VL - 15

SP - 1059

EP - 1072

JO - IEEE/ACM Transactions on Networking

JF - IEEE/ACM Transactions on Networking

SN - 1063-6692

IS - 5

ER -