TY - GEN
T1 - Reversible sketches for efficient and accurate change detection over network data streams
AU - Schweller, Robert
AU - Gupta, Ashish
AU - Parsons, Elliot
AU - Chen, Yan
PY - 2004
Y1 - 2004
N2 - Traffic anomalies such as failures and attacks are increasing in frequency and severity, and thus identifying them rapidly and accurately is critical for large network operators. The detection typically treats the traffic as a collection of flows and looks for heavy changes in traffic patterns (e.g., volume, number of connections). However, as link speeds and the number of flows increase, keeping per-flow state is not scalable. The recently proposed sketch-based schemes [14] are among the very few that can detect heavy changes and anomalies over massive data streams at network traffic speeds. However, sketches do not preserve the key (e.g., source IP address) of the flows. Hence, even if anomalies are detected, it is difficult to infer the culprit flows, making it a big practical hurdle for online deployment. Meanwhile, the number of keys is too large to record. To address this challenge, we propose efficient reversible hashing algorithms to infer the keys of culprit flows from sketches without storing any explicit key information. No extra memory or memory accesses are needed for recording the streaming data. Meanwhile, the heavy change detection daemon runs in the background with space complexity and computational time sublinear to the key space size. This short paper describes the conceptual framework of the reversible sketches, as well as some initial approaches for implementation. See [23] for the optimized algorithms in details. Evaluated with netflow traffic traces of a large edge router, we demonstrate that the reverse hashing can quickly infer the keys of culprit flows even for many changes with high accuracy.
AB - Traffic anomalies such as failures and attacks are increasing in frequency and severity, and thus identifying them rapidly and accurately is critical for large network operators. The detection typically treats the traffic as a collection of flows and looks for heavy changes in traffic patterns (e.g., volume, number of connections). However, as link speeds and the number of flows increase, keeping per-flow state is not scalable. The recently proposed sketch-based schemes [14] are among the very few that can detect heavy changes and anomalies over massive data streams at network traffic speeds. However, sketches do not preserve the key (e.g., source IP address) of the flows. Hence, even if anomalies are detected, it is difficult to infer the culprit flows, making it a big practical hurdle for online deployment. Meanwhile, the number of keys is too large to record. To address this challenge, we propose efficient reversible hashing algorithms to infer the keys of culprit flows from sketches without storing any explicit key information. No extra memory or memory accesses are needed for recording the streaming data. Meanwhile, the heavy change detection daemon runs in the background with space complexity and computational time sublinear to the key space size. This short paper describes the conceptual framework of the reversible sketches, as well as some initial approaches for implementation. See [23] for the optimized algorithms in details. Evaluated with netflow traffic traces of a large edge router, we demonstrate that the reverse hashing can quickly infer the keys of culprit flows even for many changes with high accuracy.
KW - Change detection
KW - Data stream computation
KW - IP mangling
KW - Modular hashing
KW - Network anomaly detection
KW - Reverse hashing
KW - Sketch
UR - http://www.scopus.com/inward/record.url?scp=14944385135&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=14944385135&partnerID=8YFLogxK
U2 - 10.1145/1028788.1028814
DO - 10.1145/1028788.1028814
M3 - Conference contribution
AN - SCOPUS:14944385135
SN - 1581138210
SN - 9781581138214
T3 - Proceedings of the 2004 ACM SIGCOMM Internet Measurement Conference, IMC 2004
SP - 207
EP - 212
BT - Proceedings of the 2004 ACM SIGCOMM Internet Measurement Conference, IMC 2004
PB - Association for Computing Machinery (ACM)
T2 - Proceedings of the 2004 ACM SIGCOMM Internet Measurement Conference, IMC 2004
Y2 - 25 October 2004 through 27 October 2004
ER -