TY - JOUR
T1 - General, Efficient, and Real-Time Data Compaction Strategy for APT Forensic Analysis
AU - Zhu, Tiantian
AU - Wang, Jiayu
AU - Ruan, Linqi
AU - Xiong, Chunlin
AU - Yu, Jinkai
AU - Li, Yaosheng
AU - Chen, Yan
AU - Lv, Mingqi
AU - Chen, Tieming
N1 - Funding Information:
Manuscript received May 4, 2020; revised December 13, 2020; accepted April 21, 2021. Date of publication April 28, 2021; date of current version June 2, 2021. This work was supported in part by the National Natural Science Foundation of China under Grant 62002324, Grant U1936215, and Grant 61772026, in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LQ21F020016, in part by the Ministry of Industry and Information Technology of China under Grant TC190H3WN, and in part by the Zhejiang Provincial Key Research Projects under Grant 2021C01117. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Tomas Pevny. (Corresponding author: Tieming Chen.) Tiantian Zhu, Jiayu Wang, Jinkai Yu, Yaosheng Li, Mingqi Lv, and Tieming Chen are with the College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China (e-mail: ttzhu@zjut.edu.cn; 2111912210@zjut.edu.cn; 2111912024@zjut.edu.cn; 2111812108@zjut.edu.cn; mingqilv@zjut.edu.cn; tmchen@zjut.edu.cn).
Publisher Copyright:
© 2005-2012 IEEE.
PY - 2021
Y1 - 2021
N2 - The damage caused by Advanced Persistent Threat (APT) attacks to governments and large enterprises is gradually escalating. Once an attack event is detected, forensic analysis will use the dependencies between system audit logs to rapidly locate intrusion points and determine the impact of the attacks. Due to the high persistence of APT attacks, huge amounts of data will be stored to meet the needs of forensic analysis, which not only brings great storage overhead, but also sharply increases the computing costs. To compact data without affecting forensic analysis, several methods have been proposed. However, in real-world scenarios, we meet the problems of weak cross-platform capability, large data processing overhead, and poor real-time performance, rendering existing data compaction methods difficult to meet the usability and universality requirements jointly. To overcome these difficulties, this paper proposes a general, efficient, and real-time data compaction method at the system log level; it does not involve internal analysis of the program or depend on the specific operating system type, and it includes two strategies: 1) data compaction of maintaining global semantics (GS), which determines and deletes redundant events that do not affect global dependencies, and 2) data compaction based on suspicious semantics (SS). Given that the purpose of forensic analysis is to restore the attack chain, SS performs context analysis on the remaining events from GS and further deletes the parts that are not related to the attack. The results of the real-world experiments show that the compaction ratios of our method to system events are as high as 4.36× to 13.18× and 7.86× to 26.99× on GS and SS, respectively, which is better than state-of-the-art studies.
AB - The damage caused by Advanced Persistent Threat (APT) attacks to governments and large enterprises is gradually escalating. Once an attack event is detected, forensic analysis will use the dependencies between system audit logs to rapidly locate intrusion points and determine the impact of the attacks. Due to the high persistence of APT attacks, huge amounts of data will be stored to meet the needs of forensic analysis, which not only brings great storage overhead, but also sharply increases the computing costs. To compact data without affecting forensic analysis, several methods have been proposed. However, in real-world scenarios, we meet the problems of weak cross-platform capability, large data processing overhead, and poor real-time performance, rendering existing data compaction methods difficult to meet the usability and universality requirements jointly. To overcome these difficulties, this paper proposes a general, efficient, and real-time data compaction method at the system log level; it does not involve internal analysis of the program or depend on the specific operating system type, and it includes two strategies: 1) data compaction of maintaining global semantics (GS), which determines and deletes redundant events that do not affect global dependencies, and 2) data compaction based on suspicious semantics (SS). Given that the purpose of forensic analysis is to restore the attack chain, SS performs context analysis on the remaining events from GS and further deletes the parts that are not related to the attack. The results of the real-world experiments show that the compaction ratios of our method to system events are as high as 4.36× to 13.18× and 7.86× to 26.99× on GS and SS, respectively, which is better than state-of-the-art studies.
KW - Data compaction
KW - advanced persistent threat
KW - forensic analysis
UR - http://www.scopus.com/inward/record.url?scp=85105083105&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85105083105&partnerID=8YFLogxK
U2 - 10.1109/TIFS.2021.3076288
DO - 10.1109/TIFS.2021.3076288
M3 - Article
AN - SCOPUS:85105083105
SN - 1556-6013
VL - 16
SP - 3312
EP - 3325
JO - IEEE Transactions on Information Forensics and Security
JF - IEEE Transactions on Information Forensics and Security
M1 - 9417210
ER -