AIRS: Explanation for Deep Reinforcement Learning based Security Applications

Jiahao Yu, Wenbo Guo, Qi Qin, Gang Wang, Ting Wang, Xinyu Xing

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Recently, we have witnessed the success of deep reinforcement learning (DRL) in many security applications, ranging from malware mutation to selfish blockchain mining. Like all other machine learning methods, the lack of explainability has been limiting its broad adoption as users have difficulty establishing trust in DRL models’ decisions. Over the past years, different methods have been proposed to explain DRL models but unfortunately, they are often not suitable for security applications, in which explanation fidelity, efficiency, and the capability of model debugging are largely lacking. In this work, we propose AIRS, a general framework to explain deep reinforcement learning-based security applications. Unlike previous works that pinpoint important features to the agent’s current action, our explanation is at the step level. It models the relationship between the final reward and the key steps that a DRL agent takes, and thus outputs the steps that are most critical towards the final reward the agent has gathered. Using four representative security-critical applications, we evaluate AIRS from the perspectives of explainability, fidelity, stability, and efficiency. We show that AIRS could outperform alternative explainable DRL methods. We also showcase AIRS’s utility, demonstrating that our explanation could facilitate the DRL model’s failure offset, help users establish trust in a model decision, and even assist the identification of inappropriate reward designs.

Original languageEnglish (US)
Title of host publication32nd USENIX Security Symposium, USENIX Security 2023
PublisherUSENIX Association
Pages7375-7392
Number of pages18
ISBN (Electronic)9781713879497
StatePublished - 2023
Event32nd USENIX Security Symposium, USENIX Security 2023 - Anaheim, United States
Duration: Aug 9 2023Aug 11 2023

Publication series

Name32nd USENIX Security Symposium, USENIX Security 2023
Volume10

Conference

Conference32nd USENIX Security Symposium, USENIX Security 2023
Country/TerritoryUnited States
CityAnaheim
Period8/9/238/11/23

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'AIRS: Explanation for Deep Reinforcement Learning based Security Applications'. Together they form a unique fingerprint.

Cite this