Bayesian adversarial multi-node bandit for optimal smart grid protection against cyber attacks

Jianyu Xu, Bin Liu, Huadong Mo, Daoyi Dong

Research output: Contribution to journalArticlepeer-review


The cyber security of smart grids has become one of key problems in developing reliable modern power and energy systems. This paper introduces a non-stationary adversarial cost with a variation constraint for smart grids and enables us to investigate the problem of optimal smart grid protection against cyber attacks in a relatively practical scenario. In particular, a Bayesian multi-node bandit (MNB) model with adversarial costs is constructed and a new regret function is defined for this model. An algorithm called Thompson–Hedge algorithm is presented to solve the problem and the superior performance of the proposed algorithm is proven in terms of the convergence rate of the regret function. The applicability of the algorithm to real smart grid scenarios is verified and the performance of the algorithm is also demonstrated by numerical examples.
Original languageEnglish
Article number109551
Number of pages12
Early online date6 Mar 2021
Publication statusPublished - 30 Jun 2021


  • multi-node bandit
  • reinforcement learning
  • Bayesian updating
  • cyber attack
  • smart grid

Cite this