Quantum-enhanced reinforcement learning for control: a preliminary study

Hu, Yazhou; Tang, Fengzhen; Chen, Jun; Wang, Wenxue

doi:10.1007/s11768-021-00063-x

Quantum-enhanced reinforcement learning for control: a preliminary study

Research Article
Published: 26 November 2021

Volume 19, pages 455–464, (2021)
Cite this article

Control Theory and Technology Aims and scope Submit manuscript

Yazhou Hu¹,
Fengzhen Tang^2,3,
Jun Chen¹ &
…
Wenxue Wang^2,3

774 Accesses
1 Citation
Explore all metrics

Abstract

Reinforcement learning is one of the fastest growing areas in machine learning, and has obtained great achievements in biomedicine, Internet of Things (IoT), logistics, robotic control, etc. However, there are still many challenges for engineering applications, such as how to speed up the learning process, how to balance the trade-off between exploration and exploitation. Quantum technology, which can solve complex problems faster than classical methods, especially in supercomputers, provides us a new paradigm to overcome these challenges in reinforcement learning. In this paper, a quantum-enhanced reinforcement learning is pictured for optimal control. In this algorithm, the states and actions of reinforcement learning are quantized by quantum technology. And then, a probability amplification method, which can effectively avoid the trade-off between exploration and exploitation via quantized technology, is presented. Finally, the optimal control policy is learnt during the process of reinforcement learning. The performance of this quantized algorithm is demonstrated in both MountainCar reinforcement learning environment and CartPole reinforcement learning environment—one kind of classical control reinforcement learning environment in the OpenAI Gym. The preliminary study results validate that, compared with Q-learning, this quantized reinforcement learning method has better control performance without considering the trade-off between exploration and exploitation. The learning performance of this new algorithm is stable with different learning rates from 0.01 to 0.10, which means it is promising to be employed in unknown dynamics systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A practical guide to multi-objective reinforcement learning and planning

Article Open access 13 April 2022

Deep learning: systematic review, models, challenges, and research directions

Article Open access 07 September 2023

Game-theoretic multi-agent motion planning in a mixed environment

Article 15 March 2024

References

Hansong, X., Liu, X., Wei, Yu., Griffith, D., & Golmie, N. (2020). Reinforcement learning-based control and networking co-design for industrial internet of things. IEEE Journal on Selected Areas in Communications, 38(5), 885–898.
Article Google Scholar
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., et al. (2017). Mastering the game of go without human knowledge. Nature, 550(7676), 354–359.
Article Google Scholar
Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., Choi, D. H., Powell, R., Ewalds, T., & Georgiev, P., et al. (2019). Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature, 575(7782): 350–354.
Article Google Scholar
Wainberg, M., Merico, D., Delong, A., & Frey, B. J. (2018). Deep learning in biomedicine. Nature Biotechnology, 36(9), 829–838.
Article Google Scholar
Mahmud, M., Kaiser, M. S., Hussain, A., & Vassanelli, S. (2018). Applications of deep learning and reinforcement learning to biological data. IEEE Transactions on Neural Networks and Learning Systems, 29(6), 2063–2079.
Article MathSciNet Google Scholar
Chang, M., Kaushik, S., Weinberg, S. M., Griffiths, T., & Levine, S. (2020). Decentralized reinforcement learning: Global decision-making via local economic transactions. In International Conference on Machine Learning, pp. 1437–1447. PMLR.
Yazhou, H., Wang, W., Liu, H., & Liu, L. (2019). Reinforcement learning tracking control for robotic manipulator with kernel-based dynamic model. IEEE Transactions on Neural Networks and Learning Systems, 31(9), 3570–3578.
MathSciNet Google Scholar
Peng, Z., Luo, R., Hu, J., Shi, K., Nguang, S. K., & Ghosh, B. K. (2021). Optimal tracking control of nonlinear multiagent systems using internal reinforce Q-learning. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2021.3055761.
Article Google Scholar
Peng, Z., Zhao, Y., Hu, J., Luo, R., Ghosh, B. K., & Nguang, S. K. (2021). Input-output data-based output antisynchronization control of multi-agent systems using reinforcement learning approach. IEEE Transactions on Industrial Informatics. https://doi.org/10.1109/TII.2021.3050768.
Article Google Scholar
Peng, Z., Zhao, Y., Hu, J., & Ghosh, B. K. (2019). Data-driven optimal tracking control of discrete-time multi-agent systems with two-stage policy iteration algorithm. Information Sciences, 481, 189–202.
Article MathSciNet Google Scholar
Bianchi, R. A. C., Ribeiro, C. H. C., & Costa, A. H. R. (2004). Heuristically accelerated Q-learning: a new approach to speed up reinforcement learning. In Brazilian Symposium on Artificial Intelligence, pp. 245–254. Sao Luis, Brazil.
Celiberto Jr, L. A., Matsuura, J. P., Màntaras, Ramón López D., & Bianchi, R. A. C. (2010). Using transfer learning to speed-up reinforcement learning: a cased-based approach. In 2010 Latin American Robotics Symposium and Intelligent Robotics Meeting, pp. 55–60. Sao Luis, Brazil.
Stooke, A., & Abbeel, P. (2018). Accelerated methods for deep reinforcement learning. arXiv:1803.02811.
Garcia, J., & Shafie, D. (2020). Teaching a humanoid robot to walk faster through safe reinforcement learning. Engineering Applications of Artificial Intelligence, 88, 103360.1–103360.10.
Google Scholar
Saggio, V., Asenbeck, B. E., Hamann, A., Strömberg, T., Schiansky, P., Dunjko, V., et al. (2021). Experimental quantum speed-up in reinforcement learning agents. Nature, 591(7849), 229–233.
Article Google Scholar
Tokic, M. (2010). Adaptive ε-greedy exploration in reinforcement learning based on value diferences. In Annual Conference on Artificial Intelligence, pp. 203–210. Karlsruhe, Germany.
Nair, A., McGrew, B., Andrychowicz, M., Zaremba, W., & Abbeel, P. (2018). Overcoming exploration in reinforcement learning with demonstrations. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6292–6299. Brisbane, Australia.
Gupta, A., Mendonca, R., Liu, Y., Abbeel, P., & Levine, S. (2018). Meta-reinforcement learning of structured exploration strategies. In: S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, & R. Garnett (Eds.), Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc.
Fujimoto, S., Meger, D., & Precup, D. (2019). Off-policy deep reinforcement learning without exploration. In International Conference on Machine Learning, pp. 2052–2062. PMLR.
Dong, D., Chen, C., Li, H., & Tarn, T-J. (2008). Quantum reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 38(5), 1207–1220.
Article Google Scholar
Paris, M. G. A. (2009). Quantum estimation for quantum technology. International Journal of Quantum Information, 7(supp01), 125–137.
Article Google Scholar
Shor, P. W. (1994). Algorithms for quantum computation: discrete logarithms and factoring. In Proceedings 35th Annual Symposium On Foundations Of Computer Science, pp. 124–134. Santa Fe, NM, USA.
Ekert, A., & Jozsa, R. (1996). Quantum computation and Shor’s factoring algorithm. Reviews of Modern Physics, 68(3), 733.
Article MathSciNet Google Scholar
Grover, L. K. (1996). A fast quantum mechanical algorithm for database search. In Proceedings of the 28th Annual ACM Symposium on Theory of Computing, pp. 212–219. Philadelphia, PA, USA.
Grover, L. K. (1997). Quantum mechanics helps in searching for a needle in a haystack. Physical Review Letters, 79(2), 325.
Article Google Scholar
Vandersypen, L. M. K., Steffen, M., Breyta, G., Yannoni, C. S., Sherwood, M. H., & Chuang, I. L. (2001). Experimental realization of Shor’s quantum factoring algorithm using nuclear magnetic resonance. Nature, 414(6866), 883–887.
Article Google Scholar
Jones, J. A., Mosca, M., & Hansen, R. H. (1998). Implementation of a quantum search algorithm on a quantum computer. Nature, 393(6683), 344–346.
Article Google Scholar
Dong, D., Chen, C., Chu, J., & Tarn, T.-J. (2010). Robust quantum-inspired reinforcement learning for robot navigation. IEEE/ASME Transactions on Mechatronics, 17(1), 86–97.
Article Google Scholar
Lamata, L. (2017). Basic protocols in quantum reinforcement learning with superconducting circuits. Scientific Reports, 7(1), 1–10.
Article Google Scholar
Li, J.-A., Dong, D., Wei, Z., Ying Liu, Yu., Pan, F. N., & Zhang, X. (2020). Quantum reinforcement learning during human decision-making. Nature Human Behaviour, 4(3), 294–307.
Article Google Scholar
Ravishankar, N. R., & Vijayakumar, M. V. (2017). Reinforcement learning algorithms: survey and classification. Indian Journal of Science and Technology, 10(1), 1–8.
Article Google Scholar
Zhang, Y., & Ni, Q. (2020). Recent advances in quantum machine learning. Quantum Engineering, 2(1), e34.
Google Scholar
Chen, C. L., Dong, D. Y., & Chen, Z. H. (2006). Quantum computation for action selection using reinforcement learning. International Journal of Quantum Information, 4(6), 1071–1083.
Article Google Scholar
Chuang, I. L., Gershenfeld, N., & Kubinec, M. (1998). Experimental implementation of fast quantum searching. Physical Review Letters, 80(15), 3408–3411.
Article Google Scholar
Nielsen, M. A., & Chuang, I. L. (2000). Quantum Computation and Quantum Information. Cambridge: Cambridge University Press.
MATH Google Scholar
Grover, L. K. (1997). Quantum mechanics helps in searching for a needle in a haystack. Physical Review Letters, 79(2), 325–327.
Article Google Scholar
Boyer, M., Brassard, G., Høyer, P., & Tapp, A. (1996). Tight bounds on quantum searching. Fortschritte Der Physik, 46.
Sutton, R., & Barto, A. (2018). Reinforcement Learning: An Introduction. Cambridge: MIT Press.
MATH Google Scholar
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba., W. (2016). OpenAI gym. arXiv:1606.01540.

Download references

Author information

Authors and Affiliations

College of Mechanical and Electronic Engineering, Northwest A& F University, Yangling, Shaanxi, 712100, China
Yazhou Hu & Jun Chen
The State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, Liaoning, 110016, China
Fengzhen Tang & Wenxue Wang
Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang, Liaoning, 110169, China
Fengzhen Tang & Wenxue Wang

Authors

Yazhou Hu
View author publications
You can also search for this author in PubMed Google Scholar
Fengzhen Tang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wenxue Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenxue Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, Y., Tang, F., Chen, J. et al. Quantum-enhanced reinforcement learning for control: a preliminary study. Control Theory Technol. 19, 455–464 (2021). https://doi.org/10.1007/s11768-021-00063-x

Download citation

Received: 18 May 2021
Revised: 01 July 2021
Accepted: 06 July 2021
Published: 26 November 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s11768-021-00063-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantum-enhanced reinforcement learning for control: a preliminary study

Abstract

Access this article

Similar content being viewed by others

A practical guide to multi-objective reinforcement learning and planning

Deep learning: systematic review, models, challenges, and research directions

Game-theoretic multi-agent motion planning in a mixed environment

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Quantum-enhanced reinforcement learning for control: a preliminary study

Abstract

Access this article

Similar content being viewed by others

A practical guide to multi-objective reinforcement learning and planning

Deep learning: systematic review, models, challenges, and research directions

Game-theoretic multi-agent motion planning in a mixed environment

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation