基于长短时记忆神经网络的降压变换器自适应控制

贺伟; 严佳成; 周旺平; 李洪杰

引用本文:	贺伟,严佳成,周旺平,李洪杰.基于长短时记忆神经网络的降压变换器自适应控制[J].控制理论与应用,2025,42(9):1838~1848.[点击复制]
	HE Wei,YAN Jia-cheng,ZHOU Wang-ping,LI Hong-jie.Adaptive control for buck converter based on long short-term memory neural network[J].Control Theory & Applications,2025,42(9):1838~1848.[点击复制]

基于长短时记忆神经网络的降压变换器自适应控制

Adaptive control for buck converter based on long short-term memory neural network

摘要点击 2341 全文点击 209 投稿时间：2023-05-24 修订日期：2025-02-13

查看全文查看/发表评论下载PDF阅读器 HTML

DOI编号 10.7641/CTA.2024.30355

2025,42(9):1838-1848

中文关键词恒功率负载直流降压变换器长短时记忆神经网络双深度Q网络深度强化学习

英文关键词 constant power load DC-DC buck converter long short-term memory neural network double deep Q network deep reinforcement learning

基金项目国家自然科学基金项目(62373195,62173205,52077105, 62073169)资助.

作者	单位	E-mail
贺伟	南京信息工程大学大气环境与装备技术协同创新中心	hwei@nuist.edu.cn
严佳成	南京信息工程大学大气环境与装备技术协同创新中心
周旺平^*	南京信息工程大学大气环境与装备技术协同创新中心	wpzhou@nuist.edu.cn
李洪杰	西安交通大学电气工程学院

中文摘要

基于深度强化学习的无模型控制方法将避免系统建模的复杂过程,回避较难处理的非线性系统控制问题, 且具有优良的鲁棒性.本文针对带恒功率负载的直流降压变换器系统,基于长短时记忆神经网络提出一种无模型自适应控制策略.首先,定义一种由连续电压误差信号组成的状态空间,此状态空间将误差信号构建为控制算法的输入状态;其次,基于参考电压构建离散动作空间并设计奖励函数,动作空间将算法输出转换为占空比,并基于被控系统下一时刻状态给予一个奖励信号评判算法控制效果;然后,将长短时记忆神经网络作为双深度Q网络的状态动作价值函数估计器,计算输入状态下各个决策的Q值,并选取Q值最高的决策作为最优决策输出;最后,对本方法控制下的带恒功率负载的直流降压变换器系统进行仿真和实验研究.实验结果证明,该控制策略具有优良的跟踪给定性能,当存在外界扰动时,该控制策略作用下的系统具有良好的鲁棒性.

英文摘要

The model-free control method based on deep reinforcement learning avoids the complex process of system modeling and addresses the challenges of nonlinear system control as well as captures excellent robustness. In this paper, a model-free adaptive control strategy is proposed for a DC-DC buck converter system with constant power load using long short-term memory neural network. Firstly, a state space composed of continuous voltage error signals is defined, transforming the error signals into input states for the control algorithm. Subsequently, a discrete action space is constructed based on the reference voltage, and a reward function is designed. The action space converts the algorithm’s output into duty cycles, and a reward signal is assigned based on the controlled system’s next-state evaluation to assess the algorithm’s control effectiveness. The long short-term memory neural network serves as a state-action value function estimator for the double deep Q network, calculating the Q-values for various decisions under the input state and selecting the decision with the highest Q-value as the optimal output. Finally, simulation and experimental studies are conducted on the DC-DC buck converter system with a constant power load under the control of the proposed method. Experimental results demonstrate the excellent tracking performance of the control strategy, and in the presence of external disturbances, the system under this control strategy exhibits robust behavior.