引用本文:贺伟,严佳成,周旺平,李洪杰.基于长短时记忆神经网络的降压变换器自适应控制[J].控制理论与应用,2025,42(9):1838~1848.[点击复制]
HE Wei,YAN Jia-cheng,ZHOU Wang-ping,LI Hong-jie.Adaptive control for buck converter based on long short-term memory neural network[J].Control Theory & Applications,2025,42(9):1838~1848.[点击复制]
基于长短时记忆神经网络的降压变换器自适应控制
Adaptive control for buck converter based on long short-term memory neural network
摘要点击 2341  全文点击 209  投稿时间:2023-05-24  修订日期:2025-02-13
查看全文  查看/发表评论  下载PDF阅读器   HTML
DOI编号  10.7641/CTA.2024.30355
  2025,42(9):1838-1848
中文关键词  恒功率负载  直流降压变换器  长短时记忆神经网络  双深度Q网络  深度强化学习
英文关键词  constant power load  DC-DC buck converter  long short-term memory neural network  double deep Q network  deep reinforcement learning
基金项目  国家自然科学基金项目(62373195,62173205,52077105, 62073169)资助.
作者单位E-mail
贺伟 南京信息工程大学大气环境与装备技术协同创新中心 hwei@nuist.edu.cn 
严佳成 南京信息工程大学大气环境与装备技术协同创新中心  
周旺平* 南京信息工程大学大气环境与装备技术协同创新中心 wpzhou@nuist.edu.cn 
李洪杰 西安交通大学电气工程学院  
中文摘要
      基于深度强化学习的无模型控制方法将避免系统建模的复杂过程,回避较难处理的非线性系统控制问题, 且具有优良的鲁棒性.本文针对带恒功率负载的直流降压变换器系统,基于长短时记忆神经网络提出一种无模型 自适应控制策略.首先,定义一种由连续电压误差信号组成的状态空间,此状态空间将误差信号构建为控制算法的 输入状态;其次,基于参考电压构建离散动作空间并设计奖励函数,动作空间将算法输出转换为占空比,并基于被控 系统下一时刻状态给予一个奖励信号评判算法控制效果;然后,将长短时记忆神经网络作为双深度Q网络的状态动 作价值函数估计器,计算输入状态下各个决策的Q值,并选取Q值最高的决策作为最优决策输出;最后,对本方法控 制下的带恒功率负载的直流降压变换器系统进行仿真和实验研究.实验结果证明,该控制策略具有优良的跟踪给定 性能,当存在外界扰动时,该控制策略作用下的系统具有良好的鲁棒性.
英文摘要
      The model-free control method based on deep reinforcement learning avoids the complex process of system modeling and addresses the challenges of nonlinear system control as well as captures excellent robustness. In this paper, a model-free adaptive control strategy is proposed for a DC-DC buck converter system with constant power load using long short-term memory neural network. Firstly, a state space composed of continuous voltage error signals is defined, transforming the error signals into input states for the control algorithm. Subsequently, a discrete action space is constructed based on the reference voltage, and a reward function is designed. The action space converts the algorithm’s output into duty cycles, and a reward signal is assigned based on the controlled system’s next-state evaluation to assess the algorithm’s control effectiveness. The long short-term memory neural network serves as a state-action value function estimator for the double deep Q network, calculating the Q-values for various decisions under the input state and selecting the decision with the highest Q-value as the optimal output. Finally, simulation and experimental studies are conducted on the DC-DC buck converter system with a constant power load under the control of the proposed method. Experimental results demonstrate the excellent tracking performance of the control strategy, and in the presence of external disturbances, the system under this control strategy exhibits robust behavior.