| 引用本文: | 王晴,王雨珏,王浩然,辛斌.融合深度强化学习与图神经网络的动态武器目标分配优化[J].控制理论与应用,2025,42(11):2252~2260.[点击复制] |
| WANG Qing,WANG Yu-jue,WANG Hao-ran,XIN Bin.Dynamic weapon-target assignment optimization integrating deep reinforcement learning and graph neural networks[J].Control Theory & Applications,2025,42(11):2252~2260.[点击复制] |
|
| 融合深度强化学习与图神经网络的动态武器目标分配优化 |
| Dynamic weapon-target assignment optimization integrating deep reinforcement learning and graph neural networks |
| 摘要点击 4021 全文点击 179 投稿时间:2025-02-20 修订日期:2025-09-10 |
| 查看全文 查看/发表评论 下载PDF阅读器 HTML |
| DOI编号 10.7641/CTA.2025.50065 |
| 2025,42(11):2252-2260 |
| 中文关键词 动态传感器武器目标分配 深度强化学习 图神经网络 OODA环 |
| 英文关键词 dynamic sensor-weapon-target assignment deep reinforcement learning graph neural network OODA loop |
| 基金项目 北京市自然科学基金面上项目(4252050),国家自然科学基金杰出青年科学基金项目(62425304)资助. |
|
| 中文摘要 |
| 本文提出了一种基于深度强化学习(DRL)与图神经网络(GNN)的动态传感器武器目标分配(SWTA)方法,
旨在应对现代战场中复杂和动态的决策需求.传统静态方法在实时变化的战场中效率低、适应性差.为此,本文通
过结合深度强化学习与图神经网络,构建智能决策框架,利用环境交互和深度学习优化决策策略,提高资源分配效
率和决策精度.框架受OODA环理论指导,通过图神经网络捕捉场景中武器、目标和传感器的关系特征,快速生成分
配方案,结合深度强化学习优化策略,实现动态环境下的资源分配优化.优化过程中考虑到了作战效能,资源消耗
和关键要地保护的约束.实验表明,该方法在多种场景中表现优异,显著提升了资源利用率和作战效果. |
| 英文摘要 |
| This paper proposes a dynamic sensor-weapon-target assignment (SWTA) method based on deep reinforce
ment learning (DRL) and graph neural network (GNN), aimed at addressing the complex and dynamic decision-making
requirements on modern battlefields. Traditional static methods are inefficient and lack adaptability in real-time chang
ing battlefield environments. To tackle this issue, DRL is combined with GNN to build an intelligent decision-making
framework. This framework leverages environmental interaction and deep learning to optimize decision-making strategies,
thereby improving resource allocation efficiency and decision accuracy. Guided by the OODA loop theory, the framework
uses GNN to capture the relationships between weapons, targets, and sensors in the battlefield, quickly generating assign
ment solutions. The DRL component then optimizes these strategies, enabling resource allocation optimization in dynamic
environments. The optimization process takes into account operational effectiveness, resource consumption, and the pro
tection of key locations. Experiments demonstrate that this method performs excellently in various scenarios, significantly
enhancing resource utilization and operational outcomes. |
|
|
|
|
|