面向紧急任务的多智能体批次任务分配

张华卿; 郝明瑞; 姜吉祥; 张晓飞; 马宏宾; 司佳帅

引用本文:	张华卿,郝明瑞,姜吉祥,张晓飞,马宏宾,司佳帅.面向紧急任务的多智能体批次任务分配[J].控制理论与应用,2025,42(11):2242~2251.[点击复制]
	ZHANG Hua-qing,HAO Ming-rui,JIANG Ji-xiang,ZHANG Xiao-fei,MA Hong-bin,SI Jia-shuai.Multi-agent batch task allocation for urgent tasks[J].Control Theory & Applications,2025,42(11):2242~2251.[点击复制]

面向紧急任务的多智能体批次任务分配

Multi-agent batch task allocation for urgent tasks

摘要点击 1981 全文点击 139 投稿时间：2024-09-28 修订日期：2025-10-15

查看全文查看/发表评论下载PDF阅读器 HTML

DOI编号 10.7641/CTA.2025.40518

2025,42(11):2242-2251

中文关键词异构多智能体任务分配深度强化学习注意力机制紧急任务场景

英文关键词 heterogeneous multiagent task allocation deep reinforcement learning attention mechanism urgent task scenarios

基金项目国家自然科学基金项目(62076028)资助.

作者	单位	E-mail
张华卿	北京机电工程研究所复杂系统控制与智能协同全国重点实验室	WTJ_1993@126.com
郝明瑞	北京机电工程研究所复杂系统控制与智能协同全国重点实验室
姜吉祥^*	北京机电工程研究所复杂系统控制与智能协同全国重点实验室	jjxldy@163.com
张晓飞	清华大学车辆与运载学院
马宏宾	北京理工大学自动化学院
司佳帅	北京机电工程研究所复杂系统控制与智能协同全国重点实验室

中文摘要

现有的基于学习的构造任务分配方法需要在连续构造一个完整任务分配方案后,再将任务分配给智能体去执行,其在救援、对抗等大规模紧急任务场景下通常无法满足任务的实时性需求.本文则针对大规模紧急任务场景下异构多智能体任务分配策略寻优问题,提出了一种基于深度强化学习的多智能体批次任务分配方法.在该方法中设计了包含编码器、智能体和任务节点选择解码器、递归嵌入结构的策略模型,其能够根据目标函数的最优性要求一次给出一个批次的由智能体–任务节点对所构造的部分任务分配方案.在在线任务分配中相应的智能体不用等到构造完完整的任务分配方案后再去执行相应的任务.评估结果表明,多智能体批次任务分配方法提高了紧急任务场景下任务分配策略的实时性、可靠性和协同能力.

英文摘要

Existing learning-based constructive task allocation methods require continuously generating a complete task allocation scheme before assigning tasks to agents, which fails to meet the real-time demands of large-scale urgent scenarios such as rescue or confrontation. To address this, a multi-agent batch task allocation method based on deep reinforcement learning is proposed in this paper. In this method, a policy model including an encoder, agent and task-node selection decoders, and a recursive embedding structure is designed that can generate a batch of partial task allocation schemes constructed by agent-task node pairs simultaneously according to the objective function’s optimality requirements. In online task allocation, agents no longer need to wait for the complete task allocation scheme before executing the tasks. The evaluation results showed that the proposed method improves the real-time performance, reliability, and cooperative capability of task allocation in urgent scenarios.