引用本文:张华卿,郝明瑞,姜吉祥,张晓飞,马宏宾,司佳帅.面向紧急任务的多智能体批次任务分配[J].控制理论与应用,2025,42(11):2242~2251.[点击复制]
ZHANG Hua-qing,HAO Ming-rui,JIANG Ji-xiang,ZHANG Xiao-fei,MA Hong-bin,SI Jia-shuai.Multi-agent batch task allocation for urgent tasks[J].Control Theory & Applications,2025,42(11):2242~2251.[点击复制]
面向紧急任务的多智能体批次任务分配
Multi-agent batch task allocation for urgent tasks
摘要点击 1981  全文点击 139  投稿时间:2024-09-28  修订日期:2025-10-15
查看全文  查看/发表评论  下载PDF阅读器   HTML
DOI编号  10.7641/CTA.2025.40518
  2025,42(11):2242-2251
中文关键词  异构多智能体  任务分配  深度强化学习  注意力机制  紧急任务场景
英文关键词  heterogeneous multiagent  task allocation  deep reinforcement learning  attention mechanism  urgent task scenarios
基金项目  国家自然科学基金项目(62076028)资助.
作者单位E-mail
张华卿 北京机电工程研究所 复杂系统控制与智能协同全国重点实验室 WTJ_1993@126.com 
郝明瑞 北京机电工程研究所 复杂系统控制与智能协同全国重点实验室  
姜吉祥* 北京机电工程研究所 复杂系统控制与智能协同全国重点实验室 jjxldy@163.com 
张晓飞 清华大学 车辆与运载学院  
马宏宾 北京理工大学 自动化学院  
司佳帅 北京机电工程研究所 复杂系统控制与智能协同全国重点实验室  
中文摘要
      现有的基于学习的构造任务分配方法需要在连续构造一个完整任务分配方案后,再将任务分配给智能体 去执行,其在救援、对抗等大规模紧急任务场景下通常无法满足任务的实时性需求.本文则针对大规模紧急任务场 景下异构多智能体任务分配策略寻优问题,提出了一种基于深度强化学习的多智能体批次任务分配方法.在该方 法中设计了包含编码器、智能体和任务节点选择解码器、递归嵌入结构的策略模型,其能够根据目标函数的最优性 要求一次给出一个批次的由智能体–任务节点对所构造的部分任务分配方案.在在线任务分配中相应的智能体不用 等到构造完完整的任务分配方案后再去执行相应的任务.评估结果表明,多智能体批次任务分配方法提高了紧急任 务场景下任务分配策略的实时性、可靠性和协同能力.
英文摘要
      Existing learning-based constructive task allocation methods require continuously generating a complete task allocation scheme before assigning tasks to agents, which fails to meet the real-time demands of large-scale urgent scenarios such as rescue or confrontation. To address this, a multi-agent batch task allocation method based on deep reinforcement learning is proposed in this paper. In this method, a policy model including an encoder, agent and task-node selection decoders, and a recursive embedding structure is designed that can generate a batch of partial task allocation schemes constructed by agent-task node pairs simultaneously according to the objective function’s optimality requirements. In online task allocation, agents no longer need to wait for the complete task allocation scheme before executing the tasks. The evaluation results showed that the proposed method improves the real-time performance, reliability, and cooperative capability of task allocation in urgent scenarios.