引用本文:朱安,陈力.含弹簧阻尼缓冲机构空间机器人捕获卫星操作的 避撞柔顺强化学习控制[J].控制理论与应用,2020,37(8):1727~1736.[点击复制]
ZHU An,CHEN Li.Collision avoidance and compliance reinforcement learning control for space robot with spring-damper buffer device capturing satellite[J].Control Theory and Technology,2020,37(8):1727~1736.[点击复制]
含弹簧阻尼缓冲机构空间机器人捕获卫星操作的 避撞柔顺强化学习控制
Collision avoidance and compliance reinforcement learning control for space robot with spring-damper buffer device capturing satellite
摘要点击 1788  全文点击 812  投稿时间:2019-10-10  修订日期:2020-03-15
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2020.90839
  2020,37(8):1727-1736
中文关键词  空间机器人  弹簧阻尼缓冲机构  避撞柔顺策略  冲击效应  强化学习
英文关键词  space robot  spring-damper buffer device  collision avoidance and compliance strategy  impact effect  reinforcement learning
基金项目  国家自然科学基金(11372073)和福建省工业机器人基础部件技术重大研发平台(2014H21010011) 资助项目
作者单位E-mail
朱安 福州大学机械工程及自动化学院 zhu_an24@sina.com 
陈力* 福州大学机械工程及自动化学院  
中文摘要
      针对空间机器人捕获卫星操作过程中, 关节因受冲击载荷而易造成冲击破坏的问题, 在关节电机与机械臂 之间设计了一种弹簧阻尼缓冲机构. 缓冲机构不仅能利用弹簧实现捕获操作过程的柔顺化, 利用阻尼器实现碰撞 能量的吸收及柔性振动的抑制; 还能通过合理设计与之配合的避撞柔顺策略使关节所受冲击力矩限定在安全范围 内. 首先, 分别利用含耗散力Lagrange方程法与Newton-Euler法导出了碰撞前的空间机器人与被捕获卫星的分体系 统动力学方程; 然后, 结合Newton第三定律、捕获点的速度约束、各分体的位置约束获得了捕获后的混合体系统动 力学方程, 且基于动量守恒关系计算了碰撞冲击效应与冲击力; 最后, 提出了一种结合缓冲机构的避撞柔顺强化学 习控制方案, 该方案通过实时与动态环境试错交互得到惩罚信号, 并利用惩罚信号对控制器进行优化, 实现对失稳 混合体系统的镇定控制. 利用Lyapunov定理证明了系统的稳定性; 数值仿真验证了缓冲结构的抗冲击性能及所提策 略的有效性.
英文摘要
      In order to protect joints from impact damage during the process of space robot capturing satellite, a springdamper buffer device is designed between joint motor and manipulator. The device can not only use spring to achieve the compliance during the capture operation, use damper to absorb impact energy and suppress flexible vibration; but also limit the joint’s impact torque to a safe range through reasonable and coordinated design the collision avoidance and compliance strategy. First of all, the dynamic models of space robot and satellite at collision time are derived by using Lagrange function based on dissipation theory and Newton-Euler function respectively. After that, combined with Newton’s third law, velocity and position constraints of capture points constraints of bodies, the dynamic model of hybrid system after capture is obtained, the impact effect and impact force are calculated based on momentum conservation. Finally, a collision avoidance and compliance reinforcement learning control strategy with buffer device is proposed. The penalty signal is obtained by trial-and-error interaction with dynamic environment, be used to optimize the controller to stabilize the instability hybrid system. The stability of the system is proved by Lyapunov theorem, and the impact resistance of the device and the effectiveness of the proposed strategy are verified by numerical simulation.