含弹簧阻尼缓冲机构空间机器人捕获卫星操作的 避撞柔顺强化学习控制

朱安; 陈力

引用本文:	朱安,陈力.含弹簧阻尼缓冲机构空间机器人捕获卫星操作的避撞柔顺强化学习控制[J].控制理论与应用,2020,37(8):1727~1736.[点击复制]
	ZHU An,CHEN Li.Collision avoidance and compliance reinforcement learning control for space robot with spring-damper buffer device capturing satellite[J].Control Theory and Technology,2020,37(8):1727~1736.[点击复制]

含弹簧阻尼缓冲机构空间机器人捕获卫星操作的避撞柔顺强化学习控制

Collision avoidance and compliance reinforcement learning control for space robot with spring-damper buffer device capturing satellite

摘要点击 1788 全文点击 812 投稿时间：2019-10-10 修订日期：2020-03-15

查看全文查看/发表评论下载PDF阅读器

DOI编号 10.7641/CTA.2020.90839

2020,37(8):1727-1736

中文关键词空间机器人弹簧阻尼缓冲机构避撞柔顺策略冲击效应强化学习

英文关键词 space robot spring-damper buffer device collision avoidance and compliance strategy impact effect reinforcement learning

基金项目国家自然科学基金(11372073)和福建省工业机器人基础部件技术重大研发平台(2014H21010011) 资助项目

作者	单位	E-mail
朱安	福州大学机械工程及自动化学院	zhu_an24@sina.com
陈力^*	福州大学机械工程及自动化学院

中文摘要

针对空间机器人捕获卫星操作过程中, 关节因受冲击载荷而易造成冲击破坏的问题, 在关节电机与机械臂之间设计了一种弹簧阻尼缓冲机构. 缓冲机构不仅能利用弹簧实现捕获操作过程的柔顺化, 利用阻尼器实现碰撞能量的吸收及柔性振动的抑制; 还能通过合理设计与之配合的避撞柔顺策略使关节所受冲击力矩限定在安全范围内. 首先, 分别利用含耗散力Lagrange方程法与Newton-Euler法导出了碰撞前的空间机器人与被捕获卫星的分体系统动力学方程; 然后, 结合Newton第三定律、捕获点的速度约束、各分体的位置约束获得了捕获后的混合体系统动力学方程, 且基于动量守恒关系计算了碰撞冲击效应与冲击力; 最后, 提出了一种结合缓冲机构的避撞柔顺强化学习控制方案, 该方案通过实时与动态环境试错交互得到惩罚信号, 并利用惩罚信号对控制器进行优化, 实现对失稳混合体系统的镇定控制. 利用Lyapunov定理证明了系统的稳定性; 数值仿真验证了缓冲结构的抗冲击性能及所提策略的有效性.

英文摘要

In order to protect joints from impact damage during the process of space robot capturing satellite, a springdamper buffer device is designed between joint motor and manipulator. The device can not only use spring to achieve the compliance during the capture operation, use damper to absorb impact energy and suppress flexible vibration; but also limit the joint’s impact torque to a safe range through reasonable and coordinated design the collision avoidance and compliance strategy. First of all, the dynamic models of space robot and satellite at collision time are derived by using Lagrange function based on dissipation theory and Newton-Euler function respectively. After that, combined with Newton’s third law, velocity and position constraints of capture points constraints of bodies, the dynamic model of hybrid system after capture is obtained, the impact effect and impact force are calculated based on momentum conservation. Finally, a collision avoidance and compliance reinforcement learning control strategy with buffer device is proposed. The penalty signal is obtained by trial-and-error interaction with dynamic environment, be used to optimize the controller to stabilize the instability hybrid system. The stability of the system is proved by Lyapunov theorem, and the impact resistance of the device and the effectiveness of the proposed strategy are verified by numerical simulation.