一种自治操作条件反射自动机

阮晓钢; 戴丽珍; 于乃功; 于建均

引用本文:	阮晓钢,戴丽珍,于乃功,于建均.一种自治操作条件反射自动机[J].控制理论与应用,2012,29(11):1452~1457.[点击复制]
	RUAN Xiao-gang,DAI Li-zhen,YU Nai-gong,YU Jian-jun.An autonomous operant conditioning automaton[J].Control Theory & Applications,2012,29(11):1452~1457.[点击复制]

一种自治操作条件反射自动机

An autonomous operant conditioning automaton

摘要点击 3346 全文点击 2455 投稿时间：2011-08-15 修订日期：2012-07-20

查看全文查看/发表评论下载PDF阅读器 HTML

DOI编号 10.7641/j.issn.1000-8152.2012.11.CCTA110929

2012,29(11):1452-1457

中文关键词自动机理论自治操作条件反射仿生学自主学习

英文关键词 automata theory autonomy operant conditioning bionics autonomous learning

基金项目国家自然科学基金资助项目(61075110); 北京市自然科学基金资助项目/北京市教育委员会科技计划重点资助项目(KZ201210005001); 北京市自然科学基金资助项目(4102011); 高等学校博士学科点专项科研基金资助项目(20101103110007).

作者	单位	E-mail
阮晓钢	北京工业大学电子信息与控制工程学院人工智能与机器人研究所
戴丽珍^*	北京工业大学电子信息与控制工程学院人工智能与机器人研究所	alice.dai2011@gmail.com
于乃功	北京工业大学电子信息与控制工程学院人工智能与机器人研究所
于建均	北京工业大学电子信息与控制工程学院人工智能与机器人研究所

中文摘要

针对仿生自主学习控制问题, 根据自动机的原理, 以操作条件反射学习机制为基础, 运用仿生的自组织学习方法, 提出一种自治操作条件反射自动机(autonomous operant conditioning automata, AOCA)模型, 主要包括: 操作集合、状态集合、“条件–操作”规则集合、可观测的状态转移以及操作条件反射学习律; 定义了基于AOCA状态取向值的操作熵; 给出了AOCA操作熵收敛性证明; 分析了AOCA自组织特性; 规定了AOCA的递归运行程序. 同时, 将其应用于斯金纳动物实验的模拟, 动物分阶段学习, 并且成功习得技能, 实验结果表明AOCA实现了模拟操作条件反射学习机制.

英文摘要

To deal with the bionic self-organization learning-control problem, on the basis of the automata principle and the operant conditioning learning mechanism, we make use of the bionic self-organization learning method to build an autonomous operant conditioning automaton (AOCA) model including the action set, the state set, the condition-action rule set, the observed state transition, the operant conditioning learning law as well as a recursive program. We also define the operant entropy based on the orientation values of states in the AOCA model, prove the convergence of the AOCA operant entropy, analyze the self-organization characteristics and develop the recursive operation program for the AOCA. The AOCA model has been applied to simulate the Skinner animal experiment, in which the animal learns the task in stages and succeeds in acquiring the skills eventually. These experiment results demonstrate that the AOCA can realize the learning mechanism of operant conditioning.