基于深度网络的可学习感受野算法在图像分类中的应用

王博; 郭继昌; 张艳

引用本文:	王博,郭继昌,张艳.基于深度网络的可学习感受野算法在图像分类中的应用[J].控制理论与应用,2015,32(8):1114~1119.[点击复制]
	WANG Bo,GUO Ji-chang,ZHANG Yan.Learnable receptive fields scheme in deep networks for image categorization[J].Control Theory & Applications,2015,32(8):1114~1119.[点击复制]

基于深度网络的可学习感受野算法在图像分类中的应用

Learnable receptive fields scheme in deep networks for image categorization

摘要点击 4119 全文点击 2121 投稿时间：2015-01-22 修订日期：2015-09-04

查看全文查看/发表评论下载PDF阅读器 HTML

DOI编号 10.7641/CTA.2015.50063

2015,32(8):1114-1119

中文关键词图像分类分层结构深度网络感受野

英文关键词 image categorization hierarchical architecture deep networks receptive fields

基金项目高等学校博士学科点专项科研基金项目(20120032110034)资助.

作者	单位	E-mail
王博^*	天津大学电子信息工程学院	neuwb@tju.edu.cn
郭继昌^*	天津大学电子信息工程学院
张艳	天津大学电子信息工程学院

中文摘要

作为图像检索, 图像组织和机器人视觉的基本任务, 图像分类在计算机视觉和机器学习中受到了广泛的关注. 用于目标识别及图像分类的多种基于深度学习的模型同样引发了该领域内的极大兴趣. 本文提出了一种取代尺度不变特征变换(SIFT)和方向梯度直方图(HOG)描述子的算法, 即利用深度分层结构, 按层级学习有效的图像表示, 直接从原始像素点学习特征.该方法分别利用K--奇异值分解(K--SVD)和正交匹配追踪(OMP)进行字典训练和编码.此外, 本文采用了同时学习分类器和用于池化的感受野方案. 实验结果证明, 上述算法在目标(Oxford flowers)和事件(UIUC--sports)图像分类测试集中取得了更好的分类性能.

英文摘要

An increasing interest in computer vision and machine learning has focused on visual categorization as it is a fundamental task for image retrieval, organization and robotic vision. Over the past decade, various deep learningbased models have been proposed and broadly applied to visual recognition and categorization. In this paper, the proposed approach learns features from scratch rather than employ hand-crafted (SIFT) and (HOG) descriptors. Deep hierarchical architecture for learning effective image representations can be built up layer by layer. Specifically, K--SVD and OMP are used for training and encoding phase respectively due to their simplicity and efficiency. In addition, sum, average and max operators are three commonly strategies for pooling in modern categorization models. We aim to apply an improved scheme which learns the receptive fields for pooling together with classifier instead of traditional pooling pattern. We provide a detailed analysis in deep networks for event and object tasks respectively and compare our novel method with several stateof- the-art algorithms comprising kernel-based feature learning and saliency-weighted hierarchical sparse coding. Finally, experimental results show that our algorithm performs better on UIUC--sports and Oxford flowers datasets.