Human steering angle estimation in video based on key point detection and Kalman filter

Hu, Yanpeng; Liu, Yuxuan; Xu, Yanguang; Wang, Yinghui

doi:10.1007/s11768-022-00100-3

Human steering angle estimation in video based on key point detection and Kalman filter

Research Article
Published: 27 June 2022

Volume 20, pages 408–417, (2022)
Cite this article

Control Theory and Technology Aims and scope Submit manuscript

Yanpeng Hu¹,
Yuxuan Liu¹,
Yanguang Xu² &
…
Yinghui Wang¹

192 Accesses
2 Citations
Explore all metrics

Abstract

Human pose recognition and estimation in video is pervasive. However, the process noise and local occlusion bring great challenge to pose recognition. In this paper, we introduce the Kalman filter into pose recognition to reduce noise and solve local occlusion problem. The core of pose recognition in video is the fast detection of key points and the calculation of human steering angles. Thus, we first build a human key point detection model. Frame skipping is performed based on the Hamming distance of the hash value of every two adjacent frames in video. Noise reduction is performed on key point coordinates with the Kalman filter. To calculate the human steering angle, current state information of key points is predicted using the optimal estimation of key points at the previous time. Then human steering angle can be calculated based on current and previous state information. The improved SENet, NLNet and GCNet modules are integrated into key point detection model for improving accuracy. Tests are also given to illustrate the effectiveness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Article Open access 08 October 2020

ByteTrack: Multi-object Tracking by Associating Every Detection Box

References

Zhang, Y. (2016). Research on the ethical problems and countermeasures of artificial intelligence technology. Journal of Jilin Radio and TV University, 11, 114–115.
Google Scholar
Zheng, Z., & Gu, S. (2017). Tensorflow Combat Google Deep Learning Framework (pp. 200–208). Beijing: Electronic Industry Press.
Google Scholar
Zheng, Y., Chen, Q., & Zhang, Y. (2014). Deep learning and its new progress in target and behavior recognition. China Journal of Image and Graphics, 2, 18–24.
Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. arXiv:1512.03385v1.
Agarwal A, Triggs B. (2004). 3D human pose from silhouettes by relevance vector regression. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 882–888. Washington, DC, USA.
Meng, F. (2012). Human Pose Estimation of Static Picture. Hefei University of Technology.
Tang, Z., & Wang, Z. (2011). Survey of human pose estimation in single frame image. Computer Engineering and Science, 33(11), 89–97.
Google Scholar
Toshev, A., & Szegedy, C. (2014). DeepPose: Human pose estimation via deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660. Columbus, OH, USA.
He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN. In IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988. Venice, Italy.
Yang, W., Ouyang, W., Li, H., & Wang, X. (2016). End-to-end learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3073–3082. Seattle, WA, USA.
Belagiannis, V., & Zisserman, A. (2017). Recurrent human pose estimation. In The 12th IEEE International Conference on Automatic Face and Gesture Recognition (FG), pp. 468–475. Washington, DC, USA.
Bulat A, Tzimiropoulos G. (2016). Human pose estimation via convolutional part heatmap regression. In The 14th European Conference on Computer Vision (ECCV), pp. 717–732. Amsterdam, Netherlands.
Chen, X., & Yuille, A. (2014). Articulated pose estimation by a graphical model with image dependent pairwise relations. In The 28th Conference on Neural Information Processing Systems (NIPS), pp. 1736–1744. Montreal, Canada.
Tompson, J., Jain, A., LeCun, Y., & Bregler, C. (2014). Joint training of a convolutional network and a graphical model for human pose estimation. In The 28th Conference on Neural Information Processing Systems (NIPS), pp. 1799–1807. Montreal, Canada.
Tompson, J., Goroshin, R., Jain, A., LeCun, Y., & Bregler, C. (2015). Efficient object localization using convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 648–656. Boston, MA, USA.
Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P., & Schiele, B. (2016). Deepcut: Joint subset partition and labeling for multi-person pose estimation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4929–4937. Seattle, WA, USA.
Insafutdinov E, Pishchulin L, Andres B, Andriluka M, Schiele B. (2016). Deepercut: A deeper, stronger, and faster multi-person pose estimation. In The 14th European Conference on Computer Vision (ECCV), pp. 34–50. Amsterdam, Netherlands.
Cao, Z., Simon, T., Wei, S.-E., & Sheikh, Y. (2017). Realtime multi-person 2D pose estimation using part affinity fields. In The 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1302–1310. Honolulu, HI, USA.
Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Transactions of the ASME-Journal of Basic Engineering, 82(1), 35–45.
Article MathSciNet Google Scholar
Kim, P., & Huh, L. (2011). Kalman Filter for Beginners: With Matlab Examples. Science & Techology.
Cui, J., & Chen, G. (2013). Kalman filter and its real-time application. Beijing: Tsinghua University Press.
Google Scholar
Zeng, Y. (2012). Image aware hash algorithm and its application. Hangzhou: Zhejiang Sci-Tech University.
Google Scholar
Wei, S.-E., Ramakrishna, V., Kanade, T., & Sheikh, Y. (2016). Convolutional pose machines. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4724–4732. Seattle, WA, USA.
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.
Neubeck, A., & Van Gool, L. (2006). Efficient non-maximum suppression. In The 18th International Conference on Pattern Recognition (ICPR), pp. 850–855. Hong Kong, China.
Richard, J. (1994). Trudeau. Introduction to Graph Theory. Dover Publications Inc.
Bondy, J. A., & Murty, U. S. R. (2008). Graph Theory. Springer.
Kuhn, H. W. (2010). The hungarian method for the assignment problem. Naval Res Logistics, 52(1–2), 7–21.
MATH Google Scholar
Xie, B. (2016). Hungarian algorithm and its generalization. East China Normal University.
Iqbal, U., Milan, A., & Gall, J. (2017). Pose-track: Joint multi-person pose estimation and tracking. arXiv:1611.07727.
Iqbal, U., Milan, A., Insafutdinov, E., Andriluka, M., Ensafutdinov, E., Pishchulin, L., Gall, J., & Schiele, B. (2018). PoseTrack: A benchmark for human pose estimation and tracking. In The 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5167–5176. Salt Lake City, UT, USA.
Lin, T.-Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Lawrence Zitnick, C., & Dollár, P. (2015). Microsoft COCO: common objects in context. arXiv:1405.0312v3.
Xu, Y. (2019). Human Pose Estimation Combined with Positions of the Head and Shoulders. University of Science and Technology Beijing.

Download references

Author information

Authors and Affiliations

School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, 100083, China
Yanpeng Hu, Yuxuan Liu & Yinghui Wang
KE Holdings Inc., Beijing, 100085, China
Yanguang Xu

Authors

Yanpeng Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yuxuan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yanguang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yinghui Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yinghui Wang.

Additional information

This work was supported by the National Natural Science Foundation of China (Nos. 72101026, 61621063) and the State Key Laboratory of Intelligent Control and Decision of Complex Systems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, Y., Liu, Y., Xu, Y. et al. Human steering angle estimation in video based on key point detection and Kalman filter. Control Theory Technol. 20, 408–417 (2022). https://doi.org/10.1007/s11768-022-00100-3

Download citation

Received: 29 September 2021
Revised: 21 March 2022
Accepted: 22 March 2022
Published: 27 June 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s11768-022-00100-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human steering angle estimation in video based on key point detection and Kalman filter

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

ByteTrack: Multi-object Tracking by Associating Every Detection Box

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Human steering angle estimation in video based on key point detection and Kalman filter

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

ByteTrack: Multi-object Tracking by Associating Every Detection Box

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation