FaceDetectNet: face detection via fully-convolutional network
Gorbatsevich V.S., Moiseenko A.S., Vizilter Y.V.

State Research Institute of Aviation Systems (GosNIIAS), Moscow, Russia;

Moscow Institute of Physics and Technology (MIPT), Moscow, Russia

Аннотация:
Face detection is one of the most popular computer vision tasks. There are a lot of face detection approaches proposed including different CNN-based techniques, but the problem of optimal balancing between detection quality and computational speed is still relevant. In this paper we propose new CNN-based solution for face detection called FaceDetectNet. Our CNN architecture is based on ideas of YOLO/DetectNet and GoogleNet architecture supported with some new tools and implementation details created especially for our face detection application. We propose: original iterative proposal clustering (IPC) algorithm for aggregation of output face proposals formed by CNN and the 2-level “weak pyramid” providing better detection quality on the testing sets containing both small and huge images. Our face detection approach is close to previously proposed SSD-based face detection, but the principal difference is that we use the deep features of top hidden CNN layer for forming the face proposals of any size. Thus we utilize the global semantic and context information for improving the detection quality for small faces. Our FaceDetectNet is trained and tested on the most challenging WIDER FACE detection benchmark. Our algorithm achieves the average precision (AP) 0.69 on the WIDER FACE hard level, and thus outperforms all competitive detectors on the Hard level besides the HR state-of-the-art solution. Note that HR solution is based on essentially deeper and slower CNN, while our FaceDetectNet can work in real-time on the NVIDIA GeForce 1080 GPU. On the other hand, SSD-based face detector with comparable CNN parameters provides AP 0.625 only on the WIDER FACE hard level. So, our approach provides the best quality with reasonable computational speed.

Ключевые слова:
CNN, face detection, DetectNet, YOLO.

Цитирование:
Gorbatsevich, V.S. FaceDetectNet: Face detection via fully-convolutional network / V.S. Gorbatsevich , A.S. Moiseenko , Y.V. Vizilter // Computer Optics. - 2019. -Vol. 43, Issue 1. - P.63-71. - DOI: 10.18287/2412-6179-2019-43-1-63-71.

Литература:

  1. Redmon, J. You only look once: Unified, real-time object detection / J. Redmon, S. Divvala, R. Girshick, A. Farhadi // Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). – 2016. – P. 779-788. – DOI: 10.1109/CVPR.2016.91.
  2. Tao, A. DetectNet: Deep neural network for object detection in DIGITS [Electronical Resource] / A. Tao, J. Barker, S. Sarathy. – URL: https://devblogs.nvidia.com/detectnet-deep-neural-network-object-detection-digits/ (request date 3.12.2018).
  3. Liu, W. SSD: Single shot multibox detector [Electronical Resource] / W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu, A. Berg. – URL: https://arxiv.org/abs/1512.02325 (request date 3.12.2018). – DOI: 10.1007/978-3-319-46448-0_2.
  4. Hu P, Ramanan D. Finding tiny faces / P. Hu, D. Ramanan // Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). – 2017. – URL: https://arxiv.org/abs/1612.04402 (request date 3.12.2018).
  5. Zhu, C. CMS-RCNN: contextual multi-scale region-based CNN for unconstrained face detection / C. Zhu, Y. Zheng, K. Luu, M. Savvides. – 2016. – URL: https://arxiv.org/abs/1606.05413 (request date 3.12.2018).
  6. Viola, P. Rapid object detection using a boosted cascade of simple features / P. Viola, M. Jones // Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). – 2001. – Vol. 1. – DOI: 10.1109/CVPR.2001.990517.
  7. Bourdev, L. Robust object detection via soft cascade / L. Bourdev, J. Brandt // Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). –2005. – Vol. 2. – P. 236-243. – DOI: 10.1109/CVPR.2005.310.
  8. Chen, D. Joint cascade face detection and alignment / D. Chen, S. Ren, Y. Wei, X. Cao, J. Sun. – In: Computer Vision – ECCV 2014 / ed. by D. Fleet, T. Pajdla, B. Schiele, T. Tuytelaars. – Cham: Springer, 2014. – P. 109-122. – DOI: 10.1007/978-3-319-10599-4_8.
  9. Li, J. Face detection using SURF Cascade / J. Li, T. Wang, Y. Zhang // 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops). – 2011. – P. 2183-2190. – DOI: 10.1109/ICCVW.2011.6130518.
  10. Krizhevsky, A. ImageNet classification with deep convolutional neural networks / A. Krizhevsky, I. Sutskever, G.E. Hinton // NIPS'12 Proceedings of the 25th International Conference on Neural Information Processing Systems. – 2012. – Vol. 1. – P. 1097-1105.
  11. Girshick, R. Rich featurehierarchies for accurate object detection and semantic segmentation / R. Girshick, J. Donahue, T. Darrell, J. Malik // CVPR '14 Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. – 2014. – P. 580-587. – DOI: 10.1109/CVPR.2014.81.
  12. Ren, S. Faster R-CNN: Towards real-time object detection with region proposal networks [Electronical Resource] / S. Ren, K. He, R. Girshick, J. Sun // arXiv preprint. – 2015. – URL: https://arxiv.org/abs/1506.01497 (request date 3.12.2018).
  13. Yang, S. From facial parts responses to face detection: A deep learning approach / S. Yang, P. Luo, C.C. Loy, X. Tang // Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV). – 2015. – P. 3676-3684. – DOI: 10.1109/ICCV.2015.419.
  14. Jiang, H. Face detection with the faster R-CNN / H. Jiang, E. Learned-Miller // 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017). – 2017. – P. 650-657. – DOI: 10.1109/FG.2017.82.
  15. Li, H. A convolutional neural network cascade for face detection / H. Li, Z. Lin, X. Shen, J. Brandt, G. Hua // Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). – 2015. – P. 5325-5334. – DOI: 10.1109/CVPR.2015.7299170.
  16. Qin, H. Joint training of cascaded CNN for face detection / H. Qin, J. Yan, X. Li, X. Hu // Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). – 2016. – P. 3456-3465. – DOI: 10.1109/CVPR.2016.376.
  17. Zhang, K. Joint face detection and alignment using multi-task cascaded convolutional networks / K. Zhang, Z. Zhang, Z. Li, Y. Qiao // IEEE Signal Processing Letters. – 2016. – Vol. 23, Issue 10. – P. 1499-1503. – DOI: 10.1109/LSP.2016.2603342.
  18. Zhang, C. Improving multiview face detection with multi-task deep convolutional neural networks / C. Zhang, Z. Zhang // Proceddings of the IEEE Winter Conference on Applications of Computer Vision. – 2014. – P. 1036-1041. – DOI: 10.1109/WACV.2014.6835990.
  19. Redmon, J. YOLO9000: Better, faster, stronger [Electronical Resource] / J. Redmon, A. Farhadi. – arXiv preprint. – URL: https://arxiv.org/abs/1612.08242 (request date 3.12.2018).
  20. Yang, S. Face detection through scale-friendly deep convolutional networks [Electronical Resource] / S. Yang, Y. Xiong, C. Change, L.X. Tang. – arXiv preprint. – URL: https://arxiv.org/abs/1706.02863 (request date 3.12.2018).
  21. Szegedy, C. Going deeper with convolutions / C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich // Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). – 2015. – P. 1-9. – DOI: 10.1109/CVPR.2015.7298594.
  22. Jia, Y. Caffe: Convolutional architecture for fast feature embedding / Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell // MM '14 Proceedings of the 22nd ACM international conference on Multimedia. – 2014. – P. 675-678. – DOI: 10.1145/2647868.2654889.
  23. Deng, J. ImageNet: A large-scale hierarchical image database / J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, L. Fei-Fei // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). – 2009. – P. 248-255. – DOI: 10.1109/CVPR.2009.5206848.
  24. Yang, S. WIDER FACE: A face detection benchmark / S. Yang, P. Luo, C.C. Loy, X. Tang // Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). – 2016. – P. 5525-5533. – DOI: 10.1109/CVPR.2016.596.

© 2009, IPSI RAS
Россия, 443001, Самара, ул. Молодогвардейская, 151; электронная почта: ko@smr.ru ; тел: +7 (846) 242-41-24 (ответственный секретарь), +7 (846) 332-56-22 (технический редактор), факс: +7 (846) 332-56-20