A framework of reading timestamps for surveillance video
Cheng J., Dai W.

Computer School, Hubei Polytechnic University, Huangshi 435000, Hubei, China,
School of Economics and Management, Hubei Polytechnic University, Huangshi 435003, Hubei, China

Аннотация:
This paper presents a framework to automatically read timestamps for surveillance video. Reading timestamps from surveillance video is difficult due to the challenges such as color variety, font diversity, noise, and low resolution. The proposed algorithm overcomes these challenges by using the deep learning framework. The framework has included: training of both timestamp localization and recognition in a single end-to-end pass, the structure of the recognition CNN and the geometry of its input layer that preserves the aspect of the timestamps and adapts its resolution to the data. The proposed method achieves state-of-the-art accuracy in the end-to-end timestamps recognition on our datasets, whilst being an order of magnitude faster than competing methods. The framework can be improved the market competitiveness of panoramic video surveillance products.

Ключевые слова:
surveillance video, timestamp localization, timestamp recognition.

Цитирование:
Cheng, J. A framework of reading timestamps for surveillance video / J. Cheng , W. Dai // Computer Optics. - 2019. - Vol. 43, Issue1. - P. 72-77. - DOI: 10.18287/2412-6179-2019-43-1-72-77.

Литература:

  1. Karatzas, D. ICDAR 2013 Robust Reading Competition / D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L.G.i. Bigorda, S.R. Mestre, J. Mas, D.F. Mota, J.A. Almazàn, L.P. de las Heras // 2013 12th International Conference on Document Analysis and Recognition. – 2013. – P. 1484-1493.
  2. Karatzas, D. ICDAR 2015 competition on robust reading / D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, M. Iwamura, J. Matas, L. Neumann, V.R. Chandrasekhar, S. Lu, F. Shafait, S. Uchida, E. Valveny // Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR). – 2015. – P. 1156-1160. – DOI: 10.1109/ICDAR.2015.7333942.
  3. Jaderberg, M. Deep features for text spotting / M. Jaderberg, A. Vedaldi, A. Zisserman. – In: Computer Vision – ECCV 2014 / ed. by D. Fleet, T. Pajdla, B. Schiele, T. Tuytelaars. – Cham: Springer, 2014. – P. 512-528. – DOI: 10.1007/978-3-319-10593-2_34.
  4. Lécun, Y. Gradient-based learning applied to document recognition / Y. Lécun, L. Bottou, Y. Bengio, P. Haffner // Proceedings of the IEEE. – 1998. – Vol. 86, Issue 11. – P. 2278-2324. – DOI: 10.1109/5.726791.
  5. Jaderberg, M. Reading text in the wild with convolutional neural networks / M. Jaderberg, K. Simonyan, A. Vedaldi, A. Zisserman // International Journal of Computer Vision. – 2016. – Vol. 116, Issue 1. – P. 1-20. – DOI: 10.1007/s11263-015-0823-z.
  6. Zitnick, C.L. Edge boxes: Locating object proposals from edges / C.L. Zitnick, P. Dollár. – In: Computer Vision – ECCV 2014 / ed. by D. Fleet, T. Pajdla, B. Schiele, T. Tuytelaars. – Cham: Springer, 2014. – P. 391-405. – DOI: 10.1007/978-3-319-10602-1_26.
  7. Dollar, P. Fast feature pyramids for object detection / P. Dollar, R. Appel, S. Belongie, P. Perona // IEEE Transactions on Pattern Analysis and Machine Intelligence. – 2014. – Vol. 36, Issue 8. – P. 1532-1545. – DOI: 10.1109/TPAMI.2014.2300479.
  8. Bosch, A. Image classification using random forests and ferns / A. Bosch, A. Zisserman, X. Munoz // IEEE Internati­onal Conference on Computer Vision. – 2007. – 8 p. – DOI: 10.1109/ICCV.2007.4409066.
  9. Gupta, A. Synthetic data for text localisation in natural images / A. Gupta, A. Vedaldi, A. Zisserman // IEEE Conference on Computer Vision and Pattern Recognition. – 2016. – P. 2315-2324. – DOI: 10.1109/CVPR.2016.254.
  10. Redmon, J. You only look once: Unified, real-time object detection / J. Redmon, S. Divvala, R. Girshick, A. Farhadi // Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). – 2016. – P. 779-788. – DOI: 10.1109/CVPR.2016.91.
  11. Simonyan, K. Very deep convolutional networks for large-scale image recognition [Electronical Resource] / K. Simonyan, A. Zisserman. – arXiv preprint. – v6. – 2015. – URL: https://arxiv.org/abs/1409.1556 (request date 27.12.2018).
  12. Tian, Z. Detecting text in natural image with connectionist text proposal network / Z. Tian, W. Huang, T. He, P. He, Y. Qiao. – In: Computer Vision – ECCV 2016 / ed. by B. Leibe, J. Matas, N. Sebe, M. Welling. – Cham: Springer, 2016. – P. 56-72. – DOI: 10.1007/978-3-319-46484-8_4.
  13. Ren, S. Faster R-CNN: towards real-time object detection with region proposal networks / S. Ren, K. He, R. Girshick, J. Sun // IEEE Transactions on Pattern Analysis and Machine Intelligence. – 2017. – Vol. 39, Issue 6. – P. 1137-1149. – DOI: 10.1109/TPAMI.2016.2577031.
  14. Liao, M. TextBoxes: A fast text detector with a single deep neural network / M. Liao, B. Shi, X. Bai, X. Wang, W. Liu // Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17). – 2017. – P. 4161-4167.
  15. Liu, W. SSD: Single shot multibox detector / W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, A.C. Berg. – In: Computer Vision – ECCV 2016 / ed. by B. Leibe, J. Matas, N. Sebe, M. Welling. – Cham: Springer, 2016. – P. 21-37. – DOI: 10.1007/978-3-319-46448-0_2.
  16. Ma, J. Arbitrary-oriented scene text detection via rotation proposals / J. Ma, W. Shao, H. Ye, L. Wang, H. Wang, Y. Zheng, X. Xue // IEEE Transactions on Multimedia. – 2018. – Vol. 20, Issue 11. – P. 3111-3122. – DOI: 10.1109/TMM.2018.2818020.
  17. Shi, B. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition / B. Shi, X. Bai, C. Yao // IEEE Transactions on Pattern Analysis and Machine Intelligence. – 2016. – Vol. 39, Issue 11. – P. 2298-2304. – DOI: 10.1109/TPAMI.2016.2646371.
  18. Graves, A. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks / A. Graves, F. Gomez // Proceedings of the 23rd International Conference on Machine Learning. – 2006. – P. 369-376. – DOI: 10.1145/1143844.1143891.
  19. Girshick, R. Fast R-CNN / R. Girshick // 2015 IEEE International Conference on Computer Vision (ICCV). – 2015. – P. 1440-1448. – DOI: 10.1109/ICCV.2015.169.
  20. Jaderberg, M. Synthetic data and artificial neural networks for natural scene text recognition [Electronical Resource] / M. Jaderberg, K. Simonyan, A. Vedaldi, A. Zisserman // NIPS Deep Learning Workshop 2014. – 2014. – v4. – URL: https://arxiv.org/abs/1406.2227 (request date 27.12.2018)
  21. Yu, X. A framework of timestamp replantation for panorama video surveillance / Yu X, Cheng J, Wu S, Song W. // Multimedia Tools and Applications. – 2016. – Vol. 75, Issue 17. – P. 10357-10381. – DOI: 10.1007/s11042-015-3051-1.

© 2009, IPSI RAS
Россия, 443001, Самара, ул. Молодогвардейская, 151; электронная почта: ko@smr.ru ; тел: +7 (846) 242-41-24 (ответственный секретарь), +7 (846) 332-56-22 (технический редактор), факс: +7 (846) 332-56-20