Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

CNN Based Transfer Learning for Scene Script Identification

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10639))

Included in the following conference series:

Abstract

Identifying scripts in natural images is an important step in document analysis. Recently, Convolutional Neural Network (CNN) has achieved great success in image classification tasks, due to its strong capacity and invariance to translation and distortions. A problem with training a new CNN is that it requires a large amount of labelled images and extensive computation resources. Transfer learning from pre-trained models proves to ease the application of CNN and even boost the performance in some circumstances. In this paper, we use transfer learning and fine-tuning in document analysis. Indeed, we deal with the scene script identification quantitatively by comparing the performances of transfer learning and learning from scratch. We evaluate two CNN architectures trained on natural images: AlexNet and VGG-16. Experimental results on several benchmark datasets namely, SIW-13, MLe2e and CVSI2015, demonstrate that our approach outperforms previous approaches and full training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Andrew, B., Wageeh, W., Sridha, S.: Texture for Script Identification. IEEE Trans. Pattern Anal. Mach. Intell. 27(11), 1720–1732 (2005)

    Article  Google Scholar 

  2. Palaiahnakote, S., Ze, H.Y., Danni, Z., Tong, L., Chew, L.T.: New gradient-spatial-structural features for video script identification. Comput. Vis. Image Underst. 130, 35–53 (2015)

    Article  Google Scholar 

  3. Ubul, K., Tursun, G., Aysa, A., Impedovo, D., Pirlo, G., Yibulayin, I.: Script identification of multi-script documents: a survey. IEEE Access PP(99), p. 1 (2017)

    Google Scholar 

  4. Alex, K., Ilya, S., Georey, E.H.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a Meeting Held 3–6 December 2012, Lake Tahoe, Nevada, United States, pp. 1106–1114 (2012)

    Google Scholar 

  5. Gil, L., Tal, H.: Age and gender classification using convolutional neural networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2012, Boston, MA, USA, 7–12 June, pp. 34–42 (2012)

    Google Scholar 

  6. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). doi:10.1007/978-3-319-10590-1_53

    Google Scholar 

  7. Maxime, O., Leon, B., Ivan, L., Josef, S.: Learning and transferring mid-level image representations using convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June, pp. 1717–1724 (2014)

    Google Scholar 

  8. Dan, C.C., Ueli, M., Jurgen, S.: Transfer learning for Latin and Chinese characters with deep neural networks. In: The 2012 International Joint Conference on Neural Networks (IJCNN), Brisbane, Australia, 10–15 June, pp. 1–6 (2012)

    Google Scholar 

  9. Yejun T. Liangrui P., Qian X., Yanwei W., Akio F.: CNN based transfer learning for historical Chinese character recognition. In: 12th IAPR Workshop on Document Analysis Systems, DAS 2016, Santorini, Greece, 11–14 April, pp. 25–29 (2016)

    Google Scholar 

  10. Nima, T., Jae, Y.S., Suryakanth, R., Gurudu, R., Todd, H., Christopher, B.K., Michael, B.G., Jianming, L.: Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans. Med. Imaging 35(5), 1299–1312 (2016)

    Article  Google Scholar 

  11. Neslihan, B., Juho, K., Janne, H.: Human Epithelial Type 2 cell classification with convolutional neural networks. In: 15th IEEE International Conference on Bioinformatics and Bioengineering, BIBE, Belgrade, Serbia, 2–4 November, pp. 1–6 (2015)

    Google Scholar 

  12. Hoo, C.S., Holger, R.R., Mingchen, G., Le, L., Ziyue, X., Isabella, N., Jianhua, Y., Daniel, J.M., Ronald, M.S.: Deep convolutional neural networks for computer aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)

    Article  Google Scholar 

  13. Ross, B.G., Jeff, D., Trevor, D., Jitendra, M.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June, pp. 580–587 (2016)

    Google Scholar 

  14. Ali, S.R., Hossein, A., Josephine, S., Stefan, C.: CNN features off-the-shelf: an astounding baseline for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2014, Columbus, OH, USA, 23–28 June, pp. 512–519 (2014)

    Google Scholar 

  15. Nabin, S., Ranju, M., Rabi, S., Umapada, P., Michael, B.: ICDAR 2015 competition on video script identification (CVSI 2015). In: 13th International Conference on Document Analysis and Recognition, ICDAR 2015, Nancy, France, 23–26 August, pp. 1196–1200 (2015)

    Google Scholar 

  16. Shi, B., Cong, Y., Chengquan, Z., Xiaowei, G., Feiyue, H., Xiang, B.: Script identification in the wild via discriminative convolutional neural network. Pattern Recogn. 52, 448–458 (2016)

    Article  Google Scholar 

  17. Louis, G.B., Anguelos, N., Dimosthenis, K.: Boosting patch-based scene text script identification with ensembles of conjoined networks. CoRR abs/1602.07480. (2016)

    Google Scholar 

  18. Nabin, S., Sukalpa, C., Umapada, P., Michael, B.: Word-wise script identification from video frames. In: 2013 12th International Conference on Document Analysis and Recognition, Washington, DC, USA, 25–28 August 2013, pp. 867–871 (2013)

    Google Scholar 

  19. Baoguang, S., Cong, Y., Chengquan, Z., Xiaowei, G., Feiyue, H., Xiang, B.: Automatic script identification in the wild. In: 13th International Conference on Document Analysis and Recognition, ICDAR 2015, Nancy, France, 23–26 August 2015, pp. 531–535. (2013)

    Google Scholar 

  20. Nabin, S., Ranju, M., Rabi, S., Umapada, P., Michael, B.: Bag-of-Visual Words for word-wise video script identification: A study. In: 2015 International Joint Conference on Neural Networks, IJCNN 2015, Killarney, Ireland, 12–17 July 2015, pp. 1–7. (2015)

    Google Scholar 

  21. Lluis, G.B., Dimosthenis, K.: A fine-grained approach to scene text script identification. In: 12th IAPR Workshop on Document Analysis Systems, DAS 2016, Santorini, Greece, 11–14 April 2016, pp. 192–197 (2016)

    Google Scholar 

  22. Jieru, M., Luo, D., Baoguang, S., Xiang, B.: Scene text script identification with Convolutional Recurrent Neural Networks. In: 23rd International Conference on Pattern Recognition, ICPR 2016, Cancun, Mexico, 4–8 December 2016, pp. 4053–4058 (2016)

    Google Scholar 

  23. Karen, S., Andrew, Z.: Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014)

    Google Scholar 

  24. Nicolaou, A., Bagdanov, A.D., Louis, G., Karatzas, D.: Visual script and language identification. In: 2016 12th IAPR Workshop on Document Analysis Systems (DAS), pp. 393–398 (2016)

    Google Scholar 

Download references

Acknowledgment

This work is performed in the framework of a thesis MOBIDOC financed by the EU under the program PASRI. The authors would like also to acknowledge the partial financial support of this work by grants from General Direction of Scientific Research (DGRT), Tunisia, under the ARUB program. The research leading to these results has received funding from the Ministry of Higher Education and Scientific Research of Tunisia under the grant agreement number LR11ES48.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maroua Tounsi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Tounsi, M., Moalla, I., Lebourgeois, F., Alimi, A.M. (2017). CNN Based Transfer Learning for Scene Script Identification. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10639. Springer, Cham. https://doi.org/10.1007/978-3-319-70136-3_74

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70136-3_74

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70135-6

  • Online ISBN: 978-3-319-70136-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics