Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Joint Learning of LSTMs-CNN and Prototype for Micro-video Venue Classification

  • Conference paper
  • First Online:
Advances in Multimedia Information Processing – PCM 2018 (PCM 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11165))

Included in the following conference series:

Abstract

Generally, venue category information of the micro-video is an important cue in social network applications, such as location-oriented applications and personalized services. In the existing micro-video venue classification methods, the discrimination becomes worse due to unsuitable convolutional filter and convolutional padding, and the robustness is not enough that is caused by the softmax layer. In order to alleviate such problems, we propose a novel learning framework which jointly learns LSTMs-CNN and Prototype for micro-video venue classification. Specifically, LSTMs-CNN with convolutional padding of the SAME type and small convolutional filter is used to extract spatio-temporal information. The Prototype is simultaneously learned to improve the robustness against softmax classification function. We adopt Euclidean distance loss function to train the whole network. Extensive experimental results on a real-world dataset show that our model significantly outperforms the state-of-the-art baselines in terms of both Micro-F and Macro-F scores.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://vine.co/.

  2. 2.

    https://www.snapchat.com/.

  3. 3.

    https://instagram.com/.

  4. 4.

    https://acmmm17.wixsite.com/eastern.

  5. 5.

    https://github.com/davoclavo/vinepy.

  6. 6.

    https://github.com/librosa/librosa.

  7. 7.

    https://ww2.mathworks.cn/.

  8. 8.

    https://www.tensorflow.org.

References

  1. Hays, J., Efros, A.A.: IM2GPS: estimating geographic information from a single image. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)

    Google Scholar 

  2. Zhu, L., Huang, Z., Liu, X., He, X., Song, J., Zhou, X.: discrete multi-modal hashing with canonical views for robust mobile landmark search. IEEE Trans. Multimed. 19(9), 2066–2079 (2017)

    Article  Google Scholar 

  3. Ye, M., Yin, P., Lee, W. C.: Location recommendation for location-based social networks. In: Proceedings of ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, pp. 458–461 (2010)

    Google Scholar 

  4. Zhang, J., Nie, L., Wang, X., He, X., Huang, X., Chua, T.S.: Shorter-is-better: venue category estimation from micro-video. In: Proceedings of ACM International Conference on Multimedia, pp. 1415–1424 (2016)

    Google Scholar 

  5. Nie, L., Wang, X., Zhang, J., He, X., Zhang, H., Hong, R., et al.: Enhancing micro-video understanding by harnessing external sounds. In: Proceedings of ACM International Conference on Multimedia, pp. 1192–1200 (2017)

    Google Scholar 

  6. Liu, M., Nie, L., Wang, M., Chen, B.: Towards micro-video understanding by joint sequential-sparse modeling. In: Proceedings of ACM International Conference on Multimedia, pp. 970–978 (2017)

    Google Scholar 

  7. Yang, H., Zhang X., Yin F, Liu C.: Robust classification with convolutional prototype learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  8. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of International Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  9. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of International Conference on Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  10. Lepri, B., Mana, N., Cappelletti, A., Pianesi, F.: Automatic prediction of individual performance from thin slices of social behavior. In: Proceedings of ACM International Conference on Multimedia, pp. 733–736 (2009)

    Google Scholar 

  11. Sanden, C., Zhang, J.Z.: Enhancing multi-label music genre classification through ensemble techniques. In: Proceedings of International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 705–714 (2011)

    Google Scholar 

Download references

Acknowledgments

We would like to thank the anonymous reviewers for their valuable comments. This work was supported by the National Natural Science Foundation of China (61772539), and the Fundamental Research Funds for the Central Universities (Nos. 3132017XNG1715, 3132018XNG1806).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, W., Huang, X., Cao, G., Song, G., Yang, L. (2018). Joint Learning of LSTMs-CNN and Prototype for Micro-video Venue Classification. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11165. Springer, Cham. https://doi.org/10.1007/978-3-030-00767-6_65

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00767-6_65

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00766-9

  • Online ISBN: 978-3-030-00767-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics