Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3357384.3357996acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

HiCAN: Hierarchical Convolutional Attention Network for Sequence Modeling

Published:03 November 2019Publication History

ABSTRACT

Convolutional neural networks (CNN) are widely used on sequential data since it can capture local context dependencies and temporal order information inside sequences. Attention (ATT) mechanisms have also attracted enormous interests due to its capability of capturing the important parts of a sequence. These two neural networks can extract different features from sequences. In order to combine the advantages of CNN and ATT, we propose a convolutional attention network (CAN), which merges the structure of CNN and ATT into a single neural network and can serve as a new basic module in complex neural networks. Based on CAN, we then build a sequence encoding model with hierarchical structure, "hierarchical convolutional attention network (HiCAN)", to tackle sequence modeling problems. It can explicitly capture both the local and global context dependencies and temporal order information in sequences. Extensive experiments conducted on session-based recommendation (Recommender Systems) demonstrate that HiCAN is able to outperform state-of-the-art methods and show higher computational efficiency. Furthermore, we conduct extended experiments on text classification (Natural Language Processing). The results show that our model can also achieve competitive performance on NLP tasks.

References

  1. Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Wei Liu, and Tat-Seng Chua. 2017. SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning. In IEEE Conference on Computer Vision and Pattern Recognition. 6298--6306.Google ScholarGoogle Scholar
  2. Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, Yoshua Bengio, Yoshua Bengio, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. In Advances in Neural Information Processing Systems .Google ScholarGoogle Scholar
  3. Alexis Conneau, Holger Schwenk, Lo"i c Barrault, and Yann LeCun. 2017. Very Deep Convolutional Networks for Natural Language Processing. In EACL .Google ScholarGoogle Scholar
  4. Alex Graves, Abdel-rahman Mohamed, and Geoffrey E. Hinton. 2013. Speech Recognition with Deep Recurrent Neural Networks. CoRR , Vol. abs/1303.5778 (2013).Google ScholarGoogle Scholar
  5. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016a. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR. 770--778.Google ScholarGoogle Scholar
  6. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016b. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarGoogle Scholar
  7. Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web. 173--182.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Balá zs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2016. Session-based Recommendations with Recurrent Neural Networks. In Proceedings of the 4th. International Conference on Learning Representations .Google ScholarGoogle Scholar
  9. Sepp Hochreiter and Jü rgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation , Vol. 9, 8 (1997), 1735--1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing . 1746--1751.Google ScholarGoogle ScholarCross RefCross Ref
  11. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of the 3th. International Conference on Learning Representations .Google ScholarGoogle Scholar
  12. Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, and Jun Ma. 2017. Neural Attentive Session-based Recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 1419--1428.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Zhouhan Lin, Minwei Feng, C'i cero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A Structured Self-attentive Sentence Embedding. In ICLR .Google ScholarGoogle Scholar
  14. Qiao Liu, Yifu Zeng, Refuoe Mokhosi, and Haibin Zhang. 2018. STAMP: Short-Term Attention/Memory Priority Model for Session-based Recommendation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1831--1839.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Malte Ludewig and Dietmar Jannach. 2018. Evaluation of session-based recommendation algorithms. User Model. User-Adapt. Interact. , Vol. 28, 4--5 (2018), 331--390.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 1532--1543.Google ScholarGoogle Scholar
  17. Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence . 452--461.Google ScholarGoogle Scholar
  18. Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010a. Factorizing personalized Markov chains for next-basket recommendation. In Proceedings of the 19th International Conference on World Wide Web, WWW. 811--820.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010b. Factorizing personalized Markov chains for next-basket recommendation. In Proceedings of the 19th International Conference on World Wide Web, WWW. 811--820.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Badrul Munir Sarwar, George Karypis, Joseph A. Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the Tenth International World Wide Web Conference, WWW. 285--295.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, and Chengqi Zhang. 2018. DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18). 5446--5455.Google ScholarGoogle ScholarCross RefCross Ref
  22. Yong Kiam Tan, Xinxing Xu, and Yong Liu. 2016. Improved Recurrent Neural Networks for Session-based Recommendations. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems . 17--22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30. CA, USA, 6000--6010.Google ScholarGoogle Scholar
  24. Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alexander J. Smola, and Eduard H. Hovy. 2016. Hierarchical Attention Networks for Document Classification. In NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics. 1480--1489.Google ScholarGoogle Scholar
  25. Wenpeng Yin, Hinrich Schü tze, Bing Xiang, and Bowen Zhou. 2016. ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs. TACL , Vol. 4 (2016), 259--272.Google ScholarGoogle Scholar
  26. Dani Yogatama, Chris Dyer, Wang Ling, and Phil Blunsom. 2017. Generative and Discriminative Text Classification with Recurrent Neural Networks. CoRR , Vol. abs/1703.01898 (2017).Google ScholarGoogle Scholar
  27. Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. 2015. Character-level Convolutional Networks for Text Classification. In Advances in Neural Information Processing Systems 28. 649--657.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Yuhao Zhang, Victor Zhong, Danqi Chen, Gabor Angeli, and Christopher D. Manning. 2017. Position-aware Attention and Supervised Data Improve Slot Filling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing . Copenhagen, Denmark, 35--45.Google ScholarGoogle Scholar
  29. Chang Zhou, Jinze Bai, Junshuai Song, Xiaofei Liu, Zhengchao Zhao, Xiusi Chen, and Jun Gao. 2018. ATRank: An Attention-Based User Behavior Modeling Framework for Recommendation. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence .Google ScholarGoogle Scholar
  30. Andrew Zimdars, David Maxwell Chickering, and Christopher Meek. 2001. Using Temporal Data for Making Recommendations. In Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence. 580--588.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. HiCAN: Hierarchical Convolutional Attention Network for Sequence Modeling

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management
        November 2019
        3373 pages
        ISBN:9781450369763
        DOI:10.1145/3357384

        Copyright © 2019 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 3 November 2019

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        CIKM '19 Paper Acceptance Rate202of1,031submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader