ABSTRACT
Convolutional neural networks (CNN) are widely used on sequential data since it can capture local context dependencies and temporal order information inside sequences. Attention (ATT) mechanisms have also attracted enormous interests due to its capability of capturing the important parts of a sequence. These two neural networks can extract different features from sequences. In order to combine the advantages of CNN and ATT, we propose a convolutional attention network (CAN), which merges the structure of CNN and ATT into a single neural network and can serve as a new basic module in complex neural networks. Based on CAN, we then build a sequence encoding model with hierarchical structure, "hierarchical convolutional attention network (HiCAN)", to tackle sequence modeling problems. It can explicitly capture both the local and global context dependencies and temporal order information in sequences. Extensive experiments conducted on session-based recommendation (Recommender Systems) demonstrate that HiCAN is able to outperform state-of-the-art methods and show higher computational efficiency. Furthermore, we conduct extended experiments on text classification (Natural Language Processing). The results show that our model can also achieve competitive performance on NLP tasks.
- Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Wei Liu, and Tat-Seng Chua. 2017. SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning. In IEEE Conference on Computer Vision and Pattern Recognition. 6298--6306.Google Scholar
- Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, Yoshua Bengio, Yoshua Bengio, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. In Advances in Neural Information Processing Systems .Google Scholar
- Alexis Conneau, Holger Schwenk, Lo"i c Barrault, and Yann LeCun. 2017. Very Deep Convolutional Networks for Natural Language Processing. In EACL .Google Scholar
- Alex Graves, Abdel-rahman Mohamed, and Geoffrey E. Hinton. 2013. Speech Recognition with Deep Recurrent Neural Networks. CoRR , Vol. abs/1303.5778 (2013).Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016a. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR. 770--778.Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016b. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google Scholar
- Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web. 173--182.Google ScholarDigital Library
- Balá zs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2016. Session-based Recommendations with Recurrent Neural Networks. In Proceedings of the 4th. International Conference on Learning Representations .Google Scholar
- Sepp Hochreiter and Jü rgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation , Vol. 9, 8 (1997), 1735--1780.Google ScholarDigital Library
- Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing . 1746--1751.Google ScholarCross Ref
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of the 3th. International Conference on Learning Representations .Google Scholar
- Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, and Jun Ma. 2017. Neural Attentive Session-based Recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 1419--1428.Google ScholarDigital Library
- Zhouhan Lin, Minwei Feng, C'i cero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A Structured Self-attentive Sentence Embedding. In ICLR .Google Scholar
- Qiao Liu, Yifu Zeng, Refuoe Mokhosi, and Haibin Zhang. 2018. STAMP: Short-Term Attention/Memory Priority Model for Session-based Recommendation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1831--1839.Google ScholarDigital Library
- Malte Ludewig and Dietmar Jannach. 2018. Evaluation of session-based recommendation algorithms. User Model. User-Adapt. Interact. , Vol. 28, 4--5 (2018), 331--390.Google ScholarDigital Library
- Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 1532--1543.Google Scholar
- Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence . 452--461.Google Scholar
- Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010a. Factorizing personalized Markov chains for next-basket recommendation. In Proceedings of the 19th International Conference on World Wide Web, WWW. 811--820.Google ScholarDigital Library
- Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010b. Factorizing personalized Markov chains for next-basket recommendation. In Proceedings of the 19th International Conference on World Wide Web, WWW. 811--820.Google ScholarDigital Library
- Badrul Munir Sarwar, George Karypis, Joseph A. Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the Tenth International World Wide Web Conference, WWW. 285--295.Google ScholarDigital Library
- Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, and Chengqi Zhang. 2018. DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18). 5446--5455.Google ScholarCross Ref
- Yong Kiam Tan, Xinxing Xu, and Yong Liu. 2016. Improved Recurrent Neural Networks for Session-based Recommendations. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems . 17--22.Google ScholarDigital Library
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30. CA, USA, 6000--6010.Google Scholar
- Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alexander J. Smola, and Eduard H. Hovy. 2016. Hierarchical Attention Networks for Document Classification. In NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics. 1480--1489.Google Scholar
- Wenpeng Yin, Hinrich Schü tze, Bing Xiang, and Bowen Zhou. 2016. ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs. TACL , Vol. 4 (2016), 259--272.Google Scholar
- Dani Yogatama, Chris Dyer, Wang Ling, and Phil Blunsom. 2017. Generative and Discriminative Text Classification with Recurrent Neural Networks. CoRR , Vol. abs/1703.01898 (2017).Google Scholar
- Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. 2015. Character-level Convolutional Networks for Text Classification. In Advances in Neural Information Processing Systems 28. 649--657.Google ScholarDigital Library
- Yuhao Zhang, Victor Zhong, Danqi Chen, Gabor Angeli, and Christopher D. Manning. 2017. Position-aware Attention and Supervised Data Improve Slot Filling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing . Copenhagen, Denmark, 35--45.Google Scholar
- Chang Zhou, Jinze Bai, Junshuai Song, Xiaofei Liu, Zhengchao Zhao, Xiusi Chen, and Jun Gao. 2018. ATRank: An Attention-Based User Behavior Modeling Framework for Recommendation. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence .Google Scholar
- Andrew Zimdars, David Maxwell Chickering, and Christopher Meek. 2001. Using Temporal Data for Making Recommendations. In Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence. 580--588.Google ScholarDigital Library
Index Terms
- HiCAN: Hierarchical Convolutional Attention Network for Sequence Modeling
Recommendations
Position-aware context attention for session-based recommendation
AbstractIn session-based recommendation scenarios where user profiles are not available, predicting their behaviors is a challenging problem. Previous dominant methods to solve this problem are RNN-based models. Recently, attention mechanisms ...
Session-Based Recommendation with Self-Attention
SoICT '19: Proceedings of the 10th International Symposium on Information and Communication TechnologyThe goal of session-based recommendation is to predict the next action of a user based on the current session and anonymous sessions before. Recent works on session-based recommendation usually use neural network architectures such as convolution neural ...
Neural Attentive Session-based Recommendation
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge ManagementGiven e-commerce scenarios that user profiles are invisible, session-based recommendation is proposed to generate recommendation results from short sessions. Previous work only considers the user's sequential behavior in the current session, whereas the ...
Comments