Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-319-99247-1_14guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A Hybrid RNN-CNN Encoder for Neural Conversation Model

Published:17 August 2018Publication History

Abstract

The conventional dialogue system is retrieval-based and its performance is directly limited by the size of dataset. Such dialogue system will give improper response if the question is out of dataset. Recently, due to the successful application of neural network in machine translation, the attention is diverted into building generative dialogue system using sequence to sequence (seq2seq) learning with neural networks. However, it is still difficult to build a satisfactory neural conversation model as sometimes the system tends to generate a general response. Nowadays, the widely employed method for dialogue generation is neural conversation model whose main structure is composed by a recurrent neural networks (RNNs) encoder-decoder. It is noticed that there is still a little work to introduce convolutional neural networks (CNNs) to neural conversation model. Considering that CNN has been used in many natural language processing (NLP) tasks and achieves great improvements, in this research we try to improve the performance of the neural conversation model by introducing a hybrid RNN-CNN encoder. The experimental result shows this architecture’s promising potential.

References

  1. 1.Asghar NPoupart PHoey JJiang XMou LPasi GPiwowarski BAzzopardi LHanbury AAffective neural response generationAdvances in Information Retrieval2018ChamSpringer15416610.1007/978-3-319-76941-7_12Google ScholarGoogle ScholarCross RefCross Ref
  2. 2.Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 (2014)Google ScholarGoogle Scholar
  3. 3.Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. In: Proceedings of 8th Workshop on Syntax, Semantics and Structure in Statistical Translation, pp. 103–111 (2014)Google ScholarGoogle Scholar
  4. 4.Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1724–1734 (2014)Google ScholarGoogle Scholar
  5. 5.Chung, J., Gülçehre, Ç., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555 (2014)Google ScholarGoogle Scholar
  6. 6.Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1243–1252 (2017)Google ScholarGoogle Scholar
  7. 7.Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings of 2014 Annual Conference on Neural Information Processing Systems, pp. 2672–2680 (2014)Google ScholarGoogle Scholar
  8. 8.Isbell Jr., C.L., Kearns, M., Kormann, D., Singh, S., Stone, P.: Cobot in LambdaMOO: a social statistics agent. In: Proceedings of the 17th National Conference on Artificial Intelligence and 12th Conference on on Innovative Applications of Artificial Intelligence, pp. 36–41 (2000)Google ScholarGoogle Scholar
  9. 9.Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751 (2014)Google ScholarGoogle Scholar
  10. 10.Li, J., Monroe, W., Ritter, A., Jurafsky, D., Galley, M., Gao, J.: Deep reinforcement learning for dialogue generation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1192–1202 (2016)Google ScholarGoogle Scholar
  11. 11.Li, J., Monroe, W., Shi, T., Jean, S., Ritter, A., Jurafsky, D.: Adversarial learning for neural dialogue generation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2157–2169 (2017)Google ScholarGoogle Scholar
  12. 12.Liu, C., Lowe, R., Serban, I., Noseworthy, M., Charlin, L., Pineau, J.: How not to evaluate your dialogue system: an empirical study of unsupervised evaluation metrics for dialogue response generation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2122–2132 (2016)Google ScholarGoogle Scholar
  13. 13.Lowe, R., Pow, N., Serban, I., Pineau, J.: The ubuntu dialogue corpus: a large dataset for research in unstructured multi-turn dialogue systems. In: Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 285–294 (2015)Google ScholarGoogle Scholar
  14. 14.Meng, F., Lu, Z., Wang, M., Li, H., Jiang, W., Liu, Q.: Encoding source language with convolutional neural network for machine translation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pp. 20–30 (2015)Google ScholarGoogle Scholar
  15. 15.Mirza, M., Osindero, S.: Conditional generative adversarial nets. CoRR abs/1411.1784 (2014)Google ScholarGoogle Scholar
  16. 16.Mou, L., Song, Y., Yan, R., Li, G., Zhang, L., Jin, Z.: Sequence to backward and forward sequences: a content-introducing approach to generative short-text conversation. In: Proceedings of 26th International Conference on Computational Linguistics, pp. 3349–3358 (2016)Google ScholarGoogle Scholar
  17. 17.Prager, J.M., Brown, E.W., Coden, A., Radev, D.R.: Question-answering by predictive annotation. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 184–191 (2000)Google ScholarGoogle Scholar
  18. 18.Ritter, A., Cherry, C., Dolan, W.B.: Data-driven response generation in social media. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 583–593 (2011)Google ScholarGoogle Scholar
  19. 19.Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (2016)Google ScholarGoogle Scholar
  20. 20.Serban, I.V., Sordoni, A., Bengio, Y., Courville, A.C., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, pp. 3776–3784 (2016)Google ScholarGoogle Scholar
  21. 21.Shao, L., Gouws, S., Britz, D., Goldie, A., Strope, B., Kurzweil, R.: Generating long and diverse responses with neural conversation models. CoRR abs/1701.03185 (2017)Google ScholarGoogle Scholar
  22. 22.Song, X., He, X., Gao, J., Deng, L.: Unsupervised learning of word semantic embedding using the deep structured semantic model. Microsoft Research (2014)Google ScholarGoogle Scholar
  23. 23.Sordoni, A., et al.: A neural network approach to context-sensitive generation of conversational responses. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 196–205 (2015)Google ScholarGoogle Scholar
  24. 24.Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of 2014 Annual Conference on Neural Information Processing Systems 2014, pp. 3104–3112 (2014)Google ScholarGoogle Scholar
  25. 25.Vinyals, O., Le, Q.V.: A neural conversational model. CoRR abs/1506.05869 (2015)Google ScholarGoogle Scholar
  26. 26.Xing, C., et al.: Topic aware neural response generation. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, pp. 3351–3357 (2017)Google ScholarGoogle Scholar
  27. 27.Yin WSchütze HXiang BZhou BABCNN: attention-based convolutional neural network for modeling sentence pairsTrans. Assoc. Comput. Linguist.20164259272Google ScholarGoogle ScholarCross RefCross Ref
  28. 28.Yu KZhao ZWu XLin HLiu XRich short text conversation using semantic-key-controlled sequence generationIEEE/ACM Trans. Audio Speech Lang. Process.20182681359136810.1109/TASLP.2018.2819941Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. 29.Yu, L., Zhang, W., Wang, J., Yu, Y.: SeqGAN: sequence generative adversarial nets with policy gradient. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, pp. 2852–2858 (2017)Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image Guide Proceedings
    Knowledge Science, Engineering and Management: 11th International Conference, KSEM 2018, Changchun, China, August 17–19, 2018, Proceedings, Part II
    Aug 2018
    501 pages
    ISBN:978-3-319-99246-4
    DOI:10.1007/978-3-319-99247-1

    © Springer Nature Switzerland AG 2018

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    • Published: 17 August 2018

    Qualifiers

    • Article
  • Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics