A deep reinforcement learning model using long contexts for Chatbots
AC Le - 2021 International Conference on System Science …, 2021 - ieeexplore.ieee.org
2021 International Conference on System Science and Engineering …, 2021•ieeexplore.ieee.org
Recently, deep neural network (DNN) based chat-bots offer great promise for generating
responses. With architecture using an encoder-decoder integrated with attention
mechanism, they showed a good one for generating natural responses, however they still
are be short-sighted in predicting response one at a time while ignoring their influence on
future outcomes. This is one of the reason DNN based models usually tend to generate
responses not relevant to previous responses. In this paper, we show how to apply …
responses. With architecture using an encoder-decoder integrated with attention
mechanism, they showed a good one for generating natural responses, however they still
are be short-sighted in predicting response one at a time while ignoring their influence on
future outcomes. This is one of the reason DNN based models usually tend to generate
responses not relevant to previous responses. In this paper, we show how to apply …
Recently, deep neural network (DNN) based chat-bots offer great promise for generating responses. With architecture using an encoder-decoder integrated with attention mechanism, they showed a good one for generating natural responses, however they still are be short-sighted in predicting response one at a time while ignoring their influence on future outcomes. This is one of the reason DNN based models usually tend to generate responses not relevant to previous responses. In this paper, we show how to apply reinforcement learning to those models that can help them achieve a specific goal. By simulating dialogues between two pretrained chatbots with DNN, followed by using policy gradient methods to reward sequences. We also present our forward-looking function for desirable goal which will be used to improve the models. The experimental results show that the proposed model generates appropriate responses to content with more information relevant to the conversation context. In addition, our model improves up to 43% BLEU score compared to the baseline.
ieeexplore.ieee.org
Showing the best result for this search. See all results