Why does the classic Neural Network perform better than LSTM in Sentiment Analysis
$begingroup$
My goal is to predict the polarity of some reviews (negative, positive or neutral). I tried two different neural networks:
left_branch = Input((7000, ))
left_branch_dense = Dense(512, activation = 'relu')(left_branch)
right_branch = Input((14012, ))
right_branch_dense = Dense(512, activation = 'relu')(right_branch)
merged = Concatenate()([left_branch_dense, right_branch_dense])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit([np.array(review_matrix), np.array(X_train)], labels,epochs=2, verbose=1)
model.save('model.merged')
#############################################################################
#############################################################################
#We will try to merge two different models in a different way: Accuracy: 70
# Prepare the review column for embedding:
review_matrix_for_embedding = prepare_for_encoding(train_set[4].tolist(), 7000) # Shape: (1503,100)
second_matrix = np.array(pd.concat([onehot_category, aspect_matrix],axis=1))
left_branch = Input(shape=(100,), dtype='int32')
# input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
# input_length: Size of the array
left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
lstm_out = LSTM(256)(left_branch_embedding)
lstm_out = Dropout(0.7)(lstm_out)
lstm_out = Dense(128, activation='sigmoid')(lstm_out)
right_branch = Input((7012, ))
merged = Concatenate()([lstm_out, right_branch])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit([review_matrix_for_embedding, second_matrix], labels,epochs=5, verbose=1)
The first one does 80% accuracy while the second one does 70%, with embedding vectors and LSTM layer. How is it possible? Is there anything wrong in my architecture?
keras nlp
New contributor
$endgroup$
add a comment |
$begingroup$
My goal is to predict the polarity of some reviews (negative, positive or neutral). I tried two different neural networks:
left_branch = Input((7000, ))
left_branch_dense = Dense(512, activation = 'relu')(left_branch)
right_branch = Input((14012, ))
right_branch_dense = Dense(512, activation = 'relu')(right_branch)
merged = Concatenate()([left_branch_dense, right_branch_dense])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit([np.array(review_matrix), np.array(X_train)], labels,epochs=2, verbose=1)
model.save('model.merged')
#############################################################################
#############################################################################
#We will try to merge two different models in a different way: Accuracy: 70
# Prepare the review column for embedding:
review_matrix_for_embedding = prepare_for_encoding(train_set[4].tolist(), 7000) # Shape: (1503,100)
second_matrix = np.array(pd.concat([onehot_category, aspect_matrix],axis=1))
left_branch = Input(shape=(100,), dtype='int32')
# input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
# input_length: Size of the array
left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
lstm_out = LSTM(256)(left_branch_embedding)
lstm_out = Dropout(0.7)(lstm_out)
lstm_out = Dense(128, activation='sigmoid')(lstm_out)
right_branch = Input((7012, ))
merged = Concatenate()([lstm_out, right_branch])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit([review_matrix_for_embedding, second_matrix], labels,epochs=5, verbose=1)
The first one does 80% accuracy while the second one does 70%, with embedding vectors and LSTM layer. How is it possible? Is there anything wrong in my architecture?
keras nlp
New contributor
$endgroup$
add a comment |
$begingroup$
My goal is to predict the polarity of some reviews (negative, positive or neutral). I tried two different neural networks:
left_branch = Input((7000, ))
left_branch_dense = Dense(512, activation = 'relu')(left_branch)
right_branch = Input((14012, ))
right_branch_dense = Dense(512, activation = 'relu')(right_branch)
merged = Concatenate()([left_branch_dense, right_branch_dense])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit([np.array(review_matrix), np.array(X_train)], labels,epochs=2, verbose=1)
model.save('model.merged')
#############################################################################
#############################################################################
#We will try to merge two different models in a different way: Accuracy: 70
# Prepare the review column for embedding:
review_matrix_for_embedding = prepare_for_encoding(train_set[4].tolist(), 7000) # Shape: (1503,100)
second_matrix = np.array(pd.concat([onehot_category, aspect_matrix],axis=1))
left_branch = Input(shape=(100,), dtype='int32')
# input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
# input_length: Size of the array
left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
lstm_out = LSTM(256)(left_branch_embedding)
lstm_out = Dropout(0.7)(lstm_out)
lstm_out = Dense(128, activation='sigmoid')(lstm_out)
right_branch = Input((7012, ))
merged = Concatenate()([lstm_out, right_branch])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit([review_matrix_for_embedding, second_matrix], labels,epochs=5, verbose=1)
The first one does 80% accuracy while the second one does 70%, with embedding vectors and LSTM layer. How is it possible? Is there anything wrong in my architecture?
keras nlp
New contributor
$endgroup$
My goal is to predict the polarity of some reviews (negative, positive or neutral). I tried two different neural networks:
left_branch = Input((7000, ))
left_branch_dense = Dense(512, activation = 'relu')(left_branch)
right_branch = Input((14012, ))
right_branch_dense = Dense(512, activation = 'relu')(right_branch)
merged = Concatenate()([left_branch_dense, right_branch_dense])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit([np.array(review_matrix), np.array(X_train)], labels,epochs=2, verbose=1)
model.save('model.merged')
#############################################################################
#############################################################################
#We will try to merge two different models in a different way: Accuracy: 70
# Prepare the review column for embedding:
review_matrix_for_embedding = prepare_for_encoding(train_set[4].tolist(), 7000) # Shape: (1503,100)
second_matrix = np.array(pd.concat([onehot_category, aspect_matrix],axis=1))
left_branch = Input(shape=(100,), dtype='int32')
# input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
# input_length: Size of the array
left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
lstm_out = LSTM(256)(left_branch_embedding)
lstm_out = Dropout(0.7)(lstm_out)
lstm_out = Dense(128, activation='sigmoid')(lstm_out)
right_branch = Input((7012, ))
merged = Concatenate()([lstm_out, right_branch])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit([review_matrix_for_embedding, second_matrix], labels,epochs=5, verbose=1)
The first one does 80% accuracy while the second one does 70%, with embedding vectors and LSTM layer. How is it possible? Is there anything wrong in my architecture?
keras nlp
keras nlp
New contributor
New contributor
edited 2 days ago
Nischal Hp
48829
48829
New contributor
asked 2 days ago
nolw38nolw38
62
62
New contributor
New contributor
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
First of all, I have noticed that you have used sigmoid activation function for your LSTM Dense Layer and in your ANN you used relu, maybe, MAYBE, this can be a reason for your lower performance. That is could be happening because sigmoid functions suffer from two problems:
- Saturation of gradients: sigmoid functions have tail distributions, meaning that they saturate in this 'flat' regions practically diminishing the gradient to zero and affecting backpropagation/training process.
- Sigmoid outputs are not zero-centered: This is an issue due the gradient calculation during backpropagation. Either having all enters positivo or negative will add a 'zigzag' effect difficulting the training process.
My comments above are taken from this excellent tutorial that you should read: http://cs231n.github.io/neural-networks-1/
I tried to summarize it but they did a master work and I think you should read it.
Second, we must consider other factors such as: are you analyzing your train/test/val losses? Maybe your LSTM networks just takes longer to train and reach its minimum. You need to work a little more on these parameters before taking any conclusions. Plot a graph showing your training and validation loss so we can assess if your model is underfitting.
Lastly, I have a question for you: Why should your LSTM network perform better than a simple ANN? Although LSTM + Embeddings are powerful techniques that have gained attention in a lot of fields, essentially NLP, that will be not every task that they beat classical approaches. I myself have tried with different data sets, and depending on the application, simple ML algorithms such as SVM would easily beat the more complex ones, including sentiment analysis.
So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.
$endgroup$
add a comment |
$begingroup$
Thank you for your answer.
I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!
Here are pictures of training for both the one who does 79% accuracy (merged two classic neural networks), and the one in which I do 70: (of course I don't talk about the training accuracy but the test accuracy).
I see a difference in the loss value, but I don't know how to interpret it? Does this mean that for the less good architecture, I don't reach a minimum? If yes, what can I modify in my NN ?
Thank you for the time you took to answer!!
Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?
left_branch = Input(shape=(100,), dtype='int32')
# input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
# input_length: Size of the array
left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
lstm_out = LSTM(256)(left_branch_embedding)
lstm_out = Dropout(0.7)(lstm_out)
lstm_out = Dense(128, activation='sigmoid')(lstm_out)
right_branch = Input((7012, ))
merged = Concatenate()([lstm_out, right_branch])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
New contributor
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
nolw38 is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47094%2fwhy-does-the-classic-neural-network-perform-better-than-lstm-in-sentiment-analys%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
First of all, I have noticed that you have used sigmoid activation function for your LSTM Dense Layer and in your ANN you used relu, maybe, MAYBE, this can be a reason for your lower performance. That is could be happening because sigmoid functions suffer from two problems:
- Saturation of gradients: sigmoid functions have tail distributions, meaning that they saturate in this 'flat' regions practically diminishing the gradient to zero and affecting backpropagation/training process.
- Sigmoid outputs are not zero-centered: This is an issue due the gradient calculation during backpropagation. Either having all enters positivo or negative will add a 'zigzag' effect difficulting the training process.
My comments above are taken from this excellent tutorial that you should read: http://cs231n.github.io/neural-networks-1/
I tried to summarize it but they did a master work and I think you should read it.
Second, we must consider other factors such as: are you analyzing your train/test/val losses? Maybe your LSTM networks just takes longer to train and reach its minimum. You need to work a little more on these parameters before taking any conclusions. Plot a graph showing your training and validation loss so we can assess if your model is underfitting.
Lastly, I have a question for you: Why should your LSTM network perform better than a simple ANN? Although LSTM + Embeddings are powerful techniques that have gained attention in a lot of fields, essentially NLP, that will be not every task that they beat classical approaches. I myself have tried with different data sets, and depending on the application, simple ML algorithms such as SVM would easily beat the more complex ones, including sentiment analysis.
So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.
$endgroup$
add a comment |
$begingroup$
First of all, I have noticed that you have used sigmoid activation function for your LSTM Dense Layer and in your ANN you used relu, maybe, MAYBE, this can be a reason for your lower performance. That is could be happening because sigmoid functions suffer from two problems:
- Saturation of gradients: sigmoid functions have tail distributions, meaning that they saturate in this 'flat' regions practically diminishing the gradient to zero and affecting backpropagation/training process.
- Sigmoid outputs are not zero-centered: This is an issue due the gradient calculation during backpropagation. Either having all enters positivo or negative will add a 'zigzag' effect difficulting the training process.
My comments above are taken from this excellent tutorial that you should read: http://cs231n.github.io/neural-networks-1/
I tried to summarize it but they did a master work and I think you should read it.
Second, we must consider other factors such as: are you analyzing your train/test/val losses? Maybe your LSTM networks just takes longer to train and reach its minimum. You need to work a little more on these parameters before taking any conclusions. Plot a graph showing your training and validation loss so we can assess if your model is underfitting.
Lastly, I have a question for you: Why should your LSTM network perform better than a simple ANN? Although LSTM + Embeddings are powerful techniques that have gained attention in a lot of fields, essentially NLP, that will be not every task that they beat classical approaches. I myself have tried with different data sets, and depending on the application, simple ML algorithms such as SVM would easily beat the more complex ones, including sentiment analysis.
So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.
$endgroup$
add a comment |
$begingroup$
First of all, I have noticed that you have used sigmoid activation function for your LSTM Dense Layer and in your ANN you used relu, maybe, MAYBE, this can be a reason for your lower performance. That is could be happening because sigmoid functions suffer from two problems:
- Saturation of gradients: sigmoid functions have tail distributions, meaning that they saturate in this 'flat' regions practically diminishing the gradient to zero and affecting backpropagation/training process.
- Sigmoid outputs are not zero-centered: This is an issue due the gradient calculation during backpropagation. Either having all enters positivo or negative will add a 'zigzag' effect difficulting the training process.
My comments above are taken from this excellent tutorial that you should read: http://cs231n.github.io/neural-networks-1/
I tried to summarize it but they did a master work and I think you should read it.
Second, we must consider other factors such as: are you analyzing your train/test/val losses? Maybe your LSTM networks just takes longer to train and reach its minimum. You need to work a little more on these parameters before taking any conclusions. Plot a graph showing your training and validation loss so we can assess if your model is underfitting.
Lastly, I have a question for you: Why should your LSTM network perform better than a simple ANN? Although LSTM + Embeddings are powerful techniques that have gained attention in a lot of fields, essentially NLP, that will be not every task that they beat classical approaches. I myself have tried with different data sets, and depending on the application, simple ML algorithms such as SVM would easily beat the more complex ones, including sentiment analysis.
So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.
$endgroup$
First of all, I have noticed that you have used sigmoid activation function for your LSTM Dense Layer and in your ANN you used relu, maybe, MAYBE, this can be a reason for your lower performance. That is could be happening because sigmoid functions suffer from two problems:
- Saturation of gradients: sigmoid functions have tail distributions, meaning that they saturate in this 'flat' regions practically diminishing the gradient to zero and affecting backpropagation/training process.
- Sigmoid outputs are not zero-centered: This is an issue due the gradient calculation during backpropagation. Either having all enters positivo or negative will add a 'zigzag' effect difficulting the training process.
My comments above are taken from this excellent tutorial that you should read: http://cs231n.github.io/neural-networks-1/
I tried to summarize it but they did a master work and I think you should read it.
Second, we must consider other factors such as: are you analyzing your train/test/val losses? Maybe your LSTM networks just takes longer to train and reach its minimum. You need to work a little more on these parameters before taking any conclusions. Plot a graph showing your training and validation loss so we can assess if your model is underfitting.
Lastly, I have a question for you: Why should your LSTM network perform better than a simple ANN? Although LSTM + Embeddings are powerful techniques that have gained attention in a lot of fields, essentially NLP, that will be not every task that they beat classical approaches. I myself have tried with different data sets, and depending on the application, simple ML algorithms such as SVM would easily beat the more complex ones, including sentiment analysis.
So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.
answered 2 days ago
Victor OliveiraVictor Oliveira
3057
3057
add a comment |
add a comment |
$begingroup$
Thank you for your answer.
I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!
Here are pictures of training for both the one who does 79% accuracy (merged two classic neural networks), and the one in which I do 70: (of course I don't talk about the training accuracy but the test accuracy).
I see a difference in the loss value, but I don't know how to interpret it? Does this mean that for the less good architecture, I don't reach a minimum? If yes, what can I modify in my NN ?
Thank you for the time you took to answer!!
Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?
left_branch = Input(shape=(100,), dtype='int32')
# input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
# input_length: Size of the array
left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
lstm_out = LSTM(256)(left_branch_embedding)
lstm_out = Dropout(0.7)(lstm_out)
lstm_out = Dense(128, activation='sigmoid')(lstm_out)
right_branch = Input((7012, ))
merged = Concatenate()([lstm_out, right_branch])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
New contributor
$endgroup$
add a comment |
$begingroup$
Thank you for your answer.
I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!
Here are pictures of training for both the one who does 79% accuracy (merged two classic neural networks), and the one in which I do 70: (of course I don't talk about the training accuracy but the test accuracy).
I see a difference in the loss value, but I don't know how to interpret it? Does this mean that for the less good architecture, I don't reach a minimum? If yes, what can I modify in my NN ?
Thank you for the time you took to answer!!
Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?
left_branch = Input(shape=(100,), dtype='int32')
# input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
# input_length: Size of the array
left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
lstm_out = LSTM(256)(left_branch_embedding)
lstm_out = Dropout(0.7)(lstm_out)
lstm_out = Dense(128, activation='sigmoid')(lstm_out)
right_branch = Input((7012, ))
merged = Concatenate()([lstm_out, right_branch])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
New contributor
$endgroup$
add a comment |
$begingroup$
Thank you for your answer.
I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!
Here are pictures of training for both the one who does 79% accuracy (merged two classic neural networks), and the one in which I do 70: (of course I don't talk about the training accuracy but the test accuracy).
I see a difference in the loss value, but I don't know how to interpret it? Does this mean that for the less good architecture, I don't reach a minimum? If yes, what can I modify in my NN ?
Thank you for the time you took to answer!!
Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?
left_branch = Input(shape=(100,), dtype='int32')
# input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
# input_length: Size of the array
left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
lstm_out = LSTM(256)(left_branch_embedding)
lstm_out = Dropout(0.7)(lstm_out)
lstm_out = Dense(128, activation='sigmoid')(lstm_out)
right_branch = Input((7012, ))
merged = Concatenate()([lstm_out, right_branch])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
New contributor
$endgroup$
Thank you for your answer.
I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!
Here are pictures of training for both the one who does 79% accuracy (merged two classic neural networks), and the one in which I do 70: (of course I don't talk about the training accuracy but the test accuracy).
I see a difference in the loss value, but I don't know how to interpret it? Does this mean that for the less good architecture, I don't reach a minimum? If yes, what can I modify in my NN ?
Thank you for the time you took to answer!!
Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?
left_branch = Input(shape=(100,), dtype='int32')
# input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
# input_length: Size of the array
left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
lstm_out = LSTM(256)(left_branch_embedding)
lstm_out = Dropout(0.7)(lstm_out)
lstm_out = Dense(128, activation='sigmoid')(lstm_out)
right_branch = Input((7012, ))
merged = Concatenate()([lstm_out, right_branch])
output_layer = Dense(3, activation = 'softmax')(merged)
model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
New contributor
New contributor
answered yesterday
nolw38nolw38
62
62
New contributor
New contributor
add a comment |
add a comment |
nolw38 is a new contributor. Be nice, and check out our Code of Conduct.
nolw38 is a new contributor. Be nice, and check out our Code of Conduct.
nolw38 is a new contributor. Be nice, and check out our Code of Conduct.
nolw38 is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47094%2fwhy-does-the-classic-neural-network-perform-better-than-lstm-in-sentiment-analys%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown