How to handle negative words in word2vec?
$begingroup$
I am training a big corpus using word2vec and averaging the word vectors to get sentence vectors. What is the best way to address negative words so that negative and positive sentences are away from each other? For e.g.: "After the fix code worked" and "After the fix code did not work" should ideally give sentence vectors which are far from each other. I heard one approach is to look for negative words like "not" and negate the next word vector. Can someone please clarify if that's a good approach or can suggest a better approach?
machine-learning neural-network nlp word2vec
$endgroup$
add a comment |
$begingroup$
I am training a big corpus using word2vec and averaging the word vectors to get sentence vectors. What is the best way to address negative words so that negative and positive sentences are away from each other? For e.g.: "After the fix code worked" and "After the fix code did not work" should ideally give sentence vectors which are far from each other. I heard one approach is to look for negative words like "not" and negate the next word vector. Can someone please clarify if that's a good approach or can suggest a better approach?
machine-learning neural-network nlp word2vec
$endgroup$
$begingroup$
Don't average them; use a document vector model like paragraph2vec. See the sentiment analysis in the Experiments section for a performance evaluation.
$endgroup$
– Emre
Dec 17 '16 at 17:32
add a comment |
$begingroup$
I am training a big corpus using word2vec and averaging the word vectors to get sentence vectors. What is the best way to address negative words so that negative and positive sentences are away from each other? For e.g.: "After the fix code worked" and "After the fix code did not work" should ideally give sentence vectors which are far from each other. I heard one approach is to look for negative words like "not" and negate the next word vector. Can someone please clarify if that's a good approach or can suggest a better approach?
machine-learning neural-network nlp word2vec
$endgroup$
I am training a big corpus using word2vec and averaging the word vectors to get sentence vectors. What is the best way to address negative words so that negative and positive sentences are away from each other? For e.g.: "After the fix code worked" and "After the fix code did not work" should ideally give sentence vectors which are far from each other. I heard one approach is to look for negative words like "not" and negate the next word vector. Can someone please clarify if that's a good approach or can suggest a better approach?
machine-learning neural-network nlp word2vec
machine-learning neural-network nlp word2vec
asked Dec 17 '16 at 11:36
ShamyShamy
593
593
$begingroup$
Don't average them; use a document vector model like paragraph2vec. See the sentiment analysis in the Experiments section for a performance evaluation.
$endgroup$
– Emre
Dec 17 '16 at 17:32
add a comment |
$begingroup$
Don't average them; use a document vector model like paragraph2vec. See the sentiment analysis in the Experiments section for a performance evaluation.
$endgroup$
– Emre
Dec 17 '16 at 17:32
$begingroup$
Don't average them; use a document vector model like paragraph2vec. See the sentiment analysis in the Experiments section for a performance evaluation.
$endgroup$
– Emre
Dec 17 '16 at 17:32
$begingroup$
Don't average them; use a document vector model like paragraph2vec. See the sentiment analysis in the Experiments section for a performance evaluation.
$endgroup$
– Emre
Dec 17 '16 at 17:32
add a comment |
4 Answers
4
active
oldest
votes
$begingroup$
When you look at the vectors that word2vec generates - negative words may have unique features but can be treated just like positive words. That is to say, as far as the NN is concerned - these are just similar words. You may have to construct "concept vectors" on top of the word vectors to do what you would like to do.
Your parts of speech tagging should automatically mark negating words as ADV. You can then train on these adverbs in conjunction to your verbs as a positive or negative output. Here's an example using spacy:-
import spacy
nlp = spacy.load('en') # this can take a while
sample_text = u'Do not go.'
parsed_text = nlp(sample_text)
token_text = [token.orth_ for token in parsed_text]
token_pos = [token.pos_ for token in parsed_text]
At this point token_text will be a list of your words and token_pos will be the POS tagging:-
Do - VERB
not - ADV
go - VERB
. - PUNCT
As you can see, "not" is tagged as ADV here. You can now feed this tagged output (or a better parse tree) into a second network to train for a negative or positive output.
Hope this helps.
$endgroup$
add a comment |
$begingroup$
There is a possibility of refining word2vec vectors, which as research shows capture both, semantic relatedness and semantic similarity, in such a way, that they would capture the relations between words such as antonymy or negation. You can take a look at Counter-Fitting method (or methods in it's related work). Their implementation should be available online.
This may improve results of your sentiment analysis method.
$endgroup$
add a comment |
$begingroup$
You can check this link. A way of handling negation is suggested 1.
New contributor
$endgroup$
$begingroup$
Please don't give link-only answers. The link might become dead after a while. Add a summary of the link and add it as a source link. But always add the answer here as text.
$endgroup$
– Tasos
2 mins ago
add a comment |
$begingroup$
You can see this paper Querying Word Embeddings for Similarity and Relatedness.
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f15784%2fhow-to-handle-negative-words-in-word2vec%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
When you look at the vectors that word2vec generates - negative words may have unique features but can be treated just like positive words. That is to say, as far as the NN is concerned - these are just similar words. You may have to construct "concept vectors" on top of the word vectors to do what you would like to do.
Your parts of speech tagging should automatically mark negating words as ADV. You can then train on these adverbs in conjunction to your verbs as a positive or negative output. Here's an example using spacy:-
import spacy
nlp = spacy.load('en') # this can take a while
sample_text = u'Do not go.'
parsed_text = nlp(sample_text)
token_text = [token.orth_ for token in parsed_text]
token_pos = [token.pos_ for token in parsed_text]
At this point token_text will be a list of your words and token_pos will be the POS tagging:-
Do - VERB
not - ADV
go - VERB
. - PUNCT
As you can see, "not" is tagged as ADV here. You can now feed this tagged output (or a better parse tree) into a second network to train for a negative or positive output.
Hope this helps.
$endgroup$
add a comment |
$begingroup$
When you look at the vectors that word2vec generates - negative words may have unique features but can be treated just like positive words. That is to say, as far as the NN is concerned - these are just similar words. You may have to construct "concept vectors" on top of the word vectors to do what you would like to do.
Your parts of speech tagging should automatically mark negating words as ADV. You can then train on these adverbs in conjunction to your verbs as a positive or negative output. Here's an example using spacy:-
import spacy
nlp = spacy.load('en') # this can take a while
sample_text = u'Do not go.'
parsed_text = nlp(sample_text)
token_text = [token.orth_ for token in parsed_text]
token_pos = [token.pos_ for token in parsed_text]
At this point token_text will be a list of your words and token_pos will be the POS tagging:-
Do - VERB
not - ADV
go - VERB
. - PUNCT
As you can see, "not" is tagged as ADV here. You can now feed this tagged output (or a better parse tree) into a second network to train for a negative or positive output.
Hope this helps.
$endgroup$
add a comment |
$begingroup$
When you look at the vectors that word2vec generates - negative words may have unique features but can be treated just like positive words. That is to say, as far as the NN is concerned - these are just similar words. You may have to construct "concept vectors" on top of the word vectors to do what you would like to do.
Your parts of speech tagging should automatically mark negating words as ADV. You can then train on these adverbs in conjunction to your verbs as a positive or negative output. Here's an example using spacy:-
import spacy
nlp = spacy.load('en') # this can take a while
sample_text = u'Do not go.'
parsed_text = nlp(sample_text)
token_text = [token.orth_ for token in parsed_text]
token_pos = [token.pos_ for token in parsed_text]
At this point token_text will be a list of your words and token_pos will be the POS tagging:-
Do - VERB
not - ADV
go - VERB
. - PUNCT
As you can see, "not" is tagged as ADV here. You can now feed this tagged output (or a better parse tree) into a second network to train for a negative or positive output.
Hope this helps.
$endgroup$
When you look at the vectors that word2vec generates - negative words may have unique features but can be treated just like positive words. That is to say, as far as the NN is concerned - these are just similar words. You may have to construct "concept vectors" on top of the word vectors to do what you would like to do.
Your parts of speech tagging should automatically mark negating words as ADV. You can then train on these adverbs in conjunction to your verbs as a positive or negative output. Here's an example using spacy:-
import spacy
nlp = spacy.load('en') # this can take a while
sample_text = u'Do not go.'
parsed_text = nlp(sample_text)
token_text = [token.orth_ for token in parsed_text]
token_pos = [token.pos_ for token in parsed_text]
At this point token_text will be a list of your words and token_pos will be the POS tagging:-
Do - VERB
not - ADV
go - VERB
. - PUNCT
As you can see, "not" is tagged as ADV here. You can now feed this tagged output (or a better parse tree) into a second network to train for a negative or positive output.
Hope this helps.
answered Dec 18 '16 at 15:21
Daniel WeeDaniel Wee
662
662
add a comment |
add a comment |
$begingroup$
There is a possibility of refining word2vec vectors, which as research shows capture both, semantic relatedness and semantic similarity, in such a way, that they would capture the relations between words such as antonymy or negation. You can take a look at Counter-Fitting method (or methods in it's related work). Their implementation should be available online.
This may improve results of your sentiment analysis method.
$endgroup$
add a comment |
$begingroup$
There is a possibility of refining word2vec vectors, which as research shows capture both, semantic relatedness and semantic similarity, in such a way, that they would capture the relations between words such as antonymy or negation. You can take a look at Counter-Fitting method (or methods in it's related work). Their implementation should be available online.
This may improve results of your sentiment analysis method.
$endgroup$
add a comment |
$begingroup$
There is a possibility of refining word2vec vectors, which as research shows capture both, semantic relatedness and semantic similarity, in such a way, that they would capture the relations between words such as antonymy or negation. You can take a look at Counter-Fitting method (or methods in it's related work). Their implementation should be available online.
This may improve results of your sentiment analysis method.
$endgroup$
There is a possibility of refining word2vec vectors, which as research shows capture both, semantic relatedness and semantic similarity, in such a way, that they would capture the relations between words such as antonymy or negation. You can take a look at Counter-Fitting method (or methods in it's related work). Their implementation should be available online.
This may improve results of your sentiment analysis method.
answered Apr 2 '18 at 14:45
Smarty77Smarty77
1012
1012
add a comment |
add a comment |
$begingroup$
You can check this link. A way of handling negation is suggested 1.
New contributor
$endgroup$
$begingroup$
Please don't give link-only answers. The link might become dead after a while. Add a summary of the link and add it as a source link. But always add the answer here as text.
$endgroup$
– Tasos
2 mins ago
add a comment |
$begingroup$
You can check this link. A way of handling negation is suggested 1.
New contributor
$endgroup$
$begingroup$
Please don't give link-only answers. The link might become dead after a while. Add a summary of the link and add it as a source link. But always add the answer here as text.
$endgroup$
– Tasos
2 mins ago
add a comment |
$begingroup$
You can check this link. A way of handling negation is suggested 1.
New contributor
$endgroup$
You can check this link. A way of handling negation is suggested 1.
New contributor
New contributor
answered 20 mins ago
Behzad MirzababaeiBehzad Mirzababaei
1
1
New contributor
New contributor
$begingroup$
Please don't give link-only answers. The link might become dead after a while. Add a summary of the link and add it as a source link. But always add the answer here as text.
$endgroup$
– Tasos
2 mins ago
add a comment |
$begingroup$
Please don't give link-only answers. The link might become dead after a while. Add a summary of the link and add it as a source link. But always add the answer here as text.
$endgroup$
– Tasos
2 mins ago
$begingroup$
Please don't give link-only answers. The link might become dead after a while. Add a summary of the link and add it as a source link. But always add the answer here as text.
$endgroup$
– Tasos
2 mins ago
$begingroup$
Please don't give link-only answers. The link might become dead after a while. Add a summary of the link and add it as a source link. But always add the answer here as text.
$endgroup$
– Tasos
2 mins ago
add a comment |
$begingroup$
You can see this paper Querying Word Embeddings for Similarity and Relatedness.
$endgroup$
add a comment |
$begingroup$
You can see this paper Querying Word Embeddings for Similarity and Relatedness.
$endgroup$
add a comment |
$begingroup$
You can see this paper Querying Word Embeddings for Similarity and Relatedness.
$endgroup$
You can see this paper Querying Word Embeddings for Similarity and Relatedness.
edited Nov 3 '18 at 14:46
Stephen Rauch♦
1,52551330
1,52551330
answered Nov 3 '18 at 14:21
Fatma.S.GadelrabFatma.S.Gadelrab
11
11
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f15784%2fhow-to-handle-negative-words-in-word2vec%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Don't average them; use a document vector model like paragraph2vec. See the sentiment analysis in the Experiments section for a performance evaluation.
$endgroup$
– Emre
Dec 17 '16 at 17:32