What does coefficient mean for a binary independent variable in multiple linear regression?
$begingroup$
Let's say I have a multiple linear regression model where my dependent variable, Y, is an integer. And, one of my independent variables --x1-- is binary --let's say either 0 or 1.
We know that sign of the coefficient for x1 in the model, positive or negative, demonstrates its correlation with Y. My question though is, is there anyway for us to know that what the correlation of x1 values independently, 0 or 1, is with the dependent variable? For instance, do we know if that 0 is the reason for the sign of coefficient and the more 0 will lead to more Y (not sure if this is the correct language but hope you understand what I am saying.)
A real-world example, I have a multiple linear regression model to find the correlation of a set of numeric indexes in tweets, as independent variables, with the number of retweets as dependent variable. One of the independent variables in this model is the tweet's fact checking label, "true" or "false" which we call it "truth_label." I have trained a multiple linear regression model and the coefficient of the truth_label is positive meaning that it has a positive correlation with the number of shares. But, I would like to know if that positive correlation is because of the "true" values or "false" values. Hope this example made my question a bit more clear.
linear-regression correlation
$endgroup$
add a comment |
$begingroup$
Let's say I have a multiple linear regression model where my dependent variable, Y, is an integer. And, one of my independent variables --x1-- is binary --let's say either 0 or 1.
We know that sign of the coefficient for x1 in the model, positive or negative, demonstrates its correlation with Y. My question though is, is there anyway for us to know that what the correlation of x1 values independently, 0 or 1, is with the dependent variable? For instance, do we know if that 0 is the reason for the sign of coefficient and the more 0 will lead to more Y (not sure if this is the correct language but hope you understand what I am saying.)
A real-world example, I have a multiple linear regression model to find the correlation of a set of numeric indexes in tweets, as independent variables, with the number of retweets as dependent variable. One of the independent variables in this model is the tweet's fact checking label, "true" or "false" which we call it "truth_label." I have trained a multiple linear regression model and the coefficient of the truth_label is positive meaning that it has a positive correlation with the number of shares. But, I would like to know if that positive correlation is because of the "true" values or "false" values. Hope this example made my question a bit more clear.
linear-regression correlation
$endgroup$
add a comment |
$begingroup$
Let's say I have a multiple linear regression model where my dependent variable, Y, is an integer. And, one of my independent variables --x1-- is binary --let's say either 0 or 1.
We know that sign of the coefficient for x1 in the model, positive or negative, demonstrates its correlation with Y. My question though is, is there anyway for us to know that what the correlation of x1 values independently, 0 or 1, is with the dependent variable? For instance, do we know if that 0 is the reason for the sign of coefficient and the more 0 will lead to more Y (not sure if this is the correct language but hope you understand what I am saying.)
A real-world example, I have a multiple linear regression model to find the correlation of a set of numeric indexes in tweets, as independent variables, with the number of retweets as dependent variable. One of the independent variables in this model is the tweet's fact checking label, "true" or "false" which we call it "truth_label." I have trained a multiple linear regression model and the coefficient of the truth_label is positive meaning that it has a positive correlation with the number of shares. But, I would like to know if that positive correlation is because of the "true" values or "false" values. Hope this example made my question a bit more clear.
linear-regression correlation
$endgroup$
Let's say I have a multiple linear regression model where my dependent variable, Y, is an integer. And, one of my independent variables --x1-- is binary --let's say either 0 or 1.
We know that sign of the coefficient for x1 in the model, positive or negative, demonstrates its correlation with Y. My question though is, is there anyway for us to know that what the correlation of x1 values independently, 0 or 1, is with the dependent variable? For instance, do we know if that 0 is the reason for the sign of coefficient and the more 0 will lead to more Y (not sure if this is the correct language but hope you understand what I am saying.)
A real-world example, I have a multiple linear regression model to find the correlation of a set of numeric indexes in tweets, as independent variables, with the number of retweets as dependent variable. One of the independent variables in this model is the tweet's fact checking label, "true" or "false" which we call it "truth_label." I have trained a multiple linear regression model and the coefficient of the truth_label is positive meaning that it has a positive correlation with the number of shares. But, I would like to know if that positive correlation is because of the "true" values or "false" values. Hope this example made my question a bit more clear.
linear-regression correlation
linear-regression correlation
asked 1 hour ago
PedramPedram
1012
1012
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Your question is a little bit confusing but I will try to answer it:
You are asking if the coefficient sign equals the correlation sign? You want to know if a positive sign in the regression leads to a positive correlation between the $Y$ and the binary $X$s.
Depends.
This is true if you are running a regression with only one THAT regressor.
The equation $Retweets=beta_0+beta_1*truth.label$ complies with this.
The model $Retweets=beta_0+beta_1*truth.label.1+beta_2*truth.label.2$ does not necesarily complies with the statement.
Why? When you have two or more variables in a model, what they do to each other is to complement between themselves, i.e. one finds what the other variable misses (in terms of errors). So you can't state anything about coefficients and correlations.
This is true, does not matter the type of variable: Binary, Integer, Continuos.
New contributor
Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49436%2fwhat-does-coefficient-mean-for-a-binary-independent-variable-in-multiple-linear%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Your question is a little bit confusing but I will try to answer it:
You are asking if the coefficient sign equals the correlation sign? You want to know if a positive sign in the regression leads to a positive correlation between the $Y$ and the binary $X$s.
Depends.
This is true if you are running a regression with only one THAT regressor.
The equation $Retweets=beta_0+beta_1*truth.label$ complies with this.
The model $Retweets=beta_0+beta_1*truth.label.1+beta_2*truth.label.2$ does not necesarily complies with the statement.
Why? When you have two or more variables in a model, what they do to each other is to complement between themselves, i.e. one finds what the other variable misses (in terms of errors). So you can't state anything about coefficients and correlations.
This is true, does not matter the type of variable: Binary, Integer, Continuos.
New contributor
Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
Your question is a little bit confusing but I will try to answer it:
You are asking if the coefficient sign equals the correlation sign? You want to know if a positive sign in the regression leads to a positive correlation between the $Y$ and the binary $X$s.
Depends.
This is true if you are running a regression with only one THAT regressor.
The equation $Retweets=beta_0+beta_1*truth.label$ complies with this.
The model $Retweets=beta_0+beta_1*truth.label.1+beta_2*truth.label.2$ does not necesarily complies with the statement.
Why? When you have two or more variables in a model, what they do to each other is to complement between themselves, i.e. one finds what the other variable misses (in terms of errors). So you can't state anything about coefficients and correlations.
This is true, does not matter the type of variable: Binary, Integer, Continuos.
New contributor
Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
Your question is a little bit confusing but I will try to answer it:
You are asking if the coefficient sign equals the correlation sign? You want to know if a positive sign in the regression leads to a positive correlation between the $Y$ and the binary $X$s.
Depends.
This is true if you are running a regression with only one THAT regressor.
The equation $Retweets=beta_0+beta_1*truth.label$ complies with this.
The model $Retweets=beta_0+beta_1*truth.label.1+beta_2*truth.label.2$ does not necesarily complies with the statement.
Why? When you have two or more variables in a model, what they do to each other is to complement between themselves, i.e. one finds what the other variable misses (in terms of errors). So you can't state anything about coefficients and correlations.
This is true, does not matter the type of variable: Binary, Integer, Continuos.
New contributor
Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
Your question is a little bit confusing but I will try to answer it:
You are asking if the coefficient sign equals the correlation sign? You want to know if a positive sign in the regression leads to a positive correlation between the $Y$ and the binary $X$s.
Depends.
This is true if you are running a regression with only one THAT regressor.
The equation $Retweets=beta_0+beta_1*truth.label$ complies with this.
The model $Retweets=beta_0+beta_1*truth.label.1+beta_2*truth.label.2$ does not necesarily complies with the statement.
Why? When you have two or more variables in a model, what they do to each other is to complement between themselves, i.e. one finds what the other variable misses (in terms of errors). So you can't state anything about coefficients and correlations.
This is true, does not matter the type of variable: Binary, Integer, Continuos.
New contributor
Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered 1 hour ago
Juan Esteban de la CalleJuan Esteban de la Calle
35811
35811
New contributor
Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49436%2fwhat-does-coefficient-mean-for-a-binary-independent-variable-in-multiple-linear%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown