Difference between output of probabilistic and ordinary least squares regressions
$begingroup$
If I execute the commands
my_reg = LinearRegression()
lin.reg.fit(X,Y)
I train my model. To my understanding training a model is calculating coefficient estimators.
I do not really understand the difference between this and e.g.
scipy.stats.linregress(X,Y)
calculating a 'normal' regression that also gives me the coefficient estimators and all the other statistics connected with it.
Could anyone tell me what is the difference here?
machine-learning linear-regression
New contributor
ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
If I execute the commands
my_reg = LinearRegression()
lin.reg.fit(X,Y)
I train my model. To my understanding training a model is calculating coefficient estimators.
I do not really understand the difference between this and e.g.
scipy.stats.linregress(X,Y)
calculating a 'normal' regression that also gives me the coefficient estimators and all the other statistics connected with it.
Could anyone tell me what is the difference here?
machine-learning linear-regression
New contributor
ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
If I execute the commands
my_reg = LinearRegression()
lin.reg.fit(X,Y)
I train my model. To my understanding training a model is calculating coefficient estimators.
I do not really understand the difference between this and e.g.
scipy.stats.linregress(X,Y)
calculating a 'normal' regression that also gives me the coefficient estimators and all the other statistics connected with it.
Could anyone tell me what is the difference here?
machine-learning linear-regression
New contributor
ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
If I execute the commands
my_reg = LinearRegression()
lin.reg.fit(X,Y)
I train my model. To my understanding training a model is calculating coefficient estimators.
I do not really understand the difference between this and e.g.
scipy.stats.linregress(X,Y)
calculating a 'normal' regression that also gives me the coefficient estimators and all the other statistics connected with it.
Could anyone tell me what is the difference here?
machine-learning linear-regression
machine-learning linear-regression
New contributor
ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited yesterday
Esmailian
6187
6187
New contributor
ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked 2 days ago
ruediruedi
1212
1212
New contributor
ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
They both solve the exact same objective, which is minimizing the mean squared error. However the second method can answer "how confident it is that slope is not zero, i.e. $Y$ is correlated with $X$?" via p-value.
In detail
Lets denote the data as $(X, Y) = {(x_n, y_n)|x_n in mathbb{R}^D, y_n in mathbb{R}}$. And the regression as $hat{y} = Ax+B$.
Extra quantities returned by scipy.stats.linregress(X,Y) are: rvalue ($r$), and pvalue ($p$).
In statistics, $r^2$ (known as r-squared) measures the "goodness-of-fit" . That is, as regression $hat{y}=Ax+B$ gets closer to observation $y$, $r^2$ gets closer to $1$. Since it is a function of $y$ and $hat{y}$, it can be calculated for the first method too. So no difference here.
However, $p$ is specific to second method. scipy.stats.linregress(X,Y) adds a normality assumption to noise, i.e. assumes $epsilon sim N(0, sigma^2)$ where $$epsilon = y - overbrace{Ax+B}^{hat{y}}$$
On the basis of this assumption, it can answer an additional question: "how confident it is that the slope is not zero?". The first method cannot answer this question.
For example, suppose the estimated slope is $2.1$ for both methods, we still cannot tell whether this slope is significant or $Y$ is actually independent of $X$. Unless we look at the value of $p$. For example, for $p < 0.01$ we are confident (at significance level $0.01$) that $Y$ is correlated with $X$, but for $p > 0.1$ we cannot be confident, i.e. slope $2.1$ could be due to chance and $Y$ might be independent of $X$.
This link gives more details on how p-value is actually calculated in second method.
$endgroup$
add a comment |
$begingroup$
There is no difference in the conceptual sense - both methods calculate linear regression coefficients. The difference lies in the interface - while through scipy.stats you gain the coefficients directly (and it is up to you to put them into an equation to calculate the predictions), scikit-learn wraps them into a model object so that you can use it in a similar fashion to other ML models such as decision trees, for example. (Actually, you can obtain the regression coefficients from the fitted scikit-learn model using my_reg.coef_.)
New contributor
Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
ruedi is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46977%2fdifference-between-output-of-probabilistic-and-ordinary-least-squares-regression%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
They both solve the exact same objective, which is minimizing the mean squared error. However the second method can answer "how confident it is that slope is not zero, i.e. $Y$ is correlated with $X$?" via p-value.
In detail
Lets denote the data as $(X, Y) = {(x_n, y_n)|x_n in mathbb{R}^D, y_n in mathbb{R}}$. And the regression as $hat{y} = Ax+B$.
Extra quantities returned by scipy.stats.linregress(X,Y) are: rvalue ($r$), and pvalue ($p$).
In statistics, $r^2$ (known as r-squared) measures the "goodness-of-fit" . That is, as regression $hat{y}=Ax+B$ gets closer to observation $y$, $r^2$ gets closer to $1$. Since it is a function of $y$ and $hat{y}$, it can be calculated for the first method too. So no difference here.
However, $p$ is specific to second method. scipy.stats.linregress(X,Y) adds a normality assumption to noise, i.e. assumes $epsilon sim N(0, sigma^2)$ where $$epsilon = y - overbrace{Ax+B}^{hat{y}}$$
On the basis of this assumption, it can answer an additional question: "how confident it is that the slope is not zero?". The first method cannot answer this question.
For example, suppose the estimated slope is $2.1$ for both methods, we still cannot tell whether this slope is significant or $Y$ is actually independent of $X$. Unless we look at the value of $p$. For example, for $p < 0.01$ we are confident (at significance level $0.01$) that $Y$ is correlated with $X$, but for $p > 0.1$ we cannot be confident, i.e. slope $2.1$ could be due to chance and $Y$ might be independent of $X$.
This link gives more details on how p-value is actually calculated in second method.
$endgroup$
add a comment |
$begingroup$
They both solve the exact same objective, which is minimizing the mean squared error. However the second method can answer "how confident it is that slope is not zero, i.e. $Y$ is correlated with $X$?" via p-value.
In detail
Lets denote the data as $(X, Y) = {(x_n, y_n)|x_n in mathbb{R}^D, y_n in mathbb{R}}$. And the regression as $hat{y} = Ax+B$.
Extra quantities returned by scipy.stats.linregress(X,Y) are: rvalue ($r$), and pvalue ($p$).
In statistics, $r^2$ (known as r-squared) measures the "goodness-of-fit" . That is, as regression $hat{y}=Ax+B$ gets closer to observation $y$, $r^2$ gets closer to $1$. Since it is a function of $y$ and $hat{y}$, it can be calculated for the first method too. So no difference here.
However, $p$ is specific to second method. scipy.stats.linregress(X,Y) adds a normality assumption to noise, i.e. assumes $epsilon sim N(0, sigma^2)$ where $$epsilon = y - overbrace{Ax+B}^{hat{y}}$$
On the basis of this assumption, it can answer an additional question: "how confident it is that the slope is not zero?". The first method cannot answer this question.
For example, suppose the estimated slope is $2.1$ for both methods, we still cannot tell whether this slope is significant or $Y$ is actually independent of $X$. Unless we look at the value of $p$. For example, for $p < 0.01$ we are confident (at significance level $0.01$) that $Y$ is correlated with $X$, but for $p > 0.1$ we cannot be confident, i.e. slope $2.1$ could be due to chance and $Y$ might be independent of $X$.
This link gives more details on how p-value is actually calculated in second method.
$endgroup$
add a comment |
$begingroup$
They both solve the exact same objective, which is minimizing the mean squared error. However the second method can answer "how confident it is that slope is not zero, i.e. $Y$ is correlated with $X$?" via p-value.
In detail
Lets denote the data as $(X, Y) = {(x_n, y_n)|x_n in mathbb{R}^D, y_n in mathbb{R}}$. And the regression as $hat{y} = Ax+B$.
Extra quantities returned by scipy.stats.linregress(X,Y) are: rvalue ($r$), and pvalue ($p$).
In statistics, $r^2$ (known as r-squared) measures the "goodness-of-fit" . That is, as regression $hat{y}=Ax+B$ gets closer to observation $y$, $r^2$ gets closer to $1$. Since it is a function of $y$ and $hat{y}$, it can be calculated for the first method too. So no difference here.
However, $p$ is specific to second method. scipy.stats.linregress(X,Y) adds a normality assumption to noise, i.e. assumes $epsilon sim N(0, sigma^2)$ where $$epsilon = y - overbrace{Ax+B}^{hat{y}}$$
On the basis of this assumption, it can answer an additional question: "how confident it is that the slope is not zero?". The first method cannot answer this question.
For example, suppose the estimated slope is $2.1$ for both methods, we still cannot tell whether this slope is significant or $Y$ is actually independent of $X$. Unless we look at the value of $p$. For example, for $p < 0.01$ we are confident (at significance level $0.01$) that $Y$ is correlated with $X$, but for $p > 0.1$ we cannot be confident, i.e. slope $2.1$ could be due to chance and $Y$ might be independent of $X$.
This link gives more details on how p-value is actually calculated in second method.
$endgroup$
They both solve the exact same objective, which is minimizing the mean squared error. However the second method can answer "how confident it is that slope is not zero, i.e. $Y$ is correlated with $X$?" via p-value.
In detail
Lets denote the data as $(X, Y) = {(x_n, y_n)|x_n in mathbb{R}^D, y_n in mathbb{R}}$. And the regression as $hat{y} = Ax+B$.
Extra quantities returned by scipy.stats.linregress(X,Y) are: rvalue ($r$), and pvalue ($p$).
In statistics, $r^2$ (known as r-squared) measures the "goodness-of-fit" . That is, as regression $hat{y}=Ax+B$ gets closer to observation $y$, $r^2$ gets closer to $1$. Since it is a function of $y$ and $hat{y}$, it can be calculated for the first method too. So no difference here.
However, $p$ is specific to second method. scipy.stats.linregress(X,Y) adds a normality assumption to noise, i.e. assumes $epsilon sim N(0, sigma^2)$ where $$epsilon = y - overbrace{Ax+B}^{hat{y}}$$
On the basis of this assumption, it can answer an additional question: "how confident it is that the slope is not zero?". The first method cannot answer this question.
For example, suppose the estimated slope is $2.1$ for both methods, we still cannot tell whether this slope is significant or $Y$ is actually independent of $X$. Unless we look at the value of $p$. For example, for $p < 0.01$ we are confident (at significance level $0.01$) that $Y$ is correlated with $X$, but for $p > 0.1$ we cannot be confident, i.e. slope $2.1$ could be due to chance and $Y$ might be independent of $X$.
This link gives more details on how p-value is actually calculated in second method.
edited 22 hours ago
answered 2 days ago
EsmailianEsmailian
6187
6187
add a comment |
add a comment |
$begingroup$
There is no difference in the conceptual sense - both methods calculate linear regression coefficients. The difference lies in the interface - while through scipy.stats you gain the coefficients directly (and it is up to you to put them into an equation to calculate the predictions), scikit-learn wraps them into a model object so that you can use it in a similar fashion to other ML models such as decision trees, for example. (Actually, you can obtain the regression coefficients from the fitted scikit-learn model using my_reg.coef_.)
New contributor
Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
There is no difference in the conceptual sense - both methods calculate linear regression coefficients. The difference lies in the interface - while through scipy.stats you gain the coefficients directly (and it is up to you to put them into an equation to calculate the predictions), scikit-learn wraps them into a model object so that you can use it in a similar fashion to other ML models such as decision trees, for example. (Actually, you can obtain the regression coefficients from the fitted scikit-learn model using my_reg.coef_.)
New contributor
Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
There is no difference in the conceptual sense - both methods calculate linear regression coefficients. The difference lies in the interface - while through scipy.stats you gain the coefficients directly (and it is up to you to put them into an equation to calculate the predictions), scikit-learn wraps them into a model object so that you can use it in a similar fashion to other ML models such as decision trees, for example. (Actually, you can obtain the regression coefficients from the fitted scikit-learn model using my_reg.coef_.)
New contributor
Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
There is no difference in the conceptual sense - both methods calculate linear regression coefficients. The difference lies in the interface - while through scipy.stats you gain the coefficients directly (and it is up to you to put them into an equation to calculate the predictions), scikit-learn wraps them into a model object so that you can use it in a similar fashion to other ML models such as decision trees, for example. (Actually, you can obtain the regression coefficients from the fitted scikit-learn model using my_reg.coef_.)
New contributor
Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered 2 days ago
Jan ŠimberaJan Šimbera
1962
1962
New contributor
Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
ruedi is a new contributor. Be nice, and check out our Code of Conduct.
ruedi is a new contributor. Be nice, and check out our Code of Conduct.
ruedi is a new contributor. Be nice, and check out our Code of Conduct.
ruedi is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46977%2fdifference-between-output-of-probabilistic-and-ordinary-least-squares-regression%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown