Difference between output of probabilistic and ordinary least squares regressions












4












$begingroup$


If I execute the commands



my_reg = LinearRegression()
lin.reg.fit(X,Y)


I train my model. To my understanding training a model is calculating coefficient estimators.



I do not really understand the difference between this and e.g.



scipy.stats.linregress(X,Y)


calculating a 'normal' regression that also gives me the coefficient estimators and all the other statistics connected with it.



Could anyone tell me what is the difference here?










share|improve this question









New contributor




ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$

















    4












    $begingroup$


    If I execute the commands



    my_reg = LinearRegression()
    lin.reg.fit(X,Y)


    I train my model. To my understanding training a model is calculating coefficient estimators.



    I do not really understand the difference between this and e.g.



    scipy.stats.linregress(X,Y)


    calculating a 'normal' regression that also gives me the coefficient estimators and all the other statistics connected with it.



    Could anyone tell me what is the difference here?










    share|improve this question









    New contributor




    ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$















      4












      4








      4


      0



      $begingroup$


      If I execute the commands



      my_reg = LinearRegression()
      lin.reg.fit(X,Y)


      I train my model. To my understanding training a model is calculating coefficient estimators.



      I do not really understand the difference between this and e.g.



      scipy.stats.linregress(X,Y)


      calculating a 'normal' regression that also gives me the coefficient estimators and all the other statistics connected with it.



      Could anyone tell me what is the difference here?










      share|improve this question









      New contributor




      ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      If I execute the commands



      my_reg = LinearRegression()
      lin.reg.fit(X,Y)


      I train my model. To my understanding training a model is calculating coefficient estimators.



      I do not really understand the difference between this and e.g.



      scipy.stats.linregress(X,Y)


      calculating a 'normal' regression that also gives me the coefficient estimators and all the other statistics connected with it.



      Could anyone tell me what is the difference here?







      machine-learning linear-regression






      share|improve this question









      New contributor




      ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited yesterday









      Esmailian

      6187




      6187






      New contributor




      ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 2 days ago









      ruediruedi

      1212




      1212




      New contributor




      ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      ruedi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          2 Answers
          2






          active

          oldest

          votes


















          1












          $begingroup$

          They both solve the exact same objective, which is minimizing the mean squared error. However the second method can answer "how confident it is that slope is not zero, i.e. $Y$ is correlated with $X$?" via p-value.



          In detail



          Lets denote the data as $(X, Y) = {(x_n, y_n)|x_n in mathbb{R}^D, y_n in mathbb{R}}$. And the regression as $hat{y} = Ax+B$.



          Extra quantities returned by scipy.stats.linregress(X,Y) are: rvalue ($r$), and pvalue ($p$).



          In statistics, $r^2$ (known as r-squared) measures the "goodness-of-fit" . That is, as regression $hat{y}=Ax+B$ gets closer to observation $y$, $r^2$ gets closer to $1$. Since it is a function of $y$ and $hat{y}$, it can be calculated for the first method too. So no difference here.



          However, $p$ is specific to second method. scipy.stats.linregress(X,Y) adds a normality assumption to noise, i.e. assumes $epsilon sim N(0, sigma^2)$ where $$epsilon = y - overbrace{Ax+B}^{hat{y}}$$
          On the basis of this assumption, it can answer an additional question: "how confident it is that the slope is not zero?". The first method cannot answer this question.



          For example, suppose the estimated slope is $2.1$ for both methods, we still cannot tell whether this slope is significant or $Y$ is actually independent of $X$. Unless we look at the value of $p$. For example, for $p < 0.01$ we are confident (at significance level $0.01$) that $Y$ is correlated with $X$, but for $p > 0.1$ we cannot be confident, i.e. slope $2.1$ could be due to chance and $Y$ might be independent of $X$.



          This link gives more details on how p-value is actually calculated in second method.






          share|improve this answer











          $endgroup$





















            0












            $begingroup$

            There is no difference in the conceptual sense - both methods calculate linear regression coefficients. The difference lies in the interface - while through scipy.stats you gain the coefficients directly (and it is up to you to put them into an equation to calculate the predictions), scikit-learn wraps them into a model object so that you can use it in a similar fashion to other ML models such as decision trees, for example. (Actually, you can obtain the regression coefficients from the fitted scikit-learn model using my_reg.coef_.)






            share|improve this answer








            New contributor




            Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$













              Your Answer





              StackExchange.ifUsing("editor", function () {
              return StackExchange.using("mathjaxEditing", function () {
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              });
              });
              }, "mathjax-editing");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "557"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });






              ruedi is a new contributor. Be nice, and check out our Code of Conduct.










              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46977%2fdifference-between-output-of-probabilistic-and-ordinary-least-squares-regression%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              1












              $begingroup$

              They both solve the exact same objective, which is minimizing the mean squared error. However the second method can answer "how confident it is that slope is not zero, i.e. $Y$ is correlated with $X$?" via p-value.



              In detail



              Lets denote the data as $(X, Y) = {(x_n, y_n)|x_n in mathbb{R}^D, y_n in mathbb{R}}$. And the regression as $hat{y} = Ax+B$.



              Extra quantities returned by scipy.stats.linregress(X,Y) are: rvalue ($r$), and pvalue ($p$).



              In statistics, $r^2$ (known as r-squared) measures the "goodness-of-fit" . That is, as regression $hat{y}=Ax+B$ gets closer to observation $y$, $r^2$ gets closer to $1$. Since it is a function of $y$ and $hat{y}$, it can be calculated for the first method too. So no difference here.



              However, $p$ is specific to second method. scipy.stats.linregress(X,Y) adds a normality assumption to noise, i.e. assumes $epsilon sim N(0, sigma^2)$ where $$epsilon = y - overbrace{Ax+B}^{hat{y}}$$
              On the basis of this assumption, it can answer an additional question: "how confident it is that the slope is not zero?". The first method cannot answer this question.



              For example, suppose the estimated slope is $2.1$ for both methods, we still cannot tell whether this slope is significant or $Y$ is actually independent of $X$. Unless we look at the value of $p$. For example, for $p < 0.01$ we are confident (at significance level $0.01$) that $Y$ is correlated with $X$, but for $p > 0.1$ we cannot be confident, i.e. slope $2.1$ could be due to chance and $Y$ might be independent of $X$.



              This link gives more details on how p-value is actually calculated in second method.






              share|improve this answer











              $endgroup$


















                1












                $begingroup$

                They both solve the exact same objective, which is minimizing the mean squared error. However the second method can answer "how confident it is that slope is not zero, i.e. $Y$ is correlated with $X$?" via p-value.



                In detail



                Lets denote the data as $(X, Y) = {(x_n, y_n)|x_n in mathbb{R}^D, y_n in mathbb{R}}$. And the regression as $hat{y} = Ax+B$.



                Extra quantities returned by scipy.stats.linregress(X,Y) are: rvalue ($r$), and pvalue ($p$).



                In statistics, $r^2$ (known as r-squared) measures the "goodness-of-fit" . That is, as regression $hat{y}=Ax+B$ gets closer to observation $y$, $r^2$ gets closer to $1$. Since it is a function of $y$ and $hat{y}$, it can be calculated for the first method too. So no difference here.



                However, $p$ is specific to second method. scipy.stats.linregress(X,Y) adds a normality assumption to noise, i.e. assumes $epsilon sim N(0, sigma^2)$ where $$epsilon = y - overbrace{Ax+B}^{hat{y}}$$
                On the basis of this assumption, it can answer an additional question: "how confident it is that the slope is not zero?". The first method cannot answer this question.



                For example, suppose the estimated slope is $2.1$ for both methods, we still cannot tell whether this slope is significant or $Y$ is actually independent of $X$. Unless we look at the value of $p$. For example, for $p < 0.01$ we are confident (at significance level $0.01$) that $Y$ is correlated with $X$, but for $p > 0.1$ we cannot be confident, i.e. slope $2.1$ could be due to chance and $Y$ might be independent of $X$.



                This link gives more details on how p-value is actually calculated in second method.






                share|improve this answer











                $endgroup$
















                  1












                  1








                  1





                  $begingroup$

                  They both solve the exact same objective, which is minimizing the mean squared error. However the second method can answer "how confident it is that slope is not zero, i.e. $Y$ is correlated with $X$?" via p-value.



                  In detail



                  Lets denote the data as $(X, Y) = {(x_n, y_n)|x_n in mathbb{R}^D, y_n in mathbb{R}}$. And the regression as $hat{y} = Ax+B$.



                  Extra quantities returned by scipy.stats.linregress(X,Y) are: rvalue ($r$), and pvalue ($p$).



                  In statistics, $r^2$ (known as r-squared) measures the "goodness-of-fit" . That is, as regression $hat{y}=Ax+B$ gets closer to observation $y$, $r^2$ gets closer to $1$. Since it is a function of $y$ and $hat{y}$, it can be calculated for the first method too. So no difference here.



                  However, $p$ is specific to second method. scipy.stats.linregress(X,Y) adds a normality assumption to noise, i.e. assumes $epsilon sim N(0, sigma^2)$ where $$epsilon = y - overbrace{Ax+B}^{hat{y}}$$
                  On the basis of this assumption, it can answer an additional question: "how confident it is that the slope is not zero?". The first method cannot answer this question.



                  For example, suppose the estimated slope is $2.1$ for both methods, we still cannot tell whether this slope is significant or $Y$ is actually independent of $X$. Unless we look at the value of $p$. For example, for $p < 0.01$ we are confident (at significance level $0.01$) that $Y$ is correlated with $X$, but for $p > 0.1$ we cannot be confident, i.e. slope $2.1$ could be due to chance and $Y$ might be independent of $X$.



                  This link gives more details on how p-value is actually calculated in second method.






                  share|improve this answer











                  $endgroup$



                  They both solve the exact same objective, which is minimizing the mean squared error. However the second method can answer "how confident it is that slope is not zero, i.e. $Y$ is correlated with $X$?" via p-value.



                  In detail



                  Lets denote the data as $(X, Y) = {(x_n, y_n)|x_n in mathbb{R}^D, y_n in mathbb{R}}$. And the regression as $hat{y} = Ax+B$.



                  Extra quantities returned by scipy.stats.linregress(X,Y) are: rvalue ($r$), and pvalue ($p$).



                  In statistics, $r^2$ (known as r-squared) measures the "goodness-of-fit" . That is, as regression $hat{y}=Ax+B$ gets closer to observation $y$, $r^2$ gets closer to $1$. Since it is a function of $y$ and $hat{y}$, it can be calculated for the first method too. So no difference here.



                  However, $p$ is specific to second method. scipy.stats.linregress(X,Y) adds a normality assumption to noise, i.e. assumes $epsilon sim N(0, sigma^2)$ where $$epsilon = y - overbrace{Ax+B}^{hat{y}}$$
                  On the basis of this assumption, it can answer an additional question: "how confident it is that the slope is not zero?". The first method cannot answer this question.



                  For example, suppose the estimated slope is $2.1$ for both methods, we still cannot tell whether this slope is significant or $Y$ is actually independent of $X$. Unless we look at the value of $p$. For example, for $p < 0.01$ we are confident (at significance level $0.01$) that $Y$ is correlated with $X$, but for $p > 0.1$ we cannot be confident, i.e. slope $2.1$ could be due to chance and $Y$ might be independent of $X$.



                  This link gives more details on how p-value is actually calculated in second method.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 22 hours ago

























                  answered 2 days ago









                  EsmailianEsmailian

                  6187




                  6187























                      0












                      $begingroup$

                      There is no difference in the conceptual sense - both methods calculate linear regression coefficients. The difference lies in the interface - while through scipy.stats you gain the coefficients directly (and it is up to you to put them into an equation to calculate the predictions), scikit-learn wraps them into a model object so that you can use it in a similar fashion to other ML models such as decision trees, for example. (Actually, you can obtain the regression coefficients from the fitted scikit-learn model using my_reg.coef_.)






                      share|improve this answer








                      New contributor




                      Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                      Check out our Code of Conduct.






                      $endgroup$


















                        0












                        $begingroup$

                        There is no difference in the conceptual sense - both methods calculate linear regression coefficients. The difference lies in the interface - while through scipy.stats you gain the coefficients directly (and it is up to you to put them into an equation to calculate the predictions), scikit-learn wraps them into a model object so that you can use it in a similar fashion to other ML models such as decision trees, for example. (Actually, you can obtain the regression coefficients from the fitted scikit-learn model using my_reg.coef_.)






                        share|improve this answer








                        New contributor




                        Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                        Check out our Code of Conduct.






                        $endgroup$
















                          0












                          0








                          0





                          $begingroup$

                          There is no difference in the conceptual sense - both methods calculate linear regression coefficients. The difference lies in the interface - while through scipy.stats you gain the coefficients directly (and it is up to you to put them into an equation to calculate the predictions), scikit-learn wraps them into a model object so that you can use it in a similar fashion to other ML models such as decision trees, for example. (Actually, you can obtain the regression coefficients from the fitted scikit-learn model using my_reg.coef_.)






                          share|improve this answer








                          New contributor




                          Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.






                          $endgroup$



                          There is no difference in the conceptual sense - both methods calculate linear regression coefficients. The difference lies in the interface - while through scipy.stats you gain the coefficients directly (and it is up to you to put them into an equation to calculate the predictions), scikit-learn wraps them into a model object so that you can use it in a similar fashion to other ML models such as decision trees, for example. (Actually, you can obtain the regression coefficients from the fitted scikit-learn model using my_reg.coef_.)







                          share|improve this answer








                          New contributor




                          Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.









                          share|improve this answer



                          share|improve this answer






                          New contributor




                          Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.









                          answered 2 days ago









                          Jan ŠimberaJan Šimbera

                          1962




                          1962




                          New contributor




                          Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.





                          New contributor





                          Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.






                          Jan Šimbera is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.






















                              ruedi is a new contributor. Be nice, and check out our Code of Conduct.










                              draft saved

                              draft discarded


















                              ruedi is a new contributor. Be nice, and check out our Code of Conduct.













                              ruedi is a new contributor. Be nice, and check out our Code of Conduct.












                              ruedi is a new contributor. Be nice, and check out our Code of Conduct.
















                              Thanks for contributing an answer to Data Science Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46977%2fdifference-between-output-of-probabilistic-and-ordinary-least-squares-regression%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Callistus I

                              Tabula Rosettana

                              How to label and detect the document text images