What does it mean when I add a new variable to my linear model and the R^2 stays the same?












4












$begingroup$


I'm inclined to think that the new variable is not correlated to the response. But could the new variable be correlated to another variable in the model?










share|cite|improve this question









$endgroup$












  • $begingroup$
    It depends, could you provide us with some reduced data lines or output from your linear models. Without more information it's hard to assist you
    $endgroup$
    – OliverFishCode
    yesterday






  • 5




    $begingroup$
    It shouldn't stay exactly the same unless it is perfectly orthogonal to your response, or is a linear combination of the variables already included. It may be that the change is smaller than the number of decimal places displayed.
    $endgroup$
    – gung
    yesterday






  • 5




    $begingroup$
    @gung What you can infer is that the new variable is orthogonal to the response modulo the subspace generated by the other variables. That's more general than the two options you mention.
    $endgroup$
    – whuber
    yesterday










  • $begingroup$
    @whuber, yes, I suppose so.
    $endgroup$
    – gung
    yesterday










  • $begingroup$
    Test your variables for multicollinearity en.wikipedia.org/wiki/Multicollinearity probably some features are linearly connected. Use caret package and vif() in R sthda.com/english/articles/39-regression-model-diagnostics/…
    $endgroup$
    – Tom Zinger
    23 hours ago


















4












$begingroup$


I'm inclined to think that the new variable is not correlated to the response. But could the new variable be correlated to another variable in the model?










share|cite|improve this question









$endgroup$












  • $begingroup$
    It depends, could you provide us with some reduced data lines or output from your linear models. Without more information it's hard to assist you
    $endgroup$
    – OliverFishCode
    yesterday






  • 5




    $begingroup$
    It shouldn't stay exactly the same unless it is perfectly orthogonal to your response, or is a linear combination of the variables already included. It may be that the change is smaller than the number of decimal places displayed.
    $endgroup$
    – gung
    yesterday






  • 5




    $begingroup$
    @gung What you can infer is that the new variable is orthogonal to the response modulo the subspace generated by the other variables. That's more general than the two options you mention.
    $endgroup$
    – whuber
    yesterday










  • $begingroup$
    @whuber, yes, I suppose so.
    $endgroup$
    – gung
    yesterday










  • $begingroup$
    Test your variables for multicollinearity en.wikipedia.org/wiki/Multicollinearity probably some features are linearly connected. Use caret package and vif() in R sthda.com/english/articles/39-regression-model-diagnostics/…
    $endgroup$
    – Tom Zinger
    23 hours ago
















4












4








4


1



$begingroup$


I'm inclined to think that the new variable is not correlated to the response. But could the new variable be correlated to another variable in the model?










share|cite|improve this question









$endgroup$




I'm inclined to think that the new variable is not correlated to the response. But could the new variable be correlated to another variable in the model?







linear-model r-squared






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked yesterday









Chance113Chance113

362




362












  • $begingroup$
    It depends, could you provide us with some reduced data lines or output from your linear models. Without more information it's hard to assist you
    $endgroup$
    – OliverFishCode
    yesterday






  • 5




    $begingroup$
    It shouldn't stay exactly the same unless it is perfectly orthogonal to your response, or is a linear combination of the variables already included. It may be that the change is smaller than the number of decimal places displayed.
    $endgroup$
    – gung
    yesterday






  • 5




    $begingroup$
    @gung What you can infer is that the new variable is orthogonal to the response modulo the subspace generated by the other variables. That's more general than the two options you mention.
    $endgroup$
    – whuber
    yesterday










  • $begingroup$
    @whuber, yes, I suppose so.
    $endgroup$
    – gung
    yesterday










  • $begingroup$
    Test your variables for multicollinearity en.wikipedia.org/wiki/Multicollinearity probably some features are linearly connected. Use caret package and vif() in R sthda.com/english/articles/39-regression-model-diagnostics/…
    $endgroup$
    – Tom Zinger
    23 hours ago




















  • $begingroup$
    It depends, could you provide us with some reduced data lines or output from your linear models. Without more information it's hard to assist you
    $endgroup$
    – OliverFishCode
    yesterday






  • 5




    $begingroup$
    It shouldn't stay exactly the same unless it is perfectly orthogonal to your response, or is a linear combination of the variables already included. It may be that the change is smaller than the number of decimal places displayed.
    $endgroup$
    – gung
    yesterday






  • 5




    $begingroup$
    @gung What you can infer is that the new variable is orthogonal to the response modulo the subspace generated by the other variables. That's more general than the two options you mention.
    $endgroup$
    – whuber
    yesterday










  • $begingroup$
    @whuber, yes, I suppose so.
    $endgroup$
    – gung
    yesterday










  • $begingroup$
    Test your variables for multicollinearity en.wikipedia.org/wiki/Multicollinearity probably some features are linearly connected. Use caret package and vif() in R sthda.com/english/articles/39-regression-model-diagnostics/…
    $endgroup$
    – Tom Zinger
    23 hours ago


















$begingroup$
It depends, could you provide us with some reduced data lines or output from your linear models. Without more information it's hard to assist you
$endgroup$
– OliverFishCode
yesterday




$begingroup$
It depends, could you provide us with some reduced data lines or output from your linear models. Without more information it's hard to assist you
$endgroup$
– OliverFishCode
yesterday




5




5




$begingroup$
It shouldn't stay exactly the same unless it is perfectly orthogonal to your response, or is a linear combination of the variables already included. It may be that the change is smaller than the number of decimal places displayed.
$endgroup$
– gung
yesterday




$begingroup$
It shouldn't stay exactly the same unless it is perfectly orthogonal to your response, or is a linear combination of the variables already included. It may be that the change is smaller than the number of decimal places displayed.
$endgroup$
– gung
yesterday




5




5




$begingroup$
@gung What you can infer is that the new variable is orthogonal to the response modulo the subspace generated by the other variables. That's more general than the two options you mention.
$endgroup$
– whuber
yesterday




$begingroup$
@gung What you can infer is that the new variable is orthogonal to the response modulo the subspace generated by the other variables. That's more general than the two options you mention.
$endgroup$
– whuber
yesterday












$begingroup$
@whuber, yes, I suppose so.
$endgroup$
– gung
yesterday




$begingroup$
@whuber, yes, I suppose so.
$endgroup$
– gung
yesterday












$begingroup$
Test your variables for multicollinearity en.wikipedia.org/wiki/Multicollinearity probably some features are linearly connected. Use caret package and vif() in R sthda.com/english/articles/39-regression-model-diagnostics/…
$endgroup$
– Tom Zinger
23 hours ago






$begingroup$
Test your variables for multicollinearity en.wikipedia.org/wiki/Multicollinearity probably some features are linearly connected. Use caret package and vif() in R sthda.com/english/articles/39-regression-model-diagnostics/…
$endgroup$
– Tom Zinger
23 hours ago












2 Answers
2






active

oldest

votes


















5












$begingroup$

Seeing little to no change in $R^2$ when you add a variable to a linear model means that the variable has little to no additional explanatory power to the response over what is already in your model. As you note, this can be either because it tells you almost nothing about the response or it explains the same variation in the response as the variables already in the model.






share|cite|improve this answer









$endgroup$





















    1












    $begingroup$

    As others have alluded, seeing no change in $R^2$ when you add a variable to your regression is unusual. In finite samples, this should only happen when your new variable is a linear combination of variables already present. In this case, most standard regression routines simply exclude that variable from the regression, and your $R^2$ will remain unchanged because the model was effectively unchanged.



    As you notice, this does not mean the variable is unimportant, but rather that you are unable to distinguish its effect from that of the other variables in your model.



    More broadly however, I (and many here at Cross Validated) would caution against using R^2 for model selection and interpretation. What I've discussed above is how the $R^2$ could not change and the variable still be important. Worse yet, the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable. Broadly, using $R^2$ for model selection fell out of favor in the 70s, when it was dropped in favor of AIC (and its contemporaries). Today -- a typical statistician would recommend using cross validation (see the site name) for your model selection.



    In general, adding a variable increases $R^2$ -- so using $R^2$ to determine a variables importance is a bit of a wild goose chase. Even when trying to understand simple situations you will end up with a completely absurd collection of variables.






    share|cite|improve this answer









    $endgroup$













    • $begingroup$
      Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
      $endgroup$
      – Richard Hardy
      yesterday













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "65"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396220%2fwhat-does-it-mean-when-i-add-a-new-variable-to-my-linear-model-and-the-r2-stays%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    5












    $begingroup$

    Seeing little to no change in $R^2$ when you add a variable to a linear model means that the variable has little to no additional explanatory power to the response over what is already in your model. As you note, this can be either because it tells you almost nothing about the response or it explains the same variation in the response as the variables already in the model.






    share|cite|improve this answer









    $endgroup$


















      5












      $begingroup$

      Seeing little to no change in $R^2$ when you add a variable to a linear model means that the variable has little to no additional explanatory power to the response over what is already in your model. As you note, this can be either because it tells you almost nothing about the response or it explains the same variation in the response as the variables already in the model.






      share|cite|improve this answer









      $endgroup$
















        5












        5








        5





        $begingroup$

        Seeing little to no change in $R^2$ when you add a variable to a linear model means that the variable has little to no additional explanatory power to the response over what is already in your model. As you note, this can be either because it tells you almost nothing about the response or it explains the same variation in the response as the variables already in the model.






        share|cite|improve this answer









        $endgroup$



        Seeing little to no change in $R^2$ when you add a variable to a linear model means that the variable has little to no additional explanatory power to the response over what is already in your model. As you note, this can be either because it tells you almost nothing about the response or it explains the same variation in the response as the variables already in the model.







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered yesterday









        TrynnaDoStatTrynnaDoStat

        5,56211335




        5,56211335

























            1












            $begingroup$

            As others have alluded, seeing no change in $R^2$ when you add a variable to your regression is unusual. In finite samples, this should only happen when your new variable is a linear combination of variables already present. In this case, most standard regression routines simply exclude that variable from the regression, and your $R^2$ will remain unchanged because the model was effectively unchanged.



            As you notice, this does not mean the variable is unimportant, but rather that you are unable to distinguish its effect from that of the other variables in your model.



            More broadly however, I (and many here at Cross Validated) would caution against using R^2 for model selection and interpretation. What I've discussed above is how the $R^2$ could not change and the variable still be important. Worse yet, the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable. Broadly, using $R^2$ for model selection fell out of favor in the 70s, when it was dropped in favor of AIC (and its contemporaries). Today -- a typical statistician would recommend using cross validation (see the site name) for your model selection.



            In general, adding a variable increases $R^2$ -- so using $R^2$ to determine a variables importance is a bit of a wild goose chase. Even when trying to understand simple situations you will end up with a completely absurd collection of variables.






            share|cite|improve this answer









            $endgroup$













            • $begingroup$
              Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
              $endgroup$
              – Richard Hardy
              yesterday


















            1












            $begingroup$

            As others have alluded, seeing no change in $R^2$ when you add a variable to your regression is unusual. In finite samples, this should only happen when your new variable is a linear combination of variables already present. In this case, most standard regression routines simply exclude that variable from the regression, and your $R^2$ will remain unchanged because the model was effectively unchanged.



            As you notice, this does not mean the variable is unimportant, but rather that you are unable to distinguish its effect from that of the other variables in your model.



            More broadly however, I (and many here at Cross Validated) would caution against using R^2 for model selection and interpretation. What I've discussed above is how the $R^2$ could not change and the variable still be important. Worse yet, the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable. Broadly, using $R^2$ for model selection fell out of favor in the 70s, when it was dropped in favor of AIC (and its contemporaries). Today -- a typical statistician would recommend using cross validation (see the site name) for your model selection.



            In general, adding a variable increases $R^2$ -- so using $R^2$ to determine a variables importance is a bit of a wild goose chase. Even when trying to understand simple situations you will end up with a completely absurd collection of variables.






            share|cite|improve this answer









            $endgroup$













            • $begingroup$
              Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
              $endgroup$
              – Richard Hardy
              yesterday
















            1












            1








            1





            $begingroup$

            As others have alluded, seeing no change in $R^2$ when you add a variable to your regression is unusual. In finite samples, this should only happen when your new variable is a linear combination of variables already present. In this case, most standard regression routines simply exclude that variable from the regression, and your $R^2$ will remain unchanged because the model was effectively unchanged.



            As you notice, this does not mean the variable is unimportant, but rather that you are unable to distinguish its effect from that of the other variables in your model.



            More broadly however, I (and many here at Cross Validated) would caution against using R^2 for model selection and interpretation. What I've discussed above is how the $R^2$ could not change and the variable still be important. Worse yet, the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable. Broadly, using $R^2$ for model selection fell out of favor in the 70s, when it was dropped in favor of AIC (and its contemporaries). Today -- a typical statistician would recommend using cross validation (see the site name) for your model selection.



            In general, adding a variable increases $R^2$ -- so using $R^2$ to determine a variables importance is a bit of a wild goose chase. Even when trying to understand simple situations you will end up with a completely absurd collection of variables.






            share|cite|improve this answer









            $endgroup$



            As others have alluded, seeing no change in $R^2$ when you add a variable to your regression is unusual. In finite samples, this should only happen when your new variable is a linear combination of variables already present. In this case, most standard regression routines simply exclude that variable from the regression, and your $R^2$ will remain unchanged because the model was effectively unchanged.



            As you notice, this does not mean the variable is unimportant, but rather that you are unable to distinguish its effect from that of the other variables in your model.



            More broadly however, I (and many here at Cross Validated) would caution against using R^2 for model selection and interpretation. What I've discussed above is how the $R^2$ could not change and the variable still be important. Worse yet, the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable. Broadly, using $R^2$ for model selection fell out of favor in the 70s, when it was dropped in favor of AIC (and its contemporaries). Today -- a typical statistician would recommend using cross validation (see the site name) for your model selection.



            In general, adding a variable increases $R^2$ -- so using $R^2$ to determine a variables importance is a bit of a wild goose chase. Even when trying to understand simple situations you will end up with a completely absurd collection of variables.







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered yesterday









            user5957401user5957401

            29727




            29727












            • $begingroup$
              Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
              $endgroup$
              – Richard Hardy
              yesterday




















            • $begingroup$
              Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
              $endgroup$
              – Richard Hardy
              yesterday


















            $begingroup$
            Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
            $endgroup$
            – Richard Hardy
            yesterday






            $begingroup$
            Could you elaborate on the $R^2$ could change somewhat (or even dramatically) when you include an irrelevant variable, specifically on the case of a dramatical change? In which sense would the variable then be irrelevant?
            $endgroup$
            – Richard Hardy
            yesterday




















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Cross Validated!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396220%2fwhat-does-it-mean-when-i-add-a-new-variable-to-my-linear-model-and-the-r2-stays%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Callistus I

            Tabula Rosettana

            How to label and detect the document text images