Help interpreting the result of linear regression and confidence interval (beginner level)












0












$begingroup$


I have 200 regressors and one response. I want to predict the response and have used Ordinary Least Squares based regression. The plot is given below where the red dots represent the predicted response and the blue dots represent the actual response using a test data set of $n = 50$ samples. The training was done using 5000 samples. The test and training set overlap.



Using the training set,




  • I applied the R2 coefficient to determine the goodness of fit and I got close to 0.845

  • the mean square error between the fitted response and the known is 135.

  • I took alpha = 0.05 / 2; % Alpha for a 2 tailed test and the F0 value = 62


I cannot understand if I should accept this model or not. Can somebody please help in making me understand if it is a good fit or not. I have used Matlab. Thank you.



test










share|improve this question









New contributor




Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
    $endgroup$
    – macaw_9227
    2 hours ago










  • $begingroup$
    thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
    $endgroup$
    – Sm1
    2 hours ago










  • $begingroup$
    Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
    $endgroup$
    – macaw_9227
    1 hour ago










  • $begingroup$
    I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
    $endgroup$
    – Sm1
    1 hour ago


















0












$begingroup$


I have 200 regressors and one response. I want to predict the response and have used Ordinary Least Squares based regression. The plot is given below where the red dots represent the predicted response and the blue dots represent the actual response using a test data set of $n = 50$ samples. The training was done using 5000 samples. The test and training set overlap.



Using the training set,




  • I applied the R2 coefficient to determine the goodness of fit and I got close to 0.845

  • the mean square error between the fitted response and the known is 135.

  • I took alpha = 0.05 / 2; % Alpha for a 2 tailed test and the F0 value = 62


I cannot understand if I should accept this model or not. Can somebody please help in making me understand if it is a good fit or not. I have used Matlab. Thank you.



test










share|improve this question









New contributor




Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
    $endgroup$
    – macaw_9227
    2 hours ago










  • $begingroup$
    thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
    $endgroup$
    – Sm1
    2 hours ago










  • $begingroup$
    Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
    $endgroup$
    – macaw_9227
    1 hour ago










  • $begingroup$
    I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
    $endgroup$
    – Sm1
    1 hour ago
















0












0








0





$begingroup$


I have 200 regressors and one response. I want to predict the response and have used Ordinary Least Squares based regression. The plot is given below where the red dots represent the predicted response and the blue dots represent the actual response using a test data set of $n = 50$ samples. The training was done using 5000 samples. The test and training set overlap.



Using the training set,




  • I applied the R2 coefficient to determine the goodness of fit and I got close to 0.845

  • the mean square error between the fitted response and the known is 135.

  • I took alpha = 0.05 / 2; % Alpha for a 2 tailed test and the F0 value = 62


I cannot understand if I should accept this model or not. Can somebody please help in making me understand if it is a good fit or not. I have used Matlab. Thank you.



test










share|improve this question









New contributor




Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I have 200 regressors and one response. I want to predict the response and have used Ordinary Least Squares based regression. The plot is given below where the red dots represent the predicted response and the blue dots represent the actual response using a test data set of $n = 50$ samples. The training was done using 5000 samples. The test and training set overlap.



Using the training set,




  • I applied the R2 coefficient to determine the goodness of fit and I got close to 0.845

  • the mean square error between the fitted response and the known is 135.

  • I took alpha = 0.05 / 2; % Alpha for a 2 tailed test and the F0 value = 62


I cannot understand if I should accept this model or not. Can somebody please help in making me understand if it is a good fit or not. I have used Matlab. Thank you.



test







regression prediction






share|improve this question









New contributor




Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 2 hours ago







Sm1













New contributor




Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 2 hours ago









Sm1Sm1

11




11




New contributor




Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • $begingroup$
    It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
    $endgroup$
    – macaw_9227
    2 hours ago










  • $begingroup$
    thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
    $endgroup$
    – Sm1
    2 hours ago










  • $begingroup$
    Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
    $endgroup$
    – macaw_9227
    1 hour ago










  • $begingroup$
    I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
    $endgroup$
    – Sm1
    1 hour ago




















  • $begingroup$
    It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
    $endgroup$
    – macaw_9227
    2 hours ago










  • $begingroup$
    thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
    $endgroup$
    – Sm1
    2 hours ago










  • $begingroup$
    Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
    $endgroup$
    – macaw_9227
    1 hour ago










  • $begingroup$
    I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
    $endgroup$
    – Sm1
    1 hour ago


















$begingroup$
It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
$endgroup$
– macaw_9227
2 hours ago




$begingroup$
It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
$endgroup$
– macaw_9227
2 hours ago












$begingroup$
thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
$endgroup$
– Sm1
2 hours ago




$begingroup$
thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
$endgroup$
– Sm1
2 hours ago












$begingroup$
Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
$endgroup$
– macaw_9227
1 hour ago




$begingroup$
Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
$endgroup$
– macaw_9227
1 hour ago












$begingroup$
I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
$endgroup$
– Sm1
1 hour ago






$begingroup$
I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
$endgroup$
– Sm1
1 hour ago












0






active

oldest

votes












Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






Sm1 is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49737%2fhelp-interpreting-the-result-of-linear-regression-and-confidence-interval-begin%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes








Sm1 is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















Sm1 is a new contributor. Be nice, and check out our Code of Conduct.













Sm1 is a new contributor. Be nice, and check out our Code of Conduct.












Sm1 is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49737%2fhelp-interpreting-the-result-of-linear-regression-and-confidence-interval-begin%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Callistus I

Tabula Rosettana

How to label and detect the document text images