Help interpreting the result of linear regression and confidence interval (beginner level)
$begingroup$
I have 200 regressors and one response. I want to predict the response and have used Ordinary Least Squares based regression. The plot is given below where the red dots represent the predicted response and the blue dots represent the actual response using a test data set of $n = 50$ samples. The training was done using 5000 samples. The test and training set overlap.
Using the training set,
- I applied the R2 coefficient to determine the goodness of fit and I got close to 0.845
- the mean square error between the fitted response and the known is 135.
- I took alpha = 0.05 / 2; % Alpha for a 2 tailed test and the F0 value = 62
I cannot understand if I should accept this model or not. Can somebody please help in making me understand if it is a good fit or not. I have used Matlab. Thank you.

regression prediction
New contributor
Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
I have 200 regressors and one response. I want to predict the response and have used Ordinary Least Squares based regression. The plot is given below where the red dots represent the predicted response and the blue dots represent the actual response using a test data set of $n = 50$ samples. The training was done using 5000 samples. The test and training set overlap.
Using the training set,
- I applied the R2 coefficient to determine the goodness of fit and I got close to 0.845
- the mean square error between the fitted response and the known is 135.
- I took alpha = 0.05 / 2; % Alpha for a 2 tailed test and the F0 value = 62
I cannot understand if I should accept this model or not. Can somebody please help in making me understand if it is a good fit or not. I have used Matlab. Thank you.

regression prediction
New contributor
Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
$begingroup$
It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
$endgroup$
– macaw_9227
2 hours ago
$begingroup$
thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
$endgroup$
– Sm1
2 hours ago
$begingroup$
Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
$endgroup$
– macaw_9227
1 hour ago
$begingroup$
I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
$endgroup$
– Sm1
1 hour ago
add a comment |
$begingroup$
I have 200 regressors and one response. I want to predict the response and have used Ordinary Least Squares based regression. The plot is given below where the red dots represent the predicted response and the blue dots represent the actual response using a test data set of $n = 50$ samples. The training was done using 5000 samples. The test and training set overlap.
Using the training set,
- I applied the R2 coefficient to determine the goodness of fit and I got close to 0.845
- the mean square error between the fitted response and the known is 135.
- I took alpha = 0.05 / 2; % Alpha for a 2 tailed test and the F0 value = 62
I cannot understand if I should accept this model or not. Can somebody please help in making me understand if it is a good fit or not. I have used Matlab. Thank you.

regression prediction
New contributor
Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
I have 200 regressors and one response. I want to predict the response and have used Ordinary Least Squares based regression. The plot is given below where the red dots represent the predicted response and the blue dots represent the actual response using a test data set of $n = 50$ samples. The training was done using 5000 samples. The test and training set overlap.
Using the training set,
- I applied the R2 coefficient to determine the goodness of fit and I got close to 0.845
- the mean square error between the fitted response and the known is 135.
- I took alpha = 0.05 / 2; % Alpha for a 2 tailed test and the F0 value = 62
I cannot understand if I should accept this model or not. Can somebody please help in making me understand if it is a good fit or not. I have used Matlab. Thank you.

regression prediction
regression prediction
New contributor
Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 2 hours ago
Sm1
New contributor
Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked 2 hours ago
Sm1Sm1
11
11
New contributor
Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$begingroup$
It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
$endgroup$
– macaw_9227
2 hours ago
$begingroup$
thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
$endgroup$
– Sm1
2 hours ago
$begingroup$
Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
$endgroup$
– macaw_9227
1 hour ago
$begingroup$
I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
$endgroup$
– Sm1
1 hour ago
add a comment |
$begingroup$
It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
$endgroup$
– macaw_9227
2 hours ago
$begingroup$
thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
$endgroup$
– Sm1
2 hours ago
$begingroup$
Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
$endgroup$
– macaw_9227
1 hour ago
$begingroup$
I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
$endgroup$
– Sm1
1 hour ago
$begingroup$
It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
$endgroup$
– macaw_9227
2 hours ago
$begingroup$
It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
$endgroup$
– macaw_9227
2 hours ago
$begingroup$
thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
$endgroup$
– Sm1
2 hours ago
$begingroup$
thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
$endgroup$
– Sm1
2 hours ago
$begingroup$
Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
$endgroup$
– macaw_9227
1 hour ago
$begingroup$
Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
$endgroup$
– macaw_9227
1 hour ago
$begingroup$
I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
$endgroup$
– Sm1
1 hour ago
$begingroup$
I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
$endgroup$
– Sm1
1 hour ago
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sm1 is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49737%2fhelp-interpreting-the-result-of-linear-regression-and-confidence-interval-begin%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sm1 is a new contributor. Be nice, and check out our Code of Conduct.
Sm1 is a new contributor. Be nice, and check out our Code of Conduct.
Sm1 is a new contributor. Be nice, and check out our Code of Conduct.
Sm1 is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49737%2fhelp-interpreting-the-result-of-linear-regression-and-confidence-interval-begin%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
$endgroup$
– macaw_9227
2 hours ago
$begingroup$
thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
$endgroup$
– Sm1
2 hours ago
$begingroup$
Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
$endgroup$
– macaw_9227
1 hour ago
$begingroup$
I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
$endgroup$
– Sm1
1 hour ago