Help interpreting the result of linear regression and confidence interval (beginner level)

I have 200 regressors and one response. I want to predict the response and have used Ordinary Least Squares based regression. The plot is given below where the red dots represent the predicted response and the blue dots represent the actual response using a test data set of $n = 50$ samples. The training was done using 5000 samples. The test and training set overlap.

Using the training set,

I applied the R2 coefficient to determine the goodness of fit and I got close to 0.845

the mean square error between the fitted response and the known is 135.

I took alpha = 0.05 / 2; % Alpha for a 2 tailed test and the F0 value = 62

I cannot understand if I should accept this model or not. Can somebody please help in making me understand if it is a good fit or not. I have used Matlab. Thank you.

test

edited 2 hours ago

asked 2 hours ago

Sm1

New contributor

$begingroup$
It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
$endgroup$
– macaw_9227
2 hours ago

$begingroup$
thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
$endgroup$
– Sm1
2 hours ago

$begingroup$
Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
$endgroup$
– macaw_9227
1 hour ago

$begingroup$
I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
$endgroup$
– Sm1
1 hour ago

add a comment |

Using the training set,

I applied the R2 coefficient to determine the goodness of fit and I got close to 0.845

the mean square error between the fitted response and the known is 135.

I took alpha = 0.05 / 2; % Alpha for a 2 tailed test and the F0 value = 62

I cannot understand if I should accept this model or not. Can somebody please help in making me understand if it is a good fit or not. I have used Matlab. Thank you.

test

edited 2 hours ago

asked 2 hours ago

Sm1

New contributor

$begingroup$
It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
$endgroup$
– macaw_9227
2 hours ago

$begingroup$
thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
$endgroup$
– Sm1
2 hours ago

$begingroup$
Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
$endgroup$
– macaw_9227
1 hour ago

$begingroup$
I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
$endgroup$
– Sm1
1 hour ago

add a comment |

Using the training set,

I applied the R2 coefficient to determine the goodness of fit and I got close to 0.845

the mean square error between the fitted response and the known is 135.

I took alpha = 0.05 / 2; % Alpha for a 2 tailed test and the F0 value = 62

I cannot understand if I should accept this model or not. Can somebody please help in making me understand if it is a good fit or not. I have used Matlab. Thank you.

test

edited 2 hours ago

asked 2 hours ago

Sm1

New contributor

Using the training set,

I applied the R2 coefficient to determine the goodness of fit and I got close to 0.845

the mean square error between the fitted response and the known is 135.

I took alpha = 0.05 / 2; % Alpha for a 2 tailed test and the F0 value = 62

I cannot understand if I should accept this model or not. Can somebody please help in making me understand if it is a good fit or not. I have used Matlab. Thank you.

test

regression prediction

edited 2 hours ago

asked 2 hours ago

Sm1

New contributor

edited 2 hours ago

asked 2 hours ago

Sm1

New contributor

edited 2 hours ago

asked 2 hours ago

Sm1

New contributor

asked 2 hours ago

Sm1

asked 2 hours ago

Sm1

New contributor

Sm1 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

$begingroup$
It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
$endgroup$
– macaw_9227
2 hours ago

$begingroup$
thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
$endgroup$
– Sm1
2 hours ago

$begingroup$
Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
$endgroup$
– macaw_9227
1 hour ago

$begingroup$
I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
$endgroup$
– Sm1
1 hour ago

add a comment |

$begingroup$
It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias
$endgroup$
– macaw_9227
2 hours ago

$begingroup$
thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading
$endgroup$
– Sm1
2 hours ago

$begingroup$
Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor
$endgroup$
– macaw_9227
1 hour ago

$begingroup$
I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.
$endgroup$
– Sm1
1 hour ago

It is good that you plotted the response, because it is always a good sanity check to visually inspect the outcome. Based on the image you provided, I would reject the model. The response is almost always far too high, only 6/50 samples have a predicted response that is less than the actual response. Your predictor has a strong positive bias

– macaw_9227
2 hours ago

thank you for your suggestions. however, the statistic value such as rsquared = 0.8 is quite high eventhough the prediction looks terrible, so is this vale misleading

– Sm1
2 hours ago

Well, if your predicted values were all exactly 0.1 * actual value, then you would have r squared =1, but still a terrible predictor

– macaw_9227
1 hour ago

I see, thank you. It will be of immense help if you could please put up an answer regarding these practical aspects encountered when the statistics value eventhough are good, yet the model predictions are terrible. In that case, how to choose a model? Also, I am having a tough time understanding the meaning of F0 statistics and p-value.

– Sm1
1 hour ago

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Sm1 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49737%2fhelp-interpreting-the-result-of-linear-regression-and-confidence-interval-begin%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

Sm1 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sm1 is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htydjtk