Density plot looks normal, qqplot not normal Shapiro significative
$begingroup$
The following QQ plot looks with too many points out of the line, the density plot looks normal and the Shapiro Test p-value < 2.2e-16, so this is not a normal distribution but I've read not to trust Shapiro Test when I have about 1000 data points so I should conclude that this distribution is normal?
normal-distribution
New contributor
$endgroup$
|
show 2 more comments
$begingroup$
The following QQ plot looks with too many points out of the line, the density plot looks normal and the Shapiro Test p-value < 2.2e-16, so this is not a normal distribution but I've read not to trust Shapiro Test when I have about 1000 data points so I should conclude that this distribution is normal?
normal-distribution
New contributor
$endgroup$
$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
12 hours ago
1
$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
11 hours ago
1
$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
11 hours ago
$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
11 hours ago
|
show 2 more comments
$begingroup$
The following QQ plot looks with too many points out of the line, the density plot looks normal and the Shapiro Test p-value < 2.2e-16, so this is not a normal distribution but I've read not to trust Shapiro Test when I have about 1000 data points so I should conclude that this distribution is normal?
normal-distribution
New contributor
$endgroup$
The following QQ plot looks with too many points out of the line, the density plot looks normal and the Shapiro Test p-value < 2.2e-16, so this is not a normal distribution but I've read not to trust Shapiro Test when I have about 1000 data points so I should conclude that this distribution is normal?
normal-distribution
normal-distribution
New contributor
New contributor
New contributor
asked 12 hours ago
AnaHochmaAnaHochma
111
111
New contributor
New contributor
$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
12 hours ago
1
$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
11 hours ago
1
$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
11 hours ago
$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
11 hours ago
|
show 2 more comments
$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
12 hours ago
1
$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
11 hours ago
1
$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
11 hours ago
$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
12 hours ago
$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
12 hours ago
1
1
$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
11 hours ago
$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
11 hours ago
1
1
$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
11 hours ago
$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
11 hours ago
$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
11 hours ago
|
show 2 more comments
1 Answer
1
active
oldest
votes
$begingroup$
First, the density plot does not really look normal. It's symmetric, but the shape is wrong. I suggest generating a normal distribution with the same mean and variance as yours and then overlaying that density on the one you've got. I am fairly sure you will see a mismatch.
Second, a quantile normal plot is often a better clue to nonnormality.
Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?
$endgroup$
1
$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
9 hours ago
$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
9 hours ago
$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
8 hours ago
1
$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
8 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f392949%2fdensity-plot-looks-normal-qqplot-not-normal-shapiro-significative%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
First, the density plot does not really look normal. It's symmetric, but the shape is wrong. I suggest generating a normal distribution with the same mean and variance as yours and then overlaying that density on the one you've got. I am fairly sure you will see a mismatch.
Second, a quantile normal plot is often a better clue to nonnormality.
Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?
$endgroup$
1
$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
9 hours ago
$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
9 hours ago
$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
8 hours ago
1
$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
8 hours ago
add a comment |
$begingroup$
First, the density plot does not really look normal. It's symmetric, but the shape is wrong. I suggest generating a normal distribution with the same mean and variance as yours and then overlaying that density on the one you've got. I am fairly sure you will see a mismatch.
Second, a quantile normal plot is often a better clue to nonnormality.
Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?
$endgroup$
1
$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
9 hours ago
$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
9 hours ago
$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
8 hours ago
1
$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
8 hours ago
add a comment |
$begingroup$
First, the density plot does not really look normal. It's symmetric, but the shape is wrong. I suggest generating a normal distribution with the same mean and variance as yours and then overlaying that density on the one you've got. I am fairly sure you will see a mismatch.
Second, a quantile normal plot is often a better clue to nonnormality.
Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?
$endgroup$
First, the density plot does not really look normal. It's symmetric, but the shape is wrong. I suggest generating a normal distribution with the same mean and variance as yours and then overlaying that density on the one you've got. I am fairly sure you will see a mismatch.
Second, a quantile normal plot is often a better clue to nonnormality.
Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?
answered 11 hours ago
Peter Flom♦Peter Flom
75.4k11107206
75.4k11107206
1
$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
9 hours ago
$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
9 hours ago
$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
8 hours ago
1
$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
8 hours ago
add a comment |
1
$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
9 hours ago
$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
9 hours ago
$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
8 hours ago
1
$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
8 hours ago
1
1
$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
9 hours ago
$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
9 hours ago
$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
9 hours ago
$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
9 hours ago
$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
8 hours ago
$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
8 hours ago
1
1
$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
8 hours ago
$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
8 hours ago
add a comment |
AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.
AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.
AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.
AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f392949%2fdensity-plot-looks-normal-qqplot-not-normal-shapiro-significative%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
12 hours ago
1
$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
11 hours ago
1
$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
11 hours ago
$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
11 hours ago