Density plot looks normal, qqplot not normal Shapiro significative












2












$begingroup$


The following QQ plot looks with too many points out of the line, the density plot looks normal and the Shapiro Test p-value < 2.2e-16, so this is not a normal distribution but I've read not to trust Shapiro Test when I have about 1000 data points so I should conclude that this distribution is normal?enter image description here



enter image description here










share|cite|improve this question







New contributor




AnaHochma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
    $endgroup$
    – Peter Leopold
    12 hours ago






  • 1




    $begingroup$
    @Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
    $endgroup$
    – whuber
    11 hours ago






  • 1




    $begingroup$
    @PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
    $endgroup$
    – Nick Cox
    11 hours ago












  • $begingroup$
    @Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
    $endgroup$
    – Peter Flom
    11 hours ago










  • $begingroup$
    I would not trust a scale for happiness [NB] with such results!
    $endgroup$
    – Nick Cox
    11 hours ago
















2












$begingroup$


The following QQ plot looks with too many points out of the line, the density plot looks normal and the Shapiro Test p-value < 2.2e-16, so this is not a normal distribution but I've read not to trust Shapiro Test when I have about 1000 data points so I should conclude that this distribution is normal?enter image description here



enter image description here










share|cite|improve this question







New contributor




AnaHochma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
    $endgroup$
    – Peter Leopold
    12 hours ago






  • 1




    $begingroup$
    @Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
    $endgroup$
    – whuber
    11 hours ago






  • 1




    $begingroup$
    @PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
    $endgroup$
    – Nick Cox
    11 hours ago












  • $begingroup$
    @Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
    $endgroup$
    – Peter Flom
    11 hours ago










  • $begingroup$
    I would not trust a scale for happiness [NB] with such results!
    $endgroup$
    – Nick Cox
    11 hours ago














2












2








2





$begingroup$


The following QQ plot looks with too many points out of the line, the density plot looks normal and the Shapiro Test p-value < 2.2e-16, so this is not a normal distribution but I've read not to trust Shapiro Test when I have about 1000 data points so I should conclude that this distribution is normal?enter image description here



enter image description here










share|cite|improve this question







New contributor




AnaHochma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




The following QQ plot looks with too many points out of the line, the density plot looks normal and the Shapiro Test p-value < 2.2e-16, so this is not a normal distribution but I've read not to trust Shapiro Test when I have about 1000 data points so I should conclude that this distribution is normal?enter image description here



enter image description here







normal-distribution






share|cite|improve this question







New contributor




AnaHochma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|cite|improve this question







New contributor




AnaHochma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|cite|improve this question




share|cite|improve this question






New contributor




AnaHochma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 12 hours ago









AnaHochmaAnaHochma

111




111




New contributor




AnaHochma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





AnaHochma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






AnaHochma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • $begingroup$
    Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
    $endgroup$
    – Peter Leopold
    12 hours ago






  • 1




    $begingroup$
    @Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
    $endgroup$
    – whuber
    11 hours ago






  • 1




    $begingroup$
    @PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
    $endgroup$
    – Nick Cox
    11 hours ago












  • $begingroup$
    @Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
    $endgroup$
    – Peter Flom
    11 hours ago










  • $begingroup$
    I would not trust a scale for happiness [NB] with such results!
    $endgroup$
    – Nick Cox
    11 hours ago


















  • $begingroup$
    Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
    $endgroup$
    – Peter Leopold
    12 hours ago






  • 1




    $begingroup$
    @Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
    $endgroup$
    – whuber
    11 hours ago






  • 1




    $begingroup$
    @PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
    $endgroup$
    – Nick Cox
    11 hours ago












  • $begingroup$
    @Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
    $endgroup$
    – Peter Flom
    11 hours ago










  • $begingroup$
    I would not trust a scale for happiness [NB] with such results!
    $endgroup$
    – Nick Cox
    11 hours ago
















$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
12 hours ago




$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
12 hours ago




1




1




$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber
11 hours ago




$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber
11 hours ago




1




1




$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
11 hours ago






$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
11 hours ago














$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom
11 hours ago




$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom
11 hours ago












$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
11 hours ago




$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
11 hours ago










1 Answer
1






active

oldest

votes


















2












$begingroup$

First, the density plot does not really look normal. It's symmetric, but the shape is wrong. I suggest generating a normal distribution with the same mean and variance as yours and then overlaying that density on the one you've got. I am fairly sure you will see a mismatch.



Second, a quantile normal plot is often a better clue to nonnormality.



Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?






share|cite|improve this answer









$endgroup$









  • 1




    $begingroup$
    Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
    $endgroup$
    – AnaHochma
    9 hours ago










  • $begingroup$
    I'd go with Wilcoxon. Or maybe a bootstrap.
    $endgroup$
    – Peter Flom
    9 hours ago










  • $begingroup$
    Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
    $endgroup$
    – Peter Leopold
    8 hours ago






  • 1




    $begingroup$
    Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
    $endgroup$
    – AnaHochma
    8 hours ago













Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f392949%2fdensity-plot-looks-normal-qqplot-not-normal-shapiro-significative%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2












$begingroup$

First, the density plot does not really look normal. It's symmetric, but the shape is wrong. I suggest generating a normal distribution with the same mean and variance as yours and then overlaying that density on the one you've got. I am fairly sure you will see a mismatch.



Second, a quantile normal plot is often a better clue to nonnormality.



Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?






share|cite|improve this answer









$endgroup$









  • 1




    $begingroup$
    Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
    $endgroup$
    – AnaHochma
    9 hours ago










  • $begingroup$
    I'd go with Wilcoxon. Or maybe a bootstrap.
    $endgroup$
    – Peter Flom
    9 hours ago










  • $begingroup$
    Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
    $endgroup$
    – Peter Leopold
    8 hours ago






  • 1




    $begingroup$
    Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
    $endgroup$
    – AnaHochma
    8 hours ago


















2












$begingroup$

First, the density plot does not really look normal. It's symmetric, but the shape is wrong. I suggest generating a normal distribution with the same mean and variance as yours and then overlaying that density on the one you've got. I am fairly sure you will see a mismatch.



Second, a quantile normal plot is often a better clue to nonnormality.



Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?






share|cite|improve this answer









$endgroup$









  • 1




    $begingroup$
    Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
    $endgroup$
    – AnaHochma
    9 hours ago










  • $begingroup$
    I'd go with Wilcoxon. Or maybe a bootstrap.
    $endgroup$
    – Peter Flom
    9 hours ago










  • $begingroup$
    Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
    $endgroup$
    – Peter Leopold
    8 hours ago






  • 1




    $begingroup$
    Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
    $endgroup$
    – AnaHochma
    8 hours ago
















2












2








2





$begingroup$

First, the density plot does not really look normal. It's symmetric, but the shape is wrong. I suggest generating a normal distribution with the same mean and variance as yours and then overlaying that density on the one you've got. I am fairly sure you will see a mismatch.



Second, a quantile normal plot is often a better clue to nonnormality.



Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?






share|cite|improve this answer









$endgroup$



First, the density plot does not really look normal. It's symmetric, but the shape is wrong. I suggest generating a normal distribution with the same mean and variance as yours and then overlaying that density on the one you've got. I am fairly sure you will see a mismatch.



Second, a quantile normal plot is often a better clue to nonnormality.



Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?







share|cite|improve this answer












share|cite|improve this answer



share|cite|improve this answer










answered 11 hours ago









Peter FlomPeter Flom

75.4k11107206




75.4k11107206








  • 1




    $begingroup$
    Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
    $endgroup$
    – AnaHochma
    9 hours ago










  • $begingroup$
    I'd go with Wilcoxon. Or maybe a bootstrap.
    $endgroup$
    – Peter Flom
    9 hours ago










  • $begingroup$
    Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
    $endgroup$
    – Peter Leopold
    8 hours ago






  • 1




    $begingroup$
    Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
    $endgroup$
    – AnaHochma
    8 hours ago
















  • 1




    $begingroup$
    Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
    $endgroup$
    – AnaHochma
    9 hours ago










  • $begingroup$
    I'd go with Wilcoxon. Or maybe a bootstrap.
    $endgroup$
    – Peter Flom
    9 hours ago










  • $begingroup$
    Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
    $endgroup$
    – Peter Leopold
    8 hours ago






  • 1




    $begingroup$
    Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
    $endgroup$
    – AnaHochma
    8 hours ago










1




1




$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
9 hours ago




$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
9 hours ago












$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom
9 hours ago




$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom
9 hours ago












$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
8 hours ago




$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
8 hours ago




1




1




$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
8 hours ago






$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
8 hours ago












AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.













AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.












AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Cross Validated!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f392949%2fdensity-plot-looks-normal-qqplot-not-normal-shapiro-significative%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to label and detect the document text images

Vallis Paradisi

Tabula Rosettana