Do error bars on probabilities have any meaning?












15












$begingroup$


People often say some event has a 50-60% chance of happening. Sometimes I will even see people give explicit error bars on probability assignments. Do these statements have any meaning or are they just a linguistic quirk of discomfort choosing a specific number for something that is inherently unknowable?










share|cite|improve this question









New contributor




mahnamahna is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    Doesn't the Probably Approximately Correct framework in computational learning theory do just that, typically giving a bound on the error rate of a classifier that holds with probability $1-delta$? If it was a meaningless concept, I doubt those (extremely clever) CoLT people would have failed to spot it!
    $endgroup$
    – Dikran Marsupial
    1 hour ago
















15












$begingroup$


People often say some event has a 50-60% chance of happening. Sometimes I will even see people give explicit error bars on probability assignments. Do these statements have any meaning or are they just a linguistic quirk of discomfort choosing a specific number for something that is inherently unknowable?










share|cite|improve this question









New contributor




mahnamahna is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    Doesn't the Probably Approximately Correct framework in computational learning theory do just that, typically giving a bound on the error rate of a classifier that holds with probability $1-delta$? If it was a meaningless concept, I doubt those (extremely clever) CoLT people would have failed to spot it!
    $endgroup$
    – Dikran Marsupial
    1 hour ago














15












15








15


2



$begingroup$


People often say some event has a 50-60% chance of happening. Sometimes I will even see people give explicit error bars on probability assignments. Do these statements have any meaning or are they just a linguistic quirk of discomfort choosing a specific number for something that is inherently unknowable?










share|cite|improve this question









New contributor




mahnamahna is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




People often say some event has a 50-60% chance of happening. Sometimes I will even see people give explicit error bars on probability assignments. Do these statements have any meaning or are they just a linguistic quirk of discomfort choosing a specific number for something that is inherently unknowable?







probability error






share|cite|improve this question









New contributor




mahnamahna is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|cite|improve this question









New contributor




mahnamahna is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|cite|improve this question




share|cite|improve this question








edited 6 hours ago









Ferdi

3,78742152




3,78742152






New contributor




mahnamahna is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 20 hours ago









mahnamahnamahnamahna

793




793




New contributor




mahnamahna is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





mahnamahna is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






mahnamahna is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • $begingroup$
    Doesn't the Probably Approximately Correct framework in computational learning theory do just that, typically giving a bound on the error rate of a classifier that holds with probability $1-delta$? If it was a meaningless concept, I doubt those (extremely clever) CoLT people would have failed to spot it!
    $endgroup$
    – Dikran Marsupial
    1 hour ago


















  • $begingroup$
    Doesn't the Probably Approximately Correct framework in computational learning theory do just that, typically giving a bound on the error rate of a classifier that holds with probability $1-delta$? If it was a meaningless concept, I doubt those (extremely clever) CoLT people would have failed to spot it!
    $endgroup$
    – Dikran Marsupial
    1 hour ago
















$begingroup$
Doesn't the Probably Approximately Correct framework in computational learning theory do just that, typically giving a bound on the error rate of a classifier that holds with probability $1-delta$? If it was a meaningless concept, I doubt those (extremely clever) CoLT people would have failed to spot it!
$endgroup$
– Dikran Marsupial
1 hour ago




$begingroup$
Doesn't the Probably Approximately Correct framework in computational learning theory do just that, typically giving a bound on the error rate of a classifier that holds with probability $1-delta$? If it was a meaningless concept, I doubt those (extremely clever) CoLT people would have failed to spot it!
$endgroup$
– Dikran Marsupial
1 hour ago










6 Answers
6






active

oldest

votes


















22












$begingroup$

It wouldn't make sense if you were talking about known probabilities, e.g. with fair coin the probability of throwing heads is 0.5 by the laws of physics. However, unless you are talking about textbook example, the exact probability is never known, we only know it approximately.



The different story is when you estimate the probabilities from the data, e.g. you observed 13 winning tickets among the 12563 tickets you bought, so from this data you estimate the probability to be 13/12563. This is something you estimated from the sample, so it is uncertain, because with different sample you could observe different value. The uncertainty estimate is not about the probability, but around the estimate of it.



Another example would be when the probability is not fixed, but depends on other factors. Say that we are talking about probability of dying in car accident. We can consider "global" probability, single value that is marginalized over all the factors that directly and indirectly lead to car accidents. On another hand, you can consider how the probabilities vary among the population given the risk factors.



You can find many more examples where probabilities themselves are considered as random variables, so they vary rather then being fixed.






share|cite|improve this answer











$endgroup$









  • 1




    $begingroup$
    If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
    $endgroup$
    – usεr11852
    16 hours ago






  • 1




    $begingroup$
    @usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
    $endgroup$
    – Tim
    16 hours ago








  • 1




    $begingroup$
    Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
    $endgroup$
    – Nuclear Wang
    10 hours ago






  • 4




    $begingroup$
    @NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
    $endgroup$
    – De Novo
    9 hours ago












  • $begingroup$
    @DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
    $endgroup$
    – Tim
    9 hours ago



















15












$begingroup$

A most relevant illustration from xkcd:



enter image description here



with associated caption:




...an effect size of 1.68 (95% CI: 1.56 (95% CI: 1.52 (95% CI: 1.504
(95% CI: 1.494 (95% CI: 1.488 (95% CI: 1.485 (95% CI: 1.482 (95% CI:
1.481 (95% CI: 1.4799 (95% CI: 1.4791 (95% CI: 1.4784...







share|cite|improve this answer











$endgroup$













  • $begingroup$
    Does this imply that error bars on probabilities are redundant?
    $endgroup$
    – BalinKingOfMoria
    13 hours ago






  • 7




    $begingroup$
    Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
    $endgroup$
    – Xi'an
    9 hours ago






  • 3




    $begingroup$
    Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
    $endgroup$
    – Xi'an
    7 hours ago






  • 4




    $begingroup$
    That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
    $endgroup$
    – gerrit
    7 hours ago





















3












$begingroup$

I know of two interpretations. The first was said by Tim: We have observed $X$ successes out of $Y$ trials, so if we believe the trials were i.i.d. we can estimate the probability of the process at $X/Y$ with some error bars, e.g. of order $1/sqrt{Y}$.



The second involves "higher-order probabilities" or uncertainties about a generating process. For example, say I have a coin in my hand manufactured by a crafter gambler, who with $0.5$ probability made a 60%-heads coin, and with $0.5$ probability made a 40%-heads coin. My best guess is a 50% chance that the coin comes up heads, but with big error bars: the "true" chance is either 40% or 60%.



In other words, you can imagine running the experiment a billion times and taking the fraction of successes $X/Y$ (actually the limiting fraction). It makes sense, at least from a Bayesian perspective, to give e.g. a 95% confidence interval around that number. In the above example, given current knowledge, this is $[0.4,0.6]$. For a real coin, maybe it is $[0.47,0.53]$ or something. For more, see:



Do We Need Higher-Order Probabilities and, If So, What Do They Mean?
Judea Pearl. UAI 1987. https://arxiv.org/abs/1304.2716






share|cite|improve this answer









$endgroup$





















    1












    $begingroup$

    All measurements are uncertain.



    Therefore, any measurement of probability is also uncertain.



    This uncertainty on the measurement of probability can be visually represented with an uncertainty bar. Note that uncertainty bars are often referred to as error bars. This is incorrect or at least misleading, because it shows uncertainty and not error (the error is the difference between the measurement and the unknown truth, so the error is unknown; the uncertainty is a measure of the width of the probability density after taking the measurement).



    A related topic is meta-uncertainty. Uncertainty describes the width of an a posteriori probability distribution function, and in case of a Type A uncertainty (uncertainty estimated by repeated measurements), there is inevitable an uncertainty on the uncertainty; metrologists have told me that metrological practice dictates to expand the uncertainty in this case (IIRC, if uncertainty is estimated by the standard deviation of N repeated measurements, one should multiply the resulting standard deviation by $frac{N}{N-2}$), which is essentially a meta-uncertainty.






    share|cite|improve this answer











    $endgroup$





















      1












      $begingroup$

      Error bars are shorthand for a probability distribution. So error bars on errors bars are shorthand for a probability distribution for a probability distribution. That entails assigning probabilities to probabilities, so this is what we need to study.



      Following John Maynard Keynes, a probability is the degree of belief that we are justified in entertaining in one proposition, in relation to another. Now there are many pairs of intelligible propositions for which we don't know how to assign a numerical probability. But for my part, I cannot think of a pair of plain propositions (not themselves containing a probability phrase) which yield a numerically quantified uncertain probability. If we try to generate one, it always entails an ordinary probability. For example, suppose that our given proposition is
      $$
      mathcal{I} = unicode{8220} text{die 1 has 6 sides, and die 2 has 7 sides; one of these dice is rolled} unicode{8221}
      $$

      and that the proposition we're assessing against $mathcal{I}$ is
      $$
      mathcal{A} = unicode{8220} text{the number showing is 3} unicode{8221}
      $$

      Then it's tempting to set
      $$
      mathrm{prob}left(mathrm{prob}(mathcal{A} | mathcal{I}) = p | mathcal{I}right) = begin{cases}
      frac{1}{2} & p = frac{1}{6} \
      frac{1}{2} & p = frac{1}{7}
      end{cases}
      $$

      which attempts to assign a probability distribution to a probability. Quite apart from the confusion introduced by the nested probability phrases, this cannot be correct, because $mathrm{prob}(mathcal{A} | mathcal{I})$ is deducible and therefore not subject to uncertainty. Let
      $$
      mathcal{D}(d) iff unicode{8220} text{die $d$ is rolled} unicode{8221}
      $$

      Then
      begin{align}
      mathrm{prob}(mathcal{A} | mathcal{I}) &= sum_d mathrm{prob}(mathcal{A}, mathcal{D}(d) | mathcal{I}) \
      &= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) : mathrm{prob}(mathcal{D}(d) | mathcal{I}) \
      &= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) cdot frac{1}{2} \
      &= frac{1}{2} left( frac{1}{6} + frac{1}{7} right) \
      &= frac{13}{84}
      end{align}



      Nested probabilities, in short, cannot be sustained, and neither, therefore, can nested error bars.






      share|cite|improve this answer









      $endgroup$





















        0












        $begingroup$

        There are very often occasions where you want to have a probability of a probability. Say for instance you worked in food safety and used a survival analysis model to estimate the probability that botulinum spores would germinate (and thus produce the deadly toxin) as a function of the food preparation steps (i.e. cooking) and incubation time/temperature (c.f. paper). Food producers may then want to use that model to set safe "use-by" dates so that consumer's risk of botulism is appropriately small. However, the model is fit to a finite training sample, so rather than picking a use-by date for which the probability of germination is less than, say 0.001, you might want to choose an earlier date for which (given the modelling assumptions) you could be 95% sure the probability of germination is less than 0.001. This seems a fairly natural thing to do in a Bayesian setting.






        share|cite|improve this answer









        $endgroup$













          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "65"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });






          mahnamahna is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f393316%2fdo-error-bars-on-probabilities-have-any-meaning%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          6 Answers
          6






          active

          oldest

          votes








          6 Answers
          6






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          22












          $begingroup$

          It wouldn't make sense if you were talking about known probabilities, e.g. with fair coin the probability of throwing heads is 0.5 by the laws of physics. However, unless you are talking about textbook example, the exact probability is never known, we only know it approximately.



          The different story is when you estimate the probabilities from the data, e.g. you observed 13 winning tickets among the 12563 tickets you bought, so from this data you estimate the probability to be 13/12563. This is something you estimated from the sample, so it is uncertain, because with different sample you could observe different value. The uncertainty estimate is not about the probability, but around the estimate of it.



          Another example would be when the probability is not fixed, but depends on other factors. Say that we are talking about probability of dying in car accident. We can consider "global" probability, single value that is marginalized over all the factors that directly and indirectly lead to car accidents. On another hand, you can consider how the probabilities vary among the population given the risk factors.



          You can find many more examples where probabilities themselves are considered as random variables, so they vary rather then being fixed.






          share|cite|improve this answer











          $endgroup$









          • 1




            $begingroup$
            If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
            $endgroup$
            – usεr11852
            16 hours ago






          • 1




            $begingroup$
            @usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
            $endgroup$
            – Tim
            16 hours ago








          • 1




            $begingroup$
            Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
            $endgroup$
            – Nuclear Wang
            10 hours ago






          • 4




            $begingroup$
            @NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
            $endgroup$
            – De Novo
            9 hours ago












          • $begingroup$
            @DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
            $endgroup$
            – Tim
            9 hours ago
















          22












          $begingroup$

          It wouldn't make sense if you were talking about known probabilities, e.g. with fair coin the probability of throwing heads is 0.5 by the laws of physics. However, unless you are talking about textbook example, the exact probability is never known, we only know it approximately.



          The different story is when you estimate the probabilities from the data, e.g. you observed 13 winning tickets among the 12563 tickets you bought, so from this data you estimate the probability to be 13/12563. This is something you estimated from the sample, so it is uncertain, because with different sample you could observe different value. The uncertainty estimate is not about the probability, but around the estimate of it.



          Another example would be when the probability is not fixed, but depends on other factors. Say that we are talking about probability of dying in car accident. We can consider "global" probability, single value that is marginalized over all the factors that directly and indirectly lead to car accidents. On another hand, you can consider how the probabilities vary among the population given the risk factors.



          You can find many more examples where probabilities themselves are considered as random variables, so they vary rather then being fixed.






          share|cite|improve this answer











          $endgroup$









          • 1




            $begingroup$
            If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
            $endgroup$
            – usεr11852
            16 hours ago






          • 1




            $begingroup$
            @usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
            $endgroup$
            – Tim
            16 hours ago








          • 1




            $begingroup$
            Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
            $endgroup$
            – Nuclear Wang
            10 hours ago






          • 4




            $begingroup$
            @NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
            $endgroup$
            – De Novo
            9 hours ago












          • $begingroup$
            @DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
            $endgroup$
            – Tim
            9 hours ago














          22












          22








          22





          $begingroup$

          It wouldn't make sense if you were talking about known probabilities, e.g. with fair coin the probability of throwing heads is 0.5 by the laws of physics. However, unless you are talking about textbook example, the exact probability is never known, we only know it approximately.



          The different story is when you estimate the probabilities from the data, e.g. you observed 13 winning tickets among the 12563 tickets you bought, so from this data you estimate the probability to be 13/12563. This is something you estimated from the sample, so it is uncertain, because with different sample you could observe different value. The uncertainty estimate is not about the probability, but around the estimate of it.



          Another example would be when the probability is not fixed, but depends on other factors. Say that we are talking about probability of dying in car accident. We can consider "global" probability, single value that is marginalized over all the factors that directly and indirectly lead to car accidents. On another hand, you can consider how the probabilities vary among the population given the risk factors.



          You can find many more examples where probabilities themselves are considered as random variables, so they vary rather then being fixed.






          share|cite|improve this answer











          $endgroup$



          It wouldn't make sense if you were talking about known probabilities, e.g. with fair coin the probability of throwing heads is 0.5 by the laws of physics. However, unless you are talking about textbook example, the exact probability is never known, we only know it approximately.



          The different story is when you estimate the probabilities from the data, e.g. you observed 13 winning tickets among the 12563 tickets you bought, so from this data you estimate the probability to be 13/12563. This is something you estimated from the sample, so it is uncertain, because with different sample you could observe different value. The uncertainty estimate is not about the probability, but around the estimate of it.



          Another example would be when the probability is not fixed, but depends on other factors. Say that we are talking about probability of dying in car accident. We can consider "global" probability, single value that is marginalized over all the factors that directly and indirectly lead to car accidents. On another hand, you can consider how the probabilities vary among the population given the risk factors.



          You can find many more examples where probabilities themselves are considered as random variables, so they vary rather then being fixed.







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited 7 hours ago

























          answered 19 hours ago









          TimTim

          57.7k9126219




          57.7k9126219








          • 1




            $begingroup$
            If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
            $endgroup$
            – usεr11852
            16 hours ago






          • 1




            $begingroup$
            @usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
            $endgroup$
            – Tim
            16 hours ago








          • 1




            $begingroup$
            Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
            $endgroup$
            – Nuclear Wang
            10 hours ago






          • 4




            $begingroup$
            @NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
            $endgroup$
            – De Novo
            9 hours ago












          • $begingroup$
            @DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
            $endgroup$
            – Tim
            9 hours ago














          • 1




            $begingroup$
            If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
            $endgroup$
            – usεr11852
            16 hours ago






          • 1




            $begingroup$
            @usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
            $endgroup$
            – Tim
            16 hours ago








          • 1




            $begingroup$
            Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
            $endgroup$
            – Nuclear Wang
            10 hours ago






          • 4




            $begingroup$
            @NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
            $endgroup$
            – De Novo
            9 hours ago












          • $begingroup$
            @DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
            $endgroup$
            – Tim
            9 hours ago








          1




          1




          $begingroup$
          If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
          $endgroup$
          – usεr11852
          16 hours ago




          $begingroup$
          If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
          $endgroup$
          – usεr11852
          16 hours ago




          1




          1




          $begingroup$
          @usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
          $endgroup$
          – Tim
          16 hours ago






          $begingroup$
          @usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
          $endgroup$
          – Tim
          16 hours ago






          1




          1




          $begingroup$
          Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
          $endgroup$
          – Nuclear Wang
          10 hours ago




          $begingroup$
          Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
          $endgroup$
          – Nuclear Wang
          10 hours ago




          4




          4




          $begingroup$
          @NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
          $endgroup$
          – De Novo
          9 hours ago






          $begingroup$
          @NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
          $endgroup$
          – De Novo
          9 hours ago














          $begingroup$
          @DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
          $endgroup$
          – Tim
          9 hours ago




          $begingroup$
          @DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
          $endgroup$
          – Tim
          9 hours ago













          15












          $begingroup$

          A most relevant illustration from xkcd:



          enter image description here



          with associated caption:




          ...an effect size of 1.68 (95% CI: 1.56 (95% CI: 1.52 (95% CI: 1.504
          (95% CI: 1.494 (95% CI: 1.488 (95% CI: 1.485 (95% CI: 1.482 (95% CI:
          1.481 (95% CI: 1.4799 (95% CI: 1.4791 (95% CI: 1.4784...







          share|cite|improve this answer











          $endgroup$













          • $begingroup$
            Does this imply that error bars on probabilities are redundant?
            $endgroup$
            – BalinKingOfMoria
            13 hours ago






          • 7




            $begingroup$
            Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
            $endgroup$
            – Xi'an
            9 hours ago






          • 3




            $begingroup$
            Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
            $endgroup$
            – Xi'an
            7 hours ago






          • 4




            $begingroup$
            That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
            $endgroup$
            – gerrit
            7 hours ago


















          15












          $begingroup$

          A most relevant illustration from xkcd:



          enter image description here



          with associated caption:




          ...an effect size of 1.68 (95% CI: 1.56 (95% CI: 1.52 (95% CI: 1.504
          (95% CI: 1.494 (95% CI: 1.488 (95% CI: 1.485 (95% CI: 1.482 (95% CI:
          1.481 (95% CI: 1.4799 (95% CI: 1.4791 (95% CI: 1.4784...







          share|cite|improve this answer











          $endgroup$













          • $begingroup$
            Does this imply that error bars on probabilities are redundant?
            $endgroup$
            – BalinKingOfMoria
            13 hours ago






          • 7




            $begingroup$
            Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
            $endgroup$
            – Xi'an
            9 hours ago






          • 3




            $begingroup$
            Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
            $endgroup$
            – Xi'an
            7 hours ago






          • 4




            $begingroup$
            That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
            $endgroup$
            – gerrit
            7 hours ago
















          15












          15








          15





          $begingroup$

          A most relevant illustration from xkcd:



          enter image description here



          with associated caption:




          ...an effect size of 1.68 (95% CI: 1.56 (95% CI: 1.52 (95% CI: 1.504
          (95% CI: 1.494 (95% CI: 1.488 (95% CI: 1.485 (95% CI: 1.482 (95% CI:
          1.481 (95% CI: 1.4799 (95% CI: 1.4791 (95% CI: 1.4784...







          share|cite|improve this answer











          $endgroup$



          A most relevant illustration from xkcd:



          enter image description here



          with associated caption:




          ...an effect size of 1.68 (95% CI: 1.56 (95% CI: 1.52 (95% CI: 1.504
          (95% CI: 1.494 (95% CI: 1.488 (95% CI: 1.485 (95% CI: 1.482 (95% CI:
          1.481 (95% CI: 1.4799 (95% CI: 1.4791 (95% CI: 1.4784...








          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited 9 hours ago









          Stephan Kolassa

          46k695167




          46k695167










          answered 19 hours ago









          Xi'anXi'an

          57.1k895360




          57.1k895360












          • $begingroup$
            Does this imply that error bars on probabilities are redundant?
            $endgroup$
            – BalinKingOfMoria
            13 hours ago






          • 7




            $begingroup$
            Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
            $endgroup$
            – Xi'an
            9 hours ago






          • 3




            $begingroup$
            Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
            $endgroup$
            – Xi'an
            7 hours ago






          • 4




            $begingroup$
            That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
            $endgroup$
            – gerrit
            7 hours ago




















          • $begingroup$
            Does this imply that error bars on probabilities are redundant?
            $endgroup$
            – BalinKingOfMoria
            13 hours ago






          • 7




            $begingroup$
            Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
            $endgroup$
            – Xi'an
            9 hours ago






          • 3




            $begingroup$
            Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
            $endgroup$
            – Xi'an
            7 hours ago






          • 4




            $begingroup$
            That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
            $endgroup$
            – gerrit
            7 hours ago


















          $begingroup$
          Does this imply that error bars on probabilities are redundant?
          $endgroup$
          – BalinKingOfMoria
          13 hours ago




          $begingroup$
          Does this imply that error bars on probabilities are redundant?
          $endgroup$
          – BalinKingOfMoria
          13 hours ago




          7




          7




          $begingroup$
          Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
          $endgroup$
          – Xi'an
          9 hours ago




          $begingroup$
          Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
          $endgroup$
          – Xi'an
          9 hours ago




          3




          3




          $begingroup$
          Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
          $endgroup$
          – Xi'an
          7 hours ago




          $begingroup$
          Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
          $endgroup$
          – Xi'an
          7 hours ago




          4




          4




          $begingroup$
          That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
          $endgroup$
          – gerrit
          7 hours ago






          $begingroup$
          That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
          $endgroup$
          – gerrit
          7 hours ago













          3












          $begingroup$

          I know of two interpretations. The first was said by Tim: We have observed $X$ successes out of $Y$ trials, so if we believe the trials were i.i.d. we can estimate the probability of the process at $X/Y$ with some error bars, e.g. of order $1/sqrt{Y}$.



          The second involves "higher-order probabilities" or uncertainties about a generating process. For example, say I have a coin in my hand manufactured by a crafter gambler, who with $0.5$ probability made a 60%-heads coin, and with $0.5$ probability made a 40%-heads coin. My best guess is a 50% chance that the coin comes up heads, but with big error bars: the "true" chance is either 40% or 60%.



          In other words, you can imagine running the experiment a billion times and taking the fraction of successes $X/Y$ (actually the limiting fraction). It makes sense, at least from a Bayesian perspective, to give e.g. a 95% confidence interval around that number. In the above example, given current knowledge, this is $[0.4,0.6]$. For a real coin, maybe it is $[0.47,0.53]$ or something. For more, see:



          Do We Need Higher-Order Probabilities and, If So, What Do They Mean?
          Judea Pearl. UAI 1987. https://arxiv.org/abs/1304.2716






          share|cite|improve this answer









          $endgroup$


















            3












            $begingroup$

            I know of two interpretations. The first was said by Tim: We have observed $X$ successes out of $Y$ trials, so if we believe the trials were i.i.d. we can estimate the probability of the process at $X/Y$ with some error bars, e.g. of order $1/sqrt{Y}$.



            The second involves "higher-order probabilities" or uncertainties about a generating process. For example, say I have a coin in my hand manufactured by a crafter gambler, who with $0.5$ probability made a 60%-heads coin, and with $0.5$ probability made a 40%-heads coin. My best guess is a 50% chance that the coin comes up heads, but with big error bars: the "true" chance is either 40% or 60%.



            In other words, you can imagine running the experiment a billion times and taking the fraction of successes $X/Y$ (actually the limiting fraction). It makes sense, at least from a Bayesian perspective, to give e.g. a 95% confidence interval around that number. In the above example, given current knowledge, this is $[0.4,0.6]$. For a real coin, maybe it is $[0.47,0.53]$ or something. For more, see:



            Do We Need Higher-Order Probabilities and, If So, What Do They Mean?
            Judea Pearl. UAI 1987. https://arxiv.org/abs/1304.2716






            share|cite|improve this answer









            $endgroup$
















              3












              3








              3





              $begingroup$

              I know of two interpretations. The first was said by Tim: We have observed $X$ successes out of $Y$ trials, so if we believe the trials were i.i.d. we can estimate the probability of the process at $X/Y$ with some error bars, e.g. of order $1/sqrt{Y}$.



              The second involves "higher-order probabilities" or uncertainties about a generating process. For example, say I have a coin in my hand manufactured by a crafter gambler, who with $0.5$ probability made a 60%-heads coin, and with $0.5$ probability made a 40%-heads coin. My best guess is a 50% chance that the coin comes up heads, but with big error bars: the "true" chance is either 40% or 60%.



              In other words, you can imagine running the experiment a billion times and taking the fraction of successes $X/Y$ (actually the limiting fraction). It makes sense, at least from a Bayesian perspective, to give e.g. a 95% confidence interval around that number. In the above example, given current knowledge, this is $[0.4,0.6]$. For a real coin, maybe it is $[0.47,0.53]$ or something. For more, see:



              Do We Need Higher-Order Probabilities and, If So, What Do They Mean?
              Judea Pearl. UAI 1987. https://arxiv.org/abs/1304.2716






              share|cite|improve this answer









              $endgroup$



              I know of two interpretations. The first was said by Tim: We have observed $X$ successes out of $Y$ trials, so if we believe the trials were i.i.d. we can estimate the probability of the process at $X/Y$ with some error bars, e.g. of order $1/sqrt{Y}$.



              The second involves "higher-order probabilities" or uncertainties about a generating process. For example, say I have a coin in my hand manufactured by a crafter gambler, who with $0.5$ probability made a 60%-heads coin, and with $0.5$ probability made a 40%-heads coin. My best guess is a 50% chance that the coin comes up heads, but with big error bars: the "true" chance is either 40% or 60%.



              In other words, you can imagine running the experiment a billion times and taking the fraction of successes $X/Y$ (actually the limiting fraction). It makes sense, at least from a Bayesian perspective, to give e.g. a 95% confidence interval around that number. In the above example, given current knowledge, this is $[0.4,0.6]$. For a real coin, maybe it is $[0.47,0.53]$ or something. For more, see:



              Do We Need Higher-Order Probabilities and, If So, What Do They Mean?
              Judea Pearl. UAI 1987. https://arxiv.org/abs/1304.2716







              share|cite|improve this answer












              share|cite|improve this answer



              share|cite|improve this answer










              answered 7 hours ago









              usulusul

              1272




              1272























                  1












                  $begingroup$

                  All measurements are uncertain.



                  Therefore, any measurement of probability is also uncertain.



                  This uncertainty on the measurement of probability can be visually represented with an uncertainty bar. Note that uncertainty bars are often referred to as error bars. This is incorrect or at least misleading, because it shows uncertainty and not error (the error is the difference between the measurement and the unknown truth, so the error is unknown; the uncertainty is a measure of the width of the probability density after taking the measurement).



                  A related topic is meta-uncertainty. Uncertainty describes the width of an a posteriori probability distribution function, and in case of a Type A uncertainty (uncertainty estimated by repeated measurements), there is inevitable an uncertainty on the uncertainty; metrologists have told me that metrological practice dictates to expand the uncertainty in this case (IIRC, if uncertainty is estimated by the standard deviation of N repeated measurements, one should multiply the resulting standard deviation by $frac{N}{N-2}$), which is essentially a meta-uncertainty.






                  share|cite|improve this answer











                  $endgroup$


















                    1












                    $begingroup$

                    All measurements are uncertain.



                    Therefore, any measurement of probability is also uncertain.



                    This uncertainty on the measurement of probability can be visually represented with an uncertainty bar. Note that uncertainty bars are often referred to as error bars. This is incorrect or at least misleading, because it shows uncertainty and not error (the error is the difference between the measurement and the unknown truth, so the error is unknown; the uncertainty is a measure of the width of the probability density after taking the measurement).



                    A related topic is meta-uncertainty. Uncertainty describes the width of an a posteriori probability distribution function, and in case of a Type A uncertainty (uncertainty estimated by repeated measurements), there is inevitable an uncertainty on the uncertainty; metrologists have told me that metrological practice dictates to expand the uncertainty in this case (IIRC, if uncertainty is estimated by the standard deviation of N repeated measurements, one should multiply the resulting standard deviation by $frac{N}{N-2}$), which is essentially a meta-uncertainty.






                    share|cite|improve this answer











                    $endgroup$
















                      1












                      1








                      1





                      $begingroup$

                      All measurements are uncertain.



                      Therefore, any measurement of probability is also uncertain.



                      This uncertainty on the measurement of probability can be visually represented with an uncertainty bar. Note that uncertainty bars are often referred to as error bars. This is incorrect or at least misleading, because it shows uncertainty and not error (the error is the difference between the measurement and the unknown truth, so the error is unknown; the uncertainty is a measure of the width of the probability density after taking the measurement).



                      A related topic is meta-uncertainty. Uncertainty describes the width of an a posteriori probability distribution function, and in case of a Type A uncertainty (uncertainty estimated by repeated measurements), there is inevitable an uncertainty on the uncertainty; metrologists have told me that metrological practice dictates to expand the uncertainty in this case (IIRC, if uncertainty is estimated by the standard deviation of N repeated measurements, one should multiply the resulting standard deviation by $frac{N}{N-2}$), which is essentially a meta-uncertainty.






                      share|cite|improve this answer











                      $endgroup$



                      All measurements are uncertain.



                      Therefore, any measurement of probability is also uncertain.



                      This uncertainty on the measurement of probability can be visually represented with an uncertainty bar. Note that uncertainty bars are often referred to as error bars. This is incorrect or at least misleading, because it shows uncertainty and not error (the error is the difference between the measurement and the unknown truth, so the error is unknown; the uncertainty is a measure of the width of the probability density after taking the measurement).



                      A related topic is meta-uncertainty. Uncertainty describes the width of an a posteriori probability distribution function, and in case of a Type A uncertainty (uncertainty estimated by repeated measurements), there is inevitable an uncertainty on the uncertainty; metrologists have told me that metrological practice dictates to expand the uncertainty in this case (IIRC, if uncertainty is estimated by the standard deviation of N repeated measurements, one should multiply the resulting standard deviation by $frac{N}{N-2}$), which is essentially a meta-uncertainty.







                      share|cite|improve this answer














                      share|cite|improve this answer



                      share|cite|improve this answer








                      edited 7 hours ago

























                      answered 7 hours ago









                      gerritgerrit

                      946824




                      946824























                          1












                          $begingroup$

                          Error bars are shorthand for a probability distribution. So error bars on errors bars are shorthand for a probability distribution for a probability distribution. That entails assigning probabilities to probabilities, so this is what we need to study.



                          Following John Maynard Keynes, a probability is the degree of belief that we are justified in entertaining in one proposition, in relation to another. Now there are many pairs of intelligible propositions for which we don't know how to assign a numerical probability. But for my part, I cannot think of a pair of plain propositions (not themselves containing a probability phrase) which yield a numerically quantified uncertain probability. If we try to generate one, it always entails an ordinary probability. For example, suppose that our given proposition is
                          $$
                          mathcal{I} = unicode{8220} text{die 1 has 6 sides, and die 2 has 7 sides; one of these dice is rolled} unicode{8221}
                          $$

                          and that the proposition we're assessing against $mathcal{I}$ is
                          $$
                          mathcal{A} = unicode{8220} text{the number showing is 3} unicode{8221}
                          $$

                          Then it's tempting to set
                          $$
                          mathrm{prob}left(mathrm{prob}(mathcal{A} | mathcal{I}) = p | mathcal{I}right) = begin{cases}
                          frac{1}{2} & p = frac{1}{6} \
                          frac{1}{2} & p = frac{1}{7}
                          end{cases}
                          $$

                          which attempts to assign a probability distribution to a probability. Quite apart from the confusion introduced by the nested probability phrases, this cannot be correct, because $mathrm{prob}(mathcal{A} | mathcal{I})$ is deducible and therefore not subject to uncertainty. Let
                          $$
                          mathcal{D}(d) iff unicode{8220} text{die $d$ is rolled} unicode{8221}
                          $$

                          Then
                          begin{align}
                          mathrm{prob}(mathcal{A} | mathcal{I}) &= sum_d mathrm{prob}(mathcal{A}, mathcal{D}(d) | mathcal{I}) \
                          &= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) : mathrm{prob}(mathcal{D}(d) | mathcal{I}) \
                          &= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) cdot frac{1}{2} \
                          &= frac{1}{2} left( frac{1}{6} + frac{1}{7} right) \
                          &= frac{13}{84}
                          end{align}



                          Nested probabilities, in short, cannot be sustained, and neither, therefore, can nested error bars.






                          share|cite|improve this answer









                          $endgroup$


















                            1












                            $begingroup$

                            Error bars are shorthand for a probability distribution. So error bars on errors bars are shorthand for a probability distribution for a probability distribution. That entails assigning probabilities to probabilities, so this is what we need to study.



                            Following John Maynard Keynes, a probability is the degree of belief that we are justified in entertaining in one proposition, in relation to another. Now there are many pairs of intelligible propositions for which we don't know how to assign a numerical probability. But for my part, I cannot think of a pair of plain propositions (not themselves containing a probability phrase) which yield a numerically quantified uncertain probability. If we try to generate one, it always entails an ordinary probability. For example, suppose that our given proposition is
                            $$
                            mathcal{I} = unicode{8220} text{die 1 has 6 sides, and die 2 has 7 sides; one of these dice is rolled} unicode{8221}
                            $$

                            and that the proposition we're assessing against $mathcal{I}$ is
                            $$
                            mathcal{A} = unicode{8220} text{the number showing is 3} unicode{8221}
                            $$

                            Then it's tempting to set
                            $$
                            mathrm{prob}left(mathrm{prob}(mathcal{A} | mathcal{I}) = p | mathcal{I}right) = begin{cases}
                            frac{1}{2} & p = frac{1}{6} \
                            frac{1}{2} & p = frac{1}{7}
                            end{cases}
                            $$

                            which attempts to assign a probability distribution to a probability. Quite apart from the confusion introduced by the nested probability phrases, this cannot be correct, because $mathrm{prob}(mathcal{A} | mathcal{I})$ is deducible and therefore not subject to uncertainty. Let
                            $$
                            mathcal{D}(d) iff unicode{8220} text{die $d$ is rolled} unicode{8221}
                            $$

                            Then
                            begin{align}
                            mathrm{prob}(mathcal{A} | mathcal{I}) &= sum_d mathrm{prob}(mathcal{A}, mathcal{D}(d) | mathcal{I}) \
                            &= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) : mathrm{prob}(mathcal{D}(d) | mathcal{I}) \
                            &= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) cdot frac{1}{2} \
                            &= frac{1}{2} left( frac{1}{6} + frac{1}{7} right) \
                            &= frac{13}{84}
                            end{align}



                            Nested probabilities, in short, cannot be sustained, and neither, therefore, can nested error bars.






                            share|cite|improve this answer









                            $endgroup$
















                              1












                              1








                              1





                              $begingroup$

                              Error bars are shorthand for a probability distribution. So error bars on errors bars are shorthand for a probability distribution for a probability distribution. That entails assigning probabilities to probabilities, so this is what we need to study.



                              Following John Maynard Keynes, a probability is the degree of belief that we are justified in entertaining in one proposition, in relation to another. Now there are many pairs of intelligible propositions for which we don't know how to assign a numerical probability. But for my part, I cannot think of a pair of plain propositions (not themselves containing a probability phrase) which yield a numerically quantified uncertain probability. If we try to generate one, it always entails an ordinary probability. For example, suppose that our given proposition is
                              $$
                              mathcal{I} = unicode{8220} text{die 1 has 6 sides, and die 2 has 7 sides; one of these dice is rolled} unicode{8221}
                              $$

                              and that the proposition we're assessing against $mathcal{I}$ is
                              $$
                              mathcal{A} = unicode{8220} text{the number showing is 3} unicode{8221}
                              $$

                              Then it's tempting to set
                              $$
                              mathrm{prob}left(mathrm{prob}(mathcal{A} | mathcal{I}) = p | mathcal{I}right) = begin{cases}
                              frac{1}{2} & p = frac{1}{6} \
                              frac{1}{2} & p = frac{1}{7}
                              end{cases}
                              $$

                              which attempts to assign a probability distribution to a probability. Quite apart from the confusion introduced by the nested probability phrases, this cannot be correct, because $mathrm{prob}(mathcal{A} | mathcal{I})$ is deducible and therefore not subject to uncertainty. Let
                              $$
                              mathcal{D}(d) iff unicode{8220} text{die $d$ is rolled} unicode{8221}
                              $$

                              Then
                              begin{align}
                              mathrm{prob}(mathcal{A} | mathcal{I}) &= sum_d mathrm{prob}(mathcal{A}, mathcal{D}(d) | mathcal{I}) \
                              &= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) : mathrm{prob}(mathcal{D}(d) | mathcal{I}) \
                              &= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) cdot frac{1}{2} \
                              &= frac{1}{2} left( frac{1}{6} + frac{1}{7} right) \
                              &= frac{13}{84}
                              end{align}



                              Nested probabilities, in short, cannot be sustained, and neither, therefore, can nested error bars.






                              share|cite|improve this answer









                              $endgroup$



                              Error bars are shorthand for a probability distribution. So error bars on errors bars are shorthand for a probability distribution for a probability distribution. That entails assigning probabilities to probabilities, so this is what we need to study.



                              Following John Maynard Keynes, a probability is the degree of belief that we are justified in entertaining in one proposition, in relation to another. Now there are many pairs of intelligible propositions for which we don't know how to assign a numerical probability. But for my part, I cannot think of a pair of plain propositions (not themselves containing a probability phrase) which yield a numerically quantified uncertain probability. If we try to generate one, it always entails an ordinary probability. For example, suppose that our given proposition is
                              $$
                              mathcal{I} = unicode{8220} text{die 1 has 6 sides, and die 2 has 7 sides; one of these dice is rolled} unicode{8221}
                              $$

                              and that the proposition we're assessing against $mathcal{I}$ is
                              $$
                              mathcal{A} = unicode{8220} text{the number showing is 3} unicode{8221}
                              $$

                              Then it's tempting to set
                              $$
                              mathrm{prob}left(mathrm{prob}(mathcal{A} | mathcal{I}) = p | mathcal{I}right) = begin{cases}
                              frac{1}{2} & p = frac{1}{6} \
                              frac{1}{2} & p = frac{1}{7}
                              end{cases}
                              $$

                              which attempts to assign a probability distribution to a probability. Quite apart from the confusion introduced by the nested probability phrases, this cannot be correct, because $mathrm{prob}(mathcal{A} | mathcal{I})$ is deducible and therefore not subject to uncertainty. Let
                              $$
                              mathcal{D}(d) iff unicode{8220} text{die $d$ is rolled} unicode{8221}
                              $$

                              Then
                              begin{align}
                              mathrm{prob}(mathcal{A} | mathcal{I}) &= sum_d mathrm{prob}(mathcal{A}, mathcal{D}(d) | mathcal{I}) \
                              &= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) : mathrm{prob}(mathcal{D}(d) | mathcal{I}) \
                              &= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) cdot frac{1}{2} \
                              &= frac{1}{2} left( frac{1}{6} + frac{1}{7} right) \
                              &= frac{13}{84}
                              end{align}



                              Nested probabilities, in short, cannot be sustained, and neither, therefore, can nested error bars.







                              share|cite|improve this answer












                              share|cite|improve this answer



                              share|cite|improve this answer










                              answered 1 hour ago









                              CarbonFlambeCarbonFlambe

                              31316




                              31316























                                  0












                                  $begingroup$

                                  There are very often occasions where you want to have a probability of a probability. Say for instance you worked in food safety and used a survival analysis model to estimate the probability that botulinum spores would germinate (and thus produce the deadly toxin) as a function of the food preparation steps (i.e. cooking) and incubation time/temperature (c.f. paper). Food producers may then want to use that model to set safe "use-by" dates so that consumer's risk of botulism is appropriately small. However, the model is fit to a finite training sample, so rather than picking a use-by date for which the probability of germination is less than, say 0.001, you might want to choose an earlier date for which (given the modelling assumptions) you could be 95% sure the probability of germination is less than 0.001. This seems a fairly natural thing to do in a Bayesian setting.






                                  share|cite|improve this answer









                                  $endgroup$


















                                    0












                                    $begingroup$

                                    There are very often occasions where you want to have a probability of a probability. Say for instance you worked in food safety and used a survival analysis model to estimate the probability that botulinum spores would germinate (and thus produce the deadly toxin) as a function of the food preparation steps (i.e. cooking) and incubation time/temperature (c.f. paper). Food producers may then want to use that model to set safe "use-by" dates so that consumer's risk of botulism is appropriately small. However, the model is fit to a finite training sample, so rather than picking a use-by date for which the probability of germination is less than, say 0.001, you might want to choose an earlier date for which (given the modelling assumptions) you could be 95% sure the probability of germination is less than 0.001. This seems a fairly natural thing to do in a Bayesian setting.






                                    share|cite|improve this answer









                                    $endgroup$
















                                      0












                                      0








                                      0





                                      $begingroup$

                                      There are very often occasions where you want to have a probability of a probability. Say for instance you worked in food safety and used a survival analysis model to estimate the probability that botulinum spores would germinate (and thus produce the deadly toxin) as a function of the food preparation steps (i.e. cooking) and incubation time/temperature (c.f. paper). Food producers may then want to use that model to set safe "use-by" dates so that consumer's risk of botulism is appropriately small. However, the model is fit to a finite training sample, so rather than picking a use-by date for which the probability of germination is less than, say 0.001, you might want to choose an earlier date for which (given the modelling assumptions) you could be 95% sure the probability of germination is less than 0.001. This seems a fairly natural thing to do in a Bayesian setting.






                                      share|cite|improve this answer









                                      $endgroup$



                                      There are very often occasions where you want to have a probability of a probability. Say for instance you worked in food safety and used a survival analysis model to estimate the probability that botulinum spores would germinate (and thus produce the deadly toxin) as a function of the food preparation steps (i.e. cooking) and incubation time/temperature (c.f. paper). Food producers may then want to use that model to set safe "use-by" dates so that consumer's risk of botulism is appropriately small. However, the model is fit to a finite training sample, so rather than picking a use-by date for which the probability of germination is less than, say 0.001, you might want to choose an earlier date for which (given the modelling assumptions) you could be 95% sure the probability of germination is less than 0.001. This seems a fairly natural thing to do in a Bayesian setting.







                                      share|cite|improve this answer












                                      share|cite|improve this answer



                                      share|cite|improve this answer










                                      answered 1 hour ago









                                      Dikran MarsupialDikran Marsupial

                                      35.9k2103147




                                      35.9k2103147






















                                          mahnamahna is a new contributor. Be nice, and check out our Code of Conduct.










                                          draft saved

                                          draft discarded


















                                          mahnamahna is a new contributor. Be nice, and check out our Code of Conduct.













                                          mahnamahna is a new contributor. Be nice, and check out our Code of Conduct.












                                          mahnamahna is a new contributor. Be nice, and check out our Code of Conduct.
















                                          Thanks for contributing an answer to Cross Validated!


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid



                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.


                                          Use MathJax to format equations. MathJax reference.


                                          To learn more, see our tips on writing great answers.




                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function () {
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f393316%2fdo-error-bars-on-probabilities-have-any-meaning%23new-answer', 'question_page');
                                          }
                                          );

                                          Post as a guest















                                          Required, but never shown





















































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown

































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown







                                          Popular posts from this blog

                                          How to label and detect the document text images

                                          Vallis Paradisi

                                          Tabula Rosettana