Do error bars on probabilities have any meaning?
$begingroup$
People often say some event has a 50-60% chance of happening. Sometimes I will even see people give explicit error bars on probability assignments. Do these statements have any meaning or are they just a linguistic quirk of discomfort choosing a specific number for something that is inherently unknowable?
probability error
New contributor
$endgroup$
add a comment |
$begingroup$
People often say some event has a 50-60% chance of happening. Sometimes I will even see people give explicit error bars on probability assignments. Do these statements have any meaning or are they just a linguistic quirk of discomfort choosing a specific number for something that is inherently unknowable?
probability error
New contributor
$endgroup$
$begingroup$
Doesn't the Probably Approximately Correct framework in computational learning theory do just that, typically giving a bound on the error rate of a classifier that holds with probability $1-delta$? If it was a meaningless concept, I doubt those (extremely clever) CoLT people would have failed to spot it!
$endgroup$
– Dikran Marsupial
1 hour ago
add a comment |
$begingroup$
People often say some event has a 50-60% chance of happening. Sometimes I will even see people give explicit error bars on probability assignments. Do these statements have any meaning or are they just a linguistic quirk of discomfort choosing a specific number for something that is inherently unknowable?
probability error
New contributor
$endgroup$
People often say some event has a 50-60% chance of happening. Sometimes I will even see people give explicit error bars on probability assignments. Do these statements have any meaning or are they just a linguistic quirk of discomfort choosing a specific number for something that is inherently unknowable?
probability error
probability error
New contributor
New contributor
edited 6 hours ago
Ferdi
3,78742152
3,78742152
New contributor
asked 20 hours ago
mahnamahnamahnamahna
793
793
New contributor
New contributor
$begingroup$
Doesn't the Probably Approximately Correct framework in computational learning theory do just that, typically giving a bound on the error rate of a classifier that holds with probability $1-delta$? If it was a meaningless concept, I doubt those (extremely clever) CoLT people would have failed to spot it!
$endgroup$
– Dikran Marsupial
1 hour ago
add a comment |
$begingroup$
Doesn't the Probably Approximately Correct framework in computational learning theory do just that, typically giving a bound on the error rate of a classifier that holds with probability $1-delta$? If it was a meaningless concept, I doubt those (extremely clever) CoLT people would have failed to spot it!
$endgroup$
– Dikran Marsupial
1 hour ago
$begingroup$
Doesn't the Probably Approximately Correct framework in computational learning theory do just that, typically giving a bound on the error rate of a classifier that holds with probability $1-delta$? If it was a meaningless concept, I doubt those (extremely clever) CoLT people would have failed to spot it!
$endgroup$
– Dikran Marsupial
1 hour ago
$begingroup$
Doesn't the Probably Approximately Correct framework in computational learning theory do just that, typically giving a bound on the error rate of a classifier that holds with probability $1-delta$? If it was a meaningless concept, I doubt those (extremely clever) CoLT people would have failed to spot it!
$endgroup$
– Dikran Marsupial
1 hour ago
add a comment |
6 Answers
6
active
oldest
votes
$begingroup$
It wouldn't make sense if you were talking about known probabilities, e.g. with fair coin the probability of throwing heads is 0.5 by the laws of physics. However, unless you are talking about textbook example, the exact probability is never known, we only know it approximately.
The different story is when you estimate the probabilities from the data, e.g. you observed 13 winning tickets among the 12563 tickets you bought, so from this data you estimate the probability to be 13/12563. This is something you estimated from the sample, so it is uncertain, because with different sample you could observe different value. The uncertainty estimate is not about the probability, but around the estimate of it.
Another example would be when the probability is not fixed, but depends on other factors. Say that we are talking about probability of dying in car accident. We can consider "global" probability, single value that is marginalized over all the factors that directly and indirectly lead to car accidents. On another hand, you can consider how the probabilities vary among the population given the risk factors.
You can find many more examples where probabilities themselves are considered as random variables, so they vary rather then being fixed.
$endgroup$
1
$begingroup$
If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
$endgroup$
– usεr11852
16 hours ago
1
$begingroup$
@usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
$endgroup$
– Tim♦
16 hours ago
1
$begingroup$
Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
$endgroup$
– Nuclear Wang
10 hours ago
4
$begingroup$
@NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
$endgroup$
– De Novo
9 hours ago
$begingroup$
@DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
$endgroup$
– Tim♦
9 hours ago
|
show 8 more comments
$begingroup$
A most relevant illustration from xkcd:
with associated caption:
...an effect size of 1.68 (95% CI: 1.56 (95% CI: 1.52 (95% CI: 1.504
(95% CI: 1.494 (95% CI: 1.488 (95% CI: 1.485 (95% CI: 1.482 (95% CI:
1.481 (95% CI: 1.4799 (95% CI: 1.4791 (95% CI: 1.4784...
$endgroup$
$begingroup$
Does this imply that error bars on probabilities are redundant?
$endgroup$
– BalinKingOfMoria
13 hours ago
7
$begingroup$
Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
$endgroup$
– Xi'an
9 hours ago
3
$begingroup$
Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
$endgroup$
– Xi'an
7 hours ago
4
$begingroup$
That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
$endgroup$
– gerrit
7 hours ago
add a comment |
$begingroup$
I know of two interpretations. The first was said by Tim: We have observed $X$ successes out of $Y$ trials, so if we believe the trials were i.i.d. we can estimate the probability of the process at $X/Y$ with some error bars, e.g. of order $1/sqrt{Y}$.
The second involves "higher-order probabilities" or uncertainties about a generating process. For example, say I have a coin in my hand manufactured by a crafter gambler, who with $0.5$ probability made a 60%-heads coin, and with $0.5$ probability made a 40%-heads coin. My best guess is a 50% chance that the coin comes up heads, but with big error bars: the "true" chance is either 40% or 60%.
In other words, you can imagine running the experiment a billion times and taking the fraction of successes $X/Y$ (actually the limiting fraction). It makes sense, at least from a Bayesian perspective, to give e.g. a 95% confidence interval around that number. In the above example, given current knowledge, this is $[0.4,0.6]$. For a real coin, maybe it is $[0.47,0.53]$ or something. For more, see:
Do We Need Higher-Order Probabilities and, If So, What Do They Mean?
Judea Pearl. UAI 1987. https://arxiv.org/abs/1304.2716
$endgroup$
add a comment |
$begingroup$
All measurements are uncertain.
Therefore, any measurement of probability is also uncertain.
This uncertainty on the measurement of probability can be visually represented with an uncertainty bar. Note that uncertainty bars are often referred to as error bars. This is incorrect or at least misleading, because it shows uncertainty and not error (the error is the difference between the measurement and the unknown truth, so the error is unknown; the uncertainty is a measure of the width of the probability density after taking the measurement).
A related topic is meta-uncertainty. Uncertainty describes the width of an a posteriori probability distribution function, and in case of a Type A uncertainty (uncertainty estimated by repeated measurements), there is inevitable an uncertainty on the uncertainty; metrologists have told me that metrological practice dictates to expand the uncertainty in this case (IIRC, if uncertainty is estimated by the standard deviation of N repeated measurements, one should multiply the resulting standard deviation by $frac{N}{N-2}$), which is essentially a meta-uncertainty.
$endgroup$
add a comment |
$begingroup$
Error bars are shorthand for a probability distribution. So error bars on errors bars are shorthand for a probability distribution for a probability distribution. That entails assigning probabilities to probabilities, so this is what we need to study.
Following John Maynard Keynes, a probability is the degree of belief that we are justified in entertaining in one proposition, in relation to another. Now there are many pairs of intelligible propositions for which we don't know how to assign a numerical probability. But for my part, I cannot think of a pair of plain propositions (not themselves containing a probability phrase) which yield a numerically quantified uncertain probability. If we try to generate one, it always entails an ordinary probability. For example, suppose that our given proposition is
$$
mathcal{I} = unicode{8220} text{die 1 has 6 sides, and die 2 has 7 sides; one of these dice is rolled} unicode{8221}
$$
and that the proposition we're assessing against $mathcal{I}$ is
$$
mathcal{A} = unicode{8220} text{the number showing is 3} unicode{8221}
$$
Then it's tempting to set
$$
mathrm{prob}left(mathrm{prob}(mathcal{A} | mathcal{I}) = p | mathcal{I}right) = begin{cases}
frac{1}{2} & p = frac{1}{6} \
frac{1}{2} & p = frac{1}{7}
end{cases}
$$
which attempts to assign a probability distribution to a probability. Quite apart from the confusion introduced by the nested probability phrases, this cannot be correct, because $mathrm{prob}(mathcal{A} | mathcal{I})$ is deducible and therefore not subject to uncertainty. Let
$$
mathcal{D}(d) iff unicode{8220} text{die $d$ is rolled} unicode{8221}
$$
Then
begin{align}
mathrm{prob}(mathcal{A} | mathcal{I}) &= sum_d mathrm{prob}(mathcal{A}, mathcal{D}(d) | mathcal{I}) \
&= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) : mathrm{prob}(mathcal{D}(d) | mathcal{I}) \
&= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) cdot frac{1}{2} \
&= frac{1}{2} left( frac{1}{6} + frac{1}{7} right) \
&= frac{13}{84}
end{align}
Nested probabilities, in short, cannot be sustained, and neither, therefore, can nested error bars.
$endgroup$
add a comment |
$begingroup$
There are very often occasions where you want to have a probability of a probability. Say for instance you worked in food safety and used a survival analysis model to estimate the probability that botulinum spores would germinate (and thus produce the deadly toxin) as a function of the food preparation steps (i.e. cooking) and incubation time/temperature (c.f. paper). Food producers may then want to use that model to set safe "use-by" dates so that consumer's risk of botulism is appropriately small. However, the model is fit to a finite training sample, so rather than picking a use-by date for which the probability of germination is less than, say 0.001, you might want to choose an earlier date for which (given the modelling assumptions) you could be 95% sure the probability of germination is less than 0.001. This seems a fairly natural thing to do in a Bayesian setting.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
mahnamahna is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f393316%2fdo-error-bars-on-probabilities-have-any-meaning%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
6 Answers
6
active
oldest
votes
6 Answers
6
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
It wouldn't make sense if you were talking about known probabilities, e.g. with fair coin the probability of throwing heads is 0.5 by the laws of physics. However, unless you are talking about textbook example, the exact probability is never known, we only know it approximately.
The different story is when you estimate the probabilities from the data, e.g. you observed 13 winning tickets among the 12563 tickets you bought, so from this data you estimate the probability to be 13/12563. This is something you estimated from the sample, so it is uncertain, because with different sample you could observe different value. The uncertainty estimate is not about the probability, but around the estimate of it.
Another example would be when the probability is not fixed, but depends on other factors. Say that we are talking about probability of dying in car accident. We can consider "global" probability, single value that is marginalized over all the factors that directly and indirectly lead to car accidents. On another hand, you can consider how the probabilities vary among the population given the risk factors.
You can find many more examples where probabilities themselves are considered as random variables, so they vary rather then being fixed.
$endgroup$
1
$begingroup$
If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
$endgroup$
– usεr11852
16 hours ago
1
$begingroup$
@usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
$endgroup$
– Tim♦
16 hours ago
1
$begingroup$
Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
$endgroup$
– Nuclear Wang
10 hours ago
4
$begingroup$
@NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
$endgroup$
– De Novo
9 hours ago
$begingroup$
@DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
$endgroup$
– Tim♦
9 hours ago
|
show 8 more comments
$begingroup$
It wouldn't make sense if you were talking about known probabilities, e.g. with fair coin the probability of throwing heads is 0.5 by the laws of physics. However, unless you are talking about textbook example, the exact probability is never known, we only know it approximately.
The different story is when you estimate the probabilities from the data, e.g. you observed 13 winning tickets among the 12563 tickets you bought, so from this data you estimate the probability to be 13/12563. This is something you estimated from the sample, so it is uncertain, because with different sample you could observe different value. The uncertainty estimate is not about the probability, but around the estimate of it.
Another example would be when the probability is not fixed, but depends on other factors. Say that we are talking about probability of dying in car accident. We can consider "global" probability, single value that is marginalized over all the factors that directly and indirectly lead to car accidents. On another hand, you can consider how the probabilities vary among the population given the risk factors.
You can find many more examples where probabilities themselves are considered as random variables, so they vary rather then being fixed.
$endgroup$
1
$begingroup$
If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
$endgroup$
– usεr11852
16 hours ago
1
$begingroup$
@usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
$endgroup$
– Tim♦
16 hours ago
1
$begingroup$
Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
$endgroup$
– Nuclear Wang
10 hours ago
4
$begingroup$
@NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
$endgroup$
– De Novo
9 hours ago
$begingroup$
@DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
$endgroup$
– Tim♦
9 hours ago
|
show 8 more comments
$begingroup$
It wouldn't make sense if you were talking about known probabilities, e.g. with fair coin the probability of throwing heads is 0.5 by the laws of physics. However, unless you are talking about textbook example, the exact probability is never known, we only know it approximately.
The different story is when you estimate the probabilities from the data, e.g. you observed 13 winning tickets among the 12563 tickets you bought, so from this data you estimate the probability to be 13/12563. This is something you estimated from the sample, so it is uncertain, because with different sample you could observe different value. The uncertainty estimate is not about the probability, but around the estimate of it.
Another example would be when the probability is not fixed, but depends on other factors. Say that we are talking about probability of dying in car accident. We can consider "global" probability, single value that is marginalized over all the factors that directly and indirectly lead to car accidents. On another hand, you can consider how the probabilities vary among the population given the risk factors.
You can find many more examples where probabilities themselves are considered as random variables, so they vary rather then being fixed.
$endgroup$
It wouldn't make sense if you were talking about known probabilities, e.g. with fair coin the probability of throwing heads is 0.5 by the laws of physics. However, unless you are talking about textbook example, the exact probability is never known, we only know it approximately.
The different story is when you estimate the probabilities from the data, e.g. you observed 13 winning tickets among the 12563 tickets you bought, so from this data you estimate the probability to be 13/12563. This is something you estimated from the sample, so it is uncertain, because with different sample you could observe different value. The uncertainty estimate is not about the probability, but around the estimate of it.
Another example would be when the probability is not fixed, but depends on other factors. Say that we are talking about probability of dying in car accident. We can consider "global" probability, single value that is marginalized over all the factors that directly and indirectly lead to car accidents. On another hand, you can consider how the probabilities vary among the population given the risk factors.
You can find many more examples where probabilities themselves are considered as random variables, so they vary rather then being fixed.
edited 7 hours ago
answered 19 hours ago
Tim♦Tim
57.7k9126219
57.7k9126219
1
$begingroup$
If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
$endgroup$
– usεr11852
16 hours ago
1
$begingroup$
@usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
$endgroup$
– Tim♦
16 hours ago
1
$begingroup$
Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
$endgroup$
– Nuclear Wang
10 hours ago
4
$begingroup$
@NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
$endgroup$
– De Novo
9 hours ago
$begingroup$
@DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
$endgroup$
– Tim♦
9 hours ago
|
show 8 more comments
1
$begingroup$
If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
$endgroup$
– usεr11852
16 hours ago
1
$begingroup$
@usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
$endgroup$
– Tim♦
16 hours ago
1
$begingroup$
Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
$endgroup$
– Nuclear Wang
10 hours ago
4
$begingroup$
@NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
$endgroup$
– De Novo
9 hours ago
$begingroup$
@DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
$endgroup$
– Tim♦
9 hours ago
1
1
$begingroup$
If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
$endgroup$
– usεr11852
16 hours ago
$begingroup$
If the calculation of a probability estimate was done through something like a logistic regression wouldn't be also natural to assume that these "error bars" refer to prediction intervals? (I am asking mostly as a clarification to the first point you raise, +1 obviously)
$endgroup$
– usεr11852
16 hours ago
1
1
$begingroup$
@usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
$endgroup$
– Tim♦
16 hours ago
$begingroup$
@usεr11852 confidence intervals, prediction intervals, highest density regions etc., depending on actual case. I made the answer very broad, since we have "varying" probabilities in many scenarios and they vary in different ways. Also you can interpret them differently in different scenarios.
$endgroup$
– Tim♦
16 hours ago
1
1
$begingroup$
Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
$endgroup$
– Nuclear Wang
10 hours ago
$begingroup$
Even "known" probabilities can be shorthand for very small error bars. One could presumably show that a coin flip is perhaps 50.00001%- 49.99999% with enough trials to get small enough error bars that exclude 50.00000%. There's no physical law suggesting the odds should be precisely even for an asymmetrical coin, but the error bars are far too small for anyone to care.
$endgroup$
– Nuclear Wang
10 hours ago
4
4
$begingroup$
@NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
$endgroup$
– De Novo
9 hours ago
$begingroup$
@NuclearWang this is accounted for by the OPs use of the phrase "fair coin". By definition, P(HEADS) for a fair coin is 0.5. A fair coin is a mathematical construct. I would suggest an edit replacing "by the laws of physics" with "by definition" to emphasize this point.
$endgroup$
– De Novo
9 hours ago
$begingroup$
@DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
$endgroup$
– Tim♦
9 hours ago
$begingroup$
@DeNovo same applies to physical coins stat.columbia.edu/~gelman/research/published/diceRev2.pdf , but yes I said "fair" not to start this discussion
$endgroup$
– Tim♦
9 hours ago
|
show 8 more comments
$begingroup$
A most relevant illustration from xkcd:
with associated caption:
...an effect size of 1.68 (95% CI: 1.56 (95% CI: 1.52 (95% CI: 1.504
(95% CI: 1.494 (95% CI: 1.488 (95% CI: 1.485 (95% CI: 1.482 (95% CI:
1.481 (95% CI: 1.4799 (95% CI: 1.4791 (95% CI: 1.4784...
$endgroup$
$begingroup$
Does this imply that error bars on probabilities are redundant?
$endgroup$
– BalinKingOfMoria
13 hours ago
7
$begingroup$
Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
$endgroup$
– Xi'an
9 hours ago
3
$begingroup$
Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
$endgroup$
– Xi'an
7 hours ago
4
$begingroup$
That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
$endgroup$
– gerrit
7 hours ago
add a comment |
$begingroup$
A most relevant illustration from xkcd:
with associated caption:
...an effect size of 1.68 (95% CI: 1.56 (95% CI: 1.52 (95% CI: 1.504
(95% CI: 1.494 (95% CI: 1.488 (95% CI: 1.485 (95% CI: 1.482 (95% CI:
1.481 (95% CI: 1.4799 (95% CI: 1.4791 (95% CI: 1.4784...
$endgroup$
$begingroup$
Does this imply that error bars on probabilities are redundant?
$endgroup$
– BalinKingOfMoria
13 hours ago
7
$begingroup$
Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
$endgroup$
– Xi'an
9 hours ago
3
$begingroup$
Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
$endgroup$
– Xi'an
7 hours ago
4
$begingroup$
That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
$endgroup$
– gerrit
7 hours ago
add a comment |
$begingroup$
A most relevant illustration from xkcd:
with associated caption:
...an effect size of 1.68 (95% CI: 1.56 (95% CI: 1.52 (95% CI: 1.504
(95% CI: 1.494 (95% CI: 1.488 (95% CI: 1.485 (95% CI: 1.482 (95% CI:
1.481 (95% CI: 1.4799 (95% CI: 1.4791 (95% CI: 1.4784...
$endgroup$
A most relevant illustration from xkcd:
with associated caption:
...an effect size of 1.68 (95% CI: 1.56 (95% CI: 1.52 (95% CI: 1.504
(95% CI: 1.494 (95% CI: 1.488 (95% CI: 1.485 (95% CI: 1.482 (95% CI:
1.481 (95% CI: 1.4799 (95% CI: 1.4791 (95% CI: 1.4784...
edited 9 hours ago
Stephan Kolassa
46k695167
46k695167
answered 19 hours ago
Xi'anXi'an
57.1k895360
57.1k895360
$begingroup$
Does this imply that error bars on probabilities are redundant?
$endgroup$
– BalinKingOfMoria
13 hours ago
7
$begingroup$
Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
$endgroup$
– Xi'an
9 hours ago
3
$begingroup$
Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
$endgroup$
– Xi'an
7 hours ago
4
$begingroup$
That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
$endgroup$
– gerrit
7 hours ago
add a comment |
$begingroup$
Does this imply that error bars on probabilities are redundant?
$endgroup$
– BalinKingOfMoria
13 hours ago
7
$begingroup$
Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
$endgroup$
– Xi'an
9 hours ago
3
$begingroup$
Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
$endgroup$
– Xi'an
7 hours ago
4
$begingroup$
That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
$endgroup$
– gerrit
7 hours ago
$begingroup$
Does this imply that error bars on probabilities are redundant?
$endgroup$
– BalinKingOfMoria
13 hours ago
$begingroup$
Does this imply that error bars on probabilities are redundant?
$endgroup$
– BalinKingOfMoria
13 hours ago
7
7
$begingroup$
Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
$endgroup$
– Xi'an
9 hours ago
$begingroup$
Joke apart, this means that the precision of the error bars is uncertain and that the evaluation of the uncertainty is itself uncertain, in an infinite regress.
$endgroup$
– Xi'an
9 hours ago
3
3
$begingroup$
Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
$endgroup$
– Xi'an
7 hours ago
$begingroup$
Which is why I deem the picture relevant and deeply connected with the fundamental difficulty (and beautiful challenge) of assessing errors in statistics.
$endgroup$
– Xi'an
7 hours ago
4
4
$begingroup$
That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
$endgroup$
– gerrit
7 hours ago
$begingroup$
That figure illustrates meta-uncertainty, which may be related to an uncertainty on a probability since uncertainty itself is a measure of the width of a probability distribution, but your post does not explain this in any way; in fact the XKCD comic suggests it has something to do with error propagation (which is false), which the question does not.
$endgroup$
– gerrit
7 hours ago
add a comment |
$begingroup$
I know of two interpretations. The first was said by Tim: We have observed $X$ successes out of $Y$ trials, so if we believe the trials were i.i.d. we can estimate the probability of the process at $X/Y$ with some error bars, e.g. of order $1/sqrt{Y}$.
The second involves "higher-order probabilities" or uncertainties about a generating process. For example, say I have a coin in my hand manufactured by a crafter gambler, who with $0.5$ probability made a 60%-heads coin, and with $0.5$ probability made a 40%-heads coin. My best guess is a 50% chance that the coin comes up heads, but with big error bars: the "true" chance is either 40% or 60%.
In other words, you can imagine running the experiment a billion times and taking the fraction of successes $X/Y$ (actually the limiting fraction). It makes sense, at least from a Bayesian perspective, to give e.g. a 95% confidence interval around that number. In the above example, given current knowledge, this is $[0.4,0.6]$. For a real coin, maybe it is $[0.47,0.53]$ or something. For more, see:
Do We Need Higher-Order Probabilities and, If So, What Do They Mean?
Judea Pearl. UAI 1987. https://arxiv.org/abs/1304.2716
$endgroup$
add a comment |
$begingroup$
I know of two interpretations. The first was said by Tim: We have observed $X$ successes out of $Y$ trials, so if we believe the trials were i.i.d. we can estimate the probability of the process at $X/Y$ with some error bars, e.g. of order $1/sqrt{Y}$.
The second involves "higher-order probabilities" or uncertainties about a generating process. For example, say I have a coin in my hand manufactured by a crafter gambler, who with $0.5$ probability made a 60%-heads coin, and with $0.5$ probability made a 40%-heads coin. My best guess is a 50% chance that the coin comes up heads, but with big error bars: the "true" chance is either 40% or 60%.
In other words, you can imagine running the experiment a billion times and taking the fraction of successes $X/Y$ (actually the limiting fraction). It makes sense, at least from a Bayesian perspective, to give e.g. a 95% confidence interval around that number. In the above example, given current knowledge, this is $[0.4,0.6]$. For a real coin, maybe it is $[0.47,0.53]$ or something. For more, see:
Do We Need Higher-Order Probabilities and, If So, What Do They Mean?
Judea Pearl. UAI 1987. https://arxiv.org/abs/1304.2716
$endgroup$
add a comment |
$begingroup$
I know of two interpretations. The first was said by Tim: We have observed $X$ successes out of $Y$ trials, so if we believe the trials were i.i.d. we can estimate the probability of the process at $X/Y$ with some error bars, e.g. of order $1/sqrt{Y}$.
The second involves "higher-order probabilities" or uncertainties about a generating process. For example, say I have a coin in my hand manufactured by a crafter gambler, who with $0.5$ probability made a 60%-heads coin, and with $0.5$ probability made a 40%-heads coin. My best guess is a 50% chance that the coin comes up heads, but with big error bars: the "true" chance is either 40% or 60%.
In other words, you can imagine running the experiment a billion times and taking the fraction of successes $X/Y$ (actually the limiting fraction). It makes sense, at least from a Bayesian perspective, to give e.g. a 95% confidence interval around that number. In the above example, given current knowledge, this is $[0.4,0.6]$. For a real coin, maybe it is $[0.47,0.53]$ or something. For more, see:
Do We Need Higher-Order Probabilities and, If So, What Do They Mean?
Judea Pearl. UAI 1987. https://arxiv.org/abs/1304.2716
$endgroup$
I know of two interpretations. The first was said by Tim: We have observed $X$ successes out of $Y$ trials, so if we believe the trials were i.i.d. we can estimate the probability of the process at $X/Y$ with some error bars, e.g. of order $1/sqrt{Y}$.
The second involves "higher-order probabilities" or uncertainties about a generating process. For example, say I have a coin in my hand manufactured by a crafter gambler, who with $0.5$ probability made a 60%-heads coin, and with $0.5$ probability made a 40%-heads coin. My best guess is a 50% chance that the coin comes up heads, but with big error bars: the "true" chance is either 40% or 60%.
In other words, you can imagine running the experiment a billion times and taking the fraction of successes $X/Y$ (actually the limiting fraction). It makes sense, at least from a Bayesian perspective, to give e.g. a 95% confidence interval around that number. In the above example, given current knowledge, this is $[0.4,0.6]$. For a real coin, maybe it is $[0.47,0.53]$ or something. For more, see:
Do We Need Higher-Order Probabilities and, If So, What Do They Mean?
Judea Pearl. UAI 1987. https://arxiv.org/abs/1304.2716
answered 7 hours ago
usulusul
1272
1272
add a comment |
add a comment |
$begingroup$
All measurements are uncertain.
Therefore, any measurement of probability is also uncertain.
This uncertainty on the measurement of probability can be visually represented with an uncertainty bar. Note that uncertainty bars are often referred to as error bars. This is incorrect or at least misleading, because it shows uncertainty and not error (the error is the difference between the measurement and the unknown truth, so the error is unknown; the uncertainty is a measure of the width of the probability density after taking the measurement).
A related topic is meta-uncertainty. Uncertainty describes the width of an a posteriori probability distribution function, and in case of a Type A uncertainty (uncertainty estimated by repeated measurements), there is inevitable an uncertainty on the uncertainty; metrologists have told me that metrological practice dictates to expand the uncertainty in this case (IIRC, if uncertainty is estimated by the standard deviation of N repeated measurements, one should multiply the resulting standard deviation by $frac{N}{N-2}$), which is essentially a meta-uncertainty.
$endgroup$
add a comment |
$begingroup$
All measurements are uncertain.
Therefore, any measurement of probability is also uncertain.
This uncertainty on the measurement of probability can be visually represented with an uncertainty bar. Note that uncertainty bars are often referred to as error bars. This is incorrect or at least misleading, because it shows uncertainty and not error (the error is the difference between the measurement and the unknown truth, so the error is unknown; the uncertainty is a measure of the width of the probability density after taking the measurement).
A related topic is meta-uncertainty. Uncertainty describes the width of an a posteriori probability distribution function, and in case of a Type A uncertainty (uncertainty estimated by repeated measurements), there is inevitable an uncertainty on the uncertainty; metrologists have told me that metrological practice dictates to expand the uncertainty in this case (IIRC, if uncertainty is estimated by the standard deviation of N repeated measurements, one should multiply the resulting standard deviation by $frac{N}{N-2}$), which is essentially a meta-uncertainty.
$endgroup$
add a comment |
$begingroup$
All measurements are uncertain.
Therefore, any measurement of probability is also uncertain.
This uncertainty on the measurement of probability can be visually represented with an uncertainty bar. Note that uncertainty bars are often referred to as error bars. This is incorrect or at least misleading, because it shows uncertainty and not error (the error is the difference between the measurement and the unknown truth, so the error is unknown; the uncertainty is a measure of the width of the probability density after taking the measurement).
A related topic is meta-uncertainty. Uncertainty describes the width of an a posteriori probability distribution function, and in case of a Type A uncertainty (uncertainty estimated by repeated measurements), there is inevitable an uncertainty on the uncertainty; metrologists have told me that metrological practice dictates to expand the uncertainty in this case (IIRC, if uncertainty is estimated by the standard deviation of N repeated measurements, one should multiply the resulting standard deviation by $frac{N}{N-2}$), which is essentially a meta-uncertainty.
$endgroup$
All measurements are uncertain.
Therefore, any measurement of probability is also uncertain.
This uncertainty on the measurement of probability can be visually represented with an uncertainty bar. Note that uncertainty bars are often referred to as error bars. This is incorrect or at least misleading, because it shows uncertainty and not error (the error is the difference between the measurement and the unknown truth, so the error is unknown; the uncertainty is a measure of the width of the probability density after taking the measurement).
A related topic is meta-uncertainty. Uncertainty describes the width of an a posteriori probability distribution function, and in case of a Type A uncertainty (uncertainty estimated by repeated measurements), there is inevitable an uncertainty on the uncertainty; metrologists have told me that metrological practice dictates to expand the uncertainty in this case (IIRC, if uncertainty is estimated by the standard deviation of N repeated measurements, one should multiply the resulting standard deviation by $frac{N}{N-2}$), which is essentially a meta-uncertainty.
edited 7 hours ago
answered 7 hours ago
gerritgerrit
946824
946824
add a comment |
add a comment |
$begingroup$
Error bars are shorthand for a probability distribution. So error bars on errors bars are shorthand for a probability distribution for a probability distribution. That entails assigning probabilities to probabilities, so this is what we need to study.
Following John Maynard Keynes, a probability is the degree of belief that we are justified in entertaining in one proposition, in relation to another. Now there are many pairs of intelligible propositions for which we don't know how to assign a numerical probability. But for my part, I cannot think of a pair of plain propositions (not themselves containing a probability phrase) which yield a numerically quantified uncertain probability. If we try to generate one, it always entails an ordinary probability. For example, suppose that our given proposition is
$$
mathcal{I} = unicode{8220} text{die 1 has 6 sides, and die 2 has 7 sides; one of these dice is rolled} unicode{8221}
$$
and that the proposition we're assessing against $mathcal{I}$ is
$$
mathcal{A} = unicode{8220} text{the number showing is 3} unicode{8221}
$$
Then it's tempting to set
$$
mathrm{prob}left(mathrm{prob}(mathcal{A} | mathcal{I}) = p | mathcal{I}right) = begin{cases}
frac{1}{2} & p = frac{1}{6} \
frac{1}{2} & p = frac{1}{7}
end{cases}
$$
which attempts to assign a probability distribution to a probability. Quite apart from the confusion introduced by the nested probability phrases, this cannot be correct, because $mathrm{prob}(mathcal{A} | mathcal{I})$ is deducible and therefore not subject to uncertainty. Let
$$
mathcal{D}(d) iff unicode{8220} text{die $d$ is rolled} unicode{8221}
$$
Then
begin{align}
mathrm{prob}(mathcal{A} | mathcal{I}) &= sum_d mathrm{prob}(mathcal{A}, mathcal{D}(d) | mathcal{I}) \
&= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) : mathrm{prob}(mathcal{D}(d) | mathcal{I}) \
&= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) cdot frac{1}{2} \
&= frac{1}{2} left( frac{1}{6} + frac{1}{7} right) \
&= frac{13}{84}
end{align}
Nested probabilities, in short, cannot be sustained, and neither, therefore, can nested error bars.
$endgroup$
add a comment |
$begingroup$
Error bars are shorthand for a probability distribution. So error bars on errors bars are shorthand for a probability distribution for a probability distribution. That entails assigning probabilities to probabilities, so this is what we need to study.
Following John Maynard Keynes, a probability is the degree of belief that we are justified in entertaining in one proposition, in relation to another. Now there are many pairs of intelligible propositions for which we don't know how to assign a numerical probability. But for my part, I cannot think of a pair of plain propositions (not themselves containing a probability phrase) which yield a numerically quantified uncertain probability. If we try to generate one, it always entails an ordinary probability. For example, suppose that our given proposition is
$$
mathcal{I} = unicode{8220} text{die 1 has 6 sides, and die 2 has 7 sides; one of these dice is rolled} unicode{8221}
$$
and that the proposition we're assessing against $mathcal{I}$ is
$$
mathcal{A} = unicode{8220} text{the number showing is 3} unicode{8221}
$$
Then it's tempting to set
$$
mathrm{prob}left(mathrm{prob}(mathcal{A} | mathcal{I}) = p | mathcal{I}right) = begin{cases}
frac{1}{2} & p = frac{1}{6} \
frac{1}{2} & p = frac{1}{7}
end{cases}
$$
which attempts to assign a probability distribution to a probability. Quite apart from the confusion introduced by the nested probability phrases, this cannot be correct, because $mathrm{prob}(mathcal{A} | mathcal{I})$ is deducible and therefore not subject to uncertainty. Let
$$
mathcal{D}(d) iff unicode{8220} text{die $d$ is rolled} unicode{8221}
$$
Then
begin{align}
mathrm{prob}(mathcal{A} | mathcal{I}) &= sum_d mathrm{prob}(mathcal{A}, mathcal{D}(d) | mathcal{I}) \
&= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) : mathrm{prob}(mathcal{D}(d) | mathcal{I}) \
&= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) cdot frac{1}{2} \
&= frac{1}{2} left( frac{1}{6} + frac{1}{7} right) \
&= frac{13}{84}
end{align}
Nested probabilities, in short, cannot be sustained, and neither, therefore, can nested error bars.
$endgroup$
add a comment |
$begingroup$
Error bars are shorthand for a probability distribution. So error bars on errors bars are shorthand for a probability distribution for a probability distribution. That entails assigning probabilities to probabilities, so this is what we need to study.
Following John Maynard Keynes, a probability is the degree of belief that we are justified in entertaining in one proposition, in relation to another. Now there are many pairs of intelligible propositions for which we don't know how to assign a numerical probability. But for my part, I cannot think of a pair of plain propositions (not themselves containing a probability phrase) which yield a numerically quantified uncertain probability. If we try to generate one, it always entails an ordinary probability. For example, suppose that our given proposition is
$$
mathcal{I} = unicode{8220} text{die 1 has 6 sides, and die 2 has 7 sides; one of these dice is rolled} unicode{8221}
$$
and that the proposition we're assessing against $mathcal{I}$ is
$$
mathcal{A} = unicode{8220} text{the number showing is 3} unicode{8221}
$$
Then it's tempting to set
$$
mathrm{prob}left(mathrm{prob}(mathcal{A} | mathcal{I}) = p | mathcal{I}right) = begin{cases}
frac{1}{2} & p = frac{1}{6} \
frac{1}{2} & p = frac{1}{7}
end{cases}
$$
which attempts to assign a probability distribution to a probability. Quite apart from the confusion introduced by the nested probability phrases, this cannot be correct, because $mathrm{prob}(mathcal{A} | mathcal{I})$ is deducible and therefore not subject to uncertainty. Let
$$
mathcal{D}(d) iff unicode{8220} text{die $d$ is rolled} unicode{8221}
$$
Then
begin{align}
mathrm{prob}(mathcal{A} | mathcal{I}) &= sum_d mathrm{prob}(mathcal{A}, mathcal{D}(d) | mathcal{I}) \
&= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) : mathrm{prob}(mathcal{D}(d) | mathcal{I}) \
&= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) cdot frac{1}{2} \
&= frac{1}{2} left( frac{1}{6} + frac{1}{7} right) \
&= frac{13}{84}
end{align}
Nested probabilities, in short, cannot be sustained, and neither, therefore, can nested error bars.
$endgroup$
Error bars are shorthand for a probability distribution. So error bars on errors bars are shorthand for a probability distribution for a probability distribution. That entails assigning probabilities to probabilities, so this is what we need to study.
Following John Maynard Keynes, a probability is the degree of belief that we are justified in entertaining in one proposition, in relation to another. Now there are many pairs of intelligible propositions for which we don't know how to assign a numerical probability. But for my part, I cannot think of a pair of plain propositions (not themselves containing a probability phrase) which yield a numerically quantified uncertain probability. If we try to generate one, it always entails an ordinary probability. For example, suppose that our given proposition is
$$
mathcal{I} = unicode{8220} text{die 1 has 6 sides, and die 2 has 7 sides; one of these dice is rolled} unicode{8221}
$$
and that the proposition we're assessing against $mathcal{I}$ is
$$
mathcal{A} = unicode{8220} text{the number showing is 3} unicode{8221}
$$
Then it's tempting to set
$$
mathrm{prob}left(mathrm{prob}(mathcal{A} | mathcal{I}) = p | mathcal{I}right) = begin{cases}
frac{1}{2} & p = frac{1}{6} \
frac{1}{2} & p = frac{1}{7}
end{cases}
$$
which attempts to assign a probability distribution to a probability. Quite apart from the confusion introduced by the nested probability phrases, this cannot be correct, because $mathrm{prob}(mathcal{A} | mathcal{I})$ is deducible and therefore not subject to uncertainty. Let
$$
mathcal{D}(d) iff unicode{8220} text{die $d$ is rolled} unicode{8221}
$$
Then
begin{align}
mathrm{prob}(mathcal{A} | mathcal{I}) &= sum_d mathrm{prob}(mathcal{A}, mathcal{D}(d) | mathcal{I}) \
&= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) : mathrm{prob}(mathcal{D}(d) | mathcal{I}) \
&= sum_d mathrm{prob}(mathcal{A} | mathcal{D}(d), mathcal{I}) cdot frac{1}{2} \
&= frac{1}{2} left( frac{1}{6} + frac{1}{7} right) \
&= frac{13}{84}
end{align}
Nested probabilities, in short, cannot be sustained, and neither, therefore, can nested error bars.
answered 1 hour ago
CarbonFlambeCarbonFlambe
31316
31316
add a comment |
add a comment |
$begingroup$
There are very often occasions where you want to have a probability of a probability. Say for instance you worked in food safety and used a survival analysis model to estimate the probability that botulinum spores would germinate (and thus produce the deadly toxin) as a function of the food preparation steps (i.e. cooking) and incubation time/temperature (c.f. paper). Food producers may then want to use that model to set safe "use-by" dates so that consumer's risk of botulism is appropriately small. However, the model is fit to a finite training sample, so rather than picking a use-by date for which the probability of germination is less than, say 0.001, you might want to choose an earlier date for which (given the modelling assumptions) you could be 95% sure the probability of germination is less than 0.001. This seems a fairly natural thing to do in a Bayesian setting.
$endgroup$
add a comment |
$begingroup$
There are very often occasions where you want to have a probability of a probability. Say for instance you worked in food safety and used a survival analysis model to estimate the probability that botulinum spores would germinate (and thus produce the deadly toxin) as a function of the food preparation steps (i.e. cooking) and incubation time/temperature (c.f. paper). Food producers may then want to use that model to set safe "use-by" dates so that consumer's risk of botulism is appropriately small. However, the model is fit to a finite training sample, so rather than picking a use-by date for which the probability of germination is less than, say 0.001, you might want to choose an earlier date for which (given the modelling assumptions) you could be 95% sure the probability of germination is less than 0.001. This seems a fairly natural thing to do in a Bayesian setting.
$endgroup$
add a comment |
$begingroup$
There are very often occasions where you want to have a probability of a probability. Say for instance you worked in food safety and used a survival analysis model to estimate the probability that botulinum spores would germinate (and thus produce the deadly toxin) as a function of the food preparation steps (i.e. cooking) and incubation time/temperature (c.f. paper). Food producers may then want to use that model to set safe "use-by" dates so that consumer's risk of botulism is appropriately small. However, the model is fit to a finite training sample, so rather than picking a use-by date for which the probability of germination is less than, say 0.001, you might want to choose an earlier date for which (given the modelling assumptions) you could be 95% sure the probability of germination is less than 0.001. This seems a fairly natural thing to do in a Bayesian setting.
$endgroup$
There are very often occasions where you want to have a probability of a probability. Say for instance you worked in food safety and used a survival analysis model to estimate the probability that botulinum spores would germinate (and thus produce the deadly toxin) as a function of the food preparation steps (i.e. cooking) and incubation time/temperature (c.f. paper). Food producers may then want to use that model to set safe "use-by" dates so that consumer's risk of botulism is appropriately small. However, the model is fit to a finite training sample, so rather than picking a use-by date for which the probability of germination is less than, say 0.001, you might want to choose an earlier date for which (given the modelling assumptions) you could be 95% sure the probability of germination is less than 0.001. This seems a fairly natural thing to do in a Bayesian setting.
answered 1 hour ago
Dikran MarsupialDikran Marsupial
35.9k2103147
35.9k2103147
add a comment |
add a comment |
mahnamahna is a new contributor. Be nice, and check out our Code of Conduct.
mahnamahna is a new contributor. Be nice, and check out our Code of Conduct.
mahnamahna is a new contributor. Be nice, and check out our Code of Conduct.
mahnamahna is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f393316%2fdo-error-bars-on-probabilities-have-any-meaning%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Doesn't the Probably Approximately Correct framework in computational learning theory do just that, typically giving a bound on the error rate of a classifier that holds with probability $1-delta$? If it was a meaningless concept, I doubt those (extremely clever) CoLT people would have failed to spot it!
$endgroup$
– Dikran Marsupial
1 hour ago