What is the conditional probability P(A|B)?

It is the probability that event A occurs given the information that event B has occurred. The defining formula is P(A|B) = P(A∩B) / P(B), valid when P(B) > 0. Intuitively, B becomes the new sample space and we measure the proportion of B in which A also occurs.

What is Bayes' theorem used for?

Bayes' theorem updates the probability of a cause B_i after observing a result A. The formula P(B_i|A) = P(B_i)P(A|B_i) / Σ_j P(B_j)P(A|B_j) takes a prior probability P(B_i), multiplies it by the likelihood P(A|B_i), and produces the posterior probability P(B_i|A). It is widely used to infer causes from results, such as estimating the true probability of disease from a positive test result.

What is the difference between independent and mutually exclusive events?

Independence means P(A∩B) = P(A)P(B); the occurrence of one event does not change the probability of the other. Mutual exclusivity means P(A∩B) = 0; the events cannot occur together. The two concepts are different. When P(A) > 0 and P(B) > 0, mutually exclusive events are NOT independent — if A occurs then B cannot, so the probability of B changes.

If a DNA test reports a 99.9% match, is the suspect 99.9% likely to be the perpetrator?

No. The 99.9% figure is the test sensitivity, not the probability of guilt. When the prior probability (the share of true perpetrators in the population) is small, false positives vastly outnumber true positives, and Bayes' theorem shows the probability of guilt may be very low. For instance, if 1 person out of 1,000,000 is the true perpetrator with sensitivity 99.9% and false-positive rate 0.1%, the probability that a matched person is actually guilty is only about 0.1%. Ignoring the prior in this way is known as the base rate fallacy (or prosecutor's fallacy).

Chapter 5: Conditional Probability

5.1 What Changes When You Have More Information

Probability depends on what you know.

Example: rolling a die

A die has been rolled, but we have not yet seen the outcome.

Q: What is the probability that the result is 6?

Answer: $\dfrac{1}{6}$

Now suppose someone says, “It is even.”

Q: What is the probability that the result is 6?

The even faces are $\{2, 4, 6\}$ — three outcomes. The face 6 is one of them.

Answer: $\dfrac{1}{3}$

The probability changed because we received additional information.

Figure 5.1: New information changes the probability.

Note: in this example the probability went up, but information can also decrease the probability or leave it unchanged. For instance, learning “the result is odd” lowers the probability of getting 6 from $\dfrac{1}{6}$ to $0$.

5.2 Definition of Conditional Probability

The probability that A occurs given that B has occurred is called the conditional probability of A given B, written $P(A|B)$.

$$P(A|B) = \dfrac{P(A \cap B)}{P(B)} \quad (P(B) > 0)$$

A way to remember it

“Once we know B has occurred,
the denominator is the probability of B (the new whole),
and the numerator is the probability that A also occurs within B.”

A concrete example: the die in a Venn diagram

Restating the previous example as a Venn diagram gives Figure 5.2. After we learn that B has occurred, only the green disc (B) matters; the conditional probability is the proportion of B in which A also occurs.

Figure 5.2: Venn diagram for the die example.

How to read the diagram

Blue disc (A): rolling a 6, i.e., $\{6\}$
Green disc (B): even faces, i.e., $\{2, 4, 6\}$
Overlap (A∩B): rolling a 6 (which is even), i.e., $\{6\}$
P(A|B) = 1/3: among even outcomes (B), the probability of getting 6 (A)

Important: once we condition on B, the green disc (B) becomes the new sample space. The conditional probability is the relative size of the overlap inside B.

Verification with formulas

$A$: rolling 6 → $P(A) = \dfrac{1}{6}$
$B$: rolling an even number → $P(B) = \dfrac{3}{6} = \dfrac{1}{2}$
$A \cap B$: rolling 6 (which is even) → $P(A \cap B) = \dfrac{1}{6}$

$$P(A|B) = \dfrac{P(A \cap B)}{P(B)} = \dfrac{1/6}{1/2} = \dfrac{1}{3}$$

The information “the result is even” raised the probability of rolling a 6 from $\dfrac{1}{6}$ to $\dfrac{1}{3}$.

The general picture

Figure 5.3: Conditional probability — B becomes the new sample space.

5.3 Worked Examples

Example 1: two dice

Two dice are rolled. We are told the sum is at least 8. What is the probability that the sum is exactly 10?

Solution

$B$: sum is at least 8
$A$: sum is 10

Count the favourable outcomes for each event.

Outcomes with sum at least 8

Sum 8: $(2,6),(3,5),(4,4),(5,3),(6,2)$ → 5 outcomes
Sum 9: $(3,6),(4,5),(5,4),(6,3)$ → 4 outcomes
Sum 10: $(4,6),(5,5),(6,4)$ → 3 outcomes
Sum 11: $(5,6),(6,5)$ → 2 outcomes
Sum 12: $(6,6)$ → 1 outcome

Total: $5+4+3+2+1 = 15$ outcomes.

Outcomes with sum 10: 3 outcomes.

$$P(A|B) = \dfrac{\color{green}{3}}{\color{blue}{15}} = \color{orange}{\dfrac{1}{5}}$$

Figure 5.4: Venn diagram for the two-dice example.

Example 2: a deck of cards

Draw one card from a standard 52-card deck. Given that the card is a face card (J, Q, or K), what is the probability that it is a heart?

Solution

$B$: face card → 12 cards
$A$: heart → 13 cards
$A \cap B$: heart face card → 3 cards

$$P(A|B) = \dfrac{3}{12} = \dfrac{1}{4}$$

Figure 5.5: Venn diagram for the deck-of-cards example.

5.4 The Multiplication Rule

Rearranging the definition of conditional probability gives:

Theorem (multiplication rule).

When $P(B) > 0$,

$$P(A \cap B) = P(B) \times P(A|B).$$

Likewise, when $P(A) > 0$,

$$P(A \cap B) = P(A) \times P(B|A).$$

Proof.

By definition,

$$P(A|B) = \dfrac{P(A \cap B)}{P(B)}.$$

Multiplying both sides by $P(B)$,

$$P(B) \times P(A|B) = P(A \cap B).$$

The second identity follows analogously from $P(B|A) = \dfrac{P(A \cap B)}{P(A)}$.

■

In words: “the probability that A and B both occur” equals “the probability of B” times “the probability of A given B.”

Example: drawing lottery tickets

A box has 10 tickets, of which 3 are winners. Two tickets are drawn in succession (without replacement). What is the probability that both are winners?

Figure 5.6: Applying the multiplication rule.

Solution

Probability the 1st ticket wins: $\dfrac{3}{10}$
Given the 1st was a winner, probability the 2nd also wins: $\dfrac{2}{9}$

$$P(\text{both winners}) = \dfrac{3}{10} \times \dfrac{2}{9} = \dfrac{6}{90} = \dfrac{1}{15}.$$

5.5 The Law of Total Probability

If the sample space can be split into mutually exclusive events $B_1, B_2, \ldots, B_n$:

Theorem (law of total probability).

Let $B_1, B_2, \ldots, B_n$ be a partition of the sample space $S$ (mutually exclusive with union $S$) such that $P(B_i) > 0$ for every $i$. Then

$$P(A) = \sum_{i=1}^{n} P(B_i)P(A|B_i) = P(B_1)P(A|B_1) + P(B_2)P(A|B_2) + \cdots + P(B_n)P(A|B_n).$$

Proof.

Since $B_1, B_2, \ldots, B_n$ partition $S$,

$$S = B_1 \cup B_2 \cup \cdots \cup B_n, \quad B_i \cap B_j = \emptyset \; (i \neq j).$$

Decomposing $A$ along the partition,

$$A = A \cap S = A \cap (B_1 \cup B_2 \cup \cdots \cup B_n) = (A \cap B_1) \cup (A \cap B_2) \cup \cdots \cup (A \cap B_n).$$

Because the $B_i$ are mutually exclusive, so are the $A \cap B_i$.

By additivity,

$$P(A) = P(A \cap B_1) + P(A \cap B_2) + \cdots + P(A \cap B_n).$$

Applying the multiplication rule $P(A \cap B_i) = P(B_i)P(A|B_i)$ to each term,

$$P(A) = P(B_1)P(A|B_1) + P(B_2)P(A|B_2) + \cdots + P(B_n)P(A|B_n).$$

■

Figure 5.7: Law of total probability — decomposing A along the partition B₁, B₂, B₃.

Example: two bags

Bag 1 contains 3 red and 2 white balls. Bag 2 contains 4 red and 6 white balls.

A coin is tossed: heads selects bag 1, tails selects bag 2. One ball is then drawn from the chosen bag.

What is the probability that the drawn ball is red?

Figure 5.8: Applying the law of total probability (two bags).

Solution

$B_1$: bag 1 chosen → $P(B_1) = \dfrac{1}{2}$
$B_2$: bag 2 chosen → $P(B_2) = \dfrac{1}{2}$
$P(\text{red}|B_1) = \dfrac{3}{5}$
$P(\text{red}|B_2) = \dfrac{4}{10} = \dfrac{2}{5}$

$$P(\text{red}) = \dfrac{1}{2} \times \dfrac{3}{5} + \dfrac{1}{2} \times \dfrac{2}{5} = \dfrac{3}{10} + \dfrac{2}{10} = \dfrac{1}{2}.$$

5.6 Bayes' Theorem

Given that A occurred, Bayes' theorem gives the probability that the cause was $B_i$.

Theorem (Bayes' theorem).

Let $B_1, B_2, \ldots, B_n$ be a partition of the sample space $S$ with $P(B_i) > 0$ for every $i$, and let $P(A) > 0$. Then

$$P(B_i|A) = \dfrac{P(B_i)P(A|B_i)}{P(A)} = \dfrac{P(B_i)P(A|B_i)}{\sum_{j=1}^{n} P(B_j)P(A|B_j)}.$$

How to read the formula: the numerator is $\color{#1976D2}{P(B_i)}$ (prior) × $\color{#388E3C}{P(A|B_i)}$ (likelihood). The denominator $P(A)$ is exactly the law of total probability of §5.5, summing the numerator-shape $P(B_j)P(A|B_j)$ over $B_1, \ldots, B_n$. In other words, Bayes' theorem is just the multiplication rule combined with the law of total probability.

Proof.

By the definition of conditional probability,

$$P(B_i|A) = \dfrac{P(A \cap B_i)}{P(A)}.$$

The multiplication rule gives $P(A \cap B_i) = P(B_i)P(A|B_i)$, so

$$P(B_i|A) = \dfrac{P(B_i)P(A|B_i)}{P(A)}.$$

Substituting the law of total probability for $P(A)$,

$$P(B_i|A) = \dfrac{P(B_i)P(A|B_i)}{\sum_{j=1}^{n} P(B_j)P(A|B_j)}.$$

■

Reading Bayes' theorem

$P(B_i)$: prior probability (before observing $A$)
$P(B_i|A)$: posterior probability (after observing $A$)
$P(A|B_i)$: likelihood (probability of $A$ assuming $B_i$)

Bayes' theorem updates beliefs about a cause $B_i$ in light of the observed result $A$.

Mental picture of Bayes' theorem

“See the result, update the cause.”

Example: a positive test result → how likely is the disease really?

Prior: prevalence of the disease before the test
Posterior: probability after taking the test result into account

Example: defective products

Two factories produce a part:

Factory A produces 60% of the parts and has a 2% defect rate.
Factory B produces 40% of the parts and has a 5% defect rate.

A given part is defective. What is the probability it was made at factory A?

Figure 5.9: Bayes' theorem — backing out the cause from the result.

Solution

$P(A) = 0.6$ (made at factory A)
$P(B) = 0.4$ (made at factory B)
$P(\text{defective}|A) = 0.02$
$P(\text{defective}|B) = 0.05$

First, compute the probability of a defective part using the law of total probability:

$$P(\text{defective}) = 0.6 \times 0.02 + 0.4 \times 0.05 = 0.012 + 0.02 = 0.032.$$

By Bayes' theorem,

$$P(A|\text{defective}) = \dfrac{0.6 \times 0.02}{0.032} = \dfrac{0.012}{0.032} = \dfrac{12}{32} = \dfrac{3}{8} = 0.375.$$

The probability the part came from factory A is 37.5%.

Example: DNA evidence and miscarriages of justice

A real-world misuse

In past trials it has been argued that “the DNA matched at 99.9%, so the suspect is guilty.” But applied correctly, Bayes' theorem can show that the actual probability of guilt is far smaller. Knowing this could have prevented wrongful convictions.

Suppose a crime has occurred. The suspect's DNA is compared with evidence from the scene. The DNA test has the following accuracy:

If the person is the perpetrator, the test matches with probability 99.9% (sensitivity).
For an unrelated person, the test matches by chance with probability 0.1% (false-positive rate).

The region has 1,000,000 residents, exactly one of whom is the perpetrator. Given that the test matched, what is the probability that the suspect is the actual perpetrator?

Common intuitive answer: “A 99.9% match means a 99.9% probability of guilt!”

However, the correct calculation says otherwise.

Bayesian computation

$B_1$: the suspect is the perpetrator → $P(B_1) = \dfrac{1}{1{,}000{,}000}$ (prior)
$B_2$: the suspect is unrelated → $P(B_2) = \dfrac{999{,}999}{1{,}000{,}000}$
$A$: DNA test matches
$P(A|B_1) = 0.999$ (the perpetrator's DNA matches)
$P(A|B_2) = 0.001$ (an unrelated person's DNA matches by chance)

By Bayes' theorem, the probability that the suspect is truly the perpetrator given that the DNA matched is

$$P(B_1|A) = \dfrac{P(B_1) \cdot P(A|B_1)}{P(B_1) \cdot P(A|B_1) + P(B_2) \cdot P(A|B_2)}.$$

Plugging in the numbers,

$$P(B_1|A) = \dfrac{\dfrac{1}{1{,}000{,}000} \times 0.999}{\dfrac{1}{1{,}000{,}000} \times 0.999 + \dfrac{999{,}999}{1{,}000{,}000} \times 0.001}.$$

Multiplying numerator and denominator by $1{,}000{,}000$,

$$= \dfrac{0.999}{0.999 + 999.999} = \dfrac{0.999}{1000.998} \approx 0.000998 \approx 0.1\%.$$

Answer: approximately 0.1% (about 1 in 1000).

Why isn't “DNA matches” the same as “99.9% guilty”?

Out of 1,000,000 residents:

The 1 perpetrator: matches with probability 99.9% → about 1 person.
The 999,999 unrelated residents: each matches with probability 0.1% → about 1,000 people.

So about 1,001 people would match. Of those, only one is the actual perpetrator.

A DNA match alone therefore identifies the true perpetrator with probability of only about 1 in 1001 — roughly 0.1%.

Figure 5.10: DNA match results and the importance of the prior.

Take-aways

Test accuracy alone is not enough: “99.9% match” describes the test, not the probability of guilt.
The prior (base rate) matters: a prior of 1 in 1,000,000 cannot be ignored.
Count false positives in absolute terms: even at high accuracy, false positives accumulate when the population is large.
Combine with other evidence: a DNA match should be weighed alongside alibi, motive, and other corroborating evidence.

Implications for trials

If additional evidence (eyewitness testimony, motive, presence at the scene) raises the prior far above 1 in 1,000,000, a DNA match increases the probability of guilt substantially. Without other evidence, however, a DNA match alone is insufficient to support a conviction. Understanding Bayes' theorem allows evidence to be evaluated correctly.

Confusing the conditional probability of a match with the probability of guilt — ignoring the prior (base rate) — is a classic statistical mistake at the intersection of probability and law, known as the prosecutor's fallacy (a special case of the base rate fallacy).

5.7 Independent Events

Two events $A$ and $B$ are independent if the occurrence of one does not change the probability of the other.

Definition: independent events

Two events $A$ and $B$ are independent if

$$P(A \cap B) = P(A) \times P(B).$$

Theorem (equivalent characterizations of independence).

When $P(A) > 0$ and $P(B) > 0$, the following are equivalent:

$P(A \cap B) = P(A) \times P(B)$
$P(B|A) = P(B)$
$P(A|B) = P(A)$

Proof.

(1) ⇔ (2):

By the definition of conditional probability,

$$P(B|A) = \dfrac{P(A \cap B)}{P(A)}.$$

Hence $P(B|A) = P(B)$ is equivalent to

$$\dfrac{P(A \cap B)}{P(A)} = P(B) \iff P(A \cap B) = P(A) \times P(B).$$

(1) ⇔ (3) follows analogously.

■

Example: tossing a coin twice

The outcome of the first toss and the outcome of the second toss are independent.

$$P(\text{both heads}) = \dfrac{1}{2} \times \dfrac{1}{2} = \dfrac{1}{4}.$$

Figure 5.11: Independent vs. dependent events.

A dependent example: drawing tickets without replacement

A box has 10 tickets, of which 3 are winners. Two tickets are drawn without replacement.

$A$: the 1st ticket is a winner → $P(A) = \dfrac{3}{10}$
$B$: the 2nd ticket is a winner

The outcome of the first draw changes the probability of the second:

If the 1st was a winner: 9 tickets remain, 2 of them winners → $P(B|A) = \dfrac{2}{9}$.
If the 1st was a loser: 9 tickets remain, 3 of them winners → $P(B|\overline{A}) = \dfrac{3}{9} = \dfrac{1}{3}$.

Since $P(B|A) \neq P(B)$, $A$ and $B$ are not independent.

Compare: with replacement, the probability that the 2nd ticket wins is always $\dfrac{3}{10}$ regardless of the 1st draw, so the two draws are independent.

Does buying more lottery tickets change the per-ticket expected value?

Lottery tickets are drawn without replacement, so the events “ticket 1 wins” and “ticket 2 wins” are not independent. Surprisingly, however, the expected value per ticket is the same regardless of how many you buy.

Concrete example: a typical large lottery

Suppose a lottery sells tickets at 300 yen each with a prize-fund payout of about 50% of total sales. Then the expected value per ticket is about 150 yen.

Buying 1 ticket

Expected value: $E[X_1] = 150$ yen.
Expected profit: $150 - 300 = -150$ yen.

The 2nd ticket when 2 tickets are bought

After buying ticket 1, you buy ticket 2. The probability that ticket 2 wins depends on ticket 1 (they are not independent), but what about its expected value?

By symmetry, the expected value of the 2nd ticket is also 150 yen:

All issued tickets are symmetric (no ticket is privileged).
The first ticket is not special.
Whichever ticket you buy, the expected value per ticket is the same.

Conclusion: no matter how many tickets you buy, the expected value per ticket is 150 yen.

Why doesn't the expected value change?

The reason is the linearity of expectation, which holds even when events are not independent:

$$E[X_1 + X_2] = E[X_1] + E[X_2].$$

The total prize fund is fixed and is distributed symmetrically among all tickets, so the expected value per ticket is the same.

Does buying many tickets help?

It is tempting to think “more tickets = better odds = better deal.” That intuition is misleading.

1 ticket: expected profit = 150 - 300 = -150 yen.
10 tickets: expected profit = 1500 - 3000 = -1500 yen.
100 tickets: expected profit = 15,000 - 30,000 = -15,000 yen.

Buying more tickets does increase the chance of winning, but the expected loss grows in proportion to the number of tickets.

Caution: independence is not the same as mutual exclusivity

“Independent” and “mutually exclusive” are often confused, but they are completely different concepts. Side-by-side Venn diagrams make the contrast obvious (Figure 5.12).

Figure 5.12: Independent vs. mutually exclusive — independent events typically overlap, whereas mutually exclusive events do not. They are different concepts.

Concept	Meaning	Venn diagram	Formula
Independent	Occurrence of one does not affect the other	Usually overlap	$P(A \cap B) = P(A) \times P(B)$
Mutually exclusive	Cannot occur together	No overlap	$P(A \cap B) = 0$

Important caveat

Mutually exclusive events with $P(B) > 0$ are not independent.
If $A$ occurs we know $B$ cannot occur, so the probability of $B$ has changed.

5.8 Chapter Summary

Formula sheet

Name	Formula	Meaning
Conditional probability	$$P(A\|B) = \dfrac{P(A \cap B)}{P(B)}$$	Probability of A given B
Multiplication rule	$$P(A \cap B) = P(B) \times P(A\|B)$$	Joint probability via two stages
Law of total probability	$$P(A) = \sum_{i} P(B_i)P(A\|B_i)$$	Sum the cases of a partition
Bayes' theorem	$$P(B_i\|A) = \dfrac{P(B_i)P(A\|B_i)}{P(A)}$$	Recover the cause from the result
Independence	$$P(A \cap B) = P(A) \times P(B)$$	Events do not influence each other

Keyword recap

Conditional probability

$P(A|B)$: probability of A given B

Multiplication rule

Joint probabilities computed in two stages

Law of total probability

Sum probabilities over a partition

Bayes' theorem

Recover the cause from the result

Prior / posterior

Probabilities before and after observing

Independence

Events do not change each other's probabilities

Glossary

📚 Open glossary

Conditional probability $P(A|B)$: Probability that A occurs given that B has occurred. Treat B as the new sample space.
Multiplication rule: $P(A \cap B) = P(B) \times P(A|B)$. Computes the joint probability in two stages.
Law of total probability: $P(A) = \sum_i P(B_i)P(A|B_i)$. Sums the probabilities over a partition.
Bayes' theorem: $P(B_i|A) = \dfrac{P(B_i)P(A|B_i)}{P(A)}$. Recovers the cause from the result.
Prior probability: Probability before observing evidence. Denoted $P(B_i)$ in Bayes' theorem.
Posterior probability: Probability after observing the result A. Denoted $P(B_i|A)$ in Bayes' theorem.
Likelihood: Probability $P(A|B_i)$ that A occurs assuming the cause is $B_i$.
Independent events: Events satisfying $P(A \cap B) = P(A) \times P(B)$ — one's occurrence does not change the other's probability.
Mutually exclusive events: $P(A \cap B) = 0$. Events that cannot occur together. Different from independence.

Exercises

Problem 1

A die is rolled. Given that the result is at least 3, what is the probability that it is even?

Problem 2

Two cards are drawn from a deck (without replacement). Given that the first is a spade, what is the probability that the second is also a spade?

Problem 3

From a group of 6 men and 4 women, 2 people are selected. Given that the first is a man, what is the probability that the second is also a man?

Problem 4

A medical test has the following properties:

1% of the population has the disease.
If a person has the disease, the test is positive 99% of the time.
If a person does not have the disease, the test is positive 2% of the time (false positive).

Given a positive test result, what is the probability that the person actually has the disease?

Problem 5 (independence check)

Two dice are rolled. Consider the events:

$A$: the first die is even.
$B$: the second die is at least 3.
$C$: the sum is 7.

(1) Are $A$ and $B$ independent?

(2) Are $A$ and $C$ independent?

Solution to Problem 1

Solution

Idea: condition on $B$ = “the result is at least 3” and find the probability of $A$ = “the result is even.”

$B$: at least 3 → $\{3, 4, 5, 6\}$, four outcomes.
$A$: even → $\{2, 4, 6\}$.
$A \cap B$: at least 3 and even → $\{4, 6\}$, two outcomes.

By the definition of conditional probability,

$$P(A|B) = \dfrac{P(A \cap B)}{P(B)} = \dfrac{2/6}{4/6} = \dfrac{2}{4} = \dfrac{1}{2}.$$

Answer: $\dfrac{1}{2}$.

Solution to Problem 2

Solution

Idea: after the first card (a spade) is removed, 51 cards remain, of which 12 are spades.

Total: 52 cards (13 per suit).
1st is a spade → 51 cards remain.
Spades remaining: $13 - 1 = 12$.

$$P = \dfrac{12}{51} = \dfrac{4}{17}.$$

Answer: $\dfrac{4}{17}$.

Solution to Problem 3

Solution

Idea: if the first selected person is a man, 9 people remain and 5 of them are men.

Initial group: 6 men and 4 women (10 people).
1st is a man → 9 people remain.
Men remaining: $6 - 1 = 5$.

$$P = \dfrac{5}{9}.$$

Answer: $\dfrac{5}{9}$.

Solution to Problem 4

Solution

Idea: use Bayes' theorem — recover the cause “disease” from the result “positive.”

$P(\text{disease}) = 0.01$ (prior).
$P(\text{positive}|\text{disease}) = 0.99$ (sensitivity).
$P(\text{positive}|\text{healthy}) = 0.02$ (false-positive rate).

By the law of total probability,

$$P(\text{positive}) = 0.01 \times 0.99 + 0.99 \times 0.02 = 0.0099 + 0.0198 = 0.0297.$$

Bayes' theorem then gives

$$P(\text{disease}|\text{positive}) = \dfrac{0.01 \times 0.99}{0.0297} = \dfrac{0.0099}{0.0297} \approx 0.333.$$

Answer: about 33% (roughly 1 in 3).

Important: even at 99% sensitivity, the rarity of the disease (1%) means only about a third of positive tests correspond to actual disease — a classic illustration of the importance of the prior.

Intuitive check: imagine 1000 people. About 10 have the disease and 990 are healthy.

Diseased 10 × 99% ≈ 9.9 test positive (true positives).
Healthy 990 × 2% ≈ 19.8 test positive (false positives).
Total positives: 9.9 + 19.8 ≈ 29.7 people.

Of the positives, the fraction who actually have the disease is $\dfrac{9.9}{29.7} \approx 0.333$, about 33%, matching the formula above.

Solution to Problem 5

Solution

(1) Are $A$ and $B$ independent?

Idea: the two rolls are independent, so we expect $A$ and $B$ to be independent. Let's verify.

$P(A) = \dfrac{3}{6} = \dfrac{1}{2}$ (first die even).
$P(B) = \dfrac{4}{6} = \dfrac{2}{3}$ (second die at least 3).
$P(A \cap B) = \dfrac{3}{6} \times \dfrac{4}{6} = \dfrac{12}{36} = \dfrac{1}{3}$.
$P(A) \times P(B) = \dfrac{1}{2} \times \dfrac{2}{3} = \dfrac{1}{3}$.

Since $P(A \cap B) = P(A) \times P(B)$, the events are independent.

(2) Are $A$ and $C$ independent?

Idea: count outcomes carefully and check whether the parity of the first roll changes the probability of the sum being 7.

$P(C) = \dfrac{6}{36} = \dfrac{1}{6}$ (sum 7: $(1,6),(2,5),(3,4),(4,3),(5,2),(6,1)$, six outcomes).
$P(A \cap C)$: first die even and sum 7 → $(2,5),(4,3),(6,1)$, three outcomes → $\dfrac{3}{36} = \dfrac{1}{12}$.
$P(A) \times P(C) = \dfrac{1}{2} \times \dfrac{1}{6} = \dfrac{1}{12}$.

Since $P(A \cap C) = P(A) \times P(C)$, the events are independent.

Why are they independent? No matter what the first die shows, there is exactly one second-die value that makes the sum equal to 7: if the first is 2 the second must be 5; if 4 then 3; if 6 then 1. Each of those joint outcomes has probability $\dfrac{1}{6} \times \dfrac{1}{6} = \dfrac{1}{36}$. So the joint count is $\dfrac{3}{36} = \dfrac{1}{12}$.

The same logic works when the first die is odd (1, 3, or 5): the second die only needs to take the matching values (6, 4, 2). The probability is again $\dfrac{3}{36} = \dfrac{1}{12}$. By symmetry the parity of the first roll does not change the probability of the sum being 7.

References

Conditional probability — Wikipedia
Bayes' theorem — Wikipedia
Law of total probability — Wikipedia
Independence (probability theory) — Wikipedia
Base rate fallacy — Wikipedia (the misuse of priors highlighted in the DNA example, also known as the prosecutor's fallacy)