Chapter 4 Conditional Probability

Conditional probability is not a theorem, nor a conjecture. It is not a lemma or a corollary. It aims to answer the question of how you should update probability based on newfound evidence. It is philosophically intuitive idea that the conditional probability \(\mathbb{P}(A|B)\) gives an advantage, when knowing the occurrence of \(B\) changes the occurrence of \(A\).

4.1 Conditional Probability

The probability of \(\mathbb{P}(A|B)\) is given by the ratio of the relative frequency of the intersection of \(A\) and \(B\) and the relative frequency of \(B\). We can interpret this as the ratio of the probability of joint occurrence of \(A\) and \(B\) and the probability of \(B\).

We define conditional probability as

\[ \mathbb{P}(A|B)=\frac{\mathbb{P}(AB)}{\mathbb{P}(B)} \]

where \(\mathbb{P}(B)>0\).

Set Theory Interpretation: \(P(AB)\) is the intersection of common elements between sets \(A\) and \(B\). Joint occurrence better describes the probability that events are to occur.

Conditional Probability is a Probability Measure

That is, they satisfy the same axioms that ordinary probabilities satisfy. This enables us to use the theorems that are true for probabilities for conditional probabilities as well.

Reduction of a Sample Space

It is possible to reduce a sample space e.g. from (S to B) and have a smaller set of subsets. Making it easier to calculate conditioned probabilities.

4.2 Law of Multiplication

Understand that conditional probability represents the relation between \(\mathbb{P}(A|B)\) and \(\frac{\mathbb{P}(AB)}{\mathbb{P}(B)}\) where \(\mathbb{P}(B) > 0\).

By simply multiplying both sides by the \(\mathbb{P}(B)\), The law of multiplication demonstrates that probability of the joint occurrence of A and B is the product of the probability of B and the conditional probability of A given that B has occurred.

\[ \mathbb{P}(AB)=\mathbb{P}(A|B)\mathbb{P}(B), \text{ where } \mathbb{P}(B)>0 \]

also,

\[ \mathbb{P}(AB)=\mathbb{P}(BA)=\mathbb{P}(B|A)\mathbb{P}(A), \text{ where } \mathbb{P}(A) > 0 \]

We can also calculate the joint probability of three events by using association laws.

Generalizing these theorems stated can be extended to \(n\) events to \(k\) events, the resulting formula can be proved by mathematical induction.

\[ \text{If } \mathbb{P}(A_1 A_2 A_3 …A_{n−1}) > 0 \text{, then} \\ \mathbb{P}(A_1 A_2 A_3 …A_{n−1}A_n) = \mathbb{P}(A_1)P(A_2 | A_1)P(A_3 | A_1A_2)···(A_n | A_1A_2A_3 ···A_{n−1}) \] ##Law of Total Probability

\[ \mathbb{P}(A) = \mathbb{P}(A|B)\mathbb{P}(A) + \mathbb{P}(A|B^c)\mathbb{P}(B^c) \]

Generalized Law of Total Probability

\[ \text{If }\mathbb{P}(A_1 A_2 A_3 …A_{n−1}) \> 0 \text{, then} \\ \mathbb{P}(A_1 A_2 A_3 …A_{n−1}A_n) = \mathbb{P}(A_1)P(A_2 \| A_1)P(A_3 \| A_1A_2)···(A_n \| A_1A_2A_3 ···A_{n−1}) \]

4.3 Bayes’ Formula

Bayes’ Rule is a simple calculation but a big idea about beliefs. Consider that the partitioned sample space we imagine for the law of total probability.

With Bayes’ Rule we are update our belief about something given that we know some other event is true or has occurred.

We want to calculated the probability of an unknown event occurring.

examining the hypothesis given evidence is true we arrive at the expression

\[ P(H|E) = \frac{P(E|H) \cdot P(H)}{P(E)} \]

  • \(P(H|E)\) represents the probability of hypothesis \(H\) given evidence \(E\).
  • \(P(E|H)\) denotes the probability of observing evidence \(E\) given the hypothesis \(H\).
  • \(P(H)\) is the prior probability of hypothesis \(H\)
  • \(P(E)\) represents the probability of observing evidence \(E\).

In order to update our beliefs… we need to consider our prior knowledge \(P(H)\)

4.4 Monty Hall Problem, Explained:

This assumes you know the background, i.e. “rules” of the Monty Hall problem

https://en.wikipedia.org/wiki/Monty_Hall_problem

  • We begin with a 1/3 chance of choosing a door which has a the car.

  • “After the host opens a door there is a 1/2 chance of finding the car”

    • We may conclude this if we use the equally likely theorem and reducing the same space with conditional probability.

    • This conclusion falsely assumes independence; that Monty has a choice between the two doors, i.e. he is choosing between them at random. The issue is that this does not constitute the full conditioning of the evidence provided to us. We possess additional knowledge that the door selected was door two.

    • In the event that we switch doors, we can reason that between the originally picked door (e.g. door 1), the closed door is now paired with one that we know is a goat (doors 2 and 3)

      • What does this mean; it means that before conditioning we had a one third chance of door 1 having the car, the complement is 2/3 and after conditioning we know that door between door two and three one of them must have a must have the car, which has a probability of 2/3.