Chapter 10 Estimation

When we have a random sample, we make an assumption about the true distribution that the population which from which the sample follows. We can use a point estimate, which we formally refer to as a statistic to estimate parameter values of an assumed distribution.

10.1 Estimation from Normal Distribution

If we have \(n\) random variables from a population that follows the form

\[ X_i \stackrel{iid}\sim N(\mu_i, \sigma_i^2) \qquad i = 1,...,n \]

Then a sample which we assume comes from that distribution, can be described by the following statistics, which help us describe the sample/population.

Distribution for sum of any linear combination \(Y\) is also normal, with the same mean and variance:

\(Y \sim N(\Sigma a_i\mu_i, \Sigma a_i^2 \sigma_i^2)\)

Sample distribution mean of this linear combination is also normal, and has the same mean, with a sample variance that is corrected for bias.

\(\bar{X} \sim N(\mu, \frac{\sigma^2}{n})\)

Sampling distribution for the variance is a Chi-squared Distribution

\(\frac{(n-1)s^2}{\sigma^2} \sim \chi^2(n-1)\)

Sampling distribution for estimating the true \(\mu\) and \(\sigma^2\) (when it is not known), we use a Student’s t Distribution. It is a standard normal divided by the square root of a Chi-squared distribution.

\(\frac{\bar{X}-\mu}{\frac{s}{\sqrt{n}}} \sim t(n-1)\)

10.2 Estimation without Normal Association

How do we estimate parameters without the assumption that the data comes from a normal distribution? We use the central limit theorem

When we do not know if a random sample (\(X_1, X_2,..,X_n\)) is normally distributed we can use the CLT to conclude that the distribution of the sample mean is also normal.

For any distribution the sample mean \(\bar{X}\) has a mean \(\mu\), and sample variance \(\frac{\sigma^2}{n})\). We use the CLT to approximate (using many samples) that the distribution of the sample will converge to a normal distribution.

When \(n\) is large enough, we can assume that the distribution of sample is normally distributed. We often hear that the rule is that if \(n > 30\) the distribution of \(\bar{X}\) will be approximately normal, however the rate of convergence depends on the original distribution.