## I. Introduction to Normal Distribution

The normal distribution is a symmetrical, bell-shaped curve characterized by its mean $$\mu$$ and standard deviation $$\sigma$$. Mathematically, it is defined by the probability density function:

$f(x | \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{ -\frac{(x – \mu)^2}{2\sigma^2}}$

### Empirical Rule

Approximately 68% of the data falls within one standard deviation, 95% within two standard deviations, and 99.7% within three standard deviations from the mean.

## II. Importance of Statistical Inference

Statistical inference allows us to make educated guesses about a population based on data from a sample. This is invaluable in various fields, from medicine to politics.

## III. Percentiles and the Normal Distribution

A percentile indicates the relative standing of a data point within a data set. To find the 90th percentile, for instance, one can use the TI-84’s invNorm(0.9, mean, standard deviation) function. To find what percentile a standardized value is, lets say 1.9, we can use the TI-84’s normcdf(1E-99, 1.9, 0, 1) function.

## IV. Sampling Distributions of Sample Means

### General Overview

Sampling distributions of sample means represent the range and frequency of possible sample means for a given sample size. It’s a concept pivotal for understanding the Central Limit Theorem. The distribution shows how much sample means will vary around the population mean.

### Relationship to Test Statistics and P-values

The sampling distribution serves as the foundation for hypothesis testing. A test statistic is calculated from sample data and is compared to the sampling distribution to determine how extreme the test statistic is. In essence, the test statistic serves a role similar to a z-score; it quantifies how many standard deviations away a sample statistic is from the population parameter, assuming the null hypothesis is true. The p-value is then the probability that the test statistic is as extreme, or more extreme, than what was observed, under the assumption that the null hypothesis is true.

## V. Central Limit Theorem

The Central Limit Theorem (CLT) states that for a sufficiently large sample size, the sampling distribution of the sample mean will approximate a normal distribution, regardless of the shape of the population distribution.

## VI. Assumptions and Tests

### 1. Normality

#### One-Sample T-Test

If the sample size is large ($$n \geq 30$$), the Central Limit Theorem can be assumed to hold. For smaller sample sizes, make a dot plot or histogram and check the shape of the distribution.

#### Two-Sample T-Test

If either sample size is smaller than 30, graphical methods such as dot plots or histograms should be used to check for normality. If both sample sizes are large ($$n \geq 30$$), the CLT can be assumed to hold.

#### One-Sample Z-Test for Proportion

Both $$np$$ and $$n(1-p)$$ should be greater than or equal to 10, as per AP guidelines.

#### Two-Sample Z-Test for Proportion

Both $$n_1p_1$$ and $$n_1(1-p_1)$$, and $$n_2p_2$$ and $$n_2(1-p_2)$$ should be greater than or equal to 10.

#### Chi-Squared Test for Independence

Each cell in the contingency table should have an expected frequency of 5 or more.

### 2. Independence

The sample size should not exceed 10% of the population, as per the 10% rule, to ensure independence.

### 3. Random Sampling

Data should be randomly sampled from the population.

## VII. Exercises

Answer the following questions based on the lecture notes.

1. What is the Empirical Rule and why is it important?
2. Calculate the z-score for a data point that is 3 units above the mean that’s zero, given a standard deviation of 2.
3. How do you find the 90th percentile using a TI-84 calculator?
4. Explain the concept of a sampling distribution of sample means.
5. What is the Central Limit Theorem, and why is it significant?
6. For a One-Sample T-Test, how do we check the assumption of normality?
7. For a Two-Sample Z-Test for Proportion, what are the AP guidelines for checking normality?
8. Explain the 10% rule in the context of statistical independence.
9. How do test statistics relate to z-scores?
10. What is a p-value and how is it calculated?