I. Introduction to Normal Distribution

The normal distribution is a symmetrical, bell-shaped curve characterized by its mean \( \mu \) and standard deviation \( \sigma \). Mathematically, it is defined by the probability density function:

\[
f(x | \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{ -\frac{(x – \mu)^2}{2\sigma^2}}
\]

Empirical Rule

Approximately 68% of the data falls within one standard deviation, 95% within two standard deviations, and 99.7% within three standard deviations from the mean.

II. Importance of Statistical Inference

Statistical inference allows us to make educated guesses about a population based on data from a sample. This is invaluable in various fields, from medicine to politics.

III. Percentiles and the Normal Distribution

A percentile indicates the relative standing of a data point within a data set. To find the 90th percentile, for instance, one can use the TI-84’s invNorm(0.9, mean, standard deviation) function. To find what percentile a standardized value is, lets say 1.9, we can use the TI-84’s normcdf(1E-99, 1.9, 0, 1) function.

IV. Sampling Distributions of Sample Means

General Overview

Sampling distributions of sample means represent the range and frequency of possible sample means for a given sample size. It’s a concept pivotal for understanding the Central Limit Theorem. The distribution shows how much sample means will vary around the population mean.

Relationship to Test Statistics and P-values

The sampling distribution serves as the foundation for hypothesis testing. A test statistic is calculated from sample data and is compared to the sampling distribution to determine how extreme the test statistic is. In essence, the test statistic serves a role similar to a z-score; it quantifies how many standard deviations away a sample statistic is from the population parameter, assuming the null hypothesis is true. The p-value is then the probability that the test statistic is as extreme, or more extreme, than what was observed, under the assumption that the null hypothesis is true.

V. Central Limit Theorem

The Central Limit Theorem (CLT) states that for a sufficiently large sample size, the sampling distribution of the sample mean will approximate a normal distribution, regardless of the shape of the population distribution.

VI. Assumptions and Tests

1. Normality

One-Sample T-Test

If the sample size is large (\( n \geq 30 \)), the Central Limit Theorem can be assumed to hold. For smaller sample sizes, make a dot plot or histogram and check the shape of the distribution.

Two-Sample T-Test

If either sample size is smaller than 30, graphical methods such as dot plots or histograms should be used to check for normality. If both sample sizes are large (\( n \geq 30 \)), the CLT can be assumed to hold.

One-Sample Z-Test for Proportion

Both \( np \) and \( n(1-p) \) should be greater than or equal to 10, as per AP guidelines.

Two-Sample Z-Test for Proportion

Both \( n_1p_1 \) and \( n_1(1-p_1) \), and \( n_2p_2 \) and \( n_2(1-p_2) \) should be greater than or equal to 10.

Chi-Squared Test for Independence

Each cell in the contingency table should have an expected frequency of 5 or more.

2. Independence

The sample size should not exceed 10% of the population, as per the 10% rule, to ensure independence.

3. Random Sampling

Data should be randomly sampled from the population.

VII. Exercises

Answer the following questions based on the lecture notes.

  1. What is the Empirical Rule and why is it important?
  2. Calculate the z-score for a data point that is 3 units above the mean that’s zero, given a standard deviation of 2.
  3. How do you find the 90th percentile using a TI-84 calculator?
  4. Explain the concept of a sampling distribution of sample means.
  5. What is the Central Limit Theorem, and why is it significant?
  6. For a One-Sample T-Test, how do we check the assumption of normality?
  7. For a Two-Sample Z-Test for Proportion, what are the AP guidelines for checking normality?
  8. Explain the 10% rule in the context of statistical independence.
  9. How do test statistics relate to z-scores?
  10. What is a p-value and how is it calculated?

Leave a Reply