Central Limit Demonstration


How Do I Do It?

What Will I See?

Purpose

What Can I Demonstrate?


DO IT!


 

 

 

 

 


How do I do the Central Limit Demonstration?

1. Select Central Limit Demonstration from the main menu.

2. Enter the total Number of Samples you want the computer to randomly select (a number between 1,000 and 10,000).

3. Enter the Sample Size -- the number of observations that you want the computer to select for each sample (a number between 1 and 40).

4. Choose the population from which the computer draws its samples (normal, uniform, triangular or v-shaped).

5. Choose OK and the computer will build the distribution of sample means.  Before the computer adds a point to the distribution, it shows you the values that go into each mean, and the value of the mean itself.  Pressing the number keys 1 - 9 will cause a delay of from 1 /10 to 8 seconds , respectively, for each sample.  If you press the 0 (zero) key, the demonstration will finish as fast as possible.  Pressing the Escape key will abort the demonstration.  When the End key is pressed, the computer displays a dialog box, "Pause Again?", and pauses until any key is pressed or 'YES' or 'NO' is clicked -- If 'YES' was clicked or the End key was pressed again, the computer pauses again.


 

 


What will I see in the Central Limit Demonstration?

Your selections will be displayed in the top corners of the window, and the Distribution of Sample Means will begin to form.  If you have selected a sample size between 1 and 10, then, for each sample, the computer displays the values that it has randomly selected as well as the resulting mean.  For sample sizes greater than 10 only the mean is displayed.  Each mean calculated will be plotted in the new distribution of sample means.  When the computer is building the distribution, pressing the number keys 1 - 9 will cause a delay of 1 to 9 half-seconds, respectively, for each sample.  If you press the 0 (zero) key, the demonstration will finish as fast as possible

When the distribution of sample means is complete, the computer superimposes a theoretical normal curve onto it.


 

 


Purpose of the Central Limit Demonstration

By allowing you to draw large numbers of samples of varying size from different types of distributions, the Central Limit Demonstration shows that:

A. Given a normal population with standard deviation sigma from which random samples of size n are drawn: As the number of samples increases, the distribution of the sample means will come closer and closer to a normal distribution whose mean is the same as the population mean and whose standard deviation (referred to as standard error) is sigma divided by the square root of n.

B. Given any population, with standard deviation sigma, from which random samples are drawn, as the number of samples and the sample size (n) increase:

1. The distribution of the sample means will come closer and closer to a normal distribution;

2. The mean of the sample means will come closer and closer to the mean of the population from which the samples were drawn; and

3. The standard deviation of the distribution of sample means (referred to as the standard error) will come closer and closer to sigma divided by the square root of n.

Proposition B is known as the Central Limit Theorem.

Consult the Teacher`s PET


 

 


What can I demonstrate in the Central Limit Demonstration

Try generating several sample mean distributions by varying the number of samples that goes into each.  (Make one with 1,000 samples, one with 10,000 samples, and a couple of distributions using a number of samples in between.) You will see that the greater the number of samples, the better the shape of the resulting distribution approximates a normal one.  (The computer limits you to 10,000 samples.)

The Shape of the Distribution of Sample Means and the Importance of Sample Size

The shape of the distribution of sample means is dependent on 3 things: the shape of the population from which the samples are drawn, the number of samples, and the size of the samples.

You have varied the number of samples in the above demonstration, now let's systematically vary the other two factors.

1. Choose a small sample size (n=2), and construct a distribution of sample means for each of the parent populations (normal, uniform, v-shaped, and triangular) with this sample size.  Choose a sufficiently large number of samples for each distribution as well, and hold it constant for each distribution (say, number of samples = 5,000).

2. So now, let's use the v-shaped population to demonstrate how you can get a closer and closer approximation to normal with larger sample sizes.  Construct 4 distributions of sample means by a) selecting the v-shaped population as the population from which to draw the samples each time, b) keep the number of samples constant in each distribution (5,000), and c) vary the sample size for each distribution.  Make your first distribution with sample size = 2, then another with sample size = 10, a third with sample size =20, and finally, make a distribution of sample means with sample size = 40.

You should see that increasing the sample size to 10 dramatically improves the approximation to a normal shape for the distribution of sample means.  A sample size of 30 is somewhat better, and there is little improvement in the approximation with a larger sample size of 40, i.e., a sample size of 30 is sufficient for a distribution of sample means to approach a normal shape.  Try this with the other non-normal distributions as well: a sample size of 30 is sufficient for approximation to normal regardless of the shape of the distribution from which the samples are drawn.

What happens to standard error as n increases?

Make a number of sample mean distributions by varying the sample size.  Since the standard error is equal to the population standard deviation divided by the square root of the sample size, then increasing sample size will decrease standard error.

Make a demonstration with each population shape using a sample size of 1.  Can you explain the results?


 

 


Central Limit Demonstration

Plain English Translation

Imagine that you are a student in Prof. D. Mented`s Introductory Statistics class.  This is a popular course because Prof. D. Mented starts the course by giving everyone a passing grade (50/100).  That`s the good news.  The bad news is that the course is full of pop quizzes.  The pop quizzes always have 10 questions, each worth half a mark.  Depending on the quality of the answer, each question can earn a maximum of half a mark.  But, if the answer is really terrible, students can also lose up to a half mark.  So, scores on the pop quizzes can range from -5 to 5 marks.  If students completely flunk a pop quiz, 5 marks will be taken away from their starting mark of 50!

D. Mented has been teaching stats for 45 years using this method, and has saved the marks from all of her students` pop quizzes.  This population of students` pop quiz scores happens to be normally distributed, with a mean (mu) of 0, and a standard deviation of 1.  If we wanted to use this population of scores to demonstrate the Central Limit Theorem, this is what we`d do:

a. We`d first decide what sample size we were going to use to make a new distribution: a distribution of sample means.  Let`s say n=9.

b. Next we`d randomly select 9 quiz scores, calculate the mean, draw another 9 scores, calculate the mean, and so on until we had drawn as many samples as pratical of size 9 and calculated the mean for each.  (The computer programme limits you to a maximum of 10,000 samples.)

c. The means of the samples that were drawn in Step b are plotted in a new distribution: the distribution of sample means.  This is the kind of distribution that you build when you run the Central Limit Demonstration, after selecting the size of the random samples, the number of samples to draw, and the shape of the population from which to draw the samples.

If you run the Central Limit Demonstration now, and choose 5,000 samples of size 9, drawn from a normal distribution, you should see that the distribution of the sample means is: i) normally distributed, ii) has a mean of 0 (like the population from which it was drawn), and iii) has a standard error of .3333 [1 (the mean of the population of quiz scores) divided by the square root of 9 (the sample size), which equals 0.3333.] The x axis labels the values of the first three standard deviations.

If these points are still unclear, consult the Complete Means Demonstration, as the points illustrated here are first introduced in the Complete Means Demonstration.