Regression Demonstration

How do I do the Regression Demonstration?

1. Select Regression Demonstration from the main menu.

2. Click the mouse where you think the left or right end of the regression line should be. Then click the mouse where you think the other end of the line should be. The computer will draw a line through the two points you have drawn. You may now choose a new point for either the left or right end of the line as many times as you like. If you press the Escape key, the computer will fit the best-fitting regression line for you.

What will I see in the Regression Demonstration?

Each time you start Regression Demonstration, your screen will display 5 randomly chosen data points, labelled a - e, on a scatterplot.  There will be a red x at the point (x-mean, y-mean). When the computer draws a line, it shows you the absolute value of the distance between the y value predicted by your line and each actual y value.  At the top of your screen are the values for Syx (the minimum possible value for error in prediction), and err (the actual error in prediction given by your line).

Purpose of the Regression Demonstration

To demonstrate that the regression line, as calculated by the regression formula, is the best-fitting line.

What can I demonstrate in the Regression Demonstration

When you select different points in the demonstration to define a line through the data, each line results in a certain amount of error (err) between the actual values of y, and the predicted values of y (given by your line).  The purpose of regression is to arrive at an equation for a line that best fits the data; one that minimizes the amount of error.  This line is the regression line with error equal to Syx; any other line through the data will result in greater error.

Start the regression demonstration and fit some lines to the data.  Your screen will display the absolute value of the distance between the y value predicted by the line and each actual y value.  At the top of your screen are the values for Syx and err.  As you draw different lines through the data points, the value for err will change (this value is the sum of the squared distances of the predicted y's, from the actual y's, divided by 5 -- the number of data points). Syx will remain constant as you redraw the line, and is the minimum possible value for err, or the least squared error.  When you press the Escape key, the actual values for points a-e are displayed in a table, and the computer draws the best fitting line through the data (therefore, the values for Syx and err will be equal). Note that you will not be able to draw a line for which the value of err is smaller than the value for Syx (but you should try!). Note also that the regression line drawn by the computer always intersects (x-mean, y-mean).