Teaching Elementary Statistics through the Concept of "LossFunction"
RUMA FALK
Descriptive statistics offer us several averages for a given set of variable numbers.
Most elementary courses introduce the mean (i.e., arithmetic mean), the median and the mode.
An average, which is supposed to characterise a given distribution of values, is never identical with all the values (except for the trivial case). Each possible suggestion of an average involves some inaccuracy. The answer to the question "what is the best representation of the numbers?" depends on what is meant by "best representation". One could interpret this to mean that the average incurs the least possible "cost" in terms of differences between the average and the actual values. Each definition of the "cost" could be minimised by an appropriate average. Asking students to pay (symbolically) the costs of the errors incurred through use of different averages might introduce the averages via the idea of the least combined error.
The following procedure, which may be presented as a game in the classroom, has helped my students on both secondary school and college levels.
THE GAME
Number of participants: 2—6.
Materials:
1. A deck of cards, each bearing a positive integer. Optimally, the distribution of numbers on the cards should be skewed so that most random samples drawn from this population would have different averages. An example of a population of N = 100 numbers is given in Table 1.
2. Nine instruction cards (IC). Each IC specifies a different
possible loss function, that is, a rule for computing the player’s payment
for deviations from his or her suggested average. The instructions could
be stated either verbally, or by mathematical symbols, or in both forms.
TABLE 1












































Population averages: Mean = 10, Median = 4, Mode = 1.
The instruction cards are presented in the boxes shown in Figure 1 (cards g to i may be omitted for some classes because of the complexity of the expressions). X_{1}, X_{2 … }X_{n} denote the actual values in a sample of size n. A is the average suggested by the player.
3. Each player should be provided with a sheet of paper, a pencil and
a table of squares of natural numbers up to 100, or with a hand calculator,
for performing the necessary calculations.
Figure 1 Instruction Cards
Procedure:
1. The deck of numbercards is shuffled and then n = 7 cards are exposed and arranged in a row in front of the participants. (The cards may be arranged in increasing order.)
2. One IC is randomly drawn and read aloud.
3. Each player (a) writes down an integer representing the 7 numbers, and (b) computes the payment (in "points") associated with that suggestion. Naturally, one should choose this integer so as to minimise losses.
4. The players compare their suggestions and check each other’s computations.
5. The 7 cards are returned to their deck and so is the IC. Both decks are shuffled and the procedure is repeated.
6. The game is terminated after a predetermined number of rounds (e.g.,
8) and the losses of each player are summed up. The winner is the player
whose total loss is the smallest.
Examples.
1. Let us assume that the following 7 cards were sampled out of the deck: 1, 1, 1, 3, 5, 30, 75. Let us assume further that ICd was selected, i.e., the payment should equal the sum of the squared deviations. Suppose a player suggests 3 as the average. That player’s loss would be:
(1 — 3)^{2} + (1 — 3)^{2} + (1 — 3)^{2} + (3 — 3)^{2} + (5 — 3)^{2 }+ (30 — 3)^{2} + (75 — 3)^{2} = 5929
Another player suggesting the value 10 would pay 4942. One then asks: Could the situation be further improved by still a better suggestion?
2. Let the random sample of cards be: 1, 2, 2, 4, 4, 50, 100, and let ICb be drawn. The loss equals the largest absolute deviation. A player offering the number 4 will lose 96 = 4 — 100. Another player suggesting 75 would have a greatest absolute deviation of only 74 = 75 — 1. Could one further reduce the loss by improving the suggestion?
Comments.
Some of the optimal suggestions were easily discovered even by very young junior high school students. Discovery of the mode, as the average minimising "the number of errors" (ICe), is immediate. Short experimentation reveals that "the largest absolute deviation" (ICb) is minimised by the midrange, i.e., (Xmin + Xmax)/2.
Students usually discover, after some trial and error, that one should suggest the median in order to minimise "the sum of absolute deviations" (ICf), even though they are neither familiar with the term "median" nor able to prove their discovery. Often the arithmetic mean is tried first. This may be due to past learning, since usually X was learned in elementary school and students tend to equate the concept of average with the arithmetic mean.
Discovering that "the sum of squared deviations" (ICd) is minimised by the arithmetic mean is quite different, although after experimenting with different alternatives, students usually get the feeling that the optimal value is heavily affected by extreme values and should be "drawn" to their direction. The fact that habit makes the arithmetic mean highly available, sometimes helps students to come up with the best suggestion on first trial.
Most pupils have a hard time when first trying to minimise the product of the absolute deviations (ICc). Once they try the mode, they all of a sudden have a moment of insight realising that any samplevalue will do the job, and they enjoy the trick.
Even those students who fail to find the correct solutions get experience that helps them understand "what it's all about" once the averages are discussed in class.
Exact understanding of the verbal (and symbolic) instructions on the cards is not always easy. Many students find the distinction between "sum of absolute deviations" (ICf) and "absolute value of the sum of deviations" (ICa) quite difficult. Concentrating on these two expressions and making out the two functions defined by them could, on its own merit, be a rewarding exercise.
A secondary benefit of playing the game is practising concepts like absolute value, squared deviation and so on. Searching for the value that minimises a given function helps acquaint the student with the way a function may vary when the independent variable undergoes small changes.
Jerusalem
Acknowledgements
Preparation of this paper was partly supported by the Human Development Center—The Hebrew University of Jerusalem.
The author is indebted to Baruch Fischhoff for his helpful comments.
Back to
contents of The Best of Teaching Statistics
Back to main Teaching Statistics
page