Tue Dec 1 11:23:00 1998
Erwin Morton <>
For nearly two weeks, timss-forum has been dominated--to its
detriment--by a flurry of messages to and from one person.  This traffic
has essentially stopped all other threads and made it difficult to
discuss anything except one man's personal agenda.

May I suggest that everyone reading or participating in this forum would
find it instructive to look at the following web sites--the home pages
of the sites repeatedly referenced by Mr. Knight:

See also:

In addition to the obvious I-have-a-point-to-prove orientation of these
sites, I note that the "close correlations" referenced by Mr. Knight
involve "selected countries", and not always the same number of
countries.  What is the basis of the selection?  Has the person who
created the graphs selected the countries that best fit the correlation
he desires?  [I believe that in this instance I can safely use the
pronoun "he" without fear of being labeled sexist.]  How good is the
analysis of the data?  Here is a specific example, drawn from one of the
web sites Mr. Knight cites:

The three graphs

all purport to show correlations between TIMSS scores and the percentage
of male teachers, in 7, 13, and 17 countries respectively.  educate 32
(the 7-country graph) and educate31 (the 13-country graph) show the same
7 data points; educate31 adds 6 more countries, all of which have
predominantly *male* teachers and relatively *low* TIMSS scores (i.e.,
all six points are on the right and well below the fitted line).  Yet
the (least-squares?) fitted line is identical on the two graphs!  That
is, the extra six points have no apparent effect on the fit!

The caption on educate32 (7 countries) says TIMSS scores increase 1
point for each 1% increase in male teachers, while the caption on
educate31 (13 countries), with this trend obviously weakened if not even
reversed, says TIMSS scores increase *4* points for each 1% increase in
male teachers!  (The trend line shown on the graphs actually increases
by about 2.5 points for every 1% increase in male teachers.)

educate21, with more countries, adds more extreme outliers on *both*
sides of the line, and appears, at first glance, to have a rather low
correlation coefficient.  But the same conclusion is drawn--4 points for
each 1%.

And nowhere, of course, is there a word about correlation coefficients,
fitting methods, error bars, etc., etc., etc.

Even if we ignore the sloppy mathematics, it is dangerous to select the
data that fit (or appear to fit, or can be made to appear to fit) the
thesis one is trying to demonstrate, and ignoring or rejecting all other
data by waving one's hand and labeling it outlying, discordant, or
nonconforming data.

It is not a sound basis for research, for understanding, or for policy
decisions.  It is a good way to delude ourselves, and an excellent
method for deluding others (intentionally or unintentionally).

Remember also that even a true close correlation does not, in and of
itself, demonstrate causality.

Pretty graphs are impressive and memorable, and may seem convincing, but
we should be extremely cautious about accepting any of these conclusions
without verifying that both the data and its analysis are correct and
complete.  We should also consider both what other interpretations might
be possible, and what other research sheds light on the same questions.

But, to quote Mr. Knight (Monday, Nov. 30):

> However, even without correcting these obvious erroneous data points,
> there is still such a close correlation between TIMSS Scores and the
> percent of teachers who are men that this probable factor cannot
> continue to be ignored
> []  It
> would not be appropriate to refocus the discussion on TIMSS data at
> this point, particularly if this *is* the root of the problem.

Yet even the graphs he cites do not confirm the correlation he claims.

Again, quoting Mr. Knight:

> Such pop edudcation (sic) theories obviously don't work.

I could not agree more, but I doubt that he and I are speaking of the
same theories.

The friendly folks at US TIMSS have been careful to distinguish among
(a) what the data shows, (b) what the data suggests, (c) what questions
the data simply does not answer, and (d) what questions the data does
not even address.

I suggest we try to do the same.

--Erwin Morton



