Sampling Precision and Sample Size

Or here if above not available

Because TI MSS is fundamentally a study of mathematics and science

achievement among fourth and eighth grade students, the precision of

survey estimates of student achievement and characteristics was of primary

importance. However, TI MSS also reports extensively on school, teacher,

and classroom characteristics, so it is necessary to have sufficiently large

samples of schools and classes. The TI MSS standards for sampling precision

require that all student samples have an effective sample size of at least 400

students for the main criterion variable, which is mathematics and science

achievement. In other words, all student samples should yield sampling

errors that are no greater than would be obtained from a simple random

sample of 400 students.

Given that sampling error, when using simple random sampling, can be

expressed as *SE**SRS *= *S */ *n *where *S *gives the population standard deviation

and *n *the sample size, a simple random sample of 400 students would yield

a 95 percent confidence interval for an estimate of a student-level mean of

±10 percent of its standard deviation ( 1.96 g *S */ 400 ). Because the TI MSS

achievement scale has a standard deviation of 100 points, this translates into

a ±10 points confidence limit (or a standard error estimate of approximately

5 points). Similarly, sample estimates of student-level percentages would have

a confidence interval of approximately ±5 percentage points.

Notwithstanding these precision requirements, TI MSS required that

all student sample sizes should not be less than 4,000 students. This was

necessary to ensure adequate sample sizes for analyses where the student

population was broken down into many subgroups. For countries involved in

the previous TI MSS cycle in 2003, this minimum student sample size was set

to 5,150 students in order to compensate for participaton in the TI MSS 2007

Bridging Study. Furthermore, since TI MSS planned to conduct analyses at the

school and classroom level in addition to the student level, all school sample

sizes were required to be not less than 150 schools, unless a complete census

failed to reach this minimum. Under simple random sampling assumptions,

a sample of 150 schools yields a 95 percent confidence interval for an estimate

of a school-level mean that is ±16 percent of a standard deviation.

Although the TI MSS sampling precision requirements are such that

they would be satisfied by a simple random sample of 400 students, sample

designs such as the TI MSS 2007 school-and-class design, typically require

much larger student samples to achieve the same level of precision. Because

students in the same school and even more so in the same class, tend

to be more like each other than like other students in the population,

sampling a single class of 30 students will yield less information per student

than a random sample of students drawn from across all students in the

population. TI MSS uses the intraclass correlation, a statistic indicating

how much students in a group are similar on an outcome measure, and a

related measure known as the design effect to adjust for this “clustering”

effect in planning sample sizes.

For countries taking part in TI MSS for the first time in 2007, the

following mathematical formulas were used to estimate how many schools

should be sampled to achieve an acceptable level of sampling precision:

*Var**PPS** *= *Deff** *g*Var**SRS** *=

*Deff** *g *S*2

*n*

≅

1+ (*mcs** *−1) ⎡⎣

⎤⎦

g *S*2

*n*

≅

1+ (*mcs** *−1) ⎡⎣

⎤⎦

g *S*2

*a** *g*mcs*

ń ń

where *Deff** *is a compensation factor for using a sample selection method

that differs from a simple random sample (also called design effect), *S*2 gives

the variance of the population, ń measures the intraclass correlation between

clusters, *mcs** *corresponds to the average number of sampled students per

class, and *a *gives the number of schools to sample. Incorporating the

precision requirements described earlier into this equation, which translates

into *Var**PPS** *= (0.05)2 g *S*2 , gives the number of schools required as:

(1)

*a** *= 400 g

1+ (*mcs** *−1) ⎡⎣

⎤⎦

*mcs*

ń

For planning purposes, the intraclass correlation coefficient usually was

set to 0.3 if no other information was available. For example, with a *mcs** *of

20 students and a ń of 0.3, equation (1) gives 134 schools.

Equation (1) is a model for determining how many schools were required

for the TI MSS 2007 sample under the assumption that the standard error of

the criterion variable (student mathematics and science achievement) reflects

only sampling variance—the usual situation in sample surveys. However,

because of its complex matrix-sampling assessment design, standard errors

in TI MSS include an imputation error component in addition to the usual

sampling error component (see Chapter 11). To keep the standard error

within the prescribed precision limits, the number of schools determined

by equation (1) has to be increased, as shown in equation (2):

(2) *a**irt** *= (400 g 0.5)/*mcs*