What is stratified random sampling

What is stratified random sampling?

Assume we have a list of all college professors at the U of R. These individuals can be classified by rank: assistant, associate and full professor. Assume the breakdown is 60% assistant professors, 30% associate professors and 10% full professors.

We want to sample 100 of all the professors at the university.

We can make a list of all professors regardless of rank, and then take a random sample of 100 from this list. This sample should include roughly the correct proportions from each group.

Alternatively, you can take a stratified sample of each group. Make a list of all assistant professors. If our total sample is 100, then 60% of the sample, or 60 individuals should come from the assistant professor list. We take a simple random sample of 60 individuals from the list of assistant professors. We do the same with the other two groups of professors. This gives us a higher accuracy with regard to the variable rank.

We then send questionnaires to all the 100 professors selected.

Our sample will then contain 60 assistant professors, 30 associate professors and 10 full professors.

We could also stratify on more than one variable, such as rank and sex.

In this case we need to determine what proportion of professors are in each combination of rank and sex.

	Rank
	Assistant	Associate	Full
Male	40%	20%	8%
Female	20%	10%	2%

If we want to create a stratified sample we create 6 subgroups to sample from.

Let’s assume that we want to sample 50 professors.

In this case we would expect 40% of our sample to be male assistant professors. Since our entire sample will be 50 professors, then need to sample 50 x 40%, or 20 professors from the entire list of male assistant professors.

Similarly we will need to sample 50 x 20%, or 10 professors from the list of female assistant professors.

We do this for all the groups.

For Female, full professors, we would sample only 1 individual from the list of female full professors (50 x 2% = 1).

How does the accuracy of a stratified sample differ from that of a simple random sample?

The accuracy of a stratified sample is usually better than that of a simple random sample. It can never be worse. An increase in accuracy can be expected if:

· There are major differences between the stratified groups, and

· Within each group, there is homogeneity

Using the professors as an example, there would be higher accuracy on a variable, such as an attitude question, if assistant professors had on average quite different views from associate professors, and both had different views from full professors. Also, there would have to be little difference within each group. That is, most assistant professors would think alike.

This would increase the accuracy for this variable. For other variables there may not be any increase in accuracy, since there may be a large overlap in attitudes regardless of rank.

When we report accuracy figures for a survey, we typically report the worst accuracy overall. On further analysis we might find that some variables have much higher accuracy. These could be calculated and reported for each variable.

If we want to increase the accuracy of a survey, we can use stratified sampling. We would have to stratify on variables that we think reflect major differences between the subgroups.

Rank