Social Studies 306
Fall 98
Answers for Assignment 4



General Notes

1. When handing in your assignments, show as much of the coding and other instructions you used when obtaining the results. This is especially important when using recodes, computes or select cases. If you do not include these SPSS commands, then I have no way of knowing what you did if your results differ from what I obtained.

2. When using these recode, compute, and select cases commands, it is often a good idea to run off the frequency distribution first (with Statistics-Summarize-Frequencies), in order that you know exactly how the data are coded and distributed.

3. When recoding a variable that has many values, there are no rules concerning what is exactly the correct or incorrect way to recode. As a result, if the tables do not produce the results you expect, you might want to try another way of recoding the data. For example, in question 2, most of you might have tried further recodes to see if you could find a relationship between job hours and study hours.

4. When producing a cross-classification table, it is best to request either column or row percentages. Otherwise, you have to deal with unequal sample sizes in each column and attempt to estimate whether or not there is a relationship between the variables on this basis. This is usually quite difficult.

5. I could not figure out how to get the bar chart in this file for the web site so that is not included.

Answers

For questions 1 and 2, I first ran off the frequency distributions in order to see how I should recode these. These tables follow.

FREQUENCIES
  VARIABLES=coffee drunk  .

Frequencies


Coffee Consumption

Frequency Percent Valid Percent Cumulative Percent
Valid 1 None 493 66.1 66.6 66.6
2 1 to 3 cups daily 193 25.9 26.1 92.7
3 4 to 6 cups daily 42 5.6 5.7 98.4
4 7 or more cups daily 12 1.6 1.6 100.0
Total 740 99.2 100.0
Missing 6 1 .1

9 No response 1 .1

System Missing 4 .5

Total 6 .8

Total 746 100.0


# Drinks to be Drunk

Frequency Percent Valid Percent Cumulative Percent
Valid 1 One 6 .8 1.0 1.0
2 2 to 4 224 30.0 36.8 37.8
3 5 to 7 278 37.3 45.7 83.6
4 8 or more 100 13.4 16.4 100.0
Total 608 81.5 100.0
Missing 0 1 .1

6 Other 1 .1

8 NOT APPLICABLE 1 .1

9 NO RESPONSE 8 1.1

System Missing 127 17.0

Total 138 18.5

Total 746 100.0


FREQUENCIES
  VARIABLES=sthours jobhours
  /STATISTICS=STDDEV MEAN MEDIAN .

Statistics

N Mean Median Std. Deviation
Valid Missing


Study Hours 716 30 15.81 12.00 12.21
HOURS PER WEEK AT JOB - W96 395 351 18.38 17.00 9.55

Study Hours

Frequency Percent Valid Percent Cumulative Percent
Valid 0 2 .3 .3 .3
1 5 .7 .7 1.0
2 20 2.7 2.8 3.8
3 19 2.5 2.7 6.4
4 26 3.5 3.6 10.1
5 39 5.2 5.4 15.5
6 36 4.8 5.0 20.5
7 20 2.7 2.8 23.3
8 42 5.6 5.9 29.2
9 12 1.6 1.7 30.9
10 122 16.4 17.0 47.9
11 6 .8 .8 48.7
12 23 3.1 3.2 52.0
13 4 .5 .6 52.5
14 13 1.7 1.8 54.3
15 64 8.6 8.9 63.3
16 2 .3 .3 63.5
17 8 1.1 1.1 64.7
18 12 1.6 1.7 66.3
19 1 .1 .1 66.5
20 86 11.5 12.0 78.5
21 6 .8 .8 79.3
22 4 .5 .6 79.9
24 4 .5 .6 80.4
25 36 4.8 5.0 85.5
27 4 .5 .6 86.0
28 6 .8 .8 86.9
30 34 4.6 4.7 91.6
32 2 .3 .3 91.9
35 9 1.2 1.3 93.2
36 3 .4 .4 93.6
40 23 3.1 3.2 96.8
42 1 .1 .1 96.9
45 2 .3 .3 97.2
50 9 1.2 1.3 98.5
52 1 .1 .1 98.6
56 1 .1 .1 98.7
60 5 .7 .7 99.4
70 1 .1 .1 99.6
80 1 .1 .1 99.7
90 1 .1 .1 99.9
99 1 .1 .1 100.0
Total 716 96.0 100.0
Missing 990 2 .3

997 Uncertain 1 .1

999 No response 17 2.3

System Missing 10 1.3

Total 30 4.0

Total 746 100.0


HOURS PER WEEK AT JOB - W96

Frequency Percent Valid Percent Cumulative Percent
Valid 1 4 .5 1.0 1.0
2 3 .4 .8 1.8
3 5 .7 1.3 3.0
4 6 .8 1.5 4.6
5 4 .5 1.0 5.6
6 10 1.3 2.5 8.1
7 5 .7 1.3 9.4
8 19 2.5 4.8 14.2
9 6 .8 1.5 15.7
10 28 3.8 7.1 22.8
11 2 .3 .5 23.3
12 22 2.9 5.6 28.9
13 1 .1 .3 29.1
13 9 1.2 2.3 31.4
13 1 .1 .3 31.6
14 6 .8 1.5 33.2
15 37 5.0 9.4 42.5
16 1 .1 .3 42.8
16 23 3.1 5.8 48.6
17 8 1.1 2.0 50.6
18 13 1.7 3.3 53.9
19 4 .5 1.0 54.9
20 59 7.9 14.9 69.9
21 1 .1 .3 70.1
22 12 1.6 3.0 73.2
23 6 .8 1.5 74.7
24 7 .9 1.8 76.5
25 29 3.9 7.3 83.8
26 1 .1 .3 84.1
27 3 .4 .8 84.8
28 2 .3 .5 85.3
29 1 .1 .3 85.6
30 25 3.4 6.3 91.9
32 7 .9 1.8 93.7
35 3 .4 .8 94.4
36 1 .1 .3 94.7
37 2 .3 .5 95.2
40 11 1.5 2.8 98.0
43 1 .1 .3 98.2
45 1 .1 .3 98.5
48 1 .1 .3 98.7
50 3 .4 .8 99.5
55 2 .3 .5 100.0
Total 395 52.9 100.0
Missing 99 NO RESPONSE 6 .8

System Missing 345 46.2

Total 351 47.1

Total 746 100.0


SPSS Instructions and output for Question 1.

RECODE
  coffee
  (1=0)  (2=2)  (3=5)  (4=8)  INTO  rcoffee .
EXECUTE .
RECODE
  drunk
  (1=1)  (2=3)  (3=6)  (4=10)  INTO  rdrunk .
EXECUTE .
FREQUENCIES
  VARIABLES=rcoffee rdrunk
  /STATISTICS=STDDEV MEAN .

Frequencies


Statistics

N Mean Std. Deviation
Valid Missing

RCOFFEE 740 6 .9351 1.6214
RDRUNK 608 138 5.5033 2.4407

RCOFFEE

Frequency Percent Valid Percent Cumulative Percent
Valid .00 493 66.1 66.6 66.6
2.00 193 25.9 26.1 92.7
5.00 42 5.6 5.7 98.4
8.00 12 1.6 1.6 100.0
Total 740 99.2 100.0
Missing System Missing 6 .8

Total 6 .8

Total 746 100.0


RDRUNK

Frequency Percent Valid Percent Cumulative Percent
Valid 1.00 6 .8 1.0 1.0
3.00 224 30.0 36.8 37.8
6.00 278 37.3 45.7 83.6
10.00 100 13.4 16.4 100.0
Total 608 81.5 100.0
Missing System Missing 138 18.5

Total 138 18.5

Total 746 100.0

Description

After recoding, the new codes represent the midpoints of the intervals into which the data were originally grouped. As a result, the means and standard deviations should closely approximate the actual values of the variables being measured, assuming that respondents are truthful and that this sample is reasonably representative of University of Regina undergraduates.

For coffee consumption, almost exactly two-thirds of respondents report that they do not drink coffee, with about two-thirds of the remainder reporting under four cups daily. Only 2 per cent report drinking seven or more cups daily. Since those who do not drink coffee are included in the average, the mean number of cups of coffee drunk daily is reported to be just under one cup. Given that there are some who drink a considerable amount of coffee, the distribution is quite varied, with a standard deviation of 1.6 cups daily.

For the number of drinks required to become drunk, there is a lot of variation, with the standard deviation being 2.4 drinks. That is, there are a lot of respondents in each of the 2-4, 5-7, and 8 or more categories. The mean is 5.5 drinks required to become drunk. Only 16 per cent report requiring 8 or more drinks before becoming drunk.

Question 2


RECODE
  sthours
  (Lowest thru 9=5)  (10 thru 19=15)  (20 thru Highest=25)  INTO  rst .
EXECUTE .
RECODE
  jobhours
  (Lowest thru 9=5)  (10 thru 19=15)  (20 thru Highest=25)  INTO  rj .
EXECUTE .
CROSSTABS
  /TABLES=rst  BY rj
  /FORMAT= AVALUE TABLES
  /CELLS= COUNT COLUMN .

Crosstabs



RST * RJ Crosstabulation

RJ Total
5.00 15.00 25.00
RST 5.00 Count 14 48 70 132
% within RJ 22.6% 31.0% 38.0% 32.9%
15.00 Count 22 66 59 147
% within RJ 35.5% 42.6% 32.1% 36.7%
25.00 Count 26 41 55 122
% within RJ 41.9% 26.5% 29.9% 30.4%
Total Count 62 155 184 401
% within RJ 100.0% 100.0% 100.0% 100.0%

Answer for Question 2

Since tables with a lot of cells are difficult to analyze, I used only three categories for each of the two variables. As most of you noted, there is not much of a relationship between hours worked at a job or jobs and study hours. This may partly be due to the fact that those without jobs are not included in this table, since only those who report some hours worked at jobs are included.

The above cross-classification table does show some sort of relationship between these variables though. Notice in the third row of the table, 42% of those with less than 10 hours at jobs report 20 or more hours per week spent studying. In contrast, for those with 10 or more hours at a job (categories 15 and 25 in the table), only somewhere between one-quarter and 30 per cent report report this many hours studied.

Again, in the first row of the table, note that as the number of job hours is increased (moving from left to right) the percentage of those who report less than 10 hours spent studying increases regularly.

As a result, this table does show some tendency for those with fewer hours worked to study somewhat more, and those with more hours worked to be more concentrated in the fewer study hours categories.

Question 3


COMPUTE ch = ch6 + ch611 + ch12 .
EXECUTE .
FREQUENCIES
  VARIABLES=ch
  /STATISTICS=STDDEV MEAN
  /BARCHART  FREQ .

Frequencies


Statistics

N Mean Std. Deviation
Valid Missing

CH 716 30 .2109 .7028

CH

Frequency Percent Valid Percent Cumulative Percent
Valid .00 639 85.7 89.2 89.2
1.00 32 4.3 4.5 93.7
2.00 25 3.4 3.5 97.2
3.00 15 2.0 2.1 99.3
4.00 2 .3 .3 99.6
5.00 2 .3 .3 99.9
6.00 1 .1 .1 100.0
Total 716 96.0 100.0
Missing System Missing 30 4.0

Total 30 4.0

Total 746 100.0



Answer for Question 3

Almost 90 per cent of respondents report that they did not have children. For those with children, the number at each successively larger number of children is somewhat less, with there being only 2 respondents with 4 children, 2 with 5 children, and one with 6 children.

This large concentration of respondents at 0 children produces a very small mean of 0.2 children per respondent. The fact that there are respondents with several children means that the standard deviation is considerably larger than the mean, at 0.7 children.

Question 4


COMPUTE prob = regret + suffer +relation  + actions + blackout + violent + la
.
EXECUTE .
FREQUENCIES
  VARIABLES=prob
  /STATISTICS=STDDEV MEAN .

Frequencies


Statistics

N Mean Std. Deviation
Valid Missing

PROB 587 159 9.4821 2.0969

PROB

Frequency Percent Valid Percent Cumulative Percent
Valid 7.00 113 15.1 19.3 19.3
8.00 97 13.0 16.5 35.8
9.00 124 16.6 21.1 56.9
10.00 104 13.9 17.7 74.6
11.00 57 7.6 9.7 84.3
12.00 35 4.7 6.0 90.3
13.00 29 3.9 4.9 95.2
14.00 16 2.1 2.7 98.0
15.00 6 .8 1.0 99.0
16.00 1 .1 .2 99.1
17.00 1 .1 .2 99.3
18.00 4 .5 .7 100.0
Total 587 78.7 100.0
Missing System Missing 159 21.3

Total 159 21.3

Total 746 100.0


MEANS
  TABLES=prob  BY alcuse permit
  /CELLS MEAN COUNT STDDEV  .

Means


Case Processing Summary

Cases
Included Excluded Total
N Percent N Percent N Percent
PROB * USED ALCOHOL? 584 78.3% 162 21.7% 746 100.0%
PROB * Permitted to Drink Underage? 583 78.2% 163 21.8% 746 100.0%

PROB * USED ALCOHOL?
PROB
2 Very rarely Mean 8.7569
N 144
Std. Deviation 1.9833
3 Special occasions Mean 9.2231
N 130
Std. Deviation 2.1359
4 Weekends Mean 9.8474
N 249
Std. Deviation 1.9158
5 Several times weekly Mean 10.0189
N 53
Std. Deviation 2.0521
6 Every day Mean 11.8750
N 8
Std. Deviation 3.9438
Total Mean 9.4829
N 584
Std. Deviation 2.0987

PROB * Permitted to Drink Underage?
PROB
1 Not at all Mean 9.6864
N 118
Std. Deviation 2.3632
2 Occasional permission Mean 9.3494
N 352
Std. Deviation 1.9670
3 No limits Mean 9.7345
N 113
Std. Deviation 2.1672
Total Mean 9.4923
N 583
Std. Deviation 2.0955


Answers for Question 4

This is an example of a scale constructed from several variables. Each of the variables is coded in a similar manner, with 1 representing minimal or no alcohol related problems, 2 some problems, and 3 representing greater alcohol related problems. While the severity of the problems is greater for some of the questions (e.g. got into trouble with the law) than for others (e.g. regret drinking so much), by adding these together, the scale treats each problem as equal. This is one of the problematic aspects of this scale.

By adding together seven variables, each with code 1 to 3, the minimum possible value that anyone could have on this scale is 7 and the maximum is 21 (in the case that the respondent answers 3 for each of the 7 questions). Note that no one has a value more than 18, with the great bulk of respondents having values between 7 and 10 on the scale. This means that about three-quarters of respondents have had relatively few alcohol related problems. Several respondents, however, did report 14 or more, indicating considerable alcohol related problems.

From the first means table, it does appear that those who report greater amounts, or at least more frequent, alcohol consumption, report more alcohol related problems. The index averages 8.8 for those who report drinking alcohol only very rarely. Then it increases to 9.2 for those who drink on special occasions, 9.8 for those drinking on weekends, 10.0 for those drinking several times a week, to 11.9 for the small number of respondents who report drinking every day. While these differences are not real large, they are consistent and regular. This leads a researcher to believe that those who drink more frequently are likely to have more alcohol related problems.

For the second means table, there is litle or no apparent relation between the index and whether or not parents allowed children to drink alcohol. The means for each of the categories are very similar (9.7, 9.3 and 9.7) with no regular or consistent direction to the differences.

Question 5


USE ALL.
COMPUTE filter_$=(ch > 0).
VARIABLE LABEL filter_$ 'ch > 0 (FILTER)'.
VALUE LABELS filter_$  0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE .
FREQUENCIES
  VARIABLES=ch
  /STATISTICS=STDDEV MEAN .

Frequencies


Statistics

N Mean Std. Deviation
Valid Missing

CH 77 0 1.9610 1.0814

CH

Frequency Percent Valid Percent Cumulative Percent
Valid 1.00 32 41.6 41.6 41.6
2.00 25 32.5 32.5 74.0
3.00 15 19.5 19.5 93.5
4.00 2 2.6 2.6 96.1
5.00 2 2.6 2.6 98.7
6.00 1 1.3 1.3 100.0
Total 77 100.0 100.0
Total 77 100.0



Answer for Question 5

This requires the use of the select cases to eliminate the respondents with no children. After that, the distribution of those with children can be seen, and this is the same as in question 3. For those who do have children, the mean is almost 2, so those undergraduates who do have children average about two children per household.

Note that the standard deviation is actually larger in this distribution than it was in question 3. For the distribution of those with children, the standard deviation is 1.1 children, as opposed to 0.7 for all respondents. In the case of all respondents, the values of the variable are so heavily concentrated at 0 children that the standard deviation is small. Although the range is reduced when those with no children are eliminated, the remaining cases are more spread out over the values between 1 and 6 than in the prior case. This is an example of where the two measures of variation actually give different results.

Question 6


FILTER OFF.
USE ALL.
EXECUTE .
USE ALL.
COMPUTE filter_$=(sex = 1).
VARIABLE LABEL filter_$ 'sex = 1 (FILTER)'.
VALUE LABELS filter_$  0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE .
MEANS
  TABLES=sthours hwhours dephours  BY job
  /CELLS MEAN COUNT STDDEV  .


Case Processing Summary

Cases
Included Excluded Total
N Percent N Percent N Percent
Study Hours * Hold a job? 249 92.2% 21 7.8% 270 100.0%
DEPENDENT HOURS * Hold a job? 248 91.9% 22 8.1% 270 100.0%
HOUSEHOLD WORK HOURS * Hold a job? 248 91.9% 22 8.1% 270 100.0%

Report
Hold a job? Study Hours DEPENDENT HOURS HOUSEHOLD WORK HOURS
1 No Mean 16.30 6.31 5.89
N 112 112 114
Std. Deviation 14.97 18.75 25.80
2 Yes Mean 13.77 3.72 2.29
N 137 136 134
Std. Deviation 10.66 5.01 10.53
Total Mean 14.91 4.89 3.95
N 249 248 248
Std. Deviation 12.81 13.17 19.17

USE ALL.
COMPUTE filter_$=(sex =2).
VARIABLE LABEL filter_$ 'sex =2 (FILTER)'.
VALUE LABELS filter_$  0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE .
MEANS
  TABLES=sthours hwhours dephours  BY job
  /CELLS MEAN COUNT STDDEV  .

Means


Case Processing Summary

Cases
Included Excluded Total
N Percent N Percent N Percent
Study Hours * Hold a job? 458 96.4% 17 3.6% 475 100.0%
DEPENDENT HOURS * Hold a job? 458 96.4% 17 3.6% 475 100.0%
HOUSEHOLD WORK HOURS * Hold a job? 455 95.8% 20 4.2% 475 100.0%

Report
Hold a job? Study Hours DEPENDENT HOURS HOUSEHOLD WORK HOURS
1 No Mean 17.82 5.31 9.53
N 205 204 204
Std. Deviation 13.08 7.25 30.98
2 Yes Mean 14.94 4.52 3.62
N 253 254 251
Std. Deviation 10.63 5.18 14.43
Total Mean 16.23 4.87 6.27
N 458 458 455
Std. Deviation 11.86 6.20 23.51


Answers for Question 6

First note that the means of study hours, hours with dependents, and hours at housework all are lower for those with jobs, both males and females.

Second note that the pattern does differ. For males, those with jobs spend approximately three hours less weekly at each of these three tasks than do those with jobs. In contrast, for females, study hours are about three hours less for those with jobs, in contrast to those without jobs. But for hours with dependents there is very little difference in the mean, less than one hour less for those with jobs. Then there is a large reduction, almost six hours, in the housework hours for those with jobs.

In terms of males and females, the latter generally spent more hours at each of the tasks than did the former. The only case where females spent less hours is that hours with dependents is reported as one hour less for females without jobs than for males without jobs.


Paul Gingrich

November 12, 1998