**Social Studies 201**

**Fall 2003**

**Answers to Computer Problem Set 1**

**Note: ****Only the answers to the (b) parts
are in this file. The full file with
the SPSS tables and the (b) part answers is available in the lab in the folder:**

** **

**t:\students\public\201\cp1af03.doc**

** **

** **

This assignment asks you to use data from SSAE98 (*Survey
of Student Attitudes and Experiences 1998 *– see the questionnaire from the
Survey and the list of variables handout of September 16, 2003), obtain some statistics from these data and
make comments on the results. When you
have completed 1-6 below, hand in the printout of the SPSS output and the comments. You may wish to copy the SPSS output file to
another program such as Word for Windows, write the comments in that file, and
hand in a paper copy of this.

For each of the following questions, part (a) refers to what
you are to do with SPSS. For part (b)
of each of these questions, write the answers and comments in pen or pencil on
the printout (or in a Word for Windows file).

** **

1. (a) Obtain frequency distributions
for each of the variables COLLEGE, CREDIT, and PRIORITY by using *Analyze – Descriptive Statistics – Frequencies*. (b) In a sentence or two,
describe the frequency distribution for each variable.

COLLEGE. Over two-thirds of respondents (70%) are registered directly through the University. Of the seventy per cent of students in the three federated colleges, Campion has greatest representation in this sample (16 per cent), with Luther next (10%) and then SIFC (4%).

** **

CREDIT. 15 and 12 hours are by far the most common number of credit hours registered in for respondents (forty and 28 per cent respectively), with around one-tenth being registered for 9 hours. There are also a number of students registered for 3, 6, 17, and 18 hours, although there are relatively few of these. In between these more common values of credit hours registered, there are only a few students registered for credit hours such as 5, 10, 11, etc.

** **

PRIORITY. Approximately one-third of respondents favour reducing taxes or debt. Just over one-quarter favour using the deficit for infrastructure or social spending, with most of these favouring increased spending for social programs rather then infrastructure.

2. (a) For
each of the first three variables (free trade, affirmative action, gay couples)
in the table at the bottom of the first page of *Report on the Survey of
Student Attitudes and Experiences*, obtain a frequency distribution and
histogram. (b) Write a note comparing
the distributions of these three variables.
Also show how the numbers in the table in the *Report* were
obtained from the SPSS output.

** **

(b) The distributions for free trade and affirmative action are similar, with the peak of the distribution in the centre (a neutral response) and with more respondents expressing support than expressing opposition. There are relatively few respondents expressing strong agreement or disagreement. In the case of opinion about marriage of gays and lesbians, the shape of the distribution is much flatter, with relatively equal numbers of respondents at each category of opinion. The distribution of responses to this opinion question differs from that for free trade and affirmative action in that there are many respondents who express strong opposition or strong support. That is, respondents generally did not express strong support for, or opposition to, either free trade or affirmative action. But there appear to be much stronger and diverse opinions on gay and lesbian marriage.

In terms of the table in the report, “% in the Middle” is
the per cent of valid responses with opinion response 3, that is, 44.3% rounded
off to the nearest percentage is 44%. For
the “% Disagreeing,” I combined the percentage of responses on the disagree
side, that is, the “Strongly Disagree” and the 2. Similarly, for the “% Agreeing” I combined the valid per cent
for opinion categories 4 and 5 – those on the agree side of a neutral response
of 3. The percentages in the *Report*
differ slightly from the table because I took the visa students out of the
sample. As a result, the percentage of
respondents who agreed with V1 was 21% (8.1+12.6) as opposed to the 20% in the *Report*.

** **

** **

3. (a) Obtain
stem-and-leaf displays for AGE (as dependent variable) by YEAR (factor list) by
using *Analyze – Descriptive
Statistics – Explore* for these
variables. Also, obtain the histograms
of AGE for each YEAR by clicking on *Plots*, then check *Histogram* and *Continue*,
prior to clicking *OK*. (Ignore or
delete the boxplot on the output. You
could also delete the box giving the descriptive statistics). (b) Using the histograms or the
stem-and-leaf displays, write a paragraph comparing the distributions of age by
year of program. Use the stem-and-leaf
displays to determine the median and the mode of age for each of fourth and
fifth year students.

** **

(b) As might be expected, the distributions show that first
year respondents are concentrated at the youngest ages, second year respondents
are slightly older, followed by slightly older students at 3^{rd}
year. The distributions of age for each
of 4^{th} and 5^{th} year students shows that there are fewer
of the youngest students (ages 18-20) and more at ages 23 plus.

For 4^{th} year students, the median student is the
116/2 = 58^{th} student. This
is at an age of 22 (3 at age 20, 22 at age 21, and 34 at age 22, for a total of
59, so the 58^{th} is at age 22).
The mode is also 22 since there are 34 respondents are age 22, more than
at any other single age.

For 5^{th} year students, the median is at age
26. That is, there are 57 respondents
who are 5^{th} year, so the median student is the 29^{th}
student. There are 20 at ages 22 and
23, another 7 at ages 24 and 25, for a total of 27 – the 29^{th}
student is one of the students aged 26.
The most common age listed is 22, so this is the modal age.

** **

** **

4. (a) For
variables V8 and V9, obtain frequency distributions, histograms and the following statistics: mean, median, standard
deviation, and quartiles. (b) Calculate
the interquartile range. Using these
data, compare the distributions of each of these variables in words – comment
on whether or not responses to the two questions are consistent with each
other.

** **

** **

(b). The
interquartile range for V8 is the 75^{th} percentile minus the 25^{th}
percentile, or 3-1=2. For V9, the IQR
is 4-3-1.

The two distributions have quite different means, with V8 having a low mean of only 2, denoting the fact that there is disagreement with establishing user fees for health care. For V9, should there be more spending for health care, the mean of 3.5 is relatively large, indicating that respondents generally favoured more such spending. In terms of variability, V8 is a little more varied than V9, as indicated by both the larger IQR and standard deviation for V8 than V9.

The distributions are concentrated at opposite ends – for V8 most respondents strongly disagree, with 70% on the disagree side and only 14% on agreeing. In contrast, for V9, over 50% of respondents agree and only 15 per cent disagree. From the relative values of the IQR and standard deviation, responses to V8 are a little more dispersed than are responses to V9. But the two responses are consistent in the sense that there are many supporters of more spending for health care (agreement on V9) and much opposition to user fees (disagreement on V8).

5. (a) Obtain
all the statistics for the four opinion variables V2, V4, V6, and V7 by using *Analyze – Descriptive Statistics – Descriptives*. (b)
Write a short note describing the similarities and differences among the
measures of centrality and dispersion for these four variables.

(b). The summary statistics for these five variables are similar in that the mean opinion is always between 3 and 3.6 and the standard deviation between 1.0 and 1.4. But since opinions are constructed on only a five-point scale, the results might be expected to be reasonably similar.

In terms of averages, there is more agreement with V6 and V7 than with V2 and V4, and with especially agreement with V6. This is indicated by the larger mean for these two variables than for V2 and V4. That is, with a mean above the neutral response of 3 for V6 and V7, on average respondents agree that government helps business and they also agree (although less strongly) that as respondents they have the power to affect the future. In contrast, opinion on helping themselves and marriage for gays and lesbians averages to around 3, or a neutral opinion – that is, opinion is split on these issues.

In terms of variability, V4 – marriage for gays and lesbians – is more dispersed than any of the other opinions. That is, respondents have more varied views on this issue than on any of the other three opinion variables. The other three have similar variation, as measured by the standard deviation, from 1.0 to 1.2. Also note that these opinions are measured on a five-point scale, from 1 to 5, with a range of 4. The range divided by four is 1, and the standard deviations are all close to 1 (the rule of thumb for an approximate size of a standard deviation).

Paul
Gingrich

October 9,
2003