Social Studies 201

Fall 2003

Answers to Computer Problem Set 1

October 9, 2003


Note: Only the answers to the (b) parts are in this file.  The full file with the SPSS tables and the (b) part answers is available in the lab in the folder:





This assignment asks you to use data from SSAE98 (Survey of Student Attitudes and Experiences 1998 – see the questionnaire from the Survey and the list of variables handout of September 16, 2003),  obtain some statistics from these data and make comments on the results.  When you have completed 1-6 below, hand in the printout of the SPSS output and the comments.  You may wish to copy the SPSS output file to another program such as Word for Windows, write the comments in that file, and hand in a paper copy of this. 


For each of the following questions, part (a) refers to what you are to do with SPSS.  For part (b) of each of these questions, write the answers and comments in pen or pencil on the printout (or in a Word for Windows file).




1.       (a) Obtain frequency distributions for each of the variables COLLEGE, CREDIT, and PRIORITY by using Analyze – Descriptive Statistics  – Frequencies.  (b)  In a sentence or two, describe the frequency distribution for each variable. 



COLLEGE.  Over two-thirds of respondents (70%) are registered directly through the University.  Of the seventy per cent of students in the three federated colleges, Campion has greatest representation in this sample (16 per cent), with Luther next (10%) and then SIFC (4%). 


CREDIT.  15 and 12 hours are by far the most common number of credit hours registered in for respondents (forty and 28 per cent respectively), with around one-tenth being registered for 9 hours.  There are also a number of students registered for 3, 6, 17, and 18 hours, although there are relatively few of these.  In between these more common values of credit hours registered, there are only a few students registered for credit hours such as 5, 10, 11, etc.


PRIORITY.  Approximately one-third of respondents favour reducing taxes or debt.  Just over one-quarter favour using the deficit for infrastructure or social spending, with most of these favouring increased spending for social programs rather then infrastructure.



2.      (a) For each of the first three variables (free trade, affirmative action, gay couples) in the table at the bottom of the first page of Report on the Survey of Student Attitudes and Experiences, obtain a frequency distribution and histogram.  (b) Write a note comparing the distributions of these three variables.  Also show how the numbers in the table in the Report were obtained from the SPSS output.


(b) The distributions for free trade and affirmative action are similar, with the peak of the distribution in the centre (a neutral response) and with more respondents expressing support than expressing opposition.  There are relatively few respondents expressing strong agreement or disagreement.  In the case of opinion about marriage of gays and lesbians, the shape of the distribution is much flatter, with relatively equal numbers of respondents at each category of opinion.  The distribution of responses to this opinion question differs from that for free trade and affirmative action in that there are many respondents who express strong opposition or strong support.  That is, respondents generally did not express strong support for, or opposition to, either free trade or affirmative action.  But there appear to be much stronger and diverse opinions on gay and lesbian marriage.


In terms of the table in the report, “% in the Middle” is the per cent of valid responses with opinion response 3, that is, 44.3% rounded off to the nearest percentage is 44%.  For the “% Disagreeing,” I combined the percentage of responses on the disagree side, that is, the “Strongly Disagree” and the 2.   Similarly, for the “% Agreeing” I combined the valid per cent for opinion categories 4 and 5 – those on the agree side of a neutral response of 3.  The percentages in the Report differ slightly from the table because I took the visa students out of the sample.  As a result, the percentage of respondents who agreed with V1 was 21% (8.1+12.6) as opposed to the 20% in the Report.



3.      (a) Obtain stem-and-leaf displays for AGE (as dependent variable) by YEAR (factor list) by using Analyze – Descriptive Statistics  – Explore for these variables.  Also, obtain the histograms of AGE for each YEAR by clicking on Plots, then check Histogram and Continue, prior to clicking OK.  (Ignore or delete the boxplot on the output.  You could also delete the box giving the descriptive statistics).  (b) Using the histograms or the stem-and-leaf displays, write a paragraph comparing the distributions of age by year of program.  Use the stem-and-leaf displays to determine the median and the mode of age for each of fourth and fifth year students.


(b) As might be expected, the distributions show that first year respondents are concentrated at the youngest ages, second year respondents are slightly older, followed by slightly older students at 3rd year.  The distributions of age for each of 4th and 5th year students shows that there are fewer of the youngest students (ages 18-20) and more at ages 23 plus. 


For 4th year students, the median student is the 116/2 = 58th student.  This is at an age of 22 (3 at age 20, 22 at age 21, and 34 at age 22, for a total of 59, so the 58th is at age 22).  The mode is also 22 since there are 34 respondents are age 22, more than at any other single age.


For 5th year students, the median is at age 26.  That is, there are 57 respondents who are 5th year, so the median student is the 29th student.  There are 20 at ages 22 and 23, another 7 at ages 24 and 25, for a total of 27 – the 29th student is one of the students aged 26.  The most common age listed is 22, so this is the modal age.



4.      (a) For variables V8 and V9, obtain frequency distributions, histograms and the  following statistics: mean, median, standard deviation, and quartiles.  (b) Calculate the interquartile range.  Using these data, compare the distributions of each of these variables in words – comment on whether or not responses to the two questions are consistent with each other.



(b).  The interquartile range for V8 is the 75th percentile minus the 25th percentile, or 3-1=2.  For V9, the IQR is 4-3-1. 


The two distributions have quite different means, with V8 having a low mean of only 2, denoting the fact that there is disagreement with establishing user fees for health care.  For V9, should there be more spending for health care, the mean of 3.5 is relatively large, indicating that respondents generally favoured more such spending.  In terms of variability, V8 is a little more varied than V9, as indicated by both the larger IQR and standard deviation for V8 than V9.


The distributions are concentrated at opposite ends – for V8 most respondents strongly disagree, with 70% on the disagree side and only 14% on agreeing.  In contrast, for V9, over 50% of respondents agree and only 15 per cent disagree.  From the relative values of the IQR and standard deviation, responses to V8 are a little more dispersed than are responses to V9.  But the two responses are consistent in the sense that there are many supporters of more spending for health care (agreement on V9) and much opposition to user fees (disagreement on V8).



5.      (a) Obtain all the statistics for the four opinion variables V2, V4, V6, and V7 by using Analyze – Descriptive Statistics  – Descriptives.   (b)  Write a short note describing the similarities and differences among the measures of centrality and dispersion for these four variables.

(b).  The summary statistics for these five variables are similar in that the mean opinion is always between 3 and 3.6 and the standard deviation between 1.0 and 1.4.  But since opinions are constructed on only a five-point scale, the results might be expected to be reasonably similar. 


In terms of averages, there is more agreement with V6 and V7 than with V2 and V4, and with especially agreement with V6.  This is indicated by the larger mean for these two variables than for V2 and V4.  That is, with a mean above the neutral response of 3 for V6 and V7, on average respondents agree that government helps business and they also agree (although less strongly) that as respondents they have the power to affect the future.  In contrast, opinion on helping themselves and marriage for gays and lesbians averages to around 3, or a neutral opinion – that is, opinion is split on these issues.


In terms of variability, V4 – marriage for gays and lesbians – is more dispersed than any of the other opinions.  That is, respondents have more varied views on this issue than on any of the other three opinion variables.  The other three have similar variation, as measured by the standard deviation, from 1.0 to 1.2.  Also note that these opinions are measured on a five-point scale, from 1 to 5, with a range of 4.  The range divided by four is 1, and the standard deviations are all close to 1 (the rule of thumb for an approximate size of a standard deviation).




Paul Gingrich

October 9, 2003