Winter 2004
Problem Set 4
Due: Monday, March 15, 2004
If
you hand in the answers to this problem set by Friday, March 12, Mark Nelson
will attempt to grade it by March 14.
There will be a review session in the Tuesday labs, March 14, prior to
the second midterm examination the next day.
1. Standardized normal distribution. For the standardized normal distribution,
a.
What
is the area between Z of 0 and Z of +1.25?
b.
What
is the area between Z of +0.5 and Z of +2.5?
c.
What
is the proportion of cases between Z = -1.8 and Z = +2.5?
d.
What
percentage of the area under the normal curve lies to the left of Z = -1.33?
e.
What
is the area under the normal curve above Z = -1.83?
f.
In
a normally distributed population, what is the percentage of the population is
within one and a half standard deviations of the mean?
g.
What
is the Z-value so that 0.25 of the area lies to the right of this Z? (That is, what is the Z-value of the 75th
percentile or 3rd quartile?)
h.
What
are the Z-values so that there is 0.035 of the area in each tail of the
distribution beyond these Z-values, for a total of 0.070 in the two tails of
the distribution?
i.
In
a standardized normal distribution, where is the fifteenth percentile?
j.
The
Explore procedure in SPSS displays
the “trimmed mean,” defined as the mean when the largest 5% and the smallest 5%
of the cases have been eliminated. In
the standardized normal distribution, what are the Z-values for the trim
points?
2. Distribution of income. The distribution of household income for Saskatchewan respondents
has a mean of fifty thousand dollars and a standard deviation of thirty-five
thousand dollars. Using these values,
and assuming that household income is normally distributed, obtain the following.
3. Television and internet use. The
data in Table 2 come from Saskatchewan respondents aged 15-24 surveyed in
Statistics Canada, 2000 General Social Survey, Cycle 14: Access to and Use of Information Communication Technology. Use these data to answer the following for
Saskatchewan residents aged 15-24:
4. Problems using data from t:\students\public\201\ssae98.sav
a. Use Analyze-Descriptive Statistics-Frequences,
with options Charts-Histograms-With
Normal Curve, to obtain frequency distributions of the three variables:
study hours, V3 (affirmative action), and V4 (gays and lesbians married). The frequency distribution table and the
histogram, with the normal curve superimposed, should be available on the
printout.
i. For study hours, use the frequency distribution table to determine the percentage of cases that are within one standard deviation of the mean; within two standard deviations of the mean. Compare with the percentages of cases within one and two standard deviations of the mean in a normal distribution. Use the figure and diagram on the printout to write a note comparing the actual distribution of study hours with that of a normal distribution.
ii. For V3 and V4, use the statistics generated and the table of the normal distribution to determine the percentage of cases that take on the neutral response of 3 (between 2.5 and 3.5) if these variables were exactly normally distributed. Compare with the percentage of neutral responses in the table of the frequency distribution. Write a short note comparing these two frequency distributions with the normal distribution.
b. Use Analyze-Descriptive Statistics-Explore
with Statistics selected and Plots deselected to obtain the following
confidence intervals. Assume the data
set is a random sample of all undergraduates at the University of Regina.
i. 90%, 95%,
and 99% confidence interval estimates for true mean weekly study hours of all
undergraduates. Use the formula from
class to verify the value of one of the interval estimates and the standard
error. Very briefly explain why the
intervals differ in width.
ii. Obtain 80% and 99% confidence interval estimates for the true mean debt level (debt1) for students in each year of their program (first through fifth year). From these tables, what might you say about the mean debt level for all University of Regina students at each of the four undergraduate years? Why is the 80% interval for first year students so much narrower than the 99% interval for fourth year students?
Table 1.
Frequency and percentage distribution of Saskatchewan household income
Income
in thousands of dollars |
Number of respondents |
Percentage of respondents |
0 |
15 |
1.7 |
Less
than 5 |
10 |
1.2 |
5-10 |
36 |
4.2 |
10-15 |
67 |
7.7 |
15-20 |
56 |
6.5 |
20-30 |
113 |
13.1 |
30-40 |
112 |
12.9 |
40-50 |
101 |
11.7 |
50-60 |
103 |
11.9 |
60-80 |
104 |
12.0 |
80-100 |
69 |
8.0 |
100
plus |
79 |
9.1 |
Total |
865 |
100.0 |
Table 2. Statistics of weekly
hours of television and internet use, Saskatchewan respondents aged 15-24 who
used each service
Variable |
Mean hours per week |
Standard deviation of
hours per week |
Sample size |
Watch
television |
13.46 |
10.18 |
180 |
Use
internet at home |
9.03 |
9.61 |
63 |
Use
internet at work |
7.60 |
10.80 |
15 |
Use
internet at school |
3.98 |
3.89 |
45 |
Source for tables: Statistics Canada, 2000 General
Social Survey, Cycle 14: Access to and
Use of Information Communication Technology