**Social Studies 201**

**Winter 2004**

**Monday, January 5,
2004**

** **

**Contact information**
**and of****ffice hours.** Monday 1:00 – 2:00 p.m., Wednesday, 11:30 a.m. – 12:30 p.m. or by
appointment. My other class is on
Tuesday and Thursday from 11:30 to 12:45, so I am not available at those
time. I also have many committee
meetings, so may be out of my office much of the time. If you wish to arrange a time to meet me,
please contact me before or after class, leave a message on the voice mail on
my telephone (585-4196), or send me an email message
(paul.gingrich@uregina.ca). I’ll try to
respond promptly to email, so if you have any brief questions that can be dealt
with by email, I’ll attempt to answer those quickly.

**Text. **There is only one textbook, with two
parts to it. Total cost is around
$50. I wrote this textbook about ten
years ago and some of the examples are now out of date, but it covers the
materials for this introductory statistics class. Point out any errors or misleading explanations in the text. I have provided lots of examples in the text
and there are many examples on the web site.
Note that most of the examples in the text are worked out. You may want to try to work out some of the
examples yourself, before looking at the answers, for learning the materials
and preparing for examinations.

The textbook is too long, but it combines the explanations with the problems, so that it is like a text and a manual together. As we cover each part of the course, I will mention which section or pages to read.

If you want another approach than mine, it would be useful
to look at some other statistics texts.
The text I used for several years was Ott, Larson and Mendenhall, *Statistics: A Tool for the Social Sciences*
– this provides a good introduction to statistics. Almost any introductory statistics text can be useful in
presenting a little different explanation.
While it can be helpful to use more than one text, the mathematical
notation may differ, and this can cause confusion.

**Grading – problem
sets and examinations. ** 40% of the
total grade comes from the problem sets, 15% from each of the two midterms, and
30% from the final examination.

There will be 5 or 6 problem sets, two before the first midterm, two between the two midterms and one or two between the midterm and the final examination. I will usually give you a week to complete these and, once they are handed in, I will attempt to provide model answers, or schedule sessions in the lab time to discuss the problems. I will hand out the first problem set on Friday, January 9 and it will be due on January 19. The second will be from January 19 to 30, and we will attempt to have it marked and returned to you before the first midterm. 40% of the final grade will be based on the problem sets including the computer problems.

In addition to the regular problem sets, there will be several computer problem sets. Later in the semester, the computer problem sets will be merged with the regular problem sets.

The purpose of the problem sets is to learn how to do statistics, practice for exams, and to obtain points. With all the problem sets, this is a class where you have to keep up with the assignments. If you put in the time, you should be able to obtain a reasonably good grade. Do not expect to do well if you just study before each exam and ignore the work in between.

Examinations are all open book, but can be difficult – they ask you to develop answers to problems, similar to the problems on the problem sets. Final examination will be in the regularly scheduled time. I will schedule review sessions before each examination (February 3 and March 16) – some of these may be during the regularly scheduled lab times.

The mean grade last semester was 73% and the grade distribution for last semester was as follows:

Grade |
Frequency |

90+ |
3 |

80-89 |
6 |

70-79 |
17 |

60-69 |
4 |

50-59 |
5 |

Total |
35 |

**Calculator and math
background. ** In order to do
statistics, it is necessary to use some mathematics, and the statistical
problems build on the ordinary arithmetic operations (addition, subtraction,
multiplication, and division) and also use some algebra that you should be
familiar with from secondary school.
But I will explain all the formulae and do example problems on the
blackboard. The aim of this class is
not to become proficient in manipulating algebraic materials, but to use the
formulae to do statistical problems. At
the same time, there is a lot of mathematics used in the course, and you have
to develop a certain ability in this area.
If this has been difficult for you in the past, attempt to tackle the
problems or additional examples from the web site. There will also be a student assistant available, so consult the
assistant or me if you are having difficulty.
We can also use parts of the Tuesday labs to go over problems and work
on extra problems.

**Labs and computer. ** Much of the work in a statistics class can
be done with a calculator, or even with a paper and pencil. Later in the course, the formulae become
more complex, and require more calculation.
In addition, some data sets have many cases and many variables, so it
can be very time consuming to do everything with a calculator. For these reasons, we will do some of the
statistical work on the computer. This
semester we will be using SPSS (Statistical Package for the Social Sciences), a
program that many statisticians and survey researchers use to analyze survey
data. Along with the SPSS program, we will
use a data set produced from a survey of University of Regina undergraduates in
1998. I will provide more details about
when we begin the computer labs.

There should be time in the computer lab times to do all the computer problems. However, if you need extra time, the computer lab (CL109) is generally open, except when other classes are scheduled there. I will prepare handouts for SPSS and this should be sufficient to do the work in this class.

The computer labs will not begin until Tuesday, January 13 or perhaps not until a week later, on January 20.

** **

**Accommodation and
university policies**. Take note of
the possibilities for accommodation for those with special needs. If you have any special needs, please
contact me as soon as possible. All students
should be familiar with the relevant University policies, so take note of these
on the attached sheet or in the *University Calendar 2003-2004*.

** **

**Web site. **Last semester, I constructed a web site
for Social Studies 201 and I have left all the materials from last semester on
the web site. For this semester, I will
be adding new materials and revising last semester’s material as we proceed
through the semester. The address of
the web site is http://uregina.ca/~gingrich/. Note that there is **no www.**
before uregina.ca. Some of the material
on the web site is in Acrobat Reader format, that is, with file type pdf. If you are unable to view files of this type
on your computer, you can use the computers in CL109 at times when there is no
class in that room.

The web site has various sections. In the section “Winter 2004 Semester,” I will put the notes, examples, and problems for this semester. I will attempt to update these at least weekly, and hopefully more frequently. In the Fall 2003 section, thre are examples and examinations from previous semesters. I have posted all the problem sets and model answers from the Winter 2001 semester. I have also included all the examinations from Fall 2001 and Winter 2001, but without answers – I will be using some of the questions from these examinations for your problem sets this semester. Another part of the web site is the textbook. I have posted most of the text, with the exception of the diagrams, on the web site. These files are in Acrobat/pdf format.

** **

If anyone is not able to use the web site, please notify me. I can provide printed copies of all materials on the web site to the University Library, and make them available there at the Reserve Desk.

** **

**Class outline**

**A. Descriptive
statistics**. The first section of
the class, chapters 1-5 of the text, deals with descriptive statistics. This is
the most commonly used type of statistics, and the type that you will most
commonly encounter. Almost all the
statistics published in newspapers and reports are descriptive in nature, describing various phenomena. Descriptive statistics includes tables,
graphs, charts, maps, diagrams, etc. – hopefully helping us understand what is
being described.

Chapter 1 of the text is an introduction to statistics and the text. Read it quickly to get an overview.

In order to present data, it is necessary to consider how data are organized. This is the aim of Chapters 2 and 3. Chapter 2 examines issues related to the production of data, including discussion of assumptions involved in their production. In this chapter, I attempt to identify some of the main issues that must be addressed by anyone who works with quantitative data or produces such data. Do not spend a long time on this chapter, but attempt to be generally familiar with the issues raised in sections 2.3 through 2.6. Section 2.7 uses the example of Statistics Canada’s Labour Force Survey as an example of a relatively successful and thorough approach to data production, albeit one that has some problems and shortcomings, as do any data. This example is now outdated, but I will provide some more recent summary data on labour statistics.

Chapter 3 discusses how social constructs can be measured. Phenomena such as length or weight are measured in well understood and well defined units such as metres or kilograms, respectively. Social constructs such as attitudes, opinions, intelligence, ability, alienation, social solidarity, class consciousness, and ethnic or national identity are not so clear cut and are much more difficult to define and measure. After some of the different approaches to measurement have been examined, Chapter 4 deals with organizing these data for purposes of description. A few of the ways of organizing data into charts, tables and graphs are discussed in Chapter 4.

Another way to present data is to calculate summary measures that succintly describe the phenomena being examined. It may be tedious to look at all the data, or we may be overwhelmed by all the data concerning a population or a social issue. In Chapter 5, summary measures of centrality and variation are discussed. Examples include average income, expectation of life, median score on standardized tests; variation in incomes and income inequality, standard scores on tests. Measures of central tendency and variation are the most widely used summary statistics in both popular and academic applications.

If we keep to the time schedule of the Class Syllabus, the first midterm will be based on this section of the course and we should complete Part I of the text just before the first midterm.

**B. Inferential
statistics**. Part II of the text
deals with inferential statistics and this occupies the remaining time and work
in Social Studies 201.

**1. Probability**. As an
introduction to the concepts required to study inferential statistics, Chapter
6 deals with probability and the normal distribution. Probability is used for two main reasons in the social
sciences. One reason is that data may
be obtained from some type of random or probability sample of population
members, using surveys or experiments.
The randomness of sample selection means that probability can be used to
obtain inferences about the social science issue or concern under
investigation. A typical example of how
probability is used in assessing the reliability of sample results is the
following from the polling agency Ipsos-Reid (available from Ipsos-Reid web
site: http://www.ipsos-reid.com/media/dsp_displaypr_cdn.cfm?id_to_view=1058).

*With a
national sample of 1,000 and 1,500 (for each component), one can say with 95%
certainty that the overall results are within a maximum of ± 3.1 percentage
points of what they would have been had the entire population of Canada’s
regular online users been surveyed. The margin of error will be larger for
sub-groupings of the survey population.*

Second, the social sciences use various models to describe or explain what occurs in the real world. Some of these models are probabilistic in nature, and require some understanding of the principles of probability. One example is insurance rates – high risk occupations pay higher life insurance rates, older people may have lower home insurance rates.

One of these models is the normal probability distribution – the so called bell curve. Some researchers consider this distribution to describe characteristics of actual populations. In particular, some instructors think that class grades should follow a normal or bell curve. This is an application of a mathematical model. The normal curve has many other applications in statistics, and learning to use this curve is essential to understanding statistics. Whether this curve does describe the distribution of characteristics or behaviour of an actual population is another question – it may be that human populations are not so well-described by the normal distribution as some researchers and analysts claim.

Our concern with probability is not with probability as a study in and of itself. Rather, we will be mainly concerned with the principles of probability, and their applications in

inferential statistics.

**2. Sampling
distributions**. Chapter 7 is a transitional
chapter, dealing with what are called sampling distributions. These are mathematical distributions that
are useful when conducting samples of a population or experiments in a
population. They describe how a
particular measure behaves under repeated sampling. For example, opinion polls such as the Gallup poll provide
estimates of the proportion of people with a particular characteristic. But another researcher selecting a different
set of individuals in a sample would obtain a somewhat different estimate. The potential variation in these proportions
can be described mathematically through the sampling distribution of the
proportion. We will examine this in
Chapter 7.

The third section of the class deals with techniques used in inferential statistics. These techniques are discussed in detail in Chapters 7-10. These chapters include what statisticians call hypothesis tests and estimation procedures. The aim of these is to infer, from survey or experimental results, conclusions about a whole population. Such conclusions always have probabilities attached to them.

**3. Estimation**. Estimation is the method used in surveys
and polls about opinions of members of a population, where a survey of a small
group of people can be used to make inferences about the nature of opinion,
attitudes, or other characteristics of a large population. The probability of obtaining a result that
is in error by no more than some specified amount can be calculated if the data
come from a sample which has been randomly selected from the whole population.

**4. Hypothesis testing**
begins by making an hypothesis, and then using a sample or experiment to test
it. The hypothesis may be that people
higher on the income scale are more likely to vote in a more conservative fashion,
while those with lower income may vote NDP, with Liberals in the middle. These hypotheses may be based on previous
research findings and on our observations of what interests each party is
regarded as addressing. Hypotheses may
be more complex, involving extensive theoretical and quantitative
research. In each case, principles of
probability are used to determine the probability of the hypothesis being true
or false.

There are many different types of hypothesis testing, but the principles involved in each type are much the same. By the end of the semester you should have a good grasp of these methods.

**5. Regression and
Correlation** have been left out of this semester's outline. The chapter on regression is also left out
of Part II of the textbook – to make the book a bit shorter and less
expensive. If we do have time at the
end of the semester, we will briefly examine the relationship between two
variables by studying the methods of correlation and regression.

**Aim of the Class**__ __

**1. Learning basic
statistical methods**. After you have
completed this class, you should be familiar with basic statistical concepts
and measures. Do not expect to be an
expert in statistical methods after one semester, just as you would not expect
to be an expert on other areas after only one class in the area. But you should be able to understand what
statistics is, what can be done with it, and tackle some straightforward types
of statistical problems.

**2. Healthy skepticism**. I encourage students to have a healthy
skepticism about statistical data and interpretations. This involves developing an appreciation of
the usefulness of statistical data and methods at the same time as adopting a
critical approach to the data and methods.
Here I outline a few aspects of this and I will attempt to provide more
examples as we proceed through the semester.

**a. Misleading
statistics**. Some statistics that
are stated in verbal arguments or published are incorrect or misleading. Examples include partial results presented
by publicists, politicians, or advertisers – a decline in the unemployment rate
may be touted as evidence of solid economic gains when in fact the job market
is not so positive. Some election polls
are inaccurate or unable to predict election results – at the same time other
polls are fairly accurate representations of public opinion. One result of encountering misleading
statistics is to take the view that statistical data and methods are useless or
invalid. Some claim that any result can
be proved by statistics and one well-known book is *How to Lie with
Statistics*. Some researchers also
reject statistics on theoretical or practical grounds, considering that only
qualitative information is valid and that quantitative data have so many
weakesses that the data can be ignored.

**b. Hard data**.** **At
the other end of the spectrum are those who are unwilling to believe anything
that does not have what they consider to be hard, statistical data to back it
up – number and quantitative data. Such
researchers may consider qualitative data as being soft, incomplete, or only
suggestive. Those adopting this
approach may consider statistics as the best method of proof, and perhaps the
only method of proof. For example, a
claim that student debt has increased dramatically, along with statements of
personal experiences of debt problems may be ignored by politicians or
policy-makers unless there is quantitative data to support such claims. Folk or traditional medical care methods
treated with skepticism by much of medical profession because carefully
controlled experiments, using well-established statistical methods, have not
been conducted.

**c. Useful if
carefully applied**. My approach is
in between these two extremes. I view
statistics as a legitmate and powerful approach to the study of social
phenomena and issues. It is often a
useful approach to describing people, their characteristics, views, and behaviour, and it can help in building
models and theories that can be used to understand the social world. Sometimes basic data is essential to
understanding a social issue. Examples
include the study of poverty and inequality, equity issues in the labour force
and politics, or trends in crime rates is associated with statistical data and
approaches. In each of these cases,
well constructed statistical data has identified both problems and suggested
possible solutions. At the same time,
each of these subject areas has been associated with some poor or misleading
data. Differences of views concerning
the causes, severity, and solution to social issues or problems have been
associated with debates over definitions and interpretations of data and
statistical methods and models.

**d. One of many social
science methods**.** **Part of the problem with statistics is
that it is often misused, so that it can provide misleading results or be used
to bolster incorrect arguments and draw misleading conclusions. It is always necessary to remember that
statistics is only one tool available to social science practitioners, and one
that is best suited to certain types of data and certain types of
problems. Other tools may be more
useful in other circumstances. Methods
such as in-depth interviews, stories, oral histories, participant observation,
historical records, novels or movies, etc. may provide a better understanding
and interpretation of social issues.
Such methods often complement statistical methods. Theoretical approaches and the use of human
reason are also necessary, although these must be informed by some types of
data (quantitative or qualitative).

**3. Reading journals
and articles**. Academic articles,
books, and journals tend to be more quantitative today than in earlier
periods. After taking a statistics
class, you should be able to understand some of the statistical approaches used
in academic articles, even if you do not know all details of the methods. Some are difficult to understand though, eg.
regression, factor analysis, analysis of variance, or structural equation models. In order to understand some of these later
methods, you will need to do further study of statistics.

**4. Further study of
statistics**. Social Studies 201
should provide all the basic statistical concepts and approaches so that you
can continue the study of statistics later.
Each discipline emphasizes a somewhat different set of techniques, and
this is what upper level course in statistics cover. For example, psychologists tend to emphasize the analysis of
variance (ANOVA), t-tests and factor analysis.
Economists tend to rely heavily on regression.

If you find you like statistics, and work with data, I would encourage you to take an upper level class in your discipline. Many of you will need more statistics for your honours or graduate work, and others may find it useful in employment after completing a degree.

For Sociology majors, the next class is Social Studies 306, where we again use SPSS, producing and analyzing survey data. There is a more advanced statistics course also, Social Studies 405/805, which the Department attempts to offer every other year.

Last edited on January 10, 2004