**Sociology 405/805**

** **

**Review and
Introduction**

**1. Production of Data**

Use existing data sources in this class. Be familiar with variables and how they are defined, and with the methods of data collection. For this class, we will have to work with these data as they come to us, but when working with these data, should be aware of definitions and methods used, and what procedures were used to obtain the data.

· Data are produced not just collected.

· Social research issues, theory, approach.

· Definition of variables – theoretical and operational.

· Questions used and potential answers.

· Organization of responses when data are presented.

· Sampling procedures.

¨ Population or sample.

¨ Non-probability or probability.

¨ Random, stratified, cluster, multistage.

¨ Interview, questionnaire, administrative data.

· Errors in data – non-sampling and sampling.

· Integrate data production with statistical procedures to be used, if possible.

· Use statistical analysis of previous data sets to improve data production for subsequent projects.

· Replicate existing studies or use questions from existing studies for comparative purposes.

**2. Types of
Measurement**

**a. Discrete or
Continuous**

·
**Discrete** –
number of possible values can be counted.

·
**Continuous** –
cannot count all possible values; possible values can be matched with some
portion of a line segment.

**b. Level of
Measurement**

·
**Nominal** –
Can classify values into categories; name or number them. Sex, ethnicity.

·
**Ordinal** –
Values can be ordered or ranked as less than, greater than, or equal to. Order of finish, Likert-scale attitudes.

·
**Interval** –
Differences or intervals are meaningful; equal numerical differences represent
equal magnitudes; well-defined unit of measure. Height, weight, time, age, income.

·
**Ratio** –
Ratios of values meaningful; non-arbitrary 0 point. Height, weight, income, age.

· Levels of measurement are hierarchical. That is, all scales are nominal. Ordinal scales are also nominal. Interval scales are both nominal and ordinal. Ratio scales are nominal, ordinal, and interval.

· Most interval scales are also ratio. Temperature may be only interval and not ratio.

· These levels of measurement determine the type of summary statistic that can be calculated and the type of statistical analysis that can be used.

· Where possible, construct variable in order to measure it at the highest possible level.

· Attitudes, opinions and many psychological variables are measured only at the ordinal level, but statistical analysis appropriate for interval or ratio level scales is commonly used.

· Some statistical methods appropriate only for interval or ratio level scales can use nominal or ordinal level scales through appropriate reconstruction of variables – e.g. dummy variables.

**3. Descriptive
Statistics**

**a. Distributions**. Frequency, percentage, proportional
distributions. Organization of data
into categories. Need nominal scale
only.

**b. Positional
Measures**. Percentiles, deciles,
quintiles, quartiles, median. Need
ordinal scale.

**c. Central Tendency**. Mode, median, mean. Mode for nominal scale, median for ordinal,
mean for interval and ratio.

**d. Variation**. Variation, variance, standard
deviation. Assume interval or ratio
level scales. Use variation for ANOVA. For ordinal scales, interquartile range or
other measures based on positional measures can be used.

**e. Measures of
Association**. Lambda, phi, V, Q,
tau. Covariation and correlation
coefficients.

**f. Regression**. Regression coefficients and regression
equation.

**g. Standardization**.
Z-values and beta coefficients.

· Each statistic provides a particular summary view of the distribution of the data.

· Use statistics appropriate to level of measurement and anticipated use of the data.

· Interpretation of statistics.

· Where data come from samples or where the data make inferences concerning the influence of particular variables, the variability of the estimate must be considered. Use of interval estimates and hypothesis tests.

**4. Probability and
Sampling Distributions**

· Classical, frequency, subjective interpretations of probability.

· Independence and dependence of events.

· Random variables and expected values.

· Probability distributions – Normal, t, F, and chi-square distributions.

· Sampling

¨ Representative

¨ Random

¨ Sampling Distributions

¨ Stratified, cluster, multistage samples

¨ Sampling error

· Models

¨ Probabilistic, not deterministic

¨ Multivariate

¨ Description and explanation

**5. Inferential
Statistics**

**a. Sampling Error**

**b. Interval Estimates**

- Confidence level
- Confidence interval
- Meaning of interval estimates

**c. Hypothesis Tests**

- Null and alternative (research) hypotheses
- Level of significance
- One or two-tailed tests
- Test statistic
- Interpreting test results
- Types of error

**6. Other Issues**

- Independent and dependent variables
- Interaction
- Assumptions
- Interpretation
- Unobserved variables
- Errors in variables

**7. Statistical programs – SPSS and Minitab**

- Constructing a data set – data entry, labels
- Organizing data – selection of cases, transformation of variables
- Sampling and weighting
- Statistical procedures
- Different types of data sets – survey, weighted, time series
- SPSS syntax files

Last edited January 10, 2004