MEASUREMENT AND SAMPLING
Characteristics of Measurement
- Validity
- a valid measurement is a quantity or dimension that corresponds to the measured variable
- there are standard measurements (procedures and expressions) for common variables, but
where variables must be are operationally defined, or surrogate variables used because
they are more easily measured, then the validity of measurements has to be ensured
- Accuracy
- closeness of measurements to an expected or true value
- accuracy is inversely proportional to error (i.e. high accuracy corresponds to
low error)
- types of error:
- gross: blunders caused by carelessness of instrument failure
- systematic: consistent overestimation or underestimation of the target value; usually
caused by poor calibration of an instrument or a poor measurement procedure; often small
enough to go undetected by results in a poor inference of a target value
- random: human error randomly (normally) distributed with respect to the mean observation
- Precision
- the closeness of repeated measurements to one another
Sampling
- Sample
- a subset of all the measurements that could be derived from a very large or infinite
population, where the population is defined by one or more common characteristics (e.g.
a certain class of people, slopes in limestone)
- sample and population refer to the items or to the corresponding sets of measurements
- the purpose of sampling is to
- gain an impression of an area or collection of things
- to estimate a population parameter
- to test hypotheses: unproven theories or suppositions which are the basis for further
investigation
- advantages of sampling
- the only means of obtaining data about an infinite population (e.g. air
temperatures)
- cost and time effective means of obtaining data about a large finite population; better
data then hastily collected data for the entire population
- desirable when measurement is destructive or stressful (e.g. plant sampling,
some measurements on people)
Sampling error
- the difference between a sample estimate and the corresponding population parameter
- not usually quantifiable since sampling normally is done because the population
parameter cannot be known, however, it can be predicted from statistical theory
- depends on measurement error and the representativeness of a sample, which in turn
depends on
- sample size
- there is a diminishing decrease in sampling error with increasing sample size, that is,
the largest decrease in error occurs with an increase in the size of a small sample
- a minimum sample size is three, since in a sample of two, a bad measurement cannot be
distinguished from the good one
- 30 observations or less is generally considered a small sample
- the sampling frame
- the means by which the sampled population is identified from the target population
- may be spatial (e.g. a quadrat or transect) or non-spatial (e.g. a
telephone book, , voter list, or a street corner for the sampling of people)
- is poor if it causes bias towards the sampling of certain individuals (e.g.
over representation of housewifes and unemployed by sampling shoppers on a weekday; over
representation of weak soil rock exposed in stream cuts)
- the sampling procedure
- methods of inferential statistics assume random sampling, that is, that there is an
equal probability of choosing every item in the sampled population and every possible
sample; these conditions are satisfied only by independent random sampling (i.e.
with replacement of measured items), although sampling without replacement is not much
different if the target population is very large, since discarding an item does not
significantly increase the probability of selecting the remaining items
- however, the distribution of measurements is often arbitrary (e.g. climate
stations), because the collection a random sample would be much more difficult
- the random location (coordinates) of items in a list or in an area are identified from a
string of randomly generated numbers; these locations may be clustered with respect to one
another
- with systematic sampling, the sampled items are selected at regular intervals and thus
have a uniform distribution, however the sample can be regarded as random only if the
population has a random distribution; if the sampling interval corresponds to some
periodic feature in the sampled population, then sample will be biased
- stratified sampling, dividing the target population into sub-populations (strata)
according to one or more criteria, is done to
- enable comparisons between two or more strata
- obtain a sample that is more evenly distributed among state (e.g. Canada is
often stratified in five regions, because sampling of socioeconomic phenomena will always
be biased towards central Canada)
- obtain sub-samples that are more representative of individual strata (i.e. have
less variability and sampling error) than an unstratified sample would be of the total
area
- experimental control over independent variables; for example, if an area is stratified
by slope gradient and elevation, then any topographic variation in vegetation cover can be
attributed to variations in aspect
- sampling can be stratified random or stratified systematic