# Sampling in Quantitative

AI and Robotics

Nov 8, 2013 (3 years and 8 months ago)

61 views

1

Lecture 11 of 47C5 Social
Research Process I:

Sampling in Quantitative
Research I

Paul Lambert, 14.10.03, 4
-
5pm

2

47C5: Survey research lectures

Lecture 8: The Survey Method

Intro. to & qualities of survey method

Lecture 9: Using Secondary Datasets

Data access and issues

Lectures 11/12: Sampling

Sample design, data collection / analysis

3

Resources for lectures 8,9,11,12

Lecture slides on WebCT

site

Initial list in 47C5 unit outlines

Some

on further list on WebCT site

4

Web Resources for lectures
8,9,11,12

http://staff.stir.ac.uk/paul.lambert/teaching.htm

Some other internet resources (cf De Vaus 2002)

http://trochim.human.cornell.edu/kb/

http://statcomp.ats.ucla.edu/survey

5

L11/12: Surveys and Sampling

Lecture 11:

1) Role of sampling in social surveys

2) Types of sampling methods

Lecture 12:

3) Good practice in survey conduct

4) Robust analysis of survey data

6

Part 1: Role of sampling in
survey research

Surveys can be census’s

More often samples from wider population

Several
sampling methods

select cases

Aim:
representative of wider population

7

Inference

Key idea is
inference

= confidence in our ability to generalise

Sampling inference = application of statistical
theories in order to estimate probabilities
that a sample result is ‘likely to have been
unrepresentative’

8

The ‘normal’ (Gaussian) curve

9

Theories of sampling methods

Sampling and probability theories

tell us
that any particular
random sample

is most
likely to have the same properties as the
wider population
. We can then estimate the
probability that sample results of a
particular nature could have arisen by
chance, rather than because they are the
same as the population result.

10

If the cases in sample surveys
were selected at
random
, then
can use sampling theories and
thus
‘inference’

11

‘Inferential data analysis’

Variable
-
by
-
case matrix data analysis for
generalising findings to population

Often distinguished from
‘descriptive’
data
analysis (results of sample only)

Key:

joint influence of

1) size of sample

2) strength of data pattern

12

Statistical inference

..causes confusion; one of hardest parts of
survey data analysis to understand..

Phrases:

‘significance level’ ‘p
-
value’,
‘confidence interval’, ‘hypothesis testing’, ..

Meaning:

Whether results would probably
generalise to a larger population

(if sample is treated as random)

See:

Refs for L11 part 1 (supplementary list)

13

Critiques of survey generalisation

1) Part of the ‘fall of survey methods’ 1960’s:

Sampling is not representative

and too strongly stated

See for example Cicourel 1964

14

Critiques of survey generalisation

2) Deconstructing inference (1980’s

)

Inferential methods over
-
relied upon

-

for ‘significant’ patterns

Inference needed less than often suggested

inference results, eg (non
-
)parametric variables;
data clustering; …

See for example Rose and Sullivan 1996 p192
-
5

15

Contemporary survey research

Tends to use 2 strategies to address critiques:

Large scale, often secondary, rigorous methods

or

Small scale, primary, claims carefully qualified

16

Terms in sample survey analysis

Population:

all cases of interest

Sampling frame:

list of all potential cases

Sample:

cases selected for analysis

Sampling method:

technique for selecting
cases from sampling frame

Sampling fraction:

proportion of cases
from population selected for sample (n/N)

17

Survey analysis:

‘variable
-
by
-
case matrix’

Cases

Variables

1

1

17

1.73

A

.

.

.

.

2

1

18

1.85

B

.

.

.

.

3

2

17

1.60

C

.

.

.

.

4

2

18

1.69

A

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

N

18

Sample Surveys (case selection)

Populn.

Cases

Variables

1

-

-

-

-

-

2

-

-

-

-

-

3

-

-

-

-

-

4

1

1

17

1.73

A

5

2

1

18

1.85

B

6

-

-

-

-

-

7

3

2

17

1.60

C

8

4

2

18

1.69

A

N=8

n=4

19

Part 2: Sampling methods and
techniques

= Ways of selecting case from population

2.1 Random

(probabilistic)

Generalisable,
inferential
statistics, fewer
applications

2.2 Non
-
random

(opportunistic;
purposive)

Harder to
generalise,
inference contested,
more widely used

20

2.1a Simple Random Sample

A statistical method used to choose cases
randomly (eg random numbers)

Every case in population has exactly the same
chance of being in sample

Most data analysis techniques initially
designed for simple random samples

21

2.1b Systematic Random Sample

Like the SRS, select cases from anywhere
in the whole population

An easier selection method : choose every
(n)th person for the sample

Danger of
‘periodicity’

if original
population order has any structure,

bias

22

Problems with sample methods
selecting from whole population

The ‘random’ part means it is always
possible to get a population coverage quite
different from known structures

If total population is large or dispersed, then
coverage of random parts of it is
expensive
and time consuming
: few surveys use
random sampling from whole of UK

23

2.1c Stratified random samples

Modifies random sample

to ensure even (or
‘boost’) coverage of population groups

split sampling frame by
stratification factors

select random samples within each factor

final sample has correct proportions of each

Example: select 490 M and 510 F

Properties:

proportionate sample, correct
representations; but more expensive & complex,
should use ‘weights’ for analysis

24

2.1d Multistage cluster samples

i)
Select clusters

of population at random

ii) Sample randomly
within clusters

Eg: clusters = local authorities in UK

With qualifications, may still be treated as
‘random’ for analysis purposes

Big reduction in costs if face
-
to
-
face contacts

䵯獴⁷楤i汹l晡f潵牥搠獡浰汩湧l浥瑨潤

25

Example: Multistage cluster
sample

Interest: attitudes of Scottish school pupils

Resources: 400 interviews with pupils

26

Edinburgh 100

Argyll 24

Islands 20

Highlands 40

Glasgow 124

Aberdeen 40

Shetlands 2

Borders 10

Perth 20

Moray 20

27

Glasgow
150

Edinburgh
150

Stirling 60

Moray 40

28

Stirling 60

30 young people at
Balfron School

and

30 young people at
Stirling High

29

2.1e Longitudinal random
samples

Longitudinal = interest in study over time

‘Panel’ and ‘cohort’

samples

recontact an initially random sample

Problems of
attrition

Retrospective sample

Rely on recall evidence of random selection

Problems of
selective recall

30

Issues in random sampling

Only as good as underlying
sampling
frame

(a good one may not be available, or
not be as good as we think)

Data analysis methods need adapting for
stratified / clustered designs

Other survey factors

interact

with sample
selection issues,
eg poor interviewers may
discourage certain cases from response

31

2.2) Opportunistic sampling

More often in social research, sample
design is ‘opportunistic’ (‘purposive’)

Random sampling is expensive

Random sampling is complex

Some purported random samples are actually
purposive (understanding of ‘random’)

32

2.2a Quota sampling

Fill up quota’s of groups of interest

Quota’s can ensure:

overall representation (cf systematic)

broad topic coverage (eg types of voter)

Example: market researchers in street;
telephone call centres vetting contacts

Biasses: issues in how a quota ‘fills up’

33

2.2b Snowball sampling

Also
‘focussed enumeration’

Technique for contacting cases from
populations
rare / difficult to access

Ask first obtained contact for suggested
further contacts

snowball gathers size…

Eg

smaller ethnic minority groups

Problem: social networks are non
-
random!

34

2.2c Convenience sampling

Samples whatever cases from population
were
easiest to reach
, eg personal contacts

Often no other sampling strategy involved

Biasses likely

in convenience process

Examples: …most social research survey
examples are ‘convenience’..!

35

Random v’s Opportunistic

Random difficult and expensive

mainly
government funded resources

Most people who have conducted a survey
have conducted an opportunistic one

Much
data analysis / inference assumes
random sample
, but not applied to

But
opportunistic data is often robust…

36

More on sampling methods

Refs for sampling methods / properties: eg
Gilbert 2001 chpt5; De Vaus 2001 chpt6;
Bryman 2001 chpt4.

Research reports: most important is
documentation of sampling process / issues

To consider unintentional mistakes

37

Summary on sampling methods

Good sampling not a panacea

Other elements of surveying equally crucial

Many
statistical methods assume random

sample

For good sampling, use
secondary data..

All samples have
some

value
, but non
-
random ones need careful context