Sampling in Quantitative

naivenorthΤεχνίτη Νοημοσύνη και Ρομποτική

8 Νοε 2013 (πριν από 3 χρόνια και 7 μήνες)

53 εμφανίσεις

1

Lecture 11 of 47C5 Social
Research Process I:

Sampling in Quantitative
Research I

Paul Lambert, 14.10.03, 4
-
5pm

2

47C5: Survey research lectures



Lecture 8: The Survey Method

Intro. to & qualities of survey method

Lecture 9: Using Secondary Datasets

Data access and issues

Lectures 11/12: Sampling

Sample design, data collection / analysis

3

Resources for lectures 8,9,11,12


Lecture slides on WebCT

site


2 Reading lists:



Initial list in 47C5 unit outlines


Some
additions

on further list on WebCT site


4

Web Resources for lectures
8,9,11,12



Slides and additional reading list also at:

http://staff.stir.ac.uk/paul.lambert/teaching.htm



Some other internet resources (cf De Vaus 2002)

http://trochim.human.cornell.edu/kb/

http://statcomp.ats.ucla.edu/survey


5

L11/12: Surveys and Sampling




Lecture 11:

1) Role of sampling in social surveys

2) Types of sampling methods

Lecture 12:

3) Good practice in survey conduct

4) Robust analysis of survey data


6

Part 1: Role of sampling in
survey research



Surveys can be census’s


More often samples from wider population


Several
sampling methods

select cases


Aim:
representative of wider population

7

Inference


Key idea is
inference


= confidence in our ability to generalise


Sampling inference = application of statistical
theories in order to estimate probabilities
that a sample result is ‘likely to have been
unrepresentative’


8

The ‘normal’ (Gaussian) curve

9

Theories of sampling methods

Sampling and probability theories

tell us
that any particular
random sample

is most
likely to have the same properties as the
wider population
. We can then estimate the
probability that sample results of a
particular nature could have arisen by
chance, rather than because they are the
same as the population result.

10



If the cases in sample surveys
were selected at
random
, then
can use sampling theories and
thus
‘inference’




11

‘Inferential data analysis’


Variable
-
by
-
case matrix data analysis for
generalising findings to population


Often distinguished from
‘descriptive’
data
analysis (results of sample only)


Key:

joint influence of


1) size of sample


2) strength of data pattern

in increasing confidence about generalisations

12

Statistical inference

..causes confusion; one of hardest parts of
survey data analysis to understand..


Phrases:

‘significance level’ ‘p
-
value’,
‘confidence interval’, ‘hypothesis testing’, ..

Meaning:

Whether results would probably
generalise to a larger population


(if sample is treated as random)

See:

Refs for L11 part 1 (supplementary list)

13

Critiques of survey generalisation

1) Part of the ‘fall of survey methods’ 1960’s:


Sampling is not representative




卡浰汩湧S楳⁳i獴敭s瑩捡汬礠扩慳獥d



Inferential conclusions too carelessly made
and too strongly stated


See for example Cicourel 1964

14

Critiques of survey generalisation

2) Deconstructing inference (1980’s

)


Inferential methods over
-
relied upon




卵牶敹r慮慬祳楳i扥捯浥猠瑨敯特
-
晲敥 桵湴
for ‘significant’ patterns



Inference needed less than often suggested


Bad variable analysis (operationalisation) effects
inference results, eg (non
-
)parametric variables;
data clustering; …


See for example Rose and Sullivan 1996 p192
-
5

15

Contemporary survey research

Tends to use 2 strategies to address critiques:



Large scale, often secondary, rigorous methods



or


Small scale, primary, claims carefully qualified






16

Terms in sample survey analysis


Population:

all cases of interest


Sampling frame:

list of all potential cases


Sample:

cases selected for analysis


Sampling method:

technique for selecting
cases from sampling frame


Sampling fraction:

proportion of cases
from population selected for sample (n/N)

17

Survey analysis:


‘variable
-
by
-
case matrix’

Cases






Variables


1

1

17

1.73

A

.

.

.

.

2

1

18

1.85

B

.

.

.

.

3

2

17

1.60

C

.

.

.

.

4

2

18

1.69

A

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

N

18

Sample Surveys (case selection)

Populn.

Cases






Variables


1

-

-

-

-

-

2

-

-

-

-

-

3

-

-

-

-

-

4

1

1

17

1.73

A

5

2

1

18

1.85

B

6

-

-

-

-

-

7

3

2

17

1.60

C

8

4

2

18

1.69

A

N=8

n=4

19

Part 2: Sampling methods and
techniques

= Ways of selecting case from population



2.1 Random

(probabilistic)


Generalisable,
inferential
statistics, fewer
applications


2.2 Non
-
random

(opportunistic;
purposive)

Harder to
generalise,
inference contested,
more widely used

20

2.1a Simple Random Sample


A statistical method used to choose cases
randomly (eg random numbers)

Every case in population has exactly the same
chance of being in sample



Most data analysis techniques initially
designed for simple random samples

21

2.1b Systematic Random Sample


Like the SRS, select cases from anywhere
in the whole population


An easier selection method : choose every
(n)th person for the sample


Danger of
‘periodicity’

if original
population order has any structure,


bias

22

Problems with sample methods
selecting from whole population


The ‘random’ part means it is always
possible to get a population coverage quite
different from known structures


If total population is large or dispersed, then
coverage of random parts of it is
expensive
and time consuming
: few surveys use
random sampling from whole of UK

23

2.1c Stratified random samples


Modifies random sample

to ensure even (or
‘boost’) coverage of population groups


split sampling frame by
stratification factors



select random samples within each factor


final sample has correct proportions of each


Example: select 490 M and 510 F


Properties:

proportionate sample, correct
representations; but more expensive & complex,
should use ‘weights’ for analysis

24

2.1d Multistage cluster samples


i)
Select clusters

of population at random


ii) Sample randomly
within clusters


Eg: clusters = local authorities in UK


With qualifications, may still be treated as
‘random’ for analysis purposes


Big reduction in costs if face
-
to
-
face contacts




䵯獴⁷楤i汹l晡f潵牥搠獡浰汩湧l浥瑨潤
楮慲i攠獣s汥l獵牶s礠y潬汥瑩潮t

25

Example: Multistage cluster
sample



Interest: attitudes of Scottish school pupils


Resources: 400 interviews with pupils


26

Edinburgh 100

Argyll 24

Islands 20

Highlands 40

Glasgow 124

Aberdeen 40

Shetlands 2

Borders 10

Perth 20

Moray 20

27

Glasgow
150

Edinburgh
150

Stirling 60

Moray 40

28

Stirling 60

30 young people at
Balfron School

and

30 young people at
Stirling High

29

2.1e Longitudinal random
samples


Longitudinal = interest in study over time


‘Panel’ and ‘cohort’

samples



recontact an initially random sample


Problems of
attrition


Retrospective sample


Rely on recall evidence of random selection


Problems of
selective recall

30

Issues in random sampling


Only as good as underlying
sampling
frame

(a good one may not be available, or
not be as good as we think)


Data analysis methods need adapting for
stratified / clustered designs


Other survey factors

interact

with sample
selection issues,
eg poor interviewers may
discourage certain cases from response

31

2.2) Opportunistic sampling


More often in social research, sample
design is ‘opportunistic’ (‘purposive’)



Random sampling is expensive


Random sampling is complex


Some purported random samples are actually
purposive (understanding of ‘random’)


32

2.2a Quota sampling


Fill up quota’s of groups of interest


Quota’s can ensure:


overall representation (cf systematic)


broad topic coverage (eg types of voter)


Example: market researchers in street;
telephone call centres vetting contacts


Biasses: issues in how a quota ‘fills up’

33

2.2b Snowball sampling


Also
‘focussed enumeration’


Technique for contacting cases from
populations
rare / difficult to access


Ask first obtained contact for suggested
further contacts




snowball gathers size…


Eg


smaller ethnic minority groups


Problem: social networks are non
-
random!

34

2.2c Convenience sampling


Samples whatever cases from population
were
easiest to reach
, eg personal contacts


Often no other sampling strategy involved


Biasses likely

in convenience process


Examples: …most social research survey
examples are ‘convenience’..!

35

Random v’s Opportunistic


Random difficult and expensive


mainly
government funded resources


Most people who have conducted a survey
have conducted an opportunistic one


Much
data analysis / inference assumes
random sample
, but not applied to


But
opportunistic data is often robust…

36

More on sampling methods


Refs for sampling methods / properties: eg
Gilbert 2001 chpt5; De Vaus 2001 chpt6;
Bryman 2001 chpt4.


Research reports: most important is
documentation of sampling process / issues


To be open about research


To consider unintentional mistakes

37

Summary on sampling methods


Good sampling not a panacea


Other elements of surveying equally crucial


Many
statistical methods assume random

sample


For good sampling, use
secondary data..


All samples have
some

value
, but non
-
random ones need careful context