Download full paper - School of Computing and Mathematics

topsalmonIA et Robotique

23 févr. 2014 (il y a 3 années et 1 mois)

183 vue(s)


Questioning methodology

Gordon Rugg and Peter McGeorge

Working paper

Faculty of Management and Business

University College Northampton

University College Northampton Faculty of Management and Business Working paper

ISBN 1 90
1 547 008



The Authors:

Gordon Rugg is Reader in Technology Acceptance at University College

Contact details:

Dr Gordon Rugg

Reader in Technology Acceptance

School of Accountancy, Information Systems and Law

University College Northam

Boughton Green Road

Northampton, NN2 7AL



Tel: +44 (0)1604 735500

Peter McGeorge is Senior Lecturer in Psychology at Aberdeen University

Contact details

Dr Peter McGeorge

Department of Psychology

University of Aberdeen


AB24 2UB

Scotland, UK


Phone: +44 (0)1224 272248



Any work of synthesis and integration is likely to include a significant a
mount of input of
ideas and influence from many other people, and this paper is no exception.

The sections of this work dealing with knowledge representation and integration derive at
least in part from experience in developing a Knowledge Elicitation Wo
rkbench while in the
Artificial Intelligence Group at the Department of Psychology, University of Nottingham,
working with Nigel Shadbolt, Mike Burton and Han Reichgelt. The sections on implicit
knowledge were grounded in Peter McGeorge’s PhD work in the s
ame department, with
Mike Burton. The concept of accessing different versions of knowledge via different elicitation
techniques derives from Gordon Rugg’s PhD work with Wyn Bellin, while in the Department
of Psychology, Reading University. The idea of acce
ssing different types of memory via
different elicitation techniques in the context of requirements acquisition was developed with
Neil Maiden, HCI Design Group, School of Business Computing, City University. The
extension of the requirements acquisition w
ork to the wider concept of questioning
methodology was largely inspired by work with Ann Blandford, School of Computing
Science, Middlesex University.

We would also like to record our gratitude to everyone else who helped us in this work,
particularly th
ose who provided constructive suggestions on previous drafts, and the long
suffering respondents who provided us with the practical experience of elicitation on which
this work was based.


A central problem in many disciplines is the elicitation

of a complete, correct, valid and
reliable set of information from human beings

finding out what people want, think, know or
believe. Examples include social science research, market and product research, opinion polls
and client briefs. Although numero
us elicitation techniques exist, there has traditionally been
little theoretically
driven guidance available on choice, sequencing and integration of
techniques. Choice of technique has been largely a matter of individual preference, with
interviews and qu
estionnaires usually being chosen, regardless of how suitable they are for
the task being approached. This paper discusses the issues involved in providing guidance
about choice of technique, then describes a framework for providing such guidance.

A centr
al feature of this paper is the distinction between various types of memory and
knowledge. Some of these can be accessed via interviews or questionnaires. Others, however,
can only be accessed by one technique, and are inaccessible to interviews and questi
These types are listed in the framework, and matched with corresponding recommended
elicitation techniques. The framework is illustrated by case studies, including two from the
authors’ industrial experience.

The paper concludes that questioning

methodology fills a methodological gap between
experimental design and statistics, and should be established as a discipline in its own right.






A framework for categorising techniques


Selecting and integrating


Method fragments


Case studies




Future work




Figure 1: A three layer graph

Table 1: Recommended and contra
indicated techniques for handling each knowle
dge type



A central problem in many disciplines is finding out what people want, or think, or believe, or
know. This problem is at the heart of any research involving human behaviour or attitudes

the social sciences, in effect

and a
surprising range of other fields. In computing science, for
example, elicitation of expertise is central to knowledge acquisition for knowledge based
systems, and elicitation of client requirements is at the heart of system analysis and of
requirements acq

The problem is not caused by a lack of research in the area, or of elicitation techniques; a
recent book on qualitative research methods alone runs to over six hundred pages (Denzin
and Lincoln 1994), and an overview article on requirements acq
uisition listed a dozen major
techniques, with clear recognition that there were numerous other techniques in existence, as
well as numerous versions of both the major and minor techniques (Maiden and Rugg, 1996).
The problem is more to do with choice of t
he appropriate technique or techniques, and with
using them in the correct way. The same problem occurs in a wide range of disciplines.

Traditionally, there have been three main approaches to choice of questioning technique. One
is to view choice of tech
nique as unimportant; a second is to use the techniques traditionally
used in the discipline, and the third is to view the issue as important, but not yet well enough
understood to enable an informed choice. The first main point which emerges clearly from
findings described below is that choice of the correct questioning technique is not just
important, but essential, in any discipline which involves eliciting information from people.
The second main point which emerges is that there is now a theoretica
lly grounded and
practical way of approaching this area. These issues are the central theme of this paper.

A short example demonstrates the type of issue involved. One of the authors recently
supervised an undergraduate project which was investigating has
sles (minor stresses and
irritations) affecting IT managers. This is a topic which is of considerable importance both
theoretically (in relation to stress research) and practically (staff turnover among IT managers
is a major problem to companies with a hi
gh IT presence). The student did a thorough piece
of work, establishing a good rapport with the IT managers she was studying, and using
several techniques to investigate different aspects of the topic, including interviews and
“hassle diaries”. These provi
ded an interesting insight into the nature of an IT manager’s role,
with enough detail and breadth of coverage to produce the basis of a good dissertation.
However, the interviews and hassle diaries all missed a major feature which was only
detected by use

of shadowing (i.e. following the managers around while they worked),
namely that the managers quite often had no lunch break because of pressure of work.

If this were an isolated case, then there would be little cause for concern. However, it is such a
typical case that the authors now routinely use “compare and contrast” designs involving
different elicitation techniques as a standard basis for student projects. Although these
projects also focus on an interesting domain, so that analysis can concentra
te on the domain if
the different techniques do not produce different findings, in practice the different techniques
have reliably and systematically produced different findings across a range of domains and
techniques. The following sections discuss rea
sons for this, and the implications which

There has been considerable exchange of concepts and techniques between disciplines. For
instance, laddering was developed by Hinkle (Hinkle, 1965) from Kelly’s Personal Construct
Theory (Kelly, 1955), and

has since then been used in clinical psychology (Bannister and
Fransella, 1980; Fransella and Bannister, 1977) , architecture (Honikman, 1977), market
research (Reynolds and Gutman, 1988), knowledge acquisition (Rugg and McGeorge, 1995)
and requirements a
cquisition (Maiden and Rugg, 1996). Ethnographic approaches in various
forms have been applied outside traditional ethnography to fields such as criminology

(Patrick, 1973) and requirements acquisition for air traffic control systems (

Sawyer, Bentley and Twidale, 1993

This exchange, however, has traditionally been at the level of individual concepts and
techniques, rather than in terms of larger frameworks. This is in interesting contrast to the
situation with statistics, experiment
al design and with survey methods, which have
historically been viewed as semi
autonomous disciplines in their own right, with the same
textbooks and journals being used by researchers from a wide range of disciplines. The reason
for this difference is pro
bably quite simple, namely that there has in the past been little in the
way of higher
level frameworks and metalanguage to handle elicitation techniques as a
whole. It is, however, a critically important absence, because statistics, experimental design
d survey methods cannot make up for damage caused by incorrect selection or use of
questioning technique.

The aim of this article is to describe a framework which will help remedy this situation, by
providing theoretically grounded and systematic guidance

on choice of techniques. This
framework is intended to be applicable to a range of disciplines, and to provide a common
ground for the establishment of questioning methodology as a discipline in its own right. This
new discipline would complement survey m
ethods, experimental design, and statistics,
thereby providing researchers with a complete set of conceptual tools and methods for
research involving human behaviour.

This paper is divided into four main sections. The first section briefly describes exist
questioning techniques. The second section describes and discusses knowledge and memory
types, and the implications of these for choice of questioning technique. The third section
describes a framework for selection and integration of questioning techn
ique. The fourth
section provides a brief description of knowledge representation and related concepts, to
provide some further metalanguage.

These are followed by two short case studies and a discussion of implications for further

1.2 Existing tec

This section provides a brief overview of the main questioning techniques, to set the
subsequent theoretical analysis in context. It is tempting to derive guiding frameworks from
the techniques themselves, or from practical issues involved in tech
nique choice, such as time
or equipment required. Although this can be useful, it is only part of what is needed.
based frameworks are derived from existing solutions, rather than from the
problem, and it is the problem which is central. This iss
ue is discussed in detail below. It
should be emphasised that the ordering of the list of techniques in this section is largely
arbitrary, and is not intended as a classification in its own right; classification is described
later in this paper.

The descr
iptions of techniques are intended as a brief overview so that readers know what the
various techniques are before encountering the sections on selection and integration of

few readers are likely to be familiar with all of them. For clarity, t
hese have been
kept deliberately brief. There is a separate section later in this paper which deals with further
concepts relevant to techniques, such as knowledge representation; some topics which are
only tersely outlined in the descriptions of techniqu
es, such as hierarchical structures of
knowledge, are discussed in more detail in the “further concepts” section.


The main elicitation techniques

There is a considerable literature on the individual techniques, and on the philosophical,
theoretical an
d methodological issues associated with them

for instance, the role of the
observer, and the nature of subjectivity in data collection. A good introduction to this
literature is provided by Denzin and Lincoln (1994). Although these are important issues,

reasons of space they are not discussed in detail in this paper, which concentrates instead on
the interaction between knowledge types and elicitation techniques. Some of the techniques
described below can be tracked back to a key source, or can be ill
ustrated by a classic study;
other techniques, such as interviews, are ubiquitous and have no clear origin. The
descriptions below are intended to give a brief overview of the main techniques in use, and
include references to further reading where a techni
que is likely to be unfamiliar to most

Ethnographic approaches

usually involve spending extensive amounts of time with the
group being studied so as a gain a thorough first
hand understanding of how their physical
and conceptual world is structure
d. Varieties include participant observation, where the
observer participates in the group’s activities, which may in turn be either undisclosed
participant observation (in which the observer does not disclose to the group that their
participation is for p
urposes of research) or disclosed (in which the observer does not attempt
to conceal the purpose of the participation).

A classic example of using disclosed participant observation is Mead’s (1928) study of sexual
behaviour in Samoa. A more recent example

is Sommerville et al’s (Sommerville et al, 1993)
study of the behaviour of air traffic controllers. Classic examples of undisclosed participant
observation include Patrick’s study of a Glasgow gang (Patrick, 1973) and Rosenhan’s study
of behaviour in a ps
ychiatric ward (Rosenhan, 1973).


involves observing the activity in question. Varieties include participant
observation (described above, under ethnographic approaches), direct observation and
indirect observation. In direct observation, the
activity itself is observed; in the case of indirect
observation, the by
products of the target activity are observed, usually when the target
activity itself cannot be directly observed. A familiar example of direct observation is
shadowing, where the res
earcher follows the respondent around, usually in the context of the
respondent’s work. An example of indirect observation is examination of illegitimacy rates as
an indicator of the incidence of premarital sex.


involve the respondent verbally rep
orting on the target activity. There are numerous
varieties, some of which would traditionally be considered as techniques in their own right
(e.g. scenarios). The underlying similarity in deep structure, however, is great enough for
classifying them toget
her to be sensible. Varieties include self report and report of others.
Each of these can in turn be subdivided into on
line and off
line reporting. In self report, the
respondent reports on their own actions; in reports of others, the respondent reports o
n the
actions of others. In on
line report, the reporting occurs while the action is taking place; in off
line report, the reporting occurs after the action. Scenarios are a special form of report in
which the respondent reports on what they think would ha
ppen in a particular situation (i.e.
scenario), which may involve themselves and/or others. Critical incident technique, and
several closely related techniques such as
illuminative incident analysis

(Cortazzi and Roote,
1975) involves asking the respondent

to describe and discuss a particularly instructive past

are one of the most familiar and widely used elicitation techniques. The core
concept is of a question and answer session between elicitor and respondent, but the term
w” is used so loosely, and to cover so many variants, that it is of debatable value. A
traditional distinction is made between structured and unstructured interviews. In the former,
the elicitor has a series of prepared topics or specific questions; in the

latter, the agenda is left
open and unstructured. Interviews may overlap with scenarios, by asking about possible
situations, and with critical incident technique, by asking about important past events, as well
as with other techniques such as laddering,
when clarifying the meaning of technical terms.

The Personal Construct Theory techniques
are a range of techniques deriving from Kelly’s
Personal Construct Theory (PCT). These include
repertory grids, card sorts

PCT is based on a set of as
sumptions explicitly described by Kelly (1955). These cluster round
a model in which people make sense of the world by dividing it up into things (elements)

which can then be described by appropriate attributes (constructs). There are various
assumptions a
bout the nature of entities and constructs: for instance, that there is enough
similarity across individuals to allow us to communicate with each other, but enough
divergence for each individual to be different. This model is reflected in the elicitation
echniques based on PCT. Repertory grids are entity:construct (broadly equivalent to
object:attribute) matrices with values in the resulting cells of the matrix; card sorts involve
repeatedly sorting entities into groups on the basis of different criteria;
laddering is a
structured technique similar to a highly restricted interview, for eliciting
categorisations, hierarchies and levels of explanation. These techniques are often formally
linked to each other in elicitation, with output from one

being used directly as input for
another. Examples include Kelly’s original work describing PCT (Kelly, 1955); Bannister and
Fransella’s more accessible descriptions of PCT and of repertory grid technique (Bannister
and Fransella, 1980 and Fransella and B
annister, 1977 respectively). Personal Construct
Theory and its associated techniques have been applied to knowledge acquisition and
requirements acquisition by Boose, Gaines and Shaw (e.g. Shaw, 1980, Shaw and Gaines,
1988, and Boose, Shema and Bradshaw,
1989). Card sorts are described in detail in Rugg and
McGeorge, 1997. Laddering was used by Honikman in architecture (Honikman, 1977 ) and by
Reynolds and Gutman in advertising research (Reynolds and Gutman, 1988).


are lists of questions

or statements, usually administered in written form, but
sometimes used in spoken form (e.g. via telephone sessions). When in spoken form, they
overlap with structured interviews (described above). A traditional distinction in
questionnaires is between op
en questions, in which the respondent may use their own words,
and closed questions, in which the respondent has to choose between possible responses on a
list supplied by the elicitor.


is an approach used under different names in different d
isciplines. Versions
include architects’ models, engineering prototypes, software prototypes, artists’ impressions
in architecture, etc. The prototype is shown to the respondent, who then critiques it; the
results from this are then usually fed back into a
nother iteration of design. It should be noted
that this is a completely separate concept from
prototype theory
, which is discussed
separately in section 3.2.2 below.




The choice of structure for a framework is
an important issue. A single hierarchical tree, with
classes and sub
classes, is only able to represent a single way of classifying the entities
involved. In the case of elicitation techniques, however, it is necessary to categorise in several
ways, which
might include time taken to use the technique (minutes in the case of card sorts,
months or years in the case of some ethnographic work) or equipment needed to use the
technique (extremely sophisticated recording equipment for observation of human:computer

interaction, or a notepad and pen in the case of laddering).

The approach used by Maiden and Rugg (Maiden and Rugg, 1996) and used in the present
paper is a faceted one, in which several different categorisations are used and treated as
orthogonal (i.e.

separate from, and uncorrelated with, each other, but applied to the same
entities). This has considerable advantages in terms of clarity. It also has the advantage of
handling range of convenience much more elegantly than is the case with non
roaches, such as matrix representations or elaborated trees. “Range of convenience” is a
concept in Personal Construct Theory (Kelly, 1955), which refers to the way in which a
particular term can only be used meaningfully within a certain range of settings
. For instance,
“IBM compatible” is meaningful only when applied to computer equipment, and is
meaningless when applied to a dug
out canoe. In slot and filler representations such as
matrices, such cases have to be handled by a “not applicable” value, and
in extreme cases the
“not applicable” cases can outnumber the meaningful values. These issues are discussed in
more detail below in relation to knowledge representation and its role in questioning

Although the technique
driven facets are impo
rtant, they are not enough. A technique
classification would be analogous to a representation of illness which was based on the
medicines and treatments which were available, but which contained no systematic
description of the illnesses which the me
dicines and treatments were designed to cure.

The most important facet of the Maiden and Rugg framework involves what the authors
termed “internal representations”. This term covers types of memory, of knowledge and of
communication filter which affect th
e quantity and type of information which can be elicited.

An initial distinction can be made between what is termed “new system” knowledge and
“existing domain” knowledge in the Maiden and Rugg framework. The former refers to
knowledge about things which

do not yet exist, the latter to knowledge about things which do
exist, or have already existed. This is a distinction with important implications for the degree
of validity which can reasonably be expected, and will be discussed in more depth later.
ing domain knowledge is divided into three types of internal representation, namely
tacit, semi
tacit and non tacit knowledge, which form the bulk of the Maiden and Rugg


Tacit knowledge

is knowledge which is not available to conscious intro
spection, and can
be subdivided into
implicit learning

(Seger, 1994) and
compiled skills

(Neves and
Anderson, 1981, Anderson, 1990). Implicit learning occurs without any conscious learning
process being involved; the learning proceeds straight from the tra
ining set of large numbers
of examples into the brain without any intermediate conscious cognitive processes. Compiled
skills were initially learned explicitly, but subsequently became habitualised and speeded up
to the point where the conscious component
was lost. Everyday examples include touch
typing and changing gear when driving a car.

In such cases, asking the respondent about the skill will produce valid responses only by
chance. Touch typists, for instance, do not usually have significant explicit

memory for the
position of keys on the keyboard; if asked which key is to the right of “g”, for instance, they
will usually have to visualise themselves typing, and observe the answer. Similarly, car

drivers will not usually be able to recall the precise
sequence of hand and foot movements
which they made when going round a roundabout. Asking respondents to describe what they
are doing while using a compiled skill usually leads to breakdown of performance because of
the intrusion of a slower conscious comp
onent into the task. Tacit knowledge may include a
significant amount of pattern matching, which is a very fast, massively parallel form of search
quite different from the sequential reasoning used for other tasks; an everyday example of
pattern matching i
s recognition of a familiar face. Because of its massively parallel nature,
pattern matching is not amenable to being broken down into lower
level explanations, with
consequent implications for elicitation. One of the most strikingly unexpected results fro
research into expertise was the extent to which experts use matching against a huge learned
set of previous instances, rather than sequential logic, as a way of operating (e.g. Chi, Glaser
and Farr, 1988, Ellis, 1989).


Explicit knowledge

is defined
as knowledge which is available to conscious introspection.
This type of knowledge is in principle accessible using any elicitation technique, although it
may be subject to various biases and distortions.

2.4 Semi
tacit knowledge

is a term which applies t
o a wide range of memory types and
communication filters. These include short term memory; recall versus recognition; taken for
granted knowledge; preverbal construing, and front and back versions. The common factor
shared by these is that they can only be

accessed via some routes.

2.4.2 Short term memory

is probably the most widely known of these types, and is well
understood as a result of considerable research in psychology. It is a limited capacity, short
term storage, with a capacity of about seven i
tems, plus or minus two, (Miller, 1956) and a
duration of a few seconds. Long term memory, in contrast, has enormous capacity, and can
last for tens of years. In complex cognitive tasks, short term memory is often used as a sort of
scratchpad, with the inf
ormation involved never reaching long term memory. This means that
any attempt to access that information after the task (e.g. via interviews) is doomed to failure,
since the information was lost from memory within seconds of being used. Short term
is only accessible via contemporaneous techniques such as on
line self
report, or
indirectly via observation.


recall versus recognition
is another aspect of memory structure. Recall is active
memory, when information is deliberately retrieved from m
emory; recognition is passive
memory, when a specified item is compared to what is stored in memory to search for a
match. Recognition is normally considerably more powerful than recall (c.f. Eysenck and
Keane, 1995). A simple example involves trying to re
call the names of the states in the USA,
where most people can only recall a small number, but can correctly recognise a much larger
number if shown a list of names.

2.4.4 Taken for granted knowledge

(TFG knowledge) is knowledge which one participant in
a communication assumes to be known by the other participant or participants (Grice 1975).
The concept is related to Norman’s concept of knowledge in the head, as opposed to
knowledge in the world (i.e. knowledge explicitly represented in the external worl
d, for
example as instructions on street signs). (Norman, 1990). TFG knowledge is normally not
stated explicitly during communication; for instance, one does not say “My aunt, who is a
woman” because it can be taken for granted that aunts, by definition, a
re women. This
principle increases the efficiency of normal communication by leaving out superfluous
information. Unfortunately, filtering out of TFG knowledge is based on the assumption that
the other participant or participants share the knowledge, and t
his assumption can be false.
This is particularly the case when experts are dealing with non
experts, and are describing
everyday features of their area of expertise. Precisely because these features are so familiar to
them, experts are likely to take them

for granted, and to assume that they are equally familiar
to the non
expert. Initial evidence from research into semi
tacit knowledge suggests that TFG
knowledge is one of the more common, and more serious, reasons for incomplete elicitation of


2.4.4 Preverbal construing

is a term used in Personal Construct Theory to describe construing
which occurs without a verbal label for the constructs involved. This effect is what is referred
to in lay language by expressions such as “I can’t put it int
o words, but…” In some cases, this
may refer to constructs which are fairly explicitly understood by the respondent, but which
happen not to have a verbal label; in other cases, some form of tacit knowledge is involved. A
striking effect which sometimes ha
ppens when using PCT techniques is that the respondent
suddenly has an “aha” experience, when a construct changes from preverbal to verbal status.
This is usually accompanied by expressions such as “I’d always known there was a
difference, but I’d never be
en able to put my finger on it before”.

2.4.5 Front and back versions

are, respectively, the “public consumption” and “behind the
scenes” versions of reality which members of a group present to outsiders (in the case of front
versions) and insiders (in t
he case of back versions). These terms are derived from Goffman’s
(1959) dramaturgical metaphor of the stage performance. This metaphor has the advantage of
not implying any intention to deceive in the front version; the front version in many
professions i
s viewed by group members as a professional image to be maintained, not as an
extended lie to be fed to the public. It has been anecdotally reported that members of the US
Air Force about to testify to public hearings are given three pieces of advice: firs
tly, don’t lie,
secondly, don’t try to be funny, and thirdly, don’t panic and blurt out the truth. Although this
does not map exactly onto the distinction between front and back versions, it does neatly
capture the distinction between telling the whole tru
th on the one hand and not telling a lie on
the other.

Any outsider, such as a researcher or analyst, coming into an organisation is likely to be given
the front version. Although this may not be dishonest, it is also unlikely to be the whole truth,
the missing information can be extremely important. An extensive literature dating back
to Weber (e.g. Weber, 1924) has consistently found that in most organisations there are
usually unofficial short
cuts in working practices which are not officially allo
wed, but without
which the system would be too unwieldy to work. A simple illustration of this is the work to
rule, a form of industrial action in which the participants follow the official procedures
exactly. This usually reduces productivity dramatically
. The distinction between front and
back versions is not an absolute one, but more of a spectrum. Outsiders may become
gradually accepted by the group, and given access to increasingly sensitive back versions of


The so
called “
stranger on a

” effect is a paradoxical effect, in which people are
prepared to discuss extremely personal and sensitive information if the situation is one of
anonymity (such as talking to a sympathetic stranger on a train whom one does not expect
ever to meet ag
ain). This may be used by investigators, but requires careful setting up

instance, it is advisable only to use a single elicitation session with each respondent, and to
make it clear that the respondent will not be identifiable in the published outco
me of the

2.4.7 Future system knowledge

is the term used by Maiden and Rugg to describe knowledge
about future systems in the context of software development. This term was used within the
context of developing software systems; a more appropria
te term for general questioning
would be “
predictive knowledge
”. This involves quite different issues from the knowledge
types described above. In the case of the knowledge types described above, the relevant
knowledge exists somewhere, and the key problem

is accessing this information reliably and
validly. The term “accessing” is an important one in this context. “Elicitation” describes the
process of extracting information from the respondent, via the respondent; however, some
types of knowledge, such as
tacit knowledge, have to be acquired by indirect induction rather
than directly from the respondent.

An example would be the use of observation to identify key actions during performance of a
compiled skill; it would in principle be possible to produce a

complete and correct description

of this skill without the respondent ever knowing what was in the description. In knowledge
acquisition, this sort of situation occurs in relation to machine learning, where the salient
variables may be identified via expl
icit elicitation from a human respondent, but the correct
weightings and correlations between these variables are then worked out by software. This
approach can lead to a system which performs better than the human experts from whom the
variables were elic
ited (Michalski and Chilausky, 1980; Kahneman, Slovic and Tversky, 1982);
the reasons for this have important implications for questioning methodology, and are
discussed in more detail below. The distinction between elicitation and acquisition is now
ally accepted in Artificial Intelligence (AI) and in requirements engineering, with
elicitation of knowledge or requirements being recognised as subsets of knowledge
acquisition or requirements acquisition respectively.

2.5 Predicting requirements and beh
. When a new product is being developed, it is
not normally possible for any single individual to predict what the requirements will be. One
reason for this is that usually more than one stakeholder is involved, leading to the need for
negotiation of

requirements between stakeholders.

Another reason involves what is known in Information Science as the
Anomalous State of

(Belkin, Oddy and Brooks, 1982). An Anomalous State of Knowledge (ASK)
exists when a person wants something (e.g. a releva
nt reference or a new system), but does
not have enough knowledge of the possibility space to be able to know what is possible and
what could therefore meet their requirements. This is particularly striking in the case of
software development, where users
may be utterly unaware of what is technically feasible,
and may dramatically alter their requirements when they see what can be done.

A third major reason for problems identifying future needs involves people’s weakness in
predicting future events and be
haviours. This is well recognised in attitude theory, where it
has long been known that people’s expressed attitudes correlate weakly at best with their
actions (e.g. Wicker, 1969). The same principle applies to people’s predictions about their own
urs in situations such as seeing smoke come from underneath the door in a waiting
room. Some personality theorists have gone so far as to argue that an individual’s own
predictions about their behaviour in a given situation are no higher in validity than
predictions of someone else who knows that person well, and that our mental models of our
personalities are derived from observation of our own behaviour, rather than being the cause
of our own behaviour. Although more recent research has shown that it

is possible to reduce
significantly the gap between expressed attitudes and actual behaviours by concentrating on
key variables in the research design and the data collection (Myers, 1990), the gap is still a
long way from closed, and the topic needs to b
e addressed with care.

This issue may well be a subset of a more general principle, namely human weakness in
dealing with multivariate information. A considerable literature in judgement and decision
making has consistently found that humans are bad at id
entifying randomness in multivariate
data, with a corresponding tendency to see correlations and patterns where none exist
(Kahneman, Slovic and Tversky, 1982). When correlations and patterns do exist, people are
consistently poor at weighting the variable
s correctly. An elegant example of this is a study by
Ayton, (1998), involving prediction of football scores. The first part of this study involved
asking British football fans and Turkish students with no interest in football to predict British
football r
esults. The result was that the Turkish students performed at a similar level to the
British fans, at well above the level which would be expected by chance. The Turkish students
were using the only information available to them, namely whether or not they

had heard of
the teams or the towns where they were based. These tended to be the larger and/or more
famous examples, and these tended to beat smaller or less famous rivals. This effect was a
strong one, and the other variables used in predictions by the
British fans were comparatively
weak predictors; the British fans, however, weighted these other variables too heavily in
relation to the main one.

An obvious way of dealing with this problem, and one already used in knowledge
acquisition, is to use elici
tation techniques to identify the salient variables, and then use

statistical or computational techniques to identify the appropriate weightings for these
variables. This approach seems to have been comparatively little used in the social sciences,
h multivariate approaches are routinely applied to the variables identified by the
researchers involved. If human weakness in handling multivariate data is as prevalent as it
appears, then attempts to extract accurate predictions from people will usually b
e attempts to
find something which does not exist, and will therefore be a waste of time and effort.

It should be noted as a parenthesis that, although the findings on human judgement and
decision making (J/DM) described above are reliable and robust, the
re has been debate about
their validity. The naturalist school of J/DM research argue that the effects found in the
“heuristics and biases” are largely artefacts of the statistical representation used by them. The
“heuristics and biases” school have genera
lly used a probabilist presentation, i.e. one
involving probability judgements, when framing the experimental task. Researchers such as
Gigerenzer argue that if the same task is reframed in a frequentist format, i.e. one involving
frequency judgements, the
n the biases and distortions described above no longer occur
(Gigerenzer, 1994). This debate is unlikely to be resolved in the near future, and is closely
linked with a long
running debate in statistics about the relative meaningfulness and validity
of pro
babilist and frequentist representations.

It is likely that future research will identify further types of memory and knowledge filter; for
instance, the authors are currently investigating the potential semi
tacit category of “not
worth mentioning” know
ledge, and intend to investigate tacit knowledge in more detail.



It is clear from the account above that no single technique is likely to be able to deal with all
the types of knowledge involved in any given situat
ion. Selection and integration of the
appropriate techniques is therefore necessary. There are various facets on which selection and
integration can be described, such as knowledge types involved, equipment needed and input
and output formalisms. For brevi
ty, only selection and integration on the basis of knowledge
type are described in any detail here. Table 1 below is not exhaustive or set in tablets of stone;
its main function is to provide a clear overview of the recommendations arising from the
s of knowledge types and of techniques above. The reasons for the recommendations
should be clear from the preceding text.


Table 1: recommended and contra
indicated techniques for handling each knowledge type.

Knowledge type:




Predictive knowledge

Any technique, but problems
with validity


tacit knowledge

Any technique, but there
may be problems with
validity of memory


tacit knowledge:

Short term memory

line se

All others (see list in section

Recall v. recognition

Techniques involving
showing examples to the
respondent (e.g. reports,
picture sorts, item sorts)

Techniques which do not
involve showing examples to
the respondent (e.g.

ken for granted knowledge



All others

Preverbal construing

Repertory grid; card sorts;
laddering; possibly reports
and interviews if handled
with care

All others

Front and back versions

Observation; possibly
interviews, critical in
technique and reports once
good rapport has been
established with respondent

All others

Tacit knowledge

Compiled skill

Observation and

All others

Implicit learning

Observation and

All others


One important

part of questioning is the identification of which knowledge types are most
salient in the situation being investigated. Practical considerations of time and resources
usually limit the amount of investigation which can be undertaken, so it is important t
identify the most important aspects of the situation and to choose the appropriate techniques
for them. A certain amount of information can often be gained informally during the initial
meetings with potential respondents, gatekeepers and other members
of the organisation
when a study is being set up. If the research is to take place in a commercial company, for
instance, it is often possible to use direct and indirect observation when on the way to the
contact person’s office

for instance, the demeano
ur of the staff, the information and other
resources available to them (e.g. manuals on desks) and the speed with which they perform
tasks. Demonstrations of tasks allow the identification of tacit knowledge; a standard
indicator of this is that the demon
strator is able to talk while performing the task, with the
conversation ceasing when conscious thought is required to perform the task. This kind of
information is difficult or impossible to gather using preliminary interviews; however helpful
the respond
ents are, they will omit to mention taken for granted knowledge, and will
probably never have noticed the extent to which they use tacit knowledge. This issue is
discussed in more detail in the case studies described below.

Once the types of knowledge inv
olved have been identified, it is then possible to start
prioritising the topics which need to be investigated further, and to select the appropriate
techniques to handle the knowledge involved. It is advisable to proceed this way round,
rather than select
ing the issues first and then profiling the knowledge involved, because the
profiling may well reveal serious misconceptions in the elicitor’s initial model of the area. An
effective demonstration of this is to ask a geologist to give an on
line self
t on how they
identify a rock specimen, leading up to identifying it, and then to follow this immediately by
asking the same geologist to identify a rock specimen and then explain how they knew that it
was the stated type of rock. For the second task, expe
rienced field geologists will usually be
able to identify a rock before the elicitor has finished putting it on the table; the on
line self
report, however, can go on for as much as half an hour. It is clear that the actual identification
is accomplished b
y some form of tacit knowledge (in this case, pattern matching) and that the
tasks described in the on
line self
report are a reconstructed version of how to proceed, used
only for teaching students or for difficult specimens. Such differences can easily m
islead the
inexperienced elicitor depending on initial briefing interviews; a moment spent in observation
is seldom wasted.

3.2 Terminology

One historical legacy of the separate evolution of elicitation techniques is that there has been
only partial and
unsystematic transfer of concepts across techniques and disciplines, so that
concepts viewed as indispensable in one area are practically unknown in another. This section
describes a range of concepts which are relevant across disciplines and techniques, a
nd which
are among the conceptual tools of questioning methodology as an integrated discipline.

The terminology described below derives from a variety of sources, but primarily from
knowledge representation, which is a fairly recent but well established a
nd extensive field
within Artificial Intelligence. A good introduction is provided by Reichgelt (1991). Knowledge
representation is also important in other areas of computing, such as requirements engineering
(Jarke, Pohl, Jacobs, Bubenko, Assenova, Holm,

Wangler, Rolland, Plihon, Schmitt, Sutcliffe,
Jones, Maiden, Till, Vassilou, Constantopoulos and Spandoudakis, 1993).

A full description of the topic is beyond the scope of this paper; however, it provides an
important basis for a metalanguage for questi
oning methodology. One significant advantage
of using this literature as a foundation is that there has been considerable work on the formal
semantics of the various representations used. This allows a more systematic, clean and
rigorous terminology than w
ould otherwise be the case. The following account draws heavily
on this literature, with additions from other literatures where appropriate.


3.2.1 Validity and reliability

An important initial distinction is between

, used here in

the sense in
which the terms are employed in statistics and experimental design. “Validity” describes the
extent to which what is elicited corresponds to reality; “reliability” describes the extent to
which the same finding occurs repeatedly, whether betw
een different elicitors, different
respondents, different occasions, or whatever other variable is involved. The standard
metaphor is target shooting, where “validity” refers to how near the bullets are to the target,
and “reliability” refers to how near t
he bullets are to each other. Bullets may be near to each
other while very distant from the target, which is generally less desirable than the converse;
however, it is usually easier to assess reliability than validity, and it is tempting to hope for
the b
est if the results are reliable.

An everyday example of this is the Father Christmas effect. If a number of respondents are
separately asked to describe Father Christmas, then their accounts are likely to agree closely
(white bearded man, somewhat overwei
ght, in long red coat and hood with white trim

probably a more detailed description than in many crime reports). However, this reliability
does not mean that there is a real Father Christmas, only that there is a widely known
stereotype, which all adult
respondents know to be fictitious.

Human memory is subject to numerous distortions, biases and imperfections, and should
therefore be treated with caution. The clarity and detail of a memory are not valid indicators
of its accuracy. Distortions can be si
gnificant, such as complete reversals of a sequence of
events. There is a considerable literature on this topic, dating from Bartlett’s early work
(Bartlett, 1932) to more recent work by e.g. Loftus and Palmer, (1974) and Baddeley (1990).
Robust findings i
nclude the active nature of memory, which involves encoding of events into
memory rather than a passive recording of them. This encoding frequently leads to
schematisation of the memory so that it fits into a familiar schema, even though this may
involve a

reversal of the sequence of events, or of the role of the participants involved.

3.2.2 Category theory and fuzzy representations.

Categorisation is an important part both of everyday cognition and of expertise. Categories
are usually defined in terms of

the set of attributes which are specific to the category in

for instance, the category “bird” in lay language is defined in terms of having
feathers, being able to fly, making nests and laying eggs. However, many categories are not
in the sense of having no exceptions or ambiguities, and there may be similar
uncertainty about the individual attributes. In the case of birds, for instance, penguins do not
fly, most reptiles lay eggs and some penguins do not make nests. Within individua
l attributes,
an attribute may be defined in terms of several sub
components, and these, like the attribute,
may be “fuzzy” attributes. This term refers to attributes whose applicability is not a clear
or” issue, but rather a question of extent
. The concept “tall”, for instance, applies
strongly to someone two metres high, but there is no unambiguous cut
off point at which a
height is described as “average” rather than “tall”. This lack of precision, however, does not
stop the attribute from bei
ng meaningful; it means, rather, that the metalanguage needed to
describe it needs to be sufficiently sophisticated.

Category theory

and more specifically
prototype theory

have been investigated in some
depth by Rosch (Rosch, 1983) and other researchers
in the same tradition, who use the
concept of core membership of a category, with increasing degrees of variation from the
prototypical core membership. A robin, in this approach, is a prototypical bird exhibiting all
the usual attributes of membership of
the category “bird”; a puffin is less prototypical, and a
penguin is on the edge of the category. Various branches of set theory and of formal
semantics also deal with the same issue of categorisation, which is an important and
ubiquitous one.


At a pract
ical level, categorisation has major implications for any bureaucracy, and
particularly for a bureaucracy trying to automate its procedures, which has been noted since
Weber’s research into bureaucracies (Weber, 1924); the same is true for the law. For ins
assessment of welfare entitlements, or of tax liability, often involves a considerable amount of
making about the appropriate category in which to put a particular issue; once the
category has been decided, the rest of the assessment is com
paratively trivial. At a theoretical
level, the topic of categorisation is of particular interest to social anthropologists, in terms of
the social construction of defining features of social structure, such as in
groups and out


is the t
opic of an extensive literature on fuzzy logic, dating back to Zadeh’s original
work (Zadeh, 1965). This literature uses a mathematical approach to describe degrees of
membership of fuzzy sets, and has proved a powerful tool in handling data of this sort.
basic concept is that set membership is quantified on a scale from zero (not a member) to one
(completely a member), with intermediate membership being given an intermediate numeric
score, such as 0.3 or 0.7.

There are also extensive literatures in s
tatistics and psychology, particularly judgement and
making (J/DM) dealing with areas such as
stochastic events

incomplete knowledge
, which are different from fuzzy knowledge, but may
overlap with it. Uncertai
nty refers to knowledge may or may not be true; stochastic events
happen or do not happen on a probabilistic basis; imperfect knowledge contains errors;
incomplete knowledge is simply incomplete. Thus, for example, a doctor may think that a
patient has a p
articular disease, but not be sure of the diagnosis (uncertainty); the disease may
be known to cause delirium at unpredictable intervals (stochastic events); the medical records
may contain errors, although the doctor does not know which parts of the recor
ds are correct
and which are incorrect (imperfect knowledge); and the medical records may not contain any
information about one aspect of the patient’s previous health (incomplete knowledge). Each
of these has different implications for theory and practice

3.2.3 Terms from knowledge representation

The standard literature on knowledge representation in Artificial Intelligence deals in depth
with formalisms for representing knowledge, including facts, relationships and actions.
Although these provide a pow
erful language for handling the output from elicitation
sessions, this is too broad a topic to be covered in detail in this paper, so only an outline is
given below.

Three well
established formalisms for representing relationships are

Nets, i.e. semantic networks, have the advantage of considerable flexibility in handling
different types of relationship (e.g. “is
a” and “part
of” links) but the disadvantage of unclear
semantics and of lack of structure. Frames involve a slot and fil
ler notation, in which the
various relevant semantic categories are listed in advance and then filled in for each instance
being described. These have the advantage of clarity and completeness, but the disadvantage
of rigidity. Rules represent information
in terms of conditions and consequences (e.g. IF
condition A AND condition B THEN action C). This is useful for representing knowledge
about actions, but can lead to problems of obscurity with regard to precedence, concurrency,
etc in large rule sets. Alth
ough all of these formalisms are relevant to elicitation, the most
immediately relevant is semantic networks, whose terminology is explicitly used in laddering
and in category theory (described later).

Another set of representations from AI with implicati
ons for questioning methodology deals
, and
. Classes are categories which may be composed of
classes and of sub
classes. Eventually all classes end in
, i.e. specific, unique
entities which belong to tha
t class. A familiar example is zoological classification, in which the
class (using knowledge representation terminology) of canids includes the sub
class of dogs,
and the sub
class of dogs in turn contains instances consisting of all the dogs in the world

Each class has a set of attributes which define and/or describe it; for instance, the class of
mammals includes the attributes of giving birth to live young and suckling the young with

The concept of inheritance refers to the situation where a sub
class has not only its own
attributes, but also inherits the attributes belonging to any higher
level classes to which that
class belongs. Although computationally and semantically attractive because of its parsimony
and elegance, this concept encounters
representational problems with inheritance from
different sets of higher
level classes and with exceptions which over
ride the inherited
attributes; it therefore needs to be applied with caution. The classic example is Tweety the
bird: the class of “bird”
normally has the attribute “able to fly”, but if Tweety is a penguin,
then this inherited attribute has to be over
ridden at the level of the class of “penguin” with
the attribute “unable to fly”.

3.2.4 Terms from Personal Construct Theory (PCT) termi

Personal Construct Theory makes an initial distinction between

(the entities being
described) and

(the concepts used to describe them). This distinction is very
similar to the distinction in AI between


spectively. Considerable
emphasis is placed in PCT on the elicitation of respondents’ own categorisation in the form of
elements and constructs. Although elicitation of constructs may appear to a novice to be an
endless task, in fact the number of construc
ts relevant to a particular domain of discourse is
usually quite small (in fact, usually less than twenty, and often significantly less than that).
Part of the reason for this is that the domain of discourse is only relevant to a sub
set of the
which the respondent knows; another part of the reason is that respondents will
explicitly state that they know of more constructs which are applicable, but which are not
particularly important. Since the elicited constructs are usually tersely described (
two or three
words) and tractable in number, it is possible to compare results across different respondents
more easily than is the case with interviews, etc, and with more validity than is the case with
e.g. questionnaires, which normally impose the elici
tor’s constructs on the respondent rather
than eliciting the respondent’s constructs.

PCT has an explicitly defined set of terminology and concepts, such as
focus of convenience

(the core area to which a construct can be applied) and
range of convenience

(the range of
contexts to which a construct can meaningfully be applied). Focus of convenience and range
of convenience are the most immediately relevant to questioning methodology, and space
prevents a more exhaustive listing, but PCT terminology is an ar
ea which could profitably be
studied by elicitors working in a range of disciplines and approaches in which it is currently
little known, such as discourse analysis. In particular, its combination of flexibility and
formalism would make it well suited to a
reas which have in the past used structuralism or
semiotics; PCT is at least as flexible and formalist as these, but considerably richer and better
defined. This flexibility is also a factor in the authors’ preference for PCT over approaches
such as Q meth
odology. For instance, the classic Q sort, in which cards are sorted into a
predetermined distribution, is diametrically opposed in its approach to the PCT practice of
examining a respondent’s repertory grid specifically to see whether the responses show a
unusual distribution. One potential link between PCT and grounded theory (Glaser and
Strauss, 1967) could repay investigation: grounded theory’s concept of tracing inferencing
through a series of levels of abstraction of data has clear similarities to so
me of the concepts in
laddering. In particular, laddering on explanations can be used to check whether concepts
have been fully explained, as described below in the section on graph theory.


3.2.5 Graph theory

A relevant literature which is comparatively

little known in most non
disciplines is graph theory. This provides a clear, useful notation for representing knowledge
in a way which allows qualitative analysis to be combined with quantitative. The term
“graph” in this context refers not t
o graphs in the sense of plotting sets of values against each
other, but to items linked to each other by lines, as in the simplified diagram below.

Figure 1: a three layer graph

In this case, the top
level n
ode (A) is joined by two arcs (connecting lines) to two lower
nodes, (A

and A
). Node A

is joined by two arcs to leaf level (bottom level) nodes (A
); the node on the right (A
) is joined by three arcs to leaf level nodes. The graph has a

depth of three levels; the leaf
level nodes are the children of nodes A

and A
, which in turn
are the children of node A. The terms “nodes” and “arcs” are widely used in a range of
disciplines in the sense described above, although formal graph the
ory favours the terms
“vertices” and edges” respectively for the same concepts.

There are various forms of graph, such as trees (graphs in which each node may have an
upwards connection to a parent, and may have downwards connections to one or more
en, but no sideways connections to other nodes) and nets (graphs which do not have the
hierarchical structure of trees, and in which sideways links may occur). Graphs may be
directed (each arc may be followed in one direction only) or undirected (each arc
may be
followed in either direction).

Using a very simple tree as an example, it is possible to see how graphs offer a powerful and
flexible formalism for representation of relationships. For instance, it is possible to count the
layers of nodes in the gr
aph, as an index of hierarchical organisation of structure, or to count
the number of nodes at a particular level of the graph, as an index of differentiation and
breadth at that point. An obvious application is the study of organisational behaviour, where

such indices can be used to describe the structure of the organisation; however, the same
concept can be applied to other areas. It has, for instance, been applied to elucidatory depth,
i.e. the number of successive layers of explanation needed to reach p
ublic domain terms or
tacit knowledge (Rugg and McGeorge, 1995), and can be applied in the same way to the
fabricatory depth, i.e. the way in which tools are used to make tools to make tools as an index
of the depth and breadth of a culture’s technological

infrastructure (currently being
investigated by the authors).












Facet theory
, as used by Rugg and McGeorge (1995), is derived largely from graph theory,
with the concept of separate trees orthogonal to each other but sharing some or all of the same
vel instances. This concept is conveniently similar to the concept of “views” in software
engineering, and is becoming increasingly used in that field. A similar concept is well
established in information science (Vickery, 1960), though without the same un
mathematical formalisms. Facet theory makes it possible to describe complex multivariate
structures as a set of separate and comparatively simple structures, and is applicable to a
wide range of uses. For instance, an organisation may have one str
ucture for the commercial
organisation, another for union membership within it and another for safety officers.

3.2.6 Schema theory

One of the features which Bartlett discovered in his research on memory (Bartlett, 1932) was
that the processes of memory
tend to organise events and facts into regular templates, which
Bartlett termed schemata. The same underlying concept has been re
worked repeatedly in
psychology since then, for instance in the form of script theory (Schanck and Abelson, 1977).
This phenom
enon is important to questioning methodology for two main reasons. The first is
that it explains and predicts certain types of error in memory, particularly recall, which is
salient to questioning techniques dependent on the respondent’s memory of the past
. The
second is that it helps explain the way in which respondents, particularly experts, structure
parts of their expertise.

This has important implications for elicitation of information about values and judgements,
and can explain apparent inconsistenc
ies in them, although there appears to have been
comparatively little work on this. In the field of software metrics, for instance, the majority of
work appears to have concentrated on the elicitation of individual metrics for evaluating
software, rather t
han on finding out which categories respondents use to cluster software into
groups, and which metrics are relevant to each of those groups. In the domain of car design,
for instance, there are well established groups of car, such as town car, luxury car a
nd estate
car. The metric of “size” is applicable to all of these, but the desired value is very different for
the different groups. In the case of a town car, small size is an asset, whereas in the case of a
luxury car it is a drawback.

Techniques such a
s laddering are well
suited to the elicitation of schemata, and it will be
interesting to see what comes of future work using this approach. The field of software design
appears to be particularly ready for such work, which would complement the existing
terature on customising software to the individual user, and on identifying generic user



The traditional unit of analysis and discussion in elicitation is the method/technique: for
instance, the interview, or the questionnaire,

or repertory grid technique. There are, however,
significant problems with this approach when looking at the bigger picture. One problem is
that for most techniques there is no single standard form, so any description of the technique
has to include descr
iptions of the main variants of the technique. Another is that the various
techniques tend to blur into each other

the distinction between a self
report and an
interview in which the respondent uses materials to demonstrate a point, for instance, is hard

to draw. A related further problem is that the same features may occur in two or more
techniques, leading to duplication of description in any systematic account of the techniques.
These problems, and others like them, make it difficult to provide a syste
matic, clear, precise
set of descriptions and prescriptions about methods/techniques and their use.

One solution to this problem is to use a finer
grained unit of analysis. Instead of treating each
method or technique as an integral whole, one can instead

treat it as being composed of a
number of sub
components. For instance, in scenarios the elicitor uses a prepared set of
information for the respondent; the elicitor then asks natural language questions; the
respondent answers using natural language respo
nses. This is quite different from the
structure of a repertory grid session, where the elicitor uses a prepared grid format, and
encourages the respondent to identify constructs which describe the elements used in the
grid. It is, however, composed of som
e elements in common with a structured interview,
which also involves natural language questions and natural language responses.

We introduce the term “method fragments” to describe these sub
components. Method
fragments can be identified at various level
s of granularity. The coarsest grained level consists
of fragments such as “natural language question,” with finer grained levels such as “natural
language question about a future event” and “natural language question phrased as a
probability value.”

hod fragments have obvious practical advantages in any systematic work involving
elicitation techniques and methods. They can be used to reduce or remove repetition when
two or more techniques share common method fragments. In such cases, it is only necess
to cover each method fragment once, and to state which techniques involve that fragment.
They can also be used in a “pick and mix” way to create the appropriate customised variant
of a technique, or blend of two or more techniques, to fit a particular
situation. One of our
recent student projects, for instance, involved asking respondents to say what was going on in
a photo, then followed this up with a short set of previously prepared questions, the
responses to which were in turn probed using ladderin
g. These fragments allowed
investigation of attributional effects via the report on the photos (for instance, investigation of
how women’s status was perceived in photos where the women were using IT equipment)
which could then be compared with the account
s obtained via the interviews; the laddering
allowed identification of attributes which were perceived as status markers, which could in
turn be compared with results from the other two fragments.

A more profound advantage is that the use of fine
method fragments makes it
possible to provide grounded advice about use of appropriate formats. In the case of “natural
language question phrased as probability value,” for instance, there is a considerable
literature within the Judgement and Decision
ng (J/DM) area of psychology dealing
with the various cognitive biases which are associated with probabilist and frequentist
presentations of the same underlying question. Similarly, the literature on attribution theory
provides strong guidance about outco
mes from phrasing a question in the second or the third
person (“What would you do…” versus “what would most people do…”).

Although it might be thought that the number of potential method fragments would be
enormous, our initial work in this area suggests

that the number is in fact quite tractable. Our
research so far has been both bottom
up, working from practical experience towards theory,

and top
down, working from theory towards practice. There is still a considerable amount of
work to be done in this
area, but it holds great potential.



5.1 Bulk carriers

The first case study described here was one of the precipitating events leading to the
development of the framework described above. The case study involved following the

of software to be used in the loading of bulk carrier ships. As part of this
process, the author wanted not only to interview the software development team, but also to
observe them in action, and to observe loading in progress. The interviews were
lematic, but there were practical and security problems with access to the loading.
During the negotiations about this, the software developers decided to undertake their own
visit to observe loading, since their knowledge of the process came from requirem
ents given
them as documentation.

The developers soon found several important aspects of the loading process which had
serious implications for system design, but which had not been mentioned anywhere in their
documentation. For instance, the developers h
ad assumed that loading would occur at a fairly
constant rate, making it possible to predict loading strains on the hull reasonably well in
advance; however, this assumption turned out not to be correct. It also transpired that hull
stresses could very qui
ckly change from safe to dangerous if the cargo being loaded was a
dense one, such as iron ore, where a large weight of cargo could be loaded very quickly. The
developers had also not realised how much noise, glare and vibration were associated with
the lo
ading process, which had serious implications for the design of any computer based
warning system.

In this example, the system analysis had been carried out competently by professionals, but
had failed to record several important facts in the documentatio
n. These facts were
discovered in less than an hour by developers with no formal training in observation, leading
one to wonder how many more might have been uncovered by a trained specialist.
Interestingly, all the missing factors in this example appear t
o have been cases of taken for
granted knowledge.

5.2 Industrial printing

The second case study was undertaken by Blandford and Rugg (in preparation) as part of an
assessment of the feasibility of integrating requirements acquisition for real
world softw
systems with usability evaluation in general and Programmable User Models in particular.
The domain involved was industrial printing of, for example, sell
by dates onto products; the
company involved was a market leader in the field. The case study con
sisted of two main
phases, the first of which was undertaken at the company’s premises, and the second of
which was undertaken in a client’s food processing factory, where the equipment could be
seen in action.

The first phase involved interviews with sta
keholders, conducted separately to identify any
differences between stakeholders with regard to requirements, and also demonstrations of the
equipment, which were combined with observation and on
line self
report. The
demonstrations showed that the demonst
rators did not use the equipment often enough to
have compiled the skills involved in using it, and also showed that using it was not a trivially
easy task. (Since the equipment is normally set up to print a sell
by date a given number of
days in the futur
e, and can automatically update the date to be printed, simply showing the
user the date being printed is not enough; the equipment also needs to be able to show the
length of time by which the date is being offset.)

It became clear that user navigation t
hrough the equipment and security issues associated
with the password protection for the equipment were particularly important potential
problems. Programmable User Models were used to identify particular problems which

might arise, after which the visit t
o the client’s site was conducted to see how well these
predictions corresponded with reality.

The security issue turned out to have been solved by a passive work
around; the equipment
was positioned next to packing teams, making it extremely difficult fo
r anyone to use the
equipment without authorisation. One device, however, was positioned in an isolated part of
the factory, and there had been concerns about its security, as predicted by the authors.
(There had been one occasion when a print code had mys
teriously changed in the middle of a

The user navigation issue turned out to be an interesting one in several ways. The first phase
of investigation had shown that frequency of use of the device would be an important
variable (and one where the so
ftware development stakeholder and the training stakeholder
had different perceptions of how often typical users would use the device). The authors had
predicted that if the device was used frequently enough, then the skills involved would
become compiled,

and navigation would not be a problem; however, if the device was used
less frequently, then navigation would be a problem, with various likely errors.

The site manager told the authors that there were different levels of training for the different

involved, who used the device at various levels of sophistication. He mentioned that he
and some of the other senior staff had been on the full training course, and were familiar with
the device. This is what would be expected as a front version, and ther
e was the prospect that
the back version would be very different. However, when the manager demonstrated a
feature of the device, the speed at which he operated it was clearly the result of a compiled
skill, which indicated considerable use of the device,
which in turn indicated that there was
not a significantly different back version. For staff who used the device less often, for simple
tasks, there were printed “crib sheets” (
) attached to the device. This was an
interesting finding, since
an earlier interviewee had told the authors that this approach would
not be used in the food industry because of the need to clean the outside of the device
frequently to comply with health and safety regulations.

In addition to these expected issues, som
e serendipitous findings emerged. It had been
expected that observation would identify issues missed in the previous sessions, but it was
not possible to predict what these would be. An example of this was that the air in the second
site contained a high p
roportion of suspended dust particles from the dried foods being
processed. This was not directly relevant to the requirements for the equipment design, but
had important indirect implications. The amount of dust was sufficient to make wearing
inconvenient, since the lenses soon became dusty. A significant proportion of the
staff on site were middle aged, and needed glasses to read small text, such as that on the
device’s display screen. Since the site dealt with food processing, health and safe
ty legislation
meant that staff had to wear white coats. This combination of factors meant that for a
significant proportion of staff, checking the display on the device involved taking out
spectacles from under a white coat, putting the spectacles on, rea
ding the display, cleaning
the spectacles, and then putting them away again. This in turn meant that it was not possible
to depend on staff glancing at the display in passing as a routine method of checking the
device, with consequent implications for work
ing practices.

Although the client had a long relationship with the company, and the sales representative
who accompanied the authors to the site was on good terms with the manager and clearly
knew the site well, there had been no previous mention of the
dust issue and its implications.
The most likely explanation was, once again, taken for granted knowledge which had gone
unmentioned and undetected until observation was used.


5.3: women’s working dress

A study of perceptions of women’s clothing at
work used card sorts to investigate
categorisation of women’s working dress by male and female respondents (Gerrard, 1995).
This is an area which had previously been investigated by other researchers using a range of
familiar techniques. However, there had

not been any previous work using card sorts, which
appeared to be a particularly appropriate technique for this area. In the study, each card held
a different picture of a set of women’s clothing worn by a model. Respondents were asked to
sort the cards
repeatedly into groups of their choice, using a different criterion to categorise
all of the cards each time (individual cards could be sorted into a group such as “not
applicable” or “don’t know” if necessary). One finding was that half of the male respon
but none of the female respondents, used the criterion of whether the women depicted were
married or unmarried. This was something which had not emerged as an issue in any of the
previous research in this area. It was also of interest because the pi
ctures did not show the
faces or the hands of the models, because of the risk of distraction from cues other than the
clothing itself, so the respondents were unable to see wedding rings or other indications of
marital status, and were therefore categorisi
ng solely on the basis of the clothing.

5.4: change at work

Management of change is a topic which has received considerable attention from researchers
and practitioners. Change of apparently trivial factors can have knock
on implications which
connect to

level values and goals in those affected by the change, and which can in turn
lead to strong emotions and often resistance to the proposed change. This appeared a
particularly suitable area for investigation via laddering, and was investigated using
laddering and a questionnaire in the same organisation, which was about to bring in a new IT
system (Andrews, 1999).

The results obtained via the two techniques had some similarities; for instance, the theme of
improved communication via the proposed

new IT system ran through responses from both
techniques. However, there were also some striking differences. For example, only 5% of the
respondents stated in questionnaires that the new technology would affect their job security,
whereas this was explic
itly mentioned by 43% of the respondents when laddering was used.

Another interesting result emerged during the quantitative analysis of the laddering results.
This involved counting the average number of levels of higher level goals and implications of
he new system elicited from respondents in different positions in the organisation. The
average number of levels used by respondents with higher positions in the organisation was
1.9, whereas respondents lower in the organisation used an average of 3.7 le
vels. This result is
intuitive, since one would expect the more senior staff to have thought through more
implications than the less senior staff. However, what was happening with the responses was
that often the more senior staff were proceeding d
irectly to the implications for the
organisation, whereas the less senior staff were first proceeding to the implications for them
personally, then moving to the implications for the organisation, and then proceeding to
further implications for them person

Case studies: summary

The case studies are simply case studies; wholesale testing of the framework will be a much
larger operation. However, it is significant that in all cases, important issues were missed by
previous work, and emerged only when di
fferent questioning techniques were introduced to
the field, as predicted by the framework. It is also interesting that taken for granted
knowledge, missed by interviews, featured prominently in both the first two cases.

The problem of missing knowledge
cannot be simply resolved by using observation in
addition to whichever other technique the elicitor happens to favour; although observation
happened to be an appropriate method in two of these case studies, there are other situations

where it is impractic
al or impossible. An example of this occurred when one of the authors
was investigating staff and student perceptions of what constituted a good dissertation. Staff
and students agreed that presentation was an important factor, but elucidation via ladderin
of what was meant by “good presentation” showed that students interpreted “good
presentation” quite differently from staff. The systematic nature of laddering made it possible
to uncover the different interpretations in a way which would not have been pr
actical via
observation (which would have required an enormous range of examples of dissertations, and
even then could not have guaranteed to identify all the rare features). It also improved the
chances of identifying that there were different interpretat
ions of the same term: because
laddering usually proceeds down till a term has bottomed out, it elicits a fairly full
description of how a term is being used. Interviews can do this, but it is not an inherent
feature of interviews per se, and the degree of

elucidation is normally decided by the
preferences of the interviewer rather than any systematic principle. Selection and integration
of techniques is clearly a critical factor in eliciting valid, reliable information, and needs to be
considered carefully



It should be clear from the examples above that questioning methodology spans a wide range
of areas, and that a considerable amount of work remains to be done. The following sections
discuss the main issues involved.

6.1 Questioning meth

It is clear that choice of questioning technique is something which draws on a wide body of
findings from a numerous disciplines, and which is not trivially simple. It would therefore
make sense to treat questioning methodology as a field in its o
wn right, analogous to, and
complementary to, statistics and survey methods. The commonality of methodological and
theoretical issues in questioning across disciplines is sufficient to make cross
fertilisation both
possible and desirable.

It is also clear

that no single technique is adequate for handling the full range of knowledge
types likely to be encountered, and that elicitors should expect to use more than one
technique. Choice of the appropriate technique is an important issue, and there is a need f
or a
guiding framework which is empirically validated and which is theoretically grounded in
knowledge types and information filters, rather than a hopeful collection of ad hoc rules of

6.2 Validation

The framework described here is built of com
ponents which are individually validated, but
this does not mean that the way in which they have been assembled is necessarily valid. An
important piece of future work is validation of the framework, so that errors can be identified
and corrected. The auth
ors are currently working on this, via a combination of case studies
and formal experiments. Initial results from case studies are consistent with the predictions of
the framework, particularly in the case of semi
tacit knowledge; the role of taken for gra
knowledge has been particularly striking.

6.3 Training needs

A practical point arising from the discussion above is that if the framework is validated by
further work, then elicitors will need to be trained in relevant techniques before undertaking
questioning work. This has serious implications in terms of training needs, and it is likely that
questioning courses, analogous to statistics courses, would need to be set up as a routine part
of academic and industrial infrastructure. The authors’ experi
ence is that such a course is
feasible, especially if the initial emphasis is on providing an overview of the framework and
the main issues, with more detailed coverage of the particular techniques needed for an
individual project.

6.4 Cross
discipline w

One of the reasons that a framework has taken so long to emerge is almost certainly that the
relevant knowledge was scattered across so many different disciplines that most researchers
would never see enough to obtain an overview. The brief descriptio
ns above of the various
techniques and concepts does not do justice to the wealth of knowledge which has been built
up in the different disciplines, and there is much to be gained from exchange of knowledge
across disciplines.

The need is not only for ex
change of information, but also for comparisons across disciplines,
domains and cultures, to assess the degree of commonality and of difference which exists

across them. Repertory grid technique, for instance, makes clear and explicit assumptions
about som
e aspects of human cognition; it would be interesting, and very much in the
tradition of PCT, to see whether these hold true across different cultures.

6.5 Future work

The most immediate and obvious need for further work involves empirical validation o
f the
framework described above. Although the framework is composed of established parts, this
does not guarantee that the way in which they have been fitted together is correct. Validation
of this sort requires large data sets; initial results from case s
tudies indicate that the
framework is sound, and provides useful guidance. One striking effect in the case studies has
been the prominence of taken for granted knowledge as a source of missed requirements
when only interviews are used. Another interesting
feature is the frequent use of tacit
knowledge by experts, often as a sub
component of a wider skill or task (described in more
detail below).

An area which is attracting increasing attention is elicitation of information across cultures

for instance, w
hen a high
cost product such as a building or an aircraft is being designed and
built for a client from another culture. Even within a single country, different organisations
can have quite different corporate cultures, with different implications for the
suitability of a
particular product to their context: this has been the subject of considerable work, much of it
by researchers following the sociotechnical approach pioneered by groups such as the
Tavistock Institute. The framework described above provide
s a structured and systematic
way of approaching such problems, but does not in itself guarantee that the appropriate
techniques exist to solve them.

One promising approach to such problems is the use of laddering. Laddering allows the
elicitor to break d
own terms used by the respondent into progressively more specific
components, until the explanation “bottoms out” at a level which cannot be explained further.
The components at this level may be of several types. One type is externally observable
, such as size and colour; the other main type involves pattern matching in the
broadest sense (shape, texture and sound), which may in turn be either public domain pattern
matching (i.e. using patterns known to the lay public) or expert pattern matching,
which is
often associated with implicit learning and compiled skills. These components can then be
compared across respondents.

A similar approach can be used to tackle issues such as script theory and chunking, where
different respondents have different
ways of grouping items together into higher
structures. These structures differ between disciplines and professions. The usual example of
a script (Schanck and Abelson, 1977) is a series of actions linked in a predictable way, each of
which may in tu
rn be composed of several sub
actions; eating at a restaurant, for instance,
usually consists of several main actions, such as “booking a table”, “hanging up coats” and
“ordering”. “Chunking” is a similar concept involving the grouping together of individu
items into a higher
level group.

One of the major differences between experts and novices is the extent to which series of
actions are scripted or chunked up. In field archaeology, for instance, the script of “drawing a
section” (i.e. a cross
through an archaeological feature being excavated) is composed
of a large number of sub
tasks such as establishing a reference height relative to the site
temporary bench mark, with each of these in turn being composed of other lower
level tasks
such as se
tting up the surveying equipment. It is possible to elicit these scripts and chunks
using laddering, and then to use graph
theoretic approaches to show the number and nature
of them, thus combining qualitative and quantitative analysis.

Interestingly, the

concept of script theory can be applied to the concept of agenda setting in
discourse analysis, as an example of implicit or explicit debate about the script to be used by

participants in the discourse. It should in principle be possible to use the same l
based approach as described above to investigate the nature and number of scripts available
for a particular situation, and to approach the area of social interaction from a social cognition


6.6 Conclusion

In the past, questioning

methodology was a Cinderella discipline compared to the elegant
sisters of statistics and survey methods. It is now clear, however, that questioning
methodology is as important and as rich a discipline as its sisters. The next steps are the
traditional on
es for a newly emerging discipline: the bringing together of knowledge from
parent disciplines, the establishment of new research agendas and approaches, and the setting
up of the infrastructure to support this, in such forms as workshops, textbooks, confe
and journals. It will be interesting to see what emerges from the work ahead; whether any
previously intractable problems turn out to be tractable after all, and what new challenges
appear to take their place. Traditionally, living in interesting t
imes was treated as a curse, but
in academia, living in interesting times is what every researcher hopes for. This certainly
appears to be the most likely future for researchers in questioning methodology.



Anderson, J.R.
The Adaptive Charact
er of Thought.
Erlbaum, Hillsdale N.J., 1990

Andrews, S.
An assessment of end user attitudes and motivation towards new technologies in the
workplace and the behaviours arising from them.
Unpublished undergraduate thesis, University
College Northampton, 1

Ayton, P. Pers. com. April 1998

Baddeley, A.D.
Human memory: Theory and practice.

Lawrence Erlbaum Associates, Hove, 1990

Bannister, D. and Fransella, F.
Inquiring man.

Penguin, Harmondsworth, 1980

Bartlett, F.C.
Remembering: A study in experimenta
l and social psychology.

Cambridge University
Press, Cambridge, 1932

Belkin, N.J. , Oddy, R.N. and Brooks, H.M. ASK for Information Retrieval: Part I, Background
and Theory.
Journal of Documentation
, pp. 61
71, June 1982.

Blandford, A. and Rugg,

Integration of Programmable User Model Approaches with
Requirements Acquisition: A case study.
In preparation

Boose, J.H., Shema, D.B. and Bradshaw, J.M. Recent progress in AQUINAS: A knowledge
acquisition workbench.
Knowledge Acquisition
, (1): 185
4 (1989).

Chi, M.T.H., Glaser, R. and Farr, M.J. (eds.)
The Nature of Expertise.
Lawrence Erlbaum
Associates, London, 1988.

Cortazzi, D. and Roote, S.
Illuminative Incident Analysis.

Hill, London, 1975

Denzin, N.K. and Lincoln, Y.S. (eds.)
ok of Qualitative Research.

Sage, London, 1994

Ellis, C. (ed.)
Expert Knowledge and Explanation: The Knowledge
Language Interface.

Horwood, Chichester, 1989.

Eysenck, M.W. and Keane, M.T.
Cognitive Psychology.

Psychology Press, Hove, 1995

a, F. and Bannister, D.
A manual for repertory grid technique.

Academic Press, London,

Gerrard, S.
The working wardrobe: perceptions of women’s clothing at work.

Unpublished Master’s
thesis, London University, 1995

Gigerenzer, G. Why the distinctio
n between single event probabilities and frequencies is
important for psychology (and vice versa). In Wright, D. and Ayton, P. (eds.)

John Wiley and Sons, Chichester, 1994

Glaser, B.G. and Strauss, A.L.
The Discovery of Grounde
d Theory.

Aldine, New York, 1967

Goffmann, E.
he Presentation of Self in Everyday Life.
Doubleday, New York 1959

Grice, H.P. Logic and Conversation. In Cole, P. and Morgan, J.L. (eds.)
Syntax and Semantics 3.
Academic Press, New York, 1975

Hinkle, D.,

The change of personal constructs from the viewpoint of a theory of construct implications.

Unpublished PhD thesis, Ohio State University, 1965.


Cited in Bannister, D. and Fransella, F.
Inquiring man.
Penguin, Harmondsworth, 1980

Honikman, B. (1977). Con
struct Theory as an Approach to Architectural and

Environmental Design. In Slater, P. (ed.),
The Measurement of Interpersonal Space by Grid
Technique: Volume 2: Dimensions of Interpersonal Space.

John Wiley and Sons, London, 1977

Jarke, M., Pohl, K., Ja
cobs, S., Bubenko, J., Assenova, P., Holm, P., Wangler, P., Rolland, C.,
Plihon, V., Schmitt, J., Sutcliffe, A.G., Jones, S., Maiden, N.A.M., Till, D., Vassilou, Y.,
Constantopoulos, P. and Spandoudakis, G. Requirements Engineering: An Integrated View of
epresentation. In Sommerville, I. and Manfred, P. (eds.)
Proceedings 4th European Software
Engineering Conference
, Garmesh
Partenkirchen, 1993. Springer
Verlag, Lecture Notes in
Computer Science
, pp. 100

Kahneman, D., Slovic, P. and Tversky, A. (
Judgement under Uncertainty: Heuristics and
Cambridge University Press, Cambridge, 1982

Kelly, G.A.
The Psychology of Personal Constructs.
W.W. Norton, New York, 1955

Loftus, E.F. and Palmer, J.C. Reconstruction of automobile destruction:
An example of the
interaction between language and memory.
Journal of Verbal Learning and Verbal Behaviour
589, 1974

Maiden, N.A.M. and Rugg, G. ACRE: a framework for acquisition of requirements.

Software Engineering Journal
, pp. 183
192, 1996

ead, M.
Coming of Age in Samoa.

William Morrow, New York, 1928

Michalski, R.S. and Chilauski, R.L. Learning by being told and learning from examples: an
experimental comparison of the two methods of knowledge acquisition in the context of
developing an ex
pert system for soybean disease.
International Journal of Policy Analysis and
Information Systems
, pp. 125
161, 1980

Miller, G.A. The magic number seven, plus or minus two: Some limits on our capacity for
processing information.
Psychological Review
, 81
93, 1956

Myers, D.G.
Social Psychology

(3d edition). McGraw Hill, New York, 1990

Neves, D.M. and Anderson J.R. (1981). Knowledge compilation: mechanisms for the
automatization of cognitive skills. In Anderson J R (ed).
Cognitive skills and their

Erlbaum, Hillsdale, N.J., 1981

Norman, D.
The design of everyday things.

Doubleday/Currency, New York, 1990

Reichgelt, H.
Knowledge Representation.

Ablex Publishing Corp., Norwood,


Patrick, J.
A Glasgow Gang Observed.
Eyre Methuen, Lo
ndon, 1973

Reynolds, T.J. and Gutman, J. (1988). Laddering Theory, Method, Analysis, and
Journal of Advertising Research
, February
March 1988, pp. 11

Rosch E. Prototype Classification and Logical Classification: the Two Systems.


Scholnick, K. (ed.)
New Trends in Conceptual Representation: Challenges to Piaget's Theory.

Lawrence Erlbaum Associates, Hillsdale N.J., 1983

Rosenhan, D.L. On Being Sane in Insane Places.

(January 19, 1973)


Rugg, G. and McGeorge, P. Ladderi
Expert Systems
(4), pp. 339
346, 1995

Rugg, G. and McGeorge, P. The sorting techniques: a tutorial paper on card sorts, picture
sorts and item sorts.
Expert Systems
(2), 1997

Seger, C.A. Implicit learning.
Psychological Bulletin

(2), pp. 1
196 (1994).

Shaw, M.L.G.
Recent Advances in Personal Construct Theory
. Academic Press, London, 1980.

Shaw, M.L.G. and Gaines, B.R. A methodology for recognising consensus, correspondence,
conflict and contrast in a knowledge acquisition system.
edings of Workshop on Knowledge
Acquisition for Knowledge
Based Systems,

Banff, Canada, Nov 7
11, 1988.

Schank, R.C. and Abelson, R.P.
Scripts, plans, goals and understanding.

Lawrence Erlbaum
Associates, Hillside, N.J., 1977

Sommerville I., Rodden T., S
awyer P., Bentley R. and Twidale M., 1993,

Integrating Ethnography into the Requirements Engineering Process.

Proceedings of IEEE
Symposium on Requirements Engineering
, IEEE Computer Society Press, 165

Vickery, B.C.
Faceted Classification: A Guide
to the Construction and Use of Special Schemes.

London, 1960.

Weber, M. Legitimate Authority and Bureaucracy (1924) In Pugh, D.S. (ed.)

Organisation theory: selected readings (third edition)
Penguin, London, 1990

Wicker, A.W. Attitude versus actio
ns: The relationship of verbal and overt behavioral
responses to attitude objects.
Journal of Social Issues
(4), pp. 41
78 (1969)

Zadeh, L. Fuzzy sets.
Information and Control
, pp. 338
353 (1965)