The interplay of root, suffix and whole-word frequency in processing derived words

desertdysfunctionalInternet and Web Development

Dec 4, 2013 (4 years and 28 days ago)

180 views

The interplay of root, suffix and whole
-
word

fr
e
quency in processing derived words
*



Cristina Burani and Anna M. Thornton



In three lexical decision experiments we investigated whether the relative frequency
of root, derivational suffix and whole
-
word a
ffects processing of Italian printed
derived stimuli. Experiment 1 considered pseudowords made up of pseudoroots
combined with either high
-
,medium
-
, or low
-
frequency suffixes. Only
pseudowords with high
-
frequency suffixes resulted in increased decision ti
mes and
higher error rates relative to nonsuffixed pseudowords. Experiments 2 and 3 dealt
with suffixed derived words. In Experiment 2, low
-
frequency words with high
-
or
low
-
frequency roots and with high or low
-
frequency suffixes were orthogonally
contrast
ed. Lexical decision latencies were a function of the frequency of both the
root and the suffix. However,
post
-
hoc

comparisons showed an effect of whole
-
word familiarity. In Experiment 3, low
-
frequency derived words with orthogonal
variation of root and su
ffix frequency, and equal whole
-
word familiarity, were
investigated, and were contrasted with low
-
frequency nonderived words. Words
with high
-
frequency roots showed quicker and more accurate lexical decision
responses, irrespective of suffix frequency. By
contrast, words with low
-
frequency
roots, irrespectively of suffix frequency, did not differ from nonderived words.
These results are interpreted within Schreuder and Baayen’s (1995) parallel dual
-
route model for morphological processing, as evidence for b
oth benefits and costs
of morphemic access, due to the balancing of the quantitative characteristics of
root, suffix and whole word.



1.

Introduction


In most models of morphological processing, it is assumed that the pro
b
a-
bility of accessing morphological
constituents of words is conditioned, at
the different processing stages, by many properties of the morphologically
complex words. Among these properties, the frequency of morphological
constituents relative to the frequency of the complex word as a whole
form
can play a major role.

Evidence for reliance on morphological structure in accessing printed
complex words comes from low
-
frequency words which include higher
frequency constituents (see, e.g., Andrews 1986; Burani and Caramazza
1987; Meunier and Segu
i 1999). In parallel dual
-
route models of lexical
158


Burani and

Thornton

access (see, e.g., Burani and Laudanna 1992; Chialant and Caramazza
1995; Frauenfelder and Schreuder 1992; Schreuder and Baayen 1995),
words composed of more than one morpheme may activate in parallel two
t
ypes of access units, namely units corresponding to the whole word and
units corresponding to the morphemes included in the stimulus. In these
models, the relative frequency of the whole word and of the constituent
morphemes affect the relative time
-
course

of activation of the different
units in the different components. Hence, frequency is the major determ
i-
nant of the relative probability that lexical access is either whole
-
word
-
based or mo
r
pheme
-
based.

The assumption underlying these models is that the h
igher the fr
e
que
n-
cy of a given lexical unit, be it a word, a root or an affix, the greater the
likelihood that this unit is quickly activated and processed in the diffe
r
ent
processing components. What is crucial, in determining the probability that
lexical

access is provided by either whole
-
word or morpheme pro
c
essing,
is the complex balance existing between the frequency of the whole word
and the frequency of its constituent morphemes, both roots and a
f
fixes, i.e.,
it is relative frequency, rather than abs
olute frequency (for a similar pr
o-
posal, see Hay 2000; 2001).

Hence, it might be predicted that a transparent derived word which has
low
-
frequency in the language, like Italian
bassezza

(‘lowness’), but is
composed of a very frequent root (i.e.,
bass
-
, ‘lo
w’) and a very frequent
suffix (i.e.,
-
ezza
, ‘
-
ness’) is likely to be accessed via activation of its mo
r-
phemic constituents, rather than via the unit corresponding to the whole
-
word, which is supposed to be very scarcely activated. This prediction
implies
that the frequencies of
both

the root
and

the suffix are capable of
affecting processing, and calls for evidence concerning the roles of
both

root
and

suffix frequency. However, and surprisingly, only the frequency
of roots has been considered so far, whil
e the frequency of affixes has been
usually neglected. No study on lexical access to derived words has syste
m-
atically varied the quantitative values of derivational affixes. By contrast,
these values have been investigated in the context of pseudoword proc
es
-
sing (see below).



1.1.

Studies on words


Several studies conducted in different languages, including English, Italian,
French and Dutch, have shown that access times and accuracy to suffixed
The interplay of root, suffix and whole
-
word frequency
159

derived words are significantly affected by root frequency (see,
e.g., Bea
u-
villain 1996; Bradley 1979; Burani and Caramazza 1987; Colé, Beauvillain
and Segui 1989; Holmes and O’Regan 1992; Schreuder, Burani, and
Baayen 2002). Lexical decisions were faster and more accurate when a
suffixed word, usually of low frequency,

included a root of high frequency.
The facilitatory effect of high
-
frequency root morphemes was found both
when calculation of root frequency included the frequency of the base
word and its inflected forms only (Burani and Caramazza 1987), and when
it was

extended to include the frequencies of all the derived word
-
forms
sharing the same root (Colé,
Beauvillain and Segui

1989).

The root frequency effect has been found in the context of suffixed d
e-
rived words that were both orthographically and phonologicall
y transparent
with respect to their base root, and were usually transparent for meaning
with respect to the meaning of their base. However, there has been ev
i-
dence of root frequency effects also for derived words that included bound
roots, or could be rate
d as semantically opaque with respect to their base
(see, e.g., Holmes and O’Regan 1992; Schreuder, Burani and Baayen
2002). In the studies in which root frequency effects have been found, su
f-
fixes were usually productive and had high frequency. Suffix fre
quency
was not directly investigated
per se
, but it was usually kept constant across
categories by including the same suffixes in the high
-
frequency and low
-
frequency root sets.



1.2.

Studies on pseudowords


An investigation of suffix frequency
per se

was rece
ntly made, by adopting
pseudoword contexts made up of illegal root
-
suffix combinations. Burani
et al. (1997) submitted, to both lexical decision and naming, pseudowords
that were made up of real roots combined with derivational suffixes not
compatible with

the root. In order to demonstrate that the probability of
access through activation of morphemic units corresponding to suffixes is
constrained by their frequency values (see also Laudanna and Burani
1995), Burani et al
.

(1997) made use of suffixes belong
ing to two distinct
fr
e
quency ranges. In one experimental set, roots were combined with high
-
frequency suffixes, and the resulting pseudowords were contrasted with
pseudowords in which the same roots were combined with control s
e-
quences that had analogous
orthographic frequency in final position of
Italian words, but were not suffixes. In the second set, a comparison was
160


Burani and

Thornton

made between pseudowords composed of roots plus low
-
frequency su
f
fi
x-
es, and the same roots combined with control low
-
frequency orth
o
graphi
c
final sequences. Roots were in both cases of medium frequency. In order to
control for asymmetries in the possibility to assign meaning to suffixed
pseudowords of the two kinds (i.e., with high
-

and low
-
frequency suffixes,
respectively), suffixed pseudow
ords in the two frequency sets were
matched for mean interpretability values derived from participants’ empir
i-
cal ratings.

The lexical decision results by Burani et al
.

(1997) showed that the i
n-
terference effect which is usually found on pseudowords that i
nclude real
affixes (see, e.g., Caramazza, Laudanna and Romani 1988; Taft and Forster
1975; see also, for derivational suffixes, Jarvella and Wennstedt 1993) is
conditioned by the frequency of the embedded suffixes: Longer reaction
times and higher error r
ates were found, with respect to control
pse
u
dowords, only when pseudowords included high
-
frequency suffixes.
By contrast, pseudowords with low
-
frequency suffixes took no longer to be
rejected than control pseudowords. From these results on suffixed
pse
u
do
words, Burani et al
.

(1997) concluded that the probability that su
f-
fixes will affect processing is conditioned by their frequency (see also
Laine 1996, for Finnish productive derivational suffixes causing interfe
r-
ence effects on pseudoword lexical decision
).



1.3.

Suffix frequency, suffix numerosity, and productivity


In considering the frequency of suffixes, two main quantitative measures
can be adopted. On the one hand, frequency in the proper sense is calc
u
la
t-
ed on word tokens, by summing up the cumulative f
requency in a given
corpus of all the word tokens in which a given suffix occurs. On the other
hand, suffix frequency can be measured by calculating the number of word
types in which a given suffix occurs in a given language. This second
measure could be n
amed numerosity of the suffix (as proposed by Burani
et al. 1995).

There could be reasons for considering numerosity (i.e., suffix type
-
frequency) as a better quantitative characterization for suffixes and a
stronger predictor of performance in access task
s. Suffix numerosity is
closely related to suffix productivity thus allowing the suffix to “emerge”
as a separate processing unit (see, e.g., Baayen 1989; 1992; Bybee 1995a).
However, there is a strong link and a complex interplay among suffix type
The interplay of root, suffix and whole
-
word frequency
161

and tok
en frequency, productivity and probability of morphemic parsing
(Hay and Baayen 2002). In the study by Burani et al. (1997), suffix n
u
me
r-
osity and suffix frequency were not disentangled because, after inspe
c
tion
of frequency distributions, it was found tha
t suffix token
-
frequency and
suffix numerosity tended to be highly correlated. Suffix frequency was
used in a broad sense to subsume the two quantitative measures that could
affect processing. Consequently, suffixes were either high or low on both
dimensio
ns, frequency and numerosity, calculated on a corpus of Italian
written language (Istituto di Lingui
s
tica Computazionale CNR 1989).



1.4.

Other properties of suffixes relevant for word processing


Some recent research has investigated the role of properties of

derivational
suffixes in lexical access to words (see Bertram, Laine and Karvinen 1999,
for Finnish; Bertram, Schreuder and Baayen 2000, for Dutch). For both
Finnish and Dutch, properties like suffix productivity and suffix hom
o
n
y-
my (i.e., suffix ambiguit
y in serving more than one semantic function)
were found to affect processing, with words including productive and no
n-
homonymous suffixes being more likely to induce morpheme
-
based pr
o-
c
essing (see also, for Japanese, Hag
i
wara et al. 1999).

In the studies b
y Bertram,
Laine and Karvinen

(1999) and Bertram,
Schreuder and Baayen

(2000), no information was given on the frequency
values of the productive
vs
. unproductive suffixes, and the issue of suffix
frequency/numerosity was not assessed directly. A variation

in productivity
usually corresponds to a variation in frequency/numerosity. However, a
l
t-
hough very related to suffix productivity, suffix frequency and numerosity
do not necessarily correspond to productivity. There may be differences in
suffix numerosity

that do not correspond to differences in suffix produ
c
ti
v-
ity. At the same time, it is not always the case that differences in pr
o
ducti
v-
ity correspond to differences in suffix frequency or numerosity.
2

Thus there
are reasons for investigating suffix freque
ncy/numerosity in derived words,
without identifying these quantitative measures with pr
o
ductivity.



162


Burani and

Thornton

2.

The present study


While there is evidence that root frequency affects access to printed su
f-
fixed derived words, evidence for a role of quantitative prop
erties of su
f-
fixes in visual processing comes almost exclusively from pseudowords.
The present study aimed at assessing simultaneously the roles of both root
and suffix frequency in Italian derived words, by testing different comb
i
n
a-
tions of roots and suff
ixes with differing frequency. If the role of qua
n
tit
a-
tive properties of morphemes have to be assessed
per se
, derived words
with morphemic constituents of different frequencies should be matched
for a number of factors, including orthographic/phonological

transparency,
semantic tran
s
parency, and whole
-
word frequency.

Italian derivation occurs mostly through agglutination of suffixes to
roots which are not occurring words themselves (see Peperkamp 1995).
Orthographic/phonological transparency of derived for
ms with respect to
their roots is quite common, wide
-
spreading across different frequency
ranges of both words and morphemes, and can be easily controlled for.
Semantic transparency can also be kept under control, while varying suffix
frequency/numerosity.

Suffix numerosity, i.e., the number of word types in
which a given affix occurs, is one determinant of semantic transparency,
but it does not identify with it (see Bybee 1995a). Moreover, in intramodal
tasks there might be reasons for expecting effects of

morphological co
n-
stituency also in derived words that are less transparent for meaning or in
semantically opaque words (see, e.g., Bentin and Feldman 1990; Feldman
and Soltano 1999; Plaut and Gonnerman 2000; Schreuder, Burani and
Baayen 2002; Stolz and Fe
ldman 1995; Vannest and Boland 1999; but see
also, for contrasting evidence in cross
-
modal tasks, Marslen
-
Wilson et al.
1994).

Given the basic prediction that low
-
frequency words with two high
-
frequency constituents should be the best candidates for acces
s through
morphemes, which predictions could be made for low
-
frequency words
that include only one high
-
frequency constituent (either the root or the
affix)? Would lexical access be equally sensitive to the higher frequency
constituent? Would it be differe
ntially sensitive to the frequency of the root

and the affix, respectively? In some studies the assumption has been made
that root frequency effects should manifest themselves in low
-
frequency
derived words with frequent and productive suffixes. To our kno
wledge,
there were no investigations of whether root frequency effects would show
up in the context of low
-
frequency suffixes. Similarly, no study has inve
s-
The interplay of root, suffix and whole
-
word frequency
163

tigated whether high
-
frequency suffixes may affect the probability of mo
r-
pheme
-
based access in the c
ontext of low
-
frequency roots. In synthesis, no
study has addressed the issue of whether low
-
frequency derived words that
include either one or both low
-
frequency constituents were likely to act
i-
vate morphemic units at all.

Our predictions were developed i
n the framework of the model pr
o-
posed first by Schreuder and Baayen (1995) (see, for recent updates,
Baayen and Schreuder 1999; 2000; Baayen, Schreuder and Sproat 2000).
This is a race model for the recognition of morphologically complex words
in which the
re are two parallel access routes, one based on whole
-
form
information, and the other based on morphemic decomposition. One a
s-
sumption of the model is that, for the visual modality, the complete input is
available from the start. Thus in principle, for low
-
frequency derived
words, both root and suffix frequency should affect processing at the stages
in which morphemic access representations are activated over time by the
sensory input.

In the framework of this race model, it is crucial to assess the comple
x
balance between access through storage (whole
-
word activation) and a
c-
cess through computation (morphemic activation). An open issue is how
processing proceeds for low
-
frequency words which include low
-
frequency
constituent morphemes. This implies assessi
ng the relation existing, in
terms of processing costs and benefits, between components in which mo
r-
phemic units are segmented and activated, and subsequent components in
which they are re
-
combined in order to derive meaning.

So far, the probability of fas
ter access through whole
-
word activation
has been suggested for high
-
frequency derived words that tend to be highly
lexicalized (Baayen and Neijt 1997; Burani and Laudanna 1992; Bybee
1995b; Chialant and Caramazza 1995; Frauenfelder and Schreuder 1992;
Sch
reuder and Baayen 1995).
3

Additionally, whole
-
word storage has been
proposed for English derived words which include non
-
neutral affixes and
whose stems tend to cluster around recurring patterns thus constituting
“gangs” (Alegre and Gordon 1999b), or for d
erived words which include
unproductive suffixes (Bertram,
Laine and Karvinen

1999; Bertram,
Schreuder and Baayen

2000; Hagiwara et al
.

1999).

However, the possibility should be conceived that also low
-
frequency
derived words which include low
-
frequency m
orphemic constituents


even if phonologically transparent


are more likely to be accessed as
whole forms through direct whole
-
word access. For these words, the r
e-
duced probability of access through morphemic decomposition would d
e-
164


Burani and

Thornton

rive from the fact th
at the slight difference between the frequency of mo
r-
phemes and whole
-
word frequency is not large enough for morphological
processing to result in benefits relative to whole
-
word based lexical access.

The following visual lexical decision experiments addre
ssed the latter
issues by combining evidence from pseudoword and word processing. E
x-
periment 1 was conducted on pseudowords, whereas both Experiments 2
and 3 involved words. Experiment 1 on pseudowords aimed at replicating
and extending results on the role

of suffix quantitative properties in no
n-
le
x
ical contexts. It addressed issues that should help in interpreting results
from the two following experiments on words. By including suffixes in
pseudoword contexts in which the initial orthographic sequence did

not

correspond to an existing root, the role of suffix frequency/numerosity
could be differentiated from its consequences on the semantic transparency
or interpretability of a newly derived form with respect to its base. At the
same time, by investigating

stimulus contexts in which the lexical mo
r-
phemic unit (the suffix) occurred in the rightmost part of the stimulus in
the absence of a morphemic unit on its left side, we aimed at providing
evidence for a role of morphemic units which is independent of seq
uential
left
-
to
-
right processing. Experiments 2 and 3 addressed the issue of the
interplay of root, suffix and whole
-
word frequency in access to derived
words, by orthogonally varying high
-

and low
-
frequency roots and suffixes
in low
-
frequency transparent
derived Italian words. In order to specify the
balance between the processing routes based on whole
-
word and mo
r
phe
m-
ic units, respectively, the derived words were contrasted with no
n
derived
words of analogously low frequency (E
x
periment 3).



3.

Experiment 1


In Experiment 1, we aimed at replicating, with three sets of suffixes, the
effect of suffix frequency in lexical decision to pseudowords found by
Burani et al
.
(1997). In that study, suffixes were combined with real roots
to form pseudowords. In the prese
nt experiment, the pseudowords were
obtained by combining suffixes of varying frequencies with orthographic
sequences that did not correspond to real roots. If suffix frequency effects
were to occur in lexical decision to pseudowords of this sort, strong e
v
i-
dence for the role of suffixes in visual processing would be provided. If
activation of the morphemic units comprising a stimulus occurs irrespe
c-
tive of their sequential positions within the stimulus, provided they are
The interplay of root, suffix and whole
-
word frequency
165

frequent enough, the prediction cou
ld be made that high
-
frequency suffixes
are act
i
vated and play some role also when affixed after nonroots.

The expected result has twofold implications. On the one hand, if high
-
frequency suffixes delay lexical decisions to pseudowords even when they
occu
r in stimuli that do not contain real roots, it could be concluded that
the effects of frequency/numerosity of suffixes occur at a processing stage
in which affix morphemes are available independently of their semantic
content. When suffixes are combined w
ith nonroots, no interpretability of
the combination should be expected, because of the absence of a meanin
g-
ful component in first position (for the effects of interpretability of new
root
-
suffix combinations in lexical access, see Burani et al. 1999; see
also,
for the effects of interpretability on novel Dutch compounds, Coolen, van
Jaarsveldt and Schreuder 1991; van Jaarsveldt, Coolen and Schreuder
1994).

On the other hand, if we were to show a frequency effect induced by a
morphemic unit located in the r
ightmost position of the stimulus, within an
orthographic context in which no lexical or morphemic unit occurs in left
position, we would challenge a sequential search model, which predicts
that the frequency of the second constituent should not affect lex
ical pr
o-
c
essing (Taft and Forster 1976). Evidence against this sort of model has
been provided by studies which used both compound nonwords (Lima and
Pollatsek 1983), and real compound words (Andrews 1986; Andrews and
Davies 1999; Pollatsek, Hyönä and Bert
ram 2000). The main finding of
these studies, which mainly employed lexical decision, but also the r
e
cor
d-
ing of eye movements in sentence reading (Pollatsek, Hyönä and Bertram
2000), was an effect of the lexical status or of the frequency of the second
con
stituent. Evidence for a frequency effect of the second constit
u
ent when
this is a suffix is still lacking. However, even in a theoretical framework
which incorporates principles of interactive activation (Taft 1994), it is still
assumed that, while inflec
tional endings would be stripped off in word
processing, derivational suffixes would not, because of their different role
in processing. Within the latter framework, all the pse
u
dowords that are
tested in our experiment, provided they are equated in their
leftmost no
n-
lexical part on purely orthographic grounds, should be rejected equally
fast, irrespective of the presence of a suffix on their righ
t
most side. In co
n-
trast with these predictions, if interference effects on nonword lexical dec
i-
sion do arise whe
n the stimulus includes a high
-
frequency suffix in comb
i-
nation with a nonroot orthographic sequence, a model in which the
166


Burani and

Thornton

processing system activates frequent morphemic units, irrespective of their
relative locations within the word, would be supported

In
Experiment 1, three sets of Italian derivational suffixes were s
e
lec
t-
ed. The three sets, matched for length in letters and phonemes, and for
orthographic/phonological structure, differed only for frequency, calc
u
la
t-
ed both on word tokens and on word types.

Suffixes in the three sets could
be considered of high, medium and low frequency, respectively, by consi
d-
ering the overall distribution of frequency values of Italian suffixes of the
same length. The main prediction was that high
-
frequency suffixes should

cause more interference on nonword decision when included in
pseudoword contexts, relative to low
-
frequency suffixes. Suffixes of m
e
d
i-
um frequency might either not constitute sufficiently activated proces
s
ing
units for interference to occur, or they might

show interference effects of a
smaller size than high
-
frequency suffixes.



3.1.

Method


3.1.1.

Materials and design


Nine suffixes were selected, equally subdivided in three experimental sets,
of high, medium and low frequency, respectively. Frequencies, in this e
x-
p
eriment and in all the following experiments, were derived from a corpus
of Italian written language of 1.5 million tokens (Istituto di Linguistica
Computazionale CNR 1989). The mean suffix frequencies in the three sets,
calculated on word tokens, were 1,0
60 per 1.5 million (range: 639
-
1,557);
68.3 (range: 55
-
90), and 12.3 (range: 7
-
18), for the three sets, respectively.
Differences in mean suffix numerosity between the three sets, calculated
on word types in the corpus, paralleled differences in frequency.

The mean
number of word types in which suffixes occurred were 165 (range: 145
-
187), 20.6 (range: 11
-
39), and 6.3 (range: 2
-
10) for the three sets, respe
c-
tively.

There were both nominal and adjectival suffixes. No suffix was h
o
mo
n-
ymous with another Italian

suffix. All suffixes were four
-
letter long and
were matched across sets for length in phonemes and for syllabic structure.

For each suffix, a control sequence with similar orthographic and sy
l-
labic structure was selected. The sequences corresponding to a
suffix and
the control sequences were matched for orthographic frequency in word
final position in each set. The mean frequencies of control sequences in
The interplay of root, suffix and whole
-
word frequency
167

word final position were: 1,538 per 1.5 million for the first set; 472 for the
second set; 64 for the
third set. These values were matched to the mean
frequencies of the orthographic strings corresponding to the selected su
f-
fixes, calculated in word final position and including both real suffixes and
pseudosuffixes: 1,335 per 1.5 million for the high
-
frequ
ency set; 411 for
the medium
-
frequency set, and 62 for the low
-
frequency set, respe
c
tively.

The suffixes were combined with orthographically legal letter s
e
quen
c-
es that did not correspond to any existing root. Each suffix was co
m
bined
with four different
pseudoroots, for a total of twelve pseudowords in each
set. The length of pseudowords fell within the length range of Italian words
including the same suffix, and respected as much as possible the distrib
u-
tion of word length for each suffix in the Italian
language (calcul
a
tions
were based on Ratti et al. 1988). Mean lengths in letters of the
pse
u
dowords were 9.1, 8.6, and 8.7 for the three sets, respectively.

Each suffixed pseudoword was matched with a control pseudoword that
included the same pseudoroot in

combination with the orthographic s
e-
quence that constituted the control sequence for the suffix. Thus
pse
u
dowords in each suffixed
-
control pair had the same length, the same
sy
l
labic structure, similar orthographic/phonological structure, and similar
orth
ographic frequency of the final part, either corresponding to a suffix or
to a nonsuffix. Pseudowords including suffixes and control sequences were
also matched for bigram frequency. Mean bigram frequencies, calculated
on the base of the natural logarithm,

were: 10.49, 10.57, and 10.60 for
pseudowords in the three suffixed sets; they were 10.80, 10.55, and 10.44
for pseudowords in control sets. In combining initial letter strings with
suffixes and control letter sequences, we avoided the presence in the
pse
u
dowords of embedded real words. Pseudowords in the six sets were
also matched for their overall degree of orthographic similarity to a real
word, i.e., for the number of orthographic neighbors. Adopting the N
-
count
measure (Coltheart et al. 1977), i.e., t
he total number of words that can be
obtained from each pseudoword by replacing one letter at a time with a
n-
other letter, while preserving the other letters’ positions, we determined
that the great majority of pseudowords had a null N
-
count (with a few e
x-
c
eptions of N
-
count = 1, balanced across sets), i.e., we obtained
pse
u
dowords that were equally dissimilar from existing words, according
to the N
-
count metric.

In synthesis, there were six sets of pseudowords, arranged in a 2x3 d
e-
sign, in which the main fa
ctors were the presence
vs
. absence of a suffix,
and the high
vs
. medium
vs
. low frequency of the orthographic sequence
168


Burani and

Thornton

corresponding either to a suffix or to a nonsuffix. In each of the six sets,
there were 12 pseudowords (4 for each suffix or control seq
uence), for a
total of 72 experimental stimuli, 36 suffixed and 36 controls. The exper
i-
mental items, with the mean RT and percent error for each item, are r
e
por
t-
ed in Appendix A.

In order to avoid presenting the same pseudoroot to the same participant
both

in the suffixed and in the nonsuffixed control condition, each exper
i-
mental set was split in two subsets of 6 items each. Each participant was
presented with 36 experimental pseudowords, 18 suffixed and 18 pseud
o-
suffixed, in which no pseudoroot was repeat
ed. In each subset there were
two instances of the same suffix or final control sequence. For each set of
suffixed and control pseudowords, the entire set of single scores was pr
o-
vided by two participants presented with two compl
e
mentary sublists.

In each

sublist, the 36 experimental pseudowords were presented t
o-
gether with 66 filler pseudowords and 102 filler words. Each participant
was presented with a total of 204 stimuli. Filler stimuli were the same in
each of the two sublists: Words included medium/l
ow frequency singular
nouns and adjectives, either derivationally suffixed or nonsuffixed, in a
proportion that reflected the composition of the Italian basic dictionary in
the medium/low frequency range (see Thornton, Iacobini and Burani 1994;
1997). Each

suffix and each control final sequence that occurred in e
x
pe
r-
imental pseudowords was also included in the same number of filler words.
Filler pseudowords were drawn from words analogous to the filler words
by changing one or two letters in different posit
ions. Mean length was the
same for words and pseudowords (range: 6
-
11 letters).

The list was presented to participants in a single experimental session,
arranged in three randomized blocks of 68 items each. For each block,
participants were assigned to one

of two different randomizations of items.
Each experimental list was preceded by a practice list of 50 items, 25
words and 25 pseudowords, assigned in the same proportion to two ra
n-
domized blocks.



3.1.2.

Procedure


Participants were tested individually in a so
undproof experimental booth.
They received standard lexical decision printed instructions in which they
were asked to decide as quickly and as accurately as possible whether a
presented letter string was an Italian word or not. If it was a word (YES
The interplay of root, suffix and whole
-
word frequency
169

respon
se), they had to press the right one of two response keys, otherwise
(NO response) the left one. For left
-
handed participants, the order of the
response buttons was reversed.

Each trial started with the presentation of a fixation mark (a cross) in
the cent
er of the screen for 400 ms, followed after 300 ms by the stimulus
centered at the same position. Stimuli were presented on a monitor in white
uppercase letters on a dark background and remained on the screen until
the participant pressed one of the two re
sponse buttons. They disappeared
after a time period of 1,500 milliseconds if no response was given. A new
trial began 1,200 ms after responding or time
-
out. If a participant r
e
spon
d-
ed more slowly than the preset limit of 1.5 sec, the words
FUORI TEMPO

(‘o
ut of time’) appeared on the screen. If the participant gave the wrong
response, the word
ERRORE

(‘error’) appeared on the screen. This signal
was displayed for 500 ms. The interval between the disappearance of the
feedback and the next warning signal was
1,200 ms. There was a pause
after each block of stimuli. The total duration of the experimental session
was approximately 20 mi
n
utes.



3.1.3.

Participants


Forty
-
eight participants, mostly University students, were paid to partic
i-
pate in the experiment. All were

native speakers of Italian.



3.2.

Results and discussion


The data of four participants, whose mean reaction times for correct r
e-
sponses or whose error rates were more extreme than 2 s.d from the mean
of all participants, were excluded from further analysis.
Using the remai
n-
ing forty
-
four participants, the mean reaction times and error rates for all
items were obtained and one pair of items in the medium
-
frequency set was
removed because the number of errors for one of the two members of the
pair (
crofusso
) wa
s more than 2.5 s.d. above the mean. When means for
length, bigram frequency and N
-
count were recalculated after removing the
two paired items, the sets were still balanced. The remaining observations
were used to calculate participants’ and items’ mean re
action times and
error scores. Mean reaction times by items and percentages of errors for
170


Burani and

Thornton

the three experimental categories and their respective controls are shown in
Table 1.


Table 1.

Experiment 1. Mean reaction times by items in ms. and % error.


Suffi
xed and control (nonsuffixed) pseudowords, with high
-
frequency
(HF), m
e
dium
-
frequency (MF), and low
-
frequency (LF) final sequences





Suffixed

Control

Difference

HF

Mean RT

739

700

+39


% Error

13.6

6.1


+ 7.5






MF

Mean RT

701

701

0


%Error

4.1

7.4

-

3.3






LF

Mean RT

680

679

+ 1


%Error

4.6

4.2


+ 0.4


Results were submitted to a mixed three
-
way analysis of variance with two
within
-
participants factors: Suffixedness (suffixed
vs
. nonsuffixed
pse
u
dowords) and frequency of final sequence
, both suffix and control
(high
vs
. medium
vs
. low). The third between
-
participants factor was list
(first
vs
. second sublist, each administered to one half of participants).

The ANOVAs were performed both by participants and by items and
showed interactio
n between suffixedness and frequency on both reaction
times (F1(2,84) = 7.59, p<.001, MSE= 1,412.1; F2(2,58) = 3.34, p=.04,
MSE= 977.3) and errors (F1(2,84) = 5.92, p=.004, MSE= 117.2; F2(2,58)
= 5.97, p=.004, MSE= 1.48). Results differed across the three
experimental
sets when suffixed pseudowords were compared to their respective co
n-
trols. Comparisons between suffixed
-
control pairs based on the Duncan
test on means by items revealed that suffixed pse
u
dowords in the high
-
frequency set were significantly (3
9 ms) slower (p=.03) and gave rise to
significantly (7.5%) more errors (p=.003) with respect to controls. By co
n-
trast, suffixed pseudowords were equally fast relative to their controls in
both the medium
-
frequency and the low
-
frequency sets (p>.1 in both c
a
s-
es). Percent errors on suffixed pseudowords were 3.3 less and 0.4 more in
the medium
-
frequency and in the low
-
frequency sets, respectively. These
differences were not si
g
nificant (p>.1 in both cases).

An effect of frequency was found on both reaction tim
es (F1(2,84) =
19.78, p<.001, MSE= 1,505.6; F2(2,58) = 10.12, p<.001, MSE= 977.3),
The interplay of root, suffix and whole
-
word frequency
171

and errors (F1(2,84) = 5.76, p=.004, MSE= 122.5; F2(2,58) = 6.28, p=.003,
MSE= 1.48). The main effect of suffixedness (suffixed
vs
. nonsuffixed
pseudowords) was found on rea
ction times by participants only (F1(1,42) =
6.57, p=.013, MSE= 900.46; F2(1,58) = 2.56, p>.1, MSE= 977.3). No main
effect of suffixedness was found on errors (F1(1,42) = 2.03, p>.1, MSE=
72.23; F2(1,58) = 1.27, p>.1, MSE= 1.48).

As revealed by the strong
interaction between suffixedness and fr
e-
quency of final sequence, and by
post
-
hoc

comparisons, response latencies
and percentages of errors to suffixed pseudowords were higher, relative to
their controls, only when the pseudowords included a high
-
frequency

su
f-
fix. By contrast, pseudowords that included either medium
-
frequency or
low
-
frequency suffixes did not reveal longer reaction times nor lower acc
u-
racy with respect to matched orthographic controls. These results confirm
those obtained by Burani et al. (
1997): High
-
frequency suffixes activate
corresponding morphemic access units in pseudoword contexts. By co
n-
trast, no access unit seems to be available for suffixes that are either m
e
d
i-
um
-

or low
-
frequency, at least not in pseudoword contexts and within the

time required to perform lexical decision.

The present results allow us to build on the findings by Burani et al.
(1997). In the present experiment, the interference effect caused by high
-
frequency suffixes occurred with suffixes that were combined with
none
x-
isting roots. Hence, activation of morphemic lexical units corresponding to
suffixes occurred in the absence of a real root on their left side. This fin
d-
ing hardly seems compatible with sequential search accounts (Taft and
Forster 1976), and with rece
nt reformulations (Taft 1994), which predict
that the frequency of the second constituent, the derivational suffix, should
not affect lexical processing in the absence of a lexical unit as first co
n
sti
t-
uent.



4.

Experiment 2


Experiment 1 provided evidence f
or a role in processing of suffix fr
e
que
n-
cy, with high
-
frequency suffixes significantly affecting rejection l
a
tencies
in visual lexical decisions to pseudowords. In Experiment 1 there was no
evidence that suffixes of medium/low frequency which extended up
to a
frequency of 90 per 1.5 million constituted effective processing units: No
interference arose in lexical decision, when a suffix was either m
e
dium
-

or
low
-
frequency.

172


Burani and

Thornton

The role of suffix frequency in lexical decision to real words was a
s-
sessed in Experi
ment 2, by varying both root and suffix frequency in tran
s-
parent derived words of low surface frequency. Derived words included
suffixes belonging to two sets of differing frequencies. Suffixes of high
frequency were contrasted with suffixes that were of m
edium/low fr
e
que
n-
cy. In Experiment 1 there was no evidence for differences between med
i-
um
-

and low
-
frequency suffixes. Hence, suffixes from both the latter fr
e-
quency ranges were pooled together in a single set. For simplicity,
her
e
after we will refer to me
dium/low
-

frequency suffixes as low
-
frequency suffixes.

All low
-
frequency derived words should in principle be accessed
through constituent morphemes


even when both the root and the suffix
are low
-
frequency, the derived word is nevertheless lower in freq
uency
than its constituent morphemes. Hence, for low
-
frequency derived words
that are equated for all the relevant properties except for frequency of the
two constituent morphemes, either high or low, predictions were that both
reaction times and error rat
es should not be function of whole
-
word fr
e-
quency, but should rather reflect differences in the frequency of mo
r
phe
m-
ic constit
u
ents.

Words including higher
-
frequency morphemes were expected to be a
c-
cessed more quickly and more accurately than words includi
ng lower
-
frequency morphemes, with words including both root and suffix of high
frequency being the fastest and the most accurate, and words with low
-
frequency root and suffix being the slowest and the least accurate. Derived
words in which only one morphe
me, either the root or the suffix, is of high
frequency, were expected to show intermediate reaction times and error
rates. If for printed stimuli simultaneous parallel activation of
both

root and

affix is assumed, irrespectively of their relative position
s within the word,
words in which the high
-
frequency constituent is either the root or the su
f-
fix were not expected to differ in activation times.



4.1.

Method


4.1.1.

Materials and design


Four sets of equally low
-
frequency suffixed derived words were selected.
In t
he four sets, root and suffix frequency varied orthogonally: The first set
included high
-
frequency roots and high
-
frequency suffixes (HH); the se
c-
The interplay of root, suffix and whole
-
word frequency
173

ond set included low
-
frequency roots and high
-
frequency suffixes (LH);
the third set included high
-
frequency r
oots and low
-
frequency suffixes
(HL); the fourth set included low
-
frequency roots and low
-
frequency su
f-
fixes (LL). Suffixes were either high
-

or low
-
frequency on both tokens and
types, i.e., on both frequency
tout court
and numerosity. The root fr
e
que
n-
cy m
easure included the cumulative frequency of both the inflected and the
derived forms of the base.

Thirteen words (nouns and adjectives) were included in each set, for a
total of nine different suffixes in each set. No suffix was homonymous with

a different

Italian suffix. The same suffixes were included in the two high
-
frequency suffix sets and in the two low
-
frequency suffix sets, respe
c
tiv
e-
ly. Suffixes were three to five letters long. High
-
frequency suffixes and
low
-
frequency suffixes were matched for len
gth and syllabic structure.
Roots were different in the four sets. Root length was balanced across sets.
The roots belonged to different grammatical categories (i.e., nouns, adje
c-
tives and verbs) that were balanced across sets. All the derived words were
p
resented in singular citation form. They were orthographically and ph
o
n
o-
logically transparent with respect to their bases, i.e., there was no orth
o-
graphic/phonological assimilation at the boundary between root and suffix.

Across the four sets, words were m
atched for surface frequency, length,
syllable structure and bigram frequency. The 52 experimental words were
presented together with 108 filler words and 160 filler pseudowords, for a
total of 320 stimuli. Any suffix that occurred in experimental words o
c-
curred also in the same number of filler pseudowords. Filler pseudowords
were drawn from words analogous to the filler words by changing one or
two letters in different positions in the word. Filler words included m
e
d
i-
um/low
-
frequency singular nouns and ad
jectives, either morphologically
complex or simple, in a proportion that reflected the composition of the
Italian basic dictionary in the medium/low frequency range (Thornton,
Iacobini and Burani 1994; 1997). Mean length was the same for words and
pseudowo
rds (range: 6
-
11 letters).


The list was presented to participants in a single experimental session,
arranged in four randomized blocks of 80 items each. Each participant was
presented with a different block randomization and with a different ra
n-
domization

of items within each block. Each experimental list was pr
e
ce
d-
ed by a practice list of 40 items, 20 words and 20 pseudowords, a
s
signed in
the same proportion to two randomized blocks.



174


Burani and

Thornton

4.1.2.

Procedure


The procedure was the same as

in Experiment 1. The experim
ental session
lasted about 30 minutes
.



4.1.3.

Participants


Forty
-
five participants, mostly University students, were paid to participate
in the experiment. All were native speakers of Italian.



4.2.

Results and discussion


The data of ten participants, who made mo
re than 15 percent errors on the
experimental words, were excluded from further analysis. Using the r
e-
maining thirty
-
five participants, the mean reaction times and error rates for
all items were obtained. We removed four experimental words that showed
erro
r rates exceeding 40% from the data set. One word (
tenerume
) was
removed in set HL, two words (
aratore

and
larvale
) in set LH, and one
word (
ameboide
) in set LL. One item (
rimanenza
) was removed in set HH
because it was the only word which included a prefi
xed bound root of an
irregular verb. Removal of these items did not affect the matching of the
four sets for the relevant variables.

In Table 2 the mean values with standard deviations for the variables in
each experimental set are reported. The list of t
he experimental items, with
root frequency, suffix frequency, word frequency, mean RT and percent
error for each item, are reported in A
p
pendix B.

The remaining observations were used to calculate participant and item
mean reaction times and error scores.
Mean reaction times by items and
percentages of errors for the four experimental sets are shown in Table 3.


The interplay of root, suffix and whole
-
word frequency
175

Table 2.

Experiment 2. Mean values and standard deviations (s.d.) for the relevant
variables.


HH = Derived words with high
-
frequency root and hig
h
-
frequency suffix

HL = Derived words with high
-
frequency root and low
-
frequency suffix

LH = Derived words with low
-
frequency root and high
-
frequency suffix

LL = Derived words with low
-
frequency root and low
-
frequency suffix




HH

HL

LH

LL



Mean

s.d
.

Mean

s.d.

Mean

s.d.

Mean

s.d.

Root frequency

501

412.1

507

409.8

33.5

19.1

33.3

21.9

Family size

15.7

8.05

10.20

3.82

4.4

2.5

4.2

1.7

Suffix frequency

1,859

1,515

57

33.7

1,636

1,260

58

32.5

Suffix numerosity

246

123.6

17.3

10.8

217

101.39

17.7

10.6

Semantic relatedness

3.77

0.43

3.47

0.89

3.66

0.68

3.45

0.95

Familiarity

6.57

0.78

6.05

1.03

6.15

0.9

5.75

1.01

Bigram frequency

10.79

0.4

10.69

0.3

10.56

0.3

10.60

0.3

Word length in letters

8.2

1.5

8.2

1.1

8.4

1.0

8.2

0.9

Root length in letters

4.4

1.4

4.3

1.1

4.5

1.0

4.4

0.8

Suffix length in letters


3.8

0.6

3.9

0.5

3.9

0.5

3.8

0.6

Word frequency

3.1

2.8

3.5

3.2

2.2

2.3

1.7

1.2



Table 3.

Experiment 2. Mean reaction times by items in ms and % error.

Suffixed derived words with high
-
frequency roo
t and high
-
frequency su
f-
fix (HH); high
-
frequency root and low
-
frequency suffix (HL); low
-
frequency root and high
-
frequency suffix (LH); low
-
frequency root and
low
-
frequency suffix (LL).




HH

HL

Mean Reaction Time

597

624

% Error

2.5

5.9


LH

LL


Mean R
eaction Time

634

670

% Error

6.7

12.2


Results were submitted to two
-
way analyses of variance, with root fr
e-
quency (high
vs
. low) and suffix frequency (high
vs
. low) as the two fa
c-
tors. There were main effects of both root frequency and suffix frequency
on both reaction times and error rates. For root frequency, F1(1,34)=77.83,
p<.001, MSE= 771.3; F2(1,43)= 7.96, p<.01, MSE= 2,551.55 on reaction
times; F1(1,34)= 13.96, p<.001, MSE= 68.9; F2(1,43)= 6.08, p<.025,
MSE= 6.59 on error rates. For suffix frequen
cy, F1(1
,
34)= 28.92, p<.001,
176


Burani and

Thornton

MSE= 1,088; F2(1,43)= 4.77, p<.05, MSE= 2,551.55 on reaction times;
F1(1,34)= 11.00, p=.002, MSE= 63.59; F2(1,43)= 4.38, p<.05, MSE= 6.59
on error rates. There was no interaction between the two factors (p>.1 in
all the analyse
s).

Results on both reaction times and error rates strictly paralleled diffe
r-
ences in frequency of constituent morphemes, with words including high
-
frequency constituents determining quicker and more accurate perfo
r-
m
ance. No differential role of root frequ
ency with respect to suffix fr
e-
quency was apparent in the data. Hence, results seemed to confirm the
hypothesis of access through activation of morphemic constituents for low
-
frequency derived words.

We controlled
post
-
hoc

for possible residual asymmetries

in the pro
p
e
r-
ties of the experimental words that could have contributed to the effect.
Three properties of the derived words were considered: the semantic rela
t-
edness to the base, the word’s morphological family size, and the word
familiarity.



4.2.1.

Ratings o
f semantic relatedness with the base


According to some authors, effects of morphological structure should be
found preferentially in derived words that are semantically transparent (or
related) with respect to the base (Marslen
-
Wilson et al.

1994; but see

also,
for contrasting data and accounts, Bentin and Feldman 1990; Feldman and
Soltano 1999; Plaut and Gonnerman 2000; Schreuder, Burani and Baayen
2002; Stolz and Feldman 1995; Vannest and Boland 1999). In selecting
stimuli, we aimed at balancing words in

the four sets for the degree of s
e-
mantic transparency. We controlled for semantic transparency of the d
e-
rived words with respect to their bases by excluding semantically opaque
words, and by including in each set approximately the same number of
suffixes
that could be considered either productive or unproductive on the
basis of different measures of productivity. In each frequency set there
were suffixes that could be considered productive because they had been
used to coin a substantial number of neologis
ms in the last fifty years, or
could be rated as productive on the basis of the quantitative measure of
productivity proposed by Baayen (1989; 1992). Analogously, in each set
there was a similar number of suffixes that could be considered scarcely
producti
ve or unproductive on either one or both the latter measures.

The interplay of root, suffix and whole
-
word frequency
177

However, it could not be excluded that, independently of productivity
rated in the latter ways, derived words including high
-
frequency suffixes
might result in greater semantic transparency with

respect to their bases by
virtue of suffix numerosity itself. High suffix numerosity is related to a
greater number of derived words which tend to share a similar part of
meaning, namely the meaning carried by the suffix. Analogously, it could
not be excl
uded that words might have different semantic transparency
values due to specific idiosy
n
crasies.

Derived words were submitted to empirical ratings for semantic rela
t-
edness with their base. Each derived word was paired with its base word.
The printed list
of word
-
pairs was presented in different random orders to
thirty
-
eight University students who had not participated in the lexical
decision experiment. Participants had to rate, for each pair, on a five
-
point
scale ranging from “Very unrelated” to “Very re
lated”, how “related in
meaning” they thought the first word (the derived word) was to the second
word (the base word).

Mean ratings of semantic relatedness with the base were 3.77 (s.d. 0.43)
for HH words; 3.47 (s.d. 0.89) for HL words; 3.66 (s.d. 0.68) f
or LH
words; 3.45 (0.95) for LL words, respectively. A two
-
way ANOVA with
root frequency (high
vs
. low) and suffix frequency (high
vs
. low) as factors
was performed on semantic relatedness rating means both by participants
and by items. A suffix frequency
effect was found, by participants only (F1
(1,37) = 15.65, p<.001, MSE= 195.1; F2 (1,43) = 1.35, p=.25, MSE=.55),
with words including high
-
frequency suffixes being rated as significantly
more related for meaning to their base words (mean semantic relatedn
ess:
3.71) than words including low
-
frequency suffixes (mean semantic rela
t-
edness: 3.46). Neither root frequency nor the interaction were significant
(F<1 in both cases).

In order to assure that the suffix frequency effect was not confounded
with the fact
that words with high
-
frequency suffixes were more transpa
r-
ent for meaning than words with low
-
frequency suffixes, two further ana
l-
yses were carried out. First, the results of lexical decision were rean
a
lyzed
by excluding from the two sets with low
-
frequenc
y suffixes the least tran
s-
parent items (four items in all), thus obtaining new sets that were pe
r
fectly
matched for semantic transparency. After matching sets for semantic tran
s-
parency, the results of ANOVAs did not change, but showed even stronger
effects

of both root and suffix frequency (F2 (1,39) = 7.3, p<.025, MSE=
2,314.56; F2 (1,39) = 9.46, p<.005, MSE= 2,314.56 on reaction times for
root and suffix, respectively; F2 (1,39) = 6.08, p<.025, MSE= 6.36; F2
178


Burani and

Thornton

(1,39) = 7.88, p<.01, MSE= 6.36, on error rates

for root and su
f
fix, respe
c-
tively), and no interaction (p>.1). Furthermore,
post
-
hoc

corr
e
lation anal
y-
sis did not reveal significant correlation (one
-
tailed test) of reaction time
with semantic relatedness (r =
-
.14, t(45)=0.92, p>.1). Hence we could
excl
ude that semantic transparency was responsible for the e
f
fects found
(see also, for evidence that semantic transparency itself cannot explain why
some suffixes induce decomposition while others do not, Vannest and B
o-
land 1999).



4.2.2.

Morphological family size


Recently, Bertram, Baayen and Schreuder (2000) reported evidence for the
role, in the lexical processing of Dutch complex words, of morphological
family size, i.e., the type count of derived words and compounds with a
given base word as a constituent. Thi
s type count of the number of mo
r-
ph
o
logical family members, that has been found to be a strong independent
co
-
determinant of response latencies for Dutch monomorphemic and i
n-
flected words (de Jong, Schreuder and Baayen 2000; Schreuder and
Baayen 1997), als
o affected latencies to derived words.
Bertram,
Baayen
and Schreuder

(2000) suggested that a large family size of the base word
facilitates lexical processing for most suffixed derived words. According to
the authors, the facilitatory effect of a large fam
ily size is due to semantic
activation sprea
d
ing from a complex word to its family members.

The family size of the base root


i.e., the

root numerosity


could in
principle affect processing of Italian derived words. In our study, a larger
morphological f
amily size should be expected in both sets with high root
frequency, relative to the sets with low root frequency. For each target
word, the number of word types that share the same root in the corpus was
counted. The obtained mean number of morphological
family members
was 15.7 (s.d. 8.05) for HH words, 10.2 (s.d. 3.82) for HL words, 4.4 (s.d.
2.5) for LH words, and 4.2 (s.d. 1.7) for LL words, respe
c
tively. A two
-
way ANOVA with root frequency (high
vs
. low) and suffix frequency
(high
vs
. low) as the two f
actors was performed on family size values in the
four sets. As expected, a difference in the numerosity of the morphological
family between words with high and low frequency roots was found
(F(1,43)= 39.01, p<.0001, MSE= 22.5). A difference in morphologic
al
family size between words with high and low frequency suffixes was also
found (F(1,43)= 4.24, p<.05, MSE= 22.5), and a marginally significant
The interplay of root, suffix and whole
-
word frequency
179

interaction (F(1,43)= 3.43, p=.07, MSE=22.5). A two
-
tailed t
-
test between
HH and HL sets revealed a significant

difference in family size between
the two sets with high
-
frequency root (t(22)= 2.11, p<.05).

The differences in family size among the experimental sets were in the
same direction as the differences in response times, with HH words having
a mean larger nu
mber of family members (15.7) than HL words (10.2), and
words with high
-
frequency suffixes a mean larger number of family me
m-
bers (10.1) than words with low
-
frequency suffixes (7.2). Hence we a
s-
sessed whether these differences could be responsible for part

of the effect
that was found in lexical decision.

Two further analyses were made. First, we reanalyzed the results of le
x-
ical d
e
cision by excluding two items in each of the two sets with high
-
frequency roots, to obtain new sets that were matched for mean

family size.
After matching sets for family size, the results of ANOVAs did not change,
but still showed effects of both root and suffix frequency (F2(1,39)= 7.82,
p=.008, MSE= 2,602.1; F2 (1,39) = 4.31, p<.05, MSE= 2,602.1, on rea
c-
tion times for root and

suffix, respectively; F2 (1,39) = 6.7, p=.01, MSE=
6.75; F2 (1,39) = 4.54, p= .04, MSE= 6.75, on error rates for root and su
f-
fix, respectively), and no interaction (p>.1). Furthermore,
post
-
hoc

corr
e
l
a-
tion analysis did not reveal any correlation of reacti
on time with family
size (r=.04, t(45)=0.27, p>.1). Thus the hypothesis that differences in fa
m
i-
ly size were responsible for part of the effects found in lexical decision
could be r
e
jected.



4.2.3.

Familiarity ratings


Words from the low
-
frequency range of a co
rpus may differ in familiarity,
and familiarity is usually a good predictor of lexical decision performance
(Connine et al. 1990; Gernsbacher 1984). Although our derived words
were matched for whole
-
word frequency, we made a
post
-
hoc

check for
famil
i
arity.

The derived words were submitted to twenty
-
seven University students
for familiarity ratings. Participants had to rate the printed words on a se
v-
en
-
point scale ranging from “Unknown” (1) to “Very well known” (7). All
the derived words received high famili
arity ratings. However, there were
differences between the four groups. Mean familiarity ratings were: 6.57
(s.d. 0.78) for HH words; 6.05 (s.d. 1.03) for HL words; 6.15 (s.d. 0.9) for
LH words; 5.75 (s.d. 1.01) for LL words.

180


Burani and

Thornton

A two
-
way ANOVA with root freq
uency (high
vs
. low) and suffix fr
e-
quency (high
vs
. low) as the two factors performed on familiarity rating
means both by participants and by items showed significant effects of both
factors, by participants only (F1(1,26) = 15.03, p<.001, MSE= .51;
F2(1,4
3) = 2.56, p>.1, MSE= .88 for root frequency; F1(1,26) = 44.11,
p<.001, MSE= .46; F2(1,43) = 1.74, p>.1, MSE= .88 for suffix frequency),
and no interaction (F<1).

Results on familiarity ratings paralleled results on reaction times and
accuracy, with words
including high
-
frequency roots and high
-
frequency
suffixes being rated as more familiar than words with low
-
frequency roots
and suffixes. Differences in rated familiarity were so clean that we could
not select
post
-
hoc

a subset matched for familiarity. Mor
eover,
post
-
hoc

correlation analysis revealed a strong correlation of reaction time with
famil
i
arity (r =
-
.66, t(45) = 5.83, p<.0001).



4.2.4.


Ad interim considerations


In synthesis, the results of
post
-
hoc

controls on three possible factors co
n-
tributing to
the effects found at lexical decision (i.e., semantic relatedness
with the base, morphological family size, and word familiarity) evident
i
a
t-
ed word familiarity as a possible determinant of lexical decision pe
r
fo
r-
mance. There could be reasons for adopting d
ifferences in rated famil
i
arity
of morphologically complex words as evidence
per se
for access to mo
r-
phemic constituency (see Bertram,
Baayen and Schreuder

2000; Schreuder
and Baayen 1997). However, it could also be the case that word familiarity
provides
a different source of explanation for the effect we found. In E
x-
periment 3, we tried to disentangle the role of morphemic fr
e
quency from
that of word familiarity, while addressing further processing issues.



5.

Experiment 3


The aim of Experiment 3 was twof
old. First, we aimed at detailing the
effects found in Experiment 2 with different and larger sets of derived
words, better balanced for properties like semantic relatedness with the
base, morphological family size and rated familiarity. Second, we assesse
d
whether slower reaction times and higher error rates to words including
both a root and a suffix of low frequency (LL words) were a function of
The interplay of root, suffix and whole
-
word frequency
181

the low frequency of constituent morphemes, or whether performance was
merely a function of low whole
-
word sur
face frequency. The latter poss
i-
bility would imply that, for low
-
frequency derived words with both co
n-
stituents of low
-
frequency, the output of lexical decision does not result
from morphological processing, but from whole
-
word processing. For low
-
frequenc
y derived words whose constituents are both low
-
frequency, the
moderate difference between the frequency of morphemes and whole
-
word
frequency (with root and suffix only slightly higher in frequency than the
whole
-
word) might not be large enough for morpho
logical processing to
result in benefits relative to access based on the whole
-
word.

To address the latter issue, a set of nonderived words (ND) was
matched for frequency to four new sets of low
-
frequency derived words
that included morphemes of differing
frequencies. Analogously to Exper
i-
ment 2, the first set of derived words included high
-
frequency roots and
high
-
frequency suffixes (HH); the second set included low
-
frequency roots
and high
-
frequency suffixes (LH); the third set high
-
frequency roots and
lo
w
-
frequency suffixes (HL); the fourth set low
-
frequency roots and low
-
frequency suffixes (LL). Words in the five sets (the four sets of derived
words and the set of underived words) had the same mean low surface fr
e-
quency and were matched for all the relev
ant variables, including f
a
miliar
i-
ty. The only difference was that words in the fifth (nonderived) set did not
include any derivational suffix.
4

Predictions were the following. If LL derived words with both low
-
frequency constituents are accessed preferent
ially as whole
-
words and no
morpheme
-
based access succeeds because of including exceedingly low
-
frequency roots and suffixes, LL derived words should show similar results
to nonderived (ND) words: For both LL and ND words, reaction times and
error rates sh
ould be function of surface frequency only. If, by contrast,
activation of morphemes is involved in access to LL derived words, and if
morphemic activation implies processing advantages because of accessing
a root and a suffix which are, although slightly,

higher in frequency than
the whole
-
word, LL derived words should be quicker than nonderived
words. The latter do not benefit in fact from any constituent morpheme of
higher frequency.
5

If the closer matching for familiarity, semantic relate
d-
ness and morph
ological family size obtained for words in the experimental
sets of Experiment 3 does not make any contribution to results, HL and LH
derived words which include one high
-
frequency constituent (either the
root or the suffix), should show intermediate react
ion times and error rates
relative to LL words on the one hand, and HH derived words on the other.
182


Burani and

Thornton

Accordingly, HH derived words with both constituents of high
-
frequency
should be the most quickly and most accurately recognized.



5.1.

Method


5.1.1.

Materials and des
ign


The five experimental sets included seventeen words each, nouns and a
d-
jectives in analogous proportions in each set. In the four derived sets, there
were nine different high
-
frequency suffixes, and ten different low
-
frequency suffixes, with analogous
distributions in the two high
-
frequency
suffix sets, and in the two low
-
frequency suffix sets, respectively. No su
f-
fix was homonymous to a different Italian suffix. Suffixes were three to
five letters long. High
-
frequency suffixes and low
-
frequency suffixe
s were
matched for length and syllabic structure. Roots were different in the four
sets, and belonged to different grammatical categories (i.e., nouns, adje
c-
tives and verbs) that were balanced across sets. Root length was balanced
across sets. Analogously
to Experiment 2, suffixes were either high
-

or
low
-
frequency on both word tokens (suffix frequency) and word types
(suffix numerosity), and the root frequency included the cumulative fr
e-
quency of both the i
n
flected and the derived forms of the base. The hi
gh
-
frequency root words had also a larger mean family size than low
-
frequency root words. Words were matched for mean morphological fa
m
i-
ly size across sets with the same root frequency. All the derived words
were orthographically and phonologically transpa
rent with respect to their
bases.

All words in the five sets were presented in the citation form. Words
were matched, across the five sets, for surface frequency, length, syllabic
structure and bigram frequency. The five sets were also matched for rated
fa
miliarity (familiarity ratings were obtained on a seven
-
point scale from
twenty
-
four participants, see Experiment 2 for details about the method),
and the four sets of derived words were matched for semantic relatedness
with the base (semantic relatedness
ratings were obtained on a seven
-
point
scale from twenty
-
four different participants; see Experiment 2 for details
concerning the method). The mean values with standard deviations for the
relevant variables in each experimental set are reported in Table 4.

The list
of stimuli, with root frequency, suffix frequency, word frequency, mean
RT and pe
r
cent error for each item are reported in Appendix C.
6

The interplay of root, suffix and whole
-
word frequency
183

The 85 experimental words were presented together with 115 filler
words and 200 filler pseudowords, for a tota
l of 400 stimuli. Any suffix
that occurred in the experimental words occurred also in the same number
of filler pseudowords. The filler pseudowords were drawn from words
analogous to the filler words by changing one or two letters in different
positions in

the word. The filler words included medium/low
-
frequency
singular nouns and adjectives, either morphologically complex or simple,
in a proportion that reflected the composition of the Italian basic dictionary
in the medium/low frequency range (Thornton, I
acobini and Burani 1994;
1997). The mean length was the same for words and pseudowords (range:
6
-
11 letters).


The list was presented to participants in a single experimental session,
arranged in five randomized blocks of eighty items each. Each participa
nt
was presented with a different block randomization and with a different
randomization of items within each block. Each experimental list was pr
e-
ceded by a practice list of 40 items of 20 words and 20 pseudowords, a
s-
signed in the same proportion to two r
andomized blocks.


Table 4.

Experiment 3. Mean values and standard deviation (s.d.) for the relevant
variables
.


HH = Derived words with high
-
frequency root and high
-
frequency suffix

HL = Derived words with high
-
frequency root and low
-
frequency suffix

LH =

Derived words with low
-
frequency root and high
-
frequency suffix

LL = Derived words with low
-
frequency root and low
-
frequency suffix

ND = Nonderived words,



Freq: Frequency; Num: Numerosity; Rel: Relatedness




HH

HL

LH

LL

ND


Mean

s.d.

Mean

s.
d.

Mean

s.d.

Mean

s.d.

Mean

s.d.

Root Freq.

547

536.5

554

526.2

38

16.72

31

22.9

9.5

7.98

Family Size

12.3

8.81

11.1

4.05

4.6

1.94

4.1

2.11

1.8

0.97

Suffix Freq.

2,119

1,653

75

37.26

1,892

1,479

68

43.9





Suffix Num.

247

141.2

21

12.6

227

127.33

21

1
2.5





Semantic Rel.

5.41

0.56

5.01

0.77

5.34

0.65

5.24

0.82





Familiarity

6.42

0.34

6.24

0.6

6.22

0.46

6.15

0.48

6.21

0.68

Bigram Freq.

10.72

0.44

10.74

0.25

10.58

0.52

10.47

0.27

10.75

0.29

Word Length

8.2

1.29

8.6

1.22

8.1

1.20

8.8

0.88

7.9

0.83

Root Length

4.4

1.06

4.5

1.23

4.3

1.22

5.0

0.94





Suffix Length

3.8

0.66

4.2

0.64

3.8

0.66

3.8

0.66





Word Freq.

5.3

4.04

4.6

4.94

3.1

2.68

3.1

3.53

0.4

3.47

5.1.2.

Procedure


The procedure was the same as

in Experiment 2. The experimental session
last
ed about 30 minutes.

184


Burani and

Thornton



5.1.3.

Participants



Fifty participants, mostly University students, were paid to participate in
the experiment. All were native speakers of Italian.



5.1.4.

Results and discussion


The data of three participants, whose mean reaction times for

correct r
e-
sponses or whose error rates were more extreme than 2 s.d. from

the mean
of all participants, were excluded from further analysis. Using the remai
n-
ing forty
-
seven participants, the mean reaction times and error rates for all
items were obtained.

Mean reaction times by items and percentages of
errors for the five experimental sets are shown in Table 5.


Table 5.

Experiment 3. Mean reaction times by items in ms and % error.
Suffixed
derived words with high
-
frequency root and high
-
frequency suffix (
HH);
high
-
frequency root and low
-
frequency suffix (HL); low
-
frequency root
and high
-
frequency suffix (LH); low
-
frequency root and low
-
frequency
suffix (LL); no
n
derived words (ND).




HH

HL

LH

LL

ND

Mean

Reaction
Time


603


605


641


645


640







% Er
ror

5.4

8.6

14.1

17.1

13.6


Results on the five sets were submitted to one
-
way ANOVAs. Additio
n
a
l-
ly, two
-
way analyses of variance, with root frequency (high
vs
. low) and
suffix frequency (high
vs
. low) as the two factors were performed on the
four derived

sets.

Results of one
-
way ANOVAs showed a significant difference among
experimental categories, on both reaction times (F1(4,184) = 35.77, p
<.0001, MSE= 483.08; F2(4,80) = 4.05, p<.005, MSE= 1,816.13) and e
r-
rors (F1(4,184) = 21.09, p<.0001, MSE= 1.43; F2
(4,80) = 2.84, p<.03,
MSE= 29.35).
Post
-
hoc

comparisons based on the Duncan’s test on the
The interplay of root, suffix and whole
-
word frequency
185

means by items showed that both HH and HL derived words gave rise to
significantly faster reaction times than words in the other three sets (for all
comparisons invol
ving HH or HL words, p<.025).

Additionally, HH and HL derived words did not differ from one another
(p>.1). Analogously, no differences were found among LH, LL and ND
words (always p>.1). A similar pattern was found on errors, with signif
i-
cant differences
between HH words on the one hand, and LH, LL, and ND
words, on the other (always p<.05), and between HL and LL words
(p=.05).

The two
-
way ANOVAs on the four derived sets, with root frequency
(high
vs
. low) and suffix frequency (high
vs
. low) as the two fac
tors, co
n-
firmed a root effect only, by both participants and items, on both reaction
times and errors (F1 (1,46) = 129.3, p<.0001, MSE= 438.24; F2 (1,64) =
14.59, p<.0001, MSE= 1,740.71 for reaction times; F1 (1,46) = 66.38,
p<.0001, MSE= 1.53; F2 (1,64) =

9.14, p<.004, MSE= 30.66 for errors).
On reaction times, no suffix effect was found (F<1), and no interaction
(F<1). A suffix effect was found on errors, in the analysis by participants
only (F1 (1,46) = 9.83, p<.003, MSE= 1.35; F2 (1,64) = 1.20, p>.1, MS
E=
30.7), with words with low
-
frequency suffixes giving rise to more errors
than words with high
-
frequency suffixes. No interaction was found on
errors (F<1).

Results of Experiment 3 confirmed morpheme
-
based processing for HH
words. As expected, these deri
ved words showed faster reaction times and
higher accuracy because of including high
-
frequency morphological co
n-
stituents. The outcomes of Experiment 3 also suggested whole
-
word pr
o-
c
essing for LL derived words


the latter words, whose constituent mo
r-
pheme
s were low
-
frequency, did not show any advantage with respect to
nonderived words of the same surface frequency.

The results of Experiment 3 were not in accordance with the predictions
made for HL and LH derived words. Differently from Experiment 2, in
whi
ch the two sets of derived words which include one high
-
frequency
constituent did not differ from each other, and showed intermediate rea
c-
tion times and accuracy with respect to words including two high
-
frequency constituents on the one hand, and words inc
luding two low
-
frequency constituents on the other, in Experiment 3 HL and LH words
gave rise to contrasting results. Apparently, the imperfect balance in rated
familiarity among word sets was responsible for part of the effects found in
Experiment 2. A be
tter matching among experimental sets led to a different
pattern of results in Experiment 3. While words with high
-
frequency root
186


Burani and

Thornton

and low
-
frequency suffix (HL words) were as fast as words with both root
and suffix of high frequency (HH words), words with a

high
-
frequency
suffix but a low
-
frequency root (LH words) did not differ either from
words in which both constituents were low
-
frequency (LL words), nor
from words with no morphological constituency,
i.e.,

nonderived (ND)
words.

From these results it coul
d be argued that the major determinant of lex
i-
cal decision performance to suffixed derived words is the root frequency,
with no role for the frequency of the suffix. In the General discussion a
processing account will be discussed, that reconciles these re
sults on words
with the apparently contrasting results on suffixed pseudowords, including
results of Experiment 1, in which a strong role for suffix frequency was
found.



6.

General discussion


Three lexical decision experiments investigated how the frequenc
y of mo
r-
phemic constituents, namely roots and suffixes, affects the visual pr
o-
ces
s
ing of low
-
frequency derived stimuli. In Experiment 1, pseudowords
made up of Italian derivational suffixes of various frequencies and mea
n-
ingless pseudoroots were contrasted

to pseudowords in which the same
pse
u
doroots were combined with control orthographically legal final s
e-
quences. These final sequences were matched to the suffixes for frequency
and for other relevant variables, but were not suffixes themselves. There
were

three sets of suffixed/control pseudowords, differing in the frequency
of the final sequences, either high, medium or low. Only pseudowords that
included high
-
frequency suffixes showed interference, namely longer dec
i-
sion times and higher error rates, wit
h respect to their matched non su
f-
fixed controls. Neither pseudowords including medium
-
frequency suffixes
nor pseudowords with low
-
frequency suffixes differed from their respe
c-
tive controls. These results are in accordance with previous results by B
u-
rani e
t al. (1997), and extend them to pseudoword contexts in which no
root is present. The results provide further evidence that the activation of
morphemic access units corresponding to suffixes is constrained, in visual
tasks, by the quantifiable characterist
ics of the suffixes themselves (see
also Laudanna and Burani 1995; Laudanna, Burani and Cermele 1994).

Experiment 2 and 3 explored whether the frequency of the root or the
suffix affected lexical decision to low
-
frequency suffixed derived Italian
The interplay of root, suffix and whole
-
word frequency
187

words. Fo
ur sets of derived nouns and adjectives were contrasted in E
x
pe
r-
iment 2. All of the words in the four sets had a low frequency, but di
f
fered
with respect to the frequency of their morphemic constituents, either high
or low with orthogonal variation. The re
sults showed that reaction times
and accuracy to derived words were affected by the frequency of both roots
and suffixes. Lexical decisions were faster and more accurate when the
derived words included two high
-
frequency constituents, they were the
slowest

and the least accurate when both constituents had low frequency,
and had intermediate times and error rates when the derived words inclu
d-
ed only one high
-
frequency constituent, either the root or the suffix. No
differential effects of root and suffix freq
uency were found. However,
post
-
hoc

controls showed that the effects of morphemic fr