1

1
UNDERGRADUATE MATHEM
ATICS STUDENTS’
CAREER
:
A
CLASSIFICATION
TREE
Chiara Andrà
1
, Guido Magnano
1
, Francesca Morselli
2
1
University of Torino, Italy;
2
University of Genoa, Italy
Starting from a
longitudinal survey on the students enrolling in the
mathematics
undergraduate course at the University of Torino
about
possible causes of dropout
,
we
analyse
in detail the first year of a specific cohort of students: the freshmen in the
academic year 2010/11
. The
purpose of the study is
to
shed light on the possible
f
actors that can determine success or drop out. We use the methodological tool
known as
"
classification
tree
"
, developed within the data mining domain.
LIFE AS A TRAIL
, DECISIONS AS CROSS
ROADS
The 1998 movie
Sliding doors
draws from the commonly

accepted me
t
aphor of
human life as a trail:
sometimes there are crossroads
and people have to decide where
to go
.
The verb “to decide”
, indeed,
derives from the
Latin
word
de

caedere
(
to cut
away
)
: when a certain alternative is chosen, any other possibility is cut aw
ay.
Similarly, we would like to d
raw on the trail metaphor
for depicting undergraduate
mathematics students’ car
e
ers
(either taking the degree or dropping out).
Undergraduate students in mathematics, engineering, and sciences
all around the
world
face sev
eral difficulties with mathematics
, as it has arisen in
UK
,
Canada
,
Australia
, and Ireland (
for a reference of recent researches on this issue see:
Rylands
& Coady,
2009
).
T
he problem
is interpreted
by Rylands and Coady
as a consequence
of the
undergraduat
e students’
i
ncrea
singly diverse backgrounds
.
As a matter of fact,
indeed, the variety of high school
s
from which the students enrolling in mathematics
come from is broad and a multifaceted situation is depicted.
Although
we agree that
students enter univ
e
rsity with a varied background
, for us this is not the final
interpretation for th
e difficulties. On the contrary,
we are interested in detecting the
causes of difficulty on which
it is possible to intervene
.
Notice that we are not dealing with generic un
dergraduate students who have to
attend math courses, but rather
on
with
individuals who
specifically chose
undergraduate studies in
mathematics
.
U
ndergraduate students in mathematics are
supposed to be both talented and motivated
:
they are likely to have
experienced a
positive causal relationship from attitude to achievement (Ma & Kishor, 1997), as
well as from achievement to attitude (Ma & Xu, 2004),
in a
self

reinforcing sequence
of positive experiences
–
as it
has been observed also in other researches
, c
onfirming
that positive beliefs are related to each other and with positive emotions (Hannula
et
al.
, 2006; Roesken
et al.
, 2011)
. This
is likely to involve both the individual and the
social aspect of cognition, motivat
ion and emotions (Hannula, 2011).
1

2
Ho
wever, a longitudinal study on undergraduate mathematics students at the
University of Torino has shown that each year
2
0

25
%
of them drop out after the first
academic year (Andrà, Magnano & Morselli, 2012).
Moreover, t
he aforementioned studies
(
see
Ryland
s & Coady, 2009)
focus mainly on
the
cognitive aspects
in relation to students’ difficulties
. Also our
longitudinal survey
has taken into account cognitive aspects, which allowed confirming that drop out
students were not necessarily those displaying a sig
nificant lack of knowledge at the
beginning of university studies. Beliefs and other affective factors, such as
motivation, may have a key role in determining difficulties and reaction to them.
On
the same line, within the MAVI community recent studies hav
e been carried out,
focusing on the situation in Spain (Gòmez

Chacòn
et al.
, 2012) and in Germany
(Griese
et al.
, 2012).
The purpose of our research is to insert affective factors in a cognitive

oriented
assessment test, and to investigate which factors em
erge, how they emerge, and to
which extent they contribute to understand the drop

out phenomenon of
undergraduate students in mathematics. We are also interested in investigating the
kind of information the analysis of affective factors in the test would g
ive us.
A methodological tool that can help us dealing with
the
students’ decision of cut
themselves away from the university studies is the
classification tree
. This tool does
not oblige the researchers to assume a linear correlation between the involved
variables, a limit that has been pointed out by Hannula (2011) with respect to most
researches in the field of affect. As a consequence, it allows us to
model the interplay
of a large
amount of
factors
without imposing restrictive relations within variable
s.
In
the following sections, we first provide a sketchy description of our investigation
tools, then we introduce the
classification tree
methodology, and finally we
characterize some profiles.
THE
“EXTENDED” TEST
: COGNITIVE

AND AFFECTIVE

BASED
ITEMS
T
h
e TARM test actually assesses the students’ mathematical abilities,
but
in this
study we focus on the students interpretations of their achievements in mathematics
at school.
As a first attempt in this direction, in collaboration with Laura Nota
(Universit
y of Padua) the 2010/11 TARM questionnaire had been enlarged to include
a set of items from the
Career Adapt

Abilities Inventory
(Savickas
et al.
, 2009), from
the
Perceived Responsibility Scale
(Zimmerman & Kitsantas, 2005) and from the
Source of School Ma
thematics Self

Efficacy Scale
(Usher & Pajares, 2009). Studies
on the view of mathematics and self

beliefs of mathematics learners (Hannula
et al.
,
2005; Roesken
et al.
, 2011)
have
also
been
a helpful reference.
Savickas’ Career Adaptability is a multidi
mensional construct that concerns
individual differences in the willingness, competence, and performance of behaviors
required to cope with transitions. The willingness to engage in the five principal types
1

3
of coping behaviors that constitute adaptation (o
rientation, exploration,
establishment, management, disengagement) constitute the psychological dimension
of the model, and it is composed by the facets of flexibility, proactivity,
conscientiousness, and openness. Adapt

ability is the psychosocial dimensi
on and it
is distinct from the behaviors that produce adaptation and its outcomes; it includes
concern, control, curiosity, confidence, collaboration, and cooperation. Willingness
and adapt

abilities shape the individual’s readiness and resources for perfo
rming the
behaviors needed to
face
vocational development tasks, transitions, or traumas.
F
or example, we consider the
attitude to think positively about one’s professional
future
(Adp1), the
curiosity and desire to explore new opportunities, also in the
professional sphere
(Adp3), and
self

confidence about one’s capacity in fostering
professional self

realization
(Adp5). For each voice, the students were given a list of
11 abilities, and they were asked to rate how much they think to have the ability from
1 (very little) to 6 (very much). Examples of abilities are: “to reflect on how my
future will be”, “to have a positive view of my future”, “to prepare for the future”, “to
become aware of the educational and professional choices I have to make”, “to
adva
nce the changes”, “to be persevering”
–
as regards Adp1; “to explore my
environment”, “to look for opportunities that help me growing up as an individual”,
“to consider different ways of doing things”, “to search for information about the
choices I have to
do”, “to ask for advices”
–
as regards Adp3; and “to learn from
one’s own mistakes”, “to be proud of a well done work”, “to learn new abilities”, “to
do things that I consider a challenge”, “to be reliable”
–
as regards Adp5.
Zimmerman’s and Kitsantas’
Perce
ived Responsibility Scale
assesses individual’s
self

efficacy beliefs regarding their use of specific self

regulatory processes in
various areas of academic functioning. In our study, the students were given 18
questions concerning the responsibility about
school events. They had to rate from 1
(the teacher) to 7 (the student), with the median value 4 corresponding to “both”,
whom is responsible for.
Usher and Pajares study self

efficacy beliefs as influenced by four sources: the
interpreted result of one’
s results, the vicarious experience, the social persuasion, and
the emotional and psychological states (mostly anxiety). In our study, the students
were asked to rate their level of agreement with each one of 14 statements from 1
(not at all) to 6 (perfect
ly). One interesting dimension regards the experienced
sensations and emotions (Smt4), which is related to the emotional and psychological
states. Examples of statements that have been administered to the students are: “the
only fact that I have to attend
a math lesson makes me feel stressed and/or nervous”,
“as I start to make some math exercises I start to perceive sensations of stress”, and
“when I think that I have to study math I sag”.
1

4
DATA ANALYSIS
Methodological considerations
The key concept of d
ata
mining
is
discovery
, commonly defined as "detecting
something new". The actual data mining task is the
exploration
of large
amounts
of
data
,
corresponding to several factors,
to extract interesting patterns such as
clusters
,
unusual records
, or
dependenci
es
.
These patterns can then be seen as a kind of
summary of the input data, and used in further analysis
.
A
wide

spread method
in
data mining is the
classification tree
,
which
aims at
predict
ing
the value of a target
variable
on the base of
several input v
ariables
. The
tree
is algorithmically
constructed
(using a computer package)
by
computing, for each factor to be considered, the
information gain (w.r. to the target variable) given by
splitting the
initial population
into
two groups
at some threshold valu
e
.
Once found the splitting value
which
maximize
s
the information gain, the program explores the next factor, until all factors
have been tested. The population is then split according to the factor (and the split
value) corresponding to the highest inform
ation gain
found
. At this point, the full
process is repeated (including the factor
already
used for the first splitting) for each
of the obtained subgroups, and so on until
either
(i) each of the resulting subgroups
contains only individuals having the sa
me value of the target variable, or (ii) further
splitting does not yield significant information gain
.
T
he best way to illustrate the
method is to show how it works on data
, as we shall do in the next section.
Classification tree
construction
Our data cov
er 162 undergraduate math students at the University of Torino, all
enrolled in 2010/11. We shall first consider
the number of
passed
exams
(within
the
first year
)
as the target variable
.
In the classification tree construction, the target
variable should
assume the least possible number of distinct values, to avoid
overfitting. Therefore,
we had
to find a reliable way to "count" the
earned
credits,
then
we had
to split the range of values at a significant threshold level, so to obtain
only two classes for
the target variable.
Following the
1999
Bologna Accords
,
European
university courses
are described in terms of ECTS credits, known in Italy
as
CFU (“crediti formativi universitari”).
One
CFU corresponds, in principle, to 25
hours of study
;
each academic ye
ar
includes
60 CFU.
In previous studies on math
undergraduate curriculum in Turin University
,
we could observe that a critical
CFU
threshold
in the first year is
21
:
students earning less than 21 CFU in the first year
very seldom
get the final
degree
.
Acco
rdingly
, in the subsequent analysis we shift
from description of what happened (number of CFU earned) to prediction of what is
likely to happen (
career
). Thus, we say
that
CFU1
≥
21
“predicts success” and,
conversely, that
CFU1
<
21
“predicts drop

out
”.
At the moment,
we
know
also
the list
of
second

year students
in
2011/12
,
and
therefore the actual dropout incidence after
the first year (a number of students, in fact,
give up their
studies at a later stage)
.
1

5
We apply the
classification tree
method to single out which variables “characterize”
the
two
groups (
CFU1
≤
21
,
CFU1
>21
).
We have
at our disposal
up to
27
input
variables
concerning:
personal information
that can be read in terms
of social aspects
(for instance, living in a big or in a small town, being a commuter and so on);
p
sychological traits and motivations
(as emerged from the answers to the “affective”
part of the test)
;
data from students’ previous career (diploma grades an
d type)
;
the
performance
in
the
non

selective entrance
test
(TARM)
;
for each student, we know
all scores,
credits
and examination dates for the
University
first

year courses, but in
connection to dropout we considered only the total amount of CFU obtained
in
scored exams. The construction of the classification tree is controlled by a number of
parameters, such as
the list of factors to be used and
the minimum information gain to
be considered for a split.
T
he "best" model
should be
a compromise between the
maximum
overall predictive power and the
minimum number of factors and splits
needed (a fully predictive tree with too many nodes is likely to be overfitting)
.
Figure 1
shows
a
classification tree
which gives a correct prediction rate of
92
%,
using only 9
factors
. T
he variable
yielding
the greatest information gain is T2, the
score in the second part of the test TARM
(mostly
assessing the
comprehension of
texts taken from math and physics textbooks, in Italian and in English
)
: the first part
of TARM, T1, is
the same for all the undergraduate courses in the Faculty of
Sciences
and assesses
basic
mathematic
al skills
, while T2 is
considered as “
specific
”
for the
math curriculum
.
Figure 1:
classification tree
with CFU1 as predicted variable
.
T
he T2 score range
s from 0 to 30, and t
he split
value determined by the algorithm
(14.5) shows that the test was well balanced
.
S
tudents
on the right branch (scoring
more than 14.5)
are
subsequently
discriminated by “prc”, that is
the perceived sense
1

6
of
responsibility
(
Savi
ckas
et al.
, 2009): if
the latter
is
not very high
(prc<60.44), then
the next split is
relative to
the factor
smt2
(
possibility to observe and imitate effective
models
). If smt2<62.95, it predicts “success” (CFU1>21). If prc is
very
high,
instead,
the vari
able adp4 (the
ability
of establishing positive relationships and cooperating
with others
) intervenes.
Notice that all measures of affective factors have been
rescaled so
that 50±10 correspond to the mean
±
one standard deviation (
observed for
a suitable
r
eference population
).
The digits
0 and 1 at the bottom
of terminal branches mean
that the tree prediction for
that branch is
CFU1
<
21
or
CFU1
≥
21
,
respectively.
For each terminal group, we have
indicated how many individuals of the original population are correctly classified
(
“T”
) and incorrectly classified (“F”) by the tree
.
Going back to the root
branching
on
T2, the variable with the greate
st information
gain on the left
branch
(i.e., for students who scored less than 15 in T2)
is
the
undergraduate
curriculum
: “
MAT
” (
traditional
math curriculum
) versus “MFA”
(
applied math for finance
and
insurance
).
H
ere, we are not regarding the choice
amon
g the two curricula as an achievement factor: however, this datum should be
included because reaching 21 CFU may have a different significance in the two
curricula
.
It turns out that the difference affects only students with a low T2 score.
Among these,
MF
A
students
are further classified by
variable
s adp2 (
inclination to
consider oneself as responsible for his own professional future
), and st4
(
writing
ability
)
.
As regards MAT students
with low T2
, the variable smt2 plays
again
a
fundamental role
.
The
rem
aining
branches of the tree should be
rea
d along the same lines. However,
this should
not
be assumed to describe the actual process which
determines
the
academic achievement: it
rather
indicates that
, for instance,
students getting a good
score in T2 and n
evertheless failing to reach 21 CFU could be
singled out
(with
reasonable accuracy) by considering a specific combination of sense of
responsibility, previous availability of effective models as a source o
f
mathematical
self

efficacy, and adaptability. To
relate this
predictive evidence
with a
causal
process
, further investigation is required: in particular, one should compare patterns
emerging from this exploratory analysis with models proposed by current research on
affective factors.
To give further exam
ples of the method, we have displayed
in
fig
ures
2 and 3
two
more tree
s, both referring to
the
decision to
attend
the second year
as the target
variable
. Figure 2 shows the tree that considers the same input variables of the tree in
figure 1
, whilst the tr
ee in figure 3 takes into account also the variable CFU1
, which
was the target in the tree of fig
ure
1
.
C
ognitive
and affective
aspects intervene in
these three distinct processes in different ways. In the analysis above,
we
have shown
that
–
after the cogn
itive

based variable T2
–
the affective aspects plays a crucial role.
1

7
In figure 3, instead,
the first split is determined by CFU1, and cognitive

based
variables play a key role.
Figure 2:
classification tree
predicting
continuation
/drop

out on the basis o
f the
factors measured at the beginning of the first year
.
Figure 3:
classification tree predicting continuation/drop

out on the basis of the same
factors
and
of the first year total credits
.
DISCUSSION AND FURTH
ER DEVELOPMENTS
What are we learning from
classification tree
s
?
The first
branching
in each
classification tree
concern
s
cognitive variables, the
n
the
affective factors emerge. They emerge both for students who are likely to take the
degree
and for
students
who a
re likely to drop out
, but in diff
erent manners. In fact,
for the first
group
,
we observed that
a high T2
,
a
not too high
sense of responsibility
(
prc
)
, and a low
availability of effective models
(smt2)
predicts
“success”
.
The same
variable smt2 intervenes also when T2 is low
,
but here
it
predicts “failure” when it is
low.
This is
a phenomenon
w
h
ich
w
ould never
be observed in
traditional correlation
or multiple regression analysis. Whether this
ambivalent (
predictive
)
role of a
single
1

8
factor correspond
s
to a causal role
or not, it is unclea
r by now. Yet,
such phenomena
are
not manifestly absurd
: an excessive sense of responsibility and self

comparison
with effective models could actually be negative factors for academic achievement
(fig
ure 1
);
in turn,
high adaptability is expected to be a p
ositive factor for academic
achievement, but could also lead students experi
encing
difficulties to
decide more
easily to
change
to a different
curriculum
(adp1 and adp4 in fig
ure
2)
.
Our study,
along with Maggiani (2011), confirms that the transition from
high school to
university is a personal process where
–
beyond
learning skills and motivation
–
self

beliefs, locus of control and adaptability interact in complex and "nonlinear" patterns,
which cannot be explored by traditional correlation analysis.
The
cl
assification tree
methodology
The first aspect we bring to light is that in our study we used the results from
previous years (since 2001/02) to infer on the present (and the future) students
enrolled in the mathematics under graduate course.
This implies
an overarching
assumption: the situation of ten

years

ago students is the same of today’s one.
The
term ‘situation’ has to be meant in a wide way, in order to take into account socio

economical, psychological, and cognitive aspects. Although we are aware o
f the
changes our formative system
–
as well as our country and our society
–
had gone
through in the last ten years, we claim that some facts are still worthy to be
considered
.
In fact, if we look at the
classification tree
which contains information
about
the enrolling to the second year, we can see a confirmation of this.
The methodology of the
classification tree
can be seen not only as a way to
disentangle
a complicated (and sometimes contradictory) picture, but also as a
generator of research questions
–
a way
to
bring into light issues that need further
elaboration. It provide
s the researcher
an articulated frame
, and the researcher has to
make sense of it
,
without relying on constraining assumptions such as linearity
correlations of variables.
Moreover
, we can notice that a certain amount of ‘noise’ is present in our analysis.
To us, variation is good.
We do not believe that the future of undergraduate students
may be completely predicted by means of tests (neither cognitive

based, nor
affective

based),
or other tools. Individual unpredictability is neither a limitation, nor
a matter of fact in our opinion, but
the reason to believe that each student is both
responsible and free. Our research may serve as a source of information for the
students to be aw
are of what is likely to happen in certain circumstances, as well as a
warning for professors and administrative operators who mind about the students’
career, for any reason.
General considerations on the research
We conclude this paper with some general
considerations.
The first one concerns the
role of the items used to collect data: the picture that emerges from the analysis
1

9
depends on the data that had been collected.
Items are not “neutral” to the research,
as well as the assumptions that lay on the b
ackground of the methodology used.
Among the items, the affective

based ones have a
significant
role. As expected, the
first split is determined by cognitive factors, but the affective aspects contribute to
delineate a varied and multifaceted landscape. Wi
thout them, the trees would have
stopped after very few steps.
In other words, this research contributes to prove that
affect

related issue
s
are of crucial importance in the learning processes, considered in
a wide perspective.
A feature of this research i
s that we have taken into account psychological items that
are “generic”. An open question might regard the possibility to use items that are
“specific” for the mathematics.
References
Andrà, C.
, Magnano, G., and Morselli, F.
(20
12
).
Drop

out undergraduate
students in
mathematics: an exploratory study. In:
Current state of research on mathematical beliefs
XVII. Proceedings of the MAVI

17 conference,
September
17

20, 2011.
Ruhr

Universitat
Bochum, Germany
(pp. 13

22).
Gòmez

Chacòn, I., Garc
ì
a Madruga, J.A.,
Rodrìguez
R., Vila J.O., & Elosùa, R. (2012).
Mathematical beliefs and cognitive reflection: do they predict academic achievement?
In:
B. Roesken & M. Casper (Eds.),
CURRENT STATE OF RESEARCH ON
MATHEMATICAL BELIEFS
,
XVII Proceedings of the MAVI

17 Confere
nce
,
September
17

20, 2011
, pp. 64

73. Ruhr

Universität Bochum, Germany.
Griese, B., Glasmachers, E.
, Harterich, J.,
Kallweit, M., & Roesken, B. (2012).
Engineering
students and their learning of mathematics.
In: B. Roesken & M. Casper (Eds.),
CURRENT STATE
OF RESEARCH ON MATHEMATICAL BELIEFS
,
XVII Procee
dings
of the MAVI

17 Conference
,
September 17

20, 2011
, pp.
85

96
. Ruhr

Universität
Bochum, Germany.
Hannula, M.S. (2004).
Affect in mathematical thinking and learning.
Turku, Finland:
Annales universitatis
Turkuensis B 273.
Hannula, M.S. (2011). The structure and dynamics of affect in mathematical thinking and
learning. In:
M. Pytlak, E. Swoboda, & T. Rowland (Eds.),
Proceedings of the CERME7.
Reszow, Poland.
9

13 February 2011
, pp. 34

60. University of Resz
ow, Poland:
CERME.
Hannula, M.S., Kaasila, R., Laine, A., & Pe
hkonen, E. (2006).
The structure of student
teacher’s view of mathematics at the beginning of their studies.
In: M. Bosch (Ed.),
Proceedings of
the CERME4. S
ant Feliu de Guixols, Spain.
17

20 Fe
bruary 2005
, pp.
205

214. Fundemi IQS
–
Universitat Ramon Llull: CERME.
Ma, X., & Kishor, N. (1997). Attitude toward self, social factors, and achievement in
mathematics: a meta

analytic review.
Educ
ational Psychology Review, 9
, 89

120.
1

10
Ma, X., & Xu, J. (2
004). Determining the causal
ordering between attitude toward
mathematics and achievement in mathematics.
American Journal of Education,
110
(May), 256

280.
Mag
giani, C. (2011).
La transizione dalla scuola secondaria all’Università: il caso degli
studenti
d
el corso di laurea in Matematica
.
Unpublished
manuscript. Tesi di Laurea,
University of Genova (Italy).
Roesken, B., Hannula, M. S. & Pehkonen, E. (2011).
Dimensions of students’ views of
themselves as learners of mathematics.
ZDM. The International Journa
l on Mathematics
Education.
DOI: 10.1007/s11858

011

0315

8
Rylands, L. & Coady, C. (2009).
Performance of students with weak mathematics in first

year mathematics and science.
International Journal of Mathematical Education in
Science and Technology, 40
(6), 741
–
753.
Savickas, M., Nota, L., Rossier, J., Dauwalder, J.P., Duarte, M.E., Guichard, J.,
Soresi, S., vanEsbroeck, R., van Vianen, A.E.M. (2009).
Life designing: A paradigm for
career construction in the 21st century.
Journal of Vocatio
nal Behavior, 75
(3), 239

250
Usher E.L., Pajares F. (2009).
Sources of self

efficacy in mathematics: A validation study.
Contemporary Educational Psychology 34,
89
–
101
Zimmerman, B.J., &
Kitsantas, A
.
(2005).
Students' perceived responsibility and complet
ion
of homework: The role of self

regulatory beliefs and processes.
Contemporary
Educational Psychology, 30
, 397

417
Comments 0
Log in to post a comment