UNDERGRADUATE MATHEMCAREER: A CLASSIFICATION TREE

fantasicgilamonsterData Management

Nov 20, 2013 (4 years and 1 month ago)

89 views




1
-

1



UNDERGRADUATE MATHEM
ATICS STUDENTS’
CAREER
:

A
CLASSIFICATION
TREE

Chiara Andrà
1
, Guido Magnano
1
, Francesca Morselli
2


1
University of Torino, Italy;
2
University of Genoa, Italy

Starting from a

longitudinal survey on the students enrolling in the
mathematics

undergraduate course at the University of Torino
about

possible causes of dropout
,
we
analyse

in detail the first year of a specific cohort of students: the freshmen in the
academic year 2010/11
. The

purpose of the study is
to
shed light on the possible
f
actors that can determine success or drop out. We use the methodological tool
known as

"
classification

tree
"
, developed within the data mining domain.

LIFE AS A TRAIL
, DECISIONS AS CROSS
ROADS

The 1998 movie
Sliding doors

draws from the commonly
-
accepted me
t
aphor of
human life as a trail:

sometimes there are crossroads

and people have to decide where
to go
.

The verb “to decide”
, indeed,

derives from the
Latin

word

de
-
caedere

(
to cut
away
)
: when a certain alternative is chosen, any other possibility is cut aw
ay.

Similarly, we would like to d
raw on the trail metaphor

for depicting undergraduate
mathematics students’ car
e
ers

(either taking the degree or dropping out).


Undergraduate students in mathematics, engineering, and sciences

all around the
world

face sev
eral difficulties with mathematics
, as it has arisen in

UK
,

Canada
,

Australia
, and Ireland (
for a reference of recent researches on this issue see:
Rylands
& Coady,
2009
).

T
he problem

is interpreted

by Rylands and Coady
as a consequence
of the
undergraduat
e students’
i
ncrea
singly diverse backgrounds
.

As a matter of fact,
indeed, the variety of high school
s

from which the students enrolling in mathematics

come from is broad and a multifaceted situation is depicted.

Although

we agree that
students enter univ
e
rsity with a varied background
, for us this is not the final
interpretation for th
e difficulties. On the contrary,
we are interested in detecting the
causes of difficulty on which
it is possible to intervene
.

Notice that we are not dealing with generic un
dergraduate students who have to
attend math courses, but rather
on

with

individuals who
specifically chose

undergraduate studies in

mathematics
.
U
ndergraduate students in mathematics are
supposed to be both talented and motivated
:

they are likely to have
experienced a
positive causal relationship from attitude to achievement (Ma & Kishor, 1997), as
well as from achievement to attitude (Ma & Xu, 2004),
in a
self
-
reinforcing sequence
of positive experiences

as it

has been observed also in other researches
, c
onfirming

that positive beliefs are related to each other and with positive emotions (Hannula
et
al.
, 2006; Roesken
et al.
, 2011)
. This

is likely to involve both the individual and the
social aspect of cognition, motivat
ion and emotions (Hannula, 2011).




1
-

2



Ho
wever, a longitudinal study on undergraduate mathematics students at the
University of Torino has shown that each year
2
0
-
25
%

of them drop out after the first
academic year (Andrà, Magnano & Morselli, 2012).

Moreover, t
he aforementioned studies
(
see
Ryland
s & Coady, 2009)
focus mainly on
the
cognitive aspects

in relation to students’ difficulties
. Also our
longitudinal survey
has taken into account cognitive aspects, which allowed confirming that drop out
students were not necessarily those displaying a sig
nificant lack of knowledge at the
beginning of university studies. Beliefs and other affective factors, such as
motivation, may have a key role in determining difficulties and reaction to them.
On
the same line, within the MAVI community recent studies hav
e been carried out,
focusing on the situation in Spain (Gòmez
-
Chacòn
et al.
, 2012) and in Germany
(Griese
et al.
, 2012).

The purpose of our research is to insert affective factors in a cognitive
-
oriented
assessment test, and to investigate which factors em
erge, how they emerge, and to
which extent they contribute to understand the drop
-
out phenomenon of
undergraduate students in mathematics. We are also interested in investigating the
kind of information the analysis of affective factors in the test would g
ive us.

A methodological tool that can help us dealing with
the
students’ decision of cut
themselves away from the university studies is the
classification tree
. This tool does
not oblige the researchers to assume a linear correlation between the involved
variables, a limit that has been pointed out by Hannula (2011) with respect to most
researches in the field of affect. As a consequence, it allows us to
model the interplay
of a large

amount of
factors

without imposing restrictive relations within variable
s.

In
the following sections, we first provide a sketchy description of our investigation
tools, then we introduce the
classification tree

methodology, and finally we
characterize some profiles.


THE
“EXTENDED” TEST
: COGNITIVE
-

AND AFFECTIVE
-
BASED
ITEMS

T
h
e TARM test actually assesses the students’ mathematical abilities,
but
in this
study we focus on the students interpretations of their achievements in mathematics
at school.

As a first attempt in this direction, in collaboration with Laura Nota
(Universit
y of Padua) the 2010/11 TARM questionnaire had been enlarged to include
a set of items from the
Career Adapt
-
Abilities Inventory

(Savickas
et al.
, 2009), from
the
Perceived Responsibility Scale

(Zimmerman & Kitsantas, 2005) and from the
Source of School Ma
thematics Self
-
Efficacy Scale

(Usher & Pajares, 2009). Studies
on the view of mathematics and self
-
beliefs of mathematics learners (Hannula
et al.
,
2005; Roesken
et al.
, 2011)
have

also

been

a helpful reference.

Savickas’ Career Adaptability is a multidi
mensional construct that concerns
individual differences in the willingness, competence, and performance of behaviors
required to cope with transitions. The willingness to engage in the five principal types



1
-

3



of coping behaviors that constitute adaptation (o
rientation, exploration,
establishment, management, disengagement) constitute the psychological dimension
of the model, and it is composed by the facets of flexibility, proactivity,
conscientiousness, and openness. Adapt
-
ability is the psychosocial dimensi
on and it
is distinct from the behaviors that produce adaptation and its outcomes; it includes
concern, control, curiosity, confidence, collaboration, and cooperation. Willingness
and adapt
-
abilities shape the individual’s readiness and resources for perfo
rming the
behaviors needed to
face

vocational development tasks, transitions, or traumas.

F
or example, we consider the
attitude to think positively about one’s professional
future

(Adp1), the
curiosity and desire to explore new opportunities, also in the
professional sphere

(Adp3), and
self
-
confidence about one’s capacity in fostering
professional self
-
realization

(Adp5). For each voice, the students were given a list of
11 abilities, and they were asked to rate how much they think to have the ability from

1 (very little) to 6 (very much). Examples of abilities are: “to reflect on how my
future will be”, “to have a positive view of my future”, “to prepare for the future”, “to
become aware of the educational and professional choices I have to make”, “to
adva
nce the changes”, “to be persevering”

as regards Adp1; “to explore my
environment”, “to look for opportunities that help me growing up as an individual”,
“to consider different ways of doing things”, “to search for information about the
choices I have to
do”, “to ask for advices”

as regards Adp3; and “to learn from
one’s own mistakes”, “to be proud of a well done work”, “to learn new abilities”, “to
do things that I consider a challenge”, “to be reliable”

as regards Adp5.

Zimmerman’s and Kitsantas’
Perce
ived Responsibility Scale

assesses individual’s
self
-
efficacy beliefs regarding their use of specific self
-
regulatory processes in
various areas of academic functioning. In our study, the students were given 18
questions concerning the responsibility about

school events. They had to rate from 1
(the teacher) to 7 (the student), with the median value 4 corresponding to “both”,
whom is responsible for.

Usher and Pajares study self
-
efficacy beliefs as influenced by four sources: the
interpreted result of one’
s results, the vicarious experience, the social persuasion, and
the emotional and psychological states (mostly anxiety). In our study, the students
were asked to rate their level of agreement with each one of 14 statements from 1
(not at all) to 6 (perfect
ly). One interesting dimension regards the experienced
sensations and emotions (Smt4), which is related to the emotional and psychological
states. Examples of statements that have been administered to the students are: “the
only fact that I have to attend
a math lesson makes me feel stressed and/or nervous”,
“as I start to make some math exercises I start to perceive sensations of stress”, and
“when I think that I have to study math I sag”.




1
-

4



DATA ANALYSIS

Methodological considerations

The key concept of d
ata

mining

is
discovery
, commonly defined as "detecting
something new". The actual data mining task is the
exploration

of large
amounts

of
data
,

corresponding to several factors,
to extract interesting patterns such as
clusters
,
unusual records
, or

dependenci
es
.
These patterns can then be seen as a kind of
summary of the input data, and used in further analysis
.
A

wide
-
spread method

in
data mining is the
classification tree
,
which

aims at

predict
ing

the value of a target
variable
on the base of

several input v
ariables
. The

tree
is algorithmically
constructed

(using a computer package)
by
computing, for each factor to be considered, the
information gain (w.r. to the target variable) given by
splitting the
initial population

into
two groups

at some threshold valu
e
.

Once found the splitting value
which

maximize
s

the information gain, the program explores the next factor, until all factors
have been tested. The population is then split according to the factor (and the split
value) corresponding to the highest inform
ation gain

found
. At this point, the full
process is repeated (including the factor
already

used for the first splitting) for each
of the obtained subgroups, and so on until
either
(i) each of the resulting subgroups
contains only individuals having the sa
me value of the target variable, or (ii) further
splitting does not yield significant information gain
.

T
he best way to illustrate the
method is to show how it works on data
, as we shall do in the next section.

Classification tree

construction

Our data cov
er 162 undergraduate math students at the University of Torino, all
enrolled in 2010/11. We shall first consider
the number of
passed
exams
(within
the
first year
)

as the target variable
.
In the classification tree construction, the target
variable should
assume the least possible number of distinct values, to avoid
overfitting. Therefore,
we had

to find a reliable way to "count" the
earned
credits,
then
we had
to split the range of values at a significant threshold level, so to obtain
only two classes for
the target variable.
Following the
1999
Bologna Accords
,
European
university courses

are described in terms of ECTS credits, known in Italy
as

CFU (“crediti formativi universitari”).
One
CFU corresponds, in principle, to 25
hours of study
;

each academic ye
ar
includes

60 CFU.
In previous studies on math
undergraduate curriculum in Turin University
,
we could observe that a critical
CFU
threshold
in the first year is
21
:

students earning less than 21 CFU in the first year
very seldom

get the final
degree
.
Acco
rdingly
, in the subsequent analysis we shift
from description of what happened (number of CFU earned) to prediction of what is
likely to happen (
career
). Thus, we say

that
CFU1

21
“predicts success” and,
conversely, that
CFU1
<
21

“predicts drop
-
out
”.

At the moment,

we
know
also
the list
of

second
-
year students

in
2011/12
,
and
therefore the actual dropout incidence after
the first year (a number of students, in fact,
give up their

studies at a later stage)
.




1
-

5



We apply the
classification tree

method to single out which variables “characterize”
the
two

groups (
CFU1

21
,
CFU1
>21
).
We have
at our disposal
up to
27

input
variables
concerning:

personal information

that can be read in terms
of social aspects
(for instance, living in a big or in a small town, being a commuter and so on);
p
sychological traits and motivations

(as emerged from the answers to the “affective”
part of the test)
;

data from students’ previous career (diploma grades an
d type)
;

the
performance
in
the
non
-
selective entrance
test
(TARM)
;

for each student, we know
all scores,

credits
and examination dates for the

University
first
-
year courses, but in
connection to dropout we considered only the total amount of CFU obtained
in
scored exams. The construction of the classification tree is controlled by a number of
parameters, such as

the list of factors to be used and

the minimum information gain to
be considered for a split.
T
he "best" model
should be

a compromise between the
maximum
overall predictive power and the
minimum number of factors and splits
needed (a fully predictive tree with too many nodes is likely to be overfitting)
.

Figure 1
shows
a
classification tree

which gives a correct prediction rate of
92
%,
using only 9
factors
. T
he variable
yielding
the greatest information gain is T2, the
score in the second part of the test TARM

(mostly

assessing the

comprehension of
texts taken from math and physics textbooks, in Italian and in English
)
: the first part
of TARM, T1, is

the same for all the undergraduate courses in the Faculty of
Sciences

and assesses
basic
mathematic
al skills
, while T2 is
considered as “
specific


for the
math curriculum
.


Figure 1:

classification tree

with CFU1 as predicted variable
.

T
he T2 score range
s from 0 to 30, and t
he split
value determined by the algorithm
(14.5) shows that the test was well balanced
.
S
tudents
on the right branch (scoring
more than 14.5)
are
subsequently
discriminated by “prc”, that is
the perceived sense



1
-

6



of
responsibility

(
Savi
ckas
et al.
, 2009): if
the latter
is
not very high
(prc<60.44), then
the next split is
relative to

the factor
smt2

(
possibility to observe and imitate effective
models
). If smt2<62.95, it predicts “success” (CFU1>21). If prc is
very

high,
instead,
the vari
able adp4 (the
ability

of establishing positive relationships and cooperating
with others
) intervenes.

Notice that all measures of affective factors have been
rescaled so

that 50±10 correspond to the mean

±

one standard deviation (
observed for

a suitable
r
eference population
).

The digits

0 and 1 at the bottom
of terminal branches mean

that the tree prediction for
that branch is
CFU1
<
21
or
CFU1

21
,

respectively.

For each terminal group, we have
indicated how many individuals of the original population are correctly classified
(
“T”
) and incorrectly classified (“F”) by the tree
.


Going back to the root
branching

on
T2, the variable with the greate
st information
gain on the left
branch
(i.e., for students who scored less than 15 in T2)
is

the
undergraduate
curriculum
: “
MAT
” (
traditional
math curriculum
) versus “MFA”
(
applied math for finance

and
insurance
).
H
ere, we are not regarding the choice
amon
g the two curricula as an achievement factor: however, this datum should be
included because reaching 21 CFU may have a different significance in the two
curricula
.
It turns out that the difference affects only students with a low T2 score.
Among these,
MF
A
students

are further classified by
variable
s adp2 (
inclination to
consider oneself as responsible for his own professional future
), and st4

(
writing
ability
)
.

As regards MAT students

with low T2
, the variable smt2 plays
again
a
fundamental role
.

The
rem
aining

branches of the tree should be
rea
d along the same lines. However,
this should
not

be assumed to describe the actual process which
determines

the
academic achievement: it
rather
indicates that
, for instance,

students getting a good
score in T2 and n
evertheless failing to reach 21 CFU could be
singled out

(with
reasonable accuracy) by considering a specific combination of sense of
responsibility, previous availability of effective models as a source o
f

mathematical
self
-
efficacy, and adaptability. To
relate this
predictive evidence

with a
causal
process
, further investigation is required: in particular, one should compare patterns
emerging from this exploratory analysis with models proposed by current research on
affective factors.

To give further exam
ples of the method, we have displayed
in
fig
ures

2 and 3

two
more tree
s, both referring to

the
decision to
attend

the second year

as the target
variable
. Figure 2 shows the tree that considers the same input variables of the tree in
figure 1
, whilst the tr
ee in figure 3 takes into account also the variable CFU1
, which
was the target in the tree of fig
ure

1
.
C
ognitive

and affective

aspects intervene in
these three distinct processes in different ways. In the analysis above,
we
have shown
that

after the cogn
itive
-
based variable T2


the affective aspects plays a crucial role.



1
-

7



In figure 3, instead,
the first split is determined by CFU1, and cognitive
-
based
variables play a key role.


Figure 2:

classification tree

predicting

continuation
/drop
-
out on the basis o
f the
factors measured at the beginning of the first year
.


Figure 3:
classification tree predicting continuation/drop
-
out on the basis of the same
factors
and

of the first year total credits
.


DISCUSSION AND FURTH
ER DEVELOPMENTS

What are we learning from

classification tree
s
?

The first
branching
in each

classification tree

concern
s

cognitive variables, the
n

the
affective factors emerge. They emerge both for students who are likely to take the
degree
and for
students

who a
re likely to drop out
, but in diff
erent manners. In fact,
for the first
group
,
we observed that
a high T2
,

a
not too high

sense of responsibility

(
prc
)
, and a low
availability of effective models

(smt2)

predicts

“success”
.

The same
variable smt2 intervenes also when T2 is low
,

but here
it
predicts “failure” when it is
low.

This is
a phenomenon

w
h
ich
w
ould never
be observed in

traditional correlation
or multiple regression analysis. Whether this
ambivalent (
predictive
)

role of a
single



1
-

8



factor correspond
s

to a causal role
or not, it is unclea
r by now. Yet,
such phenomena
are

not manifestly absurd
: an excessive sense of responsibility and self
-
comparison
with effective models could actually be negative factors for academic achievement
(fig
ure 1
);
in turn,
high adaptability is expected to be a p
ositive factor for academic
achievement, but could also lead students experi
encing

difficulties to
decide more
easily to
change
to a different
curriculum

(adp1 and adp4 in fig
ure

2)
.

Our study,
along with Maggiani (2011), confirms that the transition from
high school to
university is a personal process where

beyond

learning skills and motivation


self
-
beliefs, locus of control and adaptability interact in complex and "nonlinear" patterns,
which cannot be explored by traditional correlation analysis.

The
cl
assification tree

methodology

The first aspect we bring to light is that in our study we used the results from
previous years (since 2001/02) to infer on the present (and the future) students
enrolled in the mathematics under graduate course.
This implies
an overarching
assumption: the situation of ten
-
years
-
ago students is the same of today’s one.

The
term ‘situation’ has to be meant in a wide way, in order to take into account socio
-
economical, psychological, and cognitive aspects. Although we are aware o
f the
changes our formative system

as well as our country and our society


had gone
through in the last ten years, we claim that some facts are still worthy to be
considered
.

In fact, if we look at the
classification tree

which contains information
about
the enrolling to the second year, we can see a confirmation of this.


The methodology of the
classification tree

can be seen not only as a way to
disentangle

a complicated (and sometimes contradictory) picture, but also as a
generator of research questions


a way
to
bring into light issues that need further
elaboration. It provide
s the researcher
an articulated frame
, and the researcher has to
make sense of it
,

without relying on constraining assumptions such as linearity
correlations of variables.

Moreover
, we can notice that a certain amount of ‘noise’ is present in our analysis.
To us, variation is good.

We do not believe that the future of undergraduate students
may be completely predicted by means of tests (neither cognitive
-
based, nor
affective
-
based),

or other tools. Individual unpredictability is neither a limitation, nor
a matter of fact in our opinion, but
the reason to believe that each student is both
responsible and free. Our research may serve as a source of information for the
students to be aw
are of what is likely to happen in certain circumstances, as well as a
warning for professors and administrative operators who mind about the students’
career, for any reason.

General considerations on the research

We conclude this paper with some general
considerations.
The first one concerns the
role of the items used to collect data: the picture that emerges from the analysis



1
-

9



depends on the data that had been collected.

Items are not “neutral” to the research,
as well as the assumptions that lay on the b
ackground of the methodology used.

Among the items, the affective
-
based ones have a
significant

role. As expected, the
first split is determined by cognitive factors, but the affective aspects contribute to
delineate a varied and multifaceted landscape. Wi
thout them, the trees would have
stopped after very few steps.

In other words, this research contributes to prove that
affect
-
related issue
s

are of crucial importance in the learning processes, considered in
a wide perspective.

A feature of this research i
s that we have taken into account psychological items that
are “generic”. An open question might regard the possibility to use items that are
“specific” for the mathematics.

References

Andrà, C.
, Magnano, G., and Morselli, F.

(20
12
).
Drop
-
out undergraduate

students in
mathematics: an exploratory study. In:
Current state of research on mathematical beliefs
XVII. Proceedings of the MAVI
-
17 conference,
September

17
-
20, 2011.
Ruhr
-
Universitat
Bochum, Germany

(pp. 13
-
22).

Gòmez
-
Chacòn, I., Garc
ì
a Madruga, J.A.,
Rodrìguez

R., Vila J.O., & Elosùa, R. (2012).
Mathematical beliefs and cognitive reflection: do they predict academic achievement?
In:
B. Roesken & M. Casper (Eds.),
CURRENT STATE OF RESEARCH ON
MATHEMATICAL BELIEFS
,

XVII Proceedings of the MAVI
-
17 Confere
nce
,
September
17
-
20, 2011
, pp. 64
-
73. Ruhr
-
Universität Bochum, Germany.

Griese, B., Glasmachers, E.
, Harterich, J.,
Kallweit, M., & Roesken, B. (2012).
Engineering
students and their learning of mathematics.
In: B. Roesken & M. Casper (Eds.),
CURRENT STATE

OF RESEARCH ON MATHEMATICAL BELIEFS
,

XVII Procee
dings
of the MAVI
-
17 Conference
,
September 17
-
20, 2011
, pp.
85
-
96
. Ruhr
-
Universität
Bochum, Germany.

Hannula, M.S. (2004).
Affect in mathematical thinking and learning.

Turku, Finland:
Annales universitatis
Turkuensis B 273.

Hannula, M.S. (2011). The structure and dynamics of affect in mathematical thinking and
learning. In:

M. Pytlak, E. Swoboda, & T. Rowland (Eds.),

Proceedings of the CERME7.
Reszow, Poland.

9
-
13 February 2011
, pp. 34
-
60. University of Resz
ow, Poland:
CERME.

Hannula, M.S., Kaasila, R., Laine, A., & Pe
hkonen, E. (2006).
The structure of student
teacher’s view of mathematics at the beginning of their studies.
In: M. Bosch (Ed.),
Proceedings of

the CERME4. S
ant Feliu de Guixols, Spain.

17
-
20 Fe
bruary 2005
, pp.
205
-
214. Fundemi IQS


Universitat Ramon Llull: CERME.

Ma, X., & Kishor, N. (1997). Attitude toward self, social factors, and achievement in
mathematics: a meta
-
analytic review.
Educ
ational Psychology Review, 9
, 89
-
120.




1
-

10



Ma, X., & Xu, J. (2
004). Determining the causal

ordering between attitude toward

mathematics and achievement in mathematics.
American Journal of Education,
110
(May), 256
-
280.

Mag
giani, C. (2011).
La transizione dalla scuola secondaria all’Università: il caso degli
studenti

d
el corso di laurea in Matematica
.
Unpublished
manuscript. Tesi di Laurea,
University of Genova (Italy).

Roesken, B., Hannula, M. S. & Pehkonen, E. (2011).
Dimensions of students’ views of
themselves as learners of mathematics.
ZDM. The International Journa
l on Mathematics
Education.
DOI: 10.1007/s11858
-
011
-
0315
-
8

Rylands, L. & Coady, C. (2009).
Performance of students with weak mathematics in first
-
year mathematics and science.
International Journal of Mathematical Education in
Science and Technology, 40
(6), 741

753.

Savickas, M., Nota, L., Rossier, J., Dauwalder, J.P., Duarte, M.E., Guichard, J.,
Soresi, S., vanEsbroeck, R., van Vianen, A.E.M. (2009).
Life designing: A paradigm for
career construction in the 21st century.
Journal of Vocatio
nal Behavior, 75
(3), 239
-
250

Usher E.L., Pajares F. (2009).
Sources of self
-
efficacy in mathematics: A validation study.

Contemporary Educational Psychology 34,

89

101

Zimmerman, B.J., &
Kitsantas, A
.

(2005).
Students' perceived responsibility and complet
ion
of homework: The role of self
-
regulatory beliefs and processes.
Contemporary
Educational Psychology, 30
, 397
-
417