Assessing robustness of inference in symmetrical nonlinear regression models

Luis H. Vanegas

Universidad Nacional de Colombia

Luz M. Rondon

Universidad Nacional de Colombia

Francisco José A. Cysneiros

Universidad Federal de Pernambuco

Reporte Interno de Investigación No. 16

Departamento de Estadística

Facultad de Ciencias

Universidad Nacional de Colombia

Bogotá, COLOMBIA

Assessing robustness of inference in

symmetrical nonlinear regression models

Luis Hernando Vanegas

a,∗

Luz Marina Rondon

a

Francisco Jos´e A.Cysneiros

b

a

Universidad Nacional de Colombia,Facultad de Ciencias,Departamento de

Estad´ıstica - Carrera 30 No.45-03,Bogot´a - Colombia

b

Departamento de Estat´ıstica,CCEN-UFPE - Cidade Universit´aria - Recife,PE -

Brazil 50740-540

ABSTRACT

This paper describes how diagnostic procedures were derived for symmetrical non-

linear regression models,continuing the work carried out by Cysneiros and Vanegas

(2008) and Vanegas and Cysneiros (2010),who showed that the parameters es-

timates in nonlinear models are more robust with heavy-tailed than with normal

errors.In this paper,we focus in assessing if the robustness of this kind of models

is also observed in the inference process (i.e.partial F-test).Symmetrical nonlinear

regression models includes all symmetric continuous distributions for errors covering

both light- and heavy-tailed distributions such as Student-t,logistic-I and -II,power

exponential,generalized Student-t,generalized logistic and contaminated normal.

Firstly,a statistical test is shown to evaluating the assumption that the error terms

all have equal variance.The results of a simulation study which describes the be-

haviour of the test for heteroscedasticity proposed in the presence of outliers is

then given.To assess the robustness of inference process,we present the results of a

simulation study which described the behavior of partial F-test in the presence of

outliers.Also,some diagnostic procedures are derived to identify inﬂuential obser-

vations on the partial F-test.A dataset described in Venables and Ripley (2002)

is also analysed.Diagnostic analysis indicates that a power exponential nonlinear

model seems to ﬁt the data better than other symmetrical nonlinear models.

Key words:Symmetric distribution,heavy-tailed error,testing heteroscedasticity,

partial F-test,robust model.

1 INTRODUCTION

It is well known that normal linear and nonlinear regression models can be highly inﬂuenced

by extreme observations in response variable (see Cook and Weisberg (1982),Barnett and

Lewis (1994)).As an alternative to this type of analysis,can be considered models in which

the distribution of the error presents heavier tails than normal,which can accommodate

these observations and reduce and control their inﬂuence on the parameters estimates and

on statistical inference.Examples of distributions with heavier tails than normal can include

Student-t,logistic-II and power exponential (with positive index parameter).The symme-

trical regression models framework can be used for applying models where the error follow

distributions of this kind,which cover both light- and heavy-tailed distributions for error.

Considerable contributions have been made to symmetrical regression models in the recent

years.For instance,Cordeiro,Ferrari,Uribe-Opazo and Vasconcellos (2000) corrected maxi-

mumlikelihood estimates in symmetrical nonlinear regression models,while Galea,Paula and

Uribe-Opazo (2003) developed local inﬂuence measurements in symmetrical linear regres-

sion models.Cysneiros and Paula (2004) proposed restricted tests in linear models having

t-multivariate distribution.Cordeiro (2004) corrected likelihood ratio tests in symmetrical

nonlinear regression models;Galea,Paula and Cysneiros (2005) developed a standardized

residual and proposed some local inﬂuence measurements in symmetrical nonlinear regres-

sion models and Cysneiros and Paula (2005) proposed restricted tests in symmetrical linear

regression models.Cysneiros,Paula and Galea (2007) developed estimation procedures and

diagnostics measurements in heteroscedastic symmetrical linear regression models.Cysneiros

and Vanegas (2008) recently proposed standardized residuals for symmetrical nonlinear re-

gression models and carried out an analytical,empirical study to describe these residuals’

behavior.Vanegas and Cysneiros (2010) developed diagnostic procedures based on case-

deletion and mean-shift outlier models in symmetrical nonlinear regression models,showing

that,in the presence of outliers in response variable,the parameters estimates in nonlinear

models are more robust with heavy-tailed than with normal errors.In this paper,we focus

in assessing if the robustness of this kind of models is also observed in the inference process.

∗

Corresponding author.

Email address:lhvanegasp@unal.edu.co (Luis Hernando Vanegas).

2

This paper deals with diagnostic procedures for symmetrical nonlinear regression models,

thereby continuing the work carried out in Cysneiros and Vanegas (2008) and Vanegas and

Cysneiros (2010).A statistical test for heteroscedasticity is proposed.Also,some diagnostic

procedures are described for identifying inﬂuential observations on partial F-test.The paper

is organized as follows.Section 2 introduces symmetrical nonlinear regression class.Section

3 describes a statistical test for evaluating the presence of heteroscedasticity in error terms.

Besides,presents the results of a simulation study illustrating the behavior of the proposed

heteroscedasticity test in the outliers presence.To assess the robustness of inference process,

section 4 presents the results of a simulation study describing partial F-test behavior in the

presence of extreme observations in response variable.In addition,describes some diagnostic

procedures based on the convergence of iterative parameter estimation and,in the case-

deletion model for identifying inﬂuential observations on the partial F-test.Section 5 analyzes

a dataset described in Venables and Ripley (2002),in which diagnostic procedures are

applied.This dataset consists of 52 obese patients on a weight reduction programme who

tended to lose adipose tissue at a diminishing rate,where weight is measured in kilograms,

and the time since the start of the programme,in days.Section 6 deals with some concluding

remarks.

2 SYMMETRICAL NONLINEAR REGRESSION MODELS

Suppose Y

1

,...,Y

n

as n independent random variables where density function is given by

f

Y

i

(y) =

1

√

φ

g{(y −

i

)

2

/φ},y ∈ IR,

with

i

∈ IR and φ > 0 location and dispersion parameters,respectively.The function

g:IR −→[0,∞) is such that

R

∞

0

g(u)du < ∞and is typically known as the density generator.

For instance,for Student-t distribution with ν degrees of freedomwe have g(u) ∝ (ν+u)

−

ν+1

2

and for power exponential distribution with index parameter −1 < k ≤ 1 we have g(u) ∝

exp(−

1

2

u

1/(1+k)

).We denote Y

i

∼ S(

i

,φ,g).The symmetrical nonlinear regression model is

deﬁned as

Y

i

= (β;x

i

) +ǫ

i

,i = 1,...,n,

where

i

(β) = (β;x

i

) is an injective and twice diﬀerentiable function with respect to

β = (β

1

,...,β

p

)

T

.We also suppose that the derivative matrix D

β

= ∂µ/∂β has rank p

3

(p < n) for all β ∈ Ω

β

⊂ IR

p

,with Ω

β

a compact set with interior points,ǫ

i

∼ S(0,φ,g) and

x

i

is the vector of explanatory variables.The characteristic function of Y

i

can be expressed as

ς

y

(y) = E(e

ity

) = e

itµ

ϕ(t

2

φ),t ∈ IR for some function ϕ(),with ϕ(u) ∈ IR for u > 0.Where,

E(Y

i

) =

i

and V ar(Y

i

) = ξφ,where ξ > 0 is a constant given by ξ = −2ϕ

′

(0) with ϕ

′

(0) =

{∂ϕ(u)/∂u}

u=0

(see,for instance,Fang,Kotz and Ng (1990)).For example,for power

exponential distribution with index parameter k we have ξ = 2

1+k

Γ[3(1+k)/2]/Γ[(1+k)/2],

with Γ() the Gamma function.All extra parameters will be considered to be known or ﬁxed

in this paper.

The log-likelihood function for the parameters vector θ = (β

T

,φ)

T

is given by

L(θ;y) =

n

X

i=1

l(y

i

;

i

;φ) = −

n

2

log φ +

n

X

i=1

log[g(z

2

i

)],

where z

i

= φ

−1/2

{y

i

−

i

(β)}.The score functions for β and φ have,respectively,the forms

U

β

(θ) = φ

−1

D

T

β

D(v)(y−µ) and U

φ

(θ) = (2φ)

−1

{φ

−1

Q

V

(β,φ)−n} with µ = (

1

,...,

n

)

T

,

y = (y

1

,...,y

n

)

T

the observed responses,D(v) = diag{v

1

,...,v

n

} with v

i

= v(z

i

) = −2

g

′

(z

2

i

)

g(z

2

i

)

,

g

′

(z

2

i

) = {

∂g(u)

∂u

}

u=z

2

i

and Q

V

(β,φ) = (y−µ)

T

D(v)(y−µ).The Fisher information matrix for

θ can be expressed as K

θθ

= diag{K

ββ

,K

φφ

},where K

ββ

=

4d

g

φ

D

T

β

D

β

and K

φφ

=

n

4φ

2

(4f

g

−1)

with 4d

g

= E(v

2

(z)z

2

),4f

g

= E(v

2

(z)z

4

) and z ∼ S(0,1,g).Thus,the maximum likelihood

estimators of β and φ are asymptotic independents.For power exponential distribution with

index parameter k we founded 4d

g

= 2

1−k

Γ[(3 − k)/2](1 + k)

−2

/Γ[(k + 1)/2] and 4f

g

=

(k + 3)/(k + 1).Some expressions for v(),d

g

and f

g

for symmetrical distributions can be

found in Cysneiros and Paula (2005).The maximum likelihood estimates of θ,

ˆ

θ = (

ˆ

β

T

ˆ

φ)

T

,

can be obtained by solving U(

ˆ

θ) =

U

T

β

(

ˆ

θ),U

φ

(

ˆ

θ)

T

= 0.Some iterative procedures can be

used such as Newton-Raphson,BFGS and Fisher scoring method.The iterative process for

ˆ

θ takes the form

β

(m+1)

=

D

T(m)

β

D

(m)

β

−1

D

T(m)

β

D

(m)

β

β

(m)

+D(ρ

(m)

)

h

y −µ(β

(m)

)

i

=

D

T(m)

β

D

(m)

β

−1

D

T(m)

β

˜

Z

(m)

,(1)

φ

(m+1)

=

1

n

Q

V

(β

(m+1)

,φ

(m)

),m= 0,1,2,...,

with D(ρ) = diag{ρ(z

1

),...,ρ(z

n

)} and ρ(z

i

) = v(z

i

)/4d

g

.If the function g(u) is monoton-

4

ically decreasing for u > 0,then v() > 0 and ρ(z) > 0 for all z ∈ IR.Also,v(z) = v(−z)

and ρ(z) = ρ(−z) for all z ∈ IR.When the error of model has heavy-tailed distribution,

the values of the weighting ρ(z

i

) and v(z

i

) in β and φ,respectively,has small values for

|z

i

| large.Thus,models with heavy tailed distribution can reduce the inﬂuence of extreme

observations while the weights are equal for all observations in normal nonlinear regression

model,consequently,estimates in these models are more sensitive to extreme observations.

It is easy to show that in Student-t,logistic-II,and power exponential (k > 0) distributions

the values of v(z) and ρ(z) decrease for |z| large.For example,for the power exponential

distribution with index parameter k we have v(z) = (1 + k)

−1

|z|

−k/(k+1)

.

ˆ

β is a consistent

estimator of β and

√

n(

ˆ

β −β)

d

−→N

p

(0,J

−1

ββ

) in suitable regularity conditions (see Cox and

Hinkley (1974)),where J

ββ

= lim

n→∞

1

n

K

ββ

.Then,

ˆ

K

−1

ββ

=

ˆ

φ

4d

g

(D

T

ˆ

β

D

ˆ

β

)

−1

is a consistent

estimator of the asymptotic variance-covariance matrix of

ˆ

β.Also

ˆ

φ is a consistent estimator

of the φ and

√

n(

ˆ

φ −φ)

d

−→N(0,J

−1

φφ

),where J

φφ

= lim

n→∞

1

n

K

φφ

.Then,

ˆ

K

−1

φφ

=

4

ˆ

φ

2

n(4f

g

−1)

is a

consistent estimator of the asymptotic variance of

ˆ

φ.The regularity conditions do not hold

for some symmetric distributions such as Kotz,generalized Kotz and double exponential

(Cordeiro,Ferrari,Uribe-Opazo and Vasconcellos,2000).

Vanegas and Cysneiros (2010) have shown that the heavy-tailed models can be more robust

against extreme observations than the normal model.They also developed diagnostic proce-

dures based on case-deletion and mean-shift outlier models.For instance,they founded that

one-step approximation

ˆ

θ

I

(i)

of

ˆ

θ

(i)

could be expressed as

ˆ

θ

I

(i)

=

ˆ

β

I

(i)

ˆ

φ

I

(i)

=

ˆ

β −

ˆ

φ

1/2

ρ(ˆz

i

)ˆz

i

(D

T

ˆ

β

D

ˆ

β

)

−1

ˆ

d

i

/(1 −

ˆ

h

ii

)

ˆ

φ −2

ˆ

φ(v(ˆz

i

)ˆz

2

i

−1)/(n −1)(4f

g

−1)

,

where

ˆ

h

ii

=

ˆ

d

T

i

(D

T

ˆ

β

D

ˆ

β

)

−1

ˆ

d

i

was the (i,i)-th element of

ˆ

H = D

ˆ

β

(D

T

ˆ

β

D

ˆ

β

)

−1

D

T

ˆ

β

,

ˆ

d

T

i

was the

i-th row of D

ˆ

β

and D

ˆ

β

= {D

β

}

β=

ˆ

β

.To identify inﬂuential observations we can use the

following measurements for

ˆ

β

j

(j = 1,...,p) and

ˆ

φ,respectively,

t

β

j

,i

=

ˆ

β

I

j(i)

−

ˆ

β

j

q

ˆ

V ar(

ˆ

β

j

)

=

q

(4d

g

)

ˆ

h

ii

ρ(ˆz

i

)ˆz

i

1 −

ˆ

h

ii

ˆ

ψ

j,i

(2)

5

and

t

φ,i

=

ˆ

φ

I

(i)

−

ˆ

φ

q

ˆ

V ar(

ˆ

φ)

=

s

n

4f

g

−1

v(ˆz

i

)ˆz

2

i

−1

n −1

,(3)

where

ˆ

β

I

j(i)

= a

T

j

ˆ

β

I

(i)

,ψ

j,i

=

a

T

j

(D

T

β

D

β

)

−1

d

i

√

a

T

j

(D

T

β

D

β

)

−1

a

j

√

h

ii

is the linear correlation coeﬃcient between

ˆ

β

j

and d

T

i

ˆ

β,a

j

= (a

1

,...,a

p

)

T

with a

j

= 1 and a

s

= 0 for all s 6= j.We have that

GD

β

j

,i

= t

2

β

j

,i

and GD

φ,i

= t

2

φ,i

,with GD

β

j

,i

and GD

φ,i

the univariate versions of Generalized

Cook Distance for

ˆ

β

j

and

ˆ

φ,respectively (see Vanegas and Cysneiros (2010)).Large values

of GD

β

j

,i

or GD

φ,i

indicate that the i-th observation has a disproportionate inﬂuence on

ˆ

β

j

or

ˆ

φ,respectively.

3 TESTING FOR HETEROSCEDASTICITY

Many authors have discussed the detection and testing of variance heterogeneity,for instance,

Cook and Weisberg (1983) for normal linear regression models,Lin and Wei (2003) for

normal nonlinear regression models and Cysneiros,Paula and Galea (2007) for symmetrical

linear regression models.The standard assumption for the model that the error terms all

have equal variance is now evaluated.For this,it was assumed that,in the same way as

i

,

the dispersion parameter depended on a set of explanatory variables through the following

structure

φ

i

= φτ(g

i

,λ),i = 1,...,n,

where τ

i

(λ) = τ(g

i

,λ) was an injective and twice diﬀerentiable function with respect to

λ = (λ

1

,...,λ

q

)

T

.It was supposed that the derivative matrix D

λ

= ∂τ/∂λ had rank q with

τ = (τ

1

,...,τ

n

)

T

and g

i

the vector of explanatory variables.Then,the score function for

θ = (β

T

,λ

T

,φ)

T

in this model could be written as

U(θ) =

U

β

(θ)

U

λ

(θ)

U

φ

(θ)

6

where U

β

(θ) = D

T

β

D

(f)

(y−µ),U

λ

(θ) = φD

T

λ

mand U

φ

(θ) = τ

T

m,with D

(f)

= diag{f

1

,...,f

n

},

f

i

= v(z

i

)/φ

i

,m = (m

1

,...,m

n

)

T

and m

i

= (v(z

i

)z

2

i

−1)/2φ

i

.Likewise,the Fisher matrix

information for θ was K

θθ

(θ) = diag{K

ββ

,K

∗

} where

K

∗

=

K

λλ

K

λφ

K

φλ

K

φφ

,

K

ββ

= D

T

β

W

1

D

β

,K

λλ

= φ

2

D

T

λ

W

2

D

λ

,K

λφ

= φD

T

λ

W

2

τ,K

φφ

= τ

T

W

2

τ,with W

1

=

4d

g

diag{1/φ

1

,...,1/φ

n

} and W

2

=

4f

g

−1

4

diag{1/φ

2

1

,...,1/φ

2

n

}.It was assumed that exist

λ

◦

in the parametric space of λ such that τ

i

(λ

◦

) = 1 for i = 1,...,n.Then,for testing

variance heterogeneity the following hypothesis was evaluated

H

0

:λ = λ

◦

(4)

H

1

:λ 6= λ

◦

The Score test was used as this only requires estimating model parameters under the null

hypothesis which provides eﬃciency from the computational point of view.The Score test

for evaluating (4),denoted ξ

λ

,could be expressed as

ξ

λ

=

n

U

T

λ

(θ) V ar(

ˆ

λ)U

λ

(θ)

o

θ=

ˆ

θ

◦

(5)

where

ˆ

θ

◦

= (

ˆ

β

T

,λ

◦ T

,

ˆ

φ)

T

was the maximum likelihood estimate of θ under H

0

(i.e.constant

variance for error terms) and V ar(

ˆ

λ) was the appropiate submatrix of K

−1

θθ

(θ).Asymptot-

ically and under H

0

,ξ

λ

followed Chi-square distribution with q degrees of freedom.Large

values of ξ

λ

suggested evidence of heteroscedasticity.

Theorem 1 The Score test for evaluating hypothesis (4) could be expressed as

ξ

λ

=

1

(4f

g

−1)

n

ˆm

∗

T

ˆm

∗

o

θ=

ˆ

θ

◦

(6)

where ˆm

∗

= H

∗

λ

m

∗

,H

∗

λ

= D

∗

λ

D

∗

T

λ

D

∗

λ

D

∗

T

λ

,D

∗

λ

=

I −

11

T

n

D

λ

,m

∗

= (m

∗

1

,...,m

∗

n

)

T

and

m

∗

i

= v(z

i

)z

2

i

.

7

This result extended the heteroscedasticity test developed by Cook and Weisberg (1983)

and Lin and Wei (2003) for normal regression cases to the symmetric nonlinear regression

models.When the error of the model had heavy-tailed distribution,weights v(ˆz

i

) in m

∗

i

had

small values for |ˆz

i

| large,while that in normal regression v(ˆz

i

) = 1 for all observations.Thus,

models where the error has heavy-tailed distribution could reduce the inﬂuence of extreme

observations on ξ

λ

.Cook and Weisberg (1983) suggested two options for the functional form

of τ

i

(λ)

i)

τ

i

= exp(g

T

i

λ),i = 1,...,n.(7)

In this case λ

◦

= 0 and D

λ

= D

(τ)

G,with D

(τ)

= diag{τ

1

,...,τ

n

} and G= (g

1

,...,g

n

)

T

.

Then,ξ

λ

was given by the expression (6),where H

∗

λ

= G

∗

(G

∗

T

G

∗

)

−1

G

∗

T

and G

∗

=

I −

11

T

n

G.

ii)

τ

i

=

q

Y

j=1

g

λ

j

ij

,g

ij

> 0,i = 1,...,n;j = 1,...,q.(8)

Here,λ

◦

= 0 and D

λ

= D

(τ)

G

l

,with G

l

the matrix of natural logarithms of G.Then,ξ

λ

was given by the expression (6),where H

∗

λ

= G

∗

l

(G

∗

T

l

G

∗

l

)

−1

G

∗

T

l

and G

∗

l

=

I −

11

T

n

G

l

.

3.1 Simulation study 1

Some Monte Carlo simulations have been developed for studying the performance of the

partial F-test in nonlinear models in the presence of outliers in the response variable.We

considered the Michaelis-Menten model,which can be expressed as

Y

i

=

β

1

x

i

β

2

+x

i

+ǫ

i

,i = 1,...,n,

where β

1

> 0,β

2

> 0,x

i

> 0 and ǫ

i

∼ S(0,φ,g) were independent and identically distributed

variables.For errors,we generated 10,000 samples of N(0,φ) with sample sizes n = 40.

The explanatory variable x was generated following uniform distribution in the interval

(0,100) and their values were ﬁxed throughout the simulations.The values of parameters

β

1

= 1,β

2

= 6.41 and φ = 0.1.To guarantee the presence of one outlier we constructed

Y

∗

i

= β

∗

1

x

i

/(β

2

+ x

i

) + ǫ

i

,where i = 6 and β

∗

1

= 1.1,1.2,1.3,1.4,1.5,1.6,1.7 and 1.8.The

heteroscedasticity test ξ

λ

for the two functional forms of τ

i

considered in this section were

8

calculated in each one of the replications.δ

α

=#{ξ

λ

> ξ

α

}/100 was then computed,being the

percentage of replications where the heteroscedasticity test suggested that the assumption of

variance constant of the error terms was violated to the 100α% level of signiﬁcance,with ξ

α

such that#{ξ

λ

> ξ

α

|β

∗

1

= β

1

}/10,000 = α.Tables 1 and 2 present the values of δ

α

for α =

0.1,0.05 and 0.01 levels where τ

i

had the functional forms given by (7) and (8),respectively,

with q = 1 and g

i

= x

i

,i = 1,...,n.

It can be observed that the percentage of replications in all the scenarios studied where

ξ

λ

suggest that the assumption of variance constant of the error terms had been violated,

increased when the diﬀerence between β

∗

1

and β

1

also increased.However,for the normal

distribution,the increase occured quicker,which indicated that this distribution has more

sensitived than the other distributions considered for extreme observations in the response

variable.Additionally,the values of δ

α

were higher in the normal model than in the other

models in all scenarios.The inﬂuence of outliers on ξ

λ

increased in the Student-t models

when ν,the degrees of freedom,also increased.The same occured for power exponential

models when k,the index parameter,decreased.This was because the Student-t distribution

tended to normal when ν tended to inﬁnity,while the power exponential distribution tended

to normal when k tended to zero.Then,we concluded that,in the presence of outliers in

response variable,the heteroscedasticity test in nonlinear models is more robust with heavy-

tailed than with normal errors.

4 INFLUENTIAL OBSERVATIONS ON PARTIAL F-TEST

4.1 Simulation study 2

The simulation study that began in Section 3 is now continued.In each replication,partial

F-test,F

1

(1) = (

ˆ

β

1

−1)

2

/

ˆ

V ar(

ˆ

β

1

),were obtained for evaluating the hypothesis H

0

:β

1

=

1 vs.H

a

:β

1

6= 1 in normal,Student-t with ν = 4,6,8 and 10 degrees of freedom,power

exponential k =0.6,0.7,0.8 and 0.9 index parameter and Logistic-II models.δ

α

=#{F

1

(1) >

F

α

}/100 was then computed,being the percentage of replications where partial F-test for

β

1

suggested that H

0

must be rejected to the 100α% level of signiﬁcance,with F

α

such that

#{F

1

(1) > F

α

|H

0

was true}/10,000 = α.Table 3 presents the values of δ

α

for α = 0.1,0.05

and 0.01 levels.

9

Table 1

Performance of the heteroscedasticity test with functional form (7) in the presence of extreme

observations in the response variable

Distribution

β

∗

1

−β

1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

δ

0.1

Normal

10.48

11.30

13.36

16.88

22.12

29.90

39.34

50.34

Student-t (ν = 4)

10.33

10.51

11.06

12.20

13.55

14.72

16.26

17.65

Student-t (ν = 6)

10.09

10.68

11.52

12.93

15.02

17.37

19.82

22.53

Student-t (ν = 8)

10.10

10.78

11.75

13.62

16.38

19.37

23.07

26.64

Student-t (ν = 10)

10.20

10.98

12.13

14.10

17.55

20.76

25.59

30.34

Power Exp.(k = 0.6)

10.52

11.50

13.45

16.55

21.61

28.79

37.99

48.01

Power Exp.(k = 0.7)

10.39

11.54

13.38

16.26

21.38

28.38

37.34

47.20

Power Exp.(k = 0.8)

10.24

11.43

13.19

15.97

21.25

27.91

36.63

46.55

Power Exp.(k = 0.9)

10.27

11.45

13.24

16.14

21.15

27.61

36.40

45.90

Logistic-II

10.19

10.85

11.81

13.73

16.92

20.72

25.65

31.15

δ

0.05

Normal

5.28

5.83

7.04

9.66

14.15

20.86

29.47

39.13

Student-t (ν = 4)

4.92

4.96

5.39

6.12

6.89

7.90

8.97

9.94

Student-t (ν = 6)

4.96

5.20

5.67

6.82

8.34

10.06

11.80

13.69

Student-t (ν = 8)

5.08

5.36

6.04

7.35

9.40

11.56

14.37

17.15

Student-t (ν = 10)

5.10

5.37

6.22

7.65

10.05

13.08

16.51

19.83

Power Exp.(k = 0.6)

5.12

5.57

6.78

9.23

13.17

19.05

26.78

36.54

Power Exp.(k = 0.7)

5.11

5.43

6.66

9.18

12.91

18.59

26.15

35.80

Power Exp.(k = 0.8)

5.09

5.40

6.54

9.09

12.90

18.47

25.82

35.21

Power Exp.(k = 0.9)

5.04

5.35

6.50

9.07

12.88

18.33

25.52

34.72

Logistic-II

5.05

5.26

6.06

7.48

9.72

12.92

16.75

21.02

δ

0.01

Normal

1.10

1.18

1.82

2.94

4.76

8.30

13.24

20.72

Student-t (ν = 4)

1.01

1.02

1.14

1.35

1.59

1.80

2.10

2.37

Student-t (ν = 6)

1.01

1.03

1.31

1.60

1.94

2.41

3.05

3.52

Student-t (ν = 8)

1.05

1.21

1.51

1.85

2.37

3.16

4.09

5.04

Student-t (ν = 10)

1.13

1.24

1.59

1.98

2.67

3.65

5.04

6.83

Power Exp.(k = 0.6)

1.12

1.30

1.71

2.65

4.48

7.53

12.05

18.81

Power Exp.(k = 0.7)

1.14

1.30

1.76

2.70

4.52

7.61

11.91

18.54

Power Exp.(k = 0.8)

1.09

1.26

1.70

2.60

4.29

7.18

11.25

17.50

Power Exp.(k = 0.9)

1.12

1.23

1.71

2.51

4.01

6.84

10.56

16.51

Logistic-II

1.05

1.16

1.52

1.86

2.54

3.74

5.40

7.48

It can be observed that the percentage of replications where F

1

(1),in all the models and for

all levels of signiﬁcance studied,suggesting that H

0

must be rejected,δ

α

,increased when the

diﬀerence between β

∗

1

and β

1

also increased.However,the increase for the normal distribution

occured quicker than for the other distributions considered.In all scenarios the values of δ

α

10

Table 2

Performance of the heteroscedasticity test with functional form (8) in the presence of extreme

observations in the response variable

Distribution

β

∗

1

−β

1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

δ

0.1

Normal

10.49

12.29

15.99

22.53

31.86

42.75

55.95

69.09

Student-t (ν = 4)

10.38

10.73

12.03

14.09

16.85

20.00

23.02

26.03

Student-t (ν = 6)

10.39

11.29

13.09

16.16

20.40

25.25

30.96

36.40

Student-t (ν = 8)

10.41

11.64

13.67

17.44

22.86

29.27

37.12

44.49

Student-t (ν = 10)

10.42

11.76

14.19

18.48

24.98

32.49

41.79

51.52

Power Exp.(k = 0.6)

10.41

12.25

15.55

21.70

30.86

41.35

53.59

66.69

Power Exp.(k = 0.7)

10.36

12.16

15.51

21.53

30.42

41.02

53.02

66.24

Power Exp.(k = 0.8)

10.33

12.13

15.44

21.45

30.05

40.52

52.67

65.43

Power Exp.(k = 0.9)

10.31

12.06

15.37

21.28

29.74

40.02

52.32

64.58

Logistic-II

10.40

11.60

13.60

17.73

23.90

31.77

40.89

51.28

δ

0.05

Normal

5.49

7.05

10.09

15.83

24.53

35.29

47.56

60.97

Student-t (ν = 4)

5.23

5.87

6.88

8.69

10.41

12.59

14.91

17.02

Student-t (ν = 6)

5.37

6.28

7.91

10.28

13.48

17.45

21.71

25.98

Student-t (ν = 8)

5.42

6.52

8.61

11.55

15.67

21.26

27.14

33.66

Student-t (ν = 10)

5.46

6.66

8.90

12.31

17.35

23.90

31.24

40.05

Power Exp.(k = 0.6)

5.48

6.99

10.03

15.21

22.77

33.39

45.19

58.28

Power Exp.(k = 0.7)

5.45

6.92

10.02

14.98

22.50

32.96

44.46

57.46

Power Exp.(k = 0.8)

5.36

6.87

10.01

14.87

22.27

32.50

43.90

56.73

Power Exp.(k = 0.9)

5.35

6.82

9.99

14.83

22.14

32.11

43.68

55.79

Logistic-II

5.45

6.46

8.67

11.82

16.55

23.30

31.36

40.89

δ

0.01

Normal

1.36

2.30

3.97

7.16

12.83

21.13

31.18

43.36

Student-t (ν = 4)

1.12

1.47

2.13

3.08

4.12

5.11

6.11

7.12

Student-t (ν = 6)

1.23

1.90

2.79

4.15

5.83

7.76

9.77

12.28

Student-t (ν = 8)

1.25

1.91

2.94

4.69

7.01

9.48

12.79

16.72

Student-t (ν = 10)

1.27

1.98

3.24

5.00

7.85

11.31

15.62

21.25

Power Exp.(k = 0.6)

1.25

2.04

3.68

6.50

11.32

18.63

28.80

40.21

Power Exp.(k = 0.7)

1.24

2.02

3.57

6.47

11.07

18.26

28.02

39.57

Power Exp.(k = 0.8)

1.23

2.00

3.51

6.31

10.67

17.58

27.14

38.33

Power Exp.(k = 0.9)

1.16

2.00

3.48

6.29

10.44

17.29

26.56

37.58

Logistic-II

1.16

1.84

2.97

4.72

7.50

11.00

15.96

22.72

were higher in the normal model.For example,with α = 0.1 it was observed that the value

of δ

α

reached 28.9% in the normal model,while for the other models it was always less than

18%.Similar trends were observed for other levels of α.In the Student-t models the inﬂuence

of outliers on partial F-test increased when ν,the degrees of freedom,increased too.The

11

same occured for the power exponential models when k,the index parameter,decreased.

This pattern indicated that partial F-test in models with heavy-tailed distributions was less

sensitive in the presence of outliers in the response variable than in the normal model.It could

be observed that,unlike the other models considered,the inﬂuence of outliers in the Student-

t and power exponential model on partial F-test was not strictly increased when (β

∗

1

−β

1

)

increased.Then,we concluded that,in the presence of outliers in response variable,the

partial F-test in nonlinear models is more robust with heavy-tailed than with normal errors,

which is new because in the past,some works had concluded the robustness of heavy-tailed

models,but only on parameter estimates.

4.2 Diagnostic procedures

The Wald’s test can be used for evaluating the hypothesis H

0

:β

j

= γ vs.H

a

:β

j

6= γ.The

statistic of this test is the following

F

j

(γ) =

ˆ

β

j

−γ

2

ˆ

V ar

ˆ

β

j

= t

2

j

(γ),(9)

where t

j

(γ) = (

ˆ

β

j

−γ)/

q

ˆ

V ar(

ˆ

β

j

).Asymptotically and under H

0

,F

j

(γ) and t

j

(γ) follow Chi-

square(1) and standard normal distributions,respectively.The case-deletion model (CDM),

proposed by Cook and Weisberg (1982),can be used to study the inﬂuence of i-th observa-

tion on F

j

(γ) in normal linear regression model framework.The following measurement of

inﬂuence can then be calculated

Δ

j(i)

(γ) =

F

j(i)

(γ) −F

j

(γ)

F

j

(γ)

,F

j

(γ) 6= 0,(10)

where F

j(i)

(γ) was the partial F-test calculated with

ˆ

θ

(i)

,the estimate of θ obtained when

the i-th observation has been excluded from the dataset.Large values of |Δ

j(i)

(γ)| indicated

that the i-th observation had a disproportionate inﬂuence on F

j

(γ).However,the calcu-

lation of Δ

j(i)

(γ),i = 1,...,n,could be computationally expensive,especially when n is

large.An approximation of these values was thus obtained by substituting

ˆ

θ

(i)

by one-step

approximation,

ˆ

θ

I

(i)

,in the expression (10).

12

Table 3

Performance of the partial F-test in the presence of extreme observations in the response variable

Distribution

β

∗

1

−β

1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

δ

0.1

Normal

11.25

12.79

14.18

16.77

19.27

22.26

25.49

28.92

Student-t (ν = 4)

10.19

11.10

13.42

14.47

15.02

15.32

14.91

14.44

Student-t (ν = 6)

10.27

11.12

13.44

14.92

15.79

16.33

16.29

15.89

Student-t (ν = 8)

10.76

11.98

13.56

14.93

16.07

16.71

16.98

16.98

Student-t (ν = 10)

10.93

12.60

13.60

15.14

16.51

17.45

17.80

17.93

Power Exp.(k = 0.6)

11.13

12.53

13.21

14.11

14.68

15.01

15.06

14.83

Power Exp.(k = 0.7)

10.93

12.46

13.14

14.07

14.50

14.71

14.56

14.30

Power Exp.(k = 0.8)

10.87

12.04

13.04

13.86

14.23

14.09

13.84

13.47

Power Exp.(k = 0.9)

10.88

11.32

13.03

13.80

14.16

14.07

13.82

13.46

Logistic-II

10.94

11.94

13.38

14.71

15.70

16.35

16.48

16.52

δ

0.05

Normal

6.03

6.90

8.35

9.85

11.77

14.39

16.78

19.44

Student-t (ν = 4)

5.62

6.73

7.73

8.37

8.89

8.90

8.66

8.25

Student-t (ν = 6)

5.70

6.74

7.76

8.63

9.26

9.78

9.76

9.28

Student-t (ν = 8)

5.73

6.77

7.84

8.64

9.47

10.05

10.27

10.24

Student-t (ν = 10)

5.74

6.83

8.04

8.96

9.85

10.71

11.02

11.13

Power Exp.(k = 0.6)

5.99

6.73

7.39

8.02

8.30

8.53

8.57

8.54

Power Exp.(k = 0.7)

5.88

6.68

7.34

8.01

8.25

8.32

8.33

8.03

Power Exp.(k = 0.8)

5.83

6.66

7.33

7.99

8.13

8.18

8.12

7.84

Power Exp.(k = 0.9)

5.76

6.61

7.28

7.70

7.83

7.94

7.70

7.38

Logistic-II

5.76

6.80

7.88

8.56

9.39

9.83

10.10

10.11

δ

0.01

Normal

1.55

1.93

2.33

2.72

3.41

4.14

5.00

6.00

Student-t (ν = 4)

1.34

1.74

2.10

2.30

2.52

2.56

2.39

2.22

Student-t (ν = 6)

1.34

1.83

2.15

2.42

2.79

2.87

2.81

2.68

Student-t (ν = 8)

1.39

1.87

2.26

2.57

2.94

3.08

3.16

3.15

Student-t (ν = 10)

1.43

1.89

2.30

2.63

3.05

3.29

3.36

3.36

Power Exp.(k = 0.6)

1.48

1.79

2.15

2.34

2.46

2.52

2.54

2.47

Power Exp.(k = 0.7)

1.46

1.86

2.13

2.33

2.40

2.50

2.51

2.43

Power Exp.(k = 0.8)

1.41

1.85

2.11

2.32

2.39

2.41

2.39

2.31

Power Exp.(k = 0.9)

1.40

1.84

2.03

2.31

2.37

2.32

2.29

2.20

Logistic-II

1.45

1.91

2.26

2.54

2.93

3.05

3.05

3.02

Theorem 2 The inﬂuence measurement Δ

j(i)

(γ) based on the one-step approximation of

ˆ

θ

(i)

,denoted

ˆ

θ

I

(i)

,can be expressed as

Δ

I

j(i)

(γ) ≈

1 −t

β

j

,i

/t

j

(γ)

2

1 −2

q

4f

g

−1

n

t

φ,i

1 +

ˆ

ψ

2

j,i

ˆ

h

ii

1−

ˆ

h

ii

−1,t

j

(γ) 6= 0 (11)

13

According to (11) it was concluded that the i-th observation was not inﬂuential on F

j

(γ)

when the three following conditions were satisﬁed:i) this was not inﬂuential on

ˆ

β

j

(i.e.

t

2

β

j

,i

≈ 0),ii) was not inﬂuential on

ˆ

φ (i.e.t

2

φ,i

≈ 0),and iii) was not located on a remote

region of the subspace deﬁned by columns of D

ˆ

β

(i.e.

ˆ

h

ii

≈ 0).The result obtained in (11)

was the extension of the expression developed by Cook and Weisberg (1982) for the normal

linear regression models to symmetrical nonlinear regression models.

The convergence of the iterative process for estimating θ may also be used for studying the

inﬂuence of i-th observation on F

j

(γ).Then,from (1) we have

ˆ

β = (D

T

ˆ

β

D

ˆ

β

)

−1

D

T

ˆ

β

˜

Z

= (D

T

ˆ

β

D

ˆ

β

)

−1

D

T

ˆ

β

n

D

ˆ

β

ˆ

β +D(ˆρ)

h

y −µ(

ˆ

β)

io

,(12)

where β = (β

∗T

,β

j

)

T

and D

β

= (D

β

∗,d

(j)

) can be deﬁned with β

∗

the vector of location

parameters when the j-th has been excluded,D

β

∗ = ∂µ/∂β

∗

and d

(j)

= ∂µ/∂β

j

.From (12)

and using that (I −

ˆ

H

∗

) is a symmetric and idempotent matrix we can express

ˆ

β as

ˆ

β

∗

ˆ

β

j

=

D

T

ˆ

β

∗

D

ˆ

β

∗

D

T

ˆ

β

∗

ˆ

d

(j)

ˆ

d

T

(j)

D

ˆ

β

∗

ˆ

d

T

(j)

ˆ

d

(j)

−1

D

T

ˆ

β

∗

ˆ

d

T

(j)

˜

Z

=

(D

T

ˆ

β

∗

D

ˆ

β

∗

)

−1

D

T

ˆ

β

∗

h

I −(

ˆ

R

T

(j)

ˆ

R

(j)

)

−1

ˆ

d

(j)

ˆ

R

T

(j)

i

˜

Z

(

ˆ

R

T

(j)

ˆ

R

(j)

)

−1

ˆ

R

T

(j)

(I −

ˆ

H

∗

)

˜

Z

,(13)

where H

∗

= D

β

∗(D

T

β

∗D

β

∗)

−1

D

T

β

∗ and

ˆ

R

(j)

= (I −

ˆ

H

∗

)

ˆ

d

(j)

.Also,from the Section 2,the

variance of

ˆ

β

j

is the respective element of K

−1

ββ

,thus V ar(

ˆ

β

j

) = φ(R

T

(j)

R

(j)

)

−1

/4d

g

.From

(13) the expression (9) can then be rewritten as

F

j

(γ) =

h

ˆ

R

T

(j)

ˆ

R

(j)

−1

ˆ

R

T

(j)

I −

ˆ

H

∗

˜

Z−γ

i

2

ˆ

φ

ˆ

R

T

(j)

ˆ

R

(j)

−1

/4d

g

=

h

ˆ

R

T

(j)

ˆ

R

(j)

−1

ˆ

R

T

(j)

I −

ˆ

H

∗

˜

Z−γ

ˆ

d

(j)

i

2

ˆ

φ

ˆ

R

T

(j)

ˆ

R

(j)

−1

/4d

g

14

Given that

˜

Z = D

ˆ

β

∗

ˆ

β

∗

+

ˆ

β

j

ˆ

d

(j)

+D(ˆρ)

h

y −µ(

ˆ

β)

i

we can write the matrix (I−

ˆ

H

∗

)(

˜

Z−γ

ˆ

d

(j)

)

as

ˆ

R

(z)

= (I −

ˆ

H

∗

)

n

D(ˆρ)

h

y −µ(

ˆ

β)

i

+(

ˆ

β

j

−γ)

ˆ

d

(j)

o

.Deﬁning

ˆ

R

∗

(z)

=

q

4d

g

/

ˆ

φ

ˆ

R

(z)

we have

F

j

(γ) as the following

F

j

(γ) =

h

ˆ

R

T

(j)

ˆ

R

(j)

−1

ˆ

R

T

(j)

ˆ

R

∗

(z)

i

2

ˆ

R

T

(j)

ˆ

R

(j)

−1

,(14)

i.e.,F

j

(γ) can be considered to be the statistical test for evaluating the signiﬁcance of

the normal linear regression through the origin of

ˆ

R

∗

(z)

on

ˆ

R

(j)

when error variance is 1.

Therefore,a graph of

ˆ

R

∗

(z)

against

ˆ

R

(j)

can reveal which observations were contributing to

the relationship assessed by F

j

(γ) and which were being diverted from the same.In the

normal linear case with intercept,(14) means that the partial F-test can be interpreted as

F

j

(0) = n

(1 −R

∗2

)

(1 −R

2

)

ρ

2

X

j

Y X

∗,

where ρ

X

j

Y X

∗ was the partial correlation coeﬁcient between the response and the j-th ex-

planatory variable while R

2

and R

∗2

were the coeﬃcients of determination for the model

with all the covariates and without the j-th,respectively.

5 EXAMPLE

We considered a dataset analyzed in Venables and Ripley (2002) which showed that obese

patients on a weight reduction programme tended to lose adipose tissue at a diminishing rate.

The two variables were x,for time (in days) since the start of the programme and y,being

the patients’ weight in kilograms measured in standard conditions.The dataset pertained

to 48 years-old male patients 193 cm height with a large body frame.The model used for

analyzing this dataset was given by

Y

i

= β

1

+

β

2

2

x

i

/β

3

+ǫ

i

,i = 1,...,52,

where β

1

was ultimate lean weight,or asymptote,β

2

was total amount to be lost and β

3

was time taken to lose half the amount remaining to be lost.The data are illustrated in

Figure 1.We consider two distributions in addition to the normal for errors in the model:

Student-t with 4 degrees of freedom and power exponential where k = 0.9.Table 4 shows

the parameters estimates and standard errors for the models ﬁtted.It can be observed that

15

the ultimate lean weight estimate,

ˆ

β

1

,was smaller in normal model,which was consistent

with the other parameters estimates since also in this model

ˆ

β

2

and

ˆ

β

3

were the higher and

smaller,respectively.For location and scale parameters the standard errors were smaller in

the heavy-tailed error models.

Fig.1.Scatter plot for the weight loss data

0 50 100 150 200 250

110120130140150160170180

Time (in days)

Weight (in kilograms)

Table 4

Parameter estimates (standard errors) for symmetrical nonlinear models ﬁtted on weight loss data

Distribution

β

1

β

2

β

3

φ

Normal

81.373

102.684

141.910

0.754

(2.202)

(2.021)

(5.139)

(0.148)

Student-t(4)

82.529

101.612

139.413

0.481

(2.013)

(1.844)

(4.704)

(0.124)

Power exponential(0.9)

83.675

100.674

136.717

0.131

(1.620)

(1.479)

(3.781)

(0.035)

Three observations (22,39 and 44) were identiﬁed in residuals plots (Figures 2.a - 5.a) that

could be considered outliers in all the models except in the power exponential.These plots

also show an increase in the magnitude of the residuals when the weight of the patients

decreased,which suggested that the error’s variance was not constant.This behavior was

observed with less intensity in the model with power exponential error distribution.Table 5

shows the percentage changes in parameter estimates when the outliers were eliminated from

the dataset.It can be seen that the changes were smaller with the heavy-tailed error models

than in the normal,especially with power exponential distribution.In plots for GD

β,i

(Figures

16

2.b - 5.b) and GD

φ,i

,(the latter is not shown here) it can also be observed that the inﬂuence

of the observations was smaller with the heavy-tailed error models.It could be interesting to

evaluate whether the time taken to lose half the amount remaining to be lost was diﬀerent to

145 days,i.e.,evaluating H

0

:β

3

= 145 vs.H

1

:β

3

6= 145.Table 6 shows F

3

(145) with and

without outliers.Here,again,the power exponential model appearsed more robust regarding

outliers than in the other models considered.The graphs of |Δ

I

3(i)

(145)| (Figures 2.c - 5.c)

indicate that F

3

(145) was more robust in heavy-tailed error models,especially in the power

exponential,where the outliers did not have a disproportionate inﬂuence.Also,the inﬂuence

on F

3

(145) can be observed in plots of

ˆ

R

(j)

versus

ˆ

R

∗

(z)

(Figures 2.d - 5.d).The dotted line

in these plots represents the regression of

ˆ

R

∗

(z)

on

ˆ

R

(j)

.We observed that the observations

did not show the signiﬁcance of F

3

(145) so clearly in the normal and the Student-t models

as they did in the power exponential model.This plot also showed the same observations as

inﬂuentials as in plot |Δ

I

3(i)

(145)|.

Table 5

Percentage changes in parameter estimates when observations 22,39 and 44 were eliminated from

the dataset.

Distribution

β

1

β

2

β

3

φ

Normal

1.655

−1.234

−2.238

−31.029

Student-t(4)

0.956

−0.722

−1.299

−23.738

Power exponential(0.9)

−0.073

0.061

0.121

−26.505

Table 6

F

3

(145) values and p-values with and without observations 22,39 and 44

Distribution

with all observations

without outliers

F

3

(145)

p-value

F

3

(145)

p-value

Normal

0.3613

0.5477

2.2733

0.1315

Student-t(4)

1.4105

0.2350

3.3001

0.0692

Power exponential(0.9)

4.7985

0.0284

6.8244

0.0090

Theorem 2 was used to assess the assumption for the model that the error terms all had

equal variance with the functional forms described in (7) and (8).Table 7 shows ξ

λ

with

and without outliers when q = 1 and g

i

= x

i

,i = 1,...,n.It was observed that there was

17

Fig.2.Residual plots t

∗

D

(ˆz

i

) (a),index plots GD

β,i

(b),index plots |Δ

I

3(i)

(145)| (c) and

ˆ

R

(j)

vs

ˆ

R

∗

(z)

plot (d) for ﬁtted model with normal errors on weight loss data

110 120 130 140 150 160 170 180

-3-2-10123

22

39

44

0 10 20 30 40 50

0.000.100.200.30

22

39

44

46

49

50

51

0 10 20 30 40 50

0.00.51.01.5

22

31 39

44

46

49

50

51

-0.04 -0.02 0.00 0.02

-2-1012

22

31

39

44

46

49

50

51

t

∗

D

(ˆz

i

)

GD

β,i

ˆ

|Δ

I

3(i)

(145)|

ˆ

R

∗

(z)

ˆ

R

(j)Index

Index

(a) (b)

(c) (d)

Table 7

Heteroscedasticity test ξ

λ

values and p-values with and without observations 22,39 and 44

Functional

Distribution

with all observations

without outliers

form of τ

i

ξ

λ

p-value

ξ

λ

p-value

exp(g

T

i

λ)

Normal

6.598

0.011

4.703

0.030

Student-t(4)

7.253

0.007

4.723

0.029

Power exponential(0.9)

1.936

0.164

1.217

0.269

q

Q

j=1

g

λ

i

ij

Normal

4.995

0.025

3.192

0.074

Student-t(4)

5.400

0.020

3.160

0.075

Power exponential(0.9)

1.460

0.227

0.799

0.371

statistical evidence in the normal and Student-t models against the equal variance assumption

for the error terms.The heteroscedasticity test was highly inﬂuenced by the outliers in those

models.By contrast,the heteroscedasticity test was robust to outliers and was not signiﬁcant

18

Fig.3.Residual plots t

∗

D

(ˆz

i

) (a),index plots GD

β,i

(b),index plots |Δ

I

3(i)

(145)| (c) and

ˆ

R

(j)

vs

ˆ

R

∗

(z)

plot (d) for ﬁtted model with Student-t(4) errors on weight loss data

110 120 130 140 150 160 170 180

-3-2-10123

22

39

44

0 10 20 30 40 50

0.000.100.200.30

4

5

39

44

46

49

50

51

0 10 20 30 40 50

0.00.51.01.5

4

5

22

46

49

50

51

-0.04 -0.02 0.00 0.02

-2-1012

22

39

44

46

49

50

51

t

∗

D

(ˆz

i

)

GD

β,i

ˆ

|Δ

I

3(i)

(145)|

ˆ

R

∗

(z)

ˆ

R

(j)

Index

Index

(a)

(b)

(c)

(d)

for power exponential model.Figure 5 show the values for |Δ

I

3(i)

(150)| and |Δ

3(i)

(150)| for

all models considered.The dotted line can be used as reference since it has 0 intercept and

slope 1.These measurements showed high agreement.In conclusion,the power exponential

nonlinear model seemed to ﬁt the data better than the other symmetrical nonlinear models.

6 CONCLUDING REMARKS

This paper described how diagnostic procedures were derived for symmetrical nonlinear

regression models.We developed a statistical test to assess heteroscedasticity in the error

terms and some diagnostic procedures for identifying inﬂuential observations on partial F-

test.Simulations experiments on the Michaelis-Menten model showed that,in the presence

of outliers in the response variable,the inference (i.e.partial F-test and heteroscedasticity

test) in nonlinear models are more robust with heavy-tailed than with normal errors.

19

Fig.4.Residual plots t

∗

D

(ˆz

i

) (a),index plots GD

β,i

(b),index plots |Δ

I

3(i)

(145)| (c) and

ˆ

R

(j)

vs

ˆ

R

∗

(z)

plot (d) for ﬁtted model with power exponential(0.9) errors on weight loss data.

110 120 130 140 150 160 170 180

-3-2-10123

22

39

44

0 10 20 30 40 50

0.000.100.200.30

37

44

50

0 10 20 30 40 50

0.00.51.01.5

2

39

44

49

51

-0.04 -0.02 0.00 0.02

-2-1012

22

39

44

49

50

t

∗

D

(ˆz

i

)

GD

β,i

ˆ

|Δ

I

3(i)

(145)|

ˆ

R

∗

(z)

ˆ

R

(j)Index

Index

(a) (b)

(c) (d)

Fig.5.Inﬂuence measurements |Δ

3(i)

(145)| and |Δ

I

3(i)

(145)| for the ﬁtted models on weight loss

data

0.0 0.5 1.0 1.5

0.00.51.01.5

0.0 0.5 1.0 1.5

0.00.51.01.5

0.0 0.5 1.0 1.5

0.00.51.01.5

Normal Student-t(4) Power Exp(0.9)

|Δ

I

3(i)

(145)|

|Δ

I

3(i)

(145)|

|Δ

I

3(i)

(145)|

|Δ

3(i)

(145)||Δ

3(i)

(145)||Δ

3(i)

(145)|

20

APPENDIX

A Proof of Theorem 1

The following was obtained after algebraic manipulations

{V ar(ˆγ)}

θ=

ˆ

θ

◦

=

4

4f

g

−1

"

D

T

λ

I −

11

T

n

!

D

λ

#

−1

λ=λ

◦

and

{U

λ

(θ)}

θ=

ˆ

θ

◦

=

1

2

[D

T

λ

(m

∗

−1)]

θ=

ˆ

θ

◦

Then,substituting in (5) the following was obtained

ξ

λ

=

1

4f

g

−1

(m

∗

−1)

T

D

λ

"

D

T

λ

I −

11

T

n

!

D

λ

#

−1

D

T

λ

(m

∗

−1)

θ=

ˆ

θ

◦

Since

I −

11

T

n

is a symmetric and idempotent matrix,the following could be written

ξ

λ

=

1

4f

g

−1

(m

∗

−1)

T

D

λ

D

∗

T

λ

D

∗

λ

−1

D

T

λ

(m

∗

−1)

θ=

ˆ

θ

◦

From the convergence of iterative estimation of scale parameter 1

T

m

∗

= n,which implied

that (m

∗

−1) could be written as

I −

11

T

n

m

∗

.Replacing in the expression above,ξ

λ

could

be express as

ξ

λ

=

1

4f

g

−1

m

∗

T

D

∗

λ

D

∗

T

λ

D

∗

λ

−1

D

∗

T

λ

m

∗

θ=

ˆ

θ

◦

Since H

∗

λ

= D

∗

λ

D

∗

T

λ

D

∗

λ

−1

D

∗

T

λ

was a symmetric and idempotent matrix,then

ξ

λ

=

1

4f

g

−1

n

ˆm

∗

T

ˆm

∗

o

θ=

ˆ

θ

◦

with

ˆ

m

∗

= H

∗

λ

m

∗

✷

B Proof of Theorem 2

F

j(i)

(γ) can be written as

F

j(i)

(γ) =

ˆ

β

j(i)

−γ

2

ˆ

φ

(i)

4d

g

a

T

j

D

(i)T

ˆ

β

(i)

D

(i)

ˆ

β

(i)

−1

a

j

≈

ˆ

β

j(i)

−γ

2

ˆ

φ

(i)

4d

g

a

T

j

D

T

ˆ

β

D

ˆ

β

−

ˆ

d

i

ˆ

d

T

i

−1

a

j

,(B.1)

21

where

ˆ

β

j(i)

= a

T

j

ˆ

β

(i)

and D

(i)

β

is the derivative matrix after excluding the i-th observation

from the dataset.Substituting

ˆ

θ

(i)

by

ˆ

θ

I

(i)

in (B.1) F

I

j(i)

(γ) was obtained,given by

F

I

j(i)

(γ) ≈

a

T

j

ˆ

β

I

(i)

−γ

2

ˆ

φ

I

(i)

4d

g

a

T

j

D

T

ˆ

β

D

ˆ

β

−1

+

D

T

ˆ

β

D

ˆ

β

−1

ˆ

d

i

ˆ

d

T

i

D

T

ˆ

β

D

ˆ

β

−1

1−

ˆ

d

T

i

D

T

ˆ

β

D

ˆ

β

−1

ˆ

d

i

a

j

=

a

T

j

ˆ

β −

ˆ

φ

1/2

ρ(ˆz

i

)ˆz

i

a

T

j

D

T

ˆ

β

D

ˆ

β

−1

ˆ

d

i

(

1−

ˆ

h

ii)

−γ

2

ˆ

φ

4d

g

1 −

2(v(ˆz

i

)ˆz

2

i

−1)

(n−1)(4f

g

−1)

a

T

j

D

T

ˆ

β

D

ˆ

β

−1

a

j

+

a

j

D

T

ˆ

β

D

ˆ

β

−1

ˆ

d

i

2

1−

ˆ

h

ii

=

ˆ

β

j

−γ

−

ˆ

φ

1/2

ρ(ˆz

i

)ˆz

i

a

T

j

D

T

ˆ

β

D

ˆ

β

−1

ˆ

d

i

(

1−

ˆ

h

ii)

2

ˆ

φ

4d

g

a

T

j

D

T

ˆ

β

D

ˆ

β

−1

a

j

1 −

2(v(ˆz

i

)ˆz

2

i

−1)

(n−1)(4f

g

−1)

1 +

a

j

D

T

ˆ

β

D

ˆ

β

−1

ˆ

d

i

2

a

T

j

D

T

ˆ

β

D

ˆ

β

−1

a

j

(1−

ˆ

h

ii

)

=

(

ˆ

β

j

−γ)

r

ˆ

φ

4d

g

a

T

j

D

T

ˆ

β

D

ˆ

β

−1

a

j

−

(4d

g

)

1/2

ρ(ˆz

i

)ˆz

i

a

T

j

D

T

ˆ

β

D

ˆ

β

−1

ˆ

d

i

(

1−

ˆ

h

ii)

r

a

T

j

D

T

ˆ

β

D

ˆ

β

−1

a

j

2

1 −

2(v(ˆz

i

)ˆz

2

i

−1)

(n−1)(4f

g

−1)

1 +

a

T

j

D

T

ˆ

β

D

ˆ

β

−1

ˆ

d

i

2

(

1−

ˆ

h

ii)

r

a

T

j

D

T

ˆ

β

D

ˆ

β

−1

a

j

From (2) and (3) we have

F

I

j(i)

(γ) =

h

t

j

(γ) −t

β

j

,i

i

2

1 −2

q

4f

g

−1

n

t

φ,i

h

1 +

ˆ

ψ

2

j,i

ˆ

h

ii

1−

ˆ

h

ii

i

=

t

2

j

(γ)

h

1 −t

β

j

,i

/t

j

(γ)

i

2

1 −2

q

4f

g

−1

n

t

φ,i

h

1 +

ˆ

ψ

2

j,i

ˆ

h

ii

1−

ˆ

h

ii

i

,t

j

(γ) 6= 0.

22

Then,as Δ

I

j(i)

(γ) can be expressed as F

I

j(i)

(γ)/F

j

(γ) −1 we have

Δ

I

j(i)

(γ) ≈

1 −t

β

j

,i

/t

j

(γ)

2

1 −2

q

4f

g

−1

n

t

φ,i

1 +

ˆ

ψ

2

j,i

ˆ

h

ii

1−

ˆ

h

ii

−1,t

j

(γ) 6= 0.✷

References

Barnett,V.and Lewis,T.(1994).Outliers in statistical data.New York:John Wiley.

Cook,R.D.and Weisberg,S.(1982).Residuals and Inﬂuence in Regression.New York:

Chapman and Hall.

Cook,R.D.and Weisberg,S.(1983).Diagnostics for heteroscedasticity in regression.

Biometrika,70,110.

Cordeiro,G.M.;Ferrari.S.L.P.;Uribe-Opazo.M.A.and Vasconcellos.K.L.P.(2000).Cor-

rected maximumlikelihood estimation in a class of symmetric nonlinear regression models.

Statistics and Probability Letters,46,317-328.

Cordeiro,G.M.(2004).Corrected LRtests in symmetric nonlinear regression models.Journal

Statistical Computation and Simulation,74,609-620.

Cox,D.R.and Hinkley,D.V.(1974).Theoretical Statistics.London:Chapman and Hall.

Cysneiros,F.J.A.and Paula,G.A.(2004).One-sided test in linear models with multivariate

t-distribution.Communications in Statistics,Simulation and Computation,33(3),747-

772.

Cysneiros,F.J.A.and Paula,G.A.(2005).Restricted methods in symmetrical linear regres-

sion models.Computational Statistics and Data Analysis,49,689-708.

Cysneiros,F.J.A.;Paula,G.A.and Galea,M.(2007).Heteroscedastic symmetrical linear

regression models.Statistics and Probability Letters,77,1084-1090.

Cysneiros,F.J.A.and Vanegas,L.H.(2008).Residuals and their statistical properties in

symmetrical nonlinear models.Statistics and Probability Letters,78,3269-3273.

Fang,K.T.;Kotz,S.and Ng,K.W.(1990).Symmetric Multivariate and Related Distributions.

London:Chapman and Hall.

Galea,M.;Paula,G.A.and Cysneiros,F.J.A.(2005).On diagnostic in symmetrical nonlinear

23

models.Statistics and Probability Letters,73,45-467.

Galea.M.;Paula,G.A.and Uribe-Opazo.M.(2003).On inﬂuence diagnostics in univariate

elliptical linear regression models.Statistical Papers,44,23-45.

Lin,J.G.and Wei,B.C.(2003).Testing for Heteroscedasticity in Nonlinear Regression Mod-

els.Communications in Statistics,Theory and Methods,32(1),171-192.

Vanegas,L.H.and Cysneiros,F.J.A.(2010).Assesment of Diagnostic Procedures in Sym-

metrical Nonlinear Regression Models.Computational Statistics and Data Analysis,54,

1002-1016.

Venables,W.N.and Ripley,B.D.(2002).Modern Applied with S.Fourth Edition,New

York:Springer.

24

## Comments 0

Log in to post a comment