Iterated Random Functions:Convergence Theorems

C.D.Fuh

y

Institute of Statistical Science

Academia Sinica,Taipei,Taiwan,ROC

ABSTRACT

Iterated random functions are used to draw pictures,simulate large Ising models or

likelihood function representation of hidden Markov models,among other applications.

They oer a method for studying the steady state distribution of a Markov chain,and

there is a simply unifying idea:the iterated random Lipschitz functions converge if the

functions are contracting on the average.To be more precise,let (X;d) be a complete

separable metric space and (F

n

)

n0

a sequence of i.i.d.randomfunctions fromX to X which

are uniform Lipschitz,that is,L

n

= sup

x6=y

d(F

n

(x);F

n

(y))=d(x;y) < 1a.s.Providing the

mean contraction assumption Elog

+

L

1

< 0 and Elog

+

d(F

1

(x

0

);x

0

) < 1for some x

0

2 X,

it is known that the forward iterations M

x

n

= F

n

F

1

(x),n 0,converge weakly to

a unique stationary distribution for each x 2 X.The associated backward iterations

^

M

x

n

= F

1

F

n

(x) are a.s.convergent to a random variable

^

M

1

which does not

depend on x and has distribution .In this paper,we describe the essential results about

asymptotic behavior of the iterated random functions M

x

n

.To start with,we summarize

recent results regarding stochastic stability of iterated random functions.Then,we study

limiting theorems for additive functions of a Markov chain that can be constructed as an

iterated random functions,which include ergodic theorem,central limit theorem,quick

convergence,Edgeworth expansion and renewal theorems.Three prototypical methods

are introduced to prove limiting theorems:regeneration method,Poisson equation,and

spectral theory for the transition operator.Several examples are given for illustration.

AMS 2000 subject classications.60J05,60J15,60K05,60G17.

Keywords and phrases:random function,Lipschitz map,Markov chain,Poisson equa-

tion,forward iterations,backward iterations,stationary distribution,Prokhorov metric,

level ladder epoch,moment generating function,product of random matrices,Liapunov

exponent,Harris recurrence,total variation,w-ergodicity,geometric ergodicity,uniform

ergodicity,strictly contraction,drift condition,central limit theorem,quick convergence,

Edgeworth expansion,renewal theorem.

y Research partially supported by NSC 91-2118-M-001-016.

1

1 Introduction

Iterated random functions (IRF) have a wide range of applications including perfect simu-

lation,the generation of fractal images,data compression,queuing theory,autoregressive

processes and likelihood representation of hidden Markov models,and among others.The

reader is referred to Du o (1997),and Diaconis and Freedman (1999) for excellent recent

survey including an extensive list of relevant literature.In this paper,we study the theo-

retical aspect of iterated random functions;to summarize recent limiting theorems in the

literature.To be more precise,a sequence of the form

M

n

= F(

n

;M

n1

);n 0;(1:1)

is called an iterated random functions (IRF) of i.i.d.Lipschitz maps providing

1.M

0

;

1

;

2

; are independent random elements on a common probability space

(

;U;P);

2.

1

;

2

; are identically distributed with common distribution and take values in

a second countable measurable space (;A);

3.M

0

;M

1

; take values in a complete separable metric space (X;d) with Borel -eld

B(X);

4.F:(X;A

B(X))!(X;B(X)) is jointly measurable and Lipschitz continuous

in the second argument.

Let X

0

be a dense subset of X and M(X

0

;X) the space of all mappings f:X

0

!X

endowed with product topology and product -eld.Then the space L

Lip

(X;X) of all

Lipschitz continuous mappings f:X!X properly embedded forms a Borel subset of

M(X

0

;X) and the mappings

L

Lip

(X;X) X 3 (f;x) 7!f(x) 2 X;

L

Lip

(X;X) 3 f 7!l(f):= sup

x6=y

d(f(x);f(y))

d(x;y)

are Borel,see Lemma 5.1 in Diaconis and Freedman (1999) for details.Hence

L

n

:= l(F(

n

;));n 0;(1.1)

are also measurable and form a sequence of i.i.d.random variables.

In the following,we write F

n

(x) for F(

n

;x).Let F

k:n

:= F

k

F

n

,F

n:k

:= F

n

F

k

and F

n:n1

the identity on X for all 1 k n.Hence

M

n

= F

n

(M

n1

) = F

n:1

(M

0

) (1.2)

2

for all n 0.Closely related to these forward iterations,and in fact a key tool to their

analysis,is the following sequence of backward iterations

^

M

n

:= F

1:n

(M

0

);n 0:(1.3)

The connection is established by the identity

P

x

(M

n

2 ) = P

x

(

^

M

n

2 )

for all n 0.Put also M

x

n

:= F

n:1

(x) and

^

M

x

n

:= F

1:n

(x) for x 2 X and note that

P((M

x

n

;

^

M

x

n

)

n0

2 ) = P

x

((M

n

;

^

M

n

)

n0

2 ):

The reason for introducing these additional sequences is that we will do comparisons of

^

M

x

n

and

^

M

y

n

,or M

x

n

and M

y

n

,for dierent x;y.In the verication of stochastic stability,it

is known that the forward iterations M

x

n

= F

n

F

1

(x),n 0,converge weakly to a

unique stationary distribution for each x 2 X;while the associated backward iterations

^

M

x

n

= F

1

F

n

(x) are a.s.convergent to a random variable

^

M

1

which does not depend

on x and has distribution ,providing the mean contraction assumption Elog

+

L

1

< 0 and

Elog

+

d(F

1

(x

0

);x

0

) < 1for some x

0

2 X.

The theory of additive functional of iterated random functions gives rise to general

results of which typical examples are the ergodic theorem and central limit theorem;the

results describe here can be considered as an innite dimensional extension of this theory.

The aspect of the situation which is new is the non-commutativity of the iteration,and thus

we are led to study a certain Markov chain theory.Clearly,by denition (1.1),(M

n

)

n0

constitutes a temporarily homogeneous Markov chain with state space X and transition

kernel P,given by

P(x;B) = (F(;x) 2 B)

for x 2 X and B 2 B(X).The n-step transition kernel is denoted P

n

.For x 2 X,let

P

x

be the probability measure on the underlying measurable space under which M

0

= x

a.s.The associated expectation is denoted E

x

,as usual.For an arbitrary distribution

on X,we put P

():=

R

P

x

() (dx) with associated expectation E

.We use P and E for

probabilities and expectations,respectively,that do not depend on the initial distribution.

It is known (cf.Alsmeyer,2003) that the induced Markov chain from iterated random

functions is Harris recurrent on a set H,and w-ergodic for some weight function w if extra

moment conditions is assumed.The results may be more easily derived fromrelated results

in Meyn and Tweedie (1993,Chapter 17) if (M

n

)

n0

is further irreducible (with respect

to some measure on B(X)) in which case it is even positive Harris recurrent on some P-

absorbing set.However,many IRF of i.i.d.Lipschitz functions are not irreducible but only

weak Feller chains.It is this fact that complicates the necessary arguments in the general

situation.

3

In this paper,we study limiting theorems for additive functions of a Markov chain that

can be constructed as an iterated random functions,which include Stochastic stability,

ergodic theorem,central limit theorem,quick convergence,Edgeworth expansion and re-

newal theorems.Three prototypical methods are introduced to prove limiting theorems:

regeneration method,Poisson equation,and spectral theory for the transition operator.To

start with,we introduce the regeneration method to prove rate of convergence and ergodic

theorem for IRF in Section 2.Secondly,without the assumption of irreducibility,we apply

Poisson equation method to prove central limit theorem and quick convergence in Section

3.To prove Edgeworth expansion and renewal theorems,we need to put the irreducibility

assumption,to which two types of conditions are imposed here.A density hypothesis on

leads to a situation in the context of Harris recurrent;another natural hypothesis is the

positivity of the functions in the support of and we have then contraction properties

which lead also to precise results.In Section 4,we state the results of Harris recurrent and

w-ergodic for iterated random functions,and introduce a sucient condition,based on the

density hypothesis,for irreduciblity.In Section 5,we study the hypothesis of positivity for

the functions in the support of ,on which basis we develop our spectral theory.Edge-

worth expansion,and renewal theorems,considered in Sections 6 and 7 respectively,are

then follow from the established Markov chains theory.Illustrated examples are included

in Section 8.The rst two satisfy the density assumption;while the third one satises the

positivity assumption.The fourth example does not satisfy neither one.

2 Stochastic stability and ergodic theorem

In this section,we summarize the results of stochastic stability and rate of convergence

for iterations of i.i.d.mean contraction random Lipschitz functions.Ergodic theorem is

also given.A central question for an IRF (M

n

)

n0

is under which conditions it stabilizes,

that is,converges to a stationary distribution .Elton (1990) showed in the more general

situation of a stationary sequence (F

n

)

n1

that this holds true whenever Elog

+

l(F

1

) and

Elog

+

d(F

1

(x

0

);x

0

) are both nite for some (and then all) x

0

2 X and the Liapunov expo-

nent l

:= lim

n!1

n

1

log l(F

n:1

) which exists by Kingman's subadditive ergodic theorem,

is a.s.negative.His results for i.i.d.F

1

;F

2

; under the slightly stronger assumptions

Elog

+

l(F

1

) < 0,Elog

+

d(F

1

(x

0

);x

0

) < 1 for some x

0

2 X are restated in Theorem 2.1.

The basic idea is to consider the backward iterations

^

M

x

n

= F

1:n

(x) and to prove their a.s.

convergence to a limit

^

M

1

which does not depend on x and which has distribution .The

obvious inequality

d(

^

M

x

n+m

;

^

M

x

n

)

n

Y

k=1

l(F

k

)

d(F

n+1:n+m

(x);x) a.s.;(2.1)

4

valid for all n;m 0 and x 2 X,forms a key tool in the necessary analysis.Alsmeyer and

Fuh (2001) embarks on that same inequality together with the simple observation that

log

n

Y

k=1

l(F

k

)

=

n

X

k=1

log l(F

k

);n 0;(2.2)

is an ordinary zero-delayed random walk and thus perfectly amenable to renewal theoretic

(regeneration) arguments.Under the mean contraction assumption Elog

+

l(F

1

) < 0,it has

negative drift whence,for arbitrary 2 (0;1),the level log ladder epochs

0

( ) 0,

n

( ):= inf

k >

n1

:

k

X

j=

n1

( )+1

log l(F

j

) log

;n 1;(2.3)

are all a.s.nite and constituting an ordinary discrete renewal process.As a consequence,

the subsequence (M

n

( )

)

n0

forms again an IRF of i.i.d.Lipschitz maps which further is

strictly contractive because,by construction,

l(F

1:

1

( )

) < 1:

For the associated backward iterations

^

M

x

n

( )

= F

1:

n

( )

(x),inequality (2.1) hence takes

the very strong form

d(

^

M

x

n+m

( )

;

^

M

x

n

( )

)

n

d(F

n+1

( ):

n+m

( )

(x);x) (2.4)

for all n;m 0 and x 2 X and suggests the following procedure to prove convergence

results for (M

n

)

n0

and its associated sequence of backward iterations:

Step 1.Given a set of conditions,nd out what kind of results hold true for the strictly

contractive IRF (M

n

( )

)

n0

for any 2 (0;1).

Step 2.Analyze the excursions of (M

n

)

n0

between two successive ladder epochs

k

( )

and

k+1

( ) and adjust the results with respect to (M

n

)

n0

if necessary.

The stability results in this section are taken from Alsmeyer and Fuh (2001).They

focus on estimates for d(

^

M

1

;

^

M

n

) under P

x

,x 2 X,and d(M

x

n

;M

y

n

) for x;y 2 X.The

latter distance may be viewed as the coupling rate of the forward iterations at time n when

started at dierent values x and y.The two sets of conditions we will consider are that,

for some p > 0 and some x

0

2 X,either

Elog

p+1

(1 +L

1

) < 1 and Elog

p+1

(1 +d(F

1

(x

0

);x

0

)) < 1 (2.5)

or

EL

p

1

< 1 and Ed(F

1

(x

0

);x

0

)

p

< 1 (2.6)

5

holds.Two major conclusions will concern the distance of P

n

(x;) for x 2 X and in the

Prokhorov metric associated with d.Following Diaconis and Freedman (1999),the latter

is also denoted d and dened,for two probability measures

1

;

2

on X,as the inmum

over all 0 such that

1

(B) <

2

(B

) + and

2

(B) <

1

(B

) +

for all B 2 B(X),where B

:= fx 2 X:d(x;y) < for some y 2 Bg.We will show that,

for all x 2 X and n 0,

d(P

n

(x;);) A

x

(n +1)

p

;(2.7)

if (2.5) holds,and

d(P

n

(x;);) A

x

r

n

(2.8)

for some r 2 (0;1) not depending on x and n,if (2.6) is true.

Now let

1

( ) be as dened in (2.3) for 2 (0;1),i.e.

1

( ):= inffn 1:L

1:n

g = inf

n 1:

n

X

k=1

log L

k

log

:(2.9)

Providing Elog

+

L

1

< 0,a condition which will always be in force throughout,

1

( ) is an

a.s.nite rst passage time with nite mean ( ).It has also nite variance ( )

2

,say,if

Elog(1 +L

1

)

2

< 1.Let further

log

:= inf

2(0;1)

log

( )

:(2.10)

If Ej log L

1

j < 1,then it is well known from renewal theory,that

log

Elog L

1

( )

log

Elog L

1

(1 +o(1)) ( !0):(2.11)

It is now easily checked that in this case

log

= lim

#0

log

( )

= Elog L

1

:(2.12)

Theorem 2.1.Given an IRF (M

n

)

n0

of i.i.d.Lipschitz maps,suppose

Elog

+

L

1

< 0 and Elog

+

d(F

1

(x

0

);x

0

) < 1 (2.13)

6

for some x

0

2 X.Then the following assertions hold:

(a)

^

M

n

converges a.s.to a random element

^

M

1

with distribution which does not depend

on the initial distribution.

(b) For each 2 (

;1),lim

n!1

P

x

(d(

^

M

1

;

^

M

n

) >

n

) = 0 for all x 2 X.

(c) M

n

converges in distribution to under every P

x

;x 2 X.

(d) is the unique stationary distribution of (M

n

)

n0

and (

^

M

n

)

n0

a stationary sequence

under P

.

(e) (M

n

)

n0

is ergodic under P

.

Theorem 2.2.Given the situation of Theorem 2.1 and additionally condition (2.5) for

some p > 0,the following assertions hold:

(a) For each 2 (

;1),

X

n1

n

p1

P

x

(d(

^

M

1

;

^

M

n

) >

n

) c

1 +log

p

(1 +d(x;x

0

))

and

lim

n!1

n

p

P

x

(d(

^

M

1

;

^

M

n

) >

n

) = 0

for all x 2 X and some c

2 (0;1).

(b) For each 2 (

;1),

limsup

n!1

n

p1

p

1

n

log d(

^

M

1

;

^

M

n

) log

0 P

x

-a.s.

for all x 2 X.In case 0 < p 1 this remains true for =

.

(c) If p = 1,then lim

n!1

n

d(

^

M

1

;

^

M

n

) = 0 P

x

-a.s.for all x 2 X and all 2 (

;1).

(d) d(P

n

(x;);) A

x

(n + 1)

p

for all n 0,x 2 X and a positive constant A

x

of the

form maxfA;2d(x;x

0

)g,where A does neither depend on x nor on n.

(e)

R

1

0

log

p

(1 +d(x;x

0

)) (dx) =

R

1

0

pt

p1

(x:log(1 +d(x;x

0

)) > t) dt < 1.

Theorem 2.3.Given the situation of Theorem 2.1 and additionally condition (2.6) for

some p > 0,the following assertions hold:

(a) For each 2 (

;1),

lim

n!1

n

P

x

(d(

^

M

1

;

^

M

n

) >

n

) = 0

for all x 2 X and some

2 (0;1).

(b) There exists > 0) such that for each q 2 (0;),

lim

n!1

sup

x2X

n

q

(1 +d(x;x

0

))

q

E

x

d(

^

M

1

;

^

M

n

)

q

= 0

7

for some

q

2 (0;1).The same holds true for q = with

q

= 1.

(c) d(P

n

(x;);) A

x

r

n

for all n 0,some r 2 (0;1) and a constant A

x

of the form

maxfA;d(x;x

0

)g.The constants r and A do not depend on x nor n.

(d)

R

X

d(x;x

0

)

(dx) =

R

1

0

t

1

(x:d(x;x

0

) > t) dt < 1 for some > 0.

Let us mention that the constants c

;

;

q

;;A

x

and r in the previous theorems gen-

erally further depend on p > 0 of the supposed respective moment condition.

The assertions of the previous two theorems on d(

^

M

1

;

^

M

n

) are easily translated into

similar results on d(M

x

n

;M

y

n

) for the forward iterations started at dierent values x and

y.Essentially,this only takes the observation that (M

x

n

;M

y

n

) and (

^

M

x

n

;

^

M

y

n

) are identically

distributed for all x;y 2 X and n 0 and that

d(

^

M

x

n

;

^

M

y

n

) d(

^

M

x

0

1

;

^

M

x

n

) + d(

^

M

x

0

1

;

^

M

y

n

):

We summarize the results in the following two corollaries.

Corollary 2.1.Given the situation of Theorem 2.2,the following assertions hold:

(a) For each 2 (

;1),

X

n1

n

p1

P(d(M

x

n

;M

y

n

) >

n

) c

1 +log

p

(1 +d(x;x

0

)) +log

p

(1 +d(y;x

0

))

and

lim

n!1

n

p

P(d(M

x

n

;M

y

n

) >

n

) = 0

for all x;y 2 X and some c

2 (0;1).

(b) For each 2 (

;1),

limsup

n!1

n

p1

p

1

n

log d(M

x

n

;M

y

n

) log

0 a.s.

for all x;y 2 X.In case 0 < p 1 this remains true for =

.

(c) If p = 1,then lim

n!1

n

d(M

x

n

;M

y

n

) = 0 a.s.for all x;y 2 X and all 2 (

;1).

Corollary 2.2.Given the situation of Theorem 2.3,the following assertions hold:

(a) For each 2 (

;1),

lim

n!1

n

P(d(M

x

n

;M

y

n

) >

n

) = 0

for all x;y 2 X and some

2 (0;1).

(b) There exists > 0 such that for each q 2 (0;),

lim

n!1

sup

x;y2X

n

q

(1 +d(x;x

0

) d(y;x

0

)

q

E

x

d(M

x

n

;M

y

n

)

q

= 0

for some

q

2 (0;1).The same holds true for q = with

q

= 1.

8

3 Central limit theoremand quick convergence:Pois-

son equation approach

In this section we show that a continuous functions obtained by iterated random functions

converge to a standard normal distribution.The machinery which we develop to prove this

result rests on the stability theory developed in Section 2.These techniques are extremely

appealing as well as powerful,and can lead to much further insight into asymptotic behavior

of the iterated random functions.Here we will focus on two results:central limit theorem

and quick convergence.

Let g 2 L

2

0

() be a square integrable function with mean 0,i.e.

Z

X

g d = 0 and kgk

2

2

=

Z

X

g

2

d < 1:(3.1)

Consider the sequence

S

n

(g):= g(M

1

) + +g(M

n

);n 1;(3.2)

which may be viewed as a Markov random walk with driving chain (M

n

)

n0

.By construct-

ing a solution h 2 L

2

() to the Poisson equation

h = g +Ph;(3.3)

where Ph(x):=

R

X

h(y) P(x;dy),and a subsequent decomposition of S

n

(g) into a mar-

tingale and a stochastically bounded sequence,Benda (1998) showed that S

n

(g)=

p

n is

asymptotically normal as n!1under P

x

for -almost all x 2 X,if g 2 L

Lip

(X;R),

EL

2

1

< 1 and Ed(F

1

(x

0

);x

0

)

2

< 1:(3.4)

It was observed by Wu and Woodroofe (2000) that these conditions may be relaxed if the

integrability assumption on g is slightly strengthened to g 2 L

2

0

()\L

r

() for some r > 2.

Their further assumptions are Elog

+

L

1

< 0,(2.6) and a -square integrability condition

on a certain local Lipschitz constant for g with respect to a attened metric d.The main

point is that it allows discontinuous g,for instance suitable indicator functions.A main

purpose in this section is to summarize Benda (1998),and Wu and Woodroofe's (2000)

results for the asymptotic normality of S

n

(g)=

p

n,and apply the results from Alsmeyer

(1990),and Fuh and Zhang (2000) for quick convergence of n

1

S

n

(g) to 0.As to the above

mentioned local Lipschitz constant for g,we will show that its integrability (instead of

square integrability) with respect to suces.

We will further give sucient conditions for the -quick convergence of n

1

S

n

(g) to 0.

The concept of quick convergence was introduced by Strassen (1967).A sequence (Z

n

)

n0

is said to converge -quickly ( > 0) to a constant if

E( supfn 0:jZ

n

j "g)

< 1 (3.5)

9

for all"> 0.Plainly,Z

n

! -quickly implies Z

n

! a.s.Put N

"

:= supfn 0:

jZ

n

j "g.Since (3.5) then reads EN

"

< 1 for all"> 0,the -quick convergence of

Z

n

to holds if,and only if,

X

n1

n

1

P(N

"

n) =

X

n1

n

1

P

sup

jn

jZ

j

j "

< 1 (3.6)

for all"> 0.

Our results will be stated in Theorem 3.1 and Corollaries 3.1 and 3.2.As in Benda

(1998),and Wu and Woodroofe (2000),the bulk of the work is to verify the existence

of a solution to the Poisson equation (3.3).This is the content of Theorem 3.1.The

asymptotic normality of S

n

(g)=

p

n (Corollary 3.1) then follows as in Benda (1998) by

applying a martingale central limit theorem;while the -quick convergence of S

n

(g)=n to

0 for suitable (Corollary 2) will be obtained by using a result from Alsmeyer (1990),and

Fuh and Zhang (2000).

Some preliminary considerations are needed before presenting our results:

A.Flattening the metric.In order for solving the Poisson equation (3.3) for a given

function g,the particular given complete separable metric d on the space X will not be

essential but may rather be altered to our convenience.This has been observed by Wu and

Woodroofe (2000) who therefore consider attened variations of d obtained by composing

d with an arbitrary nondecreasing,concave function :[0;1)![0;1) with (0) = 0

and (t) > 0 for all t > 0.Let be the collection of all such functions.It is easy to see

that d

:= d is again a complete metric.Possible choices from include

p

(t):= t

p

for any 0 < p 1 as well as

(t):=

t

1+t

.The latter choice leads to a bounded metric d

satisfying

d

(x;y) d(x;y) 2d

(x;y) (3.7)

for all x;y 2 f(u;v) 2 X

2

:d(u;v) 1g.This shows that the behavior of d and d

is

essentially the same for small values.Notice further that

2 with

lim

t#0

(t)

(t)

= 1 (3.8)

for all 2 .

B.Integrable local Lipschitz constant.One can further relax the global Lipschitz con-

tinuity of g needed in Benda (1998) and instead be satised with a -almost sure local

Lipschitz continuity (with respect to a attened metric d

) in combination with an inte-

grability condition on the local Lipschitz constant.To make this precise,let 2 .For a

10

measurable g:X!R,dene its local Lipschitz constant at x 2 X with respect to d

as

l

(g;x) = sup

y:0<d(x;y)1

jg(x) g(y)j

d

(x;y)

(3.9)

and,for r 2 [1;1],

kgk

r;

= kl

(g;)k

r

;(3.10)

where kk

r

denotes the usual normon L

r

().It is easily seen that kk

r;

denes a (pseudo-)

norm on the space

L

r

;0

() =

n

g 2 L

r

():

Z

X

g(x) (dx) = 0 and kgk

r;

< 1

o

(3.11)

and that L

r

;0

() = L

r

;0

() with

1

2

k k

r;

k k

r;

k k

r;

on this space (use

(3.7) and (3.8)).Possibly after replacing with

,we may therefore always assume

be bounded when dealing with elements of L

r

;0

().

Plainly,all global Lipschitz functions,i.e.all g 2 L

Lip

(X;R),are elements of L

r

;0

()

for any 2 .However,g need not be continuous in order for being an element of some

L

r

;0

().As pointed out in Wu and Woodroofe (2000),if g = 1

B

is the indicator function

of some B 2 B(X),then

l

(1

B

;x) =

1

d

(@B;x)

;(3.12)

where @B denotes the topological boundary of B and d

(@B;x):= inf

y2@B

d(x;y).They

further show that,if B(x;R) = fy:jx yj Rg is the closed R-ball with center x 2 X,

(t) = t

1=4

and denotes Lebesgue measure,then,for each x 2 X,1

B(x;R)

(B(x;R)) 2

L

2

;0

() for -almost all R > 0,see their Theorem 3.

Theorem 3.1.Let r 2 (1;1] with conjugate number s 1,given by

1

r

+

1

s

= 1.Let also

2 be satisfying

Z

1

0

(t)

t

dt < 1:(3.13)

If Elog

+

L

1

< 0 and (2.6) holds for some p > s,then each g 2 L

r

()\L

1

;0

() admits a

solution h 2 L

r

0

() to the Poisson equation h = g +Ph.

We remark that all examples of 2 mentioned in Section 8 satisfy condition (2.6).

With the help of the Poisson equation,one may write

S

n

(g) = W

n

+R

n

;n 1 (3.14)

11

where

W

n

:=

n

X

k=1

(h(M

k

) Ph(M

k1

));n 0 (3.15)

forms a zero mean martingale under P

with stationary increments from L

r

() and

R

n

:= Ph(M

0

) Ph(M

n

);n 1 (3.16)

is stochastically L

r

-bounded under P

in the sense that

sup

n1

P

(jR

n

j > t) 2P

(Z > t) (3.17)

for all t > 0 and some Z 2 L

r

();take any random variable Z 0 with distribution

function P

(jPh(M

0

)j t=2) for t 0

In the stationary regime,that is under P

,the following central limit theorem now fol-

lows exactly as in Benda (1998) from Theorem 3.1 and a martingale central limit theorem.

However,an additional argument is needed to show that the same result holds true under

P

x

for -almost all x 2 X.While this extension is not considered in Wu and Woodroofe

(2000),its proof in Benda (1998) fails to work here because it draws on the continuity of

g and a moment condition like (2.5).

Corollary 3.1.Given the assumptions of Theorem 3.1 with r 2 and p > s,S

n

(g)=

p

n

is asymptotically normal with mean 0 and variance s

2

(g):=

R

(h

2

(Ph)

2

) d under P

as

well as under P

x

for -almost all x 2 X.

So if g 2 L

2

()\L

1

;0

() we need moment condition (2.6) for some p > 2,to conclude

asymptotic normality of S

n

(g)=

p

n.By using the result of the existence of solution for the

Poisson equation (3.3),the following corollary is taken from Theorem 2 in Fuh and Zhang

(2000).

Corollary 3.2.Given the assumptions of Theorem 3.1 with p > s > 1,S

n

(g)=n converges

-quickly to 0 for = r 1,i.e.

X

n1

n

r2

P

sup

jn

j

1

jS

j

(g)j "

< 1 (3.18)

for all"> 0.

12

4 Harris recurrence of iterated random functions

Let M

n

= F(

n

;M

n1

);n 0 be the iterated random functions dened in Section 1.By

the ergodic theorem as shown in Theorem 2.1(e),the latter implies for each B 2 B(X)

lim

n!1

1

n

n

X

k=1

1

B

(M

k

) = (B) (4.1)

P

-a.s.and thus also P

x

-a.s.for -almost all x 2 X.Hence,if (B) > 0,then

P

x

(M

n

2 B i.o.) = 1 (4.2)

for -almost all x 2 X and we would like to conclude that every -positive set B is

recurrent.Unfortunately,the -null set of x 2 X for which (4.2) fails to hold in general

depends on the set B.On the other hand,if it does not,we infer the -irreducibility of the

chain (M

n

)

n0

on some H with (H) = 1 and then,because of (4.2) for each -positive B,

further its Harris recurrence on H.Providing additionally aperiodicity,this in turn implies

that P

x

(M

n

2 ) converges to in total variation for every x 2 H which,of course,is a

much stronger conclusion than Elton's result appeared in Theorem 2.1.With regard to a

further analysis of IRF,for instance the rate of convergence towards stationarity (in total

variation),it also gives access to the highly developed theory for irreducible and Harris

recurrent Markov chains on general state spaces.

Given an IRF of i.i.d.Lipschitz maps satisfying the conditions of an a.s.negative

Liapunov exponent and condition (2.13),two questions will be considered in this section

and discussed in various examples in Section 8.we state a sucient condition for H = X

in Theorem 4.1.These conditions are quite often easy to check in applications when the

stationary distribution is known to some extent.See Section 8 for several examples.We also

deals with the convergence towards stationarity for Harris recurrent IRF.Under additional

moment conditions on L

1

and d(F

1

(x

0

);x

0

),we will show w-regularity and w-ergodicity

for suitable functions w in Theorem 4.2,and provide polynomial as well as geometric rates

of convergence towards stationarity in Theorem 4.3.Theorems 4.1 to 4.3 are taken from

Alsmeyer (2003).

A set B 2 B(X) is called -full,if (B) = 1,and P-absorbing,if P(x;B) = 1 for

all x 2 X.For the denitions of irreducibility,Harris recurrence and related notions for

Markov chains on general state spaces not explicitly repeated here,we refers to the standard

monograph by Meyn and Tweedie (1993).If (M

n

)

n0

is a Harris chain on a set H,this set

is called a Harris set (for (M

n

)

n0

).It is well-known that in this case there always exists

a maximal absorbing set with this property,called maximal Harris set.Our next theorem

contains some information on when this latter set is the whole space X.Let int(B) denote

the interior of a set B 2 B(X).

13

Theorem 4.1.Suppose (M

n

)

n0

is an IRF of i.i.d.Lipschitz maps which has a.s.neg-

ative Liapunov exponent l

and satises (2.13).Let denote its stationary distribution.

Suppose (M

n

)

n0

is Harris recurrent with maximal Harris set H.Then the following as-

sertions hold:

(a) Either (int(H)) = 0,or H = X.

(b) If there exists a -positive set X

0

and a -nite measure on (X;B(X

0

)) such that each

P(x;),x 2 X

0

,possesses a -continuous component.Furthermore,if X with (int(X

0

)) >

0 and if int(supp ) 6=;,then H = X.

As already mentioned above,Theorem 4.1 implies,by invoking the ergodic theorem for

aperiodic,positive Harris chains (see Meyn and Tweedie (1993),Theorem 13.0.1) that

lim

n!1

kP

x

(M

n

2 ) k = 0 (4.3)

for all x 2 H where k k denotes the total variation distance.A weaker metric considered

in Diaconis and Freedman (1999) and Alsmeyer and Fuh (2001) is the Prokhorov metric

associated with d.See also Theorems 2.2 and 2.3 in Section 2.

If (M

n

)

n0

is Harris recurrent,it is natural to ask in view of Theorem 2.2(d) and

Theorem 2.3(c),whether or not similar conclusions hold when replacing the Prokhorov

distance with the total variation distance.The positive answer is provided in Theorem 4.3

for the case H = X and under the additional assumptions that the support of the stationary

distribution has nonempty interior.

Weaker conclusions,stated as Theorem 4.2,can considerably easier concerning the w-

regularity of (M

n

)

n0

.Following Meyn and Tweedie (1993),a set C 2 B(X) is called

w-regular for a function w:X![1;1) if for each -positive B 2 B(X)

sup

x2C

E

x

%(B)1

X

n=0

w(M

n

)

< 1;

where %(B):= inffn 1:M

n

2 Bg.(M

n

)

n0

is called w-regular on a P-absorbing set H if

it is -irreducible and H admits a countable cover of w-regular sets.Dening the w-norm

kk

w

for a signed measure as

kk

w

:= sup

jgjw

j(g)j;(g):=

Z

g d:

(M

n

)

n0

is called w-ergodic on H if it is positive Harris on H with invariant distribution

satisfying (f) < 1and if

lim

n!1

kP

n

(x;) k

w

= 0

14

for all x 2 H.Now put

w(x):= 1 +log

p

(1 +d(x;x

0

)) (4.4)

providing (2.5) for p > 0,and

w(x):= 1 +d(x;x

0

)

(4.5)

providing (2.6) for p > 0,and 0 < p such that

R

X

d(x;x

0

)

d(x) < 1.By using Meyn

and Tweedie's main result on w-regularity,the following result is now immediate and hence

stated without proof.

Theorem 4.2.Let (M

n

)

n0

be an IRS of i.i.d.Lipschitz maps satisfying Elton's con-

ditions.Suppose further that (M

n

)

n0

is an aperiodic positive Harris chain on a -full,

absorbing set H and that either (2.5) or (2.6) holds for some p > 0.Then H may be chosen

such that (M

n

)

n0

is w-regular and w-ergodic on H with w according to (4.4),respectively

(4.5).

It is to be understood that the Harris set H on which (M

n

)

n0

is w-regular need not

be the maximal Harris set.

Theorem4.3.Let (M

n

)

n0

be an IRF of i.i.d.Lipschitz maps with a.s.negative Liapunov

exponent l

and stationary distribution .Suppose further that (M

n

)

n0

is a positive Harris

chain on whole X and that int(supp ) 6=;.Then the following assertions hold:

(a) If (M

n

)

n0

satises (2.5) for some p > 0,then

X

n1

n

p1

kP

x

(M

n

2 ) k < 1 (4.6)

as well as

lim

n!1

n

p

kP

x

(M

n

2 ) k = 0:(4.7)

for all x 2 X.

(b) If (M

n

)

n0

satises (2.6) for some p > 0,then

X

n0

r

n

kP

x

(M

n

2 ) k

w

< 1 (4.8)

for all x 2 X and some r 2 (0;1) not depending on x 2 X,where w is dened as in (4.5).

15

5 Spectral decomposition and characteristic functions

of Markov random walks

It is shown,in Section 4,that the induced Markov chain (M

n

)

n0

of the iterated random

functions is Harris recurrent on a set H.Under the assumption of H = X,and moment

assumption (2.6),(M

n

)

n0

is w-geometric with w dened in (4.5).In this section,we

introduce the culminating form of the geometric ergodicity theorem,and show that such

convergence can be viewed as geometric convergence of an operator norm.That is,the

convergence is bounded independently of the starting point.In the following,we study

the spectral theory for uniform ergodic Markov chains with respect to a general norm,

and apply it to iterated random functions in the next two sections.The materials of this

section are similar to that of Fuh and Lai (2001),and Fuh and Lai (2003),we include here

for completeness.

Let f(X

n

;S

n

);n 0g be a Markov random walk on X R

d

.For sake of notation,

denote P(x;A) = P(x;A R

d

).For all transition probability kernels P(x;A);Q(x;A),

x 2 X,A 2 A and for all measurable functions h(x);x 2 X,dene Qh and PQ by

Qh(x) =

R

Q(x;dy)h(y) and PQ(x;A) =

R

P(x;dy)Q(y;A),respectively.

Let N be the Banach space of measurable functions h:X!C (:= set of complex

numbers) with norm khk < 1.We introduce the Banach space B of transition probability

kernels Q such that the operator norm jjQjj = supfjjQgjj;jjgjj 1g is nite.Two pro-

totypical norms used in the literature are the supnorm and the L

p

-norm for 1 < p < 1.

Another two commonly used norms in applications are the weighted variation norm and

the bounded Lipschitz norm,to be described as follows:

1.Let w:X![1;1) be a measurable function,dene for all measurable functions h,

a weighted variation norm

jjhjj

w

= sup

x2X

jh(x)j=w(x);(5.1)

and set N

w

= fh:jjhjj

w

< 1g.Corresponding norm in B

w

is of the form jjQjj

w

=

sup

x2X

R

jQj(x;dy)w(y)=w(x):

2.Let (X;d) be a metric space.For any continuous function h on X,the Lipschitz

seminorm is dened by jjhjj

L

:= sup

x6=y

jh(x) h(y)j=d(x;y).Call the supremum norm

jjhjj

1

= sup

x2X

jh(x)j:Let the bounded Lipschitz norm

jjhjj

BL

:= jjhjj

L

+jjhjj

1

(5.2)

and N

BL

= fh:jjhjj

BL

< 1g.Here BL stands for\bounded Lipschitz".

Denote by P

n

(x;A) = P(X

n

2 AjX

0

= x),the transition probabilities over n steps.

The kernel P

n

is a n-fold power of P.Dene the Cesaro averages P

(n)

=

P

n

j=0

P

j

=n,where

P

0

= P

(0)

= I and I is the identity operator on B.

16

Denition 1 A Markov chain fX

n

;n 0g is said to be uniformly ergodic (or strongly

stable) with respect to a given norm jj jj,if there exists a stochastic kernel such that

P

(n)

! as n!1 in the induced operator norm in B.The Markov chain fX

n

;n 0g

is called w-uniformly ergodic in the case of weighted variation norm.

The Markov chain fX

n

;n 0g is assumed to be irreducible (with respect to some

measure on A),aperiodic and strongly stable.Theorem 1.1.of Kartashov (1996) leads

that P has a unique stationary projector ,in the sense of

2

= = P = P,and

(x;A) = (A) for all x 2 X and A 2 A.

The following assumptions will be used in this section.

C1.There exists a natural n,a measure on A and a measurable function h on

A such that

R

(dx)h(x) > 0,(X) = 1,

R

(dx)h(x) > 0,and the kernel T(x;A) =

P

n

(x;A) h(x)(A) is nonnegative.

C2.sup

khk1

jjE[h(X

1

)jX

0

= x]jj < 1.

C3.sup

x

E

x

j

1

j

2

< 1and sup

khk1

jjE[j

1

j

r

h(X

1

)jX

0

= x]jj < 1for some r 3.

C4.Let be an initial distribution of the Markov chain fX

n

;n 0g,assume that for

some r 1,

jjjj:= sup

jjhjj1

j

Z

x2X

h(x)E

x

j

1

j

r

(dx)j < 1:

Remarks:1.Condition C1 is a mixing condition on the Markov chain fX

n

;n 0g,

and it is satised for Harris recurrent Markov chain.An example on page 9 of Kartashov

(1996) shows that there exists an uniformly ergodic Markov chain with respect to a given

norm,which is not Harris recurrent.C2 is a condition to guarantee the operators dened

in (5.4)-(5.5) below to be bounded.C3 and C4 are moment conditions.We also note that

by making use a similar argument as that in Section 3 of Jensen (1987),the X

1

and

1

appeared in C2-C4 can be relaxed to X

t

and

t

,for some xed t > 1.

2.Theorem 2.2 and Corollary 2.1 of Kartashov (1996) shows that under Condition C1,

a Markov chain X with transition probability kernel P is uniformly ergodic with respect

to a given norm jj jj if and only if there exist > 0 and 0 < < 1 such that for all n 1

jjP

n

jj

n

:(5.3)

When the Markov chain fX

n

;n 0g is w-uniformly ergodic,(5.3) is satised without

Condition C1.

For d 1 vectors ,dene the linear operators P

,P,

and Q on N by

(P

h)(x) =

Z

h(y)e

is

P(x;dy ds) = E[h(X

1

)e

iS

1

jX

0

= x];(5.4)

(Ph)(x) =

Z

h(y)P(x;dy ds) = E[h(X

1

)jX

0

= x];(5.5)

17

h = E

fh(X

0

)e

iS

1

g;Qh =

Z

h(y)(dy):(5.6)

Condition C2 ensures that P

and P are bounded linear operators on N,and (5.3) implies

that

kP

n

Qk = sup

h2N;khk1

kP

n

h Qhk

n

:(5.7)

For a bounded linear operator T:N!N,the resolvent set is dened as fz 2 C:

(TzI)

1

existsg and (TzI)

1

is called the resolvent (when the inverse exists).From

(5.7),it follows that for z 6= 1 and jzj > ;

R(z):= Q=(z 1) +

1

X

n=0

(P

n

Q)=z

n+1

(5.8)

is well dened.Since R(z)(P zI) = I = (P zI)R(z),the resolvent of P is R(z).

Moreover,by C3 and an argument similar to the proof of Lemma 2.2 of Jensen (1987),

there exist K > 0 and > 0 such that for jj ;jz1j > (1)=6 and jzj > +(1)=6,

kP

Pk Kjj;(5.9)

R

(z):=

1

X

n=0

R(z)f(P

P)R(z)g

n

is well dened:(5.10)

Since R

(z)(P

zI) = R

(z)f(P

P) +(PzI)g = I = (P

zI)R

(z);the resolvent

of P

is R

(z).

For jj the spectrum (which is the complement of the resolvent set) of P

therefore

lies inside the two circles

C

1

= fz:jz 1j = (1 )=3g and C

2

= fz:jzj = +(1 )=3g:(5.11)

Hence,by the spectral decomposition theorem (cf.Riesz and Sz-Nagy (1955),page 421),

N = N

1

() N

2

() and

Q

:=

1

2i

Z

C

1

R

(z)dz;I Q

:=

1

2i

Z

C

2

R

(z)dz (5.12)

are parallel projections of N onto the subspaces N

1

();N

2

();respectively.Moreover,by

an argument similar to the proof of Lemma 2.3 of Jensen (1987),there exists 0 <

such that N

1

() is one-dimension for jj and

sup

jj

kQ

Qk < 1:(5.13)

18

For jj ,let () be the eigenvalue of P

with corresponding eigenspace N

1

().Since

Q

is the parallel projection onto the subspace N

1

() in the direction of N

2

();

P

Q

h = ()Q

h for h 2 N:(5.14)

Letting denote the initial distribution of (X

0

;S

0

) and dening the operator

by (5.6),

we then have for h 2 N;

E

fe

iS

n

h(X

n

)g =

P

n

h =

P

n

fQ

+(I Q

)gh (5.15)

=

n

()

Q

h +

P

n

(I Q

)h:

Suppose C4 also holds.An argument similar to the proof of Lemma 2.4 of Jensen

(1987) shows that there exist K

> 0 and 0 <

< such that for jj

,

k

P

n

(I Q

)hk K

khkjjf(1 +2)=3g

n

:(5.16)

We next consider the summand

n

()

Q

h in (5.15).Suppose that C3 holds with r 3

and let [r] denote the integer part of r.Then analogous to Lemma 2.5 of Jensen (1987),

() has the Taylor expansion

() = 1 +

X

(j

1

;;j

d

):1j

1

++j

d

[r]

i

j

1

++j

d

j

1

;;j

d

j

1

1

j

d

d

=(j

1

! j

d

!) +() (5.17)

in some neighborhood of the origin,where () = O(jj

r

) as !0.Assume furthermore

that C4 holds.Then,analogous to Lemma 2.6 of Jensen (1987),

Q

h

1

has continuous

partial derivatives of order [r] 2 in some neighborhood of the origin.Moreover,there

exist constants K and 0 < <

such that for jj < and l r 2;we have

d

l

d

l

Q

h =

r3

X

j=1

1

(j 1)!

j1

j

+cKjj

r2l

;(5.18)

where jcj 1,and

j

;j = 0;1; ;r 3,are constants with

0

= 1.

P

n

(I Q

)h has

continuous partial derivatives of order [r] in some neighborhood of the origin.Moreover,

the norm of any such partial derivatives converges to 0 geometrically fast.In summary,we

have

Theorem5.1 Let f(X

n

;S

n

);n 0g be the Markov random walk dened as (5.1),satisfying

Conditions C1-C4.Then,there exists a > 0 such that for all 2 R

d

with jj < ,we

have

P

= ()Q

+P

(I Q

);(5.19)

19

and

(i) () is the unique eigenvalue of the maximal modulus of P

;

(ii) Q

is a rank-one projection such that Q

(I Q

) = (I Q

)Q

= 0;

(iii) the mappings ();Q

and I Q

are analytic for jj < ;

(iv) j()j >

2+

3

and for each p 2 N;the set of positive integers,there exists c > 0 such

that for each n 2 N,

k

d

p

d

p

P

n

(I Q

)k c(

1 +2

3

)

n

:

(v) dening

j

= lim

n!1

(1=n)E

x

log jjT

(j)

n

jj as the upper Liapunov exponent,it follows

that

j

=

@()

@

j

j

=0

=

Z

E

x

(log kM

(j)

1

uk=jjujj)dm(x;u):

Remarks:1.Under C1-C3 with respect to a norm k k together with the assumption

jjE

x

fjS

1

j

r

gjj < 1;it can be shown by an argument similar to the proof of Lemma 2.7 of

Jensen (1987) that

P

n

(I Q

)h has continuous partial derivatives of order [r] in some

neighborhood of the origin.Moreover analogous to (5.16),the norm of any such partial

derivatives converges to 0 geometrically fast,by an argument similar to the proof of Lemma

2.4 of Jensen (1987).

2.For the special case

t

= g(X

t

) with g:X!R;the representation (5.15)-(5.17) of

the characteristic function E

(e

iS

n

) was rst obtained by Nagaev (1957) under the uniform

ergodicity condition

sup

A;x;y

jP(X

m

2 AjX

0

= x) P(X

m

2 AjX

0

= y)j < 1 for some m 1:(5.20)

As noted by Nagaev,(5.20) implies the existence of a stationary distribution and a

uniform geometric rate of converge to the stationary distribution

sup

A;x

jP(X

n

2 AjX

0

= x) (A)j

n

;(5.21)

for some > 0;0 < < 1 and all n 1.Jensen (1987) rst claried Nagaev's arguments

and then,considered the more general case

t

= g(X

t1

;X

t

) with g:X X!R.

Noting that the moment condition sup

x

E[jg(x;X

1

)j

r

jX

0

= x] < 1 required in Nagaev's

arguments for such

t

is not satised in most cases where g depends on both X

t1

and X

t

,

he extended Nagaev's representation (5.15)-(5.17) to the case where (5.20) holds (also in

the case of L

p

-norm for 1 < p < 1) and

sup

x

E[jg(X

m

;X

m+1

)j

r

jX

0

= x] < 1 for some m 1 and r 3;(5.22)

Z

E[jg(X

t1

;X

t

)j

r2

jX

0

= x](dx) < 1for 1 t m2:(5.23)

20

Instead of introducing a delay m as in (5.22) and (5.23),we broaden the scope of appli-

cability of Nagaev's representation theory by using general norm not only in the moment

condition C3 but also in the ergodicity condition.In Sections 6,7 and 8,the usefulness of

this idea is discussed further and illustrated with examples of iterated random functions.

6 Rate of convergence theorems:Asymptotic expan-

sion

We show in Section 4 that iterated random functions,under some regularity conditions,

satisfying the properties of Harris recurrent and w-ergodic.Sucient condition in Theorem

4.1 is also given for -irreducible.Section 5 provides the spectral theory for irreducible

Markov operator.In this section,we apply the results in Harris recurrent and strong stable

Markov chains,to have Edgeworth expansion for iterated random functions.Edgeworth

expansion for irreducible Harris recurrent Markov chains can be found in Hipp (1985),

Malinovskii (1987) and Jensen (1989);while Edgeworth expansion for strong stable Markov

chains is in Fuh and Lai (2003).

By the assumption of Harris recurrent,we can assume,without loss of generality,that

the state space X has an atom A

0

,that is (A

0

) > 0 and P(x;) = P(y;) for all x;y 2 A

0

.

We may then dene the stopping times T

0

= inffn 0jg(M

n

) 2 A

0

g,T

k

= inffn >

T

k1

jg(M

n

) 2 A

0

g for k 1 and

k

= T

k

T

k1

.Also we let 0 denote a xed point in

A

0

.Adapted the notations from Section 3.For j = 1; ;d,let g

j

2 L

2

0

() be a square

integrable function with mean 0,i.e.

Z

X

g

j

d = 0 and kg

j

k

2

2

=

Z

X

g

2

j

d < 1:

Denote g = (g

1

; ;g

d

) and consider the sequence

S

n

:= S

n

(g):= g(M

1

) + +g(M

n

);n 1;

which may be viewed as a Markov random walk with driving chain (M

n

)

n0

.We want to

make asymptotic expansion of the distribution of the sum S

n

(g).For this we dene the

random variable

Z

k

=

T

k

X

j=T

k1

+1

g(M

j

) (6.1)

for k = 1;2; .The uniform Cramer condition for (Z

1

;

1

) under P

0

states that for any

c > 0 there exists a < 1 such that

jE

0

fexp(iu

0

Z

1

+iv

1

)gj < (6.2)

21

for all v 2 R and all u 2 R

d

with kuk > c.We dene a uniformity class B

c

of Borel sets

in the following way,

B

c

= fB 2 B

d

jf(@B)

"

g < c"for all"> 0g;(6.3)

where is the standard normal distribution in R

d

and (@B)

"

= fB(x;")jx 2 B;B(x;")

B

c

g with B(x;") a ball centered at x and with radius".

The following theorem is taken from Theorem 1 of Jensen (1989).

Theorem 6.1 Suppose (M

n

)

n0

is an IRF of i.i.d.Lipschitz maps which has a.s.negative

Liapunov exponent l

and satises (2.13).Let denote its stationary distribution.Suppose

there exists a -positive set X

0

and a -nite measure on (X;B(X)) such that each P(x;),

x 2 X

0

,possesses a -continuous component.Furthermore,if X with (int(X

0

)) > 0 and

if int(supp ) 6=;.Assume further that A

0

is the positive Harris recurrent atom and let

0 2 A

0

.Let be the initial distribution of M

0

.Assume for some s 3 the moment

conditions

E

(T

s2

0

) < 1 E

(

T

0

X

j=0

jg(M

j

)j)

s2

) < 1 (6.4)

E

0

(

s

1

) < 1 E

0

(

T

1

X

j=1

jg(M

j

)j)

s

) < 1;(6.5)

and assume that under P the covariance of (Z

1

;

1

) is positive denite and that (Z

1

;

1

)

satises the uniform Cramer condition (6.2).Then

P(

1

p

n

n

X

j=1

fg(M

j

) (g)g 2 B) =

Z

B

'

~

(x)

s3

X

r=0

n

r=2

q

r

(x)dx +O(n

(s2)=2

)

uniformly for B 2 B

c

.Here'

~

is the density of the normal distribution with mean zero

and covariance

~

,q

r

is a polynomial in x and

~

= E

fg(M

1

) (g)g

0

fg(M

0

) (g))g +C

where

C =

1

X

n=2

E

fg(M

1

) (g)g

0

fg(M

n

) (g))g:

In the second part of this section,we will introduce the Edgeworth expansion for strong

stable Markov chains,and then apply it to additive functional of IRF.As in (5.3),the

22

Markov chain fX

n

;n 0g is geometrically mixing in the sense that there exist > 0 and

0 < < 1 such that for all x 2 X;k 0 and n 1 and for all real-valued measurable

functions g;h with g;h 2 N,

jjE

x

fg(X

k

)h(X

k+n

)g fE

x

g(X

k

)gfE

x

h(X

k+n

)gjj

n

:(6.6)

Let ~g;

~

h be real-valued measurable functions on X X.Since E

x

~

h(X

k

;X

k+1

) = E

x

h(X

k

),

where h(z) = E

z

f

~

h(z;X

1

)g,the same proof as that of Theorem 2.2 of Kartashov (1996)

can be used to show that there exist

1

> 0 and 0 <

1

< 1 such that for all x 2 X;k 0

and n 1 and for all measurable ~g;

~

h with sup

y

~g

2

(x;y) 2 N and sup

y

~

h

2

(x;y) 2 N;

jjE

x

f~g(X

k

;X

k+1

)

~

h(X

k+n

;X

k+n+1

)g fE

x

g(X

k

)gfE

x

h(X

k+n

)gjj

1

n1

1

:(6.7)

To establish asymptotic expansion for Markov randomwalks,we shall make use of (6.7)

in conjunction with the following extension of conditional Cramer's (strongly nonlattice)

condition (cf.Fuh and Lai (2003)):There exists m 1 such that

limsup

jj!1

jEfexp(i S

m

)jX

0

;X

m

gj < 1:(6.8)

Next,we assume the strong mixing condition hold.

jE

f~g(X

k

;X

k+1

)

~

h(X

k+n

;X

k+n+1

)g fE

g(X

k

)gfE

h(X

k+n

)gj

1

n1

1

:(6.9)

Remarks:1.When the norm is the weighted variation norm,we do not need the

strong mixing condition (6.9),cf.page 653 of Fuh and Lai (2001).

2.In the special case where S

1

is independent of (X

0

;X

1

) so that the kernel P(x;AB)

in (1.1) can be factorized as P

1

(x;A)P

2

(B),(6.8) reduces to the condition that the random

variable S

1

is strongly nonlattice:

limsup

jj!1

j

Z

exp(i s)P

2

(ds)j < 1:

In addition to (6.8) and (6.9),we shall assume that C1-C4 in Section 5 hold for some

integer r 3.Let

=

Z

E

x

S

1

(dx) (=

0

(0));(6.10)

and let V = (@

2

()=@

i

@

j

j

=0

)

1i;jd

be the Hessian matrix of at 0.By Theorem 5.1,

lim

n!1

n

1

E

f(S

n

n)(S

n

n)

0

g = V;(6.11)

where

0

denotes the transpose.

23

Let

n

() = E

(e

iS

n

) and let h

1

2 N be the constant function h

1

1.Then by

Proposition 1 and the fact that

Q

h

1

has continuous partial derivatives of order r 2

in some neighborhood of = 0,we have the Taylor series expansion of

n

(=

p

n) for

j=

p

nj "(some suciently small positive number):

n

(=

p

n) = f1 +

r2

X

j=1

n

j=2

~

j

(i)ge

0

V =2

+o(n

(r2)=2

);(6.12)

where ~

j

(i) is a polynomial in i of degree 3j,whose coecients are smooth function

of the partial derivatives of () at = 0 up to the order j + 2 and those of

Q

h

1

at = 0 up to the order j.Letting D denote the d 1 vector whose kth component

is the partial dierentiation operator D

k

with respect to the kth coordinate,dene the

dierential operator ~

j

(D).As in the case of sums of i.i.d.zero-mean random vectors (cf.

Bhattacharya and Rao,1976),we obtain an Edgeworth expansion for the\formal density"

of the distribution of S

n

by replacing the ~

j

(i) and e

0

V =2

in (6.12) by ~

j

(D) and

V

(y),respectively,where

V

is the density function of the d-variate normal distribution

with mean 0 and covariance matrix V.The following two theorems are taken from Fuh

and Lai (2003).

Theorem6.2 Let r 3 be an integer.Assume C1-C4,(6.8) and (6.9) hold (or C1-C4,and

(6.8) hold in the case of w-uniformly ergodic).Let

j;V

= ~

j

(D)

V

for j = 1;:::;r 2.

For 0 < 1 and c > 0,let B

;c

be the class of all Borel subsets B of R

d

such that

R

(@B)

"

V

(y)dy c"

for every"> 0,where @B denotes the boundary of B and (@B)

"

denotes its"-neighborhood.Then

sup

B2B

;c

jP

f(S

n

n)=

p

n 2 Bg

Z

B

f

V

(y) +

r2

X

j=1

n

j=2

j;V

(y)gdyj = o(n

(r2)=2

):(6.13)

Next,we apply Theorems 6.2 to the case of iterated random functions.The following

theorem proves that M

n

is a strong stable Markov chain,under some moment conditions.

Theorem 6.3 Given an IRF (M

n

)

n0

of i.i.d.Lipschitz maps,suppose for some p > 0,

Elog

+

L

1

< 0;EL

p

1

< 1 and Ed(F

1

(x

0

);x

0

)

p

< 1 (6.14)

for some x

0

2 X.Under the assumptions of Theorem 6.1.Then (M

n

)

n0

is ergodic with

stationary distribution and uniform ergodic with respect to the norm

khk

wl

:= khk

w

+khk

BL

(6.15)

:= sup

x2X

jh(x)j

1 +d(x

0

;x)

p

+sup

x6=y

jh(x) h(y)j

d(x;y)

q

;

24

for q 2 (0;p) and some x

0

2 X.Here,wl represents a combination of the weighted variation

norm with w(x) = 1 +d(x

0

;x)

p

and the bounded Lipschitz norm.Furthermore,there exist

> 0 and 0 <

q

< 1 such that

kP

n

Qk

wl

= sup

khk=1

kP

n

h Qhk

wl

n

q

;(6.16)

where P;Q are dened as (5.4)-(5.6).

Under negative Liapunov assumption Elog

+

L

1

< 0 and moment conditions EL

2

1

<

1;Ed(F

1

(x

0

);x

0

)

2

< 1;for some x

0

2 X,Benda (1998) and Wu and Woodroofe (2000)

proved the central limit theorem for S

n

(g)=

p

n:=

P

n

t=1

g(M

t

)=

p

n in iterated random

functions.In this section,we study the asymptotic expansion for S

n

(g)=

p

n for a given

function g.Note that the method used in Benda (1998),and Wu and Woodroofe (2000) is

based on the idea of Poisson equation.And no irreducible assumption is needed in their

argument.Here,we apply Theorem4.1 for aperiodic,irreducible and uniformergodic (with

respect to the jj jj

wl

norm) Markov chain that can be constructed as an iterated random

functions.

7 Renewal theorems

In this section,we summarize the results from Fuh and Lai (2001) to state d-dimensional

renewal theorems,with an estimate on the rate of convergence,for the Markov random

walks induced by the iterated random functions.Although the norm considered in Fuh

and Lai (2001) is the weighted variation norm (5.1),the spectral theory from Section 5 can

be used to generalize them to general norm without any diculty.

Let f(X

n

;S

n

);n 0g be the Markov random walk considered in Section 5.In the

one-dimensional case,let g:X R!R.The classical Markov renewal theorem states

that under certain regularity conditions,

E

(

1

X

k=0

g(X

k

;b S

k

)) !

Z

X

Z

R

g(x;s)dsd(x)=

Z

X

E

x

1

d(x) (7.1)

as b!1.In Theorem 7.4 we establish rates of convergence for (7.1),generalizing Stone's

(1965) results in the i.i.d.case.While in Theorems 7.1-7.3 we establish the results to

multidimensional Markov renewal theory (for the case d > 1) with convergence rates,

where the Markov random walks are induced by iterated random functions.Our approach

uses the Fourier transform of the Markov transition operator and Schwartz's theory of

distributions,developed in Section 5.

25

When the increments

n

are i.i.d.and strongly nonlattice,Stone (1965) and Carlsson

(1983) derived the rate of convergence of the renewal measure to its limit under moment

conditions on

n

in the case d = 1,while Carlsson and Wainger (1982) and Keener (1990)

developed asymptotic expansion of the renewal measure in the case d > 1.To generalize

these results to Markov random walks,we rst recall the conditional Cramer's condition

(corresponding to conditional strongly nonlattice random vectors) dened in (6.8):There

exists m 1 such that

limsup

jj!1

jEfexp(i S

m

)jX

0

;X

m

gj < 1:(7.2)

Here and in the sequel,we use column vectors to denote 2 R

d

,

0

to denote the transpose

of ,and jj to denote its Euclidean norm (

0

)

1=2

.

Let = E

1

and V = lim

n!1

n

1

E

f(S

n

n)(S

n

n)

0

g,which are well dened

under C2 and C3.Let S

n;j

(or

n;j

,

j

,

j

) denote the jth component of the d-dimensional

vector S

n

(or

n

,,).Suppose

1

> 0.Without loss of generality,it will be assumed that

V is positive denite (i.e.,

n

is strictly d-dimensional under ),because otherwise we can

consider a lower-dimensional subspace instead.In the case d > 1 dene

= E

f(

n;2

=

1

; ;

n;d

=

1

)

0

g;

~

V = ( ;I

d1

)V

0

I

d1

;(7.3)

where I

k

is the k k identity matrix.Note that

~

V is the asymptotic covariance matrix

(under P

) of f(S

n;2

; ;S

n;d

)

0

S

n;1

g=

p

n.For s 2 R

d

,dene ~s = (s

2

; ;s

d

)

0

s

1

.

First consider the case of i.i.d.

n

,with d > 1 and S

0

= 0.The renewal measure is

dened by U(B) =

P

1

n=0

PfS

n

2 Bg;and multivariate renewal theory is concerned with

approximating U(s + ) by

k

(s + ) as s

1

!1,where

k

is a -nite measure on R

d

whose density function (i.e.,Radon-Nikodym derivative) with respect to Lebesgue measure

is of the form

k

(s) =

1

1

p

det

~

V

(

1

2s

1

)

(d1)=2

e

1

~s

0 ~

V

1

~s=2s

1

f1 +

k

X

j=1

s

j=2

1

!

j

(~s=

p

s

1

)g (7.4)

for s

1

> 0,and

k

(s) = 0 for s

1

0,where!

j

(u) =

P

n

j

l=0

q

l

(u) and q

l

(u) is a polynomial of

degree l in u whose coecients are associated with the Taylor expansion of (1 Ee

i

0

1

)

1

near = 0.For Markov random walks,the renewal measure involves not only fS

n

g but

also fX

n

g.For A 2 A and B 2 B,dene

U

A

(B) =

1

X

n=0

P

fX

n

2 A;S

n

2 Bg:(7.5)

26

We can approximate U

A

(s +) by (A)

A;

k

(s +),in which

A;

k

is a -nite measure on

R

d

with density function

A;

k

with respect to Lebesgue measure,where

A;

k

(s) = 0 for

s

1

0 and

A;

k

(s) is given by (7.4) for s

1

> 0,with the coecients of the polynomials

!

1

(~s); ;!

k

(~s) depending also on A and via Taylor's expansion of the Fourier transform

of U

A

near the origin,assuming that C4 hold for some suciently large r (depending on

k).Note that when is degenerate at (x;0),C4 follows from C2.The precise denition of

!

j

is given in Section 4.1 of Fuh and Lai (2001),where they also proved the following mul-

tidimensional Markov renewal theorem with bounds on the remainders in approximating

U

A

(s +) by (A)

A;

k

(s +) as s

1

!1,recalling the assumption

1

> 0.

Theorem 7.1.Let k 1 and let f(M

n

;S

n

);n 0g be a strongly nonlattice Markov

random walk satisfying C1-C4 for some r > k +5 +maxf1;(d 1)=2g.Let A 2 A and B

be a d-dimensional rectangle

Q

d

j=1

[

j

;

j

].Then as s

1

!1,

U

A

s +

0

s

1

+B

= (A)

A;

k

s +

0

s

1

+B

+o(s

(d1+k)=2

1

)

uniformly in ~s.

Theorem 7.2.Let f(M

n

;S

n

);n 0g be a strongly nonlattice Markov random walk satis-

fying C1-C4 for some r > 3.Let h > 0 and > 0.Let B

be the class of all Borel subsets

of R

d1

such that

R

(@B)

"

exp(jyj

2

=2)dy = O("

) as"#0,where @B denotes the boundary

of B and (@B)

"

denotes its"-neighborhood.Then as s

1

!1,

U

A

([s

1

;s

1

+h]

p

s

1

(s

1

+C)) = (A)

A;

1

([s

1

;s

1

+h]

p

s

1

(s

1

+C)) +o(s

(1+)=2

1

)

for every < min(1;r 3);uniformly in A 2 A and C 2 B

.

For"> 0 and f:R

d

!R,dene the oscillation function

f

(s;") = supfjf(s) f(t)j:

js tj "g:Let F

b

be the set of all Borel functions f:R

d

![0;1] such that f(s) = 0

whenever s

1

62 [b;b +h];with xed h > 0.

Theorem 7.3.Let f(M

n

;S

n

);n 0g be a strongly nonlattice Markov random walk sat-

isfying C1-C4 for some r > 3.Let 0 < < min(1;r 3).Then for every > 0,as

b!1,

Z

f(s)dU

A

(s) = (A)

Z

f(s)d

A;

1

(s) +O

Z

f

(s;b

)d

A;

1

(s)

+o(b

(1+)=2

)

uniformly in f 2 F

b

and A 2 A.

27

In the case d = 1,V is a scalar,which will be denoted by

2

.The following the-

orem provides bounds on the dierence between U

A

([b;b + h]) and its renewal-theoretic

approximation as b!1.

Theorem 7.4.Suppose d = 1 and f(M

n

;S

n

);n 0g is a strongly nonlattice Markov

random walk satisfying C1-C4 for some r 2.Then as b!1,

U

A

([b;b +h]) = (A)h= +o(b

(r1)

)

uniformly in A 2 A.

Given an IRF (M

n

)

n0

of i.i.d.Lipschitz maps,suppose (2.13) and (2.6) hold for some

p > 0.Then (M

n

)

n0

forms satises assumptions C1-C4 in Section 5.For j = 1; ;d,let

g

j

2 L

2

0

() be a square integrable function with mean 0,i.e.

Z

X

g

j

d = 0 and kg

j

k

2

2

=

Z

X

g

2

j

d < 1:(7.6)

Denote g = (g

1

; ;g

d

) and consider the sequence

S

n

:= S

n

(g):= g(M

1

) + +g(M

n

);n 1;(7.7)

which may be viewed as a Markov random walk with driving chain (M

n

)

n0

.

In order to apply the spectral theory for (M

n

)

n0

,we put the following irreducible

condition.We shall say that the Markovian kernel P(x;dy) is irreducible if the condition

P

h = e

i

h ( 2 Rand h 2 N) implies that e

i

= 1 and h is a constant.By using Theorem

6.3,we can apply Theorems 7.1-7.4 to have the conclusion.

8 Examples and Applications

In the following,F always denotes a generic copy of F

1

;F

2

; and Lebesgue measure

on R (or some subset).Examples 1,2 and 4 are similar to that in Alsmeyer (2003) and

Example 3 is taken from Fuh (2003).Examples 1 and 2 show the irreducibility from the

density point of view;while Example 3 is from the positivity point of view.

Example 1.This is the motivating example in Diaconis and Freedman (1999),see 2.1

there.Let X:= [0;1],

u

(x):= ux,

u

(x):= x +u(1 x) for u 2 [0;1] and

F(x) = Z

U

(x) +(1 Z)

U

(x) (8.1)

for independent randomvariables U;Z with a uniformdistribution on [0;1] and a Bernoulli(1=2)

distribution,respectively.It is not dicult to verify that (M

n

)

n0

satises the assumptions

28

of Theorem 4.1 and has stationary distribution =Beta(1=2;1=2) with Lebesgue density

f(x) =

1

p

x(1x)

on (0;1),also called arcsine distribution.(Plainly, in the denomina-

tor of f means the constant 3:14:::) Now observe that P(x;) is a mixture of a uniform

distribution on [0;x] and a uniform distribution on [x;1].So it possesses a -continuous

component for each x 2 [0;1].Theorem 4.1 therefore imply the Harris recurrence of

(M

n

)

n0

on H = X = [0;1].The conclusion remains true in the biased case where Z

has a Bernoulli(p) distribution for some p 6= 1=2.The stationary distribution in this case

is a Beta(p;q) distribution with Lebesgue density

(p+q)

(p)(q)

x

p1

(1 x)

q1

on (0;1),where

q 1 p and is gamma function.

The Beta-Walk is a generalization of (8.1) and obtained by replacing the uniform vari-

able U in (8.1) by a Beta(;) variable V, 2 [0;1].Here Beta(0;0):=

1

2

(

0

+

1

) and

Beta(1;1):=

1=2

.(8.1) yields when = 1.As one can easily see with Theorem 4.1,

(M

n

)

n0

is a positive Harris chain on X = [0;1] for 2 (0;1],but is not for = 0.Dia-

conis and Freedman [5,Theorem 6.1] show that equals Beta(

+1

;

+1

) for 2 f0;1;1g,

but diers from it otherwise,although sharing the rst three moments.Except for the case

= 0,where =

1

2

(

0

+

1

), is further absolutely continuous with therefore nonempty

int(supp ).Since X is compact,condition (2.6) with x

0

= 0 holds for every p > 0,whence

Theorem 4.3 implies geometric ergodicity of the chain for every 2 (0;1).If = 0,the

same conclusion yields by observing that,starting from any x 2 [0;1],it takes a geometric

time to enter the absorbing closed Harris set supp = f0;1g and that Theorem 4.3 gives

geometric ergodicity on that set.

Example 2.Let us next take a look at matrix recursions which have been studied by

many authors,see 2.2 in Diaconis and Freedman (1999) and the references given there.

The dening equation is

M

n

= A

n

M

n1

+B

n

;n 1 (8.2)

on X = R

m

for some m 1,where (A

1

;B

1

);(A

2

;B

2

); are i.i.d.;A

n

is a mm matrix

and B

n

a m1 vector.So the associated random Lipschitz map is

F(x) = Ax +B (8.3)

with (A;B) being a generic copy of (A

1

;B

1

).Let k k be any norm on R

m

,dene

kAk:= supfkAxk;x 2 R

m

;kxk 1g for mm matrices A and suppose that

Elog

+

kAk < 1 and Elog

+

kBk < 1:

Suppose further an a.s.negative Liapunov exponent l

,here given by

l

= inffn

1

Elog kA

1

A

n

k;n 1g:

29

Then the conditions of Theorem 2.1 are satised (with x

0

= 0) whence,by Theorem 2.1,

M

n

possesses a unique stationary distribution which is the distribution of any solution

M

1

of the stochastic xed point equation M

1

AM

1

+B,where (A;B) and M

1

are

independent.As one can easily see,we may take

M

1

=

X

n1

n1

Y

k=1

A

k

B

n

:(8.4)

If we now additionally assume that (A;B) is nonsingular with respect to

mm

m

,then

all P(x;),x 2 R

m

,are evidently nonsingular with respect to

m

and Theorem 4.1 shows

the positive Harris recurrence of (M

n

)

n0

on whole X = R

m

.The same conclusion holds

true providing that A,B are independent and B is nonsingular with respect to

m

.

Next,we assume that the matrix recursion (8.2) with a.s.negative Liapunov exponent

and satisfying

(k1) Elog

+

kAk < 1and Elog

+

kBk < 1,

we recall that its positive Harris recurrence on whole X = R

m

follows if further

(k2) (A

1

;B

1

) is nonsingular with respect to

mm

m

,

or

(k2') A

1

,B

1

are independent and B

1

is nonsingular with respect to

m

holds true.

Given any p > 0,it is then immediate to conclude the assertion of Theorem 4.2(a) and

of 4.2(b) with w(x) = kxk

p

,providing additionally

(k3) Elog

p+1

(1 +kA

1

k) < 1and Elog

p+1

(1 +kB

1

k) < 1,

respectively

(k4) EkA

1

k

p

< 1and EkB

1

k

p

< 1.

Example 3.In this example,we consider the statistical inferential problem for hidden

Markov model.A hidden Markov model is dened as a parameterized Markov chain in a

Markovian random environment (cf.Cogburn,1980),with the underlying environmental

Markov chain viewed as missing data.That is,for each 2 R

q

,the unknown pa-

rameter,we consider X = fX

n

;n 0g as an ergodic (positive recurrent,irreducible and

aperiodic) Markov chain on a nite state space D = f1;2; ;dg,with transition prob-

ability matrix P() = [p

xy

()]

x;y=1;;d

and stationary distribution () = (

x

())

x=1;;d

.

Suppose that an additive component

n

=

P

n

k=0

k

;taking values in R,is adjoined to the

chain such that f(X

n

;

n

);n 0g is a Markov chain on DR and conditioning on the full

X sequence,

n

is a Markov chain with probability

P

()

f

n+1

2 BjX

0

;X

1

; ;

0

;

1

; ;

n

g = P

()

(X

n+1

:

n

;B) a:s:(8.5)

30

for each n and B 2 B(R);the Borel -algebra of R.Furthermore,we assume the existence

of a transition probability density for the Markov chain f(X

n

;

n

);n 0g with respect to

a -nite measure on R such that

P

()

fX

1

2 A;

1

2 BjX

0

= x;

0

= s

0

g =

X

y2A

Z

B

p

xy

()f(s;'

y

()js

0

)d(s);(8.6)

where f(

k

;'

X

k

()j

k1

) is the transition probability density of

k

given

k1

;X

k

,with

respect to , 2 is the unknown parameter,and'

y

() is a function dened on the pa-

rameter space for each y = 1; ;d.Here and in the sequel,we assume the Markov chain

f(X

n

;

n

);n 0g has stationary probability with probability density

x

()f(;'

x

())

with respect to .In this example,we assume that only one parameter is of interest and

treat the other parameters as nuisance parameters.That is,for simplicity,we consider

2 R as an one-dimensional unknown parameter.For convenience of notation,we

will use

x

for

x

() and p

xy

for p

xy

(),respectively,in the sequel.We give a formal

denition of a hidden Markov model as follows:

Denition 2 A process f

n

;n 0g is called a hidden Markov model if there is a Markov

chain fX

n

;n 0g such that the process f(X

n

;

n

);n 0g satises (8.5) and (8.6).

Note that if

n

are conditionally independent given the full sequences X,then the Markov

chain f(X

n

;

n

);n 0g is called a Markov random walk,and f

n

;n 0g is the classical

hidden Markov model.

Now,let

0

;

1

; ;

n

be the observations from the hidden Markov model f

n

;n 0g

with an unknown parameter .Let

S

n

:=

p

n

(

0

;

1

; ;

n

;

1

)

p

n

(

0

;

1

; ;

n

;

0

)

(8.7)

:=

P

d

x

0

=1

P

d

x

n

=1

x

0

(

1

)f(

0

;'

x

0

(

1

))

Q

n

k=1

p

x

k1

x

k

(

1

)f(

k

;'

x

k

(

1

)j

k1

)

P

d

x

0

=1

P

d

x

n

=1

x

0

(

0

)f(

0

;'

x

0

(

0

))

Q

n

k=1

p

x

k1

x

k

(

0

)f(

k

;'

x

k

(

0

)j

k1

)

for xed

0

;

1

2 .

Let

0

2

0

(the interior of ) and consider the problem of testing hypothesis

0

.

Given

1

>

0

,we can construct a sequential probability ratio test of =

0

versus =

1

and use it to test the composite hypothesis

0

.Then,the sequential probability ratio

test of =

0

versus =

1

stops sampling at stage

T:= inffn:log S

n

a or log S

n

bg (8.8)

for a 0 < b and accepts the null hypothesis that =

0

(or the alternative hypothesis

that =

1

) is the actual density according to log S

T

a (or log S

T

b).When it is

31

regarded as a test of

0

,the SPRT rejects

0

if and only if log S

T

b.The

problem of interest here is to approximate the type I error = P

(

0

)

flog S

T

bg,the

type II error = P

(

1

)

flog S

T

ag and the expected sample sizes E

(

0

)

T (E

(

1

)

T) of the

test,where P

()

(E

()

) refers to the probability (expectation) with initial distribution as

the stationary distribution

x

()f(;'

x

()).

To analyze the likelihood ratio (8.7),we have the following likelihood representation

via products of random matrices.Given a column vector u = (u

1

; ;u

d

)

t

2 R

d

,where t

denotes the transpose of the underlying vector in R

d

,dene the L

1

-norm of u as kuk =

P

d

i=1

ju

i

j.The likelihood ratio (8.7) then can be represented as

S

n

=

p

n

(

1

; ;

n

;

1

)

p

n

(

1

; ;

n

;

0

)

=

kM

n

(

1

) M

1

(

1

)M

0

(

1

)(

1

)k

kM

n

(

0

) M

1

(

0

)M

0

(

0

)(

0

)k

;(8.9)

where

M

0

= M

0

() =

2

6

4

f(

0

;'

1

()) 0 0

.

.

.

.

.

.

.

.

.

.

.

.

0 0 f(

0

;'

d

())

3

7

5

;(8.10)

M

k

= M

k

() =

2

6

4

p

11

()f(

k

;'

1

()j

k1

) p

d1

()f(

k

;'

1

()j

k1

)

.

.

.

.

.

.

.

.

.

p

1d

()f(

k

;'

d

()j

k1

) p

dd

()f(

k

;'

d

()j

k1

)

3

7

5

(8.11)

for k = 1; ;n,and

() =

1

(); ;

d

()

t

:(8.12)

Note that each component p

xy

f(

k

;'

y

()j

k1

) in M

k

represents X

k1

= x and X

k

= y,

and

k

is a Markov chain with transition probability density f(

k

;'

y

()j

k1

),for k =

1; ;n,therefore the M

k

are random matrices.Since f(X

n

;

n

);n 0g is a Markov chain

by denition (8.5) and (8.6),this implies that fM

k

;k = 1; ;ng is a sequence of Markov

random matrices.Hence,S

n

is the ratio of the L

1

-norm of the products of Markov random

matrices via representation (8.9).Note that is xed in (8.9).

Note that we consider i.i.d.random Lipschitz maps in this paper.Although the ex-

ample shows a nite dimensional (random matrices) linear iteration driven by a Markov

chain,the results developed in our setting can still be applied.The property of irreducible

comes from the positivity of the density dened in (8.10) and (8.11).The reader is referred

to Fuh (2003) for formal denitions and details.

Example 4.Let us now look at an example,in fact a one-dimensional special case of

Example 2 (A = (a) and B

n

=

n

),with a negative answer as to Harris recurrence.Put

32

f

0

(x) ax 1,f

1

(x) ax +1 for x 2 R and some a 2 (0;1) and consider

F(x) = f

(x) (8.13)

where is 0 or 1 with probability 1=2 each.The associated IRF (M

n

)

n0

with state space

X = R thus satises the recursive equation

M

n

= aM

n1

+

n

;n 1 (8.14)

where

1

;

2

; are independent Bernoulli(1=2) variables.Its unique stationary distribution

is the distribution of the innite series

P

n1

a

n1

n

.It is known that is continuous

for every a 2 (0;1),singular for a 2 (0;1=2) [N,N (1=2;1) a nonempty -null set,and

absolutely continuous,otherwise.If a = 1=2, is the uniform distribution on [2;2].

We claimthat (M

n

)

n0

is never Harris recurrent.If it were,by Theorem2.1 of Alsmeyer

(2003),we could nd a -positive set X

0

,necessarily uncountable because is continuous,

such that the

P(x;) =

1

2

ax+1

+

1

2

ax1

;x 2 X

0

were dominated by some -nite measure .By a well-known result of Halmos and Sav-

age (1948),we could then nd a countable subset X

1

of X

0

such that (P(x;))

x2X

0

and

(P(x;))

x2X

1

were equivalent,that is P(x;N) = 0 for all x 2 X

0

i P(x;N) = 0 for

all x 2 X

1

.On the other hand,given any countable X

1

= fx

n

;n 1g,the set of

x such that P(x;) is nonsingular with respect to some P(x

n

;) is easily identied as

X

1

[ fx 2 X:x = x

n

2

a

for some ng which is again countable.Consequently,the

uncountable X

0

contains elements x such that P(x;) is orthogonal to each P(x

n

;),a

contradiction to the equivalence of (P(x;))

x2X

0

and (P(x;))

x2X

1

.

References

[1] Alsmeyer,G.(1990).Convergence rates in the law of large numbers for martingales.

Stoch.Proc.Appl.36,181-194.

[2] Alsmeyer,G.(2003).On the Harris recurrence of iterated random Lipschitz functions

and related convergence rate results.To appear in J.Theoretical Probab.

[3] Alsmeyer,G.and Fuh,C.D.(2001).Limit theorems for iterated random functions

by regenerative methods.Stoch.Proc.Appl.96,123-142.Corrigendum (2002),97,

341-345.

[4] Benda,M.(1998).A central limit theorem for contractive stochastic dynamical sys-

tems.J.Appl.Prob.35,200-205.

33

[5] Bhattacharya,R.N.and Ranaga Rao,R.(1976).Normal Approximation and Asymp-

totic Expansions.Krieger,Malabar.Fl.1986.(Revised Reprint).

[6] Carlsson,H.(1983).Remainder term estimates of the renewal function.Ann.Probab.

11,143-157.

[7] Carlsson,H.and Wainger,S.(1982).An asymptotic series expansion of the multidi-

mensional renewal measure.Comp.Math.,47,355-364.

[8] Diaconis,P.and Freedman,D.(1999).Iterated random functions.SIAM Review 41,

45-76.

[9] Du o,M.(1997).Random Iterative Models,Springer-Verlag,New York.

[10] Elton,J.H.(1990).A multiplicative ergodic theorem for Lipschitz maps.Stoch.Proc.

Appl.34,39-47.

[11] Fuh,C.D.(2003).SPRT and CUSUM in hidden Markov models.To appear in the

Ann.Statist.vol 31.

[12] Fuh,C.D.and Lai,T.L.(2001).Asymptotic expansions in multidimensional Markov

renewal theory and rst passage times for Markov random walks.Adv.Appl.Prob.

33,652-673.

[13] Fuh,C.D.and Lai,T.L.(2003).Characteristic function and edgeworth expansions

for Markov random walks with applications to bootstrap methods.Working paper.

[14] Fuh,C.D.and Zhang,C.H.(2000).Poisson equation,moment inequalities and quick

convergence for Markov random walks.Stoch.Proc.Appl.87,53-67.

[15] Halmos,P.and Savage,L.J.(1948).Application of the Radon-Nikodym theorem to

the theory of sucient statistics.Ann.Math.Statist.20,225-241.

[16] Hipp,C.(1985).Asymptotic expansions in the central limit theorem for compound and

Markov processes.Z.Wahrsch Verw.Gebiete,69,361-385.

[17] Jensen,J.L.(1987).A note on asymptotic expansions for Markov chains using oper-

ator theory.Adv.Appl.Math.8,377-392.

[18] Jensen,J.L.(1989).Asymptotic expansions for strongly mixing Harris recurrent

Markov chains.Scand.J.Statist.16,47-63.

[19] Kartashov,N.V.(1996).Strong Stable Markov Chains.VSP,Utrecht.

34

[20] Keener,R.(1990).Asymptotic expansions in multivariate renewal theory.Stoch.Proc.

Appl.34,137-143.

[21] Malinovskii,V.K.(1987).Limit theorems for Harris-Markov chains,I.Theory

Probab.Appl.31,269-285.

[22] Meyn,S.P.and Tweedie,R.L.(1993).Markov Chains and Stochastic Stability.

Springer-Verlag,New York.

[23] Nagaev,S.V.(1957).Some limit theorems for stationary Markov chains.Theory

Probab.Appl.2,378-406.

[24] Ney,P.and Nummelin,E.(1987).Markov additive processes I.Eigenvalue properties

and limit theorems.Ann.Probab.15,561-592.

[25] Riesz,F.and Sz-Nagy,B.(1955).Functional Analysis.Ungar,New York.

[26] Stone,C.(1965).On characteristic functions and renewal theory.Trans.Amer.Math.

Soc.120,327-342.

[27] Strassen,V.(1967).Almost sure behavior of sums of independent random variables

and martingales.Proc.Fifth Berkeley Symp.Math.Statist.and Probability,315-343.

[28] Wu,W.B.and Woodroofe,M.(2000).A central limit theorem for iterated random

functions.J.Appl.Prob.37,748-755.

35

## Comments 0

Log in to post a comment