The KuhnTucker and Envelope Theorems
Peter Ireland
EC720.01  Math for Economists
Boston College,Department of Economics
Fall 2013
The KuhnTucker and envelope theorems can be used to characterize the solution to
a wide range of constrained optimization problems:static or dynamic,and under perfect
foresight or featuring randomness and uncertainty.In addition,these same two results
provide foundations for the work on the maximum principle and dynamic programming that
we will do later on.For both of these reasons,the KuhnTucker and envelope theorems
provide the starting point for our analysis.Let's consider each in turn,rst in fairly general
or abstract settings and then applied to some economic examples.
1 The KuhnTucker Theorem
References:
Dixit,Chapters 2 and 3.
SimonBlume,Chapters 18 and 19.
Acemoglu,Appendix A.
Consider a simple constrained optimization problem:
x 2 R choice variable
F:R!R objective function,continuously dierentiable
c G(x) constraint,with c 2 R and G:R!R,also continuously dierentiable.
The problem can be stated as:
max
x
F(x) subject to c G(x)
Copyright
c
2013 by Peter Ireland.Redistribution is permitted for educational and research purposes,
so long as no changes are made.All copies must be provided free of charge and must include this copyright
notice.
1
This problem is\simple"because it is static and contains no random or stochastic elements
that would force decisions to be made under uncertainty.This problemis also\simple"
because it has a single choice variable and a single constraint.All these simplications
will make our statement and proof of the KuhnTucker theorem as clean and intuitive
as possible.But the results can be generalized along all of these dimensions and,
throughout the semester,we will work through examples that do so.
Probably the easiest way to solve this problem is via the method of Lagrange multipliers.
The mathematical foundations that allow for the application of this method are given
to us by Lagrange's Theorem or,in its most general form,the KuhnTucker Theorem.
To prove this theorem,begin by dening the Lagrangian:
L(x;) = F(x) +[c G(x)]
for any x 2 R and 2 R.
Theorem (KuhnTucker) Suppose that x
maximizes F(x) subject to c G(x),where
F and G are both continuously dierentiable,and suppose that G
0
(x
) 6= 0.Then
there exists a value
of such that x
and
satisfy the following four conditions:
L
1
(x
;
) = F
0
(x
)
G
0
(x
) = 0;(1)
L
2
(x
;
) = c G(x
) 0;(2)
0;(3)
and
[c G(x
)] = 0:(4)
Proof Consider two possible cases,depending on whether or not the constraint is binding
at x
.
Case 1:Nonbinding Constraint.
If c > G(x
),then let
= 0.Clearly,(2)(4) are satised,so it only remains to show
that (1) must hold.With
= 0,(1) holds if and only if
F
0
(x
) = 0:(5)
We can show that (5) must hold using a proof by contradiction.Suppose that
instead of (5),it turns out that
F
0
(x
) < 0:
Then,by the continuity of F and G,there must exist an"> 0 such that
F(x
") > F(x
) and c > G(x
"):
2
But this result contradicts the assumption that x
maximizes F(x) subject to
c G(x).Similarly,if it turns out that
F
0
(x
) > 0;
then by the continuity of F and G there must exist an"> 0 such that
F(x
+") > F(x
) and c > G(x
+");
But,again,this result contradicts the assumption that x
maximizes F(x) subject
to c G(x).This establishes that (5) must hold,completing the proof for case 1.
Case 2:Binding Constraint.
If c = G(x
),then let
= F
0
(x
)=G
0
(x
).This is possible,given the assumption
that G
0
(x
) 6= 0.Clearly,(1),(2),and (4) are satised,so it only remains to show
that (3) must hold.With
= F
0
(x
)=G
0
(x
),(3) holds if and only if
F
0
(x
)=G
0
(x
) 0:(6)
We can show that (6) must hold using a proof by contradiction.Suppose that
instead of (6),it turns out that
F
0
(x
)=G
0
(x
) < 0:
One way that this can happen is if F
0
(x
) > 0 and G
0
(x
) < 0.But if these
conditions hold,then the continuity of F and G implies the existence of an"> 0
such that
F(x
+") > F(x
) and c = G(x
) > G(x
+");
which contradicts the assumption that x
maximizes F(x) subject to c G(x).
And if,instead,F
0
(x
)=G
0
(x
) < 0 because F
0
(x
) < 0 and G
0
(x
) > 0,then the
continuity of F and G implies the existence of an"> 0 such that
F(x
") > F(x
) and c = G(x
) > G(x
");
which again contradicts the assumption that x
maximizes F(x) subject to c
G(x).This establishes that (6) must hold,completing the proof for case 2.
Notes:
a) The theorem can be extended to handle cases with more than one choice variable
and more than one constraint:see Dixit,SimonBlume,Acemoglu,or section 4.1
of the notes below.
b) Equations (1)(4) are necessary conditions:If x
is a solution to the optimization
problem,then there exists a
such that (1)(4) must hold.But (1)(4) are not
sucient conditions:if x
and
satisfy (1)(4),it does not follow automatically
that x
is a solution to the optimization problem.
3
Despite point (b) listed above,the KuhnTucker theorem is extremely useful in practice.
Suppose that we are looking for the solution x
to the constrained optimization problem
max
x
F(x) subject to c G(x):
The theorem tells us that if we form the Lagrangian
L(x;) = F(x) +[c G(x)];
then x
and the associated
must satisfy the rstorder condition (FOC) obtained
by dierentiating L by x and setting the result equal to zero:
L
1
(x
;
) = F
0
(x
)
G
0
(x
) = 0;(1)
In addition,we know that x
must satisfy the constraint:
c G(x
):(2)
We know that the Lagrange multiplier
must be nonnegative:
0:(3)
And nally,we know that the complementary slackness condition
[c G(x
)] = 0;(4)
must hold:If
> 0,then the constraint must bind;if the constraint does not bind,
then
= 0.
In searching for the value of x that solves the constrained optimization problem,we only
need to consider values of x
that satisfy (1)(4).
Two pieces of terminology:
a) The extra assumption that G
0
(x
) 6= 0 is needed to guarantee the existence of a
multiplier
satisfying (1)(4).This extra assumption is called the constraint
qualication,and almost always holds in practice.
b) Note that (1) is a FOC for x,while (2) is like a FOC for .In many economic
applications,where F(x) is concave and G(x) is convex,x
maximizes L(x;),
while
minimizes L(x;).For this reason,(x
;
) is typically a saddlepoint of
L(x;).
Note,however,that in the general case where F(x) need not be concave,x
will always be
a critical point of the Lagrangian { that is,it will satisfy the rstorder condition (1)
{ but x
need not maximize L(x;).An example of how this can happen is given by
Dixit's example 7.2 (p.103).Consider the problem
max
x
e
x
subject to 1 x:
4
Since the exponential objective function is strictly increasing,we know that this prob
lem is solved with x
= 1.Forming the Lagrangian and taking the rstorder condition
L(x
;
) = e
x
= 0;
shows that the associated value of
is e.But
= e implies that
L(x
;
) = e
x
+
(1 x
) = e
x
+e(1 x
);
and since e
x
ex is strictly increasing in x,x
= 1 does not maximize the Lagrangian
given
.Hence,(x
;
) is not a saddlepoint of L,since the objective function from
the original problem is convex,not concave.
But,in general,by solving the problem in this way,we are using the Lagrangian to turn
a constrained optimization problem into something more like an unconstrained opti
mization problem,in that the solution is a critical point of L(x;) rather than simply
F(x).
One nal note:
Our general constraint,c G(x),nests as a special case the nonnegativity constraint
x 0,obtained by setting c = 0 and G(x) = x.
So nonnegativity constraints can be introduced into the Lagrangian in the same way
as all other constraints.If we consider,for example,the extended problem
max
x
F(x) subject to c G(x) and x 0;
then we can introduce a second multiplier ,form the Lagrangian as
L(x;;) = F(x) +[c G(x)] +x;
and write the rst order condition for the optimal x
as
L
1
(x
;
;
) = F
0
(x
)
G
0
(x
) +
= 0:(1
0
)
In addition,analogs to our earlier conditions (2)(4) must also hold for the second
constraint:x
0,
0,and
x
= 0.
Kuhn and Tucker's original statement of the theorem,however,does not incorporate
nonnegativity constraints into the Lagrangian.Instead,even with the additional
nonnegativity constraint x 0,they continue to dene the Lagrangian as
L(x;) = F(x) +[c G(x)]:
If this case,the rst order condition for x
must be modied to read
L
1
(x
;
) = F
0
(x
)
G
0
(x
) 0;with equality if x
> 0:(1
00
)
Of course,in (1
0
),
0 in general and
= 0 if x
> 0.So a close inspection reveals
that these two approaches to handling nonnegativity constraints lead in the end
to the same results.
5
2 The Envelope Theorem
References:
Dixit,Chapter 5.
SimonBlume,Chapter 19.
Acemoglu,Appendix A.
In our discussion of the KuhnTucker theorem,we considered an optimization problem of
the form
max
x
F(x) subject to c G(x)
Now,let's generalize the problem by allowing the functions F and G to depend on a
parameter 2 R.The problem can now be stated as
max
x
F(x;) subject to c G(x;)
For this problem,dene the maximum value function V:R!R as
V () = max
x
F(x;) subject to c G(x;)
Note that evaluating V requires a twostep procedure:
First,given ,nd the value of x
that solves the constrained optimization problem.
Second,substitute this value of x
,together with the given value of ,into the objec
tive function to obtain
V () = F(x
;)
Now suppose that we want to investigate the properties of this function V.Suppose,in
particular,that we want to take the derivative of V with respect to its argument .
As the rst step in evaluating V
0
(),consider solving the constrained optimization problem
for any given value of by setting up the Lagrangian
L(x;) = F(x;) +[c G(x;)]
We know from the KuhnTucker theorem that the solution x
to the optimization problem
and the associated value of the multiplier
must satisfy the complementary slackness
condition:
[c G(x
;)] = 0
Use this last result to rewrite the expression for V as
V () = F(x
;) = F(x
;) +
[c G(x
;)]
6
So suppose that we tried to calculate V
0
() simply by dierentiating both sides of this
equation with respect to :
V
0
() = F
2
(x
;)
G
2
(x
;):
But,in principle,this formula may not be correct.The reason is that x
and
will
themselves depend on the parameter ,and we must take this dependence into account
when dierentiating V with respect to .
However,the envelope theorem tells us that our formula for V
0
() is,in fact,correct.That
is,the envelope theorem tells us that we can ignore the dependence of x
and
on
in calculating V
0
().
To see why,for any ,let x
() denote the solution to the problem:max F(x;) subject to
c G(x;),and let
() be the associated Lagrange multiplier.
Theorem (Envelope) Let F and G be continuously dierentiable functions of x and .
For any given ,let x
() maximize F(x;) subject to c G(x;),and let
() be
the associated value of the Lagrange multiplier.Suppose,further,that x
() and
()
are also continuously dierentiable functions,and that the constraint qualication
G
1
[x
();] 6= 0 holds for all values of .Then the maximum value function dened by
V () = max
x
F(x;) subject to c G(x;)
satises
V
0
() = F
2
[x
();]
()G
2
[x
();]:(7)
Proof The KuhnTucker theorem tells us that for any given value of ,x
() and
()
must satisfy
L
1
[x
();
()] = F
1
[x
();]
()G
1
[x
();] = 0;(1)
and
()fc G[x
();]g = 0:(4)
In light of (4),
V () = F[x
();] = F[x
();] +
()fc G[x
();]g
Dierentiating both sides of this expression with respect to yields
V
0
() = F
1
[x
();]x
0
() +F
2
[x
();]
+
0
()fc G[x
();]g
()G
1
[x
();]x
0
()
()G
2
[x
();]
which shows that,in principle,we must take the dependence of x
and
on into
account when calculating V
0
().
7
Note,however,that
V
0
() = fF
1
[x
();]
()G
1
[x
();]gx
0
()
+F
2
[x
();] +
0
()fc G[x
();]g
()G
2
[x
();];
which by (1) reduces to
V
0
() = F
2
[x
();] +
0
()fc G[x
();]g
()G
2
[x
();]
Thus,it only remains to show that
0
()fc G[x
();]g = 0 (8)
Clearly,(8) holds for any such that the constraint is binding.
For such that the constraint is not binding,(4) implies that
() must equal zero.
Furthermore,by the continuity of G and x
,if the constraint does not bind at ,there
exists an"
> 0 such that the constraint does not bind for all +"with"
> j"j.
Hence,(4) also implies that
( +") = 0 for all"
> j"j.Using the denition of the
derivative
0
() = lim
"!0
( +")
()
"
= lim
"!0
0
"
= 0;
it once again becomes apparent that (8) must hold.
Thus,
V
0
() = F
2
[x
();]
()G
2
[x
();]
as claimed in the theorem.
Once again,this theorem is useful because it tells us that we can ignore the dependence of
x
and
on in calculating V
0
().
And once again,the theorem can be extended to apply in more general settings:see Dixit,
SimonBlume,Acemoglu,or section 4.2 of the notes below.
But what is the intuition for why the envelope theorem holds?To obtain some intuition,
begin by considering the simpler,unconstrained optimization problem:
max
x
F(x;);
where x is the choice variable and is the parameter.
Associated with this unconstrained problem,dene the maximum value function in the
same way as before:
V () = max
x
F(x;):
8
To evaluate V for any given value of ,use the same twostep procedure as before.First,
nd the value x
() that solves the unconstrained maximization problem for that value
of .Second,substitute that value of x back into the objective function to obtain
V () = F[x
();]:
Now dierentiate both sides of this expression through by ,carefully taking the dependence
of x
on into account:
V
0
() = F
1
[x
();]x
0
() +F
2
[x
();]:
But,if x
() is the value of x that maximizes F given ,we know that x
() must be a
critical point of F:
F
1
[x
();] = 0:
Hence,for the unconstrained problem,the envelope theorem implies that
V () = F
2
[x
();];
so that,again,we can ignore the dependence of x
on in dierentiating the maximum
value function.And this result holds not because x
fails to depend on :to the
contrary,in fact,x
will typically depend on through the function x
().Instead,the
result holds because since x
is chosen optimally,x
() is a critical point of F given .
Now return to the constrained optimization problem
max
x
F(x;) subject to c G(x;)
and dene the maximum value function as before:
V () = max
x
F(x;) subject to c G(x;):
The envelope theorem for this constrained problem tells us that we can also ignore the
dependence of x
on when dierentiating V with respect to ,but only if we start by
adding the complementary slackness condition to the maximized objective function to
rst obtain
V () = F[x
();] +
()fc G[x
();]g:
In taking this rst step,we are actually evaluating the entire Lagrangian at the optimum,
instead of just the objective function.We need to take this rst step because for the
constrained problem,the KuhnTucker condition (1) tells us that x
() is a critical
point,not of the objective function by itself,but of the entire Lagrangian formed by
adding the product of the multiplier and the constraint to the objective function.
And what gives the envelope theorem its name?The\envelope"theorem refers to a geo
metrical presentation of the same result that we've just worked through.
9
To see where that geometrical interpretation comes from,consider again the simpler,un
constrained optimization problem:
max
x
F(x;);
where x is the choice variable and is a parameter.
Following along with our previous notation,let x
() denote the solution to this problem
for any given value of ,so that the function x
() tells us how the optimal choice of
x depends on the parameter .
Also,continue to dene the maximum value function V in the same way as before:
V () = max
x
F(x;):
Now let
1
denote a particular value of ,and let x
1
denote the optimal value of x associated
with this particular value
1
.That is,let
x
1
= x
(
1
):
After substituting this value of x
1
into the function F,we can think about how F(x
1
;)
varies as variesthat is,we can think about F(x
1
;) as a function of ,holding x
1
xed.
In the same way,let
2
denote another particular value of ,with
2
>
1
let's say.And
following the same steps as above,let x
2
denote the optimal value of x associated with
this particular value
2
,so that
x
2
= x
(
2
):
Once again,we can hold x
2
xed and consider F(x
2
;) as a function of .
The geometrical presentation of the envelope theorem can be derived by thinking about the
properties of these three functions of :V (),F(x
1
;),and F(x
2
;).
One thing that we know about these three functions is that for =
1
:
V (
1
) = F(x
1
;
1
) > F(x
2
;
1
);
where the rst equality and the second inequality both follow from the fact that,by
denition,x
1
maximizes F(x;
1
) by choice of x.
Another thing that we know about these three functions is that for =
2
:
V (
2
) = F(x
2
;
2
) > F(x
1
;
2
);
because again,by denition,x
2
maximizes F(x;
2
) by choice of x.
On a graph,these relationships imply that:
10
At
1
,V () coincides with F(x
1
;),which lies above F(x
2
;).
At
2
,V () coincides with F(x
2
;),which lies above F(x
1
;).
And we could nd more and more values of V by repeating this procedure for more
and more specic values of
i
,i = 1;2;3;:::.
In other words:
V () traces out the\upper envelope"of the collection of functions F(x
i
;),formed
by holding x
i
= x
(
i
) xed and varying .
Moreover,V () is tangent to each individual function F(x
i
;) at the value
i
of for
which x
i
is optimal,or equivalently:
V
0
() = F
2
[x
();];
which is the same analytical result that we derived earlier for the unconstrained
optimization problem.
If,for example,
F(x;) = (x )
2
+
2
= x
2
+2x;
then
V () = max
x
(x )
2
+
2
=
2
;
since,in this case,x
() = for all values of .
The gure below sets
1
= 2 and
2
= 7;hence x
1
= 2 and x
2
= 7,then plots
F(x
1
;) = 4 +4;
F(x
2
;) = 49 +14;
and
V () =
2
to show how
V (
1
) = F(x
1
;
1
) > F(x
2
;
1
) at
1
= 2;
and
V (
2
) = F(x
2
;
2
) > F(x
1
;
2
) at
2
= 7;
and how,more generally,V () traces out the upper envelope of the family of functions
F(x
i
;),where each x
i
maximizes F(x;) for some value
i
of .
11
To generalize these arguments so that they apply to the constrained optimization problem
max
x
F(x;) subject to c G(x;);
simply use the fact that in many cases (as when F is concave and G is convex) the
value x
() that solves the constrained optimization problem for any given value of
also maximizes the Lagrangian function
L(x;;) = F(x;) +[c G(x;)];
so that
V () = max
x
F(x;) subject to c G(x;)
= max
x
L(x;;)
Now just replace the function F with the function L in working through the arguments
from above to conclude that
V
0
() = L
3
[x
();
();] = F
2
[x
();]
()G
2
[x
();];
which is again the same result that we derived before for the constrained optimization
problem.
12
3 Two Examples
3.1 Utility Maximization
A consumer has a utility function dened over consumption of two goods:U(c
1
;c
2
)
Prices:p
1
and p
2
Income:I
Budget constraint:I p
1
c
1
+p
2
c
2
= G(c
1
;c
2
)
The consumer's problem is:
max
c
1
;c
2
U(c
1
;c
2
) subject to I p
1
c
1
+p
2
c
2
The KuhnTucker theorem tells us that if we set up the Lagrangian:
L(c
1
;c
2
;) = U(c
1
;c
2
) +(I p
1
c
1
p
2
c
2
)
Then the optimal consumptions c
1
and c
2
and the associated multiplier
must satisfy the
FOC:
L
1
(c
1
;c
2
;
) = U
1
(c
1
;c
2
)
p
1
= 0
and
L
2
(c
1
;c
2
;
) = U
2
(c
1
;c
2
)
p
2
= 0
Move the terms with minus signs to the other side,and divide the rst of these FOC by
the second to obtain
U
1
(c
1
;c
2
)
U
2
(c
1
;c
2
)
=
p
1
p
2
;
which is just the familiar condition that says that the optimizing consumer should set
the slope of his or her indierence curve,the marginal rate of substitution,equal to
the slope of his or her budget constraint,the ratio of prices.
Now consider I as one of the model's parameters,and let the functions c
1
(I),c
2
(I),and
(I) describe how the optimal choices c
1
and c
2
and the associated value
of the
multiplier depend on I.
In addition,dene the maximum value function as
V (I) = max
c
1
;c
2
U(c
1
;c
2
) subject to I p
1
c
1
+p
2
c
2
The KuhnTucker theorem tells us that
(I)[I p
1
c
1
(I) p
2
c
2
(I)] = 0
and hence
V (I) = U[c
1
(I);c
2
(I)] = U[c
1
(I);c
2
(I)] +
(I)[I p
1
c
1
(I) p
2
c
2
(I)]:
13
The envelope theorem tells us that we can ignore the dependence of c
1
and c
2
on I in
calculating
V
0
(I) =
(I);
which gives us an interpretation of the multiplier
as the marginal utility of income.
3.2 Cost Minimization
The KuhnTucker and envelope conditions can also be used to study constrained minimiza
tion problems.
Consider a rm that produces output y using capital k and labor l,according to the
technology described by
f(k;l) y:
r = rental rate for capital
w = wage rate
Suppose that the rm takes its output y as given,and chooses inputs k and l to minimize
costs.Then the rm solves
min
k;l
rk +wl subject to f(k;l) y
If we set up the Lagrangian as
L(k;l;) = rk +wl [f(k;l) y];
where the terminvolving the multiplier is subtracted rather than added in the case of
a minimization problem,the KuhnTucker conditions (1)(4) continue to apply,exactly
as before.
Thus,according to the KuhnTucker theorem,the optimal choices k
and l
and the asso
ciated multiplier
must satisfy the FOC:
L
1
(k
;l
;
) = r
f
1
(k
;l
) = 0 (9)
and
L
2
(k
;l
;
) = w
f
2
(k
;l
) = 0 (10)
Move the terms with minus signs over to the other side,and divide the rst FOC by the
second to obtain
f
1
(k
;l
)
f
2
(k
;l
)
=
r
w
;
which is another familiar condition that says that the optimizing rm chooses factor
inputs so that the marginal rate of substitution between inputs in production equals
the ratio of factor prices.
14
Now suppose that the constraint binds,as it usually will:
y = f(k
;l
) (11)
Then (9)(11) represent 3 equations that determine the three unknowns k
,l
,and
as
functions of the model's parameters r,w,and y.In particular,we can think of the
functions
k
= k
(r;w;y)
and
l
= l
(r;w;y)
as demand curves for capital and labor:strictly speaking,they are conditional (on y)
factor demand functions.
Now dene the minimum cost function as
C(r;w;y) = min
k;l
rk +wl subject to f(k;l) y
= rk
(r;w;y) +wl
(r;w;y)
= rk
(r;w;y) +wl
(r;w;y)
(r;w;y)ff[k
(r;w;y);l
(r;w;y)] yg
The envelope theorem tells us that in calculating the derivatives of the cost function,we
can ignore the dependence of k
,l
,and
on r,w,and y.
Hence:
C
1
(r;w;y) = k
(r;w;y);
C
2
(r;w;y) = l
(r;w;y);
and
C
3
(r;w;y) =
(r;w;y):
The rst two of these equations are statements of Shephard's lemma;they tell us that
the derivatives of the cost function with respect to factor prices coincide with the
conditional factor demand curves.The third equation gives us an interpretation of the
multiplier
as a measure of the marginal cost of increasing output.
Thus,our two examples illustrate howwe can apply the KuhnTucker and envelope theorems
in specic economic problems.
The two examples also show how,in the context of specic economic problems,it is often
possible to attach an economic interpretation to the multiplier
.
15
4 Generalizing the Basic Results
4.1 The KuhnTucker Theorem
Our\simple"version of the KuhnTucker theoremapplies to a problemwith only one choice
variable and one constraint.
Section 19.6 of Simon and Blume's book develops a proof for the more general case,with
n choice variables and m constraints.Their proof makes repeated,clever use of the
implicit function theorem,which makes the arguments surprisingly short but also works
to obscure some of the intuition provided by the analysis of the simplest case.
Nevertheless,having gained the intuition the intuition from working through the simple
case,it is useful to see how the result extends.
Simon and Blume (Chapter 15) and Acemoglu (Appendix A) both present fairly general
statements of the implicit function theorem.The special case or application of their
results that we will need works as follows.
Consider a system of n equations in n variables:
H
1
(y
1
;y
2
;:::;y
n
) = c
1
;
H
2
(y
1
;y
2
;:::;y
n
) = c
2
;
.
.
.
H
n
(y
1
;y
2
;:::;y
n
) = c
n
:
The functions may have other arguments {\exogenous variables"{ but since these
will be held xed,notation referring to them can be suppressed.
Now evaluate these equations at a specic set of values y
1
;y
2
;:::;y
n
to obtain
H
1
(y
1
;y
2
;:::;y
n
) = c
1
;
H
2
(y
1
;y
2
;:::;y
n
) = c
2
;
.
.
.
H
n
(y
1
;y
2
;:::;y
n
) = c
n
:
Suppose that each function H
i
,i = 1;:::;n,is continuously dierentiable and that the
n n matrix of derivatives
2
6
6
6
4
@H
1
=@y
1
@H
1
=@y
n
@H
2
=@y
1
@H
2
=@y
n
.
.
.
.
.
.
.
.
.
@H
n
=@y
1
@H
n
=@y
n
3
7
7
7
5
is nonsingular at y
1
;y
2
;:::;y
n
.
16
Then there exist continuously dierentiable functions
y
1
(c
1
;c
2
;:::;c
n
);
y
2
(c
1
;c
2
;:::;c
n
);
.
.
.
y
n
(c
1
;c
2
:::;c
n
);
dened in an open subset C of R
n
containing (c
1
;c
2
;:::;c
n
),such that
H
1
(y
1
(c
1
;c
2
;:::;c
n
);y
2
(c
1
;c
2
;:::;c
n
);:::;y
n
(c
1
;c
2
;:::;c
n
)) = c
1
;
H
2
(y
1
(c
1
;c
2
;:::;c
n
);y
2
(c
1
;c
2
;:::;c
n
);:::;y
n
(c
1
;c
2
;:::;c
n
)) = c
2
;
.
.
.
H
n
(y
1
(c
1
;c
2
;:::;c
n
);y
2
(c
1
;c
2
;:::;c
n
);:::;y
n
(c
1
;c
2
;:::;c
n
)) = c
n
:
for all (c
1
;c
2
;:::;c
n
) 2 C.
With this result in hand,consider the following generalized version of the KuhnTucker
theoremwe proved earlier.Let there be n choice variables,x
1
;x
2
;:::;x
n
.The objective
function F:R
n
!R is continuously dierentiable,as are the m functions G
j
:R
n
!
R,j = 1;2;:::;m that enter into the constraints
c
j
G
j
(x
1
;x
2
;:::;x
n
);
where c
j
2 R for all j = 1;2;:::;m.
The problem can be stated as:
max
x
1
;x
2
;:::;x
n
F(x
1
;x
2
;:::;x
n
) subject to c
j
G
j
(x
1
;x
2
;:::;x
n
) for all j = 1;2;:::;m:
Note that,typically,m n will have to hold in order so that there is a set of values
for the choice variables that satisfy all of the constraints.
To dene the Lagrangian,introduce the multipliers
j
,j = 1;2;:::;m,one for each con
straint.Then
L(x
1
;x
2
;:::;x
n
;
1
;
2
;:::;
m
) = F(x
1
;x
2
;:::;x
n
) +
m
X
j=1
j
[c
j
G
j
(x
1
;x
2
;:::;x
n
)]:
Theorem (KuhnTucker) Suppose that x
1
;x
2
;:::;x
n
maximize F(x
1
;x
2
;:::;x
n
) sub
ject to c
j
G
j
(x
1
;x
2
;:::;x
n
) for all j = 1;2;:::;m,where F and the G
j
's are all
continuously dierentiable.Suppose (without loss of generality) that the rst m m
constraints bind at the optimum and that the remaining m m 0 constraints are
nonbinding,and assume that the mn matrix of derivatives
2
6
6
6
4
G
1;1
(x
1
;x
2
;:::;x
n
):::G
1;n
(x
1
;x
2
;:::;x
n
)
G
2;1
(x
1
;x
2
;:::;x
n
):::G
2;n
(x
1
;x
2
;:::;x
n
)
.
.
.
.
.
.
.
.
.
G
m;1
(x
1
;x
2
;:::;x
n
):::G
m;n
(x
1
;x
2
;:::;x
n
)
3
7
7
7
5
;(12)
17
where G
j;i
= @G
j
=@x
i
,has rank m.Then there exist values
1
;
2
;:::;
m
that,to
gether with x
1
;x
2
;:::;x
n
,satisfy:
L
i
(x
1
;x
2
;:::;x
n
;
1
;
2
;:::;
n
) = F
i
(x
1
;x
2
;:::;x
n
)
m
X
j=1
j
G
j;i
(x
1
;x
2
;:::;x
n
) = 0
(13)
for i = 1;2;:::;n,
L
n+j
(x
1
;x
2
;:::;x
n
;
1
;
2
;:::;
n
) = c
j
G
j
(x
1
;x
2
;:::;x
n
) 0;(14)
for j = 1;2;:::;m,
j
0;(15)
for j = 1;2;:::;m,and
j
[c
j
G
j
(x
1
;x
2
;:::;x
n
)] = 0;(16)
for j = 1;2;:::;m.
Proof To begin,set the multipliers
m+1
;
m+2
;:::;
m
associated with the nonbinding
contraints equal to zero.Since each of the functions G
j
,j = m+1;m+2;:::;m,is
continuously dierentiable,suciently small adjustments in the choice variables can
be made without causing these m m constraints to become binding.
Next,note that the m+1 n matrix
2
6
6
6
6
6
4
F
1
(x
1
;x
2
;:::;x
n
):::F
n
(x
1
;x
2
;:::;x
n
)
G
1;1
(x
1
;x
2
;:::;x
n
):::G
1;n
(x
1
;x
2
;:::;x
n
)
G
2;1
(x
1
;x
2
;:::;x
n
):::G
2;n
(x
1
;x
2
;:::;x
n
)
.
.
.
.
.
.
.
.
.
G
m;1
(x
1
;x
2
;:::;x
n
):::G
m;n
(x
1
;x
2
;:::;x
n
)
3
7
7
7
7
7
5
:(17)
must have rank m< m+1.To see why,consider the system of equations
F(x
1
;x
2
;:::;x
n
) = y
G
1
(x
1
;x
2
;:::;x
n
) = c
1
G
2
(x
1
;x
2
;:::;x
n
) = c
2
.
.
.
G
m
(x
1
;x
2
;:::;x
n
) = c
m
:
With y
set equal to the maximized value of the objective function,
y
= F(x
1
;x
2
;:::;x
n
);
each of these m+1 equations holds when the functions are evaluated at x
1
;x
2
;:::;x
n
.
In this case,the implicit function theorem implies that it should be possible to adjust
18
the values of m+1 of the choice variables so to nd a new set of values x
1
;x
2
;:::;x
n
such that
F(x
1
;x
2
;:::;x
n
) = y
+"
G
1
(x
1
;x
2
;:::;x
n
) = c
1
G
2
(x
1
;x
2
;:::;x
n
) = c
2
.
.
.
G
m
(x
1
;x
2
;:::;x
n
) = c
m
:
for a strictly positive but suciently small value of".But this contradicts the assump
tion that x
1
;x
2
;:::;x
n
solves the constrained optimization problem.
Since the matrix in (17) has rank m < m+1,its m+1 rows must be linearly dependent.
Hence,there exist scalars
0
;
1
;:::
m
,at least one of which is nonzero,such that
2
6
4
0
.
.
.
0
3
7
5
=
0
2
6
4
F
1
(x
1
;x
2
;:::;x
n
)
.
.
.
F
n
(x
1
;x
2
;:::;x
n
)
3
7
5
+
1
2
6
4
G
1;1
(x
1
;x
2
;:::;x
n
)
.
.
.
G
1;n
(x
1
;x
2
;:::;x
n
)
3
7
5
+:::+
m
2
6
4
G
m;1
(x
1
;x
2
;:::;x
n
)
.
.
.
G
m;n
(x
1
;x
2
;:::;x
n
)
3
7
5
:
(18)
Moreover,in (18),
0
6= 0,since otherwise,the matrix in (12) would have rank less
than m.
Thus,for j = 1;2;:::;m,set
j
=
j
=
0
.With these settings for
1
;
2
;:::;
m
,plus the
settings
m+1
=
m+2
=
m
= 0 chosen earlier,(18) implies that (13) must hold for
all i = 1;2;:::;n.Clearly,(14) and (16) are satised for all j = 1;2;:::;m,and (15)
holds for all j = m+1;m+2;:::;m.So it only remains to show that (15) holds for
j = 1;2;:::;m.
To see that these last conditions must hold,consider the system of equations
G
1
(x
1
;x
2
;:::;x
n
) = c
1
G
2
(x
1
;x
2
;:::;x
n
) = c
2
.
.
.
G
m
(x
1
;x
2
;:::;x
n
) = c
m
;
(19)
where 0.These equations hold,with = 0,at x
1
;x
2
;:::;x
n
.And since the matrix
in (12) has rank m,the implicit function theorem implies that there are functions
x
1
();x
2
();:::;x
n
() such that the same equations hold for all suciently small values
of .
19
Since c
1
c
1
,the choices x
1
();x
2
();:::;x
n
() satisfy all of the constraints from
the original optimization problem.And since,by assumption,x
1
(0) = x
1
;x
2
(0) =
x
2
;:::;x
n
(0) = x
n
maximizes the objective function subject to the constraints,it must
be that
dF(x
1
();x
2
();:::;x
n
())
d
=0
=
n
X
i=1
F
i
(x
1
;x
2
;:::;x
n
)x
0
i
(0) 0:(20)
In addition,the equations in (19) implicitly dening x
1
();x
2
();:::;x
n
() imply
dG
1
(x
1
();x
2
();:::;x
n
())
d
=0
=
n
X
i=1
G
1;i
(x
1
;x
2
;:::;x
n
)x
0
i
(0) = 1 (21)
and
dG
j
(x
1
();x
2
();:::;x
n
())
d
=0
=
n
X
i=1
G
j;i
(x
1
;x
2
;:::;x
n
)x
0
i
(0) = 0 (22)
for j = 2;3;:::;m.
Putting all these results together,(13) implies
0 = F
i
(x
1
;x
2
;:::;x
n
)
m
X
j=1
j
G
j;i
(x
1
;x
2
;:::;x
n
):
for all i = 1;2;:::;n.Multiplying each of these equations by x
0
i
(0) and summing over
all i yields
0 =
n
X
i=1
F
i
(x
1
;x
2
;:::;x
n
)x
0
i
(0)
n
X
i=1
m
X
j=1
j
G
j;i
(x
1
;x
2
;:::;x
n
)x
0
i
(0);
or
0 =
n
X
i=1
F
i
(x
1
;x
2
;:::;x
n
)x
0
i
(0)
m
X
j=1
j
"
n
X
i=1
G
j;i
(x
1
;x
2
;:::;x
n
)x
0
i
(0)
#
;
or,since
j
= 0 for j = m+1;m+2;:::;m,
0 =
n
X
i=1
F
i
(x
1
;x
2
;:::;x
n
)x
0
i
(0)
m
X
j=1
j
"
n
X
i=1
G
j;i
(x
1
;x
2
;:::;x
n
)x
0
i
(0)
#
:
In light of (21) and (22),this last equation simplies to
0 =
n
X
i=1
F
i
(x
1
;x
2
;:::;x
n
)x
0
i
(0) +
1
:
And hence,in light of (20),
1
0:
Analogous arguments show that
j
0
for j = 2;3;:::;m as well,completing the proof.
20
4.2 The Envelope Theorem
Proving a generalized version of the envelope theorem requires no new ideas,just repeated
applications of the previous ones.
Consider,again,the constrained optimization problem with n choice variables and m con
straints:
max
x
1
;x
2
;:::;x
n
F(x
1
;x
2
;:::;x
n
) subject to c
j
G
j
(x
1
;x
2
;:::;x
n
) for all j = 1;2;:::;m:
Now extend this problem by allowing the functions F and G
j
,j = 1;2;:::;m,to depend
on a parameter 2 R:
max
x
1
;x
2
;:::;x
n
F(x
1
;x
2
;:::;x
n
;) subject to
c
j
G
j
(x
1
;x
2
;:::;x
n
;) for all j = 1;2;:::;m:
Just as before,dene the maximum value function V:R!R as
V () = max
x
1
;x
2
;:::;x
n
F(x
1
;x
2
;:::;x
n
;)
subject to c
j
G
j
(x
1
;x
2
;:::;x
n
;) for all j = 1;2;:::;m:
Note that V is still a function of the single parameter ,since the n choice variables are
\optimized out."Put another way,evaluating V requires the same twostep procedure
as before:
First,given ,nd the values x
1
();x
2
();:::;x
n
() that solve the constrained opti
mization problem.
Second,substitute these values x
1
();x
2
();:::;x
n
(),together with the given value
of ,into the objective function to obtain
V () = F(x
1
();x
2
();:::;x
n
();):
And just as before,the envelope theorem tells us that we can calculate the derivative V
0
()
of the maximum value function while ignoring the dependence of x
1
;x
2
;:::;x
n
and
1
;
2
;:::;
m
on ,provided we invoke the complementary slackness conditions (16)
to add the sum of all of the multipliers times all of the constraints to the objective
function before dierentiating through by .
Theorem (Envelope) Let F and G
j
,j = 1;2;:::;m,be continuously dierentiable func
tions of x
1
;x
2
;:::;x
n
and .For any value of ,let x
1
();x
2
();:::;x
n
() maximize
F(x
1
;x
2
;:::;x
n
;) subject to c
j
G
j
(x
1
;x
2
;:::;x
n
;) for all j = 1;2;:::;m,and let
1
();
2
();:::;
m
() be the associated values of the Lagrange multipliers.Suppose,
further,that x
1
();x
2
();:::;x
n
() and
1
();
2
();:::;
m
() are all continuously dif
ferentiable functions,and that the m() m matrix of derivatives
2
6
6
6
4
G
1;1
(x
1
();x
2
();:::;x
n
();):::G
1;n
(x
1
();x
2
();:::;x
n
();)
G
2;1
(x
1
();x
2
();:::;x
n
();):::G
2;n
(x
1
();x
2
();:::;x
n
();)
.
.
.
.
.
.
.
.
.
G
m();1
(x
1
();x
2
();:::;x
n
();):::G
m();n
(x
1
();x
2
();:::;x
n
();)
3
7
7
7
5
21
associated with the m() m binding constraints has rank m() for each value of .
Then the maximum value function dened by
V () = max
x
1
;x
2
;:::;x
n
F(x
1
;x
2
;:::;x
n
;)
subject to c
j
G
j
(x
1
;x
2
;:::;x
n
;) for all j = 1;2;:::;m
satises
V
0
() = F
n+1
(x
1
();x
2
();:::;x
n
();)
m
X
j=1
j
()G
j;n+1
(x
1
();x
2
();:::;x
n
();):
(23)
Proof The KuhnTucker theorem implies that for any given value of ,
F
i
(x
1
();x
2
();:::;x
n
();)
m
X
j=1
j
()G
j;i
(x
1
();x
2
();:::;x
n
();) = 0 (13)
for i = 1;2;:::;n,and
j
()[c
j
G
j
(x
1
();x
2
();:::;x
n
();)] = 0;(16)
for j = 1;2;:::;m must hold.
In light of (16),
V () = F(x
1
();x
2
();:::;x
n
();) +
m
X
j=1
j
()[c
j
G
j
(x
1
();x
2
();:::;x
n
();)]:
Dierentiating both sides of this expression by yields
V
0
() =
n
X
i=1
F
i
(x
1
();x
2
();:::;x
n
();)x
0
()
+F
n+1
(x
1
();x
2
();:::;x
n
();)
+
m
X
j=1
0
j
()[c
j
G
j
(x
1
();x
2
();:::;x
n
();)]
n
X
i=1
m
X
j=1
j
()G
j;i
(x
1
();x
2
();:::;x
n
();)x
0
()
m
X
j=1
j
()G
j;n+1
(x
1
();x
2
();:::;x
n
();):
which shows that,in principle,we must take the dependence of x
1
();x
2
();:::;x
n
()
and
1
();
2
();:::;
m
() on into account when calculating V
0
().
22
Note,however,that (13) implies that the sums in the rst and fourth lines of this last
expression together equal zero.Hence,to show that (23) holds,it only remains to
show that
m
X
j=1
0
j
()[c
j
G
j
(x
1
();x
2
();:::;x
n
();)] = 0
and this is true if
0
j
()[c
j
G
j
(x
1
();x
2
();:::;x
n
();)] = 0 (24)
for all j = 1;2;:::;m.
Clearly,(24) holds for any such that constraint j is binding.
For such that constraint j is not binding,(16) implies that
j
() = 0.Furthermore,by
the continuity of G
j
and x
i
(),i = 1;2;:::;n,if constraint j does not bind at ,there
exists an"
> 0 such that constraint j does not bind for all +"with"
> j"j.Hence,
0
j
() = lim
"!0
j
( +")
j
()
"
= lim
"!0
0
"
= 0;
and once again it becomes apparent that (24) must hold.Hence,(23) must hold as
well.
23
Enter the password to open this PDF file:
File name:

File size:

Title:

Author:

Subject:

Keywords:

Creation Date:

Modification Date:

Creator:

PDF Producer:

PDF Version:

Page Count:

Preparing document for printing…
0%
Comments 0
Log in to post a comment