Liedtke

download Liedtke

of 27

Transcript of Liedtke

  • 8/11/2019 Liedtke

    1/27

    A Study of Convex Functions with Applications

    Matthew Liedtke

    May 14, 20121. Introduction

    The study of concave up functions in calculus is often limited to describingone of the possible classifications of an increasing function. This is unfortu-nate because the simple generalization to a convex function greatly extendsour scope for analysis. The theory developed in the study of convex func-tions, arising from intuitive geometrical observations, may be readily appliedto topics in real analysis and economics.

    This paper begins with a rigorous study of convex functions with the goalof developing a theoretical framework to extend to further topics. Our ap-plications to real analysis include right-endpoint approximation for integrals

    and pointwise convergence. In our discussion of economics, we will use thetheory of consumer behavior under uncertainty and Jensens Inequality toshow the relation between convex functions and the modeling of risk lovingbehavior. While a first course in real analysis is necessary to truly graspthese results, this paper has been written to be accessible to those who havecompleted a bridge to higher mathematics course.

    2. Convex functions and theory

    2.1. Concave Functions. The theoretical predecessor to convexity is aconcave up function. As seen in Figure 1, the rate at which the concave upfunction fis increasing is itself increasing. This may be formally expressedin terms off.

    Definition 2.1. Let fbe a differentiable function defined on an interval I.

    (a) The function f is concave up on I iff is increasing on I.(b) The function f is concave down on I iff is decreasing on I.

    Two important theoretical properties of a concave up function begin withgeometric observations. Consider a function f that is concave up on aninterval I. Referring to Figure 1, we see that the line tangent to the graphat a point (c, f(c)) lies below all other points on the graph off. Similarly,a secant line drawn between (a, f(a)) and (b, f(b)) lies above the graph.

    Definition 2.2. Let fbe a function defined on an interval I.

    (a) Suppose that f is differentiable on I. The graph of f lies above itstangent lines on I if for each c I, the inequality f(x) Tc(x) isvalid for all x I, where Tc represents the function whose graph is theline tangent to the graph ofy= f(x) when x= c.

    (b) The graph of f lies below its secant lines on I if for each pair ofdistinct pointsa, b I with a < b, the inequality f(x) Sab(x) is validfor all x [a, b], where Sab represents the function whose graph is theline joining the points (a, f(a)) and (b, f(b)).

    1

  • 8/11/2019 Liedtke

    2/27

    2

    Figure 1. A concave up function

    x

    y

    y = f(x)

    a bc

    f(a)

    f(b)

    f(c)

    Our observation concerning secant lines is particularly useful as a tool in

    further proofs. Indeed, throughout this paper we will be returning to thisresult as a basis for proofs and applications. For this reason we will carefullyprove this result below while the complementary tangent line proof may befound in Gordon [1].

    Theorem 2.3. Let fbe a differentiable function defined on an interval I.The function f is concave up on I if and only if the graph of f lies belowits secant lines on I.

    Proof. Suppose that the function f is concave up on I, and let [a, b] I.We define the secant line between the points (a, f(a)) and (b, f(b)) by

    Sab(x) = f(b) f(a)

    b a (x a) + f(a).

    Let D: [a, b] R be defined

    D(x) =

    f(x)f(a)

    xa for a < x c;

    f(a) for x= a.

    Since fis continuous on Iand differentiable at a, it follows that D is con-tinuous on [a, b]. Furthermore, using the fact that f is concave up, and thusf(x) f(a) 0 for x (a, b), we have that D(x) 0 for x (a, b). TheMonotonicity Theorem states that D(x) D(b) for x (a, b), and usingthe definition ofD, we find that

    f(x) f(a)

    x a

    f(b) f(a)

    b a

    f(x) f(b) f(a)

    b a (x a) + f(a).

    Thus, the graph off lies below its secant lines on I.Now instead suppose that the graph off lies below its secant lines on I.

    Let a, b I such that a < b. By the Mean Value Theorem, there exists a

  • 8/11/2019 Liedtke

    3/27

    3

    c (a, b) such that f(c) = f(b)f(a)ba

    . Since the graph of f lies below its

    secant lines on I, we have for x [a, b] that

    f(x) Sab= f(c)(x a) + f(a),

    and therefore,

    f(a) = limxa+

    f(x) f(a)

    x a f(c).

    It can be shown that f(c) f(b); the proof is analogous. It followsf(a) f(b), and therefore f is concave up on [a, b]. 1

    2.2. Convex functions. While it is certainly convenient to work withinthe set of differentiable functions, this restriction will limit the scope of ouranalysis. For instance, there are functions that are not differentiable on thesame interval for which their graph lies below its secant lines. The absolutevalue function, which is not differentiable at zero, is an example of a function

    with this property. See Figure 2.

    Figure 2. The function f(x) = |x| lies below its secant lines.

    x

    yf(x) = |x|

    With the following definition of a convex function we will drop the require-ment thatf is differentiable while maintaining the geometrical properties ofa concave up function.

    Definition 2.4. Letfbe a function defined on an interval I. The functionf is convex on Iif for each interval [c, d] I, the inequality

    f

    (1 t)c + td

    (1 t)f(c) + tf(d)

    is valid for 0 t 1.

    As might be expected, this generalization comes with a trade-off. Prov-ing convexity from the definition has two possible difficulties. First, we will

    show in Example 2.5 that even working with a simple function is in itselfnontrivial. Second, recall that the motivating factor when introducing con-vexity was to further the reach of our analysis. As is often the case, it maybe beneficial to have an equivalent definition. In Theorem 2.8 we will returnto our original geometric observation and show that secant lines lie abovethe graph of a convex function.

    1The details of this proof are referenced from Gordon [1].

  • 8/11/2019 Liedtke

    4/27

    4

    Example 2.5. We will show that f(x) = x2 is convex from the definition.Let [c, d] Iand 0 t 1. We find that

    f

    (1 t)c + td

    = (1 t)2c2 + 2t(1 t)cd + t2d2

    = (1 t)c2 t(1 t)c2 + 2t(1 t)cd + t2d2

    = (1 t)c2 + t(1 t)c(2d c) + t2d2 td2 + td2

    = (1 t)c2 t(1 t)(c2 2cd + d2) + td2

    = (1 t)c2 t(1 t)(c d)2 + td2

    (1 t)c2 + td2

    Thus, the functionf(x) =x2 is convex.

    We begin the proof of Theorem 2.8 with two lemmas.

    Lemma 2.6. Let f be defined on an interval [c, d]. If 0 t 1, thenc (1 t)c + td d.

    Proof. Simple algebra reveals that

    0 t 1

    0 t(d c) d c

    c c + t(d c) d

    c (1 t)c + td d ,

    which is the desired result.

    Lemma 2.7. Let Scd be the function that represents the line through thepoints (c, f(c)) and (d, f(d)) and suppose that 0 t 1. Then

    Scd

    (1 t)c + td

    = (1 t)f(c) + tf(d).

    Proof. Note that the function Scd is defined by

    Scd(x) = f(d) f(c)

    d c (x c) + f(c).

    For 0 t1, we find that

    Scd

    (1 t)c + td

    =

    f(d) f(c)

    d c

    (1 t)c + td c

    + f(c)

    = f(c) f(d)c d

    (t)(d c) + f(c)

    = t

    f(d) f(c)

    + f(c)

    = (1 t)f(c) + tf(d).

    Theorem 2.8. Let f be a function defined on an interval I. Then f isconvex on Iif and only if the graph off lies below its secant lines on I.

  • 8/11/2019 Liedtke

    5/27

    5

    Proof. Let fbe defined on an interval I. Suppose that the graph of f liesbelow its secant lines on I, and let c, d I with c < d. By hypothesis the

    inequalityf(x) Scd(x) is valid for allx [c, d], and we note that the pointx = (1 t)c+ td for 0 t 1 is in the interval [c, d] by Lemma 2.6. Itfollows from Lemma 2.7 that

    f(x) =f

    (1 t)c + td

    Scd

    (1 t)c + td

    = (1 t)f(c) + tf(d).

    Hence, the functionf is convex by definition.Now suppose f is convex on I, and let c, d I with c < d. For each

    x [c, d] there exists a t [0, 1] such that x = (1 t)c+ td. From thedefinition of convexity we have that

    f(x) =f

    (1 t)c + td

    (1 t)f(c) + f(d) =Scd

    (1 t)c + td

    .

    This shows that the function f lies below its secant lines on I.

    2.3. Continuity. With Theorem 2.8 we have sufficient theoretical tools toanswer an important question: how well behaved are convex functions? Thatis, what can we say about continuity and differentiability? Recall that afunction may not be continuous if it has a jump discontinuity, a removablediscontinuity, or is wildly oscillating about a point. The fact that the secantlines must lie above the graph of a convex function should be convincingevidence that none of these cases are possible. See Figure 3 for an illustrationof this argument with a jump discontinuity.

    Figure 3. The function fwith a jump discontinuity.

    x

    y

    f(x)

    In proving this result, we will make use of Lemma 2.9, a proof of whichmay be found in Gordon [1].

    Lemma 2.9. Let fbe a convex function defined on an interval I. If a,b,and c are points in I such that a < b < c, then

    f(b) f(a)

    b a

    f(c) f(a)

    c a

    f(c) f(b)

    c b .

  • 8/11/2019 Liedtke

    6/27

    6

    The inequality in Lemma 2.9 represents the slope of secant lines alongthe graph off. This allows us to visualize lower and upper bounds for the

    slope of a line segment by choosing appropriate secant lines. This idea willbe illustrated in the proof of the following theorem.

    Theorem 2.10. Iff is a convex function defined on an open interval (a, b),then fis continuous on (a, b).

    Proof. Suppose f is convex on (a, b), and let [c, d] (a, b). Choose c1 andd1 such that

    a < c1 < c < d < d1 < b.

    Ifx, y [c, d] with x < y, we have from Lemma 2.9 (see Figure 4) that

    f(y) f(x)

    y x

    f(d) f(y)

    d y

    f(d1) f(d)

    d1 d

    and f(y) f(x)

    y x

    f(x) f(c)

    x c

    f(c) f(c1)

    c c1,

    showing the set f(y) f(x)y x :c x < y d

    is bounded by M >0. It follows |f(y) f(x)| M|y x|, and therefore fis uniformly continuous on [c, d]. Recalling that uniform continuity impliescontinuity, we have shown that f is continuous on [c, d]. Since the interval[c, d] was arbitrary, fis continuous on (a, b). 2

    Figure 4. Upper and lower bounds for Theorem 2.10.

    x

    y

    y = f(x)

    a c1 c x y d d1 b

    The following corollary follows immediately from the fact that any closedinterval may be contained in an open interval.

    2The details of this proof are referenced from Gordon [1].

  • 8/11/2019 Liedtke

    7/27

    7

    Corollary 2.11. Iffis a convex function defined on R, thenfis continuouson R.

    2.4. Differentiability. We now consider the differentiability of a convexfunction. Remember that we introduced convexity in order to remove thedifferentiability constraint from a concave up function. Since a convex func-tion is still geometrically related to a concave function, we may suspect theystill share similar properties. The derivative of a concave up function isincreasing. However, we can not readily draw an analogous conclusion forconvex functions because we are not guaranteed the derivative of a convexfunction exists at a point. We need an additional definition to expand theconcept.

    Definition 2.12. Let fbe defined on an interval Iand let c I. The leftand right derivatives off at c are denoted and defined by

    f(c) = limxc

    f(x) f(c)

    x c

    and

    f+(c) = limxc+

    f(x) f(c)

    x c

    respectively, provided the appropriate limits exist. (If c is an endpoint ofI, only one of these expressions is defined.) To say that f has one-sidedderivativesat c means that both f(c) and f+(c) exist.

    The concept of left and right derivatives is similar to that of left andright limits; a function fis differentiable at c if and only iff(c) =f+(c).The following theorem provides the analogous result for convex functions.A proof may be found in Gordon [1].

    Theorem 2.13. Iff is a convex function defined on an open interval (a, b),then f has one-sided derivatives at each point of (a, b), f(x) f+(x)for all x (a, b), and the functions f and f+ are increasing on (a, b).Furthermore, if f is continuous on [a, b], then these conclusions may beextended to the closed interval [a, b].

    Thus, the difference between a convex and concave up function in terms ofdifferentiablility is related to the inequality of the left and right derivatives.The following example continues our discussion of this result.

    Example 2.14. Let f be a continuous function defined on [a, b], letc (a, b), and suppose that f is convex on [a, c] and [c, b]. We will con-struct a function that shows that fmay not be convex on [a, b]. The clearpoint of interest is where the two intervals intersect at c. If we can piecewisedefine a functionfsuch thatf(x) f+(x), then we may proceed using thecontrapositive of Theorem 2.13. Suppose a > 0, and define f : [a, b] R,shown in Figure 5 by

  • 8/11/2019 Liedtke

    8/27

    8

    f(x) = x2 for a x c;

    (x c)2

    + c2

    for c x b.We note fis continuous and that f(x) = 2c 0 =f+(x). By Theorem

    2.13, the function f is not convex on [a, b].

    Figure 5. The function ffrom Example 2.14.

    x

    y

    y = f(x)

    y = c2

    a c b

    In the previous example, the function ffails to be convex on the interval

    [a, b] because the condition f

    (x) f+(x) is not valid at the point c. Thefollowing theorem explores a special case in which fis defined by piecewiselinear segments. Since the slope of each segment is constant, we will satisfythis inequality if the slope of the first line is less than the slope of the second.

    Theorem 2.15. Let f be a continuous function defined on [a, b], letc (a, b), and suppose that f is convex on [a, c] and [c, b]. If f is linearon each of the intervals [a, c] and [c, b] and the slope offon [a, c] is less thanthe slope off on [c, b], then f is convex on [a, b].

    Proof. Let f1and f2denote the line segment joining the points (a, f(a)),(b, f(b))and (b, f(a)), (c, f(c)) respectively (see Figure 6). We will first consider thecase that a < x c and show

    Sab(x) = f(b) f(a)

    b a (x a) + f(a)

    f(c) f(a)

    c a (x a) + f(a) =f1.

    By hypothesis, the slope off2 is greater than the slope off1. This means

    f(b) f(a)

    b c

    f(c) f(a)

    c a 0.

    Noting that bcba

    >0, we find

  • 8/11/2019 Liedtke

    9/27

    9

    0 b cb a

    f(b) f(c)b c

    f(c) f(a)

    c a

    = c b

    b a

    f(b) f(c)c b

    +f(c) f(a)

    c a

    =

    f(b) f(c)

    b a +

    c b

    b a

    f(c) f(a)c a

    =

    f(b) f(c)

    b a +

    1b a

    1

    c a

    f(c) f(a)

    = f(b) f(c) + f(c) f(a)

    b a

    f(c) f(a)

    c a

    f(b) f(a)

    b a (x a) + f(b)

    f(c) f(a)

    b a (x a) f(b).

    It follows that Sab(x) f1. The case in which we consider c < x < b andshow Sab(x) f2 is analogous. The graph of f lies below the secant lineSab, and therefore f is convex on [a, b].

    Figure 6. The function fas defined in Theorem 2.15.

    x

    f(x)

    a c b

    f(a)

    f(c)

    f(b)

    f1

    f2

    Sab

    Here we take an aside to prove an important result: the number of pointsat which a convex function defined on an open interval is not differentiable iscountable. We will then return to our previous theorem in order to providea worst case example of a convex function that is not differentiable at acountably infinite number of points.

    Theorem 2.16. Iff is a convex function defined on an open interval (a, b),then the set of points in (a, b) at which fis not differentiable is countable.

    Proof. The function f+ is increasing by Theorem 2.13. It can be shownthat since f+ is monotone, f+ has a countable number of discontinuities.We will show that fis differentiable at every point when f+ is continuous.

  • 8/11/2019 Liedtke

    10/27

    10

    Let > 0, and suppose f+ is continuous at c. There exists a > 0 suchthat |f+(x) f+(c)| < for all x (a, b) that satisfy |x c| < . Suppose

    c < x < c. By Theorem 2.13, we findf+

    (x) f

    (c). Using this fact,we have

    f+(c) < f+(x) f(c) f+(c)< f+(c) + .

    Since was arbitrary, it follows that f(c) =f+(c). Thus,fis differentiableat c. 3

    Example 2.17. For each positive integer n, let xn= (2n1)

    2n and yn= nn+1 .

    Let fbe the piecewise linear function that joins the points ( xn, yn). Thefunctionf is convex on the interval [0.5, 1) and the set of points at which fis not differentiable is countably infinite. [1]

    First note that the range of the sequence {xn} is the interval [0.5, 1). Ourgoal will be to show that the slope of any linear segment offis less than the

    slope of the subsequent segment. This amounts to proving that the followinginequality is true for all positive integers n 2:

    (1) yn+1 yn

    xn+1 xn

    yn yn1xn xn1

    .

    This will allow us to use Theorem 2.15 to prove that fis convex on [0.5, 1).However, before we proceed it would be prudent to simplify inequality (1)to a more workable form. Substituting for the sequence definitions of{xn}and {yn}, inequality (1) becomes

    (2)n+1n+2

    nn+1

    2n+112n+1

    2n12n

    n

    n+1 n1

    n

    2n12n

    2n112n1

    .

    Consider the left hand side of inequality (2). Simple algebra reveals that

    n + 1

    n + 2

    n

    n + 1 =

    (n + 1)(n + 1) n(n + 2)

    (n + 2)(n + 1)

    = n2 + 2n + 1 n2 2n

    (n + 2)(n + 1)

    = 1

    (n + 2)(n + 1)

    2n+1 1

    2n+1

    2n 1

    2n =

    2n(2n+1 1) (2n 1)(2n+1)

    (2n+1)(2n)

    = 2n + 2n+1

    (2n+1)(2n)

    = 1

    2n+1.

    3The details of this proof are referenced from Gordon [1].

  • 8/11/2019 Liedtke

    11/27

    11

    By performing similar operations on the right hand side, it can be shownthat inequality (2) reduces to

    2n+1

    (n + 2)(n + 1)

    2n

    (n + 1)(n).

    Let n 2. We find that

    n 22

    n + 2

    1

    n

    2n+1

    (n + 2)(n + 1)

    2n

    (n + 1)(n)

    Hence, inequality (2) is true for all n 2. By Theorem 2.15 it followsthat the function fis convex on [0.5,1).

    Furthermore, since f is convex on [0.5, 1). Theorem 2.16 states thatthe set of points in (0.5, 1) at which f is not differentiable is countable.Extending this observation to the interval [0.5, 1) is valid because we areadding at most one more point for which f is not differentiable. Thus, theset of points in [0.5, 1) for which f is not differentiable is countable. Thefunction f is illustrated in Figure 7. While f may appear smooth in thefigure, it should be noted that the graph of f is defined as a countablyinfinite number of line segments.

    Figure 7. The function fas defined in Example 2.17.

    x

    f(x)

    x1 = 12

    x2 = 34

    x3x4 ... 1

    y1 = 12

    y2 = 23

    y3

    y4

    .

    .

    .

    1f(x)

  • 8/11/2019 Liedtke

    12/27

    12

    3. Alternative criteria for convexity

    Before proceeding to applications of convexity in real analysis and eco-

    nomics we will first derive a useful result applicable to continuous functions.Our motivating factor is clear from Example 2.5; proving convexity fromthe definition may be very cumbersome. Any method to streamline this willmake a welcome addition to our criterion for convexity.

    Theorem 3.1. Iffis continuous on (a, b) and

    fx + y

    2

    1

    2f(x) +

    1

    2f(y)

    for all x, y (a, b), then f is convex on (a, b).

    This is a rather remarkable result for two reasons. First, Theorem 3.1simplifies the definition of convexity so that we need only consider the case

    in which t = 12 for a large class of functions. This result also provides ameans to approach unconventional functions that we know are continuous,an example of which will be shown following the proof.

    This theorem is also notable for its long and possibly unfamiliar prooftechnique. We begin by leveraging our hypothesis to show that the numberof terms in the numerator, and the corresponding integer divisor, can beextended to any power of two, and ultimately to any n Z+ (Lemma 3.3and Lemma 3.5, respectively). Using these results, we can show the functionsatisfies the definition of convexity for rational value oft. Finally, sequencesof rational numbers will be used to approximate irrational values oft.

    Proof of Theorem 3.1. Each of the following lemmas will be preceded byan example in which we work through a specific case. While our algebra isnot necessarily difficult, it is quite possible to be overwhelmed in the generalproof.

    Example 3.2. Suppose that fis continuous on (a, b) and

    fx + y

    2

    1

    2f(x) +

    1

    2f(y)

    for all x, y (a, b). We will illustrate how the number of terms in Theorem3.1 may be extended from two to four. By applying our hypothesis twice,we see that

    fx1+ x2+ x3+ x4

    4

    = f x1+x22 + x3+x42

    2

    1

    2fx1+ x2

    2

    +

    1

    2fx3+ x4

    2

    1

    2

    12

    f(x1) +1

    2f(x2) +

    1

    2f(x3) +

    1

    2f(x4)

    =

    1

    4f(x1) +

    1

    4f(x2) +

    1

    4f(x3) +

    1

    4f(x4).

  • 8/11/2019 Liedtke

    13/27

    13

    Lemma 3.3. Iff is defined on (a, b) and

    fx + y

    2

    1

    2 f(x) +

    1

    2 f(y)

    for all x, y (a, b), then for n Z+, fhas the property that

    f

    1

    2n

    2ni=1

    xi

    1

    2n

    2ni=1

    f(xi).

    Proof. We will use the Principle of Mathematical Induction. Whenn= 1,the inequality is valid by hypothesis, and hence 1 S. Suppose that theinequality is true for some positive integer k. We then have that

    fx1+x2+. . .+x2k + . . .+x2k+1

    2k+1 = fx1+x2+...+x2k

    2k +

    x2k+1

    +x2k+2

    +...+x2k+1

    2k

    2

    1

    2fx1+x2+. . .+x2k

    2k

    +

    1

    2fx2k+1+x2k+2+. . .+x2k+1

    2k

    1

    2

    1

    2k

    2k

    i=1

    f(xi)

    +

    1

    2

    1

    2k

    2k+1

    i=2k+1

    f(xi)

    = 1

    2k+1

    2k+1i=1

    f(xi).

    Thus, the inequality is true for k + 1. It follows from the Principle ofMathematical Induction that the inequality

    f 1

    2n

    2ni=1

    xi

    1

    2n

    2ni=1

    f(xi)

    holds for all positive integers.

    Our next goal is to show that the number of terms in Theorem 3.1 maybe extended to any m Z+. Note that 2n m < 2n+1 for some n Z+.We will work through an example where m= 6.

    Example 3.4. We wish to show that

    f

    1

    6

    6

    i=1

    xi

    1

    6

    6

    i=1

    f(xi).

    Let xequal the average ofx1, x2, . . . , x6; that is, let x= 16

    6i=1

    xi. By adding

    x to the average of the xis we find that the resulting average remains un-

    changed. Therefore

    6i=1

    (xi)+x+x

    8 = x. The idea is to continue adding x untilwe are considering a number of terms that is a power of two. This allows us

  • 8/11/2019 Liedtke

    14/27

    14

    to use our previous result. We find that

    f(x) = f

    6i=1

    xi+ x + x

    8

    6i=1

    f(xi) + f(x) + f(x)

    8

    = 1

    8

    6i=1

    f(xi) +2

    8f(x)

    34

    f(x) 18

    6i=1

    f(xi)

    f(x) 1

    6

    6i=1

    f(xi)

    f

    1

    6

    6i=1

    xi

    1

    6

    6i=1

    f(xi).

    Having reached the desired result, we will now prove the general case.

    Lemma 3.5. Iff is defined on (a, b) and

    fx + y

    2

    1

    2f(x) +

    1

    2f(y)

    for all x, y (a, b), then for m Z+ the functionfhas the property that

    f

    1

    m

    mi=1

    xi

    1

    m

    mi=1

    f(xi).

    Proof. Note that we have already covered the case where n is a power of

    two. Let x = 1m

    mi=1

    xi and choose n Z+ such that 2n < m < 2n+1. Let

  • 8/11/2019 Liedtke

    15/27

    15

    p= 2n+1 m. Since m +p is a power of two, we have by Lemma 3.3 that

    f(x) = f m

    i=1x

    i+

    pi=1

    xi

    m +p

    mi=1

    f(xi) +p

    i=1f(xi)

    m +p

    = 1

    m +p

    mi=1

    f(xi) + p

    m +pf(x)

    m

    m +pf(x)

    1

    m +p

    mi=1

    f(xi)

    f(x) 1m

    mi=1

    f(xi)

    f

    1

    m

    mi=1

    xi

    1

    m

    mi=1

    f(xi).

    We now have the computational tools to prove Theorem 3.1.

    Theorem 3.1. Iffis continuous on (a, b) and

    fx + y

    2 1

    2f(x) +

    1

    2f(y)

    for all x, y (a, b), then f is convex on (a, b).

    Proof. We will first prove the case where t= mn

    for m, n Z+. By Lemma3.5 we have that

    f

    (1 t)x + ty

    = f

    (1 m

    n)x +

    m

    ny

    = f(n m)x + my

    n

    1

    n

    nmi=1

    f(x) +m

    i=1

    f(y)

    =

    n m

    n f(x) +

    m

    nf(y)= (1 t)f(x) + tf(y).

    Now consider the general case where t [0, 1]. Let t [0, 1], and let {tn}be a sequence of rational numbers that converges to t. By our earlier resultwe find that

    f

    (1 tn)x + tny

    (1 tn)f(x) + tnf(y)

  • 8/11/2019 Liedtke

    16/27

    16

    for every positive integer n. Since f is continuous, we have

    limnf

    (1 tn)x + tny

    = f

    (1 t)x + ty

    and

    limn

    (1 tn)f(x) + tnf(y)

    = (1 t)f(x) + tf(y).

    Thereforef

    (1 t)x + ty)

    (1 t)f(x) + tf(y). It follows that fis convex

    by definition.

    The benefit of Theorem 3.1, as we alluded to earlier, is that it makesdetermining convexity for continuous functions much easier. By consideringonly the t = 12 case in the definition of a convex function, we may gaininsight into algebraic techniques that would otherwise be obfuscated. Thefollowing theorem is an excellent example of how our result simplifies anotherwise difficult proof.

    Theorem 3.6. Suppose that fis increasing on an interval [a, b) and definea function by

    (x) =

    xa

    f.

    The function is convex on (a, b).

    Proof. By the Fundamental Theorem of Calculus, the function is contin-uous on the interval (a, b). Let [c, d] (a, b), and let m = c+d2 . We mustshow

    (m) 1

    2(c) +

    1

    2(d)

    or equivalently,

    2 m

    a

    f c

    a

    f+ d

    a

    f.

    Since f is increasing, the number f(m) is an upper and lower bound for fon the intervals [c, m] and [m, d] respectively. Noting thatm is the midpointofc and d, we have

    mc

    f f(m)(m c) =f(m)(d m)

    dm

    f.

  • 8/11/2019 Liedtke

    17/27

    17

    From this statement we find that

    ca

    f+ d

    a

    f =

    ca

    f+ m

    a

    f+ d

    m

    f

    ca

    f+

    ma

    f+

    mc

    f

    =

    ma

    f+

    ma

    f

    = 2

    ma

    f.

    By Theorem 3.1, the function is convex on [a, b).We note in passing that if the increasing function fwere continuous, we

    may have used the Fundamental Theorem of Calculus to show that >0,

    and hence that is concave up on. Using our previous observations aboutsecant lines, we know that a concave up function is also convex. Since aconvex function may not be differentiable, however, the converse of thisstatement is not true in general.

    4. Applications to Real Analysis

    As we explored the behavior of convex functions we found that several ofthe results appealed to the fact that a convex function lies below its secantlines. A few of the specific applications to real analysis, however, insteadrefer to the definition of convexity. The result is that the proofs becomenotably more algebraic in nature. The extra work involved should not be a

    deterrent from the notable results. In this section, we will first discuss therelationship between a decreasing convex function and the sequence of rightendpoint approximations to the area under the curve. We will then explorea curious exception involving pointwise convergence.

    4.1. Right Endpoint Approximation. The Riemann integral is inextri-cably tied to the concept of determining the area under a function f on aclosed interval I. Generally, the intervalI is first divided into a number ofsubintervals from which a point, referred to as a tag, is chosen from each.The function evaluated at each tag, multiplied by the length of the respec-tive interval, is a rectangle estimate of the area under fon that subinterval.The sum of these rectangles provides an estimate to the area under fon the

    entire interval.Right endpoint approximation is the specific case in which the right end-

    point of each subinterval is chosen as the tag. The sequence of right end-point approximations to a positive, continuous, decreasing function definedon [0, 1] is given by

    Rn=n

    i=1

    f i

    n

    1n

    .

  • 8/11/2019 Liedtke

    18/27

    18

    While each term of the sequence will underestimate the area under thecurve, it can be shown that Rnconverges to the Riemann integral. However,

    it is not necessarily true that each iteration provides a better estimate; thatis, we need additional criteria to guarantee the sequence of right endpointapproximations{Rn} is increasing. Convexity satisfies this requirement.

    The first three values of Rn are illustrated in Figure 8. Note that sinceR2 is simply a bisection ofR1, the total area approximated must be greater.With this argument we see that the subsequenceR2n is increasing. However,it is not immediately obvious that R2 R3.

    Figure 8. The sequence {Rn} respresented for a convexfunction: (a) n= 1, (b) n= 2, (c) n= 3.

    (a)

    Area = 0.5

    x

    f(x)

    f(x) = 11+x

    1

    12

    (b)

    Area 0.58

    x

    f(x)

    f(x) = 11+x

    112

    12

    23

    (c)

    Area 0.61

    x

    f(x)

    f(x) = 11+x

    113

    23

    3

    435

    12

  • 8/11/2019 Liedtke

    19/27

    19

    Lemma 4.1. Iffis convex on [a, b] and i and nare positive integers, thenthe inequality

    f i

    n

    n i

    n

    f i

    n + 1

    +

    i

    nf i + 1

    n + 1

    is valid.

    Proof. We find that

    i

    n=

    in + i

    n(n + 1)=

    in i2 + i2 + i

    n(n + 1) =

    (n i)i + i(i + 1)

    n(n + 1) =

    n in

    in + 1

    + i

    n

    i + 1n + 1

    ,

    and noting that fis convex, we have that

    f i

    n

    f

    n in

    in + 1

    + f

    in

    i + 1n + 1

    .

    Theorem 4.2. Iff : [0, 1] R is convex, positive, and decreasing on [0, 1],

    then the sequence {Rn} of right endpoint approximations to1

    0 is increasing.

  • 8/11/2019 Liedtke

    20/27

    20

    Proof. We will show that Rn Rn+1. We have from Lemma 4.1 that

    Rn = 1

    n

    ni=1

    f i

    n

    1

    n

    ni=1

    n in

    f i

    n + 1

    +

    i

    nf i + 1

    n + 1

    = 1

    n

    ni=1

    n in

    f i

    n + 1

    +

    1

    n

    n+1i=1

    i 1

    n f i

    n + 1

    = 1

    n

    ni=1

    n 1

    n f i

    n + 1

    +

    1

    nf(1)

    =

    ni=1

    1

    n + 1 f i

    n + 1

    +

    ni=1

    n 1n2

    1

    n + 1

    f i

    n + 1

    +

    1

    n f(1)

    =n

    i=1

    1

    n + 1f i

    n + 1

    +

    ni=1

    (n 1)(n + 1) n2(n2)(n + 1)

    f i

    n + 1

    +

    1

    nf(1)

    =n

    i=1

    1

    n + 1f i

    n + 1

    1

    (n2)(n + 1)

    ni=1

    f i

    n + 1

    +

    1

    nf(1)

    and using the fact that fis a decreasing function,

    n

    i=1

    1

    n + 1

    f in + 1

    1(n

    2

    )(n + 1)

    n

    i=1

    f(1) +1

    n

    f(1)

    =n

    i=1

    1

    n + 1f i

    n + 1

    + f(1)

    1n

    1

    n(n + 1)

    =n

    i=1

    1

    n + 1f i

    n + 1

    + f(1)

    1

    n + 1

    = 1

    n + 1

    ni=1

    f i

    n + 1

    + f

    n + 1n + 1

    = 1

    n + 1

    n+1

    i=1

    f i

    n + 1= Rn+1.

    4.2. Pointwise Convergence. A sequence of functions {fn} convergespointwise to fon an interval Iif the sequence {fn(x)} converges to f(x) foreach x I. One of the striking weaknesses of pointwise convergence is that

  • 8/11/2019 Liedtke

    21/27

    21

    certain properties of{fn}may fail to carry over to the limit function. Thatis to say, even if each of the functions fnis bounded, continuous, or Riemann

    Integrable, it does not necessarily follow that f shares this property. Forcounterexamples demonstrating this fact, see Gordon [1].One may wonder if a sequence of convex functions shares a similar fate.

    The answer may be surprising. Pointwise convergence is indeed sufficient toguarantee that convexity transfers to the limit function.

    Theorem 4.3. Let {fn} be a sequence of convex functions defined on anintervalI. If{fn} converges pointwise on Ito a function f thenfis convexon I.

    Proof. Suppose that fn is a sequence of convex functions defined on I thatconverges pointwise on Ito a function f. This means that for [c, d] I

    f

    (1 t)c + td

    = limn

    fn

    (1 t)c + td

    limn

    (1 t)fn(c) + tfn(d)

    = lim

    n(1 t)fn(c) + lim

    ntfn(d)

    = (1 t)f(c) + tf(d).

    It follows that the function fis convex on Iby definition.

    5. Economic Modeling of Uncertainty

    The purpose of this section will be to explore an application of convexity toeconomic consumer thoery. Economics is the study of how agents choose toallocate scarce resources. Given a set of rationality assumptions, economistshope to model and predict the choices of individuals. Since the 1960s,economists have been increasingly concerned with developing more rigorousmathematical models. The particular model we will be discussing is that ofconsumer behavior under situations of uncertainty. Before addressing ourapplication to convexity, we will briefly discuss its theoretical foundation. 4

    5.1. Consumer Preference. Let xi R+ represent the number of unitsof good i. We assume that there is a finite number of goods n. The vectorx= (x1, . . . , xn) describes a particular distribution of goods, and is referred

    to as a consumption bundle. Note that a consumption bundle is a pointx Rn+. In situations of certainty, consumer theory looks at the prefer-ence of individuals over the set of consumption bundles X. An axiomaticassumption is that an individual is able to apply a preference relation to acollection of bundles. That is, given two bundles x1 and x2, an individual

    4The axioms, definitions, and theorems in the section are referenced from Jehle andRenys Advanced Microeconomic Theory.

  • 8/11/2019 Liedtke

    22/27

    22

    has a preference or is indifferent between the two. The symbol repre-sents a binary relation defined on the set of consumption bundles X. For

    example, the statement x1 x2 means that x1 is at least as preferable asx2 to a particular consumer. We extend this concept to consumer behaviorin situations of uncertainty by considering a vector of probability weightedoutcomes called a gamble.

    Let A = {a1, . . . , an} be a finite set of certain outcomes. Dependingon the situation, an outcome may be an amount of money, an event, oranything else. Asimple gamble assigns a probability pi, using the symbol as notation, to each of the outcomes ai in A such that

    g= (p1 a1, . . . , pn an) where pi 0,n

    i=1

    pi= 1.

    For example, the gamble in which a coin is flipped and an individual recieves

    $10 if heads, and $0 if tails, may be represented as ( 12 10, 12 0). The setof simple gambles GS on Ais given by

    GS {(p1 a1, . . . , pn an)| pi 0,n

    i=1

    pi= 1}.

    There are six axioms of choice under uncertainty; however, for the scope ofthis paper, we will only be concerned with two. The interested reader shouldrefer to Jehle and Reny [2]. We assume that a consumer is able to applya preference relation over GS, just as they are able to over consumptionbundles. The symbols and denote indifference and strict preferences,respectively.

    Axiom 1 (Completeness). For any two gambles, g and g

    in GS,g g or g g.

    Axiom 2 (Transitivity). For any three gambles g, g , g in GS,

    ifg g and g g, then g g.

    5.2. Von Neumann-Morgenstern Utility Functions. It may be diffi-cult to analyze the behavior of individuals when their preference informationis expressed over a set of vectors. We therefore introduce the concept of autility function. In general, a utility function is a transformation from a setof objects to the set of real numbers that preserves consumer preferences.In situations of certainty, the domain of a utility function is the set of con-sumption bundles. The corresponding domain of the utility function u in

    situations of uncertainity is the set of gambles GS. We also say that u hasthe expected utility property.

    Definition 5.1. The utility function u : GS Rhas the expected utilitypropertyif, for every g GS,

    u(g) =n

    i=1

    piu(ai)

  • 8/11/2019 Liedtke

    23/27

    23

    where (p1 a1, . . . , pn an) is the simple gamble induced by g.

    Hence, the utility of a gamble is simply the probability weighted utilities

    of the outcomes ai. Note that u(ai) is simply the utility function evaluatedat a gamble in which the probability assigned to the outcome ai is equalto one. A utility function in situations of uncertainty is refered to as a vonNeumann-Morgenstern (VNM) utility function. It is named after economistsJohn von Neumann and Oskar Morgenstern who proved its existence fromfour of the axioms of choice under uncertainty. A proof may be found inJehle and Reny [2].

    Theorem 5.2. Let preferences over gambles in GSsatisfy the six axiomsof choice under uncertainty. Then there exists a utility function u : GS Rrepresenting on GS, such that u has the expected utility property.

    5.3. Risk. One of the defining characteristics of the VNM utility function isthe preservation of consumer preferences over GS. This result allows us to vi-sualize consumer behavior. In particular, we will look at how an individualsattitude towards risk is shown by the VNM utility function.

    For illustrative purposes, we will consider the case in which our outcomesare some units of wealth wi. The expected value of a gamble g is given by

    E(g) =n

    i=1

    piwi.

    Intuitively, a risk averse individual is one who prefers to minimize theirexposure to uncertainty. We would therefore expect a risk averse individualto prefer the expected value of a gamble instead of accepting the uncertainty

    inherently associated with a gamble. The following definitions reflect thisidea.

    Definition 5.3. Let the function u represent an individuals VNM utilityfunction for gambles over wealth. For the gamble g, an individual is

    (1) risk averse at g ifu

    E(g)

    > u(g).

    (2) risk neutral at g ifu

    E(g)

    = u(g).

    (3) risk loving at g ifu

    E(g)

    < u(g).

    The following hypothetical gamble illustrates risk loving behavior.

    Example 5.4. Consider the following gambleg1 = (0.75 $700, 0.25 $1900).

    The expected value of the gamble is given by

    E(g1) =p1w1+ p2w2= (0.75)($700) + (0.25)($1900) = $1000.

    By the definition of the expected utility property, the utility of the gambleis given by the individuals VNM utility function evaluated at each of the

  • 8/11/2019 Liedtke

    24/27

    24

    outcomes and weighted by the respective probabilities

    u(g1) =p1u(w1) +p2u(w2) =.75u($700) + .25u($1900).

    Suppose that a risk loving individual over GSis offeredg1. By our definition,we have that

    u($1000) =u(E(g1))< u(g1) =.75u($700) + .25u($1900).

    The individuals preference of the risk inherent in g1is illustrated in Figure9. Note that the individuals VNM utility functionu appears to be convexon R+.

    We find intuition for this observation by recalling that the graph of aconvex function lies below its secant lines. In the case of a gamble with twooutcomes, it can be shown that u(g1) corresponds to a point along the secantline drawn from (w1, u(w1)) to (w2, u(w2)). In particular, we find that

    Sw1w2(E(g1)) =u(g1).With reference to Figure 9, we note that a convex VNM utility functionsatisfies the requirement for risk loving behavior that u(E(g1))< u(g1).

    Figure 9. Illustration of the gamble presented in Example 5.4.

    $

    Utility

    u(x)

    w1= 700 w2 = 900

    u(w1) = u(700)

    u(w2) = u(1900)

    E(g1) = 1000

    u(E(g1)) = u(1000)

    u(g1) = .75u(700) +.25u(1900)

  • 8/11/2019 Liedtke

    25/27

    25

    5.4. Jensens Inequality. We now return to the overlying topic of con-vexity. As we alluded to at the conclusion of our previous example, it can

    be shown that a VNM utility function that preserves risk loving behavioris convex. This follows quickly from Jensens Inequality, a result that weprove below.

    Lemma 5.5. For any points x1, x2, . . . , xnin (a, b) and for any nonnegativenumbers p1, p2, . . . , pn, we have

    a < p1x1+ p2x2+ +pnxn

    p1+ p2+ +pn< b.

    Proof. Let xm = min{x1, x2, . . . , xn} and xM = max{x1, x2, . . . , xn}. Wefind that

    p1x1+ p2x2+ +pnxn

    p1+ p2+ +pn

    p1xM+ p2xM+ +pnxM

    p1+ p2+ +pn

    = xMp1+ p2+ +pn

    p1+ p2+ +pn = xM < b,

    and

    p1x1+ p2x2+ +pnxn

    p1+ p2+ +pn

    p1xm+ p2xm+ +pnxm

    p1+ p2+ +pn= xm

    p1+ p2+ +pn

    p1+ p2+ +pn

    = xm > a.

    Theorem 5.6 (Jensens Inequality). A function fis convex on (a, b) if andonly if for any points x1, x2, . . . , xnin (a, b) and for any nonnegative numbers

    p1, p2, . . . , pn, the inequality

    fp1x1+ p2x2+ +pnxnp1+ p2+ +pn

    p1f(x1) +p2f(x2) + +pnf(xn)p1+ p2+ +pn

    is valid.

    Proof. Suppose that a function f is convex on (a, b). We will use the Prin-ciple of Mathematical Induction. For any points x1, x2, . . . , xn in (a, b) andfor any nonnegative numbers p1, p2, . . . , pn, consider the inequality

    fp1x1+ p2x2+ +pnxn

    p1+ p2+ +pn

    p1f(x1) + p2f(x2) + +pnf(xn)

    p1+ p2+ +pn.

    When n = 1, we have that

    fp1x1p1

    = f(x1

    ) p1f(x1)

    p1=f(x

    1).

    Suppose that the inequality is true for some positive integer k . This may beexpressed in summation notation as

    f

    k+1i=1 pixik+1

    i=1 pi

    = f

    ki=1pixik+1

    i=1 pi+

    pk+1xk+1k+1i=1 pi

  • 8/11/2019 Liedtke

    26/27

    26

    Since

    k

    i=1pixi

    k

    i=1pi

    is in (a, b) by Lemma 5.5, we may use the definition of a

    convex function and the induction hypothesis. Referring to the definition of

    a convex function, it may be helpful to note

    t=

    ki=1pixik+1

    i=1 piand (1 t) =

    pk+1k+1i=1 pi

    .

    We find that

    f

    k+1i=1 pixik+1

    i=1 pi

    = f

    ki=1pixik

    i=1pi

    ki=1pik+1i=1 pi

    +pk+1xk+1k+1

    i=1 pi

    f

    ki=1pixi

    ki=1pi

    ki=1pi

    k+1i=1 pi+ f(xk+1)

    pk+1

    k+1i=1 pi

    ki=1pif(xi)k

    i=1pi

    ki=1pik+1i=1 pi

    +f(pk+1)pk+1k+1

    i=1 pi

    =

    ki=1pif(xi)k+1

    i=1 pi+

    f(pk+1)pk+1k+1i=1 pi

    =

    k+1i=1 pif(xi)k+1

    i=1 pi,

    thus showing that the inequality is true fork+1. It follows from the Principleof Mathematical Induction that the inequality

    fp1x1+ p2x2+ +pnxn

    p1+ p2+ +pn

    p1f(x1) +p2f(x2) + +pnf(xn)

    p1+ p2+ +pn

    is true for all positive integers, for any pointsx1, x2, . . . , xn in (a, b), and forany nonnegative numbers p1, p2, . . . , pn.

    Now suppose that for any points x1, x2, . . . , xn in (a, b) and for any non-negative numbers p1, p2, . . . , pn, the inequality

    fp1x1+ p2x2+ +pnxn

    p1+ p2+ +pn

    p1f(x1) +p2f(x2) + +pnf(xn)

    p1+ p2+ +pn

    is valid. For 0 t 1, let p1 = t, p2 = (1 t), and p3 =. . .= pn = 0. Wefind that

    ftx1+ (1 t)x2) tf(x1) + (1 t)f(x2)Hence, the functionf is convex by definition.

    Theorem 5.7. An individual is risk loving over gambles involving nonneg-ative wealth levels if and only if their VNM utility function is strictly convexon R+.

    Proof. Suppose that an individual is risk loving over gambles involving non-negative wealth levels, let u represent their VNM utility function, and let

  • 8/11/2019 Liedtke

    27/27

    27

    g GS. This means that u

    E(g)

    < u(g), which can be equivalently ex-

    pressed as

    u

    p1w1+ p2w2+ . . . +pnwn

    < p1u(w1) + p2u(w2) + . . . +pnu(wn).

    Noting thatn

    i=1pi = 1, it follows from Jensens Inequality that the func-tion u is convex on R+.

    Now suppose that an individuals VNM utility function u is convex onR

    + and let g GS. Jensens Inequality states that the following inequality

    u

    p1w1+ p2w2+ . . . +pnwn

    < p1u(w1) + p2u(w2) + . . . +pnu(wn)

    holds for each wi R+. Hence, the individual is risk loving at g.

    We have thus shown that an individual is risk loving over gambles involv-ing nonnegative wealth levels if and only if their VNM utility function isstrictly convex on R+.

    Conclusion

    Using convexity to discuss risk behavior has allowed for greater appreci-ation of the underlying economic theory. In the same way, our theoreticalanalysis of convex functions that began with geometrical observations ledto rewarding topics in real analysis. We conclude our discussion of convexfunctions with a survey of areas of further research. With additional time,we may have explored the relationship between convex functions and met-ric preserving functions. For those interested in applications of convexityto economics, one may reference Jehle and Renys discussion of indifference

    curve analysis in [2]. Furthermore, we note that convex functions are in-extricably related to convex sets. The study of convex sets is essential inlinear optimization, as well as graduate level economic proofs.

    Acknowledgements

    Sincere thanks to Price Hardman for his helpful comments on severaldrafts of the paper, to Professor Gordon and Professor Balof for their ad-visement and careful attention to details, to Professor Schueller, for withouthis support with TikZ this paper would be notably less interesting, and toProfessor Crouter for her support in the economics section. Finally, I wouldlike to thank all members of the Whitman Swim Team, past and present,for their daily support and encouragement.

    References

    [1] Russel A. Gordon,Real Analysis: A First Course, (New York: Addison-Wesley, 2002).[2] Geoffrey A. Jehle and Philip J. Reny, Advanced Microeconomic Theory, (New York:

    Addison-Wesley, 2001).