Canuto

129
Notes on Partial Differential Equations (Theory) Claudio Canuto Dipartimento di Matematica Politecnico di Torino 10129 Torino Italy [email protected] http://calvino.polito.it/ccanuto July 21, 2009

description

partial differential equations

Transcript of Canuto

  • Notes on Partial Differential Equations

    (Theory)

    Claudio Canuto

    Dipartimento di MatematicaPolitecnico di Torino10129 Torino Italy

    [email protected]

    http://calvino.polito.it/ccanuto

    July 21, 2009

  • 2

  • Chapter 1

    Basic Concepts

    1.1 Vectors

    The inner product between two column vectors a = (ai) IRm and b = (bi) IRm will be denotedby

    a b :=mi=1

    aibi .

    An equivalent notation is aTb (where the superscript T indicates the transpose of a vector or amatrix), since a b is the matrix product between the (1m)-matrix aT and the (m, 1)-matrix b.

    For m = 3, the vector product (or external product) between a and b is the vector

    a b := det e1 e2 e2a1 a2 a3

    b1 b2 b3

    = (a2b3 a3b2)e1 + (a3b1 a1b3)e2 + (a1b2 a2b1)e3= (a2b3 a3b2, a3b1 a1b3, a1b2 a2b1)T ,

    where ei = (ij)j=1,...,3, i = 1, . . . , 3, are the vectors of the canonical basis.

    1.2 Introduction and Notations

    Let us denote by x = (x1, . . . , xm) the independent variable in the Euclidean space IRm. Let

    u = u(x ) be a real-valued function defined in some open set O IRm. Given a multi-integer =(1, . . . , m) INm, the -partial derivative of u at a point x O is obtained by differentiating uat x i-times with respect to the variable xi, for i = 1, . . . ,m. The order of the partial derivativeis defined as || := 1+ +m. The -partial derivative will be denoted by one of the symbols

    Du or||u

    x11 . . . xmm

    .

    Other symbols may be preferred for indicating low order derivatives. For instance, the first orderpartial derivatives with respect to xi will also be denoted by

    Diu or Dxiu or uxi ;

    3

  • 4 CHAPTER 1. BASIC CONCEPTS

    the second order derivative with respect to xi and xj will be denoted by

    D2iju or D2xixju or uxixj ,

    and so on.We now introduce the most commonly used first order differential operators. The gradient

    or grad is defined as

    u :=

    u

    x1...u

    xm

    =

    x1...

    xm

    u ;note that acts on a scalar function and produces a column vector function, i. e., a vector fielddefined in O.

    The divergence or div acts on a vector-valued function u = (u1, . . . , um)T and produces ascalar function, according to the definition

    u := u1x1

    + + umxm

    .

    The notation is coherent with the fact that u can be formally obtained as the inner productof the column vectors and u. Therefore, an equivalent notation for u is Tu.

    In dimension m = 3, the curl or rot acts on a vector-valued function u and produces avector-valued function, according to the definition

    u =(u3x2

    u2x3

    ,u1x3

    u3x1

    ,u2x1

    u1x2

    )T;

    the vector u can be formally obtained as the vector product of the column vectors and u.In dimension m = 2, we can define the curl of a scalar function u as the column vector in IR2

    u =(u

    x2, u

    x1

    )T;

    note that u contains the two first components of the vector U IR3, where U =(0, 0, u)T IR3 (the last component is obviously 0). Similarly, we can define the curl of a vectorfunction u = (u1, u2)

    T IR2 as the scalar

    u = u2x1

    u1x2

    ,

    which coincides with the third component of the vector U IR3, where U = (u, 0)T (the firsttwo components are 0 since u does not depend on x3).

    The perhaps most popular second order differential operator is the Laplacian , defined as

    u :=2u

    x21+ +

    2u

    x2m.

    Note that u is obtained by taking the divergence of the vector u, i.e., u = u = Tu;for this reason, the Laplacian is also denoted by the symbol 2.

  • 1.2. INTRODUCTION AND NOTATIONS 5

    A partial differential equation is a relationship

    R(x , u; D) = 0 (1.2.1)

    among the independent variable x , the dependent variable u = u(x ) and certain partial derivativesD applied to u or to some functions depending on u; the multi-integers vary in some finitesubset of INm. The equation is required to be satisfied in some open set O IRm. The order ofthe equation is the maximum order of the partial derivatives which appear in the relationship.

    Examples of (first order) partial differential equations are

    (i) the transport (or advection) equation

    a u =mi=1

    aiu

    xi= 0, (1.2.2)

    where a = a(x ) = (a1(x ), . . . , am(x ))T is a given vector field defined in O;

    (ii) the inviscid Burgers equation

    u

    t+

    x

    (u2

    2

    )= 0, (1.2.3)

    where we have set (x1, x2) = (x, t) IR2;

    (iii) the liquid crystal equation

    u u =mi=1

    (u

    xi

    )2= f, (1.2.4)

    where f = f(x ) is a given function in O.

    A partial differential equation is linear if (1.2.1) can be written as

    L(x , u; D) = f, (1.2.5)

    where L is linear in u, i.e., L(x , u+ v; D) = L(x , u; D) + L(x , v; D) for all , IR,and f = f(x ) is a given function in O. Equivalently, the left-hand side of (1.2.5) is a lineardifferential operator (of some order N) applied to u; this means that for all with || N thereexist coefficients a = a(x ) such that

    L(x , u; D) =||N

    aDu, (1.2.6)

    and, at each x O, at least one coefficient with || = N is not vanishing. For convenience, theleft-hand side L(x , u; D) will be denoted by Lu.

    If we restrict the sum in (1.2.6) to the indices with || = N , we obtain the principal part L(N)of the operator L, i.e.,

    L(N)u =||=N

    aDu.

    The transport equation (1.2.2) is an example of a linear first order equation. Examples oflinear second order equations (in two independent variables) are

  • 6 CHAPTER 1. BASIC CONCEPTS

    (i) the Poisson equation

    (2u

    x2+2u

    y2

    )= f (1.2.7)

    (called the Laplace equation if f = 0);

    (ii) the heat equationu

    t

    2u

    x2= f (1.2.8)

    (t denotes the time variable, whereas x denotes the space variable);

    (iii) the wave equation2u

    t2

    2u

    x2= f ; (1.2.9)

    (iv) the Tricomi equation2u

    x2+ y

    2u

    y2= f. (1.2.10)

    Note that the principal part of the heat operator is minus the second order derivative in the xspace variable, i.e., minus the Laplacian in one space variable.

    A partial differential equation is quasi-linear if it is linear in the higher-order derivatives, i.e.,if it can be written as

    ||=NaD

    u = f, (1.2.11)

    where the coefficients a as well as f may depend not only on x but also on u and certainderivatives Du of order || < N . An example is the inviscid Burgers equation (1.2.3), which canbe written in the (formally) equivalent expression

    u

    t+ u

    u

    x= 0.

    Finally, a partial differential equation is semi-linear if it is quasi-linear and the coefficientsa in (1.2.11) depend neither on u nor on its derivatives (whereas f may depend). Examples ofsemi-linear equations are the viscous Burgers equation

    u

    t

    2u

    x2+

    x

    (u2

    2

    )= 0,

    (where > 0 is the viscosity constant), the Korteweg-de Vries equation

    u

    t+ u

    u

    x+3u

    x3= 0, (1.2.12)

    and the ground-state equationu = u3.

    We will now discuss in which sense a function u defined in the open set O IRm is a solutionof the partial differential equation (1.2.1) therein. Indeed, we can give different meanings to theword solution. We go from the concept of classical solution to that of strong solution, andthen to weaker and weaker definitions, which require a solution to be less and less regular (i.e.,

  • 1.2. INTRODUCTION AND NOTATIONS 7

    differentiable). One of the main achievements of the Mathematics of the XXth century has beenthe relaxation of the concept of solution of a partial differential equation; this has allowed thedifferential problems to be formulated in the most appropriate way for being studied by oftensophisticated analytical tools, and numerically discretized by efficient methods.

    Let us denote by N the order of the partial differential equation (1.2.1). A classical solution isa N -time continuously differentiable function in O (i.e., u CN (O)) which, inserted with all itsderivatives in the left-hand side of (1.2.1), makes the equation satisfied pointwise in O:

    R(x , u; D) = 0, x O. (1.2.13)We want to formulate these conditions in an equivalent way, which subsequently will allow us

    to relax the concept of solution. To this end, we introduce the notion of test function, i.e., aninfinitely differentiable function defined and having compact support in O: this means that theclosed set

    supp = closure of {x O : (x ) 6= 0}is bounded and contained in O. Then, vanishes with all its derivatives in a neighborhood of theboundary O. The set of all test functions forms a vector space, which will be denoted by D(O).Note that any partial derivative of a test function is itself a test function.

    Example 1.2.1. It is often important to know that test functions with certain properties exist;for example, one often needs a test function that is positive in a small neighborhood of a givenpoint x 0 and zero outside that neighborhood. Such a function can be given explicitly:

    (x ) =

    exp(

    2

    x x 02 2)

    if x x 0 < 0 otherwise,

    where denotes the Euclidean norm of a vector in IRm.Let us assume that R depends continuously on all its arguments, so that R(x , v; D) is a

    continuous function in O, for all functions v CN (O). The set of conditions (1.2.13) is equivalentto the set of conditions

    OR(x , u; D)(x ) dx = 0, D(O). (1.2.14)

    Equivalence is established by a classical argument in analysis. If (1.2.13) is satisfied, we multiplyboth sides by (x ) and integrate over O to get (1.2.14). Conversely, suppose that (1.2.14) holds;assume by contradiction that there exists x 0 O such that R(x 0, u; D) 6= 0, say, strictlypositive. Since R(x , u; D) is a continuous function of x , it will also be strictly positive ina neighborhood B(x 0) of x 0. Take as test function a nonnegative function having supportcontained in B(x 0) and satisfying (x 0) = 1. Then,

    OR(x , u; D)(x ) dx > 0,

    which contradicts (1.2.14).The interest of formulation (1.2.14) relies on the fact that certain derivatives applied to u, or

    to some functions of u, can be moved on , thus relaxing the differentiability requirements on u.This is accomplished via the (repeated) use of the integration-by-parts formula

    O

    g

    xi(x )(x ) dx =

    Og(x )

    xi(x ) dx , D(O), i = 1, . . . ,m, (1.2.15)

  • 8 CHAPTER 1. BASIC CONCEPTS

    which holds, at least, if g is continuously differentiable in O. Note that no boundary term appears,since a test function vanishes in a neighborhood of O. While the left-hand side requires the partialderivative of g with respect to xi to be defined and integrable on O, the right-hand side is definedunder the milder condition that g be integrable on O, only.

    To explain how (1.2.15) is used to manipulate (1.2.14), assume that the partial differentialequation is written in the quasi-divergence form

    R(x , u; D) = g(x , u; D) + g0(x , u; D) = mi=1

    xigi(x , u; D

    ) + g0(x , u; D),

    where each gi (i = 0, 1, . . . ,m) only involves partial derivatives of order strictly less than N .Many partial differential equations which model fundamental phenomena of the physical worldare precisely obtained in this form; conservation laws are an example. Then, applying (1.2.15),conditions (1.2.14) can be written as

    O

    [mi=1

    gi(x , u; D)

    xi(x ) + g0(x , u; D

    )(x )

    ]dx = 0, D(O). (1.2.16)

    In this formulation, u need not be differentiable up to order N ; it is enough for the functions gi tobe defined and integrable on O. Any function u for which this is true and which satisfies (1.2.16)is called a weak solution of the partial differential equation. Obviously, a classical solution is alsoa weak solution, whereas the converse need not be true.

    Further integrations by parts in (1.2.16) may lead to even weaker definitions of solution.

    Example 1.2.2. Consider the transport equation (1.2.2) and assume that the coefficients ai(i = 1, . . . ,m) belong to C1(O). After a change of sign, the equation can be written as

    mi=1

    xi(aiu) +

    (mi=1

    aixi

    )u = 0.

    Thus, the weak formulation isOu(x )

    [mi=1

    ai(x )

    xi(x ) +

    (mi=1

    aixi

    (x )

    )(x )

    ]dx = 0, D(O).

    In this way, we allow the transport equation to have bounded, piecewise smooth but discontinuousweak solutions.

    Example 1.2.3. Recalling that = , the Poisson equationu = f

    in O is written in weak form asOu dx =

    O

    mi=1

    u

    xi

    xidx =

    Ofdx , D(O),

    provided f and all the first order partial derivatives of u exist and are integrable on O. A furtherintegration by parts yields

    Oudx =

    Ofdx , D(O),

    which only requires u to be integrable on O.

  • 1.2. INTRODUCTION AND NOTATIONS 9

    Partial differential equations are usually supplemented by boundary and/or initial conditions,i.e., conditions that the solution has to satisfy on all or part of the boundary O of the region O inwhich the equation is set; if O is unbounded, the solution may be required to match a prescribedasymptotic behaviour at infinity. Indeed, in most cases, a partial differential equation admitsinfinitely many solutions; the conditions on O or at infinity, which often originate as part ofthe mathematical model describing the phenomenon of interest, allow us to select precisely onesolution.

    Example 1.2.4. Consider the simple transport equation in one space variable

    ut + ux.

    It is immediate to check that if g = g(s) is any continuously differentiable function on the realline, then u(x, t) = g(x t) is a classical solution of the equation. Thus, if we set the equation inthe half-plane {(x, t) IR2 : t > 0}, so that O = {(x, 0) : x IR} represents the space at theinitial time t = 0, then u is the unique solution of the initial value problem{

    ut + ux = 0 in Ou = g on O.

    Example 1.2.5. As a second example, consider the Poisson equation

    u = fin some O IRm. If we know one solution uf , then all solutions can be written in the formu = u0 + uf , where u0 denotes any harmonic function in O, i.e., any solution of the homogeneousequation u0 = 0 therein. We shall see that, under appropriate assumptions, a unique solutioncan be selected by forcing u to vanish on the whole of O, and, if O is unbounded, at infinity.

    The two previous examples concern linear partial differential equations,

    Lu = f in O. (1.2.17)For such equations, the set of solutions is an (affine) vector space. To see this, consider at firstthe associated homogeneous equation

    Lu0 = 0 in O.The set of its solutions is a linear vector space: indeed, if u0 and v0 are two such solutions, thenby the linearity of L one has

    L(u0 + v0) = Lu0 + Lv0 = 0, , IR,so u0 + v0 is also a solution. Going back to the nonhomogeneous equation, if u and v are twosolutions, then

    L(u v) = Lu Lv = f f = 0,i.e., u v is a solution of the homogeneous equation. Thus, if (1.2.17) admits a solution uf , thenall its solutions can be written in the form

    u = u0 + uf ,

    with u0 arbitrary solution of the homogeneous equation. This is the well-known superpositionprinciple of linear equations.

  • 10 CHAPTER 1. BASIC CONCEPTS

    1.3 Linear First Order Equations

    The most general linear first order partial differential equation is

    a u+ a0u =mi=1

    aiu

    xi+ a0u = f ; (1.3.1)

    a = (a1, . . . , am)T 6= 0 and a0 are the coefficients of the equation, whereas f is a given function.

    An alternative formulation of the equation is the quasi-divergence form

    (au) + a0u =mi=1

    xi(aiu) + a0u = f.

    If the coefficients ai (i = 1, . . . ,m) are differentiable, the two formulations are equivalent by thedifferentiation rule of a product, up to a different definition of the zeroth-order coefficient a0.

    We want to show that (1.3.1) is equivalent to a family of ordinary differential equations. We

    write a = a a, with a having unitary Euclidean norm, and we denote by ua

    = a u thedirectional derivative of u along a. Then, (1.3.1) becomes

    a ua

    + a0u = f,

    which is a family of ordinary differential equations in the directions of a. To be more explicit, letus assume that the coefficients ai (i = 1, . . . ,m) are bounded, continuously differentiable functionsin the closure O of the region O in which the equation is set. Let us introduce the characteristicscurves of the equation, i.e., the curves x = x (s) defined as the solutions of the autonomousordinary differential system

    dx

    ds= a(x ). (1.3.2)

    Here, s is a real variable which parametrizes each curve. A classical result in the theory ofordinary differential equations (see, e.g., (??)) guarantees that, under the assumptions made onthe coefficients, for each x 0 O there exists exactly one characteristic curve passing through x 0;it is defined as the solution x = x (s; x 0) of the Cauchy problem

    dx

    ds= a(x )

    x (0) = x 0.

    The solution exists for positive and negative values of the parameter s, until x reaches the boundaryO. Note that the characteristics only depend on the principal part of the operator.

    Let now u be a (classical) solution of (1.3.1), and let us consider its restriction u = u(s) =u(x (s)) to a characteristic curve. By the chain rule and (1.3.2), one has

    du

    ds=

    mi=1

    u

    xi

    dxids

    = a u.

    It follows that u can be determined by solving the linear ordinary differential equation

    du

    ds+ a0u = f (1.3.3)

  • 1.3. LINEAR FIRST ORDER EQUATIONS 11

    n

    n

    n

    n

    a

    a

    a

    a

    O0

    O+

    O0

    O

    Figure 1.1: Decomposition of the boundary of a channel O into inflow boundary O, character-istic boundary O0 and outflow boundary O+

    on each characteristic curve (again, the symbol indicates restriction to the characteristic curve);solvability is guaranteed if, for instance, a0 and f are bounded and continuous in O. Furthermore,u can be uniquely determined by prescribing its value at one point of each characteristic curve.

    A situation of particular interest is the following one. Let O be smooth enough so that theunit vector n = n(x ) normal to O exists at each point x O; we assume that O is locally onone side of O, and n is pointing outwards. Let us introduce the inflow boundary of O as the set

    O := {x O : (a n)(x ) < 0} . (1.3.4)The terminology comes from the fact that if a is the (Eulerian) velocity of fluid particles, then Ois the portion of the boundary where the fluid is entering the region O. The sets O+ (outflowboundary) and O0 (characteristic boundary) are defined similarly, with and=, respectively.

    Now, suppose that each point in O is reached by a characteristic curve issuing from O (seeFigure 1.1). Then, we can prescribe the value of u at each point in O and uniquely solve theset of equations (1.3.3), getting u at each point in O. In other words, given a function g on O,the boundary value problem {

    a u+ a0u = f in Ou = g on O (1.3.5)

    admits a unique solution.Before presenting an example, we anticipate that in Chapter ?? we shall see that this problem

    is indeed solvable under weaker assumptions on the data (the domain, the coefficients of theoperator and the right-hand sides f and g).

    Example 1.3.1. Consider the simple, constant coefficient equation

    ut + aux = 0 (1.3.6)

    in the variables (x1, x2) = (x, t). Thus, a = (a, 1)T and a0 = 0. The characteristic curves are

    defined by the relationsdx

    ds= a,

    dt

    ds= 1.

    Eliminating s, we getx at = constant; (1.3.7)

  • 12 CHAPTER 1. BASIC CONCEPTS

    x

    t

    x0

    t0

    x0 + at

    t0 +t

    t = t0 +1a(x x0)

    Figure 1.2: Characteristics in the (x, t)-plane

    in other words, the characteristics are straight lines in the plane (x, t) having slope 1/a (see Fig.1.2). The solution u is constant along these lines. Thus, the equation models the propagation ofa signal in the x-direction, with speed a: a signal issued at time t0 from position x0 is receivedat time t0 +t at position x0 + at (see Fig. 1.3). Indeed,

    = u(x0, t0) = u(x0 + at, t0 +t).

    At first, let us suppose that O is the half-plane {(x, t) : t > 0}. Since n = (0,1)T onO = {(x, 0) : x IR}, we have a n = 1 therein, so that O = O, i.e., all the boundary isinflow. Thus, we prescribe the value u0 of u on O, i.e., at the initial time t = 0. In this case,(1.3.5) reads as {

    ut + aux = 0 x IR, t > 0 ,u(x, 0) = u0(x) x IR , (1.3.8)

    which is more properly called an initial value problem. Given a point (x, t) O, the characteristicline passing through it originates from the point (x0, 0) O such that x at = x0 (see (1.3.7)and Fig. 1.4). Since u is constant on this line, we have u(x, t) = u(x0, 0) = u0(x0) = u0(x at).

    0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    x

    x0

    S

    u(x, t0)

    0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    x

    S

    x0+at

    u(x, t0+t)

    Figure 1.3: Propagation of a signal from time t = t0 to time t = t0 +t

  • 1.3. LINEAR FIRST ORDER EQUATIONS 13

    x

    t

    (x, t)

    x at

    Figure 1.4: Solution of the initial value problem

    Dropping the bars on x and t, we get the explicit formula for the solution of the initial valueproblem (1.3.8)

    u(x, t) = u0(x at) , for all (x, t) O . (1.3.9)

    It is trivial to check that if u0 is continuously differentiable, then u is continuously differentiablewith respect to x and t; thus, u is a classical solution of the partial differential equation. On theother hand, suppose that u0 has a discontinuity at a point x0; then, u will have a discontinuityacross the characteristic line x at = x0 issued at x0. In other words, singularities propagatealong characteristic curves. This very important property is quite general, as it holds for all firstorder equations. Obviously, the function defined by (1.3.9) is not a strong solution of (1.3.6) inO; however, it is a weak solution.

    At last, suppose that O is the semi-infinite strip O = {(x, t) : 0 < x < 1, t > 0}. For the sakeof definiteness, assume that a > 0. Then,

    O = {(x, 0) : 0 x < 1} {(0, t) : t > 0}(note that the normal vector to O does not exist at the origin, yet this boundary point is aninflow point for the equation). We prescribe the value u0 = u0(x) at time t = 0 and the valueg = g(t) at the left endpoint of the interval (0, 1); no condition has to be prescribed at the rightendpoint. Thus, we consider the initial-boundary value problem

    ut + aux = 0 0 < x < 1, t > 0 ,u(x, 0) = u0(x) 0 < x < 1 ,u(0, t) = g(t) t > 0 .

    (1.3.10)

    In order to solve this problem, let us fix a point (x, t) O. If x at, then the characteristic passingthrough (x, t) meets O at the point (x0, 0), with x0 = xat; hence, as above, u(x, t) = u0(xat).On the other hand, if x < at, then the characteristic passing through (x, t) meets O at thepoint (0, t0), with t0 = t x/a (see Fig. 1.5); hence, u(x, t) = u(0, t0) = g(t0) = g(t x/a). Weconclude that the solution of the initial-boundary value problem (1.3.10) is

    u(x, t) =

    {u0(x at) if x atg(t x/a) if x < at.

    Note that if the data u0 and g do not match properly at the origin, a singularity propagates alongthe characteristic line x = at.

  • 14 CHAPTER 1. BASIC CONCEPTS

    x

    t

    0 1

    (x, t)

    t xa

    Figure 1.5: Solution of the initial-boundary value problem

    If the coefficient a is strictly negative, the boundary condition g is enforced at the right endpointof the interval (0, 1).

    The concept of characteristic line introduced above is a particular case of the more generalconcept of characteristic manifold. A (m 1)-dimensional manifold (a line in two dimensions,a surface in three dimensions, and so on) contained in O is said non-characteristic for equation(1.3.1) whenever the following property holds: if one prescribes the value of u on , then u isuniquely determined by the partial differential equation in a neighborhood of . As a first step,one aims at determining the gradient of u on ; then, if the manifold, the coefficients and thedata are smoother and smoother, one can differentiate the equation to get derivatives of u on ofhigher and higher order; at last, the condition of real analyticity leads to the representation of uin terms of its Taylor series in a neighborhood of each point in (Cauchy-Kowalewska Theorem).Confining ourselves to the determination of the gradient of u on , we observe that the directionalderivative of u along any tangential vector to is uniquely determined by the prescribed value ofu therein. Therefore, the differential equation should allow to express the derivative of u along anon-tangential direction to in terms of the value of u on the manifold. In other words, denotingby n the normal vector to , one should have a n 6= 0 on . This motivates the followingDefinition 1.3.2. Any smooth manifold in O whose normal vector n satisfies

    a n = 0 on

    is called a characteristic manifold for equation (1.3.1).

    It is easy to check that characteristic curves lie on characteristic manifolds. Furthermore, theinflow boundary O of O is obviously a non-characteristic manifold.

    1.4 Linear Second Order Equations

    The most general linear second order partial differential equation reads as follows:

    m

    i,j=1

    aij2u

    xixj+

    mi=1

    aiu

    xi+ a0u = f (1.4.1)

  • 1.4. LINEAR SECOND ORDER EQUATIONS 15

    (the choice of the minus sign in front of the principal part will be motivated in the sequel). Weactually consider the equation in the quasi-divergence form

    m

    i,j=1

    xi

    (aij

    u

    xj

    )+

    mi=1

    xi(aiu) + a0u = f. (1.4.2)

    As already mentioned above, often the equation is derived in this form; if not, we can transform(1.4.1) into (1.4.2) by an appropriate modification of lower order coefficients ai (i = 0, 1, . . . ,m).

    To simplify the notation, let us introduce the square matrix of order m

    A := (aij)1i,jm. (1.4.3)

    Since uxixj = uxjxi for any twice continuously differentiable function, it is not restrictive to assumethat aij = aji for all i and j, i.e., to assume that the matrix A is symmetric. Indeed, if the matrixis not symmetric, we write

    aijuxixj + ajiuxjxi =1

    2(aij + aji)uxixj +

    1

    2(aij + aji)uxjxi ,

    i. e., we replace A by 12(A + AT ). As before, a = (a1, . . . , am)

    T denotes the coefficients of thefirst order part. Then, (1.4.2) is compactly written as

    Lu = (Au) + (au) + a0u = f , (1.4.4)or, equivalently,

    Lu = T (Au) +T (au) + a0u = f . (1.4.5)A linear second order differential equation can be classified according to the structure of its

    principal part. This classification is very important: indeed, the type of the equation influencesthe kind of boundary and/or initial conditions which are admissible for the equation, the relevantproperties of the solution, as well as the techniques for solving the equation (analytically ornumerically).

    The classification is accomplished by looking at the sign of the eigenvalues of the coefficientmatrixA (recall thatA is symmetric, so all its eigenvalues are real). Note that since the coefficientsmay depend on x , the type of the equation may vary from point to point. Let us considerA = A(x )at a fixed point x O.

    Three situations are most commonly encountered in applications:

    (i) all the eigenvalues of A are not zero, and they all have the same sign; in this case we saythat the operator L (or the equation (1.4.4)) is of elliptic type at x ;

    (ii) precisely one eigenvalue of A is zero, while the others have constant sign; in this case we saythat the operator L is of parabolic type at x ;

    (iii) all the eigenvalues of A are not zero, and precisely one eigenvalue has a different sign withrespect to the others; in this case we say that the operator L is of hyperbolic type at x .

    In two independent variables, this classification is exhaustive (since, by assumption, A cannotbe the null matrix). The terminology comes from the fact that the level curves in the (1, 2)-planeof the associated quadratic form

    Q() = TA , = (1, 2)T

    are ellipses, or degenerate parabolae, or hyperbolae, depending whether the operator L is elliptic,or parabolic, or hyperbolic.

  • 16 CHAPTER 1. BASIC CONCEPTS

    Example 1.4.1. The Poisson equation (1.2.7) is elliptic, the heat equation (1.2.8) is parabolic,whereas the wave equation (1.2.9) is hyperbolic. Obviously, the type of each equation is the sameat all points in the plane.

    Conversely, the Tricomi equation (1.2.10) is of variable type: it is elliptic in the upper halfplane, parabolic on the axis y = 0 and hyperbolic in the lower half plane.

    In three or more independent variables, other situations may occur. If A has two or more zeroeigenvalues and the remaing ones are of one sign, we say that the operator is ultra-parabolic. Iftwo or more eigenvalues are of one sign, whereas two or more remaining ones are of the oppositesign, we say that L is ultra-hyperbolic. We shall not consider these cases further on.

    We now use the classification introduced above to reduce the general second order equation(1.4.4) to a canonical form. To this end, we shall make the simplifying assumption that thecoefficients of the principal part are constant (otherwise, one can modify the arguments below byfreezing the coefficients in a neighborhood of each point x O).

    Denote by i (i = 1, . . . ,m) the eigenvalues of A, and let wi be the corresponding eigenvectors,which form a complete set sinceA is symmetric. Define the diagonal matrix := diag(1, . . . , m),as well as the orthogonal matrix S := (w1, . . . ,wm). The eigenvalue-eigenvector relations, writtenas AS = S, yield the diagonalization of A

    STA S = . (1.4.6)

    Now, let us fix a point x O and let us make the change of independent variable

    y = x + ST(x x ).

    Denoting by x the gradient in the x -variable and defining y similarly, we have by the chainrule

    x = Sy .Indeed, if V = V (y) is a given function and we set v(x ) = V (x + ST(x x )), we have

    v

    xi=

    mj=1

    V

    yj

    yjxi

    =

    mj=1

    sijV

    yj,

    since yj = xj +m

    i=1(ST )jixi = xj +

    mi=1 sijxi. Substituting into (1.4.5) gives

    Lu = Ty(STA Syu

    )+Ty

    [ST(au)

    ]+ a0u = f ;

    recalling (1.4.6) and setting a := STa, we obtain

    Lu = Ty (yu) +Ty (au) + a0u = f. (1.4.7)

    Thus, we have diagonalized the principal part of the operator L, i.e.,

    Lu = mi=1

    i2u

    y2i+ lower order terms.

    In order to proceed, we consider the three main types of equations introduced above.

  • 1.4. LINEAR SECOND ORDER EQUATIONS 17

    (i) Elliptic equations.

    If the equation is elliptic, we can assume - possibly after changing the sign of the equation -that all the eigenvalues of A are positive. Then, we set

    D := diag

    (11, . . . ,

    1m

    )and we make the further change of variable z = x +D(y x ), which implies y = Dz.Thus, setting a :=Da, (1.4.7) becomes

    Lu = zu+Tz (au) + a0u = f,

    where z is the Laplacian in the z -variable. We conclude that the Laplace operator is thecanonical form of an elliptic operator.

    (ii) Parabolic equations.

    Set n = m 1. Suppose that i > 0 for i = 1, . . . , n, whereas m = 0. If the last componentam of the vector a appearing in (1.4.7) is zero, then the equation does not contain partialderivatives of u with respect to ym: it is an elliptic equation in n variables, the variable ymacting only as a parameter. On the other hand, if am 6= 0, we set

    D := diag

    (11, . . . ,

    1n

    ,1

    am

    )and we make the change of variable z = x +D(y x ). Writing z = (z1, . . . , zn, t) = (z , t)and denoting by a the n first components of the vector Da, we transform (1.4.7) into theform

    Dtuzu+Tz (au) + a0u = f.We conclude that the heat operator

    Dt is the canonical form of a (genuinely) parabolic operator.

    (iii) Hyperbolic equations.

    Set again n = m 1. Suppose now that i > 0 for i = 1, . . . , n, whereas m < 0. Define

    D := diag

    (11, . . . ,

    1n

    ,1m

    )and perform the same change of variable as in the parabolic case. Setting a :=Da, (1.4.7)becomes

    D2ttuzu+Tz (au) + a0u = f.We conclude that the wave operator (also termed the DAlembert operator)

    := D2tt

    is the canonical form of a hyperbolic operator.

  • 18 CHAPTER 1. BASIC CONCEPTS

    1.5 Boundary and Initial Conditions. Characteristics

    Let us consider the simple case of a hyperbolic equation, in dimension m = 2. After diagonaliza-tion, and assuming that the lower order terms are zero, we have

    (12u

    y21+ 2

    2u

    y22

    )= f , (1.5.1)

    with 1 > 0 and 2 < 0. Let us define a2 := 1/|2| > 0; setting x = y1, t = y2 and g = f/|2|,

    we obtain

    D2ttu a2D2xxu = g. (1.5.2)The equation factorizes as

    (Dt + aDx) (Dt aDx)u = g, (1.5.3)which is equivalent to the first order hyperbolic system

    (Dt + aDx)w = g (1.5.4){(Dt aDx)u = w (1.5.5)

    (note that the + and signs can be exchanged in these formulae). Recalling the results of Sect.1.3, u can be obtained by first integrating (1.5.4) along the characteristics x at = constant,next integrating (1.5.5) along the characteristics x+ at = constant. Actually, the family of linesxat = constant are called the characteristics of equation (1.5.2). In order to uniquely determinethe solution, one can prescribe a condition on u for each characteristic line at each boundary pointwhere it enters the region O. Let us detail two examples.Example 1.5.1. At first, suppose that O is the half-plane {(x, t) : t > 0}. Both characteristicsenter O at each point in O; thus, we prescribe u and a non-tangential derivative of u, such asthe normal derivative ut, therein. Precisely, we consider the initial value problem

    utt a2uxx = 0 x IR , t > 0 ,u(x, 0) = u0(x) x IR ,ut(x, 0) = u1(x) x IR ,

    (1.5.6)

    (where, for simplicity, we have chosen g 0). Taking into account (1.5.4), (1.5.5) and notingthat w(x, 0) = (ut aux)(x, 0) = u1(x) au0(x), we first integrate along the characteristicsx at = constant to solve the initial value problem{

    wt + awx = 0 x IR , t > 0 ,w(x, 0) = u1(x) au0(x) x IR ;

    we get w(x, t) = u1(x at) au0(x at). Next, we integrate along the characteristics x + at =constant to solve the initial value problem{

    ut aux = w x IR, t > 0 ,u(x, 0) = u0(x) x IR .

    We get

    u(x, t) = u0(x+ at) +

    t0w(x+ at as, s) ds;

  • 1.5. BOUNDARY AND INITIAL CONDITIONS. CHARACTERISTICS 19

    x

    t

    (x, t)

    x at x+ at x

    t

    x0

    x+ at = x0 x at = x0

    Figure 1.6: The domain of dependence of a point (x, t) (left) and the domain of influence of apoint (x0, 0)

    substituting the expression of w and making a change of variable in the integral leads to the finalform of the solution:

    u(x, t) =1

    2[u0(x at) + u0(x+ at)] + 1

    2a

    x+atxat

    u1(s) ds . (1.5.7)

    Setting (z) = 12u0(z) +12a

    z0 u1(s) ds and (z) =

    12u0(z) +

    12a

    0z u1(s) ds, we have

    u(x, t) = (x+ at) + (x at) .

    Note that the solution is the superposition of two signals, traveling leftwards and rightwards,respectively, with speed a and +a; also note that u at (x, t) only depends on the initial data onthe interval [x at, x+ at]. If we had considered our equation with a nonzero right-hand side g,then u(x, t) would have depended on the values of g in the triangle

    T = {(x, t) : 0 t t, x a(t t) x x+ a(t t)}.Indeed, adapting the computations above to the presence of the right-hand side yields

    u(x, t) =1

    2[u0(x at) + u0(x+ at)] + 1

    2a

    x+atxat

    u1(s) ds

    +

    t0ds

    tsg(x a(2 t s), s) d .

    We call the region T the domain of dependence of the point (x, t) (see Fig. 1.6, left).Conversely, the initial values at a point (x0, 0) influence the solution in the angle

    A = {(x, t) : x0 at x x0 + at};this region is called the domain of influence of the point (x0, 0) (see Fig. 1.6, right).

    This simple example shows that a second order hyperbolic equation describes the propagationand composition of two signals moving at finite speed; the solution depends locally on the data ofthe problem (the initial data u0 and u1, the right-hand side g).

    Example 1.5.2. Let us now consider our equation in the semi-infinite strip

    O = {(x, t) : 0 < x < 1, t > 0}.

  • 20 CHAPTER 1. BASIC CONCEPTS

    At each point of the spatial boundary {(0, t) : t > 0} {(1, t) : t > 0}, one characteristics isentering the domain and one is leaving it, see Fig. 1.7. Thus, one has to prescribe one boundarycondition on u; this can be either the value of u or the value of ux (which is the normal derivativeto O therein). For instance, we can consider the following initial-boundary value problem

    0 1 x

    t

    O

    Figure 1.7: The characteristics entering the domain

    utt a2uxx = 0 x IR , t > 0 ,

    u(0, t) = 0(t) t > 0 ,ux(1, t) = 1(t) t > 0 ,u(x, 0) = u0(x) x IR ,ut(x, 0) = u1(x) x IR .

    (1.5.8)

    In order to motivate the admissibility of the boundary conditions, let us fix a point P0 = (0, t0)in O (see Fig. 1.8). If we prescribe u at this point, say u(0, t0) = 0(t0), it is convenient toexchange the signs in (1.5.4), (1.5.5); then, we use the boundary data to integrate u along thecharacteristic line x at = at0 entering O at P0, i.e., we solve{

    ut + aux = wu(0, t0) = 0(t0),

    with w coming from the inside along the characteristic lines x+ at = constant.

    0 x

    t

    P0 w

    u

    Figure 1.8: Construction of the solution near a boundary point P0 = (0, t0)

  • 1.5. BOUNDARY AND INITIAL CONDITIONS. CHARACTERISTICS 21

    Conversely, if we prescribe ux at P0, say ux(0, t0) = 1(t0), then we factorize the equation asin (1.5.4), (1.5.5) and we observe that w is known at P0. Indeed, u has already been determinedfor t t0, so

    ut(0, t0) = limtt0

    u(0, t) u(0, t0)t t0 ;

    thus, w(0, t) = ut(0, t0)a1(t0) and we can integrate w along the characteristic line xat = at0,i.e., {

    wt + awx = 0w(0, t0) = ut(0, t0) a1(t0).

    For an initial-boundary value problem, the domains of dependence and influence are definedin the obvious way.

    Summarizing, at a point belonging to O, we assign as many independent boundary conditionsas the number of characteristics entering the domain O for increasing t.

    It is instructive to consider the situation in which 1 is fixed and 2 tends to 0. In this case, thespeed a tends to infinity, i.e., signals propagate with faster and faster speed. Geometrically, theslopes of the characteristic lines in the (x, t)-plane tend to 0, and the domains of dependence andinfluence of any point get wider and wider. In the limit, eq. (1.5.1) becomes the elliptic equation

    12u

    x2= f

    in the sole space variable y1 = x. The solution at each point x depends on the values of the dataf at all points x in the domain, as well as on the boundary data at all boundary points.

    If a low order term is present in the equation, i.e., if the equation is

    (12u

    x2+ 2

    2u

    t2

    )+ a1

    u

    x+ a2

    u

    t= f

    with a2 6= 0, the limit equation for 2 0 is parabolic; the solution at each point (x, t) in thedomain depends on all the values of the data f and the boundary data for all t < t, as well as onthe initial condition u0 at t = 0. Propagation of signals takes place with infinite speed.

    At last, we briefly deal with the concept of characteristic manifold. For a second order partialdifferential equation, a manifold O is non-characteristic if the prescription of u and u on uniquely determines the Hessian of u (i.e., the set of all its second order partial derivatives)therein, via the differential equation.

    Suppose that the manifold is described by an implicit equation (x ) = 0, for a smooth . Fixa point x on and let

    n =(x )(x )

    be the normal vector to at x . Let us make a change of independent variable y = x +RT(x x ),such that the last coordinate direction is along n. Setting (y) =

    (x +

    (RT)1

    (y x )), wehave R1x = y, so we choose R such that R1n = em = (0, . . . , 0, 1)T. The differentialequation in the new coordinates becomes

    Ty(RTA Ryu

    )+ lower order terms = f.

  • 22 CHAPTER 1. BASIC CONCEPTS

    We note that all second order derivatives of u except uymym are determined at x by the valuesof u and u on . Thus, in order to get the value of uymym, we must have(

    RTA R)mm

    = eTmRTA R em = n

    TAn 6= 0.

    Thus, we are led to the following

    Definition 1.5.3. Any smooth (m 1)-dimensional manifold in O whose normal vector nsatisfies

    nTAn = 0 on

    is called a characteristic manifold for equation (1.4.1).

    For instance, the characteristic manifolds of the wave equation (1.5.2) are defined by theequation a2n2x n2t = 0 (where n = (nx, nt)T), i.e., they are precisely the lines x at = constant.

    The characteristic manifolds for the heat equation (1.2.8) satisfy n2x = 0, i.e., they are the linest = constant.

    Finally, any elliptic equations has no (real) characteristic manifold. This means that, underappropriate regularity conditions, the Cauchy problem

    Lu = f in O ,u = u0 on ,

    u

    n= u1 on ,

    is always uniquely solvable in a neighborhood O of any (m1)-dimensional manifold . However,the Cauchy problem is not well-posed for an elliptic equation. This means that arbitrarily smallchanges in the data u0 and u1 may lead to arbitrarily large changes in the solution u, as thefollowing example shows.

    Example 1.5.4. Let us consider the Laplace equation in the half-plane

    u = 0 in O = {(x, y) IR2 : y > 0}

    and let us prescribe on O

    u(x, 0) =sinnx

    n,

    u

    n(x, 0) =

    u

    y(x, 0) = 0 ,

    for a fixed n > 0 (thus, u(x, y) = un(x, y)). The solution can be found by the ansatz

    u(x, y) =sinnx

    nu(y),

    which reduces the problem to a second order ordinary differential equation with two initial condi-tions:

    1

    nu nu = 0

    u(0) = 1u(0) = 0.

    The result is

    u(y) =eny + eny

    2= cosh ny,

  • 1.6. EXERCISES 23

    so that the exact solution of the Laplace problem writes as

    u(x, y) =sinnx

    ncoshny.

    As n , the initial data converge to 0 uniformly in n, whereas u becomes arbitrarily large inan arbitrarily small neighborhood of O.

    For this reason, an elliptic equation is more appropriately supplemented by one boundarycondition, involving u and/or un , at each point of the boundary O of the region O where theequation is set. In this way, one obtains a well-posed problem, as we shall see in Chapter 4.

    1.6 Exercises

    1.1. Consider the transport equation

    u

    t+ 2

    u

    x= 0

    in the half-plane {(x, t) : t > 0}, with the initial condition

    u(x, 0) = u0(x) =

    {3 if x < 0

    1 if x > 0.

    Show that the function

    u(x, t) =

    {3 if t > 12x

    1 if t < 12x

    is a weak, not classical, solution of the equation.

    1.2. Consider the linear equationu

    t+ x

    u

    x= 1

    and:

    (i) find the general solution;

    (ii) solve the initial value problem in O = IR (0,+) with the condition u(x, 0) = u0(x);(iii) solve the initial-boundary value problem first for x [0, 1] and then for x [1, 2] with the

    further condition u = g(t) on the inflow boundary.

    1.3. The inviscid Burgers equationu

    t+ u

    u

    x= 0

    is the simplest example of nonlinear transport equation.

    (i) Show that the solution u is constant along the characteristics.

    (ii) Deduce that the characteristics are straight lines in the half plane {(x, t) : t > 0}.(iii) Suppose the initial datum u(x, 0) = u0(x) is prescribed for every x IR; find the slope of

    the characteristics.

  • 24 CHAPTER 1. BASIC CONCEPTS

    1.4. Classify the following second order equations:

    (i)2u

    x2+ 3

    2u

    xy+

    2u

    yx+ 4

    2u

    y2= f

    (ii)2u

    x2+ y

    2u

    xy= g.

  • Chapter 2

    Theory of Distributions

    The theory of distributions was created by Laurent Schwartz in 1944; its main purpose is to extendthe results which hold for integrable and differentiable functions to those functions that do notsatisfy the necessary conditions of classical regularity.

    2.1 Basic Definitions

    Let O be an open set in IRm; we recall that the set D(O) has been previously defined as

    D(O) = { C(O) | supp is a compact subset of O},

    where supp denotes the support of , i.e., the closure of the set of all points x in O such that does not vanish on them:

    supp = {x O |(x ) 6= 0};it is easy to verify that D(O) is a linear space.

    Let us now introduce the following notion of convergence in D(O):

    Definition 2.1.1. A sequence {n}n0 D(O) is said to converge to D(O) if:

    (i) there exists a compact set K O which contains all the supports of n and ;

    (ii) for all multi-integers INm, the sequence {Dn}n0 converges to D uniformly on K,i.e.,

    Dn D,K n 0.

    We are now ready to discuss the concept of distribution.

    Definition 2.1.2. A distribution (or generalized function) is a linear form

    T : D(O) IR

    such that if {n}n0 converges to in D(O) then {T (n)}n0 also converges to T () in IR whenn.

    The set of all distributions on O is a linear space denoted by D(O). Moreover, the notationT, is often used instead of T () and it is called a duality form.

    25

  • 26 CHAPTER 2. THEORY OF DISTRIBUTIONS

    Example 2.1.3. Let f be a real-valued and Riemann (or Lebesgue)-integrable function on O; letus set

    Tf , :=Of(x )(x ) dx D(O)

    and let us verify that Tf is a distribution. To do this, we have to check the properties of theprevious definition; in particular:

    (i) Tf is certainly a linear form because it is real-valued and the integral is a linear operator;

    (ii) suppose {n}n0 D(O) is a sequence such that n ,K 0 when n for acertain D(O); then

    Tf , n Tf , = Tf , n =Of(x )[n(x ) (x )] dx

    and so

    |Tf , n Tf , | K|f(x )| |n(x ) (x )|dx

    n ,KK|f(x )|dx =

    = Gn ,K n 0where G =

    K |f(x )|dx is a finite constant that comes from the hypothesis that f is inte-

    grable on K. Thus we have Tf , n Tf , when n.Note that the following equality holds true:

    Of(x )[n(x ) (x )] dx =

    Kf(x )[n(x ) (x )] dx

    because the supports of all the ns and of are contained in K, so the integral vanishes on O\K.This imply that only a local integrability of f on subdomains of O, and not on the whole set O,is needed to define the distribution Tf .

    Throughout this chapter, we shall refer to this type of distribution as a function-like distri-bution.

    Example 2.1.4 (The Dirac delta). Consider a point x 0 O; we introduce now the followingform

    x0 , := (x 0) D(O)and we want to verify that it is a distribution in the sense of Definition 2.1.2.

    (i) The linearity is obvious.

    (ii) Let us suppose that n in D(O); by Definition 2.1.1, for all we have a uniformconvergence of Dn to D

    , then for || = 0 it followsmaxxK

    |n(x ) (x )| n 0

    and so, if x 0 K:|x 0, n x0 , | = |n(x 0) (x 0)| max

    xK|n(x ) (x )| n 0.

    If x 0 6 K, then it results directly n(x 0) (x 0) = 0 for all n.

  • 2.1. BASIC DEFINITIONS 27

    x12n 12n

    y

    n

    Figure 2.1: A piecewise constant approximation of the Dirac delta 0.

    Such a distribution is called the Dirac delta on the point x 0; it is possible to show (see Exercise2.1) that it is not a function-like distribution, i.e., it does not exist any function f such that theaction of x0 on a test function D(O) can be expressed as the integral on O of f versus .

    After introducing the notion of convergence in D(O), it would be useful to provide a similartool for the space D(O) too. This is accomplished by the following

    Definition 2.1.5. Let T, Tn D(O), n 0; the sequence {Tn}n0 is said to converge to T inthe sense of D(O) if

    Tn, n T, for every D(O).

    This definition leads us to an important characterization of the Dirac delta. Let us set O = IRand T = 0, T, = (0) for all D(IR); then, for every n > 0, let us define the function (seeFigure 2.1)

    fn(x) =

    {n if |x| 12n0 otherwise.

    We can observe that the integralIRfn(x) dx =

    12n

    12n

    n dx = 1

    does not depend on n: every function fn has therefore the same unitary area on IR. If we nowconsider the family of distributions Tfn , we have:

    Tfn , =IRfn(x)(x) dx = n

    12n

    12n

    (x) dx = n 1n(xn) = (xn)

    where xn is a point in the interval[ 12n , 12n] whose existence is guaranteed by the Integral Mean

    Theorem. It is clear that xn 0 when n; then using the continuity of gives

    Tfn , n (0) = 0, .

  • 28 CHAPTER 2. THEORY OF DISTRIBUTIONS

    Since this argument holds for every D(IR), we conclude that Tfn 0 in the sense of D(IR).This show that, although the Dirac delta cannot be represented by a classical function, it cannevertheless be obtained as a limit of classical functions in the sense of Definition 2.1.5.

    In general, it is easy to check that any sequence {fn} of integrable functions satisfying

    Rfn(x) dx =

    1 and supp fn B(0, rn) with rn 0 as n, converges to 0 in D(IR) as n.

    Definition 2.1.6. A distribution T is said to be of finite order if there exist r IN and a constantCr > 0 such that

    D(O), |T, | Cr max||r D,O.

    The smallest r for which this condition holds is called the order of the distribution.

    Example 2.1.7. Let Tf D(O); then

    |Tf , | =O f(x )(x ) dx

    ,O O |f(x )|dxand, if f is integrable over O, we have

    0 O|f(x )|dx = C < +

    so

    |Tf , | C,O.

    In this case r = 0, then Tf is a distribution of order zero.

    It is possible to verify that this is also the order of the Dirac delta x0.

    Definition 2.1.8. Let T D(O); the support of T is the smallest closed set K O such that

    D(O), supp OK T, = 0.

    This definition states that the support of a distribution T is strictly related to those of testfunctions. More in detail, the support K of T is the smallest closed set in O that has the followingproperty: every test function that vanishes on the whole K, i.e., such that its support does notintersect K, sees T as zero.

    For instance, if we take T = x0, x 0 O, we find supp x0 = {x 0} because every test function whose support does not contain x 0 is such that (x 0) = 0 and so x 0, = 0.

    As another example, let us consider an integrable function f with a compact support in O;then suppTf = supp f .

    Example 2.1.9. Consider an open set O in IRm and let be a closed (m1)-dimensional regularmanifold contained in O; let g be an integrable function defined on . Then, the distribution ,gdefined as

    ,g, =g()() d D(O),

    is of order zero with support equal to .

  • 2.2. DERIVATIVES OF DISTRIBUTIONS 29

    2.2 Derivatives of Distributions

    In this section, the main results from the differential theory of distributions are exposed. Inparticular, we shall see, with the aid of many examples, in which sense such a theory representsa generalization of the classical one and what meaning has to be given to the word derivativereferred to a distribution.

    Let us start with this basic definition.

    Definition 2.2.1. Let INm and T D(O); the partial derivative of T of order is thedistribution DT whose action on a test function D(O) is defined as

    DT, = (1)||T, D.

    We can immediately observe that, in the sense of this definition, all the distributions areinfinitely differentiable, since the derivative is moved on the test function which is of class C(O).

    The following example will explain the reason of such a definition and where it comes from.

    Example 2.2.2. Let f C1(O) and consider the distribution T = Tf ; in order to calculate itsderivative DiTf , we set = (0, . . . , 0, 1, 0, . . . , 0) (where the only component of the multi-integerdifferent from zero is the i-th) and then we apply the Definition 2.2.1:

    DiTf , = Tf , Di = Of(x )

    xi(x ) dx =

    =

    O

    f

    xi(x )(x ) dx = TDif ,

    for all D(O); we recall that in applying the integration-by-parts formula no boundary termappears, since a test function vanishes in a neighborhood of O.

    We conclude that DiTf = TDif , i.e., the partial derivative with respect to xi of the distribu-tion based on the function f is the distribution based on the function Dif , which exists in theclassical sense under the hypothesis f C1(O). As we have just seen, this result follows fromthe integration-by-parts formula and it allows us to calculate the derivatives of a function-likedistribution in a somewhat classical way.

    Example 2.2.3. Let O = IR and consider the function

    f(x) =

    {2x if x 0x if x < 0

    which is not differentiable in the classical sense because of the singularity at the origin. Neverthe-less, in the distributional sense we have:

    (Tf ), = Tf , = IRf(x)(x) dx =

    = 0

    x(x) dx +0

    2x(x) dx =

    =

    0

    (x) dx+

    +0

    2(x) dx =

    =

    IRg(x)(x) dx = Tg, D(IR)

  • 30 CHAPTER 2. THEORY OF DISTRIBUTIONS

    where g is the function

    g(x) =

    {2 if x > 0

    1 if x < 0;

    then (Tf ) = Tg or, as often one writes, f = g in the sense of distributions.

    Note that the derivative of Tf is itself a function-like distribution; this depends strictly on thefact that f is continuous on IR.

    Example 2.2.4. Let us now consider the function

    f(x) =

    {2x+ 1 if x > 0

    x if x < 0

    which is discontinuous at the origin. In this case we have:

    (Tf ), = Tf , = IRf(x)(x) dx =

    = 0

    x(x) dx +0

    (2x+ 1)(x) dx =

    =

    0

    (x) dx+

    +0

    2(x) dx[(2x+ 1)(x)

    ]+0

    =

    =

    IRg(x)(x) dx+ (0) D(IR) ,

    where g is defined as in the example above. Then

    (Tf ), = Tg, + 0, D(IR)and consequently (Tf )

    = Tg+0, which is no longer a function-like distribution although Tf is.

    In general, if one has

    f(x) =

    {f+(x) if x > x0

    f(x) if x < x0

    with f+ C1[x0,+), f C1(, x0], then(Tf )

    = Tg + |[f ]|x=x0x0 , (2.2.1)where

    g(x) =

    {f +(x) if x > x0f (x) if x < x0

    and |[f ]|x=x0 denotes the jump of f at the point x0. Therefore, (Tf ) is a function-like distributionif, and only if, f is continuous at x0; in fact, in this case |[f ]|x=x0 = 0, which eliminates the deltafrom the expression (2.2.1).

    Example 2.2.5. The Heaviside function is defined as

    H(x) =

    {1 if x > 0

    0 if x < 0;

    from (2.2.1) it follows (TH) = 0. The Heaviside function is then a primitive of the Dirac delta

    in the sense of distributions.More often one writes H = 0, where the derivative is, of course, intended in the distributional

    sense.

  • 2.3. STUDY OF THE LAPLACE OPERATOR IN D(O) 31

    Example 2.2.6. Consider the Dirac delta 0 D(IR) and let D(IR); from Definition 2.2.1one has

    0, = 0, = (0)0 , = 0, = (0)...

    (k)0 , = = (1)k(k)(0), k IN.

    Example 2.2.7. The multidimensional counterpart of the general situation considered in Example2.2.4 is as follows. Let O an open set in IRm and let be an (m1)-dimensional regular manifoldcontained in O, which splits O as OO+, with O open disjoint sets such that OO+ = .

    Let the function f satisfy

    f(x ) =

    {f(x ) if x Of+(x ) if x O+

    with f+ C1(O+), f C1(O). Then, for any i = 1, . . . ,m, one hasDi(Tf ) = Tgi + ,hi (2.2.2)

    where

    gi(x ) =

    {Dif(x ) if x ODif+(x ) if x O+

    andhi(s) = |[f ]|s ni(s), s ,

    where |[f ]|s denotes the jump of f at the point s in going from O to O+, and ni is the i-thcomponent of the normal unit vector to pointing from O to O+.

    We prove the result in the particular case in which f is the Heaviside function associated withthe given partition of O, i.e.,

    H(x ) =

    {0 if x O1 if x O+

    Let us compute Di(TH) in the sense of distributions. Using the divergence theorem (see (3.1.3)),we have

    Di(TH), = OH(x )

    xi(x ) dx =

    O+

    xi(x ) dx =

    nid = ,ni ,

    for all D(O); this is precisely (2.2.2) in the present situation.We refer to Exercise 2.6 for the proof of the general result.

    2.3 Study of the Laplace Operator in D(O)In this section we study the Laplacian

    =mi=1

    2

    x2i

    as an operator into the space of distributions D(O); in particular, we are interested in thosefunctions g : O IRm IR whose Laplacian is the Dirac delta 0 on the origin.

  • 32 CHAPTER 2. THEORY OF DISTRIBUTIONS

    Definition 2.3.1. Every function g = g(x), x IRm, such that

    g = 0 in D(O)

    is said to be a fundamental solution of the Laplacian.

    Let us start with m = 1 (dimension 1); if we take the function u(x) = |x|, it is easy to verifythat u(x) = sign(x) and thus u(x) = 20, as it immediately follows from (2.2.1). Hence, thefunction g(x) = 12u(x) =

    12 |x| is a fundamental solution of the Laplacian on IR.

    Let us now consider m = 2; in this case, it is convenient to use the polar coordinates definedby the transformation

    : [0, +) [0, 2) IR2(r, ) 7 (x, y) = (r cos , r sin );

    since we can think of every function u = u(x, y) as u(x, y) = u((r, )) = U(r, ), we have therelationship u(x, y) = U(r, ) and consequently (x, y)u = (r, )U , where (x, y) and (r, ) denotethe Laplacian in cartesian and polar coordinates respectively, with (see Exercise 2.8)

    (r, ) =2

    r2+

    1

    r

    r+

    1

    r22

    2. (2.3.1)

    If we take the function u(x, y) = logx2 + y2 = log r and we set (x, y) 6= (0, 0), we obtain

    from (2.3.1)

    u = (r, ) log r = 1

    r2+

    1

    r2= 0;

    hence, log r is a harmonic function in the classical sense everywhere in the plane except at theorigin.

    Let us now calculate u in the sense of distributions; taking D(IR2) we have

    u, = u, ==

    IR2

    log rdxdy = lim0+

    IR2\B(0, )

    log rdxdy

    where B(0, ) is the open ball of radius > 0 centered at the origin. Applying the integration-by-parts formula gives

    u, = lim0+

    (r>

    log r dxdy +r=

    log r

    nd

    )=

    = lim0+

    (r>

    log r dxdy r=

    nlog r d +

    r=

    log r

    nd

    )=

    = lim0+

    (r=

    nlog r d +

    r=

    log r

    nd

    )where the result log r = 0 out of the origin has been used.

    Since B(0, ) is a circle, the normal vector of its circumference r = is a radial vector, whichallows us to write

    nlog r = d

    drlog r = 1

    r

  • 2.3. STUDY OF THE LAPLACE OPERATOR IN D(O) 33

    where the minus sign depends only on the fact that n log r = log r n should be negative becausethe two vectors log r and n point in opposite directions. Thus

    r=

    nlog r d = 1

    r=

    d = 2 12

    r=

    d.

    Note that 12r= d is the mean value that takes along the circumference r = ; since

    is continuous, it follows

    lim0+

    (2 1

    2

    r=

    d

    )= 2(0, 0).

    Moreover r=

    log r

    nd = log

    r=

    nd

    and it resultsr=

    nd

    r=

    n d =

    r=| n| d

    r=

    d max(x, y)IR2

    r=

    d = 2 max(x, y)IR2

    ;

    in the third passage, the Cauchy-Schwartz inequality has been used within the fact that n = 1(here denotes the Euclidean norm in IR2). Since

    2 max(x, y)IR2

    =M

    is a real nonnegative finite constant, we conclude thatlog r=

    n

    M | log | 0+ 0and finally

    u, = 2(0, 0) = 20, D(IR2)that is

    u = 20 in D(IR2).Hence, the function g(x, y) = 12u(x, y) =

    12 log

    x2 + y2 is a fundamental solution for the

    Laplacian on IR2.In three dimensions, with the aid of the spherical coordinates, it can be found that the function

    u(x, y, z) =1

    x2 + y2 + z2

    is such that u = 40; it is therefore proportional to a fundamental solution on IR3.In general, we have the following expressions for the fundamental solutions of the Laplacian:

    g(x ) =

    1

    2r m = 1

    1

    2log r m = 2

    14

    1r

    m = 3

    1(m 2)m

    1

    rm2m 4

    r = x = m

    i=1

    x2i (2.3.2)

  • 34 CHAPTER 2. THEORY OF DISTRIBUTIONS

    where m =2m/2

    (m/2)is the surface area of the unit sphere in IRm.

    It is obvious that adding any harmonic function to g, i.e., a function v such that v 0,leads to another fundamental solution of the Laplacian. Actually, we are more interested in theexistence rather than in the uniqueness of the fundamental solutions, since their importance isdue to the fact that they provide a powerful tool for solving the following more general matter:find u such that u = f in O, where f is a given bounded function with a compact support andintegrable on O (i.e. f L1(O)).

    Note that, given any function g such that g = 0, the new function

    G(x ,y) := g(x y), x , y IRm

    has the following property: if we denote by x the Laplacian with respect to the variable x , then

    xG = y in D(IRm),

    because the singularity of g has now been moved from the origin to the point y .Let us set

    u(x ) := (f g)(x ) =

    Og(x y)f(y) dy =

    =

    OG(x ,y)f(y) dy ;

    if we calculate the Laplacian of u in the sense of distributions we obtain

    u, = u, =Ou(x )(x ) dx =

    =

    O

    OG(x ,y)f(y)(x ) dx dy =

    =

    Of(y)

    OG(x ,y)(x ) dx dy ;

    but from Definition 2.2.1 we haveOG(x ,y)(x ) dx = G, = xG, = y , = (y)

    and then

    u, =Of(y)(y ) dy = f, ;

    since this argument holds for every test function D(O), we conclude that such a u is a solutionof the elliptic equation u = f in the sense of distributions.

    2.4 Exercises

    2.1. Prove that the Dirac delta is not a function-like distribution, i.e., that it does not exist anyintegrable function f : O IRm IR such that

    x0 , =Of(x )(x ) dx , D(O).

  • 2.4. EXERCISES 35

    2.2. Consider an open set O in IRm and let be a (m1)-dimensional regular manifold containedin O. Moreover, let g be an integrable function defined on ; prove that the formula

    T,g, =g()

    n() d

    defines a distribution T,g D(O) which is not function-like.2.3. Find all functions u : IR IR such that u = 21 3 in D(IR).2.4. For every n 1 let us set

    fn(x) =

    0 if x < 0 or x > 2nn2x if 0 x 1n2n n2x if 1n < x 2n ;

    calculate limnfn in D

    (IR).

    2.5. Let be the straight line in the plane having equation y = 2x. Define then the distribution D(IR2) such that

    , =() d

    for every test function D(IR2).(i) Find all functions u : IR2 IR such that

    u

    x= in D(IR2).

    (ii) Find all functions v : IR2 IR such thatv

    x+v

    y= in D(IR2).

    2.6. Prove the identity (2.2.2) in the general case.

    2.7. Find the distributions u D(IR2) such thatu

    x= 0 in D(IR2).

    Which is the support of u?

    2.8. Prove that the Laplacian in polar coordinates is given by equation (2.3.1).

  • 36 CHAPTER 2. THEORY OF DISTRIBUTIONS

  • Chapter 3

    Sobolev Spaces

    3.1 Motivation

    In order to motivate the introduction of the Sobolev space H1(), let us consider the followingDirichlet boundary-value problem for a general second-order elliptic operator Lu:{

    Lu = (Au) + (au) + a0u = f in u = 0 on .

    (3.1.1)

    Here, A, a, a0 and f are known functions defined in ; precisely, A takes its values in the spaceof symmetric and positive-definite matrices of order n, a is a vector-valued function, whereas a0and f are scalar functions.

    We aim at giving a weak (or integral, or variational) formulation of this problem, whichcorresponds to the general form (1.2.16). At the beginning, we will proceed in a formal manner,assuming that all mathematical operations are permitted; then, step by step, we will envisage aset of assumptions on the data of the problem (the coefficients of the operator, the right-handside, the domain) which make the resulting formulation mathematically rigorous.

    The starting point consists of multiplying the first equation in (3.1.1) by a test function v andintegrating over , to get

    (Au)v +

    (au)v +

    a0uv =

    fv . (3.1.2)

    Next, we perform an integration-by-parts in the first and second term on the left-hand side.Precisely, we invoke the divergence theorem

    F =

    F n , (3.1.3)

    where F is a vector field and n is the unit vector which is normal to and pointing outwards,as well as the differentiation rule for a product

    (v) = ( ) v + v , (3.1.4)where is a vector field and v is a scalar function. Applying (3.1.3) and (3.1.4) to F = v with = Au, we obtain

    (Au)v +

    (Au) v =

    n (Au) v ;

    37

  • 38 CHAPTER 3. SOBOLEV SPACES

    if we introduce the conormal derivative of u on with respect to A, i.e. the function

    u

    nA= n (Au) (3.1.5)

    (which coincides with the normal derivativeu

    n= n u when A is the identity matrix), we get

    (Au)v =

    (Au) v

    u

    nAv . (3.1.6)

    On the other hand, applying (3.1.3) and (3.1.4) to F = v with = au yields (au)v =

    ua v +

    a nuv . (3.1.7)

    Combining the two previous results, we can write (3.1.2) as(Au) v

    ua v +

    a0uv

    u

    nAv +

    a nuv =

    fv . (3.1.8)

    Now, we observe that u is required to vanish on ; therefore, from now on, we will requirethat our test functions v vanish on , too (note that functions in D() do satisfy this condition).Then, (3.1.8) simplifies as

    (Au) v

    ua v +

    a0uv =

    fv . (3.1.9)

    Note that this equation only involves first-order partial derivatives of u and v.

    Next, we make assumptions on the functions appearing in (3.1.9), so that all integrals thereinare guaranteed to be meaningful and finite. On the left-hand side, we have integrals of productsof three functions, such as

    aij

    u

    xj

    v

    xior

    aiv

    xiu or

    a0uv ,

    whereas the right-hand side is the integral of the product of two functions. Thus, we set ourselves inthe framework of the Lebesgue Integration Theory, which, in particular, ensures that the product of two functions is integrable in , i.e., L1() if Lp() and Lp() withp, p [1,] satisfying 1p + 1p = 1; furthermore, the following Holder inequality holds:

    ||

    (||p

    )1/p (||p

    )1/p= Lp()Lp () (3.1.10)

    (if p =, the term ( ||p)1/p has to be replaced by ess sup ||, and similarly if p =). Thisresult extends to the product of three functions, i.e., L1() if Lp(), Lp() and Lp() with p, p, p [1,] satisfying 1p + 1p + 1p = 1; in this case, one has

    Lp()Lp()Lp() . (3.1.11)The structure of the integrals in (3.1.9) suggests to work in a Hilbertian setting, i.e., to assume thatu, v and their first derivatives belong to L2(). More precisely, the previous results tell us that

  • 3.2. THE SPACE H1() 39

    fv is well-defined if f and v L2();

    a0uv is well-defined if a0 L() and u, v L2();

    an integral of the form ai

    vxi

    u is well-defined if ai L(), u L2() and vxi L2(); finally,an integral of the form

    aij

    uxj

    vxi

    is well-defined if aij L(), uxj and vxi L2(). Inconclusion, if we assume that

    A (L())nn, a (L())n, a0 L(), f L2() , (3.1.12)

    then u and v should belong to L2() together with all their first-order partial derivatives. Suchderivatives have to be considered in the sense of distributions, since u and v are merely L2-integrable functions, and not classical differentiable functions.

    This leads us to introduce the Sobolev space H1() and, subsequently, its closed subspaceH10() of the functions vanishing on . This will be the appropriate space for setting the weakformulation of problem (3.1.1) and for studying its well-posedness.

    3.2 The space H1()

    Motivated by the previous discussion, we introduce the space H1() as follows.

    Definition 3.2.1. H1() is the subspace of L2() of the functions whose first-order partial deriva-tives, in the distributional sense, belong to L2(), i.e.,

    H1() = {v L2() : vxi

    L2() for 1 i n} = {v L2() : v (L2())n} .

    H1() is endowed with the inner product

    (u, v)H1() = (u, v)L2() +ni=1

    (u

    xi,v

    xi

    )L2()

    = (u, v)L2() + (u,v)(L2())n ,

    which induces the norm

    vH1() =(v2

    L2()+

    ni=1

    vxi2L2()

    )1/2=(v2

    L2()+ v2

    (L2())n

    )1/2.

    We point out that the requirement v/xi L2() means that there exists gi L2() such thatTv/xi = Tgi in the sense of distributions, i.e.,

    xi

    Tv, = Tv, xi

    = v

    xi=

    gi = Tgi , D() ;

    then, gi is identified to v/xi.

    By the very definition of the norm in H1(), one has vL2() vH1() for all v H1(),i.e., the inclusion H1() L2() is continuous.

    Next property is one of the fundamental properties of H1().

    Property 3.2.2. H1() is a Hilbert space.

    Proof. Let {vk}k0 be a Cauchy sequence in H1()-norm, i.e., > 0, k IN such that,m > k one has v vmH1() < . This immediately implies that each sequence {vk}k0,

  • 40 CHAPTER 3. SOBOLEV SPACES

    {vk/xi}k0 for i = 1, . . . , n, is a Cauchy sequence in L2(). By the completeness of this space,there exist functions v and gi, i = 1, . . . , n, belonging to L

    2(), such that

    limk

    vk = v , limk

    vkxi

    = gi , i = 1, . . . , n ,

    in L2(). The property is proven if we prove that v/xi = gi for i = 1, . . . , n. This follows from

    vxi

    , = v

    xi=

    (limk

    vk

    )

    xi= lim

    k

    vk

    xi

    = limk

    vkxi

    =

    (limk

    vkxi

    ) =

    gi = gi, D() .

    Next property, which we state without proof, is important both from the theoretical andthe constructive/numerical point of view; indeed, it guarantees that functions in H1() can beapproximated arbitrarily well by functions belonging to a sequence of finite dimensional subspaces.

    Property 3.2.3. H1() is separable, i.e., it contains a sequence {vk}k0 which is dense in it.Let us improve our knowledge of the space H1() by observing that it contains classical dif-

    ferentiable functions. Indeed, if is bounded, any function v C1() belongs to H1(), and onehas

    vH1() ||1/2(v2

    C0()+

    ni=1

    vxi2C0()

    )1/2 (n+ 1)||1/2vC1() .

    In other words, we have:

    Property 3.2.4. If is bounded, then C1() H1() with continuous injection.If is not bounded, then any function v C1() which decays fast enough as x ,

    belongs to H1(). In particular, for any open set , one has:

    Property 3.2.5. D() H1().So far, we have seen that sufficiently smooth functions, in a classical sense, belong to H1().

    On the other hand, H1() also contains piecewise smooth functions, provided they are globallycontinuous. The following result illustrates the situation.

    Property 3.2.6. Let be a bounded open set, which is divided into two open subsets and +by a smooth (n 1)-dimensional manifold . Given two functions v C1(), the function vdefined as

    v(x) =

    {v(x) if x ,v+(x) if x + ,

    belongs to H1() if and only if v is continuous across .

    Proof. Obviously, v L2(). On the other hand, for any i = 1, . . . , n, if we set

    gi(x ) =

    vxi

    (x ) if x ,v+xi

    (x ) if x + ,

  • 3.2. THE SPACE H1() 41

    we have (see (2.2.2))

    xiTv = Tgi + ,[v]ni ,

    where [v] is the jump of v across and ni is the i-th component of the normal vector n to . Theresult follows from the observation that gi L2(), whereas ,[v]ni 6 L2() unless [v]ni 0.

    The previous result has a strong practical impact, as it guarantees that one can use continu-ous, piecewise polynomial functions in order to approximate the solution of second-order ellipticproblem; the finite element method relies precisely on this property.

    One may wander if a function belonging to H1() is more regular than just an L2()-function,for instance if it is continuous. First, let us clarify the real meaning of the statement a functionv H1() is continuous. Indeed, H1() is a subspace of L2(), which according to the Lebesgueintegration theory rigorously speaking does not contain functions but classes of equivalence offunctions, two functions in the same class differing only on a zero-measure subset of . Then, thestatement above means that in the equivalence class of v there exists a function, say v, which iscontinuous. For simplicity, in the sequel of this book we will not distinguish a function and theequivalence class which contains it. The following result gives a positive answer in dimension 1.

    Property 3.2.7. If = I is a bounded interval, then H1(I) C0,1/2(I) with continuous injection,where C0,1/2(I) is the space of the Holder continuous functions of exponent 1/2 in I.

    Proof. Let us fix v H1(I) and let us set g = v L2(I). Let us define the function

    w(x) =

    xx0

    g(s) ds ,

    where x0 is any fixed point in I. Since v = w in the distribution sense, there exists a constant C

    such that v(x) = w(x) + C in I. Thus, for any two points x1, x2 I ,

    v(x1) v(x2) = w(x1) w(x2) = x1x2

    g(s) ds ,

    whence, by the Cauchy-Schwarz inequality,

    |v(x1) v(x2)| = x1x2

    1 g(s) ds x1

    x2

    12 ds

    1/2 x1x2

    g2(s) ds

    1/2 |x1 x2|1/2gL2(I) .This precisely means that v is Holder continuous of exponent 1/2 in I, and that

    |v|C0,1/2(I) := supx1,x2I

    |v(x1) v(x2)||x1 x2|1/2

    vL2(I) . (3.2.1)

    On the other hand, again by the Cauchy-Schwarz inequality,Iv(y) dy

    = I1 v(y) dy

    |I|1/2vL2(I) . (3.2.2)Since v is continuous, there exists a point x I such that

    v(x) =1

    |I|Iv(y) dy

  • 42 CHAPTER 3. SOBOLEV SPACES

    by the mean value theorem. For any x I, we write

    v(x) = v(x) +v(x) v(x)|x x|1/2 |x x|

    1/2 ,

    and we apply (3.2.1) and (3.2.2) to get

    |v(x)| |v(x)|+ |v(x) v(x)||x x|1/2 |I|1/2 |I|1/2vL2(I) + |I|1/2vL2(I) ,

    which easily implies vC0(I) C1vH1(I). In conclusion, this estimate and (3.2.1) yield

    vC0,1/2(I) C2vH1(I) .

    If n > 1, then H1() is neither contained in C0() nor in L(), as the following simplecounterexample shows. Consider the disc = {(x, y) IR2 : r =

    x2 + y2 < 1/2} and the

    function v(x, y) = | log r| for some > 0. Obviously, v is unbounded as r 0, hence, inparticular, it is not continuous at the origin. Let us show that v H1() if and only if < 1/2.We have

    v2 dxdy =

    20

    1/20

    | log r|2r drd < + for any > 0 .

    On the other hand,

    v

    x= | log r|1 x

    r2,

    v

    y= | log r|1 y

    r2,

    whence

    vx2+vy

    2 dxdy = 2| log r|22 1

    r2dxdy = 22

    1/20

    | log r|22 1rdr < + iff < 1/2 .

    We will see later on (Thm. 3.8.2) that functions in H1(), although not necessarily continuous,belong to some space Lp() with p > 2 depending on n.

    Another fundamental result is the following one. Let D() be the space of the C-functionsdefined in , whose support is compact and contained in (thus, they are allowed to be nonzeroon ). Note that D() = C() if is bounded, whereas D() = D() iff = IRn. The spaceD() can equivalently be defined as the space of the restrictions to of the functions in D(IRn).Property 3.2.8. D() is a dense subspace of H1().

    We will give some ideas of the proof, in some particular cases, later on. The property statesthat any function in H1() can be approximated arbitrarily well by smooth classical functions.This will allow us to pass to the limit and extend results which are well-known for classicalfunctions to analogous results for functions in H1().

    3.3 The families of spaces Hm() and Wm,p()

    The definition of H1() can be generalized, by considering for any m 2 the set of all functionsv L2() such that all their distributional partial derivatives Dv of order || m belong toL2(). Thus, we set

    Hm() = {v L2() : Dv L2() for || m}

  • 3.4. THE SPACES HS(IRN ) 43

    and we endow this space by the inner product

    (u, v)Hm() =||m

    (Du,Dv)L2() ,

    which induces the norm

    vHm() = ||m

    Dv2L2()

    1/2(we use here the convention that D0v = v). In this way, we obtain a separable Hilbert space whichenjoys properties similar or equal to those seen for H1(): for instance, it contains D() as a densesubspace and, if is bounded, it contains Cm() but also those functions of Cm1() which arepiecewise Cm-differentiable. Furthermore, all functions in Hm() enjoy classical differentiabilityof some order < m (which depends on the space dimension n); for instance, in dimension n = 2,the space H2() is contained in C0() with continuous inclusion (but not in C1()). The preciseresult will be given in Thm. 3.8.2.

    We thus have a scale of function spaces, in which smoothness is measured in a weak, integralsense; each space is strictly contained in all the spaces of lower index, with continuous inclusion.Such a scale is the counterpart of the classical scale of spaces Cm(), in which smoothness ismeasured in a strong, pointwise sense. Precisely, the two sequences of spaces satisfy

    Hm+1() Hm() Hm1() H1() H0() = L2()

    Cm+1() Cm() Cm1() C1() C0()and if is bounded each space of the lower sequence is contained with continuous inclusion inthe space above it in the upper sequence. Working in the Sobolev scale rather than in the classicalone is more appropriate for handling the weak, or integral, formulation of an elliptic boundaryvalue problem; in particular, the Sobolev scale consists of Hilbert spaces, whereas the classicalscale consists merely of non-reflexive Banach spaces.

    A further generalization comes from replacing L2() by some Lp() with p [1,+] in thedefinition of the Sobolev space. Thus, we set

    Wm,p() = {v Lp() : Dv Lp() for || m}equipped with the norm

    vWm,p() = ||m

    DvpLp()

    1/p .Such a space is a Banach space, which as Lp() is reflexive if 1 < p < + and is non-reflexiveif p = 1 or p = +. Note that Wm,2() = Hm(). Sobolev spaces of summability index p 6= 2play a crucial role in studying nonlinear partial differential equations.

    3.4 The spaces Hs(IRn)

    The study of Sobolev spaces is particularly interesting and important when the domain is thefull space IRn. In this section, we first provide a characterization of H1(IRn) by means of theFourier transform. Next, we consider the spaces Hm(IRn) for m > 1 and we extend the definitionof Sobolev spaces to the case where the index is any real number. Finally, we sketch the proof ofProperty 3.2.8 in the present situation, i.e., when the boundary is empty.

  • 44 CHAPTER 3. SOBOLEV SPACES

    3.4.1 A characterization of H1(IRn)

    We recall that the (continuous) Fourier transform

    v(x ) 7 v() = 1(2)n/2

    IRn

    v(x ) eixdx

    is an isomorphism between L2(IRn) and itself, whose inverse is

    v() 7 v(x ) = 1(2)n/2

    IRn

    v() e ixd ;

    more precisely, the transform is an isometry, i.e., for all v,w L2(IRn) one hasIRn

    v(x )w(x ) dx =

    IRn

    v()w() d , whence, vL2(IRn) = vL2(IRn) . (3.4.1)

    Furthermore, if is a compactly supported, continuously differentiable function, one has(

    xk

    )() = i k () IRn , k = 1, . . . , n , (3.4.2)

    as it can be seen by applying an integration by parts in the integral which defines the Fouriertransform of /xk.

    The following result gives the announced characterization of H1(IRn) in terms of summabilityat infinity of the Fourier transform of its functions.

    Proposition 3.4.1. One has

    H1(IRn) = {v L2(IRn) : the function 7 (1 + 2)1/2v() belongs to L2(IRn)}and

    vH1(IRn) =(

    IRn(1 + 2)|v()|2 d

    )1/2.

    Proof. It is enough to prove that, for each k = 1, . . . , n, vxk L2(IRn) in the distributional

    sense iff the function 7 kv() belongs to L2(IRn), with identical norm. Let us assume thatvxk

    = gk L2(IRn); then, using (3.4.2) and (3.4.1), for all D(IRn) one has on the one side

    vxk

    , = IRn

    v(x )

    xk(x ) dx =

    IRn

    v()

    (

    xk

    )()d =

    IRn

    i k v() () d

    and on the other side

    vxk

    , =IRn

    gk(x )(x ) dx =

    IRn

    gk()() d .

    By equating the two last expressions and by recalling that is arbitrary, we get i k v() = gk()almost everywhere in IRn, and therefore the function 7 k v() belongs to L2(IRn). Conversely,if this happens, one sets gk() = i k v(), so that its inverse transform gk(x ) belongs to L

    2(IRn)and satisfies gk =

    vxk

    .

    The argument given in the proof shows that

    for all v H1(IRn) ,(v

    xk

    )() = i k v() IRn , k = 1, . . . , n . (3.4.3)

  • 3.4. THE SPACES HS(IRN ) 45

    3.4.2 The spaces Hs(IRn), with s IRGiven a vector = (1, . . . , n) IRn and a multi-index = (1, . . . , n) INn, let us set = 11

    22 nn IR. An argument similar to the proof of Proposition 3.4.1 shows that

    Dv L2(IRn) in the sense of distributions if and only if v() L2(IRn); in this case, one has

    (Dv) () = i||v() ,

    which generalizes (3.4.3). Thus, any Sobolev space Hm(IRn) can be characterized as the subsetof L2(IRn) of those functions satisfying v() L2(IRn) for all INn such that || m; thenorm in Hm(IRn) could be represented as

    vHm(IRn) =

    IRn

    ||m

    ||2|v()|2d1/2 ,

    where || = (|1|, |2|, . . . , |n|). An equivalent but simpler expression of the norm is preferred,which can be derived by applying the following technical lemma, whose elementary proof is left tothe reader.

    Lemma 3.4.2. There exists constants c, C > 0 depending only on n and m such that

    c(1 + 2)m

    ||m||2 C (1 + 2)m , IRn .

    The result allows us to characterize Hm(IRn) in an equivalent manner as the subset of L2(IRn)

    of those functions such that(1 + 2)m/2 v() L2(IRn), and to use the L2(IRn)-norm of this

    function as an equivalent norm in Hm(IRn).At this point, a remarkable observation can be made, namely, that the latter characterization

    does not require the parameter m to be an integer: any real value of m is admissible. This leadsus to extend the definition of Sobolev spaces given so far, to the case of real positive indices.

    Definition 3.4.3. For any real s 0, we set

    Hs(IRn) = {v L2(IRn) : (1 + 2)s/2 v() L2(IRn)}equipped with the norm

    vHs(IRn) =(

    IRn

    (1 + 2)s |v()|2d)1/2 . (3.4.4)

    In this way, we obtain a continuos family of separable Hilbert spaces, which satisfy

    D(IRn) is a dense subspace of Hs(IRn) for all s 0 ;Hs

    (IRn) Hs(IRn) iff s > s ,

    Hs(IRn) = Hm(IRn) if s = m ,

    Hm+1(IRn) Hs(IRn) Hm(IRn) iff m < s < m+ 1 .

    The last relation shows that the Sobolev spaces Hs(IRn) of non-integer index can be viewed as asort of interpolating spaces between consecutive Sobolev spaces of integer index. The concept canbe made rigorous, within the so-called Theory of Space Interpolation.

  • 46 CHAPTER 3. SOBOLEV SPACES

    One can furtherly extend the definition of Sobolev space to the case of negative indices, bysetting

    Hs(IRn) =(H|s|(IRn)

    )if s < 0 ,

    where X denotes the dual space of the Hilbert space X; equivalently, Hs(IRn) can be definedas the space of the distributions whose Fourier transform (defined in a suitable sense) makes theright-hand side of (3.4.4) finite.

    3.4.3 Sketch of the proof of Property 3.2.8

    The idea is to apply to any function in H1(IRn) a truncation, which yields a compactly supportedfunction, followed by a regularization, which produces a C function.

    Truncation. Given any R > 0, let R D(IR) be an even function satisfying

    0 R(t) 1 t IR ,R(t) 1 if |t| R ,R(t) 0 if |t| R+ 1 .

    Then, given any v H1(IRn), one can prove that the function vR(x ) = R(x)v(x ) belongs toH1(IRn) and is supported in B(0, R + 1); furthermore, v vRH1(IRn) 0 as R +.Regularization. Given any > 0, let (x ) be any non-negative function in D(IRn) satisfying

    supp B(0, ) ,IRn

    (x ) dx = 1 .

    An example of such function is obtained by properly scaling the function given in Example 1.2.1.Note that as 0, converges in D(IRn) to the distribution 0.

    Then, given any v H1(IRn), one can prove that the convolution function

    v(x ) = ( v)(x ) =IRn

    (x y)v(y) dy

    belongs to H1(IRn) and is infinitely differentiable at every x IRn; furthermore, vvH1(IRn) 0as 0.

    Finally, we combine the two previous approximations by considering functions vR, = (vR)obtained by first truncating a function v H1(IRn), and then regularizing the result. Since bothvR and are compactly supported, so is vR,; precisely, supp vR, B(0, R + 1 + ). Thus, vR,belongs to D(IRn).

    An appropriate choice of = (R), such that (R) 0 as R , shows that v can beapproximated in H1(IRn) to any prescribed precision by a function vR, for a sufficiently large R.

    3.5 The space H1(IRn+)

    One of the simplest examples of open domain with nonempty boundary is the semi-space =IRn+ = {x = (x1, . . . , xn1, xn) IRn : xn > 0}, whose boundary IRn+ = {x = (x1, . . . , xn1, 0) IRn} can be identified with IRn1. For notational simplicity, every point x IRn will be writtenas x = (x , xn) with x IRn1.

  • 3.5. THE SPACE H1(IRN+ ) 47

    A number of properties of H1(IRn+), such as Property 3.2.8, can be obtained from the analogousproperties of H1(IRn) after introducing a suitable prolongation operator which extends the func-tions belonging to H1(IRn+) into functions belonging to H

    1(IRn). Precisely, given any v H1(IRn+),let us set

    (Pv)(x ) = v(x , |xn|) x = (x , xn) IRn .Thus, P realizes an extension of v by an even reflection around the boundary IRn+. It is easy tocheck that Pv H1(IRn) iff v H1(IRn+), and that

    PvH1(IRn) 2vH1(IRn+ v H

    1(IRn+) ,

    i.e., P is continuous. Thus, we have established the following result.

    Property 3.5.1. There exists a linear continuous operator P : H1(IRn+) H1(IRn) such thatPv|IRn+ v for all v H1(IRn+).

    Any operator satisfying the conditions of the Property is termed a prolongation operator inH1(IRn+).

    As an application, we can easily prove Property 3.2.8 in the current situation. Indeed, givenany v H1(IRn+), by the analogous result in IRn one can find a sequence of functions vk D(IRn)which converge to v = Pv in the H1(IRn)-norm. It is immediate that the functions vk = (vk)|IRn+belong to D(IRn+) and converge to v in the H1(IRn+)-norm.

    Next, we begin the discussion of the concept of trace on the boundary of a functionbelonging to H1(). Clarifying this concept is very important, for instance in order to give theproper meaning to a Dirichlet boundary condition. The present geometrically simple situationwill allow us to keep ideas separated from technicalities, which may occur in the case of generaldomains. While for a smooth function defined in (e.g., a function in D()) its trace on isdefined pointwise in the obvious way, for a function v which merely belongs to H1() the sameprocedure cannot be applied in dimension n 2. Indeed,