1. Convolutions and approximations to the identity
We restrict our attention to the Euclidean case . As we have seen the space
is a vector space; linear combinations of functions in
remain in the space. There is however a `product’ defined between elements of
that turns
into a Banach algebra. For
we define the convolution of
to be the function
Furthermore, using Fubini’s theorem to change the order of integration we can easily see that
Thus for we have that their convolution
is again an element of
. Note that the previous estimate is the main difficulty in showing that
is a Banach algebra.
More generally, the convolution of ,
, and
, is a well defined element of
and we have that
Exercise 1 Use the integral version of Minkowski’s inequality to prove estimate (1) above.
Let us summarize some properties of convolution in the following proposition. We take the chance to give two definitions here that we will use throughout these notes.
Definition 1 Let
be a topological space and
be a continuous function. The support of
, denoted by
, is the set
This is the smallest closed set in
outside which
.
Observe that we gave the definition of the support of a function for continuous functions. This is mostly a technical issue. It is easily understood that, in general, the support of a measurable function can only be defined up to sets of measure zero. The precise definition is as follows.
Definition 2 Let
be a regular Borel measure on a topological space
and
be a Borel measurable function. A point
is called a support point for
if
for every open neighborhood
of
. The set
is called the support of
.
Exercise 2 Assume that the measure
in the previous definition has the additional property that
for every open set
. Use exercise 1 of notes 1 to prove that for any continuous function
the two definitions of
, that is Definition 1 and Definition 2, coincide.
Proposition 3 Let
be such that the convolutions below are well defined.
(i) (commutative)
(ii) (associative)
(iii) (translations) For
and
we define the translation operator
For
we have
.
(iv) (support) If
then
Proof: Statements (i), (ii) and (iii) are trivial consequences of changes of variables and Fubini’s theorem. For (iv) observe that if then for any
we have
. Thus
for all
, so
.
A very useful property of the convolution of two functions is that it adopts the smoothness of the `nicest’ function. Formally this is because any differentiation operator applied to can be transferred to either
or
:
Here we use the standard multi-index notation: for and
we write as usual
. We also write
. In practice we need one of the functions to have some regularity and some mild conditions on the second function to do this rigorously. For example we have the following:
Proposition 4 Let
and suppose that
has continuous partial derivatives up to
-th order, that is
. Suppose also that
is bounded for all
. Then
has continuous derivatives up to
-th order, i.e.
, and
.
Proof: Let’s just see the special case and
. The proof in the general case is identical. Call
. Since
,
is a finite, absolutely continuous measure. We then need to show that
Fix some sequence . Observe that
. By the mean value theorem we have that
Using Lebesgue’s dominated convergence theorem we get
Observe that the hypothesis on the boundedness of the higher order derivatives will be used to show the uniform boundedness of (the analogues of) the functions in the general case.
It is instructive to fix one function to be an indicator function, say
where the constant
is there just in order to normalize the total
-mass of the function
to
. Usually we consider smooth versions of
but let’s just stick to case of the characteristic function for the sake of simplicity. Consider the `reflection’ of
give as
. Since we have started with an even function this makes no difference so that
. Observe that we can write
For some fixed , the translations of
by
,
centers the function
at the point
. So
is (a multiple of) the indicator function of an interval of length
, centered at
. Integrating against
essentially averages the function
around the point
with `weight’, the function
. In this averaging process, our choice of
implies that only the values of
at a scale
around
will be important. Thus the convolution of
and
replaces the value of
at a point with the average of the values of
at a scale
around a point. One can take this process one step further and consider sequences of functions that are more or and more concentrated around the origin, but have the same
mass, say
. For example the second function in this sequence would be
, the third could be
, and so on. Taking convolutions of the function
with the functions
amounts to averaging the function
around every point, in smaller and smaller scales around the point. Intuitively one thinks that in the limit, one should recover the function itself, at least in some weak sense. This turns out to be indeed the case. But what’s the gain? We just saw that taking convolutions of an integrable (say) function with a smooth bounded function gives as again a smooth function. Thus the previous process allows us to approximate (in some sense) any reasonable function by a sequence of very smooth functions. This has many technical advantages as one can think of any function as a limit, in the appropriate sense, of smooth approximations. This also gives a heuristic explanation of why the convolution of two functions behaves at least as good as the `nicest’ function in the convolution; averaging is a smoothing process.
We will now make the previous heuristic discussion precise. Let be a function on
and
. We define the dilations of the function
to be
Usually we will have a lot of freedom in choosing the function and we will require at least that
. Observe that dilating the function
by
doesn’t change the integral:
You should think of the function as a function concentrated around a point as was for example
in the previous discussion or, even better, as smooth approximations of it (bump function). Thus for example
could be a smooth function with compact support around the origin. Observe that as
, the mass of the function
, which is constant, becomes more and more concentrated around the origin. We will refer to this construction as `an approximation to the identity’. The reason is that, as was mentioned before, one can recover any reasonable function
by convolving with
and taking the limit as
, at least in the
sense. A more rigorous explanation is that
converges (in a weak sense) to a dirac mass at
.
Theorem 5 Let
with
. For
define the dilations of
as before,
. Then, for any
we have that
in the
as
:
Proof: For we use the notation
for the translation operator. Using the fact that has integral
we can write
By Minkowski’s integral inequality we get that
Now as
(see remark below) and
so by the dominated convergence theorem we get the result.
Remark 1 The translation operator is continuous in
for all
, that is
for all
,
.
Observe that for
,
as
means that
is uniformly continuous. This shows why the previous theorem breaks down in
.
Exercise 3 Show that the translation operator is continuous in
for
. Use the fact that continuous functions with compact support are dense in
for
. See also section 2.
as
for all
which are bounded and uniformly continuous.
(ii) If
is bounded and continuous on
show that
![]()
as
, uniformly on compact subsets of
.
Remark 2 There is a slight abuse of notation here. We use
for the norm in the space
defined in terms of the essential supremum of a function. However, the right norm in spaces of continuous functions should be defined in terms of the actual supremum of the function. Note however that for a continuous function, the two notions are identical so this should create no confusion.
Exercise 5 Let
and
be its dual exponent. Suppose that
and
. Show that
exists for every
and that it is a continuous and decays to zero at infinity. Also show the estimate
Remark 3 If
is a finite Borel measure on
and
it makes perfect sense to define the convolution of
with
to be the function
We then have
where
is the total variation of the measure
.
2. Some dense classes of functions
In this paragraph we will discuss some dense sub-classes of functions inside the space. These will prove to be very useful as many estimates will be easier to establish for these special sub-classes. Also, many times, working with a dense class in
, help us avoid several technical difficulties or even define operators that are not obviously defined directly on some
space. We will state some of the results here in the generality of a Hausdorff (or locally Hausdorff) space noting that everything goes through for
equipped with the Lebesgue measure.
Simple functions: Let be the class of all simple functions
such that
that is all simple complex valued functions that have support of finite measure. For the space
is dense in
. The space of all simple functions (not necessarily of finite compact support) is dense in
for
.
Continuous functions with compact support: Let be a measure space, where
is a locally Hausdorff space,
is a
-algebra that contains all compact subsets of
and such that
(i) locally finite: for all compact sets
.
(ii) is inner regular, meaning
(iii) is outer regular, meaning
We denote by the space of continuous functions
with compact support. Then, for every
,
is dense in
.
Remark here that whenever we embed into
,
automatically inherits the topology induced by the larger space, that is, the one defined by the norm
. Since
spaces are complete under our hypotheses, this says that
is the completion of
with respect to the norm of
for
. For
, the completion of
with respect to the
is not
but the space of continuous functions on
that vanish at infinity.
Continuous functions that vanish at infinity: Let be a locally compact Hausdorff space (a Hausdorff space where every point has a compact neighborhood). A function
is said to vanish at infinity if for every
there exists a compact set
such that
for all
. We denote by
the space of all complex valued continuous functions on
that vanish at infinity.
It is clear that , and actually the two spaces coincide whenever
is compact. We can equip the space
with the norm
Theorem 6 If
is a locally compact Hausdorff space, then
is the completion of
with respect to the supremum norm defined above.
For the proofs of the previous classical results see for example [F] or [R].
All the previous results apply to the Euclidean setup . Of course simple functions with support of finite measure are dense in
whenever
. A bit more can be said as we can choose our simple functions to be linear combinations of (
-dimensional) bounded intervals, and these are still dense in
. Continuous functions with compact support are also dense in
for all
. We can also restrict to a smaller class of more regular functions:
Infinitely differentiable functions with compact support: Let us consider the space of functions which are infinitely differentiable and have compact support. We denote this space by
. First of all it is not totally trivial that this space is non-empty.
Lemma 7 There exists a function
. From this we easily conclude that there is a
.
Exercise 6 Consider the function
(i) Show that
, together with its derivatives of any order, is infinitely differentiable and bounded.
(ii) Consider the function
. Show that
if
and
otherwise. It is obvious then that
.
(iii) For
consider the function
belongs to
. (iv) For
consider the function
Show that
.
Obviously . However, it is not hard to see the space
is still dense in
for
. It will however be easier to show that once we’ve introduced some more tools from real analysis and, in particular, convolution.
Schwartz functions: Here we introduce the space of Schwartz functions , which will turn out to be extremely useful in what follows. So let
be the space of all infinitely differentiable (
) functions
such that
for all multi-indices , of nonnegative integers. In other words, Schwartz functions are smooth functions that, together with their partial derivatives of every order, decay faster than any polynomial power at infinity. Of course every function in the class
is trivially a Schwartz function since it vanishes identically at infinity together with its derivatives of every order. A more interesting example of a Schwartz function is the Gaussian function
:
The space is also dense in all
spaces for
. Of course this is immediate once one shows that
is dense in
.
Schematically we have the following inclusions
and each space in this chain is dense in with the topology induced by
for
. We will discuss the space of Schwartz functions in much more detail in what follows. For now you can think of it as another nice class of functions that is dense in all the spaces
for
.
In the following proposition we use convolutions to show the previous denseness properties:
Proposition 8 The space
, and thus also the space
, is dense in
for all
. Also the space
is dense in
in the supremum norm.
Proof: Let and
. Since the space
is dense in
, there is a
such that
Let with
. By 5 we have that there is
in
as
. Thus for
small enough we have that
We conclude that
It remains to verify that is in
for every
. Note however that
is smooth by Proposition 4. Also, since both
and
have compact support, Proposition 3 shows that
also has compact support and we are done. Observe that the same argument applies if we start with a
. Using the fact
is dense in
it suffices to approximate a function
. However, functions in
are obviously bounded, so Exercise 4 completes the proof in this case as well.
Let us go back to approximations of the identity and justify their name.
Exercise 7 (convergence of approximations to the identity in the sense of distributions) For
we denote by
the Dirac measure at the point
:
Let
with
and consider the approximation to the identity
,
. Show that
for every
. We say that
(considered as a sequence of finite measures) converges in the sense of distributions to the measure
. We will come back to that point later on in the course.
3. Operators on spaces; boundedness and interpolation
Having set up our main environment, the spaces , we come to the core of this introduction: operators acting on these spaces and their properties. In general, we will consider operators
taking functions on some measure space
to function on some other measure space
. Many times our operators will be initially defined on `nice functions’ such as smooth functions with compact support of Schwartz functions. The goal would then be to extend the operator to a standard normed vector space such as
.
Suppose that and
are two normed vector spaces (usually Banach spaces of functions) and
be a linear operator, that is, we have
for all and complex numbers
. We will say that
is bounded if there is a constant
such that
for every
. The norm of the operator
, denoted by
or just
, is the smallest constant
so that such an inequality is true. We thus have
Continuity is equivalent to boundedness for linear operators:
Lemma 9 Let
be a linear operator. The following are equivalent: (i) The operator
is continuous.
(ii) The operator
is continuous at
.
(iii) The operator
is bounded.
Suppose that we want to show that a linear operator is a well defined bounded linear operator, where
are Banach spaces. Many times however we can only define the operator on some dense subset
. Suppose we have then that
. When can we extend
to the whole class
? Given
, the obvious thing to do is to consider some sequence
such that
. We then need to examine whether the limit
exists. Suppose that
is bounded on the dense sub-class, that is if,
for all . Using the boundedness of
on the dense class and linearity (essential) we can conclude that
so the sequence is a Cauchy sequence. The completeness of
then implies that the limit of
does indeed exist, so we can define
Observe also that for any other sequence we must have
for any . From this we conclude that
thus the extension is unique. Many times we will only define the operator
on the dense class and show its continuity on the dense sub-class. We will then say that
is densely defined.
We will use this device many times in trying to show that some linear operator is well defined and bounded, by examining the continuity of
on one of the dense classes that we have considered before (depending on what is more convenient).
A more general class of operators we will come across quite often is that of sublinear operators. Suppose that is an operator acting on a vector space of measurable functions. Then
is called sublinear if
for all complex constants
and
for all in the vector space. Of course all linear operators are sublinear. However, the most typical example of a sublinear operators we will come across is a maximal type operator. Such an operator has the form
where is a family of linear operators acting on some vector space of measurable functions,
is an infinite countable or uncountable index set, and the function
is a measurable function of
. Such operators are called maximal operators and the linearity of each
guarantees that
is sublinear.
Definition 10 (i) Let
and
be a sublinear operator on
. We will say that
is of strong type
if
for all
, where the implied constant depends only on
and
. In this case we write
for the norm of the operator
.
(ii) We will say that
is of weak type
if
for all
. We will write
for the norm of the operator
.
Observe that for fixed , the strong type
property of
trivially implies that
is of weak type
. The opposite, of course, is not true. However, we will see that in many cases the strong type bound can be deduced by interpolating between suitable endpoint weak type bounds. The first such result is the Marcinkiewicz interpolation theorem.
Theorem 11 (Marcinkiewicz interpolation theorem) Let
and
be measure spaces,
, and let
be a sublinear operator defined on
and taking values in the space of measurable functions on
. Suppose that
is of weak type
and of weak type
. Then
is of strong type
for any
.
Remark 4 Before going into the proof of this theorem let us discuss a bit its hypothesis. Given a function
we first need to show that
is well defined. Having the information that
is well defined on
we essentially need to see that
whenever
. To see this, fix a positive constant
, to be defined later, and consider the functions
Obviously we have
. Moreover,
Similarly we can estimate
This shows that we can decompose any function
to a sum of two functions
and
, whenever
, thus
. In particular,
is well defined for any
.
Proof: We first prove the theorem when . Since our hypothesis involves the distribution sets of of
it is convenient to recall the representation of the
norm of a function in terms of its distribution set. Indeed, from Proposition 9 of notes 1 we have
The measure of the set will appear many times in the proof so it is convenient to give it a shorter notation:
Fix for a moment and consider the decomposition of the function
at level
as in the remark before:
The sublinearity of allows us to write
for any . Thus,
so that
Since and
is of weak type
we can estimate the first summand as
Similarly, since and
is of weak type
we have
where are two numerical constants depending only on
respectively and on
and
. For simplicity we suppress the dependence of the constants
on these parameters. Combining the previous estimates we can write
Unravelling the definitions of the previous estimate yields
In order to recover the norm of
observe by (2) that it’s enough to multiply
by
and integrate in
.
Multiplying the first summand on the right hand side of (4) by and integrating we get
Similarly, multiplying the second summand in (4) by and integrating we have
Summing up the previous two estimates we conclude that
which shows that is of strong type
with
Observe that there is no claim here that this quantitative estimate on the norm of is optimal in general.
The proof in the case is very similar. Now the hypothesis that
is of weak type
is replaced by the hypothesis that
maps
to
. That is, there exists some constant
, depending only on
and
, such that
for all . We fix some level
and we split the function
as
where
. Obviously
so by the hypothesis we have that
. Arguing as in the case
we can write
Since , the second summand in the previous estimate vanishes identically. We conclude that
This concludes the proof in the case as well as providing the quantitative estimate
.
Exercise 8 Modify the proof above to show that under they hypotheses of the Marcinkiewicz interpolation theorem we can conclude that
where
for some
.
Hint: This is already the constant appearing in the case
. For the case
split the function
at the level
(instead of
), for some
, and optimize in the parameter
at the end of the proof. For this, use the heuristic that a sum is optimized when the terms in the sum are roughly equal in size.
Exercise 9 Let
and suppose that
. Show that
for all
. Hint: The proof is very similar to the proof of the Marcinkiewicz interpolation theorem, only simpler. Use again the fact that
and split the range of
as
, at an appropriate level
. Use the weak integrability conditions for
in the appropriate intervals of
.
Exercise 10 Let
be a finite set equipped with counting measure and let
be a function. Show that for any
we have that
Thus on finite sets, the spaces
and
are equivalent. Here
denotes the cardinality of
.
Hint: Observe that
and use Proposition 9 of notes 1.
Exercise 11 (Dual formulation of
) Let
. Show that for every
, we have
where
.
Hint: As in the previous exercise, write
Since the set
has finite measure one can estimate further the measure of the level set by
Now split the integral we want to estimate accordingly in order to take advantage of this estimate. See also the hint in the previous exercise. This will give you one direction of the estimate, the other direction being trivial.
While the Marcinkiewicz interpolation theorem is the prototype of real interpolation, complex methods can be used to derive similar conclusions. An example of such a method has already been used via the three lines lemma applied to exhibit the log convexity of the norms (which is also a form of interpolation). We will now describe the prototype of complex interpolation.
The following theorem has some differences compared to the Marcinkiewicz interpolation theorem. First of all we assume that is linear rather than sublinear. Note as well that our hypotheses concern strong type bounds for the operator
rather than weak endpoint bounds. On the other hand, the conclusion gives a good estimate for the norm of the operator when interpolating between the endpoints and allows more freedom in the choice of the exponents at the endpoints.
Theorem 12 (Riesz-Thorin interpolation theorem) Let
and
. Let
be a linear operator that is of strong type
with norm
and of strong type
with norm
. That is we have that
for all
and
for all
. Then
is of strong type
with norm at most
:
for all
, where
and
, with
.
Proof: Let us first consider the case . Then by the log-convexity of the
norm we get directly that
as desired. We can therefore focus on the case so that
. Without loss of generality we can assume that
.
We divide the proof in several steps:
step 1: It is enough to prove the theorem for . To see this just observe that we can always replace the measures
by
,
respectively, for appropriate constants
. We can choose these constants so that
and then we also have
. Doing the calculations you will see that we need to define the constants
by means of the equations
In what follows we will therefore assume that in the statement of the theorem.
for all simple functions of finite measure support . Here
is the dual exponent of
.
First of all, since is of strong type
, Hölder’s inequality shows that
and, similarly, by the type of
we get that
Thus, estimate (5) is true for . It is obvious that we need to interpolate between the two endpoint estimates above. We will do that by means of the three lines convexity lemma. First we define the map
where . Here there is a problem in the case
since the dual exponents are equal to
. In this case the definition of
should be understood as
The function is a holomorphic function of
. Furthermore, since
are simple functions of finite measure support, it is not hard to see that
is actually bounded on the strip
. Furthermore, for
we see that
. Now, on the boundary of the strip we have that
from (6) and similarly
from (7). Using the three lines lemma we get that
The right hand side however is equal to . Applying the result for
and
we get the claim of step 2. Observe that nothing really changes in the case
.
step 3: We have that
for all and all simple functions
of finite measure support.
To see this let and
be a simple function with finite measure support. We write
. Observe that
and
. Now let
be sequences of simple functions of finite measure support such that
and
as . We write
. By step 2 and (6) and (7) we have that
Letting and observing that
as
we get the claim of this step as well.
step 4: We have that
for all and
.
First of all observe that from step 3 we can actually conclude that
for all and simple functions
of finite measure support. In order to see this let
be any simple function that vanishes outside a set
of finite measure and define
. Consider a sequence of simple functions
such that
and
. In particular
vanishes outside the set
. We thus have the estimate
. Also observe that
since
is a function in
by our hypothesis. Lebesgue’s dominated convergence theorem now shows that
Now for any and
, let
be a sequence of simple functions with finite measure support such that
pointwise and
. Fatou’s lemma now gives
This proves the claim of this step as well. Duality between and
now completes the proof of the theorem.
As a first application of the Riesz-Thorin interpolation theorem we will now prove Young’s inequality on convolutions of functions.
Proposition 13 (Young’s inequality) Let
. Let
be such that
. If
and
, then
is a well defined function in
and we have the estimate
Proof: For and
fixed we define the operator
As we have already seen (see Exercise 1) we have the bound , that is,
is of strong type
. It is also very easy to see that if
is the conjugate exponent of
then we have
that is and
is of strong type
. Letting
and
, the Riesz-Thorin interpolation theorem shows that
is of strong type
. Replacing
and using the hypothesis
we get that
. Thus we conclude that
is of strong type
with norm at most
. That is we have
as we wanted to show.
Exercise 12 (Schur’s test) Let
and
. Let
be a
-measurable function such that (i) For almost every
we have that
(ii) For almost every
we have that
.
We consider the operator
,
for suitable functions
. Define
and
.
Show that
is of strong type
with norm at most
where
and
are as in the Riesz-Thorin interpolation theorem.
Hint: First consider the sublinear operator
,
which is always well defined (though maybe infinite) and controls
. Use Minkowski’s integral inequality and Hölder’s inequality to show that
, and thus $T$ is of strong type
and of strong type
. Use the Riesz-Thorin interpolation theorem to conclude the proof.
[Update 24 Feb 2011: Omission in Exercise 12 corrected and a solution hint added.]
[Update 15 Mar 2011: Typo in Exercise 8 corrected, Exercise 4 edited.]