1. Averages and maximal operators
This week we will be discussing the Hardy-Littlewood maximal function and some closely related maximal type operators. In order to have something concrete let us first of all define the averages of a locally integrable function around the point
:
where is the Euclidean ball with center
and radius
and
denotes its Lebesgue measure. Note that since Lebesgue measure is translation invariant we have
where denotes the Lebesgue measure (or volume in this case) of the
-dimensional unit ball
. Denoting by
the indicator function of the normalized unit ball
and noting that the balls centered at zero are -symmetric, we can write
Thus
and of course is an approximation to the identity since
and
are just the dilations of the function
:
Remembering the discussion that followed the definition of the convolution in Notes 2, the convolution of a locally integrable function with the dilations of an
function
was viewed as an averaging process. We see now that when
this is exact, that is,
is the average of
with respect to a ball around
of radius
where the implied constant only depends on the dimension. A similar conclusion follows if we start with any set
that is say
-symmetric and convex and normalized to volume
. We then have that
that is, are the averages of
with respect to the dilations of the fixed convex body
at every point
. Here we denote by
the dilations of
It is an easy exercise to show that all these averages are uniformly bounded in size. For all we have
One of course could consider more general sets instead of convex sets which are
-symmetric and in fact this leads to one of the most interesting family of problems in harmonic analysis. This however falls outside the scope of this course and we will mostly focus on the case of the normalized unit ball which in some sense is the prototypical example.
The Hardy-Littlewood maximal operator (with respect to Euclidean balls) is defined as
Observe that this is a sublinear operator that is well defined at least when is locally integrable. Although maximal operators are interesting in their own right, there are some very specific applications we have in mind. The first has to do with pointwise convergence of averages of a function and is a consequence of the following simple proposition.
Proposition 1 Let
be a family of sub-linear operators on
and define the maximal operator
If
is of weak type
then for any
the set
is closed in
![]()
Proof: In order to show that the set
is closed, consider a sequence of functions with
in
. We need to show that
. To see this observe that for almost every
we have
Thus for any we can write
Since the right hand side tends to as
and the left hand side does not depend on
we conclude that for every
Now we have that
Thus for almost every
so that
.
Remark 1 We have indexed the family
in
for the sake of definiteness but one can of course consider more general index sets and the previous proposition remains valid. In every case that the index set is uncountable some attention should be given in assuring the measurability of
.
Remark 2 To get a clearer picture of what this proposition says consider the family of operators
for some
with integral
. As we have seen already many times, these averages of
converge to
in many different senses for different classes of functions
. In particular if
then
converges to
even uniformly as
. Thus we have
Since
is dense in
, Proposition 1 implies that if
is of weak type
then
for almost every
. Thus in order to show that approximations to the identity converge to the function almost everywhere it is enough to show that the corresponding maximal operator is of weak type
. In what follows we will show that the Hardy-Littlewood maximal operator is of weak type
and this already implies the corresponding statement for a wide class of `nice’ approximations to the identity.
To avoid confusion, remember that in Theorem 15 of Notes 3 we have already exhibited that
for every Lebesgue point
of
. However this is only interesting if we already know that
has `many’ Lebesgue points (in particular almost every point in
). In Theorem 15 of Notes 3 we took for granted that the integral of a locally integrable function is almost everywhere differentiable and this in turn implied that almost every point in
is a Lebesgue point of
. In this part of the course we will fill in this gap by showing that the integral of a locally integrable function is almost everywhere differentiable.
Exercise 1 Let
be of weak type
. Show that for every
the set
is closed in
.
Hint: The proof is very similar to that of Proposition 1. Observe that it suffices to show that
for every
.
2. The Hardy-Littlewood maximal theorem
We focus our attention to the Hardy-Littlewood maximal operator; for
The discussion in the previous section suggests that one should try to prove weak bounds for the operator
. In fact we will prove the following theorem which summarizes the boundedness properties of
.
Theorem 2 (Hardy-Littlewood maximal theorem) (i) The Hardy-Littlewood maximal operator is of strong type
for
:
for all
and
.
(ii) The Hardy-Littlewood maximal operator if of weak type
:
for all
.
Remark 3 The Hardy-Littlewood maximal operator is not of strong type
. To see this note that for any
we have that
which shows in particular that
is never integrable whenever
is not identically
. Moreover, no strong estimates of type
are possible whenever
as can be seen by examining the dilations of
and
.
Exercise 2 Prove the assertions in the previous remark.
Exercise 3 Let
and let
be a ball such that
for every
. Let
be the ball with the same center and twice the radius of
. Show that
for every
.
Proof of Theorem 2: First of all let us observe that is of strong type
. This is just a consequence of the general fact that an average never exceeds a `maximum’. In view of the Marcinkiewicz interpolation theorem it then suffices to show the assertion (i) of the theorem, namely that
is of weak type
. Furthermore, by homogeneity, it suffices to show that
We now fix some and set
and let be any compact subset of
and our task is to obtain an estimate of the form
, uniformly in
.
For every there is a ball
(of some radius) such that
The family clearly covers the compact set
so we can extract a finite subcollection of balls which we denote by
which still cover
. Since
we get that
Observe on the other hand that
so if we manage to show that
we would be done. The main obstruction to such an estimate is that the balls may overlap a lot. On the other hand, if the balls
where disjoint (or `almost’ disjoint) then there would be no problem. Although we can’t directly claim that the family
is non-overlapping, the following lemma will allow us to extract a subcollection of balls which has this property, without losing too much of the measure of the union of balls in the collection.
Lemma 3 (Vitali-type covering lemma) Let
be a finite collection of balls. Then there exists a subcollection
of disjoint balls such that
Before giving the proof of this covering lemma let us see how we can use it to conclude the proof of Theorem 2. Recall that we have extracted a finite collection of balls which cover the set
and which satisfy
Now applying the covering lemma we can extract a subcollection of disjoint balls so that the measure of their union exceeds a multiple of the measure of the union of the original family of balls. Thus, we can write
Observe that this estimate is uniform over all compact sets so taking the supremum over such sets and using the inner regularity of the Lebesgue measure we conclude that
which concludes the proof.
Proof of the Covering Lemma 3: First of all let us assume that the balls are arranged in decreasing order of size (thus
is the largest ball). We will choose the subcollection
by the greedy algorithm. The first ball we choose in the subcollection is the largest ball, thus
. Now assume we have chosen the balls
for some
. We choose the ball
to be the largest ball which doesn’t intersect any of the balls already chosen. Observe that this amounts to choosing
We continue this process until we run out of balls. It is clear that the resulting subcollection consists of disjoint balls. On the other hand, every ball
of the original collection is either selected or it intersects one of the selected balls, say,
in the subcollection of greater or equal radius (otherwise the ball
would be selected). Then it is not hard to see that
where is the ball with the same center as
and three times its radius. Thus we have that
Taking the Lebesgue measure of both unions we conclude
and we are done.
Exercise 4 (The maximal function on the class
) We saw that if
is a non-trivial integrable function then
is never integrable. Suppose however that
is supported in a finite ball
and that it is a `bit better’ than being integrable, namely it satisfies
where
. We say in this case that
. Then we have that
and
Hints: (a) For
show that
It will help you to split the function
as
and observe that
.
(b) Show that
From this, (a) and Fubini’s theorem you can conclude the proof.
3. Consequences of the maximal theorem
Our first application of the maximal theorem has to do with the differentiability of the integral of a locally integrable function. Indeed, using Theorem 2 and Proposition 1 we immediately get the following.
Corollary 4 (Lebesgue differentiation theorem) Let
be a locally integrable function. Then, for almost every
we have that
For the proof just observe that and that the claimed convergence property is a local property thus one can confine any locally integrable function in a ball around the point
which turns
into an
function. As we have already seen in Notes 3, the previous statement also implies the following:
Corollary 5 Let
. Then almost every point in
is a Lebesgue point if
, that is, we have that
for almost every
.
Lebesgue’s differentiation theorem generalizes to more general averages. A manifestation of this is already presented in Theorem 15 of Notes 2 which asserts that for `nice’ approximations to the identity , the means
converge to
at every Lebesgue point of
. Here we will give an alternative proof of this theorem by controlling the maximal operator
by the Hardy-Littlewood maximal function.
Proposition 6 Let
be a positive and radially decreasing function with
. Then we have that
Proof: First suppose that is of the form
where
and
are Euclidean balls centered at
for all
. Then we have
However, any function which is positive and radially decreasing can be approximated monotonically from below by a sequence of simple functions of the form
so we are done.
As an immediate corollary we get the same control for approximations to the identity which are controlled by positive radially decreasing functions.
Corollary 7 Let
almost everywhere where
is positive, radially decreasing and integrable. Then we have that
In particular
is of weak type
and strong type
for all
. We conclude that
for almost every
.
Remark 4 The qualitative conclusion of the previous corollaries is that maximal averages of
with radially decreasing integrable kernels are controlled by the Hardy-Littlewood maximal function. A typical radially decreasing integrable kernel is the Gaussian kernel
By dilating
by
we get
The function
can be viewed as smooth approximation of the indicator function of a ball of radius
(up to constants). Indeed, for
say, we have that
, while for
the function
decays very fast. Thus the kernel
is not so different from
.
3.1. Points of density and the Marcinkiewicz Integral
A direct consequence of Lebesgue’s differentiation theorem is that almost every point of a measurable set is `completely’ surrounded by other points of the set. To make this precise, let us give a definition.
Definition 8 Let
be be a measurable set in
and let
. We say that
is a point of density of the set
, if
Of course the limit in the previous definition might not exist in general or not be equal to . Observe however that if the previous limit is equal to
then
is a point of density of the set
the complement of
. On the other hand, applying Lebesgue’s differentiation theorem to the function
which is obviously locally integrable we get
for almost every . Thus we immediately get the following
Proposition 9 Let
be a measurable set. Then almost every point of
is a point of density of
. Likewise, almost every point
is a point of density of
.
Thus a point of density is in a measure theoretic sense completely surrounded by other points of . The measure of the set
in the ball
is proportional to the measure of the ball as
and
is a point of density.
Another way to describe this notion is the following. Let be a closed set and define
. Of course
if
. Now think of
in a neighborhood of zero so that the vector
is in the neighborhood of
. If
then the distance of the point
from
is at most
since
and
. Thus we have that
whenever
. That is, when the points
approaches
, the distance
, that is the distance of
from
approaches zero. In fact the estimate above can be improved.
Proposition 10 Let
be a closed set. Then for almost every
,
as
. This is true in particular if
is a point of density of the set
.
Exercise 5 Prove Proposition 10 above. The
is interpreted as follows: For every
there exists some
such that
whenever
.
We will be mostly interested in another instance of this principle that is reflected in the Marcinkiewicz integral. This will also come in handy in our study of oscillatory integrals in the next chapter.
For a closed set as before we define the Marcinkiewicz integral associated to
,
, as
Remark 5 The previous theorem shows that, in average,
is small enough whenever
to make the integral converge locally. This can be seen as a variation of Proposition 10 though no direct quantitative connection is claimed.
Part (i) is obvious and is left as an exercise. For (ii) it will be enough to show the following:
Lemma 12 Let
be a closed set whose complement
has finite measure. Then we set
Then
for almost every
. In particular we have
Proof: It is enough to show
since then is finite for almost every
. To that end we write
Now fix a . As
we obviously have that
thus
. Since all the quantities under the integral signs are positive the previous estimate implies
whenever . Integrating for
we get
To get the proof of Theorem 11 we now use the previous lemma as follows. Let be a closed set and let
be a ball of radius
centered at
. Let
. Then
is closed and
so that
. Thus the previous lemma applies to
and we get that
for almost every where we denote by
the distance from the set
. Now observe that for
and
we have that
; indeed
and
thus
. We conclude that
for almost every . Since every
eventually belongs to some
for some large
we get the conclusion of the theorem.
Exercise 6 (i) Show the following strengthened form of Lemma 12: For
and locally integrable then
whenever
is closed and
.
(ii) Use (i) and the maximal theorem to conclude that
for all
.
4. The dyadic maximal function
We now come to a different approach to the maximal function theorem. On the one hand the `dyadic’ approach we will follow here already implies the maximal theorem presented in the previous paragraph. It is however interesting in its own right and it will give us the chance to present a dyadic structure on the Euclidean space which will come in handy in many different cases.
Consider the basic cube . A dyadic dilation of this cube is the cube
where
. Now we also consider integer translations of this cube of the form
for some integer vector
. We have the following definition:
Definition 13 A dyadic cube of generation
is a cube of the form
where
and
. The family of disjoint cubes
defines the
-th generation of dyadic cubes.
The dyadic cubes have the following basic properties.
(d1) The dyadic cubes in the generation are disjoint and their union is
. Thus any point
belongs to unique dyadic cube in the
-th generation.
(d2) Two (different) dyadic cubes are either disjoint or one contains the other.
(d3) A dyadic cube in consists of exactly
dyadic cubes of the generation
. On the other hand, for any dyadic cube
and any
there is a unique dyadic cube in the generation
that contains
.
As a first instance of how things simplify and get sharper in the dyadic world, let us see the analogue of the Vitali covering lemma in the dyadic case.
Lemma 14 (Dyadic Vitali-type covering lemma) Let
be a finite collection of dyadic cubes. There exists a subcollection
of disjoint dyadic cubes such that
Proof: Let be the maximal cubes among
, that is, the cubes that are not contained in any other cube of the collection
. Then the cubes
are disjoint (otherwise they wouldn’t be maximal). Also any cube that is not maximal is contained in the union
.
Given a function and
we set
Observe that given there is a unique cube
that contains
and then the value of
at
equals the average of the function
over the cube
. In fact,
is the conditional expectation of
with respect to the
-algebra generated by the family
. Observe that for every generation
, if
is a union of cubes in
then
The operator is the discrete dyadic analogue of an approximation to the identity dilated at level
. A difference however is that the averages here are not `centered’. Indeed,
is the average of
with respect to the cube
whenever
for some
. However,
is not the `center’ of the cube
.
The dyadic maximal function is defined as
Thus the supremum is taken over all dyadic cubes that contain or, equivalently, over all generations of dyadic cubes. We have the analogue of the maximal theorem:
Theorem 15 (Dyadic Maximal Theorem) (i) The dyadic maximal function is of weak type
with weak type norm at most
:
for all
. (ii) The dyadic maximal function is of strong type
, for all
; for all
we have
where the implied constant depends only on
.
We conclude using Proposition 1 that
(iii) For every
we have that
Exercise 7 Give the proof of Theorem 15 above. Observe that the proof is essentially identical to that of Theorem 2 using the dyadic version of the Vitali covering Lemma instead of the non-dyadic one. For (ii) you need to observe that the statement is true for continuous functions (for example) and use Proposition 1.
Exercise 8 (The maximal function with respect to cubes) Let
denote the maximal function with respect to cubes, that is,
where
is the indicator function of the cube
. Show that
where the implied constants depend only on the dimension
.
Exercise 9 Show the pointwise estimate
where the implied constant depends only on the dimension
. On the other hand, show that the opposite estimate cannot be true. For example when
test against the function
. Conclude that the dyadic maximal theorem follows from the non-dyadic one (with a different constant though). Hint: Observe that if
and
is a dyadic cube, there exists a ball
which contains
and
.
Exercise 10 Consider the non-centered maximal function with respect to cubes, or balls
where the supremum is taken over all Euclidean balls containing
. Likewise
where the supremum is taken over all cubes (with sides parallel to the coordinate axes) that contain
. Show that
and
are all pointwise equivalent, that is
5. The Calderón-Zygmund decomposition
Let be a measure space and
be a measurable function (say) in
. For a level
we have many times used the decomposition of
at level
:
The function is the `good’ part of
; indeed we have that
Thus the good part adopts the
-integrability of
and furthermore it is bounded. On the other hand the `bad’ part
satisfies
Thus the bad part also inherits the
-integrability of
but it also has `small’ support.
In a general measure space one cannot do much more than that in terms of decomposing in a good part and a bad part. If however there is also a metric structure in the space which is compatible with the measure, one can do a bit better and also get some control on the local oscillation of the bad part
. Various forms of this decomposition are usually referred to as Calderón-Zygmund decompositions. We present here the basic example in the dyadic Euclidean setup.
Proposition 16 (Dyadic Calderón Zygmund decomposition) Let
and
. There exists a decomposition of
of the form
where
is a collection of disjoint dyadic cubes and the sum is taken over all the cubes
. This decomposition satisfies the following properties:
(i) The `good part’
satisfies the bound
(ii) The `bad part’ is
; each function
is supported on
and
(iii) For each
we have
Furthermore we have that
In particular, from the dyadic maximal theorem we have
Proof: The proof is very similar to the proof of the dyadic covering lemma. We fix some level and let us call a dyadic cube
bad if
If a dyadic cube is not bad we call it good. A bad cube will be called maximal if is bad and also there is no dyadic cube strictly containing
is bad. Let us denote by
the collection of maximal bad cubes. Since the cubes in the collection
are dyadic and maximal, they are disjoint. Also, for any bad cube
, let
. We have that
Also, Since , every bad cube is contained in some maximal bad cube. Indeed, if
is bad cube then
as
so monotone convergence implies that
. It follows that there is a large enough
such that
for all . Thus the dyadic cube
is maximal and bad.
Now let be a maximal bad cube and consider the parent of
,
, that is the unique dyadic cube with double the side-length that contains
. Since
is maximal,
has to be good so we have
and thus
for all maximal bad cubes . We set
whenever is a maximal bad cube. We also set
It is not hard to verify all the required properties of except maybe that
. It is easy to see that
whenever is a bad cube. If
and
, then necessarily
is good. We thus have that
since is good. Now, by the dyadic maximal theorem, we have that
as
with
. Since
we conclude that
and we are done in this case as well.
Observe that in the previous decomposition of , the `bad set’, that is the set where
lives, is given in the form
One could prove the Calderón-Zygmund decomposition starting from the set and decomposing it as a union of disjoint dyadic cubes. This sort of decomposition is interesting in its own right. Let us see how this can be done.
Proposition 17 (Dyadic Whitney decomposition) Let
be an open set which is not all of
. Then there exists a decomposition
where
is a collection of disjoint dyadic cubes. For each
we have
Proof: Let denote the dyadic cubes inside
such that
Obviously but the opposite inclusion is also true. Indeed, if
note that
is contained in some dyadic cube
since
is open. Now for
a dyadic cube let
be its `parent’, that is the unique dyadic cube of side twice the side-length of
, containing
. Considering successive parents of
there will be a dyadic cube
containing
with diameter greater than
and less than
. Thus
and
. The collection of dyadic cubes
is not necessarily disjoint so we only choose the cubes in
which are maximal with respect to set inclusion and call this collection again
. Now maximal and dyadic means disjoint so we are done.
Using the Whitney decomposition lemma one can give an alternative proof of the Calderón-Zygmund decomposition by taking
and noting that the latter set is open.
As a corollary we get a control of the level sets of the usual (non-dyadic) maximal function by the level sets of the dyadic maximal function.
Corollary 18 For all
we have that
Proof: Let be the collection of dyadic cubes obtained by the Calderón-Zygmund decomposition at level
. We have that
We write for the cube with the same center as
and twice its side-length.
Indeed, let and
be any cube centered at
. Denoting by
the side-length of
, we choose
so that
. Then
intersects
cubes in the
-th generation
, and let us call them
. Observe that none of these cubes can be contained in any of the
because otherwise we would have that
. Thus the average of
on each
is at most
so
This proves the claim (2) and thus the corollary.
Exercise 11 Using the dyadic maximal theorem only, conclude that the operators
are of weak type
.
5.1. The Fefferman-Stein weighted inequality.
We give a first application of the Calderón-Zygmund decomposition which in some sense is the prototype of a weighted norm inequality. It is a variation of the maximal theorem where the Lebesgue measure is replaced by a measure of the form for some non-negative measurable function
. It then turns out that the maximal function maps
to
boundedly for all
and that it also satisfies a weak endpoint analogue for
. In particular we have
Theorem 19 (Fefferman-Stein inequality) Let
be a non-negative locally integrable function (a `weight’).
(i) We have that
for all
with
.
(ii) In the endpoint case
we get the weak analogue
for all
.
Proof: We will show that and that the weak
inequality in (ii) holds. Then the Marcinkiewicz interpolation theorem will give (i) as well.
The bound
is trivial and is left as an exercise. We turn our attention to the -bound. Let
be the collection of the dyadic cubes obtained from the Calderón-Zygmund decomposition at level
. By the proof of Lemma 18 we have that
where is the cube with the same center as
and twice its side-length. We have
Again, from the Calderón-Zygmund decomposition (at level ) we have that
for all of the decomposition. Combining the last two estimates we can write
For fixed the term
is non-zero if and only if
. Thus the previous estimate implies
where is the non-centered maximal function associated to cubes. See Exercise 8. Since
this concludes the proof.
Exercise 12 (Heldberg’s inequality and Hardy-Littlewood-Sobolev theorem) Let
,
and
.
(i) Show Heldberg’s inequality: If
then
(ii) Use the Hardy-Littlewood maximal theorem and (i) to conclude that Hardy-Littlewood-Sobolev theorem: For every
we have that
Hint: In order to show (i) split the integral
where
is a parameter to be chosen later on. For
observe that
Observe that
is decreasing, radial, non-negative and integrable (since
). Use Proposition 6 and the calculation in its proof to show the bound
For
use Hölder’s inequality to show
Choose the parameter
to minimize the sum
. Part (ii) is a trivial consequence of (i).
[Update 4 Apr 2011: Section 3.1 concerning the Marcinkiewicz integral added; numbering changed.
Update 9th May 2011: Typo in the hint of Exercise 1 corrected.]