1. Averages and maximal operators
This week we will be discussing the Hardy-Littlewood maximal function and some closely related maximal type operators. In order to have something concrete let us first of all define the averages of a locally integrable function around the point :
where is the Euclidean ball with center and radius and denotes its Lebesgue measure. Note that since Lebesgue measure is translation invariant we have
where denotes the Lebesgue measure (or volume in this case) of the -dimensional unit ball . Denoting by the indicator function of the normalized unit ball
and noting that the balls centered at zero are -symmetric, we can write
and of course is an approximation to the identity since and are just the dilations of the function :
Remembering the discussion that followed the definition of the convolution in Notes 2, the convolution of a locally integrable function with the dilations of an function was viewed as an averaging process. We see now that when this is exact, that is, is the average of with respect to a ball around of radius where the implied constant only depends on the dimension. A similar conclusion follows if we start with any set that is say -symmetric and convex and normalized to volume . We then have that
that is, are the averages of with respect to the dilations of the fixed convex body at every point . Here we denote by the dilations of
It is an easy exercise to show that all these averages are uniformly bounded in size. For all we have
One of course could consider more general sets instead of convex sets which are -symmetric and in fact this leads to one of the most interesting family of problems in harmonic analysis. This however falls outside the scope of this course and we will mostly focus on the case of the normalized unit ball which in some sense is the prototypical example.
The Hardy-Littlewood maximal operator (with respect to Euclidean balls) is defined as
Observe that this is a sublinear operator that is well defined at least when is locally integrable. Although maximal operators are interesting in their own right, there are some very specific applications we have in mind. The first has to do with pointwise convergence of averages of a function and is a consequence of the following simple proposition.
If is of weak type then for any the set
is closed in
Proof: In order to show that the set
is closed, consider a sequence of functions with in . We need to show that . To see this observe that for almost every we have
Thus for any we can write
Since the right hand side tends to as and the left hand side does not depend on we conclude that for every
Now we have that
Thus for almost every so that .
Remark 1 We have indexed the family in for the sake of definiteness but one can of course consider more general index sets and the previous proposition remains valid. In every case that the index set is uncountable some attention should be given in assuring the measurability of .
Remark 2 To get a clearer picture of what this proposition says consider the family of operators
for some with integral . As we have seen already many times, these averages of converge to in many different senses for different classes of functions . In particular if then converges to even uniformly as . Thus we have
Since is dense in , Proposition 1 implies that if is of weak type then
for almost every . Thus in order to show that approximations to the identity converge to the function almost everywhere it is enough to show that the corresponding maximal operator is of weak type . In what follows we will show that the Hardy-Littlewood maximal operator is of weak type and this already implies the corresponding statement for a wide class of `nice’ approximations to the identity.
To avoid confusion, remember that in Theorem 15 of Notes 3 we have already exhibited that
for every Lebesgue point of . However this is only interesting if we already know that has `many’ Lebesgue points (in particular almost every point in ). In Theorem 15 of Notes 3 we took for granted that the integral of a locally integrable function is almost everywhere differentiable and this in turn implied that almost every point in is a Lebesgue point of . In this part of the course we will fill in this gap by showing that the integral of a locally integrable function is almost everywhere differentiable.
Exercise 1 Let be of weak type . Show that for every the set
is closed in .
Hint: The proof is very similar to that of Proposition 1. Observe that it suffices to show that
for every .
2. The Hardy-Littlewood maximal theorem
We focus our attention to the Hardy-Littlewood maximal operator; for
The discussion in the previous section suggests that one should try to prove weak bounds for the operator . In fact we will prove the following theorem which summarizes the boundedness properties of .
for all and .
(ii) The Hardy-Littlewood maximal operator if of weak type :
for all .
Remark 3 The Hardy-Littlewood maximal operator is not of strong type . To see this note that for any we have that
which shows in particular that is never integrable whenever is not identically . Moreover, no strong estimates of type are possible whenever as can be seen by examining the dilations of and .
Exercise 2 Prove the assertions in the previous remark.
Exercise 3 Let and let be a ball such that for every . Let be the ball with the same center and twice the radius of . Show that for every .
Proof of Theorem 2: First of all let us observe that is of strong type . This is just a consequence of the general fact that an average never exceeds a `maximum’. In view of the Marcinkiewicz interpolation theorem it then suffices to show the assertion (i) of the theorem, namely that is of weak type . Furthermore, by homogeneity, it suffices to show that
We now fix some and set
and let be any compact subset of and our task is to obtain an estimate of the form , uniformly in .
For every there is a ball (of some radius) such that
The family clearly covers the compact set so we can extract a finite subcollection of balls which we denote by which still cover . Since we get that
Observe on the other hand that
so if we manage to show that
we would be done. The main obstruction to such an estimate is that the balls may overlap a lot. On the other hand, if the balls where disjoint (or `almost’ disjoint) then there would be no problem. Although we can’t directly claim that the family is non-overlapping, the following lemma will allow us to extract a subcollection of balls which has this property, without losing too much of the measure of the union of balls in the collection.
Before giving the proof of this covering lemma let us see how we can use it to conclude the proof of Theorem 2. Recall that we have extracted a finite collection of balls which cover the set and which satisfy
Now applying the covering lemma we can extract a subcollection of disjoint balls so that the measure of their union exceeds a multiple of the measure of the union of the original family of balls. Thus, we can write
Observe that this estimate is uniform over all compact sets so taking the supremum over such sets and using the inner regularity of the Lebesgue measure we conclude that
which concludes the proof.
Proof of the Covering Lemma 3: First of all let us assume that the balls are arranged in decreasing order of size (thus is the largest ball). We will choose the subcollection by the greedy algorithm. The first ball we choose in the subcollection is the largest ball, thus . Now assume we have chosen the balls for some . We choose the ball to be the largest ball which doesn’t intersect any of the balls already chosen. Observe that this amounts to choosing
We continue this process until we run out of balls. It is clear that the resulting subcollection consists of disjoint balls. On the other hand, every ball of the original collection is either selected or it intersects one of the selected balls, say, in the subcollection of greater or equal radius (otherwise the ball would be selected). Then it is not hard to see that
where is the ball with the same center as and three times its radius. Thus we have that
Taking the Lebesgue measure of both unions we conclude
and we are done.
Exercise 4 (The maximal function on the class ) We saw that if is a non-trivial integrable function then is never integrable. Suppose however that is supported in a finite ball and that it is a `bit better’ than being integrable, namely it satisfies
where . We say in this case that . Then we have that and
Hints: (a) For show that
It will help you to split the function as
and observe that .
(b) Show that
From this, (a) and Fubini’s theorem you can conclude the proof.
3. Consequences of the maximal theorem
Our first application of the maximal theorem has to do with the differentiability of the integral of a locally integrable function. Indeed, using Theorem 2 and Proposition 1 we immediately get the following.
Corollary 4 (Lebesgue differentiation theorem) Let be a locally integrable function. Then, for almost every we have that
For the proof just observe that and that the claimed convergence property is a local property thus one can confine any locally integrable function in a ball around the point which turns into an function. As we have already seen in Notes 3, the previous statement also implies the following:
Corollary 5 Let . Then almost every point in is a Lebesgue point if , that is, we have that
for almost every .
Lebesgue’s differentiation theorem generalizes to more general averages. A manifestation of this is already presented in Theorem 15 of Notes 2 which asserts that for `nice’ approximations to the identity , the means converge to at every Lebesgue point of . Here we will give an alternative proof of this theorem by controlling the maximal operator by the Hardy-Littlewood maximal function.
Proof: First suppose that is of the form where and are Euclidean balls centered at for all . Then we have
However, any function which is positive and radially decreasing can be approximated monotonically from below by a sequence of simple functions of the form so we are done.
As an immediate corollary we get the same control for approximations to the identity which are controlled by positive radially decreasing functions.
Corollary 7 Let almost everywhere where is positive, radially decreasing and integrable. Then we have that
In particular is of weak type and strong type for all . We conclude that
for almost every .
Remark 4 The qualitative conclusion of the previous corollaries is that maximal averages of with radially decreasing integrable kernels are controlled by the Hardy-Littlewood maximal function. A typical radially decreasing integrable kernel is the Gaussian kernel
By dilating by we get
The function can be viewed as smooth approximation of the indicator function of a ball of radius (up to constants). Indeed, for say, we have that , while for the function decays very fast. Thus the kernel is not so different from .
3.1. Points of density and the Marcinkiewicz Integral
A direct consequence of Lebesgue’s differentiation theorem is that almost every point of a measurable set is `completely’ surrounded by other points of the set. To make this precise, let us give a definition.
Definition 8 Let be be a measurable set in and let . We say that is a point of density of the set , if
Of course the limit in the previous definition might not exist in general or not be equal to . Observe however that if the previous limit is equal to then is a point of density of the set the complement of . On the other hand, applying Lebesgue’s differentiation theorem to the function which is obviously locally integrable we get
for almost every . Thus we immediately get the following
Proposition 9 Let be a measurable set. Then almost every point of is a point of density of . Likewise, almost every point is a point of density of .
Thus a point of density is in a measure theoretic sense completely surrounded by other points of . The measure of the set in the ball is proportional to the measure of the ball as and is a point of density.
Another way to describe this notion is the following. Let be a closed set and define . Of course if . Now think of in a neighborhood of zero so that the vector is in the neighborhood of . If then the distance of the point from is at most since and . Thus we have that whenever . That is, when the points approaches , the distance , that is the distance of from approaches zero. In fact the estimate above can be improved.
Exercise 5 Prove Proposition 10 above. The is interpreted as follows: For every there exists some such that whenever .
We will be mostly interested in another instance of this principle that is reflected in the Marcinkiewicz integral. This will also come in handy in our study of oscillatory integrals in the next chapter.
For a closed set as before we define the Marcinkiewicz integral associated to , , as
Remark 5 The previous theorem shows that, in average, is small enough whenever to make the integral converge locally. This can be seen as a variation of Proposition 10 though no direct quantitative connection is claimed.
Part (i) is obvious and is left as an exercise. For (ii) it will be enough to show the following:
Then for almost every . In particular we have
Proof: It is enough to show
since then is finite for almost every . To that end we write
Now fix a . As we obviously have that thus . Since all the quantities under the integral signs are positive the previous estimate implies
whenever . Integrating for we get
To get the proof of Theorem 11 we now use the previous lemma as follows. Let be a closed set and let be a ball of radius centered at . Let . Then is closed and so that . Thus the previous lemma applies to and we get that
for almost every where we denote by the distance from the set . Now observe that for and we have that ; indeed and thus . We conclude that
for almost every . Since every eventually belongs to some for some large we get the conclusion of the theorem.
Exercise 6 (i) Show the following strengthened form of Lemma 12: For and locally integrable then
whenever is closed and .
(ii) Use (i) and the maximal theorem to conclude that for all .
4. The dyadic maximal function
We now come to a different approach to the maximal function theorem. On the one hand the `dyadic’ approach we will follow here already implies the maximal theorem presented in the previous paragraph. It is however interesting in its own right and it will give us the chance to present a dyadic structure on the Euclidean space which will come in handy in many different cases.
Consider the basic cube . A dyadic dilation of this cube is the cube where . Now we also consider integer translations of this cube of the form for some integer vector . We have the following definition:
Definition 13 A dyadic cube of generation is a cube of the form
where and . The family of disjoint cubes
defines the -th generation of dyadic cubes.
The dyadic cubes have the following basic properties.
(d1) The dyadic cubes in the generation are disjoint and their union is . Thus any point belongs to unique dyadic cube in the -th generation.
(d2) Two (different) dyadic cubes are either disjoint or one contains the other.
(d3) A dyadic cube in consists of exactly dyadic cubes of the generation . On the other hand, for any dyadic cube and any there is a unique dyadic cube in the generation that contains .
As a first instance of how things simplify and get sharper in the dyadic world, let us see the analogue of the Vitali covering lemma in the dyadic case.
Lemma 14 (Dyadic Vitali-type covering lemma) Let be a finite collection of dyadic cubes. There exists a subcollection of disjoint dyadic cubes such that
Proof: Let be the maximal cubes among , that is, the cubes that are not contained in any other cube of the collection . Then the cubes are disjoint (otherwise they wouldn’t be maximal). Also any cube that is not maximal is contained in the union .
Given a function and we set
Observe that given there is a unique cube that contains and then the value of at equals the average of the function over the cube . In fact, is the conditional expectation of with respect to the -algebra generated by the family . Observe that for every generation , if is a union of cubes in then
The operator is the discrete dyadic analogue of an approximation to the identity dilated at level . A difference however is that the averages here are not `centered’. Indeed, is the average of with respect to the cube whenever for some . However, is not the `center’ of the cube .
The dyadic maximal function is defined as
Thus the supremum is taken over all dyadic cubes that contain or, equivalently, over all generations of dyadic cubes. We have the analogue of the maximal theorem:
for all . (ii) The dyadic maximal function is of strong type , for all ; for all we have
where the implied constant depends only on .
We conclude using Proposition 1 that
(iii) For every we have that
Exercise 7 Give the proof of Theorem 15 above. Observe that the proof is essentially identical to that of Theorem 2 using the dyadic version of the Vitali covering Lemma instead of the non-dyadic one. For (ii) you need to observe that the statement is true for continuous functions (for example) and use Proposition 1.
where is the indicator function of the cube . Show that
where the implied constants depend only on the dimension .
Exercise 9 Show the pointwise estimate
where the implied constant depends only on the dimension . On the other hand, show that the opposite estimate cannot be true. For example when test against the function . Conclude that the dyadic maximal theorem follows from the non-dyadic one (with a different constant though). Hint: Observe that if and is a dyadic cube, there exists a ball which contains and .
Exercise 10 Consider the non-centered maximal function with respect to cubes, or balls
where the supremum is taken over all Euclidean balls containing . Likewise
where the supremum is taken over all cubes (with sides parallel to the coordinate axes) that contain . Show that and are all pointwise equivalent, that is
5. The Calderón-Zygmund decomposition
Let be a measure space and be a measurable function (say) in . For a level we have many times used the decomposition of at level :
The function is the `good’ part of ; indeed we have that
Thus the good part adopts the -integrability of and furthermore it is bounded. On the other hand the `bad’ part satisfies
Thus the bad part also inherits the -integrability of but it also has `small’ support.
In a general measure space one cannot do much more than that in terms of decomposing in a good part and a bad part. If however there is also a metric structure in the space which is compatible with the measure, one can do a bit better and also get some control on the local oscillation of the bad part . Various forms of this decomposition are usually referred to as Calderón-Zygmund decompositions. We present here the basic example in the dyadic Euclidean setup.
where is a collection of disjoint dyadic cubes and the sum is taken over all the cubes . This decomposition satisfies the following properties:
(i) The `good part’ satisfies the bound
(ii) The `bad part’ is ; each function is supported on and
(iii) For each we have
Furthermore we have that
In particular, from the dyadic maximal theorem we have
Proof: The proof is very similar to the proof of the dyadic covering lemma. We fix some level and let us call a dyadic cube bad if
If a dyadic cube is not bad we call it good. A bad cube will be called maximal if is bad and also there is no dyadic cube strictly containing is bad. Let us denote by the collection of maximal bad cubes. Since the cubes in the collection are dyadic and maximal, they are disjoint. Also, for any bad cube , let . We have that
Also, Since , every bad cube is contained in some maximal bad cube. Indeed, if is bad cube then as so monotone convergence implies that . It follows that there is a large enough such that
for all . Thus the dyadic cube is maximal and bad.
Now let be a maximal bad cube and consider the parent of , , that is the unique dyadic cube with double the side-length that contains . Since is maximal, has to be good so we have
for all maximal bad cubes . We set
whenever is a maximal bad cube. We also set
It is not hard to verify all the required properties of except maybe that . It is easy to see that
whenever is a bad cube. If and , then necessarily is good. We thus have that
since is good. Now, by the dyadic maximal theorem, we have that as with . Since we conclude that and we are done in this case as well.
Observe that in the previous decomposition of , the `bad set’, that is the set where lives, is given in the form
One could prove the Calderón-Zygmund decomposition starting from the set and decomposing it as a union of disjoint dyadic cubes. This sort of decomposition is interesting in its own right. Let us see how this can be done.
Proposition 17 (Dyadic Whitney decomposition) Let be an open set which is not all of . Then there exists a decomposition
where is a collection of disjoint dyadic cubes. For each we have
Proof: Let denote the dyadic cubes inside such that
Obviously but the opposite inclusion is also true. Indeed, if note that is contained in some dyadic cube since is open. Now for a dyadic cube let be its `parent’, that is the unique dyadic cube of side twice the side-length of , containing . Considering successive parents of there will be a dyadic cube containing with diameter greater than and less than . Thus and . The collection of dyadic cubes is not necessarily disjoint so we only choose the cubes in which are maximal with respect to set inclusion and call this collection again . Now maximal and dyadic means disjoint so we are done.
Using the Whitney decomposition lemma one can give an alternative proof of the Calderón-Zygmund decomposition by taking
and noting that the latter set is open.
As a corollary we get a control of the level sets of the usual (non-dyadic) maximal function by the level sets of the dyadic maximal function.
Proof: Let be the collection of dyadic cubes obtained by the Calderón-Zygmund decomposition at level . We have that
Indeed, let and be any cube centered at . Denoting by the side-length of , we choose so that . Then intersects cubes in the -th generation , and let us call them . Observe that none of these cubes can be contained in any of the because otherwise we would have that . Thus the average of on each is at most so
This proves the claim (2) and thus the corollary.
Exercise 11 Using the dyadic maximal theorem only, conclude that the operators are of weak type .
5.1. The Fefferman-Stein weighted inequality.
We give a first application of the Calderón-Zygmund decomposition which in some sense is the prototype of a weighted norm inequality. It is a variation of the maximal theorem where the Lebesgue measure is replaced by a measure of the form for some non-negative measurable function . It then turns out that the maximal function maps to boundedly for all and that it also satisfies a weak endpoint analogue for . In particular we have
Theorem 19 (Fefferman-Stein inequality) Let be a non-negative locally integrable function (a `weight’).
(i) We have that
for all with .
(ii) In the endpoint case we get the weak analogue
for all .
Proof: We will show that and that the weak inequality in (ii) holds. Then the Marcinkiewicz interpolation theorem will give (i) as well.
is trivial and is left as an exercise. We turn our attention to the -bound. Let be the collection of the dyadic cubes obtained from the Calderón-Zygmund decomposition at level . By the proof of Lemma 18 we have that
where is the cube with the same center as and twice its side-length. We have
Again, from the Calderón-Zygmund decomposition (at level ) we have that
for all of the decomposition. Combining the last two estimates we can write
For fixed the term is non-zero if and only if . Thus the previous estimate implies
where is the non-centered maximal function associated to cubes. See Exercise 8. Since this concludes the proof.
Exercise 12 (Heldberg’s inequality and Hardy-Littlewood-Sobolev theorem) Let , and .
(i) Show Heldberg’s inequality: If then
(ii) Use the Hardy-Littlewood maximal theorem and (i) to conclude that Hardy-Littlewood-Sobolev theorem: For every we have that
Hint: In order to show (i) split the integral
where is a parameter to be chosen later on. For observe that
Observe that is decreasing, radial, non-negative and integrable (since ). Use Proposition 6 and the calculation in its proof to show the bound
For use Hölder’s inequality to show
Choose the parameter to minimize the sum . Part (ii) is a trivial consequence of (i).
[Update 4 Apr 2011: Section 3.1 concerning the Marcinkiewicz integral added; numbering changed.
Update 9th May 2011: Typo in the hint of Exercise 1 corrected.]