**1. Averages and maximal operators **

This week we will be discussing the Hardy-Littlewood maximal function and some closely related maximal type operators. In order to have something concrete let us first of all define the averages of a locally integrable function around the point :

where is the Euclidean ball with center and radius and denotes its Lebesgue measure. Note that since Lebesgue measure is translation invariant we have

where denotes the Lebesgue measure (or volume in this case) of the -dimensional unit ball . Denoting by the indicator function of the normalized unit ball

and noting that the balls centered at zero are -symmetric, we can write

Thus

and of course is an approximation to the identity since and are just the *dilations* of the function :

Remembering the discussion that followed the definition of the convolution in Notes 2, the convolution of a locally integrable function with the dilations of an function was viewed as an averaging process. We see now that when this is exact, that is, is the average of with respect to a ball around of radius where the implied constant only depends on the dimension. A similar conclusion follows if we start with any set that is say -symmetric and convex and normalized to volume . We then have that

that is, are the averages of with respect to the dilations of the fixed convex body at every point . Here we denote by the dilations of

It is an easy exercise to show that all these averages are uniformly bounded in size. For all we have

One of course could consider more general sets instead of convex sets which are -symmetric and in fact this leads to one of the most interesting family of problems in harmonic analysis. This however falls outside the scope of this course and we will mostly focus on the case of the normalized unit ball which in some sense is the prototypical example.

The Hardy-Littlewood maximal operator (with respect to Euclidean balls) is defined as

Observe that this is a sublinear operator that is well defined at least when is locally integrable. Although maximal operators are interesting in their own right, there are some very specific applications we have in mind. The first has to do with pointwise convergence of averages of a function and is a consequence of the following simple proposition.

Proposition 1Let be a family of sub-linear operators on and define the maximal operator

If is of weak type then for any the set

is closed in

*Proof:* In order to show that the set

is closed, consider a sequence of functions with in . We need to show that . To see this observe that for almost every we have

Thus for any we can write

Since the right hand side tends to as and the left hand side does not depend on we conclude that for every

Now we have that

Thus for almost every so that .

Remark 1We have indexed the family in for the sake of definiteness but one can of course consider more general index sets and the previous proposition remains valid. In every case that the index set is uncountable some attention should be given in assuring the measurability of .

Remark 2To get a clearer picture of what this proposition says consider the family of operators

for some with integral . As we have seen already many times, these averages of converge to in many different senses for different classes of functions . In particular if then converges to even uniformly as . Thus we have

Since is dense in , Proposition 1 implies that if is of weak type then

for almost every . Thus in order to show that approximations to the identity converge to the function almost everywhere it is enough to show that the corresponding maximal operator is of weak type . In what follows we will show that the Hardy-Littlewood maximal operator is of weak type and this already implies the corresponding statement for a wide class of `nice’ approximations to the identity.

To avoid confusion, remember that in Theorem 15 of Notes 3 we have already exhibited that

for every Lebesgue point of . However this is only interesting if we already know that has `many’ Lebesgue points (in particular almost every point in ). In Theorem 15 of Notes 3 we took for granted that the integral of a locally integrable function is almost everywhere differentiable and this in turn implied that almost every point in is a Lebesgue point of . In this part of the course we will fill in this gap by showing that the integral of a locally integrable function is almost everywhere differentiable.

Exercise 1Let be of weak type . Show that for every the set

is closed in .

Hint:The proof is very similar to that of Proposition 1. Observe that it suffices to show that

for every .

**2. The Hardy-Littlewood maximal theorem **

We focus our attention to the Hardy-Littlewood maximal operator; for

The discussion in the previous section suggests that one should try to prove weak bounds for the operator . In fact we will prove the following theorem which summarizes the boundedness properties of .

Theorem 2 (Hardy-Littlewood maximal theorem)(i) The Hardy-Littlewood maximal operator is of strong type for :

for all and .

(ii) The Hardy-Littlewood maximal operator if of weak type :

for all .

Remark 3The Hardy-Littlewood maximal operator isnotof strong type . To see this note that for any we have that

which shows in particular that is never integrable whenever is not identically . Moreover, no strong estimates of type are possible whenever as can be seen by examining the dilations of and .

Exercise 2Prove the assertions in the previous remark.

Exercise 3Let and let be a ball such that for every . Let be the ball with the same center and twice the radius of . Show that for every .

* Proof of Theorem 2: *First of all let us observe that is of strong type . This is just a consequence of the general fact that an average never exceeds a `maximum’. In view of the Marcinkiewicz interpolation theorem it then suffices to show the assertion (i) of the theorem, namely that is of weak type . Furthermore, by homogeneity, it suffices to show that

We now fix some and set

and let be any compact subset of and our task is to obtain an estimate of the form , uniformly in .

For every there is a ball (of some radius) such that

The family clearly covers the compact set so we can extract a finite subcollection of balls which we denote by which still cover . Since we get that

Observe on the other hand that

so if we manage to show that

we would be done. The main obstruction to such an estimate is that the balls may overlap a lot. On the other hand, if the balls where disjoint (or `almost’ disjoint) then there would be no problem. Although we can’t directly claim that the family is non-overlapping, the following lemma will allow us to extract a subcollection of balls which has this property, without losing too much of the measure of the union of balls in the collection.

Lemma 3 (Vitali-type covering lemma)Let be a finite collection of balls. Then there exists a subcollection ofdisjoint ballssuch that

Before giving the proof of this covering lemma let us see how we can use it to conclude the proof of Theorem 2. Recall that we have extracted a finite collection of balls which cover the set and which satisfy

Now applying the covering lemma we can extract a subcollection of disjoint balls so that the measure of their union exceeds a multiple of the measure of the union of the original family of balls. Thus, we can write

Observe that this estimate is uniform over all compact sets so taking the supremum over such sets and using the inner regularity of the Lebesgue measure we conclude that

which concludes the proof.

*Proof of the Covering Lemma 3:* First of all let us assume that the balls are arranged in decreasing order of size (thus is the largest ball). We will choose the subcollection by the *greedy algorithm*. The first ball we choose in the subcollection is the largest ball, thus . Now assume we have chosen the balls for some . We choose the ball to be the largest ball which doesn’t intersect any of the balls already chosen. Observe that this amounts to choosing

We continue this process until we run out of balls. It is clear that the resulting subcollection consists of disjoint balls. On the other hand, every ball of the original collection is either selected or it intersects one of the selected balls, say, in the subcollection of greater or equal radius (otherwise the ball would be selected). Then it is not hard to see that

where is the ball with the same center as and three times its radius. Thus we have that

Taking the Lebesgue measure of both unions we conclude

and we are done.

Exercise 4 (The maximal function on the class )We saw that if is a non-trivial integrable function then is never integrable. Suppose however that is supported in a finite ball and that it is a `bit better’ than being integrable, namely it satisfies

where . We say in this case that . Then we have that and

Hints:(a) For show thatIt will help you to split the function as

and observe that .

(b) Show that

From this, (a) and Fubini’s theorem you can conclude the proof.

**3. Consequences of the maximal theorem **

Our first application of the maximal theorem has to do with the differentiability of the integral of a locally integrable function. Indeed, using Theorem 2 and Proposition 1 we immediately get the following.

Corollary 4 (Lebesgue differentiation theorem)Let be a locally integrable function. Then, for almost every we have that

For the proof just observe that and that the claimed convergence property is a local property thus one can confine any locally integrable function in a ball around the point which turns into an function. As we have already seen in Notes 3, the previous statement also implies the following:

Corollary 5Let . Then almost every point in is aLebesgue pointif , that is, we have that

for almost every .

Lebesgue’s differentiation theorem generalizes to more general averages. A manifestation of this is already presented in Theorem 15 of Notes 2 which asserts that for `nice’ approximations to the identity , the means converge to at every Lebesgue point of . Here we will give an alternative proof of this theorem by controlling the maximal operator by the Hardy-Littlewood maximal function.

Proposition 6Let be a positive and radially decreasing function with . Then we have that

*Proof:* First suppose that is of the form where and are Euclidean balls centered at for all . Then we have

However, any function which is positive and radially decreasing can be approximated monotonically from below by a sequence of simple functions of the form so we are done.

As an immediate corollary we get the same control for approximations to the identity which are controlled by positive radially decreasing functions.

Corollary 7Let almost everywhere where is positive, radially decreasing and integrable. Then we have that

In particular is of weak type and strong type for all . We conclude that

for almost every .

Remark 4The qualitative conclusion of the previous corollaries is that maximal averages of with radially decreasing integrable kernels are controlled by the Hardy-Littlewood maximal function. A typical radially decreasing integrable kernel is the Gaussian kernel

By dilating by we get

The function can be viewed as smooth approximation of the indicator function of a ball of radius (up to constants). Indeed, for say, we have that , while for the function decays very fast. Thus the kernel is not so different from .

** 3.1. Points of density and the Marcinkiewicz Integral **

A direct consequence of Lebesgue’s differentiation theorem is that almost every point of a measurable set is `completely’ surrounded by other points of the set. To make this precise, let us give a definition.

Definition 8Let be be a measurable set in and let . We say that is apoint of densityof the set , if

Of course the limit in the previous definition might not exist in general or not be equal to . Observe however that if the previous limit is equal to then is a point of density of the set the complement of . On the other hand, applying Lebesgue’s differentiation theorem to the function which is obviously locally integrable we get

for almost every . Thus we immediately get the following

Proposition 9Let be a measurable set. Then almost every point of is a point of density of . Likewise, almost every point is a point of density of .

Thus a point of density is in a measure theoretic sense completely surrounded by other points of . The measure of the set in the ball is proportional to the measure of the ball as and is a point of density.

Another way to describe this notion is the following. Let be a closed set and define . Of course if . Now think of in a neighborhood of zero so that the vector is in the neighborhood of . If then the distance of the point from is at most since and . Thus we have that whenever . That is, when the points approaches , the distance , that is the distance of from approaches zero. In fact the estimate above can be improved.

Proposition 10Let be a closed set. Then for almost every , as . This is true in particular if is a point of density of the set .

Exercise 5Prove Proposition 10 above. The is interpreted as follows: For every there exists some such that whenever .

We will be mostly interested in another instance of this principle that is reflected in the Marcinkiewicz integral. This will also come in handy in our study of oscillatory integrals in the next chapter.

For a closed set as before we define the *Marcinkiewicz integral associated to *, , as

Remark 5The previous theorem shows that, in average, is small enough whenever to make the integral converge locally. This can be seen as a variation of Proposition 10 though no direct quantitative connection is claimed.

Part (i) is obvious and is left as an exercise. For (ii) it will be enough to show the following:

Lemma 12Let be a closed set whose complement has finite measure. Then we set

Then for almost every . In particular we have

*Proof:* It is enough to show

since then is finite for almost every . To that end we write

Now fix a . As we obviously have that thus . Since all the quantities under the integral signs are positive the previous estimate implies

whenever . Integrating for we get

To get the proof of Theorem 11 we now use the previous lemma as follows. Let be a closed set and let be a ball of radius centered at . Let . Then is closed and so that . Thus the previous lemma applies to and we get that

for almost every where we denote by the distance from the set . Now observe that for and we have that ; indeed and thus . We conclude that

for almost every . Since every eventually belongs to some for some large we get the conclusion of the theorem.

Exercise 6(i) Show the following strengthened form of Lemma 12: For and locally integrable then

whenever is closed and .

(ii) Use (i) and the maximal theorem to conclude that for all .

**4. The dyadic maximal function **

We now come to a different approach to the maximal function theorem. On the one hand the `dyadic’ approach we will follow here already implies the maximal theorem presented in the previous paragraph. It is however interesting in its own right and it will give us the chance to present a dyadic structure on the Euclidean space which will come in handy in many different cases.

Consider the basic cube . A dyadic dilation of this cube is the cube where . Now we also consider integer translations of this cube of the form for some integer vector . We have the following definition:

Definition 13A dyadic cube ofgenerationis a cube of the form

where and . The family of disjoint cubes

defines the -th generation of dyadic cubes.

The dyadic cubes have the following basic properties.

(d1) The dyadic cubes in the generation are disjoint and their union is . Thus any point belongs to unique dyadic cube in the -th generation.

(d2) Two (different) dyadic cubes are either disjoint or one contains the other.

(d3) A dyadic cube in consists of exactly dyadic cubes of the generation . On the other hand, for any dyadic cube and any there is a unique dyadic cube in the generation that contains .

As a first instance of how things simplify and get sharper in the dyadic world, let us see the analogue of the Vitali covering lemma in the dyadic case.

Lemma 14 (Dyadic Vitali-type covering lemma)Let be a finite collection of dyadic cubes. There exists a subcollection of disjoint dyadic cubes such that

*Proof:* Let be the maximal cubes among , that is, the cubes that are not contained in any other cube of the collection . Then the cubes are disjoint (otherwise they wouldn’t be maximal). Also any cube that is not maximal is contained in the union .

Given a function and we set

Observe that given there is a unique cube that contains and then the value of at equals the average of the function over the cube . In fact, is the *conditional expectation* of with respect to the -algebra generated by the family . Observe that for every generation , if is a union of cubes in then

The operator is the discrete dyadic analogue of an approximation to the identity dilated at level . A difference however is that the averages here are not `centered’. Indeed, is the average of with respect to the cube whenever for some . However, is not the `center’ of the cube .

The *dyadic maximal function* is defined as

Thus the supremum is taken over all dyadic cubes that contain or, equivalently, over all generations of dyadic cubes. We have the analogue of the maximal theorem:

Theorem 15 (Dyadic Maximal Theorem)(i) The dyadic maximal function is of weak type with weak type norm at most :

for all . (ii) The dyadic maximal function is of strong type , for all ; for all we have

where the implied constant depends only on .

We conclude using Proposition 1 that

(iii) For every we have that

Exercise 7Give the proof of Theorem 15 above. Observe that the proof is essentially identical to that of Theorem 2 using the dyadic version of the Vitali covering Lemma instead of the non-dyadic one. For (ii) you need to observe that the statement is true for continuous functions (for example) and use Proposition 1.

Exercise 8 (The maximal function with respect to cubes)Let denote the maximal function with respect to cubes, that is,

where is the indicator function of the cube . Show that

where the implied constants depend only on the dimension .

Exercise 9Show the pointwise estimate

where the implied constant depends only on the dimension . On the other hand, show that the opposite estimate cannot be true. For example when test against the function . Conclude that the dyadic maximal theorem follows from the non-dyadic one (with a different constant though).Hint:Observe that if and is a dyadic cube, there exists a ball which contains and .

Exercise 10Consider thenon-centered maximal functionwith respect to cubes, or balls

where the supremum is taken over all Euclidean balls containing . Likewise

where the supremum is taken over all cubes (with sides parallel to the coordinate axes) that contain . Show that and are all pointwise equivalent, that is

**5. The Calderón-Zygmund decomposition **

Let be a measure space and be a measurable function (say) in . For a level we have many times used the decomposition of at level :

The function is the `good’ part of ; indeed we have that

Thus the good part adopts the -integrability of and furthermore it is bounded. On the other hand the `bad’ part satisfies

Thus the bad part also inherits the -integrability of but it also has `small’ support.

In a general measure space one cannot do much more than that in terms of decomposing in a good part and a bad part. If however there is also a metric structure in the space which is compatible with the measure, one can do a bit better and also get some control on the local oscillation of the bad part . Various forms of this decomposition are usually referred to as Calderón-Zygmund decompositions. We present here the basic example in the dyadic Euclidean setup.

Proposition 16 (Dyadic Calderón Zygmund decomposition)Let and . There exists a decomposition of of the form

where is a collection of disjoint dyadic cubes and the sum is taken over all the cubes . This decomposition satisfies the following properties:

(i) The `good part’ satisfies the bound(ii) The `bad part’ is ; each function is supported on and

(iii) For each we have

Furthermore we have that

In particular, from the dyadic maximal theorem we have

*Proof:* The proof is very similar to the proof of the dyadic covering lemma. We fix some level and let us call a dyadic cube *bad* if

If a dyadic cube is not bad we call it *good*. A bad cube will be called *maximal* if is bad and also there is no dyadic cube strictly containing is bad. Let us denote by the collection of maximal bad cubes. Since the cubes in the collection are dyadic and maximal, they are disjoint. Also, for any bad cube , let . We have that

Also, Since , every bad cube is contained in some maximal bad cube. Indeed, if is bad cube then as so monotone convergence implies that . It follows that there is a large enough such that

for all . Thus the dyadic cube is maximal and bad.

Now let be a maximal bad cube and consider the parent of , , that is the unique dyadic cube with double the side-length that contains . Since is maximal, has to be good so we have

and thus

for all maximal bad cubes . We set

whenever is a maximal bad cube. We also set

It is not hard to verify all the required properties of except maybe that . It is easy to see that

whenever is a bad cube. If and , then necessarily is good. We thus have that

since is good. Now, by the dyadic maximal theorem, we have that as with . Since we conclude that and we are done in this case as well.

Observe that in the previous decomposition of , the `bad set’, that is the set where lives, is given in the form

One could prove the Calderón-Zygmund decomposition starting from the set and decomposing it as a union of disjoint dyadic cubes. This sort of decomposition is interesting in its own right. Let us see how this can be done.

Proposition 17 (Dyadic Whitney decomposition)Let be an open set which is not all of . Then there exists a decomposition

where is a collection of disjoint dyadic cubes. For each we have

*Proof:* Let denote the dyadic cubes inside such that

Obviously but the opposite inclusion is also true. Indeed, if note that is contained in some dyadic cube since is open. Now for a dyadic cube let be its `parent’, that is the unique dyadic cube of side twice the side-length of , containing . Considering successive parents of there will be a dyadic cube containing with diameter greater than and less than . Thus and . The collection of dyadic cubes is not necessarily disjoint so we only choose the cubes in which are maximal with respect to set inclusion and call this collection again . Now maximal and dyadic means disjoint so we are done.

Using the Whitney decomposition lemma one can give an alternative proof of the Calderón-Zygmund decomposition by taking

and noting that the latter set is open.

As a corollary we get a control of the level sets of the usual (non-dyadic) maximal function by the level sets of the dyadic maximal function.

Corollary 18For all we have that

*Proof:* Let be the collection of dyadic cubes obtained by the Calderón-Zygmund decomposition at level . We have that

We write for the cube with the same center as and twice its side-length.

Indeed, let and be any cube centered at . Denoting by the side-length of , we choose so that . Then intersects cubes in the -th generation , and let us call them . Observe that none of these cubes can be contained in any of the because otherwise we would have that . Thus the average of on each is at most so

This proves the claim (2) and thus the corollary.

Exercise 11Using the dyadic maximal theorem only, conclude that the operators are of weak type .

** 5.1. The Fefferman-Stein weighted inequality. **

We give a first application of the Calderón-Zygmund decomposition which in some sense is the prototype of a weighted norm inequality. It is a variation of the maximal theorem where the Lebesgue measure is replaced by a measure of the form for some non-negative measurable function . It then turns out that the maximal function maps to boundedly for all and that it also satisfies a weak endpoint analogue for . In particular we have

Theorem 19 (Fefferman-Stein inequality)Let be a non-negative locally integrable function (a `weight’).

(i) We have that

for all with .

(ii) In the endpoint case we get the weak analogue

for all .

*Proof:* We will show that and that the weak inequality in (ii) holds. Then the Marcinkiewicz interpolation theorem will give (i) as well.

The bound

is trivial and is left as an exercise. We turn our attention to the -bound. Let be the collection of the dyadic cubes obtained from the Calderón-Zygmund decomposition at level . By the proof of Lemma 18 we have that

where is the cube with the same center as and twice its side-length. We have

Again, from the Calderón-Zygmund decomposition (at level ) we have that

for all of the decomposition. Combining the last two estimates we can write

For fixed the term is non-zero if and only if . Thus the previous estimate implies

where is the non-centered maximal function associated to cubes. See Exercise 8. Since this concludes the proof.

Exercise 12 (Heldberg’s inequality and Hardy-Littlewood-Sobolev theorem)Let , and .

(i) Show Heldberg’s inequality: If then

(ii) Use the Hardy-Littlewood maximal theorem and (i) to conclude that Hardy-Littlewood-Sobolev theorem: For every we have that

Hint:In order to show (i) split the integralwhere is a parameter to be chosen later on. For observe that

Observe that is decreasing, radial, non-negative and integrable (since ). Use Proposition 6 and the calculation in its proof to show the bound

For use Hölder’s inequality to show

Choose the parameter to minimize the sum . Part (ii) is a trivial consequence of (i).

*[Update 4 Apr 2011: Section 3.1 concerning the Marcinkiewicz integral added; numbering changed.*

*Update 9th May 2011: Typo in the hint of Exercise 1 corrected.]
*

Hi. I find that your note is very usefull. I only have a question:

How to prove that “any function {\phi} which is positive and radially decreasing can be approximated monotonically from below by a sequence of simple functions of the form {\sum a_j \chi_{B_j}}.

Thank you very much

it suffices to do it on the real line and Look at the pre images of the superlevel sets of the function.