## DMat0101, Notes 7: General Calderón-Zygmund Operators

After having studied the Hilbert transform in detail we now move to the study of general Calderón-Zygmund operators, that is operators given formally as

$\displaystyle T(f)(x)=\int K(x,y)f(y)dy,$

for an appropriate kernel ${K}$. Let us quickly review what we used in order to show that the Hilbert transform ${H}$ is of weak type ${(1,1)}$ and strong type ${(2,2)}$. First of all we essentially used the fact that the linear operator ${H}$ is defined on ${L^2}$ and bounded, that is, that it is of strong type ${(2,2)}$. This information was used in two different ways. First of all, the fact that ${H}$ is defined on ${L^2}$ means that it is defined on a dense subspace of ${L^p}$ for every ${1\leq p <+\infty}$. Furthermore, the boundedness of the Hilbert transform on ${L^2}$ allowed us to treat the set ${\{|H(g)|>\lambda\}}$ where ${g}$ is the good part’ in the Calderón-Zygmund decomposition of a function ${f}$. Secondly, we used the fact that there is a specific representation of the operator ${H}$ of the form

$\displaystyle H(f)(x)=\int K(x,y)f(y)dy,$

whenever ${f\in L^2}$ and has compact support and ${x\notin {\mathrm{supp}}(f)}$. For the Hilbert transform we had that the kernel ${K}$ is given as

$\displaystyle K(x,y)=\frac{1}{x-y}.$

We used the previous representation and the formula of ${K}$ to prove a sort of restricted ${L^1}$ boundedness of ${H}$ on functions which are localized and have mean zero, which is the content of Lemma 7 of Notes 6. This, in turn, allowed us to treat the bad part’ of the Calderón-Zygmund decomposition of ${f}$. From the proof of that Lemma it is obvious that what we really need for ${K}$ is a Hölder type condition. Note as well that for the Hilbert transform we first proved the ${L^p}$ bounds for ${1 and then the corresponding boundedness for ${2 followed by the fact that ${H}$ is essentially self-adjoint.

1. Singular kernels and Calderón-Zygmund operators

We will now define the class of Calderón-Zygmund operators in such a way that we will be able to repeat the schedule used for the Hilbert transform. We begin by defining an appropriate class of kernels ${K}$, name the singular (or standard) kernels.

Definition 1 (Singular or Standard kernels) A singular (or standard) kernel is a function ${K:\mathbb R^n \times \mathbb R^n \rightarrow {\mathbb C}}$, defined away from the diagonal ${x=y}$, which satisfies the decay estimate

$\displaystyle |K(x,y)|\lesssim_n |x-y|^{-n}, \ \ \ \ \ (1)$

for ${x\neq y}$ and the Hölder-type regularity estimates

$\displaystyle |K(x,y_1)-K(x,y)|\lesssim_{n,\sigma} \frac{|y-y_1|^\sigma}{|x-y|^{n+\sigma}}\quad\mbox{if}\quad |y-y_1|<\frac{1}{2}|x-y|, \ \ \ \ \ (2)$

and

$\displaystyle |K(x_1,y)-K(x,y)|\lesssim_{n,\sigma} \frac{|x-x_1|^\sigma}{|x-y|^{n+\sigma}}\quad\mbox{if}\quad |x-x_1|<\frac{1}{2}|x-y|, \ \ \ \ \ (3)$

for some Hölder exponent ${0<\sigma\leq 1}$.

Example 1 Let ${K:{\mathbb R}\times {\mathbb R}\rightarrow {\mathbb R}}$ be given as ${K(x,y)=(x-y)^{-1}}$ for ${x,y\in{\mathbb R}}$ with ${x\neq y}$. Then ${K}$ is a singular kernel. Observe that ${K}$ is the singular kernel associated with the Hilbert transform.

Example 2 Let ${K:{\mathbb R}^n\times {\mathbb R}^n \rightarrow {\mathbb R}}$ be given as

$\displaystyle K(x,y)=\Omega(\frac{x-y}{|x-y|})|x-y|^{-n},$

where ${\Omega:S^{n-1}\rightarrow {\mathbb C}}$ is a Hölder-continuous function:

$\displaystyle |\Omega(x')-\Omega(y')|\lesssim_{n,\sigma} |x'-y'|^\sigma,$

for some ${0<\sigma\leq 1}$. Then ${K}$ is a singular kernel.

Exercise 1 Prove that the kernel ${K}$ of example 2 is a singular kernel.

Example 3 Let ${K:{\mathbb R}^n\times {\mathbb R}^n \rightarrow {\mathbb C}}$ satisfy the size estimate

$\displaystyle |K(x,y)|\lesssim_n |x-y|^{-n},$

and the regularity estimates

$\displaystyle |\nabla_x K(x,y)|\lesssim_n |x-y|^{-{(n+1)}},\quad |\nabla_y K(x,y)|\lesssim_n |x-y|^{-{(n+1)}},$

away from the diagonal ${x=y}$. Then ${K}$ is a singular kernel. In particular, the kernel ${K:{\mathbb R}^n\rightarrow {\mathbb R}^n \rightarrow {\mathbb C}}$ given as

$\displaystyle K(x,y)=|x-y|^{-n},$

is a singular kernel since the gradient of ${K}$ is of the order ${|x-y|^{-{(n+1)}}}$. Thus the estimates (2) and (3) are consistent with (1) but of course do not follow from it.

Remark 1 The constant ${\frac{1}{2}}$ appearing in (2), (3) is inessential. The conditions are equivalent with the corresponding conditions where ${\frac{1}{2}}$ is replaced by any constant between zero and one.

We are now ready to define Calderón-Zygmund operators.

Definition 2 (Calderón-Zygmund operators) A Calderón-Zygmund operator (in short CZO) is a linear operator ${T:L^2({\mathbb R}^n)\rightarrow L^2({\mathbb R}^n)}$ which is bounded on ${L^2({\mathbb R}^n)}$:

$\displaystyle \|T(f)\|_{L^2({\mathbb R}^n)}\lesssim_{T,n} \|f \|_{L^2({\mathbb R}^n)} \quad \mbox{for all}\quad f\in L^2({\mathbb R}^n),$

and such that there exists a singular kernel ${K}$ for which we have

$\displaystyle T(f)(x)=\int_{{\mathbb R}^n} K(x,y)f(y) dy,$

for all ${f\in L^2({\mathbb R}^n)}$ with compact support and ${x\notin{\mathrm{supp}}(f)}$.

Remark 2 Note that the integral ${\int K(x,y)f(y)}$ converges absolutely whenever ${f\in L^2({\mathbb R}^n)}$ has compact support and ${x}$ lies outside the support of ${f}$. Indeed,

$\displaystyle \begin{array}{rcl} \int_{{\mathbb R}^n} |K(x,y)||f(y)|dy&\leq&\bigg( \int_{y\notin {\mathrm{supp}}(f)}|K(x,y)|^2dy\bigg)^\frac{1}{2}\|f\|_{L^2({\mathbb R}^n)} \\ \\ &\leq& \bigg( \int_{|x-y|\geq \delta }\frac{1}{|x-y|^{2n}}dy \bigg)^\frac{1}{2} \|f\|_{L^2({\mathbb R}^n)} \end{array}$

by (1), for some ${\delta>0}$. Observe that the integral in the last estimate converges.

Remark 3 For any singular kernel ${K}$ one can define ${T_K}$ by means of

$\displaystyle T(f)(x)=\int_{{\mathbb R}^n} K(x,y) f(y)dy,$

for ${f\in L^2({\mathbb R}^n)}$ with compact support and ${x\notin {\mathrm{supp}}(f)}$. It is not necessary however that ${T_K}$ is a CZO since it might fail to be bounded on ${L^2({\mathbb R}^n)}$.

Remark 4 It is not hard to see that ${T}$ uniquelydetermines the kernel ${K}$. That is if

$\displaystyle T(f)(x)=\int_{{\mathbb R}^n} K(x,y) f(y)dy=\int_{{\mathbb R}^n} K_1(x,y) f(y)dy,$

for all ${f\in L^2({\mathbb R}^n)}$ with compact support, then ${K=K_1}$ almost everywhere (why?). The opposite is not true. Indeed, for any bounded function ${b\in L^\infty({\mathbb R}^n)}$ the operator defined as ${T(f)(x)=b(x)f(x)}$ is a Calderón-Zygmund kernel with kernel zero. A more specific example is the identity operator which also falls in the previous class, and is CZO with kernel 0. However, this is the only ambiguity. See Exercise 2.

Exercise 2 Let ${T_1,T_2}$ be two CZOs with the same singular kernel ${K}$. Show that there exists a bounded function ${b\in L^\infty({\mathbb R}^n)}$ such that

$\displaystyle T_1(f)=T_2(f)+bf,$

for all ${f\in L^2({\mathbb R}^n)}$.

If ${T}$ is a CZO, the definition already contains the fact that ${T}$ is defined and bounded on ${L^2({\mathbb R}^n)}$, so we don’t need to worry about that. The next step is to establish the restricted ${L^1}$ boundedness for ${L^1}$ functions with mean zero. The following lemma is the analogue of Lemma 7 of Notes 6.

Lemma 3 Let ${B=B(z,R)}$ be a Euclidean ball in ${{\mathbb R}^n}$ and denote by ${B^*}$ the ball with the same center and twice the radius, that is ${B^*=B(z,2R)}$. Let ${f\in L^1(B)}$ have mean zero, that is ${\int_B f =0}$. Then we have that

$\displaystyle |T(f)(x)|\lesssim_{n,\sigma} \frac{R^\sigma}{|x-z|^{n+\sigma}}\int_{B} |f(y)|dy,$

for all ${x\notin B^*}$. We conclude that

$\displaystyle \|T(f)\|_{L^1 ( {\mathbb R}^n\setminus B^*)}\lesssim_{n,\sigma} \|f\|_{L^1(B)}.$

Proof:Using the fact that ${f}$ has zero mean on ${B}$, for ${x\notin B^*}$ we can estimate

$\displaystyle \begin{array}{rcl} |T(f)(x)|&\leq& \int_B |K(x,y)-K(x,z)||f(y)|dy\leq\int_B \frac{|y-z|^\sigma}{|x-y|^{n+\sigma}}|f(y)|dy \\ \\ &\lesssim_{n,\sigma} & \frac{R^\sigma}{|x-z|^{n+\sigma}}\int_B |f(y)|dy. \end{array}$

Integrating throughout ${{\mathbb R}^n\setminus B^*}$ we also get the second estimate in the lemma. $\Box$

The only thing missing in order to conclude the proof of the ${L^p}$ bounds for CZOs is the the fact that they are self adjoint as a class. In particular, we need the following.

Lemma 4 Let ${T}$ be a CZO. Consider the adjoint ${T^*}$ defined by means of

$\displaystyle \int T(f)\bar{g}=\int f \overline{T^*(g)}, \ \ \ \ \ (4)$

for all ${f,g}$ in ${L^2}$. Then ${T^*}$ is a CZO.

Proof: It is immediate from (4) and the fact that ${T}$ is bounded on ${L^2}$ that ${T^*}$ is also bounded on ${L^2}$ with the same norm. Now let ${f,g\in L^2({\mathbb R}^n)}$ have disjoint compact supports. We have

$\displaystyle \int T(f) \bar g = \int \int K(x,y) f(y) dy \ \bar g(x) dx=\int f(y) \overline{\int \overline{K(x,y)} g(x)dx}\ dy. \ \ \ \ \ (5)$

Now let ${z\notin {\mathrm{supp}}(g)}$ and ${\phi \in C_c ^\infty({\mathbb R}^n)}$ have support inside ${B(0,1)}$ with ${\int \phi =1}$. For ${\epsilon>0}$, the functions ${\phi_\epsilon(y-z)}$ are supported in ${B(z,\epsilon)}$ so, for ${\epsilon}$ small enough, the support of ${\phi_\epsilon}$ is disjoint from the support of ${g}$. By (5)we conclude that

$\displaystyle \int \phi_\epsilon (z-y) \overline {T^*(g)(y)} dy= \int \phi_\epsilon(z-y)\overline{\int \overline{K(x,y)} g(x)dx}\ dy.$

Letting ${\epsilon \rightarrow 0}$ we get

$\displaystyle T^*(g)(z)=\int \overline{K(x,z)}g(x)dx,$

for almost every ${z\notin {\mathrm{supp}}(g)}$. Since the conditions defining singular kernels are symmetric in the variables ${x,y}$, the kernel ${S(x,y):=\overline{K(y,x)}}$ is again a singular kernel so we are done. $\Box$

The discussion above leads to the main theorem for CZOs:

Theorem 5 Let ${T}$ be a Calderón-Zygmund operator. Then ${T}$ extends to a linear operator which is of weak type ${(1,1)}$ and of strong type ${(p,p)}$ for all ${1 where the corresponding norms depend only on ${n}$ and ${\sigma}$ and ${p}$.

2. Pointwise convergence and maximal truncations

Let ${T}$ be a CZO. The example of the Hilbert transform suggests that we should have the almost everywhere convergence

$\displaystyle T(f)(x)=\lim_{\epsilon \rightarrow 0 }\int_{|x-y|>\epsilon} K(x,y)f(y)dy,$

at least for nice functions ${f\in {\mathcal S(\mathbb R^n)}}$. The truncated operators

$\displaystyle T_\epsilon(f)(x):=\int_{|x-y|>\epsilon} K(x,y)f(y)dy,$

certainly make sense for ${f\in L^2({\mathbb R}^n)}$ because of (1). However, the limit ${\lim_{\epsilon \rightarrow 0 } T_\epsilon(f)(x)}$ need not even exist in general or may exist and be different from ${T(f)(x)}$. Here we can use the trivial example of the operator ${T(f)(x)=b(x)f(x)}$. As we have already observed this is a CZO operator with kernel ${0}$. Thus ${T_\epsilon(f)(x)=0}$ for all ${\epsilon>0}$ but clearly ${T(f)\neq 0}$ in general.

The following lemma clears out the situation as far as the existence of the limit is concerned:

Lemma 6 The limit

$\displaystyle \lim_{\epsilon \rightarrow 0} T_\epsilon(f)(x),$

exists almost everywhere for all ${f\in {\mathcal S(\mathbb R^n)}}$ if and only if the limit

$\displaystyle \lim_{\epsilon \rightarrow 0} \int_{\epsilon<|x-y|<1} K(x,y)dy ,$

exists almost everywhere.

Proof:First suppose that the limit ${\lim_{\epsilon \rightarrow 0} T_\epsilon (f)(x)}$ exists for all ${f\in{\mathcal S(\mathbb R^n)}}$ and let ${\phi \in {\mathcal S(\mathbb R^n)}}$ with ${\phi \equiv 1}$ on ${B(0,1)}$. Then

$\displaystyle \lim_{\epsilon \rightarrow 0} T_\epsilon(\phi)(x)=\lim_{\epsilon\rightarrow 0}\int_{\epsilon<|x-y|<1}K(x,y) dy+\int_{|x-y|>1}K(x,y)\phi(y)dy .$

Observe that by (1)the second integral on the right hands side converges absolutely. Since the limit on the left hand side exists we conclude that the limit on the right hand side exists as well. Conversely, suppose that the limit

$\displaystyle \lim_{\epsilon\rightarrow 0}\int_{\epsilon<|x-y|<1}K(x,y) dy=L$

exists and let ${f\in{\mathcal S(\mathbb R^n)}}$. We have that

$\displaystyle \begin{array}{rcl} T_\epsilon(f)&=&\int_{\epsilon<|x-y|<1}K(x,y)f(y)dy+\int_{|x-y|>1}K(x,y)f(y)dy \\ \\ &=& \int_{\epsilon<|x-y|<1}K(x,y)[f(y)-f(x)]dy+f(x)\int_{\epsilon<|x-y|<1}K(x,y)dy \\ \\ && +\int_{|x-y|>1}K(x,y)f(y)dy=:I_1(\epsilon)+I_2(\epsilon)+I_3. \end{array}$

By the same considerations are before ${|I_3|}$ is a positive number that does not depend on ${\epsilon}$. By the hypothesis we also have that ${\lim_{\epsilon\rightarrow 0} I_2(\epsilon)=Lf(x)}$. Finally for ${I_1(\epsilon)}$ observe that we have

$\displaystyle \begin{array}{rcl} \int_{0<|x-y|<1}|K(x,y)||x-y|dy\lesssim_n \int_{|x-y|<1}|x-y|^{-(n-1)}dy\lesssim_n 1, \end{array}$

by (1). Since

$\displaystyle |K(x,y)[f(x)-f(y)]|\lesssim \|\nabla f\|_{L^\infty({\mathbb R}^n)} |K(x,y)||x-y|,$

dominated convergence implies that ${\lim_{\epsilon \rightarrow 0 }I_1(\epsilon)}$ exists as well. $\Box$

Thus, for specific kernels ${K}$ one has an easy criterion to establish whether the limit ${\lim_{\epsilon \rightarrow 0}T_\epsilon (f)}$ exists a.e. for nice’ functions ${f}$. For example, for the kernel ${K(x,y)=(x-y) ^{-1}}$ of the Hilbert transform, the existence of the limit

$\displaystyle \lim_{\epsilon\rightarrow 0}\int_{\epsilon<|x-y|<1}\frac{1}{x-y}dy=0$

is obvious. In order to extend the almost everywhere convergence to the class ${L^p({\mathbb R}^n)}$ we need to consider the corresponding maximal function.

Definition 7 Let ${T}$ be a CZO and define the truncations of ${T}$ as before

$\displaystyle T_\epsilon(f)(x):=\int_{|x-y|>\epsilon} K(x,y)f(y)dy,\quad x\in {\mathbb R}^n,\quad f\in{\mathcal S(\mathbb R^n)}.$

The maximal truncationof ${T}$ is the sublinear operator defined as

$\displaystyle T_*(f)(x)=\sup_{\epsilon>0} |T_{\epsilon}(f)(x)|,\quad x\in {\mathbb R}^n.$

The maximal truncation of a CZO has the same continuity properties as ${T}$ itself.

Theorem 8 Let ${T}$ be a CZO and ${T_*}$ denote its maximal truncation. Then ${T_*}$ is of weak type ${(1,1)}$ and strong type ${(p,p)}$ for ${1.

The proof of Theorem 8 depends on the following two results.

Lemma 9 Let ${S}$ be an operator of weak type ${(1,1)}$ and ${\nu\in(0,1)}$. Then for every set ${E\subset {\mathbb R}^n}$ with ${0<|E|<+\infty}$ we have that

$\displaystyle \int_E |S(f)(x)|^\nu dx\lesssim_{\nu,S} |E|^{1-\nu}\|f\|_1 ^\nu.$

The proof of this lemma is a simple application of the representation of the ${L^\nu}$ norm in terms of level sets and is left as an exercise.

Exercise 3 Prove Lemma 9 above.

The second result we need is the following lemma that gives a pointwise control of the maximal truncations of the CZO ${T}$ by an expression that involves the maximal function of ${f}$ and the maximal function of ${T(f)}$.

Lemma 10 Let ${T}$ be a CZO and ${0< \nu \leq 1}$. Then for all ${f\in C_c ^\infty({\mathbb R}^n)}$ we have that

$\displaystyle T_*(f)(x)\lesssim_{\nu,n,\sigma} [M(T|f|^\nu)(x)]^\frac{1}{\nu}+M(f)(x).$

Proof:Let us fix a function ${f\in{\mathcal S(\mathbb R^n)}}$ and ${\epsilon>0}$ and consider the balls ${B=B(x,\epsilon/2)}$ and its double ${B^*=B(x,\epsilon)}$. We decompose ${f}$ in the form

$\displaystyle f=f\chi_{B^*}+f(1-\chi_{B^*})=:f_1+f_2.$

Since ${{\mathrm{supp}}(f_2)\cap B=\emptyset}$ and obviously ${f_2\in L^2({\mathbb R}^n)}$ has compact support we can write

$\displaystyle T(f_2)(x)=\int_{{\mathbb R}^n} K(x,y)f_2(y)dy=\int_{|x-y|>\epsilon}K(x,y)f(y)dy= T_\epsilon(f)(x). \ \ \ \ \ (6)$

Also every ${w\in B}$ is not contained in the support of ${f_2}$ thus

$\displaystyle \begin{array}{rcl} |T(f_2)(w)-T(f_2)(x)|&=&\bigg|\int_{|x-y|>\epsilon} [K(x,y)-K(w,y)]f_2(y)dy \bigg| \\ \\ & \leq& \int_{|x-y|>\epsilon}\frac{|x-w|^\sigma}{|x-y|^{n+\sigma}}|f(y)|dy, \end{array}$

by (3), since ${|x-w|<\frac{\epsilon}{2}<\frac{1}{2} |x-y|}$ for ${y}$ in the area of integration above. By this estimate we get that

$\displaystyle \begin{array}{rcl} |T(f_2)(w)-T(f_2)(x)|&\lesssim_\sigma &\epsilon^\sigma \sum_{k=0} ^\infty \int_{2^k\epsilon<|x-y|<2^{k+1}\epsilon} \frac{|f(y)|}{(2^k\epsilon)^{n+\sigma}}dy\\ \\ &\lesssim_\sigma & \sum_{k=0} ^\infty \frac{1}{\epsilon^n} \frac{1}{2^{k(n+\sigma)}}\int_{|x-y|< 2^{k+1}\epsilon}|f(y)|dy \\ \\ & \lesssim_{\sigma,n} &\sum_{k=0} ^\infty \frac{1}{2^{k\sigma} } M(f)(x)\simeq_{n,\sigma} M(f)(x). \end{array}$

Combining the previous estimates we conclude that for any ${w\in B}$

$\displaystyle |T_\epsilon(f)(x)|\leq A M(f)(x)+|T(f_2)(w)|\leq A M(f)(x)+|T(f)(w)|+|T(f_1)(w)|, \ \ \ \ \ (7)$

for some constant ${A}$ depending only on ${n}$ and ${\sigma}$.

If ${T_\epsilon(f)(x)=0}$ then we are done. If ${|T_\epsilon(f)(x)|>0}$ then there is ${\lambda>0}$ such that ${|T_\epsilon(f)(x)|>\lambda}$. Let

$\displaystyle B_1=\{w\in B:|Tf(w)|>\lambda/3\},$

$\displaystyle B_2=\{w\in B:|Tf_1(w)|>\lambda/3\},$

and

$\displaystyle B_3=\begin{cases} \emptyset, \quad\mbox{if}\quad M(f)(x)\leq A^{-1}\lambda/3,\\ \\ B,\quad\mbox{if}\quad M(f)(x)>A^{-1} \lambda/3 \end{cases}.$

Let ${w \in B}$. Then either ${w\in B_1}$ or ${w\in B_2}$ or ${AM(f)(x)>\lambda/3}$. In the last case ${B_3=B}$ so in every case we conclude that ${w\in B_1\cup B_2\cup B_3}$ thus ${B\subset B_1\cup B_2 \cup B_3}$. However we have that

$\displaystyle |B_1|\lesssim \frac{1}{\lambda} \int_{B} |T(f)(y)|dy\leq \frac{|B|}{\lambda} M(Tf)(x).$

Also, by the ${(1,1)}$ type of ${T}$ we get

$\displaystyle |B_2|\lesssim \frac{1}{\lambda}\|f_1\|_{L^1({\mathbb R}^n)}=\frac{1}{\lambda}\int_B |f(y)|dy\leq \frac{|B|}{\lambda}M(f)(x).$

Finally, if ${B_3=B}$ then ${\lambda \lesssim_{n,\sigma} M(f)(x)}$. Otherwise ${B_3=\emptyset}$ so

$\displaystyle |B|\leq |B_1|+|B_2|\lesssim_{n,\sigma} \frac{|B|}{\lambda}(M(Tf)(x)+M(f)(x)).$

Thus in every case we get that

$\displaystyle \lambda \lesssim_{n,\sigma} M(Tf)(x)+M(f)(x).$

Since the previous estimate is true for any ${\lambda< T_\epsilon(f)(x)}$ we conclude that

$\displaystyle T_\epsilon(f)(x)\lesssim_{n,\sigma} M(Tf)(x)+M(f)(x),$

which gives the desired estimate in the case ${\nu=1}$.

For ${\nu <1}$ estimate (7)implies that

$\displaystyle |T_\epsilon(f)(x)|^\nu\lesssim_{\sigma,\nu,n} |M(f)(x)|^\nu + |T(f_2)(w)|^\nu + |T(f_1)(w)|^\nu,$

and integrate in ${w\in B}$ to get

$\displaystyle |T_\epsilon(f)(x)|^\nu \lesssim_{\sigma,\nu,n} |M(f)(x)|^\nu + \frac{1}{|B|}\int_B |T(f)(w)|^\nu dw +\frac{1}{|B|}\int_B |T(f)(w)|^\nu dw,$

and thus

$\displaystyle |T_\epsilon(f)(x)| \lesssim_{\sigma,\nu,n} |M(f)(x)|+ \bigg( \frac{1}{|B|}\int_B |T(f)(w)|^\nu dw \bigg)^\frac{1}{\nu}+\bigg( \frac{1}{|B|}\int_B |T(f)(w)|^\nu dw\bigg)^\frac{1}{\nu},$

Note that

$\displaystyle \bigg( \frac{1}{|B|}\int_B |T(f)(w)|^\nu dw\bigg ) \leq [M(|Tf|^\nu)(x)]^\frac{1}{\nu},$

and by Lemma 9the last term is controlled by

$\displaystyle \bigg( \frac{1}{|B|}\int_B |T(f)(w)|^\nu dw \bigg)^\frac{1}{\nu } \leq \frac{1}{|B|}\|f_1\|_1 \leq M(f)(x) ,$

since ${T}$ is of weak type ${(1,1)}$. Gathering these estimates we get

$\displaystyle T_\epsilon(f)(x)\lesssim_{\sigma,\nu,n}M(f)(x)+[M(|Tf|^\nu)(x)]^\frac{1}{\nu},$

as we wanted to show. $\Box$

We can now give the proof of the fact that maximal truncation of a CZO is of weak type ${(1,1)}$ and strong type ${(p,p)}$ for ${1.

Proof: Proof of Theorem 8. By Lemma 10 for ${\nu=1}$ we immediately get that ${T_*}$ is of strong type ${(p,p)}$ for ${1 since both ${M}$ and ${T}$ are. In order to show that ${T_*}$ is of weak type ${(1,1)}$ we argue as follows. By Lemma 10we have that

$\displaystyle \begin{array}{rcl} |\{x\in{\mathbb R}^n:T_*(f)(x)>\lambda \}|&\lesssim_{n,\nu,\sigma}& |\{x\in{\mathbb R}^n:M(f)(x)>\lambda/2 \}| \\ \\ && + |\{x\in{\mathbb R}^n:[M (|Tf|^\nu)(x)]^\frac{1}{\nu}>\lambda /2\}| \\ \\ &\lesssim& \frac{1}{\lambda}\|f\|_{L^1({\mathbb R}^n)}+ |\{x\in{\mathbb R}^n:[M (|Tf|^\nu)(x)]^\frac{1}{\nu}>\lambda /2\}| . \end{array}$

Thus the proof will be complete if we show that

$\displaystyle |\{x\in{\mathbb R}^n:[M (|Tf|^\nu)(x)]^\frac{1}{\nu}>\lambda /2\}|\lesssim \frac{1}{\lambda}\|f\|_1.$

As we have seen in Corollary 18 of Notes 5 we have that

$\displaystyle |\{x\in{\mathbb R}^n: M(g)(x)>4^n \lambda \}|\leq 2^n |\{x\in{\mathbb R}^n: M_\Delta(g)(x)>\lambda \}|. \ \ \ \ \ (8)$

where ${M_\Delta}$ is the dyadic maximal function. Furthermore, using the Calderón-Zygmund decomposition it is not hard to see (see Exercise 4) that

$\displaystyle |\{x\in{\mathbb R}^n:M_\Delta(g)(x)>\lambda \}|\lesssim \frac{1}{\lambda} \int_{\{M_\Delta(g)(x)>\lambda \}}|g(x)|dx.$

Applying the last estimate to ${g(x)=[M( |Tf|^\nu)(x)]^\frac{1}{\nu}}$ we get

$\displaystyle \begin{array}{rcl} |\{x\in{\mathbb R}^n:[M (|Tf|^\nu)(x)]^\frac{1}{\nu}>4^n\lambda /2\}|\lesssim_{n,\nu} \frac{1}{\lambda^\nu} \int_{\{[M_\Delta (|Tf|^\nu)(x)]^\frac{1}{\nu}>\lambda /2 \}} |Tf(x)|^\nu dx. \end{array}$

For ${f\in C^\infty _c({\mathbb R}^n)}$ the set ${ {\{[M_\Delta(|Tf|^\nu)(x)]^\frac{1}{\nu}\geq \lambda / 2\}}}$ has finite measure. Thus by Lemma 9we conclude that

$\displaystyle |\{x\in{\mathbb R}^n:[M_\Delta (|Tf|^\nu)(x)]^\frac{1}{\nu}>4^n\lambda /2\}|\lesssim_{\nu,n} \frac{1}{\lambda^\nu} \|f\|_{L^1({\mathbb R}^n)} ^\nu |\{x\in{\mathbb R}^n: [M_\Delta (|Tf|^\nu)(x)]^\frac{1}{\nu} >4^n \lambda /2 \}| ^{1-\nu} ,$

and thus by (8)that

$\displaystyle |\{x\in{\mathbb R}^n:[M (|Tf|^\nu)(x)]^\frac{1}{\nu}>\lambda /2\}|\lesssim_{\nu,n} \frac{1}{\lambda } \|f\|_{L^1({\mathbb R}^n)}.$

This concludes the proof. $\Box$

Exercise 4 Show that for all ${f\in L^1({\mathbb R}^n)}$ we have that

$\displaystyle |\{x\in{\mathbb R}^n:M_\Delta(f)(x)>\lambda\}|\lesssim_n \int_{\{x\in{\mathbb R}^n:M_\Delta(f)(x)>\lambda\}}|f(x)|dx.$

3. Singular integral operators on ${L^\infty}$ and ${\textnormal{BMO}}$.

The theory of Calderón-Zygmund operators developed so far is pretty satisfactory except for one point, the action of a CZO on ${L^\infty}$. Exercise 4 from Notes 6 shows for example that in general a CZO cannot be bounded on ${L^\infty}$. Furthermore, it is at the moment unclear how to define the action of ${T}$ on a general bounded function or even on a dense subset of ${L^\infty}$. With a little effort however this can be achieved.

Let us first fix a function ${f\in L^\infty({\mathbb R}^n)}$ and look at the formula

$\displaystyle T(f)(x)=\int K(x,y)f(y)dy. \ \ \ \ \ (9)$

As we have already mentioned several times, such a formula is not meaningful throughout ${{\mathbb R}^n}$. Indeed the integral above need not converge, both close to the diagonal ${x=y}$, since ${K}$ is singular, as well as at infinity since ${K}$ only decays like ${|x-y|^{-n}}$, not fast enough to make the integral above absolutely convergent. The first problem we have dealt with so far by considering functions with compact support and requiring the validity of (9)only for ${x\notin {\mathrm{supp}}(f)}$. A similar solution could work now but we still have a problem at infinity. Note that we didn’t run into this problem yet since we only considered functions in ${L^p({\mathbb R}^n)}$ which necessarily possess decay at infinity. This is not necessarily the case for bounded functions. However, looking at the difference of the values of ${T(f)}$ at two points ${x_1,x_2}$ with ${x_1\neq x_2}$, we can formally write

$\displaystyle \begin{array}{rcl} T(f)(x_1)-T(f)(x_2)=\int[K(x_1,y)-K(x_2,y)]f(y)dy. \end{array}$

Using the regularity condition (3)we see that

$\displaystyle |K(x_1,y)-K(x_2,y)| \lesssim_{n,\sigma} \frac{|x_1-x_2|^\sigma}{|x-y|^{n+\sigma}}$

when ${y\rightarrow \infty}$. This is enough to assure integrability in the previous integral, as long as ${x_1,x_2\notin {\mathrm{supp}}(f).}$ Motivated by this heuristic discussion we define for ${f\in L^\infty({\mathbb R}^n)}$:

$\displaystyle T(f)(x)= T(f\chi_B)(x)+\int_{{\mathbb R}^n\setminus B} [K(x,y)-K(0,y)]f(y)dy, \ \ \ \ \ (10)$

for some Euclidean ball ${B}$ so that ${0,y\in B}$. First of all it is easy to see that the integrals above make sense. Indeed, ${T(f\chi_B)}$ is well defined since ${f\chi_B}$ is in ${L^2({\mathbb R}^n)}$. On the other hand, the integral in the second summand converges absolutely since we integrate away from ${B\ni 0,y}$, ${f}$ is bounded and ${K(x,y)-K(0,y)}$ behaves like ${|y|^{-(n+\sigma)}}$ for ${|y|\rightarrow +\infty}$. However, (10)only defines ${T(f)}$ up to a constant. Indeed it is easy to see that if ${B,B'}$ are two different balls containing ${0,y}$ the difference in the two definitions is equal to

$\displaystyle \int_{B\triangle B'} K(0,y)f(y)dy,$

which is a constant independent of ${x}$. Thus we only define ${T(f)}$ modulo constants. This definition of ${T}$ gives a linear operator which extends our previous definitions on ${L^2({\mathbb R}^n)}$ or ${\mathcal S({\mathbb R}^n)}$. To deal with the ambiguity in the definition, we have to define the appropriate space.

Definition 11 We say that two functions ${f,g\in {\mathbb R}^n}$ are equivalent modulo a constant if there exists a constant ${c\in {\mathbb C}}$ such that ${f(x)-g(x)=c}$ almost everywhere on ${{\mathbb R}^n}$. This is an equivalence relationship. By abuse of language and notation we will oftentimes identify an equivalence class with a representative of the class, much like we do with measurable functions.

Definition 12 (Bounded Mean Oscillation) Let ${f}$ be a locally integrable function ${f}$, defined modulo a constant. We set

$\displaystyle f_B=\frac{1}{|B|}\int_B f,$

to be the average of ${f}$ on the Euclidean ball ${B}$. The ${\textnormal{BMO}}$ norm of ${f}$ is the quantity

$\displaystyle \|f\|_{\textnormal{BMO}}:=\sup_B \frac{1}{|B|}\int_B | f- f_B|,$

where the supremum varies over all Euclidean balls ${B}$. The space ${\textnormal{BMO}({\mathbb R}^n)}$ is the set of all locally integrable functions ${f}$, defined modulo a constant, such that ${\|f\|_{\textnormal{BMO}}<+\infty}$. Thus, an element of ${\textnormal{BMO}}$ is only defined up to a constant.

First of all observe that this is a good definition since replacing a function ${f}$ by ${f+c}$ for any constant ${c\in {\mathbb C}}$ does not affect its BMO norm. Thus, all elements in the equivalence class of ${f}$ have the same BMO norm. The previous quantity actually defines a norm, always keeping in mind that we identify functions that differ by a constant. For example any constant is equivalent to the function ${0}$ in BMO and thus ${\|f\|_{\textnormal{BMO}}=0}$ if and only if ${f=c}$ almost everywhere for some ${c\in{\mathbb C}}$.

It is not hard to give the following alternative description of the BMO norm, which is maybe a bit more revealing:

Proposition 13 (i) Let ${f\in \textnormal{BMO}({\mathbb R}^n)}$. We have that

$\displaystyle \|f\|_{\textnormal{BMO}}\simeq \sup_B \inf_{a\in{\mathbb C}} \frac{1}{|B|}\int_B|f-a|.$

(ii) For any locally integrable function ${f}$ and a cube ${Q}$ set ${f_Q=\int_Q f}$. We set

$\displaystyle \|f\|_{\textnormal{BMO}_\square}:=\sup_Q \frac{1}{|Q|}\int_Q |f-f_Q|,$

where the supremum is taken over all cubes ${Q\subset {\mathbb R}^n}$ Then

$\displaystyle \|f\|_{\textnormal{BMO}_\square}=\sup_Q \inf_{a\in {\mathbb C}}\frac{1}{|Q|}\int_Q|f-f_Q|$

as in ${(i)}$. Moreover

$\displaystyle \|f\|_{\textnormal{BMO}}\simeq_n \|f\|_{\textnormal{BMO}_\square}.$

Proof:For (i) observe that for any ball ${B}$ we have

$\displaystyle \inf_{a\in {\mathbb C}} \frac{1}{|B|}\int_B |f-a| \leq \frac{1}{|B|}\int_B |f-f_B|.$

On the other hand for any ${a\in {\mathbb C}}$ we have

$\displaystyle \frac{1}{|B|}\int_B |f-f_B| \leq \frac{1}{|B|} \int_B|f-a| +\frac{1}{|B|}\int_B|f_B-a|\leq \frac{2}{|B|}\int_B|f-a|,$

which gives the opposite inequality as well by taking the infimum over ${a\in {\mathbb C}}$. The proof of the first claim in ${(ii)}$ is identical. For the second claim in ${(ii)}$ let ${a\in {\mathbb C} }$ and ${Q}$ be a cube. Consider the smallest ball ${B\supset Q}$ with the same center as ${Q}$. Then

$\displaystyle \frac{1}{|B|}\int_B |f-a| \gtrsim _n \frac{1}{|Q|} \int_Q |f-a|.$

Thus,

$\displaystyle \sup_B\inf_{a\in{\mathbb C}}\frac{1}{|B|}\int_B |f-a| \gtrsim _n\inf_{a\in {\mathbb C}} \frac{1}{|Q|} \int_Q |f-a|,$

for any cube ${Q}$. Taking also the supremum over cubes ${Q}$ proves the one direction of the inequality. The proof of the opposite inequality is similar. $\Box$

Thus a function ${f}$ in BMO has the property that for any ball ${B}$ there is a constant ${c(B)}$ such that ${\frac{1}{|B|}\int|f-c(B)|\leq \|f\|_{\textnormal{BMO}}}$. That is, the values of ${f}$ oscillate around ${c(B)}$ by at most ${\|f\|_{\textnormal{BMO}}}$ in average. Locally, and in the mean, the function ${f}$ has bounded oscillation.

The space BMO contains ${L^\infty}$ but also contains unbounded functions.

Proposition 14 (i) For every ${f\in L^\infty({\mathbb R}^n)}$ we have that

$\displaystyle \|f\|_{\textnormal{BMO}({\mathbb R}^n)}\lesssim \|f\|_{L^\infty({\mathbb R}^n)},$

thus ${L^\infty({\mathbb R}^n)\subset \textnormal{BMO}({\mathbb R}^n)}$.

(ii) The function ${f(x)=\log |x|}$ is in ${\textnormal{BMO}({\mathbb R}^n)}$. Thus ${L^\infty({\mathbb R}^n)}$ is a proper subset of ${\textnormal{BMO}({\mathbb R}^n)}$.

Our interest in the space BMO mainly lies in the fact that it serves as a substitute endpoint for the boundedness of CZOs, namely a CZO ${T}$ is bounded from ${L^\infty}$ to BMO, where ${T}$ should be defined as in (10). Note here that even though (10) only defines ${T}$ up to constants’, this is the only possible definition of a BMO function.

Theorem 15 Let ${T}$ be a CZO. Then for every ${f\in L^\infty({\mathbb R}^n)}$ we have that

$\displaystyle \|T(f)\|_{\textnormal{BMO}({\mathbb R}^n)}\lesssim_{n,\sigma} \|f\|_{L^\infty({\mathbb R}^n)}.$

Proof:Let ${B=(z,r)}$ be some ball in ${{\mathbb R}^n}$. We need to show that

$\displaystyle \frac{1}{|B|}\int_B| T(f)-T(f)_B|\leq \|f\|_{L^\infty}.$

and denote ${B^*=B(z,2\sqrt n r)}$. We set

$\displaystyle f=f\chi_{B^*}+f\chi_{{\mathbb R}^n\setminus B^*}=:f_1+f_2.$

Since ${T}$ is of strong type ${(2,2)}$ we have

$\displaystyle \|T(f_1)\|_{L^2({\mathbb R}^n)}\lesssim_{n,\sigma}\|f_1\|_{L^2({\mathbb R}^n)}\leq \|f\|_{L^\infty({\mathbb R}^n)} |B^*|^\frac{1}{2}.$

Thus by Cauchy-Schwartz we have

$\displaystyle \frac{1}{|B|} \int_B |T(f_1)| \leq\frac{1}{|B|}\|T(f_1)\|_{L^2({\mathbb R}^n)}|B|^\frac{1}{2}\lesssim_{n,\sigma} \|f\|_{L^\infty}.$

On the other hand for ${x\in B}$, the ball ${B^*}$ certainly contains both ${x}$ and ${z}$ so

$\displaystyle \begin{array}{rcl} T(f_2)(x)&=&T(f_2\chi_{B^*})+\int_{\mathbb R^n\setminus B^*} [K(x,y)-K(z,y)]f_2(y)dy \\ \\ &\leq& \|f\|_{L^\infty({\mathbb R}^n)} \int_{|y-z|\geq 2r}|K(x,y)-K(z,y)|dy \\ \\ &\leq & \|f\|_{L^\infty({\mathbb R}^n)} \int_{|y-z|\geq 2r} \frac{|x-z|^\sigma}{|x-y|^{n+\sigma}}dy \\ \\ &\leq & \|f\|_{L^\infty({\mathbb R}^n)} r^\sigma\int_{\{|x-y|>r\}}\frac{1}{|x-y|^{n+\sigma}}dy \\ \\ &\lesssim_{n,\sigma} & \|f\|_{L^\infty({\mathbb R}^n)} . \end{array}$

Remembering that (10)only defines ${T}$ up to a constant ${c(B)}$ we get

$\displaystyle \frac{1}{|B|}\int_{B}|T(f)(x)-c(B)|\lesssim_{n,\sigma}\|f\|_{L^\infty({\mathbb R}^n)} .$

By Proposition 13 this proves the theorem. $\Box$

3.1. The John-Nirenberg Inequality

We will now see that although the space BMO contains unbounded functions like ${\log|x|}$, this is in a sense the maximum possible growth for a BMO function. Although such a claim is not precise in a pointwise sense, it can be rigorously proved in the sense of level sets. Indeed, assuming ${\|f\|_{\textnormal{BMO}}=1}$ then

$\displaystyle \frac{1}{|B|}\int_B|f-f_B|\leq 1,$

for all balls ${B}$. Using Chebyshev’s inequality this implies

$\displaystyle |\{x\in B: |f(x)-f_B|>\lambda\}|\leq\frac{|B|}{\lambda}.$

This estimate is interesting for ${\lambda}$ large, and states that on any ball ${B}$ the function ${f}$ exceeds its average by ${\lambda}$ only on a small fraction ${1/\lambda}$ of the ball ${ |B|}$. In fact, this can be improved.

Theorem 16 (John-Nirenberg inequality) Let ${f\in\textnormal{BMO}({\mathbb R}^n)}$. Then for any Euclidean cube ${Q}$ we have that

$\displaystyle |\{x\in Q:|f(x)-f_Q|>\lambda\}|\lesssim_n e^{-c_n{\lambda }/{\|f\|_{\textnormal{BMO}_\square}}}|Q|,$

for all ${\lambda>0}$, where the constant ${c_n>0}$ depends only on the dimension ${n}$.

Remark 5 Obviously it doesn’t make any difference to work with balls instead of cubes so the the previous theorem remains valid with balls ${B}$ replacing cubes ${Q}$.

Proof:For ${\lambda>0}$ let us denote by ${c(\lambda)}$ the best constant in the inequality

$\displaystyle |\{x\in Q:|f(x)-f_Q|>\lambda\}|\leq c(\lambda)|Q|,$

valid for any cube ${Q}$ and ${f}$ with ${\|f\|_{\textnormal{BMO}_\square}=1.}$ By Chebyshev’s inequality combined with the trivial bound we get

$\displaystyle c(\lambda)\leq \min(1,{1}/{\lambda}),$

which is of course quite far from the desired estimate

$\displaystyle c(\lambda )\lesssim_n e^{-c_n\lambda}.$

This will be achieved by iterating a local Calderón-Zygmund decomposition as follows.

Let us fix a cube ${Q_o}$ and consider the family ${\mathcal B_1}$ of ${2^n}$ cubes inside ${Q_o}$ which are formed by bisecting each side of ${Q}$. Then define the second generation ${\mathcal B_2}$ by bisecting the sides of each cube in ${\mathcal B_1}$ and so on. The family of all cubes in all generation will be denoted by ${\mathcal B'}$. For a level ${\Lambda > 1}$ to be chosen later let ${\mathcal B''}$ be the `bad’ cubes in ${\mathcal B'}$, that is the cubes ${Q\in \mathcal B'}$ such that

$\displaystyle \frac{1}{|Q|}\int_Q F(w)dw>\Lambda,$

where ${F(w)=|f(w)-f_{Q_o}|}$.

Finally let ${\mathcal B}$ be the family of maximal bad cubes. Since ${\frac{1}{|Q_o|}\int_ {Q_o} F(w)dw\leq 1 <\Lambda}$ for the original cube ${Q_o}$, every bad cube is contained in a maximal bad cube. As in the global Calderón-Zygmund decomposition we conclude that

$\displaystyle \Lambda\leq \frac{1}{|Q|}\int_Q F(w)dw \leq r_n \Lambda$

for each cube ${Q\in \mathcal B}$ where the constant ${r_n}$ depends only on the dimension ${n}$. We also conclude that

$\displaystyle F(w)\leq \Lambda$

if ${x\notin \cup_{Q\in\mathcal B} Q}$ by the dyadic maximal theorem. Remembering the initial normalization ${\|f\|_{\textnormal{BMO}_\square}=1}$ we get

$\displaystyle \sum_{Q\in\mathcal B}|Q|\leq \frac{1}{ \Lambda}\sum_{Q\in\mathcal B}\int_Q F(w)dw\leq \frac{1}{\Lambda} |Q_o|$

and for ${Q\in\mathcal B}$

$\displaystyle |f_{Q}-f_{Q_o}|=|\frac{1}{|Q |}\int_{Q }[f-f_{Q_o}]| \leq \frac{1}{|Q|}\int_Q F(w)dw \leq r_n \Lambda.$

Now consider ${\lambda>r_n \Lambda }$. We have

$\displaystyle \begin{array}{rcl} |\{x\in Q_o: |(f-f_{Q_o})(x)|> \lambda \}|&=& |\{x\in \cup_{Q\in \mathcal B} Q: |f(x)-f_{Q_o}|>\lambda\}| \\ \\&=&|\{x\in \cup_{Q\in \mathcal B} Q: |f(x)-f_{Q}|>\lambda-|f_Q-f_{Q_o}| \}|\\ \\ &\leq & \sum_{Q\in\mathcal B}|\{x\in Q :F(x)>\lambda -r_n\Lambda \}| \\ \\ &\leq & c(\lambda-r_n\Lambda) \sum_{Q\in\mathcal B}|Q| \\ \\ &\leq & c(\lambda-r_n\Lambda) \frac{1}{ \Lambda}|Q_o| \end{array}$

However this means that

$\displaystyle c(\lambda)\leq \frac{c(\lambda-r_n\Lambda)}{\Lambda}$

whenever ${\lambda>r_n\Lambda}$. Suppose that ${Nr_n \Lambda < \lambda \leq (N+1) r_n \Lambda}$. Since ${c(\lambda)}$ is non-increasing and the trivial estimate ${c(\lambda)\leq 1}$ we get

$\displaystyle c(\lambda)\leq c(Nr_n\Lambda)\leq \frac{c(r_n\Lambda)}{\Lambda^N}\leq e^{-N\ln \Lambda}\leq e^{-(\frac{\lambda}{r_n\Lambda}-1)\ln \Lambda}\lesssim e^{-c_n\lambda}$

for ${\Lambda=e}$ (say) and ${\lambda >r_n e}$. On the other hand, for ${\lambda we have

$\displaystyle c(\lambda)\leq 1 \lesssim_n e^ {-c_n\lambda}$

so the proof is complete. $\Box$

Corollary 17 Consider the ${L^p}$ version of the BMO norm

$\displaystyle \begin{array}{rcl} \|f\|_{\textnormal{BMO,p}}:=\sup_B\bigg( \frac{1}{|B|}\int_B|f-f_B|^p\bigg)^\frac{1}{p}\simeq_{p,n} \sup_B\inf_{a\in{\mathbb C}}\bigg( \frac{1}{|B|}\int_B|f-a|^p\bigg)^\frac{1}{p} \\ \\ \simeq_{n,p} \sup_Q\bigg( \frac{1}{|Q|}\int_Q|f-f_Q|^p\bigg)^\frac{1}{p}\simeq_{n,p}\sup_Q\inf_{a\in{\mathbb C}}\bigg( \frac{1}{|Q|}\int_Q|f-a|^p\bigg)^\frac{1}{p}. \end{array}$

Then

$\displaystyle \|f\|_{\textnormal{BMO}}\simeq_{p,n} \|f\|_{\textnormal{BMO,p}}.$

Exercise 5 Use the John-Nirenberg and the description of ${L^p}$ norms in terms of level sets to prove Corollary 17

Finally, we show how we can use the space ${\textnormal{BMO}({\mathbb R}^n)}$ as a different endpoint in the Log-convexity estimates for the ${L^p}$ norms.

Lemma 18 Let ${0 and ${f\in L^p({\mathbb R}^n)\cap \textnormal{BMO}({\mathbb R}^n)}$. Then ${f\in L^q({\mathbb R}^n)}$ and

$\displaystyle \|f\|_{L^q({\mathbb R}^n)}\lesssim_{p,q,d} \|f\|_{L^p({\mathbb R}^n)} ^\frac{p}{q}\|f\|_{\textnormal{BMO}({\mathbb R}^n)}^{1-\frac{p}{q}}.$

Proof:Obviously it is enough to assume that ${\|f\|_{\textnormal{BMO}}\neq 0}$ otherwise there is nothing to prove. Also by homogeneity we can normalize so that ${\|f\|_{\textnormal{BMO}}=1}$. Now form the Calderón-Zygmund decomposition of ${|f|^p}$ at level ${1}$ and denote by ${\mathcal B}$ the family of bad cubes as usual. For each cube ${Q\in\mathcal B}$ we then have

$\displaystyle \frac{1}{|Q|}|\int_Q f|\leq \frac{1}{|Q|}\int_Q |f|^p\lesssim_n 1 .$

From the John-Nirenberg inequality we conclude that

$\displaystyle |\{x\in Q:|f(x)|>\lambda\}|\leq |\{x\in Q:|f(x)-f _Q|>\lambda-|f _Q|\}|\lesssim_n e^{-c_n\lambda}|Q|,$

for all the bad cubes ${Q\in\mathcal B}$. Since we have that ${|f(x)|<1}$ for ${x\notin \cup_{Q\in\mathcal B} Q}$ we get

$\displaystyle |\{x\in {\mathbb R}^n: |f(x)|>\lambda \}|\lesssim_n e^{-c_n \lambda} \|f\|_{L^p} ^p, \ \ \ \ \ (11)$

for all ${\lambda>1}$. On the other hand, since ${f\in L^p}$ we have

$\displaystyle |\{x\in{\mathbb R}^n:|f(x)|>\lambda\}|\leq\frac{\|f\|_{L^p} ^p}{\lambda^p}. \ \ \ \ \ (12)$

We conclude the proof by using the description of the ${L^p}$ norm in terms of level sets and using (12) for ${\lambda<1}$ and (11) for ${\lambda>1}$. $\Box$

Exercise 6 (The sharp Maximal function) For ${f\in L^1 _{\textnormal{loc}}({\mathbb R}^n)}$ define the sharp maximal function

$\displaystyle M^\sharp (f)(x)=\sup_{B\ni x} \frac{1}{|B|}\int_B |f(y)-f_B|dy,\quad x\in {\mathbb R}^n.$

Observe that ${f\in \textnormal{BMO}({\mathbb R}^n)}$ if and only if ${M^\sharp(f)\in L^\infty({\mathbb R}^n)}$ and, in particular,

$\displaystyle \|f\|_{\textnormal{BMO}({\mathbb R}^n)}=\|M^\sharp\|_{L^\infty({\mathbb R}^n)}.$

Show that for every ${x\in {\mathbb R}^n}$ we have

$\displaystyle M^\sharp(f)(x)\lesssim_n M(f)(x).$

4. Vector valued Calderón-Zygmund Singular integral operators

We close this chapter on CZOs by describing a vector valued setup in which all our results on CZOs go through almost verbatim. We will see an application of these vector valued results in our study of Littlewood-Paley inequalities.

So let ${\mathcal H}$ be a separable Hilbert space with inner product ${(\cdot,\cdot)}$ and norm ${\|\cdot\|}$ and consider a function ${f:{\mathbb R}^n \rightarrow \mathcal H}$. All the well known facts about spaces of measurable scalar functions have almost obvious generalizations in this setup once we fix some analogies. For example, the function ${f}$ will be called measurable if for every ${h\in\mathcal H}$ the function ${{\mathbb R}^n\rightarrow x\mapsto (f(x,h(x)) }$ is a measurable function of ${x}$. If ${f}$ is measurable then ${\| f \| }$ is also measurable. We then denote ${L^p({\mathbb R}^n,\mathcal H)}$ the space of all measurable functions ${f:{\mathbb R}^n\rightarrow \mathcal H}$ such that

$\displaystyle \|f\|_{L^p({\mathbb R}^n,\mathcal H)}:=\bigg(\int \|f(x)\|^p dx \bigg)^\frac{1}{p}<+\infty,\quad 1\leq p <+\infty ,$

and the usual corresponding definition for ${p=\infty}$

$\displaystyle \|f\|_{L^\infty({\mathbb R}^n,\mathcal H)}:={\mathrm{ess}}_{x\in{\mathbb R}^n}\|f(x)\|.$

It is not hard to check the duality relations for these ${L^p}$ spaces; for example

$\displaystyle \|f\|_{L^p({\mathbb R}^n, \mathcal H)}=\sup\bigg\{ \big| \int (f(x),g(x)) dx \big|: \|g\|_{L^{p'}({\mathbb R}^n,\mathcal H)}\bigg\},$

for all ${1\leq p<\infty}$. Also our interpolations theorems, the Marcinkiewicz interpolation theorem and the Riesz thorin interpolation theorem go through in this setup as well.

Moreover, if a function ${f:{\mathbb R}^n\rightarrow \mathcal H}$ is absolutely integrable, we can define its integral as an element of ${\mathcal H }$ by defining the functional ${I_f:\mathcal H\rightarrow {\mathbb C}}$

$\displaystyle I_f(h):=\int_{{\mathbb R}^n} (f(x),h)dx .$

Note here that ${I_f}$ is uniquely defined as a functional in ${\mathcal H^*}$. Indeed, ${I_f}$ is obviously linear and by the Cauchy-Schwartz inequality we have

$\displaystyle |I_f(h)| = \big| \int_{{\mathbb R}^n} (f(x),h)dx \big| \leq \int_{{\mathbb R}^n}|(f(x),h)|dx\leq\bigg( \int_{{\mathbb R}^n}||f(x)|| dx \bigg) || h ||.$

By the Riesz representation theorem on Hilbert spaces, there is a unique element of ${\mathcal H}$, which we denote by ${\int_{{\mathbb R}^n} f(x) dx}$, such that ${I_f=(\int_{{\mathbb R}^n}f(x)dx,\cdot)}$, that is

$\displaystyle I_f(h)=\bigg(\int_{{\mathbb R}^n}f(x)dx,h \bigg),\quad h\in\mathcal H.$

Finally, if ${\mathcal H_1, \mathcal H_2}$ are separable Hilbert spaces we denote by ${B(\mathcal H_1,\mathcal H_2)}$ to be the space of bounded linear operators ${T:\mathcal H_1\rightarrow \mathcal H_2}$, equipped with the usual operator norm:

$\displaystyle \|T\|_{\mathcal H_1\rightarrow \mathcal H_2}:= \sup_{x\in\mathcal H_1}\frac{||Tx||_{\mathcal H_2}}{\|x\|_{\mathcal H_1}}.$

Again, a function ${F:{\mathbb R}^n \rightarrow B(\mathcal H_1,\mathcal H_2) }$ will be called measurable if for every ${h\in\mathcal H_1}$ the function

$\displaystyle {\mathbb R}^n\ni x\mapsto F(x)h \in \mathcal H_2$

is a measurable ${\mathcal H_2}$-valued function.

We are now ready to give the description of vector valued CZOs. We start with the definition of a singular kernel.

Definition 19 (Vector valued singular Kernel) Let ${\mathcal H_1,\mathcal H_2}$ be two separable Hilbert spaces and ${K:{\mathbb R}^n\times {\mathbb R}^n\rightarrow B(\mathcal H_1, \mathcal H_2)}$ be a function defined away from the diagonal ${\Delta:=\{x= y\}.}$ Then ${K}$ will be called a (vector-valued) singular kernel if it obeys the size estimate

$\displaystyle \|K(x,y)\|_{\mathcal H_1\rightarrow \mathcal H_2}\lesssim_n \frac{1}{|x-y|^n}, \quad (x,y)\in {\mathbb R}^n\times {\mathbb R}^n\setminus \Delta, \ \ \ \ \ (13)$

and the regularity estimates

$\displaystyle \|K(x,y_1)-K(x,y)\|_{\mathcal H_1\rightarrow \mathcal H_2}\lesssim_{n,\sigma} \frac {|y-y_1|^\sigma} {|x-y|^{n+\sigma} }\quad\mbox{if}\quad |y-y_1|<\frac{1}{2}|x-y|, \ \ \ \ \ (14)$

and

$\displaystyle \|K(x_1,y)-K(x,y)\|_{\mathcal H_1\rightarrow \mathcal H_2}\lesssim_{n,\sigma} \frac {|x-x_1|^\sigma} {|x-y|^{n+\sigma} }\quad\mbox{if}\quad |x-x_1|<\frac{1}{2}|x-y|, \ \ \ \ \ (15)$

for some Hölder exponent ${0<\sigma\leq 1}$.

Definition 20 Let ${\mathcal H_1,\mathcal H_2}$ be separable Hilbert spaces. An linear operator ${T:L^2({\mathbb R}^n,\mathcal H_1)\rightarrow L^2({\mathbb R}^n,\mathcal H_2)}$ is called a (vector valued) Calderón-Zygmundoperator (vector valued CZO) from ${\mathcal H_1}$ to ${\mathcal H_2}$ if it is bounded from ${L^2({\mathbb R}^n,\mathcal H_1)}$ to ${ L^2({\mathbb R}^n,\mathcal H_2)}$

$\displaystyle \|T(f)\|_{L^2({\mathbb R}^n,\mathcal H_2)}\lesssim_{n,T} \|f\|_{L^2({\mathbb R}^n,\mathcal H_1)},$

for all ${f\in L^2({\mathbb R}^n,\mathcal H_1)}$, and there exists a vector valued singular kernel ${K:{\mathbb R}^n\times {\mathbb R}^n\rightarrow B(\mathcal H_1, \mathcal H_2)}$ such that

$\displaystyle T(f)(x)=\int K(x,y)f(y)dy,$

whenever ${f\in L^2({\mathbb R}^n,\mathcal H_1)}$ has compact support and ${x\notin {\mathrm{supp}}(f)}$.

Adjusting the proof of the scalar case to this vector valued setup we get the corresponding statement of Theorem 5.

Theorem 21 Let ${\mathcal H_1,\mathcal H_2}$ be separable Hilbert spaces and ${T}$ be a vector valued Calderón-Zygmund operator from ${\mathcal H_1}$ to ${\mathcal H_2}$.

(i) The operator ${T}$ is of weak type ${(1,1)}$

$\displaystyle |\{x\in{\mathbb R}^n: || T(f)(x)||_{\mathcal H_2}>\lambda\}| \lesssim_{n,\sigma}\frac{\|f\|_{L^1({\mathbb R}^n),\mathcal H_1}}{\lambda},\quad \lambda>0,$

for all ${f\in L^1({\mathbb R}^n,\mathcal H_1)}$.

(ii) For all ${1, ${T}$ is of strong type ${(p,p)}$

$\displaystyle \|T(f)\|_{L^p({\mathbb R}^n,\mathcal H_2)}\lesssim_{n,\sigma} \|f\|_{L^p({\mathbb R}^n,\mathcal H_1)},$

for all ${f\in L^p({\mathbb R}^n,\mathcal H_1)}$.