The maximal function along a polynomial curve; effective dimension bounds.

In this post I will try to give a description of an older result of mine that studies the {L^2\rightarrow L^2} operator norm of the maximal function along a polynomial curve. The relevant paper can be found here. The main object of study in this paper is the maximal operator

\displaystyle \mathcal{M}_P(f)(x):=\sup_{\epsilon>0}\frac{1}{2\epsilon} \int_{|t|\leq \epsilon} |f(x_1-t,x_2-t^2,\ldots,x_d-t^d)|dt.

It was known since the seventies that this operator is bounded on {L^p} for {1<p<\infty}. I was however interested in getting some effective bounds for the operator norm, at least on {L^2}. In fact it is possible to do that:

Theorem 1 (Parissis, 2010) There is a numerical constant {c>0} such that

\displaystyle 	\|\mathcal{M}_P(f)\|_{L^2({\mathbb R}^d)} \leq c \log d\ \| f\|_{L^2({\mathbb R}^d)}.

In this post we will content ourselves to proving a slightly weaker estimate with linear (instead of logarithmic) growth in {d}. This will serve presenting the main ideas and techniques involved in the proof while keeping things as simple as possible. I will however give some clues on how to move from the linear dependence to the logarithmic one without presenting too many details. Of course the reader can always consult the original paper where all the details are presented.

The methods and ideas in this paper are somehow a mix originating in two independent investigations. The first is concerned with the dimension dependence of the operator norm of the maximal function. It was first Stein that observed that the Hardy-Littlewood maximal function associated with the Euclidean ball function is bounded on {L^p} with norm bounds that do not depend on the dimension. The second area of research has to do with the boundedness properties of maximal functions (and singular integrals) along lower dimensional varieties. The operator under study is such an example. However since here I am interested in good dimensional constants for the the norm of such an operator, tools from the first area of research will be used. I will try to give a short overview of these two areas. I will then try to describe how Bourgain’s ideas for the study of the standard maximal function can be used together with some new ones in order to get a good operator bound for the maximal function along a polynomial curve.

Notation: I will use the symbols \gtrsim, \lesssim, \simeq to supress numerical constants only. The dependence on the parameters we are interested in here will never be hidden in these symbols. Also the constants c,c_1,\ldots will be used to denote generic numerical constants that can change even in the same line of text. The notation c_p will denote for example a constant that depends on p only.

— 1. Dimension free inequalities for the Maximal function —

The first area of research alluded to before has to do with proving dimension free inequalities for the maximal function with respect to a fixed convex body. In order to fix some notation, let us consider a convex set {K\subset \mathbb R^d} which is centrally symmetric, i.e. {x\in K\iff -x\in K}. Furthermore, we normalize {K} so that it has volume {|K|=1}. The maximal function associated with {K} is defined as

\displaystyle  \begin{array}{rcl} 	 M_K(f)(x)&:=&\sup _{\epsilon>0}\frac{1}{|\epsilon K|}\int_{\epsilon K} |f(x-y)| dy \\ \\ &= &\sup _{\epsilon>0}\frac{1}{\epsilon^d}\int_{\epsilon K} |f(x-y)| dy,\quad x\in\mathbb R^d, \end{array}

since we have normalized {K} to have volume {1}. In other words, {M_K(f)} assigns to each point {x\in {\mathbb R}^d} the maximal average of {f} over all dilations of the convex body {K+x}, the copy of {K} centered at {x}. I encourage you to think of the normalized Euclidean ball or the unit cube of {{\mathbb R}^d} in the place of {K}, reducing the previous definition to the more familiar standard definition of the Hardy-Littlewood maximal function.

An alternative way to write down the maximal function which is notationally convenient is through the isotropic dilations of a function. So, let {h} be a locally integrable function on {{\mathbb R}^d}. If {x\in \mathbb R^d} and {s>0}, the isotropic dilation of {h} is defined as

\displaystyle  h^s (x):=\frac{1}{s^d} h\Big(\frac{x}{s}\Big),

where {(x/s)} just means {(x_1/s,\ldots,x_d/s)}. Here, the word isotropic is used to express in order to emphasize the fact that we dilate all variables in the same way, i.e. isotropically. It will be useful to remember this when we define the anisotropic dilations later on. Two easy comments are in order. Whenever the Fourier transform of {h} makes sense, we have that

\displaystyle  \widehat h^s(\xi)=\hat h (s\xi)=\hat h(sx_1,\ldots,sx_d),\quad \xi\in{\mathbb R} ^d.

In particular, dilations preserve integrals:

\displaystyle  \widehat h^s(0)=\hat h(0)\implies \int h^s(x)dx= \int h(x) dx.

It is now a simple exercise to check that the maximal function can be written as

\displaystyle M_K(f)(x)=\sup_{\epsilon>0} [|f|*(\chi_K)^\epsilon](x), \quad x\in {\mathbb R}^d,

where {\chi_K} denotes the indicator function of {K}.

The following theorem summarizes the boundedness properties of the operator {M_K}.

Theorem 2 Let {K\subset\mathbb R^d} be a centrally symmetric convex body normalized so that {|K|=1}.

    (i) For all {\lambda >0}\displaystyle |\{x\in{\mathbb R}^d:|M_K(f)(x)|>\lambda \}| \leq c_1 \frac{\|f\|_{L^1(\mathbb R^d)}}{\lambda},

    for some constant {c_1>0} which depends only the dimension {d} and on the choice of the convex body {K}.

    (ii) For every {1<p\leq \infty},

    \displaystyle  \|M_K(f)\|_{L^p(\mathbb R^d)} \leq c_p \|f \|_{L^p(\mathbb R^d)},

    for some constant {c_p>0} which depends only on {p, d} and on the choice of the convex body {K}.

Let us denote by {c_{1,d}(K)} the best possible value of the constant {c_1} in (i) and by {c_{p,d}(K)} the best value of the constant {c_p} in (ii). In (Stein and Strömberg, 1983) an investigation was initiated on understanding the behavior of these constants as {d\rightarrow+\infty}. In particular, the interest was mainly whether these constants can be independent of the dimension as {d\rightarrow+\infty}. While significant progress has been made, several aspects of this question remain largely open. Before summarizing what is known, let me define some convex bodies that are of special interest. In what follows, {B^d} denotes the normalized Euclidean ball of {{\mathbb R}^d} and {Q^d=[-\frac{1}{2},\frac{1}{2}]^d}. Also, {\tilde B_q ^d} denotes the {\ell^q} ball in {{\mathbb R}^d}:

\displaystyle \tilde B_q ^d := \{ x\in\mathbb R^d: \big(\sum_{j=1} ^d |x_j|^q\big)^\frac{1}{q}<1\}.

We then define { B_q ^d} to be the normalized {\ell^q} ball in {R^d} so that {B_q ^d=\frac{1}{|\tilde B_q ^d|}B_q ^d.} Of course we have that {B^d=B_2 ^d} and {Q^d=B_\infty ^d}.

The following bounds are known:

  • (Stein and Strömberg, 1983): There exists a numerical constant {c>0} such that\displaystyle \sup_K c_{1,d}(K) \leq c d \log d.
  • (Bourgain, 1986b), (Carbery, 1986): For {p\in(3/2,\infty]},\displaystyle \sup_K c_{p,d}(K) \leq c(p) ,where {c(p)>0} depends only on {p}.
  • (Stein and Strömberg, 1983): There exists a numerical constant {c'>0} such that\displaystyle  c_{1,d}(B^d) \leq c' d .
  • (Stein and Strömberg, 1983): For {1<p\leq \infty},\displaystyle  c_{p,d}(B^d) \leq c'(p) ,where {c'(p)>0} depends only on {p}.
  • (Bourgain, 1987),(Müller, 1990): For {1\leq q <\infty} and {1<p\leq \infty},\displaystyle c_{p,d}(B^d _q) \leq c''(p,q),where {c''(p,q)>0} depends only on {p,q}.
  • (Aldaz, 2008): We have that\displaystyle  \lim_{d\rightarrow +\infty} c_{1,d}(Q^d) =+\infty.

Following (Aldaz, 2008), a lower bound {c_{1,d} \geq c (\log d)^{1-o(d)}} as {d\rightarrow +\infty} was proved in (Aubrun, 2009).

Theorem 2 is a textbook theorem whose proof can be found in any graduate text in Real Analysis. I will only point out that the standard proof first establishes the weak bound (i) by means of a suitable covering lemma. The strong {L^p}-bound {(ii)} is then proved by interpolating between the weak {L^1} inequality (i) and the trivial {L^\infty} bound

\displaystyle  \|M_K(f)\|_{L^\infty(\mathbb R^d)} \leq \|f \|_{L^\infty(\mathbb R^d)}.

This method does not give optimal constants for the operator norm. One reason for that is that we don’t know the optimal constants for the weak {L^1} inequality! A more important reason is revealed by Aldaz’s result; at least in the case of the unit cube, such dimension free weak inequalities do not actually hold. A third reason is just that, in many cases, interpolation does not give the optimal constants. As a result most of the strong {L^p} dimension free inequalities for the maximal function start from {L^2({\mathbb R}^d)} and extrapolate to {L^p({\mathbb R}^d)} for {p<2}. One exception is the special case of the unit ball {B^d} where all the strong {L^p} inequalities are proved simultaneously. However the method there is particular to the Euclidean symmetry of the ball and does not seem to generalize to other convex bodies.

— 1.1. The dyadic maximal function —

For the purpose of this post it will actually be enough to consider the following model-case operator. So we choose the Euclidean ball {B} for our convex body {K}. Moreover, instead of consider all dilations of the ball, we will only consider dyadic dilations. We can thus define the following dyadic version of the maximal function

\displaystyle  	M^\textnormal{dyad}_B(f)(x):= \sup_{k\in\mathbb Z} (|f|*(\chi_B)^{2^k})(x). \ \ \ \ \ (1)

It actually turns out that the object just defined is not as innocent as it looks. It is obvious that this dyadic maximal function is controlled by the `full’ maximal function. In some sense, one can many times control or at least gain some information for the full maximal function from this dyadic one. Roughly speaking, if one knows how the maximal averages behave on dyadic dilations and has some information on the derivative of these averages with respect to the dilation parameter, then it is possible to `interpolate’ the information from the dyadic nods to every dilation scale. I won’t explain how this is done since we will not actually need it here. You can however check the article of Bourgain (Bourgain, 1986b) which uses this principle to get the {L^2\rightarrow L^2} dimension free bounds for the maximal function.

— 2. The maximal function along a polynomial curve —

The second line of research involves the study of maximal averages with respect to `thin’ sets. These maximal operators are much more singular than the maximal functions considered in the previous paragraph and many of the standard tools (for example standard covering lemmas) do not apply any more. A typical situation is when the family of averaging sets consists of lower dimensional subvarieties of {{\mathbb R}^d}. Again here, there are two typical examples.

In the first case let us consider the dilations of a fixed variety in {{\mathbb R}^d}. A typical example of a {d-1}-dimensional variety in {{\mathbb R}^d} is the unit sphere {S^{d-1}\subset{\mathbb R}^d} which gives rise to the spherical maximal function:

\displaystyle \mathcal M_{\sigma}(f)(x):= \sup_{s>0}\int_{S^{d-1}}f(x-sy)d\sigma(y),

where {\sigma} is the induced Lebesgue measure on the unit sphere of {{\mathbb R}^d}.

Theorem 3 (Stein and Wainger, 1978; Bourgain, 1986a) Let {d\geq 2} and {p>d/(d-1)}. Then

\displaystyle \|\mathcal M_\sigma(f)\|_{L^p({\mathbb R}^d)} \leq C \|f\|_{L^p({\mathbb R}^d)},

where {C} is a numerical constant that can only depend on {d} and {p}.

The other typical example of maximal averages with respect to thin sets arises when one considers segments of a fixed {k-dimensional} submanifold of {{\mathbb R}^d}. Specializing even more let us consider a polynomial surface

\displaystyle \vec P:\mathbb R^k \rightarrow \mathbb R^d, \quad \vec P(t)=(P_1(t),\ldots,P_d(t)),\quad t\in\mathbb R^k.

The related maximal operator here is defined as

\displaystyle \mathcal M_{\vec P} (f)(x):=\sup_{\epsilon>0}\frac{1}{\epsilon^k}\int_{|t|\leq \epsilon} |f(x-\vec P(t))|dt.

The following theorem gives the boundedness properties of {\mathcal M_{\vec P}} on {L^p} spaces.

Theorem 4 (Stein and Wainger, 1978) Let {1<p\leq \infty} and {f\in L^p({\mathbb R}^d)}. We have that

\displaystyle \| \mathcal M_{\vec P}(f)\|_{L^p({\mathbb R} ^d)} \leq c_{p,d,k} \|f\|_{L^p({\mathbb R} ^d)},

where the constant {c_{p,d,k}} depends only on {p,d,k}.

Observe the absence of an endpoint estimate on {L^1({\mathbb R}^d)}. In fact it is not known whether the operator {\mathcal M} maps {L^1} to {L^{1,\infty}} and this is one of the big open problems in the area. There are several results `close’ to {L^1}:

Theorem 5 (Christ and Stein, 1987) Consider the maximal operator {\mathcal{M}_P} where {P(t)=(t,t^2,\ldots,t^d)}, {t\in{\mathbb R}} ({k=1}). Then {\mathcal M_P} maps {L\log L(B)} to {L^1(B)} where {B} is any bounded set of {\mathbb R^d}, that is locally.

This theorem is not the optimal known result but it is a good introduction to such theorems due to the (relative) simplicity of its proof. For more sharp results see for example (Seeger et al., 2004).

— 2.1. A rewriting of the operator in a dyadic fashion —

I will from now one stick to the case {P(t)=(t,t^2,\ldots,t^d)}, that is our averaging set is a one-dimensional variety (curve) and our maximal function takes the form

\displaystyle  \mathcal M_P(f)(x)=\sup_{\epsilon>0}\frac{1}{2\epsilon}\int_{|t|\leq \epsilon} |f(x_1-t,x_2-t^2,\ldots,x_d-t^d)|dt.

My intention is to rewrite this operator in a form that resembles the dyadic maximal function (1). So let me fix an {\epsilon>0} and define the integer {k} by {2^{k_o-1} < \epsilon \leq 2^{k_o}}. We now have

\displaystyle  \mathcal{M}_P(f)(x) \leq \frac{1}{2^{k_o}} \sum_{j=-\infty} ^{k_o}\int_{2^{j-1} <|t| \leq 2^{j}} |f(x_1-t,\ldots,x_d-t^d)|dt .


\displaystyle \mathcal M_P ^{\textnormal dyad}(f)(x):=\sup_{j\in\mathbb Z} \frac{1}{2^j}\int_{2^{j-1} <|t| \leq 2^{j}} |f(x_1-t,\ldots,x_d-t^d)|dt,

the previous estimate yields

\displaystyle \mathcal M_P(f)(x)\simeq \mathcal M_P ^{\textnormal dyad}(f)(x).

Since the dyadic version of our operator is equivalent (up to numerical constants) to the original one, we will carry out the analysis for {M_P ^{\textnormal dyad}(f)(x)} instead of {\mathcal M_P(f)(x)}.

— 2.2. Parabolic dilations —

In order to write the operator {M_P ^{\textnormal dyad}(f)(x)} in a form resembling (1), we need to introduce ‘anisotropic dilations’. For {x=(x_1,\ldots,x_d)\in \mathbb R^d} and {s>0}, the anisotropic, or {parabolic} dilations of {x} are defined as

\displaystyle \delta_s x = (sx_1,s^2x_2,\ldots,s^dx_d).

Observe that this dilation of {{\mathbb R}^d} matches the geometry of the curve {P=P(t)}. In fact the curve {P} is the orbit of the point {(1,1,\ldots,1)} (say) under the parabolic dilations operator {\delta_s}. That being said, let’s move on to defining the parabolic dilations of a locally integrable function {h} on {\mathbb R^d} as

\displaystyle h_s(x):=\frac{1}{s^\alpha}h(\delta_\frac{1}{s}x)=\frac{1}{s^\alpha} h\Big(\frac{x_1}{s},\frac{x_2}{s^2},\ldots,\frac{x_d}{s^d}\Big).

where {\alpha=1+2+\cdots+d}. In analogy with isotropic dilations we have that

\displaystyle \widehat h_s(\xi)=\hat h (\delta_s \xi),\quad \int h_s(x)dx = \int h(x) dx,

whenever the involved integrals make sense. It is a small step to extend the previous definition to finite Borel measures on {\mathbb R^d}. If {\mu} is such a measure we define the parabolic dilations of {\mu} by means of the formula

\displaystyle \widehat {d\mu_s}(\xi)=\widehat{d\mu}(\delta_s\xi).

Going back to the maximal function {M_P ^{\textnormal dyad}(f)(x)}, consider the measure {d\mu_{2^j}} defined for every test function {\phi} as

\displaystyle \langle \phi,d\mu_{2^j}\rangle :=\frac{1}{2^j}\int_{ 2^{j-1}<|t|\leq 2^j}\phi(t,t^2,\ldots,t^d)dt=\int_{\frac{1}{2}<|t|\leq 1} \phi(\delta_{2^j}P(t))dt.

This notation suggests that the measures {d\mu_{2^j}} are parabolic dilations of a single measure {d\mu}. We will shortly see that this is in fact the case.

On the Fourier transform side we have that

\displaystyle \widehat {d\mu_{2^j}}(\xi)=\int_{\frac{1}{2}<|t|\leq 1} e^{-2\pi i(\xi_1 2^j t +\xi_2 (2^j)^2 t^2+\cdots+\xi_d (2^j)^d t^d)}dt=\widehat{d\mu}(\delta_{2^j}\xi),

where {d\mu} is the measure

\displaystyle  	\widehat{d\mu}(\xi):=\int_{\frac{1}{2}<|t|\leq 1} e^{-2\pi i(\xi_1 t +\xi_2 t^2+\cdots+\xi_d t^d)}dt. \ \ \ \ \ (2)

If you would rather see how this measure acts on test functions this is also pretty obvious:

\displaystyle \langle \phi,d\mu\rangle :=\int_{ \frac{1}{2}<|t|\leq 1}\phi(t,t^2,\ldots,t^d)dt.

Thus for every {j\in\mathbb Z}, {d\mu_{2^j}} is the parabolic dilation of the measure {d\mu}, which is the reason for choosing the notation in the first place.

— 3. A unified approach to maximal convolution operators —

Recall the description of the dyadic maximal function with respect to the unit ball:

\displaystyle 	M^\textnormal{dyad}_B(f)(x)=\sup_{j\in\mathbb Z} (|f|*(\chi_B)^{2^j})(x).

On the other hand, using the parabolic dilations previously defined it is straightforward to check that {\mathcal{M}_P ^{\textnormal dyad}(f)(x)} can be written in the form

\displaystyle \mathcal{M}_P ^{\textnormal dyad}(f)(x)=\sup_{j\in\mathbb Z} (|f|*d\mu_{2^j})(x),

where {d\mu} is the measure defined in (2).

Note that the superscript {^ {2^j}} denotes isotropic dilations while the superscript {_{2^j}} denotes anisotropic or parabolic dilations. These two maximal operators have a different `geometry’ which is reflected by the different dilations, isotropic in one case and parabolic in the other case. We will overcome this issue by working with a metric on the Euclidean space that matches the geometry of the parabolic dilations. This essentially means we will be working on a space of homogeneous type. We will take up this issue later on in the discussion.

A second important difference between these maximal functions is that {M^\textnormal{dyad}_B} is defined with respect to a measure supported on a convex set in {\mathbb R^d} while {\mathcal{M}_P ^{\textnormal dyad}} is defined with respect to a measure supported on the one-dimensional curve {P(t)}. The day is saved by the fact that the manifold {t\rightarrow P(t)} has non vanishing curvature around the point {t=0}, and thus the Fourier transform of the measure {d\mu} will have power decay at infinity.

The following strategy is inspired by Bourgain’s proof of the dimension independent {L^2\rightarrow L^2} for {M^\textnormal{dyad}_B(f)}. The initial step is to choose any finite Borel measure {d\nu} on {{\mathbb R}^d } and write:

\displaystyle  \begin{array}{rcl}  	\mathcal{M}_P ^{\textnormal dyad}(f)(x)&\leq& \sup_{j\in\mathbb Z} (|f|*d\nu_{2^j})(x) + \Big( \sum_{j\in\mathbb Z} \big|(|f|*(d\mu-d\nu)_{2^j})(x)\big|^2\Big)^\frac{1}{2} \\ &=: &T(f)(x)+ S(f)(x). \end{array}

The operator {S(f)} is a square function and the way to treat it is to understand the decay of the Fourier transform of the measure {d\mu}. The operator {T(f)} has a very similar form to our original operator. However here we have the freedom to choose the measure {d\nu} as we wish. The following paragraph explains why an appropriate choice of the measure {d\nu} gives a desirable estimate for {T(f)}.

— 4. Symmetric diffusion semi-groups —

We will use in an essential way Stein’s theorem on symmetric diffusion semi-groups. For details see (Stein, 1970). Here we recall the definition and the relevant theorem.

Theorem 6 For {t>0} let {T^t:L^p(\mathbb R^d)\rightarrow L^p(\mathbb R^d)}, {1\leq p \leq \infty}, be a family of operators such that {T^{t_1}\circ T^{t_2}=T^{t_1+t_2}} for every {t_1,t_2>0} and {T^0=Id}. Assume also that {\lim_{t\rightarrow 0}T^tf =f} in {L^2(\mathbb R ^d)}. Suppose that the family {\{T^t\}_{t>0}} satisfies the following properties:

  1. {{\|T^t f\|_{L^p(\mathbb R^d)}\leq \| f\|_{L^p(\mathbb R^d)}}, {t>0}, {1\leq p \leq \infty} (contraction property).}
  2. {For every {t>0}, {T^t} is a self adjoint operator in {L^2(\mathbb R ^d)} (symmetry property).}
  3. {{T^t f \geq 0} if {f\geq 0}, {t>0} (positivity property).}
  4. {{T^t1=1}, {t>0} (conservation property).}

We call the family {\{T^t\}_{t>0}} a symmetric diffusion semi-group. Let

\displaystyle T^*(f)(x)=\sup_{t>0}T^t(f)(x).


\displaystyle  \begin{array}{rcl}  \| T^*(f)\|_{L^p(\mathbb R^d)} \leq c_p \| f\|_{L^p(\mathbb R ^d)} ,\quad 1<p\leq \infty, \quad f\in L^p(\mathbb R^d), \end{array}

where {c_p} depends only on {p}.

We have written down Stein’s theorem on the Euclidean space for simplicity but in fact it is a much more general theorem that applies to positive measure spaces.

Since we will consider convolution operators a special mention is in order. So, suppose that

\displaystyle  	T^t(f)(x):=(|f|*(\Delta_td\nu))(x) \ \ \ \ \ (3)

where {d\nu} is a probability measure and {\Delta_t} denotes isotropic or parabolic dilations (it makes no difference here).

Proposition 7 The family of operators {(T^t)_{t>0}} defined in (3) is a positive symmetric diffusion semi-group if and only if

\displaystyle \widehat{\Delta_{t_1}d\nu}(\xi)\widehat{\Delta_{t_2}d\nu}(\xi)=\widehat{\Delta_{t_1+t_2}d\nu}(\xi),

for every {t_1,t_2>0}.

Proof: All the semi-group properties are automatically satisfied for {T^t} and we only need to check that {T^{t_1}\circ T^{t_2}=T^{t_1+t_2}}. Taking Fourier transforms completes the proof. \Box

If this looks a bit too abstract for you, let us review too classical semi-groups.

The Poisson semi-group: Recall that the Poisson kernel is defined on {{\mathbb R}^d} as

\displaystyle  P(x):=c_d {(1+|x|^2)^{-\frac{(d+1)}{2}}},

where {c_d} is the appropriate dimensional constant so that {\|P\|_{L^1({\mathbb R}^d)}=1}. We consider the isotropic dilations of the Poisson kernel,

\displaystyle  P^t(x)=\frac{1}{t^d}P\Big(\frac{x}{t}\Big),

as usual. Now the family of operators

\displaystyle T^t(f)(x):= (f*P^t)(x),

is a symmetric diffusion semigroup. To see this we use the well known fact that {\widehat{P^t}(\xi)=e^{-2\pi|\xi|}} for {\xi\in\mathbb R^d}. We thus get that

\displaystyle \widehat{P^{t_1}}(\xi_1)\widehat{P^{t_2}}(\xi_1)=\hat P(t_1\xi)\hat P(t_2\xi)=e^{-2\pi(t_1+t_2)|\xi|}=\widehat{P^{t_1+t_2}}(\xi).

The Heat semi-group: The Heat kernel is defined on {{\mathbb R} ^d} as

\displaystyle H(x):=(4\pi)^{-\frac{d}{2}}e^{-\frac{|x|^2}{4}}.

Dilation isotropically by {\sqrt{t}} we get the Heat semi-group

\displaystyle H^{\sqrt{t}}(x)=(4\pi t)^{-\frac{d}{2}}e^{-\frac{|x|^2}{4t}}.

Using the fact that {\hat H(\xi)=e^{-4\pi^2|\xi|^2}} we can easily see that the Heat semigroup is a positive symmetric diffusion semi-group.

We just saw two classical examples of isotropic semi-groups on the Euclidean space. Bourgain used the Poisson semi-group in order to get dimension-free bounds for the maximal function associated with a convex body. Our maximal function here is quite different. In particular we have seen that it is defined as a convolution operator with respect to the parabolic dilations of a given measure. We thus need to define a `parabolic’ semi-group that matches the geometry of our dilations.

— 5. Parabolic Poisson kernel —

We begin by defining the appropriate norm function that respects the geometry of the parabolic dilations.

Let {\rho:{\mathbb R}^d\rightarrow {\mathbb R}^+\cup\{0\}} be a function such that, for every {x,y\in{\mathbb R}^d} we have

  • {\rho(x)=0\iff x=0}.
  • {\rho(-x)=\rho(x)}.
  • {\rho(x+y)\leq c (\rho(x)+\rho(y))}, for some constant {c>0}.
  • {\rho(\delta_s x)=s \rho(x)}, for any {s>0}.

Then {\rho} is parabolic (quasi) norm. We have that {({\mathbb R}^d,dx,\rho)} is a space of homogeneous type. Observe that the `balls’ {\{\rho(x)<r \}} have volume of the order {r^\alpha}, where {\alpha=1+2+\cdots+d}. Thus this space has homogeneous dimension {\alpha}. Given a dilation operator, the norm function is not unique though all parabolic norm functions are equivalent up to dimensional constants. However, here we are interested in the dependence of the operator norms on the dimension so the specific choice of the norm function turns out to be important. For the dilation operator {\delta_s(x)=(sx_1,s^2x_2,\ldots,s^dx_d)}, the following functions are natural examples of parabolic norms:

  • {\rho_1(x):= |x_1| +|x_2|^\frac{1}{2}+\cdots+|x_d|^\frac{1}{d},}
  • {\rho_2(x):= \max_{1\leq j \leq d} |x_j|^\frac{1}{j}.}

Formally, the Poisson kernel for our space of homogeneous type should formally look like

\displaystyle \widehat{P^\rho}(\xi):=e^{-\rho(\xi)}.

Observe that by dilating parabolically we get for {t>0}

\displaystyle \widehat{P^\rho _t}(\xi)=e^{-\rho(\delta_t \xi)}=e^{-t\rho(\xi)},

using the homogeneity of {\rho} with respect to the parabolic dilations. This property alone shows that the family of operators

\displaystyle T^t(f):=P_t ^\rho * f

has the desired semigroup structure, much like the Poisson kernel on the Euclidean space. However, it is not clear yet what meaning to give to {P^\rho}. In particular, defining {\widehat{P^\rho}}, we need to make sure that this Fourier transform comes from a probability measure. This is necessary in order for {T^t} to be a positive symmetric diffusion semi-group of operators. In fact this is one of the factors that affects how we choose the parabolic norm {\rho}.

Let us quickly see why this is the case for the function {\rho_1}:

Proposition 8 The function {e^{-\rho_1(\xi)}} is the Fourier transform of a probability measure. In particular, there is a non-negative function {P^{\rho_1}\in L^1({\mathbb R}^d)} such that {\widehat{P^{\rho_1}}(\xi)=e^{-\rho_1(\xi)}.}

This is a consequence of a well known theorem of Pólya:

Theorem 9 (Pólya) Let {f} be a function on {{\mathbb R}} which satisfies the following conditions for all {t\in{\mathbb R}}

  • {f(0)=1, \quad f(t)\geq 0,\quad f(t)=f(-t)}
  • The function {f} is decreasing and continuous convex in {[0,+\infty)}.

Then {f} is the Fourier transform of a probability measure.

In order to see why Proposition 8 is true, we apply Pólya’s theorem to every function {e^{-|\xi_j|^\frac{1}{j}}} for {j=1,2,\ldots,d}. We then get that {e^{-|\xi_j|^\frac{1}{j}}=\widehat{d\nu^{(j)}}(\xi_j).} Defining

\displaystyle d\nu(\xi):=d\nu^{(1)}\otimes\cdots\otimes d\nu^{(d)}

we readily see that {e^{-\rho_1(\xi)}=\widehat{d\nu}(\xi)}.

The following statement is just an application of Stein’s general semi-group theorem on the parabolic semi-group just constructed.

Corollary 10 Let {(T^t)_{t>0}} be the family of operators defined as

\displaystyle T_t(f):=(f*d\nu_t)(x),

where {d\nu} is the measure of Proposition 8 and {d\nu_t} denotes the parabolic dilations of {d\nu}:

\displaystyle d\nu_t(\xi)=\widehat{d\nu}(\delta_t\xi).

Let us define {T^*(f):=\sup_{t>0} (f*d\nu_t)(x)}. Then

\displaystyle \| T^*(f)\|_{L^p({\mathbb R}^d)}\leq c_p \|f \|_{L^p({\mathbb R}^d)},

where the constant {c_p} depends only on {p}.

— 6. The square function estimate —

We recall the basic estimate

\displaystyle  \begin{array}{rcl}  	\mathcal{M}_P ^{\textnormal dyad}(f)(x)&\leq& \sup_{j\in\mathbb Z} (|f|*(d\nu)_{2^j})(x) \\ \\ && 	+ \Big( \sum_{j\in\mathbb Z} \big|(|f|*(d\mu-d\nu)_{2^j})(x)\big|^2\Big)^\frac{1}{2} \\ \\ &=&T(f)(x)+ S(f)(x). \end{array}

Now we have a good candidate for the choice of the measure {d\nu}, namely the measure constructed in Proposition 8. Note also that any other probability measure corresponding to a different parabolic norm {\rho} will be as good, provided we can prove it is well defined! Corollary 10 takes care of the first term in the previous estimate and in fact for all {1\leq p <\infty}. We have

\displaystyle  \| \mathcal{M}_P ^{\textnormal dyad}(f)(x) \|_{L^2({\mathbb R}^d)} \leq c_2 \|f\|_{L^2({\mathbb R}^d)}+ \|S(f)\|_{L^2({\mathbb R}^d)},

where {c_2>0} is just a numerical constant. Setting {d\lambda:=d\mu-d\nu} and using Plancherel’s theorem, we have

\displaystyle  \begin{array}{rcl}  \| \mathcal S(f)\|_{L^2(\mathbb R^d)}&=& \bigg \| \bigg(\sum_{k\in \mathbb Z}|f*d\lambda_{2^k}|^2\bigg)^\frac{1}{2}\bigg\| _{L^2(\mathbb R^d)} \\ \\ &=&\bigg( \sum_{k\in \mathbb Z}\int_{\mathbb R ^d}|(f*d\lambda_{2^k})(x)|^2 dx\bigg) ^\frac{1}{2}\\ \\ &=& \bigg( \sum_{k\in \mathbb Z}\int_{\mathbb R ^d}|\hat f(\xi)|^2|\widehat{d\lambda_{2^k}}(\xi)|^2 d\xi\bigg) ^\frac{1}{2}\\ &\leq& \sup_{\xi\in\mathbb R ^d} \| \widehat {d\lambda_{2^k}}(\xi)\|_{\ell^2(\mathbb Z)} \| f\|_{L^2(\mathbb R^d)} . \end{array}

Here of course we denote {\|\widehat {d\lambda_{2^k}}(\xi)\| _{\ell^2(\mathbb Z)}=\big(\sum_{k\in \mathbb Z}|\widehat{d\lambda_{2^k}}(\xi)|^2\big)^\frac{1}{2}}.

Theorem 11 Let {d\lambda=d\mu -d\nu} where

\displaystyle \widehat{d\mu}(\xi)=\int_{\frac{1}{2}<|t|\leq 1} e^{-2\pi i(\xi_1t+\cdots+\xi_d t^d)}dt

and {\widehat{d\nu}(\xi)=\widehat {P^{\rho_1}}(\xi)} is the measure defined in Corollary 8. Then

\displaystyle \sup _{\xi\in\mathbb R^d} \|\widehat {d\lambda_{2^k}}(\xi)\| _{\ell^2(\mathbb Z)} \lesssim d.

For the proof of this statement we will need the following simple estimate on oscillatory integrals with polynomial phase, due to Vinogradov:

Lemma 12 (Vinogradov) For any positive integer {d} we have

\displaystyle  \bigg|\int_{\frac{1}{2}<|t|\leq 1} e^{-2\pi i (\xi_1t+\cdots+\xi_d t^d)}dt \bigg| \lesssim \frac{1}{(\max_{1\leq j \leq d}|\xi_j|)^\frac{1}{d}}.

This is a special case of a more general lemma due to Vinogradov. The proof is an easy consequence of a corresponding sub-level set estimate. For a proof see for example (Parissis, 2008).

Proof of Theorem 11: Let us set {|\xi_{j_o}|^\frac{1}{j_o}:=\max_{1\leq j \leq d}|\xi_j |^\frac{1}{j}} and {A:=|\xi_{j_o}|^{-\frac{1}{j_o}}}. Now for `large’ {k}, {2^k>A}, we write

\displaystyle  \begin{array}{rcl}  	|\widehat{d\lambda_{2^k}}(\xi)|&=& |(\widehat {d\mu}-\widehat {d\nu})(\delta_{2^k}\xi)|\leq |\widehat{d\mu(\delta_{2^k}}\xi)|+|\widehat{d\nu}(\delta_{2^k}\xi)|\\ \\ 	&\lesssim& \frac{1}{|\xi_{j_o}|^\frac{1}{d} 2^\frac{kj_o}{d}}+\frac{1}{2^k |\xi_{j_o}|^\frac{1}{j_o}}, \end{array}

using Vinogradov’s Lemma. Summing in {k} for {2^k>A} we get

\displaystyle \bigg(\sum_{2^k>A}	|\widehat{d\lambda_{2^k}}(\xi)|^2\bigg)^\frac{1}{2}\lesssim \frac{1}{|\xi_{j_o}|^\frac{1}{d} } A^{-\frac{j_o}{d}}\frac{1}{1-2^{-\frac{j_o}{d}}}+\frac{1}{|\xi_{j_o}|^\frac{1}{\xi_{j_o}}}\frac{1}{A}\lesssim \frac{d}{j_o}\leq d.

On the other hand, for `small’ {k}, {2^k\leq A}, the following estimate is relevant

\displaystyle  \begin{array}{rcl}  	|\widehat{d\lambda_{2^k}}(\xi)|&=& |(\widehat {d\mu}-\widehat {d\nu})(\delta_{2^k}\xi)|\leq |\widehat {d\mu}(\delta_{2^k}\xi)-1|+|\widehat {d\nu}(\delta_{2^k}\xi)-1|\\ \\ 	&\leq & \sum_{j=1} ^d \frac {2^{kj}|\xi_j|}{j+1} +2^{k}d|\xi_{j_o}|^\frac{1}{\xi_{j_o}}\lesssim \log d \ |\xi_{j_1}|2^{kj_1}+2^{k}d|\xi_{j_o}|^\frac{1}{\xi_{j_o}}, \end{array}

for some {j_1\in\{1,2,\ldots,d\}}. Summing up the estimates for small {k} we get

\displaystyle  \begin{array}{rcl}  \bigg(\sum_{2^k\leq A} |\widehat{d\lambda_{2^k}}(\xi)|^2\bigg)^\frac{1}{2} &\lesssim & \log d \ \bigg(\sum_{2^k\leq A}(|\xi_{j_1}|2^{k j_1})^2 \bigg)^\frac{1}{2} \\ \\ && + d |\xi_{j_o}|^\frac{1}{j_o} \bigg(\sum_{2^k\leq A}(2^k)^2 \bigg)^\frac{1}{2} \\ \\ &\lesssim & \log d \ |\xi_{j_1}| A^{j_1}+ d| \xi_{j_o}|^\frac{1}{j_o} A\lesssim d. \end{array}

Thus we have proved that for any {\xi} we have {\|\widehat {d\lambda_{2^k}}(\xi)\| _{\ell^2(\mathbb Z)} \lesssim d} as desired. \Box

— 7. Improving the linear bound —

I will give a very brief description of how to prove the logarithmic bound in the dimension {d}. The main difference with the proof described above is the choice of the parabolic norm function {\rho}. One first needs to observe that there is an improvement over Theorem 11 if {\rho_1} is replaced by the norm function

\displaystyle \rho_3(\xi):=\sum_{j=0} ^{N-1}\max_{2^j\leq \ell< 2^{j+1}}{|\xi_\ell|^\frac{1}{\ell}},

where we assume that {d=2^N} for some positive integer {N}, and actually this already gives the general case via a simple argument. The way to get this gain was introduced in (Parissis, 2008) and consists of dividing the Euclidean space in `dyadic blocks’ of dimensions. One the can show by induction on the index of the dyadic block that in fact, with the previous choice {\rho_3} we have that

\displaystyle \sup _{\xi\in\mathbb R^d} \|\widehat {d\mu_{2^k}}(\xi) - e^{-2^k\rho_3(\xi)}\| _{\ell^2(\mathbb Z)} \lesssim \log d.

The problem now is that one needs to make sure that there is a probability measure on {{\mathbb R}^d}, let’s call it {d\nu_3}, such that

\displaystyle  \widehat{d\nu_3}(\xi)=e^{-\rho_3(\xi)}.

The presence of {\max} in the definition of {\rho_3} makes this i pretty hard task. We can however consider another parabolic norm {\rho_4} which is equivalent up to numerical constants to {\rho_3}. Indeed, if we define

\displaystyle \rho_4(\xi):=\sum_{j=0} ^{N-1}\bigg(\sum_{2^j\leq \ell< 2^{j+1}}{|\xi_\ell|^\frac{2^j}{\ell}}\bigg)^\frac{1}{2^j},

it easy to check that {\rho_3(\xi)\simeq \rho_4(\xi)} where as usual the implied constants do not depend on anything. For {\rho_4} it is possible to show that there exists a probability measure (in fact a non-negative {L^1} function) {P^{\rho_4}(x)dx} such that {\widehat{P^{\rho_4}}(\xi)=e^{-\rho_4(\xi)}}.

— 8. Some open questions —

Let me just rewrite the statement of the main theorem presented here.

Theorem 13 Let {\mathcal {M}_P} denote the maximal function along the polynomial curve {P(t)=(t,t^2,\ldots,t^d)}:

\displaystyle \mathcal{M}_P(f)(x)=\sup_{\epsilon>0} \frac{1}{2\epsilon}\int_{|t|\leq \epsilon}|f(x-P(t))|dt.

Then, for all {f\in L^2({\mathbb R}^d)},

\displaystyle \|\mathcal{M}_{P}(f)\|_{L^2({\mathbb R}^d)} \leq c \log d \ \|f\|_{L^2({\mathbb R}^d)},

where {c>0} is an absolute constant.

There is an aspect of the statement of this theorem which is a bit unsatisfactory. This is the fact that {d} here is both the degree of the space, as well as the degree of the curve. This is a bit confusing since in my opinion there should be no dependence on the dimension of the space here. However, in order to see this one needs to somehow `decouple’ the dependence of the dimension of the space and that of the curve. From the proof of the theorem it is obvious that the factor {\log d} comes from the degree of the polynomial curve. It is not so clear what would happen however if one considered the curve {Q(t)=(t^{n_1},\ldots,t^{n_d})} instead, where say {n_1<n_2<\ldots<n_d}.

Question 14: Let {\mathcal{M}_Q} denote the maximal function associated with the curve {Q(t)=(t^{n_1},\ldots,t^{n_d})}, where {n_1<n_2<\ldots<n_d} are positive integers. Is it true that

\displaystyle \|\mathcal M_Q\|_{L^2({\mathbb R}^d)\rightarrow L^2({\mathbb R}^d)} \lesssim \log n_d?

More generally, let {\vec P:{\mathbb R}^k\rightarrow{\mathbb R}^d} denote the polynomial map {\vec P(t)=(P_1(t),\ldots,P_d(t))}, {t\in{\mathbb R}^k}, where each {P_j} is of degree at most {N}. Can we describe the dependence of the norm {\|\mathcal M_{\vec P}\|_{L^2({\mathbb R}^d)\rightarrow L^2({\mathbb R}^d)} } on the parameters {d,k,N}?

Another obvious open end is whether the logarithmic bound of the theorem is optimal:

Question 15: Is there a function {f\in L^2({\mathbb R}^d)} such that

\displaystyle 	\|\mathcal M_P(f)\|_{L^2({\mathbb R}^d)} \gtrsim \log d \|f\|_{L^2({\mathbb R}^d)}.

Observe that if one considers the corresponding singular integral (Hilbert transform along a polynomial curve), then this question has positive answer.

Finally, I think it would be interesting to see if the bound of the theorem extends to {L^p({\mathbb R}^d)} for {p\neq 2}. Observe that for {p>2} we automatically get a bound by interpolating with the trivial {L^\infty\rightarrow L^\infty} bound. We can’t possible know if these bounds are optimal though so the previous question becomes relevant for any {p\in(1,\infty).} In combination with the first question, I think it would be interesting to see if this operator satisfies dimension-free bounds for {p<3/2}. Observe that for the maximal function associated with the Euclidean cube, we still don’t know the answer to this question.

Question 16: What is the dependence on {d} of the operator norm {\|\mathcal M_{ P}\|_{L^p({\mathbb R}^d)\rightarrow L^p({\mathbb R}^d)} } for {p\neq 2}? In particular it would be interesting to study this for {1<p<3/2}.

— 9. References —

Aldaz, J. M. 2008. The weak type (1, 1) bounds for the maximal function associated to cubes grow to infinity with the dimension, available at 0805.1565.

Aubrun, Guillaume. 2009. Maximal inequality for high-dimensional cubes, available at 0902.4305v2.

Bourgain, Jean. 1986a. Averages in the plane over convex curves and maximal operators, J. Analyse Math. 47, 69–85. MR874045.

Bourgain, Jean. 1986b. On the {L^p}-bounds for maximal functions associated to convex bodies in Rn, Israel J. Math. 54, no. 3, 257–265. MR853451.

Bourgain, Jean. 1987. On dimension free maximal inequalities for convex symmetric bodies in {{\mathbb R}^n}, Geometrical aspects of functional analysis (1985/86), pp. 168–176. MR907693.

Bourgain, Jean. 1986. On high-dimensional maximal functions associated to convex bodies, Amer. J. Math. 108, no. 6, 1467–1476. MR868898.

Carbery, Anthony. 1986. An almost-orthogonality principle with applications to maximal functions associated to convex bodies, Bull. Amer. Math. Soc. (N.S.) 14, no. 2, 269–273. MR828824.

Christ, Michael and Elias M. Stein. 1987. A remark on singular Calderón-Zygmund theory, Proc. Amer. Math. Soc. 99, no. 1, 71–75. MR866432.

Müller, Detlef. 1990. A geometric bound for maximal functions associated to convex bodies, Pacific J. Math. 142, no. 2, 297–312. MR1042048.

Parissis, Ioannis. 2010a. Logarithmic dimension bounds for the maximal function along a polynomial curve, J. Geom. Anal. 20, no. 3, 771–785. MR2610899.available at 0810.4508.

Parissis, Ioannis R. 2008b. A sharp bound for the Stein-Wainger oscillatory integral, Proc. Amer. Math. Soc. 136, 963–972, available at 0709.1466.

Seeger, Andreas, Terence Tao, and James Wright. 2004. Singular maximal functions and Radon transforms near {L^1}, Amer. J. Math. 126, no. 3, 607–647. MR2058385.

Stein, E. M. and J.-O. Strömberg. 1983. Behavior of maximal functions in {{\mathbb R}^n} for large n, Ark. Mat. 21, no. 2, 259–269. MR727348.

Stein, Elias M. 1970. Topics in harmonic analysis related to the Littlewood-Paley theory, Annals of Mathematics Studies, No. 63, Princeton University Press, Princeton, N.J. MR0252961.

Stein, Elias M. and Stephen Wainger. 1978. Problems in harmonic analysis related to curvature, Bull. Amer. Math. Soc. 84, no. 6, 1239–1295. MR508453.


About ioannis parissis

I'm a postdoc researcher at the Center for mathematical analysis, geometry and dynamical systems at IST, Lisbon, Portugal.
This entry was posted in math.CA, Mathematics, open problem, paper, seminar notes and tagged , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s