Tomorrow I am traveling to Heraklion, Crete, where among other things I will give a talk on Discrepancy in two Dimensions. The talk is based on a recent paper with D. Bilyk, M. Lacey and A. Vaghasrhakyan. Here follows an outline of the talk which can also be used as an easier first reading of the paper (which is admittedly quite technical).
1. Introduction.
Everything will take place in the unit cube . For
we write
for the rectangle anchored at and at
. Let
be an
-point distribution in
:
Definition 1 The discrepancy function of
is
The first term in the previous definition is sometimes referred to as the counting part:
Obviously this term counts the number of points of
inside
. We call the second term in the definition of the discrepancy function `the linear part’:
Here
denotes the Lebesgue measure of a set
. The linear part expresses the expected number of points in
if we pick the points uniformly and independently at random.
It turns out that the `size’ of this function (in many different senses) must necessarily grow to infinity with
Question 2 Is there a function
such that
where
as
? Some typical choices for the norm
are
,
,
,
.
2. Motivation.
2.1. Koksma-Hlawka inequality
Let be a function of bounded variation
(in the sense of Hardy and Krause). Then
In dimension the variation of a
function can be expressed as
. In higher dimensions there is an expression for
involving the partial derivatives of
, assuming sufficient smoothness. Observe that if we replace
by
in the Koksma-Hlawka inequality, we recover the definition of the discrepancy function on the left hand-side.
Roughly speaking, the Monte-Carlo numerical integration scheme consists of picking random i.i.d points inside
and approximating the integral of
over the unit cube by the average of
over
. As we shall see shortly, the discrepancy function for
points selected independently and at random inside
satisfies
which by the Koksma-Hlawka inequality implies an error in the integration of
of the order
.
We shall see that there exist point distributions that satisfy . By the way, these point distributions are extremal for discrepancy; one can never do better than that. However, using this selection of points, one integrates with error of the order
, which is of course a huge improvement over the i.i.d. selection of points. Again, roughly speaking, this is the main idea behind the quasi Monte-Carlo integration scheme. Here the terminology is a bit deceiving as we will see that there is nothing random about point distributions that are extremal for discrepancy.
2.2. Uniformly distributed sequences in .
Another application of two dimensional discrepancy relates to the notion of uniform distribution.
Definition 3 A sequence
is said to be uniformly distributed in
if
for all
.
We can define the discrepancy function associated with a sequence
as
Thus, the sequence is uniformly distributed in
if
as
Of course this is just a weak, qualitative statement. It is many times desirable to replace the
with an explicit estimate in
.
Example 1 For
, let us consider the sequence
, where
is the fractional part. Then we have the estimate
The consideration of uniformly distributed sequences leads naturally to the following question:
Question 4 Is there a function
(as
) such that
for infinitely many
?
Questions 2 and 4 are equivalent in the sense that finding a function in Question 2 gives us the existence of a function
in Question 4. This is due to Roth (’54). For example, to see that Question 2 implies Question 4 for a sequence
justs consider the point distribution
. That was the initial motivation for studying the discrepancy function in two dimensions and it turns out that both Questions 2 and 4 have an affirmative answer with the lower bound function being
and
respectively.
There is a subtle distinction between discrepancy for point distributions in the unit square, and uniformly distributed sequences in , that is between the functions
and
. The difference is not so much in the finite/infinite setting but rather on a distinction between a `dynamic’ setting in the case of
and a `static’ one in
. Indeed, in the case of sequences in
, the definition of
depends on all the `initial segments’
simultaneously and the discrepancy changes drastically if we rearrange the terms of the sequence. No such phenomenon occurs in the static two dimensional setting where we look at the discrepancy of the point distribution
for the whole set
and the ordering of the points plays absolutely no role in the definition of the discrepancy. Thus a `dynamic’ problem in
is equivalent to a `static’ problem in
. More generally, there is always a `dynamic’ discrepancy problem in
dimensions which is equivalent to a `static’ problem in
dimensions for
. This rectangle has area
and contains
points of the set
. In discrepancy language this means that
. As we have already mentioned, one can achieve an
norm of the order
which shows that lattices are very far from being extremal for discrepancy. Any other consideration of points where a very thin axis-parallel rectangle can contain `many’ points of
will be also `bad ‘ for discrepancy.
3. `Bad’ and `Good’ sets for discrepancy.
3.1. Highly structured sets are bad.
Here it is implied that the `structure’ is consistent with the axis-parallel nature of the discrepancy function. Let us for example consider for example a lattice in
. Consider a very thin axis parallel rectangle of the form
for some
.
3.2. Random sets are bad.
Let be chosen independently and uniformly in
. Consider the rectangle
. Then
which implies that
This in turn means that with high probability. Another way to express this is to observe that by the Central limit theorem we get that
3.3. The Roth heuristic.
This expresses the empirical observation that extremal sets for discrepancy are such that each dyadic box of volume contains exactly one point of the distribution
. A typical example that satisfies the Roth heuristic (and is extremal for discrepancy) is the van der Corput set of
points: every dyadic box of volume
contains exactly one point of the van der Corput set. We will go into a detailed analysis of the van der Corput set in the subsequent paragraphs.
4. Lower bounds for discrepancy
As we have mentioned in the introduction, no matter how clever one is when choosing the point distribution , the discrepancy must always grow to infinity. The most typical instance of this principle is the following
lower bounds which are essentially optimal:
Theorem 5 (Roth ’54) For all dimensions
and all point distributions
, we have that
We will refer to the bound of Theorem 5 as `Roth bound’. We have the corresponding theorem in , originally due to Schmidt:
Theorem 6 (Schmidt ’77) For all dimensions
and all point distributions
, we have that
for all
.
These theorems are sharp as to the order of magnitude. In particular we have the following theorem:
Theorem 7 (Davenport ’56) Let
and
. For any positive integer
, there exists a point distribution
such that
Constructing point distribution that are extremal for discrepancy is a delicate matter, especially in high dimensions. For more extremal constructions see [Roth ’74, ’80], [Chen ’81].
Proof: We will work with dyadic rectangles . Each dyadic rectangle
is a tensor product
where each side ,
, is a dyadic interval in
. Each dyadic interval is divided by its midpoint to a left interval
and a right interval
which are also dyadic. The Haar function associated with the interval
is
where . The Haar function associated with
is just the corresponding tensor product
where . Observe that we adopt an
normalization of the Haar functions instead of the usual
normalization so our formulas will look a bit different than the standard ones. For a non-negative integer
, we define
. Now a vector defines a `geometry’ of dyadic rectangles inside
as follows. Define
Thus for fixed ,
is a collection of disjoint dyadic rectangles in
, whose corresponding sides have the same lengths. The volume of each
is also fixed since
.
For the proof of Roth’s bound, let us now fix a non negative integer such that
so that
. We have that
Now let us call `good’ the collection of rectangles . Thus good rectangles are dyadic rectangles of fixed geometry defined by
, which do not contain points of
. It is not hard to see that if a rectangle
is empty of points of
then
Furthermore, a simple calculation shows that we always have
Thus if we conclude that
Furthermore, pigeonholing principle gives
We thus get
For the last equality observe that that are more or less choices for the first
entries of
while the last one is fixed.
The proof of Theorem 6, that is the proof of the bound is essentially the same if one knows the Littlewood-Paley inequality: Proof: } First we need a simple Lemma:
Lemma 8 Suppose that
are measurable subsets of a probability space
with
for all
. Then
We conclude that
Proof: Suppose, for the sake of contradiction, that the conclusion is not true. Then we have
which proves the first conclusion of the lemma. Now we immediately get that
which is the second conclusion in the statement of the Lemma.
Now let’s go back to the proof of the -lower bound for the discrepancy function. Again, we fix a positive integer
such that
so that
. For
we define
. Let is write
for the Lebesgue measure in
. We have that
by our choice of . Using the Littlewood-Paley inequalities we have, for every
, that
Here is the Littlewood-Paley square function associated with
. As observed before, for the `good’ rectangles R we have the estimate
which means that
. Since, for fixed
, all the rectangles
in
are disjoint, we can rewrite this as
Finally, since , we can use Lemma 8 to get
which is the desired estimate.
While the -bounds for
are very well understood and are known to be sharp, the endpoints
and
are much harder with definitive information available only in two dimensions. Even in this case, the techniques required to prove the corresponding theorems are more involved than the
bounds we just described. We have:
Theorem 9 (Schmidt ’72) In dimension
we have
That is, there is a
jump over the average case bound.
Theorem 10 (Halász ’81) In dimension
we have
Because of Davenport’s construction in Theorem 7(as well as other construction due to several different people), Theorem 10 is known to be sharp. For the case we have constructions due to Halton:
Theorem 11 (Halton ’60) For any dimension
there exist point distributions with
For any dimension there is a long-standing conjecture
Conjecture 12 (Beck, Chen) For any dimension
we have that
Observe that because of Davenport’s construction, Theorem 7, Conjecture 12 is the best possible result one can hope for, that is if true, it will be sharp. For the bound however, it is not even clear what the right lower bound should be. We just mention one of the conjecture due to Michael Lacey:
Conjecture 13 (Lacey) For any dimension
we have that
that is, we have a
jump over the average case bound.
At the moment, we do not have any matching upper bounds for Conjecture 13. The only result point towards that direction, states that we have at least some improvement over the average case bound:
Theorem 14 (Bilyk, Lacey, Vagharshakyan) For all dimensions
we have that
where
is a function of
tending to
as
![]()
In our recent work with Bilyk, Lacey and Vagharshakyan, we focus on the two dimensional case and prove results that smoothly interpolate between the and the
norms. The lower bounds we prove are more or less consequences of the Roth bound; the main point is that these results are shown to be sharp:
Theorem 15 (Bilyk, Lacey, P, Vagharshakyan) (i) For dimension
we have that
for all
, and this is sharp.
(ii) For dimension
we have that
for all
, and this is sharp.
The space is the exponential Orlicz space. Without going into a lot of details about its formal definition let’s just note here that we have
for all and
, so the first part of theorem interpolates between the bounds of Roth (Theorem 5) and Schmidt (Theorem 9}).
The space BMO is the dyadic BMO space defined by Chang and Fefferman. This space consists of all square integrable functions in the linear span of
for which we have
Here the supremum is over all measurable subsets of , not just rectangles. Here i is useful to recall the simple observation that the BMO norm is insensitive to functions that are constant in either the vertical or horizontal direction (or both) since then the Haar functions have zero mean in each direction.
In this talk we will focus on the BMO result which is technically easier to analyze. In fact, the lower bound for the BMO norm of the Discrepancy function is just a repetition of the proof that gave us the Roth bound. The main interest is that this is sharp. In order to describe the point distribution that satisfies
we need to take a closer look at the van der Corput set.
5. The van der Corput set (vdC).
The van der Corput set of points is defined as
Abusing notation, we can rewrite the points of the van der Corput set in their binary expansion as
If we order the -coordinates of the set in increasing order,
we can also write
In order to find the that corresponds to a given
, write the
in its binary expansion and then reverse the first
binary digits. Note that the
-st digit is always
the way we have set up things.
Claim 1 The van der Corput set is a binary net. Each dyadic rectangle of volume
contains exactly one point of the van der Corput set.
Indeed, let us consider a generic dyadic rectangle of volume :
where ,
and
. Let
be a point of the van der Corput set,
. If
belongs to
, the first
binary digits of
must be the same as the corresponding binary digits of
while the first
binary digits of
must coincide with the firt
binary digits of
. Since
, this uniquely defines a point of the van der Corput set.
The van der Corput set thus agrees with the Roth heuristic (one point per box); it turns out that vdC is extremal for discrepancy
Theorem 16 For the vdC set
, consisting of
points, we have that
Proof: First observe that we have the equivalent definition of -discrepancy:
where the supremum is over all rectangles in
.
For any consider a `strip’
of the form
. Points of the vdC set inside
have
-coordinates with their last
binary digits fixed, so they can be written in the form
As a consequence, there are such points inside
, equally spaces with step
Subdividing the strip
into boxes
of length
in the
-direction, we see that each box
has volume
and contains exactly one point of the vdC set. Thus these boxes hace discrepancy zero. There is (possibly) one last box that might not contain a point, or might have smaller volume (depending on
) but for this last box we can bound its discrepancy by
. Overall we get that the strip
has discrepancy at most
.
Now for any , consider
such that
. Then write the box
as
, where
. The discrepancy of the box
is at most
since
has volume at most
and contains at most one point of the vdC set since points of the vdC set have
-coordinates which differ by at least
. Since any interval
can be written as a disjoint union of at most
dyadic intervals, we get that
can be written as a union of at most
strips
. Summing up we get that
The claim that any interval can be written as a disjoint union of at most
dyadic intervals can be proved by induction on
. Indeed for
this is obvious. Now suppose we know the claim for
and consider any interval of the form
. This can be written in the form
so if
is even, by the induction hypothesis we are done. Suppose now that
is odd. Then we can write
and since now is even we can use the inductive hypothesis again to get a total of (at most)
dyadic interval.
6. The BMO norm of the discrepancy of the van der Corput set.
In this section, we set ourselves the task to prove the following theorem:
Theorem 17 (B,L,P,V) For the vdC set we have
The heart of the proof is the following Lemma that gives estimates for the Haar coefficients of the van der Corput set:
Lemma 18 (Basic Lemma) For the van der Corput set, we have the basic estimate
Let us postpone the proof of the Lemma for now because it is quite technical and see how we can use it in order to prove Theorem 17.
Proof: Recall that the BMO norm of the discrepancy function is given by
Here, is any measurable set. We consider two cases:
Big Volume Rectangles : We have in this case that
where we have already used the basic Lemma to estimate the Haar coefficients of the Discrepancy function. Now for fixed , the rectangles in
are disjoint we are considering only the ones contained in
. Since each rectangle has fixed volume
, there are at most
of them. Overall we get
Small Volume Rectangles : Here we treat the `linear part’ and the `counting part’ separately and just use triangle inequality. Observe that this means we can get away with that because in the small volume case there is no significant cancellation between the linear and the counting part.
For the linear part we have the estimate
The counting part is a bit more involved. We let be the family of maximal dyadic rectangles
in
which satisfy
. Remember we are always in the small volume case so these rectangles
also satisfy
. Also the non-zero inner product with the haar function corresponding to
implies that
must contain at least one point of our set in its interior. However, the van der Corput set is a dyadic net at the scale
so each such
contains at most one point. This means that every
contains exactly one point of the van der Corput set. Thus the counting part contribution to the BMO norm can be estimated as
where is the unique point of the vdC set contained in
. Using Bessel inequality for the space
we get
Combining these estimates we get that the counting part contribution is bounded by
In the inner sum we only consider rectangles with a side length
. But this is justified by the fact these rectangles contain one point of the vdC set in their interior and they are dyadic. Since the points of the vdC set have
-coordinates that differ by
, this is only possible if the side length of the maximal rectangles is at least
.
For fixed, the rectangles
in the family
with side
have to be disjoint since they are dyadic and maximal. Since they are all contained in ,
the inner sum can be estimated by
Overall we get that the counting part can be estimated by
which is the desired estimate (squared).
For the proof of our basic lemma we need some auxiliary definitions that actually come in handy in several calculations and considerations of the van der Corput set. First of all, for we define
to be the
-th digit in the binary expansion of
. Abusing notation again, if
has binary expansion
this means for example that
and so on. Now for a
which has
binary digits, we define the digit scrambling or digital shift as
That is, the -th digit of
changes if
and stays the same if
. We also need the auxiliary function
defined as
where is the fractional part of
. Observe that
is a
-periodic function. We have the following proposition which contains the basic properties of the function
that we will use in what follows.
Proposition 19 Let
be a dyadic rectangle of the form
where
and
. Let
. Then
Furthermore, for every
we have that
Finally,
is a `Lipschitz’ function in the sense that for every
,
We can now give the proof of Lemma 18:
Proof: Let be a dyadic rectangle of the form
We write for the van der Corput set of
points. We consider again two cases depending on the volume of the rectangles:
Small volume rectangles : Here we have that
. This means that the rectangle
can contain at most
points of the van der Corput set in its interior. That is because there is only one free digit in the binary expansion of the points of the vdC set contained in
and for this digit there are two choices. Thus we have
Large volume rectangles : Here we have that
which means there are at least
points of
in the rectangle
. In more detail, let us consider
. Then
for the -coordinate of a point in
. The first
and the last
binary digits of
are fixed because
. Now we group the points in quadruples
according to their
-coordinates:
There are such quadruples
. We estimate
Observe that if is any one of the four points in a quadruple
, then the quadruple can be described by the points
Using this representation and the properties of the function we can calculate
which by the Lipschitz property of implies that
We conclude that
This completes the proof of the lemma and thus the proof of the BMO estimate.
Pingback: Circle discrepancy for checkerboard measures | I Woke Up In A Strange Place