A smoother Brownian motion

I have mentioned one type of fractal already, but nature’s favorite (or at least, my own favorite…) is probably the curve formed by the path of a standard Brownian Motion, which is mathematically thought of as being a “random” continuous function

$latex [0,+\infty[\rightarrow \mathbf{R}$

and is visualized by the graph of this function. One can find online plenty of pictures of “typical” paths, which are immediately noticeable by virtue of being very rough: indeed, it is a theorem of Paley, Wiener and Zygmund that, almost surely, the function

$latex t\mapsto B(t)$

is nowhere differentiable, hence violent oscillations must exist arbitrarily closely to every point of the graph.

Hence the typical pictures of simulated Brownian motion are not as striking or beautiful as those of approximations of other fractal objects can be. This seems to be a shame, and things don’t have to be this way: since we can only draw approximations, the problem is partly that Brownian motion is often approximated by rescaled random walks with smaller and smaller steps with affine interpolation of the discrete steps of the walk, which creates the spiky look at each change of direction.

But here’s my own version:

Sample Brownian approximation

Isn’t it much nicer? Or, at least, more “readable”?

The theoretical backround is the theory of Bernstein polynomials (S. Bernstein, not J. Bernstein, who also has his own polynomials). They were used by S. Bernstein to provide a nice constructive proof of the Weierstrass approximation theorem: for any continuous function

$latex f\,:\, [0,1]\rightarrow \mathbf{R}$

the polynomials

$latex B_n(x;f)=\sum_{0\leq i\leq n}{{n\choose i}f(i/n)x^i(1-x)^{n-i}}$

converge uniformly to f as n goes to infinity. Since those are polynomials, they are very smooth approximations to an arbitrary continuous function. And if one applies this to a path of the Brownian motion, using the known distribution of the random variable B(i/n) (namely, a standard centered gaussian with variance i/n), one gets nice polynomial approximations of Brownian motion! The one above has degree 40000, and it should be said that trying to compute it from the formula doesn’t work (binomial coefficients (40000/i) are not so easy to compute when i is reasonably big…): what is used instead is the probabilistic interpretation of

$latex {n\choose i}x^i(1-x)^{n-i}$

as the probability that a binomial distribution with parameters x and n (i.e., the sum of n Bernoulli random variables with probability of being 1 equal to x) takes value i; this is approximated nicely by packages such as GSL. On the other hand, it takes much longer to output a graph like the one above, compared with a random walk with similar complexity (interpreted, roughly speaking, as how complicated the graph looks like — how many changes of direction, for instance).

All of this is described in the first part of my paper on Bernstein polynomials and Brownian motion (note that the last section of this, which tries to study zeros of random Bernstein polynomials, contains a serious mistake…); in particular it is shown that, indeed, those converge (in law) to Brownian motion, and in fact can be used to give an alternative proof of its mathematical existence.

Expectation symbol

Probabilists use the word “expectation” to describe the average of a random variable (which is by the way rather badly defined by the OED:

8. The degree of probability of the occurrence of any contingent event.

1832 WEBSTER s.v., If the chances of receiving or not receiving a hundred dollars..are equal; then..the expectation is worth half the money. 1838 DE MORGAN Ess. Probab. v. (1841) 97 The balance is the average required, and is known by the name of the mathematical expectation. 1848 WHARTON Law Lex., Expectation, in the doctrine of chances, is applied to any contingent event, upon the happening of which some benefit is expected. Ibid., The value of the expectation is..£5.

but let us leave that aside), and the notation is typically

$latex m=\mathbf{E}(X)$

or maybe with a blackboard “E”, or something similar. The notation is also more and more commonly used in some parts of analytic number theory to simplify the writing of (typically) finite averages, with things like

$latex \frac{1}{|X|}\sum_{x\in X}{f(x)}=\mathbf{E}_{x\in X}(f(x))$

quite common, e.g., in the papers of Green and Tao or the book of additive combinatorics of Tao and Vu.

This is indeed quite convenient, but I find that the LaTeX formatting is often rather ugly, especially in displayed formulas and when there are also integration and summation signs on the line. What I would really like to have is a way to typeset an expectation operator “E” rather like a summation sign “Σ”, with respect to size, and with upper and lower subscripts to indicate the conditions involved. I’ve looked in various collections of LaTeX fonts and symbols without success, and my vanishingly small LaTeX/Metafonting skills are quite insufficient to create such a symbol myself. Does anyone know of a satisfactory solution?

Semi-idle question: what is the structure of the set of gaussians?

Consider a probability space (Ω,Σμ) and the corresponding Hilbert space

$latex H=L^2_0(\Omega,\mu)$

of real-valued square-integrable functions with zero mean:

$latex \int_{\Omega}{f(x)d\mu(x)}=0\text{ for } f\in H.$

In H, we can define the subset

$latex G=\left{X\,:\, \Omega\rightarrow \mathbf{R}\,\mid\, X\text{ is a centered gaussian variable}},$

(where gaussian is in the non-strict sense: constant variables are considered gaussians, with variance 0; this has the effect in particular that G is not empty, as it contains the zero random variable).

Here’s a question I’m vaguely puzzled about: does the set G have any known interesting structure? Note that it is not a vector space, but it satisfies the following conditions:

(1) it is closed for the norm topology;

(2) it is stable under scalar multiplication (so it is geometrically a cone — already some kind of structure);

(3) if X and Y are in G, and are orthogonal [i.e., are independent; recall that all variables are centered], then the linear space spanned by X and Y is entirely contained in G;

(4) (almost) conversely, a result of Cramer states that if X and Y are in H, are orthogonal, and their sum is in G, then X and Y are in G.

These four conditions together are quite strong, and I’m wondering how far they are from capturing the truth about G. In other words, could it be that, in some sense, any subset M of H, or indeed of an arbitrary Hilbert space, satisfying (1)–(4) is necessarily related to a set of gaussian random variables? (This makes sense since all conditions are expressed completely in the language of Hilbert spaces, with no reference to the underlying probability space, once “orthogonal” has been used instead of “independent”).

Or are there interesting subsets M which are genuinely different? I don’t really have a clue (or an opinion), but I haven’t thought about it very much either…

Since there are many known characterizations of the normal distribution, these may well be questions which have already been answered, but for instance the book of Bogachev on Gaussian Measures (linked-to above) does not seem to contain anything of this kind.

Quantum Chaos in Bordeaux

I was in Bordeaux this week until yesterday to attend (parts of) the conference Quantum Chaos 2009 organized by M-L. Chabanol, S. Nonenmacher and G. Ricotta. This was a very enjoyable stay, although it was threatened by natural chaos on the left due to a terrible storm that went over the South West part of France last Saturday, and by more social chaos on the right from a fairly general strike tomorrow, including that of the public transportation system and of the university personnel restaurant in Bordeaux.

I will not try to give a summary of what Quantum Chaos is, since I am not at all an expert in this field in general, and also because Arnd Bäcker, the first lecturer, gave a wonderful overall presentation from the point of view of mathematical physics (which is where the subject grew). The downloadable version does not include the very nice computer simulations that he presented, unfortunately.

I lectured myself on the part of the subject that I know (a bit) about: the Arithmetic Quantum Unique Ergodicity Conjecture of Rudnick and Sarnak [see this survey of J. Marklof for some background on this and other arithmetic aspects of Quantum Chaos], which claimed that the probability measures on the modular surface

$latex SL(2,\mathbf{Z})\backslash \mathbf{H}$

defined by

$latex \mu_f=|f(z)|^2 y^{-2}dxdy,$

for eigenfunctions f of the Laplace operator which are of norm 1 and are also eigenfunctions of all Hecke operators, should converge weakly to the Poincaré measure

$latex 3/\pi y^{-2}dxdy$

on the modular surface. I said “claimed” because this is a conjecture no more: rather dramatically, G. Prakash pointed out at the end of my second lecture that K. Soundararajan had posted on arXiv the same morning a preprint containing a proof of the last missing step required to verify the correctness of the conjecture! This is based on the ground-breaking work of E. Lindenstrauss, who had used ergodic theoretic methods to prove that any weak limit had to be a multiple of the Poincaré measure, but without being able to avoid the possibility that some of the unit mass would be lost (“escape in the cusp”): Soundararajan manages to exclude this last possibility. (I find somewhat ironic that this happened the first day probably in months where I didn’t have time to check the arXiv posting in the morning…)

What I had lectured on was the recent work of R. Holowinsky, and of (another work of) Soundararajan. roughly speaking, Holowinsky had managed to prove the conjecture either “with very few exceptions” or under the Ramanujan-Petersson conjecture for Hecke eigenforms, using a very beautiful and clever sieve argument (the link between sieve methods and a conjecture motivated by mathematical physics is part of the beauty of mathematics…) Soundararajan, also with a remarkable argument of analytic number theory, had proved a fairly general “weak subconvexity” result for values of L-functions satisfying the Ramanujan-Petersson conjecture. This gave — among other things! — another proof of the Unique Ergodicity with very few exception (assuming the Ramanujan-Petersson) conjecture, because of earlier work, in particular, of Sarnak and a formula of Watson linking the desired equidistribution to some special values of triple product L-functions.

Although the (brand new) result of Soundararajan means that these two works are not needed any more to try to prove the original conjecture, their interest is not just as beautiful pieces of mathematics, as Holowinsky and Soundararajan show in another joint work: their methods work also for holomorphic cusp forms of large weight (still Hecke eigenfunctions). There, on the one hand, we have the Ramanujan-Petersson conjecture (by Deligne’s one-two knockout step: first linking it with the Riemann Hypothesis over finite fields, and then proving the latter). Also, on the other hand, a rather wonderful stroke of good fortune arises: the two “exceptional subsets”, where each method fails to give equidistribution, are disjoint! So Holowinsky and Soundararajan end up proving the holomorphic case of AQUE by a completely devious route (which recalls the first ineffective solution of the class number problem of Gauss, where either assuming that GRH holds, or that it doesn’t, led to the result!)

It is also interesting to note that this holomorphic case of AQUE is currently inaccessible to ergodic-theoretic methods, the reason being (apparently) that there is no “classical dynamics” behind it — the geodesic flow being the classical system underlying the original conjecture. (This was explained to me by M. Einsiedler, whose lectures in Bordeaux cover the ergodic-theoretic methods of Lindenstrauss).

Other lecturers in the programme are P. Kurlberg, describing the quantization of toral automorphisms and the problems that arise (where it is again very satisfying from the point of view of the unity of mathematics that the Riemann Hypothesis over finite fields turns out to play a big role), and J. Keating who will lecture Thursday and Friday on Quantum Graphs (which, unfortunately, I know nothing about, but I’ll try to find a good survey to link to).

The Goldston-Pintz-Yildirim result, and how far do we have to walk to twin primes ?

When D. Goldston, J. Pintz and C. Yildirim proved the spectacular result that the gaps between consecutive primes satisfy

$latex \liminf \frac{p_{n+1}-p_n}{\log n}=0\ \ \ \ \ \ \ (*)$

(which means that those gaps are infinitely often smaller than an arbitrarily small multiple of their average size) there was a great hope that maybe some of their ideas would lead to a proof of the much stronger result

$latex \liminf (p_{n+1}-p_n) \text{ is finite}\ \ \ \ \ \ \ (**)$

and possibly to the twin prime conjecture (in the form stating that this liminf should be equal to 2). Since then, titles of lectures like Goldston’s “Revenge of the twin prime conjecture” have shown that this hope was, maybe, somewhat sanguine.

The work of Goldston-Pintz-Yildirim has been the subject of a number of surveys, and there is an excellent recent blog post on this topic by T. Tao (the first paragraph of which links to three of these surveys). In this post, I wish to discuss a complementary topic which is not addressed in detail in any of these: what, exactly, is needed to go to finite gaps (i.e., to prove (**))? why did it seem reasonable to hope that it might be accessible, and how hard can we guess this extra step to actually be? Some of this is discussed in my own Bourbaki report, but there will be new information towards the end, based on some discussions I had with É. Fouvry after I had written it.

The basic data that the method of Goldston-Pintz-Yildirim depends on is the quantitative equidistribution of primes in arithmetic progression, which is measured by the functions

$latex E(x;q,a)=\ \ \ \ \ \sum_{p\leq x,\ p\equiv a\text{ mod } q}{\ \log p}\ \ \ -\ \ \frac{x}{\varphi(q)},$

defined for any modulus q> 0 and residue class a modulo q, coprime with q. The basic information is that, in various senses, E(x;q,a) is small, and the difficulty (in this method as well as in countless other problems of analytic number theory) is that we must quantify this intuition in various ranges of the three variables, and most often uniformly.

What is known, and gives a reasonably strong foundation for further results and expectations, is that

$latex |E(x;q,a)|\leq C(A)\frac{x}{(\log x)^A}$

for any A>0, for some constant C(A) which is independent of q and a, but which is however ineffective (not actually computable for any given A). This looks good but recall the “main term” involved is

$latex \frac{x}{\varphi(q)}$

and because the Euler function is of size roughly q (up to small oscillations which are not important in this post), this means that the estimate we obtain is, in fact, trivial (and turns out to be useless) if q grows with x faster than any power of logarithm. For instance, this theorem (due to Siegel and Walfisz) doesn’t give any information about

$latex \ \ \ \ \ \sum_{p\leq x,\ p\equiv a\text{ mod } q}{\ \log p}$

if q is a small power of x (in particular, one can not guarantee the existence of primes occuring in the summation even if the main term x/φ(q) suggests there should be plenty of them).

Now, what is needed for (*) and (**)? (Note that by jumping to this, I — purposely in this post — am leaving aside the extremely ingenious and beautiful original ideas of Goldston, Pintz and Yildirim). Well, for (*), it suffices to know some upper bound on average over q up to, roughly, the square root of x, and uniformly over a: it suffices to find some B, depending on A, such that

$latex \sum_{q\leq \sqrt{x}/(\log x)^B}{\max_{a\text{ mod } q}{|E(x;q,a)|}}\leq C(A)\frac{x}{(\log x)^A},$

(where the constant C(A) may of course be another one than the one before). Now this is much stronger than the Siegel-Walfisz Theorem, and at first sight one may wonder if this is reasonable. That it is, indeed, expected, was known for a long time before it was proved unconditionally by Bombieri and Vinogradov (independently) in the 1960’s. The reason is that this estimate (with some specific value of B which I won’t write down explicitly) is a trivial consequence of the Generalized Riemann Hypothesis for Dirichlet L-functions (widely expected to be true…) and the so-called “Explicit formulas” which transform sums over primes in arithmetic progressions into sums over the zeros of the relevant L-functions. Indeed, as in this post, the “trivial” refers here to transforming an identity (or something close to it) to an upper bound by using the triangle inequality for a sum.

(Of course, the fact that this was expected does not mean that the result is easy: proving the Bombieri-Vinogradov theorem was a remarkable breakthrough in analytic number theory, and it was one of the main achievements that led to Bombieri being awarded the Fields Medal in 1974).

So, in a sense, the Bombieri-Vinogradov theorem explains why (*) is a theorem today. As to (**), much more is needed: it would suffice to extend the Bombieri-Vinogradov theorem to any range of modulus larger than the square-root of x, i.e., to prove, for any A, that

$latex \sum_{q\leq x^{\delta}}{\max_{a\text{ mod } q}{|E(x;q,a)|}}\leq D(A)\frac{x}{(\log x)^A},\ \ \ \ \ \ \ (***)$

for some δ>1/2 (independent of A); precisely, if this holds for a given δ, it follows that there are infinitely many primes at a finite distance from each other, the bound on the distance depending on δ. Currently (as far as I know), assuming the best possible result, which is that any δ<1 will work, one gets primes at a distance at most 16 from each other.

But any estimate of the type above remains completely conjectural! And based on the previous description of the Bombieri-Vinogradov Theorem, one can understand: the latter is, in some sense, an unconditional confirmation of what the Generalized Riemann Hypothesis suggests should be true. But GRH says nothing about primes in arithmetic progressions where the modulus exceeds (roughly) the square root of the length! So estimates like the one we need are unknown, even under the assumption of GRH.

This may lead to some scratching of chin, and some wonderment whether those statements are, in fact, correct. This is definitely a legitimate question: based on very clever ideas of Maier, Friedlander and Granville have shown that some very optimistic versions where the bound on the modulus is

$latex x/(\log x)^B\text{ instead of } x^{\delta},\ \delta<1$

are definitely false. But there are reasons to hope, at least for δ just slightly larger than 1/2, if one remembers that GRH implies the Bombieri-Vinogradov by a trivial treatment of a sum over zeros of L-functions: surely, the ordinates of the latter must be distributed randomly enough to lead to compensations when the sum is not submitted to the rough treatment of the triangle inequality!

A further — and more convincing — cause to hope is given by the remarkable work of Fouvry and Iwaniec first, strenghtened later by Bombieri, Friedlander and Iwaniec: for certain specific types of weights (arising naturally in sieve theory), and for certain explicit δ>1/2 (namely any δ<4/7 in the strongest version), we have

$latex \sum_{q\leq x^{\delta}}{\gamma_q E(x;q,a)}\leq D(a,A)\frac{x}{(\log x)^A}.$

This type of statements (which go back to the beginning to mid 1980’s) rank among the deepest known results in analytic number theory. Indeed, note that, for the reason sketched above, this can not be deduced from the Generalized Riemann Hypothesis, and yet it is an unconditional result: hence, it goes well beyond the already remarkable Bombieri-Vinogradov Theorem. (It would actually be interesting to have a proof of such results assuming — and using! — GRH, since apparently it involves in some mysterious way the detection of randomness in the ordinates of zeros of L-functions, but I don’t think there has been any work in this direction).

The proofs of these bounds for primes in arithmetic progressions to large moduli depend essentially on general estimates from the spectral theory of Kloosterman sums (due largely to Deshouillers and Iwaniec), and thus (ultimately) on properties of modular forms and eigenfunctions of the hyperbolic Laplace operator on finite area hyperbolic surfaces. Among applications, one may mention a result of Fouvry which was used (among other applications) in the first version of the deterministic polynomial-time algorithm for primality testing of Agrawal, Kayal and Saxena.

Coming back to (**), the fact that the exponent 4/7 is larger than 1/2 may suggest that the job is done. Unfortunately, there are two difficulties: (i) the weights γq are not arbitrary, and can not be chosen to give the absolute value in the summation; (ii) more importantly, the residue class a is now fixed. This second condition is the crucial one, because the method of Goldston, Pintz and Yildirim fundamentally requires to use more than one residue class.
Despite renewed efforts, it would seem to be very hard to obtain (***) using anything like the methods of Bombieri, Fouvry, Friedlander and Iwaniec.

However, one shouldn’t discard any hope too quickly: looking at the proof of (**), one realizes that it is not (***) really which is needed, but the following type of inequality

$latex \sum_{q\leq x^{\delta}}{\sum_{a\in S(P,q)}{|E(x;q,a)|}}\leq D(A)\frac{x}{(\log x)^A},\ \ \ \ \ \ \ (****)$

where the moduli q can be restricted to squarefree ones while the set of residue classes in the inner sum is

$latex S(P,q)=\{a\text{ mod }q\,\mid\, (a,q)=1\text{ and } P(a)\equiv 0\text{ mod } q\},$

for some integral (monic) polynomial which is fixed (though its degree may be fairly high). This special structure may be more accessible, because

$latex \sum_{q\leq Q}{|S(P,q)|}\ll Q(\log Q)^{E}$

for some (fixed) exponent E>0, so the number of residue classes considered is, on average, not very large. In particular, note that a trivial summation of the Bombieri-Fouvry-Friedlander-Iwaniec bounds over

$latex 0<a\leq (\log x)^{B}$

shows (together with the information that the implied constant is polynomial in a) that their result does hold uniformly over sets with that many residue classes… except that those must, in the current state of knowledge, be located in the beginning interval, and not be spread out all over the residue classes, like the roots of P modulo q are likely to be.

Fouvry has looked a little bit at this type of inequalities, starting at the beginning; unfortunately, his rough computations give no particular reason to be optimistic at this point: it seems one needs to understand averages of very short Kloosterman sums in many variables, and both the length and the number of variables depend on the degree of the polynomial. Now, this polynomial is not arbitrary: in a somewhat delicate way, its degree ends up dictating for which value of θ>1/2 one needs to get (****) in order to derive (**). But the larger the degree, the smaller the exponent one can actually achieve assuming the most optimistic bounds for those short exponential sums (which are utterly out of reach anyway).

The conclusion? Well, one which seems to be unavoidable is that it really looks like a significant new idea is needed to get bounded gaps between primes: either to bypass the type of strategies which have been successful up to now in studying the distribution of primes in arithmetic progressions; or to find a way to bring those techniques to an entirely new level!