Euler style

Courtesy of the divisor function, here is another fun example of reasoning in the great style of Euler (the last installment is rather old…) A classical tool to study the distribution of values of $latex d(n)$ (the number of positive divisors of $latex n$) is the Voronoi summation formula, which expresses a sum

$latex S(w,c,a)=\sum_{n\geq 1}d(n)w(n)e\Bigl(\frac{an}{c}\Bigr),$

for a nice test function $latex w$, some positive integer $latex c\geq 1$, and some integer $latex a$ coprime to $latex c$, in terms of a “dual sum”

$latex S(W,c,\bar{a})=\sum_{m\in \mathbf{Z}-\{0\}}{d(|m|)W(m/c^2)e\Bigl(\frac{\bar{a}m}{c}\Bigr)},$

where $latex \bar{a}$ is the inverse of $latex a$ modulo $latex c$, and

$latex W(y)=\int w(|x|) k(xy)dx$

is some integral transform of $latex w$, with kernel $latex k(y)$ involving the classical Bessel functions $latex Y_0$ and $latex K_0$. Precisely, we have

$latex k(y)=\begin{cases} -2\pi Y_0(4\pi \sqrt{y})&\text{ if } x>0\\ 4 K_0(4\pi\sqrt{|y|})&\text{ if } y<0\end{cases},$

and one should add that there is also a main term in the Voronoi formula, but it is irrelevant for today's story. A classical application of this formula is to improve the error term in Dirichlet's asymptotic evaluation of

$latex \sum_{n\leq X}d(n),$

which was done indeed by Voronoi.

In an ongoing work with É. Fouvry, S. Ganguly and Ph. Michel, we needed to know some unitarity property of the transformation

$latex w \mapsto W.$

This is an entirely classical question, but we didn't find a ready-made statement in Watson’s book on Bessel functions. There is however a formal argument that suggests the answer: if we consider the function $latex g(x,y)$ of two real variables defined by

$latex g(x,y)=w(|xy|),$

then it turns out that we have

$latex \hat{g}(u,v)=W(uv),$

where $latex \hat{g}$ is the standard Fourier transform of $latex g$ (this is contained in Section 4.5 of the book of H. Iwaniec and myself.) Hence we have, by the unitarity of the Fourier transform, the identity

$latex \int \int |w(|xy|)|^2dxdy = \int\int |W(uv)|^2dudv.$

Offhandedly, by changing variables, this means that

$latex \int |w(|t|)|^2 dt \times I = \int |W(s)|^2 ds \times I,$

which would give

$latex 2\|w\|^2= \|W\|^2\quad\quad\quad\quad\quad\quad (\star)$

(the factor $latex 2$ comes from the fact that $latex w$ is extended to an even function on $latex \mathbf{R}$ from its original source as a function defined for non-negative real numbers), if not for the fact that the “constant” $latex I$ is the integral

$latex I=\int \frac{dx}{|x|}.$

Alas, it diverges, although probably Euler would write it as $latex I=4\log (\infty)$ (two infinities from the divergence at $latex 0^{\pm}$, the other two from the divergence at $latex \pm \infty$), and be happy with the outcome.

One can then prove rigorously the formula $latex (\star)$ by truncation arguments, but here is a more conceptual argument (which offers the advantage of being something we can just quote), which follows from the interpretation of the Voronoi formula in terms of the representation theory of $latex G=\mathrm{SL}_2(\mathbf{R})$. What happens is that there exists a unitary representation $latex \rho$ of $latex G$ (the principal series with Casimir eigenvalue $latex 1/4$) which can be represented as acting on the Hilbert space $latex H=L^2(\mathbf{R},|x|^{-1}dx)$ (the Kirilov model) in such a way that the unitary operator

$latex T=\rho\Bigl(\begin{pmatrix}0&-1\\1&0\end{pmatrix}\Bigr)$

is given by an integral operator

$latex (T\varphi)(x)=\int \varphi(y) j(xy)\frac{dy}{|y|}$

for some function $latex j$, which Cogdell and Piatetski-Shapiro called the Bessel function of $latex \rho$ (see this note of Cogdell for a short explanation of this, with the analogues for finite fields and $latex p$-adic fields). Now, by direct inspection of the formula for $latex j(y)$ that Cogdell and Piatetski-Shapiro computed, and comparison with the kernel $latex k(y)$ in the Voronoi formula, one finds that

$latex W(y)=|y|^{-1/2} T( x\mapsto \sqrt{|x|} w(|x|) )$

(in this other short note, Cogdell explains why it is no coincidence that this abstract Bessel function appears in the Voronoi summation formula). Now, from

$latex \int |\varphi(x)|^2 \frac{dx}{|x|}=\int |T(\varphi)(x)|^2\frac{dx}{|x|},$

which holds for all $latex \varphi\in H$ because $latex T$ is unitary on $latex H$, we deduce exactly $latex (\star)$…

Remark. There is a completely similar story where the circles $latex x^2+y^2=a$ replace the hyperbolas $latex xy=a$, or in other words, if one defines
$latex g(x,y)=w(x^2+y^2).$

Then the Fourier transform of $latex g$ is still a radial function $latex W(u^2+v^2)$, and the map $latex w\mapsto W$ is a Hankel transform (it involves the Bessel function $latex J_0$). Its unitarity follows then immediately from that of the Fourier transform, since the analogue of the divergent integral $latex I$ is now, indeed, a finite constant.

In terms of representation-theory, the story is the same as above, except that the representation $latex \rho$ is replaced with a discrete series representation. One can also deal similarly with radial functions in higher-dimensional euclidean spaces, which involves other discrete series representations.

Trace functions, I

This is again the first of a series of a few posts in which I will explain (as promised a very long while ago, and as far as I can…) the trace weights that are used in my paper with É. Fouvry and Ph. Michel (henceforth, this paper will be referred-to as FKM). Given a prime number $latex p$, these are certain specific functions

$latex K\,:\, \mathbf{F}_p\rightarrow \mathbf{C}$

that “come from algebraic geometry”, and that can be studied using both a very rich formalism, and such extraordinarily deep results as Deligne’s “Weil 2” form of the Riemann Hypothesis over finite fields.

In fact, each function of this type is really a kind of “shadow” of a more intrinsic (more algebraic, more geometric, more arithmetic, as you wish) object, and it is rather these objects which algebraic geometry studies. In general, $latex K$ does not determine this other object: if I call $latex \mathcal{F}$ the latter, it may well be the case that two distinct objects $latex \mathcal{F}_1$ and $latex \mathcal{F}_2$ give rise to the same trace function $latex K$. However, there is also a basic complexity invariant $latex c(\mathcal{F})\geq 1$ defined for a given $latex \mathcal{F}$ (which is called its “conductor”), and one can show (this uses the Riemann Hypothesis…) that, given $latex p$, there is a bound $latex T(p)$ (which grows with $latex p$) such that a given function $latex K$ can come from at most one object $latex \mathcal{F}$ with complexity at most $latex T(p)$. I will come back to this in a later post, since I consider the question of determining precisely $latex T(p)$ to be quite fundamental and fascinating, but for the basic purpose of FKM, this issue does not really arise.

As a terminological aside, we tend to call these functions $latex K$ either “trace weights” or “trace functions”. Maybe a better word might be well-deserved for this notion, but we’re not quite sure what might work, though possibly we might use “tracic function”, a good translation of the French fonction tracique that we’ve found ourselves using; this has, at least, some classic ring.

In this first post, I will outline the three possible definitions (or interpretations) of the class of trace functions, going from what is possibly the most closely related to notions known to analytic number theorists, and ending with the most flexible, but maybe least familiar one.

Special Hecke eigenvalues of automorphic forms. In the first picture, one looks at automorphic forms related to the field $latex F=\mathbf{F}_p(T)$ of rational functions over the finite field $latex \mathbf{F}_p$. As is the case for classical modular forms, there are Hecke operators associated to each place of $latex F$, in particular to the irreducible polynomials $latex P_x=X-x$ for $latex x\in\mathbf{F}_p$. Given an automorphic form $latex \phi$, one can then define a function
$latex K_{\phi}(x)=\lambda_{X-x}(\phi),$
the corresponding Hecke eigenvalue for these particular Hecke operators. The complexity of $latex \phi$ can then be defined as the sum of the “traditional” automorphic conductor and the rank $latex r$. Indeed, it is essential here to consider automorphic forms on all groups $latex \mathrm{GL}_r(F)$, and not just on $latex \mathrm{GL}_1$ or $latex \mathrm{GL}_2$.

As examples, imitating the correspondance from Dirichlet characters to Hecke characters for $latex \mathrm{GL}_1$ over the field $latex \mathbf{Q}$, it is not too difficult to construct explicitly some automorphic forms (of rank $latex 1$) for which the associated functions are given by
$latex K(x)=e(P(x)/p),\quad\quad\text{ or }\quad\quad K(x)=\chi(P(x)),$
for some polynomial $latex P\in\mathbf{Z}[X]$ and some multiplicative Dirichlet character $latex \chi$. These are certainly the most natural-looking “functions of algebraic origin” on a finite field, and indeed this construction of (analogues of) Dirichlet characters is the original, and easiest, way to prove the rationality and functional equation for the associated $latex L$-functions over $latex F$ (since, in order to prove this, one does not even need to mention automorphic forms, the whole argument happening within the realm of Dirichlet characters.)

Despite their many fine qualities, automorphic forms are however a bit inflexible from the point of view of defining generalizations of these basic functions $latex K(x)$. For instance, it is rather difficult to write down concretely the function attached to an automorphic form of rank at least $latex 2$. In fact, I don’t really know how to do it (except for automorphic forms built from the case $latex r=1$, like analogues of Eisenstein series) without first applying one of the two other definitions, constructing some object $latex \mathcal{F}$ and associated trace function $latex K$, and then invoking some version of the Langlands correspondence to claim the existence of some automorphic form $latex \phi$ with Hecke eigenvalues $latex K_{\phi}$ coinciding with the original $latex K$.

Similarly, given two functions $latex K_1(x)$, $latex K_2(x)$ arising as Hecke eigenvalues of some automorphic forms $latex \phi_1$ and $latex \phi_2$, it is a rather big theorem to show that there exists another automorphic form with eigenvalues

$latex K(x)=K_1(x)K_2(x),$

(for $latex x$ unramified for both $latex \phi_1$ and $latex \phi_2$): this is the general theory of the Rankin-Selberg convolution.

Another serious drawback (which I will amplify later) is that this is — as far as I know, and at current time — strictly a one-variable story. There is no simple definition (that I know) that can be used to easily package a family of automorphic forms $latex \phi_t$ and, for instance, create a new automorphic form $latex \Phi$ with Hecke eigenvalues related to some average of the eigenvalues of $latex \phi_t$.

Galois representations of function fields. The first alternative to automorphic representation is given by Galois representations, and it is again a customary picture on the side of number fields. The base field is still $latex F=\mathbf{F}_p(T)$, but we now consider the Galois group
$latex G=\mathrm{Gal}(F^{sep}/F)$
of some separable closure of $latex F$, and finite-dimensional representations
$latex \rho\,:\, G\rightarrow \mathrm{GL}(V).$
Then, as is customary in algebraic number theory, for any $latex x\in \mathbf{F}_p$, we have the associated decomposition and inertia group at the place corresponding to $latex x$, and the Frobenius automorphism $latex Fr_x$ which acts on $latex V$ if $latex x$ is unramified for $latex \rho$ (i.e., if the inertia group at $latex x$ acts trivially on $latex V$) and which acts on the invariants $latex V^{I_x}$ otherwise. In all cases we can define a function
$latex K(x)=\mathrm{Tr}(\rho(Fr_x)\mid V^{I_x}).$
It is immediately clear that such a definition gives a very flexible formalism, because we are now dealing largely with linear algebra. So formally, we can add these functions (taking direct sums of representations), multiply them (taking tensor product; because this operation does not always commute with invariants, the corresponding trace function coincides with the product of the two factors at the unramified $latex x$, but may differ at the others.) There is a non-trivial difficulty having to do with topology: to obtain a good theory, since $latex G$ is an infinite profinite group, we want to consider continuous representations. But then, if $latex V$ is a $latex \mathbf{C}$-vector space with its usual topology, we have the difficulty that there are too few representations: any continuous representation then has finite image. One works around this issue by the well-know device of picking some auxiliary prime number $latex \ell\not=p$, and considering continuous representations into $latex \bar{\mathbf{Q}}_{\ell}$-vector spaces. There are many representations in that case (in particular, many with large infinite image), but of course the trace function now takes values in an $latex \ell$-adic field. Qu’à cela ne tienne (or, as Katz says, ell-adic, schmell-adic), one can pick (with some effort or help from a friendly axiom) an isomorphism
$latex \iota\,:\, \bar{\mathbf{Q}}_{\ell}\rightarrow \mathbf{C},$
and consider the function
$latex x\mapsto \iota(\mathrm{Tr}(\rho(Fr_x)\mid V^{I_x})),$
which is complex-valued.

The complexity is, here also, easy to define: there is a notion of Artin conductor for such a representation, and we add the dimension of $latex V$ to take the latter into account.

For applications to constructing interesting function, this business involving $latex \ell$ shouldn’t be considered as too problematic. In fact, to a large extent, it turns out that the theory is rather independent of $latex \ell$. Without wanting to develop this too much, one can already see it by noticing that for any $latex \ell\not=p$, one can rather easily construct Galois representations with trace functions equal to
$latex K(x)=e(P(x)/p),\quad\quad K(x)=\chi(P(x)),$
the basic examples already considered. In fact, this is rather simpler than the corresponding construction of Dirichlet characters of $latex F$, and in particular, it is very easy to go from the construction of representations $latex \rho_a$ and $latex \rho_m$ with respective trace functions
$latex K_a(x)=e(x/p),\quad\quad K_m(x)=\chi(x),$
to the case involving a general polynomial: we have a map $latex F\rightarrow F$ by $latex T\mapsto P(T)$, hence a map of Galois groups $latex P^*\,:\, G\rightarrow G$, and we can “just” consider the composites
$latex K(x)=\rho_a\circ P^*,\quad\quad K(x)=\rho_m\circ P^*,$
to get the desired representations. (This is really a restriction of representations.)

This theory also has fairly natural extensions to higher-dimensional varieties (though one must assume some smoothness for the theory to work decently). To a large extent, FKM might have been written in this language, as far as the definitions of trace weights are concerned. But we use instead the third approach…

Middle-extension sheaves on the affine line. This last theory is closer in terms of formalism to the previous one, but more geometric in spirit, and it is the most flexible. Indeed, it is the one we use in FKM. But the counterpart to this geometric flexibility is that the basic flavor of the definition is least familiar to analytic number theorists. (Here, I am reminded of Cyrano de Bergerac who, having described six different ways of going to the moon, and being asked “Which one did you choose”, replied “A seventh”; or, in proper subjunctive French, –Mais voilà six moyens excellents !. . .Quel système Choisîtes-vous des six, Monsieur ? — Un septième !)

Here the basic object is an $latex \ell$-adic étale sheaf on the affine line over $latex \mathbf{F}_p$, with an added “regularity” property. It is a consequence of basic properties of such objects that, for any $latex x\in\mathbf{F}_p$, we can look at the “stalk” at $latex x$, which is a finite-dimensional $latex \bar{\mathbf{Q}}_{\ell}$-vector space $latex \mathcal{F}_x$, and that the Frobenius automorphism (in some incarnation) acts on this vector space, allowing us to define a trace function
$latex K(x)=\mathrm{Tr}(Fr\mid \mathcal{F}_x),$
and this is how we get our trace weights from this point of view.

To get a feeling for the actual meaning of this, I would like first to refer to my old expository text on Deligne’s first proof of the Riemann Hypothesis over finite fields, where the first part is an introduction to étale cohomology, which might be useful for readers with some basic background in elliptic curves over finite fields, but who haven’t studied the étale topology yet. But here is a more down-to-earth way of seeing things, which mixes fish and fowl to some extent.

A middle-extension sheaf $latex \mathcal{F}$ on the affine line over $latex \mathbf{F}_p$, whatever is the actual definition, comes concretely with some data. One of them is a finite set $latex S\subset \bar{\mathbf{F}}_p$ of singularities, which is defined over $latex \mathbf{F}_p$ (in other words, it is the zero set of some non-zero polynomial in $latex \mathbf{F}_p[T]$). On the complement $latex U$ of this set, the sheaf is what is called lisse, which is equivalent to saying that there is a representation of the étale fundamental group $latex \pi_1(U)$ of $latex U$ in some finite-dimensional $latex \bar{\mathbf{Q}}_{\ell}$-vector space which is “equivalent” to the restriction of the sheaf to $latex U$. But this étale fundamental group is, in fact, none other (canonically isomorphic) than the Galois group $latex G=\mathrm{Gal}(F^{sep}/F)$ of the previous description. And in fact, if we view the representation corresponding to $latex \mathcal{F}$ as a representation of $latex G$, the trace functions are the same.

This allows us at least to describe how one can define the complexity of a middle-extension sheaf: one just takes the complexity of the associated Galois representation (the dimension of the vector space, plus the Artin conductor.)

What is the point then of thinking in terms of sheaves? To my mind, here are some important advantages:

  • The geometric picture that arises is often the easiest way to “see” how to manipulate trace functions to construct new ones;
  • There are different ways of extending a lisse sheaf on $latex U$ to a sheaf on the affine line, and the “middle-extension” is just one of them. It is, in some sense, the best one, but there are others. In the general theory, these may come out because some construction goes outside of the realm of middle-extension sheaves: for instance, the tensor product of two middle-extension sheaves is not one in general; this accounts in a precise way for the way the product of two trace functions may not be one exactly;
  • The theory of sheaves extends handily to higher-dimensional varieties, where more types of singularities and other behaviors arise because there is “more room” for the dimension of various sets where different behaviors arise (so sheaves on a surface might be supported on a curve, etc). Here it is important to see middle-extension sheaves as just some of the étale sheaves, and to allow more general ones.
  • The formalism is by far the most powerful. Especially crucial to the proofs of the deepest results (including the Riemann Hypothesis) is the existence of the étale cohomology groups of a sheaf, and of so-called “higher-direct images” (with compact support or not), which make sense for étale sheaves, but in general do not preserve such regularity properties as being lisse or middle-extension.
  • As a consequence of the above, this is the language in which the sources concerning the properties of étale sheaves are written; for FKM, this means especially the books of N. Katz, which we have consulted and referenced extensively…

To conclude this first post, here is a concrete illustration of what the sheaf formalism gives that is important to analytic number theorists, and which is completely mysterious (as far as I know, at least) on the level of Galois representations or automorphic forms: the existence of the Fourier transform. In fact, given a trace weight $latex K(x)$ associated to some sheaf $latex \mathcal{F}$, a construction of Deligne delivers another sheaf $latex \mathcal{G}$, which is still a middle-extension sheaf, and is such that the associated trace function is
$latex \hat{K}(x)=-\frac{1}{\sqrt{p}}\sum_{y\in\mathbf{F}_p}K(y)e\Bigl(\frac{xy}{p}\Bigr).$
This construction is not obvious; in fact, it involves (1) the fact that sheaves make sense on higher-dimensional varieties, with a wide variety of “functorial” properties; (2) the fact that higher-direct images exist: this is what is needed to obtain results of the type “a sum over $latex y$ of some trace functions parametrized by $latex x$ is itself a trace function”…

If we assume the existence of this construction (and most analytic number theorists would argue that, whatever a theory of functions of algebraic origin might do, it should be compatible with Fourier transform…) we immediately expand our range of examples with some highly-interesting ones, starting with the basic cases
$latex K(x)=e(P(x)/p),\quad\quad K(x)=\chi(P(x)),$
whose Fourier transforms are extremely interesting: they are values of families of exponential sums in one variable.

For instance, take
$latex K(x)=e(\bar{x}/p),\text{ for } x\not=0\pmod{p},$
where we denote by $latex \bar{x}$ the inverse of $latex x$ modulo $latex p$. Then we find that
$latex \hat{K}(x)=-\frac{1}{\sqrt{p}}\sum_{y\not=0}{e\Bigl(\frac{xy+\bar{y}}{p}\Bigr)}$
is a trace weight! In other words, the family of Kloosterman sums $latex S(x,1;p)$, as a function of $latex x$, is a function of algebraic origin modulo $latex p$…

Trailer! In the next posts! I will probably next describe many examples of trace functions, and discuss the formalism that allows us to manipulate them conveniently. After this, I will come to their analytic properties, where the key point is the Riemann Hypothesis over finite fields…

Local limit theorems slides

I gave today a lecture in the conference in honor of F. Delbaen at ETH, and since the rooms were not suitable for blackboard talks, I prepared a beamer talk. I’ve put up the slides on the web for any interested reader; the topic was a survey of the general ideas surrounding my papers with A. Nikeghbali, J. Jacod, A. Barbour, and F. Delbaen on mod-gaussian convergence, mod-Poisson convergence, and related limiting behavior of sequences of random variables. I have posted about this a few times before, but not about all of the results. All the corresponding papers and preprints can be found on my home page.

Correlation sums in the wild

In my last post concerning my joint work with É. Fouvry and Ph. Michel, I reported a few weeks ago how happy we were to have found in the literature a specific case of the general correlation sums that we introduced in our paper to deal with “algebraic twists” of modular forms. The example, we are happy to report, turns out not to be isolated: we have found three more in the last few days. I only list them in the briefest way below, since some of them are rather complicated looking, but precise statements and references are found in a short note we just finished typing. It is rather nice to see how some order emerges from these sums (the last one has no less than 8 parameters, in addition to the three variables of summation) once they are considered from the point of view of general correlation sums.

  • In a paper from 1990 concerning small eigenvalues of the hyperbolic Laplace operator in special situations, H. Iwaniec considers correlation sums related to the weight $latex K(x)=e(2\bar{x}/p)S(\bar{x},\bar{x};p)$, where $latex \bar{x}$ is the inverse of $latex x$ modulo $latex p$, and $latex S(m,n;p)$ is the usual Kloosterman sum; the matrices $latex \gamma$ in the correlation sums are here all upper-triangular, the difficult case being when $latex \gamma$ is not diagonal.  It turns out that the general machinery we develop proves the desired estimates for these sums.
  • In a paper of 1995, N. Pitt considers correlation sums related to the weight $latex K(x)=e(\bar{x}/p)$, which was also the one involved in the sums of Friedlander and Iwaniec that we had earlier identified as examples of correlation sums. However, whereas the matrices in that first case were lower-triangular, there is no particular restriction on the sums of Pitt (which are somewhat involved, with 4 parameters), except that they are not upper-triangular.
  • Finally, in a recent preprint, R. Munshi also considers correlations sums related to $latex K(x)=e(\bar{x}/p)$, also without any particular restriction on the matrices involved except that they are not upper-triangular. These sums differ from those of Pitt by the number and configuration of parameters (there are 8 here…)

We have not yet fully updated the text of the paper to mention these examples, but this will be done soon…

Lecture notes list

Just a note to mention that I’ve just created a web page with links to the notes I’ve written for various courses at ETH over the years (though most of them are not as complete as I would like).