## Seminar in Probability Theory

The Seminar in Probability Theory takes place during the semester, normally on Wednesday at 11:00.

### Program HS 2019

Date/Time Speaker Title Location
30 Oktober 2019 Alain-Sol Sznitman
ETH Zurich
On bulk deviations for the local behavior of random interlacements > Spiegelgasse 1 Room 00.003
In this talk we will discuss some recent large deviation asymptotics concerning the local behavior of random interlacements on $$\mathbb Z^d$$, $$d\ge 3$$. In particular, we will describe the link with previous results concerning macroscopic holes left inside a large box, by the the adequately thickened connected component of the boundary of the box in the vacant sets of random interlacements.
13 November 2019 Pierre-Francois Rodriguez
Imperial College
On the phase transition for level-set percolation of the Gaussian free field > Spiegelgasse 1 Room 00.003
We will discuss recent progress regarding the geometry of the Gaussian free field in three and more dimensions. Our results, based on joint work with H. Duminil-Copin, S. Goswami and F. Severo, deal with the percolation problem associated to level-sets of the Gaussian free field, as first investigated by J. Lebowitz and H. Saleur in 1986. We will examine the equality of several natural critical parameters related to this model.
27 November 2019 Alexis Prévost
University of Köln
Percolation for the Gaussian free field on the cable system > Spiegelgasse 1 Room 00.003
Among percolation models with long-range correlations, the Gaussian free field on discrete graphs has received a lot of attention over the last few years, but not so much on its continuous equivalent, the cable system. I will present a brief history of the question, and explain why, for this model, the critical parameter is surprisingly equal to 0, and is thus explicitly known, on a large class of graphs. There are different proofs of this result, through uniqueness of the infinite cluster, Russo formula, exploration martingale or random interlacements, but also examples where the critical parameter is not equal to 0. Finally, I will give the law for the capacity of the clusters of the level sets of the Gaussian free field on the cable system.
Joint work with Alexander Drewitz and Pierre-François Rodriguez.
4 December 2019 Razvan Gurau
CNRS
Invitation to random tensors > Spiegelgasse 1 Room 00.003
I will give an introduction to random tensors and their applications. In particular I will describe an universality result for invariant probability measures for tensors: under generic scaling assumptions, for large tensor sizes any invariant tensor measure approaches a Gaussian. I will then discuss the implications of this result, as well as ways to avoid it.
11 December 2019 Daniele Tantari
Scuola Normale Superiore
Direct/Inverse Hopfield model and Restricted Boltzmann Machines > Spiegelgasse 1 Room 00.003
Mean-field methods fail to reconstruct the parameters of the model when the dataset is clusterized. This situation is found at low temperatures because of the emergence of multiple thermodynamic states. The paradigmatic Hopfield model is considered in a teacher-student scenario as a problem of unsupervised learning with Restricted Boltzmann Machines (RBM). For different choices of the priors on units and weights, the replica symmetric phase diagram of random RBM’s is analyzed and in particular the paramagnetic phase boundary is presented as directly related to the optimal size of the training set necessary for a good generalization. The connection between the direct and inverse problem is pointed out by showing that inference can be efficiently performed by suitably adapting both standard learning techniques and standard approaches to the direct problem.
18 December 2019 Nicolas Macris
EPFL
Optimal errors and phase transitions in high-dimensional generalised linear models > Spiegelgasse 1 Room 00.003
High-dimensional generalized linear models are basic building blocks of current data analysis tools including multilayers neural networks. They arise in signal processing, statistical inference, machine learning, communication theory, and other fields. I will explain how to establish rigorously the intrinsic information-theoretic limitations of inference and learning for a class of randomly generated instances of generalized linear models, thus closing several old conjectures. Examples will be shown where one can delimit regions of parameters for which the optimal error rates are efficiently achievable with currently known algorithms. I will discuss how the proof technique, based on the recently developed adaptive interpolation method, is able to deal with the output nonlinearity and also to some extent with non-separable input distributions.

### Program FS 2019

Date/Time Speaker Title Location
27 Febuary 2019 Mo Dick Wong
University of Cambridge
Universal tail profile of Gaussian multiplicative chaos > Spiegelgasse 5 Room 05.002
We study the tail probability of the mass of Gaussian multiplicative chaos and establish a formula for the leading order asymptotics under very mild assumptions, resolving a recent conjecture of Rhodes and Vargas. The leading order coefficient can be described by the product of two constants, one capturing the dependence on the test set and any non-stationarity and the other one encoding the universal properties of multiplicative chaos. This may be seen as a first step in understanding the full distributional properties of Gaussian multiplicative chaos.
20 March 2019 David Belius
University of Basel
Theory of Deep Learning 1: Introduction to the main questions >
slides
Spiegelgasse 5 Room 05.002
This is the first talk in a five part series of talks on deep learning from a theoretical point of view, held jointly between the probability theory and machine learning groups of the Department of Mathematics and Computer Science. The four invited speakers that follow after this talk are young researchers who are contributing in different ways to what will hopefully eventually be a comprehensive theory of deep neural networks.

In this first talk I will introduce the main theoretical questions about deep neural networks:
1. Representation - what can deep neural networks represent?
2. Optimization - why and under what circumstances can we successfully train neural networks?
3. Generalization - why do deep neural networks often generalize well, despite huge capacity?

As a preface I will review the basic models and algorithms (Neural Networks, (stochastic) gradient descent, ...) and some important concepts from machine learning (capacity, overfitting/underfitting, generalization, ...).
27 March 2019 Levent Sagun
EPFL
Theory of Deep Learning 2: Over-parametrization in neural networks: an overview and a definition >
slides
Spiegelgasse 5 Room 05.002
An excursion around the ideas for why the stochastic gradient descent algorithm works well on training deep neural networks leads to considerations about the underlying geometry of the related loss function. Recently, we gained a lot of insight into how tuning SGD leads to better or worse generalization properties on a given model and task. Furthermore, we have a reasonably large set of observations that lead to the conclusion that more parameters typically lead to better accuracies as long as the training process is not hampered. In this talk, I will speculatively argue that as long as the model is over-parameterized (OP), all solutions are equivalent up to finite size fluctuations.
We will start by reviewing some of the recent literature on the geometry of the loss function, and how SGD navigates the landscape in the OP regime. Then we will see how to define OP by finding a sharp transition described by the models fitting abilities to its training set. Finally, we will discuss how this critical threshold is connected to the generalization properties of the model, and argue that life beyond this threshold is (more or less) as good as it gets.
3 April 2019 Arthur Jacot
EPFL
Theory of Deep Learning 3: Neural Tangent Kernel: Convergence and Generalization of Deep Neural Networks >
slides
Spiegelgasse 5 Room 05.002
We show that the behaviour of a Deep Neural Network (DNN) during gradient descent is described by a new kernel: the Neural Tangent Kernel (NTK). More precisely, as the parameters are trained using gradient descent, the network function (which maps the network inputs to the network outputs) follows a so-called kernel gradient descent w.r.t. the NTK. We prove that as the network layers get wider and wider, the NTK converges to a deterministic limit at initialization, which stays constant during training. This implies in particular that if the NTK is positive definite, the network function converges to a global minimum. The NTK also describes how DNNs generalise outside the training set: for a least squares cost, the network function converges in expectation to the NTK kernel ridgeless regression, explaining how DNNs generalise in the so-called overparametrized regime, which is at the heart of most recent developments in deep learning.
10 April 2019 Lenaïc Chizat
Université Paris-Sud
Theory of Deep Learning 4: Training Neural Networks in the Lazy and Mean Field Regimes >
slides
Spiegelgasse 5 Room 05.002
The current successes achieved by neural networks are mostly driven by experimental exploration of various architectures, pipelines, and hyper-parameters, motivated by intuition rather than precise theories. Focusing on the optimization/training aspect, we will see in this talk why pushing theory forward is challenging, but also why it matters and key insights it may lead to. We will review some recent results on the phenomenon of "lazy training", on the role of over-parameterization, and on training neural networks with a single hidden layer.
15 April 2019 (Monday) 13:00 Marylou Gabrié
ENS
Theory of Deep Learning 5: Information theoretic approach to deep learning theory: a test using statistical physics methods > slides Spiegelgasse 5 Room 05.002
The complexity of deep neural networks remains an obstacle to the understanding of their great efficiency. Their generalisation ability, a priori counter intuitive, is not yet fully accounted for. Recently an information theoretic approach was proposed to investigate this question.
Relying on the heuristic replica method from statistical physics we present an estimator for entropies and mutual informations in models of deep model networks. Using this new tool, we test numerically the relation between generalisation and information.
TBA
8 May 2019 Roland Bauerschmidt
Universitiy of Cambridge
The geometry of random walk isomorphisms > Spiegelgasse 5 Room 05.002
The classical random walk isomorphism theorems relate the local time of a random walk to the square of a Gaussian free field. I will present non-Gaussian versions of these theorems, relating hyperbolic and hemispherical sigma models (and their supersymmetric versions) to non-Markovian random walks interacting through their local time. Applications include a short proof of the Sabot-Tarres limiting formula for the vertex-reinforced jump process (VRJP) and a Mermin-Wagner theorem for hyperbolic sigma models and the VRJP. This is joint work with Tyler Helmuth and Andrew Swan.
15 May 2019 Augusto Teixeira
IMPA
Random walk on a simple exclusion process > Spiegelgasse 5 Room 05.002
In this talk we will study the asymptotic behavior of a random walk that evolves on top of a simple symmetric exclusion process. This nice example of a random walk on a dynamical random environment presents its own challenges due to the slow mixing properties of the underlying medium. We will discuss a law of large numbers that has been proved recently for this random walk. Interestingly, we can only prove this law of large numbers for all but two exceptional densities of the exclusion process. The main technique that we have employed is a multi-scale renormalization that has been derived from works in percolation theory.
Monday 17 June 2019 11:00 Shuta Nakajima
University of Nagoya
Gaussian fluctuations in directed polymers > Spiegelgasse 5 Room 05.001
In this talk, we consider the discrete directed polymer model with i.i.d. environment and we study the fluctuations of the partition function. It was proven by Comets and Liu that for sufficiently high temperature, the fluctuations converge in distribution towards the product of the limiting partition function and an independent Gaussian random variable. We extend the result to the whole L^2-region, which is predicted to be the maximal high-temperature region where the Gaussian fluctuations should occur under the considered scaling. This is joint work with Clément Cosco.

### Program HS 2018

Date/Time Speaker Title Location
6 September 2018 Lisa Hartung
New York University
The Ginibre ensemble and Gaussian multiplicative chaos >
It was proven by Rider and Virag that the logarithm of the characteristic polynomial of the Ginibre ensemble converges to a logarithmically correlated random field. In this talk we will see how this connection can be established on the level if powers of the characteristic polynomial by proving convergence to Gaussian multiplicative chaos. We consider the range of powers in the $$L^2$$ phase.
(Joint work in progress with Paul Bourgade and Guillaume Dubach).
Spiegelgasse 1 Room 00.003
19 September 2018 Alexander Drewitz
Universität Köln
Ubiquity of phases in some percolation models with long-range correlations >
We consider two fundamental percolation models with long-range correlations: The Gaussian free field and (the vacant set) of Random Interlacements. Both models have been the subject of intensive research during the last years and decades, on $$\mathbb Z^d$$ as well as on some more general graphs. We investigate some structural percolative properties around their critical parameters, in particular the ubiquity of the infinite components of complementary phases.
This talk is based on joint works with A. Prévost (Köln) and P.-F. Rodriguez (Bures-sur-Yvette).
Spiegelgasse 1 Room 00.003
31 October 2018 Anton Klimovsky
Universität Duisburg-Essen
High-dimensional Gaussian fields with isotropic increments seen through spin glasses >
Finding the (space-height) distribution of the (local) extrema of high-dimensional strongly correlated random fields is a notorious hard problem with many applications. Following Fyodorov and Sommers (2007), we focus on the Gaussian fields with isotropic increments and take the viewpoint of statistical physics. By exploiting various probabilistic symmetries, we rigorously derive the Fyodorov-Sommers formula for the log-partition function in the high-dimensional limit. The formula suggests a rich picture for the distribution of the local extrema akin to the celebrated spherical Sherrington-Kirkpatrick model with mixed p-spin interactions.
Spiegelgasse 1 Room 00.003
7 November 2018 Dominik Schröder
IST Austria
Cusp Universality for Wigner-type Random Matrices >
For Wigner-type matrices, i.e. Hermitian random matrices with independent, not necessarily identically distributed entries above the diagonal, we show that at any cusp singularity of the limiting eigenvalue distribution the local eigenvalue statistics are universal and form a Pearcey process. Since the density of states typically exhibits only square root or cubic root cusp singularities, our work complements previous results on the bulk and edge universality and it thus completes the resolution of the Wigner-Dyson-Mehta universality conjecture for the last remaining universality type.
Spiegelgasse 1 Room 00.003
14 November 2018 Marius Schmidt
Universität Basel
Oriented first passage percolation on the hypercube >
Consider the hypercube as a graph with vertex set $${0,1}^N$$ and edges between two vertices if they are only one coordinate flip apart. Choosing independent standard exponentially distributed lengths for all edges and asking how long the shortest directed paths from $$(0,..,0)$$ to $$(1,..,1)$$ is defines oriented first passage percolation on the hypercube. We will discuss the conceptual steps needed to answer this question to the precision of extremal process following the two paper series "Oriented first passage percolation in the mean field limit" by Nicola Kistler, Adrien Schertzer and Marius A. Schmidt: arXiv:1804.03117 [math.PR] and arXiv:1808.04598 [math.PR].
Spiegelgasse 1 Room 00.003
21 November 2018 Antti Knowles
University of Geneva
Local law and eigenvector delocalization for supercritical Erdos-Renyi graphs >
We consider the adjacency matrix of the Erdos-Renyi graph $$G(N,p)$$ in the supercritical regime $$pN > C \log N$$ for some universal constant C. We show that the eigenvalue density is with high probability well approximated by the semicircle law on all spectral scales larger than the typical eigenvalue spacing. We also show that all eigenvectors are completely delocalized with high probability. Both results are optimal in the sense that they are known to be false for $$pN < \log N$$. A key ingredient of the proof is a new family of large deviation estimates for multilinear forms of sparse vectors. Joint work with Yukun He and Matteo Marcozzi.
Spiegelgasse 1 Room 00.003
28 November 2018 Gaultier Lambert
University of Zurich
How much can the eigenvalue of a random matrix fluctuate? >
The goal of this talk is to explain how much the eigenvalues of large Hermitian random matrices deviate from certain deterministic locations. These are known as “rigidity estimates” in the literature and they play a crucial role in the proof of universality. I will review some of the current results on eigenvalues’ fluctuations and present a new approach which relies on the theory of Gaussian Multiplicative Chaos and leads to optimal rigidity estimates for the Gaussian Unitary Ensemble. I will also mention how it is also deduce a central limit theorem from our proof. This is joint work with Tom Claeys, Benjamin Fahs and Christian Webb.
Spiegelgasse 1 Room 00.003
12 December 2018 Ioan Manulescu
University of Fribourg
Uniform Lipschitz functions on the triangular lattice have logarithmic variations >
Uniform integer-valued Lipschitz functions on a finite domain of the triangular lattice are shown to have variations of logarithmic order in the radius of the domain. The level lines of such functions form a loop $$O(2)$$ model on the edges of the hexagonal lattice with edge-weight one. An infinite-volume Gibbs measure for the loop $$O(2)$$ model is constructed as a thermodynamic limit and is shown to be unique. It contains only finite loops and has properties indicative of scale-invariance: macroscopic loops appearing at every scale. The existence of the infinite-volume measure carries over to height functions pinned at 0; the uniqueness of the Gibbs measure does not. The proof is based on a representation of the loop $$O(2)$$ model via a pair of spin configurations that are shown to satisfy the FKG inequality. We prove RSW-type estimates for a certain connectivity notion in the aforementioned spin model. Based on joint work with Alexander Glazman.
Spiegelgasse 1 Room 00.003