Content
Seminar in Probability Theory
The Seminar in Probability Theory takes place during the semester, normally on Wednesday at 11:00.
Program HS 2019
Date/Time  Speaker  Title  Location 

30 Oktober 2019  AlainSol Sznitman ETH Zurich 
On bulk deviations for the local behavior of random interlacements >  Spiegelgasse 1 Room 00.003 
In this talk we will discuss some recent large deviation asymptotics concerning the
local behavior of random interlacements on \(\mathbb Z^d\), \(d\ge 3\). In
particular, we will describe the link with previous results concerning macroscopic
holes left inside a large box, by the the adequately thickened connected component
of the boundary of the box in the vacant sets of random interlacements.


13 November 2019  PierreFrancois Rodriguez Imperial College 
On the phase transition for levelset percolation of the Gaussian free field >  Spiegelgasse 1 Room 00.003 
We will discuss recent progress regarding the geometry of the Gaussian free field
in three and more dimensions. Our results, based on joint work with H.
DuminilCopin, S. Goswami and F. Severo, deal with the percolation problem
associated to levelsets of the Gaussian free field, as first investigated by J.
Lebowitz and H. Saleur in 1986. We will examine the equality of several natural
critical parameters related to this model.


27 November 2019  Alexis Prévost University of Köln 
Percolation for the Gaussian free field on the cable system >  Spiegelgasse 1 Room 00.003 
Among percolation models with longrange correlations, the Gaussian free field on
discrete graphs has received a lot of attention over the last few years, but not so
much on its continuous equivalent, the cable system. I will present a brief history
of the question, and explain why, for this model, the critical parameter is
surprisingly equal to 0, and is thus explicitly known, on a large class of graphs.
There are different proofs of this result, through uniqueness of the infinite
cluster, Russo formula, exploration martingale or random interlacements, but also
examples where the critical parameter is not equal to 0. Finally, I will give the
law for the capacity of the clusters of the level sets of the Gaussian free field
on the cable system.
Joint work with Alexander Drewitz and PierreFrançois Rodriguez. 

4 December 2019  Razvan Gurau CNRS 
Invitation to random tensors >  Spiegelgasse 1 Room 00.003 
I will give an introduction to random tensors and their applications. In particular I will describe an universality result for invariant probability measures for tensors: under generic scaling assumptions, for large tensor sizes any invariant tensor measure approaches a Gaussian. I will then discuss the implications of this result, as well as ways to avoid it.


11 December 2019  Daniele Tantari Scuola Normale Superiore 
Direct/Inverse Hopfield model and Restricted Boltzmann Machines >  Spiegelgasse 1 Room 00.003 
Meanfield methods fail to reconstruct the parameters of the model when the dataset is clusterized. This situation is found at low temperatures because of the emergence of multiple thermodynamic states. The paradigmatic Hopfield model is considered in a teacherstudent scenario as a problem of unsupervised learning with Restricted Boltzmann Machines (RBM). For different choices of the priors on units and weights, the replica symmetric phase diagram of random RBM’s is analyzed and in particular the paramagnetic phase boundary is presented as directly related to the optimal size of the training set necessary for a good generalization. The connection between the direct and inverse problem is pointed out by showing that inference can be efficiently performed by suitably adapting both standard learning techniques and standard approaches to the direct problem.


18 December 2019  Nicolas Macris EPFL 
Optimal errors and phase transitions in highdimensional generalised linear models >  Spiegelgasse 1 Room 00.003 
Highdimensional generalized linear models are basic building blocks of current data analysis tools including multilayers neural networks. They arise in signal processing, statistical inference, machine learning, communication theory, and other fields. I will explain how to establish rigorously the intrinsic informationtheoretic limitations of inference and learning for a class of randomly generated instances of generalized linear models, thus closing several old conjectures. Examples will be shown where one can delimit regions of parameters for which the optimal error rates are efficiently achievable with currently known algorithms. I will discuss how the proof technique, based on the recently developed adaptive interpolation method, is able to deal with the output nonlinearity and also to some extent with nonseparable input distributions.

Program FS 2019
Date/Time  Speaker  Title  Location 

27 Febuary 2019  Mo Dick Wong University of Cambridge 
Universal tail profile of Gaussian multiplicative chaos >  Spiegelgasse 5 Room 05.002 
We study the tail probability of the mass of Gaussian multiplicative chaos and
establish a formula for the leading order asymptotics under very mild assumptions,
resolving a recent conjecture of Rhodes and Vargas. The leading order coefficient
can be described by the product of two constants, one capturing the dependence on
the test set and any nonstationarity and the other one encoding the universal
properties of multiplicative chaos. This may be seen as a first step in
understanding the full distributional properties of Gaussian multiplicative chaos.


20 March 2019  David Belius University of Basel 
Theory of Deep Learning 1: Introduction to the main
questions > slides 
Spiegelgasse 5 Room 05.002 
This is the first talk in a five part series of talks on deep learning from a
theoretical point of view, held jointly between the probability theory and machine
learning groups of the Department of Mathematics and Computer Science. The four
invited speakers that follow after this talk are young researchers who are
contributing in different ways to what will hopefully eventually be a comprehensive
theory of deep neural networks.
In this first talk I will introduce the main theoretical questions about deep neural networks: 1. Representation  what can deep neural networks represent? 2. Optimization  why and under what circumstances can we successfully train neural networks? 3. Generalization  why do deep neural networks often generalize well, despite huge capacity? As a preface I will review the basic models and algorithms (Neural Networks, (stochastic) gradient descent, ...) and some important concepts from machine learning (capacity, overfitting/underfitting, generalization, ...). 

27 March 2019  Levent Sagun EPFL 
Theory of Deep Learning 2: Overparametrization in neural
networks: an overview and a definition > slides 
Spiegelgasse 5 Room 05.002 
An excursion around the ideas for why the stochastic gradient descent algorithm
works well on training deep neural networks leads to considerations about the
underlying geometry of the related loss function. Recently, we gained a lot of
insight into how tuning SGD leads to better or worse generalization properties on a
given model and task. Furthermore, we have a reasonably large set of observations
that lead to the conclusion that more parameters typically lead to better
accuracies as long as the training process is not hampered. In this talk, I will
speculatively argue that as long as the model is overparameterized (OP), all
solutions are equivalent up to finite size fluctuations.
We will start by reviewing some of the recent literature on the geometry of the loss function, and how SGD navigates the landscape in the OP regime. Then we will see how to define OP by finding a sharp transition described by the models fitting abilities to its training set. Finally, we will discuss how this critical threshold is connected to the generalization properties of the model, and argue that life beyond this threshold is (more or less) as good as it gets. 

3 April 2019  Arthur Jacot EPFL 
Theory of Deep Learning 3: Neural Tangent Kernel:
Convergence and Generalization of Deep Neural Networks > slides 
Spiegelgasse 5 Room 05.002 
We show that the behaviour of a Deep Neural Network (DNN) during gradient descent
is described by a new kernel: the Neural Tangent Kernel (NTK). More precisely, as
the parameters are trained using gradient descent, the network function (which maps
the network inputs to the network outputs) follows a socalled kernel gradient
descent w.r.t. the NTK. We prove that as the network layers get wider and wider,
the NTK converges to a deterministic limit at initialization, which stays constant
during training. This implies in particular that if the NTK is positive definite,
the network function converges to a global minimum. The NTK also describes how DNNs
generalise outside the training set: for a least squares cost, the network function
converges in expectation to the NTK kernel ridgeless regression, explaining how
DNNs generalise in the socalled overparametrized regime, which is at the heart of
most recent developments in deep learning.


10 April 2019  Lenaïc Chizat Université ParisSud 
Theory of Deep Learning 4: Training Neural Networks in
the Lazy and Mean Field Regimes > slides 
Spiegelgasse 5 Room 05.002 
The current successes achieved by neural networks are mostly driven by experimental
exploration of various architectures, pipelines, and hyperparameters, motivated by
intuition rather than precise theories. Focusing on the optimization/training
aspect, we will see in this talk why pushing theory forward is challenging, but
also why it matters and key insights it may lead to. We will review some recent
results on the phenomenon of "lazy training", on the role of overparameterization,
and on training neural networks with a single hidden layer.


15 April 2019 (Monday) 13:00  Marylou Gabrié ENS 
Theory of Deep Learning 5: Information theoretic approach to deep learning theory: a test using statistical physics methods > slides  Spiegelgasse 5 Room 05.002 
The complexity of deep neural networks remains an obstacle to the understanding of
their great efficiency. Their generalisation ability, a priori counter intuitive,
is not yet fully accounted for. Recently an information theoretic approach was
proposed to investigate this question.
Relying on the heuristic replica method from statistical physics we present an estimator for entropies and mutual informations in models of deep model networks. Using this new tool, we test numerically the relation between generalisation and information. 

TBA


8 May 2019  Roland Bauerschmidt Universitiy of Cambridge 
The geometry of random walk isomorphisms >  Spiegelgasse 5 Room 05.002 
The classical random walk isomorphism theorems relate the local time of a random
walk to the square of a Gaussian free field. I will present nonGaussian versions
of these theorems, relating hyperbolic and hemispherical sigma models (and their
supersymmetric versions) to nonMarkovian random walks interacting through their
local time. Applications include a short proof of the SabotTarres limiting formula
for the vertexreinforced jump process (VRJP) and a MerminWagner theorem for
hyperbolic sigma models and the VRJP. This is joint work with Tyler Helmuth and
Andrew Swan.


15 May 2019  Augusto Teixeira IMPA 
Random walk on a simple exclusion process >  Spiegelgasse 5 Room 05.002 
In this talk we will study the asymptotic behavior of a random walk that evolves on
top of a simple symmetric exclusion process. This nice example of a random walk on
a dynamical random environment presents its own challenges due to the slow mixing
properties of the underlying medium. We will discuss a law of large numbers that
has been proved recently for this random walk. Interestingly, we can only prove
this law of large numbers for all but two exceptional densities of the exclusion
process. The main technique that we have employed is a multiscale renormalization
that has been derived from works in percolation theory.


Monday 17 June 2019 11:00  Shuta Nakajima University of Nagoya 
Gaussian fluctuations in directed polymers >  Spiegelgasse 5 Room 05.001 
In this talk, we consider the discrete directed polymer model with i.i.d.
environment and we study the fluctuations of the partition function. It was proven
by Comets and Liu that for sufficiently high temperature, the fluctuations converge
in distribution towards the product of the limiting partition function and an
independent Gaussian random variable. We extend the result to the whole L^2region,
which is predicted to be the maximal hightemperature region where the Gaussian
fluctuations should occur under the considered scaling. This is joint work with
Clément Cosco.

Program HS 2018
Date/Time  Speaker  Title  Location 

6 September 2018 
Lisa Hartung New York University 
The Ginibre ensemble and Gaussian multiplicative
chaos >
It was proven by Rider and Virag that the logarithm of the characteristic
polynomial of the Ginibre ensemble converges to a logarithmically correlated random
field. In this talk we will see how this connection can be established on the level
if powers of the characteristic polynomial by proving convergence to Gaussian
multiplicative chaos. We consider the range of powers in the \(L^2\) phase.
(Joint work in progress with Paul Bourgade and Guillaume Dubach). 
Spiegelgasse 1 Room 00.003 
19 September 2018 
Alexander
Drewitz Universität Köln 
Ubiquity of phases in some percolation models with
longrange correlations >
We consider two fundamental percolation models with longrange correlations: The
Gaussian free field and (the vacant set) of Random Interlacements. Both models have
been the subject of intensive research during the last years and decades, on
\(\mathbb Z^d\) as well as on some more general graphs. We investigate some
structural percolative properties around their critical parameters, in particular
the ubiquity of the infinite components of complementary phases.
This talk is based on joint works with A. Prévost (Köln) and P.F. Rodriguez (BuressurYvette). 
Spiegelgasse 1 Room 00.003 
31 October 2018 
Anton Klimovsky Universität DuisburgEssen 
Highdimensional Gaussian fields with isotropic
increments seen through spin glasses >
Finding the (spaceheight) distribution of the (local) extrema of highdimensional
strongly correlated random fields is a notorious hard problem with many
applications. Following Fyodorov and Sommers (2007), we focus on the Gaussian
fields with isotropic increments and take the viewpoint of statistical physics. By
exploiting various probabilistic symmetries, we rigorously derive the
FyodorovSommers formula for the logpartition function in the highdimensional
limit. The formula suggests a rich picture for the distribution of the local
extrema akin to the celebrated spherical SherringtonKirkpatrick model with mixed
pspin interactions.

Spiegelgasse 1 Room 00.003 
7 November 2018 
Dominik
Schröder IST Austria 
Cusp Universality for Wignertype Random Matrices
>
For Wignertype matrices, i.e. Hermitian random matrices with independent, not
necessarily identically distributed entries above the diagonal, we show that at any
cusp singularity of the limiting eigenvalue distribution the local eigenvalue
statistics are universal and form a Pearcey process. Since the density of states
typically exhibits only square root or cubic root cusp singularities, our work
complements previous results on the bulk and edge universality and it thus
completes the resolution of the WignerDysonMehta universality conjecture for the
last remaining universality type.

Spiegelgasse 1 Room 00.003 
14 November 2018 
Marius Schmidt Universität Basel 
Oriented first passage percolation on the
hypercube >
Consider the hypercube as a graph with vertex set \({0,1}^N\) and edges between two
vertices if they are only one coordinate flip apart. Choosing independent standard
exponentially distributed lengths for all edges and asking how long the shortest
directed paths from \((0,..,0)\) to \((1,..,1)\) is defines oriented first passage
percolation on the hypercube. We will discuss the conceptual steps needed to answer
this question to the precision of extremal process following the two paper series
"Oriented first passage percolation in the mean field limit" by Nicola Kistler,
Adrien Schertzer and Marius A. Schmidt: arXiv:1804.03117 [math.PR] and
arXiv:1808.04598 [math.PR].

Spiegelgasse 1 Room 00.003 
21 November 2018 
Antti Knowles University of Geneva 
Local law and eigenvector delocalization for
supercritical ErdosRenyi graphs >
We consider the adjacency matrix of the ErdosRenyi graph \(G(N,p)\) in the
supercritical regime \(pN > C \log N\) for some universal constant C. We show
that the eigenvalue density is with high probability well approximated by the
semicircle law on all spectral scales larger than the typical eigenvalue spacing.
We also show that all eigenvectors are completely delocalized with high
probability. Both results are optimal in the sense that they are known to be false
for \(pN < \log N\). A key ingredient of the proof is a new family of large
deviation estimates for multilinear forms of sparse vectors. Joint work with Yukun
He and Matteo Marcozzi.

Spiegelgasse 1 Room 00.003 
28 November 2018 
Gaultier
Lambert University of Zurich 
How much can the eigenvalue of a random matrix
fluctuate? >
The goal of this talk is to explain how much the eigenvalues of large Hermitian
random matrices deviate from certain deterministic locations. These are known as
“rigidity estimates” in the literature and they play a crucial role in the proof of
universality. I will review some of the current results on eigenvalues’
fluctuations and present a new approach which relies on the theory of Gaussian
Multiplicative Chaos and leads to optimal rigidity estimates for the Gaussian
Unitary Ensemble. I will also mention how it is also deduce a central limit theorem
from our proof. This is joint work with Tom Claeys, Benjamin Fahs and Christian
Webb.

Spiegelgasse 1 Room 00.003 
12 December 2018 
Ioan
Manulescu University of Fribourg 
Uniform Lipschitz functions on the triangular lattice
have logarithmic variations >
Uniform integervalued Lipschitz functions on a finite domain of the triangular
lattice are shown to have variations of logarithmic order in the radius of the
domain. The level lines of such functions form a loop \(O(2)\) model on the edges
of the hexagonal lattice with edgeweight one. An infinitevolume Gibbs measure for
the loop \(O(2)\) model is constructed as a thermodynamic limit and is shown to be
unique. It contains only finite loops and has properties indicative of
scaleinvariance: macroscopic loops appearing at every scale. The existence of the
infinitevolume measure carries over to height functions pinned at 0; the
uniqueness of the Gibbs measure does not. The proof is based on a representation of
the loop \(O(2)\) model via a pair of spin configurations that are shown to satisfy
the FKG inequality. We prove RSWtype estimates for a certain connectivity notion
in the aforementioned spin model. Based on joint work with Alexander Glazman.

Spiegelgasse 1 Room 00.003 