I am a mathematician and my main research interest is probability theory.
- D.Belius, M. Schmidt - Complexity of local maxima of given radial derivative for mixed p-spin Hamiltonians, submitted.
[ Abstract
arxiv ]
We study the number of local maxima with given radial derivative of spherical
mixed p-spin models and prove that the second moment matches the square of the first moment
on exponential scale for arbitrary mixtures and any radial derivative. This is surprising, since
for the number of local maxima with given radial derivative and given energy the corresponding
result is only true for specific mixtures [Sub17; BSZ20].
We use standard Kac-Rice computations to derive formulas for the first and second moment
at exponential scale, and then find a remarkable analytic argument that shows that the second
moment formula is bounded by twice the first moment formula in this general setting.
This also leads to a new proof of a central inequality used to prove concentration of the
number critical points of pure p-spin models of given energy in [Sub17] and removes the need
for the computer assisted argument used in that paper for 3 ≤ p ≤ 10.
- D.Belius, M. Schmidt - Phase diagram for the TAP energy of the p-spin spherical mean field spin glass model, preprint.
[ Abstract
arxiv ]
We solve the Thouless-Anderson-Palmer (TAP) variational principle associated to the spherical pure p-spin mean field spin glass Hamiltonian and present a detailed phase diagram.
In the high temperature phase the maximum of variational principle is the annealed free energy of the model. In the low temperature phase the maximum, for which we give a formula, is strictly smaller.
The high temperature phase consists of three subphases. (1) In the first phase m=0 is the unique relevant TAP maximizer. (2) In the second phase there are exponentially many TAP maximizers, but m=0 remains dominant. (3) In the third phase, after the so called dynamic phase transition, m=0 is no longer a relevant TAP maximizer, and exponentially many non-zero relevant TAP solutions add up to give the annealed free energy.
Finally in the low temperature phase a subexponential number of TAP maximizers of near-maximal TAP energy dominate.
- D. Belius - High temperature TAP upper bound for the free energy of mean field spin glasses, submitted.
[ Abstract
arxiv ]
This work proves an upper bound for the free energy of the Sherrington-Kirkpatrick model and its generalizations
in terms of the Thouless-Anderson-Palmer (TAP) energy. The result applies to models with spherical or Ising spins and
any mixed p-spin Hamiltonian with external field or with a non-linear spike term. The bound is expected to be tight to
leading order at high temperature, and is non-trivial in the presence of an external field. For the proof a geometric
microcanonical method is employed, in which one covers the spin space with sets, each of which is centered at a magnetization
vector m and whose contribution to the partition function is bounded in terms of the TAP energy at m.
- D. Banerjee, D. Belius - Fluctuations of the free energy of the mixed p-spin mean field spin glass model, preprint.
[ Abstract
arxiv ]
We prove the convergence in distribution of the fluctuations of the free energy of the mixed p-spin Sherrington-Kirkpatrick model with non-vanishing 2-spin component at high enough temperature. The limit is Gaussian, and the fluctuations are seen to arise from weighted cycle counts in the complete graph on the spin indices weighted by the 2-spin interaction matrix.
- D. Belius, J. Černý, S. Nakajima, M. Schmidt - Triviality of the geometry of mixed p-spin spherical Hamiltonians with external field,
Journal of Statistical Physics 186.1 (2022): 1-3.
[ Abstract
arxiv Journal link ]
We study isotropic Gaussian random fields on the high-dimensional sphere with an added deterministic linear term, also known as mixed p-spin Hamiltonians with external field. We prove that if the external field is sufficiently strong, then the resulting function has trivial geometry, that is only two critical points. This contrasts with the situation of no or weak external field where these functions typically have an exponential number of critical points.
We give an explicit threshold hc for the magnitude of the external field necessary for trivialization and conjecture hc to be sharp.
The Kac-Rice formula is our main tool.
Our work extends [Fyo15], which identified the trivial regime for the special case of pure p-spin Hamiltonians with random external field.
- T. Liu, A. Chaman, D. Belius, I. Dokmanic - Interpreting U-Nets via Task-Driven Multiscale Dictionary Learning,
IEEE Transactions on Computational Imaging 8 (2022): 425–37. https://doi.org/10.1109/TCI.2022.3175309
[ Abstract
arxiv ]
U-Nets have been tremendously successful in many imaging inverse problems. In an effort to understand the source of this success, we show that one can reduce a U-Net to a tractable, well-understood sparsity-driven dictionary model while retaining its strong empirical performance. We achieve this by extracting a certain multiscale convolutional dictionary from the standard U-Net. This dictionary imitates the structure of the U-Net in its convolution, scale-separation, and skip connection aspects, while doing away with the nonlinear parts. We show that this model can be trained in a task-driven dictionary learning framework and yield comparable results to standard U-Nets on a number of relevant tasks, including CT and MRI reconstruction. These results suggest that the success of the U-Net may be explained mainly by its multiscale architecture and the induced sparse representation.
- M. Samarin, V. Roth, D. Belius, Feature Learning and Random Features in Standard Finite-Width Convolutional Neural Networks: An Empirical Study, Conference on Uncertainty in Artificial Intelligence (UAI) 2022.
[ Abstract
pdf ]
The Neural Tangent Kernel is an important milestone in the ongoing effort to build a theory for deep
learning. Its prediction that sufficiently wide neural
networks behave as kernel methods, or equivalently
as random feature models arising from linearized
networks, has been confirmed empirically for certain wide architectures. In this paper, we compare
the performance of two common finite-width convolutional neural networks, LeNet and AlexNet, to
their linearizations on common benchmark datasets
like MNIST and modified versions of it, CIFAR-10
and an ImageNet subset. We demonstrate empirically that finite-width neural networks, generally,
greatly outperform the finite-width linearization of
these architectures. When increasing the problem
difficulty of the classification task, we observe a
larger gap which is in line with common intuition
that finite-width neural networks perform feature
learning which finite-width linearizations cannot.
At the same time, finite-width linearizations improve dramatically with width, approaching the
behavior of the wider standard networks which
in turn perform slightly better than their standard
width counterparts. Therefore, it appears that feature learning for non-wide standard networks is important but becomes less significant with increasing
width. We furthermore identify cases where both
standard and linearized networks match in performance, in agreement with NTK theory, and a case
where a wide linearization outperforms its standard
width counterpart.
- The TAP-Plefka variational principle for the spherical SK model (with N. Kistler; Communications in Mathematical Physics volume 367, pages991–1017, 2019)
[ Abstract
arxiv
Journal pdf ]
We reinterpret the Thouless-Anderson-Palmer approach to mean field spin glass models as a variational principle in the spirit of the Gibbs variational principle and the Bragg-Williams approximation. We prove this TAP-Plefka variational principle rigorously in the case of the spherical Sherrington-Kirkpatrick model.
- Tightness for the Cover Time of compact two dimensional manifolds (with J. Rosen and O. Zeitouni; Probability Theory and Related Fields volume 176, pages1357–1437, 2020)
[ Abstract
arxiv ]
Let Cε,S2 denote the cover time of the two dimensional sphere S2 by a Wiener sausage of radius ε. We prove that
√Cε,S2−2√2(logε-1-(1/4)loglog ε-1) is tight.
- Barrier estimates for a critical Galton-Watson process and the cover time of the binary tree
(with J. Rosen and O. Zeitouni; Annales de l'Institut Henri Poincaré, Probabilités et Statistiques, Volume 55, Number 1 (2019), 127-154.)
[ Abstract
arxiv ]
For the critical Galton--Watson process with geometric offspring distributions we provide sharp barrier estimates for barriers which are (small) perturbations of linear barriers. These are useful in analyzing the cover time of finite graphs in the critical regime by random walk, and the Brownian cover times of compact two dimensional manifolds. As an application of the barrier estimates, we prove that if CL denotes the cover time of the binary tree of depth L by simple walk, then √CL/(2L+1) - (√2log2) L + log L/(√2 log2) is tight. The latter improves results of Aldous (1991), Bramson and Zeitouni (2009) and Ding and Zeitouni (2012). In a subsequent article we use these barrier estimates to prove tightness of the Brownian cover time for the two-dimensional sphere.
- Maximum of the Riemann zeta function on a short interval of the critical line
(with L-P. Arguin, P. Bourgade, M. Radziwill and K. Soundararajan; Communications in Pure and Applied Mathematics 72: 500-535 (2017) https://doi.org/10.1002/cpa.21791.)
[ Abstract
arxiv ]
We prove the leading order of a conjecture by Fyodorov, Hiary and Keating, about the maximum of the Riemann zeta function on random intervals along the critical line. More precisely, if t is uniformly distributed in [T,2T], then max |t - u| ≤ log |ζ(1/2 + iu)| =(1 + o(1))loglogT
with probability converging to 1 as T → ∞.
- Maximum of the Ginzburg-Landau fields
(with W. Wu; accepted for publication in Annals of Probability)
[ Abstract
arxiv ]
We study two dimensional massless field in a box with potential V(∇φ(·)) and zero boundary condition, where V is any symmetric and uniformly convex function. Naddaf-Spencer and Miller proved the macroscopic averages of this field converge to a continuum Gaussian free field. In this paper we prove the distribution of local marginal φ(x), for any x in the bulk, has a Gaussian tail. We further characterize the leading order of the maximum and dimension of high points of this field, thus generalize the results of Bolthausen-Deuschel-Giacomin and Daviaud for the discrete Gaussian free field.
- Maximum of the characteristic polynomial of random unitary matrices
(with L-P. Arguin and P. Bourgade; Communications in Mathematical Physics, Volume 349 (2017), Issue 2, pp 703-751.)
[ Abstract
Journal pdf
arxiv ]
It was recently conjectured by Fyodorov, Hiary and Keating that the maximum of the characteristic polynomial on the unit circle of a N×N random unitary matrix sampled from the Haar measure grows like CN/(logN)3/4 for some random variable C. In this paper, we verify the leading order of this conjecture, that is, we prove that with high probability the maximum lies in the range [N1−ε,N1+ε], for arbitrarily small ε. The method is based on identifying an approximate branching random walk in the Fourier decomposition of the characteristic polynomial, and uses techniques developed to describe the extremes of branching random walks and of other log-correlated random fields. A key technical input is the asymptotic analysis of Toeplitz determinants with dimension-dependent symbols. The original argument for these asymptotics followed the general idea that the statistical mechanics of 1/f-noise random energy models is governed by a freezing transition. We also prove the conjectured freezing of the free energy for random unitary matrices.
- Maxima of a randomized Riemann Zeta function, and branching random walks
(with L-P. Arguin and A. Harper; Annals of Applied Probability, Volume 27, Number 1 (2017), pp 178-215.)
[ Abstract
pdf ]
A recent conjecture of Fyodorov--Hiary--Keating states that the maximum of the absolute value of the Riemann zeta function on a typical bounded interval of the critical line is exp(loglog T − (3/4) logloglogT + O(1)), for an interval at (large) height T. In this paper, we verify the first two terms in the exponential for a model of the zeta function, which is essentially a randomized Euler product. The critical element of the proof is the identification of an approximate tree structure, present also in the actual zeta function, which allows us to relate the maximum to that of a branching random walk.
- The subleading order of two dimensional cover times (with N. Kistler; Probability Theory and Related Fields, Volume 167 (2017), Issue 1, pp 461-552.)
[ Abstract
Journal pdf
arxiv
]
The epsilon-cover time of the two dimensional torus by Brownian motion is the time it takes for the process to come within distance epsilon>0 from any point. Its leading order in the small epsilon-regime has been established by Dembo, Peres, Rosen and Zeitouni [Ann. of Math., 160 (2004)]. In this work, the second order correction is identified. The approach relies on a multi-scale refinement of the second moment method, and draws on ideas from the study of the extremes of branching Brownian motion.
- Gumbel fluctuations for cover times in the discrete torus (Probability Theory and Related Fields, Volume 157 (2013), Issue 3-4, pp 635-689.)
[ Abstract
Journal pdf
arxiv ]
This work proves that the fluctuations of the cover time of simple random walk in
the discrete torus of dimension at least three with large side-length are governed by the Gumbel
extreme value distribution. This result was conjectured for example in the book by Aldous & Fill. We also derive some
corollaries which qualitatively describe how covering happens. In addition, we develop a new
and stronger coupling of the model of random interlacements, introduced by Sznitman,
and random walk in the torus. This coupling is used to prove the cover time result and is also
of independent interest.
- Cover times in the discrete cylinder. (Preprint)
[ Abstract
arxiv ]
This article proves that, in terms of local times, the properly rescaled and re-centered
cover times of finite subsets of the discrete cylinder by simple random walk
converge in law to the Gumbel distribution, as the cardinality of the set goes to
infinity. As applications we obtain several other results related to covering in the
discrete cylinder. Our method is new and involves random interlacements, which
were introduced in [22]. To enable the proof we develop a new stronger coupling
of simple random walk in the cylinder and random interlacements, which is also of
independent interest.
- Cover levels and random interlacements (Annals of Applied Probability, Volume 22, Number 2 (2012), pp 522-540.)
[ Abstract
pdf ]
This note investigates cover levels of finite sets in the random interlacements model, that is the
least level such that the set is completely contained in the random interlacement at that level. It
proves that as the cardinality of a set goes to infinity, the rescaled and recentered cover level
tends in distribution to the Gumbel distribution with cumulative distribution function exp(-exp(-z))..