Aspen Winter Conference

US/Mountain
Aspen Center for Physics

Aspen Center for Physics

700 Gillespie Ave, Aspen, CO 81611, United States
Description

Fields, Strings, and Deep Learning

Progress in deep learning has traditionally involved experimental data, but in recent years it has impacted our understanding of formal structures arising in theoretical high energy physics and pure mathematics, via both theoretical and applied deep learning. This conference will bring together high energy theorists, mathematicians, and computer scientists across a broad variety of topics at the interface of these fields. Featured topics include the interface of neural network theory with quantum field theory, lattice field theory, conformal field theory, and the renormalization group; theoretical physics for AI, including equivariant, diffusion, and other generative models; ML for pure mathematics, including knot theory and special holonomy metrics, and deep learning for applications in string theory and holography. 


Organizers

Miranda Cheng, Michael Douglas, Jim Halverson, Fabian Ruehle

    • 17:00
      Welcome Reception

      Reception with light refreshments & wine

    • 11:15
      Midday break
    • 1
      QFT inspired learning

      We present a fully information theoretic approach to renormalization inspired by Bayesian statistical inference, which we refer to as Bayesian renormalization. The main insight of Bayesian renormalization is that the Fisher metric defines a correlation length that plays the role of an emergent renormalization group (RG) scale quantifying the distinguishability between nearby points in the space of probability distributions. This RG scale can be interpreted as a proxy for the maximum number of unique observations that can be made about a given system during a statistical inference experiment. The role of the Bayesian renormalization scheme is subsequently to prepare an effective model for a given system up to a precision which is bounded by the aforementioned scale. In applications of Bayesian renormalization to physical systems, the emergent information theoretic scale is naturally identified with the maximum energy that can be probed by current experimental apparatus, and thus Bayesian renormalization coincides with ordinary renormalization. However, Bayesian renormalization is sufficiently general to apply even in circumstances in which an immediate physical scale is absent, and thus provides an ideal approach to renormalization in data science contexts. To this end, we provide insight into how the Bayesian renormalization scheme relates to existing methods for data compression and data generation such as the information bottleneck and the diffusion learning paradigm. We conclude by designing an explicit form of Bayesian renormalization inspired by Wilson’s momentum shell renormalization scheme in quantum field theory. We apply this Bayesian renormalization scheme to a simple neural network and verify the sense in which it organizes the parameters of the model according to a hierarchy of information theoretic importance.

      Speaker: David Berman
    • 2
      Diffusion Models: A Return to its Thermodynamic Origins

      The 2015 paper that introduced diffusion models took inspiration from fluctuation theorems in non-equilibrium thermodynamics. I will demonstrate how some of these ideas can be refined through mathematical tools familiar to physicists (path integrals, calculus of variations etc.). This will allow us to study neural networks as a thermodynamic system, and offer a potential explanation for the impressive sample quality of these class of generative models.

      Speaker: Akhil Premkumar
    • 18:00
      Evening Break
    • 3
      Intelligent Explorations of the String Landscape

      String theory has far surpassed expectations in its ability to shed light on many areas of theoretical and mathematical physics. However, partly due to the immense size of the solution space, it is yet to be determined if our universe lives somewhere in the string landscape. In this talk, I will present how methods from computer science (genetic algorithms and reinforcement learning) can shed some light on these questions by exploring promising regions of the string landscape.

      Speaker: Thomas Harvey
    • 4
      Aspects of Field Theories through the lens of Neural Networks

      Neural Network (NN) output distributions at initialization give rise to free or interacting field theories, known as NN field theories. Details of NN architectures, e.g. parameters and hyperparameters, control field interaction strengths. Following an introduction to NN field theories, I will review the construction of global symmetry groups of these field actions via a dual framework of NN parameter distributions. This duality in architecture is exploited to construct free and interacting Grassmann valued NN field theories, starting from Grassmann central limit theorem. I will present preliminary results on Grassmann NN field theories. Such constructions of NN field theories via architectures have potential impacts in both physics and ML.

      Speaker: Anindita Maiti
    • 5
      Machine Learning Quantum Integrability

      We design and train a novel Siamese Network inspired ensemble of MLPs to directly obtain solutions to the Yang-Baxter equation. Our method exhaustively reproduces the entire classification of known solutions for two dimensional spin chains. We also comment on new solutions in higher dimension which are part of work in progress. The talk is based on arXiv:2304.07247 and on work in progress, with Suvajit Majumder and Evgeny Sobko.

      Speaker: Shailesh Lal
    • 10:00
      Coffee Break
    • 6
      The Theory and Practice of Diffusion Models

      In recent years diffusion models have become a central tool in image modeling. This talk presents the mathematics of diffusion models as it is related to more general notion of a variational auto-encoder (VAE). The general VAE formulation covers autoregressive multi-modal models which are currently competitive with diffusion models.

      Speaker: David McAllester
    • 11:15
      Midday break

      Group Lunch at the Sundeck

    • 7
      Discrete lattice gauge theory from neural-network quantum states

      First introduced by Carleo and Troyer in 2017, neural-network quantum states have achieved state-of-the-art results for approximating the ground-state wavefunction of spin-chain and lattice systems, and a variety of supervised learning problems. More recently, this approach has been extending to gauge theories using lattice gauge-equivariant convolutional neural networks (L-CNNs).

      In this talk, after reviewing these developments, I will present new results on using L-CNNs and variational quantum Monte Carlo to approximate the ground state of Z_N lattice gauge theories in 2+1 dimensions. We will see that these networks can be used to study critical points and phase transitions while retaining exact gauge invariance. In particular, we will identify the confinement/deconfinement transition and critical exponents for Z_2, and the first-order phase transition for Z_3 gauge theory.

      Speaker: Anthony Ashmore
    • 8
      Metric Flows

      Recently, neural networks have been used to approximate Ricci-flat metrics of Calabi-Yau and G2 manifolds. In these setups, the neural network is the metric (in some choice of local coordinates). One typically starts from a reference metric that is not Ricci-flat and train the neural network to the Ricci-flat metric at the end of training. In this way, training induces a complicated flow in metric space. We discuss this flow, how it compares to other metric flows such as Ricci flow, and implementations of this flow in the so-called neural-tangent kernel (NTK) regime where the neural network becomes infinitely wide. We also propose ways of dealing with the notorious problem of memory costs of NTK methods.

      Speaker: Fabian Ruehle
    • 9
      Tensor network to learn the wave function of data

      How many different ways are there to handwrite digit 3? To quantify this question imagine extending a dataset of handwritten digits MNIST by sampling additional images until they start repeating. We call the collection of all resulting images of digit 3 the "full set." To study the properties of the full set we introduce a tensor network architecture which simultaneously accomplishes both classification (discrimination) and sampling tasks. Qualitatively, our trained network represents the indicator function of the full set. It therefore can be used to characterize the data itself. We illustrate that by studying the full sets associated with the digits of MNIST. Using quantum mechanical interpretation of our network we characterize the full set by calculating its entanglement entropy. We also study its geometric properties such as mean Hamming distance, effective dimension, and size. The latter answers the question above -- the total number of black and white threes written MNIST style is 272.

      Speaker: Anatoly Dymarsky
    • 19:00
      Social at Aspen Tap
    • 10
      Computing Physical Yukawa Couplings from String Theory

      We present a numerical calculation, based on machine-learning techniques, of the physical Yukawa couplings in a heterotic string theory model, obtained by compactifying on a smooth Calabi-Yau three-fold. The model in question is one of a large class of heterotic line bundle models with precisely the MSSM low-energy spectrum plus fields uncharged under the standard-model group. The relevant quantities, that is, the Ricci-flat Calabi-Yau metric, the Hermitian Yang-Mills bundle metrics and the harmonic bundle-valued forms, are all computed by training suitable neural networks. The calculation is carried out at several points along one-parameter family in complex structure moduli space, and each complete calculation takes about a day on a laptop. The methods presented here can be generalised to other string models and constructions, including to F-theory models.

      Speaker: Kit Fraser-Taliente
    • 11
      Computer-aided Conjecture Generation in Maths and Physics

      Proposing good conjectures is at least as valuable as proving theorems. Good conjectures capture our attention and orient our efforts, acting like guide posts on the 'mazy paths to hidden truths' (Hilbert). This talk will touch on three aspects of computer-aided conjecture generation: matching numerical values, symbolic regression and generative models. I will present a Zipf-type law for Physics equations, discuss how genetic algorithms can be used to identify new identities of Rogers-Ramanujan type and present a number of conjectural generating functions for holomorphic line bundle cohomology on certain complex projective varieties.

      Speaker: Andrei Constantin
    • 10:00
      Coffee Break
    • 12
      Spectral Theory of Generalization in Regression

      A theoretical understanding of generalization remains an open problem for many machine learning models, including deep networks where overparameterization leads to better performance, contradicting the conventional wisdom from classical statistics. Here, we investigate generalization error for kernel regression, which, besides being a popular machine learning method, also describes certain infinitely overparameterized neural networks. We use techniques from statistical mechanics to derive an analytical expression for generalization error applicable to any kernel and data distribution. We present applications of our theory to real and synthetic datasets, and for many kernels including those that arise from training deep networks in the infinite-width limit. We elucidate an inductive bias of kernel regression to explain data with simple functions, characterize whether a kernel is compatible with a learning task, and show that more data may impair generalization when noisy or not expressible by the kernel, leading to non-monotonic learning curves with possibly many peaks.

      Speaker: Abdulkadir Canatar
    • 13
      Mapping the Phase Space of Supersymmetric Gauge Theories using Explainable Machine Learning
      Speaker: Rak-Kyeong Seong
    • 12:00
      Midday break
    • Public Lecture
      • 14
        Artificial Intelligence: Ideas From Nature
        Speaker: David Berman
    • 15
      Full solution to the neural scaling law model

      I will present the full solution to the random feature model recently proposed by Maloney, Roberts and Sully as a simplified model that exhibits neural scaling laws, extending the results in the latter reference beyond the ridgeless limit. The calculation is based on a new large-N diagrammatic method for all-order resummation. The result enables full predictions of scaling law exponents and also provides analytical understanding of the optimization of the ridge parameter.

      Speaker: Zhengkang (Kevin) Zhang
    • 16
      Reinforcement-Learning based numerical approaches for the conformal bootstrap

      I will give an update on approximately solving the crossing equations for a general CFT using stochastic optimisation methods. The main tools will include Reinforcement Learning algorithms. After describing the general setup I will then present concrete applications of this approach in the context of the 6D (2,0) theory and the 1D defect CFT.

      Speaker: Costis Papageorgakis
    • 10:00
      Coffee Break
    • 17
      Machine Learning negatively-curved manifolds and de Sitter vacua

      Constructing detailed cosmological models in UV-complete theories of gravity is of great importance to connect theory and observation.
      In this talk, I will describe the mathematical problems associated with a proposed class of de Sitter compactifications of M-theory on negatively-curved manifolds. These manifolds are generic and enjoy beautiful geometric properties, such as the absence of moduli and the presence of tunable short closed geodesics that support quantum effects. Despite their richness, smooth negatively-curved manifolds can be constructed explicitly starting from hyperbolic polytopes. I will describe the associated mathematical problems and current work in progress on using Machine Learning techniques for finding the detailed metrics and internal field configurations for de Sitter compactifications on these spaces.

      Speaker: Giuseppe Bruno De Luca
    • 11:15
      Midday break and Sundeck Lunch

      Sundeck Lunch

    • 18
      Statistical exploration of the landscape with consequences for SUSY at LHC

      Rather general considerations expect a power law draw to large soft SUSY breaking terms on the landscape, which are then tempered by the anthropic requirement of a derived value for the weak scale being within a factor of a few of our measured value. From a toy model of the landscape, this leads to natural SUSY being favored over unnatural SUSY models (like split SUSY, minisplit, high scale SUSY etc). Probability distributions of sparticle and Higgs masses show a maximal preference for m_h~125 GeV with sparticles, except for light higgsinos, somewhat or well beyond present LHC reach.

      Speaker: Howard Baer
    • 19
      Moduli Stabilization and the Statistics of SUSY Breaking in the Landscape

      The statistics of the supersymmetry breaking scale in the string landscape has been extensively studied in the past finding either a power-law behaviour induced by uniform distributions of F-terms or a logarithmic distribution motivated by dynamical supersymmetry breaking. These studies focused mainly on type IIB flux compactifications but did not systematically incorporate the Kähler moduli. We point out that the inclusion of the Kähler moduli is crucial to understand the distribution of the supersymmetry breaking scale in the landscape since in general one obtains unstable vacua when the F-terms of the dilaton and the complex structure moduli are larger than the F-terms of the Kähler moduli. After taking Kähler moduli stabilisation into account, we find that the distribution of the gravitino mass and the soft terms is power-law only in KKLT and perturbatively stabilised vacua which therefore favour high scale supersymmetry. On the other hand, LVS vacua feature a logarithmic distribution of soft terms and thus a preference for lower scales of supersymmetry breaking. Whether the landscape of type IIB flux vacua predicts a logarithmic or power-law distribution of the supersymmetry breaking scale thus depends on the relative preponderance of LVS and KKLT vacua. The talk is based on arXiv:2007.04327 .

      Speaker: Kuver Sinha
    • 20
      Three Dimensional IR N-ality with Eight Supercharges

      I will introduce a new family of IR dualities for quiver gauge theories in three space-time dimensions with eight supercharges. In contrast to the well-known example of 3d mirror symmetry, the aforementioned dualities map Coulomb branches to Coulomb branches and Higgs branches to Higgs branches in the deep IR. A novel feature of these dualities is that the Coulomb branch global symmetry is emergent in the IR on one side of the duality while being manifest in the UV on the other. For a large class of 3d quiver gauge theories, a sequence of these dualities can be explicitly constructed by step-wise implementing a set of four quiver mutations with non-trivial closure relations. A given duality sequence yields a set of quiver gauge theories which flow to the same IR SCFT -- a phenomenon I will refer to as IR N-ality. This set of N-al theories always contains a subset of quivers from which the rank of the IR Coulomb branch symmetry can be read off. I will illustrate the construction of the duality sequence and the resultant IR N-ality with a concrete example. Next, I will briefly discuss how IR N-ality proves to be an extremely useful tool for studying the Higgs branches of four/five dimensional SCFTs, focusing on the subclass of theories which are geometrically engineered by compactifying Type IIB/M-theory on three-fold canonical isolated hypersurface singularities. In the concluding section of my talk, I will discuss possible ways in which machine learning/deep learning methods can be deployed to explore such dualities in the landscape of 3d quivers with eight supercharges.

      Speaker: Anindya Dey
    • 21
      Twists and Turns through the Landscape of Little String Theories

      Little String Theories(LSTs) are a class of 6D supersymmetric theories that sit right between SCFTs and SUGRA theories: While LSTs admit global symmetries, they also admit a characteristic scale. This scale can be mixed with another circle radius, relating different T-dual 6D theories in 5D. LSTs and their network of T-duals however remained fairly unexplored until recently. In this talk I will review progress in charting the landscape of 6D LSTs and their network of (twisted) T-duals by exploiting recent progress in generalized symmetries and geometric engineering in M/F-theory. This talk is based on published and ongoing works with Hamza Ahmed, Florent Baume, Andreas Braun, Michele Del Zotto, Muyang Liu, Fabian Ruehle, and Benjamin Song.

      Speaker: Paul-Konstantin Oehlmann
    • 22
      Flavor Orientifolds and N=1 SCFTs

      In this work, we introduce a new class of N = 1 SCFTs with flavor symmetries, engineered in string theory through D3 branes probing various non-compact orientifolds of toric Calabi-Yau singularities using Brane-tiling techniques. The addition of D7 flavor branes is crucial to cancel tadpole contributions and lead to flavor symmetries. Distinct theories are identified and differentiated based on the choices of discrete torsion and Wilson lines. For instance, in the C^3/Z_3 case with trivial torsion, we find five distinct theories, where flavor symmetries correspond to conjugacy classes of SO(8). In the conifold case, non-trivial discrete torsion coupled with a trivial Wilson line choice leads to two S-dual theories, each exhibiting an SO(8) global symmetry. However, the combination of non-trivial Wilson lines with non-trivial discrete torsion presents a more complex situation, leading to interesting results that necessitate further analysis. Despite their simplicity, these setups provide a starting point for exploring N=1 SCFTs and S-duality within non-compact orientifolds of toric Calabi-Yau singularities, while also facilitating the analysis of the impact of discrete torsion and Wilson lines on the physical properties of these theories.

      Speaker: Ajit Kumar Sorout
    • Topic-Driven Discussion Sessions

      Discussions on different topics in the Smart Lobby and the Alcoves. Pizza and beverages will be served

    • 23
      Phi^4 Theory as a Neural Network Field Theory

      In this talk I’ll review recent progress in a neural network approach to field theory. Topics include the systematics of interactions and relation to the central limit theorem, a method of obtaining actions from correlators, and engineering actions at infinite width via local operator insertions. This technique is used to obtain phi^4 theory as a neural network field theory.

      Speaker: James Halverson
    • 24
      Andrews-Curtis conjecture and reinforcement learning
      Speaker: Sergei Gukov
    • 10:00
      Coffee Break
    • 25
      Statistical learning of deterministic functions
      Speaker: Michael Douglas
    • 26
      Block Award and Closing