Online Machine Learning Seminar
This seminar is held online via Zoom. To be added to the mailing list, please email Alex Kasprzyk or Lorenzo De Biase. Everybody is welcome.
Follow us on researchseminars.org to be kept up-to-date about upcoming talks. Recordings of talks are available on our YouTube channel.
Note: All times are UK times.
Upcoming Talks
-
- 22 March 2023, 10-11am
- Patrizio Frosini (Bologna)
- Some recent results on the theory of GENEOs and its application to Machine Learning
- Group equivariant non-expansive operators (GENEOs) have been introduced a few years ago as mathematical tools for approximating data observers when data are represented by real-valued or vector-valued functions. The use of these operators is based on the assumption that the interpretation of data depends on the geometric properties of the observers. In this talk we will illustrate some recent results in the theory of GENEOs, showing how these operators can make available a new approach to topological data analysis and geometric deep learning.
- Link to Event
-
-
- 5 April 2023, 10-11am
- Yang-Hui He (LIMS)
- Universes as Bigdata: Physics, Geometry and Machine-Learning
- The search for the Theory of Everything has led to superstring theory, which then led physics, first to algebraic/differential geometry/topology, and then to computational geometry, and now to data science. With a concrete playground of the geometric landscape, accumulated by the collaboration of physicists, mathematicians and computer scientists over the last 4 decades, we show how the latest techniques in machine-learning can help explore problems of interest to theoretical physics and to pure mathematics. At the core of our programme is the question: how can AI help us with mathematics?
- Link to Event
-
- 19 April 2023, 4-5pm
- Christoph Hertrich (LSE)
- Understanding Neural Network Expressivity via Polyhedral Geometry
- Neural networks with rectified linear unit (ReLU) activations are one of the standard models in modern machine learning. Despite their practical importance, fundamental theoretical questions concerning ReLU networks remain open until today. For instance, what is the precise set of (piecewise linear) functions exactly representable by ReLU networks with a given depth? Even the special case asking for the number of layers to compute a function as simple as max{0, x1, x2, x3, x4} has not been solved yet. In this talk we will explore the relevant background to understand this question and report about recent progress using tropical and polyhedral geometry as well as a computer-aided approach based on mixed-integer programming. This is based on joint works with Amitabh Basu, Marco Di Summa, and Martin Skutella (NeurIPS 2021), as well as Christian Haase and Georg Loho (ICLR 2023).
- Link to Event
Past Talks
-
- 15 March 2023
- Julia Lindberg (UT Austin)
- Estimating Gaussian mixtures using sparse polynomial moment systems
- The method of moments is a statistical technique for density estimation that solves a system of moment equations to estimate the parameters of an unknown distribution. A fundamental question critical to understanding identifiability asks how many moment equations are needed to get finitely many solutions and how many solutions there are. We answer this question for classes of Gaussian mixture models using the tools of polyhedral geometry. Using these results, we present a homotopy method to perform parameter recovery, and therefore density estimation, for high dimensional Gaussian mixture models. The number of paths tracked in our method scales linearly in the dimension.
- Video (YouTube), Video (Vimeo), Slides
-
- 8 March 2023
- Nick Vannieuwenhoven (KU Leuven)
- Group-invariant tensor train networks for supervised learning
- Invariance under selected transformations has recently proven to be a powerful inductive bias in several machine learning models. One class of such models are tensor train networks. In this talk, we impose invariance relations on tensor train networks. We introduce a new numerical algorithm to construct a basis of tensors that are invariant under the action of normal matrix representations of an arbitrary discrete group. This method can be up to several orders of magnitude faster than previous approaches. The group-invariant tensors are then combined into a group-invariant tensor train network, which can be used as a supervised machine learning model. We applied this model to a protein binding classification problem, taking into account problem-specific invariances, and obtained prediction accuracy in line with state-of-the-art invariant deep learning approaches. This is joint work with Brent Sprangers.
- Video (YouTube), Video (Vimeo), Slides
-
- 22 February 2023
- Guido Montufar (UCLA)
- Geometry and convergence of natural policy gradient methods
- We study the convergence of several natural policy gradient (NPG) methods in infinite-horizon discounted Markov decision processes with regular policy parametrizations. For a variety of NPGs and reward functions we show that the trajectories in state-action space are solutions of gradient flows with respect to Hessian geometries, based on which we obtain global convergence guarantees and convergence rates. In particular, we show linear convergence for unregularized and regularized NPG flows with the metrics proposed by Kakade and Morimura and co-authors by observing that these arise from the Hessian geometries of conditional entropy and entropy respectively. Further, we obtain sublinear convergence rates for Hessian geometries arising from other convex functions like log-barriers. Finally, we interpret the discrete-time NPG methods with regularized rewards as inexact Newton methods if the NPG is defined with respect to the Hessian geometry of the regularizer. This yields local quadratic convergence rates of these methods for step size equal to the penalization strength. This is work with Johannes Müller.
- Video (YouTube), Video (Vimeo), Slides
-
- 15 February 2023
- Kathlén Kohn (KTH)
- The Geometry of Linear Convolutional Networks
- We discuss linear convolutional neural networks (LCNs) and their critical points. We observe that the function space (i.e., the set of functions represented by LCNs) can be identified with polynomials that admit certain factorizations, and we use this perspective to describe the impact of the network's architecture on the geometry of the function space. For instance, for LCNs with one-dimensional convolutions having stride one and arbitrary filter sizes, we provide a full description of the boundary of the function space. We further study the optimization of an objective function over such LCNs: We characterize the relations between critical points in function space and in parameter space and show that there do exist spurious critical points. We compute an upper bound on the number of critical points in function space using Euclidean distance degrees and describe dynamical invariants for gradient descent. This talk is based on joint work with Thomas Merkh, Guido Montúfar, and Matthew Trager.
- Video (YouTube), Video (Vimeo), Slides
-
- 8 February 2023
- Manolis Tsakiris (Chinese Academy of Sciences)
- Unlabelled Principal Component Analysis
- This talk will consider the problem of recovering a matrix of bounded rank from a corrupted version of it, where the corruption consists of an unknown permutation of the matrix entries. Exploiting the theory of Groebner bases for determinantal ideals, recovery theorems will be given. For a special instance of the problem, an algorithmic pipeline will be demonstrated, which employs methods for robust principal component analysis with respect to outliers and methods for linear regression without correspondences.
- Video (YouTube), Video (Vimeo), Slides
-
- 12 September 2022
- Anindita Maiti (Northeastern University)
- Non-perturbative Non-Lagrangian Neural Network Field Theories
- Ensembles of Neural Network (NN) output functions describe field theories. The Neural Network Field Theories become free i.e. Gaussian in the limit of infinite width and independent parameter distributions, due to Central Limit Theorem (CLT). Interaction terms i.e. non-Gaussianities in these field theories arise due to violations of CLT at finite width and/or correlated parameter distributions. In general, non-Gaussianities render Neural Network Field Theories as non-perturbative and non-Lagrangian. In this talk, I will describe methods to study non-perturbative non-Lagrangian field theories in Neural Networks, via a dual framework over parameter distributions. This duality lets us study correlation functions and symmetries of NN field theories in the absence of an action; further the partition function can be approximated as a series sum over connected correlation functions. Thus, Neural Networks allow us to study non-perturbative non-Lagrangian field theories through their architectures, and can be beneficial to both Machine Learning and physics.
- Video (YouTube), Video (Vimeo), Slides
-
- 1 July 2022
- Alexei Vernitski (Essex)
- Using machine learning to solve mathematical problems and to search for examples and counterexamples in pure maths research
- Our recent research can be generally described as applying state-of-the-art technologies of machine learning to suitable mathematical problems. As to machine learning, we use both reinforcement learning and supervised learning (underpinned by deep learning). As to mathematical problems, we mostly concentrate on knot theory, for two reasons; firstly, we have a positive experience of applying another kind of artificial intelligence (automated reasoning) to knot theory; secondly, examples and counter-examples in knot theory are finite and, typically, not very large, so they are convenient for the computer to work with. Here are some successful examples of our recent work, which I plan to talk about.
- 1) Some recent studies used machine learning to untangle knots using Reidemeister moves, but they do not describe in detail how they implemented untangling on the computer. We invested effort into implementing untangling in one clearly defined scenario, and were successful, and made our computer code publicly available.
- 2) We found counterexamples showing that some recent publications claiming to give new descriptions of realisable Gauss diagrams contain an error. We trained several machine learning agents to recognise realisable Gauss diagrams and noticed that they fail to recognise correctly the same counterexamples which human mathematicians failed to spot.
- 3) One problem related to (and "almost" equivalent to) recognising the trivial knot is colouring the knot diagram by elements of algebraic structures called quandles (I will define them). We considered, for some types of knot diagrams (including petal diagrams), how supervised learning copes with this problem.
- Video (YouTube), Video (Vimeo), Slides
-
- 29 September 2021
- Tom Oliver (Nottingham)
- Supervised learning of arithmetic invariants
- We explore the utility of standard supervised learning algorithms for a range of classification problems in number theory. In particular, we will consider class numbers of real quadratic fields, ranks of elliptic curves over Q, and endomorphism types for genus 2 curves over Q. Each case is motivated by its appearance in an open conjecture. Throughout the basic strategy is the same: we vectorize the underlying objects via the coefficients of their L-functions.
- Video (YouTube), Video (Vimeo)