Online Machine Learning Seminar

This seminar is held online via Zoom, and organised by Alex Kasprzyk, Lorenzo De Biase, Tom Oliver, and Sara Veneziale.

To be added to the mailing list, please contact one of the organisers. Everybody is welcome.

Follow us on researchseminars.org to be kept up-to-date about upcoming talks. Recordings of talks are available on our YouTube channel.

Note: All times are UK times.

Past Talks

- 17 April 2024
- Elli Heyes (City, University of London)
- Generating Calabi–Yau Manifolds with Machine Learning
- Calabi–Yau n-folds can be obtained as hypersurfaces in toric varieties built from (n+1)-dimensional reflexive polytopes. Calabi–Yau 3-folds are of particular interest in string theory as they reduce 10-dimensional superstring theory to 4-dimensional quantum field theories with N=1 supersymmetry. We generate Calabi–Yau 3-folds by generating 4-dimensional reflexive polytopes and their triangulations using genetic algorithms and reinforcement learning respectively. We show how, by modifying the fitness function of the genetic algorithm, one can generate Calabi–Yau manifolds with specific properties that give rise to certain string models of particular interest.
- Video (YouTube), Video (Vimeo), Slides
- 20 March 2024
- Shailesh Lal (BIMSA)
- Neural Network solvers for the Yang-Baxter Equation
- We develop a novel neural network architecture that learns solutions to the Yang Baxter equation for R matrices of difference form. This method already enables us to learn all solution classes of the 2d Yang Baxter equation. We propose and test paradigms for exploring the landscape of Yang Baxter equation solution space aided by these methods. Further, we shall also comment on the application of these methods to generating new solutions of the Yang Baxter equation. The talk is based on joint work with Suvajit Majumder and Evgeny Sobko available in part in arXiv:2304.07247.
- Video (YouTube), Video (Vimeo), Slides
- 13 March 2024
- Daniele Angella (Università di Firenze)
- Constructing and Machine Learning Calabi–Yau Five-Folds
- The significance of Calabi–Yau manifolds transcends both Complex Geometry and String Theory. One possible approach to constructing Calabi–Yau manifolds involves intersecting hypersurfaces within the product of projective spaces, defined by polynomials of a specific degree. We show a method to construct all possible complete intersections Calabi–Yau ﬁve-folds within a product of four or less complex projective spaces, with up to four constraints. This results in a comprehensive set of 27,068 distinct spaces. For approximately half of these constructions, excluding the product spaces, we can compute the cohomological data, yielding 2,375 distinct Hodge diamonds. We present distributions of the invariants and engage in a comparative analysis with their lower-dimensional counterparts. Supervised machine learning techniques are applied to the cohomological data. The Hodge number h^1,1 can be learnt with high efficiency; however, accuracy diminishes for other Hodge numbers due to the extensive ranges of potential values. The talk is a joint collaboration with Rashid Alawadhi, Andrea Leonardo, and Tancredi Schettini Gherardini.
- Video (YouTube), Video (Vimeo), Slides
- 6 December 2023
- Kyu-Hwan Lee (Connecticut)
- Data-scientific study of Kronecker coefficients
- The Kronecker coefficients are the decomposition multiplicities of the tensor product of two irreducible representations of the symmetric group. Unlike the Littlewood–Richardson coefficients, which are the analogues for the general linear group, there is no known combinatorial description of the Kronecker coefficients, and it is an NP-hard problem to decide whether a given Kronecker coefficient is zero or not. In this talk, we take a data-scientific approach to study whether Kronecker coefficients are zero or not. We show that standard machine-learning classifiers may be trained to predict whether a given Kronecker coefficient is zero or not. Motivated by principal component analysis and kernel methods, we also define loadings of partitions and use them to describe a sufficient condition for Kronecker coefficients to be nonzero.
- Video (YouTube), Video (Vimeo)
- 22 November 2023
- Agnese Barbensi (Queensland)
- Persistent homology, hypergraphs and geometric cycle matching
- Topological data analysis has been demonstrated to be a powerful tool to describe topological signatures in real-life data, and to extract complex patterns arising in natural systems. An important challenge in topological data analysis is to find robust ways of computing and analysing persistent generators, and to match significant topological signals across distinct systems. In this talk, I will present some recent work dealing with these problems. Our method is based on an interpretation of persistent homology summaries with network theoretical tools, combined with statistical and optimal transport techniques.
- Video (YouTube), Video (Vimeo), Slides
- 8 November 2023
- Martina Scolamiero (KTH)
- Machine Learning with Topological Data Analysis features
- In Topological Data Analysis, Persistent Homology has been widely used to extract features from data. Such features are then used for clustering, visualization and classification. In this talk I will describe how we define Lipschitz continuous persistence features starting from pseudo metrics to compare topological representations of data. Special emphasis will be on the variety of different features that can be constructed in this way and how they can be used in machine learning pipelines. Joint work with the TDA group at KTH.
- Video (YouTube), Video (Vimeo), Slides
- 18 October 2023
- Felix Schremmer (Hong Kong)
- Machine learning assisted exploration for affine Deligne–Lusztig varieties
- In this interdisciplinary study, we describe a procedure to assist and accelerate research in pure mathematics by using machine learning. We study affine Deligne–Lusztig varieties, certain geometric objects related to a number of mathematical questions, by carefully developing a number of machine learning models. This iterated pipeline yields well interpretable and highly accurate models, thus producing strongly supported mathematical conjectures. We explain how this method could have dramatically accelerated the research in the past. A completely new mathematical theorem, found by our ML-assisted method and proved using the classical mathematical tools of the field, concludes this study. This is joint work with Bin Dong, Pengfei Jin, Xuhua He and Qingchao Yu.
- Video (YouTube), Video (Vimeo), Slides
- 27 September 2023
- Rahul Sarkar (Stanford)
- A framework for generating inequality conjectures
- In this talk, I'll present some recent and ongoing work, where we propose a systematic approach to finding abstract patterns in mathematical data, in order to generate conjectures about mathematical inequalities. We focus on strict inequalities of type f < g and associate them with a Banach manifold. We develop a structural understanding of this conjecture space by studying linear automorphisms of this manifold. Next, we propose an algorithmic pipeline to generate novel conjecture. As proof of concept, we give a toy algorithm to generate conjectures about the prime counting function and diameters of Cayley graphs of non-abelian simple groups. Some of these conjectures were proved while others remain unproven.
- Video (YouTube), Video (Vimeo), Slides
- 20 September 2023
- Bruno Gavranović (Strathclyde)
- Fundamental Components of Deep Learning: A category-theoretic approach
- Deep learning, despite its remarkable achievements, is still a young field. Like the early stages of many scientific disciplines, it is permeated by ad-hoc design decisions. From the intricacies of the implementation of backpropagation, through new and poorly understood phenomena such as double descent, scaling laws or in-context learning, to a growing zoo of neural network architectures - there are few unifying principles in deep learning, and no uniform and compositional mathematical foundation. In this talk I'll present a novel perspective on deep learning by utilising the mathematical framework of category theory. I'll identify two main conceptual components of neural networks, report on progress made throughout last years by the research community in formalising them, and show how they've been used to describe backpropagation, architectures, and supervised learning in general, shedding a new light on the existing field.
- Video (YouTube), Video (Vimeo), Slides
- 6 September 2023
- Charlotte Aten (Denver)
- Discrete neural nets and polymorphic learning
- Classical neural network learning techniques have primarily been focused on optimization in a continuous setting. Early results in the area showed that many activation functions could be used to build neural nets that represent any function, but of course this also allows for overfitting. In an effort to ameliorate this deficiency, one seeks to reduce the search space of possible functions to a special class which preserves some relevant structure. I will propose a solution to this problem of a quite general nature, which is to use polymorphisms of a relevant discrete relational structure as activation functions. I will give some concrete examples of this, then hint that this specific case is actually of broader applicability than one might guess.
- Video (YouTube), Video (Vimeo), Slides
- 2 August 2023
- Honglu Fan (Geneva)
- Local uniformization, Hilbert scheme of points and reinforcement learning
- In this talk, I will give a brief tour about how local uniformization, the Hilbert scheme of points, and reinforcement learning come together in a joint work with Gergely Berczi and Mingcong Zeng.
- Video (YouTube), Video (Vimeo), Slides
- 26 July 2023
- Quoc-Tung Le (ENS Lyon)
- Algorithmic and theoretical aspects of sparse deep neural networks
- Sparse deep neural networks offer a compelling practical opportunity to reduce the cost of training, inference and storage, which are growing exponentially in the state of the art of deep learning. In this presentation, we will introduce an approach to study sparse deep neural networks through the lens of another related problem: sparse matrix factorization, i.e., the problem of approximating a (dense) matrix by the product of (multiple) sparse factors. In particular, we identify and investigate in detail some theoretical and algorithmic aspects of a variant of sparse matrix factorization named fixed support matrix factorization (FSMF) in which the set of non-zero entries of sparse factors are known. Several fundamental questions of sparse deep neural networks such as the existence of optimal solutions of the training problem or topological properties of its function space can be addressed using the results of (FSMF). In addition, by applying the results of (FSMF), we also study butterfly parametrization, an approach that consists of replacing (large) weight matrices with the products of extremely sparse and structured ones in sparse deep neural networks.
- Video (YouTube), Video (Vimeo), Slides
- 12 July 2023
- Challenger Mishra (Cambridge)
- Mathematical conjecture generation and Machine Intelligence
- Conjectures hold a special status in mathematics. Good conjectures epitomise milestones in mathematical discovery, and have historically inspired new mathematics and shaped progress in theoretical physics. Hilbert’s list of 23 problems and André Weil’s conjectures oversaw major developments in mathematics for decades. Crafting conjectures can often be understood as a problem in pattern recognition, for which Machine Learning (ML) is tailor-made. In this talk, I will propose a framework that allows a principled study of a space of mathematical conjectures. Using this framework and exploiting domain knowledge and machine learning, we generate a number of conjectures in number theory and group theory. I will present evidence in support of some of the resulting conjectures and present a new theorem. I will lay out a vision for this endeavour, and conclude by posing some general questions about the pipeline.
- Video (YouTube), Video (Vimeo), Slides
- 5 July 2023
- Thomas Gebhart (Minnesota)
- Specifying Local Constraints in Representation Learning with Cellular Sheaves
- Many machine learning algorithms constrain their learned representations by imparting inductive biases based on local smoothness assumptions. While these constraints are often natural and effective, there are situations in which their simplicity is mis-aligned with the representation structure required by the task, leading to a lack of expressivity and pathological behaviors like representational oversmoothing or inconsistency. Without a broader theoretical framework for reasoning about local representational constraints, it is difficult to conceptualize and move beyond such representational misalignments. In this talk, we will see that cellular sheaf theory offers an ideal algebro-topological framework for both reasoning about and implementing machine learning models on data which are subject to such local-to-global constraints over a topological space. We will introduce cellular sheaves from a categorical perspective, observing the relationship between their definition as a limit object and the consistency objectives underlying representation learning. We will then turn to a discussion of sheaf (co)homology as a semi-computable tool for implementing these categorical concepts. Finally, we will observe two practical applications of these ideas in the form of sheaf neural networks, a generalization of graph neural networks for processing sheaf-valued signals; and knowledge sheaves, a sheaf-theoretic reformulation of knowledge graph embedding.
- Video (YouTube), Video (Vimeo), Slides
- 21 June 2023
- Edward Pearce-Crump (Imperial)
- Exploring group equivariant neural networks using set partition diagrams
- What do jellyfish and an 11th century Japanese novel have to do with neural networks? In recent years, much attention has been given to developing neural network architectures that can efficiently learn from data with underlying symmetries. These architectures ensure that the learned functions maintain a certain geometric property called group equivariance, which determines how the output changes based on a change to the input under the action of a symmetry group. In this talk, we will describe a number of new group equivariant neural network architectures that are built using tensor power spaces of Rⁿ as their layers. We will show that the learnable, linear functions between these layers can be characterised by certain subsets of set partition diagrams. This talk will be based on several papers that are to appear in ICML 2023.
- Video (YouTube), Video (Vimeo), Slides
- 14 June 2023
- Vasco Brattka (Bundeswehr Munich)
- On the Complexity of Computing Gödel Numbers
- Given a computable sequence of natural numbers, it is a natural task to find a Gödel number of a program that generates this sequence. It is easy to see that this problem is neither continuous nor computable. In algorithmic learning theory this problem is well studied from several perspectives and one question studied there is for which sequences this problem is at least learnable in the limit. Here we study the problem on all computable sequences and we classify the Weihrauch complexity of it. For this purpose we can, among other methods, utilize the amalgamation technique known from learning theory. As a benchmark for the classification we use closed and compact choice problems and their jumps on natural numbers, and we argue that these problems correspond to induction and boundedness principles, as they are known from the Kirby–Paris hierarchy in reverse mathematics. We provide a topological as well as a computability-theoretic classification, which reveal some significant differences.
- Video (YouTube), Video (Vimeo), Slides
- 31 May 2023
- Alvaro Torras Casas (Cardiff)
- Dataset comparison using persistent homology morphisms
- Persistent homology summarizes geometrical information of data by means of a barcode. Given a pair of datasets, X and Y, one might obtain their respective barcodes B(X) and B(Y). Thanks to stability results, if X and Y are similar enough one deduces that the barcodes B(X) and B(Y) are also close enough; however, the converse is not true in general. In this talk we consider the case when there is a known relation between X and Y encoded by a morphism between persistence modules. For example, this is the case when Y is a finite subset of euclidean space and X is a sample taken from Y. As in linear algebra, a morphism between persistence modules is understood by a choice of a pair of bases together with the associated matrix. I will explain how to use this matrix to get barcodes for images, kernels and cokernels. Additionally, I will explain how to compute an induced block function that relates the barcodes B(X) and B(Y). I will finish the talk revising some applications of this theory as well as future research directions.
- Video (YouTube), Video (Vimeo), Slides
- 24 May 2023
- Taejin Paik (Seoul National University)
- Isometry-Invariant and Subdivision-Invariant Representations of Embedded Simplicial Complexes
- Geometric objects such as meshes and graphs are commonly used in various applications, but analyzing them can be challenging due to their complex structures. Traditional approaches may not be robust to transformations like subdivision or isometry, leading to inconsistent results. Here is a novel approach to address these limitations by using only topological and geometric data to analyze simplicial complexes in a subdivision-invariant and isometry-invariant way. This approach involves using a graph neural network to create an O(3)-equivariant operator and the Euler curve transform to generate sufficient statistics that describe the properties of the object.
- Video (YouTube), Video (Vimeo), Slides
- 3 May 2023
- Daniel Platt (KCL)
- Group invariant machine learning by fundamental domain projections
- In many applications one wants to learn a function that is invariant under a group action. For example, classifying images of digits, no matter how they are rotated. There exist many approaches in the literature to do this. I will mention two approaches that are very useful in many applications, but struggle if the group is big or acts in a complicated way. I will then explain our approach which does not have these two problems. The approach works by finding some "canonical representative" of each input element. In the example of images of digits, one may rotate the digit so that the brightest quarter is in the top-left, which would define a "canonical representative". In the general case, one has to define what that means. Our approach is useful if the group is big, and I will present experiments on the Complete Intersection Calabi–Yau and Kreuzer–Skarke datasets to show this. Our approach is useless if the group is small, and the case of rotated images of digits is an example of this. This is joint work with Benjamin Aslan and David Sheard.
- Video (YouTube), Video (Vimeo), Slides
- 26 April 2023
- Bastian Rieck (Munich)
- Curvature for Graph Learning
- Curvature bridges geometry and topology, using local information to derive global statements. While well-known in a differential topology context, it was recently extended to the domain of graphs. In fact, graphs give rise to various notions of curvature, which differ in expressive power and purpose. We will give a brief overview of curvature in graphs, define some relevant concepts, and show their utility for data science and machine learning applications. In particular, we shall discuss two applications: first, the use of curvature to distinguish between different models for synthesising new graphs from some unknown distribution; second, a novel framework for defining curvature for hypergraphs, whose structural properties require a more generic setting. We will also describe new applications that are specifically geared towards a treatment by curvature, thus underlining the utility of this concept for data science.
- Video (YouTube), Video (Vimeo), Slides
- 19 April 2023
- Christoph Hertrich (LSE)
- Understanding Neural Network Expressivity via Polyhedral Geometry
- Neural networks with rectified linear unit (ReLU) activations are one of the standard models in modern machine learning. Despite their practical importance, fundamental theoretical questions concerning ReLU networks remain open until today. For instance, what is the precise set of (piecewise linear) functions exactly representable by ReLU networks with a given depth? Even the special case asking for the number of layers to compute a function as simple as max{0, x₁, x₂, x₃, x₄} has not been solved yet. In this talk we will explore the relevant background to understand this question and report about recent progress using tropical and polyhedral geometry as well as a computer-aided approach based on mixed-integer programming. This is based on joint works with Amitabh Basu, Marco Di Summa, and Martin Skutella (NeurIPS 2021), as well as Christian Haase and Georg Loho (ICLR 2023).
- Video (YouTube), Video (Vimeo), Slides
- 12 April 2023
- Vasco Portilheiro (UCL)
- Barriers to Learning Symmetries
- Given the success of equivariant models, there has been increasing interest in models which can learn a symmetry from data, rather than it being imposed a priori. We present work which formalizes a tradeoff between (a) the simultaneous learnability of symmetries and equivariant functions, and (b) universal approximation of equivariant functions. The work is motivated by an experiment which modifies the Equivariant Multilayer Perceptron (EMLP) of Finzi et al. (2021) in an attempt to learn a group together with an equivariant function. Additionally, the tradeoff is shown to not exist for group-convolutional networks.
- Video (YouTube), Video (Vimeo), Slides
- 5 April 2023
- Yang-Hui He (LIMS)
- Universes as Bigdata: Physics, Geometry and Machine-Learning
- The search for the Theory of Everything has led to superstring theory, which then led physics, first to algebraic/differential geometry/topology, and then to computational geometry, and now to data science. With a concrete playground of the geometric landscape, accumulated by the collaboration of physicists, mathematicians and computer scientists over the last 4 decades, we show how the latest techniques in machine-learning can help explore problems of interest to theoretical physics and to pure mathematics. At the core of our programme is the question: how can AI help us with mathematics?
- Video (YouTube), Video (Vimeo), Slides
- 29 March 2023
- Eduardo Paluzo-Hidalgo (Seville)
- An introduction to Simplicial-map Neural Networks
- In a recently accepted project RexasiPro, we deal with a critical environment where trustworthy is decisive. One of our approaches are simplicial-map neural networks (SMNNs) which are explicitly defined using simplicial maps between triangulations of the input and output spaces. Its combinatorial definition lets us prove and guarantee several nice properties following trustworthy AI principles. In "Two-hidden-layer feed-forward networks are universal approximators: A constructive approach", the first definition of SMNNs was given and its universal approximator property was proved. Later, in "Simplicial-Map Neural Networks Robust to Adversarial Examples", its robustness against adversarial examples was described.
- Video (YouTube), Video (Vimeo), Slides
- 22 March 2023
- Patrizio Frosini (Bologna)
- Some recent results on the theory of GENEOs and its application to Machine Learning
- Group equivariant non-expansive operators (GENEOs) have been introduced a few years ago as mathematical tools for approximating data observers when data are represented by real-valued or vector-valued functions. The use of these operators is based on the assumption that the interpretation of data depends on the geometric properties of the observers. In this talk we will illustrate some recent results in the theory of GENEOs, showing how these operators can make available a new approach to topological data analysis and geometric deep learning.
- Video (YouTube), Video (Vimeo), Slides
- 15 March 2023
- Julia Lindberg (UT Austin)
- Estimating Gaussian mixtures using sparse polynomial moment systems
- The method of moments is a statistical technique for density estimation that solves a system of moment equations to estimate the parameters of an unknown distribution. A fundamental question critical to understanding identifiability asks how many moment equations are needed to get finitely many solutions and how many solutions there are. We answer this question for classes of Gaussian mixture models using the tools of polyhedral geometry. Using these results, we present a homotopy method to perform parameter recovery, and therefore density estimation, for high dimensional Gaussian mixture models. The number of paths tracked in our method scales linearly in the dimension.
- Video (YouTube), Video (Vimeo), Slides
- 8 March 2023
- Nick Vannieuwenhoven (KU Leuven)
- Group-invariant tensor train networks for supervised learning
- Invariance under selected transformations has recently proven to be a powerful inductive bias in several machine learning models. One class of such models are tensor train networks. In this talk, we impose invariance relations on tensor train networks. We introduce a new numerical algorithm to construct a basis of tensors that are invariant under the action of normal matrix representations of an arbitrary discrete group. This method can be up to several orders of magnitude faster than previous approaches. The group-invariant tensors are then combined into a group-invariant tensor train network, which can be used as a supervised machine learning model. We applied this model to a protein binding classification problem, taking into account problem-specific invariances, and obtained prediction accuracy in line with state-of-the-art invariant deep learning approaches. This is joint work with Brent Sprangers.
- Video (YouTube), Video (Vimeo), Slides
- 22 February 2023
- Guido Montufar (UCLA)
- Geometry and convergence of natural policy gradient methods
- We study the convergence of several natural policy gradient (NPG) methods in infinite-horizon discounted Markov decision processes with regular policy parametrizations. For a variety of NPGs and reward functions we show that the trajectories in state-action space are solutions of gradient flows with respect to Hessian geometries, based on which we obtain global convergence guarantees and convergence rates. In particular, we show linear convergence for unregularized and regularized NPG flows with the metrics proposed by Kakade and Morimura and co-authors by observing that these arise from the Hessian geometries of conditional entropy and entropy respectively. Further, we obtain sublinear convergence rates for Hessian geometries arising from other convex functions like log-barriers. Finally, we interpret the discrete-time NPG methods with regularized rewards as inexact Newton methods if the NPG is defined with respect to the Hessian geometry of the regularizer. This yields local quadratic convergence rates of these methods for step size equal to the penalization strength. This is work with Johannes Müller.
- Video (YouTube), Video (Vimeo), Slides
- 15 February 2023
- Kathlén Kohn (KTH)
- The Geometry of Linear Convolutional Networks
- We discuss linear convolutional neural networks (LCNs) and their critical points. We observe that the function space (i.e., the set of functions represented by LCNs) can be identified with polynomials that admit certain factorizations, and we use this perspective to describe the impact of the network's architecture on the geometry of the function space. For instance, for LCNs with one-dimensional convolutions having stride one and arbitrary filter sizes, we provide a full description of the boundary of the function space. We further study the optimization of an objective function over such LCNs: We characterize the relations between critical points in function space and in parameter space and show that there do exist spurious critical points. We compute an upper bound on the number of critical points in function space using Euclidean distance degrees and describe dynamical invariants for gradient descent. This talk is based on joint work with Thomas Merkh, Guido Montúfar, and Matthew Trager.
- Video (YouTube), Video (Vimeo), Slides
- 8 February 2023
- Manolis Tsakiris (Chinese Academy of Sciences)
- Unlabelled Principal Component Analysis
- This talk will consider the problem of recovering a matrix of bounded rank from a corrupted version of it, where the corruption consists of an unknown permutation of the matrix entries. Exploiting the theory of Groebner bases for determinantal ideals, recovery theorems will be given. For a special instance of the problem, an algorithmic pipeline will be demonstrated, which employs methods for robust principal component analysis with respect to outliers and methods for linear regression without correspondences.
- Video (YouTube), Video (Vimeo), Slides
- 12 September 2022
- Anindita Maiti (Northeastern University)
- Non-perturbative Non-Lagrangian Neural Network Field Theories
- Ensembles of Neural Network (NN) output functions describe field theories. The Neural Network Field Theories become free i.e. Gaussian in the limit of infinite width and independent parameter distributions, due to Central Limit Theorem (CLT). Interaction terms i.e. non-Gaussianities in these field theories arise due to violations of CLT at finite width and/or correlated parameter distributions. In general, non-Gaussianities render Neural Network Field Theories as non-perturbative and non-Lagrangian. In this talk, I will describe methods to study non-perturbative non-Lagrangian field theories in Neural Networks, via a dual framework over parameter distributions. This duality lets us study correlation functions and symmetries of NN field theories in the absence of an action; further the partition function can be approximated as a series sum over connected correlation functions. Thus, Neural Networks allow us to study non-perturbative non-Lagrangian field theories through their architectures, and can be beneficial to both Machine Learning and physics.
- Video (YouTube), Video (Vimeo), Slides
- 1 July 2022
- Alexei Vernitski (Essex)
- Using machine learning to solve mathematical problems and to search for examples and counterexamples in pure maths research
- Our recent research can be generally described as applying state-of-the-art technologies of machine learning to suitable mathematical problems. As to machine learning, we use both reinforcement learning and supervised learning (underpinned by deep learning). As to mathematical problems, we mostly concentrate on knot theory, for two reasons; firstly, we have a positive experience of applying another kind of artificial intelligence (automated reasoning) to knot theory; secondly, examples and counter-examples in knot theory are finite and, typically, not very large, so they are convenient for the computer to work with. Here are some successful examples of our recent work, which I plan to talk about.
  - 1) Some recent studies used machine learning to untangle knots using Reidemeister moves, but they do not describe in detail how they implemented untangling on the computer. We invested effort into implementing untangling in one clearly defined scenario, and were successful, and made our computer code publicly available.
  - 2) We found counterexamples showing that some recent publications claiming to give new descriptions of realisable Gauss diagrams contain an error. We trained several machine learning agents to recognise realisable Gauss diagrams and noticed that they fail to recognise correctly the same counterexamples which human mathematicians failed to spot.
  - 3) One problem related to (and "almost" equivalent to) recognising the trivial knot is colouring the knot diagram by elements of algebraic structures called quandles (I will define them). We considered, for some types of knot diagrams (including petal diagrams), how supervised learning copes with this problem.
- Video (YouTube), Video (Vimeo), Slides
- 29 September 2021
- Tom Oliver (Nottingham)
- Supervised learning of arithmetic invariants
- We explore the utility of standard supervised learning algorithms for a range of classification problems in number theory. In particular, we will consider class numbers of real quadratic fields, ranks of elliptic curves over Q, and endomorphism types for genus 2 curves over Q. Each case is motivated by its appearance in an open conjecture. Throughout the basic strategy is the same: we vectorize the underlying objects via the coefficients of their L-functions.
- Video (YouTube), Video (Vimeo), Slides