This session is part of ICMS 2020, to be held virtually (see the update below) in Braunschweig, Germany, 1316 July 2020.
Big Data is becoming increasingly important in fundamental mathematics research, with the production and manipulation of large datasets playing an essential role. The subject is in the throes of a data revolution, with new theoretical results being driven by millions of computer experiments which produce terabytes of data stored in huge problemspecific databases. Although the information contained in these databases is of enormous importance, thorough understanding of the datasets is often hindered by the poor performance of offtheshelf database technologies in answering the types of broad unstructured queries of interest to mathematicians. The aim of this session is to bring together experts working with and creating large datasets in Pure Mathematics. We will explore the questions asked by these researchers, discuss the database systems they have developed, and their integration with existing Computer Algebra Systems.
Session organised by Gavin Brown, Tom Coates, and Alexander Kasprzyk.
Update on COVID19
Due to the COVID19 pandemic, ICMS 2020 will now be a virtual conference, with session talks held online. Further information about the virtual format of ICMS 2020 can be found here.
Talks and Abstracts

 Bettina Eick (TU Braunschweig)
 The Small Groups Library
 The Small Groups Library is a library of groups of certain "small" orders. The groups are sorted by their orders and they are listed up to isomorphism; that is, for each of the available orders a complete and irredundant list of isomorphism type representatives of groups is given. For example, the library contains all groups of order at most 2000 except 1024, all groups of cubefree order at most 50000 and all groups of order dividing p^{6} for all primes p. The talk gives a survey on the construction and the facilities of this library. The small groups library is a joint project
with Hans Ulrich Besche and Eamonn O'Brien as well as other contributors.

 YangHui He (City and Oxford)
 Universes as Big Data: Superstrings, CalabiYau Manifolds and Machine Learning
 We review how historically the problem of string phenomenology lead theoretical physics first to algebraic/differential geometry, and then to computational geometry, and now to data science and AI. With the concrete playground of the CalabiYau landscape, accumulated by the collaboration of physicists, mathematicians and computer scientists over the last 4 decades, we show how the latest techniques in machinelearning can help explore problems of physical and mathematical interest.
 Video, Slides

 Fredrik Johansson (INRIA Bordeaux)
 Fungrim: a symbolic library for mathematical functions
 We present the Mathematical Functions Grimoire (Fungrim), a website and database of formulas, theorems and tables for mathematical functions. A central goal of this project is to represent all data in semantic and fully computerreadable form. For instance, computer algebra systems may use formulas in Fungrim as rewrite rules to simplify expressions or prove theorems involving special functions or real numbers. As part of this effort, we have developed a new mathematical description language (the Grim formula language) and a symbolic computation library, Pygrim, used as the backend and main development tool for Fungrim. We discuss the design, implementation and testing of these projects, prospects of scaling the database, and potential applications.
 Video, Slides

 Andreas Paffenholz (TU Darmstadt)
 polyDB: A database for geometric objects based on MongoDB
 polyDB is a database for geometric objects with a growing set of collections. The project originated as an extension to the software polymake. It shares some conceptual ideas with polymake, and it uses the same JSON format for its data, but can be used completely independently, either by accessing it directly via a MongoDB client or driver, via a REST interface, or our web interface at polydb.org. The database is based on MongoDB and uses JSON as data format for its collections. It uses JSON schemas for the structural description of documents in a collection. The schema controls available entries in documents, their format, and their mathematical type. Meta information is stored for each collection, and it allows access control at collection level.
In my talk I will explain the background of polyDB, introduce the structure of data, and discuss the data model used. A NoSQL database like MongoDB supports our pupose quite well, but also has shortcomings if used for data from mathematics, which I will briefly address. I will show options to access the data, querying the data from a mathematical perspective, and ways to use it in computer algebra systems. For this, I will use the existing interface to polymake as my example.

 Giuseppe Pitton (Imperial College London)
 Database tools for the large scale computation of maximally mutable Laurent polynomials
 Fano polytopes are a family of integral lattice polytopes whose only interior point is the origin of the lattice.
If the dual P^{*} of a Fano polytope P is itself an integral lattice polytope, we call P a reflexive Fano polytope. Reflexive Fano polytopes have important applications in toric geometry. Moreover, recent results in Mirror Symmetry showed that it is possible to find deformation equivalent families of Fano varieties by computing some Laurent polynomials, called maximallymutable Laurent polynomials, which are naturally associated to reflexive Fano polytopes. Kreuzer and Skarke produced a list of fourdimensional reflexive Fano polytopes which consists of 473 800 776 entries. In this talk, I will illustrate database/HPC tools developed to compute maximallymutable Laurent polynomials for some interesting families of reflexive Fano polytopes in dimension 4. The massive scale of the Kreuzer and Skarke database requires a fast, robust interface with the HPC infrastructure used for carrying out the polynomial computations. This is joint work with Alexander Kasprzyk and Tom Coates.
 Slides

 Andrew Sutherland (MIT)
 The Lfunctions and Modular Forms Database
 The Lfunctions and Modular Forms Database (LMFDB) is a large database of mathematical objects that arise in number theory, including tables of number fields, elliptic curves, modular forms, and related objects. Its primary goal is to make explicit the connections between these objects predicted by the Langlands program, which are mediated by Lfunctions. The LMFDB is the product of an ongoing international collaboration with more than 100 participants that has become an important resource for researchers and students in number theory and many adjacent fields. In this talk I will give an overview of the LMFDB and discuss some of the mathematical, computational, and infrastructural challenges that arose during its development and the ways in which we were able to overcome them.
 Video, Slides
Schedule for the Live Discussions
All times are in CEST.

Monday, July 13 
Tuesday, July 14 
16:3016:40 
YangHui He 
Bettina Eick 
16:4016:45 
Break 
Break 
16:4516:55 
Andrew Sutherland 
Giuseppe Pitton 
16:5517:00 
Break 
Break 
17:0017:10 
Fredrik Johansson 
Andreas Paffenholz 