Speaker
Description
The abundance of dark matter halos is a key cosmological probe in forthcoming surveys. Placing tight constraints requires modelling the halo mass function to at least percent-level accuracy over a wide cosmological parameter space. However, a theoretical understanding of what is required for such accurate modelling is incomplete, limiting the generalisability of existing halo mass function models. We present a novel approach to gain this understanding using deep learning. Unlike existing approaches, it requires minimal assumptions on the relevant physical quantities or their parametrisations. Instead, the deep-learning model compresses all the relevant quantities into a latent representation, which we interpret using mutual information. We find our model requires only three latent variables to reproduce the halo mass functions from the state-of-the-art Aemulus emulator at $z=0$ to within 0.25% residuals over $M = 10^{13.2-15} \mathit{h}^{-1} M_\odot$ in a $w$CDM$+N_\mathrm{eff}$ parameter space. Interpreting the latent representation, we find that in addition to information expected from the extended Press-Schechter formalism, it also captures non-universality (information beyond mass variance) that is required for accurately modelling cosmology dependence. We find this additional information is strongly correlated with the recent growth history since dark energy domination, which can be parametrised by the linear growth rate at $z\sim 0.05$ for lower mass halos in our mass range, and by $\Omega_m$ for our highest mass halos. Non-universality additionally depends on the effective neutrino number $N_\mathrm{eff}$. The compact representation our model learnt can also inform the design of emulator training sets to achieve high emulator accuracy with fewer simulations.