Capse.jl

Capse.jl is a Julia package designed to emulate the computation of the CMB Angular Power Spectrum, with a speedup of several orders of magnitude compared to standard codes such as CAMB or CLASS. The core functionalities of Capse.jl are inherithed by the upstream library AbstractCosmologicalEmulators.jl.

Installation

In order to install Capse.jl, run on the Julia REPL

using Pkg, Pkg.add(url="https://github.com/CosmologicalEmulators/Capse.jl")

Usage

In order to be able to use Capse.jl, there are two major steps that need to be performed:

  • Instantiating the emulators, e.g. initializing the Neural Network, its weights and biases, and the quantities employed in pre and post-processing
  • Use the instantiated emulators to retrieve the spectra

In the reminder of this section we are showing how to do this.

Instantiation

The most direct way to instantiate an official trained emulators is given by the following one-liner

Cℓ_emu = Capse.load_emulator(weights_folder);

where weights_folder is the path to the folder containing the files required to build up the network. Some of the trained emulators can be found on Zenodo and we plan to release more of them there in the future.

It is possible to pass an additional argument to the previous function, which is used to choose between the two NN backend now available:

  • SimpleChains, which is taylored for small NN running on a CPU
  • Lux, which can run both on CPUs and GPUs

SimpleChains.jl is faster expecially for small NNs on the CPU. If you wanna use something running on a GPU, you should use Lux.jl, which can be loaded adding an additional argument to the load_emulator function, Capse.LuxEmulator

Cℓ_emu = Capse.load_emulator(weights_folder, emu = Capse.LuxEmulator);

Each trained emulator should be shipped with a description within the JSON file. In order to print the description, just run:

Capse.get_emulator_description(Cℓ_emu)
The parameters the model has been trained are, in the following order: ln10As, ns, H0, ωb, ωc, τ.
The emulator has been trained by Marco Bonici.
Marco Bonici email is bonici.marco@gmail.com.
The emulator has been trained on the high-precision-settings prediction as computed by the CAMB Boltzmann solver.
Warning

Cosmological parameters must be fed to Capse.jl with arrays. It is the user responsability to check the right ordering, by reading the output of the get_emulator_description method.

After loading a trained emulator, feed it some input parameters x in order to get the emulated $C_\ell$'s

x = rand(6) # generate some random input
Capse.get_Cℓ(x, Cℓ_emu) #compute the Cℓ's

Using SimpleChains.jl, we obtain a mean execution time of 45 microseconds

BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (minmax):  45.890 μs792.465 μs   GC (min … max): 0.00% … 87.25%
 Time  (median):     48.073 μs                GC (median):    0.00%
 Time  (mean ± σ):   53.221 μs ±  23.252 μs   GC (mean ± σ):  1.25% ±  2.96%

  ▃█▅▄▃▃▃▃▆▃▂▂▂▂▂▂▂▁▁▁▁▁▁                                    ▂
  ██████████████████████████▇█▇▇▇█▇▆▇▆▆▆▆▆▇▆▆▅▄▅▅▅▆▅▅▁▅▅▄▃▄▃ █
  45.9 μs       Histogram: log(frequency) by time       100 μs <

 Memory estimate: 117.47 KiB, allocs estimate: 8.

Using Lux.jl, with the same architecture, we obtain

BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (minmax):  42.592 μs940.212 μs   GC (min … max): 0.00% … 90.24%
 Time  (median):     44.549 μs                GC (median):    0.00%
 Time  (mean ± σ):   49.497 μs ±  34.222 μs   GC (mean ± σ):  2.60% ±  3.69%

   ▇█▄▃▂▃▃▃▃▂▂▂▁▁▁▁▁▁▁                                       ▁
  ▇█████████████████████▇▇███▇▇███▇▇▆▆▆▇▆▆▅▆▆▅▄▅▆▅▄▅▅▅▅▄▅▄▃▄ █
  42.6 μs       Histogram: log(frequency) by time      85.8 μs <

 Memory estimate: 161.34 KiB, allocs estimate: 59.

SimpleChains.jl and Lux.jl have almost the same performance and they give the same result up to floating point precision.

These benchmarks have been performed locally, with a 13th Gen Intel® Core™ i7-13700H, using a single core.

Considering that a high-precision settings calculation performed with CAMB on the same machine requires around 60 seconds, Capse.jl is 5-6 order of magnitudes faster.

Warning

Currently, there is a performance issue when using Lux.jl in a multi-threaded scenario. This is something known (see discussion here). In case you want to launch multiple chains locally, the suggested (working) strategy with Lux.jl is to use distributed computing.

Authors

  • Marco Bonici, PostDoctoral researcher at Waterloo Center for Astrophysics
  • Federico Bianchini, PostDoctoral researcher at Kavli Institute for Particle Physics and Cosmology
  • Jaime Ruiz-Zapatero, Research Software Engineer at the Advanced Research Computing centre of University College London
  • Marius Millea, Researcher at UC Davis and Berkeley Center for Cosmological Physics

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

Capse.jl is licensed under the MIT "Expat" license; see LICENSE for the full license text.