Fitting the Global 21-cm Signal

It’s relatively straightforward to call the Markov-Chain Monte Carlo code emcee (Foreman-Mackey et al. (2013)), and perform a fits to:

  • The global 21-cm signal.

  • The galaxy luminosity function.

  • Something really cool I haven’t even thought of yet!

Here, we’ll focus on the global signal application.

Fitting the Global 21-cm Spectrum

A fast model that yields semi-realistic global 21-cm signals is one which treats the Lyman-\(\alpha\), ionization, and thermal histories as tanh functions (see Harker et al. 2016), so that’s what we’ll use in this example.

First, define the parameters that remain unchanged from model to model (mostly abstracted away by problem_type=101 and tanh_model=True settings), including some metadata blobs:

import numpy as np

# These go to every calculation
base_pars = \
{
 'problem_type': 101,
 'tanh_model': True,
 'blob_names': [['tau_e', 'z_C', 'dTb_C'], ['cgm_h_2', 'igm_Tk', 'dTb']],
 'blob_ivars': [None, [('z', np.arange(6, 31))]],
 'blob_funcs': None,
}

Note

These blob_* parameters were covered in Simple Parameter Study: 2-D Model Grid, so if you have yet to go through that example, now might be a good time!

Now, initialize a fitter:

import ares

# Initialize fitter
fitter_gs = ares.inference.FitGlobal21cm()

and the signal to be fit (just a -100 mK Gaussian signal at 80 MHz with \(\sigma=20\) MHz for simplicity):

fitter_gs.frequencies = freq = np.arange(40, 200) # MHz
fitter_gs.data = -100 * np.exp(-(80. - freq)**2 / 2. / 20.**2)

# Set errors
fitter_gs.error = 20. # flat 20 mK error

At this point, we’re ready to initialize the master fitter:

fitter = ares.inference.ModelFit(**base_pars)
fitter.add_fitter(fitter_gs)
fitter.simulator = ares.simulations.Global21cm

In general, we can add more fitters in this fashion – their likelihoods will simply be summed.

Now, we set the parameters to be varied in the fit and whether or not to explore their values in log10:

# Set axes of parameter space
fitter.parameters = ['tanh_J0', 'tanh_Jz0', 'tanh_Jdz', 'tanh_Tz0', 'tanh_Tdz']
fitter.is_log = [True] + [False] * 4

as well as the priors on the parameters, which in this case we’ll take to be uninformative over a relatively broad range (to do this we need Keith Tauscher’s distpy package):

from distpy import DistributionSet
from distpy import UniformDistribution

ps = DistributionSet()
ps.add_distribution(UniformDistribution(-3, 3), 'tanh_J0')
ps.add_distribution(UniformDistribution(5, 20), 'tanh_Jz0')
ps.add_distribution(UniformDistribution(0.1, 20), 'tanh_Jdz')
ps.add_distribution(UniformDistribution(5, 20), 'tanh_Tz0')
ps.add_distribution(UniformDistribution(0.1, 20), 'tanh_Tdz')

fitter.prior_set = ps

Finally, we set the number of Goodman-Weare walkers

fitter.nwalkers = 16  # In general, the more the merrier (~hundreds)

and run the fit:

# Do a quick burn-in and then run for 50 steps (per walker)
fitter.run(prefix='test_tanh', burn=10, steps=50, save_freq=10)

This will result in a series of files named test_tanh*.pkl. See the example on Analyzing MCMC Calculations to proceed with inspecting the above dataset.

Note

For a simple model like the tanh, this fitting will be slower to run through ARES due to the overhead of initializing objects and performing the analysis (like finding extrema) in real time. For more sophisticated models, this overhead is dwarfed by the cost of each simulation, and for the complex blobs, the built-in machinery for I/O is very useful. If all you’re interested in is phenomenological fits, then it’ll be much faster to simply write your own wrappers around emcee.

Hopefully you recover a signal with a peak at 80 MHz and -100 mK, but beware that this will be nowhere near converged, so the plots won’t be pretty unless you increase the number of steps, walkers, or both.