alt text

The BTS fitting code

The BTS (Behind The Spectrum) fitting code is a python module designed to be a fully-automated multiple-component fitter for optically-thin spectra. The code is open-source and can be downloaded here. If the code is used in published work please cite Clarke et al. 2018, MNRAS, 479, 1722, which shows the first use of the code, as well as a description of the code's methodology and tests of the code's accuracy.

The latest version (updated 18/07/2023) is officially BTS v2.0 as significant changes have been made to how model selection is done, as well new features being added such as moment-masking. The following documentation has been updated to reflect this change.

Dependencies

BTS requires 4 common libraries to be installed:

Numpy,
Astropy,
Matplotlib,
Scipy, at least version 0.17.

To allow the import of the BTS module from any directory use the export command to modified the PYTHONPATH variable. This is done by adding the line

export PYTHONPATH=$PYTHONPATH:"Path to BTS' download location"

to the .bashrc file in the home directory.

How to use the code

There are only 6 functions which the user will call:

read_parameters - The function to read the parameter file and store the parameters in a dictionary. Its input is the parameter file name.
fit_single_line - The function to fit a single spectrum. It takes 4 arguments: the velocity array, the intensity in these bins, a mask denoting which channels potentially have signal, and the parameters.
fit_a_fits - The function to fit an entire fits file. It takes only 1 argument, the parameters. For this function to be used the fits file must have a header which contains information about the velocity range of the spectrum.
make_moments - The function to make moment 0, 1 and 2 maps using the moment-masking technique. It takes only one input, the parameters.
single_gaussian_test - A test function which produces test spectra containing a single Gaussian component. This test is used to determine the error on the fitted parameters. It takes the parameters as its only argument.
multi_gaussian_test - A test function which produces test spectra with up to 4 Gaussian components. This test is used to determine how well the code is at detecting the correct number of components. It takes the parameters as its only argument.

The general manner in which the code should be run is for the parameter file to be read in using read_parameters, followed by make_moments to generate the moment maps, followed by fit_a_fits to fit the PPV datacube. This is because the mask generated to calculate accurate moments is also used by the spectral fitting routine. However, once the moment-making function has been completed the mask is saved as a fits files, and thus this step does not need to be run every time the fitting routine is called. If a single spectrum is fitted, i.e. using fit_single_line, then the mask must be generated by the user. The outputs for making moments and fitting fits files, as well as their format, is explained in the next section which details the methods employed in each function.

The user interacts with BTS predominately via a parameter file which contains all important information. Included in the repository is an example script which shows how the 6 functions are used as well as example parameter files. An explanation of each of the parameters can be found at towards the end of this documentation.

Moment-masking technique

The three moments, 0,1 and 2, are commonly the first data products generated from PPV data cubes of molecular lines. However, it is known that noise in the spectra can affect the resulting moment maps. The typical technique used to combat this is called clipping, where only voxels in the PPV cube which have an intensity above some threshold (normally some multiple of the noise level, e.g. 5σ) are included. Thus this technique minimises the effect of noise by ignoring low intensity voxels; however, by doing so it also ignores numerous voxels which do contain true emission and so biases the resulting moment maps. Dame (2011) go into detail to all of the negative effects of clipping and suggest a moment-masking technique which helps alleviate them.

BTS uses such a moment-masking technique, though slightly modified, to better capture all emission voxels to produce accurate moment maps. In addition the mask generated during the process is highly useful for the spectral fitting as it allows noise-only velocity channels to be excluded from fitting metrics and thus biasing the fitting process.

The general outline of the moment-masking technique is this: the PPV cube is smoothed using a 3D top-hat filter of size (m,m,m); noise is estimated from this smoothed datacube on a pixel-by-pixel basis; voxels in the smoothed cube which have an intensity above some value TC are unmasked, i.e. are said to contain emission; voxels which neighbour previously unmasked voxels and also have an intensity above some value TL, where TL < TC, are unmasked; this previous step continues in an iterative manner until no new voxels are unmasked, completing the masking process; the voxels in the original PPV cube which were unmasked in the smoothed datacube are then used to calculate the moments. An example of the comparison between clipping and moment-masking is shown in figure 1, where m=3, TC = 8σ and TL = 4σ. Moment-masking is significantly better at producing accurate moment maps, where noise effects are minimised, while also capturing the extended, weaker emission.

Figure 1: alt text

From testing, the size of the top-hat filter should be kept relatively small, with m=3-5 found to be adequate. Optimal values for TC and TL should be found by the user; lower values mean noise may begin to effect the moment calculations while higher values exclude weaker emission. Experimentation finds that TC = 6-8σ and TL = 3-5σ typically work well. If no values are specified in the parameter file, BTS defaults to m=3, TC = 8σ and TL = 4σ.

Using the above technique, the make_moments function produces the moment 0, 1 and 2 maps and outputs them as 2D arrays stored in fits files, using the header information in the 3D PPV datacube provided, a 2D noise map stored as a fits file, as well as the 3D mask generated during the process stored as a fits file in the same format as the PPV datacube. Note that when calculating a moment 1 or 2 value for a pixel, make_moments requires that there are at least 4 channels of emission in the spectrum; thus, there may exist pixels where a moment 0 is calculated which do not have a corresponding moment 1 or 2 due to very low signal-to-noise.

Note that there is also a limited velocity range mode for the moment calculations, i.e. if one wishes to calculate moment 0, 1 and 2 from a cube over only a certain velocity range. This is done by turning the parameter flag use_velocity_range to 1 and then specifying the desired velocity range with the parameters min_velocity_range and max_velocity_range. One must ensure that the specified velocities are in the same unit as the velocity axis information in the fits file header. If the header uses frequency instead of velocity, BTS will automatically use km/s and use the header RESTFRQ keyword to make the conversion between frequency and velocity.

Multi-component fitting methodology

The BTS routine does not assume the number of components in a spectrum a priori, but uses the first, second and third derivatives to determine the number and positions of the velocity components. A least-squared fitting routine is then used to determine the best fit with that number of components, checking for over-fitting.

Figure 2 shows the first three derivatives for a perfect Gaussian with a mean of zero and a standard deviation of one. One sees that at the location of the Gaussian's maximum: the first derivative is at zero and decreasing, there exists a local minimum in the second derivative, and the third derivative is at zero and is increasing. It is the local minimum in the second derivative which is used as the main indicator of a component, the first and third derivatives are used as additional checks. This is because the second derivative is more sensitive than the first derivative, but not as sensitive as the third derivative to small oscillations, and so can be used for the detection of shoulders. This is seen in figure 3 where a second Gaussian with the same width, and half the amplitude of the first, is placed at x = 2. The first derivative is no longer zero at this location, but the second derivative shows a local minimum at x ~ 2.2 and the third derivative is zero.

Figure 2: alt text Figure 3:

Observed spectra and spectra from synthetic observations have noise, this will add considerable noise to the derivatives making the detection of a peak more difficult. Figure 4 shows the same Gaussian as seen in figure 1 but with noise added. The noise is obtained by sampling from a normal distribution with a mean of zero and a standard deviation of 0.04, leading to a peak signal-to-noise ratio of ~10. One can see that the second derivative is purely noise and shows no pronounced local minimum at the location of the peak. To combat this, one may smooth noisy spectra before determining the derivatives. To smooth the spectrum it is convolved with a Gau

BTS

Install / Use

README

The BTS fitting code

Dependencies

How to use the code

Moment-masking technique

Multi-component fitting methodology