Calibrating ARGUS data with pipelines

This is an outline of the calibration pipeline for processing ARGUS data taken with the GBT. The pipeline approach relies on two packages that I maintain:

  • gbtpipe (link) – An installable version of the GBT pipeline (developed by NRAO/GBO and forked from here). This provides useful sets of routines for calibration from the observatory library. The python module also provides a gridder for on-the-fly spectroscopic data.
  • degas (link) – The calibration pipeline package developed for the DEGAS survey. This actually does the calibration using the tools developed in gbtpipe and the strategy used for DEGAS.

The gbtpipe package is also used for gridding in the KEYSTONE and GAS projects.

Data for DEGAS are collected using the VEGAS spectrometer and output into uncalibrated SDFITS files using the GBT filler. Data are recorded in spectrometer counts with necessary pointing, tuning and weather metadata tagged to each scan of the file. A typical observation consists of a VANE/SKY calibration, following by scanning the array across the source repeatedly, and another VANE/SKY calibration. All calibration is performed on a per-feed basis.

Given these raw data, the degas package process the VANE/SKY calibration scans to establish the system temperature scale of the observations. This is accomplished through the gettsys routine, which identifies VANE/SKY scans from metadata and performs error checking. The routine returns the system temperature referenced to above the atmosphere.

Individual scan calibration runs through the calscans routine, which accepts session scan numbers and VANE/SKY calibration scan numbers and loops over each set of scans to calibrate onto the $T_A^*$ scale. Each individual scan is processed using ON/OFF calibration. In the DEGAS observing strategy, there are no dedicated OFF positions and signal from an individual scan is typically weak. These conditions allow for two strategies: to use the first and last 25% of the scan data for establishing the OFF power. The OFF power at each channel is estimated by fitting a linear relationship on a per channel basis of OFF power vs time yielding $\mbox{OFF}(t,\nu)$. This regression provides a model of the bandpass per channel over the course of the scan. Alternatively, users can construct a time-constant bandpass as a median over scan data as a function of time.

The gbtpipe provides observatory weather forecast and atmospheric modeling routines. Using these, we calculate the optical depth of the atmosphere at the observing frequency and correct to the $T_A^*$ scale with that opacity. The result of the calibration produces a set of scan data for each feed representing sky brightness as a function of position and frequency.

Once calibration is complete, the scan data are gridded into a position-position-frequency data cube using the on-the-fly gridding method outlined in Mangum et al. (2007) (see this link). The gridding first establishes a world coordinate system (WCS) that appropriately represents the all the observations to be gridded so that the latitude/longitude extent includes all the observational data and the telescope beam is well sampled. Alternatively, the user can specify their desired WCS either in terms of keywords or a FITS template header.

The data are then sampled onto the WCS grid using a gridding kernel as outlined in the article. The gbtpipe implementation chooses to loop over scans and determine which pixels should be filled with weight contributing from that scan. This is the best solution for large numbers of scans.

Each scan is processed through several steps. First, the scan is shifted to a common frequency sampling scale. The shift is determined from GBT Doppler tracking metadata and the data are re-sampled using Fourier shift transforms. Usually, the each scan is corrected for a linear baseline using a defined window for the observational setup. After linear baseline correction, scans are rejected if they fail quality criteria including (1) an rms noise value that is 25% above that expected from the radiometer formula (the threshold can be set by the user) or (2) excessive rippling. The position of the scan is then compared to all the positions in the data cube and pixels with distances smaller than support radius (usually 3 beam FWHMs) are included in the gridding. A weighted value of the spectrum is added to each position within the support using gridding kernel. The jincGrid provides the recommended Bessel function gridding and gaussGrid offers Gaussian gridding for when data are poorly sampled in position.

The basic result of the pipeline is the cube that results after all scans are accumulated into the appropriate positions in the cube and the cube is divided by the weight values. The package also provides facility to convert position-position-frequency to position-position-velocity data cubes in different frames or different rest frequencies. The package also performs position-by-position rebaselining of the entire cube. In this step, the order of the baseline can be higher (usually order 3) and the package uses Legendre polynomials so each baseline order is statistically independent.

The gbtpipe package also includes several quality assurance plotting and statistical measurements to validate data quality. The degas package also includes observation and pipeline management software that uses observing logs maintained on Google docs to generate calibrated and gridded data from the raw data and the observing logs.

Written on March 20, 2018