This document describes how to use software PIMA for processing VLBI visibility data. PIMA performs data calibration, fringe fitting, and exporting results of fringe fitting in the form that can be digested by VTD/Post-Solve and Difmap software for astrometry/geodesy analysis and for imaging.
Contents: 
- Introduction
 - Principles of PIMA
 - Creation of a configuration file
 - Loading the data
 - Parsing log files
 - Calibrating the data
 - Examine raw data and calibration information
 - Running coarse fringe fitting
 - Computation of a complex bandpass
 - Running fine fringe fitting
 - Export data for astrometry/geodesy solution
 - Export data for imaging
 - Import of gain curves
 - Flagging visibilities with low amplitude at the beginning or end of a scan.
 - Running task splt for splitting and exporting data for imaging
 - Compute gain correction
 - OPAcity Generation
 - OPAcity LOading
 - Compute TSys MOdel
 - Use case of preparing the data suitable for imaging
 - Automatic imaging
 - Re-fringe the data using results of astrometry/geodesy solution
 - Data analysis pipeline
 - Running the analysis pipeline with pir.py
 - Processing dual-band observations.
 - Auxiliary tools
 
PIMA has a flexible command-line interface and it is designed for a non-interactive use. PIMA is ideal for being incorporated into scripts for shell, python or similar interpreters.
All control parameters that are needed for processing a given experiment are gathered in a control file. PIMA does not support any defaults: all parameters, even those that are not used for a specific operation, are to be explicitly defined in that file.
PIMA supports the following general syntax:
Keyword names are in upper case and are terminated by column. Values are case sensitive. If the same keyword is defined more than once, the last definition takes preference. The pair keyword: value defined in the control file are processed as if they appended to the end of the control file. There are two exceptions: UV_FITS and INTMOD_FILE. More than these keywords are allowed for a case when several input files should be defined.
Some values in PIMA control file are file names. Although you can define relative file names, defining absolute file names is encouraged.
A user rarely creates a configuration file from scratches. Usually, a control file from a similar experiment is copied to a new name and a user edits it. A user should check carefully every keyword. The following keywords are the most commonly need to change:
There may be more than one FITS-IDI file in the processed dataset. In that case, more than one lines with keyword UV_FITS should appear in the control file. FITS-IDI files should appear in the chronological order.
PIMA requires that the number of spectral channels within each intermediate frequency (IF) be the same for each FITS-IDI file. If your experiment has different spectral resolution, PIMA cannot process it as one experiment. In that case you need to write more than one control file with different SESS_CODE keyword and process then separately.
The differences between these to cases:
Keep in mind that each source should be defined in two catalogues: one catalogue used for association with the center of fields defined in FITS-IDI files. Another catalogue is used for computation of theoretical path delay. That catalogue is defined on VTD control file used by PIMA. The primary source name is the "IVS name" which is B1950 name with same exception. Any observed source (NB: a source, not the field!) should have a record in the source catalogue defined in the VTD control file that is associated with the PIMA source catalogue via the field IVS source name. Internally, PIMA will use IVS source names, but it also keeps the original name of the center of the field.
If during task load PIMA cannot find a source name(s) in the input catalogue, it issues an error message that contains names and coordinates of all missing sources at the beginning of the task. If PIMA cannot find a source name in VTD source catalogues, it issues an error message that contains names and coordinates of all missing sources at the end of the task.
The first operation of task load is parsing control file. VTD control file specified in PIMA control file is also parsed. Finally, the experiment description file specified in the keyword MKDB.DESC_FILE is parsed. Any errors, such as syntax errors or files that do not exist are reported. PIMA will stop and issue an error message in a case of errors.
In the next step PIMA will check every source name first in the file specified in keyword SOU_NAMES, then in catalogue files specified in VTD control file. If it finds at least one source not in the catalogue, PIMA will issue the error message and print the list of missing source names and their coordinates extracted from the FITS file.
Then PIMA will check every station name first in the file specified in keyword STA_NAMES, then in catalogue files specified in VTD control file. If it finds at least one station not in the catalogue, PIMA will issue the error message and print the list of missing station names and their coordinates extracted from the FITS file.
Then PIMA check frequencies in each files and creates the global frequency table for the entire experiment. It converts low side band intermediate frequencies tables (LSB IF) into upper side band IFs by re-ordering frequencies of the channels within each IF for them to following in the ascending order. It merges or combines frequency groups if requested. Finally, it tables of cross indices from the original frequency tables frequency groups to the global frequency table, frequency groups and vice versus.
Next step is to read all visibility data. Visibility data are sorted, cross-correlation data are linked to autocorrelation data, and tables of time indices, cross-correlation indices and auto-correlation indices are created. PIMA checks for organ visibilities: cross-correlation visibilities without autocorrelation and auto-correlation data within matching cross-correlation data. These visibilities are added to the list of "bad data".
Then PIMA splits the data into scans. By that time the data are chronologically sorted. There are three parameters in the PIMA control file that controls the process of data splitting: MIN_SCAN_LEN, MAX_SCAN_LEN, and MAX_SCAN_GAP. PIMA sets a preliminary scan boundary when a source is changed. If it does not find valid visibilities for MAX_SCAN_GAP seconds after the last valid visibility of the previous source, it sets the end of scan of the previous source. If duration of the time from the first valid visibility of a given scan is longer than MAX_SCAN_LEN, a border of a scan is set, and a new scan starts. That means that a scan cannot be longer than MAX_SCAN_LEN seconds and it cannot have a gap longer than MAX_SCAN_GAP. At the same time scans of different sources may overlap, i.e as scan B may have start and stop time within the interval of start and stop time of the scan A. Scans shorter than MIN_SCAN_LEN seconds are eliminated and the visibilities within such short scans are marked as bad.
The choice of MIN_SCAN_LEN, MAX_SCAN_LEN, and MAX_SCAN_GAP is determined by scheduling goals and the correlator setup. Usually MIN_SCAN_LEN is set to have at least three accumulation periods, otherwise fringe fitting process may fail. For non-phase referencing experiment MAX_SCAN_LEN can be set to the scan length set by the schedule. Experiments at 22 GHz and higher MAX_SCAN_LEN can be set shorter to be close to the coherence time. MAX_SCAN_GAP can be set to 1/2 of the scan length to prevent scan split in a case of data loss within a scan. For scan-referencing observations MAX_SCAN_LEN is set to the cycle duration and MAX_SCAN_GAP is set to 90% of MAX_SCAN_LEN. It should be noted that PIMA allows to use a portion of a scan in data analysis, but it cannot unite two scans. Parameter SCAN_LEN_USED and SCAN_LEN_SKIP allows to set up continuous portion of a scan for fringe fitting and split after load task. But PIMA cannot increase scan length after task load is done. If a user needs to change scan allocation or increase scan length, task load should be re-run. NB: if a new run of task load changes the total number of observations, fringe fitting should be re-run, since the stale fringe results have different scan and observation indices.
After PIMA split the data into scans, it checks all cross- and auto- visibility data whether they are claimed by scans. All visibilities not claimed by scans are marked as bad.
If PIMA finds at least one bad visibility, PIMA stops withe an error message. Since getting bad visibilities is rather a common situation, PIMA has a mechanism to accommodate them. PIMA control file supports keyword UV_EXCLUDE_FILE that defines a file with indices of visibilities, either cross or auto, that are to be excluded at the very beginning. These visibilities are excluded from analysis, and PIMA cannot mark them bad because it does not see them. PIMA supports a special value of parameter UV_EXCLUDE_FILE: AUTO. If value AUTO is specified, than when PIMA finds bad points, it writes visibility indices in the so-called bad visibility file at SSSSS/EEE_uv.exc file, where SSSSSS is EXPER_DIR and EEE is SESS_CODE. If that file already exists, PIMA appends new visibilities to that file. When UV_EXCLUDE_FILE: AUTO, tasks load is executed several times. The first time PIMA finds bad points, puts them in the SSSSS/EEE_uv.exc file and stops with the exit code 23. The second time the bad points in SSSSS/EEE_uv.exc are read and excluded from the subsequent analysis. Usually two runs are sufficient. Sometimes the 3rd and 4th is required. Wrapper pf.py executes the 2nd, 3rd and 4th run automatically. NB: pf.pypurges SSSSS/EEE_uv.exc file if it exists.
After splitting the data into scans, PIMA reads and parses phase calibration, system temperature, weather information, interferometric model and interferometric model components. Any these parameters may be missing in the FITS-IDI file. In such cases PIMA issues a warning, but proceeds.
If PCAL: NO is specified in the control file, PIMA will skip phase calibration information present in the FITS-IDI file(s). Keep in mind, if PCAL: NO was specified during loading, PIMA cannot re-enable phase calibration later within running task load again. If phase calibration was loaded and cane be disabled for entire experiment or for the specified station(s) and re-enabled again. If phase calibration is not available for some scans at some stations, such observations are flagged as bad and are skipped for fringe fitting and other operations, unless phase ca libation is disabled for the entire experiment by specifying PCAL: NO or by disabling pcal at both stations of the baseline of that observation. It should be noted that bandpass and fringe results will be different whether phase calibration was used or not. Therefore, if phase calibration status was changed, bandpass should be re-generated and fringe fitting re-done.
By 2016.01.01 only VLBA put model and all calibration information into the FITS-IDI data. Lack of calibration information does not prevent PIMA to run fringe fitting by may prevent further tasks. For instance, task splt cannot run if no Tsys and/or antenna gains is available. Task mkdb cannot run if the interferometric model is not available. VLBA hardware correlator and DiFX version 2.0 and newer puts phase calibration information into FITS-IDI. Other correlators do not to do it. Missing weather information, Tsys can be loaded by PIMA task gean using results of parsing log-files. Missing antenna gains can be loaded by PIMA task gean from external gain files. Missing interferometric for VERA, SFXC, and KJCC correlators can be loaded by task moim from external model files in native format that were used by the correlator.
 
 
 
  
  PIMA can directly import log file in VLBA format 
or in the PIMA ANTAB format. Non-VLBA logs are parsed 
by program log_antab and transformed to PIMA ANTAB 
format. Program log_antab extracts system temperature, if present, nominal 
on-off time tags, cable calibration and meteorological information. Modern 
VLBI analysis does not use in-situ meteorological information, and uses 
instead of that the output of numerical weather model. The current version 
of PIMA does not use nominal on-off time tags from log
files, since this information is already used by the correlator. 
PIMA can compute on-off from data actual time tags 
when the antenna was on source. The use of cable calibration in analysis is 
discretion and rarely improves the fit, and sometimes significantly degrade 
it. But the use of system temperature is critical for imaging. When 
PIMA produces the calibrated averaged visibilities,
it discards observations without system temperature.
 
  Syntax of the program for parsing log-files:
 
     
     
     
     
     
 
NB: An analyst should always examine the output of log_antab program.
Typical failure: log_to_antab fails to determine sky frequencies. Possible
reasons:  a log file may have a portion at the beginning or at the end that 
is related to another experiment, a new change of log format. In the first case,
editing a log file solves the problem. In the latter case you need to
patch log_to_antab. Please try not to break its ability to parse other log
files. If everything else fails, you can either develop your own parser or
to parse a log file by hand. Keep in mind that some station do not record
Tsys at all.
 
  Wrapper pf.py supports log parsing. The following 
command does this: 
 
  Although the visibility data from the old hardware VLBA correlator has all
calibration information, it is recommended to re-load it since in some cases
the calibration information is not correct and re-loading fixes the problem.
There is no need to reload calibration information to FITS-IDI files generated
in Socorro by DiFX 2.0 and newer.
 
  Field System log files contain a) on-off start/stop scan time; '
b) meteorological information; c) cable calibration; d) frequency table;
e) system temperature.  VLBA log files contain phase calibration phases
and amplitudes in addition to that.
 
  PIMA task gean inserted the calibration tables
into PIMA internal data structures. Task 
gean requires a qualifier that is followed by the value.
The following qualifiers are supported:
 
     
            Comment 1: After adopting DBBC in 2013, the NRAO has changed 
            the data processing chain. It still provides legacy calibration 
            file, but that legacy calibration is inadequate for processing
            DBBC data.  You should not load legacy calibration into 
            PIMA when processing DBBC NRAO data.
             
            Comment 2: Although FITS-IDI from analogue NRAO observations
            prior 2013 contains phase-calibration, system temperature,
            phase calibration, and phase calibration, it is desirable to
            re-load calibration into PIMA using task 
            gean. The instances when calibration 
            information into FITS-IDI supplied by the NRAO was incorrect were 
            found.
             
            
            Comment 3: PIMA issues warnings about missing
            phase-cal and Tsys. If PIMA uses phase 
            calibration and for a given observation phase calibration is missed,
            PIMA will declare that observation as "bad" 
            and will not perform fringe fitting. Though 
            PIMA will process such an observation if 
            PCAL: NO is specified in 
            the control file. As of 2016.01.17 PIMA task 
            splt will bypass observations with missing 
            system temperature.
             
     
            Comment: Although FITS-IDI generated by NRAO at Socorro 
            contains antenna gains, it is desirable to re-load calibration 
            into PIMA using task 
            gean. The instances when old calibration 
            information into FITS-IDI supplied by the NRAO was incorrect 
            were found.
             
     
     
     
     
     
     
 
 
     
     
          
          
            
          
          
     
     
         What to look? First to look at phase cal phases. Phase cal
         scatter with respect to a smoothed curved should not be 
         excessive (more than 0.3 rad). Sometimes phase calibration
         may vary significant with time. Plot of "phase cal relative f0"
         (R), i.e. differences of phase calibration phases with respect
         to the phase at the lowest frequency is helpful in this 
         situation. Another useful statistics is "phase cal amplitude"
         (M). There are several factors that causes variation of phase-cal
         amplitudes. Phase calibration amplitude is proportional to 
         T_sys, which depends on elevation and may depend on time.
         The second factor is presence of spurious narrow-band signal(s)
         generated by the hardware. This signal distorts phase and 
         amplitude of the phase-calibration signal. If front-filters 
         are not tuned well, the phase calibration signal at the image
         sub-band may distort phase and amplitude of the phase cal signal
         at the primary sub-band. Plot of "phase amp versus phase" (V)
         help to reveal the presence of spurious signals. There is
         no dependence of phase calibration amplitude on phase if
         the hardware is perfect. Sinusoidal pattern indicates the 
         presence of spurious signals. Spurious signal with amplitude
         less than 10% of the average amplitude are usually harmless,
         while the use of phase calibration with spurious signals with 
         the amplitude 50% may significantly degrade results.
          
     
         An analyst should make a decision whether to keep phase 
         calibration for a given station or not. PIMA 
         allows to disable phase calibration for any given station or for 
         all stations. In order to disable phase calibration for all the 
         stations, PCAL: NO should 
         be specified in the control file. Task gean 
         allows to disable phase calibration for a given station. 
         It requires qualifier pcal_off that needs a value: station 
         name. If to run task gean with qualifier 
         pcal_on, the phase calibration for a given 
         station will be enabled. NB: if you loaded the experiment with 
         PCAL: NO, 
         PIMA does not read phase calibration, and 
         therefore task gean cannot be enabled it. 
         You need to load the experiment again in order to enable phase 
         calibration.         
          
         In addition, PIMA provides a fine-grained
         mechanism for toggling status to use or not to use pcal for 
         given stations. Keyword PCAL supports 
         a qualifier that provides a station list.
          
         Syntax:
          
         PCAL:  value[:action:[station[:station]...]
          
         A separator : (column) or , (comma) between stations is allowed.
          
         Action is either TO_USE or NOT_TO_USE. The action is 
         case insensitive. If action is TO_USE, then pcal only from
         the stations form the list will be used. If action is NOT_TO_USE,
         then pcal from the stations on the list will not be used.
         Example:
          
 
 
         Fine-grained pcal station selection can change phase calibration
         use status only if pcal is available and was not turned off
         using task gean.
          
     
      
         What to look? If PIMA shows no Tsys, that means 
         it was not loaded. If need to check log file and if possible, to fix. 
         Then you need to re-run task gean. Sometimes Tsys 
         is so noisy or wrong that keeping such a station will degrade 
         reconstructed source images. In that case bad Tsys in certain IFs can 
         be disabled by editing so-called gain correction file specified by the 
         GAIN_CORRECTION_FILE control file. 
         PIMA task load creates and 
         initializes it if the file specified by that control file does 
         not exist. The gain corrections file specifies for each station,
         each IF a factor that splt will multiplies Tsys. 
         PIMA does not modify the file if it exist. 
         An alternative way to initialize gain correction file is to run task 
         gaco with qualifier init. 
         That qualifier requires a value either fill or overwrite. Value 
         fill instructs PIMA to add 
         missing records: if for some 
         IFs, some stations the gain correction was not defined, 
         PIMA will add record with correction equal to 1
         (i.e. no correction). If the qualifier init has value overwrite, 
         PIMA will overwrite previous definitions with 1.
         If for a given IF, given station the gain correction is 0.0, then
         PIMA task splt will set 
         amplitude zero and such an IF will not be used for imaging.
          
     
        Usage: pt.py [-pt optios] EEE B obs [pima_opts]
         
        where is the low case experiment name, band is the low case
        band name, and obs is the observation index. These mandatory 
        arguments may be followed by additional arguments of the command line 
        for PIMA that are in the usual format keyword: 
        value. The wrapper itself supports options --dry-run (-r) and 
        --verbosity (-v). Option --dry-run just shows the 
        PIMA command line without execution. Option 
        --verbosity requires a value. Value 0 means no 
        informational messages, value 1 (default) moderate 
        verbosity and values 2 and 3 
        more and more verbose output.
         
        If the PIMA control file defines 
        BANDPASS_FILE, and/or 
        BANDPASS_MASK_FILE, and/or 
        PCAL_MASK_FILE, and/or 
        POLARCAL_FILE that do not exist, wrapper 
        pt.py replaces them with 
        NO and issues a warning.
         
        pt.py displays two fringe plots: versus frequency 
        (and averaged over
        time) and versus time (and averaged over frequency). If a source is
        weak, the plot may look too noisy. Keyword 
        FRIB.1D_FRQ_MSEG averages
        the data over frequency after performing fringe fit and before
        plot preparation. The value of the keyword specifies how many spectral
        channels are coherently averaged out. This parameter should not exceed
        the total number of spectral channels in an IF. Analogously,
        FRIB.1D_TIM_MSEG averages the data over time after 
        performing fringe fit and before plot preparation. The value of the 
        keyword specifies how many accumulation periods are coherently 
        averaged out. 
                  
        
        What to look? First, whether the source is detected. As a rule of 
        thumb, SNR > 7.0 and higher indicates a reliable detection,
        SNR in a range of [6.0, 7.0] is a marginal detection, SNR in a range
        [5.1, 6.0] is unlikely a detection, and SNR < 5.1 usually is 
        a non-detection. If an observation you picked is a non-detection,
        try another. If all observations are non-detections — 
        bad luck, you can stop analysis on this point. Nothing can be done.
                  
        Since no bandpass calibration is applied at this point, the phases
        are not aligned. However, the residual fringe phases should follow
        a more or less a smooth line for a high SNR observation. Jumps, or
        low amplitudes at some IFs raises a concern. Phase behavior at
        individual IFs can be examined by running pt.py 
        with specifying the IF under consideration with keywords 
        BEG_FRQ and END_FRQ.
          
    
  To run coarse fringe fitting, task  frib is used. 
PIMA wrapper pf.py simplifies 
running coarse fringe fitting. Syntax:
 
 
  In a case when phase calibration is applied, the bandpass is computed with
respect to the phase calibration. If you use all tones of phase calibration,
you should first clean them and mask out the tones affected by internal 
radio interference (spurious signals).
  
   It is assumed the residual bandpass is stable with time. An experiment may 
have jumps in the bandpass due to power-off power-on
of the VLBI hardware at one or more stations. Unfortunately, as of 2016.02.01 
PIMA does not provide a convenient way for processing 
such data. The workaround is to effectively split the dataset into two subsets 
before and after the jump and compute two bandpasses. Splitting  the dataset 
can use made using keywords OBS, 
INCLUDE_OBS_FILE, and 
EXCLUDE_OBS_FILE. Fortunately, jumps in bandpass 
occur in less than 5% VLBI experiments.
 
   An analyst usually sets the mask for cross-spectrum data during this step.
The mask is an array of 1 and 0 that depends on spectral channel index. 
PIMA
multiplies the mask by visibilities when it processes the data. Zeroes in the
mask effectively replaces the spectral channels with zeroes. Usually unwanted
potion of the spectrum is masked out: either affected by RFI or affected by 
hardware bandpasses. If a signal is narrow-band, for instance from an stellar
maser, then masking allows to discard spectral channels that have no signal,
but only noise.
 
  Usually, phase calibration signal is a rail of very narrow-band signals 
with frequency separation 1 MHz which less than the spectral resolution
of visibility data. PIMA interpolates spectrum of 
phase calibration within each IF. In a case if only one tone per IF 
is used PIMA considers phase spectrum of the 
calibration signal is flat, i.e. it assigns the phase  calibration phase 
to all spectral channels. In a case if all phase calibration tones are 
used, PIMA unwraps phases and performs linear 
interpolation or extrapolation. The presence off spurious signals 
distorts calibration phases. If they affect a small fraction of all 
tones, the tones affected by spurious signals can be masked out. 
Then PIMA will automatically interpolated between
tones that are not affected.
 
  Task mppl shows phase and amplitudes of phase 
calibration signal with multiple tones per IF. It may be useful to use 
keyword OBS to control which observations to use 
for generating plots. 
 
  Task mppl shows several types of plots. Raw phase 
usually is not informative, since the calibration signal may have many phase 
turns per IF. Let PIMA to unwrap phase for you. 
Plot of unwrapped phases (U) shows the spectrum: unwrapped phases (green) 
and modeled phases (blue). Bandpass is supposed to be smooth. Jumps in 
phases is due to spurious signals. The sum of the phase calibration tone 
and the spurious signal depends on the phase of  the phase calibration tone 
itself. Therefore, if at a given plot of a given observation you see phase 
that does not strongly deviate from the smoothed curve that does not 
necessarily means that other observation will not be affected even if the 
source of spurious signals does not depend on time. Mode (C) displays both 
unwrapped phase and amplitude at the same plot. Since spurious signal affect
both phase and amplitude of the calibration signal, this plot helps to 
identify frequencies affected by spurious signals.
 
  OK, you found a peak at the plot that you believe is due to the spurious 
signal. What further? PIMA supports so-called phase 
calibration mask file specified by the keyword 
PCAL_MASK_FILE. This file defines the value, 0 or 1, 
for each phase calibration tone. If the value is zero, then that 
calibration tone is masked out and not used for computation of the 
smoothed curve that interpolates the phase calibration signal across 
an IF. One may edit this file manually, but in general, it is too boring. 
PIMA supports so-called mask definition files that 
allows to write which calibration tones to suppress in a concise way. 
It allows to define ranges of the tones that are to be mask out. 
An analyst edits the calibration mask definition file, converts it to 
the phase calibration mask, visualizes the phase calibration phases 
and/or amplitudes and repeat this procedure till a satisfactory result 
is produced.
 
  Phase calibration mask definition file consists of records of variable
length. The first record identifies the format. It should always be
be # PIMA PCAL_MASK_GEN  v  1.00 2015.05.10
Lines that start with '#', except the first one, are considered comments,
and the parser ignores them. Mask definition records consists of 8 words
separated by one or more blanks
PCAL  STA: ssssssss  IND_FRQ: aa-bb  IND_TONE: xx-yy  OFF
 
  where sssssss is the station name, aa is the index of 
the first IF of the range, bb is the index of the last IF of the range,
xx is the index of the first tone in a given IF range and 
yy is the index of the last tone in a given IF range. 
Here is an example:
 
  PIMA does not accept the mask definition file directly. 
Task pmge transforms the phase calibration mask definition 
file into phase calibration mask file. That task requires qualifier 
mask_gen with value mask definition file. The name of 
the output file is defined by keyword PCAL_MASK_FILE in 
the PIMA control file. Example:
 
     
 
 
  Cross-correlation spectrum can be distorted  by external RFI, such as satellite 
radio. Hardware problem or errors in the hardware setup may cause cause a significant
drop in the amplitude or a total loss of signal either at the entire IF, or
a range of IFs, or a portion of IFs. If there is no signal in given spectral
channels, the SNR will be reduced and weak sources may not be detected. Masking out 
unwanted noise improves the SNR. Cross-correlation spectrum from narrow-band 
targets, such as masers or satellites may not have signal beyond edges of 
the bandwidth even in the absence of hardware failures.
 
   PIMA supports mask file specified by the keyword 
BANDPASS_MASK_FILE. This 
file defines four sets of values, 0 or 1, for each phase spectral 
channels. PIMA multiplies visibilities by the mask, 
which effectively disables spectral channels that corresponds to mask with 
value 0. The first mask affects autocorrelations, and three remaining masks 
affect cross-correlations. The second mask used used only for computation 
of bandpass, the third mask used for fringe fitting, and the fourth mask is 
used by task splt for computing visibilities averaged 
over frequency and time. Usually the second, the third, and the fourth masks 
are the same, i.e. a common mask for cross-correlation is used. 
Autocorrelation and cross-correlation masks are usually different since 
different factors lead to necessity to mask auto-correlation and 
cross-correlations.
 
  If the mask for a given cross-correlation is zero, the corresponding visibility
is replaced with zero. If the mask of a given auto-correlation is zero, the
auto-correlation at a given spectral channel is computed by linear interpolation 
between adjacent channels, or linear extrapolation, if the masked channel is at
the edge of the IF. The corresponding cross-correlation is not affected. Very
often strong unmasked spurious signals that results in autocorrelation greater 
than 2–5 of the average level usually affect the cross-correlation as well.
Therefore, it is prudent first to mask out strong spurious signals at autocorrelation
and then check cross-correlation spectrum.
 
  An analyst may create the mask file by hand, but this is a tedious work.
PIMA supports so-called bandpass mask definition files that allows to specify
the spectral channels that are to be suppressed in a concise way. The bandpass
mask definition file allows to define ranges of the spectral channels  that are 
to be mask out. An analyst edits the bandpass mask definition file, converts it 
to the bandpass mask, visualizes the cross- and auto- phase and amplitude 
spectra, and repeat this procedure till a satisfactory result is produced.
 
  A bandpass mask definition file consists of records of variable
length. The first record identifies the format. It should always be
be # PIMA BPASS_MASK_GEN  v 0.90 2009.02.05
Lines that start with '#', except the first one, are considered to be comments,
and the parser ignores them. Mask definition records consists of 8 words
separated by one or more blanks
mmmm  STA: ssssssss  IND_FRQ: aa-bb  IND_CHN: xx-yy  ddd
 
  where mmmm is the mask type: one of AUTC (autocorrelation mask),
BPAS (bandpass mask), FRNG (fringe fitting mask), SPLT (split mask), 
CROS (bandpass+fringe_fitting+split masks), ALL (all masks: 
autocorrelation+bandpass+fringe_fitting+split)
sssssss is the station name, aa is the index of 
the first IF of the range, bb is the index of the last IF of the range,
xx is the index of the first spectral channel in a given IF range and 
yy is the index of the last spectral channel in a given IF range;
ddd is disposition: ON or OFF. Station name may be substituted by 
ALL, what means the definition affects all stations.
 
  The first definition sets the default disposition: ON or OFF. Unless you
have  really pathological experiment and you have to disable the majority of
spectral channels, the first definition is 
ALL STA ON, what means to enable all spectral channels
The definitions are processed consecutively. Each new definition alters
the mask defined by priori definitions.
Here is an example:
 
  Task bpas takes as input result of fringe fitting. 
In general, it is not required to have fringe fitting results for all 
observations, although it is desirable. PIMA will 
find n observations with the highest SNR at each baseline with 
the reference station and will compute the complex bandpass using these 
observations. PIMA uses one observation with the 
highest SNR per baseline in INIT or 
INSP mode. It may happen that just the observation with 
the highest SNR is affected by RFI or has another problems. In such a case 
affected observation can be added into the exclude list. 
 
  In a case of dual-polarization observations two bandpasses are computed:
for RR polarization  and for difference LL minus RR polarization. The second
bandpass is called polarization bandpass. The main bandpass is RR for 
single-polarization RR data, LL for single-polarization LL data, and
RR for dual-polarization data. In order to compute both, main RR bandpass 
and polarization bandpasses, task bpas should be 
executed with POLAR: I.
 
  Wrapper pf.py simplifies generation of the bandpass. 
It checks exclusion file VVVVV/EEE/EEE_B_bpas_obs.exc. If it does 
not find such a file, the wrapper creates an empty file with a comment line. 
It supports option -insp that overrides value of 
BPS.MODE.
 
  When computing bandpass, PIMA re-runs fringe fitting 
for the selected observation. Blue lines in the amplitude and phase plots show 
just residual amplitudes and phases from that fringe fitting. Then a smooth 
curve is fitted to the residuals. PIMA supports two 
algorithms: fitting with a smoothing spline of the 3rd degree over n 
equi-distant knots or with a Legendre polynomial of degree b. 
The smoothing methods should be the same for phase and amplitude, but the 
degree of the Legendre polynomial or the number of knots for the spine can 
be different for amplitude and phase.
 
 
  Smoothing mode is defined by keyword BPS.INTRP_METHOD. 
It can take values SPLINE, LEGENDRE, 
or LINEAR. Phase initial bandpass is the smoothed 
model bandpass with opposite sign. The phase bandpass is normalized to have 
the mean phase over the band zero. PIMA will 
unwrap phase if phase bandpass exceeds ±π/2. Amplitude bandpass is 
normalized to unity. If BPS.NORML: IF, then the amplitude 
bandpass is analyzed over each IF separately. If BPS.NORML: BAND,
then the amplitude bandpass is normalized over the band. Which mode to use?
That depends on hardware. If system temperature is measured for entire band, 
then normalization should be made over the band. If system temperature is measured
at each IF (or pairs of IFs) individually, than normalization over IF should
be used. Autocorrelation is always normalized over IF.
 
  PIMA task bpas in the inspection 
mode displays two plots per baseline:
amplitude bandpass and phase bandpass. Reading these plots is an important skill.
An analyst should identify spikes in autocorrelation amplitudes and mask them
out. To identify a spike in autocorrelation, first set color index 3 by hitting 
C and 3, then move the cursor close to the spike and hit 
LeftMouse button. Dump file SSSSS/pima/EEE.frq helps to identify 
an IF/channel indices. Using this information, an analyst adds a record in the 
mask definition file. There is no firm rule when a spike should be mask out.
Usually spikes with amplitudes greater than 2 should be mask out, and those 
with amplitudes in a range [0.8, 1.2] are kept. After all spikes in 
autocorrelation are mask out, a mask should be generated from the mask 
definition file using task bmge. Task bmge requires qualifier mask_gen with 
the value mask definition file. After than inspection of autocorrelation 
amplitudes should be repeated. NB: Amplitude of masked 
autocorrelation is set to zero. If necessary, mask definition file should be 
edited, mask re-generated, and the procedure repeated.
 
  After cleaning autocorrelation spectrum, cross correlation should be cleaned.
Several situations are rather common:
 
    
    
    
 
  Task bpas supports keyword 
BPS.AMP_MIN that defines the threshold on the fringe 
amplitude as the share of mean amplitude over the IF. Spectral channels with 
the amplitude threshold below BPS.AMP_MIN are ignored 
by task bpas. The bandpass to these channels is obtained 
by extrapolation from those channels that are in use. This may be useful
if the bandpass has strong changes at channels with low amplitude,
since in that case smoothing with Legendre polynomial and spline may
not be robust.
 
   Phase may become too noisy in experiment with high spectral resolution.
In that case the residuals should be coherently averaged. The number 
of spectral channels averaged within an individual segment is defined
by keyword. Value 1 stands for no averaging. A balance 
between averaging and spectral resolution should be maintained. From
one hand strong averaging (high value of BPS.MSEG_ACCUM)
results in less random noise in phase. From the other hand, averaging reduces 
spectral resolution and our ability to model the system response as a function
of frequency. Scatter greater than 1–2 rad may result in a failure to 
resolve phase ambiguity across an IF, which will lead to a wrong result. 
A general guideline is to select such averaging that the scatter of residual 
phases with respect to a smooth line be in a range of 0.05–0.2 rad for 
the sources used for bandpass computation.
 
  Interpolation of the bandpass is important. In the simplest form the bandpass
is just reciprocal to the normalized complex cross-correlation function of the
observation with the highest SNR at a given baseline. If the SNR of that 
observation were infinitely high, this would have been the optimal approach.
However, when we apply the bandpass computed by inversion of the 
normalize dd complex cross-correlation function of an observation with
finite SNR, we propagate the noise of that observation to other observations,
which results in an increase of the total noise. To alleviate undesirable
noise propagation, we 1) use more than one observation for computing
band pass, 2) coherently average n adjacent visibilities; 
3) smooth the bandpass with polynomial or splines. 
 
  The choice of the magnitude of coherent averaging and the degree of 
the polynomial or the number of spline knots depends both on the SNR
and the number of spectral channels in the IF. The higher SNR of the
observations used for bandpass computation, the better, although the 
quantitative measure of whether a given SNR is high enough is rather
subjective. Observations with SNR 200–2000 allows to compute
a rather reliable bandpass with the number of spline knots 
7–20 (In general Legendre polynomials of degree  higher than 
5–7 are undesirable since they are prone to end up with bandpass
of wiggling shape like a dinosaur's spine).
 
A wise principal investigator will insert enough strong
calibrators to make computation of the bandpass robust, but sometimes
either the PI did not think well, or observations of strong calibrators
failed, or their were not possible, for instance for space VLBI.
In that case we have to compute bandpass using weak calibrators.
When observations with SNR in a range 30–100 are used care must
be taken to generate a robust bandpass. First, it should be checked 
that the scatter of the residual phases of the observations used for 
bandpass compilation is smaller π/3–π/6^ndash and phase
ambiguities can be reliable resolved. If an error in ambiguity 
resolution will happen, it will strongly poison a bandpass: such 
a bandpass, when applied, will degrade the SNR, not improve it. 
To mitigate phase ambiguity resolution, the residual cross-correlation
is coherently averaged (keywords BPS.MSEG_ACCUM and
BPS.MSEG_FINE). Second, the degree of the smoothing
polynomial or the number of spline knots is reduced.
 
  Finally, when the bandpass has to be computed using the very weak 
observations with the SNR 7–20, BPS.INTRP_METHOD:
LINEAR is used as the last resource. 
This mode requires BPS.MSEG_ACCUM and 
BPS.MSEG_FINE be equal to half of the number of 
spectral channels and 
BPS.DEG_AMP: 0 and 
BPS.DEG_PHS: 1.
In this mode the cross-spectrum is coherently averaged to two points per IF.
The amplitude bandpass is computed as the IF-averaged level. The phase
bandpass is computed as a linear function over two points. This mode 
is the most robust, but it does not model the more grained shape
of the bandpass. This omission is not essential for low-SNR observations.
According to the reciprocity principle omission in the data reduction 
of a quantity that cannot be reliably determined from observations cannot
significantly degrade goodness of the fit. From the other hand, using
BPS.INTRP_METHOD: LINEAR for
experiments with higher SNR observations may results in worse result with
respect to BPS.INTRP_METHOD: LEGENDRE 
or SPLINE. The natural low limit of observations used
for bandpass generation in BPS.INTRP_METHOD: 
LINEAR mode is the detection limit. Inadvertent
inclusion of a non-detection to the list of observations used for bandpass
generation will significantly degrade the bandpass. Better use no bandpass
than a wrong bandpass.
 
  When the analyst is satisfied with plots that task bpas 
generates in the inspection mode, an analyst runs bpas 
task in the non-interactive mode to generate the final bandpass. There are 
three modes for computation of the bandpass: INIT, 
ACCUM, and FINE. 
PIMA computes the residual visibilities averaged over 
time with parameters of fringe fitting applies. In the INIT 
mode PIMA picks up for each baseline the observation 
with the highest SNR among the observations with the reference stations that are 
subject of filter OBS, INCLUDE_OBS, and 
EXCLUDE_OBS keywords. It normalizes 
the amplitude to have the mean value to unity either over the IF or over 
the band depending on BPS.NORML keyword. The residual 
phase is normalized to have mean value and mean rate to zero over the bandwidth 
regardless of the value of BPS.NORML. 
PIMA smooths the residual phases and residual amplitudes 
and the inverts them: flips the sign of phase band pass, replaces the residual
amplitude with the quantity reciprocal to that for each spectral channel, and 
combines them to form array of complex numbers. These quantities are called 
initial complex bandpass. 
 
  Initial bandpass is computed using only one observation per baseline.
If an analyst selected BPS.MODE: INIT,
the task bpas stops here. If a user selected 
ACCUM or FINE mode 
PIMA selects N observations per baseline with the 
highest SNR beyond that that was used in the INIT
mode. It applies the initial bandpass determine residual phases and
amplitudes and averages them out. It reverses sign of residual phases,
replaces normalized residual amplitudes, combines them in the array of 
complex numbers and multiplies it by the initial bandpass. The result
is called accumulated bandpass. The number of observations per baseline
used for computation of the accumulation bandpass is controlled by
two parameters: BPS.NOBS_ACCUM and 
BPS.SNR_MIN_ACCUM. PIMA will 
select up to BPS.NOBS_ACCUM observations for each 
baseline with the highest SNR, not counting the observation used for 
computation of the initial bandpass, provided they SNR is 
BPS.SNR_MIN_ACCUM or above. The advantage of the 
accumulation bandpass is that it is unweighted average over N observations, 
and therefore, it accounts to some degree bandpass variation. If a user 
selected ACCUM mode PIMA task 
bpas stops here.
 
  If a user chooses FINE bandpass computation mode, 
PIMA selects K observations per baseline with the 
highest SNR beyond that that was used in the INIT mode. 
It applies accumulation bandpass and forms the system of linear equations 
for adjustment  to parameters of the model for the phase bandpass 
and logarithm of the amplitude bandpass, i.e. either coefficients of 
Legendre polynomial or B-spline. It solves the system using weighted
least squares with weights proportional to fringe amplitude. Then it
computes residual phases and amplitude corrections, computes their
statistics, and if the statistics exceed the specified threshold, it
discards the observations with the greatest residual and repeats 
computation until either the statistics become lower than the 
threshold or the number of rejected observations per baseline reaches
the specified limit. 
 
   Keyword BPS.SNR_MIN_FINE specifies the minimum SNR. 
Observations with SNR below that limit are not used by PIMA 
for bandpass computation in the FINE mode. Keyword 
BPS.NOBS_FINE specifies the number of observations
with the highest SNR per baseline that PIMA selects 
for bandpass computation in the FINE mode, provided their SNR 
is no less than BPS.SNR_MIN_FINE. If a given observation has 
residual statistics above the threshold, PIMA will discard 
it provided the number of remaining used observations is no less than 
BPS.MINOBS_FINE. This mechanism prevents rejection of 
too many observations.
 
  After performing the first iteration of LSQ adjustment, PIMA 
computes for each IF weighted rms of residual phases and normalized residual 
amplitudes. Then PIMA finds an observation with maximum 
phase and maximum amplitude residual. If the rms of phase residual exceeds 
BPS.PHAS_REJECT radians, that observation is marked for 
rejection. If the rms of normalized amplitude residuals exceeds 
BPS.AMPL_REJECT, that observation is marked for rejection. 
If the number of used observations still exceeds 
BPS.MINOBS_FINE, the observation is rejected, and the next 
iteration runs. PIMA maintains two counters of used 
observations: for the phase bandpass and for the amplitude bandpass. An observation 
may be rejected for amplitude bandpass but kept for phase bandpass, or vice versus,
or rejected for both amplitude and phase bandpasses.
 
  PIMA task bpas prints valuable 
statistics when DEBUG_LEVEL: 3 or higher. 
An analyst should always examine it.
Lines that starts with "BPASS Removed" are especially important. If there are
many rejected observations at a given baseline, especially if their phase or
amplitude rms of residuals is large, an analyst  should examine these 
observations: to run a trial fringe fit and look at residuals. Bad observations
may skew bandpass evaluation, so it may be necessary to examine residuals
with and without applying bandpass (BANDPASS_USE: NO). One of the reasons of
computing bandpass in FINE mode is to find observations with large residual
phases or residual amplitudes.
 
  It may happen that a fraction of high SNR observations are affected by 
hardware failure or RFI. If bpas task does not reject 
them, they skew the estimate of bandpass. There is another mechanism to get 
rid of such observations for bandpass computation: to put them in the exclude 
list. PIMA wrapper pf.py 
automatically checks file VVVVV/EEE/EEE_B_bpas_exc.obs. If it exists, 
it adds option EXCLUDE_OBS:  VVVVV/EEE/EEE_B_bpas_exc.obs,  
i.e. excludes them from participation in bandpass computation.
 
  PIMA task bpas assumes bandpass is stable. It may happen 
one or more jumps, f.e. due to power failure. In that case 
PIMA will reject many observations. At the moment, 
PIMA does not have a capability to accommodate a jump. 
A workaround is to process two portions of the experiment separately by 
specifying observation lists using INCLUDE_OBS_FILE, 
EXCLUDE_OBS_FILE or OBS keywords. 
If a given station has more than 2–3 jumps in bandpass, it should be 
discarded and station staff should be alerted.
 
  If there are many rejected observations an analyst may a) mask out bad
channels b) add offending observations to the exclude list; c) exclude
a list of observations; d) discard a station; e) ignore it.
 
  In a case of dual-polarization data when POLAR: 
I is specified, the procedure is repeated twice: 
first for the RR polarization bandpass second for the LL polarization 
with respect to the RR polarization data. Therefore, a trial fringe fit 
with bandpass applied should run three times: with RR polarization, 
with LL polarization, and with I polarization. If the polarization 
bandpass was computed perfectly, then SNR at I polarization
should be
√  
       SNR2RR + 
       SNR2LL 
SNR reduction 2–5% with respect to the expression above is rather common,
though if the SNR at I polarization is more than 10% worse, this indicates
a problem that should be investigated.
 
  A general recommendation is to run trial fringe fitting for 5–7 
observations that are marked as "removed" in the log file of 
bpas task with an without bandpass applied in order to 
familiarize with the data. If fringe plots look satisfactory, the next step: 
fringe fitting in the fine mode should be done. Otherwise, masking, deselection 
of bad observations, phase calibration disabling/enabling should be repeated. 
NB: computation of bandpass should be repeated if a) bandpass or phase cal 
bandpass mask was changed or b) treatment of phase calibration was changed.
 
 
   PIMA task  frib performs fringe 
fitting. Wrapper pf.py is called as
 
   NB: wrapper pf.py by default overwrites the files with 
fringe results if it exists. Wrapper option -keep prevents overwriting the file 
with fringe results and fringe residuals specified in the control file. 
PIMA tasks that reads results of fringe fitting, f.e. 
mkdb or splt, processes the fringe 
file sequentially. If there is more than one record for a given observation, 
the latest record takes the precedence. 
 
  There is a number of parameters that controls fringe fitting.
 
     
     
   
     
  
         If you do not know group delay and delay rate of your observation, you
         need to search for fringes in the entire area of the Fourier transform.
         This is a usual situation at the first iteration of data processing.
         After completion of the first iteration, you may be able to  predict 
         group delay and phase delay rate. In that case you may want to restrict
         the search window and re-run fringe fitting with a narrow window. This
         procedure is called re-fringing. Re-fringing with a narrow window is 
         usually done for two reasons: a) to guide PIMA to 
         pick up the main 
         peak of the averaged visibilities that may appear to have a lower amplitude
         than the secondary peaks due to phase noise; b) to detect weaker sources.
         The probability of a falls detection at a given SNR is less when the search
         is done in the narrow window. The gain in the detection limit can reach
         30%.
          
     
         Keyword FRIB.AMPL_FUDGE_TYPE controls fudge 
         factor correction. Supported values are VLBA, 
         KOGAN, DIFX and 
         NO. If your experiment has been correlated by 
         the hardware NRAO correlator, you should specify either
         VLBA or KOGAN. Then 
         a specific fudge factor to take into account register saturation 
         in the hardware correlator is applied. Basically you have to tell 
         PIMA your data have been processed with the 
          hardware correlator since there is no reliable way to learn it
         from the data themselves. The difference between is 
         VLBA and KOGAN is that 
         in the latter case all weights are considered to be 1 when 
         correction is applied. Values DIFX or 
         NO mean no fudge factor should be applied.
 
         Fringe amplitude decays linearly with an increase residual group
         delay. The attenuation factor reaches 0.5 when the residual group 
         delay reaches a quantity reciprocal to the spectral resolution.
         FRIB.AMPL_EDGE_WINDOW_COR keyword instructs 
         PIMA to compensate this attenuation. You should 
         specify its value USE, unless you have a strong argumentation against 
         it.
 
         Keyword FRIB.AMPL_EDGE_BEAM_COR instructs 
         PIMA to apply (YES) or
         not to apply (NO) a correction for beam 
         attenuation. It is assumed you know source positions with accuracy 1" 
         or better and used this a priori coordinates in the catalogue 
         when you loaded PIMA. PIMA 
         has a table with measured beam factors for ATCA and VLA antennas. For 
         all other antennas it scales the VLA beam pattern using antenna 
         diameter. It is recommended to use value YES.
 
         PIMA discards visibilities with low weights. Two 
         keywords FRIB.AUTOCORR_THRESHOLD and 
         FRIB.AUTOCORR_THRESHOLD specify
         the low threshold for autocorrelation and cross-correlation
         weights. Typically, normal weights are 1 and the thresholds 
         0.2 are recommended. Weights below that threshold usually 
         means failures. However, sometimes weights for normal visibilities
         are low. In that case thresholds 0.2 may force PIMA 
         to discard all the data. In that case an analyst may try to reduce values
         of keywords FRIB.AUTOCORR_THRESHOLD and 
         FRIB.AUTOCORR_THRESHOLD.
       
         Keyword FRIB.NOISE_NSIGMA controls computation of 
         the noise. After fringe fitting PIMA selects randomly 
         32768 samples of the Fourier transform, orders them in decreasing their 
         amplitudes, and computes the rms. Then it runs iterative procedure of 
         excluding samples with amplitudes greater than 
         FRIB.NOISE_NSIGMA. After excluding
         each sample, the rms is updated. Usually, value 3.5 
         is optimal.
          
 
     
         PIMA supports several algorithms of fine fringe 
         search specified by keyword FRIB.FINE_SEARCH. 
         The most commonly used algorithm is LSQ.
         PIMA adjusts phase delay rate, group delay, and 
         group delay rate using least squares with both additive and 
         multiplicative reweighting. Method ACC performs 
         a similar procedure, but it adjusts phase delay
         acceleration instead of group delay rate. This mode is used for a case
         when a priori station position has a very large uncertainty that causes
         a significant quadratic term in phase. This happens mainly when one of 
         the elements of the interferometer is in space, f.e. RadioAstron.
         Method PAR that adjusts group delay and phase 
         delay rate using parabolic fitting is used mainly for a coarse fringe 
         search.
          
             
        A 1D-plot of amplitudes and residual phases versus frequency is 
        controlled by the keyword FRIB.1D_RESFRQ_PLOT. 
        The residuals are coherently averaged over 
        FRIB.1D_FRQ_MSEG spectral channels. Averaging is 
        not needed for high SNR observations (i.e. 
        FRIB.1D_FRQ_MSEG should be set to 
        1), but may be needed for lower SNR observations. 
         
        A 1D-plot of amplitudes and residual phases versus time is 
        controlled by the keyword FRIB.1D_RESTIM_PLOT. 
        The residuals are coherently averaged over FRIB.1D_TIM_MSEG
         accumulation periods. Averaging is not needed for high SNR observations 
        (i.e. FRIB.1D_FRQ_MSEG should be set to 
        1), but may be needed for lower SNR observations. 
 
        A 1D-plot of amplitude of the delay resolution function is 
        controlled by the keyword FRIB.1D_DRF_PLOT. The delay 
        resolution is the 1D slice of the the Fourier transform of visibilities along
        the phase delay found by fringe fitting. Keyword 
        FRIB.1D_DRF_SPAN defines
        the span of the plot in units of the group delay ambiguity spacings.
        Value 1.2 is usually enough to visualize the DRF. 
        In a case of lack of phase offsets in IFs, the shape of the DRF is regular 
        with low sidelobes. The presence of phase offsets distorts the shape of 
        the DRF and raises the sidelobe which may become stronger than the main 
        maximum.
 
        A 2D-plot of fringe amplitude versus group delay and phase delay
        rate is controlled by the keyword FRIB.2D_FRINGE_PLOT. 
        A portion of the 2D Fourier transform of visibilities is displayed. 
        Two keywords FRIB.PLOT_DELAY_WINDOW_WIDTH, and 
        FRIB.PLOT_RATE_WINDOW_WIDTH control
        the size of the window. Units are seconds for delay and dimensionless for
        delay rate. The size of a usable window is strongly dependent on the
        frequency sequence and duration of accumulation period. 
        1.D-7 for delay and 5.D-12 delay 
        rate may be a reasonable initial choice. The best values are selected 
        by trials.
 
  Basic operations of this task are a) splitting the data into output
scans with their reference time; b)computation of path delay; 
c) formatting the output.
 
  PIMA performs baseline-dependent fringe fitting, 
and it processes observations independently. PIMA 
finds the fringe reference time (FRT) automatically as a mean weighted 
epoch among used accumulation periods when 
FRT_OFFSET: AUTO. In general, 
the FRT is different  even if all stations had the same nominal start 
and stop time. Often VLBI experiments are scheduled in such a way that 
all stations have the same stop time but different start time. Geodetic 
schedules tends to have chaotic start and stop time when different 
antennas of the network have different start and stop epochs. 
 
  PIMA has several way to set the scan reference time. 
The algorithm is controlled by keyword MKDB.SRT. 
If it has value SRT_FRT, then PIMA 
sets the SRT the same as FRT. 
As a result the number of output scans, i.e. observations with the same 
epoch, tends to be the same as the number of observations, i.e. each output 
scan has only one observation. This is usually undesirable.
 
 
  When MKDB.SRT: MID_SCAN, 
PIMA consolidates time epochs in order to have as many 
as possible observations of the same scan to have the same epoch. But when
the reference epoch is moved away from the weighted mean epoch,
the uncertainty of group delay increases. PIMA has 
two keywords that controls the interval of time the scan reference time 
can deviate from the reference time for group delay uncertainty not
to grow too much: MKDB.GD_MAX_ADD_ERROR and 
MKDB.GD_MAX_SCL_ERROR. Keyword 
MKDB.GD_MAX_ADD_ERROR specifies the tolerance 
of the absolute increase of the uncertainty in seconds. Keyword 
MKDB.GD_MAX_SCL_ERROR specifies the tolerance of 
the increase of the uncertainty as a fraction of the original group delay 
uncertainty derived by PIMA task 
frib. PIMA uses the smallest 
of the these two tolerances. Typical value of 
MKDB.GD_MAX_ADD_ERROR is 5.D-12, 
typical value of  MKDB.GD_MAX_SCL_ERROR is 
0.2. That means that if, for example, the original path 
delay uncertainty is 40 ps, the tolerance is the smallest of 5 and 
0.2*40= 8 ps, i.e. 5 ps. PIMA determines
the SRT in such a way that the increase of the uncertainty within
the tolerance limit be minimal. If there are observations that cannot
be combined to the same SRT, PIMA splits input 
scans set by task load into several output scans.
 
  PIMA allows a user to compute scan reference time, 
write them in file and supply the file name as the value of 
MKDB.SRT. This may be useful for comparison the results 
with other fringe fitting software.
 
  PIMA can write the output in three different formats. 
The format is controlled by the keyword MKDB.OUTPUT_TYPE. 
Value TEXT instructs PIMA to 
generate a plain ascii output file with total group delays, total phase delay 
rates and many other quantities, one line per observation. Value 
AMPL instructs 
PIMA to generate a plain ascii output file fringe 
amplitude, fringe phase, Tsys, gain, uv baseline projections and other 
parameters are written in a plain ascii table, one line per used observation. 
Results in AMPL format are mainly for non-imaging flux 
density analysis. Value GVF instructs 
PIMA to generates a database in binary 
GVF format. VLBI analysis program Post-Solve for geodesy 
and absolute astrometry accepts GVF as input. Therefore, task 
mkdb provides an interface between 
PIMA and Post-Solve.
 
   The database in plain ascii is not equivalent to the database in GVF 
format: the GVF database contains more parameters than the ascii database.
Therefore, the GVF database can be converted to ascii, but reverse
transformation is not feasible.
 
  The ascii database contains parameters for two bands. If the observations
were performed only for one band or task mkdb was 
called with keyword MKDB.2ND_BAND: 
NO, the values for the second band will be zero.
When the output for both bands is available, the band with higher
reference frequency, thereafter called higher precedes.
Path delay is defined as the difference of two intervals of proper time:
1) the interval of proper time measured by the clock of the first (reference)
   station between event of coming the wavefront to the reference point 
   of the first antenna and clock synchronization and
2) the interval of proper time measured by the clock of the second (remote)
   station between event of coming the wavefront to the reference point 
   of the second antenna and clock synchronization.
The antenna reference point is the point of injection of phase calibration
tone if the phase calibration was used in data analysis or the phase center
of the antenna.
The following parameters are written to the output ascii database:
 
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
 
  Keyword MKDB.OUTPUT_NAME controls the name of the 
output file. Its meaning depends on MKDB.OUTPUT_TYPE. 
If the output type is TEXT or AMPL, 
then the value of MKDB.OUTPUT_NAME is the file name.
If the output type is GVF, then the value of 
MKDB.OUTPUT_NAME is the database suffix — 
a character in lower case. The database name is yyyymmdd_s where 
yyyy is the year, mm is the month number with heading zero, dd
is the day of the month with heading zero and is the suffix specified by 
MKDB.OUTPUT_NAME. Suffixes a-e are
reserved to imported databases converted from MARK3-DBH or vgosdb 
formats to gvf. Suffixes x-z are resolved for tests.
 
  In order to generate the output in GVF format, PIMA 
requires some additional information that it cannot find in FITS-IDI files
with visibilities. This information is supplied in an the experiment
description file. The name of that file is the value of keyword
MKDB.DESC_FILE. If the value of 
MKDB.DESC_FILE is NO, then no 
information that is supposed to be defined in the experiment
description is exported to the output database.
 
  PIMA can put in the output database two frequency 
bands in the dual-band experiment. The fringe fitting procedure runs twice 
for a dual-band experiment. The second band can either occupy a range
of IFs or a frequency group. PIMA requires to have 
two control files for the lower and upper bands. Task mkdb 
should use the control file with the upper band. The file for the upper band of
dual-band data should have a reference to the control file file
for the lower band. The name of the control file for the lower band
is specified in the keyword MKDB.2ND_BAND. 
 
   Post-Solve has two slots for group delays, phase delay rates and
other quantities: the 1st and the 2nd. Post-Solve assumes the first 
slot is the the upper frequency and the second one is for the lower
frequency but it does not check.
 
  Keyword MKDB.2ND_BAND should have value 
NO for single band experiment and for the control 
file for fringe fitting the lower band. It should have the name of the 
control file for the lower band inside the control file for  the upper 
band of a dual-band experiment.
 
  PIMA needs to know where to write the output database 
in GVF format. Keyword MKDB.VCAT_CONFIG specifies 
the configuration file for VLBI database catalogue. That configuration file 
defines two directories: directory for ascii envelope wrappers 
(keyword GVF_ENV_DIR) and directory for binary files 
(keyword GVF_DB_DIR). The same configuration
files is supposed to be used by Post-Solve. Post-Solve assumes that
configuration file has name $SAVE_DIR/vcat.conf . Therefore, in order
Post-Solve to find the database crested by PIMA, 
the control file should specify $SAVE_DIR/vcat.conf with environment 
variable SAVE_DIR expanded.
 
  Database file in GVF format can be read with GVH library. The GVH
package has routine gvf_transform that can transform from binary
representation of the database to the ascii and back to binary form.
The ascii representation of a database is human readable. Moreover,
Post-Solve can work with both ascii and binary representation, binary
representation being more than one order of magnitude faster. 
Transformation to an ascii representation can be useful for simple
editing  such as replacement of source name or station name.
 
  Post-Solve classifies observations as good, bad, but recoverable,
and bad unrecoverable. The latter category is not visible and Post-Solve
does not allow to recover them. The common reasons for that: 1) SNR less
than the limit specified in the keyword FRIB.SNR_DETECTION
of the PIMA control file; 2) lack of phase calibration 
for a given observation when pcal is anything else than 
NO; failure in fringe fitting, f.e. because there are 
less than 3 valid accumulation periods. Since PIMA 
does not determine whether an observation is really detected or not, if 
FRIB.SNR_DETECTION is too high, there is a chance that 
Post-Solve will miss good observations. For this reason, it is recommended 
to set a rather low FRIB.SNR_DETECTION parameters, f.e. 
5.0, and then set the SNR limit in Post-Solve but hitting 
J. Post-Solve sets the SNR limit temporarily. The observations below 
the limit are considered unrecoverable until the limit is changed. If the SNR 
limit is lowered, but not below the limit used by PIMA, 
the observation from bad and unrecoverable becomes bad and recoverable. 
Reducing the SNR limit in Post-Solve below the limit used by 
PIMA does not have effect.
 
  A suggested strategy is to set a low FRIB.SNR_DETECTION, 
5.0, and after loading the database into Post-Solve set 
higher SNR limit in Post-Solve, 5.8–6.0. After 
cleaning the database for outliers with SNR limit 
5.8–6.0 the limit is lowered to the value used by 
PIMA. Such a strategy avoids the problem of 
contamination the dataset with too many outliers which may cause difficulties 
in initial analysis, since too many outliers, say more than 10% may 
significantly skew residuals.
 
 
  If a given observation, given IF does not system temperature or antenna
gain, or have Tsys out of range [10, 10000]K or have zero gain, the
visibility for such an observations, such an IF is discarded. Therefore,
calibration for system temperature and antenna gain must be performed before
running task splt. In contrast, the data without 
amplitude calibration are still usable for absolute astrometry/geodesy. 
Flagging data for the time intervals when the antennas were off-sources 
is essential for deriving a good-quality image. Flagging can also be 
performed before mkdb, although usually it has 
only a marginal effect.
 
 
  PIMA task prga (PRint GAin) 
prints gain for each station, each IF. It is recommended to inspect these 
values.
 
  PIMA supports two formats of gain information: 
VLBA gain and EVN gain files. The VLBA gain format allows to specify 
different gain for time ranges, and therefore, this format is preferable.
 
  Gain file specifies gain at the reference elevation, the so-called 
Degrees Per Flux Unit (DPFU) factor in Jy/K for R and L polarizations for 
a certain frequency range and a set of coefficients that specify the 
polynomial that describes the dependence of gain with elevation — that 
is why it is called "gain curve". If the elevation dependence is not known, 
than the polynomial has only one coefficient for degree 0: 1.0
 
  PIMA task gean allows to import
gain curves. It requires either qualifier vlba_gain or evn_gain. The value of
the qualifier is the file name with gains. When the firsts qualifier is 
vlba_gain PIMA requires the second 
qualifier gain_band that specifies the band. Supported 
bands are class="val">90cm, class="val">50cm, class="val">21cm, 
class="val">18cm, class="val">13cm, class="val">13cmsx, 
class="val">6cm, class="val">7ghz, class="val">4cm, 
class="val">4cmsx, class="val">2cm, class="val">1cm, 
class="val">24ghz, class="val">7mm, class="val">3mm. 
NB: PIMA does not check whether the band is supported.
 
  PIMA wrapper pf.py does this 
task as well. It assumes to find files vlba.gains and ivs.gains in directory 
specified by configuration parameter --stable-share. 
pf.py wrapper tries both gain files. 
 
  Task gean does not report an error, if it does not 
find a gain for the specific station, specific frequency range, specific time 
interval. If it does not find gain, the gain is set to zero. If the gain is 
zero for a given station, given IF, PIMA task 
splt will not export visibilities for a given station. 
It is recommended to inspect gain values by running PIMA 
task prga after importing gains in order to be sure the 
are correct. Wrapper pf.py runs 
prga automatically at the end. The results an be found 
in the log file of task gain.
 
 
  PIMA supports two mechanisms for flagging visibility. 
When PIMA loads the data, it
checks all visibilities for inconsistencies, such as lack of autocorrelation,
wrong source indices, duplicates, etc. It puts indices of damaged visibilities
in a separate file and bars them from loading. These visibilities are 
considered unrecoverable. PIMA supports keyword 
TIME_FLAG_FILE that defines a so-called time epoch flag 
file. The file flag file consists of records in plain ascii that defines 
the visibility to be flagged. A record consists of four words separated by 
one or more blanks. The first word is the observation index, the second word 
is the index of the accumulation period within that observation, and the third
word is a flag. The flag is multiplied by the visibility. Flag 0 means the 
visibility will not be used for further processing. Lines that start with 
# are considered as comments and discarded by PIMA.
 
  Task onof uses this mechanism to flag out bad 
accumulation periods. It determines accumulation periods with low amplitude 
and write their indices and observation indices into the time epoch file. 
Other tasks, such as  frib, splt
read this file and flag out visibilities that have corresponding indices.
 
  Task onof does not require qualifiers. Its behavior
is determined by a number of keywords of the control file. If Keyword 
ONOF.GEN_FLAGS_MODE is CREATE, 
PIMA will ignore the previous contents of the flag 
file and overwrite it. If ONOF.GEN_FLAGS_MODE is 
UPDATE, then PIMA will honor
input of the flag file specified by the keyword 
TIME_FLAG_FILE and update it. In this mode 
PIMA will never reduce the number of flagged 
visibilities, but it can only increase them.
 
  In order to determine accumulation periods at the beginning or the end
of the scan that have to flags, PIMA needs to get 
a hint which interval to consider as "good". Two keywords, 
ONOF.KERNEL_START_SHARE and 
ONOF.KERNEL_END_SHARE determine the so-called kernel 
interval. The value of these keywords are the offsets or the kernel start 
and stop time as a share of the total nominal scan length. The share runs 
from 0 to 1. instance, ONOF.KERNEL_START_SHARE:  
0.25, ONOF.KERNEL_END_SHARE: 
0.80 specifies the kernel interval that starts at 
0.25*scan_length and ends at 0.80*scan_length. However, the length of the 
kernel interval is limited by the value of 
ONOF.COHERENT_INTERVAL (in seconds). If parameters
ONOF.KERNEL_START_SHARE and 
ONOF.KERNEL_END_SHARE specify the interval longer
than ONOF.COHERENT_INTERVAL, then 
ONOF.KERNEL_END_SHARE is reduced in such a way that 
the kernel interval will be close, but not exceeding 
ONOF.COHERENT_INTERVAL.
 
   PIMA first computes coherently averaged complex 
visibilities over the kernel interval and then tries visibilities coherently 
averaged over frequency over and accumulation periods backwards from the start 
of the kernel interval and forward from the end of the kernel interval. 
PIMA computes the frequency averaged visibility 
amplitudes, computes the ratio of the visibility amplitude over the trial 
accumulation to the amplitude over the kernel interval and tries two 
criteria: a) if the ratio is less than (1-k*σa), where 
k is the value of the keyword 
ONOF.NSIG_THRESHOLD and σa, then the 
accumulation periods is marked as a candidate for exclusion; b) if the ratio 
is less than ONOF.AMPL_THRESHOLD, then the accumulation 
periods is marked as a candidate for exclusion. If 
ONOF.AMPL_THRESHOLD is zero then the first criterion is 
disabled. If ONOF.AMPL_THRESHOLD is zero, then the second 
criterion is disabled. If ONOF.NSIG_THRESHOLD > 0.0, 
the second criteria is used only if the ratio of the amplitude in the kernel 
interval to the uncertainty amplitude at a given accumulation period is less 
than ONOF.NSIG_THRESHOLD.
 
  If PIMA finds k consecutive candidates, where 
k is the value of keyword ONOF.MIN_LOW_AP, 
then it flags them and all consecutive visibilities at the beginning or 
the end of the scan. 
 
    Criterion ONOF.AMPL_THRESHOLD is suitable for 
observations with with high SNR and long accumulation periods. Visibility 
amplitude computed over one accumulation period has a large scatter and 
the fluctuations caused by noise may be mistakenly considered as the source 
being off source.
 
    Criterion ONOF.NSIG_THRESHOLD is suitable for both 
low SNR and high SNR observations. Value 3 was found 
satisfactory for most of cases. Task onof is not able 
to find time interval when antennas were off-source for observations with 
low SNR, say less than 10, because amplitude fluctuations due to random 
noise become too large to be distinguished from antennas being off-source.
  
 
 
  Task splt processes the data on source basis. A user 
can specify the source name that will be processed or to request to process all
the sources in a cycle. Keyword SPLT.SOU_NAME controls 
this behavior. Its value can be either B-name, or J-name or ALL, which means 
to process all the sources.
   
  Keyword SPLT.FRQ_MSEG specifies the number of spectral
channels to be coherently averaged out. A usual choice for processing legacy 
data is to specify the number of spectral channels in an individual IF. In 
that cases all spectral channels will be averaged out. 
PIMA does not average spectral channels across IF 
boundaries. That means that the maximum value of 
SPLT.FRQ_MSEG is limited to the number of channels in 
an IF. Starting from 2014, observations at 512 MHz band became more and more
common. Averaging across 512 MHz band may result to image smearing.
Therefore, it may appear beneficial to decrease the number of spectral
channels that will be averaged. Task splt averages 
SPLT.FRQ_MSEG spectral channels in one output IF. 
If SPLT.FRQ_MSEG is less than the number spectral channels 
in one IF, then the output dataset will have more output IFs than the input 
dataset.
 
   Keyword SPLT.TIM_MSEG specifies the number of 
accumulation periods to be coherently averaged out. The interval of time that 
with SPLT.TIM_MSEG accumulation periods is called segment. 
A usual choice is to select segment duration 10–20 seconds. DIFMAP allows
to averaged data further increasing segment length (DIFMAP task uvaver), but 
after the data have been averaged, there is no way to undo averaging. In a case 
when a very large map will be made, SPLT.TIM_MSEG may be 
reduced in order to avoid image smearing.
 
   Keyword SPLT.SNR_MIN specifies the SNR threshold.
Observations with the SNR over all frerquencies and over total scan duration
less than that threshold are excluded for processing and are not writtebn 
in the output file.
 
 
  One output visibility is a result of coherent averaging over
SPLT.FRQ_MSEG*SPLT.TIM_MSEG input 
visibilities. It should be noted that if SPLT.FRQ_MSEG 
is not an integer divisor of the number of spectral channels and 
SPLT.TIM_MSEG is not an integer divisor of the total
number of accumulation periods of an observation, the number of used input 
visibilities can be less at the end of the observation or at the upper part 
of the spectrum can be less.
 
  Phases of input visibilities are rotated before averaging according
to phase delay rate and group delays that are found during fringe 
fitting. PIMA reads results of fringe fitting from 
the file specified by the keyword FRINGE_FILE. 
 
  Since PIMA  performs fringe fitting for each 
observation independently, in general, fringe reference time is different. 
As a result, the misclosure of the raw phases from the contribution 
of group delay and phase delay rates is not zero. If 
SPLT.STA_BASED: YES or 
ALL, then PIMA performs 
a procedure that converts baseline-dependent group delays and phase delay 
rates to station-based that automatically have zero misclosure and therefore,
applying phase rotation from the results of fringe fitting does not change 
misclosure in original visibilities. PIMA performs 
this conversion at each scan. It may happen that observations of a given 
scan have to be split into several subarrays. A subarray is a set of 
observations that has common baselines 
with every station within a subarray. For example a station array 
ABCDRFG may have to be split into two subarrays ABCD  and
EFG if there are no usable observations at baselines AE,AF,AG,
 BE,BF,BG,CE,CF,CG,DE,DF,DG . If in this example, there are no 
usable data at baseline FG, there will be three subarrays: 
ABCF, EF, EG. PIMA splits 
the data into subarrays for processing each scan. If a subarray for a given
station was already used in the previous scans, PIMA 
assigns the observations to that subarray. In a case if a new subarray has 
all the stations that were in one of the previous subarrays, 
PIMA assigns observations to that subarray. In a case 
is a new subarray has all the stations that were in one of the previous 
arrays, plus one or more new station, PIMA extends 
that subarray, and assigns the observations to that subarray.
 
  In a case if all scans were scheduled at all antennas and fringes were 
detected at all observations and no observations were excluded, there will 
be only one subarray. If one of these conditions is violated, 
PIMA may end up with many subarrays. Since splitting 
data in many subarrays reduces the number of phase and amplitude misclosures,
in general it is undesirable to have many subarrays. 
PIMA supports a procedure subarray consolidation 
controlled by the keyword SPLT.SUBARRY_CONSOLIDATION. 
Value NO means to disable subarray consolidation. Value 
MIN instructs PIMA to preform 
minimal subarray consolidation: if all stations of subarray A are 
present in the subarray B, then the subarray B is 
consolidated with subarray A. Value MAX 
instructs to preform maximum subarray consolidation: if a subarray has 
at least one common station with subarray B, both subarrays are 
consolidated.  
 
  In order to compute station-dependent group delay, phase delay, and
group delay correctly, baseline-dependent group delay, phase delay, and
group delay should be correct. Fringe fitting provides a wrong result
for a non-detection, or an observation affected by RFI. Group delays
of non-detections have a uniform distribution over the fringe search
window, which is several orders of magnitude larger than the scatter
of group delays for normal observations. Therefore, a care should taken
in order to block using bad group delays by task 
splt. It is recommended to process the data with
VTD/Post-Solve in order to identify outliers and exclude them as input 
to the procedure for computing station-based group delay, phase delay
and group delay rate using keyword EXCLUDE_OBS_FILE.
If SPLT.STA_BASED: YES is used,
the observations excluded for computing station-based quantities will
remain excluded from  being further processed and written in the the
output file. However, in general, this approach is too restrictive.
When SPLT.STA_BASED: ALL is
specified, the filters specified by kewords 
FRIB.SNR_DETECTION, 
EXCLUDE_OBS_FILE,
INCLUDE_OBS_FILE, and OBS
is applied only to the input data of the procedure for computing 
station-based quantities. All observations between the stations for 
which station-dependent group delays, phase delay, and group delays 
are computed are used for further processing, regardless whether they
passed the input filter or not. Though if there were no observations
at baselines at certain stations, there will be no station-based 
quatnties for these stations, and therfore, no for the observations 
with these stations will be used for generating the output.
Value ALL is recommended.
 
   
   Alternatively, when SPLT.STA_BASED: 
NO, PIMA does not convert 
baseline-dependent quantities to station-based. Important: phase misclosure 
is distorted when this option is used. This option is useful when the data 
are to be processed on a baseline basis. 
 
   Keyword SPLT.SNR_MIN specifies the SNR threshold
for the output. Observations with the SNR over all frequencies and over 
total scan duration less than that threshold are excluded for processing 
and are not written in the output file. This criteria is used after 
the averaged visibilities are computed. If 
SPLT.STA_BASED: YES or
SPLT.STA_BASED: NO was used,
observations with SNR less than FRIB.SNR_DETECTION
will remained excluded even if their SNR is equal or greater than
SPLT.SNR_MIN. However, when 
SPLT.STA_BASED: ALL is 
specified, observations with SNR less than 
FRIB.SNR_DETECTION may become valid. For instance,
if stations  B and C have low sensitivity, but station A has high
sensitivity, visibilities at baseline BC can be determined using phase
delay rate and group delay at baselines AB and AC. The SNR of visibilities
at baseline BC may be very low. Keyword SPLT.SNR_MIN 
allows to filter out such visibilities with low SNR computed on the basis
fringe fitting results from visibility analysis at other baselines.
 
   There are several options to generate the output in a case of
dual-polarization data. Keyword SPLT.POLAR specifies 
which polarizations to put in the output: ALL for all 
polarizations that present in the data, PAR only for 
RR or LL polarizations. Other 
supported values: I, RR, 
RL, LR, and LL.
 
   PIMA computes averaged visibilities and their weights. 
The algorithm for weights computation is controlled by keyword 
SPLT.WEIGHT_TYPE. According to FITS specifications, 
weight is defined as reciprocal to the fringe amplitude variance. Value 
ONE forces PIMA to set all weights 
to 1. Value OBS_SNR instructs PIMA 
to compute weights on the basis of signal to noise ratio SNR. The segment 
weight is Ampl/SNR**2, where SNR is the signal to noise ratio over 
visibilities of a given segment. The segment SNR  computed from the SNR over 
all visibilities used in fringe fitting and scaled by square root of the the 
ratio of visibilities in the segment to the total number is used visibilities
in the observation. When SPLT.WEIGHT_TYPE is 
OBS_RMS, PIMA computes variance of 
the fringe amplitude over the observation and assigns weights for all segments 
reciprocal to this estimate of variance. When 
SPLT.WEIGHT_TYPE is SEG_RMS, 
PIMA computes variance of fringe amplitude over 
visibilities of a given segment and assigns weights reciprocal to this 
variance.
 
   Method SEG_RMS is the preferable, since it accounts for 
temporal variation of the variance. However, it requires a sufficient number of 
segments for computing meaningful variance. When 
SPLT.WEIGHT_TYPE is AUTO, 
PIMA uses different ways to use the weight depending on 
the number of accumulation periods in a given segments. If the number of 
accumulation periods per segment specified in the keyword 
SPLT.TIM_MSEG is equal or greater than the threshold 
(currently 8), the variance is computed over visibilities of a given segment. 
Otherwise, the variance will be computed over all visibilities of the 
observation (equivalent to OBS_RMS). 
SPLT.WEIGHT_TYPE: AUTO is recommended 
for a general case. 
 
  When computing calibrated amplitude PIMA applies two 
renormalizations unless a user disables it. When 
SPLT.AUTOCORR_NRML_METHOD: AVERAGED, 
PIMA normalizes system temperature for masking 
autocorrelation. The system temperature is measured by integrating the total 
power over entire IF. When a portion of the bandwidth where Tsys was computed 
is masked out, the total power is changed. In general, the spectrum of noise 
is not constant over the band, it is proportional to the autocorrelation. 
PIMA normalizes autocorrelation to have the average equal 
to 1 over the nominal IF width. When 
SPLT.AUTOCORR_NRML_METHOD: AVERAGED, 
PIMA computes the mean autocorrelation over the used 
portion of the bandwidth within each IF, which in general is not 1. Then fringe 
amplitude is divided by the mean autocorrelation. 
SPLT.AUTOCORR_NRML_METHOD: NO 
disables applying this renormalization. It is recommended to use 
SPLT.AUTOCORR_NRML_METHOD: AVERAGED.
 
   When SPLT.BPASS_NRML_METHOD: 
WEIGHTED, PIMA divides the fringe 
amplitude by the square root of the product of the square root
of bandpass renormalization factor. The representative bandwidth of the 
intermediate frequency used for re-normalization is specified by the
keyword SPLT.BPASS_NRML_RANGE. The value of this keyword 
is two numbers from 0 to 1 separated by the colon. These number specify the 
lower and the high part of the representative bandwidth as a share of the 
total bandwidth.  Example: SPLT.BPASS_NRML_RANGE: 
0.25:0.80. They define the representative bandwidth as 
[F_low + Bl*Fw, F_low*Bh*fw]. PIMA computes 
renormalization factor R = (sum Br/Nr ) / 
Sum Bt/Nt,  where Br — 
bandpass in the bandwidth is [F_low + Bl*Fw, F_low*Bh*fw], 
Nr — the number of points in that bandwidth; Bt 
bandpass in the total bandwidth, Nt the total number of points 
in the entire IF. Factor R is multiplied by every point of the bandpass and
makes its normalized over the representative bandwidth 
[F_low + Bl*Fw, F_low*Bh*fw]. Usually R > 1.0 
For example, if the IF bandwidth is 16 MHz and 
SPLT.BPASS_NRML_RANGE: 0.25:0.80,
then the representative portion of the bandwidth used for renormalization 
starts at 0.25*16=4.0 MHz and ends at 0.8p*16=12.8 MHz. Thus, the portion
[4.0, 12.8] MHz of the total bandwidth [0, 16] MHz is
considered representative and the bandpass is normalized to be 1 over the
representative portion of the bandwidth. The share of the representative 
bandwidth depend on quality of hardware. 0.25:0.80
is a good choice for most of the cases.
 
   In addition to generating the output averaged visibilities in FITS-IDI
format, PIMA  task splt 
will generate total visibilities averaged over entire can when 
SPLT.TOTAL_UV: YES is specified.
NB: unlike to averaged visibilities written in the FITS-IDI format,
the total visibilities are refereed to the band reference frequency.
Name of a file with total visibilities obeys the following convention:
JJJJJJJJJJ_B_uvt.txt, where JJJJJJJJJJ is the 10-character
long J2000 source name and B is the band defined in the keyword 
BAND. Total visibilities are written in the plain
ascii format. See document Total_visibilities_format.txt for format description.
 
 
  PIMA supports keyword 
SPLT.GAIN_CORR_FILE that specifies so-called gain
correction file. This file in plain ascii format defines factors for each 
station and each IF by which fringe is multiplied when task 
splt runs.
 
 
  These factors can be assigned manually or automatically. Task 
gaco can run in two modes: manual and automatic. 
In manual mode PIMA expects a qualifier init with 
the value of the a priori factor. Value 1.0 or 
0.0 are usual choices. Task 
gaco with qualifier init 
sets all gains to the initial value. Gain correction 1.0 means 
no correction. Gain correction 0.0 means all IFs all stations should 
be deselected.
 
  Task gaco writes the gain correction file. If task 
gaco was invoked in init mode, 
the gain correction should be edited in order to be useful. Example: station 
KP-VLBA had fringe amplitude a factor of 8–10 lower in IFs 3 and 4, 
and a user would like to get rid of them for imaging purposes. Than 
PIMA task gaco with qualifier 
init and value 1.0 is called.
After that the user edits the gain correction file that the task 
created ad changes gain correction for KP-VLBA IFs 3 and 4 from 1.0 to 0.0.
 
  In order to to compute gain corrections in the automatic mode, several
images should be made first and self-calibrated visibilities be saved.
Then PIMA analyzes the ratio of original amplitudes 
before amplitude self-calibration and after amplitude self-calibration and 
determines their ratios for each station and each IF using least squares. 
These gain corrections are equivalent to the factors should by task CORPLT 
of DIFMAP package. In order to do it, PIMA should find 
visibilities before and after imaging. PIMA supports 
convention that a file with visibilities before imaging have name 
JJJJJJJJJJ_B_uva.fits, where JJJJJJJJJJ is the 10-character
long J2000 source name and B is the band defined in the keyword 
BAND. PIMA task 
splt created files with averaged visibilities in this 
format. PIMA expects files with self-calibrated 
visibilities after imaging to have names in the form of 
JJJJJJJJJJ_B_uvs.fits.
 
  When used in the automatic mode, PIMA task 
gaco expects two qualifiers, sou and 
dir, the first is mandatory and the second is optional. 
The value of the first qualifier is a comma-separated source list. The value of 
the second optional qualifier specifies the directory where files with original
and self-calibrated visibilities can be found (they should be in the same 
directory). If the second qualifier is omitted, PIMA will 
search for visibilities in the same directory where task 
splt put them: SSSSS/EEE_uvs, where 
SSSSS is the PIMA scratch directory specified 
by the keyword EXPER_DIR and EEE is the experiment 
name specified by the keyword SESS_CODE.
 
  PIMA task gaco used in the 
automatic mode computes the gain correction file. If a given station observed 
no sources from the list, the gain correction for that station is set to 1.0. 
A user may edit the gain correction file, for instance setting zeros for IFs 
for certain station(s). Setting the gain correction to zero will effectively 
flag out these IFs for imaging purposes, while these IFs are still 
available for other tasks, for example, fringe fitting.
 
  PIMA task splt uses gain file, 
unless SPLT.GAIN_CORR_FILE: NO. 
It multiplies the calibrated visibility by the product of gain corrections 
of both stations of a baseline. It writes the used gain corrections into 
output FITS file in two places: 1) as an ascii  table in the HISTORY records 
of the main table, 2) as a new table in GACO. It is possible to run task 
taco the second time. PIMA 
searches for gain correction in the FITS file with calibrated visibilities, 
applies these corrections as a priori and writes updated total gain 
correction with respect to a case when no gain correction is applied. 
Thus, if to run gaco task more than once, the result 
will be approximately the same.
 
  It is first recommended to run splt task 
for several reference sources with DEBUG_LEVEL: 
2 or 3 and with 
FRIB.SNR_DETECTION: 6.0. It is 
recommended to investigate the splt log 
file. Search there for lines with PIMA_SPLT_FITSTA  SOU:, 
for instance with using grep. This line provides the statistics for a 
subarray. (Remember: a scan may have one or more subarrays that can be 
later consolidated into one subarray). An analyst should examine the 
column followed by MaxDev_Gr_Del. This column provides the maximum 
residual of transforming baseline-dependent group delay to the station-based. 
Typical value of this residual for a detected source is 
50–300 ps. A source with complicated structure may have residual 
1 – 2 ns. But a residual, say 1000 ns, indicates that at least one 
observation in the subarray was a non-detection. Even one non-detection 
can spoil entire dataset for a given source to a level of uselessness. If you 
see large maximum  residual in the subarray statistics, just look in the 
preceding lines that start with PIMA_SPLT_FITSTA  Sou:. Identify 
sources with large residuals (say more than 12 &ndash 3 ns), identify their 
observation indices that can be found in the column followed by 
OBS:, and add these observation indices to the exclude file. Then 
run PIMA task splt with specifying 
this file in keyword EXCLUDE_OBS_FILE once again. 
Then inspect the log file again.
 
  After inspection shows no subarray with large residuals, the selected
reference sources are imaged using phase and amplitude self-calibration. 
If for some reason an image of one of the reference sources is not
satisfactory, another that bad source should be replaced with another
source. 
 
  When good images were produced for all reference sources, 
PIMA task gaco is invoked with 
qualifier sou with the comma-separated list of 
reference sources. After that the gain correction file is inspected.
If some IFs at some stations are to be masked out, corresponding 
values of the gain correction file are replaced with zeroes.
 
 
  After that task splt runs over entire dataset by 
specifying SPLT.SOU_NAME: ALL. 
A care must be taken to use only detected observations. There are two 
approaches for cleaning the dataset for non-detections. The first approach 
is to raise the SNR detection limit defined by keyword 
FRIB.SNR_DETECTION. Depending on the search window,
the detection limit is 5.3–6.0, the wider the window the higher 
the limit. Another approach is to run full absolute astrometry/geodesy 
pipeline and exclude those observations that are flagged out by 
Post-Solve. Solve will flag non-detections and other "bad" observations, 
such as those affected by RFI, low fringe rate problem, etc.
Extraction of the list of observations that is performed by program
gvf_db that is a part of Solve package. Usage: 
 
   After running splt over the entire dataset, 
the splt log should
be examined the same way as we did when we processed reference 
sources.
 
  Task splt creates calibrated visibilities of all 
the sources, except those that have too few detections and put then in 
directory SSSSS/EEE_uvs. Calibrated visibilities are used for 
imaging with DIFMAP, AIPS or another software. Result of imaging are two 
files per source: a file with self-calibrated visibilities in FITS format 
and image in FITS image format. The image contains two tables: a set of 
CLEAN components and the binary image that was generated from the table 
of CLEAN components. 
 
 
 
  Task opag provides and interface to the 
SPD package for computation of 
opacity and atmosphere brightness temperature using the output of 
numerical weather models. You need SPD package installed in order to run
this task. In the future, the ability to run SPD remotely will be added
if there will be user demand.
 
  Keyword atmo_dir is required. It specifies
the name of the directory with the output of numerical weather model
processed with MALO package 
and written in HEB format.
 
  
  Task opag computes opacity and atmosphere
brightness temperature for all frequencies as well as slant path
delay at a grid over azimuth and elevation. The grid is equidistant
over azimuth and not equidistant over elevations. 
PIMA puts a series of output files for the 
time range of the VLBI experiment and two epochs before the 
first observation and two epochs after the last observation. The 
step of epochs is determined by the numerical weather model
(typically 3–6 hours). The output files are formed as
SSSSS/EEE_sob/EEE_DDDDDDDD_DDDD.spd, where 
DDDDDDDD_DDDD is the date in format YYYYMMDD_hhmm,
for instance 20150818_0300.
 
  Task opag runs 3–20 minutes and creates
the set of files that are used by task opal 
 
 
  Task opal parses the files with opacity and 
brightness temperature on the 3D elevation-azimuth-time grid created by
package SPD that was invoked by
task opag, interpolates them for the scan start
and scan end and writes down into PIMA internal 
data structure. It also computes receiver temperature Trec by subtracting 
atmosphere temperature from measured system temperature.
Task opal also initializes arrays with so-called 
modeled and cleaned system temperature, i.e. removes if they existed 
before, fills them zero and sets flags "not available".
 
 
   Task tsmo computes the so-called modeled Tsys.
This task works in two modes, "elevation" mode and 
"if" mode. 
 
  In the "if" mode the task flags Tsys values that violate the assumption 
that the ratio of Tsys between IFs of the specified band is constant 
in time for a given experiment. First, PIMA finds
the reference IF within the band specified by keywords 
BEG_FRQ and END_FRQ. It tries
 the IFs one by one. It computes the logarithms of the the ratio of Tsys
of a given IF to the Tsys of the reference IF, finds the median ratio,
computes the rms of the deviation and removes the outliers that are
FRIB.NOISE_NSIGMA times greater than the rms.
The trial reference IF that has the minimum number of outliers becomes
the reference IF. After that PIMA computes 
the average Tsys ratios and stores flags. In a case if for a given
observation Tsys at the reference IF has to be flagged, a temporary
reference IF is sought.
 
  In the "elevation" mode the task decomposes Tsys into the product
 T_sys = T_o * a(t) * b(e)
 
   where  
    OL> 
        
   After the task tsmo  computes Tsys decomposition,
it computes the so-called modeled Tsys using parameters of the decomposition
for start and stop date of every scan and stores it in the appropriate
slot. In addition, PIMA computes so-called cleaned
array of Tsys. Cleaned Tsys coincides with modeled Tsys for the points
with missing or flagged measured Tsys and coincides with measured Tsys
for all other points.
 
   Task tsmo requires keyword mode 
that accepts comma separated values if and 
elev. Unless a user has reasons to do otherwise,
it is recommended to use both keywords: tsmo 
mode if,elev. When two
values are specified, PIMA will first execute
Tsys model computation in the "if" mode and then in 
"elevation" mode.
 
   For experiments with two band, for instance S/X or C/X, task 
tsmo should run for two bands separately since the 
ration of Tsys between IFs of different receivers may change. Elevation 
dependence is frequency dependent, and therefore, should be modeled 
separately.
 
   Results of task tsmo can be examined with 
task tspl. When task tspl
is invoked with keyword TSYS:
MEASURED, it will show measured Tsys. When it 
is invoked with TSYS: MODELED
it will show modeled Tsys. When it is invoked with 
TSYS: CLEANED, it will show 
modeled Tsys. In general, TSYS: 
CLEANED is recommended: from one hand, measured
Tsys is used. From the other hand, obvious Tsys outliers are eliminated.
 
   It should be remembered that opal purges results 
of task tsmo and task load 
purges results of both tasks opal and
tsmo.
 
 
  Quality of automatically generated images not always satisfactory. 
Most common reasons: a) not all visibilities when the antennas were off
are filtered out; b) visibilities at some IFs require significant 
corrections, say more than 50%, c) non-detection visibilities used by
splt; d) the source is larger than the default field 
of view. Therefore, automatically generated images require scrutinizing.
Recommended approach: an analyst using utility gqvew screens pictures of 
images and self-calibrated visibilities and selects those sources which
images look suspicious. These selected sources are imaged manually.
It is recommended to adhere the same name conventions of output files that
PIMA uses.
 
 
  If an observation is suppressed because PIMA picked 
wrong maximum in the Fourier transform of visibilities, it is possible to 
correct it. We can predict group delay rather precisely after Post-Solve 
solution: several nanoseconds at 2 GHz and several hundreds picoseconds at 
8 GHz and higher. Therefore, we can guide PIMA where 
to search for the maximum and fringe affected data once more. This 
procedure is called re-fringing. 
 
  Solve has a special mode that prints residuals and a priori delay computed
by VTD. To turn this mode, hit key A at the "Last page" menu and 
rewind spool file (Find "menu 1" set (C)hange Spooling current: on and
hit key ; to rewind the spool file with solution listing, i.e. to purge 
its previous contents. After that just run LSQ solution by hitting  key Q,
scroll the listing by hitting blank key two times and leave Post-Solve by 
hitting key T. Then copy the spool file into the experiment directory 
under name EEE_B_init.spl, where EEE is the experiment name
and B is band. Using information in the residual file, one can
construct command line for PIMA with modified keywords
FRIB.DELAY_WINDOW_CENTER, 
FRIB.RATE_WINDOW_CENTER, 
FRIB.DELAY_WINDOW_WIDTH,
FRIB.RATE_WINDOW_WIDTH in such a way that 
PIMA will search for the maximum within 1–3 ns 
of the group delay predicted on the basis of Post-Solve  solution. Program 
samb that is a part of Post-Solve package does this for you.
 
  Usage:
 
  Program samb analyzes the file with Post-Solve residuals, finds outliers
marked by character > or R in the 8th column, computes the
expected residual group delay delay with respect to the a priori
model used by the correlator, which in general is different than 
a priori models used by VTD plus adjustments found by Post-Solve, and
uses this value for FRIB.DELAY_WINDOW_CENTER argument 
for PIMA command line for re-fringing.
 
   We need to save Post-Solve solution before running 
PIMA by hitting CNTRL/U key. Then we execute 
the command line generated  by PIMA. Re-fringing may 
or may not find correct maximum in the Fourier transform of visibility data. 
The control file generated by samb writes the results in file 
VVVVV/EEE/EEE_B_refri.fri and residuals in 
VVVVV/EEE/EEE_B_refri.frr, where VVVVV is the 
PIMA scratch directory, EEE is the experiment 
name specified in the keyword SESS_CODE of the 
PIMA control file, and B is the band name.
 
  Next step is to extract records in the output fringe output and
fringe residual files generated by the command file created by samb, 
and to add those that have SNR greater than the limit specified by samb
command to the end the main fringe results and fringe residual 
files. Remember, process fringe results file consecutively. If there 
is more than one record corresponding to the same observation, the 
latest record overrides the previous record(s).
 
  Next step is to create a GVF database using updated files with 
fringe results and fringe residuals with PIMA 
task mkdb. There is a caveat. When we updated 
the database, it stores auto suppression and user suppression flags. 
These flags store the status of observations before re-fringing. 
Re-fringing may change observation status: an observation that was 
considered non-detection in the first fringing, may become detected 
in the re-fringing. Post-Solve does not allow to change status of 
an observation marked as non-detection. Program gvf_supr_promote
solves this problem. It updates flags "not detected". If an 
observation was not detected in the first fringing, but detected 
during re-fringing, the flag "not detected" is cleared and flag 
"suppressed" is set. Program gvf_supr_promote is a part of 
Post-Solve. It accepts the full database name, including and extension 
.env, as an argument. Alternatively, the same operation 
can be performed with wrapper pu.py. 
Wrapper pu.py has two arguments: experiment 
name and band.
 
  Operations samb, PIMA, and update of suppression 
flags can be performed with wrapper pr.py. 
Wrapper pr.py requires as the 
first argument the experiment name, as the second argument
low case band, as the third argument the SNR limit. Depending
on band, pr.py will select delay window semi-width. 
The delay semi-width can be overridden with optional argument 
-delwin. Optional flag -nodb
causes the wrapper to update only fringe results without creation of 
a database. This option is necessary when a low band of a dual-band 
experiment is re-fringed.
 
  After the database is updated, it should be processed with Post-Solve 
once more. Some observations that were previously suppressed as outliers
(ideally all)  can be restored. Observations that were considered as
non-detections and therefore were considered as unrecoverable if
detected during re-fringing appears as "bad", but recoverable. 
Post-Solve program ELIM in restoration mode should be executed and the
database be updated. Usually, there is no need to make a next 
iteration, unless an error has been made that should be corrected. 
 
 
 
    
    
    
    
    
    
 
    
  
    
 
    
    
    
    
    
    
 
    
    
    
    
    
    
    
    
    
 
 
   parameter --verbosity controls verbosity of the output.
    
          
          
    
   parameter --band specifies the 1-character long 
             band name. If the experiment has two bands, the code for 
             the upper band should be used.
 
 
   parameter --run-level controls which elements 
             or a group of elements of the VLBI data analysis pipeline 
             should be executed. The run-level is either a positive
             number when an elementry run level is specified or 
             a low case letter if a compound run level is selected.
             See the next subsection.
 
   If parameter -s was specified, statically linked 
PIMA  will be used.
 
  If the experiment has data from two bands, both bands will be processed
to enable the use of ionosphere-free combinations of group delay 
observables. The upper band should be specified when running 
pir.py.
 
   Limitations:
    
      
    
  The following run levels are supported:
 
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
    
        
        
        
        
            
        
            
        
   
     
     
        
     
          
     
     
     
     
     
     
     
         If you will image observed sources, you need prepare files with 2–4
         reference sources at this point. Remember, these files have suffices 
         _{band}_ref.sou, where {band} is a band.
          
     
     
     
     
 
   PIMA task  load, 
gean, pmge, 
bmge are band-independent; other tasks depends 
on the band an should be executed with the appropriate control file. All 
operations in PIMA pipeline, except tasks  
load, gean, 
pmge, bmge, and 
mkdb are performed two times: first for lower band and 
for upper band. They can run concurrently. Bandpass and phase calibration mask 
files are common for both bands. Task mkdb should be 
run for the upper band only. When PIMA finds value of 
MKDB.2ND_BAND that is the PIMA 
control file for the lower band, it computes the total observables of two 
control files and puts them in appropriate slots of GVF database. 
NB: PIMA does not check which band is upper and 
which band is lower frequency — an analyst should define it. If to 
run PIMA task mkdb with the control 
file for the lower band, PIMA will create the GVF 
with total observables only for that band. For historical reasons Post-Solve 
always marks the upper band as "X" and lower band as "S" regardless 
the frequency range. 
 
  
  Tasks coarse fringe fitting, bandpass generation, and fine fringe fitting
is executed two times, for lower and upper band. Task 
mkdb is executed once for the upper band control file 
only. The GVF database created in the dual-band mode contains the data for 
both bands. Using VTD/Post-Solve  two bands are  processed consecutively, 
first the lower band marked as "S" (Data type "GS"), then the upper band 
marked as "X" (Data type "GX"). Two files with residuals are created: the 
upper band and for the lower band. Then wrapper pr.py
is executed for both bands: first for the lower band and then for the upper 
band. Option -nodb should be used with the wrapper for 
the lower band. This option prevents creation of a database for the lower 
band, since such a database would not have the data for the upper band. 
After wrapper pr.py for the lower band is completed, 
wrapper pr.py  for the upper band is executed. During 
next VTD/Post-Solve iteration lower band and upper band data are re-analyzed. 
After that a liner combination of the upper and lower band data 
(Data type: "G_GXS") are analyzed.
 
  Imaging the upper and lower bands is done separately. Gain control files
should be separate for upper and lower bands.
 
  Although GVF format allows to support up to 8 bands, as of 2016.05.05,
VTD/Post-Solve supports only two bands. An experiment with more than
two frequency bands is processed similarly as a dual-band experiment
except MKDB.2ND_BAND keyword that should be NO. 
In such case the keyword MKDB.OUTPUT_NAME that defines 
the database suffix should be different for each band. Otherwise, 
PIMA task mkdb will overwrite
a database for a different band.
 
 
          
              
              
              
              
              
          
     
     
     
  
      
     
 
  There are four contents definitions formats that PIMA 
deals with:
 
      
      
      
          
 
          Usage:
 
      
          Usage:
 
      
          Usage:
 
           
 
                    
                   Unrecognized extension is treated as Postscript.
                    
               
               
                    
               
               
               
                    
           
      
          Usage:
 
           
 
                    
                   Unrecognized extension is treated as Postscript.
                    
               
               
                    
               
               
               
               
               
           
      
          Usage:
 
           
 
                    
                   Unrecognized extension is treated as Postscript.
                    
               
               
                    
           
      
          Usage:
 
          Mandatory arguments are the file with self-calibrated visibilities 
          in FITS-UVS format and image in FITS-MAP format. Options:
           
           
               
               
            
 
 Parsing log files 
  This is the most frustrating part of data analysis. If you have 
data from VLBA, you do not need to run parsing log files. Parsing
log files from the KVN and VERA is very straightforward. Unfortunately,
parsing log files generated by the Field System developed in the Goddard
Space Flight Center often fails, because the format of field system
log file is changed without notice, and the developer who maintains
the field system refuses to cooperate.  
     Usage: log_to_antab {mode} {log_file} {antab_file} [year]
  where 
    
  The main difficulty is in extraction system temperature from field system logs.
The parsing software needs to identify Tsys record, extract the array of Tsys and
match that array with sky frequencies. It needs to determine intermediate frequency
with respect to the frequency of the local oscillator, to determine the frequency
of the local oscillator, match them, find tsys record, determine to which BBC
a field in tsys record belong and match the field.
usage: pf.py EEE B logs
    where EEE is the experiment name and B is the band. 
It creates output file EEE_AA.ant, where AA is a two 
character long low case antenna name. 
 
 Calibrating the data 
  FITS-IDI visibility file is supposed to have all calibration information 
inside. However, only Socorro correlator inserts all calibration information
into FITS-IDI. Visibility files from all other correlator missing some or
all calibration information.
    
NB: Antenna gain, system temperature, meteorological information and cable
calibration does not change result of fringe fitting. Therefore, these 
calibration can be applied after fringe fitting. Phase calibration affects
result of fringe fitting. Therefore, it is supposed to perform this kind
of calibration before fringe fitting. If you changed phase calibration, 
phase calibration status (pcal_on, pcal_off), you have to redo bandpass
calibration and fringe fitting. Otherwise, you will get wrong results.
 
 Examine raw data and calibration information 
  An analyst must always examine the data and calibration information
before running fringe fitting as carefully as possible. If an error will
not be noticed at the initial examination, then the analysis will be 
have be redone. Attentiveness during early examination saves time and
reduces the probability that the error will not be noticed and will lead
to an erroneous result.
     
 
         
         PCAL:   USE_ALL:NOT_TO_USE:MEDICINA:NYALE13S:RAEGSMAR:YARRA12M
         Here phase calibration from the following stations, MEDICINA, 
         NYALE13S, RAEGSMAR, YARRA12M will not be used.
         
         PCAL:   USE_ALL:to_use:HART15M,KOKEE,WETTZELL
         Here phase calibration only from the following stations, 
         HART15M,KOKEE,WETTZELL will be used provided the phase 
         is available and was not turned off with task gean.
          Running coarse fringe fitting 
  The goal for coarse fringe fitting is preparation for bandpass 
computation and for initial data examination. Coarse fringe fitting
uses single polarization data (RR for dual-polarization data), does
not use bandpass, because usually bandpass is not known at that time,
and uses no oversampling in order to speed up computation, and performs
the parabolic fine fringe search. Therefore, the following parameters 
are always set:
BANDPASS_USE:          NO
BANDPASS_FILE:         NO
POLARCAL_FILE:         NO
FRIB.OVERSAMPLE_MD:    1
FRIB.OVERSAMPLE_RT:    1
FRIB.FINE_SEARCH:      PAR
MKDB.FRINGE_ALGORITHM: DRF
If the bandpass mask file is not available, then 
BANDPASS_MASK_FILE: NO
is set. POLAR: RR is set for dual-polarization or 
RR data and POLAR: LL is set to LL-polarization data. 
Usually PIMA runs in both coarse and fine fringe fitting mode. 
It is desirable to store results  of coarse and fine fringe fitting in 
separate files. Therefore, when we run coarse fringe fitting, we set 
FRINGE_FILE and FRIRES_FILE
into different files than those specified in the PIMA control file.
 Usage: pf.py EEE B coarse 
  where EEE is the experiment name and  Computation of a complex bandpass 
   Computation of the complex bandpass is the second major task that
requires human intervention. The bandpass of the ideal system is rectangular
shape for the amplitude and zero for phase. That means that cross-correlation
spectrum of a signal from a radio sources with continuum flat spectrum is
also flat with some constant phase offset and the multiplicative factor
that is proportional to the square root of the products of Tsys at both
stations. Unfortunately, up to date perfect VLBI hardware is not yet 
developed. The cross-correlation spectrum diverts from the ideal (flat phases
and flat amplitudes). The use of phase-calibration may alleviate the deviation
from the ideal spectrum, may have no visible effect or may even degrade it, but
never fixes phases. Therefore, normally a complex function of frequency is 
computed that being multiplied by the cross-spectrum makes it flat. This function
can be computed reliably using sources with SNR > 200, but preferably
with SNR > 1000. A good principle investigator does not hesitate to spend a sizable
amount of allotted time for observing bright sources that are used as calibrators.
Bandpass calibration can still be performed using source with SNR 40–200,
but less reliable. Quality of bandpass derived using observations with SNR
in  a range 10–40 is questionable. Bandpass calibrator sources are supposed
to be continuum with the spectrum flat within an IF. Any active galaxy nuclea
satisfies this condition.
 Cleaning phase calibration 
If you use four or more tones of phase calibration per IF, you should first 
clean them and mask out the tones affected by internal radio interference. 
Old VLBA hardware extracted two phase calibration tones per IF. PIMA treats
this case as a single tone per IF. A user can select which tone to use.
If less four tones per IF was extracted in your experiment or you do not
apply multiples phase cal tones per IF, just skip this section.
# PIMA PCAL_MASK_GEN  v  1.00 2015.05.10
#
#  Phase calibration mask definition file for VLBI experiment VEPS02
#
#  Last updated on  2015.05.13_12:35:17
#
PCAL  STA:   KUNMING     IND_FRQ:   1-1   IND_TONE:   30-31  OFF
PCAL  STA:   SESHAN25    IND_FRQ:   8-8   IND_TONE:    4-4   OFF
PCAL  STA:   URUMQI      IND_FRQ:   1-16  IND_TONE:    1-1   OFF
  The first line identifies the format. The file defines the mask that
deselects tones with indices 30–31 of the first IF for station KUNMING,
the tone with index 4 for the 8th IF for station SESHAN25, and the first tone
in all IFs from the 1st to the 16th for station URUMQI. 
pima ru0186_x_pima.cnt pmge mask_gen ru0186_pcal_mask.gen
Comments:
    
Masking auto- and cross-correlation spectral channels
  PIMA allows to mask out specified channels of 
either auto or cross-spectrum or both. The auto-correlation spectrum 
is corrupted by the presence of internal RFI generated in the vicinity 
of the data acquisition system. Usually the internal RFI causes 
appearance of peaks in the auto-correlation spectrum. As a rule of thumb 
peaks with the amplitude less than 1.5 of the average amplitude can be 
safely ignored and peaks with the amplitude greater than the average 
by a factor of 2 should must be masked out since they noticeably affect 
the fringe fitting procedure and distort the estimate of the average 
phase and the amplitude. Peaks with the amplitude in a range 
[1.2, 2] of the average autocorrelation are in the border line.
It is not recommended to mask out auto-correlation at the edge of
the IFs, since during data processing PIMA
interpolates auto-correlation.
# PIMA BPASS_MASK_GEN  v 0.90 2009.02.05
#
#  Control for bandpass mask generation for VLBI experiment BP192B3
#
#  Created on 2015.11.14_12:33:11
#
ALL    STA:   ALL         ON
#
ALL    STA:   ALL         IND_FRQ:   3-3    IND_CHN: 121-129   OFF
CROS   STA:   KP-VLBA     IND_FRQ:   1-2    IND_CHN:   1-512   OFF
AUTC   STA:   FD-VLBA     IND_FRQ:   4-4    IND_CHN: 431-436   OFF
  The first line identifies the format. The first non-comment line
sets the initial mask 1. The second non-comment lines disables both
autcorrelations and cross-correlations for the IF #3, spectral
channels 121 through 129. The third line disables cross-correlation
in IFs #1 and #2 (that experiment has 512 spectral channels per IF)
for station KP-VLBA. The fourth line disables autocorrelation for
FD-VLBA in spectral channels from 431 through 436 in IF #4 keeping
autocorrelation. 
  
 Creation complex bandpass in the inspection mode
  Task bpas compute complex band-pass. This task 
supports 4 modes: INSP (inspection), 
INIT (initial), ACCUM 
(accumulation), and FINE. Modes INIT
and INSP differs only by the generated output: 
bpas in INSP mode generates plots, 
while bpas in INIT mode does not.
 Usage: pf.py exp band obs [-insp]
  When PIMA task bpas is invoked in 
the INSP mode, PIMA runs a cycle over
baselines with the reference station and computes amplitude and phase bandpasses
for an observation with the highest SNR among selected observations of each 
baseline  with the referenced station. It displays two plots per baseline: 
amplitude plot and phase plot. An amplitude plots shows three function: 
autocorrelation (red), cross-correlation normalized to unity (blue) and the 
bandpass (green). The phase plot shows residual phase (blue) and phase bandpass 
(green).
   
A hint: task bpas in the inspection mode processes 
baselines in the alphabetic order. If a certain station requires heavy 
editing the bandpass generation file, you can run the task with 
OBS: num_obs qualifier, there num_obs is the index 
of the observation with the highest SNR at the baseline of interest.
 Running fine fringe fitting 
   Fine fringe fitting is the main task. During coarse fringe fitting, the fine
fringe search procedure is disabled in order to speed up the process, and the 
bandpass was not applied. During fine fringe search this simplification is lifted.
 Usage: pf.py exp band fine
 
   PIMA task  frib creates two ascii 
output files defined in keywords FRINGE_FILE and 
FRIRES_FILE. The first file keeps results of fringe fitting 
and it is used  by other tasks. The latter file with fringe fitting residuals is 
for informational purposes only. 
    
 
Usage: create_fftw_plan 
         where method is one of MEASURE of PATIENT, 
         num_threads is the number of threads, request_file
         is the file with dimension definitions and plan_file is the output
         configuration file. PIMA supplies two configuration files
         pima_wis_big.inp  and pima_wis_small.inp . They can be found in 
         $PIMA_DIR/share/pima/ directory, where PIMA_DIR is the environment
         variable of the directory where PIMA has been 
         installed. It is suggested 
         to use pima_wise_big.inp unless you have less than 12 Gb memory.
         In that case you should use pima_wise_small.inp, but you will not
         be able to process efficiently wide field VLBI experiments with high 
         spectral and temporal resolution. FFTW configuration file depends
         on the number of threads. If you generated the FFTW configuration
         file for N threads, but run PIMA with K threads, the configuration
         file will not be used. PIMA will run, but much 
         slower (a factor of 2–5). Therefore you have to create several 
         plans files for different number of threads. Usually, you use the same 
         number of threads as the number of cores, but you may want to reduce 
         the number of threads if you run PIMA on a busy 
         server). FFTW supports several methods for computing the best  
         configuration file. Method MEASURE is recommended. Method 
         PATIENT is supposed to improve performance, but it may take 
         several days to compute it.
         Examples:
         create_fftw_plan MEASURE 1 $PIMA_DIR/share/pima/pima_wis_big.inp 
                          $PIMA_DIR/share/pima/pima_big_measure_1thr.wis
         create_fftw_plan MEASURE 12 $PIMA_DIR)/share/pima/pima_wis_big.inp 
                          $PIMA_DIR/share/pima/pima_big_measure_12thr.wis
         PIMA keyword FFT_CONFIG_FILE 
         defines the FFTW configuration file.
         Keyword FFT_METHOD defines the method that was used 
         for generation of that file (MEASURE or PATIENT). 
         Keyword NUM_THREADS sets the number of threads that 
         PIMA uses for FFTW and some other parallel operations. 
         The FFTW configuration file should be generated with the same number of 
         threads and the same FFTW method, otherwise PIMA 
         performance will be seriously degraded.
         
             
        A plot is written in directory SSSSS/EEE_fpl.
 Export data for astrometry/geodesy solution 
   Fringe results can be transformed to the form that astrometry/geodesy
software Post-Solve can ingest. Task mkdb reads 
fringe files, fringe residual file, contents of internal 
PIMA tables that are kept in SSSSS/EEE.pim 
file, and contents of visibility files, computes scan reference time (SRT), 
computes a priori path delays on the SRT, computes total group delays
and phase delay rates on SRT, sorts them, and writes them into 
database files in either binary Geo VLBI Format (GVF) or plain ascii 
(TEXT) format. In a case of dual-base observations it reads two fringe 
results and fringe residual files for both bands and matches them.
   
   The offset and format of each parameter is specified in the header of the ascii
database.
 Split and export data for imaging 
   PIMA has a capability to format its results in the 
form that is suitable for both absolute astrometry/geodesy or imaging. In the 
latter case PIMA coherently averages the visibilities 
over time and frequency and applies all necessary calibrations and 
re-normalizations. 
 Import gain curves 
  As of 2016, only the NRAO generates visibility data that are
fully compliant with FITS-IDI specifications and contain gain curves.
Data generated by other correlators do not have this information, and
therefore, the gain curves should be imported. Import gain curves 
is beneficial for processing VLBA data, since the gain curves embedded
in the database may not be the best one. 
 Flagging visibilities with low amplitude at the beginning or end 
of a scan. 
  Data acquisition system often record before and/or after the actual 
scan time. The field system  is supposed to record time stamp of nominal
start and nominal stop time and the correlator is supposed to flag accumulation
periods that were recorded when the antennas were off-source. However, it may
happen that visibility data for a given time have intervals when the antenna
were off-source. Usually, this happens at the beginning of the scan. This
may happen because the fields system software incorrectly determine on/off
time, or did not propagate it to the correlator, or the antenna was off while
the fields system software reported the antenna was on. Propagation such "data" 
poses a serious problem for imaging. Such visibilities should be flagged during 
imaging stage. If left unflagged they distort an image. 
PIMA has task onof that analyzes 
the data, determines accumulation periods with the fringe amplitude at the 
beginning and/or the end of a scan with fringe amplitude below the threshold 
and flag them out. This task should run before splt.
 Running task splt for splitting and exporting data for imaging 
  Using results of fringe fitting, PIMA performs 
coherent averaging over time and frequency after rotation phases according 
to group delays and phase delay rates, applies calibration for system 
temperature, gain curves, bandpass re-normalization, combines all visibilities 
of a given source and writes averaged visibilities and their weights into
output binary files in FITS format that are suitable for imaging with
AIPS or DIFMAP.
 Compute gain correction 
   It is rather common from some stations to have some IFs with gain
to be wrong by a certain factor. This may be due to unaccounted change
in gain curve, or due to systematic error in Tsys. During imaging process
gain can be adjusted using amplitude closure. This procedure is called
amplitude self-calibration. However, a source should be relatively bright
and UV coverage should be rather dense for amplitude self-calibration
to produce good results. If the gain is off by a factor that is constant 
over entire experiment, the gain correction determined from imaging one
source can be used as a priori for imaging other sources. This is just
that PIMA  task gaco 
(GAin COrrection) does.
 Use case of preparing the data suitable for imaging
  The imaging analysis and absolute astrometry/geodesy pipeline has 
a common beginning: loading the data; parsing log files, checking
logs, checking phase calibration, cleaning phase calibration for
spurious tones, checking autocorrelation, checking Tsys; running
coarse fringe fitting, computation of the bandpass, running
fine fringe fitting. After that point the pipelines diverge.
Next task will be selecting good reference sources and running task
splt for them. A good reference source is strong, 
observed at all baselines and has relatively simple structure.
    gvh_db database_name mode
   where mode is either 10 for a processing a single-band experiment 
or the upper band of a dual-band experiment and 20 is for processing
lower band of a dual-band experiment. The advantage of the second
approach is that detections as weak as 4.8 can be used since Post-Solve
will eliminated non-detections and other bad detections. Re-fringing
allows to recover weak detections. The  disadvantage is that Post-Solve 
should be installed, and running astrometry analysis requires extra
efforts.
 OPAcity Generation 
 OPAcity Loading 
 Compute TSys MOdel 
       
       
   
                                                                      
Parameter T_o and coefficients of the spline a(t) and b(e) are found by 
iterative non-linear LSQ. Outliers are detected and flagged out during
this procedure. If tsmo task in "if" mode 
ran before, the input for this procedure is the geometric average 
of Tsys, except those that were previously flagged out, i.e.
T = (Π Tsys(i))^(1/n). Otherwise, Tsys for 
BEG_FRQ is taken. It is strongly recommended first
to run tsmo task in "if" mode and then
in "elevation" mode since the latter mode is less stable to 
outliers. Potentially, a large outlier(s) can distorts significantly 
the solution.
 Automatic imaging
   In principle, totally automatic imaging is feasible, but development
such a system would require an order of magnitude more efforts than
it was invested in PIMA. Therefore,
PIMA provides partially automatic imaging capability. 
In the framework of the approach implemented in PIMA, 
a user runs fringe fitting, runs astrometry/geodesy solution with 
Post-Solve, runs task onof, runs 
splt  for 2–4 reference sources, produces 
their images manually, runs task gaco, and then 
runs wrapper script pf.py task 
map. Task map of 
pf.py wrapper invokes 
PIMA task gain, 
splt, calls DIFMAP in a batch mode, generates images 
of all sources, and creates pictures of all imaged sources in gif 
format. pf.py scripts puts calibrated visibilities, 
images, self-calibrated visibilities, images, pictures of source images, and 
pictures of scan-averaged flux densities as a function of baseline
lengths into directory SSSSS/EEE_uvs, were SSSSS is the 
PIMA scratch directory specified in the keyword 
EXPER_DIR and EEE is the experiment name 
specified in the keyword SESS_CODE. 
PIMA generates the for each source with enough usable 
data the following files:
    
 where B is the band specified in keyword BAND, 
 Re-fringe the data using results of astrometry/geodesy solution 
  In general, an iteration PIMA → Post-Solve 
→ PIMA is required.
Interactive Post-Solve is used for a) setting up parameterization; 
b) outliers elimination; c) re-weighting; d) setting up constraints. 
Results of Post-Solve are written in the database during operation 
"database update" (Cntrl/U). Observation suppression status is kept in 
the database. This information can be extracted and used for excluding 
suppressed observations by task splt. Sole suppresses 
observations with residuals greater than some limit, typically 
3–4 σ. There are several reasons why an observation may have 
a large residual: a) deficiency in the theoretical model; b) low fringe 
rate that results in PIMA selecting correlation 
between phase-calibration signal; c) PIMA picking up 
a local maximum in the Fourier transform of visibilities. If the residual 
is caused by deficiency in the ionosphere or atmosphere model, such an 
observation is "bad" for astrometry/geodesy, but good for imaging. If the 
a priori source position used by PIMA had a large 
error, say more than 0.5"–1", a quadratic term in residual fringe 
phase appears, group delay is biased, and the SNR is reduced. This problem 
can be alleviated if fringe fitting is repeated with corrected a priori 
source position. PIMA checks the difference between 
the source position used by the correlator and the a priori used by 
PIMA that it takes from the supplied catalogue. 
If the differences exceeds a certain threshold PIMA 
computes phase correction that compensate the quadratic term. This usually 
fixes the problem. NB: the dataset should be reloaded in order to change 
in the a priori source catalogue to take effects. Similarly, if a source 
position used by the correlator was correct, but the source position in 
the PIMA catalogue is wrong, f.e. due to a typo 
or wrong source association, PIMA will apply 
wrong correction and may spoil data  that are good otherwise.
 samb -p {pima_control_file} -w {window_semi_width_in_nsec} -s {snr_min) 
      -r {residual_file} -o {output_file}
  Parameter pima_control_file is the name of 
PIMA control file. Parameter 
window_semi_width_in_nsec is new window for group delay search. 
Recommended value is 5 times the wrms of residuals. Parameter snr_min 
is the new SNR limit. Recommended value 4.8. Parameters residual_file 
is the fike with residuals generated by Post-Solve. Finally, parameter 
output_file specifies the name of the output command file.
 Data analysis pipeline 
   The recommended pipeline consists of three steps: 1) fringe fitting;
2) astrometry/geodesy; 3) imaging. Step astrometry/geodesy can be used
for imaging analysis or can be skipped.
 Fringe fitting pipeline 
 
   
 
 Pipeline for astrometry/geodesy data analysis 
This pipeline runs after the fringe fitting pipeline.
   
 
 Imaging pipeline 
  It is recommended to run imaging pipeline after fringe fitting pipeline
and astrometry/geodesy pipeline. The latter pipeline allows you to
effectively filter out non-detections and corrupted observations,
f.e. observations where the fringe fitting algorithm found the maximum
that corresponds to correlation of phase calibration signal. Though
it is possible to skip astrometry/geodesy pipeline, but in that case
you need to screen observations for non-detections or observations'
where fringe fitting failed. Setting a higher SNR limit in a range 
6.0–6.5 seems prudent. NB: even one non-detection that slipped
into the dataset may severely distort an image.
   
   The three last three steps are performed by wrapper 
pf.py EEE B map. 
 
 Running the analysis pipeline with pir.py
  Program pir.py is provided for facilitating
running the VLBI analysis pipeline in the semi-automatic fashion.
As of 2021, the fully automated mode is not yet implemented.
However, pir.py substantially reduces the
amount of manual work. It executes elements of the VLBI data 
analysis pipeline. In total, there are 16 elements. Elements 
can be executed separately, or in the group, or all together.
   usage: pir.py [-h] [--version] [-v verbosity] [-b band]            
                 [-r run-level] [-s] experiment
 
   where experiment is the experiment code following either NRAO,
         or KVN, or IVS, or KaVA, or EAVN notation.
         
   
      
 
 pir.py run levels 
   Program pir.py splits the VLBI data 
analysis pipeline into a number of run levels. The run levels are supposed 
to be executed in the defined order because results  from the previous run 
levels are used for the next run level. The run levels can be elementary 
or compound. Compound run levels combine several elementary run levels. 
The granulation of the pipeline into elementary and compound run levels 
provides flexibility. For some data analysis scenarios compound run levels 
can be used, for other scenarios additional programs need run between 
elementary run levels.
    
       
  The following compound run levels are supported:
   
       
 
 Hints for pir.py use 
   The pipeline execution still requires manual steps. 
pir.py reduces the number of manual operations 
to the minimum and automatically runs other steps. The following
sequence is recommended:
 
    
 
 
 Processing dual-band observations 
   Dual-band observations are processed separately. Though in some cases
it is possible to run fringe fitting over a very wide bandwidth (several
GHz), in that case this would be called wide-band fringe fitting. Depending
on the correlator setup dual-band data can be be put in one frequency group,
f.e. VLBA S/X observations or observations at remote wings of VLBA C-band
receivers or be put into two different groups. If the frequency layout is not
known, the following parameters should be set FRQ_GRP: 
1, BEG_FRQ: 1, 
END_FRQ: 1 before loading the 
experiment in PIMA. After that, a user should examine 
the frequency file SSSSS/EEE.frq created by 
PIMA task load and create 
two PIMA control files for the upper and lower 
bands. The upper band is considered primary band and the low band is 
considered secondary band. Keywords BAND, 
FRQ_GRP, BEG_FRQ, 
END_FRQ should define frequency names frequency indices 
within the frequency band. The control file of the primary (upper) frequency 
band should define the name of the PIMA control file 
for the secondary (lower) frequency band in the keyword 
MKDB.2ND_BAND. The value of this keyword should be 
NO in the control file for the lower band.
 Auxiliary tools 
   PIMA provides a number of tools for examining
the data.
 
 Antenna log processing tool 
    Program log_to_antab processes input log files generated 
by software Field System and writes results in PIMA Antab format.
         Usage: log_to_antab mode log_file antab_file  [year]
   There are three mandatory arguments:
    
 
             
          Tools for examining data in FITS-IDI format 
     When you receive the data from the experiment that you intend to
analyze, you first need to examine the data. 
NB: PIMA processes data only in 
FITS-IDI format. PIMA provides several utilities 
that are useful for an initial data check.
    
 
         Usage: fitsh fits_file
         
         Usage: fitsd directory file
         
         Usage: get_source_table_from_fits fits_file
          Tools for manipulation with data in FITS image format 
There is a number of tools for processing image data in FITS image format.
NB: these tools will work with the data generated by 
PIMA, AIPS, and DIFMAP. They may or may not work with 
data generated by other programs. There are two level specifications: FITS 
and contents definition. FITS format defines only the data structure at the 
lower level. This information is not sufficient to parse arbitrary 
FITS-file without knowledge of contents definition specifications. 
     
  The following tools are provided:
     
          uva_merge uva_output input1_uva [input2_uva ...]
          Up to 30 FITS files can be merged. The order of input files
          does not matter. Do not forget that the output file comes first!
          
         
          Usage:  fits_tim_avr input_uva tim_av_sec output_uva
          where the first argument is the input file, the second 
          argument is the averaging interval in seconds, and the 
          third argument is the name of the output file in fits format.
          
         
          fits_to_map [-o output_file] [-box value] [-size code] 
                      [-color code] [-lev value] [-beam code] fits_map_file
          A mandatory argument is map file in FITS-MAP format. Options:
          
             
          
                       
                   
                       
                   
                        
                   Fonts and line width are scaled in order to fit the
                   sizes above. By default, image size is 3,
                   i.e. 160x160 mm.
                   
                        
                        
                        
                   
                        
                   
                        
                        
                        
                        
                   
          fits_to_radplot [-o output_file] [-size code] [-color code] 
                          [-gap time] [-wei T|F] [-cutoff_err value] [-auto] fits_vis_file
          A mandatory argument is map file in FITS-UVA format. Options:
          
             
          
                       
                   
                       
                       
                   
                        
                   Fonts and line width are scaled in order to fit the
                   sizes above. By default, image size is 3,
                   i.e. 160x160 mm.
                   
                        
                        
                        
                   
          fits_to_uvplot [-o output_file] [-size code] [-color code] uva_fits_file
          A mandatory argument is map file in FITS-UVA or FITS-UVS format. Options:
          
             
          
                       
                   
                       
                       
                   
                        
                   Fonts and line width are scaled in order to fit the
                   sizes above. By default, image size is 3,
                   i.e. 160x160 mm.
                   
                        
                        
                        
                   
          fits_to_cfd [-help] [-o output_file] [-wei T|F] [-cutoff_err value] fits_vis_file fits_map_file
          
             
     This document was prepared by Leonid Petrov
     
     Last update:    2022.06.07