3.2. Important concepts and data setup

3.2.1. Catalog type

Modern surveys typically make their catalog data available through a number of files, either because of the big size and/or because there are different samples (e.g., with or without spectroscopic redshift). Assuming that these files are standard for a given survey, with the same column names in particular, the concept of catalog type is introduced. We consider that all catalog files having identical column names, with photometric band names referring to same filters, belong to the same catalog type.

The catalog type concept is exploited to define the naming convention in the Phosphoros directory structure used to organize files (see the next section). It is also useful for the mapping operation (that connects input photometry to the corresponding filter transmission curves) because this operation can be defined only once for all files of the same catalog type.

3.2.2. Directory Organization

All data files are organized into a standardized directory structure (see Fig. 3.2). This allows to greatly reduce the amount of specifications as the code automatically knows where to look for input data or where to write output files, without necessarily relying on additional user configurations.

../../_images/Phos_Dir_structure.pdf

Fig. 3.2 Standard directory structure of Phosphoros

In most cases, there is no need to alter the standard directory structure, nor even to understand all details. The simplest is to used the default configuration (see here for changing the standard structure).

3.2.2.1. Phosphoros Root Directory

The root directory is the location of the top-level Phosphoros directory. By default, it is $HOME/Phosphoros. This location can be simply modified by setting the PHOSPHOROS_ROOT environment variable to a different directory.

Below the root directory, there are five main directories:

Catalogs, AuxiliaryData, IntermediateProducts, Results and Config.

3.2.2.2. Catalogs

Input catalog files contain multi-band photometric flux measurements, with their errors. Rows refer to different sources. One source ID column (e.g., OBJECT_ID) must be present. The catalog format is either ASCII or FITS as described in the Catalog format section. Fluxes must be provided in \(\mu\)Jy unit (AB magnitudes are also accepted).

Input catalogs are placed into sub-directories according to their catalog type. For example, catalogs belonging to the COSMOS catalog type are found into:

> $PHOSPHOROS_ROOT/Catalogs/COSMOS/

Tip

Input catalog files in FITS format can be examined using, for example, the TOPCAT software 1.

3.2.2.3. Auxiliary Data

All input files that are not catalogs are referred to as auxiliary data. They include all the required information to build the grid of models. Auxiliary data needed in the Phosphoros database are:

  • Filter transmission curves, which characterise the telescope full transmission curves as a function of wavelength, including the telescope optic, the filter itself and the detector efficiency. Values range between 0 and 1. They are typically found in sub-directories of the Filters directory:

    > $PHOSPHOROS_ROOT/AuxiliaryData/Filters/
    

    Filter transmission curves can refer either to photon-counting or energy-measuring systems. By default, Phosphoros assumes photon-counting systems (see the Auxiliary Data Format section to know how to handle energy-measuring systems).

  • Spectral Energy Distribution (SED) templates, which consist in restframe SED templates of galaxies, stars, QSOs, etc. They are expected in erg/s/cm2/\(\mathring{\rm A}\), and they are found in sub-directories of:

    > $PHOSPHOROS_ROOT/AuxiliaryData/SEDs/
    

    Based on SED templates, Phosphoros generates a grid of modeled photometry that are compared with photometric measurements.

    Note

    Before generating the model grid, Phosphoros normalizes all SED templates to the solar luminosity at 10pc distance with respect to a given filter transmission (that users can choose). In this way, Phosphoros can provide physically well defined values of the scale factor in output catalogs.

  • Reddening Curves, which provide the attenuation curves required to compute the intrinsic absorption caused by interstellar dust in galaxies. They are found into:

    > $PHOSPHOROS_ROOT/AuxiliaryData/ReddeningCurves/
    

All these input files must be ASCII tables, with the wavelength in \(\mathring{\rm A}\) as first column and the specific values as second column.

Optional functionalities in Phosphoros require additional auxiliary data that are also located in sub-directories of the AuxiliaryData directory.

Information on the auxiliary data format can be found in the File Format Reference chapter.

3.2.2.4. Intermediate Products

Intermediate products are all the relevant files produced by Phosphoros before the execution of the Redshift Estimate step. They can be reused for different runs. Typical intermediate products are the grid of models, the grid of luminosity models, the filter mapping, etc. They are organized per catalog type, e.g. for the Cosmos catalog type:

> $PHOSPHOROS_ROOT/IntermediateProducts/Cosmos/

When using Phosphoros through the GUI you will never need to open the IntermediateProducts folder. If you use the CLI you may have to locate files to be provided to the next computation step.

3.2.2.5. Results

The main product of Phosphoros is an output source catalog that includes redshift estimates, best-fit models and, optionally, 1D PDFs of model parameters (see the Compute Redshifts section). File format can be ASCII or FITS. Output data are organized per catalog type, e.g.:

> $PHOSPHOROS_ROOT/Results/Cosmos/

3.2.2.6. Configuration Files

Configuration files include the list of command options required to run Phosphoros executables in the CLI. They are typically found into:

> $PHOSPHOROS_ROOT/config/

This folder contains also the GUI internal configuration ($PHOSPHOROS_ROOT/config/GUI/), which you should not alter by hand.

3.2.3. Phosphoros internal database

In order to make input and auxiliary data available to data analysis, they first have to be imported inside the Phosphoros directory structure. When launching Phosphoros for the first time, it will automaticall create the folder structure under the $PHOSPHOROS_ROOT folder.

The standard procedure is to import input catalogs and auxiliary data files, such as filter transmission curves or SEDs, into the Phosphoros internal database. All the operations such as importing, moving and deleting files can be done using the shell commands such as cp, mv or rm (or using the GUI, which can import or delete folders). Users can also create or re-arrange sub-directories in the Phosphoros structure to match their preferred organization scheme by the mkdir shell command or by the GUI.

The Phosphoros Auxiliary Data Pack can be dowloaded from the Phosphoros repository through the GUI (see GUI: Configuration) or the CLI (see CLI: Download Data Pack). Files will be automatically located in the proper directories. This data pack contains a set of filter transmissions for the main recent UV/optical/IR surveys, the commonly adopted reddening curves and a large library of SED templates (a full description of the data pack can be found in the Data Repository (under construction) chapter).

Much of the data manipulated by Phosphoros can be reused in different analyses. The directory structure described above is designed to keep the input, intermediate and output data files of an arbitrary number of analyses.

Footnotes

1

see http://www.star.bris.ac.uk/~mbt/topcat/