Table Of Contents

Previous topic

drs_tool command-line interface

Next topic

DRS Schemes

This Page

CMIP5 product detection

drslib includes an algorithm for detecting the required CMIP5 product from filenames generated by CMOR2 developed by Martin Juckes. The module drslib.p_cmip5 contains an API to the algorithm and drs_tool will use the algorithm to allocate a product for incoming data when used with the --detect-product option.

A detailed discussion of the algorithm used is available as the document Identifying the product [PDF].

drs_tool configuration

Product detection requires extra information to be configured about the CMIP5 experiment and the data being processed. Before first using the product detection feature you should initialise the CMIP5 experiment description data using drs_tool init:

$ drs_tool init --shelve-dir=DIR

where DIR is a directory suitable for containing the data files. On an ESG datanode a suitable command might be:

$ drs_tool init --shelve-dir=/usr/local/share/p_cmip5/data

In addition to the shelve directory p_cmip5 requires additional information about the model output being processed to be included in an external configuration file (see `Input configuration file`_).

Both these parameters can be configured using metaconfig as follows:

[metaconfig]
configs = drslib

[drslib:p_cmip5]
shelve-dir = /usr/local/share/p_cmip5/data
config = /usr/local/share/p_cmip5/model.ini

The configuration file.

A configuration file is required by the p_cmip5 module (which assigns the data to DRS product=output1 or output2) for processing piControl data, and may optionally contain additional information which will ensure consistency of of the assigment for some other datasets (details below).

The configuration file is in standard ini-file format and should contain one section for each model name you want to process. Within each section set of option name/value pairs are listed to specify relevant properties of the model.

An Example

For instance the definition for 2 models HADCM3 and HIGEM1-2 could look as follows:

[HADCM3]

category=centennial
branch_year_piControl_to_historical=1820
base_year_historical=1850
branch_year_esmControl_to_esmHistorical=1850
base_year_esmHistorical=1850

[HIGEM1-2]

category=other
branch_year_piControl_to_historical=1820
base_year_historical=1850
branch_year_esmControl_to_esmHistorical=1850
base_year_esmHistorical=1850

Option names

1. category

Value:either ‘centennial’ or ‘other’
Description:The category specifies which suite of experiments the model is being used for. This information is use to determine what data should be prioritised for quality control and DOI assignment. The aim is to ensure consistency between experiments and between modelling groups. If a model is used both for centennial and decadal experiments, specify ‘centennial’.

2. branch_year_piControl_to_historical

Value:

integer

Description:

The year of the piControl data used to initiate the historical run.

Required if piControl data for tables aero, day or 6hrPlev is archived. This information can be determined from the global attribute “branch” in the historical data files and the base year from the time units of the piControl experiment.

3. base_year_historical

Value:integer
Description:The year of the start of the historical run. This is required becuase it is needed when processing piControl data, and thus cannot generally be obtained from the data files being processed.

4. branch_year_esmControl_to_esmHistorical

Value:integer
Description:The year of the piControl data used to initiate the historical run. See notes on 2. branch_year_piControl_to_historical

5. base_year_esmHistorical

Value:integer
Description:Start of esmHistorical expt. See notes on 3. base_year_historical

6. base_year_abrupt4xCO2 [optional]

Value:integer
Description:The year of start of the abrupt4xCO2 run. Used for processing abrupt4xCO2 data – only needed if the base year specified by the time units does not correspond to start of experiment.

7. base_year_piControl [optional]

Value:

integer

Description:

The year of start of the piControl run. Only needed if the base year specified by the time units does not correspond to start of experiment.

Used in determining which years of data from the aero table, piControl experiment in the decadal suite are replicated.

8. base_year_1pctCO2 [optional]

Value:integer
Description:The year of start of the 1pctCO2 run. Only needed if the base year specified by the time units does not correspond to start of experiment.

Invoking drs_tool with product detection

drs_tool will allocate a product to files in the incoming directory when invoked with the --detect-product option.

drs_tool list will show the product deduced in each dataset_id and list datasets for which the product could not be determined as incomplete. drs_tool todo and drs_tool upgrade will only operate on datasets for which the product could be determined.

In the following example data from the UK Met Office Hadley Centre is processed into the DRS hierarchy.

$ drs_tool list -I ./mohc_holding/ -R ./cmip5 --detect-product
...
[INFO] drslib.p_cmip5: Product deduced as output1, selected years [112/56] assigned to output1
[INFO] drslib.p_cmip5: Deducing product for <DRS cmip5.%.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrPlev.r1i1p1.%.va.2001120106-2002120100>
[INFO] drslib.p_cmip5: Product deduced as output1, selected years [112/56] assigned to output1
[INFO] drslib.p_cmip5: Deducing product for <DRS cmip5.%.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrPlev.r1i1p1.%.va.2002120106-2003120100>
[INFO] drslib.p_cmip5: Product deduced as output1, selected years [112/56] assigned to output1
[INFO] drslib.p_cmip5: Deducing product for <DRS cmip5.%.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrPlev.r1i1p1.%.va.2003120106-2004120100>
[INFO] drslib.p_cmip5: Product deduced as output1, selected years [112/56] assigned to output1
[INFO] drslib.p_cmip5: Deducing product for <DRS cmip5.%.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrPlev.r1i1p1.%.va.2004120106-2005120100>
[INFO] drslib.p_cmip5: Product deduced as output1, selected years [112/56] assigned to output1
==============================================================================
DRS Tree at .
------------------------------------------------------------------------------
cmip5.output1.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrLev.r1i1p1        0:0 1120:1371721676416
cmip5.output1.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrPlev.r1i1p1       0:0 224:116004074368
------------------------------------------------------------------------------
2 datasets awaiting upgrade
==============================================================================

# Select the 6hrPlev dataset for publishing and check commands to be issued
$ drs_tool todo -I ./mohc_holding/ -R . --detect-product cmip5.%.%.%.%.%.%.6hrPlev
==============================================================================
DRS Tree at .
------------------------------------------------------------------------------
Publisher Tree cmip5.output1.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrPlev.r1i1p1 todo for version 20101006

mv ./mohc_holding/psl_6hrPlev_HadGEM2-ES_historical_r1i1p1_194912010600-195012010000.nc ./cmip5/output1/MOHC/HadGEM2-ES/historical/6hr/atmos/6hrPlev/r1i1p1/files/psl_20101006/psl_6hrPlev_HadGEM2-ES_historical_r1i1p1_194912010600-195012010000.nc
ln -s ./cmip5/output1/MOHC/HadGEM2-ES/historical/6hr/atmos/6hrPlev/r1i1p1/files/psl_20101006/psl_6hrPlev_HadGEM2-ES_historical_r1i1p1_194912010600-195012010000.nc ./cmip5/output1/MOHC/HadGEM2-ES/historical/6hr/atmos/6hrPlev/r1i1p1/v20101006/psl/psl_6hrPlev_HadGEM2-ES_historical_r1i1p1_194912010600-195012010000.nc
...
mv ./mohc_holding/va_6hrPlev_HadGEM2-ES_historical_r1i1p1_200412010600-200512010000.nc ./cmip5/output1/MOHC/HadGEM2-ES/historical/6hr/atmos/6hrPlev/r1i1p1/files/va_20101006/va_6hrPlev_HadGEM2-ES_historical_r1i1p1_200412010600-200512010000.nc
ln -s ./cmip5/output1/MOHC/HadGEM2-ES/historical/6hr/atmos/6hrPlev/r1i1p1/files/va_20101006/va_6hrPlev_HadGEM2-ES_historical_r1i1p1_200412010600-200512010000.nc ./cmip5/output1/MOHC/HadGEM2-ES/historical/6hr/atmos/6hrPlev/r1i1p1/v20101006/va/va_6hrPlev_HadGEM2-ES_historical_r1i1p1_200412010600-200512010000.nc
==============================================================================

# Do the upgrade
$ drs_tool upgrade -I ./mohc_holding/ -R . --detect-product cmip5.%.%.%.%.%.%.6hrPlev
...

# List the results.
# Not including --detect-product lists datasets that are incompletely specified
$ drs_tool list -I ./mohc_holding -R mohc_dryrun
==============================================================================
DRS Tree at mohc_dryrun
------------------------------------------------------------------------------
cmip5.output1.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrPlev.r1i1p1.v20101005  224:116004074368
------------------------------------------------------------------------------
Incompletely specified incoming datasets
------------------------------------------------------------------------------
cmip5.%.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrLev.r1i1p1
==============================================================================