======================= CMIP5 product detection ======================= drslib includes an algorithm for detecting the required CMIP5 product from filenames generated by CMOR2_ developed by Martin Juckes. The module :mod:`drslib.p_cmip5` contains an API to the algorithm and ``drs_tool`` will use the algorithm to allocate a product for incoming data when used with the ``--detect-product`` option. A detailed discussion of the algorithm used is available as the document `Identifying the product`_ [PDF]. .. _`Identifying the product`: doc/requested_subset_decision_tree_v0_5.pdf ``drs_tool`` configuration ========================== Product detection requires extra information to be configured about the CMIP5 experiment and the data being processed. Before first using the product detection feature you should initialise the CMIP5 experiment description data using ``drs_tool init``: .. code-block:: bash $ drs_tool init --shelve-dir=DIR where ``DIR`` is a directory suitable for containing the data files. On an ESG datanode a suitable command might be: .. code-block:: bash $ drs_tool init --shelve-dir=/usr/local/share/p_cmip5/data In addition to the shelve directory ``p_cmip5`` requires additional information about the model output being processed to be included in an external configuration file (see `Input configuration file`_). Both these parameters can be configured using metaconfig as follows: .. code-block:: ini [metaconfig] configs = drslib [drslib:p_cmip5] shelve-dir = /usr/local/share/p_cmip5/data config = /usr/local/share/p_cmip5/model.ini .. include:: p_cmip5/configuration.rst Invoking drs_tool with product detection ======================================== ``drs_tool`` will allocate a product to files in the incoming directory when invoked with the ``--detect-product`` option. ``drs_tool list`` will show the product deduced in each dataset_id and list datasets for which the product could not be determined as incomplete. ``drs_tool todo`` and ``drs_tool upgrade`` will only operate on datasets for which the product could be determined. In the following example data from the UK Met Office Hadley Centre is processed into the DRS hierarchy. .. code-block:: bash $ drs_tool list -I ./mohc_holding/ -R ./cmip5 --detect-product ... [INFO] drslib.p_cmip5: Product deduced as output1, selected years [112/56] assigned to output1 [INFO] drslib.p_cmip5: Deducing product for [INFO] drslib.p_cmip5: Product deduced as output1, selected years [112/56] assigned to output1 [INFO] drslib.p_cmip5: Deducing product for [INFO] drslib.p_cmip5: Product deduced as output1, selected years [112/56] assigned to output1 [INFO] drslib.p_cmip5: Deducing product for [INFO] drslib.p_cmip5: Product deduced as output1, selected years [112/56] assigned to output1 [INFO] drslib.p_cmip5: Deducing product for [INFO] drslib.p_cmip5: Product deduced as output1, selected years [112/56] assigned to output1 ============================================================================== DRS Tree at . ------------------------------------------------------------------------------ cmip5.output1.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrLev.r1i1p1 0:0 1120:1371721676416 cmip5.output1.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrPlev.r1i1p1 0:0 224:116004074368 ------------------------------------------------------------------------------ 2 datasets awaiting upgrade ============================================================================== # Select the 6hrPlev dataset for publishing and check commands to be issued $ drs_tool todo -I ./mohc_holding/ -R . --detect-product cmip5.%.%.%.%.%.%.6hrPlev ============================================================================== DRS Tree at . ------------------------------------------------------------------------------ Publisher Tree cmip5.output1.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrPlev.r1i1p1 todo for version 20101006 mv ./mohc_holding/psl_6hrPlev_HadGEM2-ES_historical_r1i1p1_194912010600-195012010000.nc ./cmip5/output1/MOHC/HadGEM2-ES/historical/6hr/atmos/6hrPlev/r1i1p1/files/psl_20101006/psl_6hrPlev_HadGEM2-ES_historical_r1i1p1_194912010600-195012010000.nc ln -s ./cmip5/output1/MOHC/HadGEM2-ES/historical/6hr/atmos/6hrPlev/r1i1p1/files/psl_20101006/psl_6hrPlev_HadGEM2-ES_historical_r1i1p1_194912010600-195012010000.nc ./cmip5/output1/MOHC/HadGEM2-ES/historical/6hr/atmos/6hrPlev/r1i1p1/v20101006/psl/psl_6hrPlev_HadGEM2-ES_historical_r1i1p1_194912010600-195012010000.nc ... mv ./mohc_holding/va_6hrPlev_HadGEM2-ES_historical_r1i1p1_200412010600-200512010000.nc ./cmip5/output1/MOHC/HadGEM2-ES/historical/6hr/atmos/6hrPlev/r1i1p1/files/va_20101006/va_6hrPlev_HadGEM2-ES_historical_r1i1p1_200412010600-200512010000.nc ln -s ./cmip5/output1/MOHC/HadGEM2-ES/historical/6hr/atmos/6hrPlev/r1i1p1/files/va_20101006/va_6hrPlev_HadGEM2-ES_historical_r1i1p1_200412010600-200512010000.nc ./cmip5/output1/MOHC/HadGEM2-ES/historical/6hr/atmos/6hrPlev/r1i1p1/v20101006/va/va_6hrPlev_HadGEM2-ES_historical_r1i1p1_200412010600-200512010000.nc ============================================================================== # Do the upgrade $ drs_tool upgrade -I ./mohc_holding/ -R . --detect-product cmip5.%.%.%.%.%.%.6hrPlev ... # List the results. # Not including --detect-product lists datasets that are incompletely specified $ drs_tool list -I ./mohc_holding -R mohc_dryrun ============================================================================== DRS Tree at mohc_dryrun ------------------------------------------------------------------------------ cmip5.output1.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrPlev.r1i1p1.v20101005 224:116004074368 ------------------------------------------------------------------------------ Incompletely specified incoming datasets ------------------------------------------------------------------------------ cmip5.%.MOHC.HadGEM2-ES.historical.6hr.atmos.6hrLev.r1i1p1 ============================================================================== .. _CMOR2: http://www2-pcmdi.llnl.gov/cmor/documentation/