Table Of Contents

Previous topic

Module Reference

This Page

Translating CMIP3 to CMIP5

The script translate_cmip3 converts the CMIP3 archive into a form as close to the DRS specification as possible. This transformation involves both filename and directory structure changes. From the command’s help message:

Usage: translate_cmip3 [options] cmip3_root cmip5_root

Options:
  -h, --help            show this help message and exit
  -i INCLUDE, --include=INCLUDE
                        Include paths matching INCLUDE regular expression
  -e EXCLUDE, --exclude=EXCLUDE
                        Exclude paths matching EXCLUDE regular expression
  -c, --copy            Copy rather than move files
  -d, --dryrun          Emit log messages but don't translate anything
  -l LOGLEVEL, --loglevel=LOGLEVEL
                        Set logging level

Example

The drslib.cmip3 module implements a similar API to drslib.cmip5 thus allowing CMIP3 paths to be converted to DRS instances then converted into CMIP5 DRS format.

>>> from drslib import cmip3
>>> cmip3_trans = cmip3.make_translator('cmip3')
>>> drs3 = cmip3_trans.filepath_to_drs('cmip3/20c3m/atm/da/rsus/gfdl_cm2_0/run1/rsus_A2.19610101-19651231.nc')
>>> drs3
<DRS activity='cmip3', product='output', institute='GFDL', model='CM2', experiment='20c3m', frequency='day', realm='atmos', variable='rsus', table='A2', ensemble=(1, None, None), version=1, subset=None, extended='19610101-19651231'>
>>> cmip5_trans.drs_to_filepath(drs3)
'http://example.com/cmip5/output/GFDL/CM2/20c3m/day/atmos/v1/rsus/r1/rsus_A2_CM2_20c3m_r1_19610101-19651231.nc'

CMIP3 DRS components

The CMIP3 activity is cmip3 and all datasets are given the product output. The version component is always v1. Translation of the other DRS components for CMIP3 are described below.

Institute & Model

Institutes and models given in capital letters and underscores are converted to dash characters. Capitalisation is chosen to be consistent with the examples given in sections 3.2 and 3.3 of the DRS specification and dashes are used to avoid ambiguity in DRS filenames that use underscores as the component separator.

Where the exact encoding is not trivial the syntax used by the IPCC Data Distribution Centre [DDC] is used.

[DDC]http://www.ipcc-data.org
CMIP3 directory Institute Model
bcc_cm1 CMA BCC-CM1
bccr_bcm2_0 BCCR BCM2
cccma_cgcm3_1 CCCMA CGCM3-1-T47
cccma_cgcm3_1_t63 CCCMA GCM3-1-T63
cnrm_cm3 CNRM M3
miub_echo_g MIUB-KMA CHO-G
csiro_mk3_0 CSIRO K3
csiro_mk3_5 CSIRO K3-5
gfdl_cm2_0 GFDL M2
gfdl_cm2_1 GFDL M2-1
inmcm3_0 INM M3
ipsl_cm4 IPSL M4
iap_fgoals1_0_g LASG GOALS-G1-0
mpi_echam5 MPIM CHAM5
mri_cgcm2_3_2a MRI GCM2-3-2
giss_aom NASA ISS-AOM
giss_model_e_h NASA ISS-EH
giss_model_e_r NASA ISS-ER
ncar_ccsm3_0 NCAR CSM3
ncar_pcm1 NCAR CM
miroc3_2_hires NIES IROC3-2-HI
miroc3_2_medres NIES IROC3-2-MED
ukmo_hadcm3 UKMO ADCM3
ukmo_hadgem1 UKMO ADGEM1
ingv_echam4 INGV CHAM4

Experiment

The experiment component remains unchanged from the CMIP3 archive structure except that it’s position in the tree changes to match the DRS specification.

Frequency

The CMIP3 frequency specifiers are translated into those described in the DRS specification as follows:

CMIP3 DRS
yr yr
mo mon
da day
3h 3hr
fixed fx

Modelling-realm

We map CMIP3 realms onto equivilent CMIP5 realms. In some cases this mapping also depends on the variable. This mapping is defined in the table below:

CMIP3 realm Variable DRS realm
atm mrsos land
atm trsult aerosol
atm trsul aerosol
atm tro3 atmosChem
atm * atmos
ice * seaIce
land sftgif landIce
land * land
ocn * ocean

Variable name

Variable names are left unchanged.

Ensemble member

The encoding run<N> is translated into r<N>.

Subset and extended path

Although most filenames in the CMIP3 archive follow a consistent syntax there are enough exceptions to make complete adherence to the DRS specification impractical. Instead translate_cmip3 attempts to extract the variable, MIP table name from the CMIP3 path and constructs an approximate DRS filename of the form:

<variable>_<mip-table>_<model>_<experiment>_<ensemble-member>_<extended>.nc

where <extended> is the unparsed portion of the filename that may contain a temporal subset or may be irregular. Some examples are given below::

/20c3m/atm/da/rsus/gfdl_cm2_0/run1/rsus_A2.19610101-19651231.nc --> rsus_A2_CM2_20c3m_r1_19610101-19651231.nc
/1pctto2x/atm/mo/rlftoaa_co2/ipsl_cm4/run1/rlftoaa_co2_A5_1860-1869.nc --> rlftoaa_co2_A5_CM4_1pctto2x_r1_1860-1869.nc
/2xco2/land/fixed/orog/miroc3_2_hires/run1/orog_A1.nc --> orog_A1_MIROC3-2-HI_2xco2_r1.nc
/sresa1b/atm/mo/rlut/cccma_cgcm3_1/run4/rlut_a1_sresa1b_4_cgcm3.1_t47_2001_2100.nc --> rlut_a1_CGCM3-1-T47_sresa1b_r4_sresa1b_4_cgcm3.1_t47_2001_2100.nc