ECLIS Plugins : tuning CNRM-CM to your needs

Want to have vegetation evolve your way ? to feed your monitoring graphs during run ? You can easily include your script sequences atop of CNRM-CM scripts

Article mis en ligne le 25 avril 2012

dernière modification le 10 septembre 2013

par senesi

From version V5.4, the plugin features of ECLIS allows largely for tuning CNRM-CM runs to your needs, in order to let you cope with cases which were not explicitly anticipated at the stage of ECLIS design. This apply both for the coupled and forced modes. From V5.5, this applies also to Nemo forced mode.
Section ’plugin sections and their role’ is interesting even if you are only a plugin user, for improving your understanding of your experiment’s setup. The remaining is of interest for designing a plugin

Standard plugins

This apply for instance to the case of running a modified version of the atmopheric model which needs some additional initial conditions, and/or new boundary conditions ; another example is the use of an alternate post-processing. Some of these needs are common enough that standard plugins have been integrated in ECLIS ; see Standard plugins

Designing a plugin

If no standard plugin match your need, you may embark designing one. It is recommended to have a look at a standard plugin code when reading the following guidelines. Designing a plugin is done by providing, at the experiment installation step, up to 10 sequences of ksh shell comands, which will each be executed at a given stage of the experiment run, and which will take advantage of some ’services’ provided by ECLIS. These services, which are described herafter, are mainly the provision of some environment variables, and the provision of a number of script functions.

For activating such a set of script commands, just group them in a file and quote the filename in the experiment parameter ’PLUGINS’ (in param_file ; this is a string parameter that can contain multiple filenmaes, separated by blanks). The script commands should appear in the plugin file as ’sections’ labelled : PLUGIN_PREINSTALL, PLUGIN_INSTALL, PLUGIN_FETCH, PLUGIN_FETCH_LOOP1 and 2, PLUGIN_COMPUTE_INIT, PLUGIN_COMPUTE_BEFORE, PLUGIN_COMPUTE_AFTER, PLUGIN_COMPUTE_END, PLUGIN_PUT_LOOP, PLUGIN_PUT. A section begins with its label bewteen square braces (e.g. ’[PLUGIN_FETCH]’).

The plugins files are searched first in the current working directory when launching the experiment install, and next in the ’plugins’ directory of the ECLIS version you are using ; you can of course also provide a full pathname for organizing you own plugins library.

Micro-plugins are an alternative, easier for adding only a few script commands

Plugin sections and their role

PLUGIN_PREINSTALL : executed once per experiment, at the beginning of the ’regular’ experiment setup/install, it is used only for modifying default values of ECLIS parameters used at insatll stage, when necessary ; for instance, you can specify there that you do not want to save atmospheric raw outputs on the archive machine (by setting LATMOUT to 0)

PLUGIN_INSTALL : executed once per experiment, at the end of the ’regular’ experiment setup/install take some actions done onc eper job on the front-end, like :
- copying a specific namelist template from any machine, modifying it for this run (e.g. setting the experiment name, the coupling frequency ...), and storing this version in the relance directory
- fetching some initial conditions, copying it to the relevant transient directory on GFS under a name relevant for the experiment, and archiving it with this name, for reference
- creating some output (sub)directories on the archive machine, in the experiment reference output directory or anywhere else
- archiving the set of aerosols data that will be used by the run, for reference
  The actions above can actually also be put in the next sequence PLUGIN_FETCH, for instance if they must use the compute phase working directory ; this is however not the case for the next one, which is quite powerful : defining some user environment variables which will be known to all job steps, using commands like
  'echo "VAR=VALUE" >> ${RELEXP}/$EXPID.conf

PLUGIN_FETCH_LOOP1 : executed once per each month of the simulation, this is the place for getting e.g. lateral boundary conditions from the archive machine and storing them on the current directory for this step ; its name is FTEXP, and it is a subdirectory on FTDIR dedicated to this experiment (see details about FTDIR further below). Because it is more efficient to avoid fetching data files on at a time with ftget, you may at this stage rather use a function like EFGET , which records in a file the names for all the files to be fetched, and let the actual fetching be executed in next step, PLUGIN_FETCH, using function RECUP. You have access to a number of environment variables which are updated during each loop on months

PLUGIN_FETCH : executed once per job, on the front-end computer, after the fetch of all ’regular’ restart files ; this is the place for getting from the archive machine the kind of input data which do change only once per job (such as history/restart files), or series of monthly input files with RECUP as explained above.

PLUGIN_FETCH_LOOP2 : executed also once per month, it is the place for e.g. pre-processing the kind of data fetched by first loop (e.g. de-taring it)

PLUGIN_COMPUTE_EARLY_INIT :executed once per job, at the very beginning of the supercomputer / parallel step of the run job

PLUGIN_COMPUTE_INIT : executed once per job, on the supercomputer /parallel system, after the copy of job input restarts from FTEXP dir to run working directory (this copying is useful for choosing, for each job, if the working directory in on WORKIDR or TMPDIR), and after the copy of namelists from RELDIR directory to the run working directory (this last copying allows to prepare an on-the-fly change in namelist during the course of an experiment). The user would insert similar copying between permanent disk storage where changes may occur (either on purpose or not) during one of the jobs of the experiment, and the simulation working directory which should not be touched during a job, except by the run itself. Note that, at that stage, ECLIS does change the date in Arpege and Surfex restarts

PLUGIN_COMPUTE_BEFORE : executed just before boundary conditions and climatology update, and so before each model run, so usually once per month, on the supercomputer. At that stage, ECLIS has cleaned the working directory (keeping only the input restarts and namelists he is aware of), then updated the namelists for calendar aspects and proc numbers, and updated the boundary conditions and forcings (like aerosols). See further below the envionment variables available for accessing model run loop index and current month value (this apply also to other sequences).

PLUGIN_COMPUTE_AFTER : executed immediately after each model run, so usually once per month, on the supercomputer. At that stage, ECLIS has only collected the model outputs ; just after, it will rename output restarts to input restart names, copy or move to FTDIR the restarts to save, and prepare commands for copying restarts and model outputs on the archive machine ; hence, the user commands should not rename or delete model outputs

PLUGIN_COMPUTE_END : executed once per job, on the supercomputer, after the loop on months. ECLIS does nothing at that stage except asking for job accounting printout. This can be the place for some yearly processing which wouldn’t fit in the PUT steps (see below)

PLUGIN_PUT_LOOP : executed once per processed month, on the front-end, after that ECLIS has copied the ’regular’ restarts and outputs for the month to the archive machine

PLUGIN_PUT : executed once per job, on the front-end, after the loop described above.

An example of such sequences is provided in the attached document

Environment variables available to the plugins :

Most variables of the first paragraph below are fully described in Design a ’param_’ file and are accessible alphabetically through the ECLIS parameters index. We quote here only the most interesting ones in the context of tuning, and the other ones in the second and third paragraph :

A selection of those variables which are fixed through the course of the experiment
- EXPID : experiment name
- [RELDIR] : directory for experiment ’relance’
- INIDATE, INITIME : date at simulation start (according to experiment setup )
- ARCHIV : name of the archive machine (you shouldn’t need to know it)
- FTEXP : directory for exchanges with the archive machine ; it is emptied at the beginning of each job, except for sub-directory ’restart’ ; its value is $STORE_LOC/$EXPID, and STORE_LOC can be set in the param file and defaults to $FTDIR (which is set by the system)
- RESTART=$FTEXP/restart : directory specifically used as a buffer for archiving ECLIS regular restarts
- ATMNAM, SFXNAM, OCENAM, IOSNAM, RIVNAM, IOXNAM, ICENAM, CPLNAM : namelists filenames (their content is instantiated for each model run)
- HOM_TOOL : directory of the ECLIS toolbox
- RUNMAIL, GEOM, LMSE (1 if Surfex active)

Variables which are fixed inside any job (but may be changed on a per job basis, by changing VARI in the _his file)
- RESDATE, [DATE] : date of the inital state of the current job (YYYYMMDD)
- ENDDATE, [DATF] : last date of the experiment
- ENDYEAR, ENDMONTH, ENDDAY : break down of ENDDATE
- NMONTH : number of months processed in the job (== size of model(s) launch loop)
- TOTAL_PROC_ARP, TOTAL_PROC_OCE, TOTAL_PROC_IOS : number of procs for the models

Variables which do evolve in job loops. The first series of variables address all loops, namely fetch loops, model launch loop and archiving loop :
- IMONTH : loop index (on months)
- YEAR, MONTH, DAY : date parts for the first day to process in the current month
- NEXTMONTH, NEXTYEAR, PREVMONTH, PREVYEAR : similar ; but NEXTYEAR is the year of next month, PREVYEAR is the year of previous month
- YYMMDATE = $YEAR$MONTH
- NDAY : number of days to process in the current month (even if end date is not at a month end)
- AT_END : equals 1, from start of both loops, if we are in the last loop
- in model launch loop :
  - launch : 0 from start of loop , 1 once model run is done
  - NSTEP_OCE : number of Nemo time steps
  - NFRHIS, NFRPOS, NSTOP : values from Arpege namelist

Less useful variables, which are fixed through the course of the experiment
- ECLIS : reference directory for the version of ECLIS used at experiment setup
- FRCPL : coupling frequency
- ATMRESGEN, SFXFRESGEN, SFXTRESGEN, OCERESGEN, ICERESGEN, RIVRESGEN : pattern for the names of model restarts on archive machine and also on transient directory RESTART
- ATMRESARCH, SFXRESARCH, ICEOUTARCH, ICERESARCH, RIVRESARCH, CPLRESARCH : directories on archive machine for archiving model restarts for this experiment
- ATMOUTARCH, SFXOUTARCH, OCEOUTARCH, OCERESARCH, RIVOUTARCH, CPLOUTARCH : directories for archiving model outputs (diagnostics) for this experiment
- SAVE_RESTART_PER : interval between saved restarts (in month). ’all’=all restarts
- ATMEXE, OCEEXE, IOSEXE, CPLEXE, RIVEXE : model binaries
- UPDCLI, UPDOZO : binaries for Arpege boundary conditions update
- LOCEREST : 0 if Nemo first initla state is Levitus climatology
- YEAR_BCD, YEAR_GHG, YEAR_OZO ... : calendar pattern for boundary conditions and forcings (see doc)
- BCOND : Arpege boundary conditions file
- FORGHG, FORBCA, FORORA, FORSDA, FORSSA, FORSUL, FORVOL, FOROZO : aerosols forcing filename patterns
- ATMRESIN, SFXFRESIN, SFXTRESIN, OCERESIN, ICERESIN, RIVRESIN : name of the input restarts, as known by the models ; they are valid only in the experiment working directory
- ATMRESOUT, SFXFRESOUT :same for the output restarts
- KEEPTMP : 1 if the compute working directory is on $WORKDIR (and =$WORKDIR/rundirs/$EXPID )
- LATMOUT, LSFXOUT, LICEOUT, LOCEOUT, LRIVOUT : 0 if the output of this component is just thrown away (and not archived)

Functions provided and recommeded

To be described later : FTPUT, FTGET, EFPUT, EFGET, REDATE, YERR_EFF, SUBS, get_timestep, change_var_in_namelist

Notes

Current directory for compute step : by default, that directory is $TMPDIR, as set and allocated by the system, or, if your job uses only one node, $TMP_LOC, which is a node-local file system (for efficiency). But, if you have set KEEP_TMP to 1 in the param_ file, the compute directory is $WORKDIR/rundirs/$EXPID, it is emptied at the beginning of each job, and you can look at it for debugging purpose.
ECLIS do clean the compute step directory at the beginning of the compute loop ; it leaves only a number of files and directories that he needs and knows about ; hence, if you want to accumulate results, you have to store it either somewhere else, or in a sub-directory dedicated to that, which name is ’plugins’ and which is not touched by ECLIS

How does it work (internals) :

The content of the plugin command files is analyzed at experiment install stage, and assembled in files dedicated to each sub-step (FETCH, COMPUTE_INIT ...) and located in the experiment relance directory, in subdir ’plugins’. The ECLIS run scripts insert these files using Mtool (which has been configured by the run script to use this sub-directory as a possible source of include files). Hence, the plugin commands remain unchanged until you re-install the experiment, or change manually one of those files.

’On-the-fly’ plugins

From paragraph above, you can deduce that, once an experiment is installed, you can avoid going again through the experiment installation process for introducing or modifying shell-script commands. You can do so by adding or modifying files in subdirectory ’plugins’ of the experiment relance directory, which filename equals the relevant section name (e.g. INSTALL, COMPUTE_AFTER...see above). This is a way to debug an experiment or to easily tune the commands of a plugin. You can do so even if no plugin was quoted in the experiment param_ file at installation stage.

’Micro’- plugins

Furthermore, from version V6.0.2, when you only have a few lines of script code to add to the standard ECLIS script, you can provide it directly as shell strings in ECLIS parameters named after the scheme ’section’_ADD as e.g. in COMPUTE_INIT_ADD="ln -s Restart fort.45\n ls -al fort.45". Section keywords are the same as above

I am a bit lost with all this piceces of script

It may be uneasy to have a clear idea of the actual content of the run script when you use multiple plugins. For inspecting the actual script after plugins inclusion you can install your experiment using the special sequence CM_SUBMIT_METHOD="--submitcmd=echo" ./param_EXPID , and then look a the content of the mtool ’depot’ directory which name will be displayed