3. Building and Running the UFS Weather Model¶
3.1. Prerequisite Libraries¶
The UFS Weather Model requires a number of libraries for it to compile. There are two categories of libraries that are needed:
Bundled libraries (NCEPLIBS). These are libraries developed for use with NOAA weather models. Most have an NCEPLIBS prefix in the repository, e.g. NCEPLIBS-bacio. Select tools from the UFS Utilities repository (UFS-UTILS) are also included in this category. A list of the bundled libraries tested with this WM release is in the top-level
READMEof the NCEPLIBS repository (be sure to look at the tag in that repository that matches the tag on this WM release).Third-party libraries (NCEPLIBS-external). These are libraries that were developed external to the UFS Weather Model. They are general software packages that are also used by other models in the community. Building these is optional, since existing builds of these libraries can be pointed to instead. A list of the external libraries tested with this WM release is in the top-level
READMEof the NCEPLIBS-external repository. Again, be sure to look at the tag in that repository that matches the tag on this WM release.
Note
The libraries in NCEPLIBS-external must be built before the libraries in NCEPLIBS.
See this wiki link for an explanation of which platforms and compilers are supported. This will help to determine if you need to build NCEPLIBS and NCEPLIBS-external or are working on a system that is already pre-configured. On pre-configured platforms, the libraries are already available.
If you do have to build the libraries, it is a good idea to check the platform- and compiler-specific
README files in the doc/ directory of the NCEPLIBS-external repository
as a first step, to see if your system or one similar to it is included. These files have detailed
instructions for building NCEPLIBS-external, NCEPLIBS, and the UFS Weather Model. They may be all the
documentation you need. Be sure to use the tag that corresponds to this version of the WM, and define a
WORK directory path before you get started.
If your platform is not included in these platform- and compiler-specific README files, there is a more
generic set of instructions in the README file at the top level of the NCEPLIBS-external repository, and at the top level of the NCEPLIBS repository. It may still be a good idea to look at some of the platform-
and compiler-specific README files as a guide. Again, be sure to use the tag that corresponds to this version of the WM.
The top-level README in the NCEPLIBS-external repository includes a troubleshooting section that may be helpful.
You can also get expert help through a user support forum set up specifically for issues related to build dependencies.
3.2. Downloading the Weather Model Code¶
To clone the develop branch of the ufs-weather-model repository, execute the following commands:
git clone https://github.com/ufs-community/ufs-weather-model.git ufs-weather-model
cd ufs-weather-model
git submodule update --init --recursive
Compiling the model will take place within the ufs-weather-model directory you just created.
3.3. Building the Weather Model¶
3.3.1. Loading the required modules¶
Modulefiles for pre-configured platforms
are located in modulefiles/ufs_<platform>.<compiler>. For example, to load the modules from the ufs-weather-model
directory on hera:
module use modulefiles
module load ufs_hera.intel
Note that loading this module file will also set the CMake environment variables shown in Table 3.1.
EnvironmentVariable |
Description |
Hera Intel Value |
|---|---|---|
CMAKE_C_COMPILER |
Name of C compiler |
mpiicc |
CMAKE_CXX_COMPILER |
Name of C++ compiler |
mpiicpc |
CMAKE_Fortran_COMPILER |
Name of Fortran compiler |
mpiifort |
CMAKE_Platform |
String containing platform and compiler name |
hera.intel |
If you are not running on one of the pre-configured platforms, you will need to set the environment variables in a different way.
3.3.2. Setting the CMAKE_FLAGS and CCPP_SUITES environment variables¶
You need to use the CMAKE_FLAGS environment variable to specify which application to build.
In order to have one or more CCPP physics suites available at runtime, you also need to select those suites at
build time by setting the CCPP_SUITES environment variable. Multiple suites can be set. Following
examples are for the bash shell.
For the ufs-weather-model ATM app (standalone ATM):
export CMAKE_FLAGS="-DAPP=ATM -DCCPP_SUITES=FV3_GFS_v16"
For the ufs-weather-model ATM app (standalone ATM) in 32 bit:
export CMAKE_FLAGS="-DAPP=ATM -D32BIT=ON -DCCPP_SUITES=FV3_GFS_v16"
For the ufs-weather-model ATMW app (standalone ATM with wave):
export CMAKE_FLAGS="-DAPP=ATMW -DCCPP_SUITES=FV3_GFS_v16"
For the ufs-weather-model S2S app (atm/ice/ocean):
export CMAKE_FLAGS="-DAPP=S2S -DCCPP_SUITES=FV3_GFS_2017_coupled,FV3_GFS_2017_satmedmf_coupled,FV3_GFS_v15p2_coupled,FV3_GFS_v16_coupled,FV3_GFS_v16_couplednsst"
For the ufs-weather-model S2S app (atm/ice/ocean) with debugging flags turned on, with verbose build messages:
export CMAKE_FLAGS="-DAPP=S2S -DDEBUG=ON -DCCPP_SUITES=FV3_GFS_2017_coupled,FV3_GFS_2017_satmedmf_coupled,FV3_GFS_v15p2_coupled,FV3_GFS_v16_coupled,FV3_GFS_v16_couplednsst"
export BUILD_VERBOSE=1
For the ufs-weather-model S2SW app (atm/ice/ocean/wave):
export CMAKE_FLAGS="-DAPP=S2SW -DCCPP_SUITES=FV3_GFS_2017_coupled,FV3_GFS_v15p2_coupled,FV3_GFS_v16_coupled,FV3_GFS_v16_coupled_noahmp"
For the ufs-weather-model HAFS app (atm/ocean) in 32 bit:
export CMAKE_FLAGS="-DAPP=HAFS -D32BIT=ON -DCCPP_SUITES=FV3_HAFS_v0_gfdlmp_tedmf_nonsst,FV3_HAFS_v0_gfdlmp_tedmf,FV3_HAFS_v0_hwrf_thompson,FV3_HAFS_v0_hwrf"
For the ufs-weather-model HAFSW app (atm/ocean/wave) in 32 bit:
export CMAKE_FLAGS="-DAPP=HAFSW -D32BIT=ON -DCCPP_SUITES=FV3_HAFS_v0_gfdlmp_tedmf_nonsst,FV3_HAFS_v0_gfdlmp_tedmf,FV3_HAFS_v0_hwrf_thompson,FV3_HAFS_v0_hwrf"
For the ufs-weather-model HAFS-ALL app (data/atm/ocean/wave) in 32 bit:
export CMAKE_FLAGS="-DAPP=HAFS-ALL -D32BIT=ON -DCCPP_SUITES=FV3_HAFS_v0_gfdlmp_tedmf_nonsst,FV3_HAFS_v0_gfdlmp_tedmf,FV3_HAFS_v0_hwrf_thompson,FV3_HAFS_v0_hwrf"
3.3.3. Building the model¶
The UFS Weather Model uses the CMake build system. There is a build script called build.sh in the
top-level directory of the WM repository that configures the build environment and runs the make
command. This script also checks that all necessary environment variables have been set.
If any of the environment variables have not been set, the build.sh script will exit with a message similar to:
./build.sh: line 11: CMAKE_Platform: Please set the CMAKE_Platform environment variable, e.g. [macosx.gnu|linux.gnu|linux.intel|hera.intel|...]
The WM can be built by running the following command from the ufs-weather-model directory:
./build.sh
Once build.sh is finished, you should see the executable, named ufs_model, in the ufs-weather-model/build/ directory.
If it is desired to build in a different directory, specify the BUILD_DIR environment variable: e.g. export BUILD_DIR=test_cpld
will build in the ufs-weather-model/test_cpld directory instead.
Expert help is available through a user support forum set up specifically for issues related to the Weather Model.
3.4. Running the model¶
3.4.1. Using the regression test script¶
The regression test script rt.sh in the tests/ directory can be
used to run a number of preconfigured test cases. It is the top-level script
that calls lower-level scripts to build, set up environments and run tests.
On Tier-1 platforms, it can
be as simple as editing the rt.conf file and subsequently executing
./rt.sh -l rt.conf
Following discussions are general, but the user may not be able to successfully execute the script as is unless s/he is on one of the Tier-1 platforms.
Each line in the PSV (Pipe-separated values) file rt.conf is used to either
build or run. The COMPILE line specifies the application to build (e.g.
APP=S2S), CCPP suite to use (e.g. SUITES=FV3_GFS_2017_coupled), and
additional build options (e.g. DEBUG=Y) as necessary. The RUN line
specifies the name of a test to run. The test name should match the name of one
of the test files in the tests/tests/ directory or, if the user is adding a new
test, the name of the new test file. The order of lines in rt.conf matters
since rt.sh processes them sequentially; a RUN line should be proceeded
by a COMPILE line that builds the model used in the test. The following example
rt.conf file builds the standalone ATM model in 32 bit and then runs the
control test:
COMPILE | -DAPP=ATM -DCCPP_SUITES=FV3_GFS_v16 -D32BIT=ON | | fv3
RUN | control | | fv3
The third column of rt.conf relates to the platform; if left blank, the test
runs on all Tier-1 platforms. The fourth column deals with baseline creation (more
on this later) and fv3 means the test will be included during baseline creation.
The rt.conf file includes a large number of tests. If the user wants to run
only a specific test, s/he can either comment out (using the # prefix) the
tests to be skipped, or create a new file, e.g. my_rt.conf, then execute
./rt.sh -l my_rt.conf.
The regression test generates a number of log files. The summary log file
RegressionTests_<machine>.<compiler>.log in the tests/ directory compares
the results of the test against the baseline specific to a given platform and
reports the outcome (hence, the ‘regression’ test): ‘Missing file’ results when
the expected files from the simulation are not found, and typically occurs
when the simulation did not run to completion; ‘OK’ means that the simulation
results are bit-for-bit identical to those of the baseline; ‘NOT OK’ when
the results are not bit-for-bit identical; and ‘Missing baseline’ when there
is no baseline data to compare against.
More detailed log files are found in the tests/log_<machine>.<compiler>/ directory.
In particular, the user may find useful the run directory path provided as the
value of RUNDIR variable in the run_<test-name> file. $RUNDIR is a
self-contained (i.e. sandboxed) directory with the executable file, initial
conditions, model configuration files, environment setup scripts and a batch job
submission script. The user can run the test by cd’ing into $RUNDIR and
invoking the command
sbatch job_card
This can be particularly useful for debugging and testing code changes. Note that
$RUNDIR is automatically deleted at the end of a successful regression test;
specifying the -k option retains the $RUNDIR, e.g. ./rt.sh -l rt.conf -k.
Found inside the $RUNDIR directory are a number of model configuration files:
input.nml, model_configure, nems.configure, and other application
dependent files, e.g. ice_in for Subseasonal-to-Seasonal application.
These model configuration files are
generated by rt.sh from the template files in the tests/parm/ directory.
Specific values used to fill in the template files depend on the test being run, and
are set in two stages: default values are specified in tests/default_vars.sh and
the default values are overriden if necessary by those specified in a test file
tests/tests/<test-name>. For example, the variable DT_ATMOS, which is
substituted into the template file model_configure.IN to generate
model_configure, is initially assigned 1800 in the function export_fv3 of the
script default_vars.sh, but the test file tests/tests/control overrides by
reassigning 720 to the variable.
Also found inside the $RUNDIR directory are the files fv3_run and
job_card, which are generated from the template files in the tests/fv3_conf/
directory. The latter is a platform-specific batch job submission script, while
the former prepares the initial conditions by copying relevant data from the
input data directory of a given platform to the $RUNDIR directory.
Table 3.2 summarizes the subdirectories discussed above.
Name |
Description |
|---|---|
tests/ |
Regression test root directory. Contains rt-related scripts and the summary log file |
tests/tests/ |
Contains specific test files |
tests/parm/ |
Contains templates for model configuration files |
tests/fv3_conf/ |
Contains templates for setting up initial conditions and a batch job |
tests/log_*/ |
Contains fine-grained log files |
There are a number of command line options available to the rt.sh script.
The user can execute ./rt.sh to see information on these options. A couple
of them are discussed here. When running a large number (10’s or 100’s) of
tests, the -e option to use the ecFlow workflow manager can significantly
decrease the testing time by queuing the jobs according to dependencies and
running them concurrently. The -n option can be used to run a single test;
for example, ./rt.sh -n control will build the ATM model and run the
control test. The -c option is used to create baseline. New
baslines are needed when code changes lead to result changes, and therefore
deviate from existing baselines on a bit-for-bit basis.
When a developer needs to create a new test for his/her implementation, the
first step would be to identify a test in the tests/tests/ directory that can
be used as a basis and to examine the variables defined in the test file. As
mentioned above, some of the variables may be overrides for those defined in
default_vars.sh; others may be new variables that are needed specifically
for the test. Default variables and their values are defined in the export_fv3
function of the default_vars.sh script for ATM application, export_cpl
function for S2S application and export_datm function for GODAS application.
Also, the names of template files for model configuration and initial conditions
can be identified via variables INPUT_NML, NEMS_CONFIGURE and FV3_RUN;
for example, by trying grep -n INPUT_NML * inside the tests/ and tests/tests/
directories.
3.4.2. Using the operational requirement test script¶
The operational requirement test script opnReqTest in the tests/ directory can also be used to run
tests. Given the name of a test, opnReqTest carries out a suite of test cases.
Each test case addresses an aspect of the requirements new implementations
should satisfy, which are shown in Table 3.3.
For the following discussions on opnReqTest, the user should note the distinction between
‘test name’ and ‘test case’: examples of test name are control, cpld_control
and regional_control which are all found in the /tests/tests/ directory, whereas
test case refers to any one of thr, mpi, dcp, rst, bit and dbg.
Case |
Description |
|---|---|
thr |
Varying the number of threads produces the same results |
mpi |
Varying the number of MPI tasks reproduces |
dcp |
Varying the decomposition (i.e. tile layout of FV3) reproduces |
rst |
Restarting reproduces |
bit |
Model can be compiled in double/single precision and run to completion |
dbg |
Model can be compiled and run to completion in debug mode |
The operational requirement test uses the same testing framework used by the regression
test, and therefore it is recommened that the user first read
Section 3.4.1. All the files in
the subdirectories shown in Table 3.2 are relavant to the
operational requirement test except that the opnReqTest script replaces rt.sh.
The /tests/opnReqTests/ directory contains
opnReqTest-specific lower-level scripts used to set up run configurations.
On Tier-1 platforms, tests can be run by invoking
./opnReqTest -n <test-name>
For example, ./opnReqTest -n control performs all six test cases
listed in Table 3.3 for control
test. At the end of the run, a log file OpnReqTests_<machine>.<compiler>.log
is generated in tests/ directory, which informs the user whether each test case
passed or failed. The user can choose to run a specific test case by invoking
./opnReqTest -n <test-name> -c <test-case>
where <test-case> is one or
more comma-separated values selected from thr, mpi, dcp, rst,
bit, dbg. For example, ./opnReqTest -n control -c thr,rst runs the
control test and checks the reproducibility of threading and restart.
The user can see different command line options available to opnReqTest by
executing ./opnReqTest -h; frequently used options are -e to use the ecFlow
workflow manager, and -k to keep the $RUNDIR. In the following,
comparisons are made between the regression and operational requirement tests on how they handle
different reproducibility tests.
As discussed in Section 3.4.1, the variables and
values used to configure model parameters and to set up initial conditions in the
$RUNDIR directory are set up in two stages: first, tests/default_vars.sh
define default values; then a specific test file in the tests/tests/ subdirectory
either overrides the default values or creates new variables if required by the test.
The regression test treats the different test cases shown in
Table 3.3 as different tests. Therefore, each
test case requires a test file in the tests/tests/ subdirectory; examples are
control_2threads, control_decomp, control_restart and control_debug,
which are just variations of control test to check various reproducibilities.
There are two potential issues with this approach. First, if several different
variations of a given test were to be created and included in the rt.conf file,
there are too many tests to run. Second, if a new test is added by the user, s/he
will also have to create these variations. The idea behind the operational requirement test is to
automatically configure and run these variations, or test cases, given a test file.
For example, ./opnReqTest -n control will run all six test cases in
Table 3.3 based on a single control test file.
Similarly, if the user adds a new test new_test, then ./opnReqTest -n new_test will
run all test cases. This is done by the operational requirement test script opnReqTest by adding a third
stage of variable overrides, and the related scripts can be found in the tests/opnReqTests/
directory.