John K. Ousterhout | Expect | Reference | |||||
|
File Managers | Databases | Graphics | ||||
Networking | Web | ||||||
Software Engineering | Unix Component Model | C3 Tools | History | Humor | Random Findings | Etc |
|
|
The Modules package is designed to abstract environment adjustments necessary for running various versions of the same application from the user. It was created by John L. Furlani in early 1990th when he was the system administrator for Sun’s Research Triangle Park Facility in North Carolina. So this is Sun Microsystems product which is more then 25 years old.
One of the advantages of Environment Modules is that a single modulefile supports all major shells including bash, ksh, zsh, as well as tcsh both for environment setup and initialization. This makes environments with multiple shells less complicated. It is less important now as bash became the dominant shell.
This is mostly sysadmin tool and it adds another scripting langue and rather complex infrastructure to the mix. Makes sense only if users need multiple versions of software (for example multiple version of gcc or openmpi). In this particular case it makes .bashrc and .bash_profile files less complex and more modular. But the fact that scripts are written in TCL which few people know makes this a devil bargain.
Most modules are pretty primitive and do not implement full capabilities of software. For example, Environment Modules are capable providing per application set of aliases, but this functionality is rarely used. The level of usage is so basic, that typically it can be re-implemented via bash using library of source files and by storing initial status on environment and set of aliases in some file (so that it can be restored from this file)
If your HPC cluster is large and the user community clueless, it does help to avoid complex mistakes caused by incorrect order of directories in PATH or LD_LIBRARY_PATH. If free the user from direct manipulation of environment variables by providing higher level abstraction of changes in a form of modulefiles. You just load the necessary modulefile and all changes are magically done (one modules per version of application should be written).
At the same time it adds complexity and is over-engineered. In no way it makes sense for regular users who do not have the problem of multiple versions of software. Due to complexity few users develop their own modules. Most use cases are centrally maintained modules by datacenter staff repository that users passively use.
In additional to maintaining critical for application packages variables such as PATH and LD_LIBRARY_PATH, environment modules also allow to resolve some application dependencies and warn about some conflicts. this is important is you use openMPi as various applications require different version of OpenMPI. If particular package should be used with OpenMPI 1.10, then environment modules can provide proper environment for this automatically.
In other words in the environment where multiple versions of the same application are used, the environment modules package gives you two advantages:
Those two advantages were instrumental in making Environment Modules a part of most HPC cluster setups. It also inspired several alternative implementation such as lmod from University of Texas, which is written in Lua instead of TCL.
The "environment adjustment scripts" called modulefiles are created on per application per version basis. They can be dynamically loaded, unloaded, or switched.
Along with the capability of using multiple versions of the same software it also can be used to implement site policies regarding the access and use of applications. you can also designate one as default and in case version is not supplied it will be loaded by the package interpreter.
Module files are written in the Tool Command Language (TCL) created by John K. Ousterhout. This language was designed as universal macro language for application, as at one time Sun planed to promote it, but unfortunately Java killed those plans. As a result TCL never achieved the universal usage and lags behind such scripting languages as Perl and Python. Even in its primary role as universal macrolanguage for applications it was by-and-large was displaced by LUA.
Some niche usage of TCL still can be observed in networking and HPC computing areas. The language is simple, but syntax is somewhat unusual and takes some time to get accustomed to. Outside of networking writing environment modules is probably the only common use of TCL those days.
As each modulefile is written in the Tcl (Tool Command Language) they can be interpreted iether regular TCL interpreter, of specialized modulecmd program (which in those days of 16 to 32 cores servers with 64-256GB of RAM does not make much sense). Still two version of environment modules exist one that uses TCL interpreter and the other that uses C plus TCL library. they are mostly compatible and you should not worry which version you use, but for power users some subtle differences exist in some commands and TCL-based version is preferable.
Unless you are a TCL user, when creating your own modules it make sense to take somewhat similar existing module and modify it, not create one from scratch.
There are two versions of environment modules version 3 and 4. Version 3 is written in C and use TCL library, while version 4 is a TCL version. RHEL RPMs are for version 3.
For RHEL6 you can need to get environment-modules-3.2.10-3 RPM, for RHEL 7 environment-modules-3.2.10-10 RPM.
RPM for environment modules is available of RHEL and derivates from EPEL. It can be installed with:
yum install environment-modules
Among other thing this RPM create directory /usr/share/Modules/modulefiles and populates it wit several modules
# ll /usr/share/Modules/modulefiles total 24 -rw-r--r-- 1 root root 716 Nov 23 2015 dot -rw-r--r-- 1 root root 870 Nov 23 2015 module-git -rw-r--r-- 1 root root 1990 Nov 23 2015 module-info -rw-r--r-- 1 root root 806 Nov 23 2015 modules -rw-r--r-- 1 root root 532 Nov 23 2015 null -rw-r--r-- 1 root root 1676 Nov 23 2015 use.ownIt also creates a startup script /etc/profile.d/modules.sh
root@centos68:# cat /etc/profile.d/modules.sh shell=`/bin/basename \`/bin/ps -p $$ -ocomm=\`` if [ -f /usr/share/Modules/init/$shell ] then . /usr/share/Modules/init/$shell else . /usr/share/Modules/init/sh fiFor bash /usr/share/Modules/init/bash is used for initialization:
module() { eval `/usr/bin/modulecmd bash $*`; } export -f module MODULESHOME=/usr/share/Modules export MODULESHOME if [ "${LOADEDMODULES:-}" = "" ]; then LOADEDMODULES= export LOADEDMODULES fi if [ "${MODULEPATH:-}" = "" ]; then MODULEPATH=`sed -n 's/[ #].*$//; /./H; $ { x; s/^\n//; s/\n/:/g; p; }' ${MODULESHOME}/init/.modulespath` export MODULEPATH fi if [ ${BASH_VERSINFO:-0} -ge 3 ] && [ -r ${MODULESHOME}/init/bash_completion ]; then . ${MODULESHOME}/init/bash_completion fi
Modulefiles are small scripts written in TCL language.
Each modules can modify set of environment variables which will be restored to their initial state when the module is unloaded. that's the main advantage as it allow you to use multiple version of the same software in a single login session (although on modern computers even a dozen of separate login sessions is nothing to worry about).
Here is example of a very simple module called null. This module can be loaded first so it gives you the ability to restore the environment to the initial state by unloading it. It does not do anything useful at all
#%Module1.0##################################################################### ## ## null modulefile ## ## modulefiles/null. Generated from null.in by configure. ## proc ModulesHelp { } { global version puts stderr "\tThis module does absolutely nothing." puts stderr "\tIt's meant simply as a place holder in your" puts stderr "\tdot file initialization." puts stderr "\n\tVersion $version\n" } module-whatis "does absolutely nothing" # for Tcl script use only set version "3.2.10"As you can see modules contains internal procedures ModulesHelp and module-whatis which provide some information about their usage. It is a good practice to provide whose two in all your custom modules.
Modules can also set aliases via set-alias command. This capability, for example, is used in module-git supplied with RPM:
root@centos68:/Modules/modulefiles # cat module-git #%Module1.0##################################################################### ## ## module-cvs modulefile ## ## modulefiles/module-git. Generated from module-git.in by configure. ## proc ModulesHelp { } { global version puts stderr "\tThis module will set up an alias" puts stderr "\tfor easy anonymous check-out of this version of the" puts stderr "\tenvironment modules package." puts stderr "\get-modules - retrieve modules sources for this version" puts stderr "\n\tVersion $version\n" } # for Tcl script use only set version 3.2.10 set _version_ [ string map {. -} $version ] module-whatis "get this version of the module sources from SourceForge.net" set-alias get-modules "git clone git://git.code.sf.net/p/modules/git modules-$_version_ && cd modules-$_version_ && git checkout modules-$_version_" if [ module-info mode load ] { ModulesHelp }
Modulefiles can also execute arbitrary scripts via system command
When a modulefile is loaded is replaces, append or prepend several environment variables, necessary for the particular version of the application. When the module is unloaded, it resportes variable like PATH to its previous state, but each instruction setenv in it becomes unsetenv which will unset the environment variable - the previous value will not be restored! (Unless you handle it explicitly by reloading your .bashrc, see below.) .
So is does not provide a full stack of environment. Restoration is selective.
Again, selected environment variables, such as PATH, which is manipulated append-path and prepend-path commands will be restored
For example, module dot append . to the path
root@centos68:/Modules/modulefiles # cat dot #%Module1.0##################################################################### ## ## dot modulefile ## ## modulefiles/dot. Generated from dot.in by configure. ## proc ModulesHelp { } { global dotversion puts stderr "\tAdds `.' to your PATH environment variable" puts stderr "\n\tThis makes it easy to add the current working directory" puts stderr "\tto your PATH environment variable. This allows you to" puts stderr "\trun executables in your current working directory" puts stderr "\twithout prepending ./ to the excutable name" puts stderr "\n\tVersion $dotversion\n" } module-whatis "adds `.' to your PATH environment variable" # for Tcl script use only set dotversion 3.2.10 append-path PATHIf we load and unload it, the PATH environment variable will be restored to the state before load:
# echo $PATH /usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin # module load dot # echo $PATH /usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:. # module unload dot # echo $PATH /usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
Module files are sorted in repositories which are regular directories with some additional files added (for example files that define the default version of the particular package).
The initial repository is created by installation RPM and populated with several basic modules is /usr/share/Modules/modulefiles. It is activated by default.
Initially it contains 6 modules;
root@centos68:/Modules/modulefiles # ll total 24 -rw-r--r-- 1 root root 716 Nov 23 2015 dot -- adds dot tot he PATH -rw-r--r-- 1 root root 870 Nov 23 2015 module-git -- creates alias get-modules for getting modules from the repository -rw-r--r-- 1 root root 1990 Nov 23 2015 module-info -- provides infomation about the installation -rw-r--r-- 1 root root 806 Nov 23 2015 modules -- set the environmant for environment modules to operate -rw-r--r-- 1 root root 532 Nov 23 2015 null -- does nothing -rw-r--r-- 1 root root 1676 Nov 23 2015 use.own -- creates ~/privatemodules directory (if it does not exist) and adds it to the MODULEPATH
You can add additional repositories or replace the initial default repository using the use command, see below.
The most common repositories are
The other commonly used repository is $HOME/privatemodules. It can be created and activated with the supplied module use.own
The set of directories where module files are searched is defined by environment variable $MODULEPATH. The initial setting is /usr/share/Modules/modulefiles. You can replace it with your own directory with the command:
module use directory
This overwrites the existing directory. Instead additional directories can be appended one at time:
module use --append directory
If you have system or cluster wide repository it make sense to append this directive to /etc/profile.d/modules.d file so that all users instantly have access to this modules repository.
You can create your own private modules directory in your home directory and add it to the $MODULEPATH. You can copy one of several modules from other modules repositories to your directory (in the example below we copy the module null (which does nothing and is used for initialization only).
mkdir $HOME/privatemodules cp $MODULESHOME/modulefiles/null $HOME/privatemodules module use --append $HOME/privatemodules
After that you will have the following env
root@centos68:/etc/modulefiles # env | grep MODULE MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles:/home/joeuser/privatemodules: LOADEDMODULES= MODULESHOME=/usr/share/Modules
NOTE: In your scripts you need explicitly source /etc/profile.d/modules.sh as it defines module alias. Without this alias defined, module command fails and you can't load modules in your script. You can also source if in .bashrc and source your .bashrc in your script (which is necessary anyway). Road to hell is paved with good intentions.
If we have one of several modules (for example the null module) in default repository it will be shown by "module avail". If module is visible in module avail you can load it The command load accepts relative path. In the exampeles below:
module load composer_xe/2015.1.133 module load openmpi/1.10
Here intel and openmpi are directories, while 11.1.059 and 1.8.0 are names of the modules. This is the convention the most often used is structuring modules repository. Which actually can be maintained using git or other configuration management system
root@centos68:...joeuser/privatemodules # find composer* -ls 1182156 4 drwxr-xr-x 2 root root 4096 Jul 28 18:06 composer_xe 1182161 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2013.4.183 1182171 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2013_sp1.3.174 1182157 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2013.0.079 1182166 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2016.0.109 1182162 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2013.3.163 1182163 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2011_sp1.13.367 1182160 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2013_sp1.0.080 1182173 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2015.1.133 1182165 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2015.0.090 1182167 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2013_sp1.2.144 1182170 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2013_sp1.4.211 1182172 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2013.1.117 1182159 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2011_sp1.12.361 1182168 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2013.2.146 1182164 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/.generic 1182158 4 -rw-r--r-- 1 root root 1772 Sep 9 2015 composer_xe/2013_sp1.1.106 1182169 4 -rw-r--r-- 1 root root 2373 Nov 21 2014 composer_xe/.modulerc
Similarly
root@centos68:...joeuser/privatemodules # find openmpi -ls 1182122 4 drwxr-xr-x 6 root root 4096 Jul 28 18:06 openmpi 1182129 4 drwxr-xr-x 2 root root 4096 Jul 28 18:06 openmpi/1.10 1182132 4 -rw-r--r-- 1 root root 1618 Mar 12 2015 openmpi/1.10/1.10.0_intel-2016.0.47 1182130 4 -rw-r--r-- 1 root root 1618 Mar 12 2015 openmpi/1.10/1.10.2_intel-2016.3.067 1182131 4 -rw-r--r-- 1 root root 310 Jun 14 2016 openmpi/1.10/.modulerc 1182142 4 drwxr-xr-x 2 root root 4096 Jul 28 18:06 openmpi/1.8 1182145 4 -rw-r--r-- 1 root root 1618 Mar 12 2015 openmpi/1.8/1.8.1_intel-2013_sp1.3.174 1182143 4 -rw-r--r-- 1 root root 1618 Mar 12 2015 openmpi/1.8/1.8.4_intel-2015.1.133 1182144 4 -rw-r--r-- 1 root root 308 Jun 14 2016 openmpi/1.8/.modulerc
As there are several modulesfiles in the directory openmpi/1.10, environment modules interpreter will load the default version defined in openmpi/1.10/.modulerc
If it contains the line:
module-version 1.10/1.10.2_intel-2016.3.067 defaultthen the version defined in the line will be considered to be default and will be loaded. All other needs to be loaded explicitly by providing full relative path.
You can also define file .version which also can specify the default version.
The following example uses the null and module-info modules to show use of a version file within a hierarchical organization and their affect on module avail and module show:
mkdir /etc/modulefiles/test cp $MODULESHOME/modulefiles/null /etc/modulefiles/test/2.0 cp $MODULESHOME/modulefiles/module-info /etc/modulefiles/test/1.0 module avail module show test
You will have the folloiwng output from avial command:
-------------------------------------------- /usr/share/Modules/modulefiles -------------------------------------------- dot module-git module-info modules null use.own --------------------------------------------------- /etc/modulefiles --------------------------------------------------- test/1.0 test/2.0
Ler's set the first version as the default:
cat > /etc/modulefiles/test/.modulerc <<EOF #%Module module-version test/2.0 default EOF
Now you will have
-------------------------------------------- /usr/share/Modules/modulefiles -------------------------------------------- dot module-git module-info modules null use.own --------------------------------------------------- /etc/modulefiles --------------------------------------------------- test/1.0 test/2.0(default)
there is also more rarely used file .version that can be used to define the default version similarly to .modulerc file. The difference is that .version only applies to the current directory, and the .modulerc applies to the current directory and all subdirectories.
There are other directives that you can put in .modulerc file. Among them the most useful is module-alias which allow to define aliases for frequently used modules so that they can be addressed directly without any path.
For examplemodule-alias name module-file
- Assigns the module file module-file to the alias name. This command should be placed in one of the modulecmd rc files in order to provide shorthand invocations of frequently used module file names.
The parameter module-file may be either
- a fully qualified modulefile with name and version
- a symbolic module file name
- another module file alias
module-alias myintel composer_xe/2015.1.133
See MODULEFILE(4) manual page (C version)
Environment Modules on Scientific Linux, CentOS, and RHEL distributions in the environment-modules package include modules.csh and modules.sh scripts for the /etc/profile.d directory. Those script perform initialization of modules.
For a source build the automation for all users can be manually configured. /etc/profile.d/modules.sh is a very simple script
root@centos68:/etc/profile.d # cat modules.sh shell=`/bin/basename \`/bin/ps -p $$ -ocomm=\`` if [ -f /usr/share/Modules/init/$shell ] then . /usr/share/Modules/init/$shell else . /usr/share/Modules/init/sh fiIn turn /usr/share/Modules/init/bash contains:
root@centos68:/etc/profile.d # cat /usr/share/Modules/init/$shell module() { eval `/usr/bin/modulecmd bash $*`; } export -f module MODULESHOME=/usr/share/Modules export MODULESHOME if [ "${LOADEDMODULES:-}" = "" ]; then LOADEDMODULES= export LOADEDMODULES fi if [ "${MODULEPATH:-}" = "" ]; then MODULEPATH=`sed -n 's/[ #].*$//; /./H; $ { x; s/^\n//; s/\n/:/g; p; }' ${MODULESHOME}/init/.modulespath` export MODULEPATH fi if [ ${BASH_VERSINFO:-0} -ge 3 ] && [ -r ${MODULESHOME}/init/bash_completion ]; then . ${MODULESHOME}/init/bash_completion fi
That means that if you use bash, you can source in your scripts /usr/share/Modules/init/bash directly, without /etc/profile.d/modules.sh
ABSTRACT
Typically users initialize their environment when they log in by setting environment information for every application they will reference during the session. The Modules package is a database and set of scripts that simplify shell initialization and lets users easily modify their environment during the session.
The Modules package lessens the burden of UNIX environment maintenance while providing a mechanism for the dynamic manipulation of application environment changes as single entities. Users not familiar with the UNIX environment benefit most from the single command interface. The Module package assists system administrators with the documentation and dissemination of information about new and changing applications.
This paper describes the motivations and concepts behind the Modules package design and implementation. It discusses the problems with modifying the traditional user environment and how the Modules package provides a solution to these problems. Both the user’s and the system administrator’s viewpoint are described. This paper also presents the reader with a partial implementation of the Modules package. Sample C Shell and Bourne Shell scripts with explanations are used to describe the implementation. Finally, an example login session contrasts the traditional user’s environment with one that uses the Modules package.
Introduction
Capitalized Modules refers to the package as a whole. References to a module files themselves (modules) are uncapitalized.
Typically, when users invoke a shell, the environment is initialized with the settings for every application they might access during a login session.
This information is stored in a set of initialization files in each user’s home directory. Over time, these files can incur numerous and relatively complex changes as applications move and new applications become available. Since each user has his own initialization files, keeping these files current with system-wide application changes becomes difficult for both the user and the system administrator. In this model, the user often makes environment changes during a login session by modifying the initialization files and then re-initializing the shell.
The Modules package provides a way to simplify this process.
The Modules package is a set of scripts and information files that provides a simple command interface for modifying the environment. Each module in the Modules package is a file containing the information needed to initialize the environment for a particular application (or any environment information). From a user’s perspective, the package supplies a single command with multiple arguments that provides for the addition, change, and removal
From the administrator’s point of view, the environment information is documented and maintained in one location with each module encapsulating one application’s information. Thus, it is easier for the system administrator to add new applications and ensure that the necessary environment for the application is correctly installed and maintained by all end users.
Design motivations
The first section describes the motivations driving the design of the Modules package. They are as following:
- Help to alleviate the burden of UNIX environment maintenance for users. The main design motivation is to alleviate the burden of UNIX environment maintenance for users.The UNIX environment is cumbersome for even experienced users. Users not familiar with UNIX are both baffled and troubled by the complexity of environment maintenance. The Modules package attempts to ease this maintenance by encapsulating environment information and providing a single command for environment modification.
- Ease the dissemination of information and documentation of new software.The amount of information an administrator must convey about a new application can be large.Many variables may need to be changed for each new application. Because each module is self-documenting, the information is readily available for reference by users.
- Make it simple to change the environment numerous times during the login session. The amount of effort involved with adding environment information as applications are needed stops users from managing their environment dynamically. Using a new application requires looking up the necessary environment modifications, making the appropriate modifications, and finally invoking the command. Thus, it is usually worth initializing the environment with the information for every application when the shell is invoked. But if adding the environment for a new application is as simple as typing a single command, the balance turns in favor of adding to the environment just before accessing a new application.
- Decrease dependency on servers when the applications on those servers are not needed. When a directory is added to the search path, a dependency on the server containing this directory is added as well. If this server goes down, the user must wait for the server to return even though, in the case of an unnecessary directory, he doesn’t need access to the directory to complete his work.
- Manage the difficulties associated with frequently switching between different application releases. Switching between two releases of the same software is simplified by making it easier to swap the two applications’ environment information.
The second section presents an overview of the design.
Design Overview
- The Modules package should be simple to use.Therefore, it employs a single command interface, very similar to that of the sccs(1)[1] command. A single command with the ability to provide help is both easy to remember and simple to use.
- For the Modules package to assist a system administrator, it should save time spent maintaining environments as well as installing and documenting the use of new applications. The package must be flexible enough to accommodate any situation an application might require. It must meet the user’s needs in different ways. Some users will use the module(1) command to manipulate most of their environment. Other users will use it sparingly as an easy way to try new or rarely used applications. In addition, the package should permit experienced users to tailor it to their needs.
- Finally, the solution must be shell-independent. The interface should be the same regardless of which shell the user chooses.
Problems with Traditional Shell Initialization
This section describes some of the difficulties with traditional shell initialization as viewed by the user and by the system administrator.
The User’s Viewpoint Maintaining shell start-up files can be difficult and frustrating for the user because he lacks interest or UNIX® knowledge. Since modification is generally required when a new application is installed, maintaining start-up files can be demanding and time consuming in very dynamic or large UNIX® environments.
When shell start-up files are used, the environment can become cluttered with unnecessary information. Information for every application, whether or not it will actually be referenced during the current session, is loaded into the shell. For the user to execute applications at the command line without using full pathnames, the search path could be very long.
When using an automounter[2], the system not only searches a longer path, but must remount infrequently referenced directories to search them. In addition, if a directory in the search path is on a server that goes down, the shell will hang trying to search that directory. Thus, the time needed to detect a “command not found” or to find programs toward the end of the search path is drastically increased.
Shells that have path caching help this problem immensely, but a number of shells and users do not have or use path caching. It is best if a small path containing only the most used directories is set at initialization and supplemented just before a new application is used.
Switching between different versions of an application is usually difficult “on the fly.” First, the user must know which environment variables to reset and then must enter the explicit shell commands to change the variables. Finally, the user must modify the search path to remove the old path and to add the new path. This is a cumbersome and time consuming process that restricts the flexibility of changing between different software versions.
Accessing a new application is difficult when it requires a change in the user’s environment. If the application is for temporary use, the user accesses the application by changing the environment in the current shell. If it’s a long-term addition to the user’s set of applications, the user edits the shell initialization files and the shell is re-initialized.
Novice users often don’t know how or don’t want to know how to modify their environment to use the new application.
The System Administrator’s Viewpoint
A system administrator currently announces the installation of a new application via e-mail or a note in /etc/motd. Usually, the notice contains a full description of the application and the environment variables that must be set to use the new application.
Some users do not understand what they really need to do to use an application. This, in turn, causes numerous requests to a system administrator. Novice users often need a system administrator to help them modify their start-up files.
At most sites, a logfile or database is maintained containing the descriptions and quirks of each application so that the user can set up his environment to use an application. Maintaining such a logfile can be time consuming as it can become very large.
The Modules package eases the dissemination of information about a newly installed application: only the module name must be announced. If users want to use the new application, they use the module(1) command to add the announced module. Any users who don’t use the Modules package can acquire the information needed to set their environment from the module file itself.
Each module is self-documenting. Users either access this information via the module(1) display command or view the module file itself. In general, a logfile or database still needs to be maintained, but only the module name is listed for each application.
Thus, when a user wants to use an application, the database references a module name that contains the current environment information. The user either loads this module directly or gets the environment information from the module and manually incorporates it into the environment.
The Modules Package Provides a Solution The User’s Viewpoint
Although shell start-up files are still necessary, maintaining them is easier with the Modules package. The user is provided with two options: modify the start-up files directly, or use the module(1) command to modify them.
Changing start-up files is simpler with the Modules package. The user only has to add new arguments to, or remove them from, the module(1) Shell Wrappers and modules command in his start-up file to add or remove an application’s environment. Or, if the user prefers, he can use the module(1) command to add module names to, or remove such names from, his start-up file’s module(1) command.
A clean environment is readily maintained since the Modules package makes it easy to dynamically modify the environment. A minimum of environment information is initialized at start-up, and an application’s environment is added only when needed. Response time is improved because search paths are much shorter on average.
The Modules package is optimum for the windowing environment under which many UNIX users work. For example, a user only loads the window system module during the login initialization. Then, in a shell window, the user uses the module(1) command to initialize the environment just prior to accessing a new application.
If the path to an application changes, the change will be masked by the Modules package. For example, a user loads a module named ‘openwin’ to use OpenWindows[3] even if its access path has changed. For the user, no environment or start-up file modification is required.
To switch between different releases, the user simply changes predefined modules. The old module is swapped out, and the new one is loaded in its place, even during the login session.
The Modules package will help inexperienced UNIX users manage their environment. They must learn a single command for manipulating their environment. This is opposed to having to thoroughly understand the quirks of setting a UNIX environment.
Shell Wrappers and Modules
Shell wrapper scripts setup environment variables for a certain application when a command for that application is invoked. For each application, the system administrator creates a wrapper script and a symbolic link to the script for each command in the application.
The Modules package can augment a wrapper script scheme or be used in place of wrapper scripts.
With wrapper scripts, users still add the directory containing the symbolic links to their search path in order to use the application. In this case, the Modules package augments the wrapper scheme by helping the user manage the search path.
One solution for managing the search path is creating a directory of symbolic links to all of the wrapper scripts. In this case, the user only adds one directory to his search path to access every application. Moving an application requires that every symbolic link for that application change.
When many applications are installed, this directory can quickly become overwhelming and unmanageable. Documenting and finding programs in such a directory is difficult and often not very clean.
The Modules package provides the user with a lot of flexibility by differentiating between user and system module files. Like wrapper scripts, the The Modules Package Implementation Modules: Providing a Flexible User Environment 5 Modules package can encapsulate environment details from the user.
The Modules Package Implementation
The Modules package has been implemented for both the C Shell and Bourne Shell dialects. This section describes some of the implementation details.
Modules Initialization The Modules package is initialized when a user sources a site-wide accessible initialization script.
This file is shell dependent and is usually done in a user’s .cshrc or .profile upon each shell invocation.
This Modules initialization script defines a few environment variables.
- MODULESHOME is the directory containing the master module file and the master command scripts.
- MODULEPATH is a standard path variable that is searched to find module files. MODULESHOME should always be a part of this path.
- The _loaded_modules variable contains a space separated list of every module that has been loaded.
All of these variables are exported so that the shell’s children will have the same information and be able to keep track of currently loaded modules.
This initialization script sets up the module(1) command. This command is an alias or a function depending upon which the shell supports (see Figure 1).
FIGURE 1. module(1) Command Initialization Environment Modification Because a process is unable to modify the environment of its parent, the Modules package sources scripts into the current shell.
The Modules package should not alter any part of the environment besides the variables documented in the init-script or the module files themselves. When a script is sourced into the current shell, it has the potential to change existing user variables. This “feature” permits the Modules package to work, but it presents a pitfall that the package must take into account. The main concern is the module(1) command because it uses a large number of variables to implement its subcommands.
A couple of precautions have been taken to avert the possibility of variables being changed or destroyed by the module(1) command. The first precaution is the choice of variable names used by the module(1) command. All of the names are proceeded with an underscore and are all lowercase.
The use of underscores should stop most variable conflicts. It does, however, leave room for a user variable to be changed by the module(1) command.
So, a check for possible variable conflicts is made when the bulk of the script is sourced. If a conflict arises, the user is notified.
The problem of changing existing variables could be eliminated by running module(1) as a subshell and sourcing the return values. I found this has an unacceptable response time for the problem being addressed. If users run into variable conflicts, they can set an option telling the module(1) command to run the script in a subshell and source the script’s output (this code is not in Figure 1).
Since the Modules package is designed to abstract environment information from the user, it must be concerned with environment dependencies and conflicts between different applications and different versions of the same application. For example, the environment for two versions of the same application should not be loaded at the same time. Along the same lines, some applications (like Sun’s AnswerBook[4]) are dependent upon other applications (OpenWindows[3]). Possible conflicts
## ## module(1) User Command as Function ## module() { _module_argv="$*" . $MODULESHOME/.module.sh unset _module_argv }Implementation of Internal Module File Functions and dependencies are put into the module files themselves and detected by the module(1) command.
Shell Independence
The Modules package is shell-independent because the interface is independent of the currently executing shell. To ease administration, the module files are shell independent as well. Thus, only one copy of an application’s environment information is maintained.
A number of functions or aliases are set up by the module(1) command. Each module calls these functions to accomplish specific, well defined tasks.
For example, the _set_environ task is responsible for setting an environment variable. This is a line from an “openwin” module file.
_set_environ OPENWINHOME /depot/openwin Here, OPENWINHOME is initialized to /depot/ openwin[5].
System Module Files and User Module Files
Through the MODULEPATH environment variable, users can specify module directories to be searched before or after the site-wide directory.
Thus, a distinction is made between “system modules” and “user modules.” System modules are maintained by the system administrator and contain the default initialization information for packages that are installed on the network. User modules, created by the user, are either derivatives of the site-wide modules or are new modules that are specific to the user’s needs.
This arrangement provides the user with the same power as the system-wide modules provide to easily change the environment “on the fly.” Implementation of Internal Module File Functions This is the set of internal functions called by the module files themselves. These functions change paths and set variables, aliases, and dependencies.
_prepend_*path and _append_*path Sometimes an application’s path should be prepended to a search path. Other times it should be appended to a search path. The Modules package makes this distinction in the application’s module file. The path modification functions are currently implemented to modify the PATH, MANPATH, MODULEPATH, and LD_LIBRARY_PATH environment variables. See Figure 2 for an example of how the alias and function to append the MANPATH variable is implemented.
FIGURE 2. _append_manpath Alias and Function _rm_*path The _rm_path functions remove the directory, given as an argument, from their associated path. As with the append and prepend path functions, they’re currently defined for the PATH, MANPATH, MODULEPATH, and LD_LIBRARY_PATH environment variables.
Currently, awk(1)[?] does most of the work of removing and recreating the path.
When the user requests that a module be removed, the _rm_flag is set in a higher level function. Then, the module is reloaded with this flag set. Thus, every function that was called when the module was loaded is called again with the _rm_flag set (see Figure 2). The same functions that set up the environment now call their associated _rm_*path function to remove their environment information.
Bourne Shell Function: _append_manpath() { if [ "$_rm_flag:-X" = "X" ]; then _rm_manpath $1 else MANPATH="$MANPATH":"$1"; export MANPATH; fi } C Shell Alias: alias _append_manpath 'if($?_rm_flag) eval _rm_manpath \!:1; if(! $?_rm_flag) setenv MANPATH $MANPATH":"\!:1' Implementation of Internal Module File Functions Modules: Providing a Flexible User Environment 7 See Figure 3 for an example of how the function to remove a directory from the MANPATH variable is implemented.
FIGURE 3. _rm_manpath Function and awk Script _set_environ This function is responsible for setting and clearing environment variables. The code gets more complex when removing environment variables.
Since environment variables are often used to define paths to other directories, variables and paths defined later in a module must be able to reference these variables even as they are removed. The best way to describe this problem is with an example module file (see Figure 4).
FIGURE 4. OpenWindows module File Here the first variable set is the location of OpenWindows (OPENWINHOME). This variable is used to define the path locations as well. If the variable were removed from the environment before the paths were removed, it would be impossible to remove the paths. So, environment variables are not actually cleared from the environment until after the module has been removed. Thus, _set_environ simply adds any environment variables to a list. Upon completion of reading the module file, the elements are removed from the environment. See Figure 5 for an example of how the alias and function defined to set and unset environment variables are implemented.
FIGURE 5. _set_environ Function and Alias _prereq and _conflict Two functions were created to manage conflicts and dependencies with other module files. The _prereq function is a list of modules the calling module must have loaded to run. Similarly, the _conflict function is a list of modules the calling module has conflicts with. If more than one module is listed for these commands, it is treated as an ORed list. Multiple calls can be used to get an ANDing effect.
For example, AnswerBook needs OpenWindows loaded to run. So, it defines the openwin (or openwin-v3) module as a prerequisite. A conflicting case would be OpenWindows Version 2.0 with As Bourne Shell Function: _rm_manpath() { MANPATH=`echo $MANPATH | awk -F: 'BEGIN {p=0} { for(i=1;i<=NF;i++) { if($i!="'$1'"){ if(p) {printf ":"} p = 1; printf "%s", $i } } }'` export MANPATH; } ## ## OpenWindows Version 2.0 ## _set_environ OPENWINHOME /depot/openwin _set_environ DISPLAY `hostname`:0.0 _prepend_newpath $OPENWINHOME/bin/xview _prepend_newpath $OPENWINHOME/bin _prepend_manpath $OPENWINHOME/man _prepend_ldpath $OPENWINHOME/lib Bourne Shell Function _set_environ() { if [ "$_rm_flag:-X" = "X" ]; then _unset_list="$_unset_list $1" else eval $1="$2"; export $1; fi } C Shell Alias alias _set_environ ' if($?_rm_flag) set _unsetenv_list = ($_unsetenv_list \!:1); if(! $?_rm_flag) setenv \!:1 \!:2' Implementation of the module(1) Command 8 Modules: Providing a Flexible User Environment OpenWindows Version 3.0. These two modules should not be loaded at the same time. So, each one defines the other as a conflict. See Figure 6 for an example of how the function for conflict management is defined.
FIGURE 6. _conflict Function Implementation of the module(1) Command Each invocation of module(1) sources a site-wide script that actually implements the command.
Arguments are passed to the module(1) command using the _module_argv variable (see Figure 1).
The first argument designates the sub-command the module(1) command is to execute. Valid arguments are as follows: Load or Add Remove or Erase Switch or Change Show or Display Initadd Initrm List Available Help Loading Modules Loading modules is done by the _add_module internal function. This function takes any number of arguments and attempts to load each one as a module. It traverses the argument list first verifying that a listed module isn’t loaded already. If it is, the function prints an error and moves on to the next name in the argument list (see Figure 7).
FIGURE 7. _add_module() Argument Verification If the module is not loaded, _add_module begins looking for the module by searching each directory specified in the MODULEPATH variable. Once found, the module is sourced, the module name is _conflict() { if [ "$_rm_flag:-X" = "X"]; then return; fi _conflict_loaded=0 for _con in $*; do for _mod in $_loaded_modules; do if [ "$_mod" = "$_con" ]; then _conflict_loaded=$_mod fi done done if [ $_conflict_loaded != 0 ]; then echo "ERROR: Module conflict" fi unset _conflict_loaded } _add_module() { if [ $# -lt 1 ]; then echo "ERROR: More arguments" return fi for mod in $*; do _found=0; _cur_module=$mod; for chkmod in $_loaded_modules; do if [ $mod = $chkmod ]; then _found=1 break fi done if [ $_found -eq 1 ]; then echo "ERROR: Module is already loaded" continue fi Implementation of the module(1) Command Modules: Providing a Flexible User Environment 9 appended to the _loaded_modules variable, and if there are any other arguments, the load process begins anew (see Figure 8).
FIGURE 8. _add_module() Traverse MODULEPATH and Load Module Removing Modules Removing a module is accomplished by using the _rm_module internal function, which is very similar to the _add_module function. Any number of arguments can be passed to the _rm_module function.
First, each argument is checked to verify that the module is actually loaded (same code as in Figure 7 except the final if statement indicates an error if the module is not loaded). If a module is loaded, the _rm_flag is set and the MODULEPATH variable is searched. Once a module is found, it is sourced (see Figure 9).
FIGURE 9. _rm_module() Traverse MODULEPATH and Load Module The same functions that are used in loading a module are used to remove modules. When one of the functions detects that the _rm_flag is set, it removes its corresponding piece of environment instead of adding it (see Figures 2 and 3). Note that after the module file has been sourced, each variable in the _unset_list is unset.
Switching Modules Although this function has not been fully implemented, I will describe it here. Only modules that define themselves as compatible with another module can be switched. The compatibility information is kept in the module file. If a module can be switched with another module, it lists that other module via the _switch function.
Switchable modules are very similar in that their environments match one another and their modules follow the same format. The variables that have to be reset are the same, and the search path changes are the same as well.
for dir in $MODULEPATH; do if [ -f $dir/$mod ]; then echo "Loading $dir/$mod" . $dir/$mod if [ "${_load_error:-NotSet" = "NotSet" ]; then _loaded_modules= "$_loaded_modules $mod" export _loaded_modules unset _load_error fi _found=1 break fi done if [ $_found -ne 1 ]; then echo "ERROR: Module not found" fi done } _rm_flag= for dir in $MODULEPATH; do if [ -f $dir/$mod ]; then echo "Removing $dir/$mod" . $dir/$mod _loaded_modules= `echo $_loaded_modules | sed s/$mod//` export _loaded_modules _found=1 break fi done if [ $_found -ne 1 ]; then echo "ERROR: Module not found" fi for env in $_unset_list; do unset $env done Example Sessions 10 Modules: Providing a Flexible User Environment The difference between switching two modules and the process of removing a loaded one and loading a new one is that the location of a search path entry does not change. The append and prepend functions are used when removing and loading module files. This process has the possibility of altering a portion of the search path in relation to other entries.
Although this sounds restrictive, it is often very useful because most module switching involves different versions of the same program.
Displaying Modules The _display_module internal function implements the user display and show subcommands.
If no arguments are provided, it displays information about every loaded module. Otherwise, only the modules named as arguments are displayed. An awk(1) script is used to convert the information contained in the module file to output that is visually pleasing to the user.
Changing User Initialization Files
The initadd and initrm sub-commands help the user add modules to and remove modules from their shell initialization files. The initialization file is searched for a comment line placed there by the module(1) command. Located immediately after this comment line is a line invoking the module(1) command. It is this line that is changed according to the list of modules given to the initadd and initrm request. If a comment line is not located in the initialization file and the user is requesting that a module be added, the comment line and the module(1) command line is appended to the user’s initialization file.
Listing Modules The _list_modules function simply prints out the current value of the _loaded_modules variable.
Available Modules Each directory in the MODULEPATH variable is listed using the UNIX ls(1)[?] command by the _avail_modules function (see Figure 10).
FIGURE 10. _list_modules() and _avail_modules() Functions Help for Modules Two levels of help are provided. Without any arguments, the module(1) command lists the available sub-commands. With ‘help’ as the only argument, it provides more complete description of the Modules package. If a second argument, the name of a subcommand, is provided, the module(1) command displays help about the sub-command.
Example Sessions Contrast Conventional Style with Modules Style The examples in Figures 11 and 12 depict how the same environment modifications are accomplished with and without using the Modules package. Specifically, the examples show how a user would switch from using OpenWindows Version 2.0 to a Development Version of OpenWindows Version 3.0. The keystrokes the user actually types are in bold. Notice the difference in effort needed to switch between the two window systems. Also notice the difference in the length of the search path variables.
_list_modules() { if [ "$_loaded_modules" ="" ]; then echo "No Modules Loaded" else echo "Loaded: $_loaded_modules" fi } _avail_modules() { for dir in $MODULEPATH; do echo $dir":" (cd $dir; ls) done } Future Work Modules: Providing a Flexible User Environment 11 FIGURE 11. Conventional Style FIGURE 12. Modules Style A More Complex Modules Example A few more module(1) commands are demonstrated in Figure 13. Once the user logs in, a check is made of what modules are currently available.
Then, the ‘lang’ module is displayed to find out what the module does. Notice that the PATH environment variable is changed as the module display indicates.
The ‘answerbook’ module is displayed showing how prerequisites might be used. In this example, answerbook must have either the ‘openwin’ or ‘openwin-v3’ module loaded before it will load.
Finally, the ‘answerbook’ module is loaded and the program is started.
system login: jlf ++++++++ CSH Login ++++++++ jlf@system% echo $PATH /depot/lang:/depot/openwin/bin: /depot/openwin/bin/xview:/usr/local/bin: /usr/bin:/usr/ucb:/usr/etc:.: /depot/frame/bin:/depot/sunvision/bin: /depot/TeX/bin jlf@system% setenv OPENWINHOME /depot/openwin-v3 jlf@system% setenv PATH /depot/lang: /depot/openwin-v3/bin:/depot/openwin-v3/bin/xview: /usr/local/bin:/usr/bin:/usr/ucb:/usr/etc:.: /depot/frame/bin:/depot/sunvision/bin: /depot/TeX/bin jlf@system% setenv LD_LIBRARY_PATH /depot/lang/SC1.0:/depot/openwin-v3/lib:/usr/lib jlf@system% setenv MANPATH /depot/lang/man: /depot/openwin-v3/man:/depot/sunvision/man: /depot/TeX/man jlf@system% openwin system login: jlf ++++++++ CSH Login ++++++++ Loading /site/Modules/openwin jlf@system% echo $PATH /depot/openwin/bin:/depot/openwin/bin/xview: /usr/local/bin:/usr/bin:/usr/ucb:/usr/etc:.
jlf@system% module rm openwin Removing /site/Modules/openwin jlf@system% module add openwin-v3 Loading /site/Modules/openwin-v3 jlf@system% openwin FIGURE 13. Complex Modules Example Future Work Having the Modules package as a set of scripts that are sourced into an existing shell helps makes the interface shell independent. However, it would be best for performance and cleanliness to have support for the Modules package built into the shell itself. The module commands could be more complex without losing any performance over the current version.
Currently, conventional style search paths can be built in an order that doesn’t represent a logical search structure. Users are responsible for maintaining the order of their paths with little or no system login: jlf ++++++++ CSH Login ++++++++ Loading /site/Modules/openwin jlf@system% module avail /site/Modules: X11@ init-csh openwin tex X11R4 init-sh openwin-v3 vx-devel answerbook lang saber xgl dos local sunvision frame lotus sunvision-devel frame-ol mh taac-devel jlf@system% module show lang ++++++++ ( /site/Modules/lang Module ) ++++++++ Unbundled Languages Prepend PATH: /depot/lang Prepend MANPATH: /depot/lang/man Prepend LD_LIBRARY_PATH: /depot/lang/SC1.0 jlf@system% echo $PATH /depot/openwin/bin:/depot/openwin/bin/xview: /usr/local/bin:/usr/bin:/usr/ucb:/usr/etc:.
jlf@system% module add lang Loading /site/Modules/lang jlf@system% echo $PATH /depot/lang:/depot/openwin/bin: /depot/openwin/bin/xview:/usr/local/bin: /usr/bin:/usr/ucb:/usr/etc:.
jlf@system% module show answerbook +++++ ( /site/Modules/answerbook Module ) +++++ Answerbook Version 1.0 Prerequisites(ORed): openwin openwin-v3 Append PATH: /depot/answerbook jlf@system% module add answerbook Loading /site/Modules/answerbook jlf@system% answerbook Results, Performance Notes 12 Modules: Providing a Flexible User Environment help. The Modules package can provide the user with the information he needs to build a logical search path with existing modules.
Work is in progress to increase the grammar available in the module file. Syntax permitting if-else statements and path ordering are two examples of new syntax not described in this paper.
Finally, more options are under development to provide users with even greater control over how their environment is constructed by the Modules package. In some cases, users don’t want a variable modified when loading a certain module file. For example, if the OpenWindows[3] dynamic libraries are already cached using ldconfig(8)[1] the LD_LIBRARY_PATH should not be modified when loading the “openwin” module file. Other configuration and control options are being added for more experienced users as well.
Results, Performance Notes The Modules package is quite new and is still under development. Only a few of our users have begun working in the Modules’ environment. They have been very pleased with the package and the benefits it provides.
Currently, it takes a second or two to load a module into the current shell. In an effort to improve this performance, it is possible for the user to specify that the internal functions remain resident from one module(1) command to the next. This provides a marked improvement in speed since the functions are not being completely redefined upon every invocation.
Summary
The Modules package provides both the novice and the experienced UNIX user with a clean interface to the environment. This interface enables the user to easily add, change, and remove application environments dynamically.
John L. Furlani graduated from the University of South Carolina with a BS in Electrical and Computer Engineering. he worked as a system administrator at both USC and the Naval Research Laboratory in Washington, D.C. during his college years. Upon graduation, John joined Sun Microsystems Incorporated as the system administrator for Sun’s Research Triangle Park Facility in North Carolina.
Reach him at Sun via U.S. Mail at Sun Microsystems Inc., P.O. Box 13447, Research Triangle Park, NC 27709-13447. Reach him via electronic mail at [email protected] or via the internet at [email protected] Acknowledgments Ken Manheimer and Don Libes at the National Institute of Standards and Technology deserve special thanks for their help and ideas toward this paper and some design considerations. Maureen Chew with Sun Microsystems provided me with a test environment and many ideas on how to improve Modules. There are many others that deserve thanks but too many to list here -- thanks to everyone that helped.
References
- [1] Sun Microsystems Incorporated, SunOS Reference Manual.
- [2] Sun Microsystems Incorporated, “Using the NFS Automounter”, System and Network Administration, Chapter 15.
- [3] Sun Microsystems Incorporated, OpenWindows Version 2 Reference Manual.
- [4] Sun Microsystems Incorporated, Using the Sun System Software Answerbook.
- [5] Manheimer, Warsaw, Clark, Rowe, “The Depot: A Framework for Sharing Software Installation Across Organization and UNIX Platform Boundaries”, USENIX Large Installation System Administration IV Conference Proceedings, October 1990, p. 37-76.
|
Switchboard | ||||
Latest | |||||
Past week | |||||
Past month |
Apr 20, 2018 | lists.sdsc.edu
[Rocks-Discuss] Environment Modules Recommendations
>> 1. I use the C modules code. But, I like the tcl version better. It seems more robust and it is certainly easier to fix (or not -- like the erroneous "Duplicate version symbol" below) and enhance (as I have done). I think I'll switch the next time I upgrade our cluster.
>> 2. I have come to not like the recommended organizational scheme of package/version. I think I'll switch to using the version as a suffix, like RPMs do, e.g. package-version. I think that would make it easier to use the built-in default selection mechanism (alphabetic ordering, last one is the default). Right now, for example, my modules are:
Environment Modules - Mailing Lists
[Modules] Duplicate version symbol found
[Modules] Duplicate version symbol found From: Christoph Niethammer <niethammer@hl...> - 2013-07-08 11:33:33 Hello, I would like to mark modules as default and testing so that they show up like this: $ module av mpi/openmpi mpi/openmpi/1.6.5(default) mpi/openmpi/1.7.2(testing) and can be lauded via e.g. $ module load mpi/openmpi/testing I tried to use module-version in .modulerc to achieve this behaviour with the commands module-version openmpi openmpi/1.6.5 module-version openmpi openmpi/1.7.2 but I get a warnig "Duplicate version symbol 'testing' found". For the default version there is no such warning. So it seems to me, that there is a problem/bug in module-version. Best regards Christoph Niethammer PS: My current workaround for this problem is to use a variable in all the .modulrc files. #%Module1.0 # File: $MODULEPATH/mpi/openmpi/.modulerc set DEFAULT 1.6.5 module-version openmpi openmpi/$DEFAULT # circumvent problem with duplicate definition of symbol testing # The used variable name has to be unique to prevent conflicts if # this workaround is used in multiple .modulerc files. if { ![info exists MPI_OPENMPI_TESTING] } { set MPI_OPENMPI_TESTING 1.7.2 module-version mpi/openmpi/$MPI_OPENMPI_TESTING testing }
Thread view
[Modules] Duplicate version symbol found From: Christoph Niethammer <niethammer@hl...> - 2013-07-08 11:33:33 Hello, I would like to mark modules as default and testing so that they show up like this: $ module av mpi/openmpi mpi/openmpi/1.6.5(default) mpi/openmpi/1.7.2(testing) and can be lauded via e.g. $ module load mpi/openmpi/testing I tried to use module-version in .modulerc to achieve this behaviour with the commands module-version openmpi openmpi/1.6.5 module-version openmpi openmpi/1.7.2 but I get a warnig "Duplicate version symbol 'testing' found". For the default version there is no such warning. So it seems to me, that there is a problem/bug in module-version. Best regards Christoph Niethammer PS: My current workaround for this problem is to use a variable in all the .modulrc files. #%Module1.0 # File: $MODULEPATH/mpi/openmpi/.modulerc set DEFAULT 1.6.5 module-version openmpi openmpi/$DEFAULT # circumvent problem with duplicate definition of symbol testing # The used variable name has to be unique to prevent conflicts if # this workaround is used in multiple .modulerc files. if { ![info exists MPI_OPENMPI_TESTING] } { set MPI_OPENMPI_TESTING 1.7.2 module-version mpi/openmpi/$MPI_OPENMPI_TESTING testing }
Jul 28, 2017 | hpc.nrel.gov
- Contents
Instructions and policies for installing and maintaining environment modules on Peregrine.
Toolchains:Libraries and applications are built around the concept of 'toolchains'; at present a toolchain is defined as a specific version of a compiler and MPI library or lack thereof. Applications are typically built with only a single toolchain, whereas libraries are built with and installed for potentially multiple toolchains as necessary to accommodate ABI differences produced by different toolchains. Workflows are primarily composed of the execution of a sequence of applications which may use different tools and might be orchestrated by an application or other tool. The toolchains presently supported are:
- impi-intel
- openmpi-gcc
- comp-intel (no MPI)
- gcc (no MPI)
Loading one of the above MPI-compiler modules will also automatically load the associated compiler module (currently gcc 4.8.2 and comp-intel/13.1.3 are the recommended compilers). Certain applications may of course require alternative toolchains. If demand for additional options becomes significant, requests for additional toolchain support will be considered on a case-by-case basis.
Building Module Files Here are the steps for building an associated environment module for the installed mysoft software. First, create the appropriate module location% mkdir -p /nopt/nrel/apps/modules/candidate/modulefiles/mysoft # Use a directory and not a file. % touch /nopt/nrel/apps/modules/candidate/modulefiles/mysoft/1.3 # Place environment module tcl code here. % touch .version # If required, indicate default module in this file.Next, edit the module file itself ("1.3" in the example). The current version of the HPC Standard Module Template is:#%Module -*- tcl -*- # Specify conflicts # conflict 'appname' # Prerequsite modules # prereq 'appname/version....' #################### Set top-level variables ######################### # 'Real' name of package, appears in help,display message set PKG_NAME pkg_name # Version number (eg v major.minor.patch) set PKG_VERSION pkg_version # Name string from which enviro/path variable names are constructed # Will be similar to, be not necessarily the same as, PKG_NAME # eg PKG_NAME-->VisIt PKG_PREFIX-->VISIT set PKG_PREFIX pkg_prefix # Path to the top-level package install location. # Other enviro/path variable values constructed from this set PKG_ROOT pkg_root # Library name from which to construct link line # eg PKG_LIBNAME=fftw ---> -L/usr/lib -lfftw set PKG_LIBNAME pkg_libname ###################################################################### proc ModulesHelp { } { global PKG_VERSION global PKG_ROOT global PKG_NAME puts stdout "Build: $PKG_NAME-$PKG_VERSION" puts stdout "URL: http://www.___________" puts stdout "Description: ______________________" puts stdout "For assistance contact [email protected]" } module-whatis "$PKG_NAME: One-line basic description" # # Standard install locations # prepend-path PATH $PKG_ROOT/bin prepend-path MANPATH $PKG_ROOT/share/man prepend-path INFOPATH $PKG_ROOT/share/info prepend-path LD_LIBRARY_PATH $PKG_ROOT/lib prepend-path LD_RUN_PATH $PKG_ROOT/lib # # Set environment variables for configure/build # ##################### Top level variables ########################## setenv ${PKG_PREFIX} "$PKG_ROOT" setenv ${PKG_PREFIX}_ROOT "$PKG_ROOT" setenv ${PKG_PREFIX}_DIR "$PKG_ROOT" #################################################################### ################ Template include directories ###################### # Only path names setenv ${PKG_PREFIX}_INCLUDE "$PKG_ROOT/include" setenv ${PKG_PREFIX}_INCLUDE_DIR "$PKG_ROOT/include" # 'Directives' setenv ${PKG_PREFIX}_INC "-I $PKG_ROOT/include" #################################################################### ################## Template library directories #################### # Only path names setenv ${PKG_PREFIX}_LIB "$PKG_ROOT/lib" setenv ${PKG_PREFIX}_LIBDIR "$PKG_ROOT/lib" setenv ${PKG_PREFIX}_LIBRARY_DIR "$PKG_ROOT/lib" # 'Directives' setenv ${PKG_PREFIX}_LD "-L$PKG_ROOT/lib" setenv ${PKG_PREFIX}_LIBS "-L$PKG_ROOT/lib -l$PKG_LIBNAME" ####################################################################
- The tags 'pkg_name', 'pkg_version', 'pkg_prefix', 'pkg_root' and 'pkg_libname' should be replaced by the appropriate names for the library or application.
- Specify any module prerequisites and/or conflicts.
- Provide content for the URL and Description in the 'ModulesHelp' procedure
- Give a one-line description for the 'module whatis' line. This should reflect pkg_name, pkg_version, and the toolchain used to build.
- If needed, augment or edit the pre-defined environment settings (Note: the default procedure is to use ' prepend-path to permit bottom-up construction of an environment stack ).
- The ${PKG_PREFIX}_LIBS variable could require manual changes by developers due to non-standard and/or multiple library names.
The current module file template is maintained in a version control repo at [email protected]:hpc/hpc-devel.git. The template file is located in hpc-devel/modules/modTemplate . To see the current file
git clone [email protected]:hpc/hpc-devel.git cd ./hpc-devel/modules/ cat modTemplateNext specify a default version of the module package. Here is an example of an an associated .version file for a set of module files
% cat /nopt/nrel/apps/modules/candidate/modulefiles/mysoft/.version #%Module######################################## # vim: syntax=tcl set ModulesVersion "1.3"The .version file is only useful if there are multiple versions of the software installed. Put notes in the modulefile as necessary in stderr of the modulefile for the user to use the software correctly and for additional pointers.
NOTE : For modules with more than one level of sub-directory, although the default module as specified above is displayed correctly by the modules system, it is not loaded correctly!if more than one version exists, the most recent one will be loaded by default. In other words, the above will work fine for dakota/5.3.1 if 5.3.1 is a file alongside the file dakota/5.4 , but not for dakota/5.3.1/openmpi-gcc when a dakota/5.4 directory is present. In this case, to force the correct default module to be loaded, a dummy symlink needs to be added in dakota/ that points to the module specified in .version
Example
% cat /nopt/nrel/apps/modules/default/modulefiles/dakota/.version #%Module######################################## # vim: syntax=tcl set ModulesVersion "5.3.1/openmpi-gcc" % module avail dakota ------------------------------------------------------------------ /nopt/nrel/apps/modules/default/modulefiles ------------------------------------------------------------------- dakota/5.3.1/impi-intel dakota/5.3.1/openmpi-epel dakota/5.3.1/openmpi-gcc(default) dakota/5.4/openmpi-gcc dakota/default % ls -l /nopt/nrel/apps/modules/default/modulefiles/dakota total 8 drwxrwsr-x 2 ssides n-apps 8192 Sep 22 13:56 5.3.1 drwxrwsr-x 2 hsorense n-apps 96 Jun 19 10:17 5.4 lrwxrwxrwx 1 cchang n-apps 17 Sep 22 13:56 default -> 5.3.1/openmpi-gccNaming ModulesSoftware which is made accessible via the modules system generally falls into one of three categories.
- Applications: these may be intended to carry out scientific calculations, or tasks like performance profiling of codes.
- Libraries: collections of header files and object code intended to be incorporated into an application at build time, and/or accessed via dynamic loading at runtime. The principal exceptions are technical communication libraries such as MPI, which are categorized as toolchain components below.
- Toolchains: compilers (e.g., Intel, GCC, PGI) and MPI libraries (OpenMPI, IntelMPI, mvapich2).
Often a package will contain both executable files and libraries. Whether it is classified as an Application or a Library depends on its primary mode of utilization. For example, although the HDF5 package contains a variety of tools for querying HDF5-format files, its primary usage is as a library which applications can use to create or access HDF5-format files. Each package can also be distinguished as a vendor- or developer-supplied binary, or a collection of source code and build components ( e.g. , Makefile(s)).
For pre-built applications or libraries, or for applications built from source code, the basic form of the module name should be
{package_name}/{version}. For libraries built from source, or any package containing components which can be linked against in normal usage, the name should be
{package_name}/{version}/{toolchain}The difference arises from two considerations. For supplied binaries, the assumed vendor or developer expectation is that a package will run either on a specified Linux distribution (and may have specific requirements satisfied by the distribution), or across varied distributions (and has fairly generic requirements satisfied by most or all distributions). Thus, the toolchain for supplied binaries is implicitly supplied by the operating system. For source code applications, the user should not be directly burdened with the underlying toolchain requirement; where this is relevant ( i.e. , satisfying dependencies), the associated information should be available in module help output, as well as through dependency statements in the module itself.
Definitions:
{package_name} : This should be chosen such that the associated Application, Library, or Toolchain component is intuitively obvious, while concomitantly distinguishing its target from other Applications, Libraries, or Toolchain components likely to be made available on the system through the modules. So, "gaussian" is a sensible package_name , whereas "gsn" would be too generic and of unclear intent. Within these guidelines, though, there is some discretion left to the module namer.
{version} : The base version generally reflects the state of development of the underlying package, and is supplied by the developers or vendor. However, a great deal of flexibility is permitted here with respect to build options outside of the recognized {toolchain} terms. So, a Scalapack-enabled package version might be distinguished from a LAPACK-linked one by appending "-sc" to the base version, provided this is explained in the "module help" or "module show" information. {version} provides the most flexibility to the module namer.
{toolchain} : This is solely intended to track the compiler and MPI library used to build a source package. It is not intended to track the versions of these toolchain components, nor to track the use of associated toolkits ( e.g. , Cilk Plus) or libraries ( e.g. , MKL, Scalapack). As such, this term takes the form {MPI}-{compiler} , where {MPI} is one of
- openmpi
- impi (Intel MPI)
and {compiler} is one of
Module Directory Organization For general support, modulefiles can be installed in three top locations:
- gcc
- intel
- epel (which implies the gcc supplied with the OS, possibly at a newer version number than that in the base OS exposed in the filesystem without the EPEL module).
- /nopt/Modules/3.2.10/modulefiles (Modules supplied by HP, not supported by the HPC Modeling & Simulation group)
- /nopt/nrel/apps/modules (majority of modules, most common location)
- /nopt/nrel/ecom/modules (specialized modules for certain groups/applications)
In addition, more specific requests can be satisfied in two other ways:
For the '/nopt/nrel/apps' modules location (where most general installations should be made), the following sub-directories have been created:
- /projects/X/modules (Modules useful to a single project with allocated resource)
- /home/$USER/modules (Modules useful to a single user)
to manage how modules are developed, tested and provided for production level use. An example directory hierarchy for the module files is as follows:
- /nopt/nrel/apps/modules/candidate/modulefiles
- /nopt/nrel/apps/modules/default/modulefiles
- /nopt/nrel/apps/modules/deprecated/modulefiles
- /nopt/nrel/apps/modules/hpc/modulefiles
[wjones@login2 nrel]$ tree -a apps/modules/default/modulefiles/hdf5-parallel/ apps/modules/default/modulefiles/hdf5-parallel/ ├── .1.6.4 │ ├── impi-intel │ ├── openmpi-gcc │ └── .version ├── 1.8.11 │ ├── impi-intel │ └── openmpi-gcc └── .version [wjones@login2 nrel]$ tree -a apps/modules/default/modulefiles/hdf5 apps/modules/default/modulefiles/hdf5 ├── .1.6.4 │ └── intel ├── 1.8.11 │ ├── gcc │ └── intel └── .version [wjones@login2 nrel]$ module avail hdf5 ------------------------------------------------------- /nopt/nrel/apps/modules/default/modulefiles ------------------------------------------------------- hdf5/1.8.11/gcc hdf5-parallel/1.8.11/impi-intel(default) hdf5/1.8.11/intel(default) hdf5-parallel/1.8.11/openmpi-gccModule Migrationlast modified Jul 06, 2015 03:16 PM
- There are three file paths for which this document is intended. Each corresponds to a status of modules within a broader workflow for managing modules. (The other module locations are not directly part of the policy).
- /nopt/nrel/apps/modules/candidate/modulefiles : This is the starting point for new modules. Modules are to be created here for testing and validation prior to production release. Modules here are not necessarily expected to work without issues, and may be modified or deleted without warning.
- /nopt/nrel/apps/modules/default/modulefiles : This is the production location, visible to the general user community by default. Modules here carry the expectation of functioning properly. Movement of modulefiles into and out of this location is managed through a monthly migration process.
- /nopt/nrel/apps/modules/deprecated/modulefiles : This location contains older modules which are intended for eventual archiving. Conflicts with newer software may render these modules non-functional, and so there is not an expectation of maintenance for these. They are retained to permit smooth migration out of the Peregrine software stack ( i.e. , users will still have access to them and may register objections/issues while retaining their productivity).
- "modifications" to modules entail
- Additions to any of the three stages;
- Major changes in functionality for modules in /default or /deprecated;
- Archiving modules from /deprecated; or,
- Making a module "default"
These are the only acceptable atomic operations. Thus, a migration is defined as an addition to one path and a subsequent deletion from its original path.
- Announcements to users may be one of the following six options:
- Addition to /candidate!"New Module";
- Migration from /candidate to /default!"Move to Production";
- Migration from /default to /deprecated!"Deprecate";
- Removing visibility and accessibility from /deprecated!"Archive"; or,
- Major change in functionality in /default or /deprecated!"Modify"
- Make default!"Make default"
Changes outside of these options, e.g. , edits in /candidate, will not be announced as batching these changes would inhibit our ability to respond nimbly to urgent problems.
- A "major change in functionality" is an edit to the module that could severely compromise users' productivity in the absence of adaptation on their part. So, pointing to a different application binary could result in incompatibilities in datasets generated before and after the module change; changing a module name can break workflows over thousands of jobs. On the other hand, editing inline documentation, setting an environment variable that increases performance with no side effects, or changing a dependency maintenance revision (e.g., a secondary module load of a library from v3.2.1 to v3.2.2) is unlikely to create major problems and does not need explicit attention.
- All module modifications are to be documented in the Sharepoint Modules Modifications table prior to making any changes (this table is linked at http://cs.hpc.nrel.gov/modeling/hpc-sharepoint-assets).
- Module modifications are to be batched for execution on monthly calendar boundaries, and (a) announced to [email protected] two weeks prior to execution, and (b) added to http://hpc.nrel.gov/users/announcements as a new page, which will auto-populate the table visible on the front page. Endeavor to make this list final prior to the first announcement.
- Modules may not be added to or deleted from /default without a corresponding deletion/addition from one of the other categories, i.e. , they may only be migrated relative to /default, not created or deleted directly.
- Good faith testing. There is not currently a formally defined testing mechanism for new modules in /candidate. It is thus left to the individual module steward's (most likely the individual who owns the modulefile in the *NIX sense) discretion what is a defensible test regimen. Within the current document's scope, this specifically relates to the module functionality, not the application functionality.
- Library and toolchain dependencies must be checked for prior to removal of modules from .../deprecated. For example, if a user identifies an application dependency on a deprecated library or toolchain, then the application module will point to the specific library or toolchain version!if it were not, then presumably an updated library/toolchain would be breaking the application. Thus, checking for dependencies on deprecated versions can be done via simple grep of all candidate and production modules. (An obvious exception is if the user is handling the dependencies in their own scripts; this case can not be planned around). It is assumed that an identified dependency on a deprecated module would spur rebuilding and testing of the application against newer libraries/toolchain, so that critical dependency on deprecated tools may not often arise in practice.
Jul 28, 2017 | genomics.upenn.edu
Basic module usageTo know what modules are available, you'll need to run the "module avail" command from an interactive session:
[asrini@consign ~]$ bsub -Is bash Job <9990024> is submitted to default queue <interactive>. <<Waiting for dispatch ...>> <<Starting on node063.hpc.local>> [asrini@node063 ~]$ module avail ------------------------------------------------------------------- /usr/share/Modules/modulefiles ------------------------------------------------------------------- NAMD-2.9-Linux-x86_64-multicore dot module-info picard-1.96 rum-2.0.5_05 STAR-2.3.0e java-sdk-1.6.0 modules pkg-config-path samtools-0.1.19 STAR-hg19 java-sdk-1.7.0 mpich2-x86_64 python-2.7.5 use.own STAR-mm9 ld-library-path null r-libs-user bowtie2-2.1.0 manpath openmpi-1.5.4-x86_64 ruby-1.8.7-p374 devtoolset-2 module-cvs perl5lib ruby-1.9.3-p448
The module names should be pretty self-explainatory, but some are not. To see information about a module you can issue a module show [module name] :[asrini@node063 ~]$ module show null ------------------------------------------------------------------- /usr/share/Modules/modulefiles/null: module-whatis does absolutely nothing ------------------------------------------------------------------- [asrini@node063 ~]$ module show r-libs-user ------------------------------------------------------------------- /usr/share/Modules/modulefiles/r-libs-user: module-whatis Sets R_LIBS_USER=$HOME/R/library setenv R_LIBS_USER ~/R/library ------------------------------------------------------------------- [asrini@node063 ~]$ module show devtoolset-2 ------------------------------------------------------------------- /usr/share/Modules/modulefiles/devtoolset-2: module-whatis Devtoolset-2 packages include the newer versions of gcc prepend-path PATH /opt/rh/devtoolset-2/root/usr/bin prepend-path MANPATH /opt/rh/devtoolset-2/root/usr/share/man prepend-path INFOPATH /opt/rh/devtoolset-2/root/usr/share/info -------------------------------------------------------------------Example use of modules:
[asrini@node063 ~]$ python -V Python 2.6.6 [asrini@node063 ~]$ which python /usr/bin/python [asrini@node063 ~]$ module load python-2.7.5 [asrini@node063 ~]$ python -V Python 2.7.5 [asrini@node063 ~]$ which python /opt/software/python/python-2.7.5/bin/pythonAfter running the above commands, you will be able to use python v2.7.5 till you exit out of the interactive session or till you unload the module:
[asrini@node063 ~]$ module unload python-2.7.5 [asrini@node063 ~]$ which python /usr/bin/pythonModules may also be included in your job scripts and submitted as a batch job.
Using Modules at LoginIn order to have modules automatically load into your environment, you would add the module commands to your $HOME/.bashrc file. Note that modules are not available on the PMACS head node, hence, you'll need to ensure that your login script attempts to load a module only if you are on a compute node:
[asrini@consign ~]$ more .bashrc # .bashrc # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi # # # Modules to load if [ $HOSTNAME != "consign.hpc.local" ] && [ $HOSTNAME != "mercury.pmacs.upenn.edu" ]; then module load python-2.7.5 fi # more stuff below ..... [asrini@consign ~]$ which python /usr/bin/python [asrini@consign ~]$ bsub -Is bash Job <172129> is submitted to default queue <interactive>. <<Waiting for dispatch ...>> <<Starting on node063.hpc.local>> [asrini@node063 ~]$ which python /opt/software/python/python-2.7.5/bin/python
February 11, 2014 | Wiki de Calcul Québec
Example
A module file's contents are simple enough, a good starting point is to take an existing file as an example and to modify the variables it contains to adapt it to the module that you would wish to install:
File : modulefiles/mpi/openmpi/1.6.3_intel#%Module1.0 ##################################################################### ## ## OPENMPI MPI lib ## ## proc ModulesHelp { } { puts stderr "\tAdds the OpenMPI library to your environment. " } module-whatis "(Category_______) mpi" module-whatis "(Name___________) OpenMPI" module-whatis "(Version________) 1.6.3" module-whatis "(Website________) http://www.open-mpi.org/" module-whatis "(Authorship_____) The Open MPI Team" module-whatis "(Compiler_______) Intel 2013" module-whatis "(Flags__________) CFLAGS='-O3 -xHOST -Wall' ../openmpi-1.6.3/configure --prefix=prefix --with-threads " module-whatis " --enable-mpi-thread-multiple --with-openib --enable-shared --enable-static --with-ft=cr --enable-ft-thread " module-whatis " --with-blcr=/software/apps/blcr/0.8.4 --with-blcr-libdir=/software/apps/blcr/0.8.4/lib --with-tm=/opt/torque " module-whatis " CFLAGS='CFLAGS' --with-io-romio-flags='--with-file-system=testfs+ufs+nfs+lustre'" module-whatis "(Dependencies___) Intel 2013" conflict mpi prereq compilers/intel/2013 set synopsys /software/MPI/openmpi/1.6.3_intel set blcr_synopsys /software/apps/blcr/0.8.4 prepend-path PATH $synopsys/bin:$blcr_synopsys/bin prepend-path LD_LIBRARY_PATH $synopsys/lib:$blcr_synopsys/lib prepend-path C_INCLUDE_PATH $synopsys/include prepend-path CXX_INCLUDE_PATH $synopsys/include prepend-path CPP_INCLUDE_PATH $synopsys/include prepend-path CPLUS_INCLUDE_PATH $synopsys/include prepend-path MANPATH $synopsys/share/man:$blcr_synopsys/man setenv OMPI_MCA_plm_rsh_num_concurrent 960Let us consider this example in detail. This file start with a comment (lines starting with the number sign #), which specifies that it is a module using format 1.0. Other comments show that this is a module for the MPI library OpenMPI. The actual module then starts by defining a function, ModulesHelp. This function is then called when a user runs the following command:
[name@server $] module help mpi/openmpi/1.6.3_intel
This command outputs the message "Adds the OpenMPI library to your environment." to standard error.Note : All messages that are displayed by the module command use standard error (stderr) instead of standard output (stdout).
The module then continues with a list of commands including the following:
Command Meaning module-whatis Allows for a more elaborate description of the module, which is shown using module whatis module_name. conflict Specifies that this module cannot be loaded if the given module was already loaded. prereq Specifies a pre-required module. set Defines a variable that is internal to the module. prepend-path Prefix an environment variable of the type PATH using the specified path. setenv Defines an environment variable. The above module defines a detailed description using the module-whatis command. Following that, it specifies that the module cannot be loaded together with another mpi module, that means that only one mpi module can be loaded at the same time. After that, it specifies that the Intel compiler, version 2013, is required.
After that, the module adds some directories to various environment variables, and finally defines a new environment variable (an OpenMPI control parameter in this case).
ADMIN Magazine
... ... ...
If you want to change you compiler or libraries – basically anything to do with your environment – you might be tempted to change your $PATH in the .bashrc file (if you are using Bash) and then log out and log back in whenever you need to change your compiler/MPI combination. Initially this sounds like a pain, and it is, but it works to some degree. It doesn't work in the situation where you want to run multiple jobs each with a different compiler/MPI combination.For example, say I have a job using the GCC 4.6.2 compilers using Open MPI 1.5.2, then I have a job using GCC 4.5.3 and MPICH2. If I have both jobs in the queue at the same time, how can I control my .bashrc to make sure each job has the correct $PATH? The only way to do this is to restrict myself to one job in the queue at a time. When it's finished I can then change my .bashrc and submit a new job. Because you are using a different compiler/MPI combination from what is in the queue, even for something as simple as code development, you have to watch when the job is run to make sure your .bashrc matches your job.
The Easy Way
A much better way to handle compiler/MPI combinations is to use Environment Modules. (Be careful not to confuse "environment modules" with "kernel modules.") According to the website, "The Environment Modules package provides for the dynamic modification of a user's environment via modulefiles." Although this might not sound earth shattering, it actually is a quantum leap for using multiple compilers/MPI libraries, but you can use it for more than just that, which I will talk about later.
You can use Environment Modules to alter or change environment variables such as $PATH, $MANPATH, $LD_LIBRARY_LOAD, and others. Because most job scripts for resource managers, such as LSF, PBS-Pro, and MOAB, are really shell scripts, you can incorporate Environment Modules into the scripts to set the appropriate $PATH for your compiler/MPI combination, or any other environment variables an application requires for operation.
How you install Environment Modules depends on how your cluster is built. You can build it from source, as I will discuss later, or you can install it from your package manager. Just be sure to look for Environment Modules.
Using Environment Modules
To begin, I'll assume that Environment Modules is installed and functioning correctly, so you can now test a few of the options typically used. In this article, I'll be using some examples from TACC. The first thing to check is what modules are available to you by using the module avail command:
[laytonjb@dlogin-0 ~]$ module avail ------------------------------------------- /opt/apps/intel11_1/modulefiles ------------------------------------------- fftw3/3.2.2 gotoblas2/1.08 hdf5/1.8.4 mkl/10.2.4.032 mvapich2/1.4 netcdf/4.0.1 openmpi/1.4 ------------------------------------------------ /opt/apps/modulefiles ------------------------------------------------ gnuplot/4.2.6 intel/11.1(default) papi/3.7.2 intel/10.1 lua/5.1.4 pgi/10.2 -------------------------------------------------- /opt/modulefiles --------------------------------------------------- Linux TACC TACC-paths cluster ----------------------------------------------- /cm/shared/modulefiles ------------------------------------------------ acml/gcc/64/4.3.0 fftw3/gcc/64/3.2.2 mpich2/smpd/ge/open64/64/1.1.1p1 acml/gcc/mp/64/4.3.0 fftw3/open64/64/3.2.2 mpiexec/0.84_427 acml/gcc-int64/64/4.3.0 gcc/4.3.4 mvapich/gcc/64/1.1 acml/gcc-int64/mp/64/4.3.0 globalarrays/gcc/openmpi/64/4.2 mvapich/open64/64/1.1 acml/open64/64/4.3.0 globalarrays/open64/openmpi/64/4.2 mvapich2/gcc/64/1.2 acml/open64-int64/64/4.3.0 hdf5/1.6.9 mvapich2/open64/64/1.2 blacs/openmpi/gcc/64/1.1patch03 hpl/2.0 netcdf/gcc/64/4.0.1 blacs/openmpi/open64/64/1.1patch03 intel-cluster-checker/1.3 netcdf/open64/64/4.0.1 blas/gcc/64/1 intel-cluster-runtime/2.1 netperf/2.4.5 blas/open64/64/1 intel-tbb/ia32/22_20090809oss open64/4.2.2.2 bonnie++/1.96 intel-tbb/intel64/22_20090809oss openmpi/gcc/64/1.3.3 cmgui/5.0 iozone/3_326 openmpi/open64/64/1.3.3 default-environment lapack/gcc/64/3.2.1 scalapack/gcc/64/1.8.0 fftw2/gcc/64/double/2.1.5 lapack/open64/64/3.2.1 scalapack/open64/64/1.8.0 fftw2/gcc/64/float/2.1.5 mpich/ge/gcc/64/1.2.7 sge/6.2u3 fftw2/open64/64/double/2.1.5 mpich/ge/open64/64/1.2.7 torque/2.3.7 fftw2/open64/64/float/2.1.5 mpich2/smpd/ge/gcc/64/1.1.1p1This command lists what environment modules are available. You'll notice that TACC has a very large number of possible modules that provide a range of compilers, MPI libraries, and combinations. A number of applications show up in the list as well.
You can check which modules are "loaded" in your environment by using the list option with the module command:
[laytonjb@dlogin-0 ~]$ module list Currently Loaded Modulefiles: 1) Linux 2) intel/11.1 3) mvapich2/1.4 4) sge/6.2u3 5) cluster 6) TACCThis indicates that when I log in, I have six modules already loaded for me. If I want to use any additional modules, I have to load them manually:
[laytonjb@dlogin-0 ~]$ module load gotoblas2/1.08 [laytonjb@dlogin-0 ~]$ module list Currently Loaded Modulefiles: 1) Linux 3) mvapich2/1.4 5) cluster 7) gotoblas2/1.08 2) intel/11.1 4) sge/6.2u3 6) TACCYou can just cut and paste from the list of available modules to load the ones you want or need. (This is what I do, and it makes things easier.) By loading a module, you will have just changed the environmental variables defined for that module. Typically this is $PATH, $MANPATH, and $LD_LIBRARY_LOAD.
To unload or remove a module, just use the unload option with the module command, but you have to specify the complete name of the environment module:
[laytonjb@dlogin-0 ~]$ module unload gotoblas2/1.08 [laytonjb@dlogin-0 ~]$ module list Currently Loaded Modulefiles: 1) Linux 2) intel/11.1 3) mvapich2/1.4 4) sge/6.2u3 5) cluster 6) TACCNotice that the gotoblas2/1.08 module is no longer listed. Alternatively, to you can unload all loaded environment modules using module purge:
[laytonjb@dlogin-0 ~]$ module purge [laytonjb@dlogin-0 ~]$ module list No Modulefiles Currently Loaded.You can see here that after the module purge command, no more environment modules are loaded.
If you are using a resource manager (job scheduler), you are likely creating a script that requests the resources and runs the application. In this case, you might need to load the correct Environment Modules in your script. Typically after the part of the script in which you request resources (in the PBS world, these are defined as #PBS commands), you will then load the environment modules you need.
Now that you've seen a few basic commands for using Environment Modules, I'll go into a little more depth, starting with installing from source. Then I'll use the module in a job script and write my own module.
Building Environment Modules for Clusters
In my opinion, the quality of open source code has improved over the last several years to the point at which building and installing is fairly straightforward, even if you haven't built any code before. If you haven't built code, don't be afraid to start with Environment Modules.
For this article, as an example, I will build Environment Modules on a "head" node in the cluster in /usr/local. I will assume that you have /usr/local NSF exported to the compute nodes or some other filesystem or directory that is mounted on the compute nodes (perhaps a global filesystem?). If you are building and testing your code on a production cluster, be sure to check that /usr/local is mounted on all of the compute nodes.
To begin, download the latest version – it should be a *.tar.gz file. (I'm using v3.2.6, but the latest as of writing this article is v3.2.9). To make things easier, build the code in /usr/local. The documentation that comes with Environment Modules recommends that it be built in /usr/local/Modules/src. As root, run the following commands:
% cd /usr/local % mkdir Modules % cd Modules % mkdir src % cp modules-3.2.6.tar.gz /usr/local/Modules/src % gunzip -c modules-3.2.6.tar.gz | tar xvf - % cd modules-3.2.6At this point, I would recommend you carefully read the INSTALL file; it will save your bacon. (The first time I built Environment Modules, I didn't read it and had lots of trouble.)
Before you start configuring and building the code, you need to fulfill a few prerequisites. First, you should have Tcl installed, as well as the Tcl Development package. Because I don't know what OS or distribution you are running, I'll leave to you the tasks of installing Tcl and Tcl Development on the node where you will be building Environment Modules.
At this point, you should configure and build Environment Modules. As root, enter the following commands:
% cd /usr/local/Modules/src/modules-3.2.6 % ./configure % make % make installThe INSTALL document recommends making a symbolic link in /usr/local/Modules connecting the current version of Environment Modules to a directory called default:
% cd /usr/local/Modules % sudo ln -s 3.2.6 defaultThe reason they recommend using the symbolic link is that, if you upgrade Environment Modules to a new version, you build it in /usr/local/Modules/src and then create a symbolic link from /usr/local/Modules/<new> to /usr/local/Modules/default, which makes it easier to upgrade.
The next thing to do is copy one (possibly more) of the init files for Environment Modules to a global location for all users. For my particular cluster, I chose to use the sh init file. This file will configure Environment Modules for all of the users. I chose to use the sh version rather than csh or bash, because sh is the least common denominator:
% sudo cp /usr/local/Modules/default/init/sh /etc/profile.d/modules.sh % chmod 755 /etc/profile.d/modules.shNow users can use Environment Modules by just putting the following in their .bashrc or .profile:
%. /etc/profile.d/modules.shAs a simple test, you can run the above script and then type the command module. If you get some information about how to use modules, such as what you would see if you used the -help option, then you have installed Environment Modules correctly.
Environment Modules in Job Scripts
In this section, I want to show you how you can use Environment Modules in a job script. I am using PBS for this quick example, with this code snippet for the top part of the job script:
#PBS -S /bin/bash #PBS -l nodes=8:ppn=2 . /etc/profile.d/modules.sh module load compiler/pgi6.1-X86_64 module load mpi/mpich-1.2.7 (insert mpirun command here)At the top of the code snippet is the PBS directives that begin with #PBS. After the PBS directives, I invoke the Environment Modules startup script (modules.sh). Immediately after that, you should load the modules you need for your job. For this particular example, taken from a three-year-old job script of mine, I've loaded a compiler (pgi 6.1-x86_64) and an MPI library (mpich-1.2.7).
Building Your Own Module File
Creating your own module file is not too difficult. If you happen to know some Tcl, then it's pretty easy; however, even if you don't know Tcl, it's simple to follow an example to create your own.
The modules themselves define what you want to do to the environment when you load the module. For example, you can create new environment variables that you might need to run the application or change $PATH, $LD_LIBRARY_LOAD, or $MANPATH so a particular application will run correctly. Believe it or not, you can even run code within the module or call an external application. This makes Environment Modules very, very flexible.
To begin, remember that all modules are written in Tcl, so this makes them very programmable. For the example, here, all of the module files go in /usr/local/Modules/default/modulefiles. In this directory, you can create subdirectories to better label or organize your modules.
In this example, I'm going to create a module for gcc-4.6.2 that I build and install into my home account. To begin, I create a subdirectory called compilers for any module file that has to do with compilers. Environment Modules has a sort of template you can use to create your own module. I used this as the starting point for my module. As root, do the following:
% cd /usr/local/Modules/default/modulefiles % mkdir compilers % cp modules compilers/gcc-4.6.2The new module will appear in the module list as compilers/gcc-4.6.2. I would recommend that you look at the template to get a feel for the syntax and what the various parts of the modulefile are doing. Again, recall that Environment Modules use Tcl as its language but you don't have to know much about Tcl to create a module file. The module file I created follows:
#%Module1.0##################################################################### ## ## modules compilers/gcc-4.6.2 ## ## modulefiles/compilers/gcc-4.6.2. Written by Jeff Layton ## proc ModulesHelp { } { global version modroot puts stderr "compilers/gcc-4.6.2 - sets the Environment for GCC 4.6.2 in my home directory" } module-whatis "Sets the environment for using gcc-4.6.2 compilers (C, Fortran)" # for Tcl script use only set topdir /home/laytonj/bin/gcc-4.6.2 set version 4.6.2 set sys linux86 setenv CC $topdir/bin/gcc setenv GCC $topdir/bin/gcc setenv FC $topdir/bin/gfortran setenv F77 $topdir/bin/gfortran setenv F90 $topdir/bin/gfortran prepend-path PATH $topdir/include prepend-path PATH $topdir/bin prepend-path MANPATH $topdir/man prepend-path LD_LIBRARY_PATH $topdir/libThe file might seem a bit long, but it is actually fairly compact. The first section provides help with this particular module if a user asks for it (the line that begins with puts stderr); for example:
home8:~> module help compilers/gcc-4.6.2 ----------- Module Specific Help for 'compilers/gcc-4.6.2' -------- compilers/gcc-4.6.2 - sets the Environment for GCC 4.6.2 in my home directoryYou can have multiple strings by using several puts stderr lines in the module (the template has several lines).
After the help section in the procedure ModuleHelp, another line provides some simple information when a user uses the whatis option; for example:
home8:~> module whatis compilers/gcc-4.6.2 compilers/gcc-4.6.2 : Sets the environment for using gcc-4.6.2 compilers (C, Fortran)After the help and whatis definitions is a section where I create whatever environment variables are needed, as well as modify $PATH, $LD_LIBRARY_PATH, and $MANPATH or other standard environment variables. To make life a little easier for me, I defined some local variables:topdir, version, and sys. I only used topdir, but I defined the other two variables in case I needed to go back and modify the module (the variables can help remind me what the module was designed to do).
In this particular modulefile, I defined a set of environment variables pointing to the compilers (CC, GCC, FC, F77, and F90). After defining those environment variables, I modified $PATH, $LD_LIBRARY_PATH, and $MANPATH so that the compiler was first in these paths by using the prepend-path directive.
This basic module is pretty simple, but you can get very fancy if you want or need to. For example, you could make a module file dependent on another module file so that you have to load a specific module before you load the one you want. Or, you can call external applications – for example, to see whether an application is installed and functioning. You are pretty much limited only by your needs and imagination.
Making Sure It Works Correctly
Now that you've defined a module, you need to check to make sure it works. Before you load the module, check to see which gcc is being used:
home8:~> which gcc /usr/bin/gcc home8:~> gcc -v Reading specs from /usr/lib/gcc/i386-redhat-linux/3.4.3/specs Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions--enable-java-awt=gtk --host=i386-redhat-linux Thread model: posix gcc version 3.4.3 20050227 (Red Hat 3.4.3-22.1)This means gcc is currently pointing to the system gcc. (Yes, this is a really old gcc; I need to upgrade my simple test box at home).
Next, load the module and check which gcc is being used:
home8:~> module avail ----------------------------- /usr/local/Modules/versions ------------------------------ 3.2.6 ------------------------- /usr/local/Modules/3.2.6/modulefiles -------------------------- compilers/gcc-4.6.2 dot module-info null compilers/modules module-cvs modules use.own home8:~> module load compilers/gcc-4.6.2 home8:~> module list Currently Loaded Modulefiles: 1) compilers/gcc-4.6.2 home8:~> which gcc ~/bin/gcc-4.6.2/bin/gcc home8:~> gcc -v Using built-in specs. Target: i686-pc-linux-gnu Configured with: ./configure --prefix=/home/laytonj/bin/gcc-4.6.2 --enable-languages=c,fortran --enable-libgomp Thread model: posix gcc version 4.6.2This means if you used gcc, you would end up using the version built in your home directory.
As a final check, unload the module and recheck where the default gcc points:
home8:~> module unload compilers/gcc-4.6.2 home8:~> module list No Modulefiles Currently Loaded. home8:~> which gcc /usr/bin/gcc home8:~> gcc -v Reading specs from /usr/lib/gcc/i386-redhat-linux/3.4.3/specs Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions--enable-java-awt=gtk --host=i386-redhat-linux Thread model: posix gcc version 3.4.3 20050227 (Red Hat 3.4.3-22.1)Notice that after you unload the module, the default gcc goes back to the original version, which means the environment variables are probably correct. If you want to be more thorough, you should check all of the environment variables before loading the module, after the module is loaded, and then after the module is unloaded. But at this point, I'm ready to declare success!
Final Comments
For clusters, Environment Modules are pretty much the best solution for handling multiple compilers, multiple libraries, or even applications. They are easy to use even for beginners to the command line. Just a few commands allow you to add modules to and remove them from your environment easily. You can even use them in job scripts. As you also saw, it's not too difficult to write your own module and use it. Environment Modules are truly one of the indispensable tools for clusters.
Kenneth Craft (Intel), Added June 8, 2015
What are Environment Modules
The Environment Modules utility allows dynamic modification of a user environment (shell environment variables such as PATH, LD_LIBRARY_PATH, etc). The Environment Module utility works with a set of system or user configure "modulefiles" which specify the necessary environment settings necessary to use a particular development tool or toolset, such as Intel® Parallel Studio XE. More information on the Environment Utility and modulefiles can be found here http://modules.sourceforge.net/
Each modulefile contains the information needed to configure the shell environment for a specific version of a development tool or toolset. Once the Modules utility has a modulefile configured for a particular tool, typically by the system administrator, the system user's environment can be modified using the 'module' command which interprets modulefiles. Typically modulefiles instruct the module command to alter or set shell environment variables such as PATH, MANPATH, etc. modulefiles may be shared by many users on a system and users may have their own collection to supplement or replace the shared modulefiles. This utility and collection of modulefiles allows users to test code with different versions of the tools such as compilers quickly and easily.
Environment modules give you control over your environment by keeping track of what variables are changed giving you the ability to remove those same entries, preventing variable bloat.
Supported Platforms
Linux* and OS X*
How to create a module file for Intel Software Development tools
First, install the Intel Development Tools on your system or cluster. Note that the Intel Development Tools ( Parallel Studio XE, Cluster Edition, Composer Edition, etc) do NOT come packaged with modulefiles. Instead the bash or tcsh/csh script files, named '*vars.sh' or '*vars.csh' (VARS scripts) are installed to the installation 'bin' directory (directories). To use the tools, first of all set up the environment using the 'source' command. i.e.
source [install-dir]composer_xe_2015.x.yyy/bin/ifortvars.sh intel64
There are two ways you can create modulefiles, the obvious way is by hand, using the provided script files *vars.[sh | csh] as reference. DO NOT DO IT THIS WAY, the VARS script files usually call a series of other dependent scripts. Also some scripts take arguments, such as the 'compilervars.sh' scripts, while others do not. Unraveling this nest of scripts is nearly impossible and error prone.
Alternatively you can use a utility '
env2
' to automate the process. The 'env2
' utility executes a script, capturing the changes it makes to the environment, and echoes those env vars and settings in a format suitable for a modulefile. This output can be captured to create your module file. in order to get the correct information you will need to redirect the output from stdout to a file.The '
env2
' utility can be found at http://env2.sourceforge.net/Creating a module file for any Intel Development Tool:
- First you place a special comment at the beginning of the file so that module will recognize it as a module file with following command:
echo "#%Module" > my_module_file
- then you will need to use the env2 command like below, for example:
perl env2 -from bash -to modulecmd "[install-dir]/parallel_studio_xe_201m.0.nnn/psxevars.sh <intel64|ia-32>" >> my_module_file
This will create a module file that has all the correct information need to run the compiler and tools correctly.- Now following the '
module
' instructions to use the compiler tools through module file.Note:You can find more information about Environment modules here http://modules.sourceforge.net/
For more complete information about compiler optimizations, see our Optimization Notice.
Raffy, Guillaume said on Fri, 12/23/2016 - 05:48
FYI : the simple solution that I proposed on 12/22/2016 - 06:29 doesn't actually work properly : env2 issues append-path commands instead of prepend-path commands, as there's an ambiguity : the same path finds itself at the same time at the beginning and at the end of the environment variable.
I managed to get env2 output proper modulecmd instructions by adding environment variables with dummy content prior to calling env2. These environment variables prevent ifortranvars.sh to enter lines such as "NLSPATH="$INSTALL_DIR/compiler/lib/locale/en_US/%N"; export NLSPATH": for example, declaring NLSPATH=dummy prior to calling env2 then causes env2 to issue prepend-path on NLSPATH instead of setenv
In order to automate this process, I had to perform some scripting to automatically detect what environment variables ifortvars.sh creates. Here's the code that does it :
local strEnvComparerFilePath="./envcomparer.bash"
echo "#!/bin/bash" > "$strEnvComparerFilePath"
echo "env | awk -F= '{ print \$1}' | sort > $strTmpDir/env_before.txt" >> "$strEnvComparerFilePath"
echo "source ifortvars.sh intel64" >> "$strEnvComparerFilePath"
echo "env | awk -F= '{ print \$1}' | sort > $strTmpDir/env_after.txt" >> "$strEnvComparerFilePath"chmod a+x "$strEnvComparerFilePath"
# temporarily disable the unbound variables check because some scripts dereference variables that are not necessarily defined and that's ok
set +o nounset
"$strEnvComparerFilePath"
set -o nounsetdiff "$strTmpDir/env_before.txt" "$strTmpDir/env_after.txt" | grep '> ' | sed 's/> //' > './new-env-vars.txt'
# create dummy environment variables so that env2 can properly detect append-path operations
local strNewVarName=''
for strNewVarName in $(cat ./new-env-vars.txt)
do
export $strNewVarName='dummy'
done# after this env2 will generate proper modulecmd instructions
perl env2 -from bash -to modulecmd "[install-dir]/ifortvars.sh intel64" >> my_module_fileHope this helps
Raffy, Guillaume said on Thu, 12/22/2016 - 06:29
This solution involving env2 doesn't actually generate a proper module file. The generated module file does seem to work as long as it's the only module being used. However, the generated module might wrongly redefine variables such as LD_LIBRARY_PATH instead of appending paths to it.
The reason for this is that env2 is not clever enough to decide when to issue setenv commands or prepend-path commands. In intel's ifortvars, for example, there are lots of things like :
if [ -z "${NLSPATH}" ]
then
NLSPATH="$INSTALL_DIR/compiler/lib/locale/en_US/%N"; export NLSPATH
else
NLSPATH="$INSTALL_DIR/compiler/lib/locale/en_US/%N:${NLSPATH}"; export NLSPATH
fiIf NLSPATH exists before calling env2, then the second branch (else condition) is entered, env2 detects a prepend operation and rightfully issues a prepend-path command. However, if NLSPATH doesn't exist before calling env2, then the first branch (if condition) is entered, env2 detects a simple assignment, and wrongfully issues a setenv command.
There's a simple solution for this problem though (it does work for me but it's not guaranteed to work with all scripts) : just run the intel script once before calling it again via env2 : this will create the path-like environment variables that are missing, and will cause env2 to detect prepend operations.
Environment Modules are written in the Tcl (Tool Command Language) and are interpreted by the modulecmd program via the module[17] user interface.
- Environment Modules provides a set of extensions to the "standard" Tcl package including setenv, unsetenv, append-path, prepend-path, set-alias and more as defined in the modulefiles man page[18] which, along with the built-in functionality of Tcl, provides a rich environment for handling setting defaults and initializing into an environment.
Feb 22, 2015 | nickgeoghegan.net
Creating module files
Above we specified that module files be placed in /modules, so that's where we'll put gcc's module files.
Create a gcc directory, if there isn't one
mkdir /modules/gccAdd the associated module file
vim /modules/gcc/4.6.2What's in that file, then?
#%Module1.0 proc ModulesHelp { } { global dotversion puts stderr "\tGCC 4.6.2 (gcc, g++, gfortran)" } module-whatis "GCC 4.6.2 (gcc, g++, gfortran)" conflict gcc prepend-path PATH /packages/gcc/4.6.2/bin prepend-path LD_LIBRARY_PATH /packages/gcc/4.6.2/lib64 prepend-path LIBRARY_PATH /packages/gcc/4.6.2/lib64 prepend-path MANPATH /packages/gcc/4.6.2/man setenv CC gcc setenv CXX g++ setenv FC gfortran setenv F77 gfortran setenv F90 gfortranModules allows you to set default versions of packages. So, say you have 4 versions of gcc, and you'd like the 4.6.2 as the default version, you can set it in a version file.
vim /modules/gcc/.version
#%Module1.0 set ModulesVersion "4.6.2"How do I use modules?
Well, it's about bloody time that we finally get to use the damn modules we've setup, otherwise you'd drive to my house and beat the piss out of me.
List the modules on your system with module avail.
[nick@zoidberg ~]$ module avail ---------------------------------- /modules/ ----------------------------------- gcc/4.6.2(default) haskell/ghc/7.0.4The (default) means that I can just load gcc without specifying the version numbers.
Load a module on your system with module load
Before we do this, I'll assure you it works.
[nick@zoidberg ~]$ gcc --version gcc (Debian 4.4.5-8) 4.4.5Let's load gcc version 4.6.2
[nick@zoidberg ~]$ module load gcc/4.6.2 [nick@zoidberg ~]$ gcc --version gcc (GCC) 4.6.2We can also load this version of gcc without specifying the version number, as 4.6.2 is the default.
[nick@zoidberg ~]$ module load gcc [nick@zoidberg ~]$ gcc --version gcc (GCC) 4.6.2See what modules are loaded
The modules loaded will always contain version numbers, if you're install them into the same folder structure as myself.
[nick@zoidberg ~]$ module list Currently Loaded Modulefiles: 1) /gcc/4.6.2Unloading modules
The syntax for unloading modules is the same as loading them.
[nick@zoidberg ~]$ module unload gcc [nick@zoidberg ~]$ gcc --version gcc (Debian 4.4.5-8) 4.4.5
ADMIN Magazine
When people first start using clusters, they tend to stick with whatever compiler and MPI library came with the cluster when it was installed. As they become more comfortable with the cluster, using the compilers, and using the MPI libraries, they start to look around at other options: Are there other compilers that could perhaps improve performance? Similarly, they might start looking at other MPI libraries: Can they help improve performance? Do other MPI libraries have tools that can make things easier? Perhaps even more importantly, these people would like to install the next version of the compilers or MPI libraries so they can test them with their code. So this forces a question: How do you have multiple compilers and multiple MPI libraries on the cluster at the same time and not get them confused? I'm glad you asked.The Hard Way
If you want to change you compiler or libraries – basically anything to do with your environment – you might be tempted to change your $PATH in the .bashrc file (if you are using Bash) and then log out and log back in whenever you need to change your compiler/MPI combination. Initially this sounds like a pain, and it is, but it works to some degree. It doesn't work in the situation where you want to run multiple jobs each with a different compiler/MPI combination.
For example, say I have a job using the GCC 4.6.2 compilers using Open MPI 1.5.2, then I have a job using GCC 4.5.3 and MPICH2. If I have both jobs in the queue at the same time, how can I control my .bashrc to make sure each job has the correct $PATH? The only way to do this is to restrict myself to one job in the queue at a time. When it's finished I can then change my .bashrc and submit a new job. Because you are using a different compiler/MPI combination from what is in the queue, even for something as simple as code development, you have to watch when the job is run to make sure your .bashrc matches your job.
The Easy Way
A much better way to handle compiler/MPI combinations is to use Environment Modules. (Be careful not to confuse "environment modules" with "kernel modules.") According to the website, "The Environment Modules package provides for the dynamic modification of a user's environment via modulefiles." Although this might not sound earth shattering, it actually is a quantum leap for using multiple compilers/MPI libraries, but you can use it for more than just that, which I will talk about later.
You can use Environment Modules to alter or change environment variables such as $PATH, $MANPATH, $LD_LIBRARY_LOAD, and others. Because most job scripts for resource managers, such as LSF, PBS-Pro, and MOAB, are really shell scripts, you can incorporate Environment Modules into the scripts to set the appropriate $PATH for your compiler/MPI combination, or any other environment variables an application requires for operation.
How you install Environment Modules depends on how your cluster is built. You can build it from source, as I will discuss later, or you can install it from your package manager. Just be sure to look for Environment Modules.
Using Environment Modules
To begin, I'll assume that Environment Modules is installed and functioning correctly, so you can now test a few of the options typically used. In this article, I'll be using some examples from TACC. The first thing to check is what modules are available to you by using the module avail command:
[laytonjb@dlogin-0 ~]$ module avail ------------------------------------------- /opt/apps/intel11_1/modulefiles ------------------------------------------- fftw3/3.2.2 gotoblas2/1.08 hdf5/1.8.4 mkl/10.2.4.032 mvapich2/1.4 netcdf/4.0.1 openmpi/1.4 ------------------------------------------------ /opt/apps/modulefiles ------------------------------------------------ gnuplot/4.2.6 intel/11.1(default) papi/3.7.2 intel/10.1 lua/5.1.4 pgi/10.2 -------------------------------------------------- /opt/modulefiles --------------------------------------------------- Linux TACC TACC-paths cluster ----------------------------------------------- /cm/shared/modulefiles ------------------------------------------------ acml/gcc/64/4.3.0 fftw3/gcc/64/3.2.2 mpich2/smpd/ge/open64/64/1.1.1p1 acml/gcc/mp/64/4.3.0 fftw3/open64/64/3.2.2 mpiexec/0.84_427 acml/gcc-int64/64/4.3.0 gcc/4.3.4 mvapich/gcc/64/1.1 acml/gcc-int64/mp/64/4.3.0 globalarrays/gcc/openmpi/64/4.2 mvapich/open64/64/1.1 acml/open64/64/4.3.0 globalarrays/open64/openmpi/64/4.2 mvapich2/gcc/64/1.2 acml/open64-int64/64/4.3.0 hdf5/1.6.9 mvapich2/open64/64/1.2 blacs/openmpi/gcc/64/1.1patch03 hpl/2.0 netcdf/gcc/64/4.0.1 blacs/openmpi/open64/64/1.1patch03 intel-cluster-checker/1.3 netcdf/open64/64/4.0.1 blas/gcc/64/1 intel-cluster-runtime/2.1 netperf/2.4.5 blas/open64/64/1 intel-tbb/ia32/22_20090809oss open64/4.2.2.2 bonnie++/1.96 intel-tbb/intel64/22_20090809oss openmpi/gcc/64/1.3.3 cmgui/5.0 iozone/3_326 openmpi/open64/64/1.3.3 default-environment lapack/gcc/64/3.2.1 scalapack/gcc/64/1.8.0 fftw2/gcc/64/double/2.1.5 lapack/open64/64/3.2.1 scalapack/open64/64/1.8.0 fftw2/gcc/64/float/2.1.5 mpich/ge/gcc/64/1.2.7 sge/6.2u3 fftw2/open64/64/double/2.1.5 mpich/ge/open64/64/1.2.7 torque/2.3.7 fftw2/open64/64/float/2.1.5 mpich2/smpd/ge/gcc/64/1.1.1p1This command lists what environment modules are available. You'll notice that TACC has a very large number of possible modules that provide a range of compilers, MPI libraries, and combinations. A number of applications show up in the list as well.
You can check which modules are "loaded" in your environment by using the list option with the module command:
[laytonjb@dlogin-0 ~]$ module list Currently Loaded Modulefiles: 1) Linux 2) intel/11.1 3) mvapich2/1.4 4) sge/6.2u3 5) cluster 6) TACCThis indicates that when I log in, I have six modules already loaded for me. If I want to use any additional modules, I have to load them manually:
[laytonjb@dlogin-0 ~]$ module load gotoblas2/1.08 [laytonjb@dlogin-0 ~]$ module list Currently Loaded Modulefiles: 1) Linux 3) mvapich2/1.4 5) cluster 7) gotoblas2/1.08 2) intel/11.1 4) sge/6.2u3 6) TACCYou can just cut and paste from the list of available modules to load the ones you want or need. (This is what I do, and it makes things easier.) By loading a module, you will have just changed the environmental variables defined for that module. Typically this is $PATH, $MANPATH, and $LD_LIBRARY_LOAD.
To unload or remove a module, just use the unload option with the module command, but you have to specify the complete name of the environment module:
[laytonjb@dlogin-0 ~]$ module unload gotoblas2/1.08 [laytonjb@dlogin-0 ~]$ module list Currently Loaded Modulefiles: 1) Linux 2) intel/11.1 3) mvapich2/1.4 4) sge/6.2u3 5) cluster 6) TACCNotice that the gotoblas2/1.08 module is no longer listed. Alternatively, to you can unload all loaded environment modules using module purge:
[laytonjb@dlogin-0 ~]$ module purge [laytonjb@dlogin-0 ~]$ module list No Modulefiles Currently Loaded.You can see here that after the module purge command, no more environment modules are loaded.
If you are using a resource manager (job scheduler), you are likely creating a script that requests the resources and runs the application. In this case, you might need to load the correct Environment Modules in your script. Typically after the part of the script in which you request resources (in the PBS world, these are defined as #PBS commands), you will then load the environment modules you need.
Now that you've seen a few basic commands for using Environment Modules, I'll go into a little more depth, starting with installing from source. Then I'll use the module in a job script and write my own module.
Building Environment Modules for Clusters
In my opinion, the quality of open source code has improved over the last several years to the point at which building and installing is fairly straightforward, even if you haven't built any code before. If you haven't built code, don't be afraid to start with Environment Modules.
For this article, as an example, I will build Environment Modules on a "head" node in the cluster in /usr/local. I will assume that you have /usr/local NSF exported to the compute nodes or some other filesystem or directory that is mounted on the compute nodes (perhaps a global filesystem?). If you are building and testing your code on a production cluster, be sure to check that /usr/local is mounted on all of the compute nodes.
To begin, download the latest version – it should be a *.tar.gz file. (I'm using v3.2.6, but the latest as of writing this article is v3.2.9). To make things easier, build the code in /usr/local. The documentation that comes with Environment Modules recommends that it be built in /usr/local/Modules/src. As root, run the following commands:
% cd /usr/local % mkdir Modules % cd Modules % mkdir src % cp modules-3.2.6.tar.gz /usr/local/Modules/src % gunzip -c modules-3.2.6.tar.gz | tar xvf - % cd modules-3.2.6At this point, I would recommend you carefully read the INSTALL file; it will save your bacon. (The first time I built Environment Modules, I didn't read it and had lots of trouble.)
Before you start configuring and building the code, you need to fulfill a few prerequisites. First, you should have Tcl installed, as well as the Tcl Development package. Because I don't know what OS or distribution you are running, I'll leave to you the tasks of installing Tcl and Tcl Development on the node where you will be building Environment Modules.
At this point, you should configure and build Environment Modules. As root, enter the following commands:
% cd /usr/local/Modules/src/modules-3.2.6 % ./configure % make % make installThe INSTALL document recommends making a symbolic link in /usr/local/Modules connecting the current version of Environment Modules to a directory called default:
% cd /usr/local/Modules % sudo ln -s 3.2.6 defaultThe reason they recommend using the symbolic link is that, if you upgrade Environment Modules to a new version, you build it in /usr/local/Modules/src and then create a symbolic link from /usr/local/Modules/<new> to /usr/local/Modules/default, which makes it easier to upgrade.
The next thing to do is copy one (possibly more) of the init files for Environment Modules to a global location for all users. For my particular cluster, I chose to use the sh init file. This file will configure Environment Modules for all of the users. I chose to use the sh version rather than csh or bash, because sh is the least common denominator:
% sudo cp /usr/local/Modules/default/init/sh /etc/profile.d/modules.sh % chmod 755 /etc/profile.d/modules.shNow users can use Environment Modules by just putting the following in their .bashrc or .profile:
%. /etc/profile.d/modules.shAs a simple test, you can run the above script and then type the command module. If you get some information about how to use modules, such as what you would see if you used the -help option, then you have installed Environment Modules correctly.
Environment Modules in Job Scripts
In this section, I want to show you how you can use Environment Modules in a job script. I am using PBS for this quick example, with this code snippet for the top part of the job script:
#PBS -S /bin/bash #PBS -l nodes=8:ppn=2 . /etc/profile.d/modules.sh module load compiler/pgi6.1-X86_64 module load mpi/mpich-1.2.7 (insert mpirun command here)At the top of the code snippet is the PBS directives that begin with #PBS. After the PBS directives, I invoke the Environment Modules startup script (modules.sh). Immediately after that, you should load the modules you need for your job. For this particular example, taken from a three-year-old job script of mine, I've loaded a compiler (pgi 6.1-x86_64) and an MPI library (mpich-1.2.7).
Building Your Own Module File
Creating your own module file is not too difficult. If you happen to know some Tcl, then it's pretty easy; however, even if you don't know Tcl, it's simple to follow an example to create your own.
The modules themselves define what you want to do to the environment when you load the module. For example, you can create new environment variables that you might need to run the application or change $PATH, $LD_LIBRARY_LOAD, or $MANPATH so a particular application will run correctly. Believe it or not, you can even run code within the module or call an external application. This makes Environment Modules very, very flexible.
To begin, remember that all modules are written in Tcl, so this makes them very programmable. For the example, here, all of the module files go in /usr/local/Modules/default/modulefiles. In this directory, you can create subdirectories to better label or organize your modules.
In this example, I'm going to create a module for gcc-4.6.2 that I build and install into my home account. To begin, I create a subdirectory called compilers for any module file that has to do with compilers. Environment Modules has a sort of template you can use to create your own module. I used this as the starting point for my module. As root, do the following:
% cd /usr/local/Modules/default/modulefiles % mkdir compilers % cp modules compilers/gcc-4.6.2The new module will appear in the module list as compilers/gcc-4.6.2. I would recommend that you look at the template to get a feel for the syntax and what the various parts of the modulefile are doing. Again, recall that Environment Modules use Tcl as its language but you don't have to know much about Tcl to create a module file. The module file I created follows:
#%Module1.0##################################################################### ## ## modules compilers/gcc-4.6.2 ## ## modulefiles/compilers/gcc-4.6.2. Written by Jeff Layton ## proc ModulesHelp { } { global version modroot puts stderr "compilers/gcc-4.6.2 - sets the Environment for GCC 4.6.2 in my home directory" } module-whatis "Sets the environment for using gcc-4.6.2 compilers (C, Fortran)" # for Tcl script use only set topdir /home/laytonj/bin/gcc-4.6.2 set version 4.6.2 set sys linux86 setenv CC $topdir/bin/gcc setenv GCC $topdir/bin/gcc setenv FC $topdir/bin/gfortran setenv F77 $topdir/bin/gfortran setenv F90 $topdir/bin/gfortran prepend-path PATH $topdir/include prepend-path PATH $topdir/bin prepend-path MANPATH $topdir/man prepend-path LD_LIBRARY_PATH $topdir/libThe file might seem a bit long, but it is actually fairly compact. The first section provides help with this particular module if a user asks for it (the line that begins with puts stderr); for example:
home8:~> module help compilers/gcc-4.6.2 ----------- Module Specific Help for 'compilers/gcc-4.6.2' -------- compilers/gcc-4.6.2 - sets the Environment for GCC 4.6.2 in my home directoryYou can have multiple strings by using several puts stderr lines in the module (the template has several lines).
After the help section in the procedure ModuleHelp, another line provides some simple information when a user uses the whatis option; for example:
home8:~> module whatis compilers/gcc-4.6.2 compilers/gcc-4.6.2 : Sets the environment for using gcc-4.6.2 compilers (C, Fortran)After the help and whatis definitions is a section where I create whatever environment variables are needed, as well as modify $PATH, $LD_LIBRARY_PATH, and $MANPATH or other standard environment variables. To make life a little easier for me, I defined some local variables:topdir, version, and sys. I only used topdir, but I defined the other two variables in case I needed to go back and modify the module (the variables can help remind me what the module was designed to do).
In this particular modulefile, I defined a set of environment variables pointing to the compilers (CC, GCC, FC, F77, and F90). After defining those environment variables, I modified $PATH, $LD_LIBRARY_PATH, and $MANPATH so that the compiler was first in these paths by using the prepend-path directive.
This basic module is pretty simple, but you can get very fancy if you want or need to. For example, you could make a module file dependent on another module file so that you have to load a specific module before you load the one you want. Or, you can call external applications – for example, to see whether an application is installed and functioning. You are pretty much limited only by your needs and imagination.
Making Sure It Works Correctly
Now that you've defined a module, you need to check to make sure it works. Before you load the module, check to see which gcc is being used:
home8:~> which gcc /usr/bin/gcc home8:~> gcc -v Reading specs from /usr/lib/gcc/i386-redhat-linux/3.4.3/specs Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions--enable-java-awt=gtk --host=i386-redhat-linux Thread model: posix gcc version 3.4.3 20050227 (Red Hat 3.4.3-22.1)This means gcc is currently pointing to the system gcc. (Yes, this is a really old gcc; I need to upgrade my simple test box at home).
Next, load the module and check which gcc is being used:
home8:~> module avail ----------------------------- /usr/local/Modules/versions ------------------------------ 3.2.6 ------------------------- /usr/local/Modules/3.2.6/modulefiles -------------------------- compilers/gcc-4.6.2 dot module-info null compilers/modules module-cvs modules use.own home8:~> module load compilers/gcc-4.6.2 home8:~> module list Currently Loaded Modulefiles: 1) compilers/gcc-4.6.2 home8:~> which gcc ~/bin/gcc-4.6.2/bin/gcc home8:~> gcc -v Using built-in specs. Target: i686-pc-linux-gnu Configured with: ./configure --prefix=/home/laytonj/bin/gcc-4.6.2 --enable-languages=c,fortran --enable-libgomp Thread model: posix gcc version 4.6.2This means if you used gcc, you would end up using the version built in your home directory.
As a final check, unload the module and recheck where the default gcc points:
home8:~> module unload compilers/gcc-4.6.2 home8:~> module list No Modulefiles Currently Loaded. home8:~> which gcc /usr/bin/gcc home8:~> gcc -v Reading specs from /usr/lib/gcc/i386-redhat-linux/3.4.3/specs Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions--enable-java-awt=gtk --host=i386-redhat-linux Thread model: posix gcc version 3.4.3 20050227 (Red Hat 3.4.3-22.1)Notice that after you unload the module, the default gcc goes back to the original version, which means the environment variables are probably correct. If you want to be more thorough, you should check all of the environment variables before loading the module, after the module is loaded, and then after the module is unloaded. But at this point, I'm ready to declare success!
Final Comments
For clusters, Environment Modules are pretty much the best solution for handling multiple compilers, multiple libraries, or even applications. They are easy to use even for beginners to the command line. Just a few commands allow you to add modules to and remove them from your environment easily. You can even use them in job scripts. As you also saw, it's not too difficult to write your own module and use it. Environment Modules are truly one of the indispensable tools for clusters.
Google matched content |
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: March, 12, 2019