Compiling R with Intel compilers and MKL on Hyak

Prerequisites

First, request a build node, which will have internet access. An hour of time should be enough to install R and a few packages. Building R (especially the suggested packages) can be sped up by using multiple build jobs.

srun -p build --mem=10G --cpus-per-task=8 --time=1:00:00 --pty /bin/zsh

To use the Intel compilers and link to MKL on Hyak, load the relevant module. In early 2020, this is accomplished by running

module load icc_19

I add this to my .bashrc and .zshrc so that the module is loaded every time I log in. To avoid warnings about building PDF documentation, load TeXLive as well.

module load contrib/texlive/2017

Next, download the R source from CRAN and extract the contents. For R v3.6.2, this is done using wget.

wget https://cran.r-project.org/src/base/R-3/R-3.6.2.tar.gz
tar -xvf R-3.6.2.tar.gz

Configuration

Following the Intel instructions for building R with MKL, we need to source the compilervars.sh script and set a number of environmental variables. This can be done in a script for convenience. Some packages (e.g. TMB) will break if you include OpenMP in the standard compiler flags, so these are moved to the more appropriate *_OPENMP_* environmental variables. The following takes care of these steps.

source /sw/intel-2019/bin/compilervars.sh intel64

export CC="icc"
export CXX="icpc"
export F77="ifort"
export FC="ifort"
export AR="xiar"
export LD="xild"

export CFLAGS="-fPIC -O3 -ipo -xHost -multiple-processes=8"
export CXXFLAGS="-fPIC -O3 -ipo -xHost -multiple-processes=8"
export FFLAGS="-fPIC -O3 -ipo -xHost -multiple-processes=8"
export FCFLAGS="-fPIC -O3 -ipo -xHost -multiple-processes=8"
export LDFLAGS=""

# OpenMP flags? Need to be here instead of above or TMB breaks?
export R_OPENMP_CFLAGS="-qopenmp"
export SHLIB_OPENMP_CFLAGS="-qopenmp"
export SHLIB_OPENMP_CXXFLAGS="-qopenmp"
export SHLIB_OPENMP_FFLAGS="-qopenmp"

MKL="-lmkl_rt -liomp5 -lpthread"

Finally, run the configure script inside the R-* directory. The --prefix flag should be changed to the directory where the built version should be installed.

./configure --prefix=/gscratch/*/bin/R-3.6.2 \
            --enable-R-shlib \
            --with-blas=$MKL \
            --with-lapack \
            --enable-BLAS-shlib

Build and install R

This part is pretty easy once configuration is complete. The -j flag can be used to specify the maximum number of parallel jobs to run. This should be equal to the --cpus-per-task specified when you requested the build node.

make -j8

Installation can be accomplised using

make install

Finally, add the R and Rscript executables to your PATH.

Testing the installation

In order to use multiple cores on a compute node, the --cpus-per-task flag must be greater than one. Note that --tasks-per-node does not allow multiple threads to be used by a single process. Watching CPU usage with e.g. htop will tell you if multithreaded linear algebra is working. So, using an interactive compute node (such as stf-int) open your newly-compiled version of R and running

X <- matrix(rnorm(5000 * 5000), nrow = 5000)
t(X) %*% X

You should see multiple CPUs being used at 100% in htop. If only one CPU is maxed out, something went wrong.

Avatar
John K Best
PhD candidate

My research focuses on advancing the use of statistics in ecology and fisheries.