Seventh Sense Rambling about life's little things, in 7 ≡ 1 (mod 6) fashion

### Disclaimer

The instructions/steps given below worked for me (and Michigan Technological University) running Rocks 5.4.2 (with CentOS 5.5) – as has been a common practice for several years now, a full version of Operating System was installed. These instructions may very well work for you (or your institution), on Rocks-like or other linux clusters. Please note that if you decide to use these instructions on your machine, you are doing so entirely at your very own discretion and that neither this site, sgowtham.net, nor its author (or Michigan Technological University) is responsible for any/all damage – intellectual and/or otherwise.

LINPACK is a software library for performing numerical linear algebra on computers. It was written in FORTRAN by Jack Dongarra, Jim Bunch, Cleve Moler and Gilbert Stewart. LINPACK makes use of the BLAS libraries for performing basic vector and matrix operations. It has largely been superseded by LAPACK, which runs more efficiently on modern architectures.

The LINPACK benchmarks are a measure of a system’s floating point computing power. Introduced by Jack Dongarra, they measure how fast a computer solves a dense N * N system of linear equations, Ax = b. The solution is obtained by Gaussian Elimination with Partial Pivoting, with

$\frac{2}{3}\:N^3 \:+\: 2\:N^2 \:+\: \mathcal{O}\left(N\right)$

floating point iterations. The result is often expressed in billions of floating point operations per second (GFLOPS). HPL, a portable implementation of High Performance LINPACK Benchmark, is used as a performance measure for ranking the supercomputers in the Top500 list.

### Pre-requisite #1: MPI

Following Rocks recommendations, this and other pre-requisites will be installed under /share/apps/; software installed by me, in clusters at Michigan Tech, have the following template for their folder structure:

/share/apps/ --> Software/Software_Version/ --> Compiler/Compiler_Version

Rocks 5.4.2 installation has an instance (of few flavors) of MPI but I prefer to compile MPICH2 using GCC 4.1.2. At the time of writing this post, the latest stable version of MPICH2 is 1.4.1p1 and it may be downloaded from here. Following folder structure/template mentioned above, it will be installed under

/share/apps/ --> mpich2/1.4.1p1/ --> gcc/4.1.2

To avoid confusion and/or missed steps leading to undesired results, steps associated with installation of MPICH2 have been put in the following script:

#! /bin/bash # # install_mpich2.sh # BASH script to install MPICH2 (compiled against GCC 4.1.2) on a # Rocks 5.4.2 cluster's front end # Must be root (or at least have sudo privilege) to run this script   # Begin root-check IF if [ $UID != 0 ] then clear echo echo " You must be logged in as root!" echo " Exiting..." echo exit else # Set necessary variables export CC="gcc" export CXX="g++" export FC="gfortran" export F77="gfortran" export MPICH2_VERSION="1.4.1p1" export GCC_VERSION="4.1.2" export MPICH2_INSTALL="/share/apps/mpich2/${MPICH2_VERSION}/gcc/${GCC_VERSION}" export ANL="http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs" echo echo " Step #0: Download MPICH2 to /share/apps/tmp" cd /share/apps/tmp/ wget${ANL}/${MPICH2_VERSION}/mpich2-${MPICH2_VERSION}.tar.gz   # Begin mpich2-${MPICH2_VERSION}.tar.gz check IF if [ -e "mpich2-${MPICH2_VERSION}.tar.gz" ] then echo echo " Step #1: configure, make clean, make and make install"   tar -zxpf mpich2-${MPICH2_VERSION}.tar.gz cd mpich2-${MPICH2_VERSION}/ ./configure --prefix=${MPICH2_INSTALL} make clean make make install echo echo " Step #2: Update$HOME/.bashrc" cat <<EOF   Add the following lines to $HOME/.bashrc and remember to source it # MPICH2 (${MPICH2_VERSION}) settings export MPICH2="${MPICH2_INSTALL}" export PATH="\${PATH}:\${MPICH2}/bin:\${MPICH2}/sbin" export MANPATH="\${MANPATH}:\${MPICH2}/man" export LD_LIBRARY_PATH="\${LD_LIBRARY_PATH}:\${MPICH2}/lib"   EOF   fi # End mpich2-${MPICH2_VERSION}.tar.gz check IF echo fi # End root-check IF A good test of a successful installation is that which mpicc, which mpif77, etc. return the respective commands located in ${MPICH2_INSTALL}.

### Pre-requisite #2: Goto BLAS

As in the case of pre-requisite #1, this one will be installed under

/share/apps/ --> gotoblas2/1.13/ --> gcc/4.1.2

The following script, used in installation, will assume that one has downloaded Goto BLAS2 1.13 from here and placed it in
/share/apps/tmp/

### Running HPL Benchmark

The amount of memory used by HPL is essentially the size of the co-efficient matrix, A. Following standard binary definition, 1 GB is 1024 * 1024 * 1024 bytes (MEM_BYTES). Most scientific/engineering computations use double precision numbers, with each such double precision number taking 8 bytes of memory. Thus, 1 GB can accommodate 134,217,728 double precision entities

DP_ELEMENTS = MEM_BYTES/8

Theoretically, sqrt(DP_ELEMENTS) represents the maximum possible value of N. However, operating system needs some memory to perform some necessary operations. As such, HPL benchmark is usually performed for the following values of N – with m representing the fraction of total memory – making sure that swapping did not occur (which would result in reduced performance).

$N \:=\: m\:\sqrt{\mathrm{TOTAL\_DP\_ELEMENTS}} \hspace{0.50in} m: 0.50\:(0.10)\:0.80$

HPL uses the block size (NB) for the data distribution as well as for the computational granularity. From a data distribution perspective, the smaller NB, the better the load balance. From a computational perspective, too small of a value for NB may limit the computational performance by a large factor since almost no data re-use will occur in the highest level of the memory hierarchy. The number of messages will also increase. In my case, this benchmark was performed for NB values of 128, 256 and 512.

The results so obtained are compared with the theoretical peak value, $\mathrm{GFLOPS}_\mathrm{Theory}$, computed as follows:

$\mbox{\# of Nodes} \;\times\; \mbox{\# of Sockets/Node} \;\times\; \mbox{\# of Cores/Socket} \;\times\; \\\\\mbox{CPU Frequency (Cycles/second)} \;\times\; \mbox{\# of Floating Point Operations/Cycle}$

For e.g., for a cluster with 16 identical recent Intel architecture compute nodes, each compute node with dual hex cores @ 3.00 GHz, $\mathrm{GFLOPS}_\mathrm{Theory}$ will be

$\mbox{16 (\# of Nodes)} \;\times\; \mbox{2 (\# of Sockets/Node)} \;\times\; \mbox{6 (\# of Cores/Socket)} \;\times\; \\\\\mbox{3 G [CPU Frequency (Cycles/second)]} \;\times\; \mbox{4 (\# of Floating Point Operations/Cycle)}\\\\ = 2304 \mbox{ GFLOPS} \\\\ \approx 2.3 \mbox{ TFLOPS}$

If each of these nodes had 24 GB RAM, then

MEM_BYTES = 1024 * 1024 * 1024 * 24 * 16 = 412316860416

and

DP_ELEMENTS = 412316860416/8 = 51539607552

As such, N values will be

$N \:=\: m\:\times\: \sqrt{51539607552} \:\approx\: m\:\times\: 227020 \hspace{0.50in} m: 0.50\:(0.10)\:0.80$

### What if the cluster has heterogeneous compute nodes?

Computing $\mathrm{GFLOPS}_\mathrm{Theory}$ isn’t easy in this case; becomes even more so when these compute nodes belong to different generations as one has to account for aging factor. It has been a practice at Michigan Tech, in such cases, to split the cluster into different queues – one for each generation/type of compute nodes – and run the HPL benchmark separately.

### Thanks be to

Rocks mailing list and its participants.

### 3 responses to “HPL 2.0 benchmark with GCC 4.1.2 on Rocks 5.4.2”

1. [...] HPL 2.0 Benchmark With GCC 4.1.2 On Rocks 5.4.2 [...]

2. BILAL says:

i have many many problems related installing linpack on rock cluster….can you plz give me some proper linux command guide to install mpich ,blas then hpl…..and proper detail instruction related changes in make.linux file……because i get two errors in
command:

make arch = linux

• Gowtham says:

Give this a try:

make arch=linux

That is, without any spaces around =

Most of these posts, especially the ones with any hint of technical jargon, are intended to be Note2Self. But if any of them float your boat, then feel free to sail along.

If you feel so generous, improve my journey by sharing your thoughts!

@sgowtham

"… there is no such thing as professional photographers … just different skill levels" Marissa Mayer, CEO, Yahoo http://t.co/R2zBxykDf5

RT @Nobelprize_org: Heinrich Rohrer, received 1986 #NobelPrize in #Physics for the scanning tunneling microscope, died age 80, interview ht…

Ragin' Rainbow Falls: http://t.co/OoTXcKrlWn #Michigan #Nature #BlackRiver

(Trout) Lily Pad: http://t.co/BMnG6N4zKt #Michigan #Nature

Furious Superior: http://t.co/3JGbJNQrib #Wisconsin #Nature #PostFloodMudStir