Setup Hostname without DNS Server on CentOS 7

Let’s say that you have a cluster of 6 nodes and you do not have a DNS server.

How do you map the IPs to their perspective hostnames properly so that every node in the cluster can detect the other node’s hostname? We need to bypass the automatic lookup.

I have 6 nodes with the following IPs and hostnames.

128.197.115.158 buhpc1
128.197.115.7 buhpc2
128.197.115.176 buhpc3
128.197.115.17 buhpc4
128.197.115.9 buhpc5
128.197.115.15 buhpc6

Now, we edit every node’s /etc/hosts file.

For all nodes

vi /etc/hosts

The original files look like this:

We change the files to look like this:

Restart the network.
systemctl restart network

Make sure that the /etc/hosts file is exactly the same on all 6 nodes! Then, after restarting all the networks with the above command, the /etc/hosts file with bypass the automatic DNS lookup and relate those hostnames to those IP addresses!

Testing

If I am on buhpc1, and I do:

ping buhpc2

I will be pinging 128.197.115.7. In the /etc/hosts file, we have defined 128.197.115.7 as buhpc2.

How to Mount USB Flash Drive on CentOS 7

In this blog article, I’ll show you how to mount a USB Flash Drive on CentOS 7 terminal. In my case, I needed to mount a USB Flash Drive on my minimal CentOS 7 machine to copy a file to the USB Flash Drive.

USB Flash Drive

The file system of my USB flash drive is FAT32. I used a Windows 10 computer to create a folder called System Volume Information on the USB flash drive.

fat32-type

Mounting

First, go to your CentOS 7 computer and create a folder where you’ll mount the contents of the USB flash drive to.

mkdir -p /media/USB

/dev is a location that represents devices attached to your computer. Check the /dev directory’s contents by typing:

ls /dev/sd*

You should see something like this:

sda-check

Next, insert your USB flash drive into the CentOS 7 machine. You could also use ls /dev/sd (then hit tab). You should see a new sdb and sdb1.

new-sd-letter

Our USB flash drive is represented by /dev/sdb1. We will mount the USB flash drive to the /media/USB folder that we created earlier.

mount -t vfat /dev/sdb1 /media/USB

Check if the USB flash drive is mounted by listing the contents of /media/USB.

[[email protected] ~]# ls /media/USB
System Volume Information

Since the /media/USB contains System Volume Information, I know that the USB flash drive was mounted properly. Now, I can copy any file to the mounted USB flash drive folder.

cp nfs-utils-1.3.0-0.21.el7_2.x86_64.rpm /media/USB/

Unmounting

After you are done with the USB flash drive, always remember to unmount the USB flash drive from the folder it is mounted on.

umount /media/USB

You can now safely eject the USB flash drive from the CentOS 7 machine.

How to Run HPL (LINPACK) Across Nodes – Guide

We’ve been working on a benchmark called HPL also known as High Performance LINPACK on our cluster. Our cluster is made of 6 nodes.
The specs are: 6 x86 nodes each with an Intel(R) Xeon (R) CPU 5140 @ 2.33 GHz, 4 cores, and no accelerators. Our OS is CentOS 7. At first, we had difficulty improving HPL performance across nodes. For some reason, we would get the same performance with 1 node compared to 6 nodes. Here’s what we did to improve performance across nodes, but before we get into performance, let’s answer the big questions about HPL. For more information, visit the HPL FAQs.

What is HPL?

HPL measures the floating point execution rate for solving a system of linear equations. HPL is measured in FLOPs, which are floating point operations per second. The dependencies include MPI and BLAS.

 

Theoretical peak performance

When you run HPL, you will get a result with the number of FLOPs HPL took to complete. With benchmarks like HPL, there is something called the theoretical peak FLOPs/s, which is denoted by:

Number of cores * Average frequency * Operations per cycle

You will come below the theoretical peak FLOPs/second, but the theoretical peak is a good number to compare your HPL results. First, we’ll look at the number of cores we have.

cat /proc/cpuinfo

At the bottom of the cpuinfo of my laptop, I see processor: 7, which means that we have 8 cores. Processor core numbers start with 0. From the model name, I see that I have a Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, which means the average frequency is 2.60GHz. For the operations per cycle, we need to dig deeper and search additional information about the architecture. Doing a Google search on Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, we find that the max frequency considering turbo is 3.5 GHz. After a little snooping on the page, I noticed a link stating Products formerly Skylake. Skylake is a name of a microarchitecture. There’s a Stackoverflow question listing the operations per cycle for a number of recent processor microarchitectures. On the link, we see:

Intel Haswell/Broadwell/Skylake:

  • 16 DP FLOPs/cycle: two 4-wide FMA (fused multiply-add) instructions
  • 32 SP FLOPs/cycle: two 8-wide FMA (fused multiply-add) instructions

DP stands for double-precision, and SP stands for single-precision. Considering the CPU running of HPL, we would have a theoretical peak performance:

8 cores * 3.50 GHz * 16 FLOPs/cycle =  448 GFLOPS

You will have to do the same calculations for your GPU if you plan on running HPL with your GPU.

 

Why are your performance results below the theoretical peak?

The performance results depend on the algorithm, size of the problem, implementation, human optimizations to the program, compiler’s optimizations, age of the compiler, the OS, interconnect, memory, architecture, and the hardware. Basically, things aren’t perfect when running HPL, so you won’t hit the theoretical peak, but the theoretical peak is a good number to base your results on. At least 50% of your cluster’s theoretical peak performance with HPL would be an excellent goal.

 

Improving HPL Performance across Nodes

Very helpful notes on tuning HPL are available here. The HPL.dat file resides inside hpl/bin/xhpl. The file contains information on the problem size, machine configuration, and algorithm. In HPL.dat, you can change:

N – size of the problem. The problem size is the largest problem size fitting in memory. You should fill up around 80% of total RAM as recommended by the HPL docs.  If the problem size is too large, the performance will drop. Think about how much RAM you have. For instance, let’s say that I had 4 nodes with 256 MB of RAM each. In total, I have 1 GB of RAM. On our cluster, our peak performance for N is at 64000.

P – number of processes. One caveat is that P is less than Q.

Q – number of nodes. (P * Q is the total number of processes you can run on your cluster).

NBs – subset of N to distribute across nodes. NB is the block size, which is used for data distribution and data reuse. Small block sizes will limit the performance because there is less data reuse in the highest level of memory and more messaging. When block sizes are too big, we can waste space and extra computation for the larger sizes. HPL docs recommend 32 – 256. We used 256.

Our example run: N = 64000

We used an N that was a multiple of 256 because we noticed huge performance drop when NBs < 256.

P = 4, which is the max number of cores we have on each node.

Q = 5, which is the number of nodes we use. We chose 5 because our 6th node didn’t have Intel libraries at the time.

NBs = 256.

After editing the HPL.dat and saving the file, you can test using MPI with HPL. Our /nfs/hosts2 file contains 5 IP addresses. To run HPL with mpirun:

mpirun -n 20 -f /nfs/hosts2 ./xhpl

You should get improved FLOP performance compared to running HPL on a single node.

How to Install Ansible on CentOS 7

Ansible is a useful configuration automation software that allows you to automate the setup of your machines. In comparison to other configuration automation out there like Chef or Puppet, I’ve found it to be much simpler to use for cluster setup.

Installation

First, I normally update the EPEL on CentOS 7 before installing ansible.

yum install epel-release

Afterwards, we can use yum install to setup ansible on our machine. If you have a cluster, install ansible on each individual node. Yum will install all dependencies that is required of ansible.

yum install ansible

To check if you have installed ansible correctly:

ansible --version
ansible 1.9.4
  configured module search path = None

How to Install Slurm on CentOS 7 Cluster

Slurm is an open-source workload manager designed for Linux clusters of all sizes. It’s a great system for queuing jobs for your HPC applications. I’m going to show you how to install Slurm on a CentOS 7 cluster.

  1. Delete failed installation of Slurm
  2. Install MariaDB
  3. Create the global users
  4. Install  Munge
  5. Install Slurm
  6. Use Slurm

Cluster Server and Compute Nodes

I configured our nodes with the following hostnames using these steps. Our server is:

buhpc3

The clients are:

buhpc1
buhpc2
buhpc3
buhpc4
buhpc5
buhpc6

 

Delete failed installation of Slurm

I leave this optional step in case you tried to install Slurm, and it didn’t work. We want to uninstall the parts related to Slurm unless you’re using the dependencies for something else.

First, I remove the database where I kept Slurm’s accounting.

yum remove mariadb-server mariadb-devel -y

Next, I remove Slurm and Munge. Munge is an authentication tool used to identify messaging from the Slurm machines.

yum remove slurm munge munge-libs munge-devel -y

I check if the slurm and munge users exist.

cat /etc/passwd | grep slurm

Then, I delete the users and corresponding folders.

userdel - r slurm
userdel -r munge
userdel: user munge is currently used by process 26278
kill 26278
userdel -r munge

Slurm, Munge, and Mariadb should be adequately wiped. Now, we can start a fresh installation that actually works.

 

Install MariaDB

You can install MariaDB to store the accounting that Slurm provides. If you want to store accounting, here’s the time to do so. I only install this on the server node, buhpc3. I use the server node as our SlurmDB node.

yum install mariadb-server mariadb-devel -y

We’ll setup MariaDB later. We just need to install it before building the Slurm RPMs.

 

Create the global users

Slurm and Munge require consistent UID and GID across every node in the cluster.

For all the nodes, before you install Slurm or Munge:

export MUNGEUSER=991
groupadd -g $MUNGEUSER munge
useradd  -m -c "MUNGE Uid 'N' Gid Emporium" -d /var/lib/munge -u $MUNGEUSER -g munge  -s /sbin/nologin munge
export SLURMUSER=992
groupadd -g $SLURMUSER slurm
useradd  -m -c "SLURM workload manager" -d /var/lib/slurm -u $SLURMUSER -g slurm  -s /bin/bash slurm

 

Install Munge

Since I’m using CentOS 7, I need to get the latest EPEL repository.

yum install epel-release

Now, I can install Munge.

yum install munge munge-libs munge-devel -y

After installing Munge, I need to create a secret key on the Server. My server is on the node with hostname, buhpc3. Choose one of your nodes to be the server node.

First, we install rng-tools to properly create the key.

yum install rng-tools -y
rngd -r /dev/urandom

Now, we create the secret key. You only have to do the creation of the secret key on the server.

/usr/sbin/create-munge-key -r
dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key
chown munge: /etc/munge/munge.key
chmod 400 /etc/munge/munge.key

After the secret key is created, you will need to send this key to all of the compute nodes.

scp /etc/munge/munge.key [email protected]:/etc/munge
scp /etc/munge/munge.key [email protected]:/etc/munge
scp /etc/munge/munge.key [email protected]:/etc/munge
scp /etc/munge/munge.key [email protected]:/etc/munge
scp /etc/munge/munge.key [email protected]:/etc/munge

Now, we SSH into every node and correct the permissions as well as start the Munge service.

chown -R munge: /etc/munge/ /var/log/munge/
chmod 0700 /etc/munge/ /var/log/munge/
systemctl enable munge
systemctl start munge

To test Munge, we can try to access another node with Munge from our server node, buhpc3.

munge -n
munge -n | unmunge
munge -n | ssh 3.buhpc.com unmunge
remunge

If you encounter no errors, then Munge is working as expected.

 

Install Slurm

Slurm has a few dependencies that we need to install before proceeding.

yum install openssl openssl-devel pam-devel numactl numactl-devel hwloc hwloc-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel man2html libibmad libibumad -y

Now, we download the latest version of Slurm preferably in our shared folder. The latest version of Slurm may be different from our version.

cd /nfs
wget http://www.schedmd.com/download/latest/slurm-15.08.9.tar.bz2

If you don’t have rpmbuild yet:

yum install rpm-build
rpmbuild -ta slurm-15.08.9.tar.bz2

We will check the rpms created by rpmbuild.

cd /root/rpmbuild/RPMS/x86_64

Now, we will move the Slurm rpms for installation for the server and computer nodes.

mkdir /nfs/slurm-rpms
cp slurm-15.08.9-1.el7.centos.x86_64.rpm slurm-devel-15.08.9-1.el7.centos.x86_64.rpm slurm-munge-15.08.9-1.el7.centos.x86_64.rpm slurm-perlapi-15.08.9-1.el7.centos.x86_64.rpm slurm-plugins-15.08.9-1.el7.centos.x86_64.rpm slurm-sjobexit-15.08.9-1.el7.centos.x86_64.rpm slurm-sjstat-15.08.9-1.el7.centos.x86_64.rpm slurm-torque-15.08.9-1.el7.centos.x86_64.rpm /nfs/slurm-rpms

On every node that you want to be a server and compute node, we install those rpms. In our case, I want every node to be a compute node.

yum --nogpgcheck localinstall slurm-15.08.9-1.el7.centos.x86_64.rpm slurm-devel-15.08.9-1.el7.centos.x86_64.rpm slurm-munge-15.08.9-1.el7.centos.x86_64.rpm slurm-perlapi-15.08.9-1.el7.centos.x86_64.rpm slurm-plugins-15.08.9-1.el7.centos.x86_64.rpm slurm-sjobexit-15.08.9-1.el7.centos.x86_64.rpm slurm-sjstat-15.08.9-1.el7.centos.x86_64.rpm slurm-torque-15.08.9-1.el7.centos.x86_64.rpm

After we have installed Slurm on every machine, we will configure Slurm properly.

Visit http://slurm.schedmd.com/configurator.easy.html to make a configuration file for Slurm.

I leave everything default except:

ControlMachine: buhpc3
ControlAddr: 128.197.116.18
NodeName: buhpc[1-6]
CPUs: 4
StateSaveLocation: /var/spool/slurmctld
SlurmctldLogFile: /var/log/slurmctld.log
SlurmdLogFile: /var/log/slurmd.log
ClusterName: buhpc

After you hit Submit on the form, you will be given the full Slurm configuration file to copy.

On the server node, which is buhpc3:

cd /etc/slurm
vim slurm.conf

Copy the form’s Slurm configuration file that was created from the website and paste it into slurm.conf. We still need to change something in that file.

Underneathe slurm.conf “# COMPUTE NODES,” we see that Slurm tries to determine the IP addresses automatically with the one line.

NodeName=buhpc[1-6] CPUs = 4 State = UNKOWN

I don’t use IP addresses in order, so I manually delete this one line and change it to:

After you explicitly put in the NodeAddr IP Addresses, you can save and quit. Here is my full slurm.conf and what it looks like:

Now that the server node has the slurm.conf correctly, we need to send this file to the other compute nodes.

scp slurm.conf [email protected]/etc/slurm/slurm.conf
scp slurm.conf [email protected]/etc/slurm/slurm.conf
scp slurm.conf [email protected]/etc/slurm/slurm.conf
scp slurm.conf [email protected]/etc/slurm/slurm.conf
scp slurm.conf [email protected]/etc/slurm/slurm.conf

Now, we will configure the server node, buhpc3. We need to make sure that the server has all the right configurations and files.

mkdir /var/spool/slurmctld
chown slurm: /var/spool/slurmctld
chmod 755 /var/spool/slurmctld
touch /var/log/slurmctld.log
chown slurm: /var/log/slurmctld.log
touch /var/log/slurm_jobacct.log /var/log/slurm_jobcomp.log
chown slurm: /var/log/slurm_jobacct.log /var/log/slurm_jobcomp.log

Now, we will configure all the compute nodes, buhpc[1-6]. We need to make sure that all the compute nodes have the right configurations and files.

mkdir /var/spool/slurmd
chown slurm: /var/spool/slurmd
chmod 755 /var/spool/slurmd
touch /var/log/slurmd.log
chown slurm: /var/log/slurmd.log

Use the following command to make sure that slurmd is configured properly.

slurmd -C

You should get something like this:

ClusterName=(null) NodeName=buhpc3 CPUs=4 Boards=1 SocketsPerBoard=2 CoresPerSocket=2 ThreadsPerCore=1 RealMemory=7822 TmpDisk=45753
UpTime=13-14:27:52

The firewall will block connections between nodes, so I normally disable the firewall on the compute nodes except for buhpc3.

systemctl stop firewalld
systemctl disable firewalld

On the server node, buhpc3, I usually open the default ports that Slurm uses:

firewall-cmd --permanent --zone=public --add-port=6817/udp
firewall-cmd --permanent --zone=public --add-port=6817/tcp
firewall-cmd --permanent --zone=public --add-port=6818/tcp
firewall-cmd --permanent --zone=public --add-port=6818/tcp
firewall-cmd --permanent --zone=public --add-port=7321/tcp
firewall-cmd --permanent --zone=public --add-port=7321/tcp
firewall-cmd --reload

If the port freeing does not work, stop the firewalld for testing. Next, we need to check for out of sync clocks on the cluster. On every node:

yum install ntp -y
chkconfig ntpd on
ntpdate pool.ntp.org
systemctl start ntpd

The clocks should be synced, so we can try starting Slurm! On all the compute nodes, buhpc[1-6]:

systemctl enable slurmd.service
systemctl start slurmd.service
systemctl status slurmd.service

Now, on the server node, buhpc3:

systemctl enable slurmctld.service
systemctl start slurmctld.service
systemctl status slurmctld.service

When you check the status of slurmd and slurmctld, we should see if they successfully completed or not. If problems happen, check the logs!

Compute node bugs: tail /var/log/slurmd.log
Server node bugs: tail /var/log/slurmctld.log

Use Slurm

To display the compute nodes:

scontrol show nodes

-N allows you to choose how many compute nodes that you want to use. To run jobs on the server node, buhpc3:

srun -N5 /bin/hostname
buhpc3
buhpc2
buhpc4
buhpc5
buhpc1

To display the job queue:

scontrol show jobs
JobId=16 JobName=hostname
UserId=root(0) GroupId=root(0)
Priority=4294901746 Nice=0 Account=(null) QOS=(null)
JobState=COMPLETED Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A
SubmitTime=2016-04-10T16:26:04 EligibleTime=2016-04-10T16:26:04
StartTime=2016-04-10T16:26:04 EndTime=2016-04-10T16:26:04
PreemptTime=None SuspendTime=None SecsPreSuspend=0
Partition=debug AllocNode:Sid=buhpc3:1834
ReqNodeList=(null) ExcNodeList=(null)
NodeList=buhpc[1-5]
BatchHost=buhpc1
NumNodes=5 NumCPUs=20 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=20,node=5
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) Gres=(null) Reservation=(null)
Shared=0 Contiguous=0 Licenses=(null) Network=(null)
Command=/bin/hostname
WorkDir=/root
Power= SICP=0

To submit script jobs, create a script file that contains the commands that you want to run. Then:

sbatch -N2 script-file

Slurm has a lot of useful commands. You may have heard of other queuing tools like torque. Here’s a useful link for the command differences: http://www.sdsc.edu/~hocks/FG/PBS.slurm.html

 

Accounting in Slurm

We’ll worry about accounting in Slurm with MariaDB for next time. Let me know if you encounter any problems with the above steps!

 

How to Install GlusterFS on CentOS 7

GlusterFS is a scale-out network-attached storage file system. In this tutorial, we’ll be setting up GlusterFS on a cluster with CentOS 7.  Our cluster has 6 nodes connected through a switch. I’ll be using all 6 nodes as servers for distributed replicated storage with opportunity for more nodes to be clients that can access files from the GlusterFS servers.

How does GlusterFS work

In a GlusterFS, servers are used to store data in a distributed manner, and clients can access that data. Let’s explain with our 6 node example. I’m using 3 replicas, so we have pairs of 2 that compose each replica. When we use 6 nodes as servers, nodes 1 and 2 (replication1), nodes 3 and 4 (replication2), and nodes 5 and 6 (replication3) will mirror each other.

Sometimes, files are retrieved from replication1, and other times, replication2, and other times, replication3. If you think about the entire storage file system, replication1, replication2, and replication3 combine into one larger storage system (distribution). The charm of GlusterFS is file location calculation without lookup. Less bottleneck.

gluster-distributed-illustration

 

Cluster Servers

I configured our nodes with the following hostnames using these steps. Our servers are:

3.buhpc.com
1.buhpc.com
2.buhpc.com
4.buhpc.com
5.buhpc.com
6.buhpc.com

Setting up the GlusterFS Servers

Update the yum repo and epel.

yum update -y
wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/glusterfs-epel.repo
yum install http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

Install the GlusterFS server and Samba.

yum install glusterfs-server samba -y

We will make a directory on every server node, which will be the location of the mount point.

mkdir -p /gfs/glustervol

On every server node, we want to start the gluster service.

systemctl enable glusterd.service && systemctl start glusterd.service

On every server node, if you have firewalld running, we want to open the correct ports.

firewall-cmd --zone=public --add-port=24009/tcp --permanent
firewall-cmd --zone=public --add-port=24007/tcp --permanent
firewall-cmd --zone=public --add-service=nfs --add-service=samba --add-service=samba-client --permanent
firewall-cmd --zone=public --add-port=111/tcp --add-port=139/tcp --add-port=445/tcp --add-port=965/tcp --add-port=2049/tcp --add-port=38465-38469/tcp --add-port=631/tcp --add-port=111/udp --add-port=963/udp --add-port=49152-49251/tcp  --permanent
firewall-cmd --reload

You should see success on every added firewalld rule.

 

The Main GlusterFS Server

For our setup, we chose our 3.buhpc.com node to be our main server that connects all the other servers. Choose one node as the main server and connect the peers:

gluster peer probe 1.buhpc.com
gluster peer probe 2.buhpc.com
gluster peer probe 4.buhpc.com
gluster peer probe 5.buhpc.com
gluster peer probe 6.buhpc.com

We can check if we successfully added all the peers to our main server.

gluster peer status
Number of Peers: 5
Hostname: cumm024-0b08-dhcp07.bu.edu
Uuid: b7c48a28-2229-49f5-af28-41cd9cce2fe6
State: Peer in Cluster (Connected)
Other names:
2.buhpc.com

Hostname: 1.buhpc.com
Uuid: 5eacbc2e-6490-47bb-b4fd-9a2575db941f
State: Peer in Cluster (Connected)

Hostname: 4.buhpc.com
Uuid: 240282d8-a4cb-4bbc-8ca6-00a3383a0c48
State: Peer in Cluster (Connected)

Hostname: 5.buhpc.com
Uuid: 4edc641b-dbcb-415f-9618-718087004adc
State: Peer in Cluster (Connected)

Hostname: 6.buhpc.com
Uuid: 24364805-7cbe-405d-adcf-a6334f9f6e40
State: Peer in Cluster (Connected)

Now, we will create the GlusterFS volume. We are naming it glustervol. We are using 3 replicas to pair the 6 node servers that we have.

gluster volume create glustervol replica 3 transport tcp 3.buhpc.com:/gfs/glustervol 1.buhpc.com:/gfs/glustervol 2.buhpc.com:/gfs/glustervol 4.buhpc.com:/gfs/glustervol 5.buhpc.com:/gfs/glustervol 6.buhpc.com:/gfs/glustervol force

If all goes well, we can start the gluster volume.

gluster volume start glustervol
gluster volume info all
Volume Name: glustervol
Type: Distributed-Replicate
Volume ID: ed995f44-6649-48d0-b5a8-7e87c3568473
Status: Started
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: 3.buhpc.com:/gfs/glustervol
Brick2: 1.buhpc.com:/gfs/glustervol
Brick3: 2.buhpc.com:/gfs/glustervol
Brick4: 4.buhpc.com:/gfs/glustervol
Brick5: 5.buhpc.com:/gfs/glustervol
Brick6: 6.buhpc.com:/gfs/glustervol
Options Reconfigured:
performance.readdir-ahead: on

If we go into /gfs/glustervol and create a file, it appears on all the server nodes.

cd /gfs/glustervol
touch slothparadise.txt
ssh [email protected]
ls /gfs/glustervol
slothparadise.txt

 

Connecting Clients

The servers store the data efficiently in a distributed replicated manner. Now, we can add clients to be able to access those files. Let’s say that we had a node 7, 7.buhpc.com. Here’s how we would add node 7 as a client to the 6 node servers.

yum install glusterfs glusterfs-fuse attr -y

After installing the necessary glusterfs dependencies for clients, we can mount glusterfs volume onto a folder of node 7. We mount with type, glusterfs, and access the main server’s volume name, which we named glustervol, and we mount it into /mnt on node 7.

mount -t glusterfs 3.buhpc.com:/glustervol /mnt/
ls /mnt
slothparadise.txt

 

Deleting Gluster Volume

You may want to delete a gluster volume in the future. To find what gluster volumes you have:

gluster volume info all

To stop and delete the gluster volume:

gluster volume stop nameOfVolume
gluster volume delete nameOfVolume

How to Install Ganglia on CentOS 7

Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. Ganglia is useful when monitoring nodes of a cluster. Setting up Ganglia on CentOS 7 with a bunch of nodes can be confusing. In this blog, I’ll show you how to setup Ganglia and its web interface properly. Our cluster has 6 nodes connected through a switch.

Cluster Server and Clients

I configured our nodes with the following hostnames using these steps. Our server is:

3.buhpc.com

The clients are:

1.buhpc.com
2.buhpc.com
4.buhpc.com
5.buhpc.com
6.buhpc.com

 

Installation

On the server, inside the shared folder of our cluster, we will first download the latest version of ganglia. For our cluster, /nfs is the folder with our network file system.

cd /nfs
wget http://downloads.sourceforge.net/project/ganglia/ganglia%20monitoring%20core/3.7.2/ganglia-3.7.2.tar.gz

On the server, we will install dependencies and libconfuse.

yum install freetype-devel rpm-build php httpd libpng-devel libart_lgpl-devel python-devel pcre-devel autoconf automake libtool expat-devel rrdtool-devel apr-devel gcc-c++ make pkgconfig -y
yum install https://dl.fedoraproject.org/pub/epel/7/x86_64/l/libconfuse-2.7-7.el7.x86_64.rpm -y
yum install https://dl.fedoraproject.org/pub/epel/7/x86_64/l/libconfuse-devel-2.7-7.el7.x86_64.rpm -y

Now, we will build the rpms from ganglia-3.7.2 on the server.

rpmbuild -tb ganglia-3.7.2.tar.gz

After running rpmbuild, /root/rpmbuild/RPMS/x86_64 contains the generated rpms:

cd /root/rpmbuild/RPMS/x86_64/
yum install *.rpm -y

We will remove gmetad because we do not need it on the clients. Send the rest of the rpms to all the clients’ /tmp folder:

cd /root/rpmbuild/RPMS/x86_64/
rm -rf ganglia-gmetad*.rpm
scp *.rpm [email protected]:/tmp
scp *.rpm [email protected]:/tmp
scp *.rpm [email protected]:/tmp
scp *.rpm [email protected]:/tmp
scp *.rpm [email protected]:/tmp

SSH onto every client and install the rpms that we will need:

ssh [email protected]#.buhpc.com
yum install https://dl.fedoraproject.org/pub/epel/7/x86_64/l/libconfuse-2.7-7.el7.x86_64.rpm -y
yum install https://dl.fedoraproject.org/pub/epel/7/x86_64/l/libconfuse-devel-2.7-7.el7.x86_64.rpm -y
yum install /tmp/*.rpm - y

Back on the server, we will adjust the gmetad configuration file:

cd /etc/ganglia
vim gmetad.conf

buhpc will be the name of  our cluster. Find the following line and add the name of your cluster and ip address. I am using the subdomain instead of the ip address.

data_source "buhpc" 1 3.buhpc.com

Now, we edit the server’s gmond configuration file.

vim /etc/ganglia/gmond.conf

Make sure that these sections have the following and comment any extra lines you see that are within each section.

cluster {
  name = "buhpc"
  owner = "unspecified"
  latlong = "unspecified"
  url = "unspecified"
}

udp_send_channel {
  host = 1.buhpc.com
  port = 8649
  ttl = 1
}

udp_send_channel {
  host = 2.buhpc.com
  port = 8649
  ttl = 1
}

udp_send_channel {
  host = 3.buhpc.com
  port = 8649
  ttl = 1
}
udp_send_channel {
  host = 4.buhpc.com
  port = 8649
  ttl = 1
}

udp_send_channel {
  host = 5.buhpc.com
  port = 8649
  ttl = 1
}

udp_send_channel {
  host = 6.buhpc.com
  port = 8649
  ttl = 1
}

udp_recv_channel {
  port = 8649
  retry_bind = true
}

Now, SSH into each of the clients and do the following individually. On every client:

vim /etc/ganglia/gmond.conf

We will change the clients’ gmond.conf in the same way as the server’s.  Make sure that these sections have the following lines and comment any extra lines you see that are within each section.

cluster {
  name = "buhpc"
  owner = "unspecified"
  latlong = "unspecified"
  url = "unspecified"
}

udp_send_channel {
  host = 1.buhpc.com
  port = 8649
  ttl = 1
}

udp_send_channel {
  host = 2.buhpc.com
  port = 8649
  ttl = 1
}

udp_send_channel {
  host = 3.buhpc.com
  port = 8649
  ttl = 1
}
udp_send_channel {
  host = 4.buhpc.com
  port = 8649
  ttl = 1
}

udp_send_channel {
  host = 5.buhpc.com
  port = 8649
  ttl = 1
}

udp_send_channel {
  host = 6.buhpc.com
  port = 8649
  ttl = 1
}

udp_recv_channel {
  port = 8649
  retry_bind = true
}

We will start gmond on the clients for monitoring.

chkconfig gmond on
systemctl start gmond

Back on the server, we want to install the Ganglia web interface.

cd /nfs
wget http://downloads.sourceforge.net/project/ganglia/ganglia%20monitoring%20core/3.1.1%20%28Wien%29/ganglia-web-3.1.1-1.noarch.rpm -O ganglia-web-3.1.1-1.noarach.rpm
yum install -y ganglia-web-3.1.1-1.noarch.rpm

Next, we will want to disable SELinux. Change SELINUX inside /etc/sysconfig/selinux from enforcing to disabled. Then, restart the server node.

vim /etc/sysconfig/selinux
SELINUX=disabled
reboot

Now, on the server, we’ll open the correct ports on the firewall.

firewall-cmd --permanent --zone=public --add-service=http
firewall-cmd --permanent --zone=public --add-port=8649/udp
firewall-cmd --permanent --zone=public --add-port=8649/tcp
firewall-cmd --permanent --zone=public --add-port=8651/tcp
firewall-cmd --permanent --zone=public --add-port=8652/tcp
firewall-cmd --reload

On the server, we will now start httpd, gmetad, and gmond.

chkconfig httpd
chkconfig gmetad on
chkconfig gmond on
systemctl start httpd
systemctl start gmetad
systemctl start gmond

Visit http://3.buhpc.com/ganglia to see Ganglia’s monitoring. You should see something like this:

ganglia-home-page

How to Bypass Intel PXE Boot

You might encounter a PXE Boot loading screen on your cluster, but you don’t have PXE Boot. And your network isn’t even configured yet! Many machines built to be part of clusters will pop up to this PXE screen and will loop forever! What do you do?

IMG_20160318_211013_new

 

Change the Boot Order

Let’s say that all I want to do is boot the machine and install a new operating system. I have my trusty CentOS 7 Bootable USB installer drive with me, and I plug it into the computer. First, we want to restart our machine, and press the run setup button on the boot screen, which is normally F2, but on this machine, it is DEL.

IMG_20160318_210814

We will maneuver tabs until we get to the Boot Order tab.

IMG_20160318_210910

As you look at all the devices, you’ll see a couple of things in your boot order called IBA GE Slot. These PCI slots trigger the PXE boot. We want to make sure that these two slots are at the bottom of the boot order. <+> moves devices up and <-> moves devices down.

IMG_20160318_210832

Move USB HDD to the top and both PCI BEV: IBA GE Slots to the bottom. Make sure that USB HDD is associated with a number like 1: USB HDD. If it does not have a number, your hard drive is part of the excluded list and you need to find its name and press <x> to include it into the boot list. Then, move USB HDD all the way up with <+>.

IMG_20160318_210921

ESC. Hit save configurations, and your computer restarts. Wait while your computer boots into your bootable USB flash drive installer as expected. PXE boot will not boot before everything else thankfully enough!

How to Setup the Intel Compilers on a Cluster

Intel compilers like icc or icl are very useful for any cluster with Intel processors. They’ve been known to produce very efficient numerical code. If you are still a student, you can grab the student Intel Parallel Studio XE Cluster Edition, which includes Fortran and C/C++ for free for a year. Here’s our experience. If you need more information, definitely check out the official Intel Parallel Studio XE Cluster Edition guide.

Dependencies

You should have the GCC C and C++ compilers on your machine. I am using CentOS 7. You will need to install GCC C and C++ compilers on all the machines.

yum install gcc
yum install gcc-c++

 

Getting the Intel compilers and MPI libraries

I’m going to grab the student  Intel® Parallel Studio XE Cluster Edition for Linux, which lasts for a year. First thing to do is to join the Intel Developer Zone at the following link:

https://software.intel.com/registration/?lang=en-us

Fill your information and choose an Intel User ID to create. Now, you’ll have an account, but you’ll need to be a student to get the Intel compilers for free at:

https://software.intel.com/en-us/qualify-for-free-software/student

Click on Linux underneath Intel Parallel Studio XE Cluster Edition. Check the items on the next page and fill your e-mail before submitting. After submitting, you’ll receive an e-mail labeled “Thank You for Your Interest in the Intell® Software Development Products.”

The e-mail contains a product serial number that should last a year. The e-mail also contains a DOWNLOAD button that you should click.

After visiting the link, you’ll be brought to Intel® Parallel Studio XE Cluster Edition for Linux*. I prefer the Full Offline Installer Package (3994 MB). If you choose the Full Offline Installer Package, you will need to stay on that link and acquire your license file. In the red text, you’ll see the following sentence:

"If you need to acquire your license file now, for offline installation, please click here to provide your host information and download your license file."

Once you click the here link, you’ll be brought into a Sign In page to download your license file. After signing in, you’ll see your licenses that you can download. Download your license file or e-mail it to yourself. If you download the license, it should be a lic file.

At this point, you should have downloaded two files. parallel_studio_xe_2016_update2.tgz contains the zipped archive of the Intel Parallel Studio XE Cluster Edition, and NCOM….lic is your license.

parallel_studio_xe_2016_update2.tgz
NCOM....lic

You should upload these two files to the shared folder of your cluster. My shared folder is /nfs, so I’ll be sending those two files to my /nfs folder.

scp parallel_studio_xe_2016_update2.tgz NCOM...lic [email protected]:/nfs

Now, you can extract the tgz file by running:

ssh [email protected]
cd /nfs
tar -xvf parallel_studio_xe_2016_update2.tgz

We will put the license file as Licenses in /root.

mv NCOM....lic /root/Licenses

 

Activation

Now, we will set up the Intel compilers and MPI libraries.

cd parallel_studio_xe_2016_update2
./install.sh

It should say Initializing, please wait… until a text GUI pops up for installation. Type the number option that installs the installation.

First, we need to activate. Hit 3 and press Enter.

Step 2 of 7 | License agreement
[Press space to continue, 'q' to quit.]

After pressing space a bunch of times, you’ll reach the end of the license.

Type 'accept' to continue or 'decline' to go back to the previous menu:

Type “accept.”

Please type a selection or press "Enter" to accept default choice [1]: Please type your serial number (the format is XXXX-XXXXXXXX):

In another terminal, check the serial number, which will be inside /root/Licenses.

 

Install

Hit enter and the correct number options for the Intel compilers and libraries to install. You’ll see the installation of Intel MPI Benchmarks, Libraries, C++ Compiler, Fortran Compiler, and more. Using the install.sh script is the sure way to make that all the Intel libraries are installed correctly, but if you really only want specific libraries, then you’ll have to select which ones you want to install inside the rpm/ folder. The full installation may take 15 minutes or more.

Press "Enter" key to continue:
Press "Enter" key to quit:

As for the final step, the paths for Intel may not be set up automatically. I am using CentOS 7 64 bit, so I’ll have to setup the environment for Intel 64 bit. We’ll have to adjust our ~/.bashrc.

vim ~/.bashrc

Add to the end of the file the following:

Save and quit. Note: your directories may be slightly different based on the version of Intel Parallel Studio XE Cluster Edition you installed. Adjust those directories accordingly by searching whether the directories match.

source ~/.bashrc

Now, you should be able to access and use the Intel compilers as expected.

 

RLIMIT_MEMLOCK too small

When you first run your mpirun command with the Intel Parallel Studio XE Cluster Edition, you may receive an error about RLIMIT_MEMLOCK being too small.

mpirun -n 8 -f /nfs/hosts2 ./xhpcg --nx=16 --rt=60

The problem is that memory lock is set statically, and it’s too small. For every machine that you want to use MPI, we should set memory lock to unlimited.

ulimit -l unlimited
ulimit -l

If the second command says unlimited, we’ve set memory lock to unlimited. Now, we have to make sure that it’s unlimited on every startup instance.

vi /etc/security/limits.conf

Go to the bottom of the file and add the following:

*            hard   memlock           unlimited
*            soft   memlock           unlimited

Save and quit. Now, if you run the MPI command again, you should not encounter any problems.

 

Missing Hydra Files

You may come across an error where you can have missing hydra files. When you run mpirun, you may get:

bash: /usr/local/bin/hydra_pmi_proxy: No such file or directory

How I fixed the problem was: I downloaded MPICH, a different MPI library, and compiled it with these instructions. hydra binaries for MPICH should work with Intel MPI because they’re both the same process manager.  I copied MPICH’s hydra binaries to a directory that was also added to the ~/.bashrc PATH.

cp /nfs/mpich2/bin/hydra_persist /nfs/mpich2/bin/hydra_nameserver /nfs/mpich2/bin/hydra_pmi_proxy /usr/local/bin

Then, I added /usr/local/bin to the ~/.bashrc PATH.

vim ~/.bashrc

Add the following line:

export PATH=/usr/local/bin:$PATH

Save the file. And then reload ~/.bashrc.

source ~/.bashrc

Do this for all the nodes where you are missing hydra_pmi_proxy. Afterwards, if you run mpirun again, it should work!

yum [Errno 14] HTTP Error 404 – Not Found

When you run a yum command like:

yum install vim

You may get the following error:

Loaded plugins: fastestmirror
http://repo.dimenoc.com/pub/centos/7.1.1503/os/x86_64/repodata/repomd.xml: [Errno 14] HTTP Error 404 - Not Found
Trying other mirror.
Is this ok [y/d/N]: y
Downloading packages:
Delta RPMs disabled because /usr/bin/applydeltarpm not installed.
vim-enhanced-7.4.160-1.el7.x86 FAILED
http://ftp.linux.ncsu.edu/pub/CentOS/7.1.1503/os/x86_64/Packages/vim-enhanced-7.4.160-1.el7.x86_64.rpm: [Errno 14] HTTP Error 404 - Not Found

To fix the error:

yum clean all
yum update

Now try again:

yum install vim

The command should continue normally.