Setting Up Infiniband in Centos 6.7

Installing Infiniband Drivers

In Centos/RHEL, software support for Mellanox infiniband hardware is found in the package group “Infiniband Support”, which can be installed with yum:

$yum -y groupinstall "Infiniband Support"

This will install the required kernel modules, and the infiniband subnet manager opensm.

Several optional packages are also available that make configuring and trouble shooting the network easier:

$yum -y install infiniband-diags perftest gperf

inifiniband-diags is a network diagnostic package containing useful analysis tools such as ibping and ibstat.

perftest and gperf are performance testing packages containing benchmarking tools.

Infinband in Centos uses RDMA (Remote Direct Memory Access, the ability to access the memory of a host without disturbing it’s CPU). Set the rdma and opensm services to start at boot time:

$chkconfig rdma on
$chkconfig opensm on

Restart the system. This will load the required kernel modules and start rdma and opensm.

$reboot

Checking Network Connectivity

After the system reboots, check the status of rdma and opensm (status should be on):

$service rdma status
$service opensm status

Check that any infiniband interfaces are now recognized. They should appear as ib0, ib1, etc. with 20 byte hardware addresses:

$ifconfig

Test status of local infiniband link with ibstat:

$ibstat

Connected ports are in the "LinkUp" state. Each interface will have a Base LID (link id) field by which it will be recognized by other hosts.

To display ib hosts and ib switches visible on the network run the ibhosts and ibswitches commands:

$ibhosts
$ibswitches

To test network connectivity with the ibping command (equivalent to icmp ping command), navigate to the host you wish to ping and start its ibping server:

$ibping -S &

Run ibstat to find the base link id of the connected port. Then navigate the host you wish to ping from and run the command:

$ibping <lid_dest>

You should see a stream of packets traveling to the destination host with transfer times and sequence numbers.

Configuring the Network

Before making any changes to network scripts stop the network service:

$service network stop

Centos 6.7 supports “infiniband over ip” meaning that the infiniband interfaces can be configured with ip addresses just like Ethernet interfaces. Configuration files can be found in the /etc/sysconfig/network-scripts folder, and have ib0, ib1, etc in the file name.

By default, the script should have values similar to:

TYPE=infiniband
BOOTPROTO=dhcp
NAME=ib0
UUID=
NM_CONTROLLED=yes
DEVICE=ib0
ONBOOT=no
To configure a static ip address, change the ONBOOT value to “yes”, the bootproto value to “static”, the NM_CONTROLLED value to “no.” Add the ip address, netmask, and router address. The modified script should look like:
TYPE=infiniband
BOOTPROTO=static
NAME=ib0
UUID=
NM_CONTROLLED=no
DEVICE=ib0
ONBOOT=yes
IPADDR=<address>
GATEWAY=<router address, usually first address of subnet>
NETMASK=<netmask>
 Save changes to the script and start the network service.
$service network start
Run ifconfig again to make sure the infiniband interfaces have static ip addresses.

How to Check Memory on Linux machines

This article will outline how to check memory on linux and unix based machines. I’ll demonstrate a couple of useful commands that check memory on the hard drive on your machine.

df -h

df -h, which stands for disk free, checks for the space left on the drive in a human friendly format. It’s very useful to see the overall disk space left on the device.

df -h

Screen Shot 2015-08-18 at 9.39.52 PM

 

du -hs

The opposite of disk free is disk usage. du -hs, which stands for disk usage, checks for the space used on our device in a human friendly format. This command is less useful, but sometimes, instead of needing to look at the summary of memory on the hard drive, we only require looking at the disk space we have already used.

du -hs

Screen Shot 2015-08-18 at 9.36.53 PM

 

Checking overall memory across drives/partitions

Instead of checking memory on the main partition, you might want to look at the size of memory across all drives and partitions. Here’s where Mac’s and Linux machines differ.

Macs
diskutil list

diskutil list checks for the size of drives connected to your Mac.

Screen Shot 2015-08-18 at 11.08.58 PM

Linux
sudo fdisk -l

fdisk -l does a similar check like diskutil -l but for Linux machines. You may be prompted for your password because you need sudo user access.

Screenshot from 2015-08-18 23:11:53