Softpanorama

Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
May the source be with you, but remember the KISS principle ;-)
Skepticism and critical thinking is not panacea, but can help to understand the world better

InfiniBand Subnet Manager

News

High Performance Computing (HPC)

 Recommended Links

Mellanox OpenFabrics Enterprise Distribution for Linux (MLNX_OFED)

Installing Mellanox InfiniBand Driver on RHEL 6.5

Setting up a basic infiniband network Getting Started with InfiniBand InfiniBand Subnet Manager
Mellanox InfiniBand switch ConnectX-3 cards GPFS on Red Hat HPC cluster architecture Message Passing Interface     Oracle Grid Engine
  Troubleshooting InfiniBand connection issues using OFED tools Linux Troubleshooting Linux Troubleshooting Tips Network Performance tuning Linux Performance Tuning Admin Horror Stories Humor Etc

Submitted by Peter Hartman (... on

The InfiniBand subnet manager (OpenSM) assigns Local IDentifiers (LIDs) to each port connected to the InfiniBand fabric, and develops a routing table based off of the assigned LIDs.

There are two types of subnet managers:

A typical InfiniBand installation using the OFED package will run the OpenSM subnet manager at system start up after the OpenIB drivers are loaded. This automatic OpenSM is resident in memory, and sweeps the InfiniBand fabric approximately every 5 seconds for new InfiniBand adapters to add to the subnet routing tables. This usage will be sufficient for most installations, and can be controlled using the following commands:

/etc/init.d/opensmd start
/etc/init.d/opensmd stop
/etc/init.d/opensmd restart
/etc/init.d/opensmd status
There are several instances where the default usage will not be sufficient, however. If the head node is used as a compute node, and resources are at a premium, the OpenSM subnet manager can be set to run once, configure the LIDs and routing tables, and then exit:
opensm -o
For InfiniBand adapters with two ports, a second instance of the subnet manager must be active to enable a subnet on the second port. To begin, enable the subnet manager as above:
service opensmd start
service opensmd start
Starting IB Subnet Manager.                                [  OK  ]
# ibstat
CA 'mlx4_0'
        CA type: MT4099
        Number of ports: 1
        Firmware version: 2.30.8000
        Hardware version: 1
        Node GUID: 0x0002c90300b78240
        System image GUID: 0x0002c90300b78243
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 40 (FDR10)
                Base lid: 1
                LMC: 0
                SM lid: 1
                Capability mask: 0x0251486a
                Port GUID: 0x0002c90300b78241
                Link layer: InfiniBand

If we have two ports we need to  discover the GUID of the second port:

ibstat -p
This command will output two numbers, one for each port. Use the second number to start up a new OpenSM instance in daemon mode:
opensm -g <0xguid number> -B
There may also be an instance where the head node does not have InfiniBand hardware, but the compute nodes do. In this case, provided a hardware subnet manager is not used, one of the compute nodes must act as the subnet manager.

If there is already a subnet manager is running on the cluster, either a hardware based version or an OpenSM instance, then running OpenSM on another node will cause the new instance to be put in a STANDBY state. In this state, the instance listens for the existing OpenSM instance to fail, and will take over subnet manager duties once a failure state has been detected.
 


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

13.7. Testing Early InfiniBand RDMA operation Red Hat Enterprise Linux 7 - Red Hat Customer Portal

Once the rdma service is enabled, and the opensm service (if needed) is enabled, and the proper user-space library for the specific hardware has been installed, user space rdma operation should be possible. Simple test programs from the libibverbs-utils package are helpful in determining that RDMA operations are working properly. The ibv_devices program will show which devices are present in the system and the ibv_devinfo command will give detailed information about each device. For example:
~]$ ibv_devices
    device                 node GUID
    ------              ----------------
    mlx4_0              0002c903003178f0
    mlx4_1              f4521403007bcba0
~]$ ibv_devinfo -d mlx4_1
hca_id: mlx4_1
        transport:                      InfiniBand (0)
        fw_ver:                         2.30.8000
        node_guid:                      f452:1403:007b:cba0
        sys_image_guid:                 f452:1403:007b:cba3
        vendor_id:                      0x02c9
        vendor_part_id:                 4099
        hw_ver:                         0x0
        board_id:                       MT_1090120019
        phys_port_cnt:                  2
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             2048 (4)
                        sm_lid:                 2
                        port_lid:               2
                        port_lmc:               0x01
                        link_layer:             InfiniBand

                port:   2
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             4096 (5)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             Ethernet
~]$ ibstat mlx4_1
CA 'mlx4_1'
        CA type: MT4099
        Number of ports: 2
        Firmware version: 2.30.8000
        Hardware version: 0
        Node GUID: 0xf4521403007bcba0
        System image GUID: 0xf4521403007bcba3
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 56
                Base lid: 2
                LMC: 1
                SM lid: 2
                Capability mask: 0x0251486a
                Port GUID: 0xf4521403007bcba1
                Link layer: InfiniBand
        Port 2:
                State: Active
                Physical state: LinkUp
                Rate: 40
                Base lid: 0
                LMC: 0
                SM lid: 0
                Capability mask: 0x04010000
                Port GUID: 0xf65214fffe7bcba2
                Link layer: Ethernet
The ibv_devinfo and ibstat commands output slightly different information (such as port MTU exists in ibv_devinfo but not in ibstat output, and the Port GUID exists in ibstat output but not in ibv_devinfo output), and a few things are named differently (for example, the Base local identifier (LID) in ibstat output is the same as the port_lid output of ibv_devinfo) Simple ping programs, such as ibping from the infiniband-diags package, can be used to test RDMA connectivity. The ibping program uses a client-server model. You must first start an ibping server on one machine, then run ibping as a client on another machine and tell it to connect to the ibping server. Since we are wanting to test the base RDMA capability, we need to use an RDMA specific address resolution method instead of IP addresses for specifying the server. On the server machine, the user can use the ibv_devinfo and ibstat commands to print out the port_lid (or Base lid) and the Port GUID of the port they want to test (assuming port 1 of the above interface, the port_lid/Base LID is 2 and Port GUID is 0xf4521403007bcba1)). Then start ibping with the necessary options to bind specifically to the card and port to be tested, and also specifying ibping should run in server mode. You can see the available options to ibping by passing -? or --help, but in this instance we will need either the -S or --Server option and for binding to the specific card and port we will need either -C or --Ca and -P or --Port. Note: port in this instance does not denote a network port number, but denotes the physical port number on the card when using a multi-port card. To test connectivity to the RDMA fabric using, for example, the second port of a multi-port card, requires telling ibping to bind to port 2 on the card. When using a single port card, or testing the first port on a card, this option is not needed. For example:
~]$ ibping -S -C mlx4_1 -P 1
Then change to the client machine and run ibping. Make note of either the port GUID of the port the server ibping program is bound to, or the local identifier (LID) of the port the server ibping program is bound to. Also, take note which card and port in the client machine is physically connected to the same network as the card and port that was bound to on the server. For example, if the second port of the first card on the server was bound to, and that port is connected to a secondary RDMA fabric, then on the client specify whichever card and port are necessary to also be connected to that secondary fabric. Once these things are known, run the ibping program as a client and connect to the server using either the port LID or GUID that was collected on the server as the address to connect to. For example:
~]$ ibping -c 10000 -f -C mlx4_0 -P 1 -L 2
--- rdma-host.example.com.(none) (Lid 2) ibping statistics ---
10000 packets transmitted, 10000 received, 0% packet loss, time 816 ms
rtt min/avg/max = 0.032/0.081/0.446 ms
or
~]$ ibping -c 10000 -f -C mlx4_0 -P 1 -G 0xf4521403007bcba1 \
--- rdma-host.example.com.(none) (Lid 2) ibping statistics ---
10000 packets transmitted, 10000 received, 0% packet loss, time 769 ms
rtt min/avg/max = 0.027/0.076/0.278 ms
This outcome verifies that end to end RDMA communications are working for user space applications. The following error may be encountered:
~]$ ibv_devinfo
libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0
No IB devices found
This error indicates that the necessary user-space library is not installed. The administrator will need to install one of the user-space libraries (as appropriate for their hardware) listed in section Section 13.4, "InfiniBand and RDMA related software packages". On rare occasions, this can happen if a user installs the wrong arch type for the driver or for libibverbs. For example, if libibverbs is of arch x86_64, and libmlx4 is installed but is of type i686, then this error can result.

Note

Many sample applications prefer to use host names or addresses instead of LIDs to open communication between the server and client. For those applications, it is necessary to set up IPoIB before attempting to test end-to-end RDMA communications. The ibping application is unusual in that it will accept simple LIDs as a form of addressing, and this allows it to be a simple test that eliminates possible problems with IPoIB addressing from the test scenario and therefore gives us a more isolated view of whether or not simple RDMA communications are working.

Recommended Links

Google matched content

Softpanorama Recommended

Top articles

Sites

Top articles

Sites

Intel® Cluster Ready

13.7. Testing Early InfiniBand RDMA operation Red Hat Enterprise Linux 7 - Red Hat Customer Portal



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2020 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: