Skip to content

Latest commit

 

History

History
 
 

libibverbs

Introduction
============

libibverbs is a library that allows programs to use RDMA "verbs" for
direct access to RDMA (currently InfiniBand and iWARP) hardware from
userspace.  For more information on RDMA verbs, see the InfiniBand
Architecture Specification vol. 1, especially chapter 11, and the RDMA
Consortium's RDMA Protocol Verbs Specification.

Using libibverbs
================

Device nodes
------------

The verbs library expects special character device files named
/dev/infiniband/uverbsN to be created.  When you load the kernel
modules, including both the low-level driver for your IB hardware as
well as the ib_uverbs module, you should see one or more uverbsN
entries in /sys/class/infiniband_verbs in addition to the
/dev/infiniband/uverbsN character device files.

To create the appropriate character device files automatically with
udev, a rule like

    KERNEL="uverbs*", NAME="infiniband/%k"

can be used.  This will create device nodes named

    /dev/infiniband/uverbs0

and so on.  Since the RDMA userspace verbs should be safe for use by
non-privileged users, you may want to add an appropriate MODE or GROUP
to your udev rule.

Permissions
-----------

To use IB verbs from userspace, a process must be able to access the
appropriate /dev/infiniband/uverbsN special device file.  You can
check the permissions on this file with the command

	ls -l /dev/infiniband/uverbs*

Make sure that the permissions on these files are such that the
user/group that your verbs program runs as can access the device file.

To use IB verbs from userspace, a process must also have permission to
tell the kernel to lock sufficient memory for all of your registered
memory regions as well as the memory used internally by IB resources
such as queue pairs (QPs) and completion queues (CQs).  To check your
resource limits, use the command

	ulimit -l

(or "limit memorylocked" for csh-like shells).

If you see a small number such as 32 (the units are KB) then you will
need to increase this limit.  This is usually done for ordinary users
via the file /etc/security/limits.conf.  More configuration may be
necessary if you are logging in via OpenSSH and your sshd is
configured to use privilege separation.

Valgrind support
----------------

When running applications that use libibverbs under the Valgrind
memory-checking debugger, Valgrind will falsely report "read from
uninitialized" for memory that was initialized by the kernel drivers.
Specifically, Valgrind cannot see when kernel drivers write to
userspace memory, so when the process reads from that memory, Valgrind
incorrectly assumes that the memory contents are uninitialized, and
therefore raises a warning.

libibverbs can be built with specific support for the Valgrind
memory-checking debugger by specifying the --with-valgrind command
line argument to configure.  This flag enables code in libibverbs to
tell Valgrind "this memory may look uninitialized, but it's really
OK," which therefore suppresses the incorrect "read from
uninitialized" warnings.  This code adds trivial overhead to the
critical performance path, so it is disabled by default.  The intent
is that production users can use a "normal" build of libibverbs and
developers can use the "valgrind debug" build by simply switching
their LD_LIBRARY_PATH environment variables.

Libibverbs needs some header files from Valgrind in order to compile
this support; it is important to use the header files from the same
version of Valgrind that will be used at run time.  You may need to
specify the directory where Valgrind's header files are installed as
an argument to --with-valgrind.  For example

	./configure --with-valgrind=/opt/valgrind

will make the libibverbs build look for valgrind headers in
/opt/valgrind/include

Reporting bugs
==============

Bugs should be reported to the OpenFabrics mailing list
<[email protected]>.  In your bug report, please include:

 * Information about your system:
   - Linux distribution and version
   - Linux kernel and version
   - InfiniBand/iWARP hardware and firmware version
   - ... any other relevant information

 * How to reproduce the bug.  Command line arguments for a libibverbs
   example program or source code that other developers can
   compile and run is most convenient.

 * If the bug is a crash, the exact output printed out when the crash
   occurred, including any kernel messages produced.

 * If a verbs call is mysteriously returning an error or failing, the
   output of "strace -ewrite -ewrite=all <command>".

Submitting patches
==================

Patches should also be submitted to the OpenFabrics mailing list
<[email protected]>.  Please use unified diff form (the -u
option to GNU diff), and include a good description of what your patch
does and why it should be applied.  If your patch fixes a bug, please
make sure to describe the bug and how your fix works.

Please include a change to the ChangeLog file (in standard GNU
changelog format) as part of your patch.

Make sure that your contribution can be licensed under the same
license as the original code you are patching, and that you have all
necessary permissions to release your work.

TODO
====

1.1 series
----------

The libibverbs API and ABI are frozen for all releases in the 1.1
series.  Methods were added to struct ibv_context to implement the
following features, so it should be possible to add them in a future
release in the 1.1 series:

 * Memory window (MW) support.

 * Implement the reregister memory region (MR) verb.  We will add an
   extension to the IB spec to allow the application to indicate that
   the region is only being extended, and that operations in progress
   should _not_ fail (contrary to the IB spec, which states that
   reregister must be implemented so that it behaves equivalently to a
   deregister followed by a register).

Other possibilities
-------------------

There are no plans to implement the following features, which would be
needed for completeness but don't seem particularly useful.  However,
if there is demand from application developers or an implementation is
contributed, then the feature may be added.

 * Implement the query address handle (AH) verb.
 * Implement the query memory region (MR) verb.