Skip to content

Zhaojp-Frank/urdma

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository contains the DPDK verbs source along with some demo
applications.

Requirements
------------

 - Linux kernel >= 3.17.8
 - gcc >= 4.9 or a compiler which supports the C11 standard
 - libibverbs >= 1.2.0 (versions less than 1.1.8 do not support verbs extensions
   and version 1.1.8 has a broken extensions ABI)
 - librdmacm >= 1.0.21
 - DPDK >= 16.07 built as shared libraries (CONFIG_RTE_BUILD_SHARED_LIB=y),
   and the rte_kni kernel module built for the currently running kernel
 - libnl-3 and libnl-route-3
 - json-c
 - uthash (included in Fedora, Ubuntu, openSUSE, and EPEL for RHEL
   repositories)

Setup instructions
------------------

I am assuming a default installation of Ubuntu 16.10.  Other modern
distributions should also work, but you will need to install DPDK
yourself in order to build the required rte_kni kernel modules which
most other distributions do not ship in their repositories.

To build this package, RTE_SDK and RTE_TARGET need to be exported into
the environment.  Using the packages in the Ubuntu 16.10 repositories,
you can run the following:

    $ source /usr/share/dpdk/dpdk-sdk-env.sh

Or for a manual DPDK installation:

    $ export RTE_SDK=${prefix}/share/dpdk
    $ export RTE_TARGET=x86_64-native-linuxapp-gcc

If you are pulling this from a fresh git clone, first run:

    $ autoreconf -i

Then this follows the normal autotools-style build:

    $ ./configure --sysconfdir /etc
    $ make
    $ sudo make install

Note that sysconfdir must match that of your libibverbs installation, in
order for the verbs library to find the urdma driver.

The configure script will look for your kernel source directory in the
typical location by default (/lib/modules/`uname -r`/source).  You can
set the KERNELDIR environment variable to specify a different location;
for example, if you are building against a different kernel version than
what it installed locally.

To run an application with this driver, the KNI and urdma modules must be loaded:

    $ sudo modprobe rte_kni
    $ sudo modprobe urdma

You will need to create a file ${sysconfdir}/rdma/urdma.json that looks
like this, with appropriate values substituted in:

    { "ports": { "ipv4_address": "10.2.0.100" },
      "sock_name": "/run/urdma.sock",
      "eal_args": { "log-level": 7 }
    }

Finally, the urdmad service must be running:

    $ sudo systemctl start urdmad

This will cause devices to appear in your "ip link" output, and cause
uverbs devices to appear in /sys/class/infiniband_verbs.

Known Issues
------------

 - Shared receive queues (SRQs) are currently not supported.  In order
   to run openmpi over urdma, you will need to specify the following
   command line options to prevent openmpi from using shared receive
   queues, and to disable a warning that it doesn't know about the
   device vendor ID:

       $ mpirun --mca btl_openib_warn_no_device_params_found 0 \
                --mca btl_openib_receive_queues P,65536,256,192,128 \
		${mpi_app} ${mpi_app_args}...

 - There is a potential race condition with completion channels, where a
   completion event can get lost, and thus a thread waiting on
   ibv_get_cq_event() will never wake up, leading to a deadlock.  A cause has
   not been identified, but the issue has not been reproduced with the "extra"
   lock added around the rte_ring operations in do_poll_cq() and
   finish_post_cqe().

 - There is the possibility of a hang in the kernel module if the user process
   is killed while between the read() and write() calls on event_fd in
   poll_conn_state().  This is because rdma_destroy_id() in the kernel will
   block until the connection attempt completes, but itself prevents our
   event_fd from being closed which would unblock it.

 - The progress thread will use 100% CPU since it must busy-poll on the KNI
   interfaces (there is no way to sleep until the process gets an event).

 - urdma follows the RFC 5040 ordering rules strictly, meaning that it
   can place data segments out of order if it receives them out of
   order. This in turn means that if two RDMA WRITE requests are made
   on overlapping buffers, urdma may place a data segment from the first
   *after* the corresponding data segment from the second, thus leading
   to a torn write from the perspective of the application. Thus
   applications must not post multiple transfer requests on overlapping
   buffers simultaneously if they depend on the data ordering.

About

Verbs on DPDK

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 89.0%
  • Python 5.0%
  • Shell 2.7%
  • Makefile 1.9%
  • M4 1.4%