Skip to content

Commit

Permalink
Collectives notebook
Browse files Browse the repository at this point in the history
  • Loading branch information
quetric committed Jul 20, 2022
1 parent 43730a1 commit d3f4f3b
Show file tree
Hide file tree
Showing 2 changed files with 172 additions and 1 deletion.
172 changes: 172 additions & 0 deletions src/pyaccl/notebooks/collectives.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# ACCL Collectives (Emulator/Simulator)\n",
"In a system of more than one ACCL-enabled FPGAs, we can execute MPI-like collectives (scatter, gather, broadcast, reductions, etc). This notebook illustrates how to initialize the ACCL instances and run collectives. Usually, each ACCL instance runs in a separate process on a distinct compute node in a network, but for purposes of demonstration, we utilize multithreading in a single process to create and operate multiple ACCL instances"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting up the ACCL ranks descriptor\n",
"Each ACCL instance requires a dictionary describing the ranks involved in communication. This dictionary describes, for each rank, the IP address and port of the rank, the session ID, and the maximum size of buffers the rank can receive."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"START_PORT = 5500\n",
"WORLD_SIZE = 4\n",
"RXBUF_SIZE = 16*1024\n",
"\n",
"ranks = []\n",
"for i in range(WORLD_SIZE):\n",
" ranks.append({\"ip\": \"127.0.0.1\", \"port\": START_PORT+WORLD_SIZE+i, \"session_id\":i, \"max_segment_size\": RXBUF_SIZE})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initializing ACCL emulator/simulator instances\n",
"We are now ready to initialize our ACCL instances. We assume that a simulator or emulator session has been started with the appropriate number of ranks (see ACCL documentation). Our ACCL instances will connect to the simulator or emulator."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from pyaccl import accl\n",
"\n",
"accl_instances = []\n",
"for i in range(WORLD_SIZE):\n",
" accl_instances.append(accl(ranks, i, bufsize=RXBUF_SIZE, protocol=\"TCP\",\n",
" sim_sock=\"tcp://localhost:\"+str(START_PORT+i) ))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Creating ACCL buffers\n",
"With the ACCL instances ready, we can allocate buffers in each of the instances' memories. We allocate one source buffer and one result buffer, and paint the source with floating point data."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"COUNT = 1000\n",
"\n",
"op0_buffers = []\n",
"op1_buffers = []\n",
"res_buffers = []\n",
"for i in range(WORLD_SIZE): \n",
" op0_buffers.append(accl_instances[i].allocate((COUNT,)))\n",
" res_buffers.append(accl_instances[i].allocate((COUNT,)))\n",
" op0_buffers[i][:] = [1.0*i for i in range(COUNT)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Run an all-reduce collective\n",
"We are now ready to execute collectives. Since collectives require communication between the ACCL instances, we must start the collectives in each of the instances in parallel, utilizing threads. Each thread executes an all-reduce sum collective."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import threading\n",
"from pyaccl import ACCLReduceFunctions\n",
"import numpy as np\n",
"\n",
"def allreduce(n):\n",
" accl_instances[n].allreduce(op0_buffers[n], res_buffers[n], COUNT, ACCLReduceFunctions.SUM)\n",
"\n",
"threads = []\n",
"for i in range(WORLD_SIZE):\n",
" threads.append(threading.Thread(target=allreduce, args=(i,)))\n",
" threads[i].start()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Check results\n",
"All-reduce should produce in each of the result buffers the sum of all the input buffers from each of the ACCL instances involved in the collective. We can compare all-reduce outputs with the expected outputs, element by element, to make sure this is the case."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"for i in range(WORLD_SIZE):\n",
" threads[i].join()\n",
" assert np.isclose(res_buffers[i], sum(op0_buffers)).all()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## De-Initialize ACCL instances\n",
"The `deinit()` function clears all internal data structures in the ACCL instance."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"for i in range(WORLD_SIZE):\n",
" accl_instances[i].deinit()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
},
"vscode": {
"interpreter": {
"hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}
1 change: 0 additions & 1 deletion src/pyaccl/notebooks/multirank.ipynb

This file was deleted.

0 comments on commit d3f4f3b

Please sign in to comment.