-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
172 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,172 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# ACCL Collectives (Emulator/Simulator)\n", | ||
"In a system of more than one ACCL-enabled FPGAs, we can execute MPI-like collectives (scatter, gather, broadcast, reductions, etc). This notebook illustrates how to initialize the ACCL instances and run collectives. Usually, each ACCL instance runs in a separate process on a distinct compute node in a network, but for purposes of demonstration, we utilize multithreading in a single process to create and operate multiple ACCL instances" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Setting up the ACCL ranks descriptor\n", | ||
"Each ACCL instance requires a dictionary describing the ranks involved in communication. This dictionary describes, for each rank, the IP address and port of the rank, the session ID, and the maximum size of buffers the rank can receive." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"START_PORT = 5500\n", | ||
"WORLD_SIZE = 4\n", | ||
"RXBUF_SIZE = 16*1024\n", | ||
"\n", | ||
"ranks = []\n", | ||
"for i in range(WORLD_SIZE):\n", | ||
" ranks.append({\"ip\": \"127.0.0.1\", \"port\": START_PORT+WORLD_SIZE+i, \"session_id\":i, \"max_segment_size\": RXBUF_SIZE})" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Initializing ACCL emulator/simulator instances\n", | ||
"We are now ready to initialize our ACCL instances. We assume that a simulator or emulator session has been started with the appropriate number of ranks (see ACCL documentation). Our ACCL instances will connect to the simulator or emulator." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from pyaccl import accl\n", | ||
"\n", | ||
"accl_instances = []\n", | ||
"for i in range(WORLD_SIZE):\n", | ||
" accl_instances.append(accl(ranks, i, bufsize=RXBUF_SIZE, protocol=\"TCP\",\n", | ||
" sim_sock=\"tcp://localhost:\"+str(START_PORT+i) ))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Creating ACCL buffers\n", | ||
"With the ACCL instances ready, we can allocate buffers in each of the instances' memories. We allocate one source buffer and one result buffer, and paint the source with floating point data." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"COUNT = 1000\n", | ||
"\n", | ||
"op0_buffers = []\n", | ||
"op1_buffers = []\n", | ||
"res_buffers = []\n", | ||
"for i in range(WORLD_SIZE): \n", | ||
" op0_buffers.append(accl_instances[i].allocate((COUNT,)))\n", | ||
" res_buffers.append(accl_instances[i].allocate((COUNT,)))\n", | ||
" op0_buffers[i][:] = [1.0*i for i in range(COUNT)]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Run an all-reduce collective\n", | ||
"We are now ready to execute collectives. Since collectives require communication between the ACCL instances, we must start the collectives in each of the instances in parallel, utilizing threads. Each thread executes an all-reduce sum collective." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import threading\n", | ||
"from pyaccl import ACCLReduceFunctions\n", | ||
"import numpy as np\n", | ||
"\n", | ||
"def allreduce(n):\n", | ||
" accl_instances[n].allreduce(op0_buffers[n], res_buffers[n], COUNT, ACCLReduceFunctions.SUM)\n", | ||
"\n", | ||
"threads = []\n", | ||
"for i in range(WORLD_SIZE):\n", | ||
" threads.append(threading.Thread(target=allreduce, args=(i,)))\n", | ||
" threads[i].start()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Check results\n", | ||
"All-reduce should produce in each of the result buffers the sum of all the input buffers from each of the ACCL instances involved in the collective. We can compare all-reduce outputs with the expected outputs, element by element, to make sure this is the case." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"for i in range(WORLD_SIZE):\n", | ||
" threads[i].join()\n", | ||
" assert np.isclose(res_buffers[i], sum(op0_buffers)).all()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## De-Initialize ACCL instances\n", | ||
"The `deinit()` function clears all internal data structures in the ACCL instance." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"for i in range(WORLD_SIZE):\n", | ||
" accl_instances[i].deinit()" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.8.10" | ||
}, | ||
"vscode": { | ||
"interpreter": { | ||
"hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1" | ||
} | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
This file was deleted.
Oops, something went wrong.