Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README #137

Open
wants to merge 13 commits into
base: sycl-develop
Choose a base branch
from
Prev Previous commit
Next Next commit
Apply suggestions from code review
Co-authored-by: Alejandro Acosta <[email protected]>
  • Loading branch information
AD2605 and aacostadiaz authored Oct 16, 2024
commit abf077899c5f2b97c79f1ae7e9053f26b08eb27d
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ and improves code composability and readability. More documentation specific to
In addition to GEMMs, CUTLASS implements high-performance convolution via the implicit GEMM algorithm. Implicit GEMM is the formulation of a convolution operation as a GEMM thereby taking advantage of CUTLASS's modular GEMM pipeline. This allows CUTLASS to build convolutions by reusing highly-optimized GEMM components.

## CUTLASS with SYCL
CUTLASS 3.0 API now also supports SYCL, and can run on Nvidia(upto the Ampere architecture) and Intel Xe Core architecture GPUs using the SYCL backend using the Intel open source `DPC++` compiler.
CUTLASS 3.0 API now also supports SYCL, and can run on Nvidia(upto the Ampere architecture) and Intel PVC GPUs using the SYCL backend using the Intel open source `DPC++` compiler.
The support is currently limited to GEMMs only. See [Quick Start Guide](./media/docs/build/building_with_sycl_support.md) on how to build and run
examples using the SYCL backend.

Expand Down
3 changes: 1 addition & 2 deletions media/docs/build/building_with_sycl_support.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,10 @@ CUTLASS Examples <br>
* Example 14
* We also provide various SYCL examples for the Intel Data Center Max range of GPUs

## SYCL Supported Architectures and APIs
## SYCL Supported Architectures
At the time of writing, the SYCL backend supports all Nvidia architectures till Ampere, and the
Intel Data Center Max series of GPUs is supported.

We support the `CollectiveMMA` and the collective builder APIs for the same.

# References

Expand Down
3 changes: 2 additions & 1 deletion media/docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ $ mkdir build && cd build

$ cmake -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DCUTLASS_ENABLE_SYCL=ON -DDPCPP_SYCL_TARGET=nvptx64-nvidia-cuda -DDPCPP_SYCL_ARCH=sm_80 .. # compiles for the NVIDIA Ampere GPU architecture
rolandschulz marked this conversation as resolved.
Show resolved Hide resolved

$ cmake -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DCUTLASS_ENABLE_SYCL=ON -DDPCPP_SYCL_TARGET=intel_gpu_pvc .. # compiles for the Intel Xe Core Architecture
# compiles for the Intel PVC Architecture
cmake -DCUTLASS_ENABLE_SYCL=ON -DDPCPP_SYCL_TARGET=intel_gpu_pvc ..
```
A complete example can be as follows (running on the Intel Data Center Max 1100) -

Expand Down