Tiny FlashAttention WIP A tiny flash attention implement in python, rust, cuda and c for learning purpose. python version naive pure python code triton version triton code [c version] TODO: naive pure c code naive cuda code standalone naive cuda code python binding cutlass cuda code [rust version] cutlass cute flash attention in action en tutorial zh tutorial