I/O Tensors are dynamically allocated #39

Victor-Jung · 2025-02-21T10:42:53Z

I/O tensors are allocated in the InitNetwork function and never deallocated (hence, they basically have an infinite lifetime). I/O tensor dimensions are known at compile time, and we should allocate and deallocate them

For the first time, I/we should display them in the memory allocation visualization and make sure to raise an error if we go above the memory capacity limit. Then, we should perform static memory allocation for them.

This is especially an issue for Llama where the KV cache is considered as an input and then an output.

The text was updated successfully, but these errors were encountered:

Victor-Jung added the bug Something isn't working label Feb 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I/O Tensors are dynamically allocated #39

I/O Tensors are dynamically allocated #39

Victor-Jung commented Feb 21, 2025 •

edited

Loading

I/O Tensors are dynamically allocated #39

I/O Tensors are dynamically allocated #39

Comments

Victor-Jung commented Feb 21, 2025 • edited Loading

Victor-Jung commented Feb 21, 2025 •

edited

Loading