Open
Description
🚀 The feature, motivation and pitch
Certain backends like qnn or coreML utilize static attention which requires very special handling in the runner code. At a higher level different algorithms in general can cause minor differences in the IO of the model that make it difficult for the runner class to handle causing folks to have to fork the runner for every new model.
An IOManager interface would simplify the runner code abstracting away minutia that really boils down to state management away letting a lot of the other runner boilerplate to be shared
Alternatives
No response
Additional context
No response
RFC (Optional)
No response
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
In progress