Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multihead attention - no converter found #65

Closed
johndpope opened this issue Oct 17, 2024 · 3 comments
Closed

multihead attention - no converter found #65

johndpope opened this issue Oct 17, 2024 · 3 comments

Comments

@johndpope
Copy link

johndpope commented Oct 17, 2024

❌ Validation exception on node 'MultiheadAttention':
PyTorch op: MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
)
Keras op: ChangeOrderingLayer(func=<function converter_MultiheadAttention..func at 0x75b6b7e9f380>)
Input args: ('Tensor(shape=[4096, 1, 128], dtype=torch.float32)', 'Tensor(shape=[4096, 1, 128], dtype=torch.float32)', 'Tensor(shape=[4096, 1, 128], dtype=torch.float32)')
Input kwargs: {}
Output tensors: ['Tensor(shape=[4096, 1, 128], dtype=torch.float32)', 'Tensor(shape=[1, 4096, 4096], dtype=torch.float32)']
Exception: You called set_weights(weights) on layer "multi_head_attention" with a weight list of length 8, but the layer was expecting 0 weights. Provided weights: [array([[[-0.05060041, -0.01487129, 0.10044055, ....
Traceback:

❌ Validation exception on node 'MultiheadAttentionModel':
PyTorch op: MultiheadAttentionModel(
(multihead_attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
)
)
Keras op: <nobuco.layers.container.TransientContainer object at 0x75b6b7d08290>
Input args: ('Tensor(shape=[1, 128, 4096], dtype=torch.float32)',)
Input kwargs: {}
Output tensors: ['Tensor(shape=[1, 128, 4096], dtype=torch.float32)']
Exception: You called set_weights(weights) on layer "multi_head_attention_1" with a weight list of length 8, but the layer was expecting 0 weights. Provided weights: [array([[[-0.05060041, -0.01487129, 0.10044055, ....
Traceback:

[Nobuco] Converting (DONE): |████████████████████████████████████████████████████████████████████████████████| 26/26 ops [00:00]
Legend:
Green — conversion successful
Yellow — conversion imprecise
Red — conversion failed
Red — no converter found
Bold — conversion applied directly
* — subgraph reused
Tensor — this output is not dependent on any of subgraph's input tensors
Tensor — this input is a parameter / constant
Tensor — this tensor is useless

@johndpope johndpope changed the title multihead attention - multihead attention - no converter found Oct 17, 2024
@johndpope
Copy link
Author

sample code is in
#64

is this a known issue - or just something that's slipped by without anyone needing?

@johndpope
Copy link
Author

basically crafted some keras / torch classes - mostly working now
https://github.com/johndpope/IMF/blob/feat/tensorflow-cips/tf-export2.py

@AlexanderLutsenko
Copy link
Owner

Exception: You called set_weights(weights) on layer "multi_head_attention_1" with a weight list of length 8, but the layer was expecting 0 weights.

Ah, I see. Some Keras layers do not initialize their weights until the first forward pass. If that's the case, it needs to be done inside that specific node's converter. I'll take a look at it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants