Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add architectures where the transformer is used only for embeddings #163

Draft
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

lfoppiano
Copy link
Collaborator

@lfoppiano lfoppiano commented Aug 21, 2023

This PR is still work in progress

The goal of this PR is add additional architectures where the Transformer layer is frozen and used only to generate embedding representation.
At the moment the last 4 layers are concatenated together and passed to a Bidirectional LSTM with a dense layer on top. Once the implementation is confirmed, we can experiment with different options.

Currently:

  • when enabling the character channel, the ChainCRF architecture does not work
  • the results are much lower than expected and there are some behaviours that need a review

@lfoppiano lfoppiano requested review from pjox and kermitt2 August 21, 2023 05:26
@lfoppiano lfoppiano added the enhancement New feature or request label Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant