Add architectures where the transformer is used only for embeddings #163

lfoppiano · 2023-08-21T05:26:23Z

This PR is still work in progress

The goal of this PR is add additional architectures where the Transformer layer is frozen and used only to generate embedding representation.
At the moment the last 4 layers are concatenated together and passed to a Bidirectional LSTM with a dense layer on top. Once the implementation is confirmed, we can experiment with different options.

Currently:

when enabling the character channel, the ChainCRF architecture does not work
the results are much lower than expected and there are some behaviours that need a review

lfoppiano added 23 commits July 12, 2023 14:41

add BERT+BidLSTM and BERT+BidLSTM+CRF base models

d238ac5

typo

f634d24

Fix LSTM size, add Dense layer for ChainCRF

19493c5

add crf related flags in configuration

fcb764a

Merge branch 'master' into bert-bidlstm-models

09af3d1

reduce the size of the LSTM to try avoiding OOM

536e0de

Merge branch 'master' into bert-bidlstm-models

24dcfed

freze bert and concatenate embeddings

78c8054

freze bert layer

3cd4810

add frozen bert to bert_lstm_crf

40f8648

update bert_bidlstm_chaincrf

86cddbf

update tensorflow addons

f12b8b3

reverted wrong committed file

33eed9b

hacky solution for selecting the default config values

22240ad

fix the learning rate for the hacky solution

51a0d56

add examples superconductors

359446c

use the same method everywhere to know if a model is using transformers

3e07c97

tag startwith

fb509d1

remove unused fields

679b9a4

add character embedding channel

8976db3

LSTM output the same size of a single embedding

5e95960

revert change

87144fc

update

9ecc400

lfoppiano requested review from pjox and kermitt2 August 21, 2023 05:26

lfoppiano added the enhancement New feature or request label Aug 21, 2023

lfoppiano added 3 commits August 22, 2023 10:55

cleanup

3550c5b

remove chain embedding channel temporarly for ChainCRF

43318b5

typo

4a546b4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add architectures where the transformer is used only for embeddings #163

Add architectures where the transformer is used only for embeddings #163

lfoppiano commented Aug 21, 2023 •

edited

Loading

Add architectures where the transformer is used only for embeddings #163

Are you sure you want to change the base?

Add architectures where the transformer is used only for embeddings #163

Conversation

lfoppiano commented Aug 21, 2023 • edited Loading

lfoppiano commented Aug 21, 2023 •

edited

Loading