Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use Pre-trained-BioGPT-Large model? #36

Open
knagamatsu opened this issue Feb 4, 2023 · 3 comments
Open

How to use Pre-trained-BioGPT-Large model? #36

knagamatsu opened this issue Feb 4, 2023 · 3 comments

Comments

@knagamatsu
Copy link

I was able to use Pre-trained BioGPT in accordance with the use case.
Could you give us an example code of using Pre-trained BioGPT-Large?

I tried this code.

import os
os.chdir('/home/******/BioGPT')`

import torch
from fairseq.models.transformer_lm import TransformerLanguageModel

m = TransformerLanguageModel.from_pretrained(
        "checkpoints/Pre-trained-BioGPT-Large", 
        "checkpoint.pt", 
        "data",
        tokenizer='moses', 
        bpe='biogpt-large-fastbpe', 
        bpe_codes="data/biogpt_large_bpecodes",
        min_len=100,
        max_len_b=1024)
m.cuda()
src_tokens = m.encode("COVID-19 is")
generate = m.generate([src_tokens], beam=5)[0]
output = m.decode(generate[0]["tokens"])
print(output)

And the result was bellow.

RuntimeError: Error(s) in loading state_dict for TransformerLanguageModel:
	size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([57717, 1600]) from checkpoint, the shape in current model is torch.Size([42384, 1600]).
	size mismatch for decoder.output_projection.weight: copying a param with shape torch.Size([57717, 1600]) from checkpoint, the shape in current model is torch.Size([42384, 1600]).

Thank you!

@renqianluo
Copy link
Collaborator

renqianluo commented Feb 5, 2023

pull the latest code and try this:

m = TransformerLanguageModel.from_pretrained(
        "checkpoints/Pre-trained-BioGPT-Large", 
        "checkpoint.pt", 
        "data/BioGPT-Large",
        tokenizer='moses', 
        bpe='fastbpe', 
        bpe_codes="data/BioGPT-Large/bpecodes",
        min_len=100,
        max_len_b=1024)

@knagamatsu
Copy link
Author

Thank you for your prompt work.
I was able to move it without incident!

By the way, I think BioGPT results are closer to human answers.
Is it necessary to change parameters such as beam size in BioGPT-Large?
The following is a BioGPT's answer.

COVID-19 is a global pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of coronavirus disease 2019 (COVID-19), which has spread to more than 200 countries and territories, including the United States (US), Canada, Australia, New Zealand, the United Kingdom (UK), and the United States of America (USA), as of March 11, 2020, with more than 800,000 confirmed cases and more than 800,000 deaths.

And this is a half part of BioGPT-Large's answer.

COVID-19 is a novel coronavirus that emerged in late 2 0 1 9 and is associated with a high mortality rate in patients with acute respiratory distress syndrome (ARDS). The aim of this study was to investigate the clinical characteristics of patients with ARDS caused by COVID-1 9. ▃ ▃ ▃ METHODS ▃ We retrospectively analyzed the clinical data of patients with ARDS caused by COVID-1 9 admitted to the intensive care unit (ICU) of the First Affiliated Hospital of Sun Yat-sen University from January 1, 2 0 1 9, to December 3 1, 2 0 1 9. ▃ ▃ ▃ RESULTS ▃ A total of 1 2 patients with ARDS caused by COVID-1 9 were included in this study. The median age of the patients was 6 0 years (interquartile range [IQR], 5 3-6 5 years), and the median Acute Physiology and Chronic Health Evaluation II (APACHE II) score was 2 5 (IQR, 2 2-2 8). The median time from symptom onset to ICU admission was 1 0 days (IQR, 8-1 2 days). ```

@shashank140195
Copy link

I do have a question. how did you download the BioGPT large? using the URL gives me an error that is unable to load the parameters from the checkpoint. Did you use something else to download it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants