Can't initialize `PubMedQA` #35

armish · 2023-02-04T03:34:12Z

Currently have 36309e2 cloned and downloaded the PubMedQA on 2/3/2023:

$ md5sum checkpoints/QA-PubMedQA-BioGPT.tgz
8d05745c9cd93ce3a7b4d87251823b67  checkpoints/QA-PubMedQA-BioGPT.tgz

Following the advice under #23, I was able to make some progress but this what I get when I try to initialize PubMedQA:

import torch
from src.transformer_lm_prompt import TransformerLanguageModelPrompt

m = TransformerLanguageModelPrompt.from_pretrained(
        "checkpoints/QA-PubMedQA-BioGPT",
        "checkpoint_avg.pt",
        data="data/PubMedQA/biogpt-ansis-bin",
        tokenizer='moses',
        bpe='fastbpe',
        bpe_codes="data/bpecodes",
        min_len=100,
        max_len_b=1024)

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
Cell In[2], line 1
----> 1 m = TransformerLanguageModelPrompt.from_pretrained(
      2         "checkpoints/QA-PubMedQA-BioGPT",
      3         "checkpoint_avg.pt",
      4         data="data/PubMedQA/biogpt-ansis-bin",
      5         tokenizer='moses',
      6         bpe='fastbpe',
      7         bpe_codes="data/bpecodes",
      8         min_len=100,
      9         max_len_b=1024)

File ~/miniconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/models/fairseq_model.py:267, in BaseFairseqModel.from_pretrained(cls, model_name_or_path, checkpoint_file, data_name_or_path, **kwargs)
    244 """
    245 Load a :class:`~fairseq.models.FairseqModel` from a pre-trained model
    246 file. Downloads and caches the pre-trained model file if needed.
   (...)
    263         model archive path.
    264 """
    265 from fairseq import hub_utils
--> 267 x = hub_utils.from_pretrained(
    268     model_name_or_path,
    269     checkpoint_file,
    270     data_name_or_path,
    271     archive_map=cls.hub_models(),
    272     **kwargs,
    273 )
    274 logger.info(x["args"])
    275 return hub_utils.GeneratorHubInterface(x["args"], x["task"], x["models"])

File ~/miniconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/hub_utils.py:73, in from_pretrained(model_name_or_path, checkpoint_file, data_name_or_path, archive_map, **kwargs)
     70 if "user_dir" in kwargs:
     71     utils.import_user_module(argparse.Namespace(user_dir=kwargs["user_dir"]))
---> 73 models, args, task = checkpoint_utils.load_model_ensemble_and_task(
     74     [os.path.join(model_path, cpt) for cpt in checkpoint_file.split(os.pathsep)],
     75     arg_overrides=kwargs,
     76 )
     78 return {
     79     "args": args,
     80     "task": task,
     81     "models": models,
     82 }

File ~/miniconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/checkpoint_utils.py:432, in load_model_ensemble_and_task(filenames, arg_overrides, task, strict, suffix, num_shards, state)
    427     raise RuntimeError(
    428         f"Neither args nor cfg exist in state keys = {state.keys()}"
    429     )
    431 if task is None:
--> 432     task = tasks.setup_task(cfg.task)
    434 if "task_state" in state:
    435     task.load_state_dict(state["task_state"])

File ~/miniconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/tasks/__init__.py:46, in setup_task(cfg, **kwargs)
     40         task = TASK_REGISTRY[task_name]
     42 assert (
     43     task is not None
     44 ), f"Could not infer task type from {cfg}. Available argparse tasks: {TASK_REGISTRY.keys()}. Available hydra tasks: {TASK_DATACLASS_REGISTRY.keys()}"
---> 46 return task.setup_task(cfg, **kwargs)

File ~/Projects/biogpt/src/language_modeling_prompt.py:134, in LanguageModelingPromptTask.setup_task(cls, args, **kwargs)
    132     args.source_lang, args.target_lang = data_utils.infer_language_pair(paths[0])
    133 if args.source_lang is None or args.target_lang is None:
--> 134     raise Exception(
    135         "Could not infer language pair, please provide it explicitly"
    136     )
    138 dictionary, output_dictionary = cls.setup_dictionary(args, **kwargs)
    139 prompt = cls.setup_prompt(args, dictionary)

Exception: Could not infer language pair, please provide it explicitly

I was wondering if it would be possible to add example code pieces (like the one for the basic model) to the repo for different models to make it easier to get started. Happy to help with the documentation if you have any pointers for me.

Thank you!

The text was updated successfully, but these errors were encountered:

kamalkraj · 2023-02-04T07:58:35Z

You can directly infer here -> https://huggingface.co/kamalkraj/BioGPT-Large-PubMedQA

Notebook : https://github.com/kamalkraj/BioGPT-HF/blob/master/BioGPT_Large_PubMedQA.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't initialize `PubMedQA` #35

Can't initialize `PubMedQA` #35

armish commented Feb 4, 2023

kamalkraj commented Feb 4, 2023 •

edited

Loading

Can't initialize PubMedQA #35

Can't initialize PubMedQA #35

Comments

armish commented Feb 4, 2023

kamalkraj commented Feb 4, 2023 • edited Loading

Can't initialize `PubMedQA` #35

Can't initialize `PubMedQA` #35

kamalkraj commented Feb 4, 2023 •

edited

Loading