Releases · shon-otmazgin/fastcoref

avoiding downloading Spacy model with pretokenized text (@aryehgigi)
remove the wandb installation if you not using train mode (i.e adding fastcoref[train] in PyPi) (@aryehgigi)

Contributors

aryehgigi

Assets 3

24 Nov 17:10

shon-otmazgin

v2.1.1

802fabe

patch to disable progress bar at inference

patch to disable progress bar at inference (thanks to @radandreicristian)

Contributors

radandreicristian

Assets 2

10 Nov 18:08

shon-otmazgin

v2.1.0

3db712f

v2.1.0

NEW FEATURE (thanks to @aryehgigi):

Predict function signature changed to support tokenized text as input:

texts: Union[str, List[str], List[List[str]]],  # similar to huggingface tokenizer inputs
is_split_into_words: bool = False

if you send a tokenized text to the predict function use is_split_into_words=True

If you want to use a single tokenized sequence you must setis_split_into_words=True (to lift the ambiguity with a batch of sequences)

Contributors

aryehgigi

Assets 2

09 Nov 18:30

shon-otmazgin

v2.0.3

5db91fc

v2.0.3

Fix - disabling all spacy components except tokenizer. - see #13 (thanks to @radandreicristian)

Contributors

radandreicristian

Assets 2

27 Oct 15:09

shon-otmazgin

v2.0.2

9c8bad8

v2.0.2

Utilizing the existing spacy instance while using spacy component

Assets 2

25 Oct 11:26

shon-otmazgin

v2.0.1

f381d41

v2.0.1 - spacy_component, trainer

Adding the following features:

Spacy component (Thanks to @mlostar )

from fastcoref import spacy_component
import spacy


texts = ['Alice goes down the rabbit hole. Where she would discover a new reality beyond her expectations.']

nlp = spacy.load("en_core_web_sm")
nlp.add_pipe("fastcoref")

docs = nlp(texts)
docs[0]._.coref_clusters
> [[(0, 5), (39, 42), (79, 82)]]

Trainer

from fastcoref import TrainingArgs, CorefTrainer

args = TrainingArgs(
    output_dir='test-trainer',
    overwrite_output_dir=True,
    model_name_or_path='distilroberta-base',
    device='cuda:2',
    epochs=129,
    logging_steps=100,
    eval_steps=100
)   # you can control other arguments such as learning head and others.

trainer = CorefTrainer(
    args=args,
    train_file='train_file_with_clusters.jsonlines', 
    dev_file='path-to-dev-file',    # optional
    test_file='path-to-test-file'   # optional
)
trainer.train()
trainer.evaluate(test=True)

trainer.push_to_hub('your-fast-coref-model-path')

predict now support output file:

from fastcoref import LingMessCoref

model = LingMessCoref()
preds = model.predict(texts=texts, output_file='train_file_with_clusters.jsonlines')

Contributors

mlostar

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

Contributors

Contributors

Contributors

Contributors

Contributors

Releases: shon-otmazgin/fastcoref

custom spacy model for trainer

Contributors

bug fix

remove uneeded layers

Contributors

patch to disable progress bar at inference

Contributors

v2.1.0

Contributors

v2.0.3

Contributors

v2.0.2

v2.0.1 - spacy_component, trainer

Contributors