![:octocat: :octocat:](https://github.githubassets.com/images/icons/emoji/octocat.png)
Stars
Repository containing data and baselines for the two 2024 AmericasNLP shared tasks.
Compute Inter Annotator Agreement from Brat files
Library to download PubMed abstracts with metadata. Originally created to obtain the DrugProt (BioCreative VII) background set
Different useful snippets I create while I am a working at BSC
Train transformer language models with reinforcement learning.
Transformer models from BERT to GPT-4, environments from Hugging Face to OpenAI. Fine-tuning, training, and prompt engineering examples. A bonus section with ChatGPT, GPT-3.5-turbo, GPT-4, and DALL…
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
BLOOM+1: Adapting BLOOM model to support a new unseen language
Finetuning InstructLLaMA with portuguese data
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Multiple NER-tool's combined in one output. Incovating mutliple NER-engine's in parallel.
Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engineering, LLMs and other latest research
Tool to convert CoNLL-U format files to CoNLL format files and manipulate training, validation and test sets.
My notes / works on deep learning from Coursera
✨ Innovative and open-source visualization application that transforms various data formats, such as JSON, YAML, XML, CSV and more, into interactive graphs.
🚀 State-of-the-art parsers for natural language.
Machine learning-based classifier that identifies sentences that contains evidence of social impact of research
Biomedical Named Entity Recognition and Normalization of Diseases, Chemicals and Genenetic entity classes through the use of state-of-the-art models.
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
python library for working with IIIF Image and Presentation APIs
OpenMMLab Text Detection, Recognition and Understanding Toolbox
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
ROADMAP(Mind Map) and KEYWORD for students those who have interest in learning NLP
📖 A curated list of resources dedicated to Natural Language Processing (NLP)