Skip to content
View KarelDO's full-sized avatar

Highlights

  • Pro

Block or report KarelDO

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Train transformer language models with reinforcement learning.

Python 13,382 1,823 Updated Apr 23, 2025

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Python 1,810 398 Updated Apr 6, 2025

RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.

Python 436 56 Updated Apr 21, 2025

A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.

Python 1,389 80 Updated Apr 1, 2025

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

Python 564 129 Updated Jan 26, 2024

Large Action Model framework to develop AI Web Agents

Python 6,023 546 Updated Jan 21, 2025

A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.

106 8 Updated May 6, 2019

Dataset for Unified Editing, EMNLP 2023. This is a model editing dataset where edits are natural language phrases.

Python 23 1 Updated Sep 4, 2024

Reference implementation for DPO (Direct Preference Optimization)

Python 2,534 207 Updated Aug 11, 2024

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

Python 5,003 430 Updated Nov 18, 2024

Generative Representational Instruction Tuning

Jupyter Notebook 623 42 Updated Mar 14, 2025

In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.

Python 422 26 Updated Feb 13, 2024

Stanford NLP Python library for understanding and improving PyTorch models via interventions

Python 734 82 Updated Apr 23, 2025

Dataset of synthetic job ad sentences tagged with ESCO skills. From the paper Extreme Multi-Label Skill Extraction Training using Large Language Models.

2 Updated Jan 11, 2024

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

Python 833 51 Updated Apr 22, 2025

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Python 7,443 902 Updated Apr 23, 2025

Dataset used to evaluate Skill Extraction systems based on the ESCO skills taxonomy.

13 Updated Jul 18, 2024

SKILLSPAN: Competences as Spans for Skill Extraction from Job Postings

Perl 60 15 Updated Feb 13, 2025

The dataset used to evaluate JobBERT on the task of job title normalization.

26 2 Updated Sep 10, 2022

TFIDF / KNN based string matching

Python 53 13 Updated Apr 6, 2023

State-of-the-art efficient coreference. This repository contains the code for the CRAC-2023 paper "CAW-coref: Conjunction-Aware Word-level Coreference Resolution". Forked from the EMNLP-2021 paper …

Python 9 3 Updated Nov 2, 2023

Inspecting and Editing Knowledge Representations in Language Models

Python 116 6 Updated Jul 24, 2023

BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance.

Jupyter Notebook 49 5 Updated Jan 29, 2024

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

Python 37,318 3,267 Updated Aug 17, 2024

High-speed download of LLaMA, Facebook's 65B parameter GPT model

Shell 4,165 418 Updated Jun 28, 2023

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,400 851 Updated Jun 10, 2024

Home of StarCoder: fine-tuning & inference!

Python 7,414 528 Updated Feb 27, 2024

Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"

Python 1,060 80 Updated Mar 7, 2024

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 18,187 1,829 Updated Apr 23, 2025

Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models (CRFM) at Stanford for holistic, reproducible and transparen…

Python 2,182 290 Updated Apr 22, 2025
Next