Stars
Writing AI Conference Papers: A Handbook for Beginners
A Collection of BM25 Algorithms in Python
Comprehensive tools and frameworks for developing foundation models tailored to recommendation systems.
The first dense retrieval model that can be prompted like an LM
GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings
Retrieval and Retrieval-augmented LLMs
Use contrastive learning to train a large language model (LLM) as a retriever
[NeurIPS'24 Oral] HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning
DeepSeek Coder: Let the Code Write Itself
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
中文大模型微调(LLM-SFT), 数学指令数据集MWP-Instruct, 支持模型(ChatGLM-6B, LLaMA, Bloom-7B, baichuan-7B), 支持(LoRA, QLoRA, DeepSpeed, UI, TensorboardX), 支持(微调, 推理, 测评, 接口)等.
IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our focuses on user-aligned instructions tailored to each query ins…
Source codes for paper ”ReACC: A Retrieval-Augmented Code Completion Framework“
Instruct-tune LLaMA on consumer hardware
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
A quick guide (especially) for trending instruction finetuning datasets
[AAAI 2024] Official pytorch implementation of “Learning Real-World Image De-Weathering with Imperfect Supervision”
MTEB: Massive Text Embedding Benchmark
A Comparative Study of Various Code Embeddings in Software Semantic Matching
A Comprehensive Benchmark for Code Information Retrieval.
A framework for the evaluation of autoregressive code generation language models.
Code for the paper "Evaluating Large Language Models Trained on Code"
Aligning pretrained language models with instruction data generated by themselves.
High Accuracy and efficiency multi-task fine-tuning framework for Code LLMs. This work has been accepted by KDD 2024.