Skip to content
View RuiranYan's full-sized avatar

Highlights

  • Pro

Block or report RuiranYan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Writing AI Conference Papers: A Handbook for Beginners

1,831 65 Updated Dec 23, 2024

This is the repo for the survey of LLM4IR.

463 37 Updated Sep 5, 2024

A Collection of BM25 Algorithms in Python

Python 1,093 90 Updated Oct 8, 2024

Comprehensive tools and frameworks for developing foundation models tailored to recommendation systems.

81 Updated Jan 15, 2025

The first dense retrieval model that can be prompted like an LM

Python 64 3 Updated Sep 18, 2024

GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings

Python 37 1 Updated Mar 6, 2024

Retrieval and Retrieval-augmented LLMs

Python 8,313 608 Updated Jan 23, 2025
Python 186 9 Updated Jan 16, 2025

Use contrastive learning to train a large language model (LLM) as a retriever

Python 8 1 Updated Jul 19, 2024

[NeurIPS'24 Oral] HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning

Python 157 8 Updated Dec 3, 2024

DeepSeek Coder: Let the Code Write Itself

Python 11,443 783 Updated May 21, 2024

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

Jupyter Notebook 3,440 283 Updated Jan 17, 2025

中文大模型微调(LLM-SFT), 数学指令数据集MWP-Instruct, 支持模型(ChatGLM-6B, LLaMA, Bloom-7B, baichuan-7B), 支持(LoRA, QLoRA, DeepSpeed, UI, TensorboardX), 支持(微调, 推理, 测评, 接口)等.

Python 183 13 Updated May 17, 2024

IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our focuses on user-aligned instructions tailored to each query ins…

Python 31 4 Updated Jun 13, 2024

Source codes for paper ”ReACC: A Retrieval-Augmented Code Completion Framework“

Python 60 14 Updated Apr 18, 2022

Instruct-tune LLaMA on consumer hardware

Jupyter Notebook 18,773 2,224 Updated Jul 29, 2024

FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions

Python 40 Updated Jul 3, 2024

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Python 1,407 111 Updated Jan 24, 2025

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

Python 1,682 199 Updated Jul 28, 2024

A quick guide (especially) for trending instruction finetuning datasets

2,792 180 Updated Nov 28, 2023

[AAAI 2024] Official pytorch implementation of “Learning Real-World Image De-Weathering with Imperfect Supervision”

Python 15 Updated Aug 22, 2024

MTEB: Massive Text Embedding Benchmark

Jupyter Notebook 2,113 304 Updated Jan 26, 2025

A Comparative Study of Various Code Embeddings in Software Semantic Matching

Jupyter Notebook 13 2 Updated Dec 8, 2022

A Comprehensive Benchmark for Code Information Retrieval.

Python 58 11 Updated Dec 11, 2024

A framework for the evaluation of autoregressive code generation language models.

Python 872 226 Updated Oct 31, 2024
Python 1,446 109 Updated May 12, 2023

Code for the paper "Evaluating Large Language Models Trained on Code"

Python 2,527 358 Updated Jan 17, 2025

Aligning pretrained language models with instruction data generated by themselves.

Python 4,247 496 Updated Mar 27, 2023

High Accuracy and efficiency multi-task fine-tuning framework for Code LLMs. This work has been accepted by KDD 2024.

Python 660 66 Updated Dec 30, 2024
Next