Skip to content
View haojiepan1's full-sized avatar

Block or report haojiepan1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

中文自然语言推理与语义相似度数据集

336 75 Updated Jan 5, 2022

3000000+语义理解与匹配数据集。可用于无监督对比学习、半监督学习等构建中文领域效果最好的预训练模型

Python 277 35 Updated Oct 11, 2022

Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.

Jupyter Notebook 470 50 Updated Jul 11, 2024

中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。

Python 1,171 143 Updated Apr 20, 2024

Paper阅读记录博客(基于GitHub Action和GitHub Issue实现)。

Python 22 1 Updated May 29, 2024

搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)

Python 2,270 288 Updated Oct 6, 2024

Fantastic Data Engineering for Large Language Models

43 1 Updated Aug 7, 2024

深度学习经典、新论文逐段精读

26,524 2,408 Updated Aug 8, 2024

DALL·E Mini - Generate images from a text prompt

Python 14,749 1,206 Updated Nov 9, 2023

Books for Data Science

627 284 Updated Mar 27, 2024

基于pytorch的GlobalPointer进行中文命名实体识别。

Python 36 3 Updated Jul 7, 2023

“阿里灵杰”问天引擎电商搜索算法赛 第二名。电商领域两阶段文本匹配算法。

Python 51 18 Updated Jul 28, 2022

PaddleNLP UIE模型的PyTorch版实现

Python 586 99 Updated Aug 13, 2023

基于pytorch的百度UIE命名实体识别。

Python 52 9 Updated Feb 1, 2023

Open Academic Research on Improving LLaMA to SOTA LLM

Python 1,593 101 Updated Aug 30, 2023

Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models

62 2 Updated Aug 10, 2024

[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627

Python 448 40 Updated Aug 16, 2024

🤖 AgentVerse 🪐 is designed to facilitate the deployment of multiple LLM-based agents in various applications, which primarily provides two frameworks: task-solving and simulation

JavaScript 4,084 391 Updated Sep 9, 2024

An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents

Python 5,189 408 Updated Sep 26, 2024

A programming framework for agentic AI 🤖

Jupyter Notebook 31,625 4,596 Updated Oct 6, 2024

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 44,064 5,243 Updated Sep 29, 2024

Resource, Evaluation and Detection Papers for ChatGPT

452 24 Updated Mar 21, 2024

Question and Answer based on Anything.

Python 11,540 1,117 Updated Sep 27, 2024
Python 2,493 304 Updated May 19, 2024

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 47,195 6,708 Updated Oct 3, 2024

Chat凉宫春日, An open sourced Role-Playing chatbot Cheng Li, Ziang Leng, and others.

Jupyter Notebook 1,785 159 Updated Aug 13, 2024

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 6,590 361 Updated Jul 11, 2024

[ACL 2024] LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding

Python 632 45 Updated Sep 10, 2024

The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".

Python 67 8 Updated Jan 12, 2024

A quick guide (especially) for trending instruction finetuning datasets

2,486 161 Updated Nov 28, 2023
Next