-
GOT-OCR2.0 Public
Forked from Ucas-HaoranWei/GOT-OCR2.0Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Python UpdatedSep 19, 2024 -
data-juicer Public
Forked from modelscope/data-juicerA one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
Python Apache License 2.0 UpdatedSep 9, 2024 -
gptpdf Public
Forked from CosmosShadow/gptpdfUsing GPT to parse PDF
Python MIT License UpdatedAug 7, 2024 -
Firefly Public
Forked from yangjianxin1/FireflyFirefly: 大模型训练工具,支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
Python UpdatedJul 16, 2024 -
pymilvus Public
Forked from milvus-io/pymilvusPython SDK for Milvus.
Python Apache License 2.0 UpdatedJun 28, 2024 -
PaddleOCR Public
Forked from PaddlePaddle/PaddleOCRAwesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
Python Apache License 2.0 UpdatedJun 18, 2024 -
ragflow Public
Forked from infiniflow/ragflowRAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Python Apache License 2.0 UpdatedJun 18, 2024 -
unstructured Public
Forked from Unstructured-IO/unstructuredOpen source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
HTML Apache License 2.0 UpdatedJun 18, 2024 -
SynapseML Public
Forked from microsoft/SynapseMLSimple and Distributed Machine Learning
Scala MIT License UpdatedMay 28, 2024 -
MediaCrawler Public
Forked from NanmiCoder/MediaCrawler小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫
Python Other UpdatedApr 22, 2024 -
GeoLite2-City Public
Forked from wp-statistics/GeoLite2-CityGeoLite2-City.mmdb.gz CDN files based on Free Open Source CDN jsDelivr!
UpdatedApr 10, 2024 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python Apache License 2.0 UpdatedOct 20, 2023 -
pytextclassifier Public
Forked from shibing624/pytextclassifierpytextclassifier is a toolkit for text classification. 文本分类,LR,Xgboost,TextCNN,FastText,TextRNN,BERT等分类模型实现,开箱即用。
Python Apache License 2.0 UpdatedOct 18, 2023 -
ChatGLM2-6B Public
Forked from THUDM/ChatGLM2-6BChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Python Other UpdatedJul 28, 2023 -
ChatGLM-6B Public
Forked from THUDM/ChatGLM-6BChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Python Apache License 2.0 UpdatedJul 26, 2023 -
Chinese-LLaMA-Alpaca Public
Forked from ymcui/Chinese-LLaMA-Alpaca中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Python Apache License 2.0 UpdatedJul 19, 2023 -
Bert-Chinese-Text-Classification-Pytorch Public
Forked from 649453932/Bert-Chinese-Text-Classification-Pytorch使用Bert,ERNIE,进行中文文本分类
Python MIT License UpdatedJul 2, 2023 -
backtrader Public
Forked from mementum/backtraderPython Backtesting library for trading strategies
Python GNU General Public License v3.0 UpdatedFeb 28, 2022 -
-
pythondict-quant Public
Forked from Ckend/pythondict-quantQuant Examples Based on Backtrader.
Python UpdatedJan 24, 2022 -
Flask_Project Public
Forked from KikyoWu/Flask_Project基于Flask+Mysql+SQLALchmy的购物商场项目
HTML UpdatedSep 8, 2021 -
BigData-Interview Public
Forked from water8394/BigData-Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
UpdatedAug 30, 2021 -
PythonPark Public
Forked from Jack-Cherish/PythonParkPython 开源项目之「自学编程之路」,保姆级教程:AI实验室、宝藏视频、数据结构、学习指南、机器学习实战、深度学习实战、网络爬虫、大厂面经、程序人生、资源分享。
Python UpdatedAug 27, 2021 -
advanced-java Public
Forked from doocs/advanced-java😮 Core Interview Questions & Answers For Experienced Java(Backend) Developers | 互联网 Java 工程师进阶知识完全扫盲:涵盖高并发、分布式、高可用、微服务、海量数据处理等领域知识
Java Creative Commons Attribution Share Alike 4.0 International UpdatedAug 18, 2021 -
kamiFaka Public
Forked from Baiyuetribe/kamiFaka一款基于VUE3.0的开源免费的卡密发卡系统,高效、稳定可靠。
Python MIT License UpdatedMar 1, 2021 -
youxiang Public
Forked from why2lyj/youxiang-Itchat获取淘宝优惠券、京东优惠券、拼多多(多多客)优惠券、苏宁易购优惠券、唯品会优惠券,通过接入淘宝联盟、京东联盟、拼多多(多多进宝)、苏宁联盟(苏宁推客)、唯品会及其对应的开放平台,获取优惠商品图片和对应商品信息,利用微信机器人推送到指定群聊。
Python UpdatedFeb 26, 2021 -
lxSpider Public
Forked from lixi5338619/lxSpider爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》
Python UpdatedFeb 26, 2021 -
spider_draft Public
Forked from onepureman/spider_draft各种网站的登陆破解,仅供交流学习,包括:126邮箱,17173,189邮箱,360登录中心,37玩,39健康,51游戏,58同城,bilibili,YY直播,一加手机,中国移动,九游,今日头条,企查查,优酷视频,信息公示系统,凤凰网,去哪儿,启信宝,和讯网,咪咕视频登录,唯品会,喜马拉雅,国美,大众点评,大麦网,天眼查,好豆菜谱,宜贷网,小米商城,开源中国,微博,恒信易贷,房天下,搜房帮,搜…
Python UpdatedJan 25, 2021 -
ECommerceCrawlers Public
Forked from DropsDevopsOrg/ECommerceCrawlers实战🐍多种网站、电商数据爬虫🕷。包含🕸:淘宝商品、微信公众号、大众点评、企查查、招聘网站、闲鱼、阿里任务、博客园、微博、百度贴吧、豆瓣电影、包图网、全景网、豆瓣音乐、某省药监局、搜狐新闻、机器学习文本采集、fofa资产采集、汽车之家、国家统计局、百度关键词收录数、蜘蛛泛目录、今日头条、豆瓣影评、携程、小米应用商店、安居客、途家民宿❤️❤️❤️。微信爬虫展示项目:
Python MIT License UpdatedNov 4, 2020 -
EverydayWechat Public
Forked from sfyc23/EverydayWechat微信助手:1.每日定时给好友(女友)发送定制消息。2.机器人自动回复好友。3.群助手功能(例如:查询垃圾分类、天气、日历、电影实时票房、快递物流、PM2.5等)
Python MIT License UpdatedOct 30, 2020