qqccmm

Follow

qqccmm

Follow

1 follower · 0 following

Stars

harryhare / zuobiao_me_data_analysis

Python 1 Updated May 20, 2018

fujq / ZuoBiao.me

Java 1 1 Updated Aug 18, 2015

natbolon / wiki-gender

Study of linguistic gender biases in the overview of biographies in the English Wikipedia

Jupyter Notebook 5 3 Updated Dec 21, 2019

goldsmith / Wikipedia

A Pythonic wrapper for the Wikipedia API

Python 2,924 521 Updated May 12, 2024

barrust / mediawiki

MediaWiki API wrapper in python http://pymediawiki.readthedocs.io/en/latest/

Python 181 30 Updated Jan 20, 2025

ndrezn / wikipedia-histories

A Python tool to pull the complete edit history of a Wikipedia page

Python 20 7 Updated Dec 3, 2024

esbatmop / MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,744 262 Updated Mar 2, 2025

yonggekkk / warp-yg

warp多功能一键脚本，支持warp-go与wgcf切换，无限生成warp配置文件，支持升级warp+、warp团队账户，查看VPS本地IP、netflix、chatgpt解锁状态

3,819 984 Updated Sep 24, 2024

bhaskatripathi / pdfGPT

PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!

Python 7,065 849 Updated Dec 9, 2024

wenda-LLM / wenda

闻达：一个LLM调用平台。目标为针对特定环境的高效内容生成，同时考虑个人和中小企业的计算资源局限性，以及知识安全和私密性问题

JavaScript 6,266 812 Updated Jan 23, 2025

sindresorhus / capture-website

Capture screenshots of websites

JavaScript 1,955 138 Updated Dec 3, 2024

swinton / screenshot-website

📸 A GitHub Action to capture screenshots of a website, across Windows, Mac, and Linux

JavaScript 191 27 Updated Jun 24, 2024

hellodword / wechat-feeds

[已停止服务] 给微信公众号生成 RSS 订阅源

987 566 Updated Jun 26, 2021

DIYgod / RSSHub

🧡 Everything is RSSible

TypeScript 35,484 7,790 Updated Mar 3, 2025

robinmoisson / staticrypt

Password protect a static HTML page, decrypted in-browser in JS with no dependency. No server logic needed.

HTML 7,274 446 Updated Feb 19, 2025

keliousabdelhak / Python-from-json-to-sql

Convert json to sql using python & sqlite3

Python 12 2 Updated Dec 18, 2020

ericskh2 / LIHKG-Clone-Practice

Practice coding

Java 3 Updated Aug 22, 2022

unixfox / pupflare

A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)

JavaScript 377 82 Updated Sep 20, 2024

luopeixiang / textclf

TextClf ：基于Pytorch/Sklearn的文本分类框架，包括逻辑回归、SVM、TextCNN、TextRNN、TextRCNN、DRNN、DPCNN、Bert等多种模型，通过简单配置即可完成数据处理、模型训练、测试等过程。

Python 238 39 Updated Jul 21, 2023

gekelly / JD-Comment_emotional-analysis

京东评论情感分析模型，主要包括1、数据获取及探索性分析；2、文本预处理、文本分词、文本向量化、特征提取、

Jupyter Notebook 79 19 Updated Jun 4, 2019

ShawnyXiao / 2018-DC-DataGrand-TextIntelProcess

2018-DC-“达观杯”文本智能处理挑战赛：冠军 (1st/3131)

264 57 Updated Oct 24, 2018

hecongqing / 2018-daguan-competition

2018年"达观杯"文本智能处理挑战赛-长文本分类-rank4

Jupyter Notebook 282 75 Updated Aug 5, 2020

qqccmm / AutoHome_spider

Forked from StuPeter/AutoHome_spider

汽车之家爬虫，解决字体反爬。

Python 1 Updated Nov 13, 2020

luyishisi / Anti-Anti-Spider

越来越多的网站具有反爬虫特性，有的用图片隐藏关键数据，有的使用反人类的验证码，建立反反爬虫的代码仓库，通过与不同特性的网站做斗争（无恶意）提高技术。（欢迎提交难以采集的网站）（因工作原因，项目暂停）

Python 7,282 2,173 Updated Oct 17, 2021

alecxe / scrapy-fake-useragent

Random User-Agent middleware based on fake-useragent

Python 694 98 Updated Sep 18, 2023

qqccmm / Tieba_Spider

Forked from Aqua-Dream/Tieba_Spider

百度贴吧爬虫(基于scrapy和mysql)

Python 1 Updated Oct 12, 2023

Aqua-Dream / Tieba_Spider

百度贴吧爬虫(基于scrapy和mysql)

Python 407 116 Updated Nov 25, 2021

zxins / hotfish

获取知乎、V2EX、微博、贴吧、IT之家、豆瓣、虎扑、天涯、GitHub等网站热门头条的多线程爬虫，使用Flask聚合网站。

Python 35 12 Updated Feb 16, 2023

wrenfairbank / telegram_gcloner

Python 176 410 Updated Sep 29, 2020

mack-a / v2ray-agent

Xray、Tuic、hysteria2、sing-box 八合一一键脚本

Shell 15,141 4,761 Updated Feb 25, 2025