Skip to content
View qqccmm's full-sized avatar

Block or report qqccmm

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Java 1 1 Updated Aug 18, 2015

Study of linguistic gender biases in the overview of biographies in the English Wikipedia

Jupyter Notebook 5 3 Updated Dec 21, 2019

A Pythonic wrapper for the Wikipedia API

Python 2,924 521 Updated May 12, 2024

MediaWiki API wrapper in python http://pymediawiki.readthedocs.io/en/latest/

Python 181 30 Updated Jan 20, 2025

A Python tool to pull the complete edit history of a Wikipedia page

Python 20 7 Updated Dec 3, 2024

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,744 262 Updated Mar 2, 2025

warp多功能一键脚本,支持warp-go与wgcf切换,无限生成warp配置文件,支持升级warp+、warp团队账户,查看VPS本地IP、netflix、chatgpt解锁状态

3,819 984 Updated Sep 24, 2024

PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!

Python 7,065 849 Updated Dec 9, 2024

闻达:一个LLM调用平台。目标为针对特定环境的高效内容生成,同时考虑个人和中小企业的计算资源局限性,以及知识安全和私密性问题

JavaScript 6,266 812 Updated Jan 23, 2025

Capture screenshots of websites

JavaScript 1,955 138 Updated Dec 3, 2024

📸 A GitHub Action to capture screenshots of a website, across Windows, Mac, and Linux

JavaScript 191 27 Updated Jun 24, 2024

[已停止服务] 给微信公众号生成 RSS 订阅源

987 566 Updated Jun 26, 2021

🧡 Everything is RSSible

TypeScript 35,484 7,790 Updated Mar 3, 2025

Password protect a static HTML page, decrypted in-browser in JS with no dependency. No server logic needed.

HTML 7,274 446 Updated Feb 19, 2025

Convert json to sql using python & sqlite3

Python 12 2 Updated Dec 18, 2020

Practice coding

Java 3 Updated Aug 22, 2022

A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)

JavaScript 377 82 Updated Sep 20, 2024

TextClf :基于Pytorch/Sklearn的文本分类框架,包括逻辑回归、SVM、TextCNN、TextRNN、TextRCNN、DRNN、DPCNN、Bert等多种模型,通过简单配置即可完成数据处理、模型训练、测试等过程。

Python 238 39 Updated Jul 21, 2023

京东评论情感分析模型,主要包括1、数据获取及探索性分析;2、文本预处理、文本分词、文本向量化、特征提取、

Jupyter Notebook 79 19 Updated Jun 4, 2019

2018-DC-“达观杯”文本智能处理挑战赛:冠军 (1st/3131)

264 57 Updated Oct 24, 2018

2018年"达观杯"文本智能处理挑战赛-长文本分类-rank4

Jupyter Notebook 282 75 Updated Aug 5, 2020

汽车之家爬虫,解决字体反爬。

Python 1 Updated Nov 13, 2020

越来越多的网站具有反爬虫特性,有的用图片隐藏关键数据,有的使用反人类的验证码,建立反反爬虫的代码仓库,通过与不同特性的网站做斗争(无恶意)提高技术。(欢迎提交难以采集的网站)(因工作原因,项目暂停)

Python 7,282 2,173 Updated Oct 17, 2021

Random User-Agent middleware based on fake-useragent

Python 694 98 Updated Sep 18, 2023

百度贴吧爬虫(基于scrapy和mysql)

Python 1 Updated Oct 12, 2023

百度贴吧爬虫(基于scrapy和mysql)

Python 407 116 Updated Nov 25, 2021

获取知乎、V2EX、微博、贴吧、IT之家、豆瓣、虎扑、天涯、GitHub等网站热门头条的多线程爬虫,使用Flask聚合网站。

Python 35 12 Updated Feb 16, 2023

Xray、Tuic、hysteria2、sing-box 八合一一键脚本

Shell 15,141 4,761 Updated Feb 25, 2025
Next