magic-html-api

一个智能的网页内容提取API服务，基于magic-html和jina-ai/reader。支持多种内容类型（文章/论坛/微信/知乎），多种输出格式（文本/Markdown/HTML）。只保留主要文章内容，使AI能够更好地理解和分析文本。

功能特点

🔍 智能识别网页类型并提取主要内容
📚 支持多种内容类型（文章/论坛/微信/知乎）
📝 多种输出格式（文本/Markdown/HTML）
⚡ 异步处理，响应迅速
🚀 部署在Vercel上，免费使用
🤖 自动降级处理：当默认提取失败时自动使用jina-ai/reader

🔗 在线演示

访问 https://magic-html-api.vercel.app 体验在线版本。

一键部署：

API使用

内容提取

GET /api/extract

参数：

url: 要提取内容的网页URL（必需）
output_format: 输出格式（可选，默认为"text"）
- text: 纯文本格式
- markdown: Markdown格式
- html: HTML格式

示例请求：

https://your-domain.vercel.app/api/extract?url=https://example.com&output_format=markdown

响应格式：

{
    "url": "请求的URL",
    "content": "提取的内容",
    "format": "输出格式",
    "type": "内容类型",
    "success": true
}

内容类型（type）包括：

article: 文章
forum: 论坛
weixin: 微信文章
jina: AI提取（使用jina-ai/reader处理）

技术实现

使用magic-html作为主要内容提取引擎
集成jina-ai/reader作为备选提取方案
自动识别网页类型并选择最佳提取策略
智能降级：当默认提取失败时自动切换到jina-ai/reader

部署

本项目使用Vercel部署，直接导入GitHub仓库即可。

环境要求

Python 3.9+
Node.js 16+

部署步骤

Fork本仓库
在Vercel中导入项目
部署完成后即可使用

技术栈

后端

FastAPI
magic-html
jina-ai/reader
Python 3.9+

前端

Next.js 13
React
Tailwind CSS
TypeScript

部署

Vercel

致谢

magic-html - 强大的网页内容提取库
jina-ai/reader - 优秀的AI内容提取服务

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
api		api
app		app
public		public
wheels		wheels
.gitignore		.gitignore
.prettierrc		.prettierrc
README.md		README.md
eslint.config.mjs		eslint.config.mjs
next.config.js		next.config.js
package.json		package.json
postcss.config.js		postcss.config.js
requirements.txt		requirements.txt
runtime.txt		runtime.txt
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

magic-html-api

功能特点

🔗 在线演示

API使用

内容提取

技术实现

部署

环境要求

部署步骤

技术栈

后端

前端

部署

致谢

License

About

Releases

Packages

Languages

onlyistranger/magichtmlapi

Folders and files

Latest commit

History

Repository files navigation

magic-html-api

功能特点

🔗 在线演示

API使用

内容提取

技术实现

部署

环境要求

部署步骤

技术栈

后端

前端

部署

致谢

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages