Stars
A simple screen parsing tool towards pure vision based GUI agent
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
This repository contains the Hugging Face Agents Course.
Open-sourced, Fast and Context-aware Action Grounding from GUI Instructions for GUI/Computer-use Agents
🦜🔗 Build context-aware reasoning applications
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
ai副业赚钱大集合,教你如何利用ai做一些副业项目,赚取更多额外收益。The Ultimate Guide to Making Money with AI Side Hustles: Learn how to leverage AI for some cool side gigs and rake in some extra cash. Check out the English versi…
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
Use LLMs to dig out what you care about from massive amounts of information and a variety of sources daily.
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
A generative speech model for daily dialogue.
Python script to create highly-customized thumbnails for videos
python implementation of the paper "Spatially-Varying Blur Detection Based on Multiscale Fused and Sorted Transform Coefficients of Gradient Magnitudes" - cvpr 2017
💎1MB lightweight face detection model (1MB轻量级人脸检测模型)
开源人脸口罩检测模型和数据 Detect faces and determine whether people are wearing mask.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Effortlessly create virtual displays in Windows, capable of supporting various resolutions and refresh rates, suitable for remote control or graphics card spoofing.在win中轻松创建支持多种分辨率和刷新率的虚拟显示器,可用于远程控…
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
A collection of ComfyUI custom nodes.
A library for efficient similarity search and clustering of dense vectors.
C4-PlantUML combines the benefits of PlantUML and the C4 model for providing a simple way of describing and communicate software architectures
Official Code for DragGAN (SIGGRAPH 2023)
A browser UI for JSNES, a JavaScript NES emulator