Skip to content
View SRTPioneer's full-sized avatar

Block or report SRTPioneer

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language modeling architecture, StarVector processes both visual and te…

Python 2,698 147 Updated Mar 26, 2025

[CVPR 2025] FoundationStereo: Zero-Shot Stereo Matching

Python 1,078 43 Updated Mar 25, 2025

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

TypeScript 33,236 2,861 Updated Mar 27, 2025

Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has state of the art retrieval performance on both text and visual…

Python 877 62 Updated Feb 6, 2025

Wireshark for Docker containers

Go 2,425 50 Updated Mar 15, 2025

This is a list of free and open source projects related to Swarm and its growing ecosystem.

41 11 Updated Nov 7, 2024

Taming Stable Diffusion for Lip Sync!

Python 3,345 491 Updated Mar 21, 2025

Reverse Engineering: Decompiling Binary Code with Large Language Models

Python 5,354 366 Updated Oct 28, 2024

Frontier Multimodal Foundation Models for Image and Video Understanding

Jupyter Notebook 677 43 Updated Mar 21, 2025

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,331 639 Updated Feb 10, 2025

Eagle Family: Exploring Model Designs, Data Recipes and Training Strategies for Frontier-Class Multimodal LLMs

Python 637 39 Updated Jan 28, 2025

Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (…

Python 4,471 623 Updated Mar 6, 2025

This project is a FastAPI-based web service that combines natural language processing with music database querying to provide detailed answers to music-related questions. It utilizes Claude AI for …

Python 3 Updated Nov 22, 2024

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Python 26,125 3,286 Updated Sep 24, 2024

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…

Python 47,847 8,101 Updated Mar 28, 2025

Continuation of the "Official Python Client for the Discogs API"

Python 332 52 Updated Feb 25, 2025

Agent Zero AI framework

Python 6,413 1,393 Updated Mar 18, 2025

Build your own second brain with supermemory. It's a ChatGPT for your bookmarks. Import tweets or save websites and content using the chrome extension.

TypeScript 8,889 845 Updated Mar 25, 2025

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 8,866 692 Updated Mar 3, 2025

🔥Open Source No Code Web Data Extraction Platform. Turn Websites To APIs & Spreadsheets With No-Code Robots In Minutes🔥

TypeScript 9,678 766 Updated Mar 27, 2025

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 19,085 1,374 Updated Mar 3, 2025

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Jupyter Notebook 7,832 502 Updated Mar 27, 2025

World's most advanced database DevSecOps solution for Developer, Security, DBA and Platform Engineering teams. The GitHub/GitLab for database DevSecOps.

Go 12,167 785 Updated Mar 28, 2025

The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems

Python 1,850 116 Updated Mar 20, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 12,589 1,385 Updated Mar 28, 2025

Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"

Python 6,653 427 Updated Mar 18, 2025

AG2 (formerly AutoGen): The Open-Source AgentOS. Join us at: https://discord.gg/pAbnFJrkgZ

Python 2,155 271 Updated Mar 28, 2025

dstack is a lightweight, open-source alternative to Kubernetes & Slurm, simplifying AI container orchestration with multi-cloud & on-prem support. It natively supports NVIDIA, AMD, TPU, and Intel a…

Python 1,743 169 Updated Mar 28, 2025

Truly independent web browser

C++ 36,506 1,534 Updated Mar 28, 2025
Next