SidneyRey

SidneyRey

Starred repositories

Psarpei / Multi-Type-TD-TSR

Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

Jupyter Notebook 274 53 Updated Sep 5, 2022

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 2,586 191 Updated Apr 17, 2025

Gmgge / TrOCR-Seal-Recognition

基于transformer的ocr识别，在公章(印章识别, seal recognition）拓展应用

Python 210 35 Updated Feb 27, 2025

JeremyCJM / DiffSHEG

[CVPR'24] DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation

Python 148 11 Updated Apr 30, 2024

aigc3d / LHM

LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds

Python 1,850 129 Updated Apr 17, 2025

GuijiAI / HeyGem.ai

C 7,130 1,206 Updated Apr 16, 2025

mannaandpoem / OpenManus

No fortress, purely open ground. OpenManus is Coming.

Python 43,619 7,482 Updated Apr 17, 2025

SesameAILabs / csm

A Conversational Speech Generation Model

Python 12,588 1,138 Updated Mar 27, 2025

Sean-Der / embedded-sdk

LiveKit SDK for Embedded

C++ 46 8 Updated Oct 28, 2024

SIGRobotics-UIUC / LeKiwi

LeKiwi - Low-Cost Mobile Manipulator

588 63 Updated Apr 17, 2025

xavctn / img2table

img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing

Python 700 100 Updated Feb 10, 2025

OpenRobotLab / OpenHomie

Open-sourced code for "HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit".

C++ 230 19 Updated Apr 4, 2025

zhenye234 / LLaSA_training

LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 508 36 Updated Apr 8, 2025

facebookresearch / pippo

Pippo: High-Resolution Multi-View Humans from a Single Image

Python 513 41 Updated Apr 4, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 13,588 954 Updated Apr 18, 2025

isarandi / nlf

[NeurIPS 2024] Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation

Python 271 9 Updated Mar 28, 2025

xiaoyu258 / DocProj

Document Rectification and Illumination Correction using a Patch-based CNN

Python 367 85 Updated Sep 28, 2022

JOY-MM / JoyGen

talking-face video editing

Python 308 44 Updated Feb 27, 2025

showlab / DragAnything

[ECCV 2024] DragAnything: Motion Control for Anything using Entity Representation

Python 488 18 Updated Jul 2, 2024

TencentARC / MotionCtrl

Official Code for MotionCtrl [SIGGRAPH 2024]

Python 1,422 77 Updated Feb 19, 2025

dzhng / deep-research

An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the si…

TypeScript 15,594 1,600 Updated Apr 12, 2025