Skip to content
View SidneyRey's full-sized avatar

Block or report SidneyRey

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

Jupyter Notebook 274 53 Updated Sep 5, 2022

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 2,586 191 Updated Apr 17, 2025

基于transformer的ocr识别,在公章(印章识别, seal recognition)拓展应用

Python 210 35 Updated Feb 27, 2025

[CVPR'24] DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation

Python 148 11 Updated Apr 30, 2024

LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds

Python 1,850 129 Updated Apr 17, 2025

No fortress, purely open ground. OpenManus is Coming.

Python 43,619 7,482 Updated Apr 17, 2025

A Conversational Speech Generation Model

Python 12,588 1,138 Updated Mar 27, 2025

LiveKit SDK for Embedded

C++ 46 8 Updated Oct 28, 2024

LeKiwi - Low-Cost Mobile Manipulator

588 63 Updated Apr 17, 2025

img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing

Python 700 100 Updated Feb 10, 2025

Open-sourced code for "HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit".

C++ 230 19 Updated Apr 4, 2025

LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 508 36 Updated Apr 8, 2025

Pippo: High-Resolution Multi-View Humans from a Single Image

Python 513 41 Updated Apr 4, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 13,588 954 Updated Apr 18, 2025

[NeurIPS 2024] Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation

Python 271 9 Updated Mar 28, 2025

Document Rectification and Illumination Correction using a Patch-based CNN

Python 367 85 Updated Sep 28, 2022

talking-face video editing

Python 308 44 Updated Feb 27, 2025

[ECCV 2024] DragAnything: Motion Control for Anything using Entity Representation

Python 488 18 Updated Jul 2, 2024

Official Code for MotionCtrl [SIGGRAPH 2024]

Python 1,422 77 Updated Feb 19, 2025

An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the si…

TypeScript 15,594 1,600 Updated Apr 12, 2025

Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer

Python 1,192 161 Updated Mar 13, 2025
Python 439 38 Updated Mar 13, 2025

基于序列表格识别算法推理库,集成PP-Structure和modelscope等表格识别算法。

Python 268 20 Updated Apr 8, 2025

Official implementation of "DepthLab: From Partial to Complete"

Python 476 27 Updated Feb 14, 2025

Non-rigid iterative closest point, nricp.

MATLAB 85 14 Updated Mar 30, 2019

MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.

Python 22 5 Updated Dec 11, 2024

Code to accompany "A Method for Animating Children's Drawings of the Human Figure"

Python 12,407 1,068 Updated Apr 6, 2025

Python tool for converting files and office documents to Markdown.

Python 49,253 2,377 Updated Apr 13, 2025
Next