Skip to content
View myownskyW7's full-sized avatar

Highlights

  • Pro

Block or report myownskyW7

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 11,592 703 Updated Dec 24, 2024
21 Updated Jan 6, 2025

Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction

Python 24 Updated Jan 7, 2025

Code for BLT research paper

Python 1,271 89 Updated Jan 7, 2025

SOTA Open Source TTS

Python 18,182 1,360 Updated Jan 4, 2025

Official implementation of X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models

167 3 Updated Dec 3, 2024

More relighting!

Python 7,270 427 Updated Nov 28, 2024

A Collection of Variational Autoencoders (VAE) in PyTorch.

Python 6,840 1,083 Updated Jun 13, 2024

李宏毅2021/2022/2023春季机器学习课程课件及作业

Jupyter Notebook 6,361 1,607 Updated Jun 3, 2023

Inference and training library for high-quality TTS models.

Python 4,876 504 Updated Dec 10, 2024

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,897 113 Updated Jul 29, 2024

GLM-4-Voice | 端到端中英语音对话模型

Python 2,537 205 Updated Dec 5, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 7,167 665 Updated Jan 8, 2025

The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate".

Python 92 3 Updated Nov 27, 2024

Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".

Python 858 56 Updated Oct 28, 2024

The official code of the paper "PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction".

Python 50 Updated Jan 8, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 3,612 341 Updated Jan 8, 2025

Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"

Python 921 52 Updated Dec 26, 2024

SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

Jupyter Notebook 528 15 Updated Dec 18, 2024

OpenEQA Embodied Question Answering in the Era of Foundation Models

Jupyter Notebook 249 22 Updated Sep 20, 2024

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 1,403 113 Updated Dec 26, 2024

提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手

Python 36,170 3,752 Updated Jan 2, 2025

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 8,745 1,144 Updated Jan 7, 2025

Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊

255 7 Updated Nov 2, 2024

An open-source implementation for training LLaVA-NeXT.

Python 407 21 Updated Oct 23, 2024

VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)

Python 278 32 Updated Aug 15, 2024

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

Python 999 66 Updated Oct 6, 2024

Awesome papers & datasets specifically focused on long-term videos.

233 10 Updated Nov 15, 2024

800,000 step-level correctness labels on LLM solutions to MATH problems

Python 1,794 106 Updated Jun 1, 2023

SpeechGPT Series: Speech Large Language Models

Python 1,327 87 Updated Jul 22, 2024
Next