Lists (14)
Sort Newest
Starred repositories
Fully open reproduction of DeepSeek-R1
NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other enterprise documents into metadata and text to embed into retri…
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Baselines for IS25 Source Tracing Special Session
Deep Learning-Based Gait Recognition Using Smartphones in the Wild
Accelerometer sensor data for Human Posture Recognition
The official code for "One Fits All: Power General Time Series Analysis by Pretrained LM (NeurIPS 2023 Spotlight)"
About Code release for "TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis" (ICLR 2023), https://openreview.net/pdf?id=ju_Uqw384Oq
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
ICML 21 - Voice2Series: Adversarial Reprogramming Acoustic Models for Time Series Classification
Multivariate Time Series Transformer, public version
A professional list on Large (Language) Models and Foundation Models (LLM, LM, FM) for Time Series, Spatiotemporal, and Event Data.
A professionally curated list of awesome resources (paper, code, data, etc.) on transformers in time series.
[IJCAI-21] "Time-Series Representation Learning via Temporal and Contextual Contrasting"
Self-supervised contrastive learning for time series via time-frequency consistency
Cross-Platform, GPU Accelerated Whisper 🏎️
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验,同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。
SpeechGPT Series: Speech Large Language Models
Awesome speech/audio LLMs, representation learning, and codec models
Baseline Recipe for VoicePrivacy Challenge 2024: anonymization systems and evaluation software
SuperPrompt is an attempt to engineer prompts that might help us understand AI agents.
An unofficial PyTorch implementation of the StreamVC(Real-Time Low-Latency Voice Conversion)
This is a Phi Family of SLMs book for getting started with Phi Models. Phi a family of open sourced AI models developed by Microsoft. Phi models are the most capable and cost-effective small langua…