Stars
Foundational Models for State-of-the-Art Speech and Text Translation
The repoduction codes for Qwen-Audio Fine-tuning
56 language, 1 model Multilingual ASR
An open source deep research clone. AI Agent that reasons large amounts of web data extracted with Firecrawl
Example of "biological" learning for MNIST
深度学习入门教程, 优秀文章, Deep Learning Tutorial
Recurrent Neural Networks of Mediodorsal Thalamus and Prefrontal Cortex in Temporal Contexts
Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
A Two Stage Adaptation Network (TSAN) for remote sensing images classification under single-source-mixed-multiple-target domain adaptation scenario
# AD-Prediction Convolutional Neural Networks for Alzheimer's Disease Prediction Using Brain MRI Image ## Abstract Alzheimers disease (AD) is characterized by severe memory loss and cognitive impai…
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…
[ICASSP 2025] Toward Robust Early Detection of Alzheimer's Disease via an Integrated Multimodal Learning Approach
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
This repository contains the code for the INTERSPEECH2023 paper: "Alzheimer Disease Classification through ASR-based Transcriptions: Exploring the Impact of Punctuation and Pauses"
Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.
Analysis of the dynamics of the Wilson-Cowan model
Playing around with a simple Wilson-Cowan oscillator model
Simple Python Wrapper for SRILM with Python 2.x and 3.x Supported
The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks
A curated list of papers in Test-time Adaptation, Test-time Training and Source-free Domain Adaptation
chinese speech pretrained models