-
whisperX Public
Forked from m-bain/whisperXWhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Python BSD 2-Clause "Simplified" License UpdatedJan 28, 2025 -
Fast and accurate multilingual translations, all available offline for enhanced privacy and accessibility.
-
Advanced Speech Emotion Recognition, based on ExHuBERT: Enhancing HuBERT Through Block Extension and Fine-Tuning on 37 Emotion Datasets and 14 languages (Emotions: Disgust, Neutral, Kind, Anger, Su…
-
mms-turkish-tts Public
Turkish text to speech model that the part of Facebook's Massively Multilingual Speech
-
Turkish-Text-to-Speech Public
Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan
-
jetson-voice Public
Forked from dusty-nv/jetson-voiceASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch and TensorRT
-
-
Image-Captioning Public
Image captioning with a benchmark of CNN-based encoder and GRU-based inject-type (init-inject, pre-inject, par-inject) and merge decoder architectures
-
NGC-docker Public
NVIDIA GPU Cloud setup and building NVIDIA Containers for Jetson and JetPack
2 UpdatedAug 24, 2023 -
ASR-Quantization Public
Post-training quantization on Nvidia Nemo ASR model
-
Speaker-Verification Public
Verifying the identity of a person from characteristics of the voice independent from language via NVIDIA NeMo models (ECAPA-TDNN, SpeakerNet, TitaNet-L).
-
Question-Answering-BERT Public
Extractive Question-Answering with BERT on SQuAD v2.0 (Stanford Question Answering Dataset) using NVIDIA PyTorch Lightning
-
Conda-Jupyter-Docker Public
Create conda environment and launch jupyter notebook in Anaconda docker container
-
Custom object detection on a video dataset using PyTorch Faster RCNN
-
YOLO Darknet: Traffic sign detection on image and video
-
Classifying custom image datasets by creating Convolutional Neural Networks and Residual Networks from scratch with PyTorch
-
Transfer learning using Inception V3 for custom image classification dataset with TensorFlow and Keras
-
Create light-weight conda environment for ARM64 devices
6 UpdatedJan 22, 2023 -
Speech-Datasets-for-ASR Public
Download speech datasets (English and non-English) for Automatic Speech Recognition
-
KenLM Public
Determining the probability of a sequence of words in Turkish using the KenLM language model with Python
-
IMECA Public
Automatic image captioning on Android-based mobile application with CNN and multi-layer GRU encoder-decoder model
-
Image-Caption-Generation Public
InceptionV3-Multi-layer GRU based automatic image captioning with Keras and TensorFlow frameworks
-
TextPrepR Public
Package for cleaning and preprocessing text data is supported for all languages (some functions) and English (all fuctions).
-
dtw-compare-audio-files Public
Compute the MFCCs and measure (dis)similarity between two audio files using DTW