-
The huggingface library is the most popular NLP library with over 68,000 stars on github
-
It provides state-of-the-art natural processing models and very clean API.
-
Transformers library supports with a deep learning library like PyTorch, TensorFlow
-
It supports Python 3.6+, PyTorch 1.1.0+, TensorFlow 2.0+, and Flax
-
It can be used to perform Speech to Text Conversion in German Language
-
Our research shows a selection of ASR model architectures that are pretrained on the German language and evaluates state-of-the-art open-source models in German language on diverse datasets
-
With comparison to English, fewer benchmark results have been published in German.
- Speaker Independent
- Continuous and Spontaneous Speech
- Large Vocabulary
- Open Sourced and pretrained
- wav2vec2.0
- Conformer Transducer
- Conformer CTC
- Quartznet
- ContextNet
- Citrinet
- Speechbrain
- This model is by facebook AI research (FAIR)
- There are several pretrained model are available on huggingface
- Different finetuned versions of this model on lastest common voice datasets (V-6.0, V-7.0, V-8.0, V-9.0 ) are also available
- We have selected one pretrained model wav2vec2 large xlsr-53-german by facebook
- And a fine-tuned version on Commonvoice (6.0) facebook wav2vec2 large xlsr-53-german by Jonatas Grosman
- Since it had the lowest self-reported WER on Common Voice (12.06%) compared to 18.5% reported for the original model provided by FAIR
The model facebook wav2vec2 large xlsr-53-german is a Natural Language Processing (NLP) Model implemented in Transformer library, generally using the Python programming language.
- It is multilingual pre-trained wav2vec 2.0 (XLSR)
- Architecture is based on Transformers’ encoder
- Datasets on which a pre-trained model was trained (Multilingual LibriSpeech, Common Voice, Babel)
- We can find this model easily in transformers python library.
- To download and use any of the pretrained models for our given task
- We just need to use few lines of codes (PyTorch version)
# Import model and tokenizer
from transformers import AutoModel, AutoTokenizer
# Define the model repo
model_name = "facebook/wav2vec2-large-xlsr-53-german"
# Download pytorch model
model = AutoModel.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Transform input tokens
inputs = tokenizer("Hello world!", return_tensors="pt")
# Model apply
outputs = model(**inputs)