Skip to content

sutwangyan/MSKA

Repository files navigation

MSKA

PyTorch report

PWC PWC PWC PWC PWC

Introduction

We propose a multi-stream keypoint attention network to depict a sequence of keypoints produced by a readily available keypoint estimator. In order to facilitate interaction across multiple streams, we investigate diverse methodologies such as keypoint fusion strategies, head fusion, and self-distillation. The resulting framework is denoted as MSKA-SLR, which is expanded into a sign language translation (SLT) model through the straightforward addition of an extra translation network.We carry out comprehensive experiments on well-known benchmarks like Phoenix-2014, Phoenix-2014T, and CSL-Daily to showcase the efficacy of our methodology. Notably, we have attained a novel state-of-the-art performance in the sign language translation task of Phoenix-2014T.

Performance

MSKA-SLR

Dataset WER Model Training
Phoenix-2014 21.2 ckpt config
Phoenix-2014T 19.8 ckpt config
CSL-Daily 27.1 ckpt config

MSKA-SLT

Dataset R B1 B2 B3 B4 Model Training
Phoenix-2014T 53.54 54.79 42.42 34.49 29.03 ckpt config
CSL-Daily 54.04 56.37 42.80 32.78 25.52 ckpt config

Installation

conda create -n mska python==3.10.13
conda activate mska
# Please install PyTorch according to your CUDA version.
pip install -r requirements.txt

Download

Datasets

Download datasets from their websites and place them under the corresponding directories in data/

Pretrained Models

mbart_de / mbart_zh : pretrained language models used to initialize the translation network for German and Chinese, with weights from mbart-cc-25.

We provide pretrained models Phoenix-2014T and CSL-Daily. Download this directory and place them under pretrained_models.

Keypoints We provide human keypoints for three datasets, Phoenix-2014, Phoenix-2014T, and CSL-Daily, pre-extracted by HRNet. Please download them and place them under data/Phoenix-2014t(Phoenix-2014 or CSL-Daily).

MSKA-SLR Training

python train.py --config configs/${dataset}_s2g.yaml --epoch 100

MSKA-SLR Evaluation

python train.py --config configs/${dataset}_s2g.yaml --resume pretrained_models/${dataset}_SLR/best.pth --eval

MSKA-SLT Training

python train.py --config configs/${dataset}_s2t.yaml --epoch 40

MSKA-SLT Evaluation

python train.py --config configs/${dataset}_s2t.yaml --resume pretrained_models/${dataset}_SLT/best.pth --eval

Citations

@misc{GUAN2025111602,
title = {MSKA: Multi-stream keypoint attention network for sign language recognition and translation},
journal = {Pattern Recognition},
volume = {165},
pages = {111602},
year = {2025},
issn = {0031-3203},
doi = {https://doi.org/10.1016/j.patcog.2025.111602},
url = {https://www.sciencedirect.com/science/article/pii/S0031320325002626},
author = {Mo Guan and Yan Wang and Guangkun Ma and Jiarui Liu and Mingzu Sun},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages