В праздничные дни с 29.12 по 08.01 техническая поддержка отдыхает, но на наиболее важные вопросы постараемся ответить. Счастливого Нового Года!
gitverse new year логотип

slovo

Форк
0
Зеркало из https://github.com/ai-forever/slovo

README.md

Slovo - Russian Sign Language Dataset

We introduce a large-scale video dataset Slovo for Russian Sign Language task. Slovo dataset size is about 16 GB, and it contains 20400 RGB videos for 1000 sign language gestures from 194 singers. Each class has 20 samples. The dataset is divided into training set and test set by subject

user_id
. The training set includes 15300 videos, and the test set includes 5100 videos. The total video recording time is ~9.2 hours. About 35% of the videos are recorded in HD format, and 65% of the videos are in FullHD resolution. The average video length with gesture is 50 frames.

For more information see our paper - arXiv.

Downloads

DownloadsSize (GB)Comment
Slovo~16Trimmed HD+ videos by
(start, end)
annotations
Origin~105Original HD+ videos from mining stage
360p~13Resized original videos by
min_side = 360
Landmarks~1.2Mediapipe hand landmark annotations for each frame of trimmed videos

Also, you can download Slovo from Kaggle.

Annotation file is easy to use and contains some useful columns, see

annotations.csv
file:

attachment_iduser_idwidthheightlengthtexttrainbeginend
0de81cc1c-...1b...1440192014приветTrue3045
13c0cec5a-...64...1440192032утроFalse4366
2d17ca986-...cf...1920108044улицаFalse1231

where:

  • attachment_id
    - video file name
  • user_id
    - unique anonymized user ID
  • width
    - video width
  • height
    - video height
  • length
    - video length
  • text
    - gesture class in Russian Langauge
  • train
    - train or test boolean flag
  • begin
    - start of the gesture (for original dataset)
  • end
    - end of the gesture (for original dataset)

For convenience, we have also prepared a compressed version of the dataset, in which all videos are processed by the minimum side

min_side = 360
. Download link - slovo360p. Also, we annotate trimmed videos by using MediaPipe and provide hand keypoints in this annotation file.

Models

We provide some pre-trained models as the baseline for Russian sign language recognition. We tested models with frames number from [16, 32, 48], and the best for each are below. The first number in the model name is frames number and the second is frame interval.

Model NameModel Size (MB)MetricONNXTorchScript
MViTv2-small-16-4140.5158.35weightsweights
MViTv2-small-32-2140.7964.09weightsweights
MViTv2-small-48-2141.0562.18weightsweights
Swin-large-16-3821.6548.04weightsweights
Swin-large-32-2821.7454.84weightsweights
Swin-large-48-1821.7855.66weightsweights
ResNet-i3d-16-3146.4332.86weightsweights
ResNet-i3d-32-2146.4338.38weightsweights
ResNet-i3d-48-1146.4343.91weightsweights

SignFlow models

Model NameDescONNXParams
SignFlow-A63.3 Top-1 Acc on WLASL-2000 (SOTA)weights36M
SignFlow-RPre-trained on ~50000 samples, has 267 classes, tested with GigaChat (as-is and context-based modes)weights37M

Demo

usage: demo.py [-h] -p CONFIG [--mp] [-v] [-l LENGTH]
optional arguments:
-h, --help show this help message and exit
-p CONFIG, --config CONFIG
Path to config
--mp Enable multiprocessing
-v, --verbose Enable logging
-l LENGTH, --length LENGTH
Deque length for predictions
python demo.py -p <PATH_TO_CONFIG>

demo

Authors and Credits

Citation

You can cite the paper using the following BibTeX entry:

@inproceedings{kapitanov2023slovo, title={Slovo: Russian Sign Language Dataset}, author={Kapitanov, Alexander and Karina, Kvanchiani and Nagaev, Alexander and Elizaveta, Petrova}, booktitle={International Conference on Computer Vision Systems}, pages={63--73}, year={2023}, organization={Springer} }

License

Creative Commons License
This work is licensed under a variant of Creative Commons Attribution-ShareAlike 4.0 International License.

Please see the specific license.

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.