google-research
Towards Learning a Universal Non-Semantic Representation of Speech
Papers using this code:
- Interspeech 2022: TRILLsson: Distilled Universal Paralinguistic Speech Representations
- ICASSP 2022: Universal Paralinguistic Speech Representations Using Self-Supervised Conformers
- IEEE 2022: BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
- Interspeech 2021: FRILL: A Non-Semantic Speech Embedding for Mobile Devices
- Interspeech 2021: Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases
- Interspeech 2020: Towards Learning a Universal Non-Semantic Representation of Speech
This paper and code repository describe a benchmark for comparing speech representations, and the evaluation code to run it. It also contains a description of our baseline best representation, TRILL.
Things you can do
- Reproduce the results from our paper
- Compute performance of a new embedding on the Non-Semantic Speech Benchmark (NOSS)
- Run our embedding TRILL, or any of the other embedding networks on a new dataset.
Citation
To use this benchmark, please cite as follows:
@inproceedings{trill,
author={Joel Shor and Aren Jansen and Ronnie Maor and Oran Lang and Omry Tuval and Félix de Chaumont Quitry and Marco Tagliasacchi and Ira Shavitt and Dotan Emanuel and Yinnon Haviv},
title={Towards Learning a Universal Non-Semantic Representation of Speech},
year=2020,
booktitle={Interspeech},
pages={140--144},
doi={10.21437/Interspeech.2020-1242}
}
To use the embeddings, please cite the appropriate paper from the list above.
For questions reach out to
Joel Shor (joelshor@google.com)
Oran Lang (oranl@google.com)