SOVA ASR/TTS Features
Availability
SOVA ASR/TTS is licensed under the Apache License 2.0 - can be used for both academic and commercial development.
Flexibility
Works on GPU and CPU. This allows you to run SOVA speech recognition and synthesis on more devices.
Applicability
Based on SOVA ASR / TTS, you can make your own voice virtual assistants.
Safety
SOVA speech recognition and synthesis runs locally on your machine. Your data remains with you.
Openness
SOVA ASR/TTS sources are available in open source. It is possible to take the source code and modify it to fit your needs.
Ease
SOVA speech recognition and synthesis is easy to install, there is a detailed Readme.
Speech recognition
SOVA ASR recognizes your voice and converts it into written text. Both when downloading pre-recorded audio, and in real time. Acoustic Model - Wav2Letter. Decoder - CTC (Connectionist Temporal Classification). Language Model - KenLM. Punctuator - BERT.
Speech synthesis
SOVA TTS synthesizes the human voice. You can send text of any length and receive an audio recording as output. Has multiple votes. Engine - modified Tacotron 2. Vocoder - modified Waveglow. NLP preprocessor - sova-tts-tps.