The Yandex team announced the launch of a very convenient feature for Yandex Browser users — the ability to watch videos with multi-voiced voice-overs.
The Browser originally used two synthesized voices to translate speech: one male and one female. Now users have access to multi-voiced video translation – there are twelve voices, six male and six female. As the developers note, thanks to improved algorithms, it has become much easier to perceive videos with a large number of speaking participants.
The “browser” “distributes” voices to different speakers and “remembers” them using Yandex neural network technologies. First, one neural network translates speech into text, restores punctuation, and determines sentence boundaries. Then another neural network analyzes the spectrogram of the voice and notes the fragments spoken by different people. So it becomes clear which of the speakers said what.
Recall that a year ago, in September 2021, the Yandex team launched for mass use the opportunity to watch English-language videos on many popular platforms, including YouTube and Vimeo, with automatic Russian-language voice acting.