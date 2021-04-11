



Google has open sourced a new audio codec (coder-decoder) named Lyra. Lyra uses machine learning to enable low bandwidth, high quality voice calls. Lyra can compress audio up to 3 kbps, ensuring sound quality compared to other codecs that require much wider bandwidth.

The COVID-19 pandemic has led to an increase in remote work and telecommuting. As a result, data limits are increasing even in areas where connections are reliable. That’s why Google is now open sourcing Lyra to help make a difference in these situations.

The Lyra codec is optimized to pack recognizable, natural-sounding human utterances into as little space as possible. Lyra is built using thousands of hours of voice-trained ML models for the general public speaking in over 70 languages ​​and accents. It also supports low-power devices such as smartphones that have a latency of only 90 milliseconds.

Architecture

Being a codec, the Lyras architecture consists of two main components: an encoder and a decoder.

The encoder captures the characteristic attributes or features of someone’s speech when spoken on the phone. Lyra inputs these features in 40ms chunks, compresses them, and then sends them over the network. This process is encoding. The decoder, on the other hand, converts these functions back into audio waveforms. This voice waveform is easily received and understood by the other party in the call. Traditional codecs are usually based on digital signal processing (DSP) technology, but Lyra uses generative models to efficiently reconstruct high-quality audio signals.

What Lyra does is compress the audio so that it is recognizable. It produces significantly lower quality audio than what many people are accustomed to (that is, usually in encoded recordings), yet is clearly recognizable.

Open source release

Therefore, Google is using Lyra to support parts of the world with low-bandwidth networks. Open source Lyra is a good step as it creates transparency and trust for third-party developers to use the service. Google has implemented Lyra in its free Duo video calling application. We have announced that we will open source the code because we believe it may be suitable for more applications.

Lyra code is written in C ++ for great speed, efficiency, and interoperability. It writes code using the Bazelbuild framework with Abseil and uses the Google Test framework for complete unit testing. Lyra’s core API provides an interface for encoding and decoding at the file and packet levels. It provides a complete signal processing toolchain with various filters and transformations. It also provides the weights and vector quantizer needed to run Lyra. Lyra has been released in beta so more developers can get involved and get feedback as soon as possible.

