New Fraunhofer Audio Format Could Let Consumers Balance Commentary, Ambient Sound

Fraunhofer Institute, the German R&D company that brought the world the MP3 audio codec, is testing another new format, one that could let consumers resolve the balance between center-channel commentary and ambient sound and effects on sports broadcasts. Tentatively titled Dialogue Enhancement, the system would enable viewers at home to change the balance between the commentary and all other audio on sports broadcasts.

According to a system overview from Fraunhofer, a Dialogue Enhancement encoder analyzes input audio signals at the remote mix location and produces a single mono, stereo, or 5.1 surround mix of all those signals. The encoder then generates parameters that describe the relationship of each source signal to all other sources in a time- and frequency-selective manner, creating metadata for the broadcast audio. The mixed signal is encoded with a codec such as MPEG-4 AAC or HE-AAC to embed the stream of parametric meta information into the audio bitstream for delivery through a standard signal-transport channel.

On the receiving side, the audio bitstream is decoded via a codec integrated into either a set-top box or the viewing device itself, using the decoded mix signal’s metadata from the parameter bitstream to enable access to the audio sources. The user can then adjust the volume of each source individually.

Dialogue Enhancement was initially tested last year by Fraunhofer and BBC Radio at Wimbledon, where a stereo-effects image was derived from a pair of coincident crossed-pair microphones and the commentary was taken in mono from a Radio 5 feed. Listeners who were part of the experiment were able to enhance or attenuate the commentary by up to 12 dB, compared with the downmix.

The next trial will be held in conjunction with Swedish state radio sometime later this year and will experiment with more-elaborate features. These include the ability to apply compression algorithms to the audio that would allow an automatic ducking technique, in which the level of the music and sound effects is lowered only when someone is talking. Fraunhofer hopes to test the technology on a variety of smartphones, tablets, and other devices.

Fraunhofer Director of Marketing Communications Mathias Rose says that the Swedish Radio test will also test the system in both live and on-demand configurations. Though reiterating that the main goal of increased intelligibility of speech remains primary, he adds that the system may eventually lead to the ability to change the relationship of all audio channels to each other, allowing for multiple commentary channels, among other features.

Rose says the system was demonstrated at this year’s NAB show and U.S. broadcasters have shown an interest in it. However, no domestic trials are scheduled. But, he adds, it would offer the same benefits here as in Europe, where a growing percentage of the population is hearing-challenged and will create demand for ways to make spoken words more intelligible in broadcast and streamed sports events.

“Speech intelligibility has been a big source of complaints for sports,” he says. “Every time we demonstrate this process at trade shows, we get very positive feedback.”