While the two frontrunners in the NGA codec stakes are similar in many ways, a comparison of their background and frame of reference gives some interesting insights, writes John Maxwell Hobbs.

The two leading implementations of Next-Generation Audio (NGA) are Dolby AC-4 and MPEG-H 3D audio. Although they are both based around a central set of recommendations from the ITU, they have different origins, advantages, and limitations.

Dolby AC-4 vs MPEG-H 3D Audio: Background

Multi-channel audio broadcasts have been around since 1881, when Clément Alder connected pairs of telephone lines to send stereo audio from the Paris Opera to rooms at the Paris Electrical Exhibition. Between 1890 and 1932, audio programmes using this technology were commercially available in France and England. Since that time, broadcasting has gone from a single audio channel to as many as 22.

atmos2

Dolby Atmos: Codec options for an immersive audio experience

In 2020, the ITU launched its recommendations for advanced sound systems under the title of Next-Generation Audio (NGA). NGA is a set of advanced audio technologies that have been developed to provide an enhanced audio experience and is designed to deliver high-quality audio with increased flexibility, interactivity, and immersion.

It encompasses various audio technologies, including object-based audio, 3D audio, and personalised audio. Object-based audio allows sound designers to treat audio as individual objects that can be placed and moved around in a three-dimensional space, providing a more immersive and realistic experience.

3D audio goes beyond traditional stereo and surround sound, providing a more immersive and directional audio experience, often with the use of headphones or specialised speaker setups. Personalised audio allows listeners to customise their audio experience based on their individual preferences, such as adjusting the volume levels of different audio elements or selecting specific audio tracks or languages.

Read more Dolby Atmos in Broadcasting - Turning it up to 11

NGA has been developed by various organisations and industry groups, including the International Telecommunication Union (ITU), the European Broadcasting Union (EBU), and the Advanced Television Systems Committee (ATSC) in the United States. These groups work to develop and promote NGA standards and guidelines, ensuring that NGA is interoperable across different platforms and devices.

Dolby AC-4 and MPEG-H 3D audio are arguably the most widely implemented NGA standards, and have various pros and cons, which we’ll now dig into:

Dolby Atmos and Dolby AC-4

Dolby Labs entered the world of multi-channel sound in 1975 with their Dolby Stereo system for cinemas and released a true four-channel surround sound system a year later.

Dolby Atmos, the most well-known component of Dolby’s NGA technology was first introduced in 2012 as a response to increasing demand for more realistic audio in movie theatres. Traditional surround sound systems use a fixed number of channels to produce audio, with sounds assigned to specific speakers based on their location on the screen. This approach worked well for creating a sense of directional sound, but it lacked the ability to create a truly immersive sound experience.

Dolby Atmos is the brand name of the immersive experience by Dolby. It can be delivered to consumers in the home using Dolby TrueHD, Dolby Digital Plus and Dolby AC-4. The development of the AC-4 codec began in 2011 with the intention to create a high-quality audio format intended for use in broadcasting and streaming services. It is specified in the ETSI TS 103 190 specification, and was released for commercial use in 2014.  

MPEG-H 3D Audio

The other leading technology in the NGA space is MPEG-H 3D audio, specified in the ISO/IEC 23008-3 standard. It is an audio coding standard developed by the Moving Picture Experts Group (MPEG). The development of MPEG-H 3D audio began in 2013 and was first finalised in 2015, with subsequent updates and improvements being made since then.

As with Dolby AC-4, MPEG-H 3D audio was developed to address the increasing demand for high-quality and immersive audio experiences, however, rather than being focused on cinemas , MPEG-H 3D audio was designed for the areas of broadcasting, streaming, and virtual reality. It is an advanced audio coding format that allows for the efficient transmission and storage of high-quality audio, while also providing additional features such as immersive sound and interactivity.

One of the key features of MPEG-H 3D audio is its ability to support channel-based, object-based and scene-based audio. This means that audio can be encoded and decoded in either a traditional channel-based format, where sounds are assigned to specific speakers, in an object-based format, where sounds are treated as individual objects that can be placed and moved in a three-dimensional space, or in a scene-based format. This allows for a more immersive and dynamic audio experience, particularly in virtual reality and gaming applications.

Another key feature of MPEG-H 3D audio is its ability to adapt to different playback environments. This means that the audio can be optimised for specific playback systems, such as headphones, stereo speakers, or surround sound systems. It also allows for the efficient transmission of audio over networks with varying bandwidths, ensuring that the audio quality remains consistent even under adverse network conditions.

Dolby AC-4 vs MPEG-H 3D Audio: Key Features

Both MPEG-H 3D audio and Dolby AC-4 are technologies that aim to enhance the audio experience for listeners. While they have similarities in terms of support object-based audio formats and offering compatibility with legacy systems, there are also differences between the two.

Dolby AC-4 Features

Dolby AC-4 combines audio compression techniques and a flexible and scalable system design. This minimises the amount of audio data that needs be delivered and allows optimisation of the system for different content types and delivery methods. Dolby Atmos, the immersive experience delivered via AC-4, uses metadata to describe the position and movement of each sound object, allowing for accurate placement and movement of sounds in a 3D space. 

dolbyac4

dolbyac4

Some of the key features of Dolby AC-4 include:

1. Object-based audio: Unlike traditional channel-based audio formats, which assign specific sounds to specific audio channels, Dolby AC-4 uses object-based audio, where sounds are represented as individual objects that can be positioned and moved around in a three-dimensional space. This allows for more precise and dynamic control over the sound field.

2. Channel-based audio: Dolby AC-4 supports channel-based audio, including immersive channels configurations that include surround and height channels (e.g. 7.1.4).

3. Adaptive audio: Dolby AC-4 includes adaptive audio technology, which can adjust the audio mix in real-time based on the specific playback system and environment. This ensures that the audio is optimised for the particular setup, whether it’s a home theatre system, a TV or a mobile.

4. Personalisation: Dolby AC-4’s NGA features offer viewers many options, such as choice of language or home/away announcers in sports broadcasts, and user-selectable enhancement of dialogue intelligibility.

MPEG-H 3D Audio Features

MPEG-H 3D audio uses highly efficient compression algorithms that can reduce the size of audio data while maintaining high audio quality, which has significant advantages for use in broadcasting. Like Dolby Atmos, MPEG-H 3D audio uses metadata to describe the position and movement of each sound object, allowing for accurate placement and movement of sounds in a 3D space.

1. Object-based audio: MPEG-H 3D audio uses object-based audio coding, where audio objects are encoded as separate audio elements that can be combined and manipulated in real-time. This allows for more precise and dynamic control over the audio content, allowing for a more immersive and interactive audio experience.

2. Channel-based audio: MPEG-H 3D audio also supports channel-based audio coding, where audio is encoded and transmitted as separate channels. This allows for compatibility with existing audio systems and devices and ensures that the audio can be played back on any standard audio system.

3. Adaptive audio: MPEG-H 3D audio allows the audio mix to be adapted to the specific playback system and environment. This ensures that the audio is optimised for the particular setup, whether it’s a home theatre system or a mobile device.

4. Personalisation: MPEG-H 3D audio also includes support for interactivity, which allows for the listener to interact with the audio content in real-time, such as changing the mix, adjusting the volume, or selecting different audio objects,

Dolby AC-4 vs MPEG-H 3D Audio: Comparison

At a glance, it appears that the two systems are virtually identical in their capabilities, however they each show their roots.

Dolby Atmos clearly comes from the world of the cinema and is heavily weighted toward delivering a pre-produced multi-channel experience. Although it does incorporate the interactive and customisation features specified in the NGA recommendations, it is really designed for the lean-back and watch immersive environment demanded by feature films and television drama. That is changing, however, and with the AC-4 codec Dolby has been putting significant development effort into enhancing its interactive capabilities.

In terms of user adoption, Dolby AC-4 is widely supported and compatible with a large range of playback devices from TVs to mobile devices. Dolby Atmos is also supported by several streaming services like Apple TV, BT TV, Sky Q, Netflix, and Amazon, as well as gaming consoles, and virtual reality platforms. This makes it a versatile and widely available solution for immersive audio. AC-4 is included in the ATSC 3.0 and DVB broadcast standards.

MPEG-H 3D audio’s implementation of the NGA recommendations has a strong focus on customisation and interactivity, which betrays its genesis in broadcasting. MPEG-H 3D audio is also included in the ATSC 3.0 and DVB broadcast standards and is the technology behind the Sony 360 Reality Audio platform.

Although it is a newer technology and currently not as widely supported as Dolby Atmos, it is gaining popularity. Currently, MPEG-H 3D audio is supported by some broadcast television services like TTA in Korea and SBTVD in Brazil, as well as music streaming services like Amazon Music, as well as gaming consoles, and virtual reality platforms.

Based on the history of surround technology, it is highly likely that the two systems will co-exist for the foreseeable future. Most surround capable consumer devices support Dolby Surround, Dolby Pro Logic, and DTS Surround as standard features. This approach is likely to continue with the incorporation of Dolby AC-4 and MPEG-H audio to the mix.

Dolby AC-4 vs MPEG-H Audio: Feature Comparison

Feature

MPEG-H 3D Audio

Dolby Atmos/ Dolby AC-4

Audio Object Coding

Yes (up to 64 objects)

Yes (up to 128 objects)

Channel-Based Coding

Yes (up to 22.2)

Yes (up to 22.2)

Object-Based Audio

Yes

Yes

Immersive Audio

Yes

Yes (Dolby Atmos)

Spatial Audio

Yes, supports 3D audio

Yes, supports 3D audio

Height Information

Yes

Yes

Binaural Rendering

Yes

Yes

Interactive Audio

Yes

Yes

Audio Channels

2.0 to 22.2

2.0 to 24.1.104

Height Channels

Up to 9

Up to 10

Dynamic Range Control

Yes

Yes

Metadata

Yes, supports object metadata and scene-based audio

Yes, supports object metadata and scene-based audio

Speaker Configuration

Can be configured based on the playback environment

Configured based on the playback environment

Audio Processing

Can be done in real-time or post-production

Can be done in real-time or post-production

Rendering technology

Uses metadata to position sounds in a 3D space. Higher accuracy in sound rendering

Uses metadata to position sounds in a 3D space. Higher accuracy in sound rendering. 

Platform Support

TV, mobile, and streaming platforms

TV, mobile, and streaming platforms

Audio Codecs

MPEG-H 3D audio

AC-4

Bitrate

Variable, up to 768 kbps

Variable, up to 3 Mbps

Compatibility

Compatible with some current audio equipment

Compatible with most current audio equipment

Implementation

Supported by some streaming services and TV broadcasters - TTA (Korean TV), SBTVD (Brazilian TV), Amazon Music, Tidal, nugs.net

Supported by many streaming services and some TV broadcasters – BT TV, Sky Q, Netflix, Apple TV+, Disney+, Amazon Prime Video, Amazon Music, Tidal, QQ Music, Audible.

Hardware Support

Limited support, but some hardware products available

More widely supported by hardware products

Licensing

Royalty-based

Royalty-based

Read more Generative AI in Broadcasting: A rising tide