Advanced Audio Coding
From Wikipedia, the free encyclopedia
Advanced Audio Coding (AAC) is a standardized, lossy digital audio compression scheme. It was developed with the cooperation and contributions of companies mainly including Dolby, Fraunhofer (FhG), AT&T, Sony and Nokia, and was officially declared an international standard by the Moving Pictures Experts Group in April of 1997. It was written into specification as Part 7 of the MPEG-2 standard, and again into Part 3 of the MPEG-4 standard. As such, AAC can be referred to as MPEG-2 Part 7 and MPEG-4 Part 3 depending on its implementation, but is most often referred to as MPEG-4 AAC, or AAC for short. However, "MP4" usually refers to the format described in MPEG-4 Part 14, which is a container format for carriage of video and audio data.
AAC was designed as an improved-performance codec relative to MP3 (which was specified in MPEG-1 and MPEG-2) by the ISO/IEC in 11172-3 and 13818-3.
AAC was promoted as the successor to MP3 for audio coding at medium to high bitrates. Its popularity is currently maintained by it being the default Apple iTunes codec, the media player which powers iPod, the most popular digital audio player on the market. <ref>http://www.apple.com/pr/library/2006/jul/19results.html</ref> Furthermore, the iTunes Store, whose sales account for 85% of the market for legal online downloads, <ref>http://www.appleinsider.com/article.php?id=1896</ref> sells AAC-encoded songs (encapsulated with FairPlay Digital Rights Management).
Contents |
[edit] How AAC works
AAC is a wideband audio coding algorithm that exploits two primary coding strategies to dramatically reduce the amount of data needed to represent high-quality digital audio.
- Signal components that are perceptually irrelevant are discarded;
- Redundancies in the coded audio signal are eliminated;
- The signal is processed by a modified discrete cosine transform (MDCT) according to its complexity;
- Internal error correction codes are added;
- The signal is stored or transmitted.
The MPEG-4 audio standard does not define a single or small set of highly efficient compression schemes but rather a complex toolbox to perform a wide range of operations from low bitrate speech coding to high-quality audio coding and music synthesis.
- The MPEG-4 audio coding algorithm family spans the range from low bitrate speech encoding (down to 2 Kbit/s) to high-quality audio coding (at 64 Kbit/s per channel and higher).
- AAC offers sampling frequencies between 8 kHz and 96 kHz and any number of channels between 1 and 48.
- In contrast to MP3's hybrid filter bank, AAC uses the modified discrete cosine transform (MDCT) together with the increased window lengths of 2048 points. AAC is much more capable of encoding audio with streams of complex pulses and square waves than MP3 or MP2.
AAC encoders can switch dynamically between a single MDCT block of length 2048 points or 8 blocks of 256 points.
- If a single change or transient occurs, the short window of 256 points is chosen for better temporal resolution.
- By default, the longer 2048-point window is used to improve the coding efficiency because of better frequency resolution.
[edit] Modular encoding
AAC takes a modular approach to encoding. Depending on the complexity of the bitstream to be encoded, the desired performance and the acceptable output, implementers may create profiles to define which of a specific set of tools they want use for a particular application. The standard offers four default profiles:
- Low Complexity (LC) - the simplest and most widely used and supported;
- Main Profile (MAIN) - like the LC profile, with the addition of backwards prediction;
- Sample-Rate Scalable (SRS), a.k.a. Scalable Sample Rate (MPEG-4 AAC-SSR);
- Long Term Prediction (LTP); added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity.
Depending on the AAC profile and the MP3 encoder, 96 kbit/s AAC can give nearly the same or better perceptional quality as 128 kbit/s MP3.[1]
[edit] AAC Low Delay
The MPEG-4 Low Delay Audio Coder (AAC-LD) is designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. It is closely derived from the MPEG-2 Advanced Audio Coding (AAC) format.
The most stringent requirements are a maximum algorithmic delay of only 20 ms and a good audio quality for all kind of audio signals including speech and music. In this way, the AAC-LD coding scheme bridges the gap between speech coding schemes and high quality audio coding schemes.
Two-way communication with AAC-LD is possible on usual analog telephone lines and via ISDN connections. Compared to known speech coders, the codec is capable of coding both music and speech signals with good quality. Unlike speech coders, however, the achieved coding quality scales up with bitrate. Transparent quality can be achieved.
AAC LD can also process stereo signals by using the advanced stereo coding tools of AAC. Thus it is possible to transmit a stereo signal with a bandwidth of 7 kHz via one ISDN line or with a bandwidth of 15 kHz via two ISDN lines.
[edit] Error protection toolkit
Applying error protection enables error correction up to a certain extent. Error correcting codes are usually applied equally to the whole payload.
But since different parts of an AAC payload show different sensitivity to transmission errors, this would not be a very efficient approach.
The AAC payload can be subdivided into parts with different error sensitivities. Independent error correcting codes can be applied to any of these parts using the Error Protection (EP) tool defined in MPEG-4 Audio. This provides the error correcting capability just the most sensitive parts of the payload in order to keep the additional overhead low.
Error Resilient (ER) AAC
Error Resilience (ER) techniques can be used to make the coding scheme itself more robust against errors. For AAC, three custom-tailored methods were developed and defined in MPEG-4 Audio:
- Huffman Codeword Reordering (HCR) to avoid error propagation within spectral data;
- Virtual Codebooks (VCB11) to detect serious errors within spectral data;
- Reversible Variable Length Code (RVLC) to reduce error propagation within scale factor data.
[edit] AAC's improvements over MP3
Some of its advances:
- Sample frequencies from 8 kHz to 96 kHz (official MP3: 16 kHz to 48 kHz)
- Up to 48 channels (MP3 supports up to two channels in MPEG-1 mode and up to 5.1 channels in MPEG-2 mode)
- Higher efficiency and simpler filterbank (hybrid → pure MDCT)
- Higher coding efficiency for stationary signals (blocksize: 576 → 1024 samples)
- Higher coding efficiency for transient signals (blocksize: 192 → 128 samples)
- Can use Kaiser-Bessel derived window function to eliminate spectral leakage at the expense of widening the main lobe
- Much better handling of frequencies above 16 kHz
- More flexible joint stereo (separate for every scale band)
The result is a specification that allows developers more flexibility to design codecs that offer efficient compression compared to MP3. However, the advantages are not entirely decisive, and the MP3 specification, while outdated, has proven surprisingly robust. Although AAC and HE-AAC are far better than MP3 at very low bitrates, at medium to higher bitrates the two formats are more comparable. In the future as developers learn to better exploit the AAC format, AAC is expected to gain additional ground and perhaps overtake MP3.
[edit] AAC ISO standard
AAC, which was first specified in the standard known formally as ISO/IEC 13818-7, was published in 1997 as a new "part" (distinct from ISO/IEC 13818-3) in the MPEG-2 family of international standards.
[edit] Licensing and patents
In contrast with the MP3 format, which requires royalty payments on distributed content, no licenses or payments are required to be able to stream or distribute content in AAC format. <ref>Via Licensing. MPEG-4 Audio Licensing FAQ Q6.</ref> This reason alone makes AAC a much more attractive format for distributing content, particularly streaming content (such as Internet radio).
However, a patent license is required for all manufacturers or developers of AAC codecs. <ref>Via Licensing. MPEG-4 Audio Licensing FAQ Q1.</ref> It is for this reason FOSS encoders and decoders such as FAAC and FAAD [2] are distributed in source form only, in order to avoid patent infringement.
[edit] Products that support AAC
[edit] iTunes and iPod
In April 2003, Apple Computer brought mainstream attention to AAC by announcing that its iTunes and iPod products would support songs in MPEG-4 AAC format (via a firmware update for older iPods), and that customers could download popular songs in a copyright-protected form (see FairPlay) via the iTunes Store.
Apple added support for VBR encoding of AAC tracks in iTunes v5.0. It also added certain enhancements in higher-end iPods such as chapters (bookmarks that can incorporate web links and pictures set to appear at certain times during playback of audio books and podcasts) which are not features of AAC itself, but of the proprietary Apple file format that wraps the AAC bitstream.[citation needed]
[edit] Microsoft Zune
In September 2006 Microsoft unveiled their Zune media player. Supported formats include AAC.
[edit] Sony PlayStation 3
With the launch of the PlayStation 3 on November 11, 2006, Sony announced support for the AAC format for music (MPEG-4 AAC) and video (MPEG-4 SP (AAC LC), H.264/MPEG-4 AVC Main Profile(AAC LC),MPEG-2 PS(AAC LC). This will allow for easy interoperability with the PSP system that supports AAC already. Sony announced that AAC will be the default file audio CDs to the PS3.[[3]]
[edit] Other media players
Almost all current computer media players include built-in decoders for AAC, or can utilize a library to decode it. On Microsoft Windows, DirectShow can be utilized this way with the corresponding filters to enable AAC playback in any DirectShow based player. Software player applications of particular note include:
- ffdshow is a free open source DirectShow filter for Microsoft Windows operating systems that uses FAAD2 to support AAC decoding
- foobar2000 is a freeware audio player for Windows that supports LC and HE AAC
- Winamp for Windows, which includes an AAC encoder that supports LC and HE AAC;
- MPlayer or xine are often used as AAC decoders on Linux
- RealPlayer includes RealNetworks's RealAudio 10 AAC encoder
- VLC media player supports playback of MP4 files
- Media Player Classic
- XBMC (XBox Media Center) supports both AAC (LC and HE) and aacPlus on the Xbox game-console
- KSP Sound Player also supports AAC
- Sony SonicStage also support AAC in the version 4
- The KMPlayer also supports AAC
- XMB (X "ross" M "edia" B "ar") for the Sony PlayStation 3 supports AAC
[edit] Nero Digital Audio
In May 2006, Nero AG released a free AAC encoding tool, Nero Digital Audio, which is capable of encoding LC, HE and HEv2 AAC streams. The tool is a Command Line Interface tool only, and a separate utility is included to decode to PCM WAV.
The Foobar2000 audio player for Windows can provide a GUI for the encoder.
[edit] Other portable devices
For a number of years, many mobile (cell) phones from the big manufacturers such as Nokia, Motorola, Samsung, Sony Ericsson, and Philips have supported AAC playback. During 2005, the buzz around music on mobile phones increased dramatically. Many manufacturers announced dedicated music phones, such as the Philips 960,Sony Ericsson S700i, Sony Ericsson W600/Sony Ericsson W550,Sony Ericsson Z550, Ericsson Z530i, Sony Ericsson K510i, Sony Ericsson K750i/Sony Ericsson W800, Sony Ericsson W850i, Sony Ericsson W810, Sony Ericsson W900i, Sony Ericsson M600i, Sony Ericsson K800i, Nokia N91, Nokia 3250, Nokia 3300, Nokia N70, Nokia 6270, Nokia 6682, Samsung SGH-i300, Motorola ROKR E1, Motorola RAZR V3i, Motorola RAZR V3x, Motorola SLVR L7, Motorola RAZR V360, Siemens M75, Siemens CX75, Siemens EL71 - all with AAC playback as standard. This trend towards supporting AAC continues with the ever increasing number of advanced phones on the market today, with most high-end phone models capable of AAC playback. Currently sites such as www.bdmobile.tk is providing free AAC formatted music for mobiles like these. (Caveat emptor: Although some of these portable devices may be able to play back MP4 AAC, many do not have support for the proprietary and unpublished iTunes tags, and will be unable to read and display metadata embedded within the files. In some cases, the latest firmware updates add the iTunes compatibility (eg: Sony Ericsson Walkman phones).)
Also, the PlayStation Portable has had support for MP4 AAC files since the version 2.0 firmware update (released August 2005), but initially for files with a .mp4 extension only, meaning .m4a files would need to be renamed. This changed with the 2.7 firmware update (released April 2006), which can play MP4 AAC files having either extension. Other Sony products, including the A and E series Network Walkmans, support AAC with firmware updates (released May 2006).
Epson supports AAC playback in the P-2000 and P-4000 Multimedia/Photo Storage Viewers. This support is not available with their older models, however.
All Palm PDAs running Palm OS 5 or greater can play back .m4a AAC files with 3rd-party software such as Kinoma Player 3 EX, AeroPlayer or the TCPMP with its AAC plug-in. PocketPCs are also fully capable when running software like the Pocket PC version of TCPMP with its AAC plug-in.
The Sony Reader portable eBook plays M4A files containing AAC, and displays metadata created by iTunes.
The high end of Pioneer's latest range of car stereo headunits also contain AAC decoding support along with MP3 and WMA. Products like the DEH-P75BT will playback .m4a files recorded onto CD in a data format.
[edit] Extensions and improvements
Some extensions have been added to the original AAC standard:
- MPEG-4 Scalable To Lossless (SLS);
- High Efficiency AAC (HE-AAC), a.k.a. aacPlus v1 or AAC+ - the combination of SBR (Spectral Band Replication) and AAC; used for low bitrates;
- HE-AAC v.2, a.k.a. aacPlus v2 - the combination of Parametric Stereo (PS) and HE-AAC;
- Perceptual Noise Substitution (PNS);
- Long Term Predictor (LTP) - added in MPEG-4 Part 3.
[edit] See also
[edit] External links
- Apple's page on MPEG-4 AAC
- EE Times article on AAC
- Fraunhofer MPEG-2 AAC Information
- MPEG Industry Forum: MPEG-4 Licensing Information
- Via Licensing MPEG-4 Audio FAQ
- Coding Technologies AAC SBR+PS Information
- AAC Licensing
- Open Source AAC codec libraries FAAD2 (Free Advanced Audio Decoder) decoder and FAAC (Free Advanced Audio Coding) encoder.
- Roberto's public listening tests - blind, controlled listening tests of lossy compression formats including AAC
- List of AAC and aacPlus resources
- [4] - Nero's encoder download (free)
[edit] Notes
<references/>
| Multimedia compression formats | ||||||
|---|---|---|---|---|---|---|
| Video compression formats | ISO/IEC | MPEG-1 | MPEG-2 | MPEG-4 | MPEG-4/AVC | ITU-T | H.261 | H.262 | H.263 | H.264 | Others | AVS | Dirac | Indeo | MJPEG | RealVideo | VC-1 | Theora | VP6 | VP7 | WMV |
| Audio compression formats | ISO/IEC MPEG | MPEG-1 Layer III (MP3) | MPEG-1 Layer II | AAC | HE-AAC | ITU-T | G.711 | G.722 | G.722.1 | G.722.2 | G.723 | G.723.1 | G.726 | G.728 | G.729 | G.729.1 | G.729a | Others | AC3 | ATRAC | FLAC | iLBC | Monkey's Audio | Musepack | RealAudio | SHN | Speex | Vorbis | WavPack | WMA |
| Image compression formats | ISO/IEC/ITU-T | JPEG | JPEG 2000 | JPEG-LS | JBIG | JBIG2 | -- | -- | Others | BMP | GIF | ILBM | PCX | PNG | TGA | TIFF | WMP |
| Media container formats | General | 3GP | ASF | AVI | FLV | Matroska | MP4 | MXF | NUT | Ogg | Ogg Media | QuickTime | RealMedia | Audio only | AIFF | AU | WAV | -- | -- |
de:Advanced Audio Coding es:Advanced Audio Coding fr:Advanced Audio Coding it:Advanced Audio Coding nl:Advanced Audio Coding ja:AAC no:Advanced Audio Coding pl:AAC pt:AAC ru:AAC fi:AAC sv:Advanced Audio Codec vi:AAC



