Examples of audio coding
Table of Contents
- Transform coding with time-domain aliasing cancellation based on an alternate application of 512-point modified discrete sine and cosine transforms (MDCT and MDST)
- frequency resolution 93.75 Hz
- The 512-point transform is done every 256 points (overlap)
- if necessary, pre-echoes can be prevented by reducing the block size to 256 points
- Adjacent tramsform coefficients are then grouped into subband ranges which approximate the critical bands of human hearing.
- From the absolute values of the transform coefficients within such subbands a log-spectral envelope of the current block is estimated and used for a dynamic bit allocation routine.
- The coder uses no perceptual model except the critical band division.
- AT&T`s Perceptual Audio Coder (PAC) extends the idea of perceptual coding to stereo pairs.
- Transform coding
- PAC uses both L/R (left(right) and M/S (sum/difference) stereo coding, switched both in frequency and time in a signal dependent fashion.
- In M/S stereo coding, the sum and difference signals (L+R and L-R) are coded instead of left and right signals.
- Hybrid (subband/transform) coding
- Used in MiniDisc
- In ATRAC the signal is first split into three subbands
(0-5.5, 5.5-11.0, and 11.0-22.0 kHz)
- A modified DCT with 50 % overlap is then applied to all subbands combined with dynamic window switching.
- The AC-3 multichannel coder is a transform coder with Dolby`s AC-2 filterbank at its core.
- The frequency resolution is 93.75 Hz.
- Bit allocation can occur down to the individual transform coefficient level, with neighboring coefficients obtaining different allocations.
- AC-3 employs a core backward-adaptive bit allocation routine which runs in both the encoder and the decoder
- bit allocation routine controlled by a log-domain spectral envelope
- based on a psychoacoustical model
- certain parameters are explicitly sent to the decoder by which the actual psychoacoustic model in the decoder can be readjusted by the encoder
- If the encoder uses an improved bit allocation routine it can compare its results with that of the core routine used by the decoder and send correction information to fine-tune the decoder`s bit allocation. The final bit allocations of the encoder and decoder are, of course, identical.
- Masking is exploited during periods of high bit demand to reduce the overall bit rate.
- It takes advantage of the fact that ear is not able to independently detect the direction of two high frequency signals which are close in frequency.
=> A number of individual channel transform coefficients can be combined into a common coefficient.
- Dolby AC-3 is a 5.1 multichannel audio system that encodes the channel separately, i.e., without compatibility matrixing thus avoiding "unmasking" artifacts.