Codecs

In order to send the large amount of data neccesary to provide smooth audio and video, some type of compression must be used. Luckily, there are very many ways to do this well. A procedure for compressing and decompressing the audio and video used in a videoconference is called a Codec (standing for Code/Decode). Here is a detailed list of some codecs and how they work:

Audio Codecs:

A-Law, mu-Law PCM

PCM (Pulse Code Modulation) codecs encode the amplitude of each sample as a word (a unit of memory of specified size which varies from computer to computer). In order to compress the data, the amplitude values are quantized. A-Law and mu-Law improve on this by making the quantization values logarithmic. While this allows for a greater compression ratio, it also means that quiet sounds are encoded more accurately than loud sounds. The G.711 ITU recommendation includes both A-Law and mu-Law PCM encoding, and requires 64Kbps of bandwidth.

ADPCM

ADPCM (Adaptive Differential PCM) works in a similar manner to PCM encoding. However, instead of encoding the amplitude of each sample, ADPCM methods encode the difference between samples. This allows greater compression ratios than PCM encoding. Many current ITU recommendations use ADPCM encoding: G.721, G.722, G.723, G.726, and G.727. G.722 in particular is interesting in that it has an adjustable bandwidth requirement; it can work with 64, 56, or 48Kbps of bandwidth.

LPC, CELP

LPC (Linear Predictive Coding) and CELP (Code Excited Linear Prediction) both use a novel approach to compression of audio signals. Both take advantage of the fact that they were designed to transmit voice data by using a model of human speech to encode their signals. This allows for very low bandwidth requirements. CELP (while it requires slightly higher bandwidth) allows for higher quality audio by encoding the error between what was encoded and what should be produced. The GSM (Group Speciale Mobile) codec, designed orignally for cell phone use, is a slightly modified version of LPC coding which requires 13Kbps of bandwidth. The G.728 ITU recommendation uses a variation on CELP, and requires only 16Kbps; however, it is very computationally complex, so it requires special hardware for decoding.

Video Codecs:

H.261

H.261 is the baseline recommended video codec for use with H.323 protocol. All H.323 compliant programs must support the H.261 video codec, which makes it a good choice. It utilizes both intraframe (within one frame) and interframe (between frames) encoding. The interframe step takes 8x8 blocks of pixels and quantizes their data based on an adjustable table. The table is adjusted at run time in order to maintain constant data flow. The interframe encoding calculates the difference between the current frame and the next frame; it then compresses and sends this difference. Because of the adjustable quantization table, H.261 is able to maintain video smoothness through bandwidth problems. However, more motion in the video means higher bandwidth requirements.

H.263

H.263 is the successor to H.261. It is supported by some H.323 compliant programs, and works in a very similar manner to H.261. However, it supports larger video size (up to 1408x1152 versus H.261's 352x288), and is capable of higher video quality with the same available bandwidth. In general, H.263 should be used over H.261 if it is supported, since it can simultaneously boost image quality and lower bandwidth requirements.

nv

nv (Network Video) was developed at Xerox/PARC, and is the compression algorithm most often used with MBone videoconferencing systems. It, like H.261, uses both intraframe and interframe encoding. However, its interframe encoding works differently. It compares the current frame to the next frame, and only transmits data for the sections of the frame that have changed. This data is compressed using one of two algorithms. The algorithm is selected at run time depending on whether the bottleneck is local computation or network bandwidth. Unchanged sections of the video are occasionally resent to prevent loss. nv is capable of very high compression ratios.

Many of the details of specific codecs on this page come from http://www2.ncsu.edu/eos/service/ece/project/succeed_info/larettin/thesis/ch2.htm