Codecs
In order to send the large amount of data
neccesary to provide smooth audio and video, some type of compression
must be used. Luckily, there are very many ways to do this well.
A procedure for compressing and decompressing the audio and video
used in a videoconference is called a Codec (standing for Code/Decode).
Here is a detailed list of some codecs and how they work:
Audio Codecs:
A-Law, mu-Law PCM
PCM (Pulse
Code Modulation) codecs encode the amplitude of each sample
as a word (a unit of memory of specified size which varies from computer
to computer). In order to compress the data, the amplitude values
are quantized. A-Law and mu-Law improve on this by making the quantization
values logarithmic. While this allows for a greater compression ratio,
it also means that quiet sounds are encoded more accurately than loud sounds.
The G.711 ITU recommendation includes
both A-Law and mu-Law PCM encoding, and requires 64Kbps of bandwidth.
ADPCM
ADPCM (Adaptive
Differential PCM) works in a similar manner to PCM encoding.
However, instead of encoding the amplitude of each sample, ADPCM methods
encode the difference between samples. This allows greater compression
ratios than PCM encoding. Many current ITU recommendations use ADPCM encoding: G.721,
G.722, G.723, G.726, and G.727. G.722 in particular is interesting
in that it has an adjustable bandwidth requirement; it can work with 64,
56, or 48Kbps of bandwidth.
LPC, CELP
LPC (Linear
Predictive Coding) and CELP (Code Excited Linear
Prediction) both use a novel approach to compression of audio signals.
Both take advantage of the fact that they were designed to transmit
voice data by using a model of human speech to encode their signals. This
allows for very low bandwidth requirements. CELP (while it requires
slightly higher bandwidth) allows for higher quality audio by encoding the
error between what was encoded and what should be produced. The GSM
(Group Speciale Mobile) codec, designed orignally for
cell phone use, is a slightly modified version of LPC coding which requires
13Kbps of bandwidth. The G.728 ITU
recommendation uses a variation on CELP, and requires only 16Kbps; however,
it is very computationally complex, so it requires special hardware for
decoding.
Video Codecs:
H.261
H.261 is the baseline
recommended video codec for use with H.323 protocol. All H.323 compliant
programs must support the H.261 video codec, which makes it a good choice.
It utilizes both intraframe (within one frame) and interframe (between
frames) encoding. The interframe step takes 8x8 blocks of pixels and
quantizes their data based on an adjustable table. The table is adjusted
at run time in order to maintain constant data flow. The interframe
encoding calculates the difference between the current frame and the next
frame; it then compresses and sends this difference. Because of the
adjustable quantization table, H.261 is able to maintain video smoothness
through bandwidth problems. However, more motion in the video means
higher bandwidth requirements.
H.263
H.263 is the successor
to H.261. It is supported by some H.323 compliant programs, and works
in a very similar manner to H.261. However, it supports larger video
size (up to 1408x1152 versus H.261's 352x288), and is capable of higher video
quality with the same available bandwidth. In general, H.263 should
be used over H.261 if it is supported, since it can simultaneously boost image
quality and lower bandwidth requirements.
nv
nv (Network
Video) was developed at Xerox/PARC, and is the compression algorithm
most often used with MBone videoconferencing systems. It, like H.261,
uses both intraframe and interframe encoding. However, its interframe
encoding works differently. It compares the current frame to the next
frame, and only transmits data for the sections of the frame that have
changed. This data is compressed using one of two algorithms. The
algorithm is selected at run time depending on whether the bottleneck is
local computation or network bandwidth. Unchanged sections of the video
are occasionally resent to prevent loss. nv is capable of very high
compression ratios.
Many of the details of specific codecs on this page come
from http://www2.ncsu.edu/eos/service/ece/project/succeed_info/larettin/thesis/ch2.htm