Streaming Audio Primer
(Part 4):
Overview
The next major step is to compress that
whopping 250 Mbyte WAV file into a 6.5 Mbyte file, or maybe even
smaller 2 Mbyte file. How do you such wonderful things? Simple,
use either a MP3 or VQF encoder. You may be asking, "That sounds
nice, but which one should I use - MP3 or VQF, and what's the
difference?". That's a good question. Here's a few thoughts to
help you decide.
MP3 versus VQF
First you must decide which audio
compression/encoding format you will use. Both have the same
basic idea: analyze frequency content of audio and compress by
encoding only audible content. Unbelievably this can reduce a
file size by 10:1 with virtually no loss in quality. And, if you
are willing to sacrifice quality, one can achieve compression of
almost 100:1 at the extreme. So what's the difference:
- MP3:
The MP3 format, which stands for MPEG 2 Layer 3, has been
around for a while. It is a part of the second generation of
the standard MPEG format that is used to compress both audio
and video. Most video capture boards save files in this
format. It is by far the most popular format. Many expect it
to take over the music business and revolutionize how we
listen to music. Already many companies sell handheld or
wristwatch players that play MP3 files from Flash Roms
instead of CD's. In general (except really low sampling
rates), MP3 has slightly better quality than VQF. Also,
there is an abundant source of players, encoders, and other
related software for the MP3 format because of its immense
popularity. Check out the web site of the guys who invented
the MP3 format and are busy on MP4.
- VQF:
The newcomer to the audio compression scene, VQF has yet to
reach the popularity of MP3. However, it has advantages that
may cause it to eventually topple MP3: it has slightly
better compression than MP3 (maybe 5% to 10%). Also, the
free encoder has an 11,000 Hz (11 kHz) / 8,000 bps (8 kbits
per second) setting that produces a phenomenal file that is
only 2 Mbytes. Although the quality is significantly lower
than that of higher rates (such as 11 kHz or 12 kHz at 16
and 20 kbps), it may be an option for those who have limited
web space or may worry about larger file sizes discouraging
people from downloading the file. The MP3 encoders also
support this low sampling rate, but surprisingly have much
worse quality at this particular low rate.
So, which one? That's up to you. I prefer MP3
because of its widespread acceptance. Also, the popularity of
MP3 and the abundance of related software makes it much more
attractive to the majority of people. However, some prefer the
slightly smaller file sizes that VQF has to offer. The bottom
line is that both will work, and you will win no matter which
choice you make.
Encoding the WAV file
First you must download an MP3 or VQF
compressor. First check Fraunhofer Institute's web-page on its
official MP3 encoders, which are licensed. I use AudioActive's
Production Studio Lite which retails for about $35. They have
been known to offer a free 30-day trial download, but it is
currently discontinued. If you want a free VQF encoder, you can
download a free Linux Version, or a free 90-day trial version
from Yamaha.
Once you have downloaded an encoder program,
you can use this program to encode the audio file into either
the MP3/VQF format, depending on which encoder you selected.
Just to get you started, try setting the compression preferences
to the 16,000 Hz sampling rate, 20 kbps, the Mono setting, and
the MP3 output format (using AudioActive's trial version).
Compress away! This will take a while too. Depending on the
quality of your compressor, the quality and size of your final
MP3 file will vary. Using this sampling rate (20 kbps, 16000 Hz
for MP3 encoding) produces a comparably small file (6.5 Mbytes
for 45 minutes), and balances well between file size and audio
quality. This is what I use for all of my recordings. For the
VQF encoder, try the 11kHz/8kbps setting to produce the small 2
Mbyte files. For recording of speech, stereo is not required,
since most speech are recorded in Mono anyway. Also, stereo
would double the file size too.
If you are going provide streaming audio on
your site, then your primary concern in selecting an encoding
rate is the bits-per-second (BPS or "bit rate")
parameter. You must make sure that the bit rate you select is
below the modem connection speed you want to support. For
example, any modem that connects at 28.8 kbps should easily be
able to handle a streamed audio file at the 20 kbps rate. The 24
kbps rate should also be possible, but any momentary hiccup in
the network may cause a slight pause in the playback, which is
not uncommon. Because of this, I prefer the 20 kbps set of
encoding rates. Typically the higher sampling frequency (22,050
Hz, 16,000 Hz, etc.) within a given bit rate is better. Also, it
helps to plan ahead and choose the original recording rate
(48,000 Hz, 44,100 Hz, etc.) such that the encoding rate is an
integer factor of the original rate (48,000 / 16,000 = 3 - an
integer, no decimal). This makes better encoding. Encoding rates
that are not integer factors must be "resampled". That is why
"Resample" is listed by some frequency choices, but not all, in
the "Encoding Properties" window in
AudioActive's
Production Studio. Planning ahead so that resampling is not
required produces superior results.
Once the encoder finishes crunching, try
testing it to see if you are satisfied with the output. If not,
try different compression levels to get the results you like -
just don't forget to balance file size and bit rate against
quality. Once you have completed the encoding step, then you are
ready to update your web-site with your new audio files! |