iTunes and the Technology Beyond MP3
Recently, Apple Computer found a way to add a million dollars a week to its bottom line. Not a bad trick. Their newfound source of revenue comes from the recently introduced iTunes online store.
It is a superbly considered response to the music distribution problem posed by the technologies developed initially by Napster and elaborated in p-to-p designs by Nutella, KaZaa, and Morpheus, among others. These technologies provided a nearly effortless mechanism for exchanging or copying music files on a massive scale. Effortless distribution of information people wanted led to just that—except the intellectual property rights to the music were forgotten.
Apple's Music Store sells songs using the iTunes4 music player user interface. Incorporating the finding, previewing, and purchasing of music into the already familiar and well-designed iTunes music software makes the learning curve nearly flat. Songs you want to keep can be individually purchased for 99 cents each.
The track purchased is downloaded into iTunes and added to your "Purchased
Music" folder. The file itself, however, is not what you might have guessed.
It's not an MP3 but an AAC (Advanced Audio Coding) file format.
Audio File Formats 101
The technology behind the ubiquitous music files that we all listen to is not
for the fainthearted. The topic here is specifically the choice made by Apple
to use AAC rather than MP3 for the songs they are distributing from the iTunes
Music Store. To understand the issue we need to digress briefly into the arcane
world of the Moving Picture Experts Group family of standards used for coding
audio-visual information in a digital-compressed format. MP3 files are one of
those standards for encoding audio for digital distribution, playback, and streaming
on the Internet.
MP3 files are actually derived from MPEG-1 Audio Layer-3 standards. This is a standard technology and format for compressing a sound sequence into a very small file (about one-twelfth the size of the original file) while nearly preserving the original level of sound quality when it is played. MP3 files are usually download-and-play files rather than streaming sound files that you link-and-listen-to with streaming audio products.
To create an MP3 file, you use a program called a ripper to get a selection from a CD onto your hard drive and another program called an encoder to transcode the selection to an MP3 file. You don't actually "play" the MP3 file itself. To listen to it, you need an MP3 player with a decoder to translate the MP3 file back into a native audio format, in this case AIFF (Audio Interchange File Format—sometimes called "Apple Interchange File Format," as it is the adopted standard for all Apple computer music; on Windows systems it has the file extension .aif).
Audio encoding works at the intersection of acoustics and human signal processing. It involves complex compression algorithms that are "lossy," that is, they take the digitally sampled audio from a CD that is recorded at 44.1KHz (the so-called sampling rate) with a bit depth of 16 bits and throw out quanta (sound samples) to compress the sound based on the biopsychological properties of human acoustic perception. What we can't hear can be thrown out and we won't notice. The entire game is to figure out what information to get rid of that we wouldn't perceive anyway.
How is it that we don't perceive everything received by our ears? MPEG audio encoding algorithms take advantage of humans' inability to hear distortion, called "quantization noise" because of another human auditory phenomenon called masking. Masking occurs whenever the presence of a strong audio signal makes a temporal or spectral neighborhood of weaker audio signals imperceptible. That is, within certain frequency bands, strong sounds obscure weaker ones, so why encode them in the first place?
We only hear a portion of the sounds coming to our ears. This frequency dependency can be expressed in terms of critical bandwidths, which are less than 100Hz for the lowest audible frequencies and more than 4KHz at the highest. The human auditory system blurs the various signal components within these critical bands. Different frequencies within these bands basically sound the same to us. Because we hear only some of the sounds coming to our ears, and our perception of them depends on their amplitude, what we don't hear or resolve is a function of the energy of the sounds in these perceptual bands. MPEG/audio maps sounds into frequency subbands that approximate the ear's critical bands, then keeps sounds that our ears would hear based on their likelihood of us hearing them as determined by a complex filter we apply to each band. The acoustic engineer would say we "quantize each subband according to the audibility of quantization noise within that band." If we're really good at this, we can compress the sound by throwing out that which we wouldn't hear anyway based precisely on the psychoacoustic model of how our brains perceive sound. For the most efficient compression, each band should be quantized with no more levels than necessary to make the quantization noise inaudible.
What's the difference between MP3-encoded sound and AAC-encoded sound? AAC became part of the MPEG-2 standard in 1997 to provide efficient encoding for surround sound audio. It supports up to 5.1 channels (Left, Center, Right, Left and Right Surround, plus a low frequency channel). Its quality at 64Kbps is comparable to MP3 at 128Kbps. MPEG-2 AAC is a continuation of the MP3 coding scheme. Like MP3, AAC exploits the psychoacoustic properties of human hearing, using sound-masking techniques to achieve efficient compression with very little noticeable degradation in audio quality. Finally, it provides a compression advantage of about 1.3 to 1.4 that of MP3 with better sound quality.
Overall it looks like AAC is simply a better approach to digital sound encoding. MPEG formal listening tests have demonstrated that for 2 channel sound, that is, typical stereophonic listening, MPEG-2 AAC is able to provide slightly better audio quality at 96Kbps than MP3 at 128Kbps.
Apple's choice of AAC also provides an interesting option that limits streaming of the original AAC music file to a limited number of other Mac iTunes users. This is how the preview function works—you can listen to up to 30 seconds of any song in the Apple Music Store catalogue to help you decide if you want to buy it. In addition, as an iTunes user I can let up to five users listen to the music on my Mac through iTunes if we're on the same local area network. According to the iTunes help system:
"If your computer is connected to any other computers over a local network and you're using Mac OS X version 10.2.4 or later, you can share the music in your library and playlists with up to five of those computers (in the same subnet as your computer). Sharing is intended for personal use only. You can share MP3, AIFF, WAV, AAC, and radio station links; you can't share Audible spoken word content or QuickTime sound files."
It didn't take long for enterprising hackers to determine that they could extend this functionality. About a week after iTunes4 was released a number of Web sites were offering to connect you to the hard disks of individuals who were offering their iTunes files for your listening pleasure.
The benefits of Apple's choice of AAC coding became quickly apparent as the files that were shared were MP3 files on these Mac users' hard disks, not AAC encoded files. Still, the streaming of iTunes files from individual users was an extension of the intended personal use that Apple was trying, in good faith, to make available. And, it should be noted that the iTunes Music Store was not hacked. Individual users of the software hacked their own copies to provide this new feature.
David Zeiler, in his column "Hackers bite Apple in its iTunes" (writing for SunSpot.net, an online community in Maryland), noted that "Advanced Audio Coding file-compression format that Apple uses in its music store prevents large-scale streaming or downloading, as AAC-coded songs can only be played on three Macs authorized by the same user account. Streaming and downloading AAC-coded songs to strangers d'esn't work." Add another plus in the AAC encoding column.
The Web sites that were hosting this sharing service for iTunes users have since dropped this from their offerings. Apple has done us another service by bringing newer music encoding technology into mainstream use. They have some work to do in their port of iTunes4 to Windows XP, planned for later this year, to try to prevent the exploitation of the streaming hack demonstrated on the OS X version. Perhaps, as Zeiler noted, the small size of the OS X community minimized the recording industry's response to this bit of creative sharing. Further, this limited form of sharing Apple has implemented might in fact be more reasonably within the constraints of personal use, assuming widespread streaming across the Internet were blocked. Still, the fact that MP3s can be streamed isn't going to sit well with the record industry if this "feature" makes it into the Windows versions of iTunes.
Apple could choose to leave the personal streaming function out of the Windows
version, leaving Mac users with this added benefit. Alternately, they could
leave MP3 behind and only support AAC encoded music. This would have a drastic
impact on anyone without an iPod, as current MP3 players wouldn't be able to
use iTunes4 music. On the other hand, it would push the technology forward.
Consider the new Digital Radio Mondiale (yet another DRM acronym). This is a
consortium that is moving toward providing digital AM radio, worldwide, using
AAC encoding technology (see www.drm.org).
If they are successful, you'll begin to hear AM radio with FM quality.
DRM is the world's only non-proprietary, digital system for short-wave, medium-wave/AM and long-wave (below 30 MHz) that can use existing frequencies and bandwidth across the globe. In fact, last month the world of radio changed forever. On June 16, 2003, during the International Telecommunications Union's (ITU) World Radiocommunication Conference (WRC 2003) in Geneva, the world's first daily, live DRM broadcasts were scheduled to be transmitted across the globe by some of the world's best-known broadcasters, including Deutsche Welle, Radio Netherlands, Swedish Radio International, and DeutschelandRadio. Maybe the trend Apple is setting with iTunes4 is worth following.
There are a lot of MP3 players and AM radios that will need to be upgraded or replaced if this takes off. Then again, Betamax was a better technical standard than VHS for videotapes. It takes more than good technology to win the marketplace. Stay tuned!
Requirements:
- Macintosh computer
- Mac OS X 10.1.5 or later (version 12.2.5 or later recommended)
- iTunes 4 must be installed
- Internet connection (DSL, Cable, or LAN connections recommended)
- Apple ID or .mac account required, if you don't have one it's easy
to sign up
- The iTunes Music Store is only available in the United States
References
Digital Audio Tutorial: Introduction to digital audio
www.musiq.com/recording/digaudio,
accessed May 26, 2003.
Facts about MPEG4 AAC
www.telos-systems.com/?/techtalk/aac/default.htm,
accessed May 25, 2003.
Rob K'enen, "Overview of the MPEG-4 Standard." March
2002,
http://www.chiariglione.org/mpeg/standards/mpeg-4/mpeg-4.htm
What is MPEG-2/MPEG-4-ACC?
www.mp3tech.org,
accessed May 23, 2003.
Dan Jansson, "What is ACC?" Aug. 26, 2001,
www.dj-media.com/doc/what_is_aac.asp,
accessed May 25, 2003.
MPEG-4 Audio: ACC
www.apple.com/mpeg4/aac,
accessed May 24, 2003.