home | et40 - digital audio fundamentals - 2014 edition prev | next

David Javelosa
javelosa_david@smc.edu

Copyright © 2003 - 2014 David Javelosa unless otherwise stated.


week 02 - physics of sound

The Physics of Sound as Digital Information

If you were to video tape a film based animation running at 15 frames a second with a camcorder at 30 frames a second, you would guarantee capturing a clear shot of every frame, if not two. Since most material we are trying to sample is something we can hear, a 44.1 kHz sample rate is adequate without much concern. Sampling at lower rates require some sort of hard filtering to eliminate the highs that may interfere with the sampling rate. With the early samplers, a suggested practice was to first record your material to an audio cassette and then play the cassette into the sampler. The frequency response of the analog tape would be somewhat muted compared to the original material and suitable to be sampled at a lower sample rate. These days, among other things, the recommended practice is to sample at as high a rate as possible (44.1 kHz or even 48 kHz) and digitally reduce the number of samples down to the preferred rate, meaning size.

Once again we run into that quality/size thing. Digital audio gets it from both axis. Remember, whether we are taking 8 bit or 16 bit pictures of dynamic sound, we still have to take a lot of them per second. 44 thousand pictures a second at 16 bits a second cranks out to be 5 megabytes a minute, and twice that for stereo. What this points out is that your regular audio CD of around an hour of music, is clocking in at 600 megs of digital data. This is pretty heavy-duty for the typical home computer, and for a CD-based game machine, there's just not that much room for the game. Barring older hardware limitations (such as the earlier Macs with a top end of 22 kHz playback, older PC sound cards and game boxes) most of today's entertainment platforms can handle the higher sample playback rates. But do they want to?

As we are forced to deal with the size issue on this axis, there are a number of quality options that can be taken. The first level is to determine if you really need stereo CD quality sound, either as Redbook audio or as Yellow book streaming audio. (It better be coming off a CD because 600 meg drives are still to pricey to give away!) By taking the track to mono, you have cut the data size in half. By reducing the sample rate to 22 kHz, the data size is in half again but there are no high frequencies above 11 kHz. By reducing the bit depth from 16 to 8, data is cut in half again, but the crunchiness of lower dynamic resolution is introduced. The quality is hit on all sides, no matter which way you go down. It has been noted by some, however, that 16 bit, 11 kHz audio actually sounds better than 8 bit, 22 kHz audio for taking up the same amount of space, but it truly depends on the material being sampled. The further down the sample rate goes, the duller the sound becomes as the highs get filtered out.

An expansion on the techniques and tricks of dealing with digital audio size reduction is found in the Techniques section of the book. The down-sizing of audio should not be a thing to be afraid of. There are ways of making the sound passable to the human ear and still be musical and evocative. There are utilities for converting sound data from one platform to another and back again. Recent editing tools also allow for built-in compression which is a method of further reducing the size of your audio data. There are systems that allow for batching several files to be run through one process or another, saving time and tedium while dealing with hundreds of sound effects cues. Also, many systems allow for simultaneous playback of files of different sample rates, providing a way to put your best foot forward.

Fig. 09.03 Relative File Sizes of MIDI, MODS and Digital Audio

The thing to keep in mind is there is always some happy medium between the level of quality the sound must be cut to and the data size it must fit into. The three-minute pop single had originally evolved because that was the optimum amount of time for music on a 7" vinyl 45 rpm. As a culture, the audience always seems to be able to adapt. If a certain kind of music sounds good at 11 kHz, mono, 8 bit, then you can squeeze almost two-and-a-half minutes on to a 1.44 meg floppy disk. This could have been the 45 single of the digital age. However, with lossy compresion development in audio, in the form of MP3s; and blistering advances in CPUs, realtime playback of compressed sound can now deliver near CD quality at a tenth the size.

Sound Designer, The Mac And Older Tools

Certain sound editing tools are obviously required for preparing digital audio data. In the beginning there were very high end systems that ran on specialized, exotic platforms and for the most part are greatly overshadowed by the capabilities of today's software based systems. Digidesign's Sound Designer was a ground breaking product that even today is the staple of the digital audio specialist. Originally shipped as Sound Tools, the original package included the Sound Accelerator hardware card and an interface for recording analog input and playing out audio. Because of the limited CPU capabilities of the Mac, the Sound Accelerator card worked as a dedicated audio co-processor for handling all the manipulation of the often large audio data. Sound Designer's graphic interface has become the standard by which many digital audio editors are compared to. Currently, much of Sound Designer's functionality has been repackaged as the "Audio Suite" features in Pro Tools.

SoundEdit, and it's several descendants, started off as a low-end editor for handling the 8 bit audio native to the Macintosh. Because of the earlier Mac's limited audio capabilities (8 bit, mono, 11-22 kHz), there was no need for a heavy duty audio co-processor for SoundEdit to work. It had the capability of digitizing with its own low-end hardware device (MacRecorder) and handled editing sound for a number of applications including early multimedia and game products. As the Mac developed internal 16 bit capabilities and offered system extensions such as Sound Manager, the later version of SoundEdit 16 was able to provide a number of professional features such as 16 bit resolution, multiple hardware configurations and multiple track editing.

Alchemy by Passport Designs was another venerable audio editor that is a part of the digital sound designer's arsenal for it's unique features. One of the downsides to the program is that its entire operation is memory resident. Unlike SoundDesigner which can stream data of a hard drive, Alchemy specializes in editing sound files that are small enough to fit into available RAM of the host Macintosh. Besides most of the common digital audio editing features, such as bit depth and sample rate conversion, Alchemy supports direct transfer to a number of external, professional music samplers. Supporting the MIDI Sample Transfer protocol, its original design was that of a front end editor and central server to a network of studio samplers of different makes and models. Alchemy could store, recall and edit samples for an entire studio, and then distribute the material to the different instruments. For those without additional digitizing hardware, an external sampler could be used to digitize sound and Alchemy could import it into the Mac via MIDI, edit it, and make it available for software production.

A later addition to the Macintosh audio suite was Opcode's Audio Shop. This tool is somewhat more limited in it's editing capabilities but really shines as an archiving and reviewing utility. Capable of opening several sound file formats (proprietary formats from other editors, AIFF, and WAV from the PC), it also has the ability to stream audio from a hard drive without the need of any additional hardware. Depending on the Mac it is used on, it will also support both internal 16 bit audio as well as the Sound Accelerator card from Digidesign via Apple's Sound Manager. It's sound list features make this application invaluable for managing large amounts of cues in various formats, rates, depths and sizes. Although the editing interface is somewhat clumsy, there are even some usable signal processing features included.

Another widely used utility that has met an overwhelming need is WaveConvert by Waves. This simple application can handle all of the usual bit depth and sample rate conversions, format changes, and other important file manipulations needed for clean conversions. It's specialty however is it's ability to batch process several files in one command. The repetitious task of pulling down menus and clicking boxes for the various file conversions is automated allowing the user to set up a large number of files to go through several processes unattended, saving hours of time and boredom.

The current leader in two track digital audio editing is Peak by Bias. This replicates much of the original Sound Designer functionality but also does not require dedicated audio hardware.

With the rise of Windows as a professional development platform and the trend toward hardware independence, Sound Forge by Sonic Foundry has become the leading digital audio editor for the PC/Windows side of the world. Taking advantage of Windows' maturing graphic user interface, Sound Forge brings all of the similar editing techniques and features of the Macintosh to the Windows area. The biggest advantage is that, for the majority of software development done on the PC, audio data no longer needs to migrate from the Macintosh. For smaller files, this was not as big a problem but larger files would require robust networking systems to bring such data across to the PC. Batch processing is also available in Sound Forge, with recent add-on features for RealAudio encoding for compressing sound for streaming off the Internet.

Two other leading multi-track editors are also offered by Sonic Foundry: Acid and Vegas. Acid has become a popular tool for multi-tracking loops of music and has many of the editing features found in Pro Tools. Vegas originally started life as a "Pro Tools Killer" but has lived most of its life as a multi-track audio tool for video production. Also neither of these applications require dedicated audio hardware.


Software Basics & the Pro Tools Environment

  • Understanding the Pro Tools SESSION
  • Importing Audio Files, the difference between audio and MIDI
  • Wave forms and regions
  • Playlists, channels, channel strips
  • The mixer window and it's hardware counterpart
  • Channel strip components
  • Wave forms and the Edit Window
  • Regions, Groups and Lists

Reading Assignment

Review: Pro Tools 8 for Macintosh & Windows OR Complete Pro Tools Handbook

  • Starting a New Session
  • Working with Tracks
  • Getting Ready to Record


Copyright © 2003-2014 David Javelosa