home | et39 - digital audio design for games - 2015 edition prev | next

David Javelosa


Copyright © 2015 David Javelosa unless otherwise stated.


week 03 - history of recording & the psychology of sound - Intro to Multitracking


1877 - Edison makes the first recording of a human voice ("Mary had a little lamb") on the first tinfoil cylinder phonograph Dec. 6 (the word "Halloo" may have been recorded in July on an early paper model derived from his 1876 telegraph repeater, but the paper has not survived) and filed for an American patent Dec. 24.

1881 - Charles Tainter at the Volta Lab made the first lateral-cut records, but without any practical machine to play them back.

1890 - The first "juke box" was the coin-operated cylinder phonograph with 4 listening tubes that earned over $1000 in its first 6 months of operation starting the previous November 23 in San Francisco's Palais Royal Saloon, setting off a boom in popularity for commercial nickel phonographs that kept the industry alive during the Depression Nineties.

Early 1900's - Phonographs were popular household items and for the first time people could listen to recordings of famous musicians.

1908, the first film score or music composed specifically for a picture began to appear. In large cities, movies would be accompanied by a symphony orchestra (this is worth keeping in mind when you think of resistance some people had to talking pictures). In smaller cities, the movie would accompanied by a piano player.

1914 Lee de Forest finally develops the electronic amplifying valve or tube that they can finally successful project sound to an audience.

1920 - KDKA in Pittsburg inaugurated commercial radio when it was the first radio station to receive its commercial call letters from the Department of Commerce Oct. 27; it began regular scheduled broadcasting Nov. 2 with the returns of the presidential election, and continued broadcasting every evening from 8:30-9:30 pm.

1926 - Warner Bros, then a minor studio undergoing financial difficulties, bought the disc-based VITAPHONE system, designed by AT&T, with financial backing from Wall Street. For $800,000 they secured rights to it. In 1926 Warner Bros. released Don Juan with John Barrynore. It had synced music, but it was preceded by a one hour program of shorts and by a short segment in which the head of the MPPDA proclaimed, talking: "the beginning of a new era in music and motion pictures." It was a big hit with the public but it's future was still uncertain.

1927 - The Jazz singer was supposed to be a singing picture. Not talking, but during the shooting, Al Jolson ad -libed some lines that remained in the final cut. He was not only talking, he was talking spontaneously. The film was a hit... everyone jumped on the band wagon... there was no going back. But switching to sound was not easy... besides financial and technological problems.

1933 - Invention of multitrack recording system which allowed for separation of dialogue, music and effects. Before this musicals had to film and record at the same time. Up until this point there were no mixes.

1938- Les Paul; multi-track recording with discs; electric amplification for guitars

1939 - Multi-channel sound used in with Fantasia, with a system called Fantasound. Disney was interested in emphasizing the directional character of a symphony orchestra, ex, brasses clearly separated from strings. It was a double system based on a separate 35 mm interlocked print which had 3 optical tracks with a control track. For the LA premiere Disney he added a primitive "surround" channel of 96 small speakers to pick up sound from one or more of the main channels, ex: the choir was heard throughout the theater.

1951 Mag sound (magnetic tape) became the leading film audio technology developed in Germany during WW. It was used as early as 1945 for non-synch sound (music especially). The system was very large and difficult to move.

1955 - A suitcase sized recorder was designed for the 10 Commandments: sound for some shots were filmed in Egypt.

1966- The Beatles with George Martin; Sargent Pepper's done with 4 track technology.

Early 1970's - There was a better sound system in the average American teenager's bedroom than in the neighborhood theater.

1974- Todd Rundgren; 24 track solo recording artist; "sounds of the studio"

1975 - Optical stereophonic sound on film pioneered by Dolby laboratories. Tthis system allowed the 2 optical tracks on the film to be encoded and split into 4 tracks. L,R,C, S (in a way based on the Quad system). It was cheaper and quicker than magnetic tracks. It also provided noise reduction and a broader dynamic and frequency range through the use of compounding noise reduction.

1976- Brian Eno; recording studio as an instrument; Ambient series

1977 - Star Wars needed to get those low sounds up for the space battles. Because the 35 mm print only had room for 4 tracks, they reverted back to 70 mm with 6 magnetic stripes for play in certain specially equipped theaters: those that had left over equipment from the 50's. The sixth channel was dedicated to the lowest frequency creating a theater sub-woofer with its own amplification.

1979 - Superman and Apocalypse Now release in Surround Sound, standardizing the 5.1 surround system used today. This led to the final developments in the Dolby surround system and THX. This was the first year an Oscar was given for Sound Design.

1989 - Digidesign Sound Tools on the Macintosh SE; digital audio editing for the desktop and project studio.

1993- Digidesign Pro Tools with TDM plug-ins; multi-track digital audio editing with accelerated software DSP.

sfxr - sound effect generator
by DrPetter, 2007-12-14
developed for LD48#10

Basic usage:

Start the application, then hit
some of the buttons on the left
side to generate random sounds
matching the button descriptions.

Press "Export .WAV" to save the
current sound as a WAV audio file.
Click the buttons below to change
WAV format in terms of bits per
sample and sample rate.

If you find a sound that is sort
of interesting but not quite what
you want, you can drag some sliders
around until it sounds better.

The Randomize button generates
something completely random.

Mutate slightly alters the current
parameters to automatically create
a variation of the sound.

Advanced usage:

Figure out what each slider does and
use them to adjust particular aspects
of the current sound...

Press the right mouse button on a slider
to reset it to a value of zero.

Press Space or Enter to play the current sound.

The Save/Load sound buttons allow saving
and loading of program parameters to work
on a sound over several sessions.

Volume setting is saved with the sound and
exported to WAV. If you increase it too much
there's a risk of clipping.

Some parameters influence the sound during
playback (particularly when using a non-zero
repeat speed), and dragging these sliders
can cause some interesting effects.

Using an external sound editor to capture and edit
sound can also be used to string several sounds
together for more complex results.
To record this you will need to use an external
recording application, for instance Audacity.
Set the recording source in that application
to "Wave", "Stereo Mix", "Mixed Output" or similar.

Parameter description:
- The top four buttons select base waveform
- First four parameters control the volume envelope
Attack is the beginning of the sound,
longer attack means a smoother start.
Sustain is how long the volume is held constant
before fading out.
Increase Sustain Punch to cause a popping
effect with increased (and falling) volume
during the sustain phase.
Decay is the fade-out time.
- Next six are for controlling the sound pitch or
Start frequency is pretty obvious. Has a large
impact on the overall sound.
Min frequency represents a cutoff that stops all
sound if it's passed during a downward slide.
Slide sets the speed at which the frequency should
be swept (up or down).
Delta slide is the "slide of slide", or rate of change
in the slide speed.
Vibrato depth/speed makes for an oscillating
frequency effect at various strengths and rates.
- Then we have two parameters for causing an abrupt
change in pitch after a ceratin delay.
Amount is pitch change (up or down)
and Speed indicates time to wait before changing
the pitch.
- Following those are two parameters specific to the
squarewave waveform.
The duty cycle of a square describes its shape
in terms of how large the positive vs negative
sections are. It can be swept up or down by
changing the second parameter.
- Repeat speed, when not zero, causes the frequency
and duty parameters to be reset at regular intervals
while the envelope and filter continue unhindered.
This can make for some interesting pulsating effects.
- Phaser offset overlays a delayed copy of the audio
stream on top of itself, resulting in a kind of tight
reverb or sci-fi effect.
This parameter can also be swept like many others.
- Finally, the bottom five sliders control two filters
which are applied after all other effects.
The first one is a resonant lowpass filter which has
a sweepable cutoff frequency.
The other is a highpass filter which can be used to
remove undesired low frequency hum in "light" sounds.


-Increase in technology have led audio producers towards more careful sound design.

-Bombarded by ever deepening visual information, audiences must have heightenbed sound effects, if only to perceive them at all.

-Improved theater speaker systems make further demands on filmmakers who must stretch their sonic creativity to compensate and compete with home stereo systems.

-"Action" movies, whose dialogue is often trivialized , especially depend on music and sound effects to carry their emotive levels.

-We must consider the soundtrack in terms of "quality" of the sound (technical issue), and artistic quality (aesthetic issue).

-"As technology gets better, maintaining the illusion gets harder.

-When TV was black and white , and the sound came out of tinny speakers, it was easy to accept technical limitations. We knew that Lucy's hair and Ricky's skin were'nt gray, but we did not care. Or we filled in the red and tan in our minds. Color television made it harder to suspend our disbelief. Although gray hair was acceptable for Lucy, orange wasn't. Lighting and make up became much more important. The same thing happened to sound. The increased audio clarity of digital tape, better speakers and amplifiers on TV sets, and the prevalence of stereo conspire to let us hear more of the track. Likwise, gaps in audio and audio quality are "seamed over" between the ear and the brain.

- Lower quality sounds and technolgy can be used in the background because the ear is focused on the foreground sound's quality.





recording in Pro Tools (cd or midi source)

Digital Levels

When recording to an analog medium such as magnetic tape, recording engineers always try to keep their meters as close to 0 VU (stands for Volume Unit, which is based on electrical currents) as possible. This ensures a high signal-to-noise ratio while preserving enough headroom to keep the tape from saturating and distorting. Recording a few peaks that go above 0 usually doesn’t cause any problems since the tape saturation point is not an absolute.
In the digital realm, where amplitudes are stored as discrete numbers instead of continuous variables, things are quite different. Instead of having a flexible and forgiving recording ceiling, we have absolute maximum amplitudes, -32768 and 32767, in 16-bit audio. No stored signal can ever have a value above these numbers. Everything beyond gets clamped to these values, literally clipping off the wave peaks. This chopping effect can add large amounts of audible distortion. If the clipping is very short and infrequent such as during a very loud snare hit, it can go unnoticed. But in general, it is safe to say that digital audio has absolutely no headroom.

At what level, then, should a signal be recorded digitally? The standard method for digital metering is to use the maximum possible sample amplitude as a reference point. This value (32768) is referred to as 0 decibels, or 0 dB. Decibels are used to represent fractions logarithmically. In this case, the fraction is: sample amplitude divided by the maximum possible amplitude. The actual equation used to convert to decibels is: dB = 20 log (amplitude/32768)
Say you have a sine wave with a peak amplitude of 50% of full scale. Plugging the numbers in gives you 20 log (0.50) = -6.0 dB. In fact, every time you divide a signal’
s amplitude by two, you subtract its dB value by 6 dB. Likewise, doubling the amplitude of a signal increases its dB value by 6 dB. If you kept dividing your sine wave until it its peak amplitude was equal to 1, you’d get the very lowest peak dB possible, -90.3 dB.

Why do we use dBs? We’ll for one, it’s easier to say -90 dB than 0.000030 (1/32768). Decibels have been used for a very long time when dealing with sound pressure levels because of the huge range (about 120 dB) that the human ear can perceive. One confusing thing about using decibels is that 0% is referred to as minus infinity (-Inf. throughout this manual and in Sound Forge dialogs).
How do we measure the levels of a digital signal? Digital meters usually show the maximum instantaneous amplitudes in dB. This is called a peak meter. Peak meters are excellent for making sure that a recorded signal is never clipped. However, peak meters aren’t as precise as using RMS (Root mean square…another mathematical formula) power readings when trying to measure loudness. This can be appreciated by generating a sine wave and a square wave with the same peak amplitudes and noting the square wave is much louder. When using RMS power, a maximum-amplitude square wave will be 0 dB (by definition), while a maximum amplitude sine wave reaches only -3 dB.

Now, let’s get back to the real question – At what level should audio be digitized? If you know what the very loudest section of the audio is in advance, you can set your record levels so that the peak is as close to 0 dB as possible and you’ll have maximized the dynamic range of the digital medium. However, in most cases you don
’t know in advance what the loudest level will be, so you should give yourself at least 3 to 6 dB of headroom for unexpected peaks (more when recording your easily over-excited drummer friend).

Now get in there and have some fun with Sound Forge.

Copyright ©1998 Sonic Foundry, Inc.

Audio Terminology Review

  • Frequency - frequency of a sound (how high or low it is) is measured by how many times it completes a vibration or cycle per second. Measured in hertz. 20 hertz = 20 cycles.
  • Human ear can hear from 20hz to 20khz.
  • Harmonics are softer frequencies that are heard above the fundamental frequency of a sound.
  • Amplitude refers to the volume or loudness of a sound. We measure this in decibels.
  • Sample depth - bit rate of digital audio. The higher the bit rate the greater number of bits used to describe the difference in volume from the quietest to the loudest sound.
  • Sample rate - the numbers of snapshots or samples of waveform taken per second. The more samples taken per second indicates a more faithful reproduction of the waveform.
  • Professional quality audio is usually at a sample rate of either 44.1 kHz (44100 x per second) for compact discs or 48 kHz for digital video both at 16 bit dynamics. 22 kHz is a common sample rate for multimedia applications as is 8 bit dynamics in extreme situations. Other super-high-end situations have worked at 96 kHz and beyond; as well as 32 bit dynamics.
  • Nyquist Limit - highest frequency you can faithfully reproduced has to be less than one-half the sample rate. Therefore, the highest frequency a sound sampled at 44.1 Khz is 22.050 khz.

Intro to Pro Tools


  • Main Window
    • Time Display, Marker Bar, Ruler
    • Track List, Track View
    • Scrub Control, Transport Bar
    • Explorer Window area, Video screen
    • Status Bar
  • Tool Bar
    • File Management
    • Edit Commands
    • Snap to Grid, Crossfades, Lock events, group events
    • Envelope editing
    • Edit tools, zoom tools, context help

Reading Assignment

  • Pro Tools Help Files: Recording Basics
  • SFXR Help Files

Copyright © 2012 - 2015 David Javelosa