home | et39 - digital audio design for games - 2015 edition prev | next

David Javelosa


Copyright © 2015 David Javelosa unless otherwise stated.


week 13 - sound design for entertainment platforms and games


sound design for entertainment platforms and games

Game Platforms

  • Early Cartridge Players and Hand-Helds
  • Current Sample and CD Based Platforms
  • Production Techniques
  • Interactive Engines
  • Multi-Platform Development

What Is A Video Game?

A video game player is a "black box" electronic product that is designed to do one thing: entertain. Although they may have several components in common with desk-top computers, they lack the "open architecture" and versatility for general purposes. They are most commonly designed as a peripheral to the "installed base" of the television set, much like the VCR. More recently they have been designed with the "home theater" concept as part of the ultimate playback system, including a large screen and a powerful stereo or surround sound system.

The original video "game" was generally a cartridge based software product engineered to be played on a machine with little or no instruction for the user. The "operating system" on a video game is always supposed to be completely transparent. The control interface is as simple and non-obtrusive as possible. Customized settings are always optional and never mandatory. As a closed system, there are rarely compatibility problems. However, bad components are not easily accessible and generally an entire system is replaced when faulty. The hardware is optimized for the performance specification, manufactured with the most cost effective parts, and sometimes even sold at a loss to gain market share and platform dominance.

Music and audio synthesis, as mentioned earlier, was originally made possible with the introduction of analog systems comprised of different voltage controlled components, each designed to generate, shape, control and/or perform sound. The primary elements in a typical configuration of this system would include wave-shape generation (one for each voice), envelope shaping (time-based dynamic shaping), and some interface for controlling note pitch and event triggering. As this configuration was formalized in the commercial synthesizer industry, along with micro-chip development, a simple synthesizer system could be reproduced in a single digital chip.

With the economics of mass production, these sound chips became available to a number of different applications from toys to alarm clocks and eventually game machines and early computer products. One of the home computers to feature such an internal synthesizer was the Commodore 64. This sound interface device or SID chip could similarly be found in the NES and later in the portable Atari Lynx. A simpler version (with no wave shape variation or built-in frequency modulation) known as the PSG (pulse signal generator) is featured in the Sega 8-bit machines: the Master System and Game Gear, and included in the downwardly compatible 16-bit Genesis.

The next generation in home entertainment technology points to a blurring of the dedicated game console and the home desktop computer. Titles are being developed for both directions and market trends are being projected for both to exist in the same households, each with similar and sometimes overlapping functions. These functions already include traditional game play, multimedia enrichment titles, digital video play and soon may include Internet browsing, on-line communication and commerce, and any number of productivity applications yet to establish themselves in the home market.

These consoles are divided up into two categories: CD-ROM based and cartridge. But this time they take consumer electronics into a level of technology that rival professional, desk-top systems and even some high-end work-station capabilities. Their speed and optimized design truly break new ground for game play beyond the abilities of computer games, but most of them have been engineered for expansion and plug-in upgrades. Much more so than the earlier Sega CD/Genesis or TurboGraphix/Duo systems, these access ports are capable of introducing processor accelerators, video decompression like MPEG, modem and LAN support, cable functionality, as well as high density storage medium like CD-rom and DVD. There are many pointing to this generation of game box players or "tube-tops" as the first potential cross-over platform for the home digital system. Computer manufacturers like Microsoft, have had their hat in the ring with "computer-in-a-box" game machines such as the XBox. Other major interests like toy manufacturer Bandai and database monolith Oracle have either embraced or considered this as their direction into the future home market.

Since the Sony Playstation, almost all systems have been sample based and delivered on CD-rom. What this means for the audio designer is that the functions of sound effects, music and dialogue are not only available from memory, but also as streaming data from the storage medium. Sounds, regardless of their function, are then divided into a few categories: one-shots, streaming, multi-stream, and segments. The latter two categories are adressed more in the interactive application of sound in games but are treated specifically by interactive audio engines in consoles and PC technologies like Direct Music.

PC Gaming vs. Platforms

The next step up in PC sound cards was the introduction of sample-based or wavetable synthesis. These cards, such as the Roland Sound Canvas and the Creative AWE-32, provide the most authentic and complex instrument sets currently available. By using digital audio recording of instrument sounds, applying synthesizer parameters, and storing them all in ROM, the wavetable card provides an audio quality resembling many professional music synthesizer modules. It's actually no wonder because versions of both external modules and internal sound cards from the same manufacturer will contain identical sample sets in ROM. Creative Labs, for instance, actually acquired synthesizer manufacturer Emu as a result of their work together in wavetable development. The Roland sample sets appear in several hardware configurations as well as software in the Apple Quicktime Musical Instrument set.

Wavetable sound cards, like their FM synthesizer predecessors, support MIDI specifications and in particular the General MIDI specification. This is a guarantee that the pre-determined instrument number will call up the patch that most closely resembling the patch's instrument definition, regardless if it is using 2 or 4 operator FM synthesis or sample based technology. This particular subset of the MIDI spec is responsible for a very wide support of different sound cards, drivers, and sound technologies in the PC game world allowing music for common instrument sounds to be played transparently from one system to the next.

The most important CD format for any musician or sound designer is of course Red Book. This is the CD audio standard for the common music CD that one listens to on the home stereo or in the car. The Red Book format is often combined with Yellow Book data to create a "hybrid" CD. Examples of this combination are actually very common on all platforms for example: classical music CD-ROMS, game console based CD-ROMS and some of what were originally considered "enhanced CDs". The big issue with these hybrids is that the "first" track is the Yellow Book partition and the remaining tracks contain CD audio. The standard audio CD player would see track 1 as a scratchy sounding "blank". This format however, continues to be popular in the game world because of the high quality of sound for either musical or environmental accompaniment. This, of course, is providing everything else is playing from RAM.

For a computer musician to get a gig doing strictly Red Book tracks is just like scoring a movie or getting a record contract, regardless of the platform. Related to this hybrid is the Blue Book specification. This is essentially the same thing as a Yellow/Red hybrid except that the technology keeps an audio CD from recognizing the Yellow Book data track. This transparent format has been embraced by the music recording industry for delivering a computer presentation on audio CDs that have no problems with playing on regular stereos. As a concept, it seems like a good idea but it has proven to add little to audio CD sales for those that release them.

Mod Files and Direct Music

One of the most mysterious digital audio formats ever mentioned in the game and multimedia scene is that of the Mod file. Going back to the ancient days of the Amiga and living on through the SoundBlaster specification for the PC, this has remained a fairly underground format, being exchanged between European hackers and very early "net surfers".

The way music is stored in the Mod format is neither as musical instruction code (such as MIDI) nor as a straight digital recording but as both. The key to conservation and compression of materials is to use your most expensive assets as many times as possible. The Mod file takes full advantage of this concept by manipulating a couple handfuls of short samples and playing them back in a musical structure as determined by fairly compact instruction code. The instructions themselves are NOT MIDI but bear very close resemblance.

A unique advantage of using Mods for creating music over other forms of synthesis is that in the digitized sample, one has all the ambiance and nuance of an actual recording. Simple synthesis requires additional signal processing or voice doubling to even come close. The inherent quality of lower-resolution samples also contributes to the stylistic genre of Hip Hop, Techno and European Disco. With the unquestionable dominance of the PC/DOS machine and the eventual development of the SoundBlaster sound card, there was no better playback system for the format to migrate to. A huge crop of players and editors have appeared in circulation on the net and otherwise, for DOS, Windows as well as the Macintosh.

This style of music and audio data can easily be produced in ProTools, using the Identify Beat command on a rhythm loop. By creating your composition in ProTools and keeping track of using the fewest number of loops for the most ammount of repetions, you can see how the Mod File format conserves storage space with its audio data. Other similar routines can be found in Acid's "save as embedded" command, creating a complex performance with only one copy of each wave file used.

Microsoft's Direct Music engine creates game music and sound effects using this very prinicipal. Playback can be varied in a number of ways including: speed, pitch and musical variation; delivering a large number of audio experiences from very few audio assets. This technique can be heard very evidently in Halo on XBox.

Similarly, reducing sample rate tends to be the most common technique for reclaiming RAM and storage space. Some sounds definitely reduce in rate better than others. For example, instrument sounds with higher frequencies need the higher sample rate for crispness and presence, while lower, muted instruments, such as basses, can be reduced significantly without any noticeable quality loss. Also look carefully as to a file really needs to be in stereo or not. Many stereo samples carry redundant data on two channels.
The art of reusing samples as many times as possible in a composition can be justified as developing a theme, but it is the sign of a true compression master to create a tune of repetitions material that everybody likes. A sample can take on quite a different quality as a loop or played at different rates.

As a practice I have found that compression rarely improves sound quality. When used in combinations of compressed and non-compressed material, such as on the Sony Playstation, there is definitely a benefit gained with the space saved. Another trick in this application is that lower quality, compressed sounds can always be improved with any real-time DSP available in the system, such as reverb or delay. The final mix for any game audio is how it is played from the box to its target system, whether that is a TV, home stereo, or desktop computer.

Exercise In Game Scoring

Determine how many levels of background music is going to be needed. In a limited memory or processor game engine, MIDI will be the most efficient playback engine; either custom samples or the General Midi instrument set. In the case of higher performance playback, streaming audio or mod files might be used. In the most simple of playback engines, a short phrase of music can be stored in memory and looped in the background of the game level. More sophisticated playback engines, like DirectMusic can vary the BGM by playing different versions or changing the audio stream to be specific to the game state.

Start with a MIDI file

Using ACID, a MIDI file can be manipulated and saved as different versions of the original, keeping a common general feeling of the music but changed to fit the level. The simple editing in ACID allows for muting of tracks and changing instrument patches; for example:

- rearranging the MIDI file by muting all instruments except Bass and Drums
- using only certain passages of the MIDI file; such as rythms or instrumental sections
- changing patches on certain instruments; such as changing the Bass to a Tenor Sax

Each version can then be rendered by ACID as a short WAV file and used as level music stored in memory (GameMaker example). Create a different arrangement of the MIDI data for each game level (start, space, jungle, future).

Variations with Audio and DSP

Using ACID or VEGAS, each of the rendered audio files can be further modified with effects; such as reverb, echo, filtering, EQ, etc. Using FX automation in either of these programs, very distinct and colorful characteristics can then be applied to each of the different versions of the audio files. These can then be assigned to each of the game levels, for instance:

- an unmodied version of the WAV file would be used in the Start level
- a version modified with reverb would be used for the Space level
- a version modifed with an automated resonant filter would be used for the Future level
- an arrangement edited at the MIDI level would only have drums, and be used in the Desert level

Each of the rendered audio files would need to be either processed in Acid with FX on the mix level before rendering; or brought into Vegas or SoundForge and modified with FX plugins; and then saved again. Each version would then be imported into the game engine and assigned to a level.

Working with a Game Tool (Gamemaker)

Gamemaker can be downloaded for free at:


From the example file provided in class, open the sound resources in the explorer window and LOAD sounds.

Audio must be loaded as a Normal sound; MIDI can be loaded as Background Music.

Copyright © 2012 - 2015 David Javelosa