SGI Audio Tools

From N64brew Wiki
Revision as of 06:17, 8 July 2021 by Danielface (talk | contribs) (→‎Compiling the Instrument Bank)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Introduction

The Nintendo 64 SDK comes with two "batteries included" audio libraries, the SGI Audio Tools and the N64SoundTools. The SGI Audio Tools are a collection of command-line tools for preparing samples and MIDI sequences for playback on the Nintendo 64. The N64SoundTools was written by Acclaim Studios Manchester (formerly Software Creations) and encompasses a kind of DAW for authoring and editing songs for the Nintendo 64.

This article hopes to give a step-by-step reference for authoring sounds and music with the SGI Audio Tools and playing them in a NuSystem-based game. While not as intuitive and straightforward as the N64SoundTools, there are advantages to having a collection of command-line tools as they're entirely scriptable and can help automate compiling/editing of sound data.

Prerequisites

This article assumes you're familiar with the following terms/concepts:

  • Audio samples
  • ADSR and envelopes
  • Sample rate
  • MIDI
  • Linear predictive coding
  • AIFF format
  • the PATH environment variable (for program quick access)

If you're a bit unfamiliar, a quick search, tutorial, or Wikipedia skim should suffice.

This article assumes you have the SGI Audio Tools as part of the Nintendo 64 SDK. The programs in particular you're going to need are:

  • tabledesign
  • vadpcm_enc
  • ic
  • midicvt
  • midicomp
  • sbc

This article also assumes that you're using the aforementioned programs in a Windows 95-like environment. An emulator, such as Oracle VirtualBox works fine too.

The n64decomp project has decompiled tabledesign and adpcm here. It's possible to build those two particular programs yourself and run them in the environment of your choice, which might make your life a little easier!

A good warm-up for this article might be to compile and run the nu3 NuSystem sample, as it's more or less a "hello world" that plays the sort of audio files we're looking to generate. Keep note of the Makefile including the audio library and the spec file adding the sbk, ctl, and tbl files to the ROM. If you're able to compile/run nu3, even in an emulator, you'll be in a good place to test/debug/iterate an issues that pop up in your program.

Authoring a Song with the SGI Audio Tools

Compressing Sequence Data

Converting Your MIDI File(s)

MIDI files are generally either Type 0 or Type 1. The former specifies all of the notes in a single "track" while the latter has multiple tracks, typically for each instrument. The SGI tools require MIDI files to be in Type 0 and provides the midicvt tool to convert to it.

Programs such as MuseScore likely export in Type 1, so it's usually a good idea to convert to Type 0 before continuing.

For each of your original MIDI files, run the following:
midicvt some_midi.mid some_midi_converted.mid

some_midi_converted.mid is the name of the converted file in this example.

Compressing Your MIDI File(s)

Once your MIDI files have been converted to Type 0, we'll be converting them to a compressed sequence format specialized for embedded playback on the Nintendo 64.

The NuSystem library is written to use compressed MIDI files for songs. If you inspect the library, you'll notice that it uses a ALCSPlayer for storing/playing songs. It is possible to use uncompressed Type 0 MIDI, but you'll need to look into editing/rebuilding NuSystem or your own audio code with Nintendo's core audio library.

For each of your converted MIDI files, run the following:
midicomp some_midi_converted.mid some_midi_compressed.cmf

You'll now have various cmf files for each of your songs.

Compiling Your MIDI File(s)

Now that we've compressed each MIDI file, its time to compile them into one "song bank". This will be added to your ROM and loaded in at runtime. To do this, we'll be using the sbc tool.

Run the following command with each of your cmf files as parameters.
sbc -Osongs.sbk first_song_compressed.cmf second_song_compressed.cmf third_song_compressed.cmf

The ordering is important here! Keep note of the order of each parameter, as when you're selecting your songs in your game's source code, you'll be indexing them as they're ordered here (eg: first_song_compressed.cmf will be 0, second_song_compressed.cmf will be 1, etc.).

Note the lack of space between the -O flag and the output file name. This seems to be intended. 🤷

Compressing Sounds

Converting samples with SoX

SoX bills itself as the Swiss Army knife of sound processing programs. Its uses include (but aren't limited to) converting audio between formats, providing effects, and even recording. Given that SoX is an open-source tool, it's well worth including into any game developer's setup.

The tabledesign and vadpcm_enc tools require audio samples to be in AIFF or AIFC. If the samples you're using are in a different format, such as WAV, you can use SoX to batch-convert your samples. It's also a good idea to resample each effect to the same sample rate, such as 32000Hz. If you're generating your instrument bank file via a script, you can hardcode the sample rate which will let you spend less time coding/debugging.

If we want to convert an arbitrary WAV file to AIFF with a sample rate of 32000 and in mono we can enter:
sox some_file.wav -r 32000 -c 1 converted_file.aiff

This article assumes that the reader is converting their files to AIFF with SoX.

Creating a code book for each file

You'll want to create a code book for each AIFF sample you want to use in your song. To do this, you'll run the tabledesign command on each of your samples and save the output of that program to a file.

For clarity, we'll be suffix-ing each code book with .table but it's not necessary to do.

On each of your samples, run the following:
tabledesign song_sample.aiff > song_sample.table

It's worth noting that by default tabledesign will print to STDOUT. The > operator for writing to a file should work both on Unix-like and Windows here.

Compressing each sample

Once we've created our code book(s), we'll want to convert our AIFF samples to Nintendo's compressed AIFC formats. To do that, we'll be using vadpcm_enc.

On each of your samples, run the following:
vadpcm_enc -c song_sample.table song_sample.aiff compressed_song_sample.aifc

The following .aifc file(s) will be compiled in to make a sound bank.

Authoring the Instrument Bank File

Before we Begin

This is likely the most tricky and confusing parts of the SGI Audio Tools, so be sure to take a break if you're finding yourself frustrated. Take comfort in that what you're feeling is pretty normal, and that others have been in the same spot.

Section 18.1.12 of the Nintendo 64 Programming Manual is a pretty comfortable overview of what each section of an instrument bank file does. It's not "correct" in certain areas though, and copy/pasting the shown examples won't always work with ic. A particular example is that the manual says to reference each instrument in your bank section with program when in fact you'll need to use instrument instead.

If you're looking for a reference of a working instrument bank, it's best to check the example banks at ultra/usr/src/pr/assets/banks/ included with the SDK. They'll run through ic fine and help clarify things for you.

The Instrument Bank File

An instrument bank file usually has the file extension of .ins. Inside it contains one or more of each of the following:

  • envelope section(s), indicating an ADSR
  • keymap section(s), indicating the range of "piano keys" a sound occupies, as well as other data
  • sound section(s), indicating a sampled sound as well as the envelope and keymap it uses
  • instrument section(s), indicating a "MIDI instrument" with a volume, pan, and various sounds
  • A single bank section, indicating the sample rate, and which instruments correspond to which MIDI instrument numbers in your sequences. This will include a specialized instrument for the drumset channel.

envelope

The SGI Audio Tools represent an ADSR with envelopes. Volume for each of the ADSR points ranges from 0 to 127. Time is modelled in microseconds for each of the ADSR points.

Different samples and sounds can use the same envelope, but your tracks will generally sound better if you ensure that each sample has a matching envelope. If you're unsure, it doesn't

An envelope looks like the following:

   envelope AnExampleEnvelope
   {
       attackTime    = 10000;
       attackVolume  = 127;
       decayTime   = 500000;
       decayVolume   = 100;
       releaseTime   = 200000;  
   }

In the example above, the volume goes from 0 to 127 in 10000 microseconds (the attack), then decays to 100 over 500000 microseconds. When the sound using is envelope ends, the sound fades out over 200000 microseconds. This should generally match up to your sample.

For more information, review the N64 SDK Documentation at 18.1.2.5.

keymap

A keymap represents a range of "keys" for a sound to cover. The MIDI Standard represents each of the western music pitches from 0 to 127. 60 can be considered middle C.

A keymap looks like the following:

   keymap AnExampleKeymap
   {
       velocityMin = 0;
       velocityMax = 127;
       keyMin      = 0;
       keyMax      = 127;
       keyBase     = 60;
       detune      = 0;
   }

The example above maps to every available pitch as keyMin is 0 and keyMax is 127. keyBase represents the "reference pitch" to scale when changing keys. In the example above, a sample with the pitch of middle C should be used. Samples at different frequencies will require a different keyBase value.

For more information, review the N64 SDK Documentation at 18.1.2.4.

sound

A sound combines a keymap, envelope, and a compressed sample file together into a unit. A sound also has properties for stereo panning and volume from 0 to 127 each.

An example might look like:

   sound AnExampleSound
   {
       use ("your/particular/path/to/compressed_song_sample.aifc");
       pan    = 64;
       volume = 127;
       keymap = AnExampleKeymap;
       envelope = AnExampleEnvelope;
   }

Note how the keymap and envelope parts correspond to names of our examples above.

For more information, review the N64 SDK Documentation at 18.1.2.3.

instrument

An instrument models a single MIDI instrument. It consists of one or more sounds.

An example instrument might look like:

   instrument AnExampleInstrument
   {
       volume = 127;
       pan    = 64;
       sound  = AnExampleSound;
   }

Note how sound property matches the name of an existing sound above.

An instrument can specify multiple sounds. For example, GenMidiBank.inst in the N64 SDK uses four sounds for a MIDI Cello:

   instrument Cello
   {
       volume = 127;
       pan    = 64;
       vibratoType  = 128;   /* 128, 129, 130, 131 */
       vibratoRate  = 222;   /* 0 to 255 */
       vibratoDepth = 6;    /* 0 to 255 */
       vibratoDelay = 1;     /* 1 to 255 */
       sound  = Cello00;
       sound  = Cello01;
       sound  = Cello02;
       sound  = Cello03;
   }

Each of the corresponding sounds have different keymaps and samples that cover different ranges of notes. This can produce nicer-quality audio as the pitch of a sample doesn't need to be distorted as much. The tradeoff being more audio memory required for your instrument, especially at higher sampling frequencies such as 44100Hz.

For more information, review the N64 SDK Documentation at 18.1.2.2.

bank

The bank section is a collection of instrument and the final piece of the puzzle for our file. Each MIDI instrument number gets assigned a particular instrument.

An example bank might look like:

   bank SongBank 
   {
       sampleRate = 32000;
       percussionDefault = Percussion_Kit;
       instrument [0] = AnExampleSound;
       instrument [65] = AnExampleAltoSax;
       instrument [107] = AnExampleKoto;
   }

Here we associate various instruments with different MIDI instrument numbers. 0 represents MIDI notes that are played with an Acoustic Grand Piano, and we've told the audio library that we should use AnExampleSound as the voice of Acoustic Grand Piano. MIDI notes that use instrument 65 get associated with an instrument called AnExampleAltoSax.

You'll want the value for sampleRate to match the frequency your sample files are tuned to. This is the reason we converted all of our samples to the same rate with SoX above.

percussionDefault is explained in the following section.

Note that you aren't required to have an instrument associated with every MIDI instrument number. If your song, for example, is only a solo piano piece then you'd only need to worry about an instrument for 0. It's best to only include sounds for MIDI instruments that you need. Anything else is audio memory that could be better spent elsewhere, such as sound effects or higher-frequency samples.

For more information, review the N64 SDK Documentation at 18.1.2.1. Note that the example for bank in the manual says to use program for referencing an instrument. This isn't correct and the ic tool will give you an error if program is used.

Percussion Sounds

Percussion in MIDI is a bit unique in that channel 10 is reserved for percussion and that each note maps to a specific instrument. "Middle C" has a note number of 60 which is always a high bongo sound on the percussion channel.

To accommodate this, we create a special instrument for percussive sounds. Each different instrument will have its own sound, envelope, and keymap that only covers its corresponding key.

An example percussion setup for an Electric Base Drum (MIDI key 36) might look like the following:

   keymap Percussive_Bass_Drum_1Keymap
   {
       velocityMin = 0;
       velocityMax = 127;
       keyMin      = 36;
       keyMax      = 36;
       keyBase     = 36;
       detune      = 0;
   }
   sound Percussive_Bass_Drum_1Sound
   {
       use ("electric_bass_drum_sample.aifc");
       pan    = 64;
       volume = 127;
       keymap = Percussive_Bass_Drum_1Keymap;
       envelope = SomeBassDrumEnvelope;
   }

Which would then integrate into an example percussion instrument:

   instrument Percussion_Kit
   {
       volume = 127;
       pan    = 64;
       sound = Percussive_Bass_Drum_1Sound;
       sound = Percussive_Acoustic_SnareSound;
       sound = Percussive_Low_TomSound;
       sound = Percussive_Open_Hi_HatSound;
       sound = Percussive_High_Mid_TomSound;
       sound = Percussive_Crash_Cymbal_1Sound;
       sound = Percussive_High_TomSound;
       sound = Percussive_Ride_Cymbal_1Sound;
       sound = Percussive_High_BongoSound;
       sound = Percussive_Low_BongoSound;
       sound = Percussive_CabasaSound;
       sound = Percussive_MaracasSound;
       sound = Percussive_ShakerSound;
   }

The bank would then set percussionDefault to be Percussion_Kit.

N64 SDK Example Instrument Bank

The N64 SDK has reference Instrument Banks at ultra/usr/src/pr/assets/banks. If you're ever stuck on how something should look or are getting errors with ic, they can be a helpful guide to see how things are done.

Compiling the Instrument Bank

We use the instrument compiler program (ic) to turn our Instrument Bank file into .tbl and .ctl files for running ingame.

Run ic on your .ins file like the following:

   ic -OSongBank SongBank.ins

Note that SongBank in this case whatever you called your .ins file. Also, note the lack of space before the -O argument. This seems to be correct for the tool.

If your .ins file doesn't have any errors, you should see an output like this:

You should also then have .tbl and .ctl files in the same directory with the name you put before the -O parameter.

If there are syntax or other errors in your .ins file, you might get an error message like this:

Even though the output looks garbled, try not to be discouraged! The final bit of output will show the line number of the error. The first place to look is often the associated line.

In this case, the example error message was caused by a missing semicolon on line 28/29.

Much like the C-family of programming languages, semicolons indicate the start/end of statements. If you're missing one, the instrument compiler might associate two lines as a whole. Be sure to check the lines above and below if you're not initially sure where the error might be.

Finishing up

Once we've completed the steps above, we should now have the following:

  • A sbk that consists of our converted/compressed MIDI sequences
  • ctl and tbl files for our samples

We'll be including the above files into our ROM's spec file, then requesting the audio library to load and play them.

Playing a song with NuSystem

TODO

Linking the Audio Library

TODO

Setting up playback

TODO

Starting/stoping playback

TODO

Making an Instrument Bank for Sound Effects

TODO

Looping

Looping Compressed Audio

TODO

See 20.5 of the N64 SDK for more information on this.

Looping Non-Compressed Audio

As NuSystem is primarily compiled to use compressed sequenced audio, this is outside the scope of this article. However, 17.3.4 in the N64 SDK can help with non-compressed sequences.