2023-08-17 14:07:41 -04:00
|
|
|
# UXN Audio Proposal
|
|
|
|
|
|
|
|
## Problems
|
|
|
|
|
|
|
|
Currently the UXN audio device doesn't work very well for playing
|
|
|
|
complex music. There are a few reasons for this:
|
|
|
|
|
|
|
|
* Note duration is conflated with envelope shape
|
|
|
|
* Envelope resolution (67ms) limits tempos/subdivisions
|
|
|
|
* Microtonal music is not possible (according to the spec)
|
|
|
|
* Using audio callback requires scheduling pauses/silence
|
|
|
|
|
|
|
|
## Proposal outline
|
|
|
|
|
|
|
|
One way to improve the situation is to disentangle the envelope
|
|
|
|
specification from the note duration, and more generally make it
|
|
|
|
easier to specify things that a composer will frequently need to
|
|
|
|
change (pitch, articulation, duration) without having to change the
|
|
|
|
underlying voice (waveform/envelope settings).
|
|
|
|
|
|
|
|
This proposal does four things:
|
|
|
|
|
|
|
|
1. Add a two-byte `duration` port that configures a note's duration
|
|
|
|
in milliseconds. The longest possible note is about 66 seconds.
|
|
|
|
|
|
|
|
2. Double the size of the `adsr` port. This means replacing the
|
|
|
|
existing two-byte port with four one-byte ports for `attack`,
|
|
|
|
`decay`, `sustain`, and `release`. Since we have 4 extra bits per
|
|
|
|
stage, we will reduce the resolution of each stage from 66ms to
|
|
|
|
10ms (so 0x01 means 10ms). The longest envelope stage is now about
|
2023-08-17 14:13:37 -04:00
|
|
|
2.6s (up from 1s previously).
|
2023-08-17 14:07:41 -04:00
|
|
|
|
|
|
|
3. Add a one-byte `mode` port, which declares what kind of note or
|
|
|
|
sound is being played. This provides an easy way to specify
|
|
|
|
different behaviors such as:
|
2023-08-21 23:10:02 -04:00
|
|
|
* articulation (e.g. staccato, legato, etc.)
|
|
|
|
* different sample rates (44.1k, 22.05k, 11.025k, 5512)
|
2023-08-17 14:07:41 -04:00
|
|
|
|
|
|
|
4. Move the `volume` port to `0x5` and add a one-byte `detune` port.
|
|
|
|
A zero value (`0x00`) indicates a "normal" semitone pitch, and
|
|
|
|
non-zero values indicate a fractional amount to add. The
|
|
|
|
calculation is that the pitch is raised by `detune/256` cents. For
|
|
|
|
example, a value of `0x80` will raise the pitch by a quarter-tone.
|
|
|
|
The port is placed just before `pitch` so that microtonal music
|
|
|
|
can write a "micro-pitch" using one `DEO2` instruction.
|
|
|
|
|
|
|
|
## Microtonal music
|
|
|
|
|
|
|
|
Here's how to encode the 17-tone equal temperment scale (17ET) as
|
2023-08-21 23:10:02 -04:00
|
|
|
`detune/pitch` pairs starting from middle C (`0x3c`, i.e. `#3c`).
|
|
|
|
Since each step of the scale consists of 70.588 cents, we can get
|
|
|
|
accurate pitches and detunes by adding 70.588 for each step then
|
|
|
|
dividing by 100 and using the quotient and remainder:
|
2023-08-17 14:07:41 -04:00
|
|
|
|
|
|
|
```
|
2023-08-21 23:10:02 -04:00
|
|
|
pitch 1: #003c (0 semitones + 0.00 cents) -- root note is C (#3c)
|
2023-08-17 14:07:41 -04:00
|
|
|
pitch 2: #b53c (0 semitones + 70.59 cents)
|
|
|
|
pitch 3: #693d (1 semitones + 41.18 cents)
|
|
|
|
pitch 4: #1e3e (2 semitones + 11.76 cents)
|
|
|
|
pitch 5: #d33e (2 semitones + 82.35 cents)
|
|
|
|
pitch 6: #883f (3 semitones + 52.94 cents)
|
|
|
|
pitch 7: #3c40 (4 semitones + 23.53 cents)
|
|
|
|
pitch 8: #f140 (4 semitones + 94.12 cents)
|
|
|
|
pitch 9: #a641 (5 semitones + 64.70 cents)
|
|
|
|
pitch 10: #5a42 (6 semitones + 35.29 cents)
|
2023-08-21 23:10:02 -04:00
|
|
|
pitch 11: #0f43 (7 semitones + 5.88 cents) -- almost perfect 5th (#43)
|
2023-08-17 14:07:41 -04:00
|
|
|
pitch 12: #c443 (7 semitones + 76.47 cents)
|
|
|
|
pitch 13: #7844 (8 semitones + 47.06 cents)
|
|
|
|
pitch 14: #2d45 (9 semitones + 17.64 cents)
|
|
|
|
pitch 15: #e245 (9 semitones + 88.23 cents)
|
|
|
|
pitch 16: #9746 (10 semitones + 58.82 cents)
|
|
|
|
pitch 17: #4b47 (11 semitones + 29.41 cents)
|
2023-08-21 23:10:02 -04:00
|
|
|
pitch 18: #0048 (12 semitones + 0.00 cents) -- octave is C (#48)
|
2023-08-17 14:07:41 -04:00
|
|
|
```
|
|
|
|
|
|
|
|
While it's somewhat cumbersome to calculate these detune values in
|
|
|
|
advance, it only has be done for one octave and the resulting
|
|
|
|
microtonal pitches can be compactly stored and used.
|
|
|
|
|
|
|
|
## Note duration and tempo
|
|
|
|
|
|
|
|
The `duration` and `vector` ports precisely specify the audio device
|
|
|
|
behavior. The given note should be played for a number of milliseconds
|
|
|
|
specified by `duration`, at which point the `vector` should be called
|
|
|
|
to play the next note (or next silence). If the specified ADSR ports
|
|
|
|
have a shorter duration, the *mode* defines how to extend the pitch
|
|
|
|
(using the *note type* bits). If the ADSR ports have a longer
|
|
|
|
duration, then the ADSR will be shortened to fit, starting with S/R
|
2023-08-21 23:10:02 -04:00
|
|
|
but also truncating D and A if necessary. If duration is zero the
|
|
|
|
duration will be calculated dynamically from ADSR, as it is now.
|
2023-08-17 14:07:41 -04:00
|
|
|
|
|
|
|
Composers can choose a duration for the smallest subdivision needed
|
|
|
|
(e.g. 125ms per 16th note to achieve 120 bpm) and then compute precise
|
|
|
|
durations for 8th notes, quarter notes, dotted-8th notes, whole notes,
|
|
|
|
and so on. Similarly, composers can use the same envelope with
|
|
|
|
stacatto and legato notes to easily achieve different articulations
|
|
|
|
for different passages.
|
|
|
|
|
|
|
|
## More flexible envelope and waveform settings
|
|
|
|
|
2023-08-17 14:15:20 -04:00
|
|
|
The new envelope duration range (10ms to 2.6s) allows more more
|
|
|
|
complex envelopes to be specified, from slower builds and fades to
|
|
|
|
very fast attacks and releases. Similarly, allowing waveforms to be
|
|
|
|
specified at lower sampling rates potentially allows more interesting
|
|
|
|
percussion instruments to be specified without using too many bytes of
|
|
|
|
the ROM.
|
2023-08-17 14:07:41 -04:00
|
|
|
|
|
|
|
For comparison, the NES uses variable frequency samples to allow basic
|
|
|
|
voices/sounds without using too much space. For NTSC devices the
|
|
|
|
supported range is 4182-33144 Hz.
|
|
|
|
|
|
|
|
## Appending A: proposed specification:
|
|
|
|
|
|
|
|
ADDR SIZE NAME DESCRIPTION
|
|
|
|
0x30 2 bytes vector callback address to use when note finishes playing
|
|
|
|
0x32 2 bytes duration (new) duration to play sound in fractional seconds (1ms resolution)
|
|
|
|
0x34 1 byte mode (new) configures how to interpret addr/adsr/pitch (see below)
|
|
|
|
0x35 1 byte volume (moved) 4-bit volumes for left/right channels (6.7% resolution)
|
|
|
|
0x36 1 byte attack (new) envelope: attack duration (vol 0-100%, 10ms resolution)
|
|
|
|
0x37 1 byte decay (new) envelope: decay duration (vol 100-50%, 10ms resolution)
|
|
|
|
0x38 1 byte sustain (new) envelope: sustain duration (vol 50%, 10ms resolution)
|
|
|
|
0x39 1 byte release (new) envelope: release duration (vol 50-0%, 10ms resolution)
|
|
|
|
0x3a 2 bytes length length of waveform data to read (in bytes)
|
|
|
|
0x3c 2 bytes addr address to read waveform data from
|
|
|
|
0x3e 1 byte detune (new) fraction of semitone to raise (0x80 gives a quarter tone)
|
|
|
|
0x3f 1 byte pitch 1-bit loop and 7-bit MIDI note (0x00 gives silence)
|
|
|
|
|
|
|
|
MODES
|
|
|
|
|
2023-08-21 23:10:02 -04:00
|
|
|
Mode consists of the bits `xxWW xAAA`.
|
2023-08-17 14:07:41 -04:00
|
|
|
|
2023-08-21 23:10:02 -04:00
|
|
|
The `A` bits correspond to articulation, which determines how to fill
|
|
|
|
extra space when the duration exceeds the envelope length, and also
|
|
|
|
which parts of the envelope to exclude (if any) which is denoted with
|
|
|
|
an underscore:
|
2023-08-17 14:07:41 -04:00
|
|
|
|
|
|
|
```
|
2023-08-21 23:10:02 -04:00
|
|
|
0x00 (xxxx x000) regular (ADSR, pads with silence)
|
|
|
|
0x01 (xxxx x001) short (AD_R, pads with silence)
|
|
|
|
0x02 (xxxx x010) staccato (_D_R, pads with silence)
|
|
|
|
0x03 (xxxx x011) staccatissimo (_D__, pads with silence)
|
|
|
|
0x04 (xxxx x100) legato (ADSR, extends S)
|
|
|
|
0x05 (xxxx x101) begin slur (ADS_, extends S)
|
|
|
|
0x06 (xxxx x110) slur (__S_, extends S)
|
|
|
|
0x07 (xxxx x111) end slur (__SR, extends S)
|
2023-08-17 14:07:41 -04:00
|
|
|
```
|
|
|
|
|
2023-08-21 23:10:02 -04:00
|
|
|
For shorter durations, the envelope is truncated starting from the
|
|
|
|
end. For example, with `short` articulation the release will be
|
|
|
|
truncated first, then the decay, and finally the attack.
|
2023-08-17 14:07:41 -04:00
|
|
|
|
2023-08-21 23:10:02 -04:00
|
|
|
The `L` bit corresponds to whether to loop or not:
|
2023-08-17 14:07:41 -04:00
|
|
|
|
|
|
|
```
|
2023-08-21 23:10:02 -04:00
|
|
|
0x00 (xxxx 0xxx) play once (do not loop)
|
|
|
|
0x80 (xxxx 1xxx) repeat note indefinitely
|
2023-08-17 14:07:41 -04:00
|
|
|
```
|
|
|
|
|
2023-08-21 23:10:02 -04:00
|
|
|
Looping will continue until a new `pitch` is written (at which point
|
|
|
|
that note's looping behavior will be used).
|
2023-08-17 14:07:41 -04:00
|
|
|
|
2023-08-21 23:10:02 -04:00
|
|
|
The `W` bits correspond to waveform type:
|
2023-08-17 14:07:41 -04:00
|
|
|
|
|
|
|
```
|
2023-08-21 23:10:02 -04:00
|
|
|
0x00 (xx00 xxxx) waveform sampled at 44100 Hz (44.1 kHz)
|
|
|
|
0x40 (xx01 xxxx) waveform sampled at 22050 Hz
|
|
|
|
0x80 (xx10 xxxx) waveform sampled at 11025 Hz
|
|
|
|
0xc0 (xx11 xxxx) waveform sampled at 5512 Hz
|
2023-08-17 14:07:41 -04:00
|
|
|
```
|
|
|
|
|
2023-08-21 23:10:02 -04:00
|
|
|
Upsampling will be performed by repeating sample values as many times
|
|
|
|
as needed (2x, 4x, or 8x). The underlying sound engine is still
|
|
|
|
expected to play sounds at 44.1 kHz.
|
2023-08-17 14:07:41 -04:00
|
|
|
|
|
|
|
## Appendix B: not currently supported
|
|
|
|
|
|
|
|
There are some features which would be nice to add but which are not
|
|
|
|
strictly necessary and would require more significant changes. They
|
|
|
|
could potentially be supported in the future using additional bits
|
|
|
|
from the `mode` port, by new devices, or by a larger change.
|
|
|
|
|
|
|
|
* Vibrato/Tremelo
|
|
|
|
* Portamento/Glissando/Glide
|
|
|
|
* Effects (reverb, equalization, overdrive, etc.)
|
|
|
|
* Frequency generators/software synths
|
2023-08-17 14:15:20 -04:00
|
|
|
|
|
|
|
If we want to support even slower builds/fades we could use existing
|
|
|
|
mode bits to change the units used by the envelope. That would allow
|
|
|
|
us to specify larger resolutions.
|
2023-08-17 14:07:41 -04:00
|
|
|
|
|
|
|
## Appendix C: existing specification
|
|
|
|
|
|
|
|
ADDR SIZE NAME DESCRIPTION
|
|
|
|
0x30 2 bytes vector callback address to use when note finishes playing
|
|
|
|
0x32 2 bytes position read current position in sample
|
|
|
|
0x34 1 byte output read envelope loudness at this moment (0x000 to 0x888)
|
|
|
|
0x35 (unused)
|
|
|
|
0x36 (unused)
|
|
|
|
0x37 (unused)
|
|
|
|
0x38 2 bytes adsr four 4-bit envelope values (attack/decay/sustain/release)
|
|
|
|
0x3a 2 bytes length length of waveform data to read in bytes
|
|
|
|
0x3c 2 bytes addr address to read waveform data from
|
|
|
|
0x3e 1 byte volume 4-bit volumes for left/right channels
|
|
|
|
0x3f 1 byte pitch 1-bit loop and 7-bit MIDI note
|