nxu/audio-v2.md

101 lines
4.8 KiB
Markdown

# UXN Audio Proposal (v2)
*(Updated with input from bd and neauoire)*
## Problems
Currently the UXN audio device doesn't work very well for playing
complex music. There are a few reasons for this:
* Note duration is conflated with envelope shape
* Envelope resolution (67ms) limits tempos/subdivisions
* Using audio callback requires scheduling pauses/silence
## Proposal outline
One way to improve the situation is to disentangle the envelope
specification from the note duration, and more generally make it
easier to specify things that a composer will frequently need to
change (pitch, articulation, duration) without having to change the
underlying voice (waveform/envelope settings).
This proposal makes four changes:
1. Add a two-byte `duration` port that configures a note's duration
in milliseconds. The longest possible note is about 66 seconds.
2. Double the size of the `adsr` port. This means replacing the
existing two-byte port with four one-byte ports for `attack`,
`decay`, `sustain`, and `release`. Since we have 4 extra bits per
stage, we will reduce the resolution of each stage from 66ms to
10ms (so 0x01 means 10ms). The longest envelope stage is now about
2.6s (up from 1s previously). We special-case `sustain` and
instead treat its value as a fraction `x/255` (i.e. 0.0 to 1.0).
3. Move various ports around, both to improve the layout and prepare
for future additions. In particular an `expansion` port for
possible MIDI operations and a `detune` port for microtonal music
are likely (but are left unspecified by this proposal).
4. Recommends that emulators use a separate `wst` and `rst` for
evaluating the audio vector (when possible). Code run from the
audio vector should not expect to read existing values from `wst`
or `rst` (and should not leave values behind). This allows
emulators to use a separate audio thread for evaluating callbacks
without needing to pause other execution.
## Note duration and tempo
The `duration` and `vector` ports precisely specify the audio device
behavior. The given note should be played for a number of milliseconds
specified by `duration`, at which point the `vector` should be called
to play the next note (or next silence). For example if the duration
is `0x04b0` then the note should play for 1.2 seconds (1200 ms).
## More flexible envelope settings
The ADSR ports determine how loud the pitch should be at any given
moment. The ADR ports (`attack`, `decay`, and `release`) are all
specified in 10ms increments (e.g. `0x03` is 30ms). The S port for
`sustain` behaves differently: it specifies what how much of the
"leftover" duration to use before the release as a fraction `x/255`.
So with a value of `0xff` the note would hold as long as possible, and
with `0x00` the release would occur just after the decay ends.
(If the duration is short parts of the envelope may be truncated.)
Since each component has its own port, it's also much easier to adjust
one without having to fiddle with bit masks, shifting, etc.
## Appendix A: proposed specification:
ADDR SIZE NAME DESCRIPTION
0x30 2 bytes vector callback address to use when note finishes playing
0x32 2 bytes duration duration to play sound in fractional seconds (1ms resolution)
0x34 1 byte attack envelope: attack duration (vol 0-100%, 10ms resolution)
0x35 1 byte decay envelope: decay duration (vol 100-50%, 10ms resolution)
0x36 1 byte sustain envelope: sustain fraction (vol 50%, x/255 of free time)
0x37 1 byte release envelope: release duration (vol 50-0%, 10ms resolution)
0x38 2 bytes addr address to read waveform data from
0x3a 2 bytes length length of waveform data to read (in bytes)
0x3c 1 byte volume 4-bit volumes for left/right channels (6.7% resolution)
0x3d 1 byte (unused - reserved for expansion)
0x3e 1 byte pitch 1-bit loop and 7-bit MIDI note (0x00 gives silence)
0x3f 1 byte (unused - reserved for detune)
## Appendix B: existing specification
ADDR SIZE NAME DESCRIPTION
0x30 2 bytes vector callback address to use when note finishes playing
0x32 2 bytes position read current position in sample
0x34 1 byte output read envelope loudness at this moment (0x000 to 0x888)
0x35 (unused)
0x36 (unused)
0x37 (unused)
0x38 2 bytes adsr four 4-bit envelope values (attack/decay/sustain/release)
0x3a 2 bytes length length of waveform data to read in bytes
0x3c 2 bytes addr address to read waveform data from
0x3e 1 byte volume 4-bit volumes for left/right channels
0x3f 1 byte pitch 1-bit loop and 7-bit MIDI note