4.5 KiB
UXN Audio Proposal (v2)
(Updated with input from bd and neauoire)
Problems
Currently the UXN audio device doesn't work very well for playing complex music. There are a few reasons for this:
- Note duration is conflated with envelope shape
- Envelope resolution (67ms) limits tempos/subdivisions
- Using audio callback requires scheduling pauses/silence
Proposal outline
One way to improve the situation is to disentangle the envelope specification from the note duration, and more generally make it easier to specify things that a composer will frequently need to change (pitch, articulation, duration) without having to change the underlying voice (waveform/envelope settings).
This proposal makes four changes:
-
Add a two-byte
duration
port that configures a note's duration in milliseconds. The longest possible note is about 66 seconds. -
Double the size of the
adsr
port. This means replacing the existing two-byte port with four one-byte ports forattack
,decay
,sustain
, andrelease
. Since we have 4 extra bits per stage, we will reduce the resolution of each stage from 66ms to 10ms (so 0x01 means 10ms). The longest envelope stage is now about 2.6s (up from 1s previously). We special-casesustain
and instead treat its value as a fractionx/255
(i.e. 0.0 to 1.0). -
Move various ports around, both to improve the layout and prepare for future additions. In particular an
expansion
port for possible MIDI operations and adetune
port for microtonal music are likely (but are left unspecified by this proposal). -
Recommends that emulators use a separate
wst
andrst
for evaluating the audio vector (when possible). Code run from the audio vector should not expect to read existing values fromwst
orrst
(and should not leave values behind). This allows emulators to use a separate audio thread for evaluating callbacks without needing to pause other execution.
Note duration and tempo
The duration
and vector
ports precisely specify the audio device
behavior. The given note should be played for a number of milliseconds
specified by duration
, at which point the vector
should be called
to play the next note (or next silence).
More flexible envelope settings
The ADSR ports determine how loud the pitch should be at any given
moment. The ADR ports (attack
, decay
, and release
) are all
specified in 10ms increments (e.g. 0x03
is 30ms). The S port for
sustain
behaves differently: it specifies what how much of the
"leftover" duration to use before the release as a fraction x/255
.
So with a value of 0xff
the note would hold as long as possible, and
with 0x00
the release would occur just after the decay ends.
Since each component has its own port, it's also much easier to adjust one without having to fiddle with bit masks, shifting, etc.
Appending A: proposed specification:
ADDR SIZE NAME DESCRIPTION 0x30 2 bytes vector callback address to use when note finishes playing 0x32 2 bytes duration duration to play sound in fractional seconds (1ms resolution) 0x34 1 byte attack envelope: attack duration (vol 0-100%, 10ms resolution) 0x35 1 byte decay envelope: decay duration (vol 100-50%, 10ms resolution) 0x36 1 byte sustain envelope: sustain duration (vol 50%, 10ms resolution) 0x37 1 byte release envelope: release duration (vol 50-0%, 10ms resolution) 0x38 2 bytes addr address to read waveform data from 0x3a 2 bytes length length of waveform data to read (in bytes) 0x3c 1 byte volume 4-bit volumes for left/right channels (6.7% resolution) 0x3d 1 byte (unused - reserved for expansion) 0x3e 1 byte pitch 1-bit loop and 7-bit MIDI note (0x00 gives silence) 0x3f 1 byte (unused - reserved for detune)
Appendix B: existing specification
ADDR SIZE NAME DESCRIPTION 0x30 2 bytes vector callback address to use when note finishes playing 0x32 2 bytes position read current position in sample 0x34 1 byte output read envelope loudness at this moment (0x000 to 0x888) 0x35 (unused) 0x36 (unused) 0x37 (unused) 0x38 2 bytes adsr four 4-bit envelope values (attack/decay/sustain/release) 0x3a 2 bytes length length of waveform data to read in bytes 0x3c 2 bytes addr address to read waveform data from 0x3e 1 byte volume 4-bit volumes for left/right channels 0x3f 1 byte pitch 1-bit loop and 7-bit MIDI note