Version 2 of audio spec.

2023-09-13 01:04:45 -04:00 · 2023-09-13 01:04:45 -04:00 · 4a50aca68b
parent ca2c9925b0
commit 4a50aca68b
2 changed files with 100 additions and 1 deletions
--- a/audio-v2.md
+++ b/audio-v2.md
@ -0,0 +1,96 @@
+# UXN Audio Proposal (v2)
+
+*(Updated with input from bd and neauoire)*
+
+## Problems
+
+Currently the UXN audio device doesn't work very well for playing
+complex music. There are a few reasons for this:
+
+ * Note duration is conflated with envelope shape
+ * Envelope resolution (67ms) limits tempos/subdivisions
+ * Using audio callback requires scheduling pauses/silence
+ 
+## Proposal outline
+
+One way to improve the situation is to disentangle the envelope
+specification from the note duration, and more generally make it
+easier to specify things that a composer will frequently need to
+change (pitch, articulation, duration) without having to change the
+underlying voice (waveform/envelope settings).
+
+This proposal makes four changes:
+
+ 1. Add a two-byte `duration` port that configures a note's duration
+    in milliseconds. The longest possible note is about 66 seconds.
+
+ 2. Double the size of the `adsr` port. This means replacing the
+    existing two-byte port with four one-byte ports for `attack`,
+    `decay`, `sustain`, and `release`. Since we have 4 extra bits per
+    stage, we will reduce the resolution of each stage from 66ms to
+    10ms (so 0x01 means 10ms). The longest envelope stage is now about
+    2.6s (up from 1s previously). We special-case `sustain` and
+    instead treat its value as a fraction `x/255` (i.e. 0.0 to 1.0).
+    
+ 3. Move various ports around, both to improve the layout and prepare
+    for future additions. In particular an `expansion` port for
+    possible MIDI operations and a `detune` port for microtonal music
+    are likely (but are left unspecified by this proposal).
+
+ 4. Recommends that emulators use a separate `wst` and `rst` for
+    evaluating the audio vector (when possible). Code run from the
+    audio vector should not expect to read existing values from `wst`
+    or `rst` (and should not leave values behind). This allows
+    emulators to use a separate audio thread for evaluating callbacks
+    without needing to pause other execution.
+
+## Note duration and tempo
+
+The `duration` and `vector` ports precisely specify the audio device
+behavior. The given note should be played for a number of milliseconds
+specified by `duration`, at which point the `vector` should be called
+to play the next note (or next silence).
+
+## More flexible envelope settings
+
+The ADSR ports determine how loud the pitch should be at any given
+moment. The ADR ports (`attack`, `decay`, and `release`) are all
+specified in 10ms increments (e.g. `0x03` is 30ms). The S port for
+`sustain` behaves differently: it specifies what how much of the
+"leftover" duration to use before the release as a fraction `x/255`.
+So with a value of `0xff` the note would hold as long as possible, and
+with `0x00` the release would occur just after the decay ends.
+
+Since each component has its own port, it's also much easier to adjust
+one without having to fiddle with bit masks, shifting, etc.
+
+## Appending A: proposed specification:
+
+ADDR  SIZE     NAME      DESCRIPTION
+0x30  2 bytes  vector    callback address to use when note finishes playing
+0x32  2 bytes  duration  duration to play sound in fractional seconds (1ms resolution)
+0x34  1 byte   attack    envelope: attack duration (vol 0-100%, 10ms resolution)
+0x35  1 byte   decay     envelope: decay duration (vol 100-50%, 10ms resolution)
+0x36  1 byte   sustain   envelope: sustain duration (vol 50%, 10ms resolution)
+0x37  1 byte   release   envelope: release duration (vol 50-0%, 10ms resolution)
+0x38  2 bytes  addr      address to read waveform data from
+0x3a  2 bytes  length    length of waveform data to read (in bytes)
+0x3c  1 byte   volume    4-bit volumes for left/right channels (6.7% resolution)
+0x3d  1 byte             (unused - reserved for expansion)
+0x3e  1 byte   pitch     1-bit loop and 7-bit MIDI note (0x00 gives silence)
+0x3f  1 byte             (unused - reserved for detune)
+
+## Appendix B: existing specification
+
+ADDR  SIZE     NAME      DESCRIPTION
+0x30  2 bytes  vector    callback address to use when note finishes playing
+0x32  2 bytes  position  read current position in sample
+0x34  1 byte   output    read envelope loudness at this moment (0x000 to 0x888)
+0x35                     (unused)
+0x36                     (unused)
+0x37                     (unused)
+0x38  2 bytes  adsr      four 4-bit envelope values (attack/decay/sustain/release)
+0x3a  2 bytes  length    length of waveform data to read in bytes
+0x3c  2 bytes  addr      address to read waveform data from
+0x3e  1 byte   volume    4-bit volumes for left/right channels
+0x3f  1 byte   pitch     1-bit loop and 7-bit MIDI note
--- a/mksite.sh
+++ b/mksite.sh
@ -1,6 +1,8 @@
 #!/bin/sh

-cp audio.md audio.txt
+for STEM in audio audio-v2; do
+  cp $STEM.md $STEM.txt
+done

 for NAME in about.txt asma.rom math32.tal test-math32.tal test-math32.py \
  primes32.tal regex.tal repl-regex.tal test-regex.tal grep.tal \
@ -15,6 +17,7 @@ for NAME in about.txt asma.rom math32.tal test-math32.tal test-math32.py \
  deck.tal cards.tal card-sprites.tal mask-sprites.tal \
  testing.tal type-abc.tal tar.tal \
  audio.md audio.txt synthdemo.tal \
+  audio-v2.md audio-v2.txt \
 ; do
    echo "-> $NAME"
    cp $NAME /var/www/plastic-idolatry.com/html/erik/nxu