From 374e92d844e4c5b561b919e2b4004a5fa7255a96 Mon Sep 17 00:00:00 2001
From: d_m <d_m@plastic-idolatry.com>
Date: Mon, 21 Aug 2023 23:10:02 -0400
Subject: [PATCH] updates to spec

---
 audio.md | 76 ++++++++++++++++++++++++++++++--------------------------
 1 file changed, 41 insertions(+), 35 deletions(-)

diff --git a/audio.md b/audio.md
index ae0f801..23ebc41 100644
--- a/audio.md
+++ b/audio.md
@@ -33,9 +33,8 @@ This proposal does four things:
  3. Add a one-byte `mode` port, which declares what kind of note or
     sound is being played. This provides an easy way to specify
     different behaviors such as:
-     * staccato, legato, or standard playing styles
-     * different sample rates (44.1, 22.05, 11.025)
-     * looping or non-looping playback
+     * articulation (e.g. staccato, legato, etc.)
+     * different sample rates (44.1k, 22.05k, 11.025k, 5512)
 
  4. Move the `volume` port to `0x5` and add a one-byte `detune` port.
     A zero value (`0x00`) indicates a "normal" semitone pitch, and
@@ -48,13 +47,13 @@ This proposal does four things:
 ## Microtonal music
 
 Here's how to encode the 17-tone equal temperment scale (17ET) as
-`detune/pitch` pairs starting from middle C (`0x3c`). Since each step
-of the scale consists of 70.588 cents, we can get accurate pitches and
-detunes by adding 70.588 for each step then dividing by 100 and using
-the quotient and remainder:
+`detune/pitch` pairs starting from middle C (`0x3c`, i.e. `#3c`).
+Since each step of the scale consists of 70.588 cents, we can get
+accurate pitches and detunes by adding 70.588 for each step then
+dividing by 100 and using the quotient and remainder:
 
 ```
-  pitch  1: #003c  (0 semitones +  0.00 cents)
+  pitch  1: #003c  (0 semitones +  0.00 cents) -- root note is C (#3c)
   pitch  2: #b53c  (0 semitones + 70.59 cents)
   pitch  3: #693d  (1 semitones + 41.18 cents)
   pitch  4: #1e3e  (2 semitones + 11.76 cents)
@@ -64,14 +63,14 @@ the quotient and remainder:
   pitch  8: #f140  (4 semitones + 94.12 cents)
   pitch  9: #a641  (5 semitones + 64.70 cents)
   pitch 10: #5a42  (6 semitones + 35.29 cents)
-  pitch 11: #0f43  (7 semitones +  5.88 cents)
+  pitch 11: #0f43  (7 semitones +  5.88 cents) -- almost perfect 5th (#43)
   pitch 12: #c443  (7 semitones + 76.47 cents)
   pitch 13: #7844  (8 semitones + 47.06 cents)
   pitch 14: #2d45  (9 semitones + 17.64 cents)
   pitch 15: #e245  (9 semitones + 88.23 cents)
   pitch 16: #9746 (10 semitones + 58.82 cents)
   pitch 17: #4b47 (11 semitones + 29.41 cents)
-  pitch 18: #0048 (12 semitones +  0.00 cents)
+  pitch 18: #0048 (12 semitones +  0.00 cents) -- octave is C (#48)
 ```
 
 While it's somewhat cumbersome to calculate these detune values in
@@ -87,7 +86,8 @@ to play the next note (or next silence). If the specified ADSR ports
 have a shorter duration, the *mode* defines how to extend the pitch
 (using the *note type* bits). If the ADSR ports have a longer
 duration, then the ADSR will be shortened to fit, starting with S/R
-but also truncating D and A if necessary.
+but also truncating D and A if necessary. If duration is zero the
+duration will be calculated dynamically from ADSR, as it is now.
 
 Composers can choose a duration for the smallest subdivision needed
 (e.g. 125ms per 16th note to achieve 120 bpm) and then compute precise
@@ -127,45 +127,51 @@ ADDR  SIZE     NAME      DESCRIPTION
 
 MODES
 
-Mode consists of the bits `Lxxx WWNN`.
+Mode consists of the bits `xxWW xAAA`.
 
-The `N` bits correspond to note type:
+The `A` bits correspond to articulation, which determines how to fill
+extra space when the duration exceeds the envelope length, and also
+which parts of the envelope to exclude (if any) which is denoted with
+an underscore:
 
 ```
-0x00  (xxxx xx00)  standard note (uses ADSR, extends with silence)
-0x01  (xxxx xx01)  staccato note (uses ADR, ignores S, extends with silence)
-0x02  (xxxx xx10)  legato note (uses ADSR, extends S as needed)
-0x03  (xxxx xx11)  slurred note (uses S, ignores ADR, extends S as needed)
+0x00  (xxxx x000)  regular (ADSR, pads with silence)
+0x01  (xxxx x001)  short (AD_R, pads with silence)
+0x02  (xxxx x010)  staccato (_D_R, pads with silence)
+0x03  (xxxx x011)  staccatissimo (_D__, pads with silence)
+0x04  (xxxx x100)  legato (ADSR, extends S)
+0x05  (xxxx x101)  begin slur (ADS_, extends S)
+0x06  (xxxx x110)  slur (__S_, extends S)
+0x07  (xxxx x111)  end slur (__SR, extends S)
 ```
 
-Since we are no longer computing note duration from the ADSR
-durations, the note type specifies what to do when the duration is
-different than the envelope. For shorter durations, the sustain and/or
-release are truncated; for longer durations it varies by type.
+For shorter durations, the envelope is truncated starting from the
+end. For example, with `short` articulation the release will be
+truncated first, then the decay, and finally the attack.
+
+The `L` bit corresponds to whether to loop or not:
+
+```
+0x00 (xxxx 0xxx) play once (do not loop)
+0x80 (xxxx 1xxx) repeat note indefinitely
+```
+
+Looping will continue until a new `pitch` is written (at which point
+that note's looping behavior will be used).
 
 The `W` bits correspond to waveform type:
 
 ```
-0x00  (xxxx 00xx)  waveform sampled at 44100 Hz (44.1 kHz)
-0x40  (xxxx 01xx)  waveform sampled at 22050 Hz
-0x80  (xxxx 10xx)  waveform sampled at 11025 Hz
-0xc0  (xxxx 11xx)  waveform sampled at  5512 Hz
+0x00  (xx00 xxxx)  waveform sampled at 44100 Hz (44.1 kHz)
+0x40  (xx01 xxxx)  waveform sampled at 22050 Hz
+0x80  (xx10 xxxx)  waveform sampled at 11025 Hz
+0xc0  (xx11 xxxx)  waveform sampled at  5512 Hz
 ```
 
 Upsampling will be performed by repeating sample values as many times
 as needed (2x, 4x, or 8x). The underlying sound engine is still
 expected to play sounds at 44.1 kHz.
 
-The `L` bit corresponds to whether to loop or not:
-
-```
-0x00 (0xxx xxxx) play once (do not loop)
-0x80 (1xxx xxxx) repeat note indefinitely
-```
-
-Looping will continue until a new `pitch` is written (at which point
-that note's looping behavior will be used).
-
 ## Appendix B: not currently supported
 
 There are some features which would be nice to add but which are not