updates to spec

2023-08-21 23:10:02 -04:00 · 2023-08-21 23:10:02 -04:00 · 374e92d844
parent 6afad8fe5f
commit 374e92d844
1 changed files with 41 additions and 35 deletions
--- a/audio.md
+++ b/audio.md
@ -33,9 +33,8 @@ This proposal does four things:
 3. Add a one-byte `mode` port, which declares what kind of note or
    sound is being played. This provides an easy way to specify
    different behaviors such as:
-     * staccato, legato, or standard playing styles
+     * articulation (e.g. staccato, legato, etc.)
-     * different sample rates (44.1, 22.05, 11.025)
+     * different sample rates (44.1k, 22.05k, 11.025k, 5512)
     * looping or non-looping playback
 4. Move the `volume` port to `0x5` and add a one-byte `detune` port.
    A zero value (`0x00`) indicates a "normal" semitone pitch, and
@ -48,13 +47,13 @@ This proposal does four things:
 ## Microtonal music
 Here's how to encode the 17-tone equal temperment scale (17ET) as
-`detune/pitch` pairs starting from middle C (`0x3c`). Since each step
+`detune/pitch` pairs starting from middle C (`0x3c`, i.e. `#3c`).
-of the scale consists of 70.588 cents, we can get accurate pitches and
+Since each step of the scale consists of 70.588 cents, we can get
-detunes by adding 70.588 for each step then dividing by 100 and using
+accurate pitches and detunes by adding 70.588 for each step then
-the quotient and remainder:
+dividing by 100 and using the quotient and remainder:
 ```
-  pitch  1: #003c  (0 semitones +  0.00 cents)
+  pitch  1: #003c  (0 semitones +  0.00 cents) -- root note is C (#3c)
  pitch  2: #b53c  (0 semitones + 70.59 cents)
  pitch  3: #693d  (1 semitones + 41.18 cents)
  pitch  4: #1e3e  (2 semitones + 11.76 cents)
@ -64,14 +63,14 @@ the quotient and remainder:
  pitch  8: #f140  (4 semitones + 94.12 cents)
  pitch  9: #a641  (5 semitones + 64.70 cents)
  pitch 10: #5a42  (6 semitones + 35.29 cents)
-  pitch 11: #0f43  (7 semitones +  5.88 cents)
+  pitch 11: #0f43  (7 semitones +  5.88 cents) -- almost perfect 5th (#43)
  pitch 12: #c443  (7 semitones + 76.47 cents)
  pitch 13: #7844  (8 semitones + 47.06 cents)
  pitch 14: #2d45  (9 semitones + 17.64 cents)
  pitch 15: #e245  (9 semitones + 88.23 cents)
  pitch 16: #9746 (10 semitones + 58.82 cents)
  pitch 17: #4b47 (11 semitones + 29.41 cents)
-  pitch 18: #0048 (12 semitones +  0.00 cents)
+  pitch 18: #0048 (12 semitones +  0.00 cents) -- octave is C (#48)
 ```
 While it's somewhat cumbersome to calculate these detune values in
@ -87,7 +86,8 @@ to play the next note (or next silence). If the specified ADSR ports
 have a shorter duration, the *mode* defines how to extend the pitch
 (using the *note type* bits). If the ADSR ports have a longer
 duration, then the ADSR will be shortened to fit, starting with S/R
-but also truncating D and A if necessary.
+but also truncating D and A if necessary. If duration is zero the
 duration will be calculated dynamically from ADSR, as it is now.
 Composers can choose a duration for the smallest subdivision needed
 (e.g. 125ms per 16th note to achieve 120 bpm) and then compute precise
@ -127,45 +127,51 @@ ADDR  SIZE     NAME      DESCRIPTION
 MODES
-Mode consists of the bits `Lxxx WWNN`.
+Mode consists of the bits `xxWW xAAA`.
-The `N` bits correspond to note type:
+The `A` bits correspond to articulation, which determines how to fill
 extra space when the duration exceeds the envelope length, and also
 which parts of the envelope to exclude (if any) which is denoted with
 an underscore:
 ```
-0x00  (xxxx xx00)  standard note (uses ADSR, extends with silence)
+0x00  (xxxx x000)  regular (ADSR, pads with silence)
-0x01  (xxxx xx01)  staccato note (uses ADR, ignores S, extends with silence)
+0x01  (xxxx x001)  short (AD_R, pads with silence)
-0x02  (xxxx xx10)  legato note (uses ADSR, extends S as needed)
+0x02  (xxxx x010)  staccato (_D_R, pads with silence)
-0x03  (xxxx xx11)  slurred note (uses S, ignores ADR, extends S as needed)
+0x03  (xxxx x011)  staccatissimo (_D__, pads with silence)
 0x04  (xxxx x100)  legato (ADSR, extends S)
 0x05  (xxxx x101)  begin slur (ADS_, extends S)
 0x06  (xxxx x110)  slur (__S_, extends S)
 0x07  (xxxx x111)  end slur (__SR, extends S)
 ```
-Since we are no longer computing note duration from the ADSR
+For shorter durations, the envelope is truncated starting from the
-durations, the note type specifies what to do when the duration is
+end. For example, with `short` articulation the release will be
-different than the envelope. For shorter durations, the sustain and/or
+truncated first, then the decay, and finally the attack.
-release are truncated; for longer durations it varies by type.
+
 The `L` bit corresponds to whether to loop or not:
 ```
 0x00 (xxxx 0xxx) play once (do not loop)
 0x80 (xxxx 1xxx) repeat note indefinitely
 ```
 Looping will continue until a new `pitch` is written (at which point
 that note's looping behavior will be used).
 The `W` bits correspond to waveform type:
 ```
-0x00  (xxxx 00xx)  waveform sampled at 44100 Hz (44.1 kHz)
+0x00  (xx00 xxxx)  waveform sampled at 44100 Hz (44.1 kHz)
-0x40  (xxxx 01xx)  waveform sampled at 22050 Hz
+0x40  (xx01 xxxx)  waveform sampled at 22050 Hz
-0x80  (xxxx 10xx)  waveform sampled at 11025 Hz
+0x80  (xx10 xxxx)  waveform sampled at 11025 Hz
-0xc0  (xxxx 11xx)  waveform sampled at  5512 Hz
+0xc0  (xx11 xxxx)  waveform sampled at  5512 Hz
 ```
 Upsampling will be performed by repeating sample values as many times
 as needed (2x, 4x, or 8x). The underlying sound engine is still
 expected to play sounds at 44.1 kHz.
 The `L` bit corresponds to whether to loop or not:
 ```
 0x00 (0xxx xxxx) play once (do not loop)
 0x80 (1xxx xxxx) repeat note indefinitely
 ```
 Looping will continue until a new `pitch` is written (at which point
 that note's looping behavior will be used).
 ## Appendix B: not currently supported
 There are some features which would be nice to add but which are not