From 09e7609dc8725df7bba24d7e625429d2bb1477e6 Mon Sep 17 00:00:00 2001 From: d6 Date: Fri, 11 Nov 2022 21:00:50 -0500 Subject: [PATCH] uxn metadata proposal --- proposal.txt | 176 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 143 insertions(+), 33 deletions(-) diff --git a/proposal.txt b/proposal.txt index 2e6a9e8..576f2f0 100644 --- a/proposal.txt +++ b/proposal.txt @@ -1,37 +1,63 @@ -uxn rom metadata proposal +UXN ROM METADATA PROPOSAL by d6 -currently uxn rom files are just the data that will be loaded into the VMs -memory on start up (starting with address 0x100 since the zero page is -skipped). this means that the maximum rom size is 65280 bytes, although -most roms are much smaller since trailing zeros can be left out. +TL;DR SUMMARY -this simplicity is great, but comes with some (possible) downsides: +i'm proposing adding four bytes to the start of a rom. + + - "uxn0" means there is no additional metadata + - "uxn1" means there is up to 5904 bytes of additional metadata + - roms not starting with "uxn" are treated as having no metadata + +emulators will need to skip the metadata to load program memory. for +"uxn0" that means skipping those four bytes. for "uxn1" it means +reading the next two bytes and skipping the metadata based on that. +for roms not starting in "uxn" no skipping is needed. + +this metadata could be used by other roms such as loader.tal, +emulators, or even websites cataloging uxn roms. + +INTRODUCTION + +currently uxn rom files are just the data that will be loaded into the +VMs memory on start up (starting with address 0x100 since the zero +page is skipped). this means that the maximum rom size is 65280 bytes, +although most roms are smaller since trailing zeros are left out. + +this simplicity is great, but comes with some downsides: - roms aren't identifiable beyond their file name - roms don't contain any attribution information, credits, or licenses - roms don't contain a version information (rom version or uxn version) - roms don't contain any icon or preview information -while it would be nice to add all of these things that would create a major -burden on assembler and emulator authors. i think there's an easier path -forward. +while it would be nice to just start requiring all of these things +that would create a major burden on assembler and emulator authors. i +think there's a smoother path forward. + +PROPOSAL i propose adding four bytes to the start of every rom: - the literal 3 bytes "uxn" - - a fourth mode byte + - a fourth metadata mode byte -this proposal just covers mode 0 and mode 1, but in the future we have up to -254 other modes to use (though we might choose to forbid those later to keep -things simple). +this proposal just covers metadata modes 0 and mode 1, but in the +future we could have up to 254 other modes to use (though we might +choose to forbid those later to keep things simple). -the "uxn0" format would be exactly what we have now, just with those four -bytes at the start. assembler authors could choose to only support creating -"uxn0" roms without very much extra effort over what they do now. emulator -authors could easily adapt their current work to read this format. rom files -that lack a "uxn" at the start could continue to work as they do now. +UXN0 FORMAT + +the "uxn0" format would be exactly what we have now, just with those +four bytes at the start. assembler authors could choose to only +support creating "uxn0" roms without very much extra effort over what +they do now. emulator authors could easily adapt their current work to +read this format. rom files that lack a "uxn" at the start would +continue to work (though in the future we might choose to deprecate +this). + +UXN1 FORMAT the "uxn1" format would provide some extra metadata: @@ -43,22 +69,106 @@ the "uxn1" format would provide some extra metadata: - version (n bytes): the version string (ASCII/UTF-8) - author-size (1 byte): size of the author string in bytes - author (n bytes): the author string (ASCII/UTF-8) - - desc-size (2 bytes): size of the description string in bytes (4096 max) - - desc (n bytes): the description string (ASCII/UTF-8) + - desc-size (2 bytes): size of the description string in bytes + - desc (n bytes): the description string (ASCII/UTF-8) (4096 max) - icon-type (1 byte): the size and depth of the icon - - icon-palette (n bytes): the icon's color theme - - icon-chr (n bytes): the icon's CHR data + - icon-palette (n bytes): the icon's color theme (6 max) + - icon-data (n bytes): the icon's ICN or CHR data (1024 max) -the minimal "uxn1" header size (assuming the strings and icon are all empty) -would be 10 bytes (2 + 2 + 1 + 1 + 1 + 2 + 1). emulator implementors could -read total-size and then seek past this metadata to read the rom data. +we limit descriptions to a 4096 byte maximum. this helps put a +reasonable upper bound on the size of metadata. -the icon types would be: +the minimal "uxn1" header size (assuming the strings and icon are all +empty) would be 10 bytes (2 + 2 + 1 + 1 + 1 + 2 + 1). emulator +implementors could read total-size and then seek past this metadata to +read the rom data. - - 0x00: no icon provided (0-byte icon-palette, 0-byte icon-chr) - - 0x01: 1-bit 16x16 icon (2-byte icon-palette, 32-byte icon-chr) - - 0x02: 1-bit 32x32 icon (2-byte icon-palette, 128-byte icon-chr) - - 0x03: 1-bit 64x64 icon (2-byte icon-palette, 512-byte icon-chr) - - 0x81: 2-bit 16x16 icon (6-byte icon-palette, 64-byte icon-chr) - - 0x82: 2-bit 32x32 icon (6-byte icon-palette, 256-byte icon-chr) - - 0x83: 2-bit 64x64 icon (6-byte icon-palette, 1024-byte icon-chr) +UXN1 ICON TYPES + +the icon types would be defined by: + + - bit 8: is icon present? (0x80 yes, 0x00 no) + - bit 7: transparency of color 1? (0x40 transparent, 0x00 solid) + - bit 6: color depth? (0x20 2-bit color (CHR), 0x00 1-bit color (ICN)) + - bits 3-5: unused + - bits 1-2: icon dimensions (0x00: 8x8, 0x01: 16x16, 0x02: 32x32, 0x03: 64x64) + +so in table form that would mean: + + ICON ICON PALETTE IMAGE DATA TRANSPARENT + BYTE FORMAT SIZE SIZE COLOR 1? + 0x00 no icon 0 bytes 0 bytes n/a + 0x80 8x8 ICN 3 bytes 8 bytes no + 0x81 16x16 ICN 3 bytes 32 bytes no + 0x82 32x32 ICN 3 bytes 128 bytes no + 0x83 64x64 ICN 3 bytes 512 bytes no + 0xa0 8x8 CHR 6 bytes 16 bytes no + 0xa1 16x16 CHR 6 bytes 64 bytes no + 0xa2 32x32 CHR 6 bytes 256 bytes no + 0xa3 64x64 CHR 6 bytes 1024 bytes no + 0xc0 8x8 ICN 3 bytes 8 bytes yes + 0xc1 16x16 ICN 3 bytes 32 bytes yes + 0xc2 32x32 ICN 3 bytes 128 bytes yes + 0xc3 64x64 ICN 3 bytes 512 bytes yes + 0xe0 8x8 CHR 6 bytes 16 bytes yes + 0xe1 16x16 CHR 6 bytes 64 bytes yes + 0xe2 32x32 CHR 6 bytes 256 bytes yes + 0xe3 64x64 CHR 6 bytes 1024 bytes yes + +icons would be stored in 8x8 tiles, as mandated by the ICN and CHR +formats. they would be read left-to-right, top-to-bottom. while +external applications might have an easier time with other formats +(e.g. BMP, PNG, etc.) this icon format is primarily designed for ease +of use by other uxn roms running inside varvara. + +the maximal "uxn1" header size (assuming all strings are maximum +length and using an icon format of x0a3) would be 5904 bytes. while +this is substantial it is unlikely to push most rom sizes over 64k +(the point at which working with them from within varvara becomes +annoying). since the maximum rom data size is 65280, authors are free +to use up to 255 bytes of metadata while still keeping the total rom +size under 65536. + +TEXT FORMATS + +uxn isn't likely to ever support unicode very well, so why did i write +ASCII/UTF-8 for text formats? my take is that the text in roms is +likely to be used both by uxn but also by exeternal systems that +probably _can_ handle UTF-8. i expect most authors to stick to the +ASCII subset that uxn can handle. + +we have some alternatives to consider: + + (a) require 7-bit ASCII only + (b) add metadata about text encoding + (c) mandate another particular encoding (latin-1) + (d) leave the behavior of 8-bit values (0x80 - 0xff) unspecified + +i don't think (a) has any advantages of UTF-8 (uxn programs will still +need to ignore 8-bit inputs in either case). option (b) sounds like a +nightmare from within uxn programs and outside varvara UTF-8 is about +as general as we really need to be in my opinion. option (c) doesn't +feel much better than just limiting ourselves to ASCII (but maybe +that's my own cultural bias speaking) and (d) sounds like a nightmare. +so my take is that inside varvara only ASCII values are likely to be +well-supported, but for display outside varvara UTF-8 feels like the +best option (e.g. allowing author's to write their names correctly). + +another weirder option would be to provide graphical tiles or font +data that authors could use to encode program text. embedding a +font/tileset just for handling metadata feels very heavy but it would +ensure authors can display metadata in any language they can draw. + +CONCLUSION + +adding metadata to roms is undoubtedly annoying but will pay real +dividends as we move forward. among other things, it will: + + - ensure authors are credited for their work + - display what license or copyright covers a rom + - let users reliably determine which rom version is newest + - allow launchers to display nice images and names + - provide dates, places, and other historical info + - let authors to write dedications or nice messages + +thanks for considering this feature.