Simple Uxn LZ Format ==================== Goals: * Anyone can implement it * Small source code size * Easy to implement from Uxn * Mildly better than RLE Non-goals: * High compression ratio * High compression speed Format ------ It's a stream of commands. The first byte encodes the first command. Read the commands from the input until there's no more input. There are two commands. Literal and dictionary. ``` Byte 1 Byte 2+n ┌─────────────────┐ ┌───── Literal │ 0 x x x x x x x │ │ .... (Always 1 byte) └─────────────────┘ └───── Length of literal Bytes to copy to output (Adjust by adding 1) Byte 1 Byte 2 Dictionary ┌─────────────────┐ ┌─────────────────┐ (2 bytes version)│ 1 0 x x x x x x │ │ x x x x x x x x │ └─────────────────┘ └─────────────────┘ Length of Offset into dictionary match dictionary (Adjust by adding 4) (Adjust by adding 1) Byte 1 Byte 2 Byte 3 Dictionary ┌─────────────────┬─────────────────┐ ┌─────────────────┐ (3 bytes version)│ 1 1 x x x x x x │ x x x x x x x x │ │ x x x x x x x x │ └─────────────────┴─────────────────┘ └─────────────────┘ Length of dictionary match Offset into (Adjust by adding 4) dictionary (Adjust by adding 1) ``` * The maximum dictionary history size is 256 bytes. * Dictionary offsets should be treated as the distance from the end of last byte that was output. * Example: an offset of 0 means go back by 1 bytes into the history. * `a b c d e f|g` * Example: an offset of 5 means go back by 6 bytes into the history. * `a|b c d e f g` 22:56 < neauoire> how large do I make the dictionary? 22:57 < cancel> yeah. and the dictionary is just the previous 256 bytes of the file. or, if you haven't progressed through 256 bytes yet, whatever you have 22:57 < cancel> so if you're 20 bytes into the file, your dictionary is the 20 bytes you've already processed 22:57 < cancel> if you're on the first byte of the file, your dictionary size is 0 22:57 < cancel> if you're on byte 500, the dictionary size is 256 22:58 < cancel> if your dictionary size is 0, you're definitely not gonna have a match 22:58 < cancel> if you don't have a match, you need to emit the literal command 22:58 < cancel> if your dictionary size is 0, you're definitely not gonna have a match 22:58 < cancel> if you don't have a match, you need to emit the literal command 22:58 < cancel> and then just slap some bytes down into the output 22:58 < cancel> but... how many? 22:59 < neauoire> it's designed to be stream right? 22:59 < neauoire> mhmm maybe not 22:59 < cancel> yeah, but you have to write the size of the literal first 22:59 < cancel> so... how big should the literal be? 22:59 < cancel> well, you don't know yet 23:00 < cancel> so, just write that the literal is 1 byte long, and then put that first byte of the file you were looking at for a match 23:01 < cancel> now, you're looking at the second byte of the file 23:01 < cancel> repeat the process above 23:01 < cancel> your dictionary is now size 1 23:01 < cancel> and it has that first character in it 23:01 < cancel> let's say your file is 'abcdefg' 23:01 < neauoire> yeah 23:01 < cancel> your dictionary is 'a' 23:01 < cancel> and the next character is 'b' 23:01 < cancel> well, there's no match in the dictionary. 23:02 < cancel> so you need to write a literal again... 23:02 < cancel> but the last thing you wrote was already a literal 23:02 < cancel> so just combine it with the previous literal 23:03 < cancel> ok 23:03 < cancel> you can make a 'compressed' file that doesn't actually compress 23:03 < cancel> it can just be all literals 23:03 < neauoire> it'll take me a while to even just accomplish this bit 23:03 < cancel> it will be bigger than the original input 23:03 < neauoire> ah yes 23:03 < cancel> but it will still be a usable file for the decompressor 23:03 < neauoire> let me try that