The .xz file format specification version 1.0.0 is now

officially released. The format has been technically the same
since 2008-11-19, but now that it is frozen, people can start
using it without a fear that the format will break.
This commit is contained in:
Lasse Collin 2009-01-28 17:16:38 +02:00
parent c4683a660b
commit 982da7ed31
1 changed files with 49 additions and 35 deletions

View File

@ -1,10 +1,14 @@
The .xz File Format The .xz File Format
------------------- ===================
Version 1.0.0 (2009-01-14)
0. Preface 0. Preface
0.1. Notices and Acknowledgements 0.1. Notices and Acknowledgements
0.2. Changes 0.2. Getting the Latest Version
0.3. Version History
1. Conventions 1. Conventions
1.1. Byte and Its Representation 1.1. Byte and Its Representation
1.2. Multibyte Integers 1.2. Multibyte Integers
@ -61,29 +65,34 @@ The .xz File Format
this format replace the old .lzma format used by LZMA SDK and this format replace the old .lzma format used by LZMA SDK and
LZMA Utils. LZMA Utils.
IMPORTANT: The version described in this document is a
draft, NOT a final, official version. Changes
are possible.
0.1. Notices and Acknowledgements 0.1. Notices and Acknowledgements
This file format was designed by Lasse Collin This file format was designed by Lasse Collin
<lasse.collin@tukaani.org> and Igor Pavlov. <lasse.collin@tukaani.org> and Igor Pavlov.
Special thanks for helping with this document goes to Ville Special thanks for helping with this document goes to
Koskinen. Thanks for helping with this document goes to Ville Koskinen. Thanks for helping with this document goes to
Mark Adler, H. Peter Anvin, Mikko Pouru, and Lars Wirzenius. Mark Adler, H. Peter Anvin, Mikko Pouru, and Lars Wirzenius.
This document has been put into the public domain. This document has been put into the public domain.
0.2. Changes 0.2. Getting the Latest Version
Last modified: 2008-12-05 12:45+0200 The latest official version of this document can be downloaded
from <http://tukaani.org/xz/xz-file-format.txt>.
(A changelog will be kept once the first official version Specific versions of this document have a filename
is made.) xz-file-format-X.Y.Z.txt where X.Y.Z is the version number.
For example, the version 1.0.0 of this document is available
at <http://tukaani.org/xz/xz-file-format-1.0.0.txt>.
0.3. Version History
Version Date Description
1.0.0 2008-01-14 The first official version
1. Conventions 1. Conventions
@ -171,7 +180,7 @@ The .xz File Format
variable-length integers. The functions return the number of variable-length integers. The functions return the number of
bytes occupied by the integer (1-9), or zero on error. bytes occupied by the integer (1-9), or zero on error.
#include <sys/types.h> #include <stddef.h>
#include <inttypes.h> #include <inttypes.h>
size_t size_t
@ -429,9 +438,9 @@ The .xz File Format
Stream Padding. This can be convenient in certain situations Stream Padding. This can be convenient in certain situations
[GNU-tar]. [GNU-tar].
The possibility of Padding MUST be taken into account when The possibility of Stream Padding MUST be taken into account
designing an application that parses Streams backwards, and when designing an application that parses Streams backwards,
the application supports concatenated Streams. and the application supports concatenated Streams.
3. Block 3. Block
@ -599,19 +608,19 @@ The .xz File Format
The Check, when used, is calculated from the original The Check, when used, is calculated from the original
uncompressed data. If the calculated Check does not match the uncompressed data. If the calculated Check does not match the
stored one, the decoder MUST indicate an error. If the selected stored one, the decoder MUST indicate an error. If the selected
type of Check is not supported by the decoder, it MUST indicate type of Check is not supported by the decoder, it SHOULD
a warning or error. indicate a warning or error.
4. Index 4. Index
+-----------------+=========================+ +-----------------+===================+
| Index Indicator | Number of Index Records | | Index Indicator | Number of Records |
+-----------------+=========================+ +-----------------+===================+
+=================+=========+-+-+-+-+ +=================+===============+-+-+-+-+
---> | List of Records | Padding | CRC32 | ---> | List of Records | Index Padding | CRC32 |
+=================+=========+-+-+-+-+ +=================+===============+-+-+-+-+
Index serves several purporses. Using it, one can Index serves several purporses. Using it, one can
- verify that all Blocks in a Stream have been processed; - verify that all Blocks in a Stream have been processed;
@ -656,11 +665,11 @@ The .xz File Format
Unpadded Size and Uncompressed Size of the respective Blocks. Unpadded Size and Uncompressed Size of the respective Blocks.
Implementation hint: It is possible to verify the Index with Implementation hint: It is possible to verify the Index with
constant memory usage by calculating for example SHA256 of both constant memory usage by calculating for example SHA-256 of
the real size values and the List of Records, then comparing both the real size values and the List of Records, then
the check values. Implementing this using non-cryptographic comparing the hash values. Implementing this using
check like CRC32 SHOULD be avoided unless small code size is non-cryptographic hash like CRC32 SHOULD be avoided unless
important. small code size is important.
If the decoder supports random-access reading, it MUST verify If the decoder supports random-access reading, it MUST verify
that Unpadded Size and Uncompressed Size of every completely that Unpadded Size and Uncompressed Size of every completely
@ -898,7 +907,7 @@ The .xz File Format
Setting the start offset may be useful if an executable has Setting the start offset may be useful if an executable has
multiple sections, and there are many cross-section calls. multiple sections, and there are many cross-section calls.
Taking advantage of this feature usually requires usage of Taking advantage of this feature usually requires usage of
the Subblock filter. the Subblock filter, whose design is not complete yet.
5.3.3. Delta 5.3.3. Delta
@ -985,6 +994,7 @@ The .xz File Format
5.4.1. Reserved Custom Filter ID Ranges 5.4.1. Reserved Custom Filter ID Ranges
Range Description Range Description
0x0000_0300 - 0x0000_04FF Reserved to ease .7z compatibility
0x0002_0000 - 0x0007_FFFF Reserved to ease .7z compatibility 0x0002_0000 - 0x0007_FFFF Reserved to ease .7z compatibility
0x0200_0000 - 0x07FF_FFFF Reserved to ease .7z compatibility 0x0200_0000 - 0x07FF_FFFF Reserved to ease .7z compatibility
@ -1001,7 +1011,7 @@ The .xz File Format
the CRC32 and CRC64 values, and prints the calculated values the CRC32 and CRC64 values, and prints the calculated values
as big endian hexadecimal strings to standard output. as big endian hexadecimal strings to standard output.
#include <sys/types.h> #include <stddef.h>
#include <inttypes.h> #include <inttypes.h>
#include <stdio.h> #include <stdio.h>
@ -1067,7 +1077,8 @@ The .xz File Format
uint8_t buf[8192]; uint8_t buf[8192];
while (1) { while (1) {
const size_t buf_size = fread(buf, 1, 8192, stdin); const size_t buf_size
= fread(buf, 1, sizeof(buf), stdin);
if (buf_size == 0) if (buf_size == 0)
break; break;
@ -1092,6 +1103,9 @@ The .xz File Format
LZMA Utils - LZMA adapted to POSIX-like systems LZMA Utils - LZMA adapted to POSIX-like systems
http://tukaani.org/lzma/ http://tukaani.org/lzma/
XZ Utils - The next generation of LZMA Utils
http://tukaani.org/xz/
[RFC-1952] [RFC-1952]
GZIP file format specification version 4.3 GZIP file format specification version 4.3
http://www.ietf.org/rfc/rfc1952.txt http://www.ietf.org/rfc/rfc1952.txt
@ -1102,12 +1116,12 @@ The .xz File Format
http://www.ietf.org/rfc/rfc2119.txt http://www.ietf.org/rfc/rfc2119.txt
[GNU-tar] [GNU-tar]
GNU tar 1.20 manual GNU tar 1.21 manual
http://www.gnu.org/software/tar/manual/html_node/Blocking-Factor.html http://www.gnu.org/software/tar/manual/html_node/Blocking-Factor.html
- Node 9.4.2 "Blocking Factor", paragraph that begins - Node 9.4.2 "Blocking Factor", paragraph that begins
"gzip will complain about trailing garbage" "gzip will complain about trailing garbage"
- Note that this URL points to the latest version of the - Note that this URL points to the latest version of the
manual, and may some day not contain the note which is in manual, and may some day not contain the note which is in
1.20. For the exact version of the manual, download GNU 1.21. For the exact version of the manual, download GNU
tar 1.20: ftp://ftp.gnu.org/pub/gnu/tar/tar-1.20.tar.gz tar 1.21: ftp://ftp.gnu.org/pub/gnu/tar/tar-1.21.tar.gz