Commit Graph

303 Commits

Author SHA1 Message Date
Lasse Collin a08be1c420 xz: Add comments about stdin and src_st.st_size.
"xz -v < regular_file > out.xz" doesn't display the percentage
and estimated remaining time because it doesn't even try to
check the input file size when input is read from stdin.
This could be improved but for now there's just a comment
to remind about it.
2022-11-11 13:48:06 +02:00
Lasse Collin 3ee411cd1c xz: Fix displaying of file sizes in progress indicator in passthru mode.
It worked for one input file since the counters are zero when
xz starts but they weren't reset when starting a new file in
passthru mode. For example, if files A, B, and C are one byte each,
then "xz -dcvf A B C" would show file sizes as 1, 2, and 3 bytes
instead of 1, 1, and 1 byte.
2022-11-11 13:48:06 +02:00
Lasse Collin aa7fa9d960 xz: Add a comment why --to-stdout is not in --help.
It is on the man page still.
2022-11-11 13:48:06 +02:00
Jia Tan d4674dfbb7 xz: Avoid a compiler warning in progress_speed() in message.c.
This should be smaller too since it avoids the string constants.
2022-11-11 13:41:43 +02:00
Lasse Collin 01744b280c xz: Fix --single-stream with an empty .xz Stream.
Example:

    $ xz -dc --single-stream good-0-empty.xz
    xz: good-0-empty.xz: Internal error (bug)

The code, that is tries to catch some input file issues early,
didn't anticipate LZMA_STREAM_END which is possible in that
code only when --single-stream is used.
2022-11-11 13:38:34 +02:00
Lasse Collin a3e4606134 xz: Fix decompressor behavior if input uses an unsupported check type.
Now files with unsupported check will make xz display
a warning, set the exit status to 2 (unless --no-warn is used),
and then decompress the file normally. This is how it was
supposed to work since the beginning but this was broken by
the commit 231c3c7098, that is,
a little before 5.0.0 was released. The buggy behavior displayed
a message, set exit status 1 (error), and xz didn't attempt to
to decompress the file.

This doesn't matter today except for special builds that disable
CRC64 or SHA-256 at build time (but such builds should be used
in special situations only). The bug matters if new check type
is added in the future and an old xz version is used to decompress
such a file; however, it's likely that such files would use a new
filter too and an old xz wouldn't be able to decompress the file
anyway.

The first hunk in the commit is the actual fix. The second hunk
is a cleanup since LZMA_TELL_ANY_CHECK isn't used in xz.

There is a test file for unsupported check type but it wasn't
used by test_files.sh, perhaps due to different behavior between
xz and the simpler xzdec.
2022-11-11 13:38:34 +02:00
Lasse Collin 0b5e8c7e07 xz: Clarify the man page: input file isn't removed if an error occurs. 2022-11-11 13:30:44 +02:00
Lasse Collin 23b7416d5b xz: If input file cannot be removed, treat it as a warning, not error.
Treating it as a warning (message + exit status 2) matches gzip
and it seems more logical as at that point the output file has
already been successfully closed. When it's a warning it is
possible to suppress it with --no-warn.
2022-11-11 13:29:13 +02:00
Lasse Collin 63e3cdef80 xz: Make --keep accept symlinks, hardlinks, and setuid/setgid/sticky.
Previously this required using --force but that has other
effects too which might be undesirable. Changing the behavior
of --keep has a small risk of breaking existing scripts but
since this is a fairly special corner case I expect the
likehood of breakage to be low enough.

I think the new behavior is more logical. The only reason for
the old behavior was to be consistent with gzip and bzip2.

Thanks to Vincent Lefevre and Sebastian Andrzej Siewior.
2022-07-24 13:29:42 +03:00
Lasse Collin ca21733d24 xz: Change the coding style of the previous commit.
It isn't any better now but it's consistent with
the rest of the code base.
2022-07-12 19:01:09 +03:00
Alexander Bluhm 906b990b15 xz: Avoid fchown(2) failure.
OpenBSD does not allow to change the group of a file if the user
does not belong to this group.  In contrast to Linux, OpenBSD also
fails if the new group is the same as the old one.  Do not call
fchown(2) in this case, it would change nothing anyway.

This fixes an issue with Perl Alien::Build module.
https://github.com/PerlAlien/Alien-Build/issues/62
2022-07-12 19:01:09 +03:00
Lasse Collin 6e2cab8579 xz: Document the special memlimit case of 2000 MiB on MIPS32.
See commit 95806a8a52.
2022-07-12 18:57:21 +03:00
Ivan A. Melnikov 95806a8a52 Reduce maximum possible memory limit on MIPS32
Due to architectural limitations, address space available to a single
userspace process on MIPS32 is limited to 2 GiB, not 4, even on systems
that have more physical RAM -- e.g. 64-bit systems with 32-bit
userspace, or systems that use XPA (an extension similar to x86's PAE).

So, for MIPS32, we have to impose stronger memory limits. I've chosen
2000MiB to give the process some headroom.
2022-07-12 18:53:41 +03:00
Lasse Collin ca7bcdb30f xz: Avoid unneeded \f escapes on the man page.
I don't want to use \c in macro arguments but groff_man(7)
suggests that \f has better portability. \f would be needed
for the .TP strings for portability reasons anyway.

Thanks to Bjarni Ingi Gislason.
2022-07-12 18:10:08 +03:00
Lasse Collin 3b40a0792e xz: Use non-breaking spaces when intentionally using more than one space.
This silences some style checker warnings. Seems that spaces
in the beginning of a line don't need this treatment.

Thanks to Bjarni Ingi Gislason.
2022-07-12 18:10:08 +03:00
Lasse Collin d85699c36d xz: Protect the ellipsis (...) on the man page with \&.
This does it only when ... appears outside macro calls.

Thanks to Bjarni Ingi Gislason.
2022-07-12 18:10:08 +03:00
Lasse Collin d996ae6617 xz: Avoid the abbreviation "e.g." on the man page.
A few are simply omitted, most are converted to "for example"
and surrounded with commas. Sounds like that this is better
style, for example, man-pages(7) recommends avoiding such
abbreviations except in parenthesis.

Thanks to Bjarni Ingi Gislason.
2022-07-12 18:09:21 +03:00
Lasse Collin d16d0d198a xz man page: Change \- (minus) to \(en (en-dash) for a numeric range.
Docs of ancient troff/nroff mention \(em (em-dash) but not \(en
and \- was used for both minus and en-dash. I don't know how
portable \(en is nowadays but it can be changed back if someone
complains. At least GNU groff and OpenBSD's mandoc support it.

Thanks to Bjarni Ingi Gislason for the patch.
2022-07-12 18:09:21 +03:00
Bjarni Ingi Gislason 55c2555c5d src/xz/xz.1: Correct misused two-fonts macros
Output is from: test-groff -b -e -mandoc -T utf8 -rF0 -t -w w -z

  [ "test-groff" is a developmental version of "groff" ]

Input file is ./src/xz/xz.1

<src/xz/xz.1>:408 (macro BR): only 1 argument, but more are expected
<src/xz/xz.1>:1009 (macro BR): only 1 argument, but more are expected
<src/xz/xz.1>:1743 (macro BR): only 1 argument, but more are expected
<src/xz/xz.1>:1920 (macro BR): only 1 argument, but more are expected
<src/xz/xz.1>:2213 (macro BR): only 1 argument, but more are expected

  Output from nroff and troff is unchanged, except for a font change of a
full stop (.).

Signed-off-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>
2022-07-12 17:42:59 +03:00
Lasse Collin 968bbfea09 Typo fixes from fossies.org.
https://fossies.org/linux/misc/xz-5.2.5.tar.xz/codespell.html
2020-03-23 18:08:31 +02:00
Lasse Collin 74a5af180a xz: Never use thousand separators in DJGPP builds.
DJGPP 2.05 added support for thousands separators but it's
broken at least under WinXP with Finnish locale that uses
a non-breaking space as the thousands separator. Workaround
by disabling thousands separators for DJGPP builds.
2020-03-11 22:38:25 +02:00
Lasse Collin c2cc64d78c xz: Silence a warning when sig_atomic_t is long int.
It can be true at least on z/OS.
2020-03-11 12:05:57 +02:00
Lasse Collin b6314aa275 xz: Avoid unneeded access of a volatile variable. 2020-03-11 12:05:57 +02:00
Lasse Collin 4b1447809f Build: Add support for translated man pages using po4a.
The dependency on po4a is optional. It's never required to install
the translated man pages when xz is built from a release tarball.
If po4a is missing when building from xz.git, the translated man
pages won't be generated but otherwise the build will work normally.

The translations are only updated automatically by autogen.sh and
by "make mydist". This makes it easy to keep po4a as an optional
dependency and ensures that I won't forget to put updated
translations to a release tarball.

The translated man pages aren't installed if --disable-nls is used.

The installation of translated man pages abuses Automake internals
by calling "install-man" with redefined dist_man_MANS and man_MANS.
This makes the hairy script code slightly less hairy. If it breaks
some day, this code needs to be fixed; don't blame Automake developers.

Also, this adds more quotes to the existing shell script code in
the Makefile.am "-hook"s.
2020-03-11 12:05:57 +02:00
Lasse Collin e1beaa74bc xz: Comment out annoying sandboxing messages. 2020-02-05 22:00:28 +02:00
Lasse Collin d0daa21792 xz: Limit --memlimit-compress to at most 4020 MiB for 32-bit xz.
See the code comment for reasoning. It's far from perfect but
hopefully good enough for certain cases while hopefully doing
nothing bad in other situations.

At presets -5 ... -9, 4020 MiB vs. 4096 MiB makes no difference
on how xz scales down the number of threads.

The limit has to be a few MiB below 4096 MiB because otherwise
things like "xz --lzma2=dict=500MiB" won't scale down the dict
size enough and xz cannot allocate enough memory. With
"ulimit -v $((4096 * 1024))" on x86-64, the limit in xz had
to be no more than 4085 MiB. Some safety margin is good though.

This is hack but it should be useful when running 32-bit xz on
a 64-bit kernel that gives full 4 GiB address space to xz.
Hopefully this is enough to solve this:

https://bugzilla.redhat.com/show_bug.cgi?id=1196786

FreeBSD has a patch that limits the result in tuklib_physmem()
to SIZE_MAX on 32-bit systems. While I think it's not the way
to do it, the results on --memlimit-compress have been good. This
commit should achieve practically identical results for compression
while leaving decompression and tuklib_physmem() and thus
lzma_physmem() unaffected.
2020-02-05 22:00:28 +02:00
Lasse Collin 4433c2dc57 xz: Set the --flush-timeout deadline when the first input byte arrives.
xz --flush-timeout=2000, old version:

  1. xz is started. The next flush will happen after two seconds.
  2. No input for one second.
  3. A burst of a few kilobytes of input.
  4. No input for one second.
  5. Two seconds have passed and flushing starts.

The first second counted towards the flush-timeout even though
there was no pending data. This can cause flushing to occur more
often than needed.

xz --flush-timeout=2000, after this commit:

  1. xz is started.
  2. No input for one second.
  3. A burst of a few kilobytes of input. The next flush will
     happen after two seconds counted from the time when the
     first bytes of the burst were read.
  4. No input for one second.
  5. No input for another second.
  6. Two seconds have passed and flushing starts.
2020-02-05 22:00:28 +02:00
Lasse Collin acc0ef3ac8 xz: Move flush_needed from mytime.h to file_pair struct in file_io.h. 2020-02-05 22:00:28 +02:00
Lasse Collin 4afe69d30b xz: coder.c: Make writing output a separate function.
The same code sequence repeats so it's nicer as a separate function.
Note that in one case there was no test for opt_mode != MODE_TEST,
but that was only because that condition would always be true, so
this commit doesn't change the behavior there.
2020-02-05 22:00:28 +02:00
Lasse Collin ec26f3ace5 xz: Fix semi-busy-waiting in xz --flush-timeout.
When input blocked, xz --flush-timeout=1 would wake up every
millisecond and initiate flushing which would have nothing to
flush and thus would just waste CPU time. The fix disables the
timeout when no input has been seen since the previous flush.
2020-02-05 22:00:28 +02:00
Lasse Collin 3891570324 xz: Refactor io_read() a bit. 2020-02-05 22:00:28 +02:00
Lasse Collin f6d2424534 xz: Update a comment in file_io.h. 2020-02-05 22:00:28 +02:00
Lasse Collin 15b55d5c63 xz: Move the setting of flush_needed in file_io.c to a nicer location. 2020-02-05 22:00:28 +02:00
Lasse Collin 267afcd995 xz: Silence a warning from clang -Wsign-conversion in main.c. 2019-12-31 22:25:42 +02:00
Lasse Collin c8cace3d6e xz: Fix an integer overflow with 32-bit off_t.
Or any off_t which isn't very big (like signed 64 bit integer
that most system have). A small off_t could overflow if the
file being decompressed had long enough run of zero bytes,
which would result in corrupt output.
2019-12-31 22:25:02 +02:00
Lasse Collin 37df03ce52 xz: Fix some of the warnings from -Wsign-conversion. 2019-12-31 22:19:18 +02:00
Lasse Collin 8d4906262b xz: Update xz man page date. 2019-07-13 17:54:52 +03:00
Antoine Cœur 0d318402f8 spelling 2019-07-13 17:53:33 +03:00
Lasse Collin 3ca432d9cc xz: Fix a crash in progress indicator when in passthru mode.
"xz -dcfv not_an_xz_file" crashed (all four options are
required to trigger it). It caused xz to call
lzma_get_progress(&strm, ...) when no coder was initialized
in strm. In this situation strm.internal is NULL which leads
to a crash in lzma_get_progress().

The bug was introduced when xz started using lzma_get_progress()
to get progress info for multi-threaded compression, so the
bug is present in versions 5.1.3alpha and higher.

Thanks to Filip Palian <Filip.Palian@pjwstk.edu.pl> for
the bug report.
2019-07-13 17:37:55 +03:00
Lasse Collin fcc419e3c3 xz: Update man page timestamp. 2019-07-13 17:36:27 +03:00
Pavel Raiskup 5a2fc3cd01 'have have' typos 2019-07-13 17:36:27 +03:00
Lasse Collin 06eebd4543 Fix or hide warnings from GCC 7's -Wimplicit-fallthrough. 2018-03-28 19:16:06 +03:00
Lasse Collin eb2ef4c79b xz: Fix "xz --list --robot missing_or_bad_file.xz".
It ended up printing an uninitialized char-array when trying to
print the check names (column 7) on the "totals" line.

This also changes the column 12 (minimum xz version) to
50000002 (xz 5.0.0) instead of 0 when there are no valid
input files.

Thanks to kidmin for the bug report.
2018-03-28 19:16:06 +03:00
Lasse Collin 70f4792119 Update the home page URLs to HTTPS. 2018-03-28 19:16:06 +03:00
Lasse Collin 2a4b2fa75d xz: Use POSIX_FADV_RANDOM for in "xz --list" mode.
xz --list is random access so POSIX_FADV_SEQUENTIAL was clearly
wrong.
2017-03-30 22:02:10 +03:00
Lasse Collin cae412b2b7 xz: Fix the Capsicum rights on user_abort_pipe. 2016-12-30 13:13:57 +02:00
Lasse Collin ce2542d220 xz: Add support for sandboxing with Capsicum (disabled by default).
In the v5.2 branch this feature is considered experimental
and thus disabled by default.

The sandboxing is used conditionally as described in main.c.
This isn't optimal but it was much easier to implement than
a full sandboxing solution and it still covers the most common
use cases where xz is writing to standard output. This should
have practically no effect on performance even with small files
as fork() isn't needed.

C and locale libraries can open files as needed. This has been
fine in the past, but it's a problem with things like Capsicum.
io_sandbox_enter() tries to ensure that various locale-related
files have been loaded before cap_enter() is called, but it's
possible that there are other similar problems which haven't
been seen yet.

Currently Capsicum is available on FreeBSD 10 and later
and there is a port to Linux too.

Thanks to Loganaden Velvindron for help.
2016-12-26 20:40:27 +02:00
Lasse Collin 51baf68437 xz: Fix copying of timestamps on Windows.
xz used to call utime() on Windows, but its result gets lost
on close(). Using _futime() seems to work.

Thanks to Martok for reporting the bug:
http://www.mail-archive.com/xz-devel@tukaani.org/msg00261.html
2016-06-30 21:00:49 +03:00
Lasse Collin 1ddc479851 xz: Silence warnings from -Wlogical-op.
Thanks to Evan Nemerson.
2016-06-28 21:11:02 +03:00
Lasse Collin be647ff5ed Build: Fix = to += for xz_SOURCES in src/xz/Makefile.am.
Thanks to Christian Kujau.
2016-06-28 21:09:46 +03:00