d6/xz-analysis-mirror

Commit Graph

Author	SHA1	Message	Date
Lasse Collin	c77fe55ddb	xz: Add a default soft memory usage limit for --threads=0. This is a soft limit in sense that it only affects the number of threads. It never makes xz fail and it never makes xz change settings that would affect the compressed output. The idea is to make -T0 have more reasonable behavior when the system has very many cores or when a memory-hungry compression options are used. This also helps with 32-bit xz, preventing it from running out of address space. The downside of this commit is that now the number of threads might become too low compared to what the user expected. I hope this to be an acceptable compromise as the old behavior has been a source of well-argued complaints for a long time.	2022-04-14 14:20:46 +03:00
Lasse Collin	0adc13bfe3	xz: Make -T0 use multithreaded mode on single-core systems. The main problem withi the old behavior is that the compressed output is different on single-core systems vs. multicore systems. This commit fixes it by making -T0 one thread in multithreaded mode on single-core systems. The downside of this is that it uses more memory. However, if --memlimit-compress is used, xz can (thanks to the previous commit) drop to the single-threaded mode still.	2022-04-14 13:00:40 +03:00
Lasse Collin	898faa9728	xz: Changes to --memlimit-compress and --no-adjust. In single-threaded mode, --memlimit-compress can make xz scale down the LZMA2 dictionary size to meet the memory usage limit. This obviously affects the compressed output. However, if xz was in threaded mode, --memlimit-compress could make xz reduce the number of threads but it wouldn't make xz switch from multithreaded mode to single-threaded mode or scale down the LZMA2 dictionary size. This seemed illogical and there was even a "FIXME?" about it. Now --memlimit-compress can make xz switch to single-threaded mode if one thread in multithreaded mode uses too much memory. If memory usage is still too high, then the LZMA2 dictionary size can be scaled down too. The option --no-adjust was also changed so that it no longer prevents xz from scaling down the number of threads as that doesn't affect compressed output (only performance). After this commit --no-adjust only prevents adjustments that affect compressed output, that is, with --no-adjust xz won't switch from multithreaded mode to single-threaded mode and won't scale down the LZMA2 dictionary size. The man page wasn't updated yet.	2022-04-14 12:38:00 +03:00
Lasse Collin	cad299008c	xz: Add --memlimit-mt-decompress along with a default limit value. --memlimit-mt-decompress allows specifying the limit for multithreaded decompression. This matches memlimit_threading in liblzma. This limit can only affect the number of threads being used; it will never prevent xz from decompressing a file. The old --memlimit-decompress option is still used at the same time. If the value of --memlimit-decompress (the default value or one specified by the user) is less than the value of --memlimit-mt-decompress , then --memlimit-mt-decompress is reduced to match --memlimit-decompress. Man page wasn't updated yet.	2022-04-12 00:04:30 +03:00
Lasse Collin	6c6da57ae2	xz: Add initial support for threaded decompression. If threading support is enabled at build time, this will use lzma_stream_decoder_mt() even for single-threaded mode. With memlimit_threading=0 the behavior should be identical. This needs some work like adding --memlimit-threading=LIMIT. The original patch from Sebastian Andrzej Siewior included a method to get currently available RAM on Linux. It might be one way to go but as it is Linux-only, the available-RAM approach needs work for portability or using a fallback method on other OSes. The man page wasn't updated yet.	2022-03-07 00:36:16 +02:00
Lasse Collin	ba76d67585	xz: Set the --flush-timeout deadline when the first input byte arrives. xz --flush-timeout=2000, old version: 1. xz is started. The next flush will happen after two seconds. 2. No input for one second. 3. A burst of a few kilobytes of input. 4. No input for one second. 5. Two seconds have passed and flushing starts. The first second counted towards the flush-timeout even though there was no pending data. This can cause flushing to occur more often than needed. xz --flush-timeout=2000, after this commit: 1. xz is started. 2. No input for one second. 3. A burst of a few kilobytes of input. The next flush will happen after two seconds counted from the time when the first bytes of the burst were read. 4. No input for one second. 5. No input for another second. 6. Two seconds have passed and flushing starts.	2020-01-26 20:53:25 +02:00
Lasse Collin	fd47fd62bb	xz: Move flush_needed from mytime.h to file_pair struct in file_io.h.	2020-01-26 20:25:52 +02:00
Lasse Collin	8150356810	xz: coder.c: Make writing output a separate function. The same code sequence repeats so it's nicer as a separate function. Note that in one case there was no test for opt_mode != MODE_TEST, but that was only because that condition would always be true, so this commit doesn't change the behavior there.	2020-01-26 14:49:22 +02:00
Lasse Collin	5a49e081a0	xz: Fix semi-busy-waiting in xz --flush-timeout. When input blocked, xz --flush-timeout=1 would wake up every millisecond and initiate flushing which would have nothing to flush and thus would just waste CPU time. The fix disables the timeout when no input has been seen since the previous flush.	2020-01-26 14:13:42 +02:00
Lasse Collin	7883d73530	xz: Fix some of the warnings from -Wsign-conversion.	2019-06-23 23:19:34 +03:00
Antoine Cœur	2fb0ddaa55	spelling	2019-05-11 20:52:37 +03:00
Lasse Collin	b55d79461d	xz: Fix a crash in progress indicator when in passthru mode. "xz -dcfv not_an_xz_file" crashed (all four options are required to trigger it). It caused xz to call lzma_get_progress(&strm, ...) when no coder was initialized in strm. In this situation strm.internal is NULL which leads to a crash in lzma_get_progress(). The bug was introduced when xz started using lzma_get_progress() to get progress info for multi-threaded compression, so the bug is present in versions 5.1.3alpha and higher. Thanks to Filip Palian <Filip.Palian@pjwstk.edu.pl> for the bug report.	2018-12-20 20:39:20 +02:00
Lasse Collin	cb3111e3ed	xz: Make xz buildable even when encoders or decoders are disabled. The patch is quite long but it's mostly about adding new #ifdefs to omit code when encoders or decoders have been disabled. This adds two new #defines to config.h: HAVE_ENCODERS and HAVE_DECODERS.	2015-11-03 20:29:33 +02:00
Lasse Collin	6b5e3b9eff	xz: Add --ignore-check.	2014-08-05 22:32:36 +03:00
Lasse Collin	3ce3e79769	xz: Check for filter chain compatibility for --flush-timeout. This avoids LZMA_PROG_ERROR from lzma_code() with filter chains that don't support LZMA_SYNC_FLUSH.	2014-06-18 19:11:52 +03:00
Lasse Collin	8c19216bac	xz: Force single-threaded mode when --flush-timeout is used.	2014-06-09 21:21:24 +03:00
Lasse Collin	ed9ac85822	xz: Fix uint64_t vs. size_t which broke 32-bit build. Thanks to Christian Hesse.	2014-05-08 18:03:09 +03:00
Lasse Collin	3d5c090872	xz: Fix a comment.	2014-01-12 17:41:14 +02:00
Lasse Collin	dd750acbe2	xz: Make --block-list and --block-size work together in single-threaded. Previously, --block-list and --block-size only worked together in threaded mode. Boundaries are specified by --block-list, but --block-size specifies the maximum size for a Block. Now this works in single-threaded mode too. Thanks to James M Leddy for the original patch.	2013-11-12 16:29:48 +02:00
Lasse Collin	ba413da1d5	xz: Take advantage of LZMA_FULL_BARRIER with --block-list. Now if --block-list is used in threaded mode, the encoder won't need to flush at each Block boundary specified via --block-list. This improves performance a lot, making threading helpful with --block-list. The flush timer was reset after LZMA_FULL_FLUSH but since LZMA_FULL_BARRIER doesn't flush, resetting the timer is no longer done.	2013-10-22 19:51:55 +03:00
Lasse Collin	6b44b4a775	Add native threading support on Windows. Now liblzma only uses "mythread" functions and types which are defined in mythread.h matching the desired threading method. Before Windows Vista, there is no direct equivalent to pthread condition variables. Since this package doesn't use pthread_cond_broadcast(), pre-Vista threading can still be kept quite simple. The pre-Vista code doesn't use anything that wasn't already available in Windows 95, so the binaries should run even on Windows 95 if someone happens to care.	2013-09-17 11:52:28 +03:00
Lasse Collin	dee6ad3d59	xz: Add preliminary support for --flush-timeout=TIMEOUT. When --flush-timeout=TIMEOUT is used, xz will use LZMA_SYNC_FLUSH if read() would block and at least TIMEOUT milliseconds has elapsed since the previous flush. This can be useful in realtime-like use cases where the data is simultanously decompressed by another process (possibly on a different computer). If new uncompressed input data is produced slowly, without this option xz could buffer the data for a long time until it would become decompressible from the output. If TIMEOUT is 0, the feature is disabled. This is the default. This commit affects the compression side. Using xz for the decompression side for the above purpose doesn't work yet so well because there is quite a bit of input and output buffering when decompressing. The --long-help or man page were not updated yet. The details of this feature may change.	2013-07-04 14:18:46 +03:00
Lasse Collin	ea00545bea	xz: Fix the test when to read more input. Testing for end of file was no longer correct after full flushing became possible with --block-size=SIZE and --block-list=SIZES. There was no bug in practice though because xz just made a few unneeded zero-byte reads.	2013-07-04 13:25:11 +03:00
Lasse Collin	736903c64b	xz: Move some of the timing code into mytime.[hc]. This switches units from microseconds to milliseconds. New clock_gettime(CLOCK_MONOTONIC) will be used if available. There is still a fallback to gettimeofday().	2013-07-04 12:51:57 +03:00
Lasse Collin	2fcda89939	xz: Fix interaction between preset and custom filter chains. There was somewhat illogical behavior when --extreme was specified and mixed with custom filter chains. Before this commit, "xz -9 --lzma2 -e" was equivalent to "xz --lzma2". After it is equivalent to "xz -6e" (all earlier preset options get forgotten when a custom filter chain is specified and the default preset is 6 to which -e is applied). I find this less illogical. This also affects the meaning of "xz -9e --lzma2 -7". Earlier it was equivalent to "xz -7e" (the -e specified before a custom filter chain wasn't forgotten). Now it is "xz -7". Note that "xz -7e" still is the same as "xz -e7". Hopefully very few cared about this in the first place, so pretty much no one should even notice this change. Thanks to Conley Moorhous.	2013-06-21 21:50:26 +03:00
Lasse Collin	88ccf47205	xz: Add incomplete support for --block-list. It's broken with threads and when also --block-size is used.	2012-07-03 21:16:39 +03:00
Lasse Collin	74d2bae4d3	xz: Fix xz on EBCDIC systems. Thanks to Chris Donawa.	2011-11-03 17:07:22 +02:00
Lasse Collin	4c6e146df9	Add underscores to attributes (__attribute((__foo__))).	2011-05-17 11:54:38 +03:00
Lasse Collin	7a480e4859	xz: Fix input file position when --single-stream is used. Now the following works as you would expect: echo foo \| xz > foo.xz echo bar \| xz >> foo.xz ( xz -dc --single-stream ; xz -dc --single-stream ) < foo.xz Note that it doesn't work if the input is not seekable or if there is Stream Padding between the concatenated .xz Streams.	2011-05-01 12:24:23 +03:00
Lasse Collin	c29e6630c1	xz: Print the maximum number of worker threads in xz -vv.	2011-05-01 12:15:51 +03:00
Lasse Collin	24e0406c0f	xz: Add support for threaded compression.	2011-04-11 22:06:03 +03:00
Lasse Collin	9edd6ee895	xz: Change size_t to uint32_t in a few places.	2011-04-08 17:53:05 +03:00
Lasse Collin	411013ea45	xz: Fix a typo in a comment.	2011-04-08 17:48:41 +03:00
Lasse Collin	1ef3cf44a8	xz: Call lzma_end(&strm) before exiting if debugging is enabled.	2011-04-05 15:13:29 +03:00
Lasse Collin	923b22483b	xz: Add --block-size=SIZE. This uses LZMA_FULL_FLUSH every SIZE bytes of input. Man page wasn't updated yet.	2011-03-18 19:10:30 +02:00
Lasse Collin	57597d42ca	xz: Add --single-stream. This can be useful when there is garbage after the compressed stream (.xz, .lzma, or raw stream). Man page wasn't updated yet.	2011-03-18 18:19:19 +02:00
Lasse Collin	2fce9312f3	xz: Make -vv show also decompressor memory usage.	2010-09-03 15:54:40 +03:00
Lasse Collin	a848e47ced	xz: Make setting a preset override a custom filter chain. This is more logical behavior than ignoring preset level options once a custom filter chain has been specified.	2010-09-02 19:22:35 +03:00
Lasse Collin	b3ff7ba044	xz: Always warn if adjusting dictionary size due to memlimit.	2010-09-02 19:09:57 +03:00
Lasse Collin	792331bdee	Disable the memory usage limiter by default. For several people, the limiter causes bigger problems that it solves, so it is better to have it disabled by default. Those who want to have a limiter by default need to enable it via the environment variable XZ_DEFAULTS. Support for environment variable XZ_DEFAULTS was added. It is parsed before XZ_OPT and technically identical with it. The intended uses differ quite a bit though; see the man page. The memory usage limit can now be set separately for compression and decompression using --memlimit-compress and --memlimit-decompress. To set both at once, -M or --memlimit can be used. --memory was retained as a legacy alias for --memlimit for backwards compatibility. The semantics of --info-memory were changed in backwards incompatible way. Compatibility wasn't meaningful due to changes in the memory usage limiter functionality. The memory usage limiter info is no longer shown at the bottom of xz --long -help. The memory usage limiter support for removed completely from xzdec. xz's man page was updated to match the above changes. Various unrelated fixes were also made to the man page.	2010-08-07 20:45:18 +03:00
Lasse Collin	c15c42abb3	Add --no-adjust.	2010-06-15 14:06:29 +03:00
Lasse Collin	b6377fc990	Split message_filters(). message_filters_to_str() converts the filter chain to a string. message_filters_show() replaces the original message_filters(). uint32_to_optstr() was also added to show the dictionary size in nicer format when possible.	2010-05-16 18:42:22 +03:00
Lasse Collin	eb7d51a3fa	Collection of language fixes to comments and docs. Thanks to Jonathan Nieder.	2010-02-12 13:16:15 +02:00
Lasse Collin	34eb5e201d	Select the default integrity check type at runtime. Previously it was set statically to CRC64 or CRC32 depending on options passed to the configure script.	2010-01-31 19:52:38 +02:00
Lasse Collin	96a4f840e3	Improve displaying of the memory usage limit.	2010-01-31 18:17:50 +02:00
Lasse Collin	231c3c7098	Delay opening the destionation file and other fixes. The opening of the destination file is now delayed a little. The coder is initialized, and if decompressing, the memory usage of the first Block compared against the memory usage limit before the destination file is opened. This means that if --force was used, the old "target" file won't be deleted so easily when something goes wrong very early. Thanks to Mark K for the bug report. The above fix required some changes to progress message handling. Now there is a separate function for setting and printing the filename. It is used also in list.c. list_file() now handles stdin correctly (gives an error). A useless check for user_abort was removed from file_io.c.	2010-01-31 12:01:54 +02:00
Lasse Collin	0dd6d00766	Some improvements to printing sizes in xz.	2010-01-24 16:57:40 +02:00
Lasse Collin	919fbaff86	Add missing error check to coder.c. With bad luck this could cause a segfault due to reading (but not writing) past the end of the buffer.	2009-11-25 14:22:19 +02:00
Lasse Collin	465d1b0d65	Create sparse files by default when decompressing into a regular file. Sparse file creation can be disabled with --no-sparse. I don't promise yet that the name of this option won't change before 5.0.0. It's possible that the code, that checks when it is safe to use sparse output on stdout, is not good enough, and a more flexible command line option is needed to configure sparse file handling.	2009-11-25 11:19:20 +02:00
Lasse Collin	8f8ec942d6	Avoid internal error with --format=xz --lzma1.	2009-07-20 15:43:32 +03:00

1 2

53 Commits