xz and 7za compression programs
xz is a lossless data compression program and file format which incorporates the LZMA compression algorithm. xz is used now for distributing patched in many Linux
flavors. You can specify xz for compression in tar with option GNU tar. For 7za you usually need to pipe tar output into the
program.
7za is compatible and is much (often 10 times) faster then xz. For some reason xz does not multithread well. Decompression speeds
are close as decompression is almost never multithreaded.
xz format has almost better level of compression then dominant Gzip format and in some cases is able to produce files that half
of the size of files compressed with gzip.
Recently it became standard for packages of
Debian family of systems
deb (file format),
openSUSE,
Fedora,
Arch Linux,Slackware,[FreeBSD,
Gentoo,
GNOME,and
TeX Live, as well as being an option to compress a compiled
Linux kernel.
In December 2013, the Linux kernel maintainers kernel.org announced that they would use xz instead of
bzip2 as their compression tool from 2014 on.
One can think of both xz and 7za as a stripped-down version of the 7-Zip program.
xz compresses single files as input, and does not bundle multiple files into a single archive. It is therefore common to compress
a file that is itself an archive, such as those created by the
tar or
cpio Unix
programs.
xz options are close to gzip so immediate "transfer of skills" is possible. 7za options are closer to original 7zip program and
RAR archiver.
Format of files used
Although the original 7-Zip program, which implements LZMA2 compression,
is able to produce small files at the cost of speed, it also created its own unique archive format which was made primarily for Windows
and did not support Unix functionality. xz and 7za uses thier own Linux based format as Windows 7-Zip format does not support
directly Unix file
system metadata). Xz and 7za are compatible with each other: each can decompress files compressed with the other program.
Implementation
An implementation of the xz file format is freely available online as
XZ Utils. It is licensed under the terms of the
GNU LGPL and
GNU GPL, with the bulk of
the software (e.g., liblzma) in the public domain.
Version 1.22 of GNU tar supports using this software to handle xz files transparently.
FreeBSD tar has supported xz transparently since r191190 (released on April 17, 2009).
7-Zip has supported xz since version 9.04 beta (stable since 9.20).
7za man page
7za - A file archiver with highest compression ratio
7za [adeltux] [-] [SWITCH] <ARCHIVE_NAME>< ARGUMENTS>...
7-Zip is a file archiver with the highest compression ratio. The program supports 7z (that implements LZMA compression
algorithm), ZIP, CAB, ARJ, GZIP, BZIP2, TAR, CPIO, RPM and DEB formats. Compression ratio in the new 7z format is 30-50%
better than ratio in ZIP format.
- 7za is a stand-alone executable. 7za handles less archive formats than 7z, but does not need any others.
Function Letters
- a
- Add
- d
- Delete
- e
- Extract
- l
- List
- t
- Test
- u
- Update
- x
- eXtract with full paths
Switches
- -ai[r[-|0]]{@listfile|!wildcard}
- Include archives
- -ax[r[-|0]]{@listfile|!wildcard}
- eXclude archives
- -bd
- Disable percentage indicator
- -i[r[-|0]]{@listfile|!wildcard}
- Include filenames
- -l
- don't store symlinks; store the files/directories they point to (CAUTION : the scanning stage can never end because
of recursive symlinks like 'ln -s .. ldir')
- -m{Parameters}
- -mhe=on|off
- 7z format only : enables or disables archive header encryption (Default : off)
- -o{Directory}
- Set Output directory
- -p{Password}
- Set Password
- -r[-|0]
- Recurse subdirectories (CAUTION: this flag does not do what you think, avoid using it)
- -sfx[{name}]
- Create SFX archive
- -si
- Read data from StdIn (eg: tar cf - directory | 7za a -si directory.tar.7z)
- -so
- Write data to StdOut (eg: % echo foo | 7z a dummy -tgzip -si -so > /dev/null)
- -slt
- Sets technical mode for l (list) command
- -t{Type}
- Type of archive (7z, zip, gzip, bzip2 or tar. 7z format is default)
- -v{Size}[b|k|m|g]
- Create volumes
- -u[-][p#][q#][r#][x#][y#][z#][!newArchiveName]
- Update options
- -w[path]
- Set Working directory
- -x[r[-|0]]]{@listfile|!wildcard}
- Exclude filenames
- -y
- Assume Yes on all queries
Diagnostics
7-Zip returns the following exit codes:
- 0
- Normal (no errors or warnings detected)
- 1
- Warning (Non fatal error(s)). For example, some files cannot be read during compressing. So they were not
compressed
- 2
- Fatal error
- 7
- Bad command line parameters
- 8
- Not enough memory for operation
- 255
- User stopped the process with control-C (or similar)
Backup and limitations
DO NOT USE the 7-zip format for backup purpose on Linux/Unix because : - 7-zip does not store the owner/group of the file.
On Linux/Unix, in order to backup directories you must use tar : - to backup a directory : tar cf - directory | 7za a -si
directory.tar.7z
- to restore your backup : 7za x -so directory.tar.7z | tar xf -
If you want to send files and directories (not the owner of file) to others Unix/MacOS/Windows users, you can use the
7-zip format.
example : 7za a directory.7z directory
Do not use "-r" because this flag does not do what you think.
Do not use directory/* because of ".*" files (example : "directory/*" does not match "directory/.profile")
Example 1
- 7za a -t7z -m0=lzma -mx=9 -mfb=64 -md=32m -ms=on archive.7z dir1
- adds all files from directory "dir1" to archive archive.7z using "ultra settings"
- -t7z
- 7z archive
- -m0=lzma
- lzma method
- -mx=9
- level of compression = 9 (Ultra)
- -mfb=64
- number of fast bytes for LZMA = 64
- -md=32m
- dictionary size = 32 megabytes
- -ms=on
- solid archive = on
Example 2
- 7za a -sfx archive.exe dir1
- add all files from directory "dir1" to SFX archive archive.exe (Remark : SFX archive MUST end with ".exe")
Example 3
- 7za a -mhe=on -pmy_password archive.7z a_directory
- add all files from directory "a_directory" to the archive "archive.7z" (with data and header archive encryption on)
TIP:
7-zip's XZ compression on a multiprocessor system is often faster and compresses better than gzip (
self.linuxadmin )
TyIzaeL line">
[–]
kristopolous 4 years ago
(4 children)
I did this a while back also. Here's a graph: http://i.imgur.com/gPOQBfG.png
X axis is compression level (min to max) Y is the size of the file that was compressed
I forget what the file was.
TyIzaeL 4 years ago
(3 children)
That is a great start (probably better than what I am doing). Do you have time comparisons as well?
kristopolous 4 years ago
(1 child)
http://www.reddit.com/r/linuxquestions/comments/1gdvnc/best_file_compression_format/caje4hm
there's the post
TyIzaeL 4 years ago
(0 children)
Very nice. I might work on something similar to this soon next time I'm bored.
kristopolous 4 years ago
(0 children)
nope.
TyIzaeL 4 years ago
(0 children)
That's a great point to consider among all of this. Compression is always a tradeoff between how much CPU and memory you want
to throw at something and how much space you would like to save. In my case, hammering the server for 3 minutes in order to take
a backup is necessary because the uncompressed data would bottleneck at the LAN speed.
randomfrequency 4 years ago
(0 children)
You might want to play with 'pigz' - it's gzip, multi-threaded. You can 'pv' to restrict the rate of the output, and it accepts
signals to control the rate limiting.
rrohbeck 4 years ago
(1 child)
Also pbzip2 -1 to -9 and pigz -1 to -9.
With -9 you can surely make backup CPU bound. I've given up on compression though: rsync is much faster than straight backup
and I use btrfs compression/deduplication/snapshotting on the backup server.
TyIzaeL 4 years ago
(0 children)
pigz -9 is already on the chart as pigz --best. I'm working on adding the others though.
TyIzaeL 4 years ago
(0 children)
I'm running gzip, bzip2, and pbzip2 now (not at the same time, of course) and will add results soon. But in my case the compression
keeps my db dumps from being IO bound by the 100mbit LAN connection. For example, lzop in the results above puts out 6041.632
megabits in 53.82 seconds for a total compressed data rate of 112 megabits per second, which would make the transfer IO bound.
Whereas the pigz example puts out 3339.872 megabits in 81.892 seconds, for an output data rate of 40.8 megabits per second. This
is just on my dual-core box with a static file, on the 8-core server I see the transfer takes a total of about three minutes.
It's probably being limited more by the rate at which the MySQL server can dump text from the database, but if there was no compression
it'd be limited by the LAN speed. If we were dumping 2.7GB over the LAN directly, we would need 122mbit/s of real throughput to
complete it in three minutes.
Shammyhealz 4 years ago
(2 children)
I thought the best compression was supposed to be LZMA? Which is what the .7z archives are. I have no idea of the relative speed
of LZMA and gzip
TyIzaeL 4 years ago
(1 child)
xz archives use the LZMA2 format (which is also used in 7z archives). LZMA2 speed seems to range from a little slower than gzip
to much slower than bzip2, but results in better compression all around.
primitive_screwhead 4 years ago
(0 children)
However LZMA2 decompression speed is generally much faster than bzip2, in my experience, though not as fast as gzip.
This is why we use it, as we decompress our data much more often than we compress it, and the space saving/decompression speed
tradeoff is much more favorable for us than either gzip of bzip2.
crustang 4 years ago
(2 children)
I mentioned how 7zip was superior to all other zip programs in /r/osx
a few days ago and my comment was burried in favor of the the osx circlejerk .. it feels good seeing this data.
I love 7zip
RTFMorGTFO 4 years ago
(1 child)
Why... Tar supports xz, lzma, lzop, lzip, and any other kernel based compression algorithms. Its also much more likely to be preinstalled
on your given distro.
crustang 4 years ago
(0 children)
I've used 7zip at my old job for a backup of our business software's database. We needed speed, high level of compression, and
encryption. Portability wasn't high on the list since only a handful of machines needed access to the data. All machines were
multi-processor and 7zip gave us the best of everything given the requirements. I haven't really looked at anything deeply - including
tar, which my old boss didn't care for.
p7zip rpm build for : RedHat EL 6 . For other distributions click
p7zip .
Content of RPM Changelog
Provides
Requires
Softpanorama Recommended
- XZ
Utils Release Notes (Git), Tukaaani
- Lindholm, Linux Gazette
.
- XZ Utils Web site
- GNU tar Web site: References
-
Changelog for Tar 1.22
- "release history", 7-Zip .
- FTP),
Coreutils, GNU (see version 7.1 and newer files ending in .tar.xz).
- "openSUSE has moved off of LZMA to xz",
News, openSUSE .
- "XZ RPM payloads" (wiki),
Features (12 ed.), Fedora .
- "Switching
to xz compression for new packages", News, Archlinux .
- Entry
(FTP) (changelog) (13.0 ed.), Slackware, Fri May 8 18:49:03 CDT 2009 Check date values in:
|date=
(help).
- Stable
(mailing list post), Free BSD, Jan 2011 .
- "Remove
.lzma in favor of .xz portage snapshots" (mailing list post), Dev (RFC), Gentoo .
- "Important:
Switch of GNOME tarball compression format" (mailing list post), Devel, Gnome, Apr 2011 .
- tex-archive/systems/texlive/tlnet/archive
(directory), CTAN .
- xz embedded (Git), Tukaani .
- https://www.kernel.org/happy-new-year-and-good-bye-bzip2.html
Society
Groupthink :
Two Party System
as Polyarchy :
Corruption of Regulators :
Bureaucracies :
Understanding Micromanagers
and Control Freaks : Toxic Managers :
Harvard Mafia :
Diplomatic Communication
: Surviving a Bad Performance
Review : Insufficient Retirement Funds as
Immanent Problem of Neoliberal Regime : PseudoScience :
Who Rules America :
Neoliberalism
: The Iron
Law of Oligarchy :
Libertarian Philosophy
Quotes
War and Peace
: Skeptical
Finance : John
Kenneth Galbraith :Talleyrand :
Oscar Wilde :
Otto Von Bismarck :
Keynes :
George Carlin :
Skeptics :
Propaganda : SE
quotes : Language Design and Programming Quotes :
Random IT-related quotes :
Somerset Maugham :
Marcus Aurelius :
Kurt Vonnegut :
Eric Hoffer :
Winston Churchill :
Napoleon Bonaparte :
Ambrose Bierce :
Bernard Shaw :
Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient
markets hypothesis :
Political Skeptic Bulletin, 2013 :
Unemployment Bulletin, 2010 :
Vol 23, No.10
(October, 2011) An observation about corporate security departments :
Slightly Skeptical Euromaydan Chronicles, June 2014 :
Greenspan legacy bulletin, 2008 :
Vol 25, No.10 (October, 2013) Cryptolocker Trojan
(Win32/Crilock.A) :
Vol 25, No.08 (August, 2013) Cloud providers
as intelligence collection hubs :
Financial Humor Bulletin, 2010 :
Inequality Bulletin, 2009 :
Financial Humor Bulletin, 2008 :
Copyleft Problems
Bulletin, 2004 :
Financial Humor Bulletin, 2011 :
Energy Bulletin, 2010 :
Malware Protection Bulletin, 2010 : Vol 26,
No.1 (January, 2013) Object-Oriented Cult :
Political Skeptic Bulletin, 2011 :
Vol 23, No.11 (November, 2011) Softpanorama classification
of sysadmin horror stories : Vol 25, No.05
(May, 2013) Corporate bullshit as a communication method :
Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000):
the triumph of the US computer engineering :
Donald Knuth : TAoCP
and its Influence of Computer Science : Richard Stallman
: Linus Torvalds :
Larry Wall :
John K. Ousterhout :
CTSS : Multix OS Unix
History : Unix shell history :
VI editor :
History of pipes concept :
Solaris : MS DOS
: Programming Languages History :
PL/1 : Simula 67 :
C :
History of GCC development :
Scripting Languages :
Perl history :
OS History : Mail :
DNS : SSH
: CPU Instruction Sets :
SPARC systems 1987-2006 :
Norton Commander :
Norton Utilities :
Norton Ghost :
Frontpage history :
Malware Defense History :
GNU Screen :
OSS early history
Classic books:
The Peter
Principle : Parkinson
Law : 1984 :
The Mythical Man-Month :
How to Solve It by George Polya :
The Art of Computer Programming :
The Elements of Programming Style :
The Unix Hater’s Handbook :
The Jargon file :
The True Believer :
Programming Pearls :
The Good Soldier Svejk :
The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society :
Ten Commandments
of the IT Slackers Society : Computer Humor Collection
: BSD Logo Story :
The Cuckoo's Egg :
IT Slang : C++ Humor
: ARE YOU A BBS ADDICT? :
The Perl Purity Test :
Object oriented programmers of all nations
: Financial Humor :
Financial Humor Bulletin,
2008 : Financial
Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related
Humor : Programming Language Humor :
Goldman Sachs related humor :
Greenspan humor : C Humor :
Scripting Humor :
Real Programmers Humor :
Web Humor : GPL-related Humor
: OFM Humor :
Politically Incorrect Humor :
IDS Humor :
"Linux Sucks" Humor : Russian
Musical Humor : Best Russian Programmer
Humor : Microsoft plans to buy Catholic Church
: Richard Stallman Related Humor :
Admin Humor : Perl-related
Humor : Linus Torvalds Related
humor : PseudoScience Related Humor :
Networking Humor :
Shell Humor :
Financial Humor Bulletin,
2011 : Financial
Humor Bulletin, 2012 :
Financial Humor Bulletin,
2013 : Java Humor : Software
Engineering Humor : Sun Solaris Related Humor :
Education Humor : IBM
Humor : Assembler-related Humor :
VIM Humor : Computer
Viruses Humor : Bright tomorrow is rescheduled
to a day after tomorrow : Classic Computer
Humor
The Last but not Least Technology is dominated by
two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt.
Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org
was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP)
without any remuneration. This document is an industrial compilation designed and created exclusively
for educational use and is distributed under the Softpanorama Content License.
Original materials copyright belong
to respective owners. Quotes are made for educational purposes only
in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains
copyrighted material the use of which has not always been specifically
authorized by the copyright owner. We are making such material available
to advance understanding of computer science, IT technology, economic, scientific, and social
issues. We believe this constitutes a 'fair use' of any such
copyrighted material as provided by section 107 of the US Copyright Law according to which
such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free)
site written by people for whom English is not a native language. Grammar and spelling errors should
be expected. The site contain some broken links as it develops like a living tree...
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or
referenced source) and are
not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness
of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be
tracked by Google please disable Javascript for this site. This site is perfectly usable without
Javascript.
Last modified: May, 31, 2018