C. Appendix: Magic compression and decompression

The use of automagic20decompression of input files or compression of output files has been mentioned throughout. This appendix brings together this information in one place.

This behaviour applies to CQP as well as the various CWB utilities discussed in this manual.

Single-file archives only are supported. That is, a file somefile.vrt can be read by CWB if it is compressed to somefile.vrt.gz, but not if it is placed in the compressed archive somefiles.tar.gz.

Automagic decompression require the appropriate program to be installed and findable by CWB, that is, they must be in one of the directories named in the PATH environment variable. However, if for whatever reason your binaries for gzip/bzip2/xz are not in a standard location, and you either can't or don't want to modify your PATH variable, another solution is possible:

This will make automagic decompression work as expected.

While gzip, bzip2, and xz are standard or widely-available utilities on most Unix-like operating systems, they are not easily available everywhere, and especially on Windows may be very hard to install. In this case, an alternative is to use 7-zip.

7-zip is a free/open-source tool which can handle all three of the supported formats. It is easily installed on Windows (from https://www.7-zip.org/), and a package with ports of the non-GUI 7-zip executables is available for Unix-like systems as well: p7zip, installable via package managers or from http://p7zip.sourceforge.net/).

If you wish to use 7-zip you must, again, make sure that the directory containing its executable (7z, or 7z.exe on Windows) is on your PATH, or else use CWB_COMPRESSOR_PATH to specify its location.

You must then set the environment variable CWB_USE_7Z (normally to 1) to signal to CWB that it should use 7z rather than the other programs. Your command might then be:

$ CWB_USE_7Z=1 CWB_COMPRESSOR_PATH=/path/to/7zip/programs/ cwb-encode [...]

The latest developments to automagic compression: