This section documents some technical limits of CWB. Most of these are due to the design of the index file format (notably the pervasive use of signed 32-bit integer values), but some additional restrictions are imposed by implementation decisions.
CL_MAX_CORPUS_SIZE
.
The length limit is determined by the macro CL_MAX_LINE_LENGTH
in the CWB source code and can be increased if absolutely necessary.
This is strongly discouraged, as it may create index files that are not compatible with standard builds of CWB.
CL_MAX_FILENAME_LENGTH
in the CWB source code. Its default value is currently 1024 bytes.
MAX_INPUT_LINE_LENGTH
in the cwb-encode source code.