CWB Core
The CWB Core is the central component of the Corpus Workbench. The Core is often informally referred to simply as "CWB" since all other components depend on it. It consists of:
- A set of utility programs for setting up and manipulating corpus indexes
- CQP (the Corpus Query Processor): interactive command-line corpus query application
- A C API to the corpus library (CL) used by these programs, which gives direct low-level access to CWB-indexed corpora
CWB was originally designed for Unix, and today is mostly run on Linux distributions, Mac OS, and Windows.
The following is a general overview of how the different versions of CWB relate to one another.
- Early versions: in-house at the IMS Stuttgart; now obsolete.
- Version 3.0: first open source release; no longer supported.
- Versions 3.1, 3.2: modernisation and new development; no longer supported.
- Version 3.4: preparation for a new stable release; only recent releases supported.
- Version 3.5: current stable release.
A detailed list of changes to the core over time can be found in the CHANGES
files distributed along with its code.
How it works
The core of CWB is written (mostly) in C and manipulates indexes of corpus data at a low level. It consists of a library for direct data acesss (the Corpus Library, or CL), and programs that build on this library (CQP and the utlities). It is intentionally not very user friendly. Most users interact with it via software that builds on top of it, particularly CWB/Perl and CQPweb.
How to install
There are two ways to install CWB: from a packaged release for your operating system, or by compiling from the source code.
- Release packages and source code packages can be downloaded from here; note there are distinct instructions for different platforms.
- Advanced access to the repository version of the CWB Core is explained here.
Read more
To learn more about the CWB core, see:
- Evert and Hardie 2011 (see publications list)
- The two main manual documents - the CQP Manual, and the CWB Corpus Encoding Manual