Downloading and installation guide
This page explains installation of the different CWB projects one-by-one. Each section consists of download links, basic get-started instructions, and pointers on where to find more information.
There are also links to data packages - bundles of corpus data that you can use to test out CWB.
Other pages in this section detail alternate ways to get the Corpus Workbench:
- CQPwebInABox - a virtual machine image with CWB pre-installed
- Advanced instructions for access to the cutting-edge latest version, mostly for programmers
... plus information on the free software/open source licences under which CWB is published.
The recommended version of CWB is currently the stable version 3.5 (or 3.2, in the case of CQPweb). Some older versions, preserved for historical reasons only, are linked below.
We will occasionally add bug-fix releases to the stable versions e.g. 3.5.1, 3.5.2 after the stable version 3.5.0. Normally, you will always want to download the most recent bug-fix release for your OS.
Installing the CWB Core
There are two ways to install CWB: from a release package for your operating system, or by compiling from the source code.
You can browse all the available release files on SourceForge, or just use the links below.
On Linux
You can install CWB very easily on Linux distributions that use either of the two most popular packaging systems.
These are the .deb
system, for Debian,
Ubuntu, and derivatives;
and the .rpm
system, for Fedora,
Red Hat, and derivatives.
We also provide a PKGBUILD
for Arch Linux.
Debian, Ubuntu, Linux Mint, and derivatives
- get the
.deb
: https://sourceforge.net/projects/cwb/files/cwb/cwb-3.5/deb/ - run the command
sudo dpkg -i NAME_OF_FILE.deb
Fedora, Red Hat, and derivatives:
- get the
.rpm
: https://sourceforge.net/projects/cwb/files/cwb/cwb-3.5/rpm/ - run the command
sudo dnf localinstall NAME_OF_FILE.rpm
Arch Linux, Manjaro, and friends:
- get the source code tarball: https://sourceforge.net/projects/cwb/files/cwb/cwb-3.5/source/
- extract the whole archive, open a terminal in its base folder, then:
cd packaging/pkgbuild_cwb
to go to the directory containing thePKGBUILD
makepkg -i
to build and install the package- (see here for an alternative
PKGBUILD
that uses the cutting-edge code from the repo)
Without package management
On other versions of Linux, or for more control over setup, you should instead download the source code (as a "tarball") and compile/install it. This process can also be used in other Unix and Unix-like environments, e.g. *BSD, Solaris, Cygwin, or MSYS2.
- get the source code tarball: https://sourceforge.net/projects/cwb/files/cwb/cwb-3.5/source/
- extract the whole archive into its own folder, e.g. with command
tar -xvzf XXX.tar.gz
- consult the files
README
andINSTALL
(in the root of the folder you've just created) to find out how to build and install.
You'll need a working C compiler, as well as some other tools and libraries, in order to complete this procedure.
This is explained in the INSTALL
file.
(The notes on our page on live access to the development version may also be useful.)
There are ready-made compile & install scripts for some widely-used operating systems in the install-scripts/
subdirectory. For instance, a default build for most Linux distros including Debian, Ubuntu and Fedora can be compiled and installed with the single command:
sudo install-scripts/install-linuxThe necessary prerequisites are automatically installed using the distro's package manager.
On Mac OS
Our recommended way of installing the CWB Core on MacOS is via Homebrew. See the official website (https://brew.sh/) for instructions on how to get Homebrew.
To install the latest stable version of the CWB Core via Homebrew simply run the following command:
brew install cwb3
- (see here for an alternative Homebrew command that uses the cutting-edge code from the repo)
brew install cwb3 --head
- get the
.tar.gz
: https://sourceforge.net/projects/cwb/files/cwb/cwb-3.5/darwin/ - run the command
sh install-cwb.sh
(with or withoutsudo
, as you prefer) - get the download (a
.zip
file): https://sourceforge.net/projects/cwb/files/cwb/cwb-3.5/windows/ - unzip the file and open the resulting folder in Explorer
- have a look at the
README.txt
andINSTALL.txt
files... - ... when you are ready to install, double-click on the
install-cwb
file.
It's also possible to get the latest development version from our Subversion repository via Homebrew:
If you don't want to use a package manager, or don't want to install the CWB permanently in your system, you can instead use the binary release of the latest stable version. This includes statically linked binaries for the CWB Core:
A final alternative is to build from source on Mac OS; download and extract the tarball and then proceed as per the included instructions.
On Windows
CWB is Unix-native software, but there are multiple ways to run it on Windows.
You can use the Windows Subsystem for Linux. The way this works is that you enable WSL; this installs Ubuntu by default, but you can change the distribution to another one available in the Microsoft Store (e.g. Debian or Fedora).
At this point, you are effectively on Linux - so follow the instructions above for Linux systems! This is our recommended method for running CWB on Windows.
Other tools that give you a Unix-like environment within Windows are Cygwin and MSYS2. While we don't directly support CWB on these platforms, they should be as suitable as other Unix environments for building the system from a source code download (see above).
Finally, we provide a native Windows release. This can be installed on standard-issue Windows, without any Unix layer present. (It is compiled using MSYS2, but you don't need to have MSYS2 to run it.) To install this package:
You may also wish to install 7-zip, if you don't have it already (for reasons explained in the Appendix of the CWB Encoding Manual).
A warning: in the past, Windows users have had trouble with corpora that include accented characters or non-Latin alphabets. If you have corpora of this kind, then concordances, etc., may not render properly. If you use Windows Terminal (new in Windows 10) rather than the old Windows console, you may be able to avoid some or all of these problems. Ultimately, if you are going to be using the command line directly to access CWB, it's a far better idea to use the WSL if you can.
Checking it works
You may want to test your new CWB installation with one of the pre-indexed demo corpora. Download and unpack the English demo corpus (novels by Charles Dickens), change to the directory DemoCorpus/
, and try the following commands:
cwb-describe-corpus -r registry -s DICKENS cqp -eC -r registry -D DICKENSThe first command will print some information about the corpus and its attributes, while the second will start an interactive CQP session and activate the demo corpus
DICKENS
for queries.
Now follow the instructions in the CQP Query Language Manual for your first steps with the CWB, and read the Corpus Encoding Manual if you want to index your own corpora.
Installing CWB/Perl
CWB/Perl is broken into multiple independent packages so that you only need to install the functionality
needed for your purposes. The base package CWB
is strongly recommended for all CWB users
(since it provides essential utilities such as cwb-make
and cwb-regedit
).
The other packages have additional prerequisites: CWB-CL
requires a working C compiler for installation and CWB-Web
depends on some external Perl packages available from. The CQi reference implementation CWB-CQI
also works on client machines without a CWB installation.
As with almost any Perl package, the easiest way to install any or all of the CWB/Perl modules is via CPAN.
Instructions on getting it via CPAN here
If you are unable, or just don't want, to install CWB/Perl using CPAN, you can do pretty much the same thing manually: download the source code and compile/install it.
- download the tarball (a
.tar.gz
file: FILE LINK GOES HERE - extract the full content, open a terminal, and access the folder for the package(s) you wish to install
- check out the
README
file(s) - run the following commands (the same for any of the different packages):
perl Makefile.PL
make
sudo make install
Note: it is necessary to install the CWB core first - before you attempt installatioon of CWB/Perl.
Windows users: the CWB/Perl modules are not supported on Windows. Expert users might be able to get them to install, either using CPAN or not, but this is not recommended to non-experts. If you are on Windows, and plan to make substantial use of CWB/Perl, then it is highly advisible to use a Linux distribution with WSL, or to use a virtual machine (e.g. CQPwebOutsideTheBox, which is Ubuntu-based).
Installing CQPweb
CQPweb requires the CWB core to be installed, but does not depend on CWB/Perl.
Being web software, written in a scripting language for the web, CQPweb is available only as source code; the source code is what you actually deploy.
You can download either version 3.2.43 (older, but more polished) or version 3.3.xx (the most up-to-date):
- the less unstable v3.2.43 (
.tar.gz
file): download from SourceForge - the more up-to-date v3.3.xx (
.tar.gz
file): FILE LINK GOES HERE - once you've downloaded CQPweb, extract the entire folder from the
.tar.gz
- now, look at the System Administrator's Manual
CQPweb's system administration manual has a chapter on installation. This manual comes as part of the CQPweb itself, and is also available on this website.
For Docker fans: a CQPweb docker image is available (although we don't support this method of installation ourselves).
Other downloads
Pre-encoded corpora
- Novels by Charles Dickens (
DICKENS
, 37.5 MiB) for CQP tutorial - Collection of German law texts (
GERMAN-LAW
, 20.9 MiB) for CQP tutorial
- Europarl 3 (
EUROPARL-EN/DE/FR/ES/IT/NL
, 2.2 GiB) with proceedings of the European Parliament in 6 languages
Substitute placeholders in the URL with release date (28 February 2010) in order to keep robots away and prevent accidental downloads.
- EUROPARL Web interface Europarl-GUI-2.2.102_patched.tar.gz (all platforms, 0.1 MiB)
This Web GUI requires compatible versions of the Perl API packagesCWB
,CWB-CL
andCWB-Web
to be installed. It is ideal for use with the pre-encoded Europarl version 3 corpus (see below). If you have downloaded the Europarl GUI source code before, please make sure to install the patched version above for full compatibility with the pre-indexed Europarl 3!
Import & export utilities
- BNC_encoder-0.9.2.zip (all platforms, 0.1 MiB) – recoding & indexing scripts for the British National Corpus 1994 (XML edition), designed for use in BNCweb
The crypt: some very old versions
These versions of CWB are now unsupported. They may be incompatible with later versions of other CWB tools.
Version 3.5 should be backwards-compatible with corpora indexed in these older versions, so you should not really need them for the majority of purposes.
Version 3.0 of the CWB core:
- cwb-3.0.0-osx-10.5-universal.tar.gz (Mac OS X Universal, 4.3 MiB)
- cwb-3.0.0-linux-i386.tar.gz (Linux Intel 32-bit, 7.3 MiB)
- cwb-3.0.0-linux-x86_64.tar.gz (Linux Intel 64-bit, 8.9 MiB)
- cwb-3.0.0-solaris-sparc.tar.gz (Solaris 8 SPARC, 4.4 MiB)
- cwb-3.0.3-source.tar.gz (source code, 1.6 MiB)
- Version 3.0 of the CWB includes the Editline (CSTR) library: Copyright © 1992 by Rich Salz & Simmule Turner, Copyright © 1999 by Alan W. Black, Richard Caley & J. G. Vons
- This library is used under the licensing conditions detailed in the respective ReadMe files in the source code of version 3.0 of the CWB core:
editline/ReadMe.rsalz
(Turner/Salz) –editline/ReadMe.cstr
(CSTR)
Older versions of CWB-Perl:
- Perl-CWB-2.2.102.tar.gz (source code, 0.2 MiB) – base package
- Perl-CWB-CL-2.2.102.tar.gz (source code, 0.1 MiB) – low-level corpus access
- Perl-CWB-Web-2.2.102.tar.gz (source code, 0.1 MiB) – support package for Web GUIs
- Perl-CWB-CQI-2.2.102.tar.gz (source code, 0.1 MiB) – the CQi reference implementation
Some old release versions of CQPweb can be accessed here.
Older versions in the Subversion repository:
-
svn export http://svn.code.sf.net/p/cwb/code/cwb/branches/3.0 cwb-3.0
svn export http://svn.code.sf.net/p/cwb/code/cwb/branches/3.1 cwb-3.1
(old versions of the CWB Core) svn export http://svn.code.sf.net/p/cwb/code/perl/branches/3.0/ cwb-perl-3.0
(old version of the CWB/Perl modules)svn export http://svn.code.sf.net/p/cwb/code/gui/cqpweb/branches/*** cqpweb-***
(old versions of CQPweb; replace *** with one of 3.0, 3.2.5, 3.2.6, or 3.2.11)