5.3 “Translating” query results

A named query result can be “translated” to an aligned corpus, which allows more flexible display of the aligned regions, access to metadata, etc. (new in CQP v3.4.9).
Consider the following example:
> EUROPARL-DE;
> set Context 1 s;
> Zeit = [lemma = "Zeit"];
The NQR Zeit now contains all occurrences of the German word for time in the German part of EuroParl. The following command “translates” the NQR to the English part of EuroParl, i.e. it replaces each match by the complete aligned region in the target corpus (as would be displayed with show +europarl-en;.
> Time = from Zeit to EUROPARL-EN;
This creates a new NQR EUROPARL-EN:Time containing the aligned regions. You can now e.g. tabulate or count metadata:
> tabulate EUROPARL-EN:Time match text_date;
> group EUROPARL-EN:Time match text_date;
The somewhat arcane syntax of the command avoids introduction of a new reserved keyword
- while it looks similar to a corpus query or set operation, the assignment to a new NQR is mandatory (otherwise the parser won't accept the syntax)
- note that the new NQR must be specified as a short name; the name of the target corpus is implied and added automatically with the assignment
Some important details:
- matching ranges that are not aligned to the target corpus are silently discarded; you cannot expect the new NQR to contain the same number of hits as the original NQR
- if there are multiple matches in the same alignment bead, they will not be collapsed in the target corpus; i.e. the new NQR will contain several identical ranges
- in order to collate source matches with the aligned regions, make sure to discard unaligned hits from the original NQR first:
  > Zeit = [lemma = "Zeit"] :EUROPARL-EN [];
  or post-hoc as a subquery filter
  > Zeit;
  > ZeitAligned = <match> [] :EUROPARL-EN [] !;
Do not cat the translated query directly (cat EUROPARL-EN:Time;) without first activating the target corpus, as this would corrupt the context descriptor (see Sec. 3.1). The correct procedure is
> EUROPARL-EN;
> cat Time;
You can now customize the KWIC display as desired.
But it is safe to apply dump, tabulate, group, count and similar operations. Only commands that auto-print the NQR (including a bare sort or a set operation) will trigger the bug.
The problem is mentioned in this section because users are most likely to be tempted to do this when working with a set of aligned corpora.
As a second example, we will return to German translations of nuclear power.
> EUROPARL-DE;
> Other = from EUROPARL-EN:Other to EUROPARL-DE;
We can now run a subquery on the aligned regions in the German part of EuroParl in order search for possible translations other than Kern- and Atom-. One possibility is that nuclear power plant has been translated into the acronym AKW (for Atomkraftwerk).
> Other;
> [lemma = "AKW"];
Further translation candidates can be found by computing a frequency breakdown of all nouns in the aligned sentences:
> N = [pos = "N.*"];
> group N match word;
We could have applied the same strategy to the NQR Nuke in order to determine the frequencies of different translation equivalents:
> Nuke = from EUROPARL-EN:Nuke to EUROPARL-DE;
> Nuke;
> TEs = "(Atom|Kern|AKW).*";
> group TEs match lemma;