2.5 Accessing token-level annotations

specify p-attribute/value pairs (square brackets are required)
> [pos = "JJ"]; $\quad$ (find adjectives)
> [lemma = "go"];
"interesting" is an abbreviation for [word = "interesting"]
the implicit attribute in the abbreviated form can be changed with the DefaultNonbrackAttr option; for instance, enter
> set DefaultNonbrackAttr lemma;
to search for lemmatised words instead of surface forms
the %c and %d flags can be used with any attribute/value pair
> [lemma = "pole" %c];
values are interpreted as regular expressions, which the annotation string must match; add %l flag to match literally:
> [word = "?" %l];
!= operator: annotation must not match regular expression
[pos != "N.*"] $\to$ everything except nouns
[] matches any token ( $\Rightarrow$ matchall pattern)
see Appendix A.2 for a list of useful part-of-speech tags and regular expressions
or explore tagging with the /codist[] macro (more on macros in Sections 6.4 and 6.5):
> /codist["whose", pos];
$\to$ finds all occurrences of the word whose and computes frequency distribution of the part-of-speech tags assigned to it
use a similar macro to find inflected forms of go:
> /codist[lemma, "go", word];
$\to$ finds all tokens whose lemma attribute has the value go and computes frequency distribution of the corresponding word forms
abort query evaluation with Ctrl-C
(does not always work, press twice to exit CQP immediately)