> "in" @[pos="DT"] [lemma="case"];
shown in bold font in KWIC display
> [pos="DT"] (@[pos="JJ.*"] ","?){2,} [pos="NNS?"];
> A = [pos="DT"] @[pos="JJ"]? [pos="NNS?"];
> size A;
> size A target;
@1
, but this can be changed
with a user option (see Sec. 8.6 for details).
> "in" @[pos="DT"] @1[pos="J.*"]? [lemma="case"];
keyword is underlined in KWIC display
> sort by attribute on start point .. end point ;
both start point and end point are specified as an anchor, plus an optional offset in square brackets; for instance, match[-1] refers to the token before the start of the match, matchend to the last token of the match, matchend[1] to the first token after the match, and target[-2] to a position two tokens after the target anchor
NB: the target anchor should only be used in the sort key when it is always defined
> [pos="DT"] [pos="JJ"]{2,} [pos="NNS?"];
> sort by word %cd on match[1] .. matchend[-1];
> sort by word %cd on match[-1] .. match[-42];
whereas the reverse option sorts on the left context by character:
> sort by word %cd on match[-42] .. match[-1] reverse;
> sort by word %cd;
> set ExternalSort on;
> sort by word %cd;
> set ExternalSort off;
> count by lemma on match[1] .. matchend[-1];
> A = "behind" @[pos="JJ"]? [pos="NNS?"];
> dump A;
> dump A 9 14;
(10
– 15
match)
the four columns correspond to the match, matchend, target and keyword (see Section 3.7) anchors; a value of -1 means that the anchor has not been set:
1019887 1019888 -1 -1 1924977 1924979 1924978 -1 1986623 1986624 -1 -1 2086708 2086710 2086709 -1 2087618 2087619 -1 -1 2122565 2122566 -1 -1
note that any prior sort or count command affects the ordering of the rows (so that the -th row corresponds to the -th line in a KWIC display obtained with cat)
>
) or
appended (>>
) to a file, if the first character of the filename is
|
, the ouput is sent to a pipe consisting of the command(s) that
follow the |
> A = [pos="DT"] [pos="JJ.*"]* [pos="NNS?"];
> dump A > "| gawk '{print $2 - $1 + 1}' | sort -nr | uniq -c | less";