Language, computers & statistics: not just for geeks corpus linguistics in the a level classroom

Glossary of corpus linguistics terminology used here

Word sketches and colligation:
Sketch Engine’s corpora are all tagged (as are the corpora on CQPWeb) with Part-of Speech (POS) labels. Sketch Engine’s concordancing tool can read these tags and use them in its calculation to create “word sketches” – basically a detailed concordance that shows a node’s collocates and their grammatical relationships. Grammatical collocation patterns are known as colligations.

In the dashboard menu symbols on the left of the screen select “word sketch”. Create a word sketch for “man” – sketch engine will generate an overview of the most common word class for the node first, which is “noun”. Make a note of how many times “man” appears as a noun and how many times as a verb in the BNC corpus.

Take a closer look at the colligation “man” as subject of verb – what are the most common verbs?

Also take a look at the modifiers of “man” – again, make a note of these.

What about the colligation “man” as object of verb – what do you notice here about the verbs? Also, do you notice anything about how Sketch Engine in its word sketches understands “object”?

Now look at the colligation “man is …” – among the list is the noun “husband”. If you hover over the 3 dots next to it, you are given a menu with a choice of three options. Select the first one: “concordance” making sure to select the “open in a new window” option. What is the frequency for this collocation pattern in the whole corpus (it tells you at the top left of the concordance)? How many instances?

Click on the symbol for word sketch (the 3rd one down, a dot in a circle) in the dashboard menu bar on the left to make a new word sketch – this time of “woman”. Make a note of the verbs that have “woman” as subject and as object, respectively. What do you notice?

Also, what are the most common modifiers for the noun “woman”?

If you look at the colligation “woman is…” – you’ll find both “wife” and “housewife”. How many times do these occur in the corpus?


