Reading preferences of South West England: a computational view

posted in: Authors | 0

Early findings of our BBC The Novels That Shaped Our World surveys show that readers of the South West of England rate the novels Cold Comfort Farm by Stella Gibbons and Master and Commander by Patrick O’Brian more highly enjoyable than readers from other British regions. One way of understanding this trend is by using computational textual analysis.

What’s key?

In order to find out why Stella Gibbons’s satire of country life Cold Comfort Farm (1932) and Patrick O’Brian’s historical naval adventure Master and Commander (1969) appeal so much to readers in the South-West of England, the Big Book Review team performed three forms of computational analysis on these novels. We measured how these novels perform in terms of linguistic complexity, lexical diversity and unique keywords used, as compared to other novels of the same genre. Before we go into the result in more detail, it’s useful to understand some computational terminology:

  • Keyness is is the frequency of a word in the text when compared with its frequency in a reference corpus; the output is a list of words that stand out in a text as compared to other texts. This can help us identify unique themes in novels, but also preference for the use of, for instance, gendered language.
  • Linguistic complexity, or readability, is often measured by mean sentence length and mean syllables per word, and can help us quantify how hard a text is to read.
  • Lexical diversity refers to the richness of vocabulary in a text. An easy method to use is the Type-Token Ratio (TTR), which calculates the number of types (how often a word is used) divided by the number of tokens (number of words).

Linguistic mimicry

The results for complexity and diversity for Master and Commander are given in Figures 1-3 and Cold Comfort Farm are given in Figures 4-6.

The keyness analysis showed that Master and Commander stands out in terms of its unique use of (historical) seafaring language, such as captain, deck, masthead, aboard, starboard, admiralty, andsea. This analysis also flagged the use of a large number of multisyllabic which was later verified by the high mean syllables per word (see Fig. 2). What stood out as well was that the novel contains many references to place names such as Barcelona and the landscape – seascape – which is not unusual for other novels in the adventure category.   

Finally, Master and Commander is much more linguistically complex than most of the other adventure novels in our corpus (see Fig. 1). Only Scott’s Ivanhoe from 1819 is more complex, and they closeness of linguistic complexity suggests that O’Brian’s novel aims to capture the spirit of the writing of that time. Master and Commander is also rich in vocabulary (see Fig. 3), with a TTR of 0.17, only really outperformed by the outlier Keven Barry’s City of Bohane, a contemporary novel known for its unique rich Irish English slang and vernacular.

Fig. 1: Mean sentence length
Fig. 2: Mean amount of syllables per word
Fig. 3: Type-token ratio, or, lexical diversity

Gibbons’s novel stood out in its uses of key words such as female pronouns and words pointing to family relations, such as aunt, cousin, brethren, grandmother. As Gibbons wrote this provocative novel plays with the middle brow genre that was associated with women’s writing, it is not surprising that complexity (see Fig. 4 and 5) is (deliberately) not high for Cold Comfort Farm: it performs middle of the road in terms of sentence and word length. This is a knowing choice on Gibbons’ part as, paradoxically, it has by far the highest degree of lexical diversity (see Fig. 6), which is due to the use of vernacular and invented words (neologisms, imaginary slang).

Fig. 4: Mean sentence length
Fig. 5: Mean amount of syllables per word
Fig. 6: Type-token ratio, or, lexical diversity

Can these findings say anything conclusive about the literary taste of readers in the South West? The corpus is small and relatively arbitrary, but the extraordinary ratings by the South West’s readers’ for Gibbons and O’Brian seem to suggest that they have a special interest in sea-life and family life, a preference for high lexical diversity, and a love for out-of-this-world words.

To help us understand more about the diversity in literary taste across the UK, please partake in the Big Book Survey, and get some surprising book recommendations.

Leave a Reply

Your email address will not be published.