Context Statement: This was my final project for my Corpus linguistics class. For this report I used both technology and critical thinking to compare and contrast different linguistic features between different authors.


With the wide array of literature that composes fiction, often the focus of linguists is to define the characteristics of the genre as a whole rather than analyzing individual sub-genres. The goal of this study is to take sample works from three important fantasy authors from different time periods, compare them with a multi-dimensional analysis, and discover what lexical characteristics differentiate fantasy novels from each other as well as general fiction.


To make the corpus used for this analysis, 5 sections were taken from each author, for a total of fifteen text files. The first novel chosen was the first book of the Lord of the Rings Trilogy, The Fellowship of the Ring, written by J.R.R. Tolkien and published in 1954. The second novel was the first book of the Wheel of Time Series, Eye of the World, written by Robert Jordan and published in 1990. The third selection for the corpus were the first two books of the Inheritance Cycle, Eragon and Eldest, written by Christopher Paolini and published in 2002 and 2005 respectively.

One difference between novels was chapter size; As the books approached present day, their chapter size became increasingly shorter. This was also a factor in using the first two Paolini books instead of one. To keep file size consistent between novels, between 6,000 and 12,000 words, 1 chapter: 1 file for Fellowship of the Ring, 2:1 for Eye of the World, 3:1 for Inheritance. 45,709 words were sampled from Tolkien, 45,467 from Jordan, and 44,423 from Paolini, for a total corpus of 135,599 words.

While all three authors have the same purpose of presenting a narrative fiction set in a fantasy world, their audiences vary slightly. Tolkien, an ancient language professor, focuses much more on language, history, and poetry, writing to a more academic audience. Robert Jordan, while using simpler language than Tolkien, has adult content, so his audience should be considered an adult audience. Chistopher Paolini, who himself was a very young author when writing these books, wrote to a Young Adult audience.

All three stories have the same premise. A young male protagonist who has grown up as a commoner is thrown into a journey with a party of companions when a dark evil threatens the land. This narrative similarity allowed for increased focus on the lexical differences between the pieces.


To analyze these novels, the Multi-dimensional Analysis developed by Dr. Douglas Biber was used. This analysis focused on Dr. Biber’s first three dimensions. Dimension 1 distinguishes affective, interactional, and generalized content from informational density and precision; Dimension 2 distinguishes between narrative and non-narrative discourse; Dimension 3 distinguishes elaboration versus situation-dependent reference (Biber, Conrad,and Reppen, 152-153). All normed counts in this analysis were multiplied by 1000, as each file was analyzed individually.

For Dimension 1, the linguistic features chosen were personal pronouns, private verbs, and attributive adjectives. For pronouns and private verbs, using the computer program AntConc, each file was checked for occurrences. For the pronouns, a list including I, we, she, he, him, her, they, them, and you was used. Only usage as subjects or objects was used, with occurrences as adjectives hand-edited out. For private verbs, a list of five common verbs was used, both past and non-past forms: to hope, to know, to think, to feel, and to wish, with incorrect occurrences hand-edited out. For attributive adjectives, random 200 word samples were taken and hand checked for occurrences. The number of occurrences was divided by the sample size and multiplied by 1000.

For Dimension 2, the linguistic features chosen were Past-tense verbs and public verbs. Percentage of past tense verbs was also taken from random 200 word samples and normed by 1000. For public verbs, a list of five common verbs was used, in both past and non-past forms: to say, to tell, to ask, to answer, and to reply, with incorrect occurrences hand-edited out.

For Dimension 3, the linguistic features chosen were time/place adverbials and Wh-Relative clauses. For time/place adverbials, a list of common adverbs was used, including close, far, near, soon,  and late. Searches in AntConc had the adverbs enclosed in asterisks to see occurrences of different forms, and all occurrences were hand-edited. Searches with *ly* were also made to for adverbs such as quickly or slowly. All occurrences her hand-edited. Wh-relative clauses, each file was searched for occurrences of which with incorrect occurrences hand-edited out.

Finally, dimension scores were calculated using the normed counts of the linguistic features. Dimension 1: Personal Pronouns + Private Verbs – Attributive Adjectives; Dimension 2: Past-tense Verbs – Public Verbs; Dimension 3: Wh-relative Clause – Time/Place Adverbials.

Analysis Results

Dimension 1

Dimension 1, interactive vs informational, had surprising results with Tolkien and Palolini scoring very strongly positively (interactively), with 22.85 and 34.38 respectively, while Jordan scored barely negatively (informational) with -1.18. While Jordan’s text fell in the normal range researched by Biber for general fiction, between 0 and -5 (Biber,152), Tolkien’s and Paolini’s texts were highly interactive. The first contributing factor is Jordan’s slightly higher use of attributive adjectives, as seen in Figure A. The other strong reason was the large amount of dialogue and inner-monologue in Tolkien and Paolini, shown most clearly by personal pronoun usage. Table B shows specifically the usage of I, he, and she.

Figure A: Attributive Adjective Usage (Normed by 1000)

Average Attributive Adjective Norm Count

Figure B: Personal Pronoun Use (Normed by 1,000)


Tolkien, especially in the first half of the novel, used a very large amount of dialogue, resulting in the highest amount of first-person pronouns. Paolini and Jordan, on the other-hand, relied much more on third-person narration. As the protagonists were male, he was the most common. What differentiated Paolini from Jordan, however, is Paolini’s use of Eragon’s,( the protagonist) inner, first-person monologue.  See Figure C below for text examples.

Another pattern that can be seen here is the increasing presence of female characters. In Tolkien, the whole adventuring party is male, there are very large sections of the book with a female character. However, due to societal changes and the growth of the sub-genre, it can be seen that, while still featuring a male protagonist, both Jordan and Paolini are more inclusive of female characters.

As for private verbs, all three authors used them with the that-claus very infrequently, with norm counts of 2.38, 2.68, and 2.24, for Tolkien, Jordan, and Paolini, respectively.

Figure C: Text Samples

Tolkien: ‘I had to study you first, and make sure of you. The Enemy has set traps for me before now. As soon as I had made up my mind, I was ready to tell you whatever you asked. But I must admit,’ he added with a queer laugh, ‘that I hoped you would take to me for my own sake. A hunted man sometimes wearies of distrust and longs for friendship.’ 

Jordan: Moonlight flashed off steel. One Trolloc tumbled forward, rolling over and over before landing in a heap, while a second dropped to its knees with a scream, clawing at its back with both hands. The third snarled, baring a muzzleful of sharp teeth, but as its companions toppled it whirled away into the darkness. Thom’s hand made the whip-like motion again, and the Trolloc shrieked,but the shrieks faded into the distance as it ran.

Paolini: I can’t believe it,thought Eragon. They’re helping us get away! At the main gates, the soldier pointed and said, “Now, you walk through those and don’t try anything. We’ll be watching. If you have to come back, wait until morning.” “Of course,” promised Jeod. Eragon could feel the guards’ eyes boring into their backs as they hurried out of the castle.

Dimension 2

Figure D: Dimension 2 Factors(normed by 1000)  and scores (Mean Average)

Past-Tense VerbsPublic VerbsScore 
Tolkien9.6 X 1010.48.6 X10
Jordan9.2 X 104.998.7 X 10
Paolini8.3 X 108.347.5 X 10

As can be seen in Figure D, all three pieces are very heavily narrative as opposed to non-narrative. Returning to the text samples in Figure C, there is a very heavy use of past tense, especially in third-person narration. This is only broken up with use of non-past in dialogue. Jordan has a significantly lower amount of public verbs with it’s focus on action instead of dialogue. Paolini’s slightly lower score is due, again to use of the protagonist’s inner-monologues, which not only use present tense, but also focus on emotion rather than description or action.

Much like private verbs, while all the authors had semi-frequent uses of to say in different tenses, public verbs with that-clauses were not extremely common in the corpus. A more in-depth analysis of non-past verbs would have been more comparable to past-tense verbs and given a more balanced score.

Dimension 3

Figure E: Dimension 3 Factors and Score

Time/Place Adverbials (Normed to 1000)Wh-Relative Clause(Normed to 1000)Score (Average Mean)

For Dimension 3, all three novels leaned towards situationally-dependent rather than elaboration. It should be noted that there was a shift between from the older novels to the newer novels; using adverbials less and wh-relative clauses more. One Paolini file, Paolini_5, was barely positive with a score of 0.45. 


While the corpus used for this analysis was extremely small compared to the large, diverse sub-genre it was taken from, it has shown several shifts within the genre over the last fifty years.

  1. A very strong shift between Tolkien and Jordan in the balance of dialogue and action within increasingly smaller chapters. With the genres growing popularity over multiple platforms, it makes sense for the genre to become more action-focused for it’s larger audience. A shift back to more use of first-person with Paolini’s inner monologues show modern audiences increasing interest in internal conflict.
  2. More inclusive casts of characters, with female characters having more prominent roles. This is likely due to social changes within the U.S. as well as the growing popularity of the sub-genre.
  3. A strong focus on past tense narrative, with a very slow trend of having more present and private verbs. This also likely stems from a desire in modern audiences for more character development.
  4. A slow trend of moving away from situational-dependency towards extrapolation.

As the sub-genre of Fantasy continues to grow and mature, it will be interesting to see if these trends continue, stop or reverse.


