By Florian Maas on April 6, 2022
Estimated Reading Time: 10 minutes
In my previous post, I did some initial analysis on the dataset that contains all poems by /u/poem_for_your_sprog. I noted there that the rhythm, or poetic meter, of the poems is one of the main characteristics that makes these poems so appealing to me. In this post, I want to get to know a bit more about poetic meter, and explore the meters that sprog utilizes in his poems on /r/AskReddit.
Before we dive into this post, take a look at the chart containing some of the most popular poems by /u/poem_for_your_sprog below and see what patterns you can discover. You can read the poems by hovering over the points in the plot.
Did you notice any patterns? The legend for this plot can be found a little later in this post, so if you are in a hurry; feel free to scroll down. Otherwise, continue reading while we find out more about what it is that makes some of these poems feel so similar.
A little disclaimer before we start; I did not know anything about poetry when I started this analysis. Well, that's not entirely true; I knew that in poems words usually rhyme. That's about it. Anway, I don't claim that this notebook is exhaustive nor that everything in it is correct, so if you have any feedback or suggestions; please do let me know! Oh, and by the way; all code can be found on GitHub.
Let's go exploring!
To explore poetic meter, let's start by taking a look at one of my favourite poems by /u/poem_for_your_sprog, which contains a message about dealing with regret:
I should have hurried youth, in truth,
And moved more quickly on -
I should have made the most of youth,
Before the time was gone.
I should have followed fancy, free,
Before it thought to fade -
I should have picked a good degree,
Or found myself a trade.
I should have stopped to stare above;
To share another's dreams -
I should have never welcomed love,
And lost it all, it seems.
No matter what the aim or end -
No matter what you do -
Regrets are part of life, my friend:
Don't let them conquer you.
One of the reasons I like this poem is because of its clear and uplifting message. But another reason that makes this poem stand out to me is it's very apparent rhythm. If you say this poem out loud, you are clearly able to hear it;
I SHOULD have HURried YOUTH, in TRUTH,
And MOVED more QUICKly ON -
Or, a more visual way of representing this:
Here, the syllables in the blue squares are stressed, and the syllables in the grey squares are unstressed. If you have trouble hearing the rhythm, try to read the lines out loud, stressing the syllables in the grey squares. Feels unnatural doesn't it? Now try the same, but stressing the words in the blue squares. That should be a lot more comfortable.
So one could say that the rhythm of the two lines above is unstressed-stressed-unstressed-stressed-unstressed-stressed-unstressed-stressed-unstressed-stressed-unstressed-stressed-unstressed-stressed. Saying that out loud does fill a little cumbersome though.
Luckily, as with any other art, science, religion or discipline, things in poetry tend to have names to make talking about them a bit easier. For example, the meter of the first line is iambic tetrameter, whereas the second line is iambic trimeter. To break that down: First, they are both iambic, because they consist of iambic feet. An iambic foot is a unstressed syllable, followed by a stressed syllable, so a grey square followed by a blue square in the visualization above. The first line consists out of four iambic feet (pentameter), and the second line consists out of three iambic feet (trimeter). The nomenclature for the number of feet per line is as follows:
Two other commonly used types of feet are the trochaic foot (stressed, unstressed), and the anapestic foot (unstressed, unstressed, stressed). To illustrate, examples of the three types of feet we have encountered until now are:
By combining these three feet in a large variety of ways, we can construct the meter for most of the poems by /u/poem_for_your_sprog. To perform analysis on the meters that sprog uses however, we should first identify which meter each poem uses. We could do that by reading each poem and assigning the meter by hand, but that will be very time consuming. So instead, I leveraged Python to do this for us. If you are interested in learning more about my approach for tackling this problem, see this post. If you are just interested in the results, feel free to skip that page and continue reading here.
The first thing that we might be interested in; what is Sprog's favourite meter? Or better; what are his ten most favourite meters?
Turns out sprog's most common poetic meter is a combined meter of iambic tetrameter and iambic trimeter; alternating lines with eight and six syllables. The graph requires some more explanation. First, the asterisks:
Then, we also see one type of metrical foot that we have not encountered before: The amphibrachic foot. An amphibrachic foot is syllables in the order: unstressed, stressed, unstressed. Basically, an amphibrachic dimeter (010010) and a anapestic dimeter** (01001) together make up an anapestic tetrameter** (01001001001).
If this sounds like French to you (I'm assuming you're not French here. If you are; je m'excuse. Please replace French with a random language that you barely speak), don't worry; let's look at some examples of the ten most common meters to make this all a bit more clear:
To get a better idea of the application of these meters in the poetry by sprog, and to get an idea of how well our meter-determining-logic performs, below is a plot of the ten most upvoted poems for each of the ten most commonly used meters.
Note that you can hover over the markers to read the poem, and you can hide/show a group of poems by clicking the corresponding entry in the legend on the right.
Judging from these examples, it seems our simple method for determining the meter of a poem actually performs quite well. Of course, this is only a small subset of the total poems and we don't see how many poems that should be in these groups we misclassified, but it'll have to do for now. (I just put that last statement there to keep my colleagues happy, otherwise they will start harassing me on monday. To all the others; it's amazing! Look at thow well it performs! Isn't it wonderful?).
Looking at the upvotes, it does seem like we can detect a pattern here, even though there are only ten markers per group in the plot. Poems written in certain meters seem to get more upvotes than poems in other meters. Initially, I thought about just looking at the average number of upvotes per group:
However, this might lead us to draw wrong conclusions, since the dataset contains many outliers. For example, sprog's most upvoted poem has over 37k upvotes. Since at the time of writing there are 55 other poems written in iambic dimeter this group will have on average approximately 673 (37,000 divided by 55) more upvotes than if this single poem had zero upvotes; that's a lot of impact for one observation! We will get more accurate results by looking at the median instead of the mean. A good way of visualizing this type of data is a boxplot, so let's try that out:
One note to fellow Redditors; please stop upvoting the 'i lik the bred'-poem. It's screwing up my y-axis.
Anyway; this plot gives us much better results. Some types of meter are more popular than others, but the differences are not as big as our initial plot based on the mean number of upvotes led us to believe.
Now let's take a look at what we encountered in my previous notebook. There, I created a histogram with the average number of characters per line per poem. We found that there was a large peak around 30 characters, and a smaller second peak around 46/47 characters. By some visual inspection of these poems that caused the second peak, we raised the hypothesis that this was mainly caused by poems in anapestic tetrameter. Since we have classified the poems now, we can actually validate this hypothesis. Below is a plot with multiple histograms of the average line length in characters per poem, overlayed on top of each other. Note that again you can hide or show histograms by clicking the corresponding entries in the legend on the left side.
And indeed; the peak around 46/47 characters exists completely out of poems in anapestic tetrameter** - at least, for the poems in the top 10 most commonly used meters, which are displayed here. It's also interesting to see that whereas most of the histograms follow approximately a normal distribution, some are left-skewed. This is because of the fact that some lines are split over multiple lines, as we saw before. By inspecting the plot above, we can see that this mostly happens in poems with anapestic tetrameter** and iambic tetrameter.
Now, let's see how the use of meter in sprog's poems over time has changed. Below are eleven timelines; one for each of the ten most commonly used meters, and one for all others. Each timeline shows for each month the percentage of poems that was written in that meter during that month.
The first thing that's very intesting to see; in the first few months, the category 'other' accounts for a very large fraction of poems. This seems to be mainly caused by the fact that meter was a less strong feature in the poems from that period, and there was not always a single recognizable meter in the poems. Basically, I think what we are seeing there is sprog developing his own characteristic style in which meter plays such an important role.
What is also interesting to see is that halfway 2016, sprog started experimenting a bit more. The use of the always winning combo of iambic tetrameter, iambic trimeter starts to decline. At the same time, new meters make their introduction around the same period:
That was all for now! I learned a lot about meter in poetry by creating this post, and I hope I was able to share some of that with you.