What data can teach us about Russian propaganda

6 min readMay 28, 2022

Recent years have seen an increase in interest in propaganda and political rhetoric. In my field, philosophy of language, a change has occurred from the consideration of often technical and abstruse facts about how language works to how language is used to manipulate and subjugate. Thus Jason Stanley’s How Propaganda Works, building both on technical semantics and feminist and race-theoretic work, proposes a theory of, well, how propaganda works.

A whole slew of books in history and politics and current events are similarly focused. A seminal such book, Timothy Snyder’s Road To Unfreedom, traces recent developments in politics and how it’s spoken and thought of from Ukraine and Russia to Western Europe to the US.

My aim in this brief post is to approach the topic of political rhetoric from an, as far as I can tell, unfamiliar angle. Much of that work — which I admire — is qualitative. By that I mean it tends to focus on salient bits of language or rhetoric or forms of thought and subjects them to analysis. Stanley considers ‘dog whistles’ or ‘code-words’, which are expressions that serve to say things to those with ears to hear: a classic example is ‘States’ rights’. In a recent article, Snyder considers the recent coinage of ‘rushism’ among Ukrainians to stand for Russian fascism.

A worthwhile corrective to that is to look at the big picture. The rhetorical effects of political language are unlikely exhausted by such attention-grabbing phrases. More likely is that propaganda works, in part, in a day-to-day and more subtle way. (A rough analogue, if it helps: Foucault though that power manifested in a society in decentred and unobtrusive ways — rather than guns and prisoners and flayings we are subject to little but constant corrections ensuring we line up right. There is surely a linguistic analogue, and sifting through data might help us find it.)

To that end, using some software I talked about last week (see here), I took a quick look at a relatively large data set, that for the Russian news agency TASS from 24 February to 1 May and a Ukrainian agency, Ukrinform, for the same period. Although a proper analysis is something I lack both the time and the knowlege to do, I believe even an unproper analysis, using stats implementable in Excel, yield insights. I will discuss three: whether Russia’s authoritarianism casts a shadow in the data; how prevalent bona fide rhetoric such as the talk of ‘denazification’ is; and how the Russian media attempted to deflect culpability and attention away from the Bucha massacre. I’ll treat them in turn.

Authoritarianism by numbers

A core story of the war is the siege of Mariupol. The theatre airstrike was international news; Telegram users who follow the ‘Mariupol Now’ (Мариуполь сейчас) channel see very close up the devastation and fear the besieged residents were faced with. Even Russia’s state-guided Perviy Kanal, a few days into the war, in a clip no longer accessible, drew attention to it, claiming it was a new Stalingrad and so on (source, unfortunately, this tweet from me. Archiving is important!)

It’s thus reasonable that any news agency would devote a lot of attention to it. A quick check suggests that Ukrinform does. I will make my spreadsheet available maybe tomorrow once I’ve tidied a bit, but as a way of comparing (a particularly keen person could do most of this by adding to the thing I posted last week), note that any news agency, during a war, is going to talk about its president. Common sense thus suggests we can use it as a watermark, as a way of getting our bearings. Talked about as much as the president=good amount of coverage.

With that in mind, here’s a graph showing occurrences, in Ukinform, for the time period indicated, occurrences of ‘Zelensky’ vs occurrences of ‘Mariupol’:

Mariupol is the bluey sort of colour. All of these graphs start with 24 February as 0 on the x-axis.

That seems reasonable. Now, let’s check occurrences of ‘Putin’ compared with Mariupol in TASS:

Putin is the not bluey sort of colour. I’m colour blind.

Arguably, this is significant. Surely, one might think, even though the leader of a country at war is important and deserving of a lot of attention, they ought not count as essentially always more important than perhaps the central theatre of the war.

Denazification, Demilitarization, Genocide

Recall that Putin’s casus belli in his speeches in the last week of February were that, inter alia, he was going to ‘denazify’ and ‘demilitarize’ Ukraine, and that the Russian speakers in the east were underdoing ‘genocide’. These seem like paradigm propagandistic terms, used to evoke feeling while short-circuiting reasoning, indeed perhaps being ballfacedly bullshit in the technical sense of “unconcerned with truth or falsity” which is such a hallmark of much Russian political speech.

Now imagine — although I certainly haven’t done enough to substantiate this here — that propaganda was primarily about these sort of flashy expressions. Then — maybe! — one would find such terms occurring frequently even in the relatively sober stage news agencies (the pundits are a different story, as anyone who has seen the clips from TV that get shared on social media).

But that’s not really what one finds. I compared the Russian and Ukraine words for ‘denazification’, ‘demilitarization’, and ‘genocide’, and the results were somewhat sparse:

There’s just not much — that sort of terminology, it seems, is not a primary rhetorical feature of states’ political messaging. One thing to note is that the limits of my procedure are shown here: in most cases, the use of the relevant terms in the Ukrainian context were in quotations — as one would expect! — but that’s not something overly readily mineable from the text.

Bucha

The final example in this quick tour is ‘Bucha’. As one would expect, it receives a lot of attention from Ukrinform around the time the massacre was made public.

Plausibly, this is what one would expect. How does TASS treat it? Well, I took a look to see how it occurred, and noticed something: many of the stories containing it also spoke things like: fake, provocation, lying, and so on. So I collected such words (“провокац”, “лжив”, “провокац” ,”ложн”, “фейк”), and behold:

The occurrences of Bucha in the Ukrainian media (bluey) against a disjunction of lying words

That is to say, just as we see the spike in the Ukrinform and indeed the worlds’ attention to Bucha, so we see a spike in Russian talk of lies and fakes.

There are many ways to respond to this. Maybe my methodology is bad (it’s famously easy to mess up when reasoning quantitively); maybe I’ve messed up the coding along the way; maybe something else. All possible (but all find-outable, since all the software done to produce the above is available — again, see the previous post.) In the meantime, I think we should take seriously the possibility that there’s an aspect of political rhetoric only findable if we use domain-specific knowledge in combination with bigger data sets than humanities people might be used to.

What data can teach us about Russian propaganda

Authoritarianism by numbers

Denazification, Demilitarization, Genocide

Bucha

Written by Matthew McKeever