River: a tool for quantitative media analysis

Matthew McKeever
3 min readMay 24, 2022

Last week I posted some things to quantitatively store, search, and present media, and in particular the output of the state news agencies of Russia and Ukraine.

As I indicated in that post, I was a bit dissatisfied with how it turned out — it wasn’t as accessible (to journalist, historians, and others who might not have a coding background) as I’d like, especially given the relatively meagre payoff. It still more or less requires knowledge of Ukrainian and Russian and access to Telegram, but that’s like tens of millions of people.

Accordingly, I went ahead and made a no-code version that captures most of the features of the other thing but is much less hassle to set it. It is called River and it can be found in an online single-page app here. The ugly code is here and anyone can do whatever they want with it. I hope it’s of some use to people — it’s not very pleasant to look at in various respects, some of which is owing to my bad aesthetic eye, other of which (? that doesn’t sound like English) to the time constraints of doing this in my spare time. One concrete change I do hope to make is adding a moving average — or more generally other tools from time series analysis — that might help display the data. In the remainder of this short post I want to set out my hopes for this sort of thing, partly because I’ve stalled out in trying to find the patterns in the data I’m looking for.

The main functionality: a graph of occurrences of words, and those occurrences, got by clicking a day in the graph. The English language wikipedia is included for reference.

In order to understand a conflict, you need to understand the way each side sees the conflict. That’s obvious: we can only make sense of people’s behaviour if we have a sense of their beliefs and goals. It’s quite hard to see how the Russian state sees the war, especially to us in the west. Partly this is owing to the language barrier, but of late even Russophones now can’t readily access Perviy Kanal on Youtube or Russia Today. While some journalists do a good service in giving clips and such, this is information that wants to be free. That freedom might lead to the sort of impressive results we see in the OSINT community.

Telegram, with its neat archiving facilities — briefly mentioned in the previous post —helps with that. But only some. The problem is that once you get what you want — the other’s perspective — you’re left with the problem of what to do with it. The Russian sites still available post 1,000s of stories a day; news happens constantly, and it’s hard to keep track of, especially if you have an actual job that isn’t keeping track of it.

But this limitation in fact opens up possibilities. If you can’t manually comb through everything, you can maybe have a computer do it. And then, once you have a computer do it, you open up the possibility of discovering patterns in propaganda that might help understand and predict things.

Here’s the sort of thing I mean, and the sort of problem I’m interested in. The news agencies I consider mostly report on the same thing; and there’s at least a lot of overlap in what they cover. If Macron makes a speech, both agencies will cover it. That fact — that there’s at least some shared ground — makes one think that the patterns of occurrences of (the Russian and Ukrainian versions of) ‘Macron’ should be distributed according to some pattern. They should be correlated.

But that won’t generalize. Ukrainians will use ‘DNR’ (Donetsk People’s Republic) less often than the Russians because they don’t recognize it. Russians, obviously enough, won’t talk about ‘occupiers’ (окупанти) a common term in Ukrainian media. Russians will downplay Bucha, Ukrainians the strong ruble. Noticing patterns like this, we can attempt to blend them into a theory of how the different medias report on the war. There will be some shared topics; and some divergence owing to different perspectives. Developing a theory of that would be very useful, I reckon, as a sort of qualitative theory of political rhetoric, and that’s personally what I hope to do with this. If anyone knows the sort of theories or tools that would help in this endeavour, let me know.

--

--