So, on Wednesday I went along to a training session put on by our excellent library team as part of their series on ways for academic staff to raise their research profiles. This was the only one of the four I attended, partly because of time and partly because I’m probably a bit beyond the 101 seminar on how to use social media at this stage (she types optimistically). But bibliometrics are one of those things that turn up frequently in the pages of the Times Higher Education, have hands wrung over them in despair about what role they’ll play in the approaching REF assessment, are derided as being statistically useless and praised as representing the future of research strategy. It was about time that I actually found out what they were and how they work. I should give massive credit for what follows to our stellar library team, in particular Linda Norbury for all the work she put into pulling this workshop together.
Bibliometrics and Classicists
The major question for me, and for some of you reading this, was whether bibliometrics are one of those things that we as classicists have to care about. Some REF panels have decided to use bibliometric data (albeit sparingly) in their assessments this time around, which obviously raises the spectre of this becoming standard practice. Our REF panel is not one of them, and unless the tools available pick up significantly, it’s not going to be – at the moment, we are peculiarly poorly served by the major services which offer this sort of thing. They’ve got good coverage for the sciences; they’ve got good coverage for the social sciences; but the humanities are nowhere.
In some ways, this might be enough for you to throw up your hands, declare that there’s no point bending over backwards to learn about another science-generated form of measurement imposed on the discipline, and request that bibliometrics hie themselves to a nunnery. It’s tempting. Unfortunately, the funding landscape is starting to get a bit keen on this sort of data – and knowing why we don’t have it available is perhaps as useful in applications as being able to provide it, particularly for cross-disciplinary schemes. It’s a little frustrating to try out this stuff and realise that ‘your field’ isn’t being looked after properly, but being familiar with the principles now will mean that when the providers do eventually catch up, we’ll be ahead of the game.
If the throwing up your hands option still appeals, you can stop reading now.
What can bibliometrics tell you?
Bibliometrics can tell you two things – the impact rating of a journal, and the h-index of an individual researcher. Well, they can tell you more than that, but those are the two things that they’re most commonly used for.
Where can you find bibliometrics?
There are three major services for finding this data. The first is Thomas Reuters, also known as the Web of Science, which is a subscription service; the second is SCImago, a free service that runs off the data in the subscription service Scopus; and the third is Publish or Perish, another free service which works on the data stored in Google Scholar. As you may have noticed, each of these platforms is completely independent of the other; that means that the underlying calculations that each platform makes is based on its own data, so it is quite possible to come up with different results on how much impact a journal has or how many citations an individual article has received because in the differences in the underlying databases. I should note that the workshop didn’t really look at Publish or Perish, and spent more time on the Web of Science than on SCImago; I also don’t intend to give a guide on how to use each platform here. If you want to learn how to fiddle with them, there are some good Creative Commons resources available from the MyRI project in Ireland which includes online tutorials and other training materials.
Journals and the Impact Factor
The impact rating of a journal does not refer to impact in the REF sense of socio-economic impact outside the scholarly community, but to the impact of the journal on other scholars in the field – namely, how often articles from this journal are cited elsewhere. We’ve all heard the horror stories about colleagues being told that they have to publish in high impact journals (sometimes even which specific journals), but there is a sort of logic behind the madness. It boils down to the assumption that if someone publishes in the Journal of Scholarly Awesomeness, then the people who are doing the internal quality assessment of their REF submission can assume that the journal’s peer review process will have made sure that the article has a good chance of being rated three star or above. If, however, someone publishes in the Journal of Tea and Kittens, then the internal assessors will actually have to read the article and make a qualitative judgement themselves, which may be both time-consuming and inaccurate if it’s not their subject area. The rule of thumb also only goes in one direction – if something is published in a high impact journal then there’s a good case that it will demonstrate excellence, but that doesn’t mean that something published in a lower impact journal won’t demonstrate excellence.
The way that impact factor is calculated is that the database algorithm divides the number of times that articles published over the previous two years were cited by articles in the database by the number of articles published by that journal over those two years. Humanities people will already be groaning, because for us it takes far more than two years for an article to get cited, not least because of the long lead-in times to our publications. However, in the sciences the turn-around is much faster, so we looked at examples with impact factors of 13 and 14. The important thing, of course, is to compare like with like, because different citation habits in different fields also affect citation rates – apparently in economics it’s usual practice to cite very sparingly, whereas in some of the scientific disciplines much more of it goes on, thus generating higher citation numbers and higher impact factors. However, once you’ve taken that into account, you can look for the journal with the highest impact factor in your field and go from there.
SCImago has a slight variant on this; as well as offering the standard ‘add up everything and divide it’ option, it also offers something called the SCImago Journal Rank indicator (SJR), which takes into account the prestige of the citing journals – so, if a journal gets citations from other high impact journals, it will score a higher SJR than a journal with the same amount of citations but in lower impact journals.
For classicists, however, this is all moot, because the Web of Science’s Journal Citation Report system doesn’t cover our journals. They have a drop-down list of science journals, and a drop-down list of social science journals, but the humanities are conspicuous by their absence. I’m sure that if you cared enough you could farm data out of Google Scholar to pull together your own versions of impact factors, but it would be a very long and extremely boring job, and the data would be questionable. If you must look for the equivalent of an impact factor, the best alternative is the ERIH journal rankings, which cover classics journals. They rank each journal as International 1, International 2, National and none, so you can get a sense of which band each journal sits in, albeit without statistical data to back up the judgement. It’s also worth saying that the ERIH rankings are much more subjective and rather more controversial, but it’s the best we’ve got available for the time being.
You and your h-index
The other thing you can use bibliometrics for is to calculate an individual researcher’s h-index – that is, the number of times that an individual has had their work cited by other people. The h comes from Hirsch, the chap who came up with the calculations – apparently there’s a whole bucket of maths behind this, but let’s not go there. Web of Science have a particularly straightforward way of doing this that does include humanities publications, but picks up some slightly odd things (articles in the TLS, for example). The good thing about the h-index is that it can only go up, never down; obviously that puts academics who’ve been about longer at an advantage, but it also means junior people have a ready-made excuse for a lower number.
The samples that the workshop used, via a colleague in Chemistry, generated an h-index of 14, which was pretty scary given that my h-index is currently a whopping great zero. However, when I experimented with other classicists, the best h-index I could find was 5, with 4s and 1s being much more common with the other people I tried. Again, this makes sense in a humanities field, as it takes a lot longer for our publications to generate citations and my first article only appeared in 2012 – plus the h-index databases only look at articles, not at books, where we might expect our work to garner a fair number of citations. The fact that each of these services runs off their own data means that you should always say which service you’re taking your h-index from.
I certainly feel a lot better for knowing what bibliometrics are, how they work and how to generate them. But I come back to the same problem – they just aren’t reliable enough for Classics. I think I’m going to settle down with the Publish or Perish software at some point over the summer and explore that a little bit, but I don’t think it’s fundamentally going to change my view of things. Part of that is because of being comparatively junior and thus not having the glowing reward of finding that somebody has managed to cite my stuff (although it will be nice to know that I can look again in a couple of years, hopefully with a more positive result!). But part of it reflects the fact that this sort of number-crunching hasn’t yet come to define our lives in the humanities – and I sincerely hope that as long as our publication patterns remain in their current shape, it won’t.