Fifty percent of Bovine TB due to badgers? A spurious statistic and how it was created

24 Replies

A much-quoted statistic about the Bovine TB (bTB) transmission is that ‘it has been estimated that 50% of bTB incidents could be attributed to infected badgers’.

This 50% figure has appeared in the House of Commons debates on the subject, both in main debates and in committees, as well as in various DEFRA publications. As a simply-understood and memorable figure amongst a welter of quite complex statistics, it now forms one of the main planks in the pro-cull argument. The fact that it was produced by an eminent statistician goes to strengthen it further.

But in fact the figure is based on fundamentally mis-applied statistics , and has arisen from a process of ‘sexing-up’ figures derived from a very thin set of data. I use that term as it reminds me of the 45-minute figure in the Iraq debate: spurious, simple to take on board, and crucial in convincing Parliament.

And so before the fifty percent figure goes any further in its potentially destructive journey, I would like to outline how it was arrived at, why it can be regarded as spurious, and how the statistics surrounding it have been ‘sexed-up’.

Derivation

The figure was originally published in a paper that was attempting to derive various figures from the Randomised Badger Culling Trial (RBCT) triplet data (Donnelly & Hone 2010). In this case the derivation uses the observed TB incidence in each triplet in the year before proactive culling began, and compares it to a prediction of the overall incidence of TB if no infection came from badgers.

The latter figure was projected from a mathematical model derived from a comparison of TB in the cull areas and the control areas . The same paper also outlined a model to take into account the prevalence of TB infection in the badgers as the culls began.

While the modelling is explained in some detail, there is one calculation that did not appear in the paper: that of the final fifty percent ‘estimate’. This was sent subsequently to DEFRA in a letter in January 2012 (link at end of post) from just one of the authors, Donnelly, and reveals not only how spurious this value is, but how the statistics have oddly been presented in a way which hides its imprecision: a bit of a surprise considering its author is a Professor of Statistical Epidemiology.

A Spurious Estimate

First let’s look at the data used, and the explanation of the estimate. This was produced in a table:

Trial area	A	B	C	D	E	F	G	H	I	J
Observed	0.113	0.099	0.076	0.113	0.034	0.029	0.029	0.175	0.150	0.070
Fitted	0.087	0.062	0.041	0.122	0.059	0.051	0.068	0.068	0.126	0.087
Estimated proportion attributed to infectious badgers*	60.6%	44.2%	16.8%	71.7%	41.2%	32.7%	49.0%	49.0%	72.7%	60.6%

The ‘Observed’ row is the per-herd bTB incidence in the year before the cull. The ‘Fitted’ row is the figure produced by the model as described in the 2010 paper, and is the figure used to produce the estimate.

The Estimated Proportion uses ‘the prediction, based on the same model, for each area had there been no badger-to-cattle transmission within the area (0.034)’. This is a direct quote from the letter. So for A it is simply 0.087-0.034/0.087 as a percentage, 60.6%. Simple!

But where did the 0.034 figure come from? That isn’t stated in the 2012 letter, but if you turn to the original 2010 paper, (and I wonder how many MPs have done this?) you can find it in the abstract. Quote: ‘Based on the model best fitting all the data, 3.4% of herds (95% CI: 0 –6.7%) would be expected to have TB infection newly detected (i.e. to experience a TB herd breakdown) in a year, in the absence of transmission from badgers.’

It is the qualification of the figure ‘(95% CI: 0 –6.7%)’ that I want to draw attention to. It means basically that, using the model, a prediction of the figure anywhere between 0 and 6.7% has a 95% chance of being right. Normal statistical practice, then, would be to use the two extremes to derive an estimated proportion from this prediction, so that you can state that it is very likely that the estimate lies between those two percentages.

Let’s do just that. Firstly, the zero: this means that 100% of bTB being due to badgers cannot be ruled out. So the estimated proportion in every case would be 100%.

Next, the 6.7%. Use this figure instead of the 3.4% and the table looks very different:

Trial area	A	B	C	D	E	F	G	H	I	J
Observed	0.113	0.099	0.076	0.113	0.034	0.029	0.029	0.175	0.150	0.070
Fitted	0.087	0.062	0.041	0.122	0.059	0.051	0.068	0.068	0.126	0.087
Estimated proportion attributed to infectious badgers*	23.0%	0%	0%	45.1%	0%	0%	1.5%	1.5%	46.8%	23.0%

The mean is now 14.1%. So the estimate has, statistically, a good chance of being anywhere between 14.1% and 100%!

If the ‘fitted’ figures were themselves calculated statistically, with confidence limits, there would be a further inaccuracy in the 50% figure, but one source of inaccuracy is perfectly good enough for this argument.

Sexing-up the stats

I am perfectly happy to have my method and maths taken apart, but, even with my errors, the point is that the 50% estimate was based on very little data. That is why the confidence limits are so wide apart.

The correct statistical conclusion, one that I hope, say, an A-level student would make, is that there is too little data to make any firm pronouncement. Although it looks as though SOME bTB is transmitted by badgers, the proportion cannot realistically be estimated from this dataset.

However, this does not suit DEFRA’s agenda of justifying a badger cull. How tempting is it, therefore, to, ahem, ‘forget’ to mention those wide confidence limits in a non-peer-reviewed letter presented to MPs? They can always be mentioned at a later date, if queried. And this, in fact is what has happened in one of the subsequent committee minutes. But the main, firm-sounding, unqualified, sexed-up-by-omission figure of 50% transmission by badgers is what remains in the mind, and indeed still appears in unqualified form in publications like DEFRA’s consultation document where they present again their pro-cull stance.

But the spurious accuracy goes further. Not only is the 0.034 predicted incidence figure presented without confidence limits in the letter, but it is re-stated as 0.03447. A note states that ‘I have provided further decimal places here to make the calculation clear.’

But don’t those extra decimal places make the figure look even more precise? What stunning accuracy the predictive model must be able to achieve! No mention that the confidence limits are over six thousand times as wide as the accuracy implied in that final significant figure. If ever a statistical figure had come out ‘sexed-up’ to show spurious validity, this is it.

These two examples could be schoolboy-level mistakes in statistics: one might expect them from a naive but enthusiastic A Level student, and would put red ink all over them. But coming from a Professor of Statistical Epidemiology, it seems somewhat unlikely that they are mere inadvertent errors.

Either way, in my view Professor Donnelly should be professionally ashamed of herself for allowing her statistics to produce a spurious result like this: a figure that is hardly better than guesswork, but that has been subsequently misused to great effect by the pro-cull lobby, and could well be seen as misleading Parliament.

Jamie McMillan

Briantspuddle, Dorset

16 September 2013

Christl A. Donnelly & Jim Hone (2010) : Is There an Association between Levels of Bovine Tuberculosis in Cattle Herds and Badgers? (Statistical Communications in Infectious Diseases Volume 2, Issue 1)

Donelly (2012) : Letter to DEFRA, 6 Jan 2012 9974_LettertoDefraregardingestimateoftheproportionduetobadgers-Jan2012

UPDATE 20 September

Here is a classic example of the use of the 50% pseudo-statistic to support culling:

http://www.tbfreeengland.co.uk/latest-news/tb-free-video-the-vets-perspective/

It was posted in one of DEFRA’s tweets which also quoted 50% as cast-iron.

Now it has been discredited, the pro-cull lobby and DEFRA will of course fight to rescue it, and it appears that maths are grinding away in an effort to narrow the confidence interval down from its near-random 86%.

24 thoughts on “Fifty percent of Bovine TB due to badgers? A spurious statistic and how it was created”

David Powell September 17, 2013 at 4:07 pm

Reblogged this on Causing offence and commented:
A further undermining of the case for a badger cull. How many cattle are infected with bTB by badgers? 50% say the government. And how do they arrive at that figure? Ah…

Reply ↓
Jane Brown September 17, 2013 at 8:38 pm

Most of Donnelly’s papers (and few are peer reviewed) are rife with phrases like ‘it is believed’, ‘it is assumed’, ‘it is predicted’. Certainly such phrases have little place in ‘factual reference material’. Amazingly, if you take the time to read the references at the end of most of her documents, you will be surprised to see the same names and references noted again and again. Strangely, it appears that there is a small but vocal group of government ‘scientists’ whose ‘published research’ contuously feeds one another in a spiral of misinformation, misdirection and assumption. Sadly, I can only surmise that, in some manner, they are personally profiting from the perpetuation of this misinformation, and wildlife, as well as farmers, must suffer for it.

Reply ↓
1. furtlefinch Post authorSeptember 17, 2013 at 9:35 pm
  
  Thank you. I am hoping that a demolition of this one projection will prompt others to take a closer look at the validity of the modelling in the 2010 paper.
  
  Reply ↓
Jane Brown September 17, 2013 at 9:59 pm

Sadly, I wonder if it will make any difference. You already have a myriad of independent scientists, vets, wildlife experts, biologists, etc., as well as online petitions of over 300,000 people who are advising that this entire process is a sham. Who is listening? No one. Instead the government and their associated ‘scientists’ continue on with this rhetoric. Frankly, what I cannot understand is why it appears that so few farmers, who are directly affected, are unable to read the published material themselves, and not only understand that this is an ineffective and costly misdirection of resources, but that they are being undermined and isolated because of it. They should be demanding legislation for cattle vaccination immediately as well as understanding the effects of bio-security and their responsibility to it, rather accepting such a crock of rhetorical mumbo jumbo. Ireland is quietly continuing a similar process wherein the annihilation of this protected species is the goal. The Irish governmental ‘scientists’ are part of this Donnelly group – once again their non-peer reviewed publications are referenced in each of their respective reports. In the early 1990’s, prior to the results of any trials, James O’Keeffe, an Irish Department of Agriculture vet, published a report condemning the badger as being the main transmitter of bTB, and this insulated group has manipulated facts, statistics and trials for the past 20 years in an effort to support that pre-conceived claim. This is propaganda and ‘science’ at its worst.

Reply ↓
1. furtlefinch Post authorSeptember 20, 2013 at 12:35 am
  
  Absolutely right – science at its worst. As Donnelly is the only expert being quoted by DEFRA in support of the cull, in fact with the justification for the cull resting entirely on her shoulders, I am hoping that her rather weak-looking series of analyses will be scrutinised more carefully. This sham estimate was pretty easy to spot if you had the two separate documents. Maybe there are some Irish scientists outside your governmental group who could demolish more of her maths.
  
  Reply ↓
2. furtlefinch Post authorOctober 3, 2013 at 10:17 am
  
  Jane, have you contact details or refs to papers of Eamonn Gormley re bTB in badgers in Ireland? Interesting results in talk to vaccination conference in London which I couldn’t get to. Thanks.
  BTW can’t get raw data apparently unless attached to research team, which I’m not!
  
  Reply ↓
  1. J Brown October 3, 2013 at 4:43 pm
    
    No sadly, I don’t. I was corresponding with the UCD chief (at least in 2011-2012), James O’Keeffe, who was a main supporter of badger involvement in bTB, publishing a myriad of papers, few peer reviewed, indicating this involvement. My experience with UCD was that any vet who did not support this platform was summarily ousted from this small elite group. Torgerson comes to mind……
  2. J Brown October 3, 2013 at 7:53 pm
    
    Regarding the Gormley paper, I have asked around and was sent a one page doc (spewing the 50% incidence, by the way) from someone who had it on file – that’s all I can get a hold of. By the way, that person also suggested that you might want to contact Badgergate (http://www.badgergate.org.uk/) which they believe is run by two zoologists associated with the ZSL.
    The Gormley doc can be found here:
    http://www.agriculture.gov.ie/media/migration/animalhealthwelfare/diseasecontrols/tuberculosistbandbrucellosis/eradicationschemes/bovinetberadicationschemeconference/vaccinationofbadgersthestorysofarlcorner/l_corner.doc
  3. furtlefinch Post authorOctober 3, 2013 at 10:53 pm
    
    Many thanks. Will catch up this this at the weekend.
Claire Mellish September 18, 2013 at 4:46 pm

I am amazed that the confidence intervals weren’t associated with the figures presented in the letter to Defra – not a responsible thing to do by any scientist.
I was completely unconvinced by the actual paper as well. The model assumed that cows did not re-infect badgers…
So the 50% based not much really. Given cows are the main source and tend to hang around together, a 50% infection rate from badgers is pretty ridiculous. I’d like to see the authors try their ‘bend it like Beckham’ model fitting on cows and their transmission rates…

Reply ↓
1. furtlefinch Post authorSeptember 20, 2013 at 12:49 am
  
  Thanks for the comment. The model/s may be biased, but for me it is the lack of data that is striking all the way through. Curves are projected from just six points. A few observations after the RBCT had finished are reprocessed into a whole raft of conclusions about perturbation, the effect of cull area, cull percentage and a projected bTB reduction of the much-quoted 16% (again usually with 95% confidence limits down to 7% missing). I was hoping some of the original ISG might publish a critique of the cherry-picked data and bent modelling.
  
  Reply ↓
furtlefinch Post authorSeptember 20, 2013 at 12:10 pm

I understand from Rosie Woodroffe on @ZSL twitterfeed that Prof Donnelly is as we speak bashing away at her dodgy estimate, no doubt to try and narrow down the confidence limits on it. Which I think tells us that directed, detailed criticism of bad science can strike home, and I hope others will pitch in in the same vein.

Reply ↓
1. J Brown September 21, 2013 at 11:52 am
  
  Prof Donnelly has been publishing self serving ‘science’ for quite some time, condoned by the insulated group in which she is a part and supported by misleading statistical analysis. It appears to be a lifetime calling. I wonder if there is some statute of limitations, beyond which her assumptions, and perhaps even her reputation, are finally totally discredited. Such sloppy analysis, bordering on propaganda, should not be allowed, or supported, by any scientific community.
  
  Reply ↓
Jennifer D. Parker September 26, 2013 at 5:47 pm

I was a teacher of research design and statistical methods at Under-graduate and Post-graduate levels, so I get the whole stats thing. It seems to me that this is an estimate of ‘best fit’ which means that under perfect conditions they can account for 50% of Bovine TB cases. Normally, the actual numbers used in this type of analysis should easily be transformed into actual observations of Bovine TB, within each area and, THEN the best fits (or expected frequencies) are calculated, based on what has actually been observed (in this case the number of diagnosed cases of BTB). Either I am loosing my touch, or this analysis has been done the wrong way round.

Jamie, if you can get the actually raw frequencies, I can re-calculate the stats and tell you the likelihood that Bovine TB and badgers are statistically association and, the level of probability that the results occurred by chance. It is also possible, and much more convincing, to talk of effect sizes. OK: 0.3 = weak, 0.5 = moderate and 0.8 is a strong effect size. So if this data was available, and we could re-analyse it, there would be a possibility of talking about the power of the effect size. If it less than 0.5 it isn’t convincing. Imagine Owen Patterson having to say: “the association between badgers and bovine TB shows a weak effect size that accounts for an estimate of between 10% and 20% of cases.”

Another point I would like to make, is that when they say they can accounts for 50% of the problem, they are omitting the other 50% of extraneous variables that aren’t accounted for in their analysis. They are essential saying that there are many other ‘variables’ that could be equally to blame for BTB (e.g. cattle movement; any other animals that are in the area; rats for example, or dogs, anyone who travels to markets and then come back to the farm etc, etc, etc). The list is potentially endless and each farm, unless they have controlled for this is in an ‘unclean environment’.

Finally, it seems very convenient to me, that they blame 1 in 2 cases of BTB on badgers. In reality, stats are rarely this explanatory. I owuld alos have to ask: Were they seeking to explain or to predict BTB? The difference between these two statistical possibilities is massive. One states cause (i.e badgers cause TB), the other means that it is only PROBABLY that BTB is caused by badgers. If the study was probabilistic then badgers may be nothing to do with BTB! If there is any way to test another area, for a given period of time, with more controls in place and actual badger TB was also measured, then you could potentially blow this stuff out of the water. Is there any way someone could get funding to do a PROSPECTIVE study. Perhaps ask an animal charities??? It look to me like this is all ‘ex-post facto’ and therefore not valid, or controlled, analysis. Any data that was collected may be violated by any of the factors I have mentioned and others that I haven’t. It is worth noting that statistical tests are very sensitive to poor research design.

This researcher has an ethical responsibility to analyse data in a moral, and honest, way. If he isn’t doing that, he can be reported to whatever body he is a member of. It would be entirely reasonable to ask him to explain his findings and to let other scientists see his method, data, and the protocol it was subjected too. If he is unable to do this convincingly, he may be answerable to who ever oversees his work; possibly the organisation he works for? If asked, he should be able to provide anyone with this information and he should do so willingly. I have all my data kept and in order, just in case someone asks to see it. As a Scientist myself this is good etiquette.

I hope I haven’t wittered on to much? I am a girl, but I am a geek too.

Reply ↓
1. furtlefinch Post authorSeptember 26, 2013 at 6:57 pm
  
  Jennifer – Extremely useful. Many thanks. I will see what else I can dig up and try and take in what you are saying. I am working on the famous 16% prediction at the moment and can almost certainly (fairly certainly if I am talking to a statistician) demolish that as well. But I am working slowly and fitfully at the moment.
  
  Reply ↓
donald rowe September 27, 2013 at 10:58 am

Good stats / bad stats will not get a proper outcome ever. Badgers and cattle get BTB from the same source, never mind the stats. The answer lies in the soil, its the pH that is causing most of the problems.

D

Reply ↓
1. furtlefinch Post authorSeptember 27, 2013 at 10:44 pm
  
  Donald, thanks. You may be right at least for one factor. My point is that DEFRA is justifying the cull entirely from bad statistics and calling it science-based.
  
  Reply ↓
Pingback: For scientists in a democracy, to dissent is to be reasonable | George Monbiot
Pingback: For scientists in a democracy, to dissent is to be reasonable | George Monbiot « Political Blok
Pingback: Age of Unreason
furtlefinch Post authorOctober 3, 2013 at 10:46 am

Update, 3 October. I notice that Professor Donnelly has now deleted her twitter account. It appears that I have upset her with this criticism. I was hoping for a dialogue: either a rebuttal, or at least a defence of her figures. Alas, this is not to be.

Reply ↓
1. J Brown October 3, 2013 at 4:40 pm
  
  Is this a surprise? Donnelly, as well as most of the scientists in her ‘corner’ do not want their research questioned – basically because it does not stand up to any analysis whatsoever. Far easier to delete an account, rather than to admit that perhaps she has been subsidized in order to arrive at a requested, pre-conceived ‘conclusion’.
  
  Reply ↓
Pingback: The Guardian « The Vivian Redemption
Pingback: For scientists in a democracy, to dissent is to be reasonable « Föhrenbergkreis Finanzwirtschaft