Sunday, February 07, 2016

Reproducability and corrections in the scientific literature

Illustration by David Parkins from Allison et al
Excellent though somewhat depressing review article by Allison et al in Nature called "Reproducibility: A tragedy of errors" about how difficult it is to get retractions or corrections in papers which are full of errors.

Although they managed to get one paper retracted when they found that the analysis applied a mathematical model that overestimated effects by more than tenfold (after some months) they have found it very difficult/impossible to secure retractions in over 25 other cases.  They find that:
  • Editors are often unable or reluctant to take speedy and appropriate action
  • Where to send expressions of concern is unclear
  • Journals that acknowledged invalidating errors were reluctant to issue retractions
  • Journals charge authors to correct others' mistakes
  • No standard mechanism exists to request raw data
  • Informal expressions of concern are overlooked
This is pretty worrying. It also agrees with my limited experience when Science published an absurd review article by Thomas Piketty. I wrote to the editors to the effect that:
Although Piketty and his co-author Saez admit that the data on which they draw are highly uncertain, there are no error bars in any of their graphs and no discussion of statistical significance. Claims in the paper (eg that wealth concentration in Europe “has been rising since the 1970s-80s”) are “supported” by just 3 data points in Fig 2 differing by much less than 10% so cannot be significant even at the 5% level. Furthermore Piketty’s key thesis, that r>g is “supported” by Fig 4 which shows the two most recent data points where g >r and then two made-up points depending entirely on assumptions.

It now emerges that the data on which Piketty claims to have based his conclusions were not “official data and tax surveys” but data from these surveys to which manual adjustments were made to support his conclusions (Chris Giles, FT, http://blogs.ft.com/money-supply/2014/05/23/data-problems-with-capital-in-the-21st-century/) and, for the UK, using an official statistic specifically stated to be “not a suitable source” for this and declining to use the data collected for this purpose. If the correct figures and official statistics are used a significantly different picture of wealth distribution in the UK emerges.

Piketty responded to these criticisms admitting that his choice of data is questionable though defending his decisions since the official data “does not look particularly plausible”.  It may well be that the official data give estimates which are too low, and for a popular book that may be OK. But can we allow a paper to stand in Science where the data have been massaged thus? People assume that a Science paper meets the highest scientific standards. Any other paper with such problems would be retracted. Does the author of a fashionable best-seller get special treatment?
But they just said that they would take no action, and to leave a comment below the article. This bogus article has since been cited over 100 times.

No comments: