(In their own words…)
Timothy Vines and colleagues did a study on how the reproducibility of data sets in zoology changes through time. They gathered 516 papers published between 1991 and 2011. And then they tried to track the data down.
Even tracking down the authors was a challenge, never mind the actual data. As the years went by, a dwindling minority of papers were accompanied by author email addresses that still functioned.
Vines’ luck with data was even worse. In the end, only 37% of the data even from papers in 2011 were still findable and retrievable. But the proportion dropped each earlier year. By the time they got to papers published in 1991, only 7% of the data could be determined to truly still be in existence and retrievable. By then, few authors could be found, and most of them were reporting that their data were lost or inaccessible.
Researchers who had the data had died, retired, or the research had been done five computers and two universities ago. Or the data were in software or hardware that no one could access any more. As the stories and reasons kept coming, we were all wincing and more or less freaked – partly in personal recognition of life as we all know it, and partly at seeing the collective enormity of this problem tabulated. Human research in areas requiring that data be kept might fair better, but who knows? Vines thinks years from now people will look back and think it was silly not to publish data at the same time as the article.