New strategies are needed to avoid losing valuable online content, writes Amanda Lawrence for Australian Policy Online.
Link rot, disappearing documents, vanishing websites and changing content are endemic to the internet creating a digital black hole. This is having a particularly negative impact on public policy research and implementation, threatening our capacity to know even our recent past or to understand the context and background to many political and policy decisions.
In a recent study posted on ArXiv.org, researchers looked at web links posted in response to major news stories on twitter and calculated that within two and a half years an average of 27% of content was lost while 41% had been archived. This problem is not limited to blogs and other social media sites, or to news stories. It is also an issue for substantial and important research and policy documents produced by all levels of government, academic centres, NGOs, think thanks and professional organisations – in other words, grey literature.
For example, in 2008 the Australian Government held a major convention in Canberra attended by 1002 delegates who aimed to “help shape a long term strategy for the nation’s future”. The 2020 Summit was a powerful statement by a newly elected government that they were interested in ideas and the sharing of information – but what has happened to all that sharing? Where are all those ideas?
You can still find information on Wikipedia about the 2020 Summit but if you click on the link to australia2020.gov.au you won’t see anything better than an error message saying “Server not found”. What’s happened to our history? When did this website and all its content disappear from this address? Why was it allowed to just disappear?
At Australian Policy Online we noticed the problem recently and immediately went searching for the various reports and briefing papers that were produced as part of the summit. At this point in time the Copyright Act 1968 restricts the site from holding full text copies of these documents in our database for distribution to the public unless we seek permission from the rights holders first - a time consuming and often fruitless endeavour as we are discovering with our efforts to digitise older, printed policy documents. The copyright act is currently subject to a review by the Australia Law Reform Commission looking into whether exceptions in the act are adequate for the digital economy. This restriction on collecting and preserving non-commercial public interest content is an example of its limitations.
The other option that enables the holding of full text copies is use of a creative commons license. Creative commons was not used in the case of the 2020 Summit final report – the Australian Government had not yet agreed (8408) with Nick Gruen’s Gov 2.0 recommendations that public sector information be released under a creative commons BY licence (still regularly ignored by government departments and reviews anyway). If this had been the case Australian Policy Online could have held a full text copy of the report and the website for ongoing access and linking.
Fortunately the National Library of Australia collects and stores some websites in its Pandora archive and has its own archived version of the summit site (8408) so the record is not completely lost and APO’s links have been restored. But that won’t make the thousands of references to australia2020.gov.au in publications, media sites and reports any less dead. Despite saving the day in this case, the NLA is struggling to manage the “digital deluge”. According to Pam Gatenby, Assistant Director-General of Collections Management at the National Library “it is increasingly difficult for us to carry out our role effectively as appropriate systems and practices are not in place at the national level.”
This is just one example of kinds of valuable resources disappearing from our browsers every day. Earlier this year the entire federal parliament website was redeveloped changing all links to Senate reports and Parliamentary Library briefings, leaving thousands of dead links and missing content in its wake. But don’t think it's just government documents or just a government issue.
The digital black hole of public policy ‘grey literature’ and public interest websites is just one problem facing researchers, policy makers and information professionals in the internet age. Other issues include how to select and evaluate grey literature and online content for collection and for use as evidence in public policy; how can we improve the way documents are produced and managed outside of formal publishing systems; what kinds of infrastructure, policies and legislation would make research and other valuable content more accessible and usable for all?
If you are interested in any of these questions please join us in Canberra for the Where is the evidence conference: policy, research and the rise of grey literature at the National Library of Australia on Wednesday 10 October 2012. This conference is part of a three year ARC Linkage project looking at the value of grey literature for public policy and the strategies needed to improve how its collected and accessed. We are at the beginning of our search for solutions to the many pressing issues of grey literature and informal publishing online and we hope that those with a stake in the results will join us to help find the best way forward.
To register go to eidos.org.au/v2/grey-literature
For more information on this research project and grey literature go to greylitstrategies.info
Amanda Lawrence is Research Manager, Grey Literature Strategies and Managing Editor at Australian Policy Online apo.org.au which is a part of Swinburne University's, Institute for Social Research.
Image: Flcikr / bennylin0724