A national showcase and national study, 19 June 2015, Hyatt Hotel, Canberra, media welcome
Share your data and boost science productive is the message of a national workshop in Canberra today.
Over 40 data collections will be released – covering everything from cloud measurement to pavements and roads, ancient DNA, oral histories of Western Sydney and the changing coastline.
A national study conducted earlier this year showed that sharing and reusing data generated by publicly-funded research activities could lead to haring data could boost Australian research output by between $1.4 billion and $4.9 billion.
Amongst the data collections being released are:
- Lifestyles of mosquitos – the Vector-Borne Disease Network Digital Library at James Cook University contains curated data, tagged citations, articles and reports on entomology, epidemiology, demography and interventions to support malaria eradication
- Tracking floods with tweets – PetaJakarta.org Major Open Data Collection – Hydrological Infrastructure Network for Jakarta, Indonesia. This collection made available by the University of Wollongong makes use of sophisticated hydrological modelling and real-time tweets from citizens of Jakarta to provide up to date information that enables sophisticated response to floods.
- Curtin University has collaborated with the Geological Survey of Western Australia to publish data that has been scanned to deliver mineralogical and geochemical data that will enable new exploration and activity in Western Australia.
New discoveries could be lurking in this data. For example a ‘ghost’ found in recycled data from The Dish at Parkes turned out to be a collapsing neutron star.
Professor John Houghton (Victoria Institute of Strategic Economic Studies) and Dr Nicholas Gruen (CEO of Lateral Economics) conducted the study into the value of sharing data. Using conservative estimates they found that only 10-20 per cent of data used in public research in Australia is currently available for reuse.
The report shows that a relatively small investment in data policy and infrastructure can lead to a significant increase in value to Australian innovation, research, and the broader economy.
“Research funders can realise both economic and scientific benefits from open research data, and there are a growing number of funders mandating open data,” the report says. “Research institutions benefit from the enhanced visibility…that open data brings and from a reduced risk of inadvertently playing host to scientific fraud.”
The main findings of the report are significant: total spending on public research is about $13 billion annually, about $9 billion of which comes from the Federal Government.
Data reuse can also lead to exciting discoveries, as PhD candidate Emily Petroff from Swinburne University knows.
In 2007, when American astronomers were trawling through shared data collected by the Parkes Observatory they discovered ‘fast radio bursts.’ They are thought to signal events in our universe, potentially billions of light-years away, where extreme energy is given off. But these newly-discovered bursts had never been seen ‘live,’ until a team led by Emily recorded one on the telescope at Parkes. She also used reused data to track down the cause of other unusual radio bursts. The data showed the bursts usually happened at meal time on work days, and were from a microwave oven.
For the Open Research Data Report, John and Nicholas analysed Australia’s Commonwealth-funded research and agencies, including research commissioned through consultancies and in higher education institutions.
For case studies, and the full report visit: http://ands.org.au/resource/open-research-data.html
- Ross Wilkinson, Australian National Data Service, +61 419 534 163, email@example.com
- Ross Wilkinson (for ANDS) +61 419 534 163, firstname.lastname@example.org
- Niall Byrne (for ANDS), Science in Public, +61 417 131 977, email@example.com
The true value of information is starting to be captured
In May 2014 last year, a ghost that haunted radio telescope data materialised for the first time. A fast, powerful burst of radio waves which had travelled billions of light years through space was picked up “live” by the Parkes Radio Telescope—The Dish—in central New South Wales.
Lasting only milliseconds, “fast radio bursts” (FRBs) were first revealed in 2007 by American astronomers combing archival data from Parkes for something unrelated. CSIRO researchers and others have since found six more examples in data now aggregated into the Parkes Pulsar Data Archive, a project funded by the Australian National Data Service (ANDS).
Astrophysicists think FRBs may signal extreme events in our universe such as the collapse of a neutron star to form a black hole, says Swinburne University PhD student, Emily Petroff. She led the group which recorded the new burst, and coordinated a world-wide consortium of 12 earth and space telescopes to gather data on it.
“These emissions encode information about what they have travelled through to reach us—about all the matter between where they occurred and here,” she says. So there is huge interest in FRBs. Yet no-one would know about them without international access of astronomers to radio data accumulated by The Dish.
This is just one example of the benefits of documenting and storing data, and subsequently making it accessible to others. Another is the VECNet database at James Cook University which brings together a library of information on malaria and the habits and ecology of its carriers, the hundreds of species of Anopheles mosquitoes that transmit it. It provides a test-bed for any strategy medical researchers might have for curbing the spread of malaria in a specified area at any time of year
Now, for the first time, economists Prof John Houghton of the Victoria Institute of Strategic Economic Studies (VISES) and Dr Nicholas Gruen, CEO of Lateral Economics, have put figures on the value to the Australian economy of opening up research data to make such projects possible. Their report, commissioned by ANDS and entitled Open Research Data, shows there is much to be gained for Australia by increasing access to and sharing data through investing in infrastructure and framing the right policies to encourage such activity.
The study set out to estimate the total amount spent publicly on research and the value of the data created and analysed by research. It also evaluated the benefits of curating and openly sharing public research data.
Overall the report shows that a relatively small investment in data policy and infrastructure can lead to a significant increase in value to Australian innovation, research, and the broader economy.
‘Research funders can realise both economic and scientific benefits from open research data, and there are a growing number of funders mandating open data,’ the report says. ‘Research institutions benefit from the enhanced visibility … that open data brings and from a reduced risk of inadvertently playing host to scientific fraud.’
The main findings of the report are significant: total spending on public research is about $13 billion annually, about $9 billion of which comes from the Federal Government. The authors estimate that the value of the use of this data in public research is between $1.9 billion and $6 billion a year, but perhaps no more than 10 to 20 percent of the information generated is made publicly available.
The report also suggests that the installation and maintenance of data infrastructure can add another $1.4 billion to $4.9 billion a year to the Australian economy. This incorporates time saved for data centre users and able to be reinvested in further research, as well as returns from the reuse of the information by researchers who could neither create nor obtain the data themselves.
Opening access to data and recycling it in this way clearly entails costs in terms of hardware, software, communications and personnel. But these are counterbalanced by benefits which include reusing data for other purposes and in new ways; avoiding the need to spend time and money recollecting standard information; putting together in novel ways data on different topics and/or from different sources; providing source material for education and training; and being able to check the original data sources leading to research conclusions.
For instance, the launch and operation of NASA’s Hubble Space Telescope has cost at least US$10 billion, according to the Final Report of the Independent Panel reviewing its successor, the James Webb Space Telescope. But only about US$75 million has been spent over about 30 years on developing and operating an archive to make the data that Hubble has generated publicly accessible.
In 2012, about half the published scientific papers based on Hubble observations reused data from that archive—in other words, data that had not initially been gathered for the purpose to which it was put. So, in that year by means of the archive, Hubble’s scientific productivity was doubled for less than one per cent of the cost of providing and operating the telescope.
There are Australian examples of this effect. In 2008 the Australian Bureau of Statistics (ABS) began allowing unfettered use of its data, rather than selling access. A cost-benefit analysis of the provision of this ABS data by Prof Houghton, one of the authors of the Open Research Data report, showed that, while the Bureau lost the revenue it had previously made from selling its data, there were substantial savings in no longer needing to maintain a “shopfront” for sales and related inquires. And the benefits to the Australian economy arising from the ABS making data freely available were likely to have been more than five times the costs, Houghton suggested, because the nation put that information to wider use. If that analysis is correct, the Australian Government would have gained substantially more from the tax revenue earned on the additional economic output than it lost on lower fee income.
Nationally, Australia needs to collect information not only on its population for planning purposes, but also to assist activities such as agriculture, mining and the conservation of its unique environment. It does so through bodies like Geosciences Australia and the CSIRO. The report suggests that opening up this data would provide many benefits to counterbalance the necessary cost of collecting the information in the first place.
Computers and information technology now allow data to be analysed, stored, processed combined and in many novel ways that were unthinkable even a decade ago. This not only allows recycling, but also encourages solving problems by combining data from different sources both old and new.
Urban planning is an area in which this is becoming particularly useful. One recent international example is an agreement between the City of Boston and the phone app taxi service, Uber. The company provides the city with anonymised data on the time, place and length of rides that can be blended with public transport data to generate information useful to transport planners and traffic managers. Uber has suggested it could provide similar data to Australian cities.
Already the Australian Urban Research Infrastructure Network (AURIN) provides access to data on building stock, urban land use, thermal imaging, vegetation maps, public transport, public facilities and many other things. It also provides tools to analyse and work with this plethora of data. Partners, such as the City of Melbourne are using this information to model their planning and policy initiatives.
It’s all about generating efficiencies—getting greater value out of existing information sources, rather than commissioning separate surveys. “Access to AURIN data enables us to have a much better evidence base to make decisions, to provide services in a more effective way, and ensure investments provide a better return to the community,” says Austin Ley, the City of Melbourne’s Manager City Research.
Opening up research data can clearly be a very worthwhile exercise.