The storage, reduction, and analysis of these complex data sets have typically carried out by large teams of ‘insiders’; expert scientists who belong to collaborations organized around the production of scientific results. In the past few years however, there has been a strong movement to broaden access to large-scale physics data in the US driven both by ‘top down’ pressure (i.e., federal agency policies) and ‘bottom up’ pressure (i.e., ‘outsider’ researchers who want to access to data for their own research interests).
By and large the movement toward broader access is a positive one, but one that requires care in implementation. Providing open access to large complex data sets is neither easy nor inexpensive. It requires technical effort (in long term curation, data reduction and associated metadata production, data delivery in commonly used data formats, software to read and visualize the data, and associated documentation) as well as an understanding of the needs of the broader research community. There are cultural barriers to overcome (‘It’s my data, why should I have to give it to you?’) as well as implications for intellectual property rights of the data producers.
In this talk, I will survey current trends in open access and use LIGO as a case study to illustrate both the benefits and the challenges associated with providing large data sets to the broader research community.