Research Data Lifecycle Management

Workshop on Research Data Lifecycle Management (RDLM 2011)

This material is based upon work supported by the National Science Foundation under Grant No. 1137007.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

THIS WORKSHOP WAS HELD on July 18-20, 2011
Princeton University, Princeton, NJ

Thanks to all who helped to make this workshop a success!

Workshop Recommendations Summary (posted Mar 2013)

We are planning to add to the above summary and have the workshop materials below accessible through more permanent URLs, suitable for citation, in a final workshop report, to be issued shortly.

The objective of this National Science Foundation (NSF)-funded workshop was to bring together researchers, campus information technology (IT) leaders, and library/archive specialists to discuss the topic of data lifecycle management specifically as it relates to computational science and engineering research data.  Discussions concerned a common understanding of best practices and funding models for selecting, storing, describing, and preserving these digital data. The workshop was also meant to help to cultivate partnerships between these communities to foster continued developments in the preservation and sharing of research data.

Workshop Materials

Workshop Agenda (updated on 7-20-2011)

Wed 7/20 vendor panel

Dell presentation

Oracle presentation

IBM presentation

Wed 7/20 funding panel

Huerta presentation

Schopf presentation

Waters (no presentation)

Tue 7/19 presentations

Athey presentation

Goldstein presentation

Griffiths presentation

Michener presentation

Princeton campus map (posted on 7-15-2011)

Tue 7/19, 1-3pm Breakout Sessions Information (posted on 7-14-2011)

List of Workshop Attendees (updated on 7-19-2011)

Submitted Position Papers (updated on 7-22-2011)

Related events, organizations and reports (posted on 7-17-2011)

Online Participation (updated on 7-29-2011):

Video teams at Princeton streamed the video/audio for all sessions of the conference during Tue 7/19 and the first half of Wed 7/20, including the 2 separate hour-long sessions on Tuesday where participants "broke out" into 3-4 different rooms to discuss different topics each hour.
 
The videos of these sessions are being prepared and will be made available in the near future for viewing.
 
Online and on-site participants contributed to the workshop through questions and comments posted on twitter (each twitter post is limited to 140 characters). Each session, including each of the breakout session groups, included someone monitoring the twitter feed. We recommended using http://www.monitter.com to monitor the twitter feed for a specific tag.
 
The tag #rdlmw was used to post to all sessions except the breakout session groups.
 
The following tags were used to contribute to the 8 different breakout session groups on Tue 7/19:
 
(Please note: the tags below may not work when searching for them using  http://www.monitter.com ; an alternative is to use twitter.com and subsitute a space for the dash to find these posts.)
 
1-2pm:
 
#rdlm-sec  for the Secure Research Data group (Palmer Room)
#rdlm-pol  for the Policy group (Witherspoon Room)
#rdlm-sel  for the Assessment of selection of research data group (Senior Room)
#rdlm-fund  for the Funding and operation of research data lifecycle management (Prince William Ballroom)
 
2-3pm:
 
#rdlm-partlib  for the Partnering researchers, IT staff, librarians and archivists group (Prince William Ballroom)
#rdlm-stand  for the Standards for provenance, metadata and discoverability group (Palmer Room)
#rdlm-partfund  for the Partnering funding agencies, research institutions and communities group is now merged with the Industrial and corporate partnerships group, will take place in the Senior Room, and will use the single tag #rdlm-partfund
 

Workshop Description

This workshop will bring together thought leaders on the topic of Research Data Lifecycle Management, including researchers, librarians, archivists, and IT professionals. Through the workshop program, attendees will be able to participate in the development of a data lifecycle management framework. By drawing together this diverse group of specialists, the workshop will be able to leverage the progress made to date by the digital curation, preservation and open repositories communities. The workshop will also promote the needed interaction, collaboration, and information sharing among diverse institutions and groups involved in High Performance Computing (HPC), research computing, IT, libraries, and archives. The goal is to develop a combination of policy and financial frameworks that ensures maintenance of important data over time scales longer than the career of any individual investigator. 

On-site participation will be limited to a total of seventy-five leaders with balanced representation from the following areas:

  • researchers who use computational resources to produce and access data sets;
  • IT professionals involved in research computing support; and
  • librarians/archivists who manage this type of data. 

In addition, video/audio streaming coupled with twitter will be used to reach a much broader range of off-site participants.  We hope to engage a diverse population of researchers and professionals involved in research data lifecycle management to represent varying perspectives and differing institutions in the conversation.

Workshop topics may include, but are not limited to:

  • Funding and Operation of Research Data Lifecycle Management
  • Partnering Researchers, IT Staff, Librarians and Archivists
  • Assessment and Selection of Research Data
  • Policy
  • Partnering Funding Agencies, Research Institutions, and Communities
  • Standards for Provenance, Metadata, Discoverability
  • Secure Research Data
  • Industrial and Corporate Partnerships
A vendor panel discussion and a funding agency panel discussion are also planned.
 
Position papers, submitted by people involved in the production, use, and management of data used in research computing, will help to gather input from a broad community to seed the conversations at the workshop.

This NSF funded workshop is a collaboration between the Coalition for Academic Scientific Computation (CASC) and the EDUCAUSE ACTI Campus Cyberinfrastructure Group, and will be hosted at the Nassau Inn, adjacent to Princeton University’s campus in Princeton, NJ, Monday, July 18 – Wednesday, July 20. The workshop will include an informal reception at the Prospect House, Princeton University’s faculty/staff club on Monday, July 18 at 5:30 pm. It will also include a dinner at Rats, a restaurant on the Grounds for Sculpture in Hamilton, NJ on Tuesday, July 19 at 5:30 pm.

The findings of the workshop will be described in a report written by the organizing committee and an invited group of participants. The report will be submitted to EDUCAUSE for publication and posted on the CASC website.

Please feel free to contact members of the organizing committee by email if you have any questions, concerns, or suggestions.

Organizing Committee 

Curt Hillegas, Ph.D. - Chair
Director of Research Computing, Princeton University
Email

Rajendra (Raj) Bose, Ph.D.
Manager, Research Computing Services, Columbia University
Email

Kerstin Lehnert, Ph.D.
Senior Research Scientist, Lamont-Doherty Earth Observatory, Columbia University
Email

Clifford Lynch, Ph.D.
Executive Director, Coalition for Networked Information
Email

Oren Sreebny
Senior Director Emerging Technologies and Communications, IT Services, University of Chicago
Email