Data Management Plan

Available at Berkeley Lab:

https://dmptool.org/

1. Data collected, Generated, or used

During this project, data concerning characterization of the thin adaptive mirror will be collected at metrology laboratory and on the beamlines.  LBNL, ANL, BNL and SLAC follow the practices described below for management of data concerning the. The primary intellectual output of this effort will be shared with the scientific community at conferences and through publications in journals and proceedings.

Simulation and analysis programs will be written in languages including and Python, Matlab and IDL, with output data stored in commonly-used formats to facilitate sharing and comparison among groups. Libraries of functional routines, will be made available at the conclusion of the project, or at the time of peer-reviewed publication.

The calibrations and neural networks generated after training will be saved in commonly used formats. Experiment data from autonomous experiments and wavefront engineering will be collected and stored using the latest data container available (e.g. Tiled), so that it can be parsed efficiently for integration to machine learning-enabled data processing.

2. Standards

All raw data will be stored as electronic files and backed up at the facility where the data are taken. A copy of the most important data will be backed up at the other three facilities.

Metrology data typically comes in the form of 3D surface profiles or, in some cases, in 2D line scans. All the data will be stored with descriptive text explaining the measuring condition and instrument used. The data will be stored in instrument specific, native formats or in widely used, lossless data formats including PNG, TIFF, and HDF5. ASCII or CSV formats will be used where appropriate.

Additional documentation will be co-located as needed and deemed appropriate and will also be referenced in a README file.

 

Where applicable, raw data, processed data and metadata will be stored altogether within the standard HDF5, and we will provide the python scripts to parse the data. Theses scripts will be integrated to instrument controls for interoperability. Key data shared in publications will be associated with Digital Object Identifiers.

3. Related tools, Software and/or code

Code developed for this project will adhere to DOE guidelines and be assigned digital object identifies through DOE CODE.

Simulation tools will be integrated to existing software suite developed by DOE (OASYS, SRW), and pending approval, to open access software such as Sirepo.

Instrument controls software will be designed to provide an interface to other existing instrument controls suites (such as bluesky), and potentially integrated into the suites, which are using software repository such as github.

4. Data Sharing

In case other groups would like to use our data, we will make them available, when possible, without NDA infringement, and consider case-by-case if our contribution may warrant co-authorship or simple reference to our work. 

Where applicable, the data will be shared through the Globus data management system for effective collaboration with participating facilities. Key data shared in publications will be made publicly available and associated with Digital Object Identifiers.

5. Data preservation

Long-term curation of the project’s collected data will be provided by the hosting institution where the data was collected. All the experiment data will be backed up, stored in all the facilities servers and managed according to the standards of the hosting Laboratories.

6. Data Protection: Security and integrity

No personal data of any form, protected intellectual property, or national security or economic competitiveness data will be collected in the course of this work.

 

The data will be backed up on hard drive, on a Network Attached Storage at the ALS (LBNL), and on long-term High Performance Storage System at NERSC (LBNL.)

7. Oversight of data management

The Principal Investigator will ensure that the data management is consistent with the DMP and seek oversight from the ALS Computation group and other Instrument Controls group at collaborating facilities.

8. Rationale

The Principal Investigator will ensure that the methods can be reproduced at other beamlines and other facilities. The code will be shared among DOE facilities using the Globus data management system, which provides efficient controls for user access.

9. Intellectual property

Titles of invention that are conceived or first actually reduced to practice under this DOE award will vest in the United States as per 42 USC 5908.