Research Data Group
The Research Data Group, a part of the Data Division, has the goal of supporting STFC facilities and programmes to effectively manage their research data, in conjunction with the data services and peta-scale storage provided within Scientific Computing Department. We undertake research, software development and the development and delivery of services to provide integrated and automated platforms across all areas of data management including:
- Facilities Data Archiving. We develop and support systems which collect and store data generated from experiments by STFC facilities, for online use and archiving. We work closely in this with the Central Laser Facility, the Diamond Light Source and the ISIS Neutron Source.
- Data Cataloguing. We develop and support the ICAT Suite, a collection of integrated software tools for cataloguing facilities experiments, including data searching and browsing, fast data upload and download, and interfaces and portals to allow software analysis of data.
- Research Output Publication and Tracking. We support data publishing via the issuing of Digital Object Identifiers, and web based systems for accessing data, working with the STFC facilities and other partners such as the Medical Research Council. Further, RDG supports the STFC Library and Information Services in capturing the published output of the STFC Laboratories via the ePubs system.
- Data Analysis. We develop and integrate advanced information management, HPC, cloud, and distributed processing technologies to enable novel data analytics capabilities in experimental and computational sciences. We have been building innovative solutions for large scale science facilities that serve advanced materials science, engineering and other science communities
- Data Preservation. Keeping data safe and usable is a complex activity; we have a number of European projects looking at preserving data in its experimental context, including the projects APARSEN, SCAPE, SCIDIP-ES.
We work closely with the STFC facilities, ISIS and CLF and the Diamond Light Source. We also have strong links with facilities throughout Europe and the world, especially via the PaN-Data Consortium.
RDG has a longer term goals of:
- Providing long-term archiving and preservation of data in context, so that it can be kept safe and reused by future researchers, to maximise the science which can be extracted from its use.
- To give a complete picture of research outputs, using linked data technology to capture and publish the connections between the components of research.
- To provide access to SCD’s computing infrastructure, so that user’s data can be accessed and processes, seamlessly via cloud interfaces, so that scientists can process and explore their data.
- To develop and deliver advanced system integration and data analytics technologies to advance the state-of-the-art experimental, computational, and data driven science capabilities.