With the advance of new technologies, data volumes and number of files are constantly increasing. Addtionally, new regulations (e.g. GDPR) sets strict requirements on the storage and use of privacy sensitive data. Data management has therefore become an essential part of data-driven research.
In this course we will introduce how to efficiently manage data with the data management framework iRODS and to build computational pipelines on HPC infrastructure employing this data. Topics in this course will include:
- Data Life Cycle and FAIR principles
- iRODS concepts
- iRODS graphical user interface
- labeling data and searching for data in iRODS
- building a computational pipeline that draws on data managed in iRODS.