Concept for a modular, cloud-native image delivery service enabling access and transformation of large image datasets—bridging storage and applications without data duplication.
dtool is a lightweight data management tool that packages metadata with immutable data to promote accessibility, interoperability, and reproducibility; dserver makes dtool datasets findable
In this lightning talk, I will share my experience using DataLad, git-annex and ReproMan to run software pipelines on hundreds of fMRI datasets on an HPC cluster.
Scientific computing workflows have become increasingly complex, often comprising of numerous interdependent tasks executed on distributed computing resources.