P62: How To Do Machine Learning on Big Clusters
SessionPoster Reception
Event Type
ACM Student Research Competition
Poster
Reception

TimeTuesday, November 14th5:15pm - 7pm
LocationFour Seasons Ballroom
DescriptionScientific pipelines, such those in chemogenomics machine learning applications, often compose of multiple interdependent data processing tasks. We are developing HyperLoom - a platform for defining and executing workflow pipelines in large-scale distributed environments. HyperLoom users can easily define dependencies between computational tasks and create a pipeline which can then be executed on HPC systems. The high-performance core of HyperLoom dynamically orchestrates the tasks over available resources respecting task requirements. The entire system was designed to have a minimal overhead and to efficiently deal with varying computational times of the tasks. HyperLoom allows to execute pipelines that contain basic built-in tasks, user-defined Python tasks, tasks wrapping third-party applications or a combination of those.