DescriptionWorkflows have proved to be a popular method for processing and analysing scientific data. The graphical programming approach adopted enables non-programmers to rapidly construct and share sophisticated analysis pipelines. While workflows have seen significant adoption for processing files in a batch fashion, their application to streams of data has been less widespread. While there are tools that can apply dataflows to streams of data, these tools generally tend to only operate in the streaming domain.
This paper proposes extensions to the workflow engine provided by the e-Science Central platform that enable it to enact workflows that contain mixtures of streaming and batch operations in a consistent manner while still retaining the provenance capture, auditing and sharing features provided by the underlying platform.