Multiple Stream Job Performance Optimization with Source Operator Graph Transformations
Workshop: The Eighth International Workshop on Data-Intensive Computing in the Clouds
Authors: Miyuru Dayarathna (WSO2 Inc)
Abstract: Multiple distributed stream queries which are executed on stream processing systems need to be fine tuned to the compute cluster in order to harness the full potential of the hardware they run on. In this paper we describe an automatic technique for conducting such stream query optimization in the presence of multiple stream jobs. During this auto-tuning process we identify the structure of each program and conduct automatic program transformation to generate optimized unified streaming jobs. The operators on the unified secondary sample application are grouped into PEs considering their performance characteristics and the stream graph topology structure to produce high performance stream query network. We implemented this multiple stream query optimization technique on a mechanism called Tahitica. We demonstrate our approach's ability for producing optimized stream query performance by comparing it to naive deployments using two real world stream processing applications in the domains of healthcare and search advertising. Our stream query optimization approach reported 7.1% throughput performance improvement compared to a naive deployment.
Workshop Index