Multiple Stream Job Performance Optimization with Source Operator Graph Transformations
Author/Presenter
Event Type
Workshop

Applications
Clouds and Distributed Computing
SIGHPC Workshop
TimeSunday, November 12th11:10am - 11:40am
Location507
DescriptionMultiple distributed stream queries which are executed on stream processing systems need to be fine tuned to the compute cluster in order to harness the full potential of the hardware they run on. In this paper we describe an automatic technique for conducting such stream query optimization in the presence of multiple stream jobs. During this auto-tuning process we identify the structure of each program and conduct automatic program transformation to generate optimized unified streaming jobs. The operators on the unified secondary sample application are grouped into PEs considering their performance characteristics and the stream graph topology structure to produce high performance stream query network. We implemented this multiple stream query optimization technique on a mechanism called Tahitica. We demonstrate our approach's ability for producing optimized stream query performance by comparing it to naive deployments using two real world stream processing applications in the domains of healthcare and search advertising. Our stream query optimization approach reported 7.1% throughput performance improvement compared to a naive deployment.
Author/Presenter