DescriptionEarth system models are valuable tools for furthering our understanding of past, present, and future climate states. Because these models tend to be large and complex as well as in a state of near constant development, quality assurance (and subsequent debugging) are critical pieces in the development cycle. Here, we describe our multi-year effort to better evaluate the quality and "correctness" of the Community Earth System Model (CESM), a widely-used climate model. Our approach depends on an initial coarse-grain ensemble-based consistency test to determine code correctness, which has already proved successful in practice. The additional capability desired is a means of easily tracing a coarse-grain failure to its root cause, and we discuss our strategy and promising efforts to date toward that goal.