CloudOpen Europe 2013: Efficient and Large-scale Infrastructure Monitoring with Tracing
Julien Desfossez on 21 October 2013
Tracing is a powerful tool to help solve problems in high-performance multi-threaded applications. There are success stories of custom application tracers deployed in large distributed environments, but we almost never see a low-level system tracer deployed in such environments. With the features introduced in LTTng during the last year, we can now extract remotely and in real-time relevant informations about running production servers efficiently. We will demonstrate how LTTng can be deployed in a cloud infrastructure (0penStack) to extract high-precision metrics remotely, how to enable/disable kernel and user-space events dynamically, and how to extract traces on crashes. This presentation will give system administrators a new perspective on how to monitor and debug production servers in large-scale data-centers.