Tag: Detecting performance problems

Shades of Grey

System failures are often not black and white, but shades of grey (gray?).. Detecting and alerting on “performance-challenged” system components are a lot more difficult than detecting black or white (catastrophic failures). The metrics used are usually of the “time vs. latency” or “time vs. event count” variety, often aggregated and, often by using averages.