Many companies in the logging/monitoring space will try to sell you on AI and ML (Artificial Intelligence and Machine Learning) to find abnormal.
When you’re looking for the cause of a problem in log files; you’ve checked all the low hanging fruit; ERROR and WARN lines and still can’t find what you’re looking for..
I would suggest that a different kind of A.I. works really well.. Artificial Ignorance. One of the time-honoured ways of subtracting normal from data to get abnormal. Take away the knowns to get unknowns. When looking for a needle in a haystack, try removing all the hay. This technique was first written about (afaik) over 20 years ago!
Ideally, you should be able to replicate the procedure by filtering out the variable data from log lines and sort by the counts of occurrences. Then develop and save search criteria that filters the lines with highest counts out. What you should be left with are lines that you rarely (if ever) see. Maybe a DEBUG line that says something like “Should Never Happen”, but it did..
How well does your logging provider handle this? I’ll be exploring this with different logging providers and writing about it shortly.