Your systems have drives set up in RAID configurations and besides, you have data copied to redundant systems and backups, right? Safe? Maybe not. I recently found corruption in a quarter of a million files that had not previously been detected, for years!
RAID in redundant configurations will only protect against drive failures, ’nuff said.
Backups will only save you if they actually work (you do trial restores, right?) and if you can detect data corruption.
Redundant copies; again, you have to have a mechanism to detect corruption and, more importantly, use it.
The case where I’d found a quarter of million corrupt files, md5 hashes were stored in the metadata for the files, but were being tested so slowly that it was ineffective. (There were probably a trillion or so files and petabytes of data in total. We’ll examine the kinds of corruption found in a future article.)
If you do not have corruption detection built into the data and applications themselves, then you need some other method. Some filesystems like ZFS and btrfs have the ability to detect corruption.
In a future article I’ll present a simple script/solution that you can implement to at least detect corruption when it happens and hopefully your redundant copies can then be used to fix it. Otherwise, they may just silently sit there..