The reality of disk-based storage

…is that it sucks. Hard drives are crap. FC, SCSI, SAS, ATA, SATA, you name it, it’s unreliable crap.

Here’s what you don’t want to see on a NetApp filer, for example:

filer06*> vol status seq4
         Volume State      Status            Options
           seq4 online     raid_dp, trad     maxdirsize=10240
                           reconstruct
                           wafl inconsistent
                           ironing
                Containing aggregate:  

                Plex /seq/plex0: online, normal, active
                    RAID group /seq4/plex0/rg0: normal
                    RAID group /seq4/plex0/rg1: normal
                    RAID group /seq4/plex0/rg2: reconstruction 2% completed
                    RAID group /seq4/plex0/rg3: normal
                    RAID group /seq4/plex0/rg4: normal
                    RAID group /seq4/plex0/rg5: normal

Earlier a “fake” triple-disk failure within a raid group happened, leaving the volume in a possibly inconsistent state so the volume is being “ironed” to make sure everything is a-ok (NetApp calls their filesystem WAFL, so those witty engineers called their fsck tool “wafliron”. Cute.). Of course during something like a wafliron the disks get thrashed to all hell and another disk crossed an error threshold and was failed as well. So now I’m waflironing *and* reconstructing on this volume. Yay!

Needless to say, I’ll be camped out in front of this filer for the next 8 hours chanting storage voodoo prayers and sacrificing virgin hard drives in hopes of appeasing the storage gods.


About this entry