Yikes! The backup that succeeded, yet actually failed.

A few years ago, I wrote a backup program for internal use.  It's written in Perl and is extremely configurable.  Last year, one of the guys I sub-contract for asked if I could write a backup program for a client.  I suggested that I simply make a few tweaks to my program and license it to them.  He agreed.

The backup program has always worked for me and my backups are routinely over 500 mb.  Anyway...so, the client was using the backup successfully for the last year or so.  I kept getting the emails that said “backup succeeded”.  In fact, I got one this morning...well, guess what?  I got a call from the client today (he's really really cool) in which he said “all of the backup folders are empty”.  After picking my jaw up from the floor and thinking “oh crap...WTF am I going to do if they've had a major disaster?”, I asked if I could connect remotely and troubleshoot the problem.  He said yes and away I went.

Because the backup is running so well (and has been), the source code was backed up earlier this year, so I had to do some digging.  I asked him to run a couple of tests (which worked).  I looked at the log files (my apps have really detailed log files)...I had him kick off a manual backup and told him I'd dig through the code to find out what the deal was.

Well...I found the code and saw that it was doing the right thing...and if it wasn't, it would fail.  So...why the success emails?  I called him back to let him know the manual backup was working when he interrupted me and said “would it matter if the drive was low on disk space?”.  I said yes and that's when he explained that the drive was down to only a couple hundred megabytes.  So...here's the deal...my backup zips the files....the zip process was succeeding (it was doing it all in RAM), but was failing to actually write the files.  As soon as he cleared the disk space (emptied the recycle bin, cleared the IE cache, etc), everything started working.

So, now I need to dig into my backup program to find out if there is any way I can trap that condition so we stop getting the “success” emails if the backup didn't actually do anything.

More on this topic later...

Print | posted on Friday, August 13, 2004 9:27 AM

Feedback

No comments posted yet.
Title  
Name
Email (never displayed)
Url
Comments   
Please add 2 and 2 and type the answer here: