VPS Node Outage :: Node 2 :: 6AM CST : FRI 13 FEB 2009
6AM CST : FRI 13 FEB 2009,
A RAID card failed and caused severe file damage on node 2 of our VPS system. Our technicians first attempted to recover the fileset from the hardware node and attempt to rebuilt the machine with new parts.
It was very clear the server would be unrecoverable. Technicians noticed heat damage to the motherboard and unfortunately hard drives. The faulty RAID card allowed bad data to write to the drives. Technicians did replace the bad hardware and attempted to recover the fileset to no usccess.
Technicians placed the server aside for possible data recovery attempts later.
A brand new quad processor quad core Intel system featuring 6 drive RAID 10 with 32GB RAM was built as a replacement. This machine is almost fully functional and ready to go live.
While the new machine is being prepared, we have enacted our ZipSafe automated VPS backup system. VPS backups are stored off server / off site. These are incremental back ups done periodically in the case of an event such as this. The backups are being restored to available nodes. The delay in these backups is that they are kept off site for further protection. They do take some time to backup and may not be the absolute latest of your data.
Once the new node is completed all VPS accounts will be migrated to the new server. Customers should not notice as this will be a live migration. At that time, all accounts will be fully functional with our control panel again.
There will be a full 100% credit for a full month of service for this outage per our SLA guidelines. SLA credits will be handled once this issue is fully resolved.
The good news. ZipSafe backups work. This adds an extra layer to customer backups.
Affected customers will be on a brand new VPS node with twice the capacity of the old system. It is also upgradeable to 96GB of RAM. Customer VPS once on the new server should be noticeably faster as the new server will be almost twice the processing power and RAM as the old system.
This system is nearly ready as technicians have been burning in the system while preparing it to ensure it will be stable. We surely do not want another outage issue.
This post will be updated with any new updates as soon as possible.