[rt-users] Migrating from Postgres to MySQL

Matt Simerson matt at corp.spry.com
Wed Jul 29 14:30:54 EDT 2009


On Jul 29, 2009, at 10:44 AM, William Graboyes wrote:

> Hi Matt,
>
> Raid is not the end-all be-all for disk safety, especially when you  
> step into terabyte class computing, sorry I am taking this a bit off  
> topic. While RAID has it's bonuses, there are drawbacks as well,  
> take your standard RAID 5 setup, 4 Disks, 3 active, 1 Hot Spare.   
> Now lets say that Disk number 2 decided it was going to release it's  
> smoke to the world (never a good thing), now your array is still  
> alive and it is starting to rebuild onto disk 4 to make up for the  
> death of disk 2.  During the rebuild process Disk 1 comes across a  
> bad sector, poof, your data is gone.  Just a word of warning, don't  
> put all your data safety eggs into the RAID basket.

RAID poorly implemented is a placebo, and is often less reliable than  
a single disk. RAID properly implemented IS the end-all be-all of disk  
safety.

Your chosen argument is only valid against RAID level 5, which would  
be a very poor choice for a database application. RAID-5 is a poor  
choice in any environment where the data set is volatile and valuable.  
Especially when you consider the number of hours (or days) it takes to  
rebuild a hot spare into the RAID-5 set. (HINT: test that before  
deployment!)  During that rebuild window, your system performance is  
heavily degraded and extremely vulnerable. If the performance of your  
RAID system is halved, is that sufficient for your application, or is  
your system effectively down during the rebuild period? (HINT: test  
before deployment!).

But a RAID-1 or RAID 10 can be extremely robust, remaining online and  
performing optimally during multiple catastrophic disk failures. You  
can mirror the data to as many spindles as you need to insure data  
integrity. When disks fail, rebuilding a mirror disk usually takes  
less than an hour.

The systems engineer has to choose between data integrity,  
performance, and storage efficiency.

One of my systems is about 2/3 full with 24.64 TB of data. Does that  
count as "terabyte class computing?"  :)  I'm using ZFS to manage  
it. :)  :)

Matt



More information about the rt-users mailing list