Split RAID: Proposal for archival RAID using incremental batch checksum
Greg Freemyer
greg.freemyer at gmail.com
Sat Nov 22 09:03:42 EST 2014
On Sat, Nov 22, 2014 at 8:22 AM, Anshuman Aggarwal
<anshuman.aggarwal at gmail.com> wrote:
> By not using stripes, we restrict writes to happen to just 1 drive and
> the XOR output to the parity drive which then explains the delayed and
> batched checksum (resulting in fewer writes to the parity drive). The
> intention is that if a drive fails then maybe we lose 1 or 2 movies
> but the rest is restorable from parity.
>
> Also another advantage over RAID5 or RAID6 is that in the event of
> multiple drive failure we only lose the content on the failed drive
> not the whole cluster/RAID.
>
> Did I clarify better this time around?
I still don't understand the delayed checksum/parity.
With classic raid 4, writing 1 GB of data to just D1 would require 1
GB of data first be read from D1 and 1 GB read from P then 1 GB
written to both D1 and P. 4 GB worth of I/O total.
With your proposal, if you stream 1 GB of data to a file on D1:
- Does the old/previous data on D1 have to be read?
- How much data goes to the parity drive?
- Does the old data on the parity drive have to be read?
- Why does delaying it reduce that volume compared to Raid 4?
- In the event drive 1 fails, can its content be re-created from the
other drives?
Greg
--
Greg Freemyer
More information about the Kernelnewbies
mailing list