Posts by Category

Buttons

Pure New Zealand

This site is driven by Blosxom

T
his site was written in vi

SDF is driven by NetBSD

Subscribe to this sites RSS/XML feed

This site has a Tableless Stylesheet

Email me

Aug 30, 2006

NetApp Seminar

Went to a small vendor seminar to showcase some NetApp technologies and came away with some interesting information -

* The probability of a write failure is pretty small (usually in the legalise small print) but this small possibility increases as disk space increases (which is why a generic RAID of small disks is more reliable than a RAID of really big disks). Those consumer grade 500Gb and 1Tb disks are looking slightly less attractive now. In a failure situation if a disk dies and you goto reconstruct the array you could conceivably end up with a second failure due to a tiny write error - then you're screwed.

* NetApp get around this by using a variation on RAID 6 DP (like RAID 5 but with two parity disks) - any performance hit (and its significant if you set this up using a normal controller) is offset by NetApps smart controller (thats why storage vendors charge a premium for data security).

* Fibre-channel is big with Unix shops and iSCSI is big with Windows shops. Surprisingly NFS over IP is still popular in Unix-land too.

* Snapshotting now encompasses databases and mailstores. The snapshot facility places a much much lower performance overhead than a similar EMC device (granted they would say that). Apparently companies are moving away from tape based backup to disk based - keeping tapes around purely for occassional snapshots and compliance reasons.

* NetApp do 'thin provisioning' - essentially you can lie about your storage capacity (present 1 physical TB as 2 virtual TB). This was apparently implemented based upon lies developers would tell their admins, dba's and storage managers - once everyone had added in their own comfort factor it was discovered that only about 40% of the capacity was utilised and the rest was wasted. Pooling storage in a NAS or SAN and over-subscribing it means you can shuffle the space around depending on your needs at that time. Apparently the key is the forecasting tools which will help you to predict when you'll run out of space. It also tends to work better in multi-terabyte shops rather than gigabyte shops.

* You can now stream snapshots between filers in different locations (for DR / BCP / Replication) over any IP link (one NZ client does this over dialup to a location half way around the world) - this is possible due to the small 4k block size used by NetApp for storage - at the device level it only replicates changed blocks rather than the entire changed file.

Its always nice to hear vendor 'war stories' - apparently after eBay had their extended site outage in 2001 they called in Oracle who looked into the database side and found no problem with the backend software, some more (extensive) digging pinpointed the fault in disk firmware code - when the disk faulted the error was propogated up through the application layers and eventually killed the site. After this Oracle came up with their HARD initiative (essentially a database designed and implemented for the extremely paranoid) which computes its own checksums on data as its written (so it provides an extra layer of redundancy over the storage layer).

Another interesting Oracle specific tale outlined their datacenter - which uses blades and NetApp appliances extensively (storing petabytes of data). The interesting thing is that they worked through the economics of using a Fibre Channel HBA infrastructure for their blades and went with NFS over IP instead - working out that 1 x blade + 2 FC HBA's (for redundancy) was much more expensive than 1 x blade + 2 built in teamed Gb NIC's (and they were willing to wear the performance penalty). NFS also allows them to manage a central pool of storage rather than carving out chunks for direct attached storage. Interesting.

Apparently a big leap of faith is for DBA's to allow the device to handle the and manage the storage rather than thinking about sindle-count. Once they get over that they can forget about the storage and focus on the database.

Permalink | 2006.08.30-21:59.00