{"id":7435,"date":"2014-06-19T03:54:10","date_gmt":"2014-06-19T03:54:10","guid":{"rendered":"https:\/\/unknownerror.org\/index.php\/2014\/06\/19\/single-nfs-server-to-redunant-nfs-storage-collection-of-common-programming-errors\/"},"modified":"2014-06-19T03:54:10","modified_gmt":"2014-06-19T03:54:10","slug":"single-nfs-server-to-redunant-nfs-storage-collection-of-common-programming-errors","status":"publish","type":"post","link":"https:\/\/unknownerror.org\/index.php\/2014\/06\/19\/single-nfs-server-to-redunant-nfs-storage-collection-of-common-programming-errors\/","title":{"rendered":"Single NFS-server to redunant NFS-storage-Collection of common programming errors"},"content":{"rendered":"<p>Redundant NFS (in fact, <em>any<\/em> redundant storage) is not trivial.<br \/>\nPlan to spend a good amount of time (and capital) on this if you really want it to work well.<\/p>\n<p>There are generally two options available to you:<\/p>\n<h2>Option 1: Buy redundant storage devices<\/h2>\n<p>This is the fastest (and usually most expensive) option. Pick a vendor who makes a storage device with redundancy features that meet your needs, give them the company credit card, and try not to get tears on the invoice.<\/p>\n<p>The two major benefits of this route are that it&#8217;s fast (you get a pre-built solution you can just roll out by following the manual) and it&#8217;s supported (if you have a problem you call the vendor and yell until they fix it).<\/p>\n<h2>Option 2: Build it yourself<\/h2>\n<p>This site has a good outline of building a redundant iSCSI\/NFS cluster using Debian Linux. It&#8217;s from 2009, but the principles are sound.<br \/>\nSpecific step-by-step instructions on how to build this sort of environment is beyond the scope of Server Fault, but I can give you a rough outline of what you&#8217;ll need:<\/p>\n<ul>\n<li><strong>Shared (or replicated) storage<\/strong> In order to have redundancy on your storage layer you need to have the same data accessible from multiple locations &#8211; either by replicating it in real time, or by connecting everything to a shared pool of disks. A SAN is the usual way to meet the shared storage requirement. This is still a single point of failure, but when you put all your eggs in one of these baskets you make sure it&#8217;s a VERY good basket.\n<p>DRBD or ZFS replication can meet the requirement for replicated storage if you elect to go that route &#8211; it&#8217;s probably cheaper than a SAN, and both technologies have developed to a very reliable state.<\/p>\n<\/li>\n<li><strong>Multiple &#8220;front-end&#8221; systems<\/strong> Now that you have the storage worked out you need to make it accessible through redundant &#8220;front-end&#8221; systems &#8211; these are the machines that are running the NFS server (or whatever you use to serve up the disk to clients).\n<p>You need at least two, running high-availability\/failover software so if\/when you lose one the other can take over. IP failover is the &#8220;easy&#8221; option here (if one box goes down the other assumes the &#8220;live&#8221; IP address).<\/p>\n<\/li>\n<li><strong>Multiple physical paths to storage<\/strong> All the storage redundancy in the world doesn&#8217;t help you if everything goes through one wire.\n<p>You need to ensure that the client machines have multiple physical paths to get back to the storage front-ends, otherwise a failed switch leaves you with the same single-point-of-failure situation you&#8217;re trying to get out of.<\/p>\n<\/li>\n<\/ul>\n<p>Building your own redundant storage usually takes longer than a vendor solution, and you&#8217;re supporting it yourself (which means you need to be comfortable with the technology involved).<br \/>\nThe major advantages are cost (you can often build the environment cheaper than vendor-provided solutions) and flexibility (you can tailor the solution to meet your needs and integrate with other parts of your environment &#8211; for example your backup system).<\/p>\n<h3>Stuff you need either way<\/h3>\n<p>You will need <strong>a test plan*<\/strong> prior to going live in production.<br \/>\nIdeally you should have it before you even start your build-out (knowing what failures you&#8217;re defending against will help you design your system).<\/p>\n<p>Your goal in testing is to demonstrate that the absolute worst confluence of failures will not leave you in a position where you&#8217;re losing data (and ideally won&#8217;t cause an outage because your storage became inaccessible).<br \/>\nYou may not find or test <em>every<\/em> possible failure scenario, but write down all the ones you can think of and make sure to test them. You don&#8217;t want to wait until your first day of live production use to discover that losing one disk in the standby machine can cause the primary to crash &#8212; at that point it&#8217;s too late to fix.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Redundant NFS (in fact, any redundant storage) is not trivial. Plan to spend a good amount of time (and capital) on this if you really want it to work well. There are generally two options available to you: Option 1: Buy redundant storage devices This is the fastest (and usually most expensive) option. Pick a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-7435","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/7435","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/comments?post=7435"}],"version-history":[{"count":0,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/7435\/revisions"}],"wp:attachment":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/media?parent=7435"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/categories?post=7435"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/tags?post=7435"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}