“A fault domain is a set of hardware components – computers, switches, and more – that share a single point of failure.”
Nutanix provides many ways of achieving resiliency and fault tolerance.
For the ease of explanation, let us focus on how block awareness works on RF=2 cluster. RF=2 cluster stores two copies of the data at a 4MB extent group level - one copy stored on the local node and other on a different random node.
In NOS 3.x and 4.0, a block is considered a fault domain. This means if there is one block failure, data/metadata/config data will still be available as the replicas are placed in different blocks.
In future versions, we can customize the fault domain depending rack location or data center location or any set of nodes.
History:
NOS 3.x version is extent group(data) and metadata block aware. Configuration servers can be configured/migrated to make them block aware. Oplog is not block aware. So if there is a block failure, some data can be unavailable.
4.0 - Block Awareness:
In NOS 4.0, this framework is extended to provide block awareness for oplog as well.
If a new cluster with three uniform blocks are created, NOS will select configuration nodes to be in different block, will form block-aware metadata ring , and will place data (oplog/extent group) in block aware fashion. After the cluster is formed, curator scan will confirm that the cluster is block aware and will update the block aware status.
Clusters formed in 3.x will be metadata/data and configuration block aware. New oplog writes will be block aware. Oplog drains quickly, so oplog of the cluster will become block aware within few hours of the upgrade to 4.x.
For the clusters created in pre-3.x NOS version, there is a non-disruptive method to make configuration and metadata ring block aware.
In the NOS 4.0, Prism UI will provide information whether cluster is block and/or node fault tolerant.
Commands to verify the fault tolerance state:
Nutanix provides many ways of achieving resiliency and fault tolerance.
For the ease of explanation, let us focus on how block awareness works on RF=2 cluster. RF=2 cluster stores two copies of the data at a 4MB extent group level - one copy stored on the local node and other on a different random node.
In NOS 3.x and 4.0, a block is considered a fault domain. This means if there is one block failure, data/metadata/config data will still be available as the replicas are placed in different blocks.
In future versions, we can customize the fault domain depending rack location or data center location or any set of nodes.
History:
NOS 3.x version is extent group(data) and metadata block aware. Configuration servers can be configured/migrated to make them block aware. Oplog is not block aware. So if there is a block failure, some data can be unavailable.
4.0 - Block Awareness:
In NOS 4.0, this framework is extended to provide block awareness for oplog as well.
If a new cluster with three uniform blocks are created, NOS will select configuration nodes to be in different block, will form block-aware metadata ring , and will place data (oplog/extent group) in block aware fashion. After the cluster is formed, curator scan will confirm that the cluster is block aware and will update the block aware status.
Clusters formed in 3.x will be metadata/data and configuration block aware. New oplog writes will be block aware. Oplog drains quickly, so oplog of the cluster will become block aware within few hours of the upgrade to 4.x.
For the clusters created in pre-3.x NOS version, there is a non-disruptive method to make configuration and metadata ring block aware.
In the NOS 4.0, Prism UI will provide information whether cluster is block and/or node fault tolerant.
Commands to verify the fault tolerance state:
- ncli cluster get-fault-tolerance-state
- ncli cluster get-domain-fault-tolerance-status type=rackable_unit
- ncli cluster get-domain-fault-tolerance-status type=node
- ncli rack ls - how many blocks are in the cluster