InfoScale™ 9.0 Storage Foundation and High Availability Configuration and Upgrade Guide - Linux
- Section I. Introduction to SFHA
- Section II. Configuration of SFHA
- Preparing to configure
- Preparing to configure SFHA clusters for data integrity
- About planning to configure I/O fencing
- Setting up the CP server
- Configuring the CP server manually
- Configuring CP server using response files
- Configuring SFHA
- Configuring Storage Foundation High Availability using the installer
- Configuring a secure cluster node by node
- Completing the SFHA configuration
- Verifying and updating licenses on the system
- Configuring Storage Foundation High Availability using the installer
- Configuring SFHA clusters for data integrity
- Setting up disk-based I/O fencing using installer
- Setting up server-based I/O fencing using installer
- Manually configuring SFHA clusters for data integrity
- Setting up disk-based I/O fencing manually
- Setting up server-based I/O fencing manually
- Configuring server-based fencing on the SFHA cluster manually
- Setting up non-SCSI-3 fencing in virtual environments manually
- Setting up majority-based I/O fencing manually
- Performing an automated SFHA configuration using response files
- Performing an automated I/O fencing configuration using response files
- Section III. Upgrade of SFHA
- Planning to upgrade SFHA
- Preparing to upgrade SFHA
- Upgrading Storage Foundation and High Availability
- Performing a rolling upgrade of SFHA
- Performing a phased upgrade of SFHA
- About phased upgrade
- Performing a phased upgrade using the product installer
- Performing an automated SFHA upgrade using response files
- Upgrading SFHA using YUM
- Performing post-upgrade tasks
- Post-upgrade tasks when VCS agents for VVR are configured
- About enabling LDAP authentication for clusters that run in secure mode
- Planning to upgrade SFHA
- Section IV. Post-installation tasks
- Section V. Adding and removing nodes
- Adding a node to SFHA clusters
- Adding the node to a cluster manually
- Adding a node using response files
- Configuring server-based fencing on the new node
- Removing a node from SFHA clusters
- Removing a node from a SFHA cluster
- Removing a node from a SFHA cluster
- Adding a node to SFHA clusters
- Section VI. Configuration and upgrade reference
- Appendix A. Installation scripts
- Appendix B. SFHA services and ports
- Appendix C. Configuration files
- Appendix D. Configuring the secure shell or the remote shell for communications
- Appendix E. Sample SFHA cluster setup diagrams for CP server-based I/O fencing
- Appendix F. Configuring LLT over UDP
- Using the UDP layer for LLT
- Manually configuring LLT over UDP using IPv4
- Using the UDP layer of IPv6 for LLT
- Manually configuring LLT over UDP using IPv6
- About configuring LLT over UDP multiport
- Appendix G. Using LLT over RDMA
- Configuring LLT over RDMA
- Configuring RDMA over an Ethernet network
- Configuring RDMA over an InfiniBand network
- Tuning system performance
- Manually configuring LLT over RDMA
- Troubleshooting LLT over RDMA
About configuring SFHA clusters for data integrity
When a node fails, SFHA takes corrective action and configures its components to reflect the altered membership. If an actual node failure did not occur and if the symptoms were identical to those of a failed node, then such corrective action would cause a split-brain situation.
Some example scenarios that can cause such split-brain situations are as follows:
Broken set of private networks
If a system in a two-node cluster fails, the system stops sending heartbeats over the private interconnects. The remaining node then takes corrective action. The failure of the private interconnects, instead of the actual nodes, presents identical symptoms and causes each node to determine its peer has departed. This situation typically results in data corruption because both nodes try to take control of data storage in an uncoordinated manner.
System that appears to have a system-hang
If a system is so busy that it appears to stop responding, the other nodes could declare it as dead. This declaration may also occur for the nodes that use the hardware that supports a "break" and "resume" function. When a node drops to PROM level with a break and subsequently resumes operations, the other nodes may declare the system dead. They can declare it dead even if the system later returns and begins write operations.
I/O fencing is a feature that prevents data corruption in the event of a communication breakdown in a cluster. SFHA uses I/O fencing to remove the risk that is associated with split-brain. I/O fencing allows write access for members of the active cluster. It blocks access to storage from non-members so that even a node that is alive is unable to cause damage.
After you install Veritas InfoScale Enterprise and configure SFHA, you must configure I/O fencing in SFHA to ensure data integrity.