Cluster Server 8.0 Application Note: Dynamic Reconfiguration for Oracle Servers - Solaris
- Dynamic reconfiguration of Oracle servers
- Supported software and hardware
- Scenarios requiring a VCS shutdown
- Stopping and starting VCS
- Performing dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)
Performing dynamic reconfiguration on a CPU/memory board
You may want to remove a CPU/memory board that is malfunctioning or you may want to reconfigure a board from one domain to another where it is needed more.
To reassign a board from one domain to another, you must unconfigure it from one domain and reassign it to another domain. This can be done without physically removing the board from its slot. To replace a board, however, you must unconfigure it from one domain, physically remove it, add its replacement board and reconfigure it to the domain.
Use the following procedures to dynamically reconfigure a CPU/memory board.
To determine the status of the board you are reconfiguring
- If necessary, log in as the administrator to the domain containing the CPU/memory board.
- Determine the attachment point of the board you are removing:
# cfgadm
Ap_Id Type Receptable Occupant Cond . N0.SB2 CPU connected configured ok .
- Make sure you have checked whether the board has permanent memory.
See “To determine if the CPU/memory board has permanent memory”.
If the board in the domain you want to dynamically reconfigure contains permanent memory, be sure you have first stopped VCS using the procedures described in See Stopping and starting VCS.
If the board you want to reconfigure does not contain permanent memory, you can proceed to dynamically reconfigure it.
To unbind processes bound to CPU on the board
- To determine if any processes are bound to a CPU, enter:
# pbind -q
- If a processes is bound to the board, the output indicates the process ID and the ID number of the CPU.
process id 650: 0
- If you see no output or see output showing no processes bound to a CPU on the board, you are reconfiguring, perform the steps in To unconfigure the board.
- Unbind all processes bound to the CPU on the board. For example, enter:
# pbind -u 650
- Rebind the processes to a processor on another board, if necessary. For example, bind process 650 to processor with ID 9, which is on another board, using the command:
# pbind -b 650 9
- If you attempt to unconfigure a board with processes bound to it, you receive a message that resembles:
cfgadm: Hardware specific failure: unconfigure SB15: Failed to off-line:dr@0:SB15::cpu3
To unconfigure the board
- Unconfigure and disconnect the board:
# cfgadm -v -c disconnect SB2
- If the board does not contain permanent memory, the command's output resembles the following with slight variations for each server:
request delete capacity (4 cpus) request delete capacity (2097152 pages) request delete capacity SB2 done request offline SUNW_cpu/cpu448 request offline SUNW_cpu/cpu449 request offline SUNW_cpu/cpu450 request offline SUNW_cpu/cpu451 request offline SUNW_cpu/cpu448 done request offline SUNW_cpu/cpu449 done request offline SUNW_cpu/cpu450 done request offline SUNW_cpu/cpu451 done unconfigure SB2 unconfigure SB2 done notify remove SUNW_cpu/cpu448 notify remove SUNW_cpu/cpu449 notify remove SUNW_cpu/cpu450 notify remove SUNW_cpu/cpu451 notify remove SUNW_cpu/cpu448 done notify remove SUNW_cpu/cpu449 done notify remove SUNW_cpu/cpu450 done notify remove SUNW_cpu/cpu451 done disconnect SB2 disconnect SB2 done poweroff SB2 poweroff SB2 done unassign SB2 skipped
Skip to 4.
- If the board has permanent memory, the system prompts you to proceed:
System may be temporarily suspended; proceed (yes/no)?
If the answer is "yes," dynamic reconfiguration proceeds. The system is suspended during reconfiguration. When the system resumes operation on another board, the board you are reconfiguring is disconnected. If the disconnect operation succeeds, the output resembles the following with slight variations for different servers:
request suspend SUNW_OS request suspend SUNW_OS done request delete capacity (2097152 pages) request delete capacity SB15 done request offline SUNW_cpu/cpu480 request offline SUNW_cpu/cpu481 request offline SUNW_cpu/cpu482 request offline SUNW_cpu/cpu483 request offline SUNW_cpu/cpu480 done request offline SUNW_cpu/cpu481 done request offline SUNW_cpu/cpu482 done request offline SUNW_cpu/cpu483 done unconfigure SB15 unconfigure SB15 done notify remove SUNW_cpu/cpu480 notify remove SUNW_cpu/cpu481 notify remove SUNW_cpu/cpu482 notify remove SUNW_cpu/cpu483 notify remove SUNW_cpu/cpu480 done notify remove SUNW_cpu/cpu481 done notify remove SUNW_cpu/cpu482 done notify remove SUNW_cpu/cpu483 done disconnect SB15 disconnect SB15 done poweroff SB15
Skip to 4.
Note:
If there are real-time processes running on the board you are unconfiguring, the disconnect operation may not succeed. You must stop these processes in the appropriate manner before continuing with dynamic reconfiguration.
- If the board has real-time processes that must be stopped, the dynamic reconfiguration operation fails, indicating the PID of those processes that are running. There may be slight variations in output for different Oracle Sun Enterprise servers.
For example:
. . notify remove SUNW_cpu/cpu481 done notify remove SUNW_cpu/cpu482 done notify remove SUNW_cpu/cpu483 done cfgadm: Hardware specific failure: unconfigure SB15: Cannot quiesce realtime thread: 621
- To determine the name of the processes, use the command:
# ps -ef | grep PID
- Stop the process in the appropriate manner. For example, the processes in our example must be stopped using the kill command:
# kill -9 PID
- Retry the command in 1.
- To verify the board is disconnected and unconfigured, use the cfgadm command:
# cfgadm
Ap_Id Type Receptable Occupant Cond . N0.SB2 CPU disconnected unconfigured unknown .
Now you can remove the board from the slot, or reassign it to another domain.
Note:
Do not remove the board until you have verified it is disconnected.
- If you are replacing the board immediately, see To add a board to a domain. Otherwise, return the cluster to operation without replacing the disconnected CPU/memory board using the procedure in the following section.
To add a board to a domain
- Log in as administrator to the domain where you plan to add or configure the boards.
- If you are adding a new or a replacement board to a domain (for example, dom1), verify the state of the slot to contain the board.
To be configured with a new board, the slot must have the following states and condition:
Receptacle state: empty
Occupant state: unconfigured
Condition: unknown
Verify this by using the cfgadm command to list the slots, as in the following example. In the dom1 domain, slot SB2 is to contain the CPU board:
- Use the cfgadm command to connect and configure a CPU or memory board:
cfgadm -v -c configure SBx
For example:
# cfgadm -v -c configure SB2
assign SB2 assign SB2 done poweron SB2 poweron SB2 done test SB2 test SB2 done connect SB2 connect SB2 done configure SB2 configure SB2 done notify online SUNW_cpu/cpu448 notify online SUNW_cpu/cpu449 notify online SUNW_cpu/cpu450 notify online SUNW_cpu/cpu451 notify add capacity (4 cpus) notify add capacity (2097152 pages) notify add capacity SB2 done
- Verify the new board has been connected and configured using the command cfgadm. For example:
# cfgadm
Ap_Id Type Receptable Occupant Cond . SB2 CPU connected configured ok