LDOM: vxdisk list is not showing VxVM managed disks in the GUEST domain post reboot - vxdiskconfig required to detect and display VxVM initialized devices
Problem
LDOMs known as Oracle VM Server for Sparc provides a Solaris feature called MPGROUPs. With Solaris 10 based GUEST domains, Veritas Volume Manager (VxVM) initialized disks are not always displayed following a reboot of the GUEST domain server.
Manual commands such as vxdiskconfig (devfsadm & vxdisk scandisks) is required for VxVM to detect and display VxVM initialized disk.
ZFS and other uninitialized VxVM disks are displayed consistently following a reboot of the Solaris 10 GUEST domain.
Solaris 11 based GUEST domains are not encountering this issue.
Figure 1.0
Oracle released an interoperability enhancement for Solaris 11.3 with SRU 18.0.6 and higher. This interoperability enhancement enabled Veritas Dynamic Multi-pathing (DMP) to better handle the loss of Primary (Control) and Service I/O Domains. Oracle decided not to backport this functionality to Solaris 10 at this time.
The recommendation from Veritas is deploy Solaris 11 based GUESTs where possible to overcome this interoperability limitation, and take benefit from the enhanced operating system enhancements.
For Solaris 11 based GUESTs, the use of MPGROUPs is not supported in any capacity or configuration.
This article outlines how Veritas approached the troubleshooting process and what Veritas encountered during its in-house testing efforts.
Cause
The issue appears to be confined to Solaris 10 GUEST deployments where the VDC (Virtual Disk Client) related devices are sometimes discovered too late in the boot sequence for VxVM to dynamic discover the virtually presented devices from the underlying I/O domains (Primary / Alternate).
This is highlighted by capturing the boot -v output from the console associated to the GUEST domain.
The impacted customers are raising support cases with Oracle support at this time to analyze the issue further.
Solution
A manual workaround of execution the VxVM command "vxdiskconfig" is required when the Solaris 10 based GUEST domain is rebooted.
The deletion of the /etc/path_to_inst, followed by reboot -- -r appears to ensure all device types are consistently displayed when the GUEST domain is restarted.
In this instance, the Primary (Control) & Service/Alternate I/O domains are presenting DMPNODEs from each I/O domains into respective MPGROUPs up into the Solaris 10 based GUEST domain.
The Virtual disks made visible to the GUEST domain named scrappy, can be reviewed from the Primary I/O domain, by running "ldm list -o disk scrappy".
# ldm list -o disk scrappy
NAME
scrappy
DISK
NAME VOLUME TOUT ID DEVICE SERVER MPGROUP
scrappyboot scrappyboot@primary-vds0 30 1 disk@1 primary scrappyboot
clariion1_177 scrappy177@primary-vds0 30 2 disk@2 primary clariion1_177
clariion1_253 scrappy253@primary-vds0 30 3 disk@3 primary clariion1_253
clariion1_254 scrappy254@primary-vds0 30 4 disk@4 primary clariion1_254
The Virtual disk instance IDs can also be viewed from within the GUEST domain, using the "echo | format" command.
# echo | format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0d1 <DGC-VRAID-0533 cyl 2814 alt 2 hd 256 sec 64>
/virtual-devices@100/channel-devices@200/disk@1
1. c0d2 <DGC-VRAID-0533 cyl 32766 alt 2 hd 16 sec 12>
/virtual-devices@100/channel-devices@200/disk@2
2. c0d3 <DGC-VRAID-0533 cyl 32766 alt 2 hd 16 sec 12>
/virtual-devices@100/channel-devices@200/disk@3
3. c0d4 <DGC-VRAID-0533 cyl 32766 alt 2 hd 16 sec 12>
/virtual-devices@100/channel-devices@200/disk@4
Specify disk (enter its number): Specify disk (enter its number):
The Virtual disk instance ID is shown after the c#d<instance ID>.
Example:
1. The Solaris 10 GUEST domain, scrappy has just been restarted:
VxVM is only reporting two Veritas disk access (da) names.
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
emc_clariion0_47 auto:ZFS - - ZFS
emc_clariion1_177 auto:none - - online invalid thinrclm
2. When the OS and VxVM is refreshed, all device types are discovered and displayed by VxVM.
# vxdiskconfig
VxVM INFO V-5-2-1401 This command may take a few minutes to complete execution
Executing Solaris command: devfsadm (part 1 of 2) at 16:26:07 BST
May 3 16:26:07 scrappy llt: LLT INFO V-14-1-10009 LLT 7.1.0 Protocol available
May 3 16:26:07 scrappy gab: GAB INFO V-15-1-20021 GAB 7.1.0 available
May 3 16:26:07 scrappy vxfen: NOTICE: VXFEN INFO V-11-1-56 VxFEN 7.1.0 loaded. Protocol versions supported: 10,20,30
Executing VxVM command: vxdctl enable (part 2 of 2) at 16:26:07 BST
Command completed at 16:26:08 BST
3. VxVM can clearly see all device types.
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
emc_clariion0_47 auto:ZFS - - ZFS
emc_clariion1_177 auto:none - - online invalid thinrclm
emc_clariion1_253 auto:cdsdisk - - online thinrclm
emc_clariion1_254 auto:cdsdisk - - online thinrclm
Troubleshooting Steps:
1. From the I/O domain hosting the console connection to the GUEST domain, telnet to the required console port.
Primary Domain:
# ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIME
primary active -n-cv- UART 16 16G 0.2% 0.2% 4h 52m
altio active -n--v- 5000 16 16G 0.2% 0.1% 4h 7m
scooby active -n---- 5020 16 16G 0.1% 0.1% 7d 4h 11m
scrappy active -t--v- 5001 8 8G 12% 12% 1m
2. In this instance, the GUEST domain is named scrappy and has a console port of 5001.
# telnet 0 5001
Trying 0.0.0.0...
Connected to 0.
Escape character is '^]'.
Connecting to console "scrappy" in group "scrappy" ....
Press ~? for control options ..
login as: root
Using keyboard-interactive authentication.
Password:
Last login: Fri May 4 11:39:25 2018 from xxx.c
Oracle Corporation SunOS 5.10 Generic Patch January 2005
#
3. When rebooting the GUEST domain using the reboot -- -v options, it is possible to see when the Virtual Disk (vdisk) instances are made available during the boot sequence.
# reboot -- -v
<snippet>
May 4 11:40:11 scrappy reboot: rebooted by root
syncing file systems... done
rebooting...
Resetting...
NOTICE: Entering OpenBoot.
NOTICE: Fetching Guest MD from HV.
NOTICE: Starting additional cpus.
NOTICE: Initializing LDC services.
NOTICE: Probing PCI devices.
NOTICE: Finished PCI probing.
SPARC T5-2, No Keyboard
Copyright (c) 1998, 2014, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.36.2, 8.0000 GB memory available, Serial #83437971.
.
.
scsi_vhci0 at root
scsi_vhci0 is /scsi_vhci
virtual-device: cnex0
cnex0 is /virtual-devices@100/channel-devices@200
vdisk@1 is online using ldc@1,0
channel-device: vdc1
vdc1 is /virtual-devices@100/channel-devices@200/disk@1
root on rpool/ROOT/s10s_u10wos_17b fstype zfs
NOTICE: VxVM vxdmp V-5-0-1990 driver version VxVM 7.1.0.000 Multipathing Driver installed
NOTICE: VxVM vxio V-5-0-1990 driver version VxVM 7.1.0.000 I/O driver installed
NOTICE: VxVM vxspec V-5-0-1990 driver version VxVM 7.1.0.000 control/status driver installed
Hostname: scrappy
VxVM sysboot INFO V-5-2-3409 starting in boot mode...
vdisk@2 is online using ldc@6,0
channel-device: vdc2
vdc2 is /virtual-devices@100/channel-devices@200/disk@2
pseudo-device: devinfo0
devinfo0 is /pseudo/devinfo@0
NOTICE: VxVM vxdmp V-5-0-34 [Info] added disk array CKM00152901315, datype = EMC_CLARiiON
NOTICE: VxVM vxdmp V-5-0-34 [Info] added disk array CKM00153100195, datype = EMC_CLARiiON
NOTICE: VxVM vxdmp V-5-0-0 [Info] removed disk array FAKE_ENCLR_SNO, datype = FAKE_ARRAY
VxVM sysboot INFO V-5-2-3390 Starting restore daemon...
pseudo-device: zfs0
.
.
NOTICE: msgcnt 1 mesg 139: V-2-139: Loading VxFS version VxFS 7.1.0.000 SunOS 5.10
pseudo-device: vxcafs0
vxcafs0 is /pseudo/vxcafs@0
LLT INFO V-14-1-10009 LLT 7.1.0 Protocol available
GAB INFO V-15-1-20021 GAB 7.1.0 available
NOTICE: VXFEN INFO V-11-1-56 VxFEN 7.1.0 loaded. Protocol versions supported: 10,20,30
.
.
May 4 11:40:38 scrappy vdc: vdisk@4 is online using ldc@10,0
May 4 11:40:38 scrappy cnex: channel-device: vdc4
May 4 11:40:38 scrappy genunix: vdc4 is /virtual-devices@100/channel-devices@200/disk@4
May 4 11:40:42 scrappy pseudo: pseudo-device: dtrace0
May 4 11:40:42 scrappy genunix: dtrace0 is /pseudo/dtrace@0
May 4 11:40:42 scrappy vdc: vdisk@3 is offline
May 4 11:40:42 scrappy vdc: vdisk@4 is offline
May 4 11:40:43 scrappy llt: LLT Protocol unavailable
May 4 11:40:43 scrappy gab: GAB INFO V-15-1-20022 GAB unavailable
May 4 11:40:43 scrappy vxfen: NOTICE: VXFEN INFO V-11-1-30 VxFEN unloaded
May 4 11:40:43 scrappy pseudo: pseudo-device: devinfo0
May 4 11:40:43 scrappy genunix: devinfo0 is /pseudo/devinfo@0
May 4 11:41:04 scrappy vdc: vdisk@3 is online using ldc@8,0
May 4 11:41:04 scrappy cnex: channel-device: vdc3
May 4 11:41:04 scrappy genunix: vdc3 is /virtual-devices@100/channel-devices@200/disk@3
May 4 11:41:04 scrappy vdc: vdisk@4 is online using ldc@10,0
May 4 11:41:04 scrappy cnex: channel-device: vdc4
May 4 11:41:04 scrappy genunix: vdc4 is /virtual-devices@100/channel-devices@200/disk@4
NOTE: In this instance, we can see the Virtual disks are the remaining vdc devices are discovered too late in the boot sequence.
Additional Configuration Information:
For Solaris 10 GUEST environments only. Veritas is recommending the deployment of Veritas DMP exported backend devices into a MPGROUP, which is then finally made visible into the Veritas 10 GUEST.
The MPGROUP requirement is criticial for the presentation of the boot disk(s) presented up into the GUEST domain. Where Veritas CVM/CFS is used, it is recommended that MPGROUPs not be used for data disks, especially when fencing is also configured in the GUEST domain.
This configuration is to allow Solaris 10 Guest domains a greater chance of surviving the loss of a Control or Service (Alternate) I/O domain.
Sample LDOM commands:
Issued from the Control (Primary) domain.
Boot disk
# ldm add-vdsdev mpgroup=scrappyboot /dev/vx/dmp/emc_clariion0_47s2 scrappyboot@primary-vds0
# ldm add-vdsdev mpgroup=scrappyboot /dev/vx/dmp/emc_clariion0_47s2 scrappyboot@altio-vds0
# ldm add-vdisk timeout=30 scrappydisk scrappyboot@primary-vds0 scrappy
NOTE: Only a single Virtual disk resource is required when using MPGROUPs. The boot disk should only be managed by MPGROUPs with Solaris 10 GUEST deployments. With Solaris 11 the boot disk must not be configured in connection with MPGROUPs.
Data Disks
The following commands can be used with all Solaris 11 deployments or Solaris 10 CVM/CFS GUEST configurations:
Exported from Primary VDS (Virtual Disk Server) I/O domain:
# ldm add-vdsdev /dev/vx/dmp/emc_clariion1_177s2 scrappy177@primary-vds0
# ldm add-vdsdev /dev/vx/dmp/emc_clariion1_253s2 scrappy253@primary-vds0
# ldm add-vdsdev /dev/vx/dmp/emc_clariion1_254s2 scrappy254@primary-vds0
Exported from Alternate/Secondary VDS (Virtual Disk Server) I/O domain:
# ldm add-vdsdev /dev/vx/dmp/emc_clariion1_177s2 scrappy177@altio-vds0
# ldm add-vdsdev /dev/vx/dmp/emc_clariion1_253s2 scrappy253@altio-vds0
# ldm add-vdsdev /dev/vx/dmp/emc_clariion1_254s2 scrappy254@altio-vds0
Virtual disk resources exported from the Primary VDS to the GUEST domain scrappy:
# ldm add-vdisk timeout=30 scrappy-177-pri scrappy177@primary-vds0 scrappy
# ldm add-vdisk timeout=30 scrappy-253-pri scrappy253@primary-vds0 scrappy
# ldm add-vdisk timeout-30 scrappy-254-pri scrappy254@primary-vds0 scrappy
Virtual disk resources exported from the Alternate (Secondary) VDS to the GUEST domain scrappy:
# ldm add-vdisk timeout=30 scrappy-177-alt scrappy177@altio-vds0 scrappy
# ldm add-vdisk timeout=30 scrappy-253-alt scrappy253@altio-vds0 scrappy
# ldm add-vdisk timeout-30 scrappy-254-alt scrappy254@altio-vds0 scrappy
NOTE: The Virtual disk names need to be unique when not using MPGROUPs.
MPGROUPs need to be configured for data disks when CVM/CFS is not used for Solaris 10 GUEST environments only.
To support the handling of Solaris ZFS devices, the DMP tunable dmp_native support must be enabled in Solaris 11 GUESTs.
# vxdmpadm gettune dmp_native_support
# vxdmpadm settune dmp_native_support=on
NOTE: A series of Veritas Volume Manager (VxVM) patches were released to ensure DMP imports ZFS devices using DMP.
Please contact Veritas support to ensure you are running the required Veritas Volume Manager (VxVM) patch level.
MPGROUPs remains unsupported for Solaris 11 configurations. MPxIO is not supported with Solaris 10 or 11 LDOM configurations.