On the Solaris platform, commands like zpool create, zpool import, or other commands that change the SMI/EFI label, result in extremely poor performance of Veritas Volume Manager commands
Problem
On the Solaris platform, commands like zpool create, zpool import, or other commands that change the SMI/EFI label, result in extremely poor performance of VxVM (Veritas Volume Manager) commands.
VxVM commands may be delayed for few minutes (depending on the related DMP I/O timeout values) while trying to read from the disk with a new EFI label. vxconfigd may be instructed to read a disk when certain VxVM commands are executed manually, or via VCS (Veritas Cluster Server) monitor entry points (such as vxdisk list <disk access name of the problematic disk> and vxdisk –o alldgs list), or when VOM (Veritas Operation Manager tries to update the inventory, or status, of the disks through the vxlist command.
The following symptoms are observed as a result of this problem:
- Changing the label of a disk from SMI to EFI without informing VxVM through vxdisk scandisks can cause vxvm commands to take a long time to complete.
- VCS DiskGroup resource monitor entry points issue vx command timeouts and may result in a system panic.
- Commands that rely on DMP (Dynamic Multi-Pathing) devices, such as zpool import may become unresponsive, or take a long time to complete.
The following are the conditions which may cause vxconfigd to take long time to respond.
- A disk without label is first labeled as an EFI disk.
- A disk with an existing SMI/VTOC label is relabeled as an EFI disk via the format command.
- A disk with an existing SMI/VTOC label is relabeled as an EFI disk when creating zpool using the disk.
Although the following operations do not appear to cause the problem during Veritas laboratory tests, we strongly recommend running vxdisk scandisks after the following operations as well:
- Removing the label (either SMI/VTOC or EFI) by clearing the data blocks at the beginning of the disk (using dd, for example)
- Relabeling an EFI disk to an SMI/VTOC or EFI label again.
Applies To
- Veritas Volume Manager (VxVM), running on Solaris
- A disk having its label changed is visible to VxVM. A disk is visible to VxVM if it is listed in the vxdisk list output.
- Labeling a disk that does not already have a label to EFI label, or relabeling a disk from SMI/VTOC to an EFI label -- either manually, or via the zpool create command -- can cause the above mentioned problem even if that disk is not initialized as a VxVM disk, through vxdisksetup.
Error Message
The following messages will be logged to the /var/adm/messages file.
Mar 12 16:19:41 server101 vxdmp: [ID 382146 kern.notice] NOTICE: VxVM vxdmp V-5-0-112 [Warn] disabled path 231/0x220 belonging to the dmpnode 292/0xb8 due to path ok timeout
Mar 12 16:19:46 server101 vxdmp: [ID 808532 kern.notice] NOTICE: VxVM vxdmp V-5-0-1957 [Failed] i/o failed on disk 292/0xb8 due to timeout threshold (300 sec)
Mar 12 16:19:46 server101 vxdmp: [ID 140891 kern.notice] NOTICE: VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x206) on dmpnode 292/0xb8
Mar 12 16:23:05 server101 vxdmp: [ID 382146 kern.notice] NOTICE: VxVM vxdmp V-5-0-112 [Warn] disabled path 231/0x60 belonging to the dmpnode 292/0xb8 due to path ok timeout
Mar 12 16:23:05 server101 vxdmp: [ID 744425 kern.notice] NOTICE: VxVM vxdmp V-5-0-0 [Info] failover initiated for 292/0xb8
Mar 12 16:23:05 server101 vxdmp: [ID 744425 kern.notice] NOTICE: VxVM vxdmp V-5-0-0 [Info] curpri set to secondary for 292/0xb8
Mar 12 16:24:46 server101 vxdmp: [ID 808532 kern.notice] NOTICE: VxVM vxdmp V-5-0-1957 [Failed] i/o failed on disk 292/0xb8 due to timeout threshold (300 sec)
Mar 12 16:24:46 server101 vxdmp: [ID 140891 kern.notice] NOTICE: VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x206) on dmpnode 292/0xb8
Mar 12 16:29:47 server101 vxdmp: [ID 808532 kern.notice] NOTICE: VxVM vxdmp V-5-0-1957 [Failed] i/o failed on disk 292/0xb8 due to timeout threshold (300 sec)
Mar 12 16:29:47 server101 vxdmp: [ID 140891 kern.notice] NOTICE: VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x206) on dmpnode 292/0xb8
The following messages are logged to the /var/adm/vx/dmpevents.log.
Thu Mar 12 16:14:46.632: I/O error occurred on Path c5t500601603CE0325Dd5s2 belonging to Dmpnode emc_clariion0_407
Thu Mar 12 16:14:46.633: I/O error occurred on Path c4t500601603CE0325Dd5s2 belonging to Dmpnode emc_clariion0_407
Thu Mar 12 16:14:46.634: I/O analysis done as DMP_PATH_OKAY on Path c5t500601603CE0325Dd5s2 belonging to Dmpnode emc_clariion0_407
Thu Mar 12 16:14:46.635: I/O analysis done as DMP_PATH_OKAY on Path c4t500601603CE0325Dd5s2 belonging to Dmpnode emc_clariion0_407
Thu Mar 12 16:14:46.635: I/O retry(2) on Path c4t500601603CE0325Dd5s2 belonging to Dmpnode emc_clariion0_407
Thu Mar 12 16:14:46.635: Marked as ioerr Path c5t500601603CE0325Dd5s2 belonging to Dmpnode emc_clariion0_407
Thu Mar 12 16:14:46.638: I/O error occurred on Path c5t500601603CE0325Dd5s2 belonging to Dmpnode emc_clariion0_407
Thu Mar 12 16:14:46.638: Unmarked as ioerr Path c5t500601603CE0325Dd5s2 belonging to Dmpnode emc_clariion0_407
Thu Mar 12 16:14:46.640: I/O analysis done as DMP_PATH_OKAY on Path c5t500601603CE0325Dd5s2 belonging to Dmpnode emc_clariion0_407
Thu Mar 12 16:14:46.640: I/O retry(3) on Path c5t500601603CE0325Dd5s2 belonging to Dmpnode emc_clariion0_407
Thu Mar 12 16:19:41.724: Disabled Path c4t500601603CE0325Dd5s2 belonging to Dmpnode emc_clariion0_407 due to path ok timeout
Thu Mar 12 16:19:46.720: I/O error occurred (errno=0x206) on Dmpnode emc_clariion0_407
Thu Mar 12 16:23:05.813: Disabled Path c5t500601603CE0325Dd5s2 belonging to Dmpnode emc_clariion0_407 due to path ok timeout
Thu Mar 12 16:23:05.819: CURPRI set to secondary for Dmpnode emc_clariion0_407 without quiescing
Thu Mar 12 16:24:46.890: I/O error occurred (errno=0x206) on Dmpnode emc_clariion0_407
Thu Mar 12 16:29:47.030: I/O error occurred (errno=0x206) on Dmpnode emc_clariion0_407
Cause
Neither VxVM nor VxDMP monitor the disk labels. They will not be aware of any underlying disk label change. When a disk label is changed to EFI without informing VxVM and VxDMP, through the vxdisk scandisks command, VxDMP will continue to use the SMI OS devices to issue the I/Os. Those I/Os will be delayed, and retried, for several minutes, causing vxconfigd to hang if vxconfigd tries to read the disks.
Following an underlying disk label change, the OS reports and I/O error on a zero size slice. DMP then issues a SCSI inquiry to identify the reason of failure. But the OS doesn’t fail the SCSI inquiry and instead succeeds it. So DMP takes it as DMP_PATH_OKAY. According to DMP policy, DMP will keep retrying this I/O until it reaches the maximum time limit (300 seconds). This change in OS behavior, where the I/O to the disk returns an errors, but the SCSI inquiry returns successfully, results in related VxVM command taking a long time to complete.
Oracle has acknowledged this change in behavior and is working on a fix via:
Bug 23557480: Veritas Compatibility Issue with ZFS filesystem
Solution
Until the fix ( Bug 23557480) is obtained from Oracle, one of the following options can be used to avoid the issue.
- Change the DMP recoveryoption to fixedretry as described below.
- Run vxdisk scandisks immediately after changing labels from SMI to EFI. This can either be done manually, or after creating zpool.
Details
Option 1 - Change DMP recoveryoption to fixedretry
# vxdmpadm setattr enclosure <enclosure_name> recoveryoption=fixedretry retrycount=2
This needs to be set for all enclosures in the system. Confirm the change in the recoveryoption attribute:
# vxdmpadm getattr enclosure <enclosure_name> | grep recoveryoption
Example:
# vxdmpadm setattr enclosure emc_clariion0 recoveryoption=fixedretry retrycount=2
# # vxdmpadm getattr enclosure emc_clariion0 | grep recoveryoption emc_clariion0 recoveryoption[throttle] Nothrottle[0] Nothrottle[0] emc_clariion0 recoveryoption[errorretry] Timebound[300] Fixed-Retry[2] # |
Option # 2 – Run ‘vxdisk scandisks’ immediately after changing labels from SMI to EFI (either manually or after creating zpool)
Whenever a disk is labeled (with an EFI label, for example), or any disk characteristic is changed at the storage layer, run the vxdisk scandisks command immediately before running any other VxVM commands. The syntax is as follows.
# vxdisk scandisks device=<OS pathname>
Regarding the handling of the disk characteristic change at the storage layer, refer to Chapter 10 ("Dynamic Reconfiguration of devices") in the Veritas Storage Foundation Administrator's Guide for Solaris. Example: Here is the procedure to label a disk with an EFI label and inform VxVM immediately about the change. The disk can be either without label or with an SMI/VTOC label.
# vxdisk list emc_clariion0_407
Device: emc_clariion0_407 devicetag: emc_clariion0_407 type: auto flags: nolabel private autoconfig <<< nolabel; without a valid disk label pubpaths: block=/dev/vx/dmp/emc_clariion0_407 char=/dev/vx/rdmp/emc_clariion0_407 guid: - udid: DGC%5FRAID%200%5FCKM00090900285%5F60060160D3762400CF12DC595339E411 site: - errno: Disk is not usable Multipathing information: numpaths: 4 c5t500601683CE0325Dd5 state=enabled type=secondary c5t500601603CE0325Dd5 state=enabled type=primary c4t500601683CE0325Dd5 state=enabled type=secondary c4t500601603CE0325Dd5 state=enabled type=primary # vxdisk list emc_clariion0_407 Device: emc_clariion0_407 devicetag: emc_clariion0_407 type: auto info: format=none <<< with a valid label but not initialized as VxVM disk flags: online ready private autoconfig invalid <<< online means the disk has a valid label pubpaths: block=/dev/vx/dmp/emc_clariion0_407s2 char=/dev/vx/rdmp/emc_clariion0_407s2 guid: - udid: DGC%5FRAID%200%5FCKM00090900285%5F60060160D3762400CF12DC595339E411 site: - Multipathing information: numpaths: 4 c5t500601683CE0325Dd5s2 state=enabled type=secondary <<< "s2" means that the disk has an SMI/VTOC label c5t500601603CE0325Dd5s2 state=enabled type=primary c4t500601683CE0325Dd5s2 state=enabled type=secondary c4t500601603CE0325Dd5s2 state=enabled type=primary First, get the OS device paths that belong to the disk:
# echo $(vxdisk list emc_clariion0_047| grep state= | awk '{print $1}')
c5t500601683CE0325Dd5s2 c5t500601603CE0325Dd5s2 c4t500601683CE0325Dd5s2 c4t500601603CE0325Dd5s2 Label all of the OS device paths with an EFI label.
# for d in c5t500601683CE0325Dd5s2 c5t500601603CE0325Dd5s2 c4t500601683CE0325Dd5s2 c4t500601603CE0325Dd5s2
> do > format -e $d; > done selecting c5t500601683CE0325Dd5s2 [disk formatted] format> label [0] SMI Label [1] EFI Label Specify Label type[0]: 1 Warning: This disk has an SMI label. Changing to EFI label will erase all current partitions. Continue? y format> quit selecting c5t500601603CE0325Dd5s2 [disk formatted] format> label [0] SMI Label [1] EFI Label Specify Label type[1]: Ready to label disk, continue? y format> quit selecting c4t500601683CE0325Dd5s2 [disk formatted] format> label [0] SMI Label [1] EFI Label Specify Label type[1]: Ready to label disk, continue? y format> quit selecting c4t500601603CE0325Dd5s2 [disk formatted] format> label [0] SMI Label [1] EFI Label Specify Label type[1]: Ready to label disk, continue? y format> quit Note: From this point onward, don't run any other VxVM commands until the following vxdisk scandisks commands finish.
# for d in c5t500601683CE0325Dd5s2 c5t500601603CE0325Dd5s2 c4t500601683CE0325Dd5s2 c4t500601603CE0325Dd5s2
> do > echo vxdisk scandisks device=$d > vxdisk scandisks device=$d > done vxdisk scandisks device=c5t500601683CE0325Dd5s2 vxdisk scandisks device=c5t500601603CE0325Dd5s2 vxdisk scandisks device=c4t500601683CE0325Dd5s2 vxdisk scandisks device=c4t500601603CE0325Dd5s2 After the vxdisk scandisks commands finishes, vxconfigd can access the disk without hanging.
# vxdisk list emc_clariion0_407
Device: emc_clariion0_407 devicetag: emc_clariion0_407 type: auto info: format=none flags: online ready private autoconfig invalid pubpaths: block=/dev/vx/dmp/emc_clariion0_407 char=/dev/vx/rdmp/emc_clariion0_407 guid: - udid: DGC%5FRAID%200%5FCKM00090900285%5F60060160D3762400CF12DC595339E411 site: - Multipathing information: numpaths: 4 c5t500601683CE0325Dd5 state=enabled type=secondary <<< without "s2", the disk has an EFI label c5t500601603CE0325Dd5 state=enabled type=primary c4t500601683CE0325Dd5 state=enabled type=secondary c4t500601603CE0325Dd5 state=enabled type=primary |