Volume Manager unable to recognize recently replaced HBA card/adapter in IBM AIX server

Article: 100003317
Last Published: 2014-01-02
Ratings: 2 0
Product(s): InfoScale & Storage Foundation

Problem

 
 
The IBM AIX server is able to see the devices via both fscsi adpaters, fsc0 and fsc1.


AIX adapter content

# lsdev -Cc adapter | grep fcs
fcs0      Available 2a-08 FC Adapter
fcs1      Available 3S-08 FC Adapter  

# lsparent -C -k iocb
fcs0 Available 2a-08 FC Adapter
fcs1 Available 3S-08 FC Adapter  

# lsparent -C -k fcp
fscsi1 Available 3S-08-01 FC SCSI I/O Controller Protocol Device
fscsi0 Available 2a-08-02 FC SCSI I/O Controller Protocol Device  
 
 
AIX device presentation

# lsdev -Cc disk
hdisk0  Available 2w-08-00-8,0 16 Bit LVD SCSI Disk Drive
hdisk1  Available 2w-08-00-9,0 16 Bit LVD SCSI Disk Drive
hdisk2  Available 2a-08-02     Hitachi Disk Array (Fibre)
hdisk3  Available 2a-08-02     Hitachi Disk Array (Fibre)
hdisk4  Available 2a-08-02     Hitachi Disk Array (Fibre)
hdisk5  Available 2a-08-02     Hitachi Disk Array (Fibre)
hdisk6  Available 2a-08-02     Hitachi Disk Array (Fibre)
hdisk7  Available 2a-08-02     Hitachi Disk Array (Fibre)
hdisk8  Available 2a-08-02     Hitachi Disk Array (Fibre)
hdisk9  Available 2a-08-02     Hitachi Disk Array (Fibre)
hdisk55 Available 3S-08-01     Hitachi Disk Array (Fibre)
hdisk56 Available 3S-08-01     Hitachi Disk Array (Fibre)
hdisk57 Available 3S-08-01     Hitachi Disk Array (Fibre)
hdisk58 Available 3S-08-01     Hitachi Disk Array (Fibre)
hdisk59 Available 3S-08-01     Hitachi Disk Array (Fibre)
hdisk60 Available 3S-08-01     Hitachi Disk Array (Fibre)
hdisk61 Available 3S-08-01     Hitachi Disk Array (Fibre)
hdisk62 Available 3S-08-01     Hitachi Disk Array (Fibre)  


SUMMARY:
AIX hdisk devices hdisk2 to hdisk9 are "Available" on fscsi0
AIX hdisk devices hdisk55 to hdisk62 are "Available" on fscsi1



Volume Manager/DMP is failing to detect the devices presented on fscsi0


# vxdmpadm listctlr all
CTLR-NAME       ENCLR-TYPE      STATE      ENCLR-NAME
=====================================================
scsi0           Disk            ENABLED      disk
fscsi1          TagmaStore-USP  ENABLED      tagmastore-usp1
fscsi1          TagmaStore-USP  ENABLED      tagmastore-usp0


# vxdisk -e -o alldgs list
DEVICE       TYPE      DISK         GROUP        STATUS    OS_NATIVE_NAME   ATTR        
disk_0       auto      -             -            LVM          hdisk0   -            
disk_1       auto      -             -            LVM          hdisk1  -            
tagmastore-usp0_206a auto      tivspk8_62412_d3_c  tivdg        online hdisk57      std          
tagmastore-usp0_206b auto      tivspk8_62412_d4_c  tivdg        online hdisk58      std          
tagmastore-usp0_2068 auto      tivspk8_62412_d1_c  tivdg        online hdisk55      std          
tagmastore-usp0_2069 auto      tivspk8_62412_d2_c  tivdg        online hdisk56      std          
tagmastore-usp1_206a auto      tivspk8_62422_d3_p  tivdg        online hdisk61      std          
tagmastore-usp1_206b auto      tivspk8_62422_d4_p  tivdg        online hdisk62      std          
tagmastore-usp1_2068 auto      tivspk8_62422_d1_p  tivdg        online hdisk59      std          
tagmastore-usp1_2069 auto      tivspk8_62422_d2_p  tivdg        online hdisk60      std            

 

Error Message

None

Cause

Volume Manager commands "vxdctl enable" nor "vxdisk scandisks" are able to detect the newly presented HBA card "fsc0".

Recycling vxconfigd will permit Volume Manager to rediscover the detect the HBA card replacement, for which "vxdctl enable" nor "vxdisk scandisks" is designed to do.
 


In this instance, the correct HBA replacement process was not followed.


Procedure to be adopted for Dynamic Reconfiguration of a Controller/HBA (Host Bus Adapter) in Multipath Configuration on AIX platform
http://www.symantec.com/business/support/index?page=content&id=TECH71772


Details:
The procedure detailed in this technote is designed to address an omission to document the process in Volume Manager (VM) technical guides/documents. The procedure was recommended and approved by VM Engineering to avoid adverse conditions like  DMP (Dynamic Multi-Pathing) induced system panics and I/O (input/output)  errors during Dynamic Reconfiguration involving replacing HBA(Host Bus Adapter), moving HBA to different switch, replacing controllers etc


PROCEDURE:

1. Disable the controller in VM by removing reference from DMP.
#vxdmpadm -f disable ctlr=fscsi

2. Remove/Delete device references from the Operating System(OS)
#rmdev -Rdl fscsi

3. Rescan device tree and rebuild DMP database.    <<<< this is the step that was missed !
#vxdctl enable

4. Perform the Dynamic Reconfiguration operation (upgrading firmware or replacing HBA etc ...)
Perform any required changes and array management operations.

5. Reconfigure devices in the OS.
#cfgmgr

6. Run lsdev –Cc disk to check new devices showup at OS level.
#lsdev –Cc disk

7. vxdmpadm enable ctlr=fscsi

8. Rescan device tree and rebuild DMP database
#vxdctl enable


 

Solution

Workaround:

In this instance VCS is operational, so to avoid possible monitoring issues whilst vxconfigd is being recycled, it is strongly advised that VCS monitoring be disabled, by running "hastop -local -force"


1.] # hastop -local -force

What to expect when using the "-force" option with the hastop command.  


The "-force" option to the hastop command does not decrease the time required to successfully stop VERITAS Cluster Server (VCS). The "-force" option allows the VCS software to be stopped, but the resources within the service groups remain running even after the VCS engine has been stopped. VCS resource monitoring is disabled as a result.

If the configuration file (main.cf) is in read-write mode and the "-force" option  is used, the configuration file is marked as stale. If the cluster is started again, it will not start if it reads a stale configuration file.  VCS sends a warning message to users who try to use the hastop command with a stale configuration file.  


2.] # vxconfigd -k

If a vxconfigd process is already running,  the  -k option kills it  before any other startup processing.
This is useful for recovering from a hung vxconfigd process.   Killing the old vxconfigd and starting a new one usually does not cause problems for volume devices that are being used by applications, or that contain mounted file systems.

Recycling vxconfigd will result in Volume Manager rescanning for HBA adapters, devices and so on.


3.] hastart

To resume normal VCS monitoring and functionality.



Post "vxconfigd -k" Volume Manager output



# vxdmpadm listctlr all
CTLR-NAME       ENCLR-TYPE      STATE      ENCLR-NAME
=====================================================
scsi0           Disk            ENABLED      disk
fscsi1          TagmaStore-USP  ENABLED      tagmastore-usp1
fscsi0          TagmaStore-USP  ENABLED      tagmastore-usp1  <<< detected now
fscsi1          TagmaStore-USP  ENABLED      tagmastore-usp0
fscsi0          TagmaStore-USP  ENABLED      tagmastore-usp0  <<< detected  now


# vxdmpadm listenclosure all
ENCLR_NAME        ENCLR_TYPE     ENCLR_SNO      STATUS       ARRAY_TYPE  LUN_COUNT
===================================================================================
disk                         Disk                     DISKS                CONNECTED    Disk             2
tagmastore-usp1   TagmaStore-USP 0F3D6                CONNECTED    A/A             4
tagmastore-usp0   TagmaStore-USP 0F3CC                CONNECTED    A/A             4  



Volume Manager/DMP are now reporting two paths per DANAME ( disk acess name )
 

# vxdisk path
SUBPATH                     DANAME               DMNAME       GROUP                   STATE
hdisk0                      disk_0               -            -    ENABLED
hdisk1                      disk_1               -            -    ENABLED
hdisk57                     tagmastore-usp0_206a tivspk8_62412_d3_ctivdg        ENABLED
hdisk4                      tagmastore-usp0_206a tivspk8_62412_d3_ctivdg        ENABLED
hdisk58                     tagmastore-usp0_206b tivspk8_62412_d4_ctivdg        ENABLED
hdisk5                      tagmastore-usp0_206b tivspk8_62412_d4_ctivdg        ENABLED
hdisk55                     tagmastore-usp0_2068 tivspk8_62412_d1_ctivdg        ENABLED
hdisk2                      tagmastore-usp0_2068 tivspk8_62412_d1_ctivdg        ENABLED
hdisk56                     tagmastore-usp0_2069 tivspk8_62412_d2_ctivdg        ENABLED
hdisk3                      tagmastore-usp0_2069 tivspk8_62412_d2_ctivdg        ENABLED
hdisk61                     tagmastore-usp1_206a tivspk8_62422_d3_ptivdg        ENABLED
hdisk8                      tagmastore-usp1_206a tivspk8_62422_d3_ptivdg        ENABLED
hdisk62                     tagmastore-usp1_206b tivspk8_62422_d4_ptivdg        ENABLED
hdisk9                      tagmastore-usp1_206b tivspk8_62422_d4_ptivdg        ENABLED
hdisk59                     tagmastore-usp1_2068 tivspk8_62422_d1_ptivdg        ENABLED
hdisk6                      tagmastore-usp1_2068 tivspk8_62422_d1_ptivdg        ENABLED
hdisk60                     tagmastore-usp1_2069 tivspk8_62422_d2_ptivdg        ENABLED
hdisk7                      tagmastore-usp1_2069 tivspk8_62422_d2_ptivdg        ENABLED  


Applies To

 

IBM AIX Server running Storage Foundation HA 5.0 MP3 ( not rolling patches ) across two servers.

 

References

Etrack : 1535294

Was this content helpful?