Problem
On the Resiliency Manager (RM) 3.6 appliance, the Messaging service (rabbitmq) service fails to start or keeps crashing frequently.
Error Message
1) If the service fails to start then the status of the Messaging service will report a "STOPPED" state in below output
/opt/VRTSitrp/bin/itrpadm service --start mq
2) If the service keeps frequently crashing, it will report below warnings in the VRP Web Console->Risks page
Cause
The issue occurs when the erlang process used by the rabbitmq (Messaging Service) does not start at the same time as that of the rabbitmq service.
Owing to this stale erlang process, it causes the rabbitmq (Messaging Service) to either not start or keep crashing.
Solution
1) Log into the Support shell of the RM appliance. Veritas Support assistance may be needed if this is the first time accessing the Support shell.
2) Run this command to check the state of the Messaging (mq) service.
/opt/VRTSitrp/bin/itrpadm service --status mq
NOTE: It maybe a case wherein either the status would show as "STOPPED" state or the status may show as "RUNNING" but, then the service automatically stops/crashes after few minutes.
3) In both the cases, stop the Messaging service using below command to ensure that the itrpadm-monitor does not try to start rabbitmq
/opt/VRTSitrp/bin/itrpadm service --stop mq
4) Check if erlang process is running, using below command
ps -ef | grep erlang
5) If the erlang process is running, run below command to terminate it.
kill -9 <pid_erlang_from_step #4>
6) Start the Messaging Service using below command
/opt/VRTSitrp/bin/itrpadm service --start mq
7) Verify the status of the service after a certain interval (about 2 minutes) using below command to confirm it continues to be in "RUNNING" state.
/opt/VRTSitrp/bin/itrpadm service --status mq