NetBackup Snapshot Manager 10.3.x or higher upgrade may fail during the DB migration due to duplicate agentId
Problem
NetBackup Snapshot Manager 10.3.x or higher upgrade may fail during the DB migration due to duplicate agentId entries.
Error Message
Error during installation:
Starting container: flexsnap-postgresql ...done
Waiting for flexsnap-postgresql container to move to healthy state...Starting container: flexsnap-mongodb ...done
Waiting for flexsnap-mongodb container to move to healthy state...Data migration required from mongo database to postgresql database
Data migration is failed.
For more information, refer to the /cloudpoint/logs/flexsnap.log file.
Error in the flexsnap-cloudpoint.log:
DD/MM/YYYY HH:MM:SS e531c3719752 flexsnap-client[7] MainThread flexsnap.database.data_migration: INFO - Starting index creation and data migration for flexsnap DB
DD/MM/YYYY HH:MM:SS
e531c3719752 flexsnap-client[7] MainThread flexsnap.database.data_migration: ERROR - DB Migration: Encountered operation failure error: column "agentId" of relation "agent" does not exist
LINE 2: ... "osName", "srcids", "lastMessage", "onHost", "agentId",...
^
Traceback (most recent call last):
File "/opt/VRTScloudpoint/lib/flexsnap/database/postgresql/operations.py", line 68, in _func
result = func(self, *args, **kwargs)
File "/opt/VRTScloudpoint/lib/flexsnap/database/postgresql/operations.py", line 138, in insert
cursor.execute(sql, data)
File "/usr/local/lib64/python3.9/site-packages/psycopg2/extras.py", line 236, in execute
return super().execute(query, vars)
psycopg2.errors.UndefinedColumn: column "agentId" of relation "agent" does not exist
LINE 2: ... "osName", "srcids", "lastMessage", "onHost", "agentId",...
^
Cause
The agent collection in MongoDB consisted of two agentid entries, one with agentId and the other with agentid.
{
"_id": {
"$oid": "5ee26894f3db80000197646f"
},
"status": "online",
"agentid": "agent.<guid>",
"pluginConfig": {
"linux": [
{
"templateSelector": {
"schemaVersion": 1
},
"configuration": {},
"errmsg": "",
"discovered_time": 1.7108590380880687e+09,
"configId": "linux.<guid>",
"status": "discovered",
"configHash": "<configHash>"
}
]
},
"hostname": "flexsnap-onhostagent",
"osName": "linux",
"srcids": [
"linux-fs-azure-vm-<id>",
"linux-host-azure-vm-<id>",
"linux-disk-azure-vm-<id>"
],
"lastMessage": 1710859369,
"onHost": true,
"agentId": "agent.<guid>",
"hostid": "azure-vm-<id>"
}
Solution
To confirm your environment will experience this issue, please run the commands below:
DOCKER: # docker exec -it flexsnap-mongodb mongoexport --ssl --sslCAFile /cloudpoint/keys/cacert.pem --sslPEMKeyFile /cloudpoint/keys/mongodb.pem --host=flexsnap-mongodb --port=27017 -d flexsnap -c agent --pretty | grep -i agentId | awk '{print $2}' | uniq -d
PODMAN: # podman exec -it flexsnap-mongodb mongoexport --ssl --sslCAFile /cloudpoint/keys/cacert.pem --sslPEMKeyFile /cloudpoint/keys/mongodb.pem --host=flexsnap-mongodb --port=27017 -d flexsnap -c agent --pretty | grep -i agentId | awk '{print $2}' | uniq -d
If this returns an agentid, then you are likely to see this issue. If so, contact Veritas support for assistance.
Provide a copy of the agent collection database /cloudpoint/agent.json by running:
DOCKER: # docker exec -it flexsnap-mongodb mongoexport --ssl --sslCAFile /cloudpoint/keys/cacert.pem --sslPEMKeyFile /cloudpoint/keys/mongodb.pem --host=flexsnap-mongodb --port=27017 -d flexsnap -c agent --pretty --out /cloudpoint/agent.json
PODMAN: # podman exec -it flexsnap-mongodb mongoexport --ssl --sslCAFile /cloudpoint/keys/cacert.pem --sslPEMKeyFile /cloudpoint/keys/mongodb.pem --host=flexsnap-mongodb --port=27017 -d flexsnap -c agent --pretty --out /cloudpoint/agent.json
NOTE: If the system was originally on CloudPoint 2.x and underwent subsequent upgrades, please review the related article section to ensure that any potential issues related to these legacy versions are addressed and mitigated during the upgrade process.