vCenter Upgrade Error: Exception Occurred in install precheck phase
Error presented by VAMI Interface
Caveat
This is definitely bypassing some form of pre-check, please contact VMware support if it's on a production system!
Troubleshooting
VCSA 7.0 has moved the upgrade process logging to a new location - the log itself is now at /storage/log/vmware/applmgmt/update_microservice.log
(actual) or /var/log/vmware/applmgmt/update_microservice.log
(symlink)
update_microservice
This appears to be a rough order of operations with this new update process:
- Pre-Checks: First, the upgrade tries to identify the system being upgraded:
1update_microservice:: precheckEventHandler: 148 - INFO - Precheck event happens
2update_b2b:: precheck: 709 - DEBUG - Running update prechecks
3update_b2b:: b2bRequirements: 479 - DEBUG - Running B2B Requirements hook and processing the results
4update_b2b:: _runScriptHook: 330 - DEBUG - Running B2B script with hook CollectRequirementsHook
5update_b2b:: _runScriptHook: 339 - DEBUG - update script output to file /var/log/vmware/applmgmt/upgrade_hook_CollectRequirementsHook
6extensions:: _findExtension: 83 - DEBUG - Found script hook <module 'update_script' from '/storage/core/software-update/updates/7.0.1.00200/scripts/update_script.py'>:CollectRequirementsHook'
7update_utils:: isGateway: 83 - DEBUG - Not running on a VMC Gateway appliance.
8update_utils:: isB2BUpgrade: 72 - DEBUG - Bundle will execute upgrade: False
9update_script:: collectRequirements: 492 - DEBUG - Checking verisons
10update_script:: collectRequirements: 496 - DEBUG - Source VCSA version = 7.0.1.00100
11update_script:: collectRequirements: 500 - INFO - Target VCSA version = 7.0.1.00200
12update_utils:: getRPMBlacklist: 185 - DEBUG - vCSA deployment Type: embedded
13update_b2b:: b2bRequirements: 493 - DEBUG - Getting packages excluding the ones in blacklist
From there, it picks up the scope for the upgrade, and verifies against common upgrade issues:
1update_b2b:: b2bRequirements: 528 - DEBUG - Calculated packages list
2update_b2b:: checkDisk: 423 - DEBUG - Checking for disk utilization
3update_b2b:: checkDisk: 467 - DEBUG - CheckDisk completed, returning with selected disk partition /storage/updatemgr
4update_b2b:: precheck: 740 - DEBUG - Estimating time to install..
5update_b2b:: estimate_time: 679 - DEBUG - Estimating time required for rpm-update, services start-stop and reboot time if its required
6update_b2b:: estimate_time: 682 - DEBUG - Calculating RPM installation time
7update_b2b:: rpm_install_time: 587 - DEBUG - Reading all rpms present in rpm-manifest.json
8update_b2b:: rpm_install_time: 588 - DEBUG - Estimating installation time for installed rpms and new rpms
9update_b2b:: get_installed_rpms_list: 564 - DEBUG - Getting the list of installed RPMs along with the time of install
10update_b2b:: get_installed_rpms_list: 578 - DEBUG - Completed getting the list of rpms, returning with the list: <class 'list'>
11update_b2b:: rpm_install_time: 610 - DEBUG - Installation time estimated successfully, returning with time for installation 23
12update_b2b:: estimate_time: 684 - DEBUG - Calculating time to start and stop services
13update_b2b:: estimate_time_services: 620 - DEBUG - Estimating time for services-start and services-stop
14update_b2b:: estimate_time_services: 640 - DEBUG - Completed estimating time for starting and stopping services, returning with the required time: 2
15task_manager:: update: 80 - DEBUG - UpdateTask: status=SUCCEEDED, progress=100, message={'id': 'com.vmware.appliance.update.prechecks_task_ok', 'default_message': 'Prechecks completed', 'args': []}
In this case, everything looks good. I'm not really sure why it needs the SSO Administrator password, and there isn't much on-line about this. We're seeing three errors after we hit go time:
1update_b2b:: resumeStage:3431 - DEBUG - 'download' phase is 100% completed. checkAllRpmsArePresent
2rpmfunctions:: checkAllRpmsArePresent: 308 - ERROR - Empty Stage location passed. This cannot be empty.
3update_b2b:: resumeStage:3497 - ERROR - Exception in resume stage. Exception : {Package discrepency error, Cannot resume!}
4task_manager:: update: 80 - DEBUG - UpdateTask: status=FAILED, progress=0, message={'id': 'com.vmware.appliance.plain_message', 'default_message': '%s', 'args': ['Package discrepency error, Cannot resume!']}
5dbfunctions:: execute: 81 - DEBUG - Executing {SELECT CASE WHEN count(*) == 0 THEN 0 ELSE 1 END as status FROM progress WHERE _stagekey = 'patch-state' AND _message = 'Stage successful'}
6functions:: get_resume_state: 340 - DEBUG - Resume needed in Stage phase
7update_b2b:: install_with_resume:2477 - DEBUG - Installing version 7.0.1.00200
8update_functions:: readJsonFile: 224 - ERROR - Can't read JSON file /storage/core/software-update/stage/stageDir.json [Errno 2] No such file or directory: '/storage/core/software-update/stage/stageDir.json'
9task_manager:: update: 80 - DEBUG - UpdateTask: status=FAILED, progress=0, message={'id': 'com.vmware.appliance.not_staged', 'default_message': 'The update is not staged', 'args': []}
10update_b2b:: installPrechecks:2146 - DEBUG - Exception occurred while checking for discrepancies Update not staged
11task_manager:: update: 80 - DEBUG - UpdateTask: status=RESUMABLE, progress=0, message={'id': 'com.vmware.appliance.plain_message', 'default_message': '%s', 'args': ['Exception occurred in install precheck phase']}
This is pretty odd, because it's indicating a "resumable error" despite the fact that it cannot resume until a file lock is removed. Here are the errors I see:
- Empty Stage Location: Unsure what this means, given the context. Odds are the upgrade script cannot find out where to stage RPMs (Red Hat Package Manager).
- Package discrepancy error: It could be relating to the above, or it could be a failed checksum. No other logging is generated by the agent to indicate what's wrong.
Can't read JSON file /storage/core/software-update/stage/stageDir.json
: This one's more actionable! It looks like there's no directory by this name.
Easter Egg: statsmoitor
probably should be statsmonitor
Remediation
Allow the update to resume
VAMI saves the installation state as a file in /etc/applmgmt/appliance/software_update_state.conf
:
1{
2 "state": "INSTALL_FAILED",
3 "version": "7.0.1.00200",
4 "latest_query_time": "2020-12-21T00:19:32Z",
5 "operation_id": "/storage/core/software-update/install_operation"
6}
VAMI will be stuck in a loop until you remove this file as root:
1rm -rf /etc/applmgmt/appliance/software_update_state.conf
This will not necessarily resolve the issue that caused the failure, however, more work still needs to be done.
Install via ISO
EDIT: The update ISO can be found at: https://my.vmware.com/group/vmware/patch#search
We're going to try a fallback method, attaching the upgrade ISO. The following snippet is from the vSphere UI, modifying vCenter's VM Hardware:
From there, simply click "Check CD-ROM" and it will immediately appear.
This time, we know what directories to search, so I'm going to watch the logs:
1tail -f /var/log/vmware/applmgmt/update_microservice.log | grep -i err
Attempt via Command-line with ISO
VMware documents the following method to update via the command line https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vcenter.upgrade.doc/GUID-8466F019-C57C-4344-9E15-8CFF74A6E4C2.html
Stage Packages
We're going to try and clear the (empty) workspace and try fresh, auto-accepting EULAs:
1Command> software-packages unstage
2Command> software-packages stage --iso --acceptEulas
3 [2020-12-20T17:49:54.355] : ISO mounted successfully
4 [2020-12-20T17:49:54.355] : UpdateInfo: Using product version 7.0.1.00100 and build 17004997
5 [2020-12-20T17:49:55.355] : Target VCSA version = 7.0.1.00200
6 [2020-12-20 17:49:55,169] : Running requirements script.....
7 [2020-12-20T17:50:12.355] : Evaluating packages to stage...
8 [2020-12-20T17:50:12.355] : Verifying staging area
9 [2020-12-20T17:50:12.355] : ISO unmounted successfully
10 [2020-12-20T17:50:12.355] : Staging process completed successfully
11 [2020-12-20T17:50:12.355] : Answers for following questions have to be provided to install phase:
12 Question:
13 ID: vmdir.password
14 Text: Single Sign-On administrator password
15 Description: For the first instance of the identity domain, this is the password given to the Administrator account. Otherwise, this is the password of the Administrator account of the replication partner.
16 Allowed values:
17 Default value:
18
19 [2020-12-20T17:50:12.355] : Execute software-packages validate to validate your input
Let's take a look at the update:
1Command> software-packages list --staged
2[2020-12-20T17:52:00.355] :
3 category: Bugfix
4 kb: https://docs.vmware.com/en/VMware-vSphere/7.0/rn/vsphere-vcenter-server-70u1c-release-notes.html
5 leaf_services: ['vmware-pod', 'vsphere-ui', 'wcp']
6 vendor: VMware, Inc.
7 name: VC-7.0U1c
8 tags: []
9 version_supported: []
10 size in MB: 5107
11 releasedate: December 17, 2020
12 executeurl: https://my.vmware.com/group/vmware/get-download?downloadGroup=VC70U1C
13 version: 7.0.1.00200
14 updateversion: True
15 allowedSourceVersions: [7.0.0.0,]
16 buildnumber: 17327517
17 rebootrequired: False
18 productname: VMware vCenter Server
19 type: Update
20 summary: {'id': 'patch.summary', 'translatable': 'In-place upgrade for vCenter appliances.', 'localized': 'In-place upgrade for vCenter appliances.'}
21 severity: Critical
22 TPP_ISO: False
23 thirdPartyInstallation: False
24 timeToInstall: 0
25 requiredDiskSpace: {'/storage/core': 6.286324043273925, '/storage/seat': 228.3861328125}
26 eulaAcceptTime: 2020-12-20 17:50:12 AKST
Let's run it!
1Command> software-packages install --staged
2 [2020-12-20T17:53:52.355] : For the first instance of the identity domain, this is the password given to the Administrator account. Otherwise, this is the password of the Administrator account of the replication partner.
3Enter Single Sign-On administrator password:
4
5 [2020-12-20T17:54:02.355] : Validating software update payload
6 [2020-12-20T17:54:02.355] : UpdateInfo: Using product version 7.0.1.00100 and build 17004997
7 [2020-12-20 17:54:02,095] : Running validate script.....
8 [2020-12-20T17:54:09.355] : Validation successful
9 [2020-12-20 17:54:09,125] : Copying software packages [2020-12-20T17:54:09.355] : ISO mounted successfully
10166/166
11 [2020-12-20T17:57:31.355] : ISO unmounted successfully
12 [2020-12-20 17:57:31,238] : Running system-prepare script.....
13 [2020-12-20 17:57:40,289] : Running test transaction ....
14 [2020-12-20 17:57:54,344] : Running prepatch script.....
15 [2020-12-20 18:01:22,731] : Upgrading software packages ....
16 [2020-12-20T18:07:39.355] : Setting appliance version to 7.0.1.00200 build 17327517
17 [2020-12-20 18:07:39,538] : Running patch script.
18....
19 [2020-12-20 18:28:42,743] : Starting all services ....
20 [2020-12-20T18:28:46.355] : Services started.
21 [2020-12-20T18:28:46.355] : Installation process completed successfully
22 [2020-12-20T18:28:46.355] : The following warnings have been found:
23['\tWarning: \n\t\tsummary: Failed to start all services, will retry operation.\n']
24Command> shutdown reboot -r "patch reboot"
Looks like the manual install worked for me - 7.0 U1c
TL;DR
1rm -rf /etc/applmgmt/application/software_update_state
2grep -i error /var/log/vmware/applmgmt/update_microservice.log
3exit
4software-packages unstage
5software-packages stage --iso --acceptEulas
6software-packages list --staged
7software-packages install --staged
8shutdown reboot -r "patch reboot"