vCenter - File system /storage/log
is low on storage space
After a recent VCSA reboot, I was seeing the infamous no healthy upstream
error from vCenter.
The first place to check for issues like this is VMware's Virtual Appliance Management Interface (VAMI), located by default via HTTPS on port 5480. An administrator can use the appliance root password for this particular interface.
When reviewing this issue with the VAMI, I saw the following error:
Now, VCSA by design automatically rotates most logs available on the appliance using the open-source tool logrotate, but nothing in this directory appears to be managed:
1root@vcenter# grep /storage/log/etc/logrotate.d/*
I'd say this particular log partition is going to need some manual cleanup every now and then. To open up the CLI, SSH into vCenter and execute the following command:
1Command> shell
2Shell access is granted to root
First, let's get an idea of how full the disks are:
Note: The -m switch converts units into Megabytes
1root@vcenter[~]# df -m
2Filesystem 1M-blocks Used Available Use% Mounted on
3devtmpfs 5982 0 5982 0% /dev
4tmpfs 5993 1 5992 1% /dev/shm
5tmpfs 5993 2 5992 1% /run
6tmpfs 5993 0 5993 0% /sys/fs/cgroup
7/dev/sda3 46988 7199 37374 17% /
8tmpfs 5993 5 5988 1% /tmp
9/dev/mapper/dblog\_vg-dblog 15047 185 14080 2% /storage/dblog
10/dev/mapper/vtsdb\_vg-vtsdb 10008 68 9412 1% /storage/vtsdb
11/dev/mapper/vtsdblog\_vg-vtsdblog 4968 36 4661 1% /storage/vtsdblog
12/dev/sda2 120 30 82 27% /boot
13/dev/mapper/log\_vg-log 10008 9475 6 100% /storage/log
14/dev/mapper/core\_vg-core 25063 45 23723 1% /storage/core
15/dev/mapper/db\_vg-db 10008 507 8974 6% /storage/db
16/dev/mapper/updatemgr\_vg-updatemgr 100273 1953 93185 3% /storage/updatemgr
17/dev/mapper/netdump\_vg-netdump 985 3 915 1% /storage/netdump
18/dev/mapper/lifecycle\_vg-lifecycle 100273 3364 91775 4% /storage/lifecycle
19/dev/mapper/autodeploy\_vg-autodeploy 10008 37 9444 1% /storage/autodeploy
20/dev/mapper/imagebuilder\_vg-imagebuilder 10008 37 9444 1% /storage/imagebuilder
21/dev/mapper/seat\_vg-seat 10008 1185 8295 13% /storage/seat
22/dev/mapper/archive\_vg-archive 50133 16373 31185 35% /storage/archive
The log partition is definitely full. To take an inventory of disk usage, we'll use the du utility, with the s (summarize) and m (megabytes) switches enabled, and then pass the output to sort with the n (numerical) and r (reverse) switches enabled to focus on the most important first.
1root@vcenter[/]# du -sm /storage/log/vmware/\* | sort -n -r
22578 /storage/log/vmware/eam
32286 /storage/log/vmware/lookupsvc
4785 /storage/log/vmware/sso
5781 /storage/log/vmware/vsphere-ui
6530 /storage/log/vmware/vmware-updatemgr
Examining these folders further, quite a few of these are old and never rotated. VMware provides the following guidance on what's safe or isn't. Generally, Linux has issues with files being deleted out from under it, so obviously rotated logs can be safely removed. If this is a production system, I'd recommend calling VMware GSS instead of taking it upon yourself. The above command (du -sm * | sort -nr) can be used in any working directory to see what is filling up the logs the most. Here are a few examples of what I deleted to make room:
1rm -rf /storage/log/vmware/eam/web/localhost-2020-*
2rm -rf /storage/log/vmware/eam/web/localhost_access.2020*
3rm -rf /storage/log/vmware/eam/web/catalina-2020*
From here, I like to verify that space is cleared:
1root@vcenter[/]# df -m | grep /storage/log
2/dev/mapper/log_vg-log 10008 5793 3688 62% /storage/log
Catalina andTomcat are names for the same thing. This software package proxies inbound HTTP requests to specific applications, allowing many developers to build code without having to construct a soup-to-nuts HTTP server. Other similar (but more recent) projects include Python's Flask.
With HTTP Proxies and servers, it is useful to keep comprehensive records indicating "who did what", both for security reasons ("whodunit") and for debugging reasons. As a result, Tomcat is a serious log-hog wherever it exists, and it almost never reviews old logs. This is why I evaluated the change as relatively safe.
If this was not an appliance, I would have added a logrotate spec to automatically delete old files from this directory, but it is not recommended to alter VCSA in this way.