vCenter - File system /storage/log is low on storage space

After a recent VCSA reboot, I was seeing the infamous no healthy upstream error from vCenter.

The first place to check for issues like this is VMware's Virtual Appliance Management Interface (VAMI), located by default via HTTPS on port 5480. An administrator can use the appliance root password for this particular interface.

When reviewing this issue with the VAMI, I saw the following error:

vCenter Standalone Appliance Management

Now, VCSA by design automatically rotates most logs available on the appliance using the open-source tool logrotate, but nothing in this directory appears to be managed:

1root@vcenter# grep /storage/log/etc/logrotate.d/*

I'd say this particular log partition is going to need some manual cleanup every now and then. To open up the CLI, SSH into vCenter and execute the following command:

1Command> shell  
2Shell access is granted to root

First, let's get an idea of how full the disks are:

Note: The -m switch converts units into Megabytes

 1root@vcenter[~]# df -m  
 2Filesystem 1M-blocks Used Available Use% Mounted on  
 3devtmpfs 5982 0 5982 0% /dev  
 4tmpfs 5993 1 5992 1% /dev/shm  
 5tmpfs 5993 2 5992 1% /run  
 6tmpfs 5993 0 5993 0% /sys/fs/cgroup  
 7/dev/sda3 46988 7199 37374 17% /  
 8tmpfs 5993 5 5988 1% /tmp  
 9/dev/mapper/dblog\_vg-dblog 15047 185 14080 2% /storage/dblog  
10/dev/mapper/vtsdb\_vg-vtsdb 10008 68 9412 1% /storage/vtsdb  
11/dev/mapper/vtsdblog\_vg-vtsdblog 4968 36 4661 1% /storage/vtsdblog  
12/dev/sda2 120 30 82 27% /boot  
13/dev/mapper/log\_vg-log 10008 9475 6 100% /storage/log  
14/dev/mapper/core\_vg-core 25063 45 23723 1% /storage/core  
15/dev/mapper/db\_vg-db 10008 507 8974 6% /storage/db  
16/dev/mapper/updatemgr\_vg-updatemgr 100273 1953 93185 3% /storage/updatemgr  
17/dev/mapper/netdump\_vg-netdump 985 3 915 1% /storage/netdump  
18/dev/mapper/lifecycle\_vg-lifecycle 100273 3364 91775 4% /storage/lifecycle  
19/dev/mapper/autodeploy\_vg-autodeploy 10008 37 9444 1% /storage/autodeploy  
20/dev/mapper/imagebuilder\_vg-imagebuilder 10008 37 9444 1% /storage/imagebuilder  
21/dev/mapper/seat\_vg-seat 10008 1185 8295 13% /storage/seat  
22/dev/mapper/archive\_vg-archive 50133 16373 31185 35% /storage/archive

The log partition is definitely full. To take an inventory of disk usage, we'll use the du utility, with the s (summarize) and m (megabytes) switches enabled, and then pass the output to sort with the n (numerical) and r (reverse) switches enabled to focus on the most important first.

1root@vcenter[/]# du -sm /storage/log/vmware/\* | sort -n -r  
22578 /storage/log/vmware/eam  
32286 /storage/log/vmware/lookupsvc  
4785 /storage/log/vmware/sso  
5781 /storage/log/vmware/vsphere-ui  
6530 /storage/log/vmware/vmware-updatemgr

Examining these folders further, quite a few of these are old and never rotated. VMware provides the following guidance on what's safe or isn't. Generally, Linux has issues with files being deleted out from under it, so obviously rotated logs can be safely removed. If this is a production system, I'd recommend calling VMware GSS instead of taking it upon yourself. The above command (du -sm * | sort -nr) can be used in any working directory to see what is filling up the logs the most. Here are a few examples of what I deleted to make room:

1rm -rf /storage/log/vmware/eam/web/localhost-2020-*  
2rm -rf /storage/log/vmware/eam/web/localhost_access.2020*  
3rm -rf /storage/log/vmware/eam/web/catalina-2020*

From here, I like to verify that space is cleared:

1root@vcenter[/]# df -m | grep /storage/log  
2/dev/mapper/log_vg-log 10008 5793 3688 62% /storage/log

Catalina andTomcat are names for the same thing. This software package proxies inbound HTTP requests to specific applications, allowing many developers to build code without having to construct a soup-to-nuts HTTP server. Other similar (but more recent) projects include Python's Flask.

With HTTP Proxies and servers, it is useful to keep comprehensive records indicating "who did what", both for security reasons ("whodunit") and for debugging reasons. As a result, Tomcat is a serious log-hog wherever it exists, and it almost never reviews old logs. This is why I evaluated the change as relatively safe.

If this was not an appliance, I would have added a logrotate spec to automatically delete old files from this directory, but it is not recommended to alter VCSA in this way.