Thanks for your input Nicholas, MUCH appreciated. I got your contact info from a recent VMware workshop that Ivoxy put on, good to hear from you.
I'll dig through your recommendations more thoroughly, but here are some immediate thoughts and more information that might help with the conversation:
- It isn't a Windows VM, It's a Sophos UTM firewall device. Some form of Linux that a company named Astaro hacked into a firewall product. They give away a home use license for free and it's proved to be a useful solution for my home lab needs. I was debating on looking into pfsense too, just haven't looked at it yet.
- The firewall VM runs a "version" of VMware tools natively. At least the vSphere/vCenter interface seems to think it does. Reports back version: 2147483647 (Guest Managed)
- All vNICs on the VM are configured with VMXNET3. I try to go this route whenever I can.
- I like your idea of pinging the WAN IP and identifying when during the early morning it actually dies. What app would you recommend? Something as basic as timestap, ping -t command piped to a log file?
I did a little more digging and can provide more specifics around the connectivity failure. I have two static IP addresses issued by my ISP both on the same subnet. The WAN interface of this VM firewall is configured with a primary IP that is reliable. It's the "2nd" IP that seems to fail after about 24 hours, the issue being resolved with a reboot of the VM. Sophos calls this "additional addresses", but you'll see similar configs with SonicWALL or other UTM products using a NAT policy. Anyhow, I am thinking this is arp related. I have an HP Procurve switch serving the backbone of the virtual infrastructure with Rapid Spanning Tree enabled. A few things I have read recommend turning the bpdu filter on for all ports connected to VMware NICs as well as killing spanning tree all toghether for iSCSI exclusive VLANs if my switch allows for the granularity. I found another recommendation about enabling "admin-edge-port" on each interface connected to VMware hosts.
This is a new area for me, but I was going to play around with it to see if there is a difference.
Thanks again,
-Adam