First, I made the ultimate mistake, multiple changes at once. When I first bought all my homelab components I quickly put them together, and ran the script from #TeamAlam . It worked great, and I was able to do my release testing for SummitRTS. However I knew that I wanted to do more with the deployment scripts. Enough already, whats the problem?
Historically in my house I have always ran my own network and WIFI independent of my service providers, this way I don’t have to continue to change network settings, passwords etc on all of the connected devices if I switch companies. Over the past year I have decided to reduce the amount of WIFI devices and run Cat5/cat6 through the house. Which is great, however, it can be a pain when you have two networks. So after some time listening to my wife complain that we could not print from our WIFI devices (because it was on a separate network) I decided to flatten all the networks and remove my personal gear.
I know what your thinking, why not just move the printer to the same network or create a route between the two networks…? That would have been too easy! Actually based on the location of the printer it would have been a pain in the butt. So I did it, I flattened my network, all my devices including my home lab were on the same network, the dreaded service providers network. I logged into my Comcast Xfinity router and edited DHCP to give me a range where I can setup my own DHCP or have some static address space. So far so good. The wife can print, all of my physical and WIFI devices are reconnected. Total time 30 minutes. Hooray!
Notice I said physical and WIFI devices, I have not mentioned my virtual devices. So obviously my vCenter would no longer connect, I could have done the easy thing and just changed the IP address on the VCSA. However, I knew that I wanted to work on the deployment scripts, and off I went…. I changed a few things, opened up Powercli and started the deployment script, and I walked away…. But to my surprise I came back to an error. I forget what it was, and I didn’t spend much time really thinking too much about it. I ran the destroy script again, rebooted the ESX host, and started the deployment script. Same error… So I decided to google around a bit and quickly determined that the VCSA couldn’t start because of a DNS related issue. WHAT?!?!?!?! the script uses and IP not a DNS name, I triple checked that! Honestly the script didn’t change much from when it had worked a month ago.
Bottom line here is that I learned two things, Comcast Xfinity does not natively support Forward and Reverse DNS. Actually you cannot even add DNS records to their home routers as of the date of this writing (Unless someone can enlighten me). The other thing is that VCSA requires Forward and Reverse DNS to be functioning on the network, otherwise the VM will deploy, but the services won’t start. I could probably fake out the vm with a hosts file entry, but I really don’t feel like doing that every deployment, my goal is to deploy as quickly and easily as possible with no human interaction. I’ve redeployed it 4 times tonight…. So my solution for now is to isolate my equipment off the Comcast network using an older NetGear R600 router I have which can fully support Forward and Reverse DNS.
All sessions are not the same
I’m naive. I usually have 2 console windows open at all time. 1 for manually testing one liners for small scripts, the 2nd is for running fully automated scripts. With all of the things I was focused on my sessions were not one of them. This error popped up in my automated script running console :
So being that naive person I am, I thought I blew up my ESXi NUC. No problem, I grabbed a second USB stick, loaded ESXi on it, and was quickly back at it. Ran the deployment script…. and I got the same error. HUH? When I google for this error it led me to believe that the SDK service had not started on the ESX host or was broke. The IP’s were good, the web services had no problems. At the time I was also testing my on liners for adding NFS storage in my Manual console, and that worked fine. I should note that upon manual inspection, the automated script did work, even though it was still throwing the SDK errors. Very odd, I didn’t test disconnecting the vi sessions and reconnecting in the same console window, but I would imagine that the Powercli Session was toast.
Refresh your Console windows if you start to get strange errors! Oh, and run as Administrator!!!
Dead NFS connection
Whenever I am getting started to work on a project, I turn my power strips on, then power on all my devices. It would take ESXi 10-15 minutes to boot up. It would hang at the nfs4client, it would never fail, just take its sweet old time. It was never really a problem, I always took the time to open up my programs, check Twitter, Facebook, etc. I can always find a way to avoid work!
Starting and waiting for my Synology to fully power on resolves this problem (which isn’t a long time). Once the Synology “beeps” I can power on ESXi and it quickly boots to the login screen. The time difference is amazing!
Keep at it! Some times it’s a simple fix. Oh and my deployment times keep getting faster! Also keep your changes to a minimum when testing new things. Save yourself the frustrations.