VMware Cloud Foundation, Will it run on Oracle Ravello? Part 5: Deploying SDDC manager


Before we begin

If you have not checked out what this series is about then please take a look at the previous parts below.

Part 1: Planning

Part 2: Ancillary Services

Part 3: Management Cluster “Hardware”

Part 4: Management Cluster Software

Software Defined DataCentre Manager deployment

The aim of this part of the blog series is to configure the SDDC management cluster and get vSAN running on there for storage. The VMware Cloud Foundation deployment virtual machine will allow us to do this.

Before you go any further, make sure you grab a copy of the VMware Cloud Foundation deployment OVA from here. It’s about 9GB.

I would also recommend making a blueprint of your environment at this point. If the process goes wrong (like mine did) you have an easy way to get back to this point.

The stand-alone ESXi host that was deployed in part 2 of this blog series will host the deployment VM.

VCF_On_Ravello_34

Import the OVA and power it on, you should end up with something like this. Note the access URL.

Note. When defining network settings, ensure the NTP server is defined as a local NTP source such as the ADDS server.

VCF_On_Ravello_35

Log into the web UI.

VCF_On_Ravello_36

Usefully, when you first log in, there is a checklist of action items that need to be completed before continuing. Each item I have not ticked below I need to go and rectify on the management hosts. Ensure you do the same.

VCF_On_Ravello_37

VCF_On_Ravello_38

One other item to check is if the disks are marked as SSD or HDD on the management cluster hosts. They will default to HDD but we want them to appear as SSD drives so we can simulate an all-flash vSAN cluster. This is an easy task if the hosts are connected to a vCenter server already, but as they are not, this needs to be done via the command line. Check out this blog post from William Lam for details.

SSH onto the host and run the following command

This will list all the disks on the host.

VCF_On_Ravello_39

There are quite a few disks to configure. My list of commands ended up looking like this. I also marked the OS drive as an SSD.

Then check the disks are marked as SSD on the host.

VCF_On_Ravello_40

Moving on, back to the VCF builder. Tick all the boxes, click next, accept the EULA, click next.

VCF_On_Ravello_41

Until you end up here, where a configuration file has to be uploaded.

VCF_On_Ravello_42

Download the Excel spreadsheet. It looks like this. All fairly self-explanatory. You can also see why I have not deviated too much from the standard VLAN ID’s, naming convention etc.

VCF_On_Ravello_43

Note you must have license keys for all of the components shown below. SDDC-Manager key is not included as part of the vExpert license pack but I am assured this should work without a key. (It does)

VCF_On_Ravello_44

Validation failure on the first try!

Summary of failures

  • Password policy not correct for SDDC-Manager
    • Updated password in the spreadsheet
  • DNS Records not created for all VM’s
    • Created DNS records for all components listed in configuration spreadsheet on VCFDC01
  • Network connectivity validation
    • States hosts are not accessible, although they are. One to check out
  • vSAN Datastore validation
    • Boot disks are smaller than 16GB – Need to grow those. I have amended previous parts of this blog series to reflect the need for a minimum of 16GB disk
  • NTP Configuration wrong
    • Can’t use external NTP provider on ESXi hosts or VCF builder, change to internal NTP source.

The test did pass for licenses without having an SDDC-Manager license defined.

After addressing most of the problems above the validation test looks like this.

VCF_On_Ravello_46

Still issues with Network connectivity validation. vSAN disks complaining about cache size on OS disk and NTP still failing on VCF builder VM. I will accept the errors and move on by clicking acknowledge at the top right of the screen.

This allows me to start the bring up process, which is to configure and deploy everything we previously defined.

VCF_On_Ravello_47

Which then starts a checklist of tasks to run through

VCF_On_Ravello_48

As the process moves on through, we will start to see items marked as successful for deployment. Below, two platform service controllers and a vCenter service appliance have been deployed on management host 1.

VCF_On_Ravello_49

VM’s deployed.

VCF_On_Ravello_50

vSAN datastore created on host 1 for vCenter to reside on before the full cluster is created. Hopefully, the disk type will show as SSD and not unknown post-deployment.

VCF_On_Ravello_51

And the deployment proceeds as far as the NSX deployment and fails.

VCF_On_Ravello_52

Logging into the vCenter server reveals that the OVF file could not be deployed.

VCF_On_Ravello_53

I tried to create an empty folder on the vSAN datastore, same issue. can not create any files on the vSAN datastore.

Troubleshooting

The first hit in Google for the error was this VMware KB, but unfortunately, this did not resolve the issue. A little further testing led me to discover that I could not send a Jumbo packet with a VMKping to the vSAN IP address on host 01, but this worked for the other 3 hosts. Could it be a connectivity issue?

I powered the whole cluster down and brought it back online. After this, I could get a jumbo packet VMKping to the vSAN IP on host 01 and also noticed that the vSAN datastore is now reporting the correct amount of storage available. I suspect there were communication failures between all the vSAN nodes causing issues with OVF deployment.

Creating an empty folder following the power cycle worked as expected.

VCF_On_Ravello_54

Deployment Continuation

The next step was NSX deployment, most of this completed except one of the hosts purple screened during deployment, bringing down vSAN again. Unfortunately, this led to the vCenter Server appliance becoming orphaned, which is the one VM you do not want to become orphaned on the vSAN datastore. There are a number of troubleshooting articles available to run vSAN check state from the Ruby vSphere client, but the Ruby vSphere client runs on the vCenter appliance.

The VM in question on host 2.

VCF_On_Ravello_55

after much checking of the state of vSAN from ESXCLI, I decided to destroy the hosts and start again.

VCF_On_Ravello_56

TL:DR, each node in the vSAN cluster reported as being the Master node, and not Master, Agent and Backup nodes. Try as I might, I could not remedy this issue. I did have some great help from the guys in the vSAN channel in the vExpert Slack, but my problem was one that had not been seen before :(.

Redeploy, 4 vSAN hosts and VCF builder VM and start again. The failures above, I had two disk groups defined with 6 disks in use. Subsequent host deployment was reduced to a single disk group per host. Earlier blog posts on deploying hosts have been amended to reflect this setup.

Next crack at deploying VCF, yeah boi, success!

VCF_On_Ravello_57

And SDDC Manager is accessible. Happy days.

VCF_On_Ravello_58

 

Epilogue

A lot of time went into making this work. vSAN gave me the biggest headache, I did deploy the cluster 4 times before deciding to drop back to single disk group per host.

The next post will focus on making sure all the services are working correctly and addressable from SDDC Manager. Check it out in Part 6: SDDC Manager configuration.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.