NGINX in TKG Cluster in vSphere Workload Management - Road to Kubeflow in vSphere

Welcome to the third part in this series. This series has the target to finally deploy Kubeflow in vSphere, but the post is only one of the steps required for our destination.

Don’t forget to catch up if you missed how we set up Workload Management (the supervisor cluster) or this brief differentiation between TKG Cluster and the Workload Management supervisor cluster.

What you’re going to learn

We’ll set up a TKG cluster within vSphere 7 in the supervisor cluster created by the Workload Management. For this to successfully work we need to create a content library, authenticate with the kubectl vSphere plugin and utilize TKG for obtaining all necessary information. Finally, we’ll make sure to deploy a NGINX. I give you an example on how to overcome the Pod Security Policy, which is enabled on TKG Clusters.

Necessary for completion:

For a successful NGINX Deployment, you’ll need the vSphere admin account, not the admin role, this is NOT enough. Theoretically it’s not necessary, for CREATING the TKG cluster, but you will have to guess some parameters. Unfortunately, you can’t access crucial information without the admin account. When you have obtained this information, each configured user later on may be enabled to create the TKG cluster on their own.

Additional make sure to download this binary TKG CLI . This is also technically not necessary, but it’ll ease the access to information.

Breaking down this tutorial

All steps for your TKG Cluster and the NGINX Deployment:

Subscribe the content library from VMware (for obtaining OVF Templates)
Create a namespace, where you want to deploy your TKG Cluster
Create a TKG Cluster
Create NGINX Deployment

1. Subscribing the content library from VMware

In the first post I explained, that the TKG Cluster is realized as VMs i.e. the Kubernetes control planes and worker nodes are VMs. Therefore, we need OVF templates, so TKG can set up these VMs. This step is solely needed, when you want to utilize TKG, otherwise this is not necessary.

Connect to vSphere Client and select Content Libraries via the Menu

setting up content library step 1 — setting up the content library step 1

Create a new Content Library

Choose a name, and on which vCenter Server to save, click next

Subscribe to the following URL:

https://wp-content.vmware.com/v2/latest/lib.json

Make sure that you select “immediately”, because if not, TKG won’t find the OVF templates, and when applying the TKG Cluster .yaml file, Kubernetes will throw an error!

Ignore this warning. The chosen URL is from VMware and a check via the browser against that URL shows no certificate problems. Click Yes!

Finally select the Storage Location and visit the last configuration step.

One last check. Pressing FINISH creates the Content Library

Choose your Cluster, where the Workload Management is configured in. Within Configuration > Namespaces > General you’ll set up the Content Library, solely needed for TKG.

The last step! Select your freshly created Content Library and Finish with OK.

2. Creating a namespace

A namespace in Workload Management is not exactly the same as a namespace in Kubernetes. You can’t create namespaces in the supervisor cluster. Workload Management does not mind your admin privileges, it’ll deny your request. Even though creating a namespace in Workload Management will create a namespace in the supervisor cluster, too.

Select Workload Management in the vSphere Client via the Menu

Select Namespaces followed by Create Namespace

Choose your ESXi Cluster, in which you deployed your Workload Management, put a namespace-name and remember it

Add a Permission, so someone can deploy into this cluster. This step can be skipped, when you connect as a vsphere.local Administrator to the cluster

Select an identity source, e.g. your domain. The identity source and the username are required later, when connecting against the cluster!

Finally configure the Storage

Provide any VM Storage Policy, you’ve maybe already configured. I’ve chosen the Storage Policy which is used for Workload Management.

Congratulations, the namespace is created and configured.

3. Create the TKG Cluster

Now we need the TKG cli, download it to your work machine, make sure it’s executable.

kubectl and the corresponding kubectl-vsphere plugin will be downloaded in the next few steps, as they’re given to us conveniently by Workload Management.

kubectl communicates with any cluster – based on information stored in the following file:

~/.kube/config

VMware provides a cli binary: kubectl-vsphere. This binary checks authentication and, based on authorization, configures your ~/.kube/config file.

After that it’s mostly regular kubectl. Only for the connection to the TKG cluster we’ll need to kubectl-vsphere binary again.

Screenshots now are sparse, as the console commands must be pasted anyways.

We left on namespaces (if you somehow manage to not find it again, don’t worry: Menu > Workload Management > Namespace > <your namespace>)

Make sure to remember the IP of the website that you’re currently accessing (it’s the IP of the supervisor cluster)

You could manually download the binaries, I’ll prefer using wget and creating my own scripts, so I copy the download link.

Unpack, move binaries to binary location and remove the remainder

Remember to change the IP to your supervisor cluster

{
  wget https://<IP-of-supervisor-cluster>/wcp/plugin/linux-amd64/vsphere-plugin.zip
  unzip vsphere-plugin.zip
  sudo mv bin/* /usr/local/bin
  rm -r bin
  rm vsphere-plugin.zip
}

Now utilize the vsphere plugin and connect as the vSphere administrator.

The next commands are all done on the supervisor cluster.

kubectl config use-context <supervisor-ip> #10.4.10.1

Now configure the TKG binary

tkg add mc <supervisor-ip> #10.4.10.1
tkg set mc <supervisor-ip> #10.4.10.1

Obtain configuration parameters,

we need the

storageclass (where can we save what)
virtualmachineclasses (how much ressources to allocate for the worker and master nodes of the tkg cluster)
virtualmachineimage (which image to use to deploy the worker and master nodes).

Save the circled information!

create tkg cluster get virtualmachineimages

create tkg cluster get virtual machine classes with extra information

kubectl get storageclass
kubectl get virtualmachineimages
kubectl get virtualmachineclasses
kubectl get virtualmachineclasses -o jsonpath="{range .items[*]}{.metadata.name}{'\t'}{.spec.hardware.cpus}{'\t'}{.spec.hardware.memory}{'\n'}"

Now create some environment variables. the SERVICE_DOMAIN must match cluster.local or kubeflow will later on have troubles! Exchange found storage classes. The service/cluster CIDR are TKG Cluster internal, choose whatever you want to.

export CONTROL_PLANE_STORAGE_CLASS=gold-tanzu-kubernetes-storage-policy
export WORKER_STORAGE_CLASS=gold-tanzu-kubernetes-storage-policy
export DEFAULT_STORAGE_CLASS=gold-tanzu-kubernetes-storage-policy
export STORAGE_CLASSES=
export SERVICE_DOMAIN=cluster.local #don't change this! kubeflow needs it.
export CONTROL_PLANE_VM_CLASS=guaranteed-medium
export WORKER_VM_CLASS=guaranteed-xlarge
export SERVICE_CIDR=100.64.0.0/13
export CLUSTER_CIDR=100.96.0.0/11

Create the TKG cluster, after you’ve exported all these variables.
This command creates a .yaml file and saves it in tkg-cluster-creation-small. Exchange your cluster name (kubeflow-tkg), the namespace (kubeflow-ns), the kubernetes-version (v1.17.8+vmware.1-tkg.1.5417466). You do find the kubernetes version with the command from step 8 (kubectl get virtualmachineimages).

tkg config cluster kubeflow-tkg --plan=dev \
--namespace=kubeflow-ns \
--kubernetes-version=v1.17.8+vmware.1-tkg.1.5417466 \
--controlplane-machine-count=1 \
--worker-machine-count=3 > tkg-cluster-creation-small.yaml

Apply the yaml

kubectl apply -f tkg-cluster-creation.yaml

Monitor Cluster creation

Now that the cluster is created, we’ll check vSphere… Once again we’ll open the namespace via Menu > Workload Management > Namespace > <your-namespace>.

Note down the name (1) of the namespace and the TKG Cluster name (4) to proceed.

When the cluster is green, meaning everything is up and running, you should proceed.

4. Create an NGINX Deployment

Creating a Pod in a TKG Cluster isn’t hard, it’ll work out of the box. But when creating a Deployment, for example, it’ll not work. We’ll hit the configured Pod Security Policy of VMware, because Service Accounts are utilized by Deployments. The Deployment will be created, but it won’t finish. When deploying Kubeflow we’ll fix this issue, too, so don’t stop reading now.

Let us connect to the freshly created TKG Cluster with the kubectl-vsphere plugin and the following command.

Let us connect to the freshly created TKG Cluster with the kubectl-vsphere plugin and the following command.

kubectl vsphere login --server=<supervisor-ip> \
--vsphere-username <your-configured-user@<your-domain> \
--tanzu-kubernetes-cluster-namespace=<your-namespace> \
--tanzu-kubernetes-cluster-name=<your-TKG-Cluster-name>

The supervisor-ip is the same as the last one! It’ll handle authentication and authorization for you!

console input and output for logging into the TKG Cluster

When you see this output message, you can proceed, kubectl is already configured to use the correct context. It’s still possible to switch out of the TKG Cluster back into the Supervisor Cluster, but that’s not necessary for us. Checking cluster health is free and fast, therefore type the following command for one last check.

kubectl get nodes

kubectl check nodes health — checking node health of the kubectl VMs

Create a Role Binding between Service Accounts and a new namespace

Full blown explanation for Pod Security Policies (PSP) is surely not my goal here. We’ll take the already specified PSP, attach them to all created Service Accounts (SA) from a new namespace (ns).

Obtain all available PSP

kubectl get psp

create nginx step 1 — obtaining the PSP name

Create a new namespace, save this name for later

kubectl create namespace test

Create a Role Binding between SA and PSP in the freshly created test namespace. Focus on the highlighted lines!

cat << EOF > test-rb.yaml
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rb-all-sa_ns-test
  namespace: test
roleRef:
  kind: ClusterRole
  name: psp:vmware-system-privileged
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: Group
  apiGroup: rbac.authorization.k8s.io
  name: system:serviceaccounts:test
EOF

kubectl apply -f test-rb.yaml

rm test-rb.yaml

With that RoleBinding each SA of the NS test will have a privileged PSP attached. Proceed with your deployment.

kubectl run nginx --image=nginx

Final thoughts

This was a rather long post with many potential problems. They may or may not occur. I’m going to create a special post with some erros I’ve hit. Let’s be honest, it’s not a script, you may forget something, overread a sentence or whatever!

Additionally I’m not advising for any set up security in this post, especially regarding Pod Security Policy, please check this topic out for yourself!

NGINX in TKG Cluster in vSphere Workload Management – Road to Kubeflow in vSphere

What you’re going to learn

Necessary for completion:

Breaking down this tutorial

1. Subscribing the content library from VMware

2. Creating a namespace

3. Create the TKG Cluster

Monitor Cluster creation

4. Create an NGINX Deployment

Create a Role Binding between Service Accounts and a new namespace

Final thoughts

Kommentar absenden Antworten abbrechen

Search

Twitter Feed

Recent Posts

Recent Comments

Archives

Categories

Meta