TMC Data Protection is a relatively new feature in TMC that allows you to backup and restore your Kubernetes clusters, namespaces or label-specific objects. It is based on Velero, is easy to configure and even allows you to set schedules for regular backups. I’ve used Velero on it’s own in the past against workloads in TKG clusters but this was my first time using it in a more automated fashion with TMC…I was thoroughly impressed and wanted to share the experience.
Table of Contents
Create a TMC credential and AWS CloudFormation stack to be used by TMC Data Protection
These steps assume that you already have an IAM account setup on AWS. You can find detailed instructions for creating one in my previous blog post, How to deploy a TKG cluster on AWS using Tanzu Mission Control.
In the TMC UI, navigate to Administration and click the Create Account Credential dropdown and then choose AWS data protection credential.
Enter a descriptive name for the new credential and then click the Generate Template button.
Click the Next button.
The template that was created will be used to create a new CloudFormation stack. Head back to the main AWS page and navigate to CoudFormation under Management & Governance. Click the Create stack dropdown at the top right and then choose With new resources (standard).
Under Specify template, select Upload a template file. Click the Choose file button and then select the .template file that was created in TMC earlier.
Click the Next button.
Enter a name for the new stack and click the Next button.
Click the Next button on the Configure stack options page as there is nothing we need to change here.
Ensure everything is correct on the Review page. Click the checkbox at the bottom to acknowledge the IAM resources message and then click the Create stack button.
You should see that the creation is in progress.
You can click the refresh button and see various tasks running and completing.
Click on the Stack info link. When the Status is CREATE_COMPLETE, the stack is ready.
Click on the Outputs link and copy the ARN value noted.
You can now head back to TMC and click the Next button in the AWS Configuration section.
Click the Create Credential button.
You should now see your credential listed on the Administration > Accounts page.
Enable Data Protection on your cluster
In the TMC UI, navigate to Clusters and click on your cluster. Click on the Enable Data Protection link at the bottom of the page.
Select the credential you just created and click the Enable button.
You’ll see that Data Protection is being enabled in the TMC UI.
If you navigate to Storage > S3 at AWS, you should see a new S3 bucket being created.
When Data Protection is fully enabled, you’ll see a Create Backup link in the Data Protection pane for your cluster.
With Data Protection enabled you’ll see that there is a new namespace named velero with a single pod in it.
kubectl --kubeconfig=kubeconfig-clittle-test-cluster.yml -n velero get po NAME READY STATUS RESTARTS AGE velero-88ff97748-85tq7 1/1 Running 0 38m
At this point, you should install the velero executable so that you can have better visibility into your backups and restores. It’s a single file and you can find simple instructions for installing it at https://velero.io/docs/v1.4/basic-install/#install-the-cli.
Backup a workload with Data Protection
If you have a workload deployed, you can click the Create Backup link to test out this functionality. I have a WordPress application deployed in this cluster so I’ll be using that for this exercise. I can see that the site is up and has some data via a post I created on it.
You have several options to choose from when deciding how you’ll backup your workload. You can backup the entire cluster, a specific namespace or resources based on labels. In my case, all resources associated with the WordPress application have the
app=wordpress label applied so I’ll use that option. Click the Next button when you’ve configured what you’ll be backing up.
We’re going to do a single backup so we can leave the Schedule type set to “Now” and click the Next button. I’ll circle back to some of the schedule options later.
The default retention period is 30 days and you can change this to suit your needs. Click the Next button when you’re ready to proceed.
Give your backup a name and click the Create button. You’ll be redirected to the Data Protection page for your cluster where you can watch the progress of the backup.
When the backup is finished you can click on it to see more details about what was backed up.
You can also drill down into the S3 bucket that was created at AWS to see contents of the backup.
You can also use the
velero executable to see the backup and its details.
velero --kubeconfig=kubeconfig-clittle-test-cluster.yml get backups NAME STATUS CREATED EXPIRES STORAGE LOCATION SELECTOR wordpress-dp-backup Completed 2020-08-07 14:34:56 -0600 MDT 29d cjlittle-dp app=wordpress
velero --kubeconfig=kubeconfig-clittle-test-cluster.yml describe backup wordpress-dp-backup Name: wordpress-dp-backup Namespace: velero Labels: tmc.cloud.vmware.com/creator=clittle tmc.cloud.vmware.com/managed=true velero.io/storage-location=cjlittle-dp Annotations: velero.io/source-cluster-k8s-gitversion=v1.18.5+vmware.1 velero.io/source-cluster-k8s-major-version=1 velero.io/source-cluster-k8s-minor-version=18 Phase: Completed Namespaces: Included: * Excluded: <none> Resources: Included: * Excluded: <none> Cluster-scoped: auto Label selector: app=wordpress Storage Location: cjlittle-dp Snapshot PVs: auto TTL: 720h0m0s Hooks: <none> Backup Format Version: 1 Started: 2020-08-07 14:34:56 -0600 MDT Completed: 2020-08-07 14:35:02 -0600 MDT Expiration: 2020-09-06 14:34:56 -0600 MDT Persistent Volumes: 2 of 2 snapshots completed successfully (specify --details for more information)
Restore the contents of a backup using Data Protection
I deleted my WordPress application after creating the backup.
kubectl --kubeconfig=kubeconfig-clittle-test-cluster.yml delete -k wordpress-aws/ storageclass.storage.k8s.io "k8s-policy" deleted secret "mysql-pass-bd45fkk6kd" deleted service "wordpress-mysql" deleted service "wordpress" deleted deployment.apps "wordpress-mysql" deleted deployment.apps "wordpress" deleted persistentvolumeclaim "mysql-pv-claim" deleted persistentvolumeclaim "wp-pv-claim" deleted
kubectl --kubeconfig=kubeconfig-clittle-test-cluster.yml delete pv --all persistentvolume "pvc-3a3eef27-e8fa-497c-9491-3bda00225be1" deleted persistentvolume "pvc-4a476072-a8aa-403b-88eb-650e3b044054" deleted
kubectl --kubeconfig=kubeconfig-clittle-test-cluster.yml get deployment,replicaset,po,svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4h15m
And if I try to access my WordPress site, it’s not there anymore.
Let’s hope that TMC Data Protection can get it back!
When you’re ready to perform a restore, you’ll need to navigate to Clusters, select your cluster and then select the Data Protection tab. From here you will click on the backup (in the Backups section) that you want to restore.
On the details page for your backup, click on the Restore Backup button.
You can choose to either restore the entire backup or a selection from the backup based on included namespaces or labels. For my example, there should be no difference between selecting the entire backup or using the label selection that was used when creating the backup (
app=wordpress). Click the Next button when you’re ready to move on.
Enter a name for your backup and click the Restore button.
You can follow the progress of the restore in the Restores pane of the Data Protection tab.
You can also use the velero CLI to get more information about the restore.
velero --kubeconfig=kubeconfig-clittle-test-cluster.yml get restores NAME BACKUP STATUS WARNINGS ERRORS CREATED SELECTOR wordpress-dp-restore wordpress-dp-backup InProgress 0 0 2020-08-07 14:56:31 -0600 MDT app=wordpress
velero --kubeconfig=kubeconfig-clittle-test-cluster.yml describe restore wordpress-dp-restore Name: wordpress-dp-restore Namespace: velero Labels: tmc.cloud.vmware.com/creator=clittle tmc.cloud.vmware.com/managed=true Annotations: <none> Phase: InProgress Backup: wordpress-dp-backup Namespaces: Included: * Excluded: vmware-system-tmc Resources: Included: * Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io Cluster-scoped: included Namespace mappings: <none> Label selector: app=wordpress Restore PVs: auto
When the restore is done you’ll see the status as Ready on the Data Protection page.
And you should see all objects restored in your cluster.
kubectl --kubeconfig=kubeconfig-clittle-test-cluster.yml get deployment,replicaset,po,svc NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/wordpress 1/1 1 1 90s deployment.apps/wordpress-mysql 1/1 1 1 90s NAME DESIRED CURRENT READY AGE replicaset.apps/wordpress-5fb76c9f88 1 1 1 90s replicaset.apps/wordpress-mysql-5fcd84f896 1 1 1 90s NAME READY STATUS RESTARTS AGE pod/wordpress-5fb76c9f88-z7dvs 1/1 Running 0 90s pod/wordpress-mysql-5fcd84f896-hv8h5 1/1 Running 0 90s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4h52m service/wordpress LoadBalancer 10.104.31.148 ab14057ea523e4721aeb075db9338685-1719023603.us-east-1.elb.amazonaws.com 80:30705/TCP 91s service/wordpress-mysql ClusterIP None <none> 3306/TCP 91s
The external address for the wordpress service is changed since the Load Balancer was recreated at AWS but you can access the new address to validate that nothing was lost.
One thing worth mentioning if you’re going to try this exact scenario on your own. WordPress saves the loadbalancer URL internally so you will likely have some issues accessing the site after the restore. You can create a local host entry that maps the old address to the new loadbalancer IP address to get back in. Once in, you can update the URL that WordPress has saved to the new one.
As I mentioned earlier, you can schedule backups to occur on a recurring basis. The options are hourly, daily, weekly, monthly or on a custom schedule that you can specify in
In this example, I’m creating an hour backup that will run at 20 minutes past the hour and backup the entire cluster. As soon as the scheduled backup is created, a backup is taken and you will see the result of it in the Backups list. You’ll also see your new schedule in the Shedules list.
From the command line, using the
velero command, we can see the schedule that has been created and some details about it.
velero --kubeconfig=kubeconfig-clittle-test-cluster.yml get schedules NAME STATUS CREATED SCHEDULE BACKUP TTL LAST BACKUP SELECTOR hourly-cluster-backup Enabled 2020-08-28 09:15:49 -0600 MDT 20 * * * * 720h0m0s 11m ago <none>
velero --kubeconfig=kubeconfig-clittle-test-cluster.yml schedule describe hourly-cluster-backup Name: hourly-cluster-backup Namespace: velero Labels: tmc.cloud.vmware.com/creator=c_01EGTRHZ6NN94KB03RNHAPPZPE tmc.cloud.vmware.com/managed=true Annotations: <none> Phase: Enabled Schedule: 20 * * * * Backup Template: Namespaces: Included: * Excluded: <none> Resources: Included: * Excluded: <none> Cluster-scoped: included Label selector: <none> Storage Location: cjlittle-dp Snapshot PVs: false TTL: 720h0m0s Hooks: <none> Last Backup: 2020-08-28 09:20:04 -0600 MDT
I set the time for this recurrence such that I would get a scheduled backup to fire off a few minutes after I created is and we can see that there is a second cluster backup in the list at 20 minutes after the hour.
I decided to leave this running for a while to make sure it kept up with my schedule and so far nothing has been missed.
velero --kubeconfig=kubeconfig-clittle-test-cluster.yml get backups NAME STATUS CREATED EXPIRES STORAGE LOCATION SELECTOR hourly-cluster-backup-20200828232004 Completed 2020-08-28 17:20:04 -0600 MDT 29d cjlittle-dp <none> hourly-cluster-backup-20200828222004 Completed 2020-08-28 16:20:04 -0600 MDT 29d cjlittle-dp <none> hourly-cluster-backup-20200828212004 Completed 2020-08-28 15:20:04 -0600 MDT 29d cjlittle-dp <none> hourly-cluster-backup-20200828202004 Completed 2020-08-28 14:20:04 -0600 MDT 29d cjlittle-dp <none> hourly-cluster-backup-20200828192004 Completed 2020-08-28 13:20:04 -0600 MDT 29d cjlittle-dp <none> hourly-cluster-backup-20200828182004 Completed 2020-08-28 12:20:04 -0600 MDT 29d cjlittle-dp <none> hourly-cluster-backup-20200828172004 Completed 2020-08-28 11:20:04 -0600 MDT 29d cjlittle-dp <none> hourly-cluster-backup-20200828162004 Completed 2020-08-28 10:20:04 -0600 MDT 29d cjlittle-dp <none> hourly-cluster-backup-20200828152004 Completed 2020-08-28 09:20:04 -0600 MDT 29d cjlittle-dp <none> hourly-cluster-backup-20200828151549 Completed 2020-08-28 09:15:49 -0600 MDT 29d cjlittle-dp <none> wordpress-dp-backup Completed 2020-08-28 09:10:39 -0600 MDT 29d cjlittle-dp app=wordpress
And if we take a look at our S3 bucket on AWS, we can see that there is a separate folder created for each backup.