Enable and use Data Protection in TMC

TMC Data Protection is a relatively new feature in TMC that allows you to backup and restore your Kubernetes clusters, namespaces or label-specific objects. It is based on Velero, is easy to configure and even allows you to set schedules for regular backups. I’ve used Velero on it’s own in the past against workloads in TKG clusters but this was my first time using it in a more automated fashion with TMC…I was thoroughly impressed and wanted to share the experience.

Create a TMC credential and AWS CloudFormation stack to be used by TMC Data Protection

These steps assume that you already have an IAM account setup on AWS. You can find detailed instructions for creating one in my previous blog post, How to deploy a TKG cluster on AWS using Tanzu Mission Control.

In the TMC UI, navigate to Administration and click the Create Account Credential dropdown and then choose AWS data protection credential.

Enter a descriptive name for the new credential and then click the Generate Template button.

Click the Next button.

The template that was created will be used to create a new CloudFormation stack. Head back to the main AWS page and navigate to CoudFormation under Management & Governance. Click the Create stack dropdown at the top right and then choose With new resources (standard).

Under Specify template, select Upload a template file. Click the Choose file button and then select the .template file that was created in TMC earlier. 

Click the Next button.

Enter a name for the new stack and click the Next button.

Click the Next button on the Configure stack options page as there is nothing we need to change here.

Ensure everything is correct on the Review page. Click the checkbox at the bottom to acknowledge the IAM resources message and then click the Create stack button.

You should see that the creation is in progress.

You can click the refresh button and see various tasks running and completing.

Click on the Stack info link. When the Status is CREATE_COMPLETE, the stack is ready.

Click on the Outputs link and copy the ARN value noted.

You can now head back to TMC and click the Next button in the AWS Configuration section.

Click the Create Credential button.

You should now see your credential listed on the Administration > Accounts page.

Enable Data Protection on your cluster

In the TMC UI, navigate to Clusters and click on your cluster. Click on the Enable Data Protection link at the bottom of the page.

Select the credential you just created and click the Enable button.

You’ll see that Data Protection is being enabled in the TMC UI.

If you navigate to Storage > S3 at AWS, you should see a new S3 bucket being created.

When Data Protection is fully enabled, you’ll see a Create Backup link in the Data Protection pane for your cluster.

With Data Protection enabled you’ll see that there is a new namespace named velero with a single pod in it.

kubectl --kubeconfig=kubeconfig-clittle-test-cluster.yml -n velero get po

NAME                     READY   STATUS    RESTARTS   AGE
velero-88ff97748-85tq7   1/1     Running   0          38m

At this point, you should install the velero executable so that you can have better visibility into your backups and restores. It’s a single file and you can find simple instructions for installing it at https://velero.io/docs/v1.4/basic-install/#install-the-cli.

Backup a workload with Data Protection

If you have a workload deployed, you can click the Create Backup link to test out this functionality. I have a WordPress application deployed in this cluster so I’ll be using that for this exercise. I can see that the site is up and has some data via a post I created on it.

You have several options to choose from when deciding how you’ll backup your workload. You can backup the entire cluster, a specific namespace or resources based on labels. In my case, all resources associated with the WordPress application have the app=wordpress label applied so I’ll use that option. Click the Next button when you’ve configured what you’ll be backing up.

We’re going to do a single backup so we can leave the Schedule type set to “Now” and click the Next button. I’ll circle back to some of the schedule options later.

The default retention period is 30 days and you can change this to suit your needs. Click the Next button when you’re ready to proceed.

Give your backup a name and click the Create button. You’ll be redirected to the Data Protection page for your cluster where you can watch the progress of the backup.

When the backup is finished you can click on it to see more details about what was backed up.

You can also drill down into the S3 bucket that was created at AWS to see contents of the backup.

You can also use the velero executable to see the backup and its details.

velero --kubeconfig=kubeconfig-clittle-test-cluster.yml get backups

NAME                  STATUS      CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
wordpress-dp-backup   Completed   2020-08-07 14:34:56 -0600 MDT   29d       cjlittle-dp        app=wordpress
velero --kubeconfig=kubeconfig-clittle-test-cluster.yml describe backup wordpress-dp-backup

Name:         wordpress-dp-backup
Namespace:    velero
Labels:       tmc.cloud.vmware.com/creator=clittle
Annotations:  velero.io/source-cluster-k8s-gitversion=v1.18.5+vmware.1

Phase:  Completed

  Included:  *
  Excluded:  <none>

  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto

Label selector:  app=wordpress

Storage Location:  cjlittle-dp

Snapshot PVs:  auto

TTL:  720h0m0s

Hooks:  <none>

Backup Format Version:  1

Started:    2020-08-07 14:34:56 -0600 MDT
Completed:  2020-08-07 14:35:02 -0600 MDT

Expiration:  2020-09-06 14:34:56 -0600 MDT

Persistent Volumes:  2 of 2 snapshots completed successfully (specify --details for more information)

Restore the contents of a backup using Data Protection

I deleted my WordPress application after creating the backup.

kubectl --kubeconfig=kubeconfig-clittle-test-cluster.yml delete -k wordpress-aws/

storageclass.storage.k8s.io "k8s-policy" deleted
secret "mysql-pass-bd45fkk6kd" deleted
service "wordpress-mysql" deleted
service "wordpress" deleted
deployment.apps "wordpress-mysql" deleted
deployment.apps "wordpress" deleted
persistentvolumeclaim "mysql-pv-claim" deleted
persistentvolumeclaim "wp-pv-claim" deleted
kubectl --kubeconfig=kubeconfig-clittle-test-cluster.yml delete pv --all

persistentvolume "pvc-3a3eef27-e8fa-497c-9491-3bda00225be1" deleted
persistentvolume "pvc-4a476072-a8aa-403b-88eb-650e3b044054" deleted
kubectl --kubeconfig=kubeconfig-clittle-test-cluster.yml get deployment,replicaset,po,svc

NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP    <none>        443/TCP   4h15m

And if I try to access my WordPress site, it’s not there anymore.

Let’s hope that TMC Data Protection can get it back!

When you’re ready to perform a restore, you’ll need to navigate to Clusters, select your cluster and then select the Data Protection tab. From here you will click on the backup (in the Backups section) that you want to restore.

On the details page for your backup, click on the Restore Backup button.

You can choose to either restore the entire backup or a selection from the backup based on included namespaces or labels. For my example, there should be no difference between selecting the entire backup or using the label selection that was used when creating the backup (app=wordpress). Click the Next button when you’re ready to move on.

Enter a name for your backup and click the Restore button.

You can follow the progress of the restore in the Restores pane of the Data Protection tab.

You can also use the velero CLI to get more information about the restore.

velero --kubeconfig=kubeconfig-clittle-test-cluster.yml get restores

NAME                   BACKUP                STATUS       WARNINGS   ERRORS   CREATED                         SELECTOR
wordpress-dp-restore   wordpress-dp-backup   InProgress   0          0        2020-08-07 14:56:31 -0600 MDT   app=wordpress
velero --kubeconfig=kubeconfig-clittle-test-cluster.yml describe restore wordpress-dp-restore

Name:         wordpress-dp-restore
Namespace:    velero
Labels:       tmc.cloud.vmware.com/creator=clittle
Annotations:  <none>

Phase:  InProgress

Backup:  wordpress-dp-backup

  Included:  *
  Excluded:  vmware-system-tmc

  Included:        *
  Excluded:        nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io
  Cluster-scoped:  included

Namespace mappings:  <none>

Label selector:  app=wordpress

Restore PVs:  auto

When the restore is done you’ll see the status as Ready on the Data Protection page.

And you should see all objects restored in your cluster.

kubectl --kubeconfig=kubeconfig-clittle-test-cluster.yml get deployment,replicaset,po,svc

NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/wordpress         1/1     1            1           90s
deployment.apps/wordpress-mysql   1/1     1            1           90s

NAME                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/wordpress-5fb76c9f88         1         1         1       90s
replicaset.apps/wordpress-mysql-5fcd84f896   1         1         1       90s

NAME                                   READY   STATUS    RESTARTS   AGE
pod/wordpress-5fb76c9f88-z7dvs         1/1     Running   0          90s
pod/wordpress-mysql-5fcd84f896-hv8h5   1/1     Running   0          90s

NAME                      TYPE           CLUSTER-IP      EXTERNAL-IP                                                               PORT(S)        AGE
service/kubernetes        ClusterIP       <none>                                                                    443/TCP        4h52m
service/wordpress         LoadBalancer   ab14057ea523e4721aeb075db9338685-1719023603.us-east-1.elb.amazonaws.com   80:30705/TCP   91s
service/wordpress-mysql   ClusterIP      None            <none>                                                                    3306/TCP       91s

The external address for the wordpress service is changed since the Load Balancer was recreated at AWS but you can access the new address to validate that nothing was lost.

One thing worth mentioning if you’re going to try this exact scenario on your own. WordPress saves the loadbalancer URL internally so you will likely have some issues accessing the site after the restore. You can create a local host entry that maps the old address to the new loadbalancer IP address to get back in. Once in, you can update the URL that WordPress has saved to the new one.

As I mentioned earlier, you can schedule backups to occur on a recurring basis. The options are hourly, daily, weekly, monthly or on a custom schedule that you can specify in cron format.

In this example, I’m creating an hour backup that will run at 20 minutes past the hour and backup the entire cluster. As soon as the scheduled backup is created, a backup is taken and you will see the result of it in the Backups list. You’ll also see your new schedule in the Shedules list.

From the command line, using the velero command, we can see the schedule that has been created and some details about it.

velero --kubeconfig=kubeconfig-clittle-test-cluster.yml get schedules

NAME                    STATUS    CREATED                         SCHEDULE     BACKUP TTL   LAST BACKUP   SELECTOR
hourly-cluster-backup   Enabled   2020-08-28 09:15:49 -0600 MDT   20 * * * *   720h0m0s     11m ago       <none>
velero --kubeconfig=kubeconfig-clittle-test-cluster.yml schedule describe hourly-cluster-backup

Name:         hourly-cluster-backup
Namespace:    velero
Labels:       tmc.cloud.vmware.com/creator=c_01EGTRHZ6NN94KB03RNHAPPZPE
Annotations:  <none>

Phase:  Enabled

Schedule:  20 * * * *

Backup Template:
    Included:  *
    Excluded:  <none>

    Included:        *
    Excluded:        <none>
    Cluster-scoped:  included

  Label selector:  <none>

  Storage Location:  cjlittle-dp

  Snapshot PVs:  false

  TTL:  720h0m0s

  Hooks:  <none>

Last Backup:  2020-08-28 09:20:04 -0600 MDT

I set the time for this recurrence such that I would get a scheduled backup to fire off a few minutes after I created is and we can see that there is a second cluster backup in the list at 20 minutes after the hour.

I decided to leave this running for a while to make sure it kept up with my schedule and so far nothing has been missed.

velero --kubeconfig=kubeconfig-clittle-test-cluster.yml get backups

NAME                                   STATUS      CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
hourly-cluster-backup-20200828232004   Completed   2020-08-28 17:20:04 -0600 MDT   29d       cjlittle-dp        <none>
hourly-cluster-backup-20200828222004   Completed   2020-08-28 16:20:04 -0600 MDT   29d       cjlittle-dp        <none>
hourly-cluster-backup-20200828212004   Completed   2020-08-28 15:20:04 -0600 MDT   29d       cjlittle-dp        <none>
hourly-cluster-backup-20200828202004   Completed   2020-08-28 14:20:04 -0600 MDT   29d       cjlittle-dp        <none>
hourly-cluster-backup-20200828192004   Completed   2020-08-28 13:20:04 -0600 MDT   29d       cjlittle-dp        <none>
hourly-cluster-backup-20200828182004   Completed   2020-08-28 12:20:04 -0600 MDT   29d       cjlittle-dp        <none>
hourly-cluster-backup-20200828172004   Completed   2020-08-28 11:20:04 -0600 MDT   29d       cjlittle-dp        <none>
hourly-cluster-backup-20200828162004   Completed   2020-08-28 10:20:04 -0600 MDT   29d       cjlittle-dp        <none>
hourly-cluster-backup-20200828152004   Completed   2020-08-28 09:20:04 -0600 MDT   29d       cjlittle-dp        <none>
hourly-cluster-backup-20200828151549   Completed   2020-08-28 09:15:49 -0600 MDT   29d       cjlittle-dp        <none>
wordpress-dp-backup                    Completed   2020-08-28 09:10:39 -0600 MDT   29d       cjlittle-dp        app=wordpress

And if we take a look at our S3 bucket on AWS, we can see that there is a separate folder created for each backup.

Leave a Comment

Your email address will not be published.