Pivotal Engineering Journal

Technical articles from Pivotal engineers.

How we built a controller using KubeBuilder with Test Driven development, Part 1

Lessons we learned while building a controller with KubeBuilder


Who are we?

We are the Greenplum for Kubernetes team. We're working on a Kubernetes operator to run Greenplum, and connected components of Greenplum like PXF and GPText. We started with a controller for Greenplum, and recently used KubeBuilder to add controllers for PXF and GPText.

In this post, we review some of the key lessons we learned in the course of developing our KubeBuilder controllers. In Part 2, we cover our journey of discovery of how to unit test our new controllers.

Background

The Kubernetes operator pattern is used to extend Kubernetes with custom resources and APIs, as the native resources don’t have application-specific logic. An operator can declare a resource by submitting a CustomResourceDefinition to the api-server, and implement a control loop to listen for changes to those custom resources and react as necessary to manage underlying resources, which may be native Kubernetes resources, or entirely new resources like an external storage system.

We implemented our first controller for Greenplum using the Kubernetes code generators, workqueue, and hand-written code to react to create, update, and delete events from the Informer. This controller contains a lot of boilerplate and it was not easy to follow some of the controller best practices. For example, we reacted to create, update, and delete in different ways, making our controller edge driven rather than level driven. While we haven't experienced any issues with it thus far, edge triggering could make our controller less resilient should we find ourselves in unexpected states.

KubeBuilder is a framework for building operators that integrates controller-runtime to implement a lot of the responsibilities of an operator and its controllers1, including watching for resources, rate-limiting the control loop to reduce the chances of overloading the controller, and caching objects. Instead of relying on generated clients for custom types, KubeBuilder uses the controller-runtime client that works with any type registered. KubeBuilder generates scaffolding for controllers that need only one function implemented: Reconcile(). This made it much simpler for us to incorporate the best-practice controller principles into our new controllers.

Building the Controller

The KubeBuilder book covers in detail how to write a simple controller. Despite the thoroughness of the book, there were some key points that tripped us up.

Controller Basics

Our use case is a simple CustomResourceDefinition to parameterize construction of a Deployment and Service. After the reconciler is registered with the controller-runtime Manager, then its Reconcile() method will be called any time an object needs to be updated. The controller manager takes care of the basic controller details like listing and watching for changes to the objects, caching retrieved objects, and queuing and rate limiting reconciliations. When the controller manager determines that a change may need to be made, it calls the registered reconciler. The reconcile.Request parameter to Reconcile() contains only the namespace and name of the object to be reconciled. So, the first step is to Get the object. Then, based on its current contents, we can construct the desired state of our dependent objects (Deployment, Service), and ensure the desired state is set for those objects in the api-server.

Setting that desired state correctly and succinctly took us a few iterations, so below are some key details that we learned in the process, and finishing with an example Reconcile implementation.

CreateOrUpdate

CreateOrUpdate is a helper method provided by controller-runtime. Given an Object, it will either Create it or Update an existing object. Because of this functionality, it can be used for your dependent objects, and forms the core of many Reconcile() functions. However, its API makes it extremely easy to misuse without understanding some of its internals.

CreateOrUpdate() takes a callback, “mutate”, which is where all changes to the object must be performed. This point bears repeating: your mutate callback is the only place you should enact the contents of your object, aside from the name and namespace which must be filled in prior. Under the hood, CreateOrUpdate() first calls Get() on the object. If the object does not exist, Create() will be called. If it does exist, Update() will be called. Just before calling either Create() or Update(), the mutate callback will be called.

CreateOrUpdate(ctx, client, obj, mutate) client.Get(obj) mutate(obj) If obj did not exist, client.Create(obj) If obj did exist, client.Update(obj)

Initial testing may appear to work if you pass a no-op mutate and instead prepare obj before calling CreateOrUpdate(). When the client has no object to return, because it does not exist as in the Create() path, Get() does not modify obj, and obj will remain as it was before entering CreateOrUpdate(). The object would get passed to Create() in the same state as it was before CreateOrUpdate() was called. However, an issue arises during Update(). If the object exists (as in the Update() path), then Get() overwrites obj, merging the server's content over any content that was pre-filled. The Update() will never do anything, even if the actual object has diverged from the desired state. Therefore proper usage of “mutate” is essential to reconcile the object.

In your mutate callback, you should surgically modify individual fields of the object. Don't overwrite large chunks of the object, or the whole object, as we tried to do initially. Overwriting the object would discard the metadata field, and cause Update() to fail, or overwrite the status field, losing state. It could also interfere with other controllers trying to make their own modifications to the object. Kubernetes is a system of many controllers, so it is important for each controller to act in a way that will not interfere with others.

Garbage Collection, OwnerReferences and ContollerRefs

The OwnerReferences field within the metadata field of all Kubernetes objects declares that if the referred object (the owner) is deleted, then the object with the reference (the dependent) should also be deleted by garbage collection. A controller should create a ControllerRef (an OwnerReference with Controller: true) on objects it creates. There can be only one ControllerRef on an object in order to prevent controllers from fighting over an object.

For example, an ownerReferences entry is added to a Deployment that was created by a CustomResource controller, so that when the given CustomResource is deleted, the corresponding Deployment is also deleted.

The controller-runtime package provides another method, SetControllerReference, to add a ControllerRef to objects that a controller creates. It takes care of correctly appending to existing OwnerReferences and checking for an existing ControllerRef.

If we call SetControllerReference() on all objects we create during Reconcile(), then we can rely on the garbage collector, and we do not need to detect a deleted custom resource to delete its sub-resources.

Returning an error from Reconcile means requeue

Errors returned from Reconcile() will be logged. But returning an error from Reconcile() also means the reconciliation will be requeued. If that's not desirable, don't return an error. This means thinking carefully about simply returning downstream errors.

When an error is returned from Reconcile(), KubeBuilder logging of the error is very verbose, and includes a stack trace. The verbose log message may be difficult for a user to parse. Our experience with this was for a Secret that needs to be provided by the user for the PXF service to operate properly. If that Secret is missing (perhaps it was kubectl apply-ed at the same time but not available in the API yet), then we would return the error we got from Get()-ing the Secret. We did want the reconciliation to be requeued since we are not watching the Secret resource and would not otherwise reconcile again once the Secret did get created. To silence the stack trace, we decided to instead log a helpful error message, and return a Result with Requeue set to true.

Ignore NotFound errors

Case in point for not returning an error: A subtlety in the KubeBuilder book is how the authors ignore NotFound errors when Get()ting the reconciled object. Normally NotFound would be returned when the object has been deleted. Since we have arranged for the garbage collector to clean up our dependent objects, there is no need to do anything with NotFound errors, so we can ignore them. If the reconciled object was not deleted, but missing for some other reason, we should still ignore NotFound errors. We do not want to requeue the reconciliation by returning the error. If the object were to reappear in the API, the controller would get another reconciliation at that time. Requeuing the reconciliation would be a waste of time.

Example

To put all of the above together, here is a simplified example of Reconcile():

import (
	"github.com/pkg/errors"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
)

func (r *CustomReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
	ctx := context.Background()
	// Get CustomResource
	var customResource myApi.CustomResource
	if err := r.Get(ctx, req.NamespacedName, &customResource); err != nil {
		if apierrs.IsNotFound(err) {
			return ctrl.Result{}, nil
		}
		return ctrl.Result{}, errors.Wrap(err, "unable to fetch CustomResource")
	}

	// CreateOrUpdate SERVICE
	var svc corev1.Service
	svc.Name = customResource.Name
	svc.Namespace = customResource.Namespace
	_, err := ctrl.CreateOrUpdate(ctx, r, &svc, func() error {
		ModifyService(customResource, &svc)
		return controllerutil.SetControllerReference(&customResource, &svc, r.Scheme)
	})
	if err != nil {
		return ctrl.Result{}, errors.Wrap(err, "unable to CreateOrUpdate Service")
	}

	// CreateOrUpdate DEPLOYMENT
	var app appsv1.Deployment
	app.Name = customResource.Name + "-app"
	app.Namespace = customResource.Namespace
	_, err = ctrl.CreateOrUpdate(ctx, r, &app, func() error {
		ModifyDeployment(customResource, &app)
		return controllerutil.SetControllerReference(&customResource, &app, r.Scheme)
	})
	if err != nil {
		return ctrl.Result{}, errors.Wrap(err, "unable to CreateOrUpdate Deployment")
	}

	return ctrl.Result{}, nil
}

func ModifyDeployment(cr myApi.CustomResource, deployment *appsv1.Deployment) {
	labels := generateLabels(cr.Name)
	if deployment.Labels == nil {
		deployment.Labels = make(map[string]string)
	}
	for k, v := range labels {
		deployment.Labels[k] = v
	}
	replicas := cr.Spec.Replicas
	deployment.Spec.Replicas = &replicas
	deployment.Spec.Template.Labels = labels

	templateSpec := &deployment.Spec.Template.Spec

	if len(templateSpec.Containers) == 0 {
		templateSpec.Containers = make([]corev1.Container, 1)
	}
	container := &templateSpec.Containers[0]

	container.Name = "myapp"
	container.Args = []string{"/opt/myapp/bin/myapp"}
	container.Image = "myrepo/myapp:v1.0"
	container.Resources = corev1.ResourceRequirements{
		Limits: corev1.ResourceList{
			corev1.ResourceCPU:    cr.Spec.CPU,
			corev1.ResourceMemory: cr.Spec.Memory,
		},
	}
}

Up Next

In our next post, we'll describe our journey of figuring out how to apply Test Driven Development to a KubeBuilder controller.


  1. This slide deck from Kubernetes Meetup Tokyo is a great overview based on the O'Reilly book Programming Kubernetes of what a K8s controller needs to do, and gives a sense of all the things that KubeBuilder takes care of for you.