How to develop and deploy Kubernetes Operator in Red Hat Openshift

Ashima
12 min readJul 12, 2022
Source: Image by Red Hat Developer

Introduction:

Kubernetes, also known as K8s, is an open-source container orchestration platform widely used to develop and deploy applications in the cloud environment. It automates the deployment, scaling, and management of containerized applications, making them easy to deploy and operate in a microservice architecture. It also provides a way to extend the Kubernetes API and build our own custom Kubernetes resource as per requirements.

A Kubernetes Operator is a design pattern that provides the best way to extend the Kubernetes API by defining a custom resource that deals with the operational aspects of an application. A custom resource definition (CRD) lists all the configurations of the operator and defines a custom resource (CR) which is watched by the Kubernetes operator to perform application-specific actions so that the current state of the resource matches the desired state.

Kubernetes Operator Flow

Kubernetes Operator continuously watches the custom resource, performs an action when it detects changes in the custom resource, and tries to match the current state of the system to the desired one. To achieve this, the Kubernetes operator follows the control theory. To maintain the stability of the system, a controller is used. In Kubernetes, a controller is a control loop that watches the state of the cluster and makes changes to move the current state closer to the desired state.

Objectives:

  • Develop and deploy a Kubernetes operator using the operator-SDK framework
  • Defining RBAC roles and permissions for the operator to perform required actions.

Tools used to write a Kubernetes operator:

Although the operator can be written from scratch using a set of tools with which core Kubernetes components are written but some existing tools make it easier to write an operator by providing a dummy architecture of the operator:

  1. Operator-SDK: a command-line interface for deploying operators based on Kubebuilder.
  2. Kubebuilder: a tool for scaffolding Go-based Operators.
  3. Controller-Runtime: a Golang library that is wrapped around a Kubernetes API Golang client library (client-go).

This tutorial will use the operator-SDK framework to develop and deploy a go-based Kubernetes operator along with setting RBAC permissions.

Operator-SDK makes it easier to write a Kubernetes operator. We need to create three basic components of the control loop:

  • an API server to handle the incoming requests;
  • a data store that stores the configurations of applications and desired & actual state of the system — CRD and CR;
  • a controller which watches the CR and takes actions according to changes in the state of the resource.

RBAC roles:

Role-based Access control, also known as RBAC, is a method of providing access to resources based on the roles assigned to an individual user in an organization. In Kubernetes, RBAC roles allow containers to be bound to the given roles and to perform the action on various resources according to permissions. In Kubernetes operators also, RBAC roles for various resources are assigned to the operator for which operator needs to perform the actions, e.g., if the operator is responsible for the creation and deletion of a Kubernetes resource then the operator must have ‘create’ and ‘delete’ roles for that specific resource.

Building a Go-based Kubernetes Operator:

Pre-requisites:

% operator-sdk version
operator-sdk version: “v1.10.0”
  • Kubectl installed
  • Golang v1.16.0+ installed
  • To install Golang follow the document: https://go.dev/doc/install
  • Docker v3.2.2+ installed
  • Access to Docker image repository (Docker Hub)
  • Admin access to a Kubernetes/ROKS cluster.

Steps to Develop and deploy the Operator:

We will develop and deploy an operator that will create a deployment and will perform replica scaling of it.

  1. Create a project:
    Make a new directory and init a project in it.
% mkdir custom-operator% cd custom-operator% export GO111MODULE=on% $ operator-sdk init --domain example.com --repo github.com/username/custom-operator
Writing kustomize manifests for you to edit...
Writing scaffold for you to edit...
Get controller runtime:
$ go get sigs.k8s.io/controller-runtime@v0.12.1
Update dependencies:
$ go mod tidy
Next: define a resource with:
$ operator-sdk create api

Here, the following command used will generate the operator code.

% operator-sdk init — domain example.com — repo github.com/username/custom-operator

--domain: is used to create the operator API group.
--repo: is used to set the git repo where we want to commit the project.

This will generate the following resources:

.
├── Dockerfile ## for operator image which will be build to deploy the operator
├── Makefile ## to build, deploy, test, undeploy the operator
├── PROJECT ## has operator configurations
├── config
│ ├── default
│ │ ├── kustomization.yaml
│ │ ├── manager_auth_proxy_patch.yaml
│ │ └── manager_config_patch.yaml
│ ├── manager
│ │ ├── controller_manager_config.yaml
│ │ ├── kustomization.yaml
│ │ └── manager.yaml
│ ├── manifests
│ │ └── kustomization.yaml
│ ├── prometheus
│ │ ├── kustomization.yaml
│ │ └── monitor.yaml
│ ├── rbac
│ │ ├── auth_proxy_client_clusterrole.yaml
│ │ ├── auth_proxy_role.yaml
│ │ ├── auth_proxy_role_binding.yaml
│ │ ├── auth_proxy_service.yaml
│ │ ├── kustomization.yaml
│ │ ├── leader_election_role.yaml
│ │ ├── leader_election_role_binding.yaml
│ │ ├── role_binding.yaml
│ │ └── service_account.yaml
│ └── scorecard
│ ├── bases
│ │ └── config.yaml
│ ├── kustomization.yaml
│ └── patches
│ ├── basic.config.yaml
│ └── olm.config.yaml
├── go.mod
├── go.sum
├── hack
│ └── boilerplate.go.txt ## a license file
└── main.go ## the main function of the operator

2. Create an API and controller:
Use the create command to generate CRD and controller of the operator.

% operator-sdk create api --group replica --version v1alpha1 --kind CustomOperator --resource=true --controller=true
Writing kustomize manifests for you to edit…
Writing scaffold for you to edit…
api/v1alpha1/customoperator_types.go
controllers/customoperator_controller.go
Update dependencies:
$ go mod tidy
Running make:
$ make generate
go: creating new go.mod: module tmp
Downloading sigs.k8s.io/controller-tools/cmd/controller-gen@v0.6.1
go: downloading sigs.k8s.io/controller-tools v0.6.1
go: downloading golang.org/x/tools v0.1.3
go: downloading k8s.io/utils v0.0.0–20201110183641–67b214c5f920
go: downloading golang.org/x/sys v0.0.0–20210510120138–977fb7262007
go: downloading github.com/json-iterator/go v1.1.10
go get: added sigs.k8s.io/controller-tools v0.6.1
/Users/username/path/custom-operator/bin/controller-gen object:headerFile=”hack/boilerplate.go.txt” paths=”./…”

--version: This determines the version of the custom-created API.
--group: This is the group of the CRD API where custom resource lives and hence will suffix API domain. Here the group will be: ‘replica.example.com’.
--kind: defines the type of the custom resource that will be supported by the operator.
--resource and --controller: flags set to true means we are generating the scaffolding of both API and the controller.

This command will generate the following major directories:

.
.
├── api
│ └── v1alpha1
│ ├── groupversion_info.go
│ ├── customoperator_types.go
│ └── zz_generated.deepcopy.go
├── bin
│ └── controller-gen
├── config
│ ├── crd
│ │ ├── bases
│ │ │ └── replica.example.com_customoperators.yaml
...
│ ├── samples
│ │ ├── kustomization.yaml
│ │ └── replica_v1alpha1_customoperator.yaml
...
├── controllers
│ ├── customoperator_controller.go
│ └── suite_test.go
.
.

3. Defining the API:
In the previous step we created the structs that will define the Spec and Status fields of the CR of the operator in the file api/v1alpha1/customoperator_types.go:

// CustomOperatorSpec defines the desired state of CustomOperator
type CustomOperatorSpec struct {
// INSERT ADDITIONAL SPEC FIELDS — desired state of cluster
// Important: Run “make” to regenerate code after modifying this file
// Foo is an example field of CustomOperator. Edit customoperator_types.go to remove/update
Foo string `json:”foo,omitempty”`
}
// CustomOperatorStatus defines the observed state of CustomOperator
type CustomOperatorStatus struct {
// INSERT ADDITIONAL STATUS FIELD — define observed state of cluster
// Important: Run “make” to regenerate code after modifying this file
}

Spec and Status structs represent the desired state and observed state of the system respectively. Let’s populate the Spec struct with a ‘Replicas’ field which will represent the desired number of replicas of the deployment required. In this project, we are not defining any Status field, but it can be defined as per the requirement of the application.

// CustomOperatorSpec defines the desired state of CustomOperator
type CustomOperatorSpec struct {
// INSERT ADDITIONAL SPEC FIELDS — desired state of cluster
// Important: Run “make” to regenerate code after modifying this file
// Foo is an example field of CustomOperator. Edit customoperator_types.go to remove/update
// Foo string `json:”foo,omitempty”`
Replicas int32 `json:”replicas”`
}

// CustomOperatorStatus defines the observed state of CustomOperator
type CustomOperatorStatus struct {
// INSERT ADDITIONAL STATUS FIELD — define observed state of cluster
// Important: Run “make” to regenerate code after modifying this file
}

After all changes in types.go file, run the following command to update the required implementations of the CRD.

% make generate

And to generate the manifests of the CRD, run the following command:

% make manifests

4. Create a controller
In step 2, code for the controller was generated in controllers/customoperator_controller.go file:

func (r *CustomOperatorReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
_ = log.FromContext(ctx)
// your logic here return ctrl.Result{}, nil
}

Here, the ‘Reconcile’ function is responsible for reconciling and maintaining the desired state of the system. It is the method where the logic of the controller is implemented. Refer code to write the operator which creates the deployment and updates the replica count of pods to match it with the desired count.

To update the go-packages in the operator run:

% go mod download% go mod tidy

Assign RBAC roles and permissions:

The controller of the operator requires various permissions and roles to perform actions on the resources. Permissions given to the operator for the custom resource are shown in the controller.go file:

//+kubebuilder:rbac:groups=replica.example.com,resources=customoperators,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=replica.example.com,resources=customoperators/status,verbs=get;update;patch
//+kubebuilder:rbac:groups=replica.example.com,resources=customoperators/finalizers,verbs=update

On generating manifests of the CRD in step 3, ClusterRole manifest ‘config/rbac/role.yaml’ file was also generated which contains the list of roles given to the controller for various resources. We can add more permissions to this file as required by the operator. For instance, for the CustomOperator controller written in step 4, we are creating a deployment resource, and updating it to match the replica count of the pod. For the operator to perform actions successfully on deployment, we need to assign the following permissions to the controller for deployment resource:

- apiGroups:
- apps
resources:
- deployments
verbs:
- create
- get
- list
- update
- watch

5. Build and Deploy the Operator:
There are three ways to deploy the operator:

I. Run the operator locally as a go program, outside the Kubernetes cluster. This method is useful during the development and testing of the operator.

% make install run

Example Output:

% make install run
/Users/username/path-to-operator/custom-operator/bin/controller-gen “crd:trivialVersions=true,preserveUnknownFields=false” rbac:roleName=manager-role webhook paths=”./…” output:crd:artifacts:config=config/crd/bases
/Users/username/path-to-operator/custom-operator/bin/kustomize build config/crd | kubectl apply -f -
customresourcedefinition.apiextensions.k8s.io/customoperators.replica.example.com created
/Users/username/path-to-operator/custom-operator/bin/controller-gen object:headerFile=”hack/boilerplate.go.txt” paths=”./…”
go fmt ./…
go vet ./…
go run ./main.go
I1203 11:09:06.617233 29536 request.go:668] Waited for 1.04797997s due to client-side throttling, not priority and fairness, request: GET:https://sad073aa1c9351d1a93ef-6b64a6ccc9c596bf59a86625d8fa2202-ce00.us-east.satellite.appdomain.cloud:32622/apis/tuned.openshift.io/v1?timeout=32s
2021–12–03T11:09:08.557+0530 INFO controller-runtime.metrics metrics server is starting to listen {“addr”: “:8080”}
2021–12–03T11:09:08.558+0530 INFO setup starting manager
2021–12–03T11:09:08.558+0530 INFO controller-runtime.manager starting metrics server {“path”: “/metrics”}
2021–12–03T11:09:08.558+0530 INFO controller-runtime.manager.controller.customoperator Starting EventSource {“reconciler group”: “replica.example.com”, “reconciler kind”: “CustomOperator”, “source”: “kind source: /, Kind=”}
2021–12–03T11:09:08.558+0530 INFO controller-runtime.manager.controller.customoperator Starting Controller {“reconciler group”: “replica.example.com”, “reconciler kind”: “CustomOperator”}
2021–12–03T11:09:09.159+0530 INFO controller-runtime.manager.controller.customoperator Starting workers {“reconciler group”: “replica.example.com”, “reconciler kind”: “CustomOperator”, “worker count”: 1}

II. Deploy the operator as a deployment on the cluster. This step we will use to deploy our Custom Operator on the ROKS cluster.

Steps:

  • Run the following commands to build and push the operator image in DockerHub.
% make docker-build IMG=<registry>/<user>/<image_name>:<tag>% make docker-push IMG=<registry>/<user>/<image_name>:<tag>

Note: If the build of operator-image fails by giving the below error in controllers/suite_test.go file:

unable to start control plane itself: failed to start the controlplane. retried 5 times: fork/exec /usr/local/kubebuilder/bin/etcd: no such file or directory

Replace the piece of the following code in MakeFile:

test: manifests generate fmt vet envtest ## Run tests.
go test ./… -coverprofile cover.out

With:

# ENVTEST_K8S_VERSION refers to the version of kubebuilder assets to be downloaded by envtest binary.
ENVTEST_K8S_VERSION = 1.21
test: manifests generate fmt vet envtest ## Run tests.
KUBEBUILDER_ASSETS=”$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) -p path)” go test ./… -coverprofile cover.out
  • Run `make deploy` command with operator image to deploy the operator in cluster:
% make deploy IMG=<registry>/<user>/<image_name>:<tag>

Example Output:

% make deploy IMG=<registry>/<user>/<image_name>:<tag>
/Users/userame/path-to-operator/custom-operator/bin/controller-gen “crd:trivialVersions=true,preserveUnknownFields=false” rbac:roleName=manager-role webhook paths=”./…” output:crd:artifacts:config=config/crd/bases
cd config/manager && /Users/username/path-to-operator/custom-operator/bin/kustomize edit set image controller=<registry>/<user>/<image_name>:<tag>
/Users/userame/path-to-operator/custom-operator/bin/kustomize build config/default | kubectl apply -f -
namespace/custom-operator-system created
customresourcedefinition.apiextensions.k8s.io/customoperators.replica.example.com configured
serviceaccount/custom-operator-controller-manager created
role.rbac.authorization.k8s.io/custom-operator-leader-election-role created
clusterrole.rbac.authorization.k8s.io/custom-operator-manager-role created
cluster
role.rbac.authorization.k8s.io/custom-operator-metrics-reader createdclusterrole.rbac.authorization.k8s.io/custom-operator-proxy-role created
rolebinding.rbac.authorization.k8s.io/custom-operator-leader-election-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/custom-operator-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/custom-operator-proxy-rolebinding created
configmap/custom-operator-manager-config created
service/custom-operator-controller-manager-metrics-service created
deployment.apps/custom-operator-controller-manager created

This command will create CRD of the operator, a namespace with name: `<project-name>-system` and will create all RBAC roles from config/rbac manifests yaml file and operator pods in this namespace.

  • Verify the Operator is deployed successfully and pods are running:
    Run:
% kubectl get all -n <project_name>-system 

Example Output:

% kubectl get all -n custom-operator-system
NAME READY STATUS RESTARTS AGE
pod/custom-operator-controller-manager-5469f999c6–56589 2/2 Running 0 8m22s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/custom-operator-controller-manager-metrics-service ClusterIP 172.21.97.166 <none> 8443/TCP 8m23s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/custom-operator-controller-manager 1/1 1 1 8m23s
NAME DESIRED CURRENT READY AGE
replicaset.apps/custom-operator-controller-manager-5469f999c6 1 1 1 8m23s

III. Operator deployed and managed as OLM (Operator Lifecycle Manager) is used while deploying operators in production as OLM provides additional features to manage the operator.

6. Check logs of operator pod:
Once the operator pods are running successfully, we can check and monitor the logs of the operator.

% kubectl logs -f -n <project_name>-system pod/<pod_name> -c manager

Example Output:

% kubectl logs -f -n custom-operator-system pod/custom-operator-controller-manager-54b7dcdf65–4cggp -c manager
I0112 07:41:08.690044 1 request.go:668] Waited for 1.198420577s due to client-side throttling, not priority and fairness, request: GET:https://172.21.0.1:443/apis/config.openshift.io/v1?timeout=32s
2022–01–12T07:41:10.496Z INFO controller-runtime.metrics metrics server is starting to listen {“addr”: “127.0.0.1:8080”}
2022–01–12T07:41:10.497Z INFO setup starting manager
I0112 07:41:10.590427 1 leaderelection.go:243] attempting to acquire leader lease custom-operator-system/ad09e032.example.com…
2022–01–12T07:41:10.590Z INFO controller-runtime.manager starting metrics server {“path”: “/metrics”}
I0112 07:41:11.005812 1 leaderelection.go:253] successfully acquired lease custom-operator-system/ad09e032.example.com
2022–01–12T07:41:11.006Z INFO controller-runtime.manager.controller.customoperator Starting EventSource {“reconciler group”: “replica.example.com”, “reconciler kind”: “CustomOperator”, “source”: “kind source: /, Kind=”}
2022–01–12T07:41:11.006Z INFO controller-runtime.manager.controller.customoperator Starting Controller {“reconciler group”: “replica.example.com”, “reconciler kind”: “CustomOperator”}
2022–01–12T07:41:11.006Z DEBUG controller-runtime.manager.events Normal {“object”: {“kind”:”ConfigMap”,”namespace”:”custom-operator-system”,”name”:”ad09e032.example.com”,”uid”:”8d66973b-56bd-4ad7-b4f0-af02e81fb1c3",”apiVersion”:”v1",”resourceVersion”:”73956434"}, “reason”: “LeaderElection”, “message”: “custom-operator-controller-manager-54b7dcdf65–4cggp_45419a25–952e-43bd-acb3–923ae9d14536 became leader”}
2022–01–12T07:41:11.007Z DEBUG controller-runtime.manager.events Normal {“object”: {“kind”:”Lease”,”namespace”:”custom-operator-system”,”name”:”ad09e032.example.com”,”uid”:”5e097408–9928–4b1d-ac1f-04bb644b143c”,”apiVersion”:”coordination.k8s.io/v1",”resourceVersion”:”73956435"}, “reason”: “LeaderElection”, “message”: “custom-operator-controller-manager-54b7dcdf65–4cggp_45419a25–952e-43bd-acb3–923ae9d14536 became leader”}
2022–01–12T07:41:11.290Z INFO controller-runtime.manager.controller.customoperator Starting workers {“reconciler group”: “replica.example.com”, “reconciler kind”: “CustomOperator”, “worker count”: 1}

7. Creation of Custom Resource
To test the operator, we need to create a custom resource of the operator. Edit the yaml file config/samples/replica_v1alpha1_customoperator.yaml so that it has the specifications as follows:

apiVersion: replica.example.com/v1alpha1
kind: CustomOperator
metadata:
name: customoperator-sample
spec:
# Add fields here
# foo: bar
replicas: 1

Create the custom resource in the cluster:

% kubectl create -f config/samples/replica_v1alpha1_customoperator.yaml
customoperator.replica.example.com/customoperator-sample created

Check the presence of the custom resource in the cluster:

% kubectl get customoperator.replica.example.com/customoperator-sample
NAME AGE
customoperator-sample 26s

Ensure that the operator creates the required deployment and its corresponding pod(s):

% kubectl get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
custom-operator-deployment 1/1 1 1 20m
% kubectl get pods
NAME READY STATUS RESTARTS AGE
custom-operator-deployment-748c9fb686-s6pjq 1/1 Running 0 20m

8. Update the Custom Resource
Update the ‘config/samples/replica_v1alpha1_customoperator.yaml’ to change the replicas to 3 from 1.

% kubectl edit -f config/samples/replica_v1alpha1_customoperator.yaml
customoperator.replica.example.com/customoperator-sample edited

% kubectl get customoperator.replica.example.com/customoperator-sample -o yaml
apiVersion: replica.example.com/v1alpha1
kind: CustomOperator
metadata:
creationTimestamp: “2022–01–12T07:56:42Z”
generation: 2
name: customoperator-sample
namespace: default
resourceVersion: “73976268”
selfLink: /apis/replica.example.com/v1alpha1/namespaces/default/customoperators/customoperator-sample
uid: b6d974b0-cca9–4b47-a622-c821d525856b
spec:
replicas: 3

Confirm that the operator has changed the number of pods to the desired number:

% kubectl get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
custom-operator-deployment 3/3 3 3 26m
% kubectl get pods
NAME READY STATUS RESTARTS AGE
custom-operator-deployment-748c9fb686–2sc66 1/1 Running 0 2m26s
custom-operator-deployment-748c9fb686-qcns9 1/1 Running 0 2m26s
custom-operator-deployment-748c9fb686-s6pjq 1/1 Running 0 26m

9. Cleanup:
To perform the cleanup of all the resources created in this tutorial, run the following command after the deletion of the custom resource:

% kubectl delete -f config/samples/replica_v1alpha1_customoperator.yaml
customoperator.replica.example.com “customoperator-sample” deleted

% make undeploy

Full code for CustomOperator: https://github.com/ashimagarg27/custom-operator

Conclusion:

A Kubernetes Operator is a controller for managing and deploying applications on Kubernetes. In this tutorial we used operator-SDK framework to write Kubernetes operator in GoLang and defined how RBAC roles and permissions can be set for any resource to perform required actions.

--

--