Building a Centralized Multi-cluster K8s ServiceBoard using K8s CRDs and Rancher

How and Why to Build a Centralized Multi-cluster K8s ServiceBoard

Able Lv
3 min readAug 2, 2022
Photo by Ian Taylor on Unsplash

Background and Problem

We have hundreds of microservices running on more than 80 Kubernetes clusters on GCP (Google Cloud Platform) and Alibaba Cloud.

Getting K8s service status, version, Ingress URLs, and so on from multiple clusters is time-consuming and inefficient.

Solution

We want to build a Centralized Multi-cluster K8s ServiceBoard to help engineers quickly get K8s service status and information across all of our K8s clusters, such as service version, health status, K8s service name, Ingress URL, etc.

Here are 3 architectures for building it.

Architecture 1: Based on collectors in each K8s cluster

We could develop a collector and deploy it to every K8s cluster. A collector is a K8s customer controller that watches Services and Pods changes and collects k8s service information.

The architecture diagram is as follows:

Architecture 1: Based on collectors in each K8s cluster

However, we have more than 80 K8s clusters. It is very difficult to maintain so many collectors.

Architecture 2: Based on the Rancher platform

At Airwallex, we use Rancher as our unified K8s management platform. Rancher is a container management platform for managing multiple Kubernetes clusters. Rancher can communicate with all downstream K8s clusters via the tunnels established by the Rancher agents and server.

Therefore, we can connect all our K8s clusters with one collector using the Rancher platform.

The architecture diagram is as follows:

Architecture 2: Based on the Rancher platform

Maintaining 1 collector is far easier than maintaining 80 collectors. Based on Rancher, the service board server is more extensible, it can modify our k8s in the future.

Architecture 3: Based on Rancher + Kubernetes CRDs

Developing and maintaining Rancher Dashboard UI is not easy. However, Rancher Dashboard UI can display Kubernetes Custom Resource Definitions(CRDs).

Therefore, we can build a collector to watch changes in all clusters via Rancher proxy and store service information in Kubernetes Custom Resources. Then, the service information will be displayed in Rancher’s Dashboard UI.

The architecture diagram is as follows:

Architecture 3: Based on Rancher + Kubernetes CRDs

Result and Conclusion

Finally, we chose “Architecture 3: Based on Rancher + Kubernetes CRDs”. With Rancher Dashboard UI and Kubernetes CRDs, we no longer need to build frontend and backend services. Kubernetes and etcd serve as our backend server and database.

The Centralized K8s ServiceBoard pages are as follows:

ServiceBoard Overview

We can select multiple K8s clusters, search for services, and sort services by status on the ServiceBoard overview page.

ServiceBoard Overview Page

ServiceBoard Detail Page

Click a service name, we can see the following page.

The spec field contains serviceInfomation , such as clusterName , service.name , service.namespace , service.ports, ingress.hosts, ingress.ip, pods.imageTags, pods.status and updatedAt.

The status field contains the service status.

ServiceBoard Details Page

All in all, with the introduction of the Centralized Multi-cluster K8s ServiceBoard, engineers can quickly and easily get all K8s service status and information from all of our K8s clusters.

References

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Able Lv
Able Lv

Written by Able Lv

Cloud Infrastructure Engineer @Airwallex: Kubernetes, DevOps, SRE, Go, Terraform, Istio, and Cloud-Native stuff

Responses (1)

Write a response