Inspecting container checkpoints with checkpointctl

One of the newer features in Kubernetes (1.30 and later) is the Kubelet Checkpoint API. This new API allows users to create a stateful copy of a running container, a functionality which is often used for forensics or for debugging.

In Kubernetes installations where this feature is enabled, a checkpoint can be created by accessing the respective Kubelet API via curl or similar. In the following example I am also using the Kubernetes API /proxy endpoint (the same can also be done on the Node locally via localhost:10250/checkpoint/...):

$ curl -k -X POST --header "Authorization: Bearer $TOKEN" "$KUBERNETES_API_URL/api/v1/nodes/$NODE_NAME/proxy/checkpoint/$NAMESPACE_NAME/$POD_NAME/$CONTAINER_NAME"
{"items":["/var/lib/kubelet/checkpoints/checkpoint-fedora-74d79dd7f4-csrmg_skrenger-container-2024-12-12T12:56:19Z.tar"]}
Read the rest of this entry

ERROR: release image arch amd64 does not match host arch arm64

Well, so I tried installing a new ARM-based OpenShift Container Platform cluster on AWS. To prepare, I created an install-config.yaml file and changed the controlPlane.architecture and the compute.architecture field to “arm64” and then launched the installer. That did not work, it still complains about the architecture:

$ ./openshift-install create cluster --dir=.
INFO Credentials loaded from the "default" profile in file "/home/simon/.aws/credentials" 
INFO Consuming Install Config from target directory 
INFO Creating infrastructure resources...         
INFO Waiting up to 20m0s (until 11:07AM) for the Kubernetes API at https://api.skrenger-arm.lab.example.com:6443... 
INFO Pulling VM console logs                      
INFO Pulling debug logs from the bootstrap machine 
ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get "https://api.skrenger-arm.lab.example.com:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp 3.64.25.143:6443: connect: connection refused 
ERROR Bootstrap failed to complete: Get "https://api.skrenger-arm.lab.example.com:6443/version": dial tcp 3.68.144.150:6443: connect: connection refused 
ERROR Failed waiting for Kubernetes API. This error usually happens when there is a problem on the bootstrap host that prevents creating a temporary control plane. 
ERROR The bootstrap machine failed to download the release image 
INFO Pulling quay.io/openshift-release-dev/ocp-release@sha256:9ffb17b909a4fdef5324ba45ec6dd282985dd49d25b933ea401873183ef20bf8... 
INFO cfce1ab124f59e93a0f67d7e85283d524ddfd73a27d0535319d69d1dce746488 
INFO ERROR: release image arch amd64 does not match host arch arm64 
INFO Bootstrap gather logs captured here "/home/simon/Downloads/arm/log-bundle-20221124110737.tar.gz" 
Read the rest of this entry

OpenShift 4 – List installed Operators

In OpenShift Container Platform (OCP) 4, most of the functionality is controlled by Operators. To see the currently installed Operators and also their status, use the following command:

$ oc get clusteroperators
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.6.4     True        False         False      12m
cloud-credential                           4.6.4     True        False         False      38m
cluster-autoscaler                         4.6.4     True        False         False      32m
config-operator                            4.6.4     True        False         False      33m
console                                    4.6.4     True        False         False      21m
csi-snapshot-controller                    4.6.4     True        False         False      27m
dns                                        4.6.4     True        False         False      31m
etcd                                       4.6.4     True        False         False      32m
image-registry                             4.6.4     True        False         False      25m
ingress                                    4.6.4     True        False         False      24m
insights                                   4.6.4     True        False         False      33m
kube-apiserver                             4.6.4     True        False         False      30m
kube-controller-manager                    4.6.4     True        False         False      31m
kube-scheduler                             4.6.4     True        False         False      31m
kube-storage-version-migrator              4.6.4     True        False         False      24m
machine-api                                4.6.4     True        False         False      27m
machine-approver                           4.6.4     True        False         False      32m
machine-config                             4.6.4     True        False         False      32m
marketplace                                4.6.4     True        False         False      32m
monitoring                                 4.6.4     True        False         False      23m
network                                    4.6.4     True        False         False      33m
node-tuning                                4.6.4     True        False         False      33m
openshift-apiserver                        4.6.4     True        False         False      27m
openshift-controller-manager               4.6.4     True        False         False      24m
openshift-samples                          4.6.4     True        False         False      26m
operator-lifecycle-manager                 4.6.4     True        False         False      32m
operator-lifecycle-manager-catalog         4.6.4     True        False         False      32m
operator-lifecycle-manager-packageserver   4.6.4     True        False         False      27m
service-ca                                 4.6.4     True        False         False      33m
storage                                    4.6.4     True        False         False      32m

You can find the description of the default Operators in the documentation.

This will only list the Red Hat Operators that are installed as part of the cluster. These are all controlled by the ClusterVersionOperator, which is the “Master-Operator” of the cluster controlling all others.

If you want to list all Operators that were installed via the Operator Lifecycle Manager (OLM), you can use the following command:

$ oc get subscriptions --all-namespaces

Hello world

My name is Simon Krenger, I am a Technical Account Manager (TAM) at Red Hat. I advise our customers in using Kubernetes, Containers, Linux and Open Source.

Elsewhere

  1. GitHub
  2. LinkedIn
  3. GitLab