1. Background and Problem Situation
During recent project operation, a situation occurred where a specific Pod in the Kubernetes environment could not start normally and repeatedly encountered an ImagePullBackOff error.
Since the existing Pods that were already running were functioning normally, it was initially judged not to be a failure of the Registry itself.
However, after the new Deployment was deployed, only the newly created Pods continuously failed to pull the images, and as a result, the new Pods could not boot up normally.
As a result of analyzing the kubelet logs and container runtime logs to identify the root cause, we confirmed that there is a failure in SSL certificate verification during the TLS communication with the Private Registry.
In particular, while directly invoking the Registry endpoint through the command "curl -v https://repo.example.com:11000/v2/", we discovered the following error message.
SSL certificate problem: unable to get local issuer certificate
This message does not simply mean that the certificate has expired, but rather that the node does not trust the CA (Certificate Authority) that issued the SSL certificate.
In other words, the Registry server was using a valid certificate, but the information about the certificate authority was not present in the trust store inside the Kubernetes node, resulting in a failure of the TLS verification.
In particular, in a Private Registry environment, it is common to use internal network-only certificates or certificates based on private certification authorities.
In this case, the operator needs to not only build the Registry but also register the certification authority information in the Kubernetes Worker Node and Container Runtime environment.
However, this part is often omitted in the initial construction phase, and there are frequent issues that arise during the expansion of new nodes or redistribution.
Also, the reason the existing Pods were functioning normally while only the new Pods failed was due to image caching.
Container Runtime can reuse images that have already been downloaded locally, allowing existing Pods to run without accessing the Registry.
On the other hand, newly created Pods had to directly access the Registry to pull images, during which an SSL verification error occurred.
Such issues can occur not only in Kubernetes environments but also in most Container Runtime environments like Docker, containerd, CRI-O, and k3s.
Therefore, understanding the SSL certification system and the structure of the CA trust store is crucial for the stability of the operational environment.
2. Problem Solving
(1) Check SSL Certificate
The first step in troubleshooting is to check which certificate the actual Registry server is using.
In production environments, the certificate chain often consists of multiple stages, and the same problem can occur if intermediate certificates (Intermediate CA) are missing.
First, use the openssl command to check the certificate's Subject and Issuer information.
openssl s_client -connect repo.example.com:11000 /dev/null | openssl x509 -noout -subject –issuer
The output example is as follows.
subject=CN=*.example.com
issuer=C=GB; O=Sectigo Limited; CN=Sectigo Public Server Authentication CA DV R36
It can be confirmed that the actual certificate was issued by a Sectigo-affiliated CA through the above results.
This process is a key step not only in checking the certificate's expiration but also in determining which certificate authority to trust.
In the next step, you will extract the actual certificate to a file.
openssl s_client -connect repo.example.com:11000 -showcerts /dev/null | sed -n
'/-----BEGIN CERTIFICATE-----/,/-----END CERTIFICATE-----/p' >
SectigoPublicServerAuthenticationCA.crt
This command saves the entire certificate chain provided by the Registry server to a file.
The root certificate and intermediate certificates may be included together depending on the environment, and in some environments, multiple certificates may be extracted simultaneously.
The number of extracted certificates can be checked with the command below.
grep -c 'BEGIN CERTIFICATE' SectigoPublicServerAuthenticationCA.crt
You must ensure that all certificates are properly included if there are multiple certificate chains.
Especially if an intermediate certificate is missing, it may work properly in the browser but can cause verification failures in the Container Runtime.
(2) Registering CA Certificates on the System (Ubuntu / Debian Series)
The most recommended solution is to register the CA certificate in the node's system trust store.
This method is the safest way to communicate with the Registry in a production environment while maintaining TLS validation reliably.
First, copy the extracted certificate to the system CA storage path.
sudo cp SectigoPublicServerAuthenticationCA.crt /usr/local/share/ca-certificates/
After that, update-ca-certificates command is used to refresh the system trust store.
sudo update-ca-certificates
Once this task is completed, Ubuntu or Debian-based systems automatically create symbolic links inside /etc/ssl/certs and reconfigure the entire CA bundle.
You can check the registration status with the command below.
openssl verify -CAfile /etc/ssl/certs/ca-certificates.crt SectigoPublicServerAuthenticationCA.crt
If registered successfully, an 'OK' message will be displayed.
Restarting containerd or Docker Runtime afterwards usually resolves the Image Pull issues in most environments.
It is very important to use this method in a production environment.
The insecure setting circumvents the security validation itself, which can make it vulnerable to MITM (Man-In-The-Middle) attacks in the long run.
In addition, it is advisable to establish a system for automatically distributing CA certificates through scripts like cloud-init, Ansible, Terraform, and Kubernetes bootstrap in Auto Scaling environments or new Node Provisioning environments.
(3) containerd Insecure settings
In some environments, there may be cases where direct SSH access to the node is not possible, or situations where rapid recovery takes precedence due to urgent failures.
In this case, you can temporarily disable SSL verification in the containerd settings.
First, create a certs.d directory dedicated to the Registry.
mkdir -p /etc/containerd/certs.d/repo.example.com:11000
Next, create the hosts.toml file.
server = https://repo.example.com:11000
[host."https://repo.example.com:11000"]
capabilities = ["pull", "resolve"]
skip_verify = true
Here, the skip_verify = true option is key. This setting forces the Registry to skip certificate verification. After applying the setting, be sure to restart containerd.
sudo systemctl restart containerd
This method is useful for quick disaster recovery, but it is not recommended for long-term use in production environments.
Especially in security audits or financial/public environments, insecure configurations themselves can be a violation of policy. Additionally, even if Registry certificates are forged, connections may be allowed without verification, which poses security risks. Therefore, it is advisable to use this method only as a temporary measure or in a restricted manner in development environments.
(4) k3s registries.yaml Insecure configuration
In a k3s environment, unlike standard Kubernetes, the internal settings for containerd are managed automatically. Therefore, even if you modify the containerd settings directly, they may be overwritten upon restarting k3s.
In this case, you should use the /etc/rancher/k3s/registries.yaml file.
mirrors:
"repo.example.com:11000":
endpoint:
- "https://repo.example.com:11000"
configs:
"repo.example.com:11000":
auth:
username: "your-username"
password: "your-password"
tls:
insecure_skip_verify: true
This configuration is automatically reflected when k3s internally generates the containerd configuration.
After applying the settings, restart the k3s service.
sudo systemctl restart k3s
It is especially important to understand the structure of registries.yaml, as there are many cases of using k3s in Edge environments or lightweight Kubernetes environments.
It is also advisable to consider using Kubernetes Secrets or Vault-based authentication integration instead of storing credentials (username/password) in plaintext in the production environment.
3. Summary
In this incident response process, we analyzed and experienced the process of troubleshooting typical SSL certificate issues that can occur in a Private Registry environment.
The core issue was that the CA that issued the Registry server certificate did not exist in the trust store of the Kubernetes Node.
The existing Pod was functioning normally thanks to the image cache, but the new Pod encountered an SSL verification failure during the Registry access process.
To resolve this, I first used openssl to check the certificate issuer and the certificate chain, and applied the method of extracting the actual certificate and registering it in the system trust store.
We also checked the method for insecure settings based on containerd hosts.toml and insecure_skip_verify settings based on k3s registries.yaml for emergency response.
However, it is crucial to register CA certificates officially and maintain the TLS verification system in a production environment.
The insecure setting should only be used as a temporary workaround, as long-term use could lead to security vulnerabilities.
Ultimately, we confirmed that understanding not only simple application deployment but also the SSL certification system, trust structures of certification authorities, and the operational methods of Container Runtime is crucial for stable Kubernetes operation.
Jsia