Cloud-init works differently with COS images in GCP

Photo by José Ramos on Unsplash

Cloud-init works differently with COS images in GCP

·

2 min read

Introduction

When working with VMs you may come across cloud-init tool that could be used to provision your instances. It is a well documented tool and is widely used by cloud providers. You may be quite surprised though that there can be some situations that it does not work the same way as documented. It was the case for me when working with Google Cloud Platform VMs that were using COS images.

Cloud-init

Cloud-init is an open-source tool designed to automate the initialization of cloud instances and servers, making it easier to configure systems according to specific needs with minimal effort. It is widely used by developers, system administrators, and IT professionals to automate the configuration of virtual machines (VMs), cloud instances, and machines on a network. Cloud-init is supported across all major public cloud providers, provisioning systems for private cloud infrastructure, and bare-metal installations. It is also an industry standard for cross-platform cloud instance initialization, working across different distributions.

COS image

COS, or Container-Optimized OS, is an operating system image designed specifically for running containers on Google Cloud Platform. The primary purpose of COS is to provide an optimized environment for deploying and managing containers efficiently, securely, and quickly on GCP's Compute Engine VMs. COS is particularly well-suited for use cases that require running containers in a production environment, offering a balance between performance, security, and ease of management.

So what was unexpected?

With cloud-init you can configure some modules that will do task for you on different instance boot stages. One of such modules is runcmd

Documentation states that runcmd runs only on first instance boot (Module frequency: once-per-instance). Since it is works this way, then it is perfect to configure instance and do some once only tasks (e.g. enable some system services, run your once-only custom script etc).

That is not the case for COS image. If you did not notice it in GCP docs I linked, I don't blame you. It is very easy to miss this detail because this is placed at the end of a note in one of examples:

Note: Using systemctl enable in the runcmd section of cloud-init will not work as intended. This is because /etc on Container-Optimized OS is stateless, so commands such as systemctl enable/disable, that modify /etc, won't persist across reboots. See Disks and Filesystem for more information on Container-Optimized OS's filesystem. An alternative to systemctl enable is to use systemctl start, as in the example above. cloud-init modules such as write_files and runcmd, which are typically run once-per-instance on other distros, are run on every boot on Container-Optimized OS.

I guess it's another detail to remember.