Deploying to GKE»
This guide provides a way to quickly get Spacelift up and running on a Google Kubernetes Engine (GKE) cluster. In this guide we show a relatively simple networking setup where Spacelift is accessible via a public load balancer, but you can adjust this to meet your requirements as long as you meet the basic networking requirements for Spacelift.
To deploy Spacelift on GKE you need to take the following steps:
- Deploy your cluster and other infrastructure components.
- Push the Spacelift images to your artifact registry.
- Deploy the Spacelift backend services using our Helm chart.
Overview»
The illustration below shows what the infrastructure looks like when running Spacelift in GKE.
Networking»
Info
More details regarding networking requirements for Spacelift can be found on this page.
This section will solely focus on how the GCP infrastructure will be configured to meet Spacelift's requirements.
In this guide we'll create a new VPC network and subnetwork to allocate IPs for nodes, pods and services running in the cluster. We'll create a GKE VPC native cluster.
The Spacelift instance deployed in this guide is dual stack IPv4/IPv6. That's why for example we'll allocate two static addresses, and create two Ingress
resources.
The IPv4 outgoing traffic is handled with SNAT. For that we'll deploy a Cloud NAT component in GCP. For IPv6 traffic, each pod will have its own publicly routable address, so no further action is needed here.
The database will be allocated a private IP in the VPC, and we'll connect to this from pods running in the cluster using cloud-sql-proxy.
Incoming HTTPS traffic will be handled by Load balancers. Those load balancers are automatically created when deploying Ingress resources in the cluster. They will bind reserved static v4 and v6 IP addresses that you can add to your DNS zone.
Warning
Although load balancers may appear ready to handle traffic immediately after setup, there is often a brief delay before traffic is routed correctly. If the setup is complete but you are experiencing connection reset issues, please wait at least five minutes before starting your investigation.
It's also possible to deploy Spacelift in an existing VPC, for that you need to set enable_network
option to false
, and provide your own network.
Terraform code example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
|
Object Storage»
The Spacelift instance needs an object storage backend to store Terraform state files, run logs, and other things. Several Google Storage buckets will be created in this guide. This is a hard requirement for running Spacelift.
More details about object storage requirements for Spacelift can be found here.
Database»
Spacelift requires a PostgreSQL database to operate. In this guide we'll create a new dedicated Cloud SQL instance. You can also reuse an existing instance and create a new database in it. In that case you'll have to adjust the database URL and other settings across the guide. It's also up to you to configure appropriate networking to expose this database to Spacelift's VPC.
You can switch the enable_database
option to false in the terraform module to ask not create a Cloud SQL instance.
More details about database requirements for Spacelift can be found here.
GKE»
In this guide, we'll deploy a new autopilot GKE cluster to deploy Spacelift. The Spacelift application can be deployed using a Helm chart. The chart will deploy 3 main components:
- The scheduler.
- The drain.
- The server.
The scheduler is the component that handles recurring tasks. It creates new entries in a message queue when a new task needs to be performed.
The drain is an async background processing component that picks up items from the queue and processes events.
The server hosts the Spacelift GraphQL API, REST API and serves the embedded frontend assets. It also contains the MQTT server to handle interactions with workers. The server is exposed to the outside world using a Ingress
resources. There is also a MQTT Service
to exposes the broker to workers.
This MQTT Service is a ClusterIP
by default, but can be switched to a LoadBalancer
if you need to expose the MQTT service to workers from the outside of the VPC.
If you have a cluster running already, it's also possible to deploy Spacelift in it.
For that, you can set enable_gke
options to false
and provide a reference to your own resources.
You may also want to disable creation of a new VPC and set enable_network
to false
. See networking section above for more details.
In that situation, you need to provide a node_service_account
input that should reference the service account used by your cluster nodes.
This is used to grant your nodes permission to pull images from the artifact registry repository that will contain Spacelift docker images.
Terraform code example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Workers»
In this guide Spacelift workers will also be deployed in GKE. That means that your Spacelift runs will be executed in the same environment as the app itself (we recommend using another K8s namespace).
If you want to run workers from the outside of the VPC, you can switch the enable_external_workers
option of the Terraform module below.
Then you'll have to do some extra step and configuration while following the guide. If you're unsure what this entails, it is recommended to stick with the default option.
We highly recommend running your Spacelift workers within the same cluster, in a dedicated namespace. This simplifies the infrastructure deployment and makes it more secure since your runs are executed in the same environment.
Requirements»
Before proceeding with the next steps, the following tools must be installed on your computer.
- Google Cloud CLI.
- Docker.
- Helm.
- OpenTofu or Terraform.
Info
In the following sections of the guide, OpenTofu will be used to deploy the infrastructure needed for Spacelift. If you are using Terraform, simply swap tofu
for terraform
.
Generate encryption key»
Spacelift requires an RSA key to encrypt sensitive information stored in the Postgres database. Please follow the instructions in the RSA Encryption section of our reference documentation to generate a new key.
Deploy infrastructure»
We provide a Terraform module to help you deploy Spacelift's infrastructure requirements. Some parts of this module can be customized to avoid deploying part of the infra in case you want to handle that yourself. For example, you may want to disable the database if you already have a Cloud SQL instance and want to reuse it.
Before you start, set a few environment variables that will be used by the Spacelift modules:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
Note
The admin login/password combination is only used for the very first login to the Spacelift instance. It can be removed after the initial setup. More information can be found in the initial setup section.
Below is a small example of how to use this module:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
|
Feel free to take a look at the documentation for the terraform-google-spacelift-selfhosted module before applying your infrastructure in case there are any settings that you wish to adjust. Once you are ready, apply your changes:
1 |
|
Info
If you encounter the following error:
googleapi: Error 400: Service account service-xxxxxxxxxxxx@service-networking.iam.gserviceaccount.com does not exist
,
it means that the Service Networking service account was not created for your project.
To manually provision this service account, run the following command:
1 2 3 4 |
|
Once applied, you should grab all variables that need to be exported in the shell that will be used in next steps. We expose a shell
output in terraform that you
can source directly for convenience.
1 2 |
|
Info
During this guide you'll export shell variables that will be useful in future steps. So please keep the same shell open for the entire guide.
Configure your DNS zone»
Configure the following records in your DNS zone. The ${PUBLIC_IP_ADDRESS}
and ${PUBLIC_IPV6_ADDRESS}
environment variables should be available in your shell from the previous step.
1 2 3 4 5 6 |
|
Info
It is useful to configure these entries ASAP since they will be used by the Let's Encrypt handshake later. So it's better to do this right now, and continue the setup while records are being propagated.
Configure database»
We need to grant privileges to the SQL user to be able to create roles. The Spacelift application runs DB migrations on startup, and it needs to be able to create certain objects in the database.
As of now, there are two ways to do that. Either in the UI or locally using the cloud-sql-proxy
.
You need to click on the database instance in the GCP console, then go to the Cloud SQL Studio
tab on the left. Choose postgres
database, use postgres
as username and the root password from the Terraform module's output ($DB_ROOT_PASSWORD
).
Replace the variables with the ones from Terraform output, and execute following queries:
1 2 |
|
You can find more info about how to install cloud-sql-proxy in the official documentation.
1 2 3 4 5 6 7 8 9 |
|
Push images to Artifact Registry»
From the previous terraform apply step, you need to grab the URL of the registry from the output and push our docker images to it.
1 2 3 |
|
1 2 3 4 5 6 7 8 9 |
|
Deploy Spacelift»
First, we need to configure Kubernetes credentials to interact with the GKE cluster.
1 2 3 |
|
Warning
Make sure the above KUBECONFIG
environment variable is present when running following helm commands.
Cert manager»
Spacelift should run under valid HTTPS endpoints, so you need to provide valid certificates to Ingress resources deployed by Spacelift. One simple way to achieve that is to use cert-manager to generate Let's Encrypt certificates.
If you already have cert-manager running in your cluster and know how to configure Certificates on Ingress resources, you can skip this step.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Info
Note that this command can take a few minutes to finish as GKE scales up itself to be able to run the pods.
Next, we will configure an issuer to tell cert-manager how to generate certificates. In this guide we'll use ACME with Let's Encrypt and HTTP01 challenges.
Note
It is highly recommended to test against the Let's Encrypt staging environment before using the production environment. This will allow you to get things right before issuing trusted certificates and reduce the chance of hitting rate limits. Note that the staging root CA is untrusted by browsers, and Spacelift workers won't be able to connect to the server endpoint either.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
Install Spacelift»
Create Kubernetes namespace»
1 |
|
Create secrets»
The Spacelift services need various environment variables to be configured in order to function correctly. In this guide we will create three Spacelift secrets to pass these variables to the Spacelift backend services:
spacelift-shared
- contains variables used by all services.spacelift-server
- contains variables specific to the Spacelift server.spacelift-drain
- contains variables specific to the Spacelift drain.
For convenience, the terraform-google-spacelift-selfhosted
Terraform module provides a kubernetes_secrets output that you can pass to kubectl apply to create the secrets:
1 2 3 4 5 6 7 |
|
Deploy application»
You need to provide a number of configuration options to Helm when deploying Spacelift to configure it correctly for your environment.
You can generate a Helm values.yaml file to use via the helm_values
output variable of the terraform-google-spacelift-selfhosted
Terraform module:
1 |
|
Feel free to take a look at this file to understand what is being configured. Once you're happy, run the following command to deploy Spacelift:
1 2 3 4 5 6 7 |
|
Tip
You can follow the deployment progress with: kubectl logs -n ${K8S_NAMESPACE} deployments/spacelift-server
Next steps»
Now that your Spacelift installation is up and running, take a look at the initial installation section for the next steps to take.
Create a worker pool»
We recommend that you deploy workers in a dedicated namespace.
1 2 3 |
|
Warning
When creating your WorkerPool
, make sure to configure resources. This is highly recommended because otherwise very high resources requests can be set automatically by your admission controller.
Also make sure to deploy the WorkerPool and its secrets into the correct namespace we just created by adding -n ${K8S_WORKER_POOL_NAMESPACE}
to the commands in the guide below.
➡️ You need to follow this guide for configuring Kubernetes Workers.
Deletion / uninstall»
Before running tofu destroy
on the infrastructure, we recommend that you remove the Spacelift resources from your Kubernetes cluster. This is because the Spacelift helm chart creates some GCP resources (such as a network endpoint group) that are not managed by Terraform. If you do not remove them from Kubernetes, tofu destroy
will complain because some resources like networks cannot be removed if not empty.
Note
The database_deletion_protection
variable in the Terraform module controls whether the database can be automatically deleted. If set to true
— or if omitted, as it defaults to true
— the database will be protected from deletion. This means running tofu destroy
will not delete the database, and you will need to remove it manually.
1 2 3 4 5 6 7 |
|
Note
Namespace deletions in Kubernetes can take a while or even get stuck. If that happens, you need to remove the finalizers from the stuck resources.