'Terraform GKE google_container_cluster - should remove default node pool?
Terraform google_container_cluster example removes the default node pool.
resource "google_container_cluster" "primary" {
name = "my-gke-cluster"
location = "us-central1"
# We can't create a cluster with no node pool defined, but we want to only use
# separately managed node pools. So we create the smallest possible default
# node pool and immediately delete it.
remove_default_node_pool = true
initial_node_count = 1
}
However, as the default pool is removed and there can be no node to deploy the system pods.
Question
What is the good way to have a node to deploy the system pods?
- Have a default pool with number of nodes is 1 and schedule system pods in the defaujlt node. Set the min autoscaling node count to 0 and do not schedule system pods in auto scaling pods
- Remove the default pool. Set the min autoscaling node count to 1 so that system pods can be deployed in a autoscaling node.
Node pool defnition
resource "google_container_node_pool" "primary" {
name = "${google_container_cluster.primary.name}-node-pool"
project = var.PROJECT_ID
location = var.location
cluster = google_container_cluster.primary.name
#--------------------------------------------------------------------------------
# Node instantiation based on auto-scaling setting.
# node_count and autoscaling are mutually exclusive.
#--------------------------------------------------------------------------------
node_count = var.autoscaling == true ? null : var.num_nodes
dynamic "autoscaling" {
for_each = var.autoscaling ? [1] : []
content {
min_node_count = var.min_node_count # Set to 0 currently
max_node_count = var.max_node_count
}
}
#--------------------------------------------------------------------------------
# Node configurations
#--------------------------------------------------------------------------------
node_config {
#--------------------------------------------------------------------------------
# Service Account, the roles of which the node assumes
#--------------------------------------------------------------------------------
service_account = var.service_account
#--------------------------------------------------------------------------------
# Instance configurations
#--------------------------------------------------------------------------------
machine_type = var.machine_type
preemptible = var.node_preemptive
disk_size_gb = var.disk_size_gb
disk_type = var.disk_type
metadata = {
disable-legacy-endpoints = "true"
}
#--------------------------------------------------------------------------------
# The K8S labels (key/value pairs) to be applied to each node
#--------------------------------------------------------------------------------
labels = var.labels
oauth_scopes = [
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
]
tags = var.tags
}
}
Solution 1:[1]
Those are system pods that get deployed on each node, maybe kube-dns, kube-proxy however metrics server and other if running as deployment those can be running as single replicas on one of node.
What is the good way to have a node to deploy the system pods?
Keep at least 1-2 node for system pods like metrics server, ingress if you are installing inside kube-system etc.
Have a default pool with number of nodes is 1 and schedule system pods in the defaujlt node. Set the min autoscaling node count to 0 and do not schedule system pods in auto scaling pods
There is no guarantee if you have set the autoscaling node count to 0 which means it will scale down. There are limitation with GKE about scaling down node you might need to take a look first however yes it's possible you can scale down to zero by updating PDB and checking other limitations.
Limitation : https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler#limitations
Remove the default pool. Set the min autoscaling node count to 1 so that system pods can be deployed in a autoscaling node.
This is good if min count is 1 system pods should be up and running on that node, and when you deploy your application your nodes start scaling up as per need while system pods were already running.
You can also scale down to zero in that case if autoscaling is set and you deploy the application it will auto-start the application and system on available nodes so not much to worry about in that case also.
If you don't want to remove the default Node pool and keep running 1 Node, you can use this field : https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_cluster#remove_default_node_pool
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |

