Skip to content

nodegroup

Inspect and operate on a cluster's managed nodegroups: list them with AMI freshness, describe one in depth, scale desired/min/max size (with optional PDB and health gating), and roll nodegroups to the latest recommended AMI with pre-flight health checks and live monitoring.

refresh nodegroup <list|describe|scale|update> [args] [flags]

The group has the alias ng. The cluster is a positional on most subcommands, or --cluster/-c, falling back to the active context.


list

List the managed nodegroups in a cluster with their status, instance type, node counts, and AMI freshness (whether each is on the latest recommended AMI).

The NODES column shows the desired node count by default. Add --check-readiness to measure real Kubernetes node readiness (Ready/desired) via the cluster API; when the cluster is unreachable it degrades to the desired count rather than reporting a fabricated ready figure.

refresh nodegroup list [cluster] [flags]

Flags

Flag Description
--cluster, -c EKS cluster name or pattern (or pass as positional)
--filter, -f Filter, key=value (keys: name, status, instanceType, amiStatus); repeatable
--check-readiness, -R Measure real Kubernetes node readiness (Ready/desired) via the cluster API
--kubeconfig Path to the kubeconfig for --check-readiness (defaults to $KUBECONFIG, then ~/.kube/config)
--sort Sort by field: name (default), status, instance, nodes
--desc Sort descending
--format, -o table (default), json, yaml, plain
--watch, -w Re-run and redraw every --watch-interval until interrupted
--watch-interval Refresh interval for --watch (default 10s)
--timeout, -t Operation timeout (env REFRESH_TIMEOUT)

Examples

# Only nodegroups on a stale AMI
refresh nodegroup list my-cluster --filter amiStatus=outdated

# Plain TSV for scripting
refresh nodegroup list my-cluster -o plain | awk '{print $1}'

# Watch a roll progress live
refresh nodegroup list my-cluster --watch

describe

Detailed information for one nodegroup: scaling config, instance type(s), AMI/release version and freshness, and optional per-instance and workload placement details.

refresh nodegroup describe [cluster] [nodegroup] [flags]

describe has the alias get. The nodegroup name may be the second positional or --nodegroup/-n.

Flags

Flag Description
--cluster, -c EKS cluster name
--nodegroup, -n Nodegroup name (or pass as second positional)
--show-instances, -I Include EC2 instance details
--show-workloads, -W Include workload/pod placement info
--format, -o table (default), json, yaml, plain
--timeout, -t Operation timeout (env REFRESH_TIMEOUT)

Examples

refresh nodegroup describe my-cluster ng-default
refresh nodegroup describe my-cluster ng-default --show-instances --show-workloads

scale

Change a managed nodegroup's desired/min/max size. Any subset of --desired/--min/--max may be set; unspecified bounds are left unchanged.

refresh nodegroup scale [cluster] -n <nodegroup> [flags]

Flags

Flag Description
--cluster, -c EKS cluster name (or pass as positional)
--nodegroup, -n Required. Nodegroup name
--desired Desired node count
--min Minimum node count
--max Maximum node count
--health-check Validate cluster health before and after scaling
--check-pdbs Validate Pod Disruption Budgets before scaling down
--wait Wait for the scaling operation to complete
--op-timeout Scaling operation timeout (default 5m)
--kubeconfig Kubeconfig for workload/PDB checks (defaults to $KUBECONFIG, then ~/.kube/config)
--dry-run Preview the scaling impact without executing
--timeout, -t Operation timeout (env REFRESH_TIMEOUT)

Preview which PDBs would block a scale-down

Combine --dry-run --check-pdbs to preview the specific Pod Disruption Budgets that would constrain a scale-down — before you touch anything.

Examples

# Scale up to 5 nodes
refresh nodegroup scale my-cluster -n ng-default --desired 5

# Safe scale-down: preview the PDBs that would constrain it
refresh nodegroup scale my-cluster -n ng-default --desired 2 --check-pdbs --dry-run

# Scale down for real, gated on PDBs, and wait for it to settle
refresh nodegroup scale my-cluster -n ng-default --desired 2 --check-pdbs --wait

update

Roll managed nodegroups to the latest recommended AMI, with pre-flight health gates and live monitoring. This is the flagship patch command.

refresh nodegroup update [cluster] [nodegroup] [flags]

update has the alias update-ami. Omitting the nodegroup updates all nodegroups in the cluster.

Custom-AMI nodegroups are skipped

Nodegroups whose AMI is managed via a launch template (AmiType=CUSTOM) are detected and skipped with guidance: their AMI rolls when you publish a new launch-template version, not via this command.

Fleet mode

--all-clusters discovers clusters across regions (scope with -r) and rolls them serially with one batch confirmation, an aggregate summary, and a worst-outcome exit code.

refresh nodegroup update --all-clusters --dry-run            # fleet-wide plan
refresh nodegroup update --all-clusters -r us-east-1 --yes   # execute in one region

Flags

Flag Description
--cluster, -c EKS cluster name or partial pattern (overrides kubeconfig; env EKS_CLUSTER_NAME)
--nodegroup, -n Nodegroup name or partial pattern (if unset, update all)
--all-clusters Fleet mode: roll matching nodegroups across all discovered clusters (serial); scope with -r
--region, -r Region(s) for --all-clusters discovery (default: partition EKS regions / REFRESH_EKS_REGIONS)
--dry-run, -d Preview changes without executing
--changelog In dry-run, print full amazon-eks-ami release notes between the current and target AMI
--force, -f Force the update where possible
--no-wait Don't wait for update completion (start-and-return)
--quiet, -q Minimal output
--skip-health-check, -s Skip pre-flight health validation
--health-only Run the health check only, don't update (exit 0=pass / 2=warn / 3=block)
--yes, -y Assume yes: skip confirmation prompts (multi-match selection, warn-level health) for CI
--require-healthy Treat warn-level health findings as a hard stop (exit 2) instead of prompting
--skip-verify Skip post-roll verification (nodes ACTIVE, no new stuck pods)
--kubeconfig Kubeconfig for workload/PDB checks (defaults to $KUBECONFIG, then ~/.kube/config)
--poll-interval, -p Polling interval for update status (default 15s)
--timeout, -t Max time to wait for update completion (default 40m)
--format, -o table (default) or json (a JSON run summary)

Unattended / CI

Without a TTY and without --yes, a run that would otherwise prompt fails fast. For cron, pair --yes with --require-healthy and -o json.

Exit-code contract

nodegroup update returns a meaningful exit code so unattended runs can branch:

Code Meaning
0 Success — updates started/completed as expected
2 Health warnings (with --health-only or --require-healthy)
3 Health blocked — a pre-flight check failed; nothing was rolled
4 One or more nodegroup updates failed to start
5 Post-roll verification found issues (nodes not Ready / newly-stuck pods)

See Exit codes for the full reference and a CI case example.

Examples

# Preview a single nodegroup roll with the AMI release notes
refresh nodegroup update my-cluster ng-default --dry-run --changelog

# Roll one nodegroup, requiring a clean health gate
refresh nodegroup update -c prod -n ng-default --require-healthy

# Health gate only — no roll (CI readiness check)
refresh nodegroup update -c prod --health-only -o json

# Unattended cron patch with a JSON summary
refresh nodegroup update -c prod --yes --require-healthy -o json

# Fleet-wide dry-run, then execute
refresh nodegroup update --all-clusters -r us-east-1 -r us-west-2 --dry-run
refresh nodegroup update --all-clusters -r us-east-1 -r us-west-2 --yes