How to Deploy an EKS Cluster with Pulumi (Python)
Deploy a production-ready Amazon EKS cluster in typed Python using the pulumi_eks component, wiring managed node groups, OIDC/IRSA, and a kubeconfig output. This walk-through is part of the AWS Provider Deep Dive within Pulumi Patterns & Provider Management, and it assumes the credential and state foundations covered there.
EKS is one of the highest-value targets for Pulumi Python: an EKS cluster is dozens of interdependent resources (VPC, IAM roles, control plane, node groups, OIDC provider, security groups), and managing them through a typed program means you get IDE completion, mypy validation, and pytest coverage instead of a sprawling HCL module. The pulumi_eks package wraps the raw pulumi_aws resources into a single high-level Cluster component, so most of the wiring is handled for you while still letting you drop down to the underlying resources when needed.
Prerequisites
- Python 3.9+ with a virtual environment activated.
pulumi>=3.100,pulumi-aws>=6.0,pulumi-eks>=2.0, andpulumi-kubernetes>=4.0pinned inrequirements.txt(orpyproject.toml).- AWS credentials resolvable by the provider (an assumed role or OIDC session—never static keys committed to source).
- IAM permissions for the deploying principal covering
eks:*,ec2:*(VPC/subnet/security-group),iam:CreateRole/AttachRolePolicy/CreateOpenIDConnectProvider, andautoscaling:*. kubectlinstalled locally to verify the EKS cluster after deployment.- A stack initialized and the region set:
pulumi stack init dev && pulumi config set aws:region us-east-1.
CLI: Verify the toolchain and credentials before deploying.
python -c "import pulumi_eks, pulumi_aws; print('ok')" aws sts get-caller-identity
Implementation
1. Define typed cluster configuration
Drive the EKS cluster shape from a frozen dataclass so every knob is validated and discoverable, rather than passing loose keyword arguments. Pull secrets and environment-specific values from Pulumi config where appropriate.
# config.py
# CLI: pulumi config set eks:minNodes 2
from __future__ import annotations
from dataclasses import dataclass, field
from typing import Sequence
import pulumi
@dataclass(frozen=True)
class EksConfig:
name: str
instance_type: str = "t3.medium"
min_size: int = 2
max_size: int = 4
desired_size: int = 2
k8s_version: str = "1.30"
tags: dict[str, str] = field(default_factory=dict)
def load_eks_config() -> EksConfig:
"""Build a typed EKS config from Pulumi stack configuration."""
cfg = pulumi.Config("eks")
return EksConfig(
name=cfg.get("name") or "app",
instance_type=cfg.get("instanceType") or "t3.medium",
min_size=cfg.get_int("minNodes") or 2,
max_size=cfg.get_int("maxNodes") or 4,
desired_size=cfg.get_int("desiredNodes") or 2,
k8s_version=cfg.get("k8sVersion") or "1.30",
tags={"managed-by": "pulumi", "stack": pulumi.get_stack()},
)
# Provider note: the AWS region comes from `aws:region`, not this dataclass.
2. Provision a VPC for the EKS cluster
EKS needs subnets across at least two Availability Zones. The awsx (crosswalk) package builds a best-practice VPC—public subnets for load balancers, private subnets for nodes—in a few lines. If you prefer a hand-rolled network, see Building a Reusable VPC Component in Pulumi (Python) for the underlying pattern.
# network.py
# CLI: pulumi preview
from __future__ import annotations
import pulumi_awsx as awsx
def create_cluster_vpc(name: str) -> awsx.ec2.Vpc:
"""Create a multi-AZ VPC with public and private subnets for EKS."""
return awsx.ec2.Vpc(
f"{name}-vpc",
cidr_block="10.0.0.0/16",
number_of_availability_zones=2,
# State implication: subnet IDs are Output[str]; never index them
# synchronously—pass the Output lists straight into the cluster.
nat_gateways=awsx.ec2.NatGatewayConfigurationArgs(
strategy=awsx.ec2.NatGatewayStrategy.SINGLE,
),
)
3. Create the EKS cluster with managed node groups and IRSA
The eks.Cluster component provisions the control plane, the node IAM roles, the OIDC provider, and a managed node group. Enabling create_oidc_provider=True is what makes IAM Roles for Service Accounts (IRSA) work—pods can then assume scoped IAM roles instead of inheriting the node's instance profile.
# __main__.py
# CLI: pulumi up
from __future__ import annotations
import pulumi
import pulumi_eks as eks
from config import load_eks_config
from network import create_cluster_vpc
conf = load_eks_config()
vpc = create_cluster_vpc(conf.name)
cluster = eks.Cluster(
conf.name,
vpc_id=vpc.vpc_id,
public_subnet_ids=vpc.public_subnet_ids,
private_subnet_ids=vpc.private_subnet_ids,
# Run nodes in private subnets only; expose via load balancers.
node_associate_public_ip_address=False,
version=conf.k8s_version,
instance_type=conf.instance_type,
desired_capacity=conf.desired_size,
min_size=conf.min_size,
max_size=conf.max_size,
# Provider note: this creates the OIDC provider required for IRSA.
create_oidc_provider=True,
tags=conf.tags,
)
# State implication: kubeconfig contains a short-lived exec credential, not a
# static token. It is recorded in state—treat the stack as sensitive.
pulumi.export("kubeconfig", pulumi.Output.secret(cluster.kubeconfig))
pulumi.export("cluster_name", cluster.eks_cluster.name)
pulumi.export("oidc_provider_arn", cluster.core.oidc_provider.arn)
4. Attach an IRSA role to a service account (optional but recommended)
Once create_oidc_provider=True is set, you can mint IAM roles that specific Kubernetes service accounts assume. This keeps pod permissions least-privilege instead of granting the whole node group broad access.
# irsa.py
# CLI: pulumi up
from __future__ import annotations
import json
import pulumi
import pulumi_aws as aws
import pulumi_eks as eks
def create_irsa_role(
name: str,
cluster: eks.Cluster,
namespace: str,
service_account: str,
policy_arn: str,
) -> aws.iam.Role:
"""Create an IAM role assumable by one Kubernetes service account."""
oidc_arn = cluster.core.oidc_provider.arn
oidc_url = cluster.core.oidc_provider.url
assume_policy = pulumi.Output.all(oidc_arn, oidc_url).apply(
lambda args: json.dumps(
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {"Federated": args[0]},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
f"{args[1]}:sub": (
f"system:serviceaccount:{namespace}:{service_account}"
)
}
},
}
],
}
)
)
role = aws.iam.Role(f"{name}-irsa", assume_role_policy=assume_policy)
aws.iam.RolePolicyAttachment(
f"{name}-irsa-attach", role=role.name, policy_arn=policy_arn
)
return role
Verification
Confirm the EKS cluster is reachable and the node group is healthy. Pulumi exports the kubeconfig as a secret; write it to a temporary file to drive kubectl.
CLI: Pull the kubeconfig from the stack and check node status.
pulumi stack output kubeconfig --show-secrets > /tmp/kubeconfig.json KUBECONFIG=/tmp/kubeconfig.json kubectl get nodes -o wide KUBECONFIG=/tmp/kubeconfig.json kubectl get pods -A
For an automated check, assert the exported outputs with a pulumi.runtime mock test so the EKS cluster shape is validated in CI without touching AWS:
# tests/test_eks.py
# CLI: pytest tests/test_eks.py -v
from __future__ import annotations
import pytest
import pulumi
from typing import Any
class _Mocks(pulumi.runtime.Mocks):
def new_resource(self, args: pulumi.runtime.MockResourceArgs) -> tuple[str, dict[str, Any]]:
return (f"{args.name}-id", {**args.inputs})
def call(self, args: pulumi.runtime.MockCallArgs) -> dict[str, Any]:
return {}
@pytest.fixture(autouse=True)
def _set_mocks() -> None:
pulumi.runtime.set_mocks(_Mocks(), preview=False)
@pytest.mark.asyncio
async def test_cluster_version_is_pinned() -> None:
"""Assert the cluster is created with an explicit Kubernetes version."""
from config import load_eks_config
conf = load_eks_config()
assert conf.k8s_version, "Kubernetes version must be pinned, never default"
assert conf.min_size <= conf.desired_size <= conf.max_size
Gotchas & Edge Cases
Synchronously indexing subnet outputs throws at runtime. vpc.private_subnet_ids is an Output[list[str]], not a Python list. Writing vpc.private_subnet_ids[0] raises a type error because the value is not resolved yet. Pass the whole Output straight into eks.Cluster, or transform it with .apply()—never slice it directly.
Forgetting create_oidc_provider breaks IRSA silently. Without the OIDC provider, pods fall back to the node instance role and your scoped IAM roles are simply never assumable. There is no error at deploy time; you only discover it when a pod gets AccessDenied. Always enable it up front if any workload needs AWS permissions.
The kubeconfig is sensitive and lives in state. cluster.kubeconfig embeds an aws eks get-token exec credential and cluster CA data. Export it wrapped in pulumi.Output.secret() and ensure your state backend is encrypted—see Securing Pulumi secrets with AWS KMS and HashiCorp Vault for KMS-backed state encryption.
Frequently Asked Questions
Should I use pulumi_eks or build the EKS cluster from raw pulumi_aws resources?
Start with pulumi_eks. It encapsulates the control plane, node IAM roles, OIDC provider, and node group with sensible defaults, which removes most of the boilerplate and the easy-to-miss security-group wiring. Drop to raw pulumi_aws.eks resources only when you need a configuration the component does not expose, such as a fully custom launch template.
How do I roll the Kubernetes version without recreating the EKS cluster?
Bump version (and the node group's version) and run pulumi up. EKS performs an in-place control-plane upgrade, then node groups are replaced on a rolling basis. Always run pulumi preview first to confirm the control plane is updated rather than replaced, and upgrade one minor version at a time.
Can I deploy this into a specific AWS account using an assumed role?
Yes. Instantiate an aws.Provider with assume_role and pass it via opts=pulumi.ResourceOptions(provider=...), exactly as described in managing multi-account AWS environments with Pulumi Python. Keep one stack per account so state stays isolated.
Why does kubectl time out even though pulumi up succeeded?
The most common cause is nodes in private subnets without a route to the EKS control plane or to ECR. Confirm the NAT gateway exists and that the EKS cluster security group allows node-to-control-plane traffic on port 443. pulumi stack output plus aws eks describe-cluster will confirm the endpoint is active.
Related
- AWS Provider Deep Dive — the parent guide on provider initialization, credential routing, and state isolation.
- Managing multi-account AWS environments with Pulumi Python — deploy the EKS cluster into an isolated account with an assume-role provider.
- Securing Pulumi secrets with AWS KMS and HashiCorp Vault — encrypt the kubeconfig and workload secrets at rest in state.