Choosing a State Backend for Python IaC

Picking the wrong state backend locks your team into the wrong operational model — manual locking, no audit trail, or a single point of failure — so this guide compares S3+DynamoDB, GCS, Terraform Cloud, Pulumi Cloud, and self-managed options against concrete criteria, as part of Managing IaC State for Python Projects within Python IaC Fundamentals & Strategy.

The backend decision is hard to reverse cheaply once dozens of stacks depend on it, so make it deliberately. The right choice depends on which cloud you already run in, whether you want managed locking, and how much you value a hosted UI versus owning the storage.

Context

State backends differ on five axes that matter day to day: where the file lives, how locking is enforced, how secrets are encrypted, whether there is a hosted policy/UI layer, and the operational burden of running it. CDKTF state backends are standard Terraform backends, so the CDKTF-specific wiring is covered in State Backend Configuration for CDKTF; Pulumi backends are selected with pulumi login and organized as described in Pulumi Stack Architecture.

Prerequisites

  • A target cloud account (AWS, GCP) or a Pulumi Cloud / Terraform Cloud organization.
  • Python 3.9+ with the pulumi CLI or cdktf CLI plus the Terraform binary.
  • IAM permissions to create the storage and lock primitives (S3 bucket + DynamoDB table, or GCS bucket).
  • A decision on whether locking must be automatic (managed backends) or self-operated.

Decision Table

Backend Used by Locking Encryption Hosted UI / policy Operational burden
S3 + DynamoDB CDKTF/Terraform, Pulumi DynamoDB lock item SSE-KMS No You run bucket + table
GCS CDKTF/Terraform, Pulumi Native object locking CMEK No You run the bucket
Terraform Cloud CDKTF/Terraform Built-in Managed Yes (runs, policy) Lowest (hosted)
Pulumi Cloud Pulumi Built-in Managed (per-secret) Yes (history, RBAC) Lowest (hosted)
Self-managed (local/HTTP) Either Manual / none Your responsibility No Highest

Implementation

1. Configure an S3 + DynamoDB backend (AWS teams)

If your workloads already run in AWS, S3 with a DynamoDB lock table keeps state in the same trust boundary and is the cheapest managed-locking option.

# Provision once; reuse across stacks with a per-environment key.
aws s3api create-bucket --bucket my-iac-state --region us-east-1
aws s3api put-bucket-versioning --bucket my-iac-state \
  --versioning-configuration Status=Enabled
aws dynamodb create-table --table-name iac-locks \
  --attribute-definitions AttributeName=LockID,AttributeType=S \
  --key-schema AttributeName=LockID,KeyType=HASH --billing-mode PAY_PER_REQUEST
from dataclasses import dataclass

@dataclass(frozen=True)
class S3Backend:
    bucket: str
    region: str
    lock_table: str

def render(backend: S3Backend, env: str) -> dict[str, object]:
    # CLI Context: cdktf synth, then terraform -chdir=cdktf.out/stacks/<ns> init
    # State implication: DynamoDB enforces the lock; `encrypt` keeps state ciphertext at rest.
    return {"s3": {
        "bucket": backend.bucket,
        "key": f"iac/{env}/terraform.tfstate",
        "region": backend.region,
        "dynamodb_table": backend.lock_table,
        "encrypt": True,
    }}

2. Select a managed backend for Pulumi

For teams that want zero locking infrastructure and a history UI, Pulumi Cloud is the default; an object store works when you want to own the bytes.

# Hosted managed backend (automatic locking, RBAC, history):
pulumi login
# Or own the storage in S3 (locking via the object backend):
pulumi login s3://my-iac-state
# Provider note: switching backends later requires a stack export/import migration.

3. Choose Terraform Cloud for CDKTF policy/runs

If you want remote execution, run history, and Sentinel/OPA policy gating without self-hosting, Terraform Cloud is the managed CDKTF backend.

from constructs import Construct
from cdktf import TerraformStack

class CloudBackedStack(TerraformStack):
    def __init__(self, scope: Construct, ns: str) -> None:
        super().__init__(scope, ns)
        # State implication: Terraform Cloud owns the state and the lock; runs can
        # execute remotely so local credentials never touch prod state directly.
        self.add_override("terraform.backend", {"remote": {
            "hostname": "app.terraform.io",
            "organization": "my-org",
            "workspaces": {"name": ns},
        }})
        # CLI Context: cdktf synth && cdktf deploy

Verification

# Confirm the backend is remote (not local) and a state object exists.
pulumi whoami                       # shows the active Pulumi backend URL
terraform -chdir=cdktf.out/stacks/dev state list   # non-empty => remote state populated
aws s3api head-object --bucket my-iac-state --key iac/dev/terraform.tfstate
# A successful head-object proves state landed in S3 rather than on local disk.

Gotchas & Edge Cases

S3 without a DynamoDB table has no locking. An S3 backend missing dynamodb_table will happily allow concurrent writes and corrupt state. Always pair S3 with a lock table; verify the table name matches exactly.

Pulumi Cloud secrets vs object-backend secrets. Pulumi Cloud encrypts secrets with a managed per-stack key; an S3/GCS Pulumi backend needs a passphrase or KMS key you supply. Losing that passphrase makes encrypted secrets unrecoverable.

Region mismatch causes silent latency, not errors. A state bucket in a distant region adds round-trip latency to every plan. Co-locate the backend with the team and CI runners; it will not error, only slow you down.

Frequently Asked Questions

Can I use one S3 bucket for both Pulumi and CDKTF? Yes, with different key prefixes — Pulumi writes its checkpoint format and CDKTF writes Terraform .tfstate. Keep the prefixes distinct so the two tools never read each other's objects.

Is a hosted backend worth the cost over S3? If you need RBAC, run history, audit, and policy gates, the hosted backends remove real operational work. For a small team already in AWS, S3+DynamoDB is cheaper and sufficient.

How do I move off a backend I regret choosing? Use an export/import migration with verification, detailed in How to Migrate IaC State Between Backends. Plan for a brief freeze on deploys during cutover.

Does GCS support locking like DynamoDB? Yes — the GCS backend uses native object generation/locking, so you do not need a separate lock table the way S3 does.