Securing Pulumi Secrets with AWS KMS and HashiCorp Vault

Production infrastructure demands cryptographic control over state files. Pulumi's default service-managed encryption lacks audit trails and cross-account portability. Migrating to AWS KMS or HashiCorp Vault enforces compliance boundaries. This guide—part of the AWS Provider Deep Dive within Pulumi Patterns & Provider Management—details atomic provider swaps, strict Python 3.9+ typing patterns, and state recovery workflows. It pairs naturally with managing multi-account AWS environments with Pulumi Python, where each account needs its own KMS key for cross-account state portability.

Environment Isolation & Python 3.9+ Baseline

Virtual Environment & Dependency Pinning

Infrastructure code requires deterministic dependency resolution. Floating versions introduce silent breaking changes during provider upgrades. Pin pulumi, boto3, and hvac in pyproject.toml or requirements.txt. Isolate each stack in a dedicated virtual environment.

CLI: Initialize and activate a clean Python environment.

python3.12 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Strict Type Checking with mypy

Dynamic typing obscures configuration resolution errors until deployment. Enforce mypy --strict in CI pipelines. Annotate all configuration loaders and resource constructors. Catch None propagation before the Pulumi engine evaluates the dependency graph.

IAM & Vault Auth Pre-Flight Checks

Authentication deadlocks halt stack operations mid-execution. Validate AWS IAM kms:Decrypt and kms:Encrypt permissions before initializing the provider. For Vault, verify AppRole or TLS certificate validity. Run a dry-run credential fetch to confirm network routing and policy attachment.

Migrating to AWS KMS Secrets Provider

CLI Provider Swap Command

State migration must remain atomic. The change-secrets-provider subcommand re-encrypts ciphertext values without altering resource URNs. Target a specific KMS alias and specify the AWS SDK version to avoid legacy API deprecation.

CLI: Execute the atomic migration sequence.

pulumi stack change-secrets-provider "awskms://alias/pulumi-secrets-key?region=us-east-1&awssdk=v2"

Typed Secret Retrieval in Python

Raw string interpolation bypasses Pulumi's secret masking engine. Wrap sensitive values in pulumi.Output types immediately. config.get_secret() and config.require_secret() both return Output[str]—never assign these to a plain str type annotation.

import pulumi
from typing import Dict, Optional

def get_db_credentials(config: pulumi.Config) -> Dict[str, pulumi.Output[str]]:
    """Retrieve typed database credentials with explicit secret wrapping.

    Both return values are Output[str], not str—they are resolved asynchronously
    and will appear as [secret] in pulumi preview output.
    """
    username: str = config.require("db_username")
    password: pulumi.Output[str] = config.require_secret("db_password")

    return {
        "user": pulumi.Output.from_input(username),
        "pass": password,
    }

IAM Policy Scoping & Least Privilege

Broad KMS permissions violate zero-trust architectures. Scope policies to specific key aliases and restrict kms:GenerateDataKey to the Pulumi CLI execution role. Cross-account decryption requires explicit grant propagation. Consult the AWS Provider Deep Dive for granular IAM policy templates and alias routing strategies.

Integrating HashiCorp Vault Secrets Provider

Vault Transit Engine Configuration

The transit backend provides encryption-as-a-service without persistent secret storage. Enable the transit secrets engine and generate a dedicated keyring. Configure key rotation policies to align with organizational compliance windows.

CLI: Provision the transit path and key.

vault secrets enable transit
vault write -f transit/keys/pulumi-stack type=aes256-gcm96

Token & Auth Method Mapping

Pulumi requires persistent authentication during stack operations. Map AppRole, TLS, or Kubernetes service accounts to the transit path. Align token TTLs with maximum deployment durations. Short-lived tokens trigger mid-apply 403 Forbidden failures.

Once the Vault transit engine is configured, use it as the Pulumi secrets provider:

CLI: Switch the stack to use Vault transit as the secrets provider.

export VAULT_ADDR="https://vault.example.com"
export VAULT_TOKEN=""
pulumi stack change-secrets-provider "hashivault://pulumi-stack"

Python Fallback Typing Patterns

Dynamic secret resolution often requires conditional fallbacks. Use typing.Optional for values that may be absent. Validate secret presence before passing values to resource constructors.

from typing import Dict, Optional, Any
import pulumi

def resolve_vault_secrets(config: pulumi.Config) -> Dict[str, Any]:
    """Dynamically resolve Vault-backed secrets with safe fallback typing."""
    api_key: Optional[pulumi.Output[str]] = config.get_secret("vault_api_key")
    region: str = config.require("deployment_region")

    return {
        "api_key": api_key,
        "region": region,
        "fallback_enabled": api_key is not None,
    }

State Safety, Drift Detection & Safe Rollback

Pre-Migration State Snapshots

Provider transitions introduce cryptographic incompatibilities. Export the current state before executing any migration command. Store snapshots in version-controlled artifact storage. Maintain immutable backups for compliance audits.

CLI: Export stack state to a local artifact.

pulumi stack export --file state-pre-migration.json

Drift Detection via pulumi refresh

Post-migration state verification prevents silent configuration divergence. Run pulumi refresh to reconcile the local state file with live infrastructure. Review diff outputs for unexpected resource replacements or property resets.

Atomic State Import & Rollback

Decryption failures require immediate state restoration. Import the pre-migration snapshot to revert cryptographic bindings. Reference Pulumi Patterns & Provider Management for automated stack lifecycle governance and versioned state recovery pipelines.

CLI: Execute forced state rollback on failure.

pulumi stack import --file state-pre-migration.json --force

Testing Boundaries & Secret Masking Validation

pytest Isolation for IaC

Unit tests must never invoke live cloud providers. Isolate configuration parsing from resource provisioning logic. Mock the Pulumi config to simulate stack evaluation without network calls.

Mocking KMS/Vault Responses

Patch pulumi.Config using unittest.mock. Return deterministic values during test execution. Validate type coercion and error handling paths without exposing real credentials.

CLI Output Redaction Verification

Secret masking relies on Pulumi's internal serialization layer. Verify that pulumi preview and pulumi up outputs display [secret] placeholders for values retrieved via require_secret() or get_secret(). Do not attempt synchronous string operations on Output[str] objects—use .apply() for all transformations.

Common Mistakes & Remediation

Mistake Remediation Impact
Using pulumi config set without --secret during migration Always append --secret or enforce config.require_secret() in code. Verify ciphertext format in Pulumi.<stack>.yaml. Plaintext secrets committed to VCS, triggering compliance violations and audit failures.
Skipping IAM policy scoping or Vault transit path validation Apply least-privilege kms:Decrypt/kms:Encrypt or Vault transit/encrypt/* policies. Validate with aws kms describe-key or vault read transit/keys/pulumi-stack. CLI hangs on pulumi up with opaque AccessDenied or 403 Forbidden errors.
Assigning require_secret() result to str instead of Output[str] Use Output[str] type annotation. Apply .apply() for any downstream string transformation. Runtime TypeError during dependency resolution and failed resource graph compilation.

Frequently Asked Questions

Can I migrate Pulumi secrets to KMS/Vault without recreating resources? Yes. pulumi stack change-secrets-provider only re-encrypts state values. Resource IDs and URNs remain intact. Validate the operation with pulumi preview before applying changes.

How does drift detection handle rotated KMS keys or Vault tokens? Pulumi does not auto-detect key rotation. Implement CI/CD checks with pulumi refresh and monitor AWS CloudTrail or Vault audit logs for AccessDenied events during stack operations. Align token lifecycles with deployment windows.

What is the testing boundary for mocking KMS/Vault in Python IaC? Use unittest.mock to patch pulumi.Config and return test values. Never mock the actual secrets provider. Test configuration resolution, type safety, and error propagation in strict isolation.

Conclusion

Migrating Pulumi secrets to KMS or Vault is a one-time atomic operation (change-secrets-provider) that significantly improves compliance posture. The ongoing discipline—scoped IAM policies, token TTL alignment, pre-migration snapshots—is what keeps the encrypted state safe after migration. Invest in the validation workflow before executing the migration in production.