Securing Pulumi Secrets with AWS KMS and HashiCorp Vault
Production infrastructure demands cryptographic control over state files. Pulumi's default service-managed encryption lacks audit trails and cross-account portability. Migrating to AWS KMS or HashiCorp Vault enforces compliance boundaries. This guide—part of the AWS Provider Deep Dive within Pulumi Patterns & Provider Management—details atomic provider swaps, strict Python 3.9+ typing patterns, and state recovery workflows. It pairs naturally with managing multi-account AWS environments with Pulumi Python, where each account needs its own KMS key for cross-account state portability.
Environment Isolation & Python 3.9+ Baseline
Virtual Environment & Dependency Pinning
Infrastructure code requires deterministic dependency resolution. Floating versions introduce silent breaking changes during provider upgrades. Pin pulumi, boto3, and hvac in pyproject.toml or requirements.txt. Isolate each stack in a dedicated virtual environment.
CLI: Initialize and activate a clean Python environment.
python3.12 -m venv .venv source .venv/bin/activate pip install -r requirements.txt
Strict Type Checking with mypy
Dynamic typing obscures configuration resolution errors until deployment. Enforce mypy --strict in CI pipelines. Annotate all configuration loaders and resource constructors. Catch None propagation before the Pulumi engine evaluates the dependency graph.
IAM & Vault Auth Pre-Flight Checks
Authentication deadlocks halt stack operations mid-execution. Validate AWS IAM kms:Decrypt and kms:Encrypt permissions before initializing the provider. For Vault, verify AppRole or TLS certificate validity. Run a dry-run credential fetch to confirm network routing and policy attachment.
Migrating to AWS KMS Secrets Provider
CLI Provider Swap Command
State migration must remain atomic. The change-secrets-provider subcommand re-encrypts ciphertext values without altering resource URNs. Target a specific KMS alias and specify the AWS SDK version to avoid legacy API deprecation.
CLI: Execute the atomic migration sequence.
pulumi stack change-secrets-provider "awskms://alias/pulumi-secrets-key?region=us-east-1&awssdk=v2"
Typed Secret Retrieval in Python
Raw string interpolation bypasses Pulumi's secret masking engine. Wrap sensitive values in pulumi.Output types immediately. config.get_secret() and config.require_secret() both return Output[str]—never assign these to a plain str type annotation.
import pulumi
from typing import Dict, Optional
def get_db_credentials(config: pulumi.Config) -> Dict[str, pulumi.Output[str]]:
"""Retrieve typed database credentials with explicit secret wrapping.
Both return values are Output[str], not str—they are resolved asynchronously
and will appear as [secret] in pulumi preview output.
"""
username: str = config.require("db_username")
password: pulumi.Output[str] = config.require_secret("db_password")
return {
"user": pulumi.Output.from_input(username),
"pass": password,
}
IAM Policy Scoping & Least Privilege
Broad KMS permissions violate zero-trust architectures. Scope policies to specific key aliases and restrict kms:GenerateDataKey to the Pulumi CLI execution role. Cross-account decryption requires explicit grant propagation. Consult the AWS Provider Deep Dive for granular IAM policy templates and alias routing strategies.
Integrating HashiCorp Vault Secrets Provider
Vault Transit Engine Configuration
The transit backend provides encryption-as-a-service without persistent secret storage. Enable the transit secrets engine and generate a dedicated keyring. Configure key rotation policies to align with organizational compliance windows.
CLI: Provision the transit path and key.
vault secrets enable transit vault write -f transit/keys/pulumi-stack type=aes256-gcm96
Token & Auth Method Mapping
Pulumi requires persistent authentication during stack operations. Map AppRole, TLS, or Kubernetes service accounts to the transit path. Align token TTLs with maximum deployment durations. Short-lived tokens trigger mid-apply 403 Forbidden failures.
Once the Vault transit engine is configured, use it as the Pulumi secrets provider:
CLI: Switch the stack to use Vault transit as the secrets provider.
export VAULT_ADDR="https://vault.example.com" export VAULT_TOKEN="" pulumi stack change-secrets-provider "hashivault://pulumi-stack"
Python Fallback Typing Patterns
Dynamic secret resolution often requires conditional fallbacks. Use typing.Optional for values that may be absent. Validate secret presence before passing values to resource constructors.
from typing import Dict, Optional, Any
import pulumi
def resolve_vault_secrets(config: pulumi.Config) -> Dict[str, Any]:
"""Dynamically resolve Vault-backed secrets with safe fallback typing."""
api_key: Optional[pulumi.Output[str]] = config.get_secret("vault_api_key")
region: str = config.require("deployment_region")
return {
"api_key": api_key,
"region": region,
"fallback_enabled": api_key is not None,
}
State Safety, Drift Detection & Safe Rollback
Pre-Migration State Snapshots
Provider transitions introduce cryptographic incompatibilities. Export the current state before executing any migration command. Store snapshots in version-controlled artifact storage. Maintain immutable backups for compliance audits.
CLI: Export stack state to a local artifact.
pulumi stack export --file state-pre-migration.json
Drift Detection via pulumi refresh
Post-migration state verification prevents silent configuration divergence. Run pulumi refresh to reconcile the local state file with live infrastructure. Review diff outputs for unexpected resource replacements or property resets.
Atomic State Import & Rollback
Decryption failures require immediate state restoration. Import the pre-migration snapshot to revert cryptographic bindings. Reference Pulumi Patterns & Provider Management for automated stack lifecycle governance and versioned state recovery pipelines.
CLI: Execute forced state rollback on failure.
pulumi stack import --file state-pre-migration.json --force
Testing Boundaries & Secret Masking Validation
pytest Isolation for IaC
Unit tests must never invoke live cloud providers. Isolate configuration parsing from resource provisioning logic. Mock the Pulumi config to simulate stack evaluation without network calls.
Mocking KMS/Vault Responses
Patch pulumi.Config using unittest.mock. Return deterministic values during test execution. Validate type coercion and error handling paths without exposing real credentials.
CLI Output Redaction Verification
Secret masking relies on Pulumi's internal serialization layer. Verify that pulumi preview and pulumi up outputs display [secret] placeholders for values retrieved via require_secret() or get_secret(). Do not attempt synchronous string operations on Output[str] objects—use .apply() for all transformations.
Common Mistakes & Remediation
| Mistake | Remediation | Impact |
|---|---|---|
Using pulumi config set without --secret during migration |
Always append --secret or enforce config.require_secret() in code. Verify ciphertext format in Pulumi.<stack>.yaml. |
Plaintext secrets committed to VCS, triggering compliance violations and audit failures. |
| Skipping IAM policy scoping or Vault transit path validation | Apply least-privilege kms:Decrypt/kms:Encrypt or Vault transit/encrypt/* policies. Validate with aws kms describe-key or vault read transit/keys/pulumi-stack. |
CLI hangs on pulumi up with opaque AccessDenied or 403 Forbidden errors. |
Assigning require_secret() result to str instead of Output[str] |
Use Output[str] type annotation. Apply .apply() for any downstream string transformation. |
Runtime TypeError during dependency resolution and failed resource graph compilation. |
Frequently Asked Questions
Can I migrate Pulumi secrets to KMS/Vault without recreating resources?
Yes. pulumi stack change-secrets-provider only re-encrypts state values. Resource IDs and URNs remain intact. Validate the operation with pulumi preview before applying changes.
How does drift detection handle rotated KMS keys or Vault tokens?
Pulumi does not auto-detect key rotation. Implement CI/CD checks with pulumi refresh and monitor AWS CloudTrail or Vault audit logs for AccessDenied events during stack operations. Align token lifecycles with deployment windows.
What is the testing boundary for mocking KMS/Vault in Python IaC?
Use unittest.mock to patch pulumi.Config and return test values. Never mock the actual secrets provider. Test configuration resolution, type safety, and error propagation in strict isolation.
Conclusion
Migrating Pulumi secrets to KMS or Vault is a one-time atomic operation (change-secrets-provider) that significantly improves compliance posture. The ongoing discipline—scoped IAM policies, token TTL alignment, pre-migration snapshots—is what keeps the encrypted state safe after migration. Invest in the validation workflow before executing the migration in production.
Related
- AWS Provider Deep Dive — the parent guide on provider initialization, credential routing, and state isolation.
- Managing multi-account AWS environments with Pulumi Python — per-account stacks and assume-role providers that each consume their own KMS key.
- How to Deploy an EKS Cluster with Pulumi (Python) — a workload whose database passwords and tokens you would store as KMS-encrypted secrets.