Managing Multi-Account AWS Environments with Pulumi Python
Multi-account AWS architectures require strict isolation boundaries. Pulumi Python delivers a deterministic execution model for cross-account provisioning. You must enforce Python 3.9+ typing standards. The runtime operates on a single-task execution model per stack.
The boundary between Pulumi and CDKTF ecosystems remains distinct. Pulumi executes native Python code directly against cloud APIs. CDKTF compiles HCL definitions into Terraform JSON. Adopting Python for Infrastructure as Code (Pulumi / CDKTF) establishes a unified language baseline. Lifecycle management follows established Pulumi Patterns & Provider Management conventions.
Cross-Account IAM Role Architecture & Trust Boundaries
Cross-account deployments rely exclusively on temporary STS credentials. Long-lived access keys violate modern security baselines. You must enforce external ID rotation and strict session duration limits.
Apply the following trust policy to each target account role. It restricts assumption to your CI/CD runner or developer identity.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<central-account-id>:root"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "${pulumi.config:aws:external_id}"
},
"NumericLessThan": {
"sts:MaxSessionDuration": "3600"
}
}
}
]
}
Scope permissions using least-privilege IAM policies per account. Never attach AdministratorAccess to deployment roles. Restrict actions to ec2:*, s3:*, or iam:* as required. Validate the trust boundary before deployment.
CLI: Verify STS assumption locally before invoking Pulumi.
aws sts assume-role \ --role-arn arn:aws:iam::<target-account>:role/PulumiDeployer \ --role-session-name validation-test \ --external-id <your-external-id>
Dynamic Provider Instantiation with Python 3.9+ Typing
Static provider declarations fail in multi-account routing scenarios. You must instantiate pulumi_aws.Provider objects dynamically. Typed configuration prevents runtime TypeError exceptions during stack evaluation.
Define a strict routing schema using dataclasses and TypedDict. This guarantees compile-time validation for provider arguments.
# config/accounts.py
from __future__ import annotations
from dataclasses import dataclass
from typing import Optional, Dict, Any, TypedDict
import pulumi
class AssumeRoleConfig(TypedDict, total=False):
role_arn: str
session_name: str
external_id: Optional[str]
policy_arns: Optional[list[str]]
@dataclass(frozen=True)
class AccountRoute:
account_id: str
region: str
assume_role: AssumeRoleConfig
tags: Optional[Dict[str, str]] = None
def load_account_config() -> Dict[str, AccountRoute]:
"""Load multi-account routing from Pulumi config secrets."""
raw = pulumi.Config().get_secret_object("accounts")
if not isinstance(raw, dict):
raise ValueError("Invalid accounts configuration format")
routes: Dict[str, AccountRoute] = {}
for key, val in raw.items():
routes[key] = AccountRoute(
account_id=val["account_id"],
region=val["region"],
assume_role=val["assume_role"],
tags=val.get("tags")
)
return routes
Instantiate providers using the loaded schema. Pass the assume_role dictionary directly to the AWS provider constructor.
# providers/__init__.py
from __future__ import annotations
from typing import Dict, Any
import pulumi_aws
from config.accounts import AccountRoute, load_account_config
def create_account_providers() -> Dict[str, pulumi_aws.Provider]:
"""Factory function for cross-account AWS provider instantiation."""
routes = load_account_config()
providers: Dict[str, pulumi_aws.Provider] = {}
for alias, route in routes.items():
provider = pulumi_aws.Provider(
alias,
region=route.region,
assume_role=route.assume_role,
skip_credentials_validation=False,
default_tags=pulumi_aws.ProviderDefaultTagsArgs(
tags=route.tags or {}
)
)
providers[alias] = provider
return providers
The assume_role dictionary maps directly to the AWS SDK credential chain. Provider override mechanics require explicit provider arguments on every resource constructor. Refer to the AWS Provider Deep Dive for advanced credential chain resolution and regional endpoint routing.
State Backend Isolation & Stack Organization
Shared state files cause catastrophic cross-account collisions. You must isolate state per account and environment. Pulumi Cloud or S3/DynamoDB backends both support strict locking.
Initialize isolated stacks before provisioning. Never reuse stack names across AWS accounts.
CLI: Initialize environment-specific stacks.
pulumi stack init dev-us-east-1-account-a --stack dev pulumi stack init prod-us-west-2-account-b --stack prod
Configure backend encryption at rest. S3 backends require server-side encryption with KMS. DynamoDB tables enforce state locking via LockID. Pulumi Cloud handles encryption automatically.
Recovery workflows depend on state export capabilities. Corrupted state requires surgical JSON manipulation. Always verify state integrity before re-importing.
CLI: Export and re-import state for recovery.
pulumi stack export --show-secrets > state-backup.json # Edit JSON manually to remove orphaned URNs pulumi stack import --file state-backup.json
Drift Detection & Safe Rollback Strategies
Drift detection prevents configuration divergence. pulumi preview compares desired state against cached state. pulumi refresh reconciles cached state against live cloud resources.
Run refresh operations before every cross-account deployment. Stale state triggers duplicate resource creation. Orphaned infrastructure accumulates silently.
CLI: Execute pre-deployment drift detection.
pulumi refresh --yes --stack dev pulumi preview --diff --stack dev
Surgical rollbacks require targeted resource updates. Use --target to isolate failing components. Never run pulumi destroy on shared stacks.
CLI: Perform zero-downtime surgical rollback.
pulumi up --target urn:pulumi:dev::my-stack::aws:s3/bucket:Bucket::prod-data \ --replace-on-changes --stack dev
State corruption triage follows a strict sequence. Export the stack. Remove conflicting URNs. Validate with pulumi preview. Re-import only after diff verification.
Common engineering mistakes include omitting external_id in assume_role configuration. This triggers AccessDenied errors mid-stack. Always validate STS tokens before deployment. Sharing a single state file across accounts causes resource deletion during pulumi destroy. Enforce a 1:1 stack-to-account mapping. Untyped provider dictionaries cause silent misconfiguration. Run mypy --strict in pre-commit hooks. Skipping pulumi refresh creates duplicate resources. Mandate refresh in CI/CD pre-deploy stages.
Validation & Testing Boundaries
Unit tests must validate routing without hitting AWS APIs. pulumi.runtime.set_mocks() intercepts resource creation calls. Assert provider assignment explicitly.
# tests/test_multi_account.py
from __future__ import annotations
import pytest
import pulumi
import pulumi.runtime
from pulumi.runtime import mocks
from providers import create_account_providers
from config.accounts import AccountRoute
class TestProviderRouting:
@pytest.fixture(autouse=True)
def setup_mocks(self):
class MockResourceMonitor(mocks.MockResourceMonitor):
def new_resource(self, args: pulumi.runtime.mocks.MockResourceArgs) -> pulumi.runtime.mocks.MockResourceResult:
return pulumi.runtime.mocks.MockResourceResult(
id=f"{args.name}-id",
state={**args.inputs}
)
def call(self, args: pulumi.runtime.mocks.MockCallArgs) -> pulumi.runtime.mocks.MockCallResult:
return pulumi.runtime.mocks.MockCallResult(outputs={})
pulumi.runtime.set_mocks(MockResourceMonitor())
yield
def test_provider_assigns_correct_account(self):
"""Assert resource._provider matches target account ARN."""
providers = create_account_providers()
target_provider = providers["account_a"]
# Simulate resource creation with explicit provider
test_bucket = pulumi_aws.s3.Bucket(
"test-bucket",
bucket="test-routing-validation",
opts=pulumi.ResourceOptions(provider=target_provider)
)
# Validate provider routing in mock state
assert test_bucket._provider is not None
assert target_provider._name in str(test_bucket._provider)
Integration tests require pulumi.automation API. Spin up ephemeral stacks in CI pipelines. Gate merges on successful pulumi preview execution.
CLI: Enforce typing and preview gates in CI/CD.
mypy --strict config/ providers/ tests/ pulumi preview --stack test --diff --expect-no-changes
Frequently Asked Questions
How do I recover from a corrupted Pulumi state file in a multi-account setup?
Export the last known good state using pulumi stack export. Manually edit the JSON to remove orphaned URNs. Re-import with pulumi stack import. Always verify with pulumi preview before applying changes.
Can I use a single Python script to deploy to multiple AWS accounts simultaneously?
Yes. Instantiate multiple pulumi_aws.Provider objects with distinct assume_role configurations. Pass them explicitly to resource constructors. Isolate stacks per account to maintain state safety.
How does Pulumi handle IAM role session expiration during long-running deployments?
Pulumi automatically refreshes STS tokens before expiration. Configure max_session_duration in the IAM role. Ensure the AWS SDK credential chain remains active. Use pulumi up --parallel cautiously to prevent token contention.
What is the safest way to test multi-account provider routing without incurring AWS costs?
Use pulumi.runtime.set_mocks() with pytest to simulate API calls. Validate that resources route to the correct provider ARN by asserting resource._provider in test cases. Run these tests before executing pulumi up.