Skip to content
Cloud

Multi-PKI on a Single CloudHSM Cluster — Why KMS Is the Wrong Tool for Certificate Authorities

AWS KMS can technically sign with asymmetric keys, but it speaks REST — not PKCS#11. For PKI workloads that need HSM-backed signing, key export, and multi-tenant isolation, CloudHSM's Crypto User model gives you partition-equivalent isolation without legacy constraints.

Alexandre Agius

Alexandre Agius

AWS Solutions Architect

20 min read
Share:

You have a PKI. You put your CA keys in KMS. Everything works — kms:Sign returns valid signatures, your AWS-native code happily issues certificates, IAM gates everything cleanly. Until the day you bring in Evertrust, EJBCA, Microsoft AD CS, or any other PKI platform that wasn’t written specifically for the AWS SDK. And those platforms ask one simple question that KMS can’t answer: “where’s your PKCS#11 library?”

This is the second article in a small series about HSM-backed key management on AWS. The first one — When Your Keys Get Locked In — covered the case where KMS doesn’t let you export key material and the four ways out (BYOK fix, CloudHSM, XKS, Private CA). This one goes deeper on the path that most enterprises end up taking when they have a real PKI to operate: CloudHSM, with multiple PKIs sharing a single cluster, isolated through Crypto Users instead of partitions.

I’ll explain why KMS is the wrong interface for PKI even when it has the right cryptography, why CloudHSM’s lack of partitions is a feature, and how to lay out multiple CA hierarchies on one cluster without the multi-tenancy nightmares people imagine.

The Problem

There are two related problems that get conflated. Let’s separate them.

Problem 1: KMS speaks REST, not PKCS#11

KMS supports asymmetric Sign and Verify operations on RSA and ECC keys. From a pure cryptography standpoint, KMS can be a CA’s signing engine. But every off-the-shelf PKI platform — Evertrust Stream, EJBCA, Microsoft AD CS, OpenXPKI, Dogtag, Vault PKI — was designed around standardized HSM interfaces:

PKI platformExpected interface
Evertrust StreamPKCS#11
EJBCA / KeyfactorPKCS#11 (P11-NG) or Java PKCS#11 provider
Microsoft AD CSKSP / CNG
OpenXPKIPKCS#11
HashiCorp Vault PKIPKCS#11 (auto-unseal also supports KMS, but CA signing prefers PKCS#11)
Custom OpenSSL CA toolingOpenSSL Dynamic Engine (typically backed by PKCS#11)

None of these speak aws kms sign --key-id ... --message ... natively. To get them to use a KMS key, you’d need a custom shim — a PKCS#11 driver that translates C_Sign into kms:Sign. Such things exist as community projects, but you’re now operating a glue layer in your CA’s signing path. That’s a single point of failure with no vendor support, sitting between FIPS-validated hardware and the auditor who wants to know how you sign certificates.

Problem 2: KMS hides the keys you said you wanted to control

The other thing PKI platforms expect is direct, fine-grained access to key attributes: setting CKA_EXTRACTABLE, CKA_TRUSTED, CKA_SIGN, CKA_WRAP_WITH_TRUSTED. These are PKCS#11 attributes that determine whether a CA private key can be wrapped out for escrow, whether it can be used to wrap subordinate CA keys, whether it can sign at all. KMS doesn’t expose them — it’s an opinionated abstraction. Useful for application encryption; restrictive for PKI.

And then there’s the export question. The previous article covered it in detail: KMS keys cannot be exported in any form. For a root CA private key — the trust anchor that 20-year certificate hierarchies depend on — that’s not always acceptable. Auditors, divestiture clauses, and multi-cloud strategies all want a way to take your root key with you. CloudHSM allows it through wrapKey provided the key was created with CKA_EXTRACTABLE=TRUE. KMS never allows it.

So the answer for serious PKI workloads is CloudHSM. The next question — the one this article is really about — is what most customers ask immediately after: “do we need one CloudHSM cluster per PKI, or can we share?”

The Real Question: One Cluster Per PKI?

The instinctive answer from anyone who’s worked with on-premises HSMs is “no, we need separate partitions.” On a Thales Luna or Entrust nShield, every PKI gets its own partition — a cryptographically isolated slice of the HSM with its own master key. Different PKI? Different partition. That’s how you keep tenants apart.

CloudHSM doesn’t have partitions. There is exactly one slot per cluster, full stop. So at first glance, it looks like the answer is “one cluster per PKI” — which gets expensive fast, since a HA cluster is ~$2,300/month minimum.

That first glance is wrong. CloudHSM replaces the partition concept with Crypto Users, and the isolation guarantees are equivalent for almost every threat model. Once you understand the model, multi-PKI on one cluster is not just possible — it’s the recommended pattern.

How CU-Based Isolation Works

A CloudHSM cluster has four user types living inside the HSM firmware (not in IAM):

User typeCan doCannot do
Admin (Crypto Officer)Create/delete users, manage passwords, configure quorum (M-of-N), set CKA_TRUSTEDCannot use any key — no encrypt, decrypt, sign, verify
Crypto User (CU)Generate, manage, share, and use keys it ownsCannot manage other users
Appliance User (AU)Synchronize keys between HSMs in the cluster (encrypted blobs only)Cannot use keys, cannot read plaintext key material
Unactivated adminTemporary user for first-time cluster activationAnything else

Two facts matter for the multi-PKI story:

  1. AWS itself can’t use your keys. The AU only moves encrypted “masked objects” between HSMs; it has no key-use permissions. AWS administrators have no operator interface that allows reading or using customer keys.
  2. Admins can manage users but cannot use keys. That separation is enforced by the HSM firmware itself, not by a wrapping software layer.

The actual isolation primitive is in the docs:

“In AWS CloudHSM, the crypto user (CU) who creates the key owns it. The owner can use the key share and key unshare commands to share and unshare the key with other CUs.”

Translated: every key on the HSM has exactly one owner CU and zero or more shared users. A CU that is neither owner nor in the shared list cannot enumerate, see, or use the key. C_FindObjects will literally not return a handle. This is enforced inside the HSM firmware, at the same level a Luna partition’s MTK enforces partition isolation.

The shared-user grant is restrictive too. A user with whom a key has been shared can:

  • Use it for crypto operations (sign, verify, encrypt, decrypt, HMAC)

A shared user cannot:

  • Delete, export, derive, wrap, share, or unshare the key
  • Modify any attribute

So a “share” is a minimum-privilege grant of cryptographic use — not ownership.

Cluster: cluster-prod (one slot, multiple HSMs across AZs)

├── CU: cu_pki_corporate
│     ├── corp_root_ca_2026                    [owns]
│     ├── corp_issuing_ca_users_2026           [owns]
│     └── corp_issuing_ca_devices_2026         [owns]

├── CU: cu_pki_iot
│     ├── iot_root_ca_2026                     [owns]
│     └── iot_issuing_ca_factory_a_2026        [owns]

└── CU: cu_pki_lab
      └── lab_root_ca_2026                     [owns]

cu_pki_iot logging in:    sees only its own keys, period.
                          Cannot list, sign with, or even
                          discover the existence of corp_*
                          or lab_* keys.

That’s the isolation guarantee. For a multi-PKI scenario where each PKI is operated by a different team or service, this is exactly what you want — and it’s what a partition gives you on a Luna HSM.

Where the analogy breaks down

I want to be honest about the limits, because the security architects in the room will ask:

  • The isolation is logical, not physical. All keys live in the same FIPS 140-3 Level 3 boundary. A vendor firmware vulnerability — extremely rare on Marvell LiquidSecurity, but non-zero — would in principle expose every CU’s keys. If your regulator demands physical separation (rare outside payment HSM territory), separate clusters is the answer.
  • Admin compromise is total. A compromised Admin can create a new CU and share keys to it. Mitigation: enable quorum (M-of-N) on all admin operations. CloudHSM supports 2-of-2 to 8-of-8.
  • CU credential compromise = all of that CU’s keys. This is true on Luna too (compromise the partition’s PIN, you have its keys). The mitigation is identical: store CU passwords in Secrets Manager with rotation, and run each PKI’s signer process with its own CU on its own EC2 instance with its own IAM role.

For a typical enterprise multi-PKI estate (corporate users, IoT/devices, lab instruments, internal TLS), CU-based isolation is sufficient. For payment cryptography, eIDAS qualified trust services, or a divestiture-bound PKI, you may want a dedicated cluster.

The Architecture Patterns

Before diving into the three patterns, here is the full picture — what the deployment actually looks like once everything is wired up:

Multi-PKI Architecture on AWS CloudHSM — a single cluster hosts multiple isolated PKIs via per-Crypto-User key ownership, with Evertrust/EJBCA connecting via PKCS#11

The diagram above shows the moving parts: a single CloudHSM cluster spanning two Availability Zones, an EC2-hosted PKI signer running the Client SDK 5 PKCS#11 library, per-CU passwords pulled from Secrets Manager, encrypted backups landing in S3, and the dual audit story — HSM audit logs in CloudWatch and control-plane events in CloudTrail. The “isolation layer” box on the right is logical, not physical: it’s the same key store on the same HSMs, sliced into namespaces by Crypto User ownership. The patterns below differ in how you carve up that namespace.

Pattern A: One CU per PKI tenant

The default pattern. Each PKI namespace gets a CU. All CAs of that PKI hierarchy (root + issuing) live as keys owned by that CU.

Cluster
├── CU cu_pki_corporate    → corporate hierarchy (root + issuing CAs)
├── CU cu_pki_iot          → IoT hierarchy
├── CU cu_pki_lab          → lab-instrument hierarchy
└── CU cu_pki_internal_tls → internal TLS issuing only

Operationally simple. The CU password is the PKI’s authentication boundary. Maps cleanly onto how Evertrust Stream and EJBCA think about “crypto tokens” (one PKCS#11 login = one logical key namespace). Compromise of one CU doesn’t expose the others.

The downside: inside one PKI, the root CA and the issuing CAs are reachable by the same credential. If you want to separate root from issuing, you go to Pattern B.

Pattern B: One CU per CA role

For high-assurance PKIs where the offline-root-online-issuing pattern matters:

CU cu_pki_corp_root              # used only during ceremonies
CU cu_pki_corp_issuing_users     # daily issuing, user certificates
CU cu_pki_corp_issuing_devices   # daily issuing, device certificates

The root CU’s password is held by 2-3 PKI officers under quorum, kept “offline” in the operational sense — no daemon ever logs in as it except during planned ceremonies (sub-CA enrolment, sub-CA rollover, root CRL signing). The issuing CUs are used continuously by the live PKI service.

This is the closest CloudHSM analog to the classical air-gapped-root pattern. The root key never gets shared, and the issuing CAs get their own credentials.

Pattern C: CloudHSM root + AWS Private CA subordinates

The pattern AWS itself documents in “Create a portable root CA using AWS CloudHSM and ACM Private CA”. CloudHSM holds the root key. The CSRs of subordinate CAs in AWS Private CA are signed by the CloudHSM root. Daily certificate issuance happens in Private CA (managed, IAM-controlled, no operational HSM concerns). The CloudHSM cluster can even be torn down between ceremonies — its encrypted backup stays in S3 with --never-expires, and a fresh cluster is provisioned only when the root must sign again.

This pattern naturally accommodates multi-PKI: N root CAs in CloudHSM (each owned by its own CU), each chained to M Private CA subordinates. CloudHSM is touched only during ceremonies.

The anti-pattern: one CU for everything

CU cu_pki_master   →   owns ALL keys for ALL PKIs

Tempting because it’s simple. It’s also wrong:

  • The audit story collapses — every signing event has the same username field.
  • Credential rotation affects every PKI simultaneously.
  • One compromised application credential exposes every CA in the cluster.

Don’t do this.

Key Sharing — When and Why

key share is a primitive that confuses a lot of people. Here’s when it actually makes sense:

# As the owner CU, in CloudHSM CLI interactive mode:
aws-cloudhsm > login --username cu_pki_corporate --role crypto-user
aws-cloudhsm > key share \
    --filter attr.label="corp_issuing_ca_users_2026" attr.class=private-key \
    --username cu_ocsp_responder \
    --role crypto-user

After this, cu_ocsp_responder can sign with that key. It cannot export, delete, or re-share it. key unshare cleanly revokes the grant.

Legitimate use cases:

  • Zero-downtime credential rotation. The pattern from the AWS Security Blog: CU1 owns the keys, CU2 has them shared, both CUs are running. Rotate CU2’s password, switch traffic via load balancer, then rotate CU1. Documented in “AWS CloudHSM architectural considerations for crypto user credential rotation”.
  • Stand-by signers. A failover signer process running under a different CU can be pre-shared the same key for fast cutover.
  • Delegated OCSP responders. Where the OCSP responder needs to sign with a sub-CA’s key directly (small PKIs without a separate OCSP signing key).

What key share is not for: routine cross-PKI access. Different PKIs should not share keys. If you’re tempted to share, you probably want a separate sub-CA instead.

One subtle gotcha: there is no “verify only” share. The shared user gets full operational use of the key, gated by the key’s existing attributes. If you need finer-grained per-user permissions, the documented re:Post workaround is to wrap-out and re-import the key with restricted CKA_SIGN/CKA_DECRYPT/CKA_WRAP attributes per intended user — clunky, but functional.

And one rule of thumb: never share a Root CA key. A root should be operated by exactly one CU at any moment. If you need HA for the root, the answer is to issue a second sub-CA, not to share the root.

Evertrust + CloudHSM — The Practical Pattern

Most of my recent customer conversations on this topic have involved Evertrust Stream specifically, so let me ground the abstract patterns in a concrete deployment.

Evertrust Stream is a French PKI platform built around standard PKCS#11. Their own documentation says it explicitly: “Evertrust PKI integrates with any PKCS#11-compliant HSM out of the box. All major HSM vendors are supported without custom development. For cloud-native deployments, Evertrust also supports all major cloud HSM services (AWS, Azure, GCP).” The Stream platform is also designed for multi-CA hierarchies on top of one PKCS#11 connection.

That maps perfectly onto CloudHSM’s multi-CU model:

┌──────────────────────────────────────────────────┐
│ Evertrust Stream cluster (HA, 2 EC2 nodes)       │
│                                                  │
│  ┌────────────────────────────────────────────┐  │
│  │  CA hierarchy A → PKCS#11 login as cu_a    │  │
│  │  CA hierarchy B → PKCS#11 login as cu_b    │  │
│  │  CA hierarchy C → PKCS#11 login as cu_c    │  │
│  │                                            │  │
│  │  CU passwords pulled from Secrets Manager  │  │
│  │  at startup; rotated under Approach 2      │  │
│  └────────────────────────────────────────────┘  │
└────────────────────┬─────────────────────────────┘

                     │ PKCS#11 (CloudHSM Client SDK 5)
                     │ TLS to HSM ENI on TCP 2223-2225

┌──────────────────────────────────────────────────┐
│ Single CloudHSM cluster, 2 HSMs, 2 AZs           │
│                                                  │
│   cu_a owns CA-A keys                            │
│   cu_b owns CA-B keys                            │
│   cu_c owns CA-C keys                            │
│                                                  │
│   One slot. Three independent key namespaces.    │
└──────────────────────────────────────────────────┘

A few specifics that matter when you actually wire this up:

  • CloudHSM exposes exactly one PKCS#11 slot per cluster, presented as slot 1. This is the same constraint Keyfactor calls out in the EJBCA documentation. Slot-based isolation isn’t available — CU-based is.
  • Each Stream CA gets its own PKCS#11 login as a different CU. The CloudHSM PKCS#11 PIN convention is <cu_username>:<cu_password>. Stream’s per-CA cryptographic profile carries this PIN.
  • For exotic multi-cluster setups, Client SDK 5 supports multi-slot config (configure-pkcs11 add-cluster) — one process, two clusters, exposed as slot 1 and slot 2. Use this only when you’ve consciously decided to split PKIs across clusters.
  • Credentials live in Secrets Manager. The Stream EC2 instance has an IAM role with secretsmanager:GetSecretValue on the per-CU secret. Rotation runs Approach 2 from the AWS blog (dual CU + share + load-balanced cutover).
  • HSM key-ceremony quorum + Evertrust M-of-N ceremony combine cleanly. Both sides support it; align them in your ceremony script.

The same shape works for EJBCA, AD CS, OpenXPKI, or a custom Java-based CA. The interface is PKCS#11 (or KSP/CNG for Microsoft); the isolation model is per-CU.

When to Split Into Separate Clusters

CloudHSM is billed per HSM-hour, not per CU, key, or operation. A two-HSM HA cluster runs the same regardless of whether it hosts one PKI or fifteen. Adding a second cluster doubles the steady-state cost — roughly speaking, +$2,300/month per region. So the default is to consolidate, not to split.

That said, there are real reasons to split:

DriverOne cluster, multi-CUSeparate clusters
Multiple PKIs in same security domainYes
Cost sensitivityYes
Different regulatory frames per PKI (eIDAS qualified vs internal)possiblyYes
”A breach of PKI-X must NEVER touch PKI-Y” — physical isolationpartialYes
Prod / Non-Prod separationYes (always)
Different AWS account ownershippartialYes
Divestiture potential (one PKI may leave with a divested entity)acceptableYes
Different key-ceremony cadences with strict officer separationwith quorumYes

A reasonable default for an enterprise:

  • Cluster #1 — Production hosting all production PKIs as separate CUs.
  • Cluster #2 — Non-production mirroring the structure for staging.
  • Optional Cluster #3 — Regulated if a specific PKI carries an isolation mandate (FDA 21 CFR Part 11 signing, eIDAS qualified trust service, payments).

Three clusters at most for the vast majority of customers. Not one per PKI.

The KMS Custom Key Store Question

A predictable follow-up question: “can we put KMS Custom Key Stores on top of the same CloudHSM cluster, one per PKI?”

Short answer: yes, but it doesn’t do what you want for CA signing.

A KMS Custom Key Store (CKS) backed by CloudHSM is bound 1:1 to a single cluster. KMS connects to the cluster as a single dedicated CU named kmsuser, and KMS keys created in this store generate AES-256 symmetric keys owned by kmsuser. The store only supports symmetric encryption keys — no asymmetric, no HMAC, no imported material, no auto-rotation, no multi-region.

Three implications:

  1. You can have many KMS keys per CKS, one per PKI tenant. Isolation between them is enforced at the KMS layer (key policies + IAM), not at the HSM CU layer. All those KMS keys are owned by the same kmsuser inside the HSM.
  2. CKS cannot host CA signing keys because asymmetric isn’t supported. Your RSA/ECDSA CA keys must be created via direct PKCS#11/JCE access against your per-PKI CUs.
  3. CKS coexists peacefully with your cu_pki_* users on the same cluster. kmsuser and the PKI CUs cannot see each other’s keys — same isolation rule as before.

So CKS has a place in a multi-PKI design, but for symmetric workloads adjacent to the PKI: encrypting the certificate database at rest, encrypting CRL archives, encrypting audit-log copies before shipping them to a tamper-proof account. Your CA private keys stay in CU-owned, PKCS#11-accessible territory.

What I Learned

  • The right interface matters more than the right cryptography. KMS can sign, but it speaks REST. PKI platforms speak PKCS#11. Forcing them to bridge through a custom shim creates a glue layer that becomes the audit’s main concern, not the HSM. Choose the service that speaks your application’s language natively.

  • CloudHSM’s lack of partitions is a feature once you understand the CU model. Crypto Users give you the same logical isolation as partitions, with simpler operations (one cluster, one slot, many user namespaces) and better cost characteristics (you don’t pay per partition). The “we need partitions” objection comes from on-prem habits, not from a real isolation gap.

  • Multi-PKI on one cluster is the default; separate clusters are the exception. The default is driven by cost (HSM-hours, not key-counts) and by the strength of CU isolation. Split clusters only when you have a concrete reason: regulatory framing, physical isolation mandate, divestiture risk, or Prod/Non-Prod separation.

  • key share is for credential rotation, not for cross-PKI access. It’s a minimum-privilege use grant that pairs perfectly with the AWS-blog-recommended dual-CU rotation pattern. Different PKIs should not share keys; they should be different CUs that don’t share at all.

  • One giant CU is the worst answer. It collapses your audit narrative, makes credential rotation a global outage, and turns one compromised application into a total cluster compromise. If you find yourself drawing this on a whiteboard, you’ve already lost the design review.

  • Evertrust, EJBCA, AD CS, and OpenXPKI all map cleanly onto multi-CU CloudHSM because they were designed around PKCS#11 from day one. The architectural pattern (one PKI tenant = one PKCS#11 login as a dedicated CU) is consistent across all of them.

Do It Yourself

Key Takeaways

  • Pick the interface your PKI platform actually speaks. If your CA software wants PKCS#11, give it CloudHSM, not KMS. The native fit eliminates a glue layer that would otherwise be the weakest link in your signing path.
  • One CloudHSM cluster can host many PKIs through Crypto User isolation. Each PKI tenant gets its own CU; keys are owned by that CU and invisible to all others. This is the partition-equivalent in CloudHSM, enforced inside the HSM firmware.
  • Default to multi-PKI on one cluster; split only for concrete drivers. Cost, operational simplicity, and the strength of CU-based isolation make consolidation the right baseline. Separate clusters for regulatory, physical-isolation, divestiture, or Prod/Non-Prod requirements.
  • KMS Custom Key Store on CloudHSM is for symmetric workloads adjacent to the PKI, not for CA signing keys. CKS doesn’t support asymmetric.
  • Audit the share graph, not just the user list. CN_SHARE_OBJECT events in the CloudHSM HSM audit log are the multi-PKI watchpoint. Alarm on any unexpected share between CUs that aren’t part of your documented rotation pairs.

Try It Now

Alexandre Agius

Alexandre Agius

AWS Solutions Architect

Passionate about AI & Security. Building scalable cloud solutions and helping organizations leverage AWS services to innovate faster. Specialized in Generative AI, serverless architectures, and security best practices.

ONE LETTER A MONTH · NO TRACKER · UNSUBSCRIBE ANYTIME

Comments

Sign in to leave a comment

Related Posts