Multi-PKI on a Single CloudHSM Cluster — Why KMS Is the Wrong Tool for Certificate Authorities
AWS KMS can technically sign with asymmetric keys, but it speaks REST — not PKCS#11. For PKI workloads that need HSM-backed signing, key export, and multi-tenant isolation, CloudHSM's Crypto User model gives you partition-equivalent isolation without legacy constraints.
Table of Contents
- The Problem
- Problem 1: KMS speaks REST, not PKCS#11
- Problem 2: KMS hides the keys you said you wanted to control
- The Real Question: One Cluster Per PKI?
- How CU-Based Isolation Works
- Where the analogy breaks down
- The Architecture Patterns
- Pattern A: One CU per PKI tenant
- Pattern B: One CU per CA role
- Pattern C: CloudHSM root + AWS Private CA subordinates
- The anti-pattern: one CU for everything
- Key Sharing — When and Why
- Evertrust + CloudHSM — The Practical Pattern
- When to Split Into Separate Clusters
- The KMS Custom Key Store Question
- What I Learned
- Do It Yourself
- Key Takeaways
- Try It Now
You have a PKI. You put your CA keys in KMS. Everything works — kms:Sign returns valid signatures, your AWS-native code happily issues certificates, IAM gates everything cleanly. Until the day you bring in Evertrust, EJBCA, Microsoft AD CS, or any other PKI platform that wasn’t written specifically for the AWS SDK. And those platforms ask one simple question that KMS can’t answer: “where’s your PKCS#11 library?”
This is the second article in a small series about HSM-backed key management on AWS. The first one — When Your Keys Get Locked In — covered the case where KMS doesn’t let you export key material and the four ways out (BYOK fix, CloudHSM, XKS, Private CA). This one goes deeper on the path that most enterprises end up taking when they have a real PKI to operate: CloudHSM, with multiple PKIs sharing a single cluster, isolated through Crypto Users instead of partitions.
I’ll explain why KMS is the wrong interface for PKI even when it has the right cryptography, why CloudHSM’s lack of partitions is a feature, and how to lay out multiple CA hierarchies on one cluster without the multi-tenancy nightmares people imagine.
The Problem
There are two related problems that get conflated. Let’s separate them.
Problem 1: KMS speaks REST, not PKCS#11
KMS supports asymmetric Sign and Verify operations on RSA and ECC keys. From a pure cryptography standpoint, KMS can be a CA’s signing engine. But every off-the-shelf PKI platform — Evertrust Stream, EJBCA, Microsoft AD CS, OpenXPKI, Dogtag, Vault PKI — was designed around standardized HSM interfaces:
| PKI platform | Expected interface |
|---|---|
| Evertrust Stream | PKCS#11 |
| EJBCA / Keyfactor | PKCS#11 (P11-NG) or Java PKCS#11 provider |
| Microsoft AD CS | KSP / CNG |
| OpenXPKI | PKCS#11 |
| HashiCorp Vault PKI | PKCS#11 (auto-unseal also supports KMS, but CA signing prefers PKCS#11) |
| Custom OpenSSL CA tooling | OpenSSL Dynamic Engine (typically backed by PKCS#11) |
None of these speak aws kms sign --key-id ... --message ... natively. To get them to use a KMS key, you’d need a custom shim — a PKCS#11 driver that translates C_Sign into kms:Sign. Such things exist as community projects, but you’re now operating a glue layer in your CA’s signing path. That’s a single point of failure with no vendor support, sitting between FIPS-validated hardware and the auditor who wants to know how you sign certificates.
Problem 2: KMS hides the keys you said you wanted to control
The other thing PKI platforms expect is direct, fine-grained access to key attributes: setting CKA_EXTRACTABLE, CKA_TRUSTED, CKA_SIGN, CKA_WRAP_WITH_TRUSTED. These are PKCS#11 attributes that determine whether a CA private key can be wrapped out for escrow, whether it can be used to wrap subordinate CA keys, whether it can sign at all. KMS doesn’t expose them — it’s an opinionated abstraction. Useful for application encryption; restrictive for PKI.
And then there’s the export question. The previous article covered it in detail: KMS keys cannot be exported in any form. For a root CA private key — the trust anchor that 20-year certificate hierarchies depend on — that’s not always acceptable. Auditors, divestiture clauses, and multi-cloud strategies all want a way to take your root key with you. CloudHSM allows it through wrapKey provided the key was created with CKA_EXTRACTABLE=TRUE. KMS never allows it.
So the answer for serious PKI workloads is CloudHSM. The next question — the one this article is really about — is what most customers ask immediately after: “do we need one CloudHSM cluster per PKI, or can we share?”
The Real Question: One Cluster Per PKI?
The instinctive answer from anyone who’s worked with on-premises HSMs is “no, we need separate partitions.” On a Thales Luna or Entrust nShield, every PKI gets its own partition — a cryptographically isolated slice of the HSM with its own master key. Different PKI? Different partition. That’s how you keep tenants apart.
CloudHSM doesn’t have partitions. There is exactly one slot per cluster, full stop. So at first glance, it looks like the answer is “one cluster per PKI” — which gets expensive fast, since a HA cluster is ~$2,300/month minimum.
That first glance is wrong. CloudHSM replaces the partition concept with Crypto Users, and the isolation guarantees are equivalent for almost every threat model. Once you understand the model, multi-PKI on one cluster is not just possible — it’s the recommended pattern.
How CU-Based Isolation Works
A CloudHSM cluster has four user types living inside the HSM firmware (not in IAM):
| User type | Can do | Cannot do |
|---|---|---|
| Admin (Crypto Officer) | Create/delete users, manage passwords, configure quorum (M-of-N), set CKA_TRUSTED | Cannot use any key — no encrypt, decrypt, sign, verify |
| Crypto User (CU) | Generate, manage, share, and use keys it owns | Cannot manage other users |
| Appliance User (AU) | Synchronize keys between HSMs in the cluster (encrypted blobs only) | Cannot use keys, cannot read plaintext key material |
| Unactivated admin | Temporary user for first-time cluster activation | Anything else |
Two facts matter for the multi-PKI story:
- AWS itself can’t use your keys. The AU only moves encrypted “masked objects” between HSMs; it has no key-use permissions. AWS administrators have no operator interface that allows reading or using customer keys.
- Admins can manage users but cannot use keys. That separation is enforced by the HSM firmware itself, not by a wrapping software layer.
The actual isolation primitive is in the docs:
“In AWS CloudHSM, the crypto user (CU) who creates the key owns it. The owner can use the key share and key unshare commands to share and unshare the key with other CUs.”
Translated: every key on the HSM has exactly one owner CU and zero or more shared users. A CU that is neither owner nor in the shared list cannot enumerate, see, or use the key. C_FindObjects will literally not return a handle. This is enforced inside the HSM firmware, at the same level a Luna partition’s MTK enforces partition isolation.
The shared-user grant is restrictive too. A user with whom a key has been shared can:
- Use it for crypto operations (sign, verify, encrypt, decrypt, HMAC)
A shared user cannot:
- Delete, export, derive, wrap, share, or unshare the key
- Modify any attribute
So a “share” is a minimum-privilege grant of cryptographic use — not ownership.
Cluster: cluster-prod (one slot, multiple HSMs across AZs)
│
├── CU: cu_pki_corporate
│ ├── corp_root_ca_2026 [owns]
│ ├── corp_issuing_ca_users_2026 [owns]
│ └── corp_issuing_ca_devices_2026 [owns]
│
├── CU: cu_pki_iot
│ ├── iot_root_ca_2026 [owns]
│ └── iot_issuing_ca_factory_a_2026 [owns]
│
└── CU: cu_pki_lab
└── lab_root_ca_2026 [owns]
cu_pki_iot logging in: sees only its own keys, period.
Cannot list, sign with, or even
discover the existence of corp_*
or lab_* keys.
That’s the isolation guarantee. For a multi-PKI scenario where each PKI is operated by a different team or service, this is exactly what you want — and it’s what a partition gives you on a Luna HSM.
Where the analogy breaks down
I want to be honest about the limits, because the security architects in the room will ask:
- The isolation is logical, not physical. All keys live in the same FIPS 140-3 Level 3 boundary. A vendor firmware vulnerability — extremely rare on Marvell LiquidSecurity, but non-zero — would in principle expose every CU’s keys. If your regulator demands physical separation (rare outside payment HSM territory), separate clusters is the answer.
- Admin compromise is total. A compromised Admin can create a new CU and share keys to it. Mitigation: enable quorum (M-of-N) on all admin operations. CloudHSM supports 2-of-2 to 8-of-8.
- CU credential compromise = all of that CU’s keys. This is true on Luna too (compromise the partition’s PIN, you have its keys). The mitigation is identical: store CU passwords in Secrets Manager with rotation, and run each PKI’s signer process with its own CU on its own EC2 instance with its own IAM role.
For a typical enterprise multi-PKI estate (corporate users, IoT/devices, lab instruments, internal TLS), CU-based isolation is sufficient. For payment cryptography, eIDAS qualified trust services, or a divestiture-bound PKI, you may want a dedicated cluster.
The Architecture Patterns
Before diving into the three patterns, here is the full picture — what the deployment actually looks like once everything is wired up:
The diagram above shows the moving parts: a single CloudHSM cluster spanning two Availability Zones, an EC2-hosted PKI signer running the Client SDK 5 PKCS#11 library, per-CU passwords pulled from Secrets Manager, encrypted backups landing in S3, and the dual audit story — HSM audit logs in CloudWatch and control-plane events in CloudTrail. The “isolation layer” box on the right is logical, not physical: it’s the same key store on the same HSMs, sliced into namespaces by Crypto User ownership. The patterns below differ in how you carve up that namespace.
Pattern A: One CU per PKI tenant
The default pattern. Each PKI namespace gets a CU. All CAs of that PKI hierarchy (root + issuing) live as keys owned by that CU.
Cluster
├── CU cu_pki_corporate → corporate hierarchy (root + issuing CAs)
├── CU cu_pki_iot → IoT hierarchy
├── CU cu_pki_lab → lab-instrument hierarchy
└── CU cu_pki_internal_tls → internal TLS issuing only
Operationally simple. The CU password is the PKI’s authentication boundary. Maps cleanly onto how Evertrust Stream and EJBCA think about “crypto tokens” (one PKCS#11 login = one logical key namespace). Compromise of one CU doesn’t expose the others.
The downside: inside one PKI, the root CA and the issuing CAs are reachable by the same credential. If you want to separate root from issuing, you go to Pattern B.
Pattern B: One CU per CA role
For high-assurance PKIs where the offline-root-online-issuing pattern matters:
CU cu_pki_corp_root # used only during ceremonies
CU cu_pki_corp_issuing_users # daily issuing, user certificates
CU cu_pki_corp_issuing_devices # daily issuing, device certificates
The root CU’s password is held by 2-3 PKI officers under quorum, kept “offline” in the operational sense — no daemon ever logs in as it except during planned ceremonies (sub-CA enrolment, sub-CA rollover, root CRL signing). The issuing CUs are used continuously by the live PKI service.
This is the closest CloudHSM analog to the classical air-gapped-root pattern. The root key never gets shared, and the issuing CAs get their own credentials.
Pattern C: CloudHSM root + AWS Private CA subordinates
The pattern AWS itself documents in “Create a portable root CA using AWS CloudHSM and ACM Private CA”. CloudHSM holds the root key. The CSRs of subordinate CAs in AWS Private CA are signed by the CloudHSM root. Daily certificate issuance happens in Private CA (managed, IAM-controlled, no operational HSM concerns). The CloudHSM cluster can even be torn down between ceremonies — its encrypted backup stays in S3 with --never-expires, and a fresh cluster is provisioned only when the root must sign again.
This pattern naturally accommodates multi-PKI: N root CAs in CloudHSM (each owned by its own CU), each chained to M Private CA subordinates. CloudHSM is touched only during ceremonies.
The anti-pattern: one CU for everything
CU cu_pki_master → owns ALL keys for ALL PKIs
Tempting because it’s simple. It’s also wrong:
- The audit story collapses — every signing event has the same
usernamefield. - Credential rotation affects every PKI simultaneously.
- One compromised application credential exposes every CA in the cluster.
Don’t do this.
Key Sharing — When and Why
key share is a primitive that confuses a lot of people. Here’s when it actually makes sense:
# As the owner CU, in CloudHSM CLI interactive mode:
aws-cloudhsm > login --username cu_pki_corporate --role crypto-user
aws-cloudhsm > key share \
--filter attr.label="corp_issuing_ca_users_2026" attr.class=private-key \
--username cu_ocsp_responder \
--role crypto-user
After this, cu_ocsp_responder can sign with that key. It cannot export, delete, or re-share it. key unshare cleanly revokes the grant.
Legitimate use cases:
- Zero-downtime credential rotation. The pattern from the AWS Security Blog: CU1 owns the keys, CU2 has them shared, both CUs are running. Rotate CU2’s password, switch traffic via load balancer, then rotate CU1. Documented in “AWS CloudHSM architectural considerations for crypto user credential rotation”.
- Stand-by signers. A failover signer process running under a different CU can be pre-shared the same key for fast cutover.
- Delegated OCSP responders. Where the OCSP responder needs to sign with a sub-CA’s key directly (small PKIs without a separate OCSP signing key).
What key share is not for: routine cross-PKI access. Different PKIs should not share keys. If you’re tempted to share, you probably want a separate sub-CA instead.
One subtle gotcha: there is no “verify only” share. The shared user gets full operational use of the key, gated by the key’s existing attributes. If you need finer-grained per-user permissions, the documented re:Post workaround is to wrap-out and re-import the key with restricted CKA_SIGN/CKA_DECRYPT/CKA_WRAP attributes per intended user — clunky, but functional.
And one rule of thumb: never share a Root CA key. A root should be operated by exactly one CU at any moment. If you need HA for the root, the answer is to issue a second sub-CA, not to share the root.
Evertrust + CloudHSM — The Practical Pattern
Most of my recent customer conversations on this topic have involved Evertrust Stream specifically, so let me ground the abstract patterns in a concrete deployment.
Evertrust Stream is a French PKI platform built around standard PKCS#11. Their own documentation says it explicitly: “Evertrust PKI integrates with any PKCS#11-compliant HSM out of the box. All major HSM vendors are supported without custom development. For cloud-native deployments, Evertrust also supports all major cloud HSM services (AWS, Azure, GCP).” The Stream platform is also designed for multi-CA hierarchies on top of one PKCS#11 connection.
That maps perfectly onto CloudHSM’s multi-CU model:
┌──────────────────────────────────────────────────┐
│ Evertrust Stream cluster (HA, 2 EC2 nodes) │
│ │
│ ┌────────────────────────────────────────────┐ │
│ │ CA hierarchy A → PKCS#11 login as cu_a │ │
│ │ CA hierarchy B → PKCS#11 login as cu_b │ │
│ │ CA hierarchy C → PKCS#11 login as cu_c │ │
│ │ │ │
│ │ CU passwords pulled from Secrets Manager │ │
│ │ at startup; rotated under Approach 2 │ │
│ └────────────────────────────────────────────┘ │
└────────────────────┬─────────────────────────────┘
│
│ PKCS#11 (CloudHSM Client SDK 5)
│ TLS to HSM ENI on TCP 2223-2225
▼
┌──────────────────────────────────────────────────┐
│ Single CloudHSM cluster, 2 HSMs, 2 AZs │
│ │
│ cu_a owns CA-A keys │
│ cu_b owns CA-B keys │
│ cu_c owns CA-C keys │
│ │
│ One slot. Three independent key namespaces. │
└──────────────────────────────────────────────────┘
A few specifics that matter when you actually wire this up:
- CloudHSM exposes exactly one PKCS#11 slot per cluster, presented as
slot 1. This is the same constraint Keyfactor calls out in the EJBCA documentation. Slot-based isolation isn’t available — CU-based is. - Each Stream CA gets its own PKCS#11 login as a different CU. The CloudHSM PKCS#11 PIN convention is
<cu_username>:<cu_password>. Stream’s per-CA cryptographic profile carries this PIN. - For exotic multi-cluster setups, Client SDK 5 supports multi-slot config (
configure-pkcs11 add-cluster) — one process, two clusters, exposed as slot 1 and slot 2. Use this only when you’ve consciously decided to split PKIs across clusters. - Credentials live in Secrets Manager. The Stream EC2 instance has an IAM role with
secretsmanager:GetSecretValueon the per-CU secret. Rotation runs Approach 2 from the AWS blog (dual CU + share + load-balanced cutover). - HSM key-ceremony quorum + Evertrust M-of-N ceremony combine cleanly. Both sides support it; align them in your ceremony script.
The same shape works for EJBCA, AD CS, OpenXPKI, or a custom Java-based CA. The interface is PKCS#11 (or KSP/CNG for Microsoft); the isolation model is per-CU.
When to Split Into Separate Clusters
CloudHSM is billed per HSM-hour, not per CU, key, or operation. A two-HSM HA cluster runs the same regardless of whether it hosts one PKI or fifteen. Adding a second cluster doubles the steady-state cost — roughly speaking, +$2,300/month per region. So the default is to consolidate, not to split.
That said, there are real reasons to split:
| Driver | One cluster, multi-CU | Separate clusters |
|---|---|---|
| Multiple PKIs in same security domain | Yes | |
| Cost sensitivity | Yes | |
| Different regulatory frames per PKI (eIDAS qualified vs internal) | possibly | Yes |
| ”A breach of PKI-X must NEVER touch PKI-Y” — physical isolation | partial | Yes |
| Prod / Non-Prod separation | Yes (always) | |
| Different AWS account ownership | partial | Yes |
| Divestiture potential (one PKI may leave with a divested entity) | acceptable | Yes |
| Different key-ceremony cadences with strict officer separation | with quorum | Yes |
A reasonable default for an enterprise:
- Cluster #1 — Production hosting all production PKIs as separate CUs.
- Cluster #2 — Non-production mirroring the structure for staging.
- Optional Cluster #3 — Regulated if a specific PKI carries an isolation mandate (FDA 21 CFR Part 11 signing, eIDAS qualified trust service, payments).
Three clusters at most for the vast majority of customers. Not one per PKI.
The KMS Custom Key Store Question
A predictable follow-up question: “can we put KMS Custom Key Stores on top of the same CloudHSM cluster, one per PKI?”
Short answer: yes, but it doesn’t do what you want for CA signing.
A KMS Custom Key Store (CKS) backed by CloudHSM is bound 1:1 to a single cluster. KMS connects to the cluster as a single dedicated CU named kmsuser, and KMS keys created in this store generate AES-256 symmetric keys owned by kmsuser. The store only supports symmetric encryption keys — no asymmetric, no HMAC, no imported material, no auto-rotation, no multi-region.
Three implications:
- You can have many KMS keys per CKS, one per PKI tenant. Isolation between them is enforced at the KMS layer (key policies + IAM), not at the HSM CU layer. All those KMS keys are owned by the same
kmsuserinside the HSM. - CKS cannot host CA signing keys because asymmetric isn’t supported. Your RSA/ECDSA CA keys must be created via direct PKCS#11/JCE access against your per-PKI CUs.
- CKS coexists peacefully with your
cu_pki_*users on the same cluster.kmsuserand the PKI CUs cannot see each other’s keys — same isolation rule as before.
So CKS has a place in a multi-PKI design, but for symmetric workloads adjacent to the PKI: encrypting the certificate database at rest, encrypting CRL archives, encrypting audit-log copies before shipping them to a tamper-proof account. Your CA private keys stay in CU-owned, PKCS#11-accessible territory.
What I Learned
-
The right interface matters more than the right cryptography. KMS can sign, but it speaks REST. PKI platforms speak PKCS#11. Forcing them to bridge through a custom shim creates a glue layer that becomes the audit’s main concern, not the HSM. Choose the service that speaks your application’s language natively.
-
CloudHSM’s lack of partitions is a feature once you understand the CU model. Crypto Users give you the same logical isolation as partitions, with simpler operations (one cluster, one slot, many user namespaces) and better cost characteristics (you don’t pay per partition). The “we need partitions” objection comes from on-prem habits, not from a real isolation gap.
-
Multi-PKI on one cluster is the default; separate clusters are the exception. The default is driven by cost (HSM-hours, not key-counts) and by the strength of CU isolation. Split clusters only when you have a concrete reason: regulatory framing, physical isolation mandate, divestiture risk, or Prod/Non-Prod separation.
-
key shareis for credential rotation, not for cross-PKI access. It’s a minimum-privilege use grant that pairs perfectly with the AWS-blog-recommended dual-CU rotation pattern. Different PKIs should not share keys; they should be different CUs that don’t share at all. -
One giant CU is the worst answer. It collapses your audit narrative, makes credential rotation a global outage, and turns one compromised application into a total cluster compromise. If you find yourself drawing this on a whiteboard, you’ve already lost the design review.
-
Evertrust, EJBCA, AD CS, and OpenXPKI all map cleanly onto multi-CU CloudHSM because they were designed around PKCS#11 from day one. The architectural pattern (one PKI tenant = one PKCS#11 login as a dedicated CU) is consistent across all of them.
Do It Yourself
Key Takeaways
- Pick the interface your PKI platform actually speaks. If your CA software wants PKCS#11, give it CloudHSM, not KMS. The native fit eliminates a glue layer that would otherwise be the weakest link in your signing path.
- One CloudHSM cluster can host many PKIs through Crypto User isolation. Each PKI tenant gets its own CU; keys are owned by that CU and invisible to all others. This is the partition-equivalent in CloudHSM, enforced inside the HSM firmware.
- Default to multi-PKI on one cluster; split only for concrete drivers. Cost, operational simplicity, and the strength of CU-based isolation make consolidation the right baseline. Separate clusters for regulatory, physical-isolation, divestiture, or Prod/Non-Prod requirements.
- KMS Custom Key Store on CloudHSM is for symmetric workloads adjacent to the PKI, not for CA signing keys. CKS doesn’t support asymmetric.
- Audit the share graph, not just the user list.
CN_SHARE_OBJECTevents in the CloudHSM HSM audit log are the multi-PKI watchpoint. Alarm on any unexpected share between CUs that aren’t part of your documented rotation pairs.
Try It Now
- Provision a test cluster by following the CloudHSM Getting Started guide. Two
hsm2m.mediumHSMs across two AZs is enough to validate the multi-CU pattern end-to-end. - Create three CUs —
cu_pki_a,cu_pki_b,cu_pki_c— and generate RSA-3072 key pairs as each. Then log in as one and runkey list --verboseto confirm you only see your own keys. The HSM user types reference and Keys in CloudHSM doc walk through the commands. - Test
key sharebetween two of your CUs and verify the shared user can sign but not export. The share/unshare command reference has a worked example. - Wire up PKCS#11 for your PKI platform. The CloudHSM PKCS#11 library docs and the multi-slot configuration guide cover Client SDK 5.
- Read the AWS Security Blog hybrid pattern: Create a portable root CA using AWS CloudHSM and ACM Private CA — the canonical reference for “CloudHSM root + Private CA subordinates”, which extends naturally to multi-PKI.
- Plan rotation early: AWS CloudHSM architectural considerations for crypto user credential rotation explains the dual-CU + key-share pattern that makes rotation a non-event for live PKIs.
- Confirm the audit story: filter your
/aws/cloudhsm/<cluster-id>CloudWatch Logs group onCN_GENERATE_KEY_PAIR,CN_SIGN, andCN_SHARE_OBJECTto see how the HSM audit log captures per-CU activity. The full opcode list is in the audit log reference.
ONE LETTER A MONTH · NO TRACKER · UNSUBSCRIBE ANYTIME
Comments
Sign in to leave a comment
Related Posts
When Your Keys Get Locked In: Navigating AWS KMS Import Limitations
14 MIN READ
Cloud Sovereignty for the Board — A 3-Tier Architecture That Maps Data Sensitivity to Control Level
7 MIN READ
Cloud Sovereignty Deep Dive - AWS KMS Control Plane Analysis
13 MIN READ
