Architecture — CloudCondom / Phantom

Phantom Architecture

Verdict: Architecturally Sound

The trust model is correct: webhook = UX convenience, cryptographic attestation = security boundary. The explicit acknowledgment that cluster-admin, system:masters, and the cloud provider can all bypass the webhook — and the design that makes this bypass irrelevant — is the strongest architectural decision.

How It Works

Phantom is a Kubernetes operator that injects a sidecar via mutating webhook. The sidecar fetches secrets from an EU-hosted OpenBao/Vault instance directly into process memory — secrets never touch etcd, never enter Kubernetes Secrets, and the cloud provider never holds the keys.

Key Strengths

Secrets never touch etcd. Eliminates an entire class of attacks (etcd dump, backup exfiltration, KMS compulsion). The correct approach for managed Kubernetes where you have zero control over the control plane.
Three-tier caching is well-designed. Hot cache → sealed local cache → grace period progression is operationally sound. Sealed cache key derivation from SA token + cluster HMAC is reasonable.
Circuit breaker on the webhook is the right pattern for fail-closed security products. The override escape hatch (namespace label) is correctly positioned as an auditable last resort.
Canary injection via namespace labels is operationally mature thinking for a product that modifies every pod in the cluster.

Known Concerns

gVisor as “optional lightweight sandbox” is undersold. Without it, a root-level attacker on the node can read process memory via /proc/[pid]/mem. The “optionally” qualifier weakens the story.
The sidecar is a single point of failure per pod. If phantom-proxy crashes and the sealed cache is expired, the application loses access to all secrets. Consider a direct (attested) fallback path to OpenBao.
Env var patching requires applications to use environment variables or a specific socket protocol. Applications that read secrets from files need a different mechanism — solvable but not addressed.

Technology Choices: Correct

Go for the webhook/operator/sidecar is the standard choice with first-class Kubernetes client libraries. OpenBao as the external secrets source is the right call. AMD SEV-SNP / Intel TDX for attestation is the correct hardware trust anchor.

Trust Model

“Webhook = UX, Crypto = Security” — Correct and Well-Reasoned

The document’s analysis of who can bypass the webhook and why that doesn’t matter (secrets aren’t in Kubernetes, attestation gates key release) is technically sound.

One Gap

The trust model assumes OpenBao is outside the cloud provider’s jurisdiction. If a customer misconfigures OpenBao to run inside the US cloud, the entire model collapses. The architecture should enforce or verify OpenBao’s location as part of the attestation flow.

Technical Feasibility

Phantom Complexity Breakdown

Mutating webhook: well-understood pattern, excellent Go libraries. 2-3 weeks for a senior Go engineer.
OpenBao integration (secret fetch, caching, renewal): 3-4 weeks. Three-tier cache adds complexity but is well-scoped.
Sidecar injection with mesh awareness: 4-6 weeks. Compatibility matrix (Istio, Linkerd, OTel, Dapr, GKE FUSE) is the time sink.
Circuit breaker + operator lifecycle: 2-3 weeks.
Cross-provider testing matrix: 4-6 weeks. The hidden cost — testing on GKE Standard, GKE Autopilot, EKS (EC2 + Fargate), and AKS.
Attestation (SEV-SNP/TDX): 6-8 weeks. Requires specialized knowledge.
Total: ~5-7 months for production-ready Phantom with attestation.

What Needs More Research

Nitro Enclaves integration — fundamentally different from SEV-SNP/TDX. Needs PoC before committing.
eBPF memory-access monitoring — what can eBPF detect that’s actionable? Detect-and-alert vs. detect-and-block?

Key Technical Limitations

What Phantom Cannot Do

Cannot protect data processed in cleartext. Once the app decrypts a secret, data exists in cleartext in application memory. TEE mitigates this but isn’t universal.
Cannot protect against a compromised application. If the application is malicious (supply chain attack), it has legitimate access to decrypted secrets.
Cannot protect Kubernetes metadata. Pod names, labels, annotations, network policies — all visible to the cloud provider.
Cannot protect against hardware-level attacks on TEEs. AMD SEV-SNP and Intel TDX have had side-channel vulnerabilities (CacheWarp, speculative execution).
Cannot enforce key sovereignty after key release. Once a secret is released into the sidecar’s memory, it’s in the cloud provider’s infrastructure.
Cannot protect against legal coercion of the customer. This product protects against US extraterritorial reach, not all legal compulsion.

Protection Model Breakdown

Scenario	Protected?	Why
Cloud provider dumps etcd	Yes	Secrets are never in etcd
Cloud provider reads node memory (no TEE)	No	Secrets in cleartext in process memory
Cloud provider reads node memory (with TEE)	Yes (probably)	TEE encrypts memory, but side-channel attacks exist
CLOUD Act subpoena for cloud provider	Yes	Provider has no keys or plaintext to hand over
Compromised application exfiltrates secrets	No	App has legitimate access to decrypted secrets
OpenBao in EU is compromised	No	All secrets exposed at the source
MITM on OpenBao connection (no TEE)	Partial	TLS protects transit, but endpoint isn’t verified without attestation
Kubernetes API server audit logs	Partial	Pod specs logged (env var names, not values if using socket refs)
Node-level debugger / ptrace	No (without TEE)	Standard OS access allows memory inspection

Attacks NOT Defended Against

Supply chain attacks on application container images
Side-channel attacks on TEE implementations (timing, power analysis, cache-based)
Social engineering of personnel with access to OpenBao
Insider threats from the customer’s own team
Network-level DDoS preventing connectivity to OpenBao
Container escape followed by host memory access (without TEE)
Coerced firmware updates on TEE hardware by cloud provider at government request

Cross-Provider Compatibility

Documented Provider Differences: Exceptionally Thorough

The level of detail on GKE private cluster firewall rules, AKS Admissions Enforcer behavior, EKS Fargate limitations, and marketplace packaging constraints is production-grade knowledge.

Phantom Cross-Provider Status

Phantom’s core architecture (webhook + sidecar + external secrets) works across all three providers with provider-specific code paths for networking, identity, and attestation.

Undocumented Issues to Address

GKE Workload Identity Federation — default SA token behavior changes may affect sealed cache key derivation.
EKS Pod Identity — sidecar’s OpenBao auth must support both traditional K8s auth and provider-specific identity federation.
AKS Node Auto-Provisioning (Karpenter) — eBPF programs must handle different kernel versions within the same cluster.
GKE Gateway API migration — network policies may need to understand Gateway API resources.
EKS Access Entries — changes who can interact with the API server and bypass webhooks.
Multi-tenant GKE clusters (GKE Enterprise) — fleet-level policies can override per-cluster webhook configurations.
ARM / Graviton nodes — eBPF programs, sidecar images, and crypto must be multi-arch.
Windows node pools — current architecture is Linux-only. Should be explicitly documented as unsupported.
Spot / preemptible node eviction — sidecar must handle SIGTERM gracefully and clean up sealed cache.
Network policies (Calico/Cilium) — sidecar needs explicit NetworkPolicy rules to reach OpenBao.

Scalability Analysis

Pod Scale Assessment

Component	100 Pods	1,000 Pods	10,000 Pods
Webhook	Trivial	Fine	Needs horizontal scaling or namespace sharding
Sidecar (per-pod)	~5 GB (50MB each)	~50 GB	~500 GB — significant
OpenBao connections	100 concurrent	1,000 (within HA capacity)	10,000 — pooling mandatory
eBPF programs	Negligible	Moderate (per-node)	Same as 1K if node count is stable
Operator	Single replica	Single + leader election	May need sharded reconciliation

OpenBao — Biggest Scalability Risk

OpenBao as Single External Dependency — Bottleneck Risk

10,000 pod restarts during rolling deployment = 10,000 OpenBao requests in a short window. With 3 secrets/pod at 5-min TTL: ~100 req/s steady-state. A 3-node HA cluster handles this, but deployment bursts could saturate it.

Missing Mitigations

Request coalescing — if 50 pods request the same secret simultaneously, OpenBao should be hit once, not 50 times.
Batch secret fetch — 3 secrets in one API call instead of 3 sequential calls reduces connection overhead 3x.
Staggered renewal — add jitter to TTL to spread renewal load.

eBPF Overhead at Scale

At 100 pods per node, a memory-access tracepoint on sys_read/sys_write could fire millions of times per second. Even incrementing a counter adds 50-200ns per syscall.

Recommendation

eBPF monitoring should be opt-in per namespace, not cluster-wide. The attestation + secret injection provides sufficient security without continuous syscall monitoring.

Sidecar Resource Overhead

30-50 MB

Memory per sidecar

500 GB

Reserved @ 10K pods

~50 ms

CPU burst on secret fetch

5-10 MB

Rust sidecar alternative

Tech Stack

Go — Right Choice for Phantom

Go is Correct For

Webhook server (first-class controller-runtime support)
Operator/controller (standard K8s operator pattern)
Sidecar proxy (network I/O, gRPC)

Consider Rust For

Sidecar if memory footprint becomes a scaling issue (5-10MB vs 30-50MB)
Cryptographic hot paths (hardware acceleration)

OpenBao vs Alternatives

Alternative	Pros	Cons
OpenBao (chosen)	Open-source fork, no BSL risk, proven at scale, transit engine	Younger project, smaller plugin ecosystem
HashiCorp Vault	Battle-tested, extensive ecosystem	BSL license — legal risk for commercial product
CyberArk Conjur	Enterprise pedigree, good K8s integration	Less flexible API, proprietary core
Cloud KMS (AWS/GCP/Azure)	Native integration, managed	Defeats the entire purpose
SOPS + Age/KMS	Simple, file-based	No dynamic secrets, no lease management
Infisical	Modern UI, good K8s integration	Less proven at scale, SaaS-first

OpenBao is the Correct Choice

The only option that is: (a) open-source with permissive license, (b) proven at scale, (c) supports transit encryption + dynamic secrets + PKI, and (d) can be self-hosted in the customer’s jurisdiction.

eBPF vs Alternatives for Monitoring

Alternative	Pros	Cons
eBPF (chosen)	Kernel-level visibility, low overhead, no app changes	Kernel version dependencies, CO-RE complexity
ptrace-based	Works everywhere	10-100x performance overhead
seccomp-bpf	Blocks syscalls, no overhead for allowed calls	Binary allow/deny only, no monitoring
Falco (eBPF-based)	Mature, rule-based, good K8s integration	Additional dependency, overlap
auditd	Well-understood kernel audit subsystem	High overhead at scale, log-based

Recommendation

Make eBPF monitoring a Phase 2 feature, not part of the MVP. If customers demand runtime visibility, integrate with Falco rather than building a custom monitoring framework.

MVP Scope — “Secrets That Never Touch etcd”

Phantom Core Components

Mutating admission webhook that injects phantom-proxy sidecar into labeled pods
Sidecar that fetches secrets from external OpenBao and exposes them via: environment variables, Unix domain socket, and mounted tmpfs file
In-memory cache with TTL-based renewal (skip sealed local cache for MVP)
Pre-flight connectivity check (Job-based, writes to ConfigMap)
Helm chart designed for EKS add-on constraints (no hooks, no lookup)
Single-provider launch: GKE Standard (simplest webhook behavior)

What to Cut from v1

Feature	Cut?	Reason
TEE attestation (SEV-SNP/TDX)	Cut from MVP	Can be added as policy upgrade; injection works without it
Sealed local cache (tier 2)	Cut from MVP	In-memory cache + grace period is sufficient initially
eBPF monitoring	Cut from MVP	Defense-in-depth, not core value proposition
gVisor sandbox	Cut from MVP	TEE provides better guarantees anyway
Circuit breaker	Include	Critical for production safety
Canary injection	Cut from MVP	Nice-to-have, not launch-critical
Multi-provider support	GKE first	EKS in v1.1, AKS in v1.2

Critical Path to First Deployable Version

Week 1-2

Project scaffolding, CI/CD, Helm chart skeleton

Week 3-5

Mutating webhook (injection, namespace selection, fail-closed)

Week 5-7

Sidecar (OpenBao auth, secret fetch, env var injection, socket API)

Week 7-8

In-memory cache with TTL renewal, grace period

Week 8-9

Pre-flight connectivity check Job

Week 9-10

Circuit breaker implementation

Week 10-12

Integration testing on GKE Standard (public + private clusters)

Week 12-14

Documentation, Helm chart polish, beta program with 2-3 design partners

Week 14-16

GKE Marketplace submission, public launch

Timeline: ~4 months to MVP with 3-4 engineers

This assumes full-time focus and no TEE/eBPF work.

Comparison to Alternatives

Alt 1: Full Confidential Computing (Just Use TEEs)

Aspect	Phantom Approach	Full CC Approach
Secret protection	External OpenBao + attestation	Hardware memory encryption
Complexity	Custom webhook + sidecar	Node pool config only
Cross-provider	Works everywhere (with caveats)	GKE/AKS only; EKS different model
Cost	Software license + OpenBao ops	6-10% perf overhead + higher instance cost
Protection scope	Secrets only	All memory, all computation

Trade-off: Full CC is simpler but more expensive and less available. Phantom works on standard VMs and adds CC as optional enhancement — correct positioning for reaching the broadest market.

Alt 2: Sovereign Cloud (Use EU Providers)

Aspect	Phantom	Sovereign Cloud
US access risk	Eliminated by crypto	Eliminated by jurisdiction
Cloud maturity	AWS/GCP/Azure (best-in-class)	EU providers lag in services and scale
Migration effort	Install operator + OpenBao	Full infrastructure migration
Multi-region/global	Yes (US clouds have global regions)	Limited to EU regions

Alt 3: Client-Side Encryption Libraries

Aspect	Phantom	Client-Side Libraries
Application changes	Zero (transparent)	Requires code changes in every app
Language support	Any (sidecar-based)	One library per language
Coverage	All pods automatically	Only integrated applications
Adoption friction	Low (label a namespace)	High (modify every application)

Alt 4: VPN to On-Premises HSM

Technically works but adds significant operational complexity (VPN management, on-premises infrastructure, latency). Phantom’s managed OpenBao is operationally simpler. However, for customers with existing on-prem HSMs (banks, defense), this should be a supported deployment mode.

Technical Risks

High-Impact Risks

1. OpenBao Project Viability

Smaller contributor base than Vault. If the project loses momentum, you’re building on an under-maintained foundation. Mitigation: Abstract behind an interface; monitor activity; support upstream Vault as alternative backend.

2. TEE Vulnerability Disclosure

A major vulnerability in AMD SEV-SNP or Intel TDX (like CacheWarp) would undermine the attestation story. Mitigation: Position TEE as defense-in-depth, not sole guarantee. Maintain rapid response capability for advisories.

3. Cloud Provider API Changes

The three providers frequently change managed service behavior (AKS default egress removal, Kata CC sunset, GKE Autopilot restrictions). Mitigation: Aggressive compatibility testing in CI, pre-flight checks, and provider DevRel partnerships.

4. Webhook Stability Under Load

A crashed webhook will hold every pod Pending (fail-closed). Operationally catastrophic. Mitigation: Circuit breaker + bypass escape hatch. Add chaos testing to CI.

5. Secret Caching Correctness

Three-tier cache introduces eventual consistency. A rotated secret may be stale for up to 20 minutes — significant during breach response. Mitigation: Implement a “force rotation” signal from operator to sidecar that bypasses the cache.

Dependency Risks

Dependency	Risk	Severity
OpenBao	Project momentum, fork sustainability	High
AMD SEV-SNP / Intel TDX	Hardware vulnerabilities, firmware updates	Medium
`controller-runtime` (Go)	Well-maintained by K8s SIG	Low
`cilium/ebpf` (Go)	Well-maintained, backed by Isovalent/Cisco	Low
SPIFFE/SPIRE	CNCF graduated, active development	Low
`go-sev-guest`	Smaller project, Google-maintained	Medium

Architecture Improvements

Concrete changes that raise the architecture score to 8.5/10.

A1. DaemonSet Mode — Per-Node Secret Proxy

Offer a DaemonSet mode where one Phantom agent per node handles secrets for all pods via Unix domain socket.

┌──────────────────────────────────────┐
│  Node                                      │
│  [Pod A] [Pod B] [Pod C] [Pod D]           │
│     │       │       │       │              │
│     └───────┴───┬───────┘  UDS         │
│                │                           │
│       [Phantom DaemonSet, ~80MB]             │
│       [    Shared Cache         ]             │
│                │ mTLS                      │
└────────────────┴─────────────────────┘
                 │
          [  OpenBao EU  ]

Aspect	Sidecar Mode	DaemonSet Mode
Memory (100 nodes, 10K pods)	~500 GB	~8 GB
Pod isolation	Full (per-pod process)	Shared (node-level)
Blast radius of crash	1 pod	All pods on node
Secret cache deduplication	No (same secret cached N times)	Yes (one copy per node)
Best for	High-security, <500 pods	High-density, >1000 pods

A2. SecretProvider Interface Abstraction

Abstract the secrets backend behind a SecretProvider interface from day one to reduce OpenBao project risk and widen addressable market.

type SecretProvider interface {
    GetSecret(ctx context.Context, path string, identity PodIdentity) (*Secret, error)
    WatchSecret(ctx context.Context, path string) (<-chan SecretEvent, error)
    RevokeLeases(ctx context.Context, identity PodIdentity) error
    HealthCheck(ctx context.Context) error
}

Provider	Priority	Sovereignty
openbao	v1.0 (launch)	Full (EU-hosted)
vault	v1.0 (launch)	Full (customer-controlled)
aws-secrets-manager	v1.2	None (US jurisdiction)
gcp-secret-manager	v1.2	None (US jurisdiction)
local-file	v1.0 (launch)	N/A (dev/testing)

A3. Deterministic Compatibility Database

Replace the AI compatibility engine with a CI-verified YAML database of known Helm charts with tested injection results.

# compatibility-db/charts/bitnami/postgresql/16.4.0.yaml
chart:
  repository: bitnami
  name: postgresql
  versions_tested: ["16.4.0", "16.3.x", "15.x"]

injection:
  status: "compatible"    # compatible | partial | incompatible | untested
  mode: "sidecar"         # sidecar | daemonset | both

testing:
  method: "automated"
  platform: "gke-standard"
  k8s_versions: ["1.29", "1.30", "1.31"]

Advantages over AI: deterministic (same input → same output), auditable (CISOs can review), reproducible (CI link proves test), community-driven.

A4. Webhook-Free Mode via CSI Secret Store Driver

Webhook Mode (default)

Fully transparent
No app changes required
Env + file injection
Per-process isolation

              CSI Mode (alternative)
              No webhook dependency
Standard K8s pattern
Works on restricted platforms
Requires pod spec changes, file-based only

            

A5. Multi-Tenancy Architecture for Managed SaaS

Each customer gets their own OpenBao namespace with separate encryption keys, isolated metrics, and network-level separation.

Layer	Isolation Mechanism
Secrets	Separate OpenBao namespace (`/tenant-id/*`), separate policies
Encryption keys	Per-tenant unseal keys, separate HSM slots
Authentication	Per-tenant Kubernetes auth mounts
Network	OpenBao policy: tenant A’s token cannot read `/tenant-b/*`
Audit	Per-tenant audit log bucket, customer-exportable
Metrics	`tenant_id` label on all metrics, per-tenant dashboards
Billing	Per-tenant secret access counters, usage tracking

A6. Offline / Air-Gapped Deployment Mode

Fully self-contained deployment for government and defense customers.

Component	Online Mode	Air-Gapped Mode
OpenBao	CloudCondom-managed SaaS	Customer-managed, on-prem
Unseal mechanism	CloudCondom HSM	Customer’s on-prem HSM (PKCS#11)
Container images	Public registry	Customer’s Harbor/registry mirror
Compatibility DB	Auto-updated from CDN	Manual update via USB/media transfer
Updates	Automated via Helm	Manual via air-gap transfer process

A7. StatefulSet with Persistent Secrets

Tie the sealed cache to a PersistentVolumeClaim and use the StatefulSet’s stable pod identity as part of the cache key derivation.

# Cache key derivation for StatefulSets
cache_key = HKDF-SHA256(
    ikm:  service_account_token,
    salt: cluster_hmac_key,
    info: "phantom-statefulset:" + statefulset_name + ":" + pod_ordinal
)

Score Impact Summary

Improvement	Weakness Addressed	Effort	Score Impact
A1. DaemonSet mode	Sidecar scalability	2-3 weeks	+0.3
A2. SecretProvider interface	OpenBao lock-in risk	1 week + ongoing	+0.2
A3. Compatibility DB	AI engine vaporware	1-2 weeks	+0.2
A4. CSI Secret Store mode	Webhook-only limitation	2 weeks	+0.1
A5. Multi-tenancy	SaaS architecture gap	Design only	+0.1
A6. Air-gapped mode	Gov/defense market gap	1 week	+0.05
A7. StatefulSet support	Cache correctness gap	1 week	+0.05
Total			+1.0

Verdict

Key Strengths

Trust model is correct. “Webhook = UX, crypto = security boundary” is the right architecture.
Exceptional provider-specific knowledge. Production-grade documentation on GKE/EKS/AKS quirks.
Operationally mature design. Circuit breaker, canary injection, three-tier caching, pre-flight checks.
Correct MVP prioritization. Starting with Phantom on managed K8s is the right call.

Key Weaknesses

OpenBao SPOF risk inadequately addressed at scale (10K-pod burst scenario).
EKS confidential computing story is weak. Nitro Enclaves are fundamentally different.
Sidecar resource overhead at scale not addressed. 500GB reserved memory at 10K pods.
No testing strategy. Missing fuzzing, property-based testing, formal verification for crypto paths.

Recommendations

Ship Phantom alone. Nothing else in v1. Phantom on GKE Standard is the MVP. Add EKS in v1.1, AKS in v1.2.
Add a DaemonSet mode as alternative to per-pod sidecars for customers with 1,000+ pods.
Implement request coalescing and batch secret fetching to mitigate OpenBao bottleneck risk.
Abstract the OpenBao dependency behind a SecretProvider interface from day one.
Invest in a testing strategy proportional to security claims: fuzzing, property-based testing, integration tests on all providers.
Be explicit about what’s not protected. Build honest security documentation that CISOs will trust.

Idea Bin — Future Solutions

These solutions were assessed during the initial review. They are not part of v1 but are preserved here for future reference. Each is collapsed by default — click to expand.

Veilnet — Zero-Trust Encryption Mesh

Verdict: Architecturally sound but extremely ambitious. High risk of scope creep.

The core insight is correct: standard service mesh mTLS uses in-cluster CAs that the cloud provider can access. Replacing the trust root with an external, customer-controlled CA eliminates the MITM risk.

Strengths

SPIFFE/SPIRE with an external trust root is the correct identity framework
WireGuard as alternative to TLS — lower overhead for high-throughput services
Running alongside existing meshes (double-wrap) is pragmatic

Concerns

This is essentially building a service mesh — enormous engineering effort
Performance impact of double encryption needs benchmarking
Certificate rotation at scale with an external CA introduces latency
3x integration work for Istio, Linkerd, and Cilium

Feasibility: Very High complexity. 15-20 person-year effort. Not realistic with 5-7 people.

Cross-provider: Yes, at enormous cost. Core encryption is provider-agnostic but networking layer varies significantly.

Cloakfs — Transparent Filesystem Encryption

Verdict: Architecturally sound but operationally constrained by provider limitations.

The envelope encryption design (DEK/KEK hierarchy) is textbook correct. Key rotation without re-encryption is the right approach.

GKE Autopilot blocks custom CSI drivers. Eliminates one of three major platforms unless you join Google’s partner allowlist.
AKS removes custom CSI drivers on upgrade. Documented behavior — workaround (operator re-install) is fragile.
EKS Fargate has no DaemonSet support. CSI node drivers require DaemonSets. Cloakfs is non-functional on Fargate.
FUSE performance overhead (20-40% throughput hit) makes this unsuitable for I/O-intensive workloads.

Feasibility: High complexity. Risky — 6+ months. Needs provider partnership.

Cross-provider: No. CSI driver restrictions on GKE Autopilot and AKS are fundamental blockers.

Specter — Confidential Computing Orchestrator

Verdict: Correct concept, but provider fragmentation makes cross-provider delivery nearly impossible.

The per-namespace ProtectionPolicy CRD is well-designed, but fundamentally different TEE models across providers prevent a single abstraction.

Provider	TEE Model	Status
GKE	AMD SEV-SNP + Intel TDX, per-node confidential VMs	Clean integration
AKS	Confidential VM node pools, Kata CC sunsetting Mar 2026	Reduced to VM-only
EKS	Nitro Enclaves — fundamentally different model	Incompatible abstraction

Feasibility: High complexity. Partial — GKE-only in 4-6 months. Three different confidential computing models = three different products.

Lockbox — etcd Proxy

Verdict: Correctly identified as non-viable on managed Kubernetes. Sound for self-managed only.

Lockbox cannot work on GKE/EKS/AKS because you have zero access to etcd. On self-managed clusters (kubeadm, k3s, Talos, RKE2), the etcd gRPC proxy approach is technically sound but extremely delicate. Phantom subsumes Lockbox’s goals on managed clusters.

Feasibility: Medium complexity. Yes for self-managed — 3-4 months. Niche market.

Cross-provider: N/A. Only viable on self-managed; no cross-provider concern.

Phantom — Technical Architecture

Phantom Architecture

Verdict: Architecturally Sound

How It Works

Key Strengths

Known Concerns

Technology Choices: Correct

Trust Model

“Webhook = UX, Crypto = Security” — Correct and Well-Reasoned

One Gap

Technical Feasibility

Phantom Complexity Breakdown

What Needs More Research

Key Technical Limitations

What Phantom Cannot Do

Protection Model Breakdown

Attacks NOT Defended Against

Cross-Provider Compatibility

Documented Provider Differences: Exceptionally Thorough

Phantom Cross-Provider Status

Undocumented Issues to Address

Scalability Analysis

Pod Scale Assessment

OpenBao — Biggest Scalability Risk

OpenBao as Single External Dependency — Bottleneck Risk

Missing Mitigations

eBPF Overhead at Scale

Recommendation

Sidecar Resource Overhead

Tech Stack

Go — Right Choice for Phantom

Go is Correct For

Consider Rust For

OpenBao vs Alternatives

OpenBao is the Correct Choice

eBPF vs Alternatives for Monitoring

Recommendation

MVP Scope — “Secrets That Never Touch etcd”

Phantom Core Components

What to Cut from v1

Critical Path to First Deployable Version

Timeline: ~4 months to MVP with 3-4 engineers

Comparison to Alternatives

Alt 1: Full Confidential Computing (Just Use TEEs)

Alt 2: Sovereign Cloud (Use EU Providers)

Alt 3: Client-Side Encryption Libraries

Alt 4: VPN to On-Premises HSM

Technical Risks

High-Impact Risks

1. OpenBao Project Viability

2. TEE Vulnerability Disclosure

3. Cloud Provider API Changes

4. Webhook Stability Under Load

5. Secret Caching Correctness

Dependency Risks

Architecture Improvements

Webhook Mode (default)

CSI Mode (alternative)

Score Impact Summary

Verdict

Architecture Score: 8.5 / 10

Key Strengths

Key Weaknesses

Recommendations

Idea Bin — Future Solutions

Verdict: Architecturally sound but extremely ambitious. High risk of scope creep.

Strengths

Concerns

Verdict: Architecturally sound but operationally constrained by provider limitations.

Verdict: Correct concept, but provider fragmentation makes cross-provider delivery nearly impossible.

Verdict: Correctly identified as non-viable on managed Kubernetes. Sound for self-managed only.