The ephemeral infrastructure paradox: Why short-lived systems need stronger identity governance

Tags:

In my experience leading engineering projects, I have encountered the same pattern repeatedly. We obsess over deployment speed. We measure success in commit velocity and uptime. But we rarely pause to ask the most uncomfortable question in the room: Who actually owns the identities we just spun up?

This silence isn’t malicious; it’s structural. We have optimized our entire software delivery lifecycle for the creation of resources, but we have almost no muscle memory for their destruction. We celebrate the “Hello World” of a new service, but we have no ceremony for its decommissioning.

For years, I watched this disconnect play out. I would sit in planning meetings where we architected systems to scale to thousands of pods in seconds. Yet, our security governance model was still stuck in the era of manual ticket approvals. We were building Ferraris and trying to secure them with bicycle locks.

The reality today is that our infrastructure has fundamentally changed, but our identity governance has not. We are still trying to audit ghosts. We attempt to secure ephemeral workloads that live for milliseconds using spreadsheets designed for servers that last for years.

The silent explosion we’re not tracking

If you look at the dashboards of any modern enterprise today, you will see a metric that should keep you up at night. It is not the number of employees you have. It is the number of things acting like employees.

For every human developer onboarded, we inadvertently create a massive number of machine identities. Service accounts, API keys and workload tokens accumulate like digital dust. Research suggests that non-human identities now outnumber humans by a factor of 10 to 1 or more.

Consider the lifecycle of a typical microservice. In its journey from a developer’s laptop to production, it might generate a dozen distinct identities: a GitHub token for the repository, a CI/CD service account for the build, a registry credential to push the container, and multiple runtime roles to access databases, queues and logging services.

The problem is not just volume; it is invisibility. When a developer leaves, HR triggers an offboarding process. Their email is cut, their badge stops working. But what about the five service accounts they hardcoded into a deployment script three years ago? Those usually stay active, unmonitored, waiting for someone to find them. Often, these “zombie identities” retain administrative privileges long after their original purpose has vanished, simply because no one is brave enough to turn them off.

The “test tenant” trap

I have seen too many teams fall into the trap of thinking a test environment does not matter. “It’s just dev,” they say. “There’s no real customer data there.” This complacency is fatal because identity boundaries are rarely as clean as we think they are.

In reality, test environments are often where attackers go first. It is the path of least resistance. We saw this play out in the Microsoft Midnight Blizzard attack. The attackers did not burn a zero-day exploit to break down the front door; they found a legacy test tenant that nobody was watching closely.

They compromised a non-human identity and used that access to pivot straight into production corporate emails. These are not harmless leftovers. They are open backdoors. The danger lies in the relationships between environments. If a “test” CI/CD runner has permission to push to a “production” container registry, or if a developer reuses a password across both environments, the “test” label is nothing more than a false sense of security.

Supply chain reliability is an identity problem

We also need to talk about the tools we trust. The Codecov incident shook the confidence of every engineering lead I know because it wasn’t a code vulnerability—it was a credential vulnerability.

Attackers extracted a credential from a Docker image creation process. They used a static secret to hijack the Bash Uploader script. This allowed them to modify the script on the fly, effectively turning a trusted development tool into a data exfiltration engine.

This is the defining challenge of our decade. Our software supply chain is held together by thousands of API keys and secrets. If we continue to rely on long-lived static credentials to glue our pipelines together, we are building on sand. Every static key sitting in a repo—no matter how private you think it is—is a ticking time bomb. It only takes one developer to accidentally commit a .env file or one compromised S3 bucket to expose the keys to the kingdom.

The AI acceleration

If managing static bots feels like drowning, the rise of agentic AI is about to hand us a firehose.

We are rushing to deploy AI agents that do not just chat—they execute code. These are autonomous workloads that can read databases and trigger API calls. An AI agent is effectively a highly privileged employee that works at machine speed. Unlike traditional automation scripts, which are deterministic and follow a strict set of instructions, AI agents are probabilistic. They make decisions based on context.

If an AI agent is tasked with “optimizing cloud spend,” and it has broad permissions, it might decide to shut down a critical production database because it deemed it “underutilized.” Or, if it is tricked by a prompt injection attack, it could be coerced into exfiltrating sensitive customer data.

If you have not solved identity governance for your existing microservices, you are not ready for autonomous AI. If an attacker compromises an AI agent, they inherit its identity. If that identity has broad access because “it was easier to configure that way,” you have automated your own data breach.

The cultural cost of static security

Beyond the technical risks, there is a profound cultural cost to our current approach. When identity governance is slow, manual and ticket-based, it becomes an adversary to engineering velocity.

I have seen developers spend days waiting for a ticket to be approved just to get read access to an S3 bucket. This friction breeds Shadow IT. Developers, under pressure to ship, will bypass the official process. They will share static keys over Slack, hardcode credentials into their apps or reuse a high-privilege “admin” key for everything because it’s the path of least resistance.

Paradoxically, by trying to control everything with heavy-handed gates, we end up with less visibility and less control. The goal of modern identity governance shouldn’t be to say “no” more often; it should be to make the secure path the fastest path.

3 strategic shifts

How do we fix this? As illustrated in Figure 1, we need a framework that shifts from static reviews to continuous governance. There are no silver bullets, but three engineering principles consistently reduce risk without killing velocity.

srcset=”https://b2b-contenthub.com/wp-content/uploads/2026/02/workload-identity-architecture.png?quality=50&strip=all 672w, https://b2b-contenthub.com/wp-content/uploads/2026/02/workload-identity-architecture.png?resize=300%2C166&quality=50&strip=all 300w, https://b2b-contenthub.com/wp-content/uploads/2026/02/workload-identity-architecture.png?resize=150%2C84&quality=50&strip=all 150w, https://b2b-contenthub.com/wp-content/uploads/2026/02/workload-identity-architecture.png?resize=640%2C354&quality=50&strip=all 640w, https://b2b-contenthub.com/wp-content/uploads/2026/02/workload-identity-architecture.png?resize=444%2C246&quality=50&strip=all 444w” width=”672″ height=”372″ sizes=”auto, (max-width: 672px) 100vw, 672px”>Figure 1: Governance must move from static reviews to a continuous lifecycle of issuance, verification and automated expiration.

Niranjan Kumar Sharma

1. Identity must be cryptographic

We must stop relying on IP allowlists. In a world of dynamic containers, network location is a poor proxy for trust.

We need to move toward cryptographic identity. Every workload must present a verifiable certificate, whether it lives for five years or five milliseconds. Frameworks like SPIFFE allow us to issue short-lived identities to workloads automatically. This means we trust the software, not the network cable it is plugged into. This approach, often called “workload attestation,” verifies the binary running the process before issuing it an identity document (SVID). If the binary has been tampered with, it gets no identity and therefore, no access.

2. Kill the static credential

Static keys are technical debt. They are the “password on a sticky note” of the cloud era.

We need to aggressively shorten the lifespan of credentials. If a human needs access, it should expire at the end of the day. If a machine needs access, it should expire in minutes. When a credential works for only ten minutes, its value to an attacker drops to near zero. You fundamentally change the economics of the attack.

Practically, this means adopting standards like OIDC Federation for your CI/CD pipelines. Instead of storing a long-lived AWS secret in your GitHub Actions settings, your build job should exchange a temporary token with AWS to get short-lived access that expires the moment the build finishes. This pattern, documented extensively by providers like AWS and GitHub, eliminates the “secret zero” problem entirely.

3. Automate the cleanup

We cannot manually review 50,000 permissions. The math does not work.

We must use Cloud Infrastructure Entitlement Management (CIEM) to automate the cleanup. We need tools that look at what permissions a service account actually used in the last 90 days. If it hasn’t written to that S3 bucket in three months, revoke the permission automatically. Treat “Least Privilege” not as a philosophy, but as an automated garbage collection process.

This automation is critical because humans are naturally risk-averse. No engineer wants to be the one who caused an outage by deleting a key they thought was unused. Data-driven automation removes that fear, allowing us to prune privileges with confidence.

Final thoughts

The infrastructure we build has become ephemeral. Yet our mindset is still static.

We cannot continue to govern modern cloud environments with the tools of the past decade. By adopting cryptographic identity and eliminating static secrets, we can build systems that are fast and secure. The future of security is not about slowing down; it is about building guardrails that move as fast as we do.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

Categories

No Responses

Leave a Reply

Your email address will not be published. Required fields are marked *