What makes KYA different from regular IAM tools?

Traditional IAM (like Auth0, Okta) is designed for humans who log in. AI agents operate differently: they act continuously, autonomously, and at machine speed. KYA is built specifically for agent identity, with short-lived capability tokens, action-level verification, and cryptographic signatures on every request.

Yes. KYA is fully open-source (Apache 2.0) and ships with a Docker Compose setup. You can run the full stack (API + PostgreSQL + Redis) on your own infrastructure in minutes.

What latency does the /verify endpoint add?

The /verify endpoint targets a p99 latency under 20ms when self-hosted. KYA also supports a local verification mode with a Redis-backed policy cache that eliminates the network hop entirely.

How does revocation work?

Agent revocation is immediate. Within seconds, all subsequent verify calls for that agent return DENY. Capability revocation (for in-flight JWTs) uses a Redis blacklist with matching TTL. No waiting for token expiry.

What programming languages are supported?

KYA ships with official SDKs for JavaScript/TypeScript (@kya/sdk-js) and Python (@kya/sdk-python). Both include Ed25519 key generation, canonical JSON serialization, and typed API clients. The REST API works with any language.

Is the audit log tamper-proof?

Yes. Each workspace maintains a SHA-256 hash chain over its audit events. Any modification to historical events breaks the chain and is immediately detectable. This design satisfies the tamper-evidence requirements for SOC2 and ISO 27001 audits.

All articles

January 28, 20256 min read

The Missing Layer in AI Infrastructure

The AI infrastructure stack has compute, orchestration, memory, and observability. But there's one critical layer that's still missing: identity and permission control for autonomous agents.

The modern AI infrastructure stack has never been richer. We have:

Compute: GPU clouds at every price point
Orchestration: LangGraph, CrewAI, AutoGen for multi-agent workflows
Memory: Vector databases (Pinecone, Weaviate, Chroma) for long-term agent context
Observability: LangSmith, Langfuse for tracing agent decisions
Guardrails: Prompt injection detection, output filtering

Companies have raised hundreds of millions to fill each of these layers. The tooling is mature and battle-tested.

But there's one layer that nobody has built yet. The layer that sits between the agent and the action.

The Stack Has a Gap #

Here's the modern agentic workflow:

User Intent
    ↓
LLM (reasoning)
    ↓
Orchestration (LangGraph)
    ↓
Tool Call
    ↓
[MISSING: Identity & Permission Gate]
    ↓
Production System (Stripe, CRM, APIs)

We've built everything around the agent. We haven't built the layer between the agent and the world.

When your agent decides to call stripe.create_charge(amount=5000), nothing checks:

Is this agent authorized to make charges at all?
Is €5000 within its per-transaction policy?
Is this action within the agent's current capability scope?
Will this action be logged in a tamper-evident way?

The answer today is: none of these checks exist. The tool call goes straight through.

Why This Layer Is Hard #

The reason this gap exists isn't laziness — it's that building identity infrastructure for agents is genuinely hard. Several unsolved problems:

Problem 1: Agents Aren't Humans

Human identity systems assume a login flow. A human authenticates once, gets a session, and that session persists.

Agents don't log in. They spawn, execute, and terminate — sometimes thousands of times per day. Their "session" is a single task completion. Traditional IAM doesn't map onto this model.

Problem 2: Actions Are Dynamic

With humans, you can define roles: "admin", "editor", "viewer". The role doesn't change mid-session.

With agents, the required permissions are dynamic. An agent might need to:

Read data (low risk)
Process a refund under €100 (medium risk)
Trigger a production deploy (high risk)

All in the same workflow. Static roles don't capture this granularity.

Problem 3: Speed

Agents operate at machine speed. An authorization check that adds 200ms to a human login is unnoticeable. An authorization check that adds 200ms to every agent action in a 1000-action workflow adds 3+ minutes of latency.

The identity layer needs to be fast — p99 under 20ms — or it won't be used.

Problem 4: Auditability at Scale

A human makes 50 decisions per day. An agent makes 50 decisions per minute. Traditional audit logging systems aren't designed for this volume, and most don't provide the tamper-evident guarantees required for compliance.

What the Missing Layer Looks Like #

After working on this problem, we believe the missing layer needs five primitives:

1. Agent Identity (Who is acting?)

Each agent gets a cryptographic identity — an Ed25519 keypair. The public key is registered with the identity layer. Every action the agent takes is signed with the private key.

This gives you verifiable attribution. You know, with mathematical certainty, which agent performed which action.

2. Policy Engine (What is it allowed to do?)

Policies define the permission boundary for each agent:

{
  "agent": "agt_payment_processor",
  "rules": {
    "allowed_tools": ["charge_payment", "issue_refund"],
    "spend_limits": { "max_per_tx": 100, "max_per_day": 1000 },
    "rate_limits": { "actions_per_minute": 20 }
  }
}

Not a role. A precise, agent-specific policy that can be versioned, audited, and updated without touching the agent code.

3. Capability Tokens (What can it do right now?)

Inspired by capability-based security, agents operate with short-lived, scoped tokens:

Agent requests capability for "charge_payment" (TTL: 5min)
Identity layer issues: cap_01J... (valid until 10:05am)
Agent uses cap_01J... for all charge_payment calls until expiry

This is fundamentally different from API keys. The token:

Expires in 5 minutes (not 5 years)
Is scoped to one action type
Carries the policy constraints inline
Can be revoked individually

4. Verification Gate (Should this action proceed?)

Before execution, every sensitive action passes through a verification gate:

POST /verify
{
  "agent_id": "agt_01J...",
  "action": "charge_payment",
  "payload": { "amount": 45 },
  "capability_token": "eyJ...",
  "signature": "base64..."
}

→ { "decision": "ALLOW", "audit_event_id": "evt_01J..." }

The gate checks the full chain: identity → capability → policy → quotas. It returns ALLOW, DENY, or PENDING_APPROVAL with a machine-readable reason code.

5. Audit Trail (What happened and why?)

Every verification decision is logged with hash-chain integrity. You get:

Who acted (agent identity)
What they did (action + payload hash)
Whether it was allowed (decision + reason code)
When (timestamp)
Why it was allowed (policy version that approved it)

And because of the hash chain, you can prove that the log hasn't been tampered with.

The Reference Architecture #

Here's what the stack looks like with the missing layer in place:

User Intent
    ↓
LLM (reasoning)
    ↓
Orchestration (LangGraph)
    ↓
Tool Call requested
    ↓
┌─────────────────────────────────┐
│  KYA Identity & Permission Layer │
│                                  │
│  1. Verify agent identity        │
│  2. Check capability token       │
│  3. Evaluate policy              │
│  4. Check quotas                 │
│  5. Log to audit trail           │
│                                  │
│  → ALLOW / DENY                  │
└─────────────────────────────────┘
    ↓ (only if ALLOW)
Production System

The identity layer is not in the critical path of the LLM reasoning — it only activates when a tool call is about to execute.

Integration Is One Function Call #

The barrier to adoption needs to be as low as possible. In practice, adding the identity layer looks like this:

from kya_sdk import KyaClient

kya = KyaClient(workspace_id="ws_01J...")

# Wrap your tool execution
async def execute_tool(agent_id, tool_name, payload, capability_token, signature):
    result = await kya.verify(
        agent_id=agent_id,
        action=tool_name,
        payload=payload,
        capability_token=capability_token,
        signature=signature
    )

    if result.decision != "ALLOW":
        raise PermissionError(f"{tool_name} denied: {result.reason_code}")

    # Execute the actual tool
    return await tools[tool_name](payload)

One function. One check. The missing layer is now present.

Who Needs This Now #

The teams that need this most urgently are building:

Fintech copilots: Agents that can initiate transactions, process refunds, or move funds. The blast radius of an unauthorized action is immediate and financial.

RevOps automation: Agents writing to CRMs, triggering outbound sequences, managing customer data. GDPR exposure without audit trails.

DevOps agents: Agents that can deploy, scale, or reconfigure infrastructure. A single unconstrained agent in prod is a nightmare scenario.

Healthcare workflows: Agents accessing patient records or triggering clinical workflows. HIPAA compliance requires audit evidence.

In all these cases, the question isn't "do we need identity and permission control for our agents?" The answer is obviously yes. The question is "why hasn't anyone built this yet?"

We're building it.

KYA is the open-source identity & permission layer for AI agents. Get started in 5 minutes →