For agent developers

Your agent is deterministic.
Your APIs aren’t.

That’s why your agent tests flake. Gostly records the real upstream once, replays it byte-for-byte. Same input ↦ same response. Every run.

The shape of the problem

“Use AI to build deterministic automations once that can reproduce work. Don’t use AI to do repeatable work itself.”

— r/agi, organic thread

The agent-test failure mode

Your agent worked on Tuesday. It failed Wednesday. The agent didn’t change.

The Stripe sandbox returned a slightly different idempotency-key response. The eval flagged the agent as regressed. The agent was fine. Stripe moved.

This isn’t a bug in your code. It’s a property of the system: a deterministic agent calling a non-deterministic upstream cannot have a deterministic test. The only fix is to pin what the upstream returns.

I built this because I lost three days last quarter debugging an agent that hadn’t actually changed. The upstream had. I didn’t want to repeat that.

How it works for agents

Record one run

Run your agent against the real upstream once. Gostly sits between the agent and the third-party API and captures every request and response.

Replay forever

Switch the proxy to MOCK. Your agent gets byte-for-byte the same upstream behavior every test run. No tokens spent, no rate limits hit, no flake from drift.

Redacted at capture

Auth headers are stripped before any bytes touch disk — 16 PII header classes redacted by structural invariant. Bodies are scrubbed for the obvious patterns before persistence.

Gap-fill when shape drifts

When the recording doesn't cover a new path the agent tries, Gostly proposes a patch grounded by the existing recording. You approve before it ships. No LLM in the deterministic hot path, ever.

Works with the stack you already have

Gostly is an HTTP proxy. If your agent speaks HTTP to a third-party — and it does — the proxy is transparent to it. Tested with:

  • Anthropic SDK (Claude, Claude Code)
  • OpenAI SDK
  • LangChain / LangGraph
  • Pydantic AI / Instructor
  • Cursor agent tool calls
  • Anything that speaks HTTP to a third-party

Pricing

$0 to start. Self-hosted OSS proxy. No license key needed.

$10/mo when you want the AI gap-fill on unrecorded paths and drift detection.

$79/seat when the team grows — SAML, OIDC, RBAC, audit log, shared adapters.

The $10 is locked through Dec 2026 if you sign up before Jul 1. We won’t move the line on you.

Same input. Same response. Every run.

Stop debugging upstream drift. Pin the recording. Ship the agent.