Core Concepts

How It Works

Gostly is a transparent HTTP proxy with four operating modes: LEARN, MOCK, PASSTHROUGH, and TRANSITIONING. Understanding the pipeline between them is the key to getting the most out of the tool.

The pipeline

1. LEARN

The proxy forwards every request to your upstream and records the verbatim response. Your app sees no difference.

2. TRANSITION

Recorded traffic is scrubbed, pattern-extracted, and written to the mock library. A brief interstitial mode returns 503 + Retry-After.

3. MOCK

All requests are served from the mock library. No upstream required. Unmatched requests fall through to AI generation if enabled.

LEARN mode — recording traffic

In LEARN mode the proxy is a transparent pass-through. Every inbound request is forwarded to the configured upstream URL. The response is returned to the caller and simultaneously written to a local JSONL file on disk:

# Each line in traffic/{service}.jsonl is one recorded interaction
{
  "timestamp": "2026-04-23T09:14:22Z",
  "method": "GET",
  "uri": "/users/42",
  "request_headers": { "accept": "application/json" },
  "request_body": null,
  "status": 200,
  "response_headers": { "content-type": "application/json" },
  "response_body": { "id": 42, "name": "Jane Smith", "role": "admin" }
}

On sensitive headers

Authorization tokens, cookies, API keys, and a floor of enterprise security headers are redacted by default on every sink that touches disk or leaves the box. The in-memory active-session store keeps them verbatim for replay fidelity but never leaves the box. See the header redaction reference for the full list.

JSONL files live on the customer's machine and are never transmitted anywhere. The verbatim format preserves full fidelity — tests that pattern-match on specific field values work correctly because the recorded data is production-accurate.

Transition — building the mock library

When you trigger a transition, the API reads the raw JSONL, runs it through a scrub pipeline, and writes the results to the Postgres-backed mock library:

Scrub

Request/response bodies are scanned for credentials, PII patterns, and any field paths you've configured. Matched values are replaced with [REDACTED]. The scrubbed_at timestamp is set — this is the permanent safety boundary.

Pattern extraction

URI paths are normalised to templates (e.g. /users/42 → /users/{id}). The extracted patterns drive AI training and smart-swap matching.

Mock library write

Scrubbed entries are inserted into the mock_library table. The proxy is signalled to reload — it reads the library and serves from it on the next request.

During transition the proxy enters TRANSITIONING mode and returns 503 Service Unavailable with a Retry-After header. This is intentional — it prevents partial-library matches during the write.

Start a transition via the API (or use the dashboard at localhost:3000):

curl -X POST http://localhost:8000/v1/transition/start \
  -H "X-API-Key: $GHOST_API_KEY"
# Returns: { "job_id": "..." }

# Poll until complete
curl http://localhost:8000/v1/transition/{job_id}/status \
  -H "X-API-Key: $GHOST_API_KEY"

MOCK mode — serving responses

Switch to MOCK mode via the dashboard or the API:

curl -X POST http://localhost:8000/v1/mode \
  -H "X-API-Key: $GHOST_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{"mode": "MOCK"}'

In MOCK mode the proxy matches each inbound request against the library using a tiered strategy. Earlier tiers are cheaper; later tiers are more capable:

Exact match

All tiers

Method + URI + request body hash matches a recorded entry exactly. Instant — O(1) hash lookup.

Session verbatim

All tiers

If the request was seen during the active LEARN session, the in-memory capture replays it byte-for-byte (bodies and headers, served with X-Ghost-Mock: session-verbatim). This buffer is RAM-only — it never leaves the box and resets on restart or a new LEARN window.

Statechart / resource

All tiers

A Harel statechart models resource lifecycles, so a POST /charges followed by GET /charges/{id} returns the created resource instead of a 404. PATCH/POST transitions rewrite the status field and tag the response with X-Ghost-Transition.

Smart swap

All tiers¹

URI path parameters are normalised to templates (/users/{id}) and matched structurally. A recording of /users/42 will serve a request to /users/99. Enable with SMART_SWAP_ENABLED=true on the proxy.

AI generation

Pro+

Last resort. With no recorded, session, statechart, or structural match, a fine-tuned model (or retrieval-augmented generation) generates a realistic response based on recorded patterns for this service. Free tier stops at smart swap.

¹ Smart swap is available on all tiers but requires SMART_SWAP_ENABLED=true on the proxy.

Chaos injection

Any tier can be wrapped with chaos config — injecting random latency, error rates, or specific status codes to simulate degraded upstream behaviour. Available on all plans.

AI pipeline (Pro+)

When a request has no recorded match, Gostly routes it to the inference server. The inference server runs two optional modes, both disabled by default and enabled via environment variables:

ENABLE_RAG=true

Loads the all-MiniLM-L6-v2 sentence encoder and builds a per-service semantic index from your mock library. Incoming requests are matched by cosine similarity — above 0.92 the recorded response is replayed directly; above 0.75 it becomes a grounded generation template; below that, pure generation. This is the recommended first step.

ENABLE_GENERATION=true

Loads Qwen2.5-0.5B-Instruct (configurable via GEN_MODEL) and serves LoRA adapter responses. For teams with 50+ recorded interactions per endpoint, optional fine-tuning produces a per-service adapter that improves consistency. Requires ~2 GB RAM; the first request after startup may briefly 503 while the model loads.

The AI pipeline is local by default — the inference server runs inside your Docker stack, so no request bodies or response contents leave the box. An optional BYO-key cloud-LLM backend (OpenAI, Azure, or Bedrock) exists but is off unless you explicitly configure it.

Next steps

Proxy Setup →

TLS termination, multiple upstream services, per-service modes.

Configuration Reference →

Every environment variable, scrub rule, and feature flag.