Core Concepts

Security Model

Gostly records your real API traffic, which means it touches the most sensitive data your services carry — credentials, tokens, and PII. The security model is built so the safe behaviour is the default behaviour: each deployment is self-hosted and single-tenant, credential headers are stripped before anything is written to disk, and body PII is scrubbed on the way into Postgres and into anything that ships. This page walks the boundaries: what stays verbatim, what gets scrubbed, and where the enforcement actually lives.

Self-hosted data locality

The licensed product ships as a Docker Compose stack and container images — there is no host CLI and no Gostly-hosted control plane in the traffic path. Your captured traffic, mock library, webhooks, and training data live in your own Postgres and on your own Docker volume (./data/). None of it is transmitted to Gostly. The only cloud-resident state on Gostly's side is account data, license validation, and billing.

Licensed product vs OSS proxy

Two distinct products share the proxy lineage. The licensed product on this site ships as Compose + registry images (no host CLI). The open-source proxy is a separate project distributed via Homebrew and a container registry, and it does ship a host CLI. The capabilities described on this page are the licensed product unless stated otherwise.

Because the data never leaves your infrastructure, there is no Gostly-side backup. Treat the ./data/ volume as your own operational artifact — back it up the way you would any Docker volume (volume snapshot, tar to your own object storage, etc.).

Single-tenant isolation, RLS as a defined policy

Each deployment is single-tenant. The application is multi-tenant aware — every tenant-scoped table carries a tenant_id column and every query filters on it — but in a self-hosted install the tenant id defaults to default, so your data never shares a database with another customer's. Application-level tenant scoping on every query is the authoritative isolation control.

Per-tenant Postgres Row-Level Security policies are defined on all tenant-scoped tables as a defense-in-depth layer. The control plane binds a per-request GUC (gostly.tenant_id) through a tenant-scoped database dependency, and the RLS policies are written to read that GUC in their USING / WITH CHECK clauses.

Be precise: RLS is a defined policy, not engine-enforced today

In the shipped Compose configuration the database connects as the table-owning role, and Postgres does not enforce RLS for a table owner unless the table is marked FORCE ROW LEVEL SECURITY — which it is not in the shipped config. So today the policies exist as a defined layer, but the running isolation rests on the application-level WHERE tenant_id filters plus the single-tenant deployment model. We describe this as single-tenant isolation with RLS defined as defense-in-depth, never as “engine-enforced RLS.” Wiring the policies to be engine-enforced (a non-owner login role, or FORCE ROW LEVEL SECURITY on every table) is tracked work.

The 16-header credential redaction floor

A fixed floor of 16 credential and session headers is scrubbed by default on every sink that touches disk or leaves the box: the on-disk replay JSONL and every shipped/exported stream (usage telemetry, wide-events). The floor is built into the proxy and is non-overridable downward — the REDACT_HEADERS env var and per-service redact_headers config can only add headers, never remove them.

# The 16-entry floor (matched case-insensitively):
authorization            proxy-authorization      cookie
set-cookie               x-api-key                x-auth-token
x-session-id             x-access-token           x-user-token
x-csrf-token             x-amz-security-token     x-amz-session-token
x-goog-api-key           grpc-metadata-authorization
api-key                  token

Matched header values are replaced with [REDACTED]; the header name is preserved so the shape of the request is intact. The floor is pinned at exactly 16 entries — changing the list is a deliberate change to the security claim, not an accident.

The one verbatim exception is RAM-only

The single place credential headers stay verbatim is the in-memory active-session capture, which replays stateful flows (Set-Cookie, CSRF and OAuth exchanges) byte-perfect while a LEARN window is active. That buffer is RAM-only, bounded by MOCK_SESSION_MAX_BYTES (256 MiB default), and dropped on restart or on a new LEARN window — it is never written to disk and never shipped. An air-gapped self-host can opt to persist headers to the disk JSONL with MOCK_PERSIST_VERBATIM=true; the shipped telemetry/wide-events sink is always redacted with no opt-out.

Body PII: verbatim locally, scrubbed into Postgres and exports

Header redaction and body scrubbing are two separate boundaries. Header credentials are stripped at capture time on every disk/shipped sink. Body PII follows a different rule, by design:

Local replay JSONL

The agent's on-disk replay library keeps response bodies verbatim — exact-fidelity replay is the whole value, and this file lives on your volume and never leaves the box. (Credential headers are still stripped to the floor here by default.)

Postgres + exports

When LEARN→MOCK transition writes the recorded traffic into the Postgres-backed mock library — and into anything that ships or exports, such as training data — request and response bodies pass through a structural PII scrubber first.

The scrubber runs entirely in-process with zero network calls. It is regex-based and matches structured PII that shows up in API payloads — 17 categories today: JWTs, Bearer and Basic auth values, well-known API key prefixes (Stripe, OpenAI, Anthropic, GitHub, AWS access-key IDs), payment cards (Visa / Mastercard / Amex / Discover / Diners), US SSN, email, phone (US + E.164), Bitcoin and Ethereum addresses, IBAN, and IPv4. Each match is replaced with a typed placeholder:

{ "card": "[CARD]", "ssn": "[SSN]", "email": "[EMAIL]",
  "token": "[REDACTED]", "ip": "[IPV4]", "wallet": "[ETH_ADDR]" }

On top of the regex pass, a list of sensitive JSON keys (password, secret, api_key, refresh_token, cvv, mfa_code, and ~30 more) is matched by name: the entire value is replaced with [REDACTED] regardless of its content, so a secret in an unexpected format is still caught.

What the regex scrubber does not catch

This is structural pattern matching, not a model. Free-text PII with no fixed shape — a person's name or an address buried in a description field — is not caught by the regex pass today; NER-based scrubbing for unstructured fields is on the roadmap. Once a body row is scrubbed, a scrubbed_at seal is set and the sync layer refuses to overwrite a scrubbed row with an unscrubbed one — the scrub is a one-way boundary, not a reversible filter.

No LLM in the request hot path

In MOCK mode every request runs through a deterministic match cascade, cheapest tier first. AI generation is the last resort, and even then it is not synchronous on the request path — generation runs on a background worker behind a bounded queue, and the response is served from cache. That no LLM sits in the request hot path is an architectural invariant, not a tuning default:

Session verbatim

If the request was seen during the active LEARN session, the in-memory capture replays it byte-for-byte. RAM-only.

Exact match

Method + URI + request body hash matches a recorded entry exactly. O(1) lookup.

Resource store

A POST-created resource is linked to a later GET-by-id, so POST /charges then GET /charges/{id} returns the created resource instead of a 404.

Statechart

The agent-side statechart engine fires at request time on every tier with bundled fixtures (charge, customer, invoice, order, subscription), advancing resource lifecycle state on transition requests.

Smart swap

URI path parameters are normalised to templates and matched structurally, so a recording of /users/42 serves a request to /users/99.

AI inference (Pro+)

Only when nothing above matches. Generation is enqueued to a background worker; the response is served from cache. The model is never on the synchronous request path.

For Free tier the cascade stops at smart swap. For regulated or latency-sensitive deployments the inference engine can be disabled entirely and the stack runs fully deterministically. In-process generation is off by default — ENABLE_GENERATION=false — and generation routes through a dedicated inference sidecar. ENABLE_RAG (retrieval) is on by default.

LoRA adapters train on scrubbed data only

When fine-tuning is used, LoRA adapters train only on PII-scrubbed rows from the Postgres library — never on the verbatim local JSONL — and they are served from cache, self-hosted inside your stack. AI mock-repair, where present, surfaces operator proposals you approve or reject; it is off by default (ENABLE_AI_MOCK_REPAIR) and never mutates the library on its own.

Operating modes and the transition interstitial

The proxy has four modes: LEARN (record and forward), MOCK (serve from the library), PASSTHROUGH (forward without recording), and TRANSITIONING. The transition is a brief interstitial: while the LEARN→MOCK job scrubs the recorded traffic and writes it to Postgres, the proxy returns 503 Service Unavailable with a Retry-After header so no request matches a half-written library. This is the moment the body-scrub boundary is crossed — see How It Works for the full pipeline.

TLS interception is opt-in

Plain HTTP is served on :8080 and is the default path. A TLS-MITM listener runs on :8443 only when ENABLE_TLS_INTERCEPTION is set. The knob is tri-state: default off; true / lax (a listener failure logs and plain HTTP keeps serving); or strict(a TLS failure exits the process — “TLS is load-bearing”).

When interception is on, the agent mints per-host leaf certs from an embedded CA. Fetch and trust the CA once:

curl http://<agent>:8080/ca.crt > gostly-ca.crt
# then add gostly-ca.crt to the client/OS trust store

GET /ca.crt returns 503 while interception is off. Outbound TLS fingerprint impersonation (genuine Chrome / Firefox / Safari fingerprints toward the upstream) is a Pro+ feature and off unless a service is configured for it.

SSRF-guarded webhook replay

Webhook capture is automatic in the agent; replay is operator-triggered through the API (POST /v1/webhooks/{service_id}/{webhook_id}/replay) — the agent does not auto-replay. Because replay re-issues an outbound request to a caller-influenced URL, every replay target passes a shared SSRF guard before any socket opens. The same guard protects the canary replay primitive.

Scheme

Only http and https are accepted.

Host literals

localhost, ip6-localhost, ip6-loopback, and cloud metadata DNS names (metadata.google.internal and variants) are blocked.

Ports

SSH (22), SMTP (25), MySQL (3306), Postgres (5432), Redis (6379), Elasticsearch (9200), memcached (11211), and MongoDB (27017) are rejected on any host.

IP classification

Loopback, link-local (including 169.254.169.254 instance metadata), RFC1918 private space, IPv6 ULA, CGNAT (100.64/10), multicast, reserved, and unspecified are all blocked.

DNS rebinding

Every getaddrinfo result is classified — a hostname returning any internal address is rejected, not least-bad-picked.

The internal-address and metadata blocks apply in every environment by default — they are not relaxed just because a deployment is not in production. A developer who genuinely needs to replay against a local listener opts in explicitly with GOSTLY_ALLOW_LOCAL_REPLAY=1; the scheme and port guards are never relaxed.

Authentication, RBAC, and the audit log

SSO and role-based access ship in the web container's auth layer (Team tier). Three identity backends are supported: password (bcrypt, cost factor 12), SAML 2.0, and OIDC (authorization-code flow with JWKS validation). A four-rank role model gates every state-changing endpoint:

viewer  <  member  <  admin  <  owner

Roles are enforced server-side, not in the client. The owner rank carries a hard invariant — the system refuses to demote the last owner of a tenant. Authentication and activity events (logins, SSO, session-revokes, and mutations on services, mocks, repair proposals, and scrub configs) are written to an append-only audit log. Sessions use HttpOnly, Secure, SameSite=Lax cookies; API keys are stored as hashes and compared in constant time.

What is captured, and what stays observability-only

Gostly records and replays HTTP and HTTPS — HTTP/1.1 and HTTP/2 over TLS. WebSocket frames are captured for observability only; they are not replayed. There is no gRPC, async-messaging, or database mocking today (roadmap). Cold-start seeding lets you bootstrap a library without live traffic by dragging a HAR, Postman collection, or OpenAPI spec into the dashboard, which posts to POST /v1/seed/{har,postman,openapi}. Drift detection emits drift events plus a 0–100 freshness score and a sparkline trend so you can see when a recorded library is going stale against its upstream.

The telemetry boundary

Captured request bodies, response bodies, full URLs, credential headers, cookies, and API keys never leave your stack. The agent does export opt-out-able operational metrics in the Prometheus format on /metrics: request counts by match type, mock library size, IO errors, HTTP rate and latency, and the TLS subsystem family.

ghost_requests_total{match_type}      # one increment per match-path outcome
ghost_mock_library_size               # library size gauge
ghost_io_errors_total{operation}      # disk-sink open/write failures
axum_http_requests_total / _duration  # HTTP rate + latency
gostly_tls_*                          # TLS MITM subsystem (ALPN, cert cache, listener state)

These are counts and gauges — no request bodies, no raw identifiers. The full schema and the single environment variable that disables the usage stream are published on the telemetry page.

Next steps

How It Works →

The LEARN → MOCK pipeline and the full match cascade in detail.

Configuration Reference →

Every environment variable and feature flag, including the redaction add-ons.