Security Model
Gostly records your real API traffic, which means it touches the most sensitive data your services carry — credentials, tokens, and PII. The security model is built so the safe behaviour is the default behaviour: each deployment is self-hosted and single-tenant, credential headers are stripped before anything is written to disk, and body PII is scrubbed on the way into Postgres and into anything that ships. This page walks the boundaries: what stays verbatim, what gets scrubbed, and where the enforcement actually lives.
Self-hosted data locality
The licensed product ships as a Docker Compose stack and container images — there is no host CLI and no Gostly-hosted control plane in the traffic path. Your captured traffic, mock library, webhooks, and training data live in your own Postgres and on your own Docker volume (./data/). None of it is transmitted to Gostly. The only cloud-resident state on Gostly's side is account data, license validation, and billing.
Licensed product vs OSS proxy
Because the data never leaves your infrastructure, there is no Gostly-side backup. Treat the ./data/ volume as your own operational artifact — back it up the way you would any Docker volume (volume snapshot, tar to your own object storage, etc.).
Single-tenant isolation, RLS as a defined policy
Each deployment is single-tenant. The application is multi-tenant aware — every tenant-scoped table carries a tenant_id column and every query filters on it — but in a self-hosted install the tenant id defaults to default, so your data never shares a database with another customer's. Application-level tenant scoping on every query is the authoritative isolation control.
Per-tenant Postgres Row-Level Security policies are defined on all tenant-scoped tables as a defense-in-depth layer. The control plane binds a per-request GUC (gostly.tenant_id) through a tenant-scoped database dependency, and the RLS policies are written to read that GUC in their USING / WITH CHECK clauses.
Be precise: RLS is a defined policy, not engine-enforced today
FORCE ROW LEVEL SECURITY — which it is not in the shipped config. So today the policies exist as a defined layer, but the running isolation rests on the application-level WHERE tenant_id filters plus the single-tenant deployment model. We describe this as single-tenant isolation with RLS defined as defense-in-depth, never as “engine-enforced RLS.” Wiring the policies to be engine-enforced (a non-owner login role, or FORCE ROW LEVEL SECURITY on every table) is tracked work.The 16-header credential redaction floor
A fixed floor of 16 credential and session headers is scrubbed by default on every sink that touches disk or leaves the box: the on-disk replay JSONL and every shipped/exported stream (usage telemetry, wide-events). The floor is built into the proxy and is non-overridable downward — the REDACT_HEADERS env var and per-service redact_headers config can only add headers, never remove them.
# The 16-entry floor (matched case-insensitively): authorization proxy-authorization cookie set-cookie x-api-key x-auth-token x-session-id x-access-token x-user-token x-csrf-token x-amz-security-token x-amz-session-token x-goog-api-key grpc-metadata-authorization api-key token
Matched header values are replaced with [REDACTED]; the header name is preserved so the shape of the request is intact. The floor is pinned at exactly 16 entries — changing the list is a deliberate change to the security claim, not an accident.
The one verbatim exception is RAM-only
MOCK_SESSION_MAX_BYTES (256 MiB default), and dropped on restart or on a new LEARN window — it is never written to disk and never shipped. An air-gapped self-host can opt to persist headers to the disk JSONL with MOCK_PERSIST_VERBATIM=true; the shipped telemetry/wide-events sink is always redacted with no opt-out.Body PII: verbatim locally, scrubbed into Postgres and exports
Header redaction and body scrubbing are two separate boundaries. Header credentials are stripped at capture time on every disk/shipped sink. Body PII follows a different rule, by design:
Local replay JSONL
The agent's on-disk replay library keeps response bodies verbatim — exact-fidelity replay is the whole value, and this file lives on your volume and never leaves the box. (Credential headers are still stripped to the floor here by default.)
Postgres + exports
When LEARN→MOCK transition writes the recorded traffic into the Postgres-backed mock library — and into anything that ships or exports, such as training data — request and response bodies pass through a structural PII scrubber first.
The scrubber runs entirely in-process with zero network calls. It is regex-based and matches structured PII that shows up in API payloads — 17 categories today: JWTs, Bearer and Basic auth values, well-known API key prefixes (Stripe, OpenAI, Anthropic, GitHub, AWS access-key IDs), payment cards (Visa / Mastercard / Amex / Discover / Diners), US SSN, email, phone (US + E.164), Bitcoin and Ethereum addresses, IBAN, and IPv4. Each match is replaced with a typed placeholder:
{ "card": "[CARD]", "ssn": "[SSN]", "email": "[EMAIL]",
"token": "[REDACTED]", "ip": "[IPV4]", "wallet": "[ETH_ADDR]" }On top of the regex pass, a list of sensitive JSON keys (password, secret, api_key, refresh_token, cvv, mfa_code, and ~30 more) is matched by name: the entire value is replaced with [REDACTED] regardless of its content, so a secret in an unexpected format is still caught.
What the regex scrubber does not catch
scrubbed_at seal is set and the sync layer refuses to overwrite a scrubbed row with an unscrubbed one — the scrub is a one-way boundary, not a reversible filter.No LLM in the request hot path
In MOCK mode every request runs through a deterministic match cascade, cheapest tier first. AI generation is the last resort, and even then it is not synchronous on the request path — generation runs on a background worker behind a bounded queue, and the response is served from cache. That no LLM sits in the request hot path is an architectural invariant, not a tuning default:
Session verbatim
If the request was seen during the active LEARN session, the in-memory capture replays it byte-for-byte. RAM-only.
Exact match
Method + URI + request body hash matches a recorded entry exactly. O(1) lookup.
Resource store
A POST-created resource is linked to a later GET-by-id, so POST /charges then GET /charges/{id} returns the created resource instead of a 404.
Statechart
The agent-side statechart engine fires at request time on every tier with bundled fixtures (charge, customer, invoice, order, subscription), advancing resource lifecycle state on transition requests.
Smart swap
URI path parameters are normalised to templates and matched structurally, so a recording of /users/42 serves a request to /users/99.
AI inference (Pro+)
Only when nothing above matches. Generation is enqueued to a background worker; the response is served from cache. The model is never on the synchronous request path.
For Free tier the cascade stops at smart swap. For regulated or latency-sensitive deployments the inference engine can be disabled entirely and the stack runs fully deterministically. In-process generation is off by default — ENABLE_GENERATION=false — and generation routes through a dedicated inference sidecar. ENABLE_RAG (retrieval) is on by default.
LoRA adapters train on scrubbed data only
ENABLE_AI_MOCK_REPAIR) and never mutates the library on its own.Operating modes and the transition interstitial
The proxy has four modes: LEARN (record and forward), MOCK (serve from the library), PASSTHROUGH (forward without recording), and TRANSITIONING. The transition is a brief interstitial: while the LEARN→MOCK job scrubs the recorded traffic and writes it to Postgres, the proxy returns 503 Service Unavailable with a Retry-After header so no request matches a half-written library. This is the moment the body-scrub boundary is crossed — see How It Works for the full pipeline.
TLS interception is opt-in
Plain HTTP is served on :8080 and is the default path. A TLS-MITM listener runs on :8443 only when ENABLE_TLS_INTERCEPTION is set. The knob is tri-state: default off; true / lax (a listener failure logs and plain HTTP keeps serving); or strict(a TLS failure exits the process — “TLS is load-bearing”).
When interception is on, the agent mints per-host leaf certs from an embedded CA. Fetch and trust the CA once:
curl http://<agent>:8080/ca.crt > gostly-ca.crt # then add gostly-ca.crt to the client/OS trust store
GET /ca.crt returns 503 while interception is off. Outbound TLS fingerprint impersonation (genuine Chrome / Firefox / Safari fingerprints toward the upstream) is a Pro+ feature and off unless a service is configured for it.
SSRF-guarded webhook replay
Webhook capture is automatic in the agent; replay is operator-triggered through the API (POST /v1/webhooks/{service_id}/{webhook_id}/replay) — the agent does not auto-replay. Because replay re-issues an outbound request to a caller-influenced URL, every replay target passes a shared SSRF guard before any socket opens. The same guard protects the canary replay primitive.
Scheme
Only http and https are accepted.
Host literals
localhost, ip6-localhost, ip6-loopback, and cloud metadata DNS names (metadata.google.internal and variants) are blocked.
Ports
SSH (22), SMTP (25), MySQL (3306), Postgres (5432), Redis (6379), Elasticsearch (9200), memcached (11211), and MongoDB (27017) are rejected on any host.
IP classification
Loopback, link-local (including 169.254.169.254 instance metadata), RFC1918 private space, IPv6 ULA, CGNAT (100.64/10), multicast, reserved, and unspecified are all blocked.
DNS rebinding
Every getaddrinfo result is classified — a hostname returning any internal address is rejected, not least-bad-picked.
The internal-address and metadata blocks apply in every environment by default — they are not relaxed just because a deployment is not in production. A developer who genuinely needs to replay against a local listener opts in explicitly with GOSTLY_ALLOW_LOCAL_REPLAY=1; the scheme and port guards are never relaxed.
Authentication, RBAC, and the audit log
SSO and role-based access ship in the web container's auth layer (Team tier). Three identity backends are supported: password (bcrypt, cost factor 12), SAML 2.0, and OIDC (authorization-code flow with JWKS validation). A four-rank role model gates every state-changing endpoint:
viewer < member < admin < owner
Roles are enforced server-side, not in the client. The owner rank carries a hard invariant — the system refuses to demote the last owner of a tenant. Authentication and activity events (logins, SSO, session-revokes, and mutations on services, mocks, repair proposals, and scrub configs) are written to an append-only audit log. Sessions use HttpOnly, Secure, SameSite=Lax cookies; API keys are stored as hashes and compared in constant time.
What is captured, and what stays observability-only
Gostly records and replays HTTP and HTTPS — HTTP/1.1 and HTTP/2 over TLS. WebSocket frames are captured for observability only; they are not replayed. There is no gRPC, async-messaging, or database mocking today (roadmap). Cold-start seeding lets you bootstrap a library without live traffic by dragging a HAR, Postman collection, or OpenAPI spec into the dashboard, which posts to POST /v1/seed/{har,postman,openapi}. Drift detection emits drift events plus a 0–100 freshness score and a sparkline trend so you can see when a recorded library is going stale against its upstream.
The telemetry boundary
Captured request bodies, response bodies, full URLs, credential headers, cookies, and API keys never leave your stack. The agent does export opt-out-able operational metrics in the Prometheus format on /metrics: request counts by match type, mock library size, IO errors, HTTP rate and latency, and the TLS subsystem family.
ghost_requests_total{match_type} # one increment per match-path outcome
ghost_mock_library_size # library size gauge
ghost_io_errors_total{operation} # disk-sink open/write failures
axum_http_requests_total / _duration # HTTP rate + latency
gostly_tls_* # TLS MITM subsystem (ALPN, cert cache, listener state)These are counts and gauges — no request bodies, no raw identifiers. The full schema and the single environment variable that disables the usage stream are published on the telemetry page.