documentation

operating groundshade

Daily operator guide. Configuration, defense levels, fast-lane allowlists, the admin dashboard, the metrics reference, and the tuning table.

updated 2026-05-29

This page covers running GroundShade in production. The wire formats and full design live in SPEC.md.

TL;DR

# Plain run with built-in defaults. Forwards to http://127.0.0.1:3000.
groundshade

# With a config file:
groundshade --config /etc/groundshade/config.yaml

The proxy listens on 0.0.0.0:8080. The admin port and dashboard live on 127.0.0.1:9090.

Configuration

YAML, loaded in priority order:

--config <path> flag
GROUNDSHADE_CONFIG env var
/etc/groundshade/config.yaml
./groundshade.yaml
Built-in defaults (zero-config)

examples/config/full.yaml documents every option at its default. examples/config/minimal.yaml is the smallest useful starting point.

Environment overrides

These env vars override the loaded YAML at startup and on every SIGHUP reload:

Variable	Purpose
`GROUNDSHADE_CONFIG`	Config file path
`GROUNDSHADE_LISTEN_HTTP`	Inbound HTTP listener
`GROUNDSHADE_LISTEN_ADMIN`	Admin listener
`GROUNDSHADE_ADMIN_TOKEN`	Bearer token gating `/admin/*` and `/metrics`
`GROUNDSHADE_UPSTREAM_URL`	Default upstream URL
`GROUNDSHADE_TRUSTED_PROXIES`	Comma-separated CIDRs
`GROUNDSHADE_TRUST_STATE_DIR`	Trust-key state directory
`GROUNDSHADE_TRUST_SECRET`	HMAC signing key (hex)
`GROUNDSHADE_LOG_FORMAT`	`json` or `text`
`GROUNDSHADE_LOG_IP_HASH`	`true` hashes client IPs in logs

The Docker image sets GROUNDSHADE_LISTEN_ADMIN=0.0.0.0:9090 and GROUNDSHADE_TRUST_STATE_DIR=/var/lib/groundshade. Outside Docker, the state directory defaults to $XDG_STATE_HOME/groundshade or $HOME/.local/state/groundshade.

Hot reload

SIGHUP re-reads the config file and reapplies environment overrides. The listeners drain gracefully and the proxy restarts on the same PID with the same trust signing key, so outstanding gs_trust cookies stay valid. Defense state, sliding windows, connection counts, and metric counters all reset.

A validation failure logs a warning and keeps the old config running. Bad reloads never take the proxy down.

To catch a bad config before you reload, validate it without starting the server:

groundshade --check-config --config /etc/groundshade/config.yaml
# exit 0 + "config OK" if valid, exit 1 + the error if not

It applies the same env overrides as a real start, so it checks the effective config. Run it as a container pre-flight (docker exec groundshade groundshade --check-config) before the SIGHUP.

The inbound port is unbound briefly during reload (under 100 ms in normal drains, up to 30 s if old connections take their time). For true zero-downtime, run two replicas behind a load balancer.

In zero-config mode (no config file resolves), SIGHUP is logged and ignored.

What it does

Calm state: GroundShade behaves as a normal reverse proxy. No challenge code runs.

Defense state: when a route’s origin pain crosses thresholds (p95 latency or 5xx rate over a 30 s window with at least 50 samples), the route lifts:

Level	Who sees a challenge (without a trust token)
L1	UAs matching `l1_ua_patterns`; forged-browser clients (UA claims browser, JA4 disagrees); optionally write methods if `l1_suspicion_methods` is set. v0.7.1’s `always_challenge_forged_browser` flag promotes the forged-browser arm to fire at every level, including Open.
L2	L1 scope plus thin clients (no `Referer`, no `Accept-Language`)
L3	Everyone except the fast lane

Defaults for l1_ua_patterns: headless, bot, crawl, spider, python, curl, go-http, libwww. l1_suspicion_methods is empty by default; opt in only for write-only APIs that never see a real browser POST.

Independent of level, two behavioural signals can short-circuit:

Rate signal (hard threshold). The 60 s sliding window per (route, IP /24) or (route, JA4) crossed defense.rate_signals.hard_threshold. Default: 1,000 requests.
Trustless persistence. The /24 was challenged defense.trustless_persistence.threshold times without ever solving. Default: 20. Sticks until a single solve clears it.

The rate signal’s soft threshold acts as an extra L1 arm on the firing request only. No permanent state change.

Per-route policy knobs (v0.7.1)

Two config fields let you pin route policy without operator action:

defense.escalation.min_level (default open): the level the route is allowed to fall to on cooldown. Setting l1 keeps a sensitive route in active defense even when the detector reports calm. The detector can still escalate above; it just won’t step below. Useful for admin panels, payment endpoints, and known scrape targets. The difference from shields_up is that the detector still drives the upper levels; shields_up pins all the way at shields_up.
defense.scope.always_challenge_forged_browser (default false): when true, requests classified as ForgedBrowser (browser-shaped UA paired with a script-tool JA4) are placed in challenge scope at every level, including Open. Pairs with the v0.6 rate signal: rate catches volume; this flag catches polite low-rate forgers that fly under the hard threshold. No-op under CF orange-cloud, where every request arrives with CF’s JA4 and the classifier returns Browser for all.

Both are per-route overrides via the routes list, so the same proxy can have a relaxed default route and a strict /admin/* route:

routes:
  - match:
      path: "/admin/*"
    defense:
      escalation:
        min_level: l1
      scope:
        always_challenge_forged_browser: true

Fast lane

The fast lane always bypasses challenges. First match wins, in order:

Client IP matches fastlane.allow_ips (CIDRs, not spoofable when trusted_proxies is set correctly).
Path matches a configured feed glob (/feed, /rss, *.atom, etc.).
User-Agent substring matches fastlane.allow_user_agents (spoofable; pair with allow_ips if the threat model demands).
Authorization: ApiKey id:secret matches a configured key.
UA claims a known crawler (Google, Bing, DuckDuckGo, Apple, optionally Yandex) and the IP passes reverse-DNS plus forward-DNS verification.

fastlane.crawlers.yandex is false by default. The others are on.

No-JS passage challenge (opt-in, v0.7.2)

The default HTML interstitial needs JavaScript to solve the proof-of-work. Clients with JS off (Tor Browser on Safer/Safest, NoScript users, text browsers, some accessibility setups) hit the interstitial and have no way through. The passage challenge gives them a path that does not need JavaScript.

It is a friction layer, not a bot detector. It raises cost on no-JS clients (a one-click form, a server-enforced wait, single-use tokens, ip-prefix + JA4 binding, a shorter trust TTL) but a patient client can still pass. The JS proof-of-work stays the stronger proof; do not treat the passage as a replacement.

Enable it per route with challenge.mode:

js (default): JS PoW only; no-JS clients dead-end. No change.
auto: JS clients solve the PoW as before; no-JS clients get the passage form in the <noscript> branch of the same page.
nojs: the passage form is the whole page, no PoW. For routes you know are no-JS (for example an onion address).

routes:
  - match:
      path: "/onion/**"
    challenge:
      # A per-route challenge block REPLACES the global one wholesale, so
      # copy any pow/interstitial settings you rely on into it too.
      mode: auto
      nojs:
        delay_secs: 5
        trust_ttl_secs: 600

The flow: a challenged no-JS client clicks Continue (a form with an invisible honeypot), waits out delay_secs while a CSS-only progress bar fills and the page auto-reloads, then is issued a gs_trust cookie and sent back to where it was (query string preserved). The wait is enforced server-side; there is no JS to gate the button. Tuning lives under challenge.nojs: delay_secs, redeem_window_secs, trust_ttl_secs, max_issued_per_prefix, issue_window_secs. Because the passage is a weaker proof, nojs.trust_ttl_secs must be <= trust.token_ttl_secs (validated at startup).

Watch challenges_issued_total{kind="nojs"}, challenges_solved_total{kind="nojs"}, challenge_failed_total{kind="nojs",reason=...}, and passage_wait_total to see passage traffic and failures.

A note for Tor: behind one exit node many clients share an IP prefix and Tor Browser’s uniform JA4, so the binding is weak there; the single-use token and per-prefix issuance cap carry the anti-abuse weight.

Operator workflows

Engage shields under attack

curl -X POST http://127.0.0.1:9090/admin/shields \
  -H "Authorization: Bearer $GROUNDSHADE_ADMIN_TOKEN" \
  -H 'Content-Type: application/json' \
  -d '{"level":"up"}'

Add "route": "*|/api/*" to target one route. Disengage with {"level":"down"}.

Mint an API key for a partner

secret=$(openssl rand -hex 32)
echo "secret (give to partner): $secret"
cat <<EOF
fastlane:
  api_keys:
    - id: partner-acme
      secret_hash: "$(printf '%s' "$secret" | sha256sum | awk '{print $1}')"
      label: "Acme webhook receiver"
EOF

SIGHUP to reload. The partner sends Authorization: ApiKey partner-acme:<secret> on every request.

Shortcut: make api-key.

Let a non-browser client through

Two paths:

Issue an API key (above). Use this for known clients.

Have the client solve the JSON challenge. Reference solvers ship at examples/solvers/:

token=$(./examples/solvers/solve.py https://your.site/api/endpoint)
curl -H "Authorization: ChallengeSolution $token" https://your.site/api/endpoint

Allowlist a monitor

For monitors that can’t send a custom auth header (UptimeRobot free tier, Uptime-Kuma, PageSpeed), use the fast-lane allowlists. Two independent layers; either match bypasses:

fastlane:
  allow_ips:
    - "63.143.42.240/28"      # UptimeRobot range 1
    - "69.162.124.224/28"     # UptimeRobot range 2
  allow_user_agents:
    - "UptimeRobot"
    - "Uptime-Kuma"
    - "PageSpeed"

Security note. UA substring match is forgeable. Treat it as ergonomics, not security. Pair allow_user_agents with allow_ips when the threat model demands it; an attacker then has to defeat both.

The dashboard’s fast-lane section shows hit counts per reason. If a monitor’s counter never moves, either the IP range is wrong or your fronting proxy is stripping the real client IP before it reaches GroundShade.

Read the dashboard

Open http://127.0.0.1:9090/ in a browser. With an admin token configured, you land on /admin/login; submitting the token sets an HttpOnly gs_admin cookie and redirects to the dashboard.

The page polls /admin/status, /admin/routes, and /metrics once per second and renders:

A hero showing the worst-route level and a one-line summary of connections, drop rate, and uptime.
Per-route cards with a 32 s sparkline of request rate, the shields toggle, and per-route signal tracked-key counts.
A traffic proportion bar (forwarded vs challenged) and three challenge funnels with drop/pass rates: browser (JS PoW), API (JSON), and the opt-in no-JS passage (issued, solved, wait, fail).
A FAST LANE row with per-reason counters.
A CLIENTS row classifying traffic by UA + JA4 family (browser, script, forged, bot, unknown).
A SIGNALS row with JA4 state, rate soft/hard hits, trustless hits, and tracked-key counts.
A CONNECTIONS row with active connections, refusals, and the self-throttle flag.

Point Prometheus at http://127.0.0.1:9090/metrics with the same Authorization: Bearer <admin-token> header.

Admin endpoints

Method	Path	Purpose
`GET`	`/admin/status`	Version + counts
`GET`	`/admin/routes`	Per-route defense snapshot
`POST`	`/admin/shields`	Engage or disengage shields
`GET`	`/admin/login`	Login form
`POST`	`/admin/login`	Submit token, get cookie
`POST`	`/admin/logout`	Drop the cookie
`GET`	`/metrics`	Prometheus exposition

Metrics reference

Metric	Labels	Type	Meaning
`groundshade_requests_total`	`route`, `decision`, `level`	counter	Requests processed by decision (`forward`, `challenge_html`, `challenge_json`, `reject`) and route level
`groundshade_route_level`	`route`	gauge	Effective level: `0` Open, `1` L1, `2` L2, `3` L3, `4` ShieldsUp
`groundshade_challenges_issued_total`	`route`, `kind`	counter	Challenges minted; `kind` is `html`, `json`, or `nojs` (passage)
`groundshade_challenges_solved_total`	`route`, `kind`	counter	Successful solves; same `kind` set
`groundshade_challenge_failed_total`	`route`, `kind`, `reason`	counter	No-JS passage redemptions that failed (`reason`: `bad_signature`, `expired`, `replay`, `binding`, `bad_path`, `honeypot`, `cap`, `unauthorized`)
`groundshade_passage_wait_total`	`route`	counter	No-JS passage reloads served before maturity (the wait state)
`groundshade_tokens_issued_total`	`route`	counter	Trust tokens minted
`groundshade_fastlane_total`	`route`, `reason`	counter	Fast-lane hits by reason (`apikey`, `crawler`, `feed`, `ip_allowlist`, `ua_allowlist`)
`groundshade_connections_active`	(none)	gauge	Inbound TCP connections held
`groundshade_connections_rejected_total`	`reason`	counter	Connections refused at the accept layer
`groundshade_self_throttle`	(none)	gauge	`1` while the proxy is in self-throttle
`groundshade_client_family_total`	`family`	counter	Coarse classification: `browser`, `script`, `forged_browser`, `bot`, `unknown`
`groundshade_signals_immediate_total`	`route`, `reason`	counter	Challenged on sight by a signal (`rate_hard`, `trustless`)
`groundshade_signals_soft_noisy_total`	`route`	counter	Requests where the rate soft threshold fired (L1 scope arm)
`groundshade_signals_rate_tracked_keys`	`route`, `key`	gauge	Distinct keys tracked by the rate signal (`key` is `ip_prefix` or `ja4`)
`groundshade_signals_trustless_tracked_keys`	`route`	gauge	Distinct prefixes tracked by trustless persistence
`groundshade_ja4_detected`	(none)	gauge	`1` once at least one request has carried JA4

Behavioural signals

Both signals run after the fast lane on every non-bypass request.

Rate signal. Per route, a 60 s sliding window keyed in parallel by client prefix (IPv4 /24 or IPv6 /56) and JA4. Crossing soft_threshold (200 req/min) widens the L1 scope check by one arm on that request. Crossing hard_threshold (1,000 req/min) issues a challenge regardless of level.

Trustless persistence. Per route, a per-prefix counter of challenges issued without ever solving. Past 20 (default), the prefix is challenged on sight. One solve clears it. The “ever earned trust” bit is sticky for the entry’s lifetime.

Note on prefixes: trust-token IP binding uses v4 /24 and v6 /48 by default. The rate and trustless signals use v4 /24 and v6 /56. The v6 numbers differ on purpose; signals use RIPE’s recommended customer-prefix size.

SEO safety invariant

Both signals consult after the fast lane. Verified crawlers, operator allowlists, feed paths, and API keys never reach the signal evaluator. A misconfigured threshold cannot accidentally challenge a search-engine crawler. Tests in crates/groundshade-proxy/tests/e2e_signals_seo.rs lock the invariant.

JA4 availability

The proxy auto-detects whether your fronting proxy is forwarding X-JA4. After 100 requests (or 60 s), if no JA4 has arrived, it logs a single WARN and the per-JA4 arm goes silent. The per-IP/24 arm and trustless persistence keep working. The dashboard’s signals row shows the current state.

Tuning

Symptom	Knob	Direction
Solves too slow on phones	`challenge.pow.leading_zero_bits`	Lower (16–17)
Bots find solving cheap	`challenge.pow.leading_zero_bits`	Raise (19–20)
False positives at L1	`defense.scope.l1_ua_patterns`	Trim
Headless setups pass too easily	`challenge.probe.min_score`	Raise from `0` to `5` or `10`
Origin still hot under heavy traffic	`defense.trigger.p95_latency_ms` / `err5xx_rate`	Lower
Proxy running out of FDs	`selfdef.max_connections_total`	Raise (after kernel ulimit)
`connections_rejected_total{reason="per_ip"}` climbing behind a fronting proxy	`listen.trusted_proxies`	Add the proxy’s pinned `/32`
Sensitive route (admin, payment, scrape target) should stay in defense	`defense.escalation.min_level`	Set to `l1` (or higher) on that route
Forged-browser traffic walks through at Open level	`defense.scope.always_challenge_forged_browser`	Set `true` per route (no-op behind CF orange-cloud)
Real users tripping rate signal on rich pages	`defense.rate_signals.soft_threshold`	Raise
Polite scrapers slipping past	`defense.rate_signals.hard_threshold`	Lower (500–800)
Trustless flagging real users with stale cookies	`defense.trustless_persistence.threshold`	Raise (30–50)
Memory budget tight	`defense.rate_signals.max_keys_per_route`	Lower (20,000)

Logs

JSON to stdout. Default fields: method, host, path, status, duration, JA4, UA, and client_ip_hash. Client IPs are hashed with a daily-rotated salt. To log raw IPs, set observe.log_ip_hash: false and accept the GDPR responsibility.

Persistent state

Only one file:

state_dir/trust.key: 32 bytes, mode 0600. The HMAC signing key for trust tokens. Persists across restarts so outstanding cookies survive a redeploy.

No DB, no Redis, no on-disk logs unless you redirect stdout.

To rotate the key and invalidate every outstanding cookie, delete the file and restart. Or run shields-up with a fresh GROUNDSHADE_TRUST_SECRET.