documentation
operating groundshade
Daily operator guide. Configuration, defense levels, fast-lane allowlists, the admin dashboard, the metrics reference, and the tuning table.
updated
This page covers running GroundShade in production. The wire formats and full design live in SPEC.md.
TL;DR
# Plain run with built-in defaults. Forwards to http://127.0.0.1:3000.
groundshade
# With a config file:
groundshade --config /etc/groundshade/config.yaml
The proxy listens on 0.0.0.0:8080. The admin port and dashboard
live on 127.0.0.1:9090.
Configuration
YAML, loaded in priority order:
--config <path>flagGROUNDSHADE_CONFIGenv var/etc/groundshade/config.yaml./groundshade.yaml- Built-in defaults (zero-config)
examples/config/full.yaml
documents every option at its default.
examples/config/minimal.yaml
is the smallest useful starting point.
Environment overrides
These env vars override the loaded YAML at startup and on every
SIGHUP reload:
| Variable | Purpose |
|---|---|
GROUNDSHADE_CONFIG | Config file path |
GROUNDSHADE_LISTEN_HTTP | Inbound HTTP listener |
GROUNDSHADE_LISTEN_ADMIN | Admin listener |
GROUNDSHADE_ADMIN_TOKEN | Bearer token gating /admin/* and /metrics |
GROUNDSHADE_UPSTREAM_URL | Default upstream URL |
GROUNDSHADE_TRUSTED_PROXIES | Comma-separated CIDRs |
GROUNDSHADE_TRUST_STATE_DIR | Trust-key state directory |
GROUNDSHADE_TRUST_SECRET | HMAC signing key (hex) |
GROUNDSHADE_LOG_FORMAT | json or text |
GROUNDSHADE_LOG_IP_HASH | true hashes client IPs in logs |
The Docker image sets GROUNDSHADE_LISTEN_ADMIN=0.0.0.0:9090 and
GROUNDSHADE_TRUST_STATE_DIR=/var/lib/groundshade. Outside Docker,
the state directory defaults to $XDG_STATE_HOME/groundshade or
$HOME/.local/state/groundshade.
Hot reload
SIGHUP re-reads the config file and reapplies environment
overrides. The listeners drain gracefully and the proxy restarts on
the same PID with the same trust signing key, so outstanding
gs_trust cookies stay valid. Defense state, sliding windows,
connection counts, and metric counters all reset.
A validation failure logs a warning and keeps the old config running. Bad reloads never take the proxy down.
To catch a bad config before you reload, validate it without starting the server:
groundshade --check-config --config /etc/groundshade/config.yaml
# exit 0 + "config OK" if valid, exit 1 + the error if not
It applies the same env overrides as a real start, so it checks the
effective config. Run it as a container pre-flight
(docker exec groundshade groundshade --check-config) before the SIGHUP.
The inbound port is unbound briefly during reload (under 100 ms in normal drains, up to 30 s if old connections take their time). For true zero-downtime, run two replicas behind a load balancer.
In zero-config mode (no config file resolves), SIGHUP is logged and ignored.
What it does
Calm state: GroundShade behaves as a normal reverse proxy. No challenge code runs.
Defense state: when a route’s origin pain crosses thresholds (p95 latency or 5xx rate over a 30 s window with at least 50 samples), the route lifts:
| Level | Who sees a challenge (without a trust token) |
|---|---|
| L1 | UAs matching l1_ua_patterns; forged-browser clients (UA claims browser, JA4 disagrees); optionally write methods if l1_suspicion_methods is set. v0.7.1’s always_challenge_forged_browser flag promotes the forged-browser arm to fire at every level, including Open. |
| L2 | L1 scope plus thin clients (no Referer, no Accept-Language) |
| L3 | Everyone except the fast lane |
Defaults for l1_ua_patterns: headless, bot, crawl, spider,
python, curl, go-http, libwww. l1_suspicion_methods is
empty by default; opt in only for write-only APIs that never see a
real browser POST.
Independent of level, two behavioural signals can short-circuit:
- Rate signal (hard threshold). The 60 s sliding window per
(route, IP /24)or(route, JA4)crosseddefense.rate_signals.hard_threshold. Default: 1,000 requests. - Trustless persistence. The
/24was challengeddefense.trustless_persistence.thresholdtimes without ever solving. Default: 20. Sticks until a single solve clears it.
The rate signal’s soft threshold acts as an extra L1 arm on the firing request only. No permanent state change.
Per-route policy knobs (v0.7.1)
Two config fields let you pin route policy without operator action:
defense.escalation.min_level(defaultopen): the level the route is allowed to fall to on cooldown. Settingl1keeps a sensitive route in active defense even when the detector reports calm. The detector can still escalate above; it just won’t step below. Useful for admin panels, payment endpoints, and known scrape targets. The difference fromshields_upis that the detector still drives the upper levels;shields_uppins all the way atshields_up.defense.scope.always_challenge_forged_browser(defaultfalse): whentrue, requests classified as ForgedBrowser (browser-shaped UA paired with a script-tool JA4) are placed in challenge scope at every level, including Open. Pairs with the v0.6 rate signal: rate catches volume; this flag catches polite low-rate forgers that fly under the hard threshold. No-op under CF orange-cloud, where every request arrives with CF’s JA4 and the classifier returns Browser for all.
Both are per-route overrides via the routes list, so the same proxy
can have a relaxed default route and a strict /admin/* route:
routes:
- match:
path: "/admin/*"
defense:
escalation:
min_level: l1
scope:
always_challenge_forged_browser: true
Fast lane
The fast lane always bypasses challenges. First match wins, in order:
- Client IP matches
fastlane.allow_ips(CIDRs, not spoofable whentrusted_proxiesis set correctly). - Path matches a configured feed glob (
/feed,/rss,*.atom, etc.). User-Agentsubstring matchesfastlane.allow_user_agents(spoofable; pair withallow_ipsif the threat model demands).Authorization: ApiKey id:secretmatches a configured key.- UA claims a known crawler (Google, Bing, DuckDuckGo, Apple, optionally Yandex) and the IP passes reverse-DNS plus forward-DNS verification.
fastlane.crawlers.yandex is false by default. The others are on.
No-JS passage challenge (opt-in, v0.7.2)
The default HTML interstitial needs JavaScript to solve the proof-of-work. Clients with JS off (Tor Browser on Safer/Safest, NoScript users, text browsers, some accessibility setups) hit the interstitial and have no way through. The passage challenge gives them a path that does not need JavaScript.
It is a friction layer, not a bot detector. It raises cost on no-JS clients (a one-click form, a server-enforced wait, single-use tokens, ip-prefix + JA4 binding, a shorter trust TTL) but a patient client can still pass. The JS proof-of-work stays the stronger proof; do not treat the passage as a replacement.
Enable it per route with challenge.mode:
js(default): JS PoW only; no-JS clients dead-end. No change.auto: JS clients solve the PoW as before; no-JS clients get the passage form in the<noscript>branch of the same page.nojs: the passage form is the whole page, no PoW. For routes you know are no-JS (for example an onion address).
routes:
- match:
path: "/onion/**"
challenge:
# A per-route challenge block REPLACES the global one wholesale, so
# copy any pow/interstitial settings you rely on into it too.
mode: auto
nojs:
delay_secs: 5
trust_ttl_secs: 600
The flow: a challenged no-JS client clicks Continue (a form with an
invisible honeypot), waits out delay_secs while a CSS-only progress
bar fills and the page auto-reloads, then is issued a gs_trust cookie
and sent back to where it was (query string preserved). The wait is
enforced server-side; there is no JS to gate the button. Tuning lives
under challenge.nojs: delay_secs, redeem_window_secs,
trust_ttl_secs, max_issued_per_prefix, issue_window_secs. Because
the passage is a weaker proof, nojs.trust_ttl_secs must be <= trust.token_ttl_secs (validated at startup).
Watch challenges_issued_total{kind="nojs"},
challenges_solved_total{kind="nojs"},
challenge_failed_total{kind="nojs",reason=...}, and
passage_wait_total to see passage traffic and failures.
A note for Tor: behind one exit node many clients share an IP prefix and Tor Browser’s uniform JA4, so the binding is weak there; the single-use token and per-prefix issuance cap carry the anti-abuse weight.
Operator workflows
Engage shields under attack
curl -X POST http://127.0.0.1:9090/admin/shields \
-H "Authorization: Bearer $GROUNDSHADE_ADMIN_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"level":"up"}'
Add "route": "*|/api/*" to target one route. Disengage with
{"level":"down"}.
Mint an API key for a partner
secret=$(openssl rand -hex 32)
echo "secret (give to partner): $secret"
cat <<EOF
fastlane:
api_keys:
- id: partner-acme
secret_hash: "$(printf '%s' "$secret" | sha256sum | awk '{print $1}')"
label: "Acme webhook receiver"
EOF
SIGHUP to reload. The partner sends
Authorization: ApiKey partner-acme:<secret> on every request.
Shortcut: make api-key.
Let a non-browser client through
Two paths:
-
Issue an API key (above). Use this for known clients.
-
Have the client solve the JSON challenge. Reference solvers ship at
examples/solvers/:token=$(./examples/solvers/solve.py https://your.site/api/endpoint) curl -H "Authorization: ChallengeSolution $token" https://your.site/api/endpoint
Allowlist a monitor
For monitors that can’t send a custom auth header (UptimeRobot free tier, Uptime-Kuma, PageSpeed), use the fast-lane allowlists. Two independent layers; either match bypasses:
fastlane:
allow_ips:
- "63.143.42.240/28" # UptimeRobot range 1
- "69.162.124.224/28" # UptimeRobot range 2
allow_user_agents:
- "UptimeRobot"
- "Uptime-Kuma"
- "PageSpeed"
Security note. UA substring match is forgeable. Treat it as ergonomics, not security. Pair
allow_user_agentswithallow_ipswhen the threat model demands it; an attacker then has to defeat both.
The dashboard’s fast-lane section shows hit counts per reason. If a monitor’s counter never moves, either the IP range is wrong or your fronting proxy is stripping the real client IP before it reaches GroundShade.
Read the dashboard
Open http://127.0.0.1:9090/ in a browser. With an admin token
configured, you land on /admin/login; submitting the token sets an
HttpOnly gs_admin cookie and redirects to the dashboard.
The page polls /admin/status, /admin/routes, and /metrics once
per second and renders:
- A hero showing the worst-route level and a one-line summary of connections, drop rate, and uptime.
- Per-route cards with a 32 s sparkline of request rate, the shields toggle, and per-route signal tracked-key counts.
- A traffic proportion bar (forwarded vs challenged) and three challenge funnels with drop/pass rates: browser (JS PoW), API (JSON), and the opt-in no-JS passage (issued, solved, wait, fail).
- A FAST LANE row with per-reason counters.
- A CLIENTS row classifying traffic by UA + JA4 family (browser, script, forged, bot, unknown).
- A SIGNALS row with JA4 state, rate soft/hard hits, trustless hits, and tracked-key counts.
- A CONNECTIONS row with active connections, refusals, and the self-throttle flag.
Point Prometheus at http://127.0.0.1:9090/metrics with the same
Authorization: Bearer <admin-token> header.
Admin endpoints
| Method | Path | Purpose |
|---|---|---|
GET | /admin/status | Version + counts |
GET | /admin/routes | Per-route defense snapshot |
POST | /admin/shields | Engage or disengage shields |
GET | /admin/login | Login form |
POST | /admin/login | Submit token, get cookie |
POST | /admin/logout | Drop the cookie |
GET | /metrics | Prometheus exposition |
Metrics reference
| Metric | Labels | Type | Meaning |
|---|---|---|---|
groundshade_requests_total | route, decision, level | counter | Requests processed by decision (forward, challenge_html, challenge_json, reject) and route level |
groundshade_route_level | route | gauge | Effective level: 0 Open, 1 L1, 2 L2, 3 L3, 4 ShieldsUp |
groundshade_challenges_issued_total | route, kind | counter | Challenges minted; kind is html, json, or nojs (passage) |
groundshade_challenges_solved_total | route, kind | counter | Successful solves; same kind set |
groundshade_challenge_failed_total | route, kind, reason | counter | No-JS passage redemptions that failed (reason: bad_signature, expired, replay, binding, bad_path, honeypot, cap, unauthorized) |
groundshade_passage_wait_total | route | counter | No-JS passage reloads served before maturity (the wait state) |
groundshade_tokens_issued_total | route | counter | Trust tokens minted |
groundshade_fastlane_total | route, reason | counter | Fast-lane hits by reason (apikey, crawler, feed, ip_allowlist, ua_allowlist) |
groundshade_connections_active | (none) | gauge | Inbound TCP connections held |
groundshade_connections_rejected_total | reason | counter | Connections refused at the accept layer |
groundshade_self_throttle | (none) | gauge | 1 while the proxy is in self-throttle |
groundshade_client_family_total | family | counter | Coarse classification: browser, script, forged_browser, bot, unknown |
groundshade_signals_immediate_total | route, reason | counter | Challenged on sight by a signal (rate_hard, trustless) |
groundshade_signals_soft_noisy_total | route | counter | Requests where the rate soft threshold fired (L1 scope arm) |
groundshade_signals_rate_tracked_keys | route, key | gauge | Distinct keys tracked by the rate signal (key is ip_prefix or ja4) |
groundshade_signals_trustless_tracked_keys | route | gauge | Distinct prefixes tracked by trustless persistence |
groundshade_ja4_detected | (none) | gauge | 1 once at least one request has carried JA4 |
Behavioural signals
Both signals run after the fast lane on every non-bypass request.
Rate signal. Per route, a 60 s sliding window keyed in parallel
by client prefix (IPv4 /24 or IPv6 /56) and JA4. Crossing
soft_threshold (200 req/min) widens the L1 scope check by one arm
on that request. Crossing hard_threshold (1,000 req/min) issues a
challenge regardless of level.
Trustless persistence. Per route, a per-prefix counter of challenges issued without ever solving. Past 20 (default), the prefix is challenged on sight. One solve clears it. The “ever earned trust” bit is sticky for the entry’s lifetime.
Note on prefixes: trust-token IP binding uses v4
/24and v6/48by default. The rate and trustless signals use v4/24and v6/56. The v6 numbers differ on purpose; signals use RIPE’s recommended customer-prefix size.
SEO safety invariant
Both signals consult after the fast lane. Verified crawlers,
operator allowlists, feed paths, and API keys never reach the signal
evaluator. A misconfigured threshold cannot accidentally challenge a
search-engine crawler. Tests in
crates/groundshade-proxy/tests/e2e_signals_seo.rs
lock the invariant.
JA4 availability
The proxy auto-detects whether your fronting proxy is forwarding
X-JA4. After 100 requests (or 60 s), if no JA4 has arrived, it
logs a single WARN and the per-JA4 arm goes silent. The per-IP/24
arm and trustless persistence keep working. The dashboard’s signals
row shows the current state.
Tuning
| Symptom | Knob | Direction |
|---|---|---|
| Solves too slow on phones | challenge.pow.leading_zero_bits | Lower (16–17) |
| Bots find solving cheap | challenge.pow.leading_zero_bits | Raise (19–20) |
| False positives at L1 | defense.scope.l1_ua_patterns | Trim |
| Headless setups pass too easily | challenge.probe.min_score | Raise from 0 to 5 or 10 |
| Origin still hot under heavy traffic | defense.trigger.p95_latency_ms / err5xx_rate | Lower |
| Proxy running out of FDs | selfdef.max_connections_total | Raise (after kernel ulimit) |
connections_rejected_total{reason="per_ip"} climbing behind a fronting proxy | listen.trusted_proxies | Add the proxy’s pinned /32 |
| Sensitive route (admin, payment, scrape target) should stay in defense | defense.escalation.min_level | Set to l1 (or higher) on that route |
| Forged-browser traffic walks through at Open level | defense.scope.always_challenge_forged_browser | Set true per route (no-op behind CF orange-cloud) |
| Real users tripping rate signal on rich pages | defense.rate_signals.soft_threshold | Raise |
| Polite scrapers slipping past | defense.rate_signals.hard_threshold | Lower (500–800) |
| Trustless flagging real users with stale cookies | defense.trustless_persistence.threshold | Raise (30–50) |
| Memory budget tight | defense.rate_signals.max_keys_per_route | Lower (20,000) |
Logs
JSON to stdout. Default fields: method, host, path, status,
duration, JA4, UA, and client_ip_hash. Client IPs are hashed with
a daily-rotated salt. To log raw IPs, set observe.log_ip_hash: false and accept the GDPR responsibility.
Persistent state
Only one file:
state_dir/trust.key: 32 bytes, mode0600. The HMAC signing key for trust tokens. Persists across restarts so outstanding cookies survive a redeploy.
No DB, no Redis, no on-disk logs unless you redirect stdout.
To rotate the key and invalidate every outstanding cookie, delete
the file and restart. Or run shields-up with a fresh
GROUNDSHADE_TRUST_SECRET.