documentation
architecture
The mental model. Workspace shape, request lifecycle, in-memory state surfaces, concurrency model, and explicit non-goals.
updated
This page is the mental model. Byte-level wire formats and the full design rationale live in SPEC.md.
Workspace
Three Cargo crates:
groundshade-core/ transport-agnostic brain (no hyper, no sockets)
config YAML schema, defaults, validation
routing host + path-glob matcher; compiles EffectivePolicy
defense origin-pain detector + 3-level state machine
trust HMAC-signed bearer tokens with binding + budget
challenge SHA-256 PoW, signed challenge tokens, interstitial HTML
fastlane verified crawlers (rDNS), API keys, feed paths
fingerprint JA4 parsing, UA classifier, client-family bucketing
signals rate signal + trustless persistence
selfdef ConnectionLimiter (per-IP + global caps + backpressure)
observe Prometheus metrics, IP hasher, webhook dispatcher
groundshade-proxy/ hyper-based binary
proxy.rs top-level lifecycle (start, shutdown, signals)
server.rs inbound + admin accept loops, graceful shutdown
service.rs per-request dispatch
upstream.rs hyper client, URI rewrite
ws.rs WebSocket / Upgrade pass-through
hop.rs hop-by-hop header stripping (RFC 9110 §7.6.1)
body.rs ProxyBody type alias
groundshade-dashboard/ inlined HTML + JS dashboard
Why the split
groundshade-core is transport-agnostic on purpose. Decisions take
a RequestFacts value (a borrowed view of the request) and return a
verdict. The v2 plugin work (Caddy module, nginx, HAProxy SPOA,
Envoy filter) wraps the same core. No rewrites.
groundshade-proxy owns everything that touches hyper or a socket:
accept loops, process lifecycle, graceful shutdown.
Request lifecycle
┌─────────────────────────────┐
│ groundshade-proxy::server │
│ accept loop │
└──────────────┬──────────────┘
│ ConnectionLimiter::try_admit
│ (per-IP /24 + global cap)
▼
┌─────────────────────────────┐
│ hyper auto::Builder │
│ (h1/h2 negotiation + │
│ header_read_timeout) │
└──────────────┬──────────────┘
│ service_fn → handle()
▼
┌────────────────────────────────────────────────────────┐
│ groundshade-proxy::service::dispatch │
│ │
│ 1. /.well-known/groundshade/* ────▶ challenge/solve │
│ │
│ 2. extract host + path │
│ 3. Router::resolve(host, path) → EffectivePolicy │
│ 4. compiled upstream lookup │
│ 5. policy.bypass? → forward (no defense, no sample) │
│ │
│ 6. FastLane::evaluate_sync │
│ (ip allowlist → feed glob → ua allowlist → apikey) │
│ FastLane::evaluate_crawler (rDNS-verified bots) │
│ ─▶ on match: forward, no sample, no signals │
│ │
│ 7. ja4_availability.observe(ja4.is_some()) │
│ 8. signals_evaluate(prefix, ja4, now_secs) │
│ → SignalsVerdict { immediate, soft_noisy, rate, │
│ trustless } │
│ │
│ 9. trust cookie / Authorization: ChallengeSolution │
│ valid? → signals_note_trust_earned, forward, │
│ maybe renew │
│ │
│ 10. signals_verdict.immediate is Some? │
│ (rate_hard or trustless) │
│ → render interstitial OR JSON 401, │
│ signals_record_challenge │
│ │
│ 11. level == Open? → forward + sample │
│ │
│ 12. ChallengeSubject in scope for level │
│ OR signals_verdict.soft_noisy? │
│ no → forward + sample │
│ yes → render interstitial OR JSON 401, │
│ signals_record_challenge │
│ │
└────────────────────────────────────────────────────────┘
The challenge endpoints live at:
GET /.well-known/groundshade/challengeissues a SHA-256 PoW offer.POST /.well-known/groundshade/solveredeems a solution for ags_trustcookie.
State surfaces
In memory:
- Router. Small, immutable, built at startup. Cloned via
Arc. - DefenseRegistry. One entry per known
route_id. Each holds a detector ring buffer (capped at 10,000 samples), a state machine, and aRouteSignalsblock. - RouteSignals (per route). A
RateStatewith twoLruCaches (ClientPrefix → SlidingWindowandJA4 → SlidingWindow, default 50,000 keys each) plus aTrustlessState(ClientPrefix → ClientHistory, default 100,000 keys). - Ja4Availability (global). Atomic counters for requests seen and requests with JA4, plus a one-shot warning flag.
- TrustIssuer. Stateless except for the signing key. Verification is pure compute.
- ChallengeIssuer. Stateless except for an LRU replay cache of
solved
(token, nonce)pairs. - FastLane. Feed matcher (immutable), API-key table (immutable), IP allowlist (immutable, parsed once), UA allowlist (immutable), crawler verifier with a bounded LRU (24 h positive, 10 min negative).
- ConnectionLimiter. Atomic global counter, bounded per-IP LRU
(cap 16,384 prefixes), and an
Arc<IpAllowlist>for the trusted-peer cap bypass. - Metrics. Prometheus registry with a fixed handful of families.
On disk:
state_dir/trust.key. 32 bytes, mode0600. Created on first start.
That’s the entirety of GroundShade’s persistent state in v1.
Concurrency
- Inbound. One tokio task per TCP connection (spawned by the accept loop). Hyper serves multiple requests per connection on h1/h2 multiplexing.
- Outbound. One shared
hyper-util::client::legacy::Clientpool perProxyState. Upgrades (WebSocket, genericUpgrade) use their own short-livedhyper::client::conn::http1connections; the pool can’t safely reuse them after a protocol switch. - Background. One defense-tick task ticks once per second across all routes. One webhook-dispatcher task drains the event queue. One RSS-watermark monitor toggles backpressure on Linux.
Every shared data structure with locking is documented at the lock
site. Most are parking_lot::Mutex (faster than std, doesn’t
poison).
Non-goals
- No global mutex. The router map is read-only after startup. The defense registry is per-route mutexed. Counters are atomics.
- No dynamic dispatch in the hot path. The decision tree is static enums and if-else chains.
- No
unsafe.#![forbid(unsafe_code)]ingroundshade-coreand the proxy binary. - No third-party network calls at startup or on the hot path beyond the system DNS resolver (for crawler verification) and configured webhook endpoints.