ADR-004 — Dashboard Architecture: Go + HTMX + TailwindCSS¶
Status: Accepted
Context¶
DroidFarm needs a web dashboard that:
- Shows all DevicePool resources with per-device state (Booting / Idle / Busy / Unhealthy)
- Lets operators click a device to open its WebRTC stream
- Allows creating TestSession resources via a form
- Displays TestSession status and artifact download links
- Provides live updates without requiring manual page refresh
The dashboard is an internal operations tool, not a consumer-facing product. It runs as a Kubernetes Deployment with read/write access to DroidFarm CRDs. It has no persistent state — all data comes from the Kubernetes API.
Decision¶
Go HTTP server + HTMX + TailwindCSS (CDN), served as html/template-rendered HTML.
Technology breakdown¶
| Layer | Choice |
|---|---|
| HTTP server | Go net/http + standard ServeMux |
| Templating | Go html/template, embedded FS |
| UI interactivity | HTMX v2 |
| Styling | TailwindCSS v3 (CDN in dev) |
| Live updates | Server-Sent Events (SSE) |
| Kubernetes client | k8s.io/client-go dynamic API |
| WebRTC streaming | Native browser, <iframe> or link to Cuttlefish relay URL |
Options Considered¶
Option A — Go + HTMX (chosen)¶
Server renders HTML using html/template. HTMX handles partial page updates (polling for device state, SSE for live events) via declarative HTML attributes. No build step, no JavaScript framework.
Pros: - Zero frontend build tooling — no npm, webpack, or bundler. Single Docker image from a multi-stage Go build. - No JavaScript framework lock-in — the dashboard can be replaced or extended without migrating framework-specific state management. - Consistent with operator — Go is already the project language; one skill set covers operator + dashboard. - Naturally small binary — compiled Go + embedded templates produces a ~10 MB image. - Security — html/template auto-escapes output. No XSS risk from CRD data. distroless/static:nonroot eliminates shell-level attack surface. - Operational simplicity — no Node.js runtime, no CDN dependency for production (TailwindCSS CLI can generate a static CSS bundle).
Cons: - Richer interactivity is harder — complex UI interactions (drag-and-drop, real-time charts) require either more HTMX choreography or small <script> islands. - Designer friction — Tailwind utility classes embedded in Go templates are less ergonomic than JSX. - Server load for SSE — every open browser tab holds a goroutine + HTTP connection. Acceptable at DroidFarm scale (tens of concurrent operators, not thousands).
Option B — React (or Vue) SPA + REST API¶
Separate Node.js frontend, Go REST backend. Frontend compiled with Vite/Webpack.
Pros: - Rich component model; better for complex interactive UIs. - Type-safe (TypeScript) data contracts with OpenAPI codegen.
Cons: - Doubles the technology stack — two build systems, two CI pipelines, two Dockerfiles, two sets of dependencies. - Bundle management — React deps bloat the image and introduce frequent CVE churn unrelated to DroidFarm logic. - Over-engineered for scope — the dashboard is a low-traffic internal tool; a SPA's benefits (client-side routing, offline support) do not apply. - Auth complexity — SPAs require CORS, JWT handling, token storage considerations that a server-rendered app avoids.
Option C — Grafana as dashboard + Prometheus metrics¶
Use Grafana for observability, skip a bespoke UI.
Pros: - Zero dashboard code to write. - Grafana's Kubernetes plugin can display CRD state.
Cons: - No action surface — Grafana is read-only; creating TestSessions or opening WebRTC streams requires a custom panel plugin. - Poor UX for device interaction — Grafana is not designed for device-farm-style workflows. - Grafana + Prometheus is still valuable for Phase 8 observability (boot time histograms, queue depth), but does not replace the operational dashboard.
Architecture Decisions¶
No JavaScript framework¶
HTMX provides the interactivity required: - hx-get + hx-trigger="every 5s" for device grid auto-refresh on pool detail pages. - hx-get + hx-trigger="every 10s" for pools and sessions list pages. - hx-post + hx-push-url for session creation without a full page reload. - hx-ext="sse" + sse-connect="/api/v1/events" for live updates on all pages.
The WebRTC stream is loaded in an <iframe> pointing to the Cuttlefish WebRTC relay URL stored in DeviceStatus.streamURL. The browser handles all WebRTC negotiation natively; the dashboard does not mediate the media stream.
No database¶
All state lives in Kubernetes. The dashboard reads DevicePool and TestSession custom resources via the Kubernetes API (dynamic client). This means: - No schema migrations. - No database credentials to manage. - Kubernetes RBAC governs access control. - The dashboard can be restarted or scaled without data loss.
Server-Sent Events over WebSocket for live updates¶
SSE is simpler than WebSocket for one-directional server-to-client pushes: - HTTP/1.1 compatible; no protocol upgrade needed. - Automatic reconnection handled by the browser's EventSource API. - Proxied correctly by most ingress controllers with X-Accel-Buffering: no. - One goroutine per subscriber; at DroidFarm scale this is negligible.
A single Broker struct fans out Kubernetes watch events to all connected browsers. The broker is started as a background goroutine in main.go and shared via the Handler struct.
Kubernetes client: dynamic client, not code-generated¶
The dashboard uses k8s.io/client-go/dynamic rather than code-generated typed clients. This avoids importing the operator module (which would create a Go module dependency cycle). The trade-off is losing compile-time type safety for CRD fields, mitigated by: - A thin k8s.Client wrapper that converts unstructured objects to typed Go structs. - unstructured.NestedString / unstructured.NestedInt64 helpers with explicit field paths. - The CRD schema is documented in operator/api/v1alpha1/.
Separate Go module¶
dashboard/go.mod declares module github.com/christopherime/droidfarm/dashboard. This is a deliberate choice: the dashboard and operator have overlapping but not identical dependency graphs. A separate module keeps go.sum lean for each binary and avoids pulling operator-only dependencies (controller-runtime's webhook server, leader election, etc.) into the dashboard image.
Consequences¶
Positive: - The dashboard Dockerfile produces a ~15 MB image (Go binary + embedded templates + distroless base), versus ~300–500 MB for a Node.js + React SPA image. - No npm audit or node_modules CVE noise in CI. - The same developers who maintain the operator can own the dashboard. - Adding new CRD fields to the UI is a single-file change (templates + handler).
Negative: - Real-time per-device metrics (CPU, RAM, FPS) would require additional HTMX polling or a charting library. HTMX works well for this via hx-get polling but server load grows linearly with connected clients × poll interval. - If the dashboard scope grows to include tenant management, RBAC UI, or mobile support, a component framework would become justified. At that point, Option B (React + Go API) becomes the correct choice.