Skip to content

WebShell

WebShell is a browser-facing terminal that reaches every registered edge through the same geminio tunnel the rest of the platform uses. There is no separate SSH bastion, no jumpbox, no inbound port. The edge keeps dialing out; the manager opens a multiplexed stream class for shell I/O.

Use cases:

  • The agent suggests a fix; you click "Open shell on edge-prod-04" and confirm the change without leaving the SPA.
  • Vendor / contractor needs a one-off look at one host without VPN enrolment.
  • Incident-response: every command is recorded with the audit row of the originating session.

Architecture

text
browser ──WebSocket──> manager:/v1/webshell/ws

                          ├─ Router.Register(sessionID, sink, ActiveSession)

                          └─ geminio Stream (shell class)


                              edge agent
                                  └─ pty.Start("/bin/bash")

The manager-side router is in internal/manager/biz/webshell/router.go. It maintains a sessionID → Sink directory: WebSocket handlers register on connect, the tunnel-incoming dispatcher routes the edge's output / exit pushes to the right browser.

go
// internal/manager/biz/webshell/router.go:57
type Router struct {
    mu          sync.RWMutex
    sinks       map[string]Sink
    meta        map[string]*ActiveSession // sessionID → metadata
    stdoutBytes sync.Map                  // sessionID → *uint64
}

The HTTP / WebSocket handler lives next door in internal/manager/server/webshell so the router stays HTTP-agnostic and unit-testable.

The two stream classes

The geminio tunnel multiplexes:

  1. Control class — JSON RPCs (skill execution, plugin signalling, alert evaluator probes).
  2. Shell class — raw byte streams (one per WebShell session, one per tail -f follower, etc.).

Splitting at the tunnel level matters because shell I/O is bursty and unframed; mixing it with the control RPCs starves the latter. Each class has its own backpressure budget.

Session metadata

Each live session has an ActiveSession:

go
// router.go:37
type ActiveSession struct {
    SessionID    string
    OngridUserID uint64
    SSHUser      string
    DeviceID     uint64
    EdgeID       uint64
    StartedAt    time.Time
    LastInputAt  time.Time // updated on every browser → edge frame
}

LastInputAt ticks on every keystroke (Router.TouchInput). An idle-timeout watchdog evicts sessions older than the configured limit without recent input — defends against the "I closed the browser tab with a running command" leak.

Audit recording

Two layers:

  1. Header rowwebshell_sessions table: who, when, which edge, exit code, total bytes in/out.
  2. Stream recording — the manager-side Recorder interface takes every byte that crosses the wire (both directions, timestamped) and appends to an asciinema-compatible cast file under /var/lib/ongrid/webshell-recordings/<session_id>.cast. The admin /admin/webshell page plays them back.

The Recorder interface is narrow on purpose — production uses a file sink; tests inject a fake; future cloud-blob backends drop in without touching the rest of the stack.

Concurrency limits

Per-user cap: Router.CountByUser is called from the WebSocket open handler; over-cap connections are rejected with HTTP 429. Default cap is 5 (configurable). Per-edge cap defends against a runaway agent opening 100 concurrent shells.

Killing sessions

Three paths kill a session:

  1. Browser close — WebSocket disconnect propagates to the edge, which kill -HUPs the pty.
  2. Admin kill — the admin SPA calls Killer.Kill(reason="admin terminated") on the Sink, which tunnels a close down to the edge. The reason is recorded in the session's exit row.
  3. Idle eviction — the watchdog fires Kill("idle timeout") on sessions whose LastInputAt exceeded the cap.
go
// router.go:50
type Killer interface {
    Kill(reason string)
}

The manager-side handler installs a Killer when it registers the Sink. Any Sink that opts in becomes admin-killable; the rest are only browser-close-killable.

Role gating

WebShell is gated on the admin role (ADR-022 RBAC). user role can chat with the agent but cannot open shells; viewer can read recordings of past sessions but cannot open new ones. The gate runs at the HTTP handler entry, before the WebSocket upgrade.

See also

  • Skills — the bash skill is the one-shot equivalent of WebShell (single command, no pty). Same audit substrate.
  • Edge install — getting a host's edge agent up so WebShell can reach it.
  • Architecture — where the geminio tunnel sits.