Caisey Blog

Technical MSPs · May 20, 2026

Why Caisey treats RPC as the source of truth for sendability

Most remote tools show green dots that lie. Caisey uses actual RPC bridge state to know if a command can reach an endpoint, eliminating false confidence in remote troubleshooting.
rpcbridgereliabilityendpoint-connectivitymsp-operations

Every MSP technician has lived the green-dot lie. The console says the endpoint is online. You click connect. You wait. Nothing happens. The indicator flips to amber, then red, then maybe back to green again. You refresh. You try again. Five minutes later you discover the machine went to sleep twenty minutes ago and the status was stale the whole time.

This is not a minor UI annoyance. It is a operational tax that compounds across every technician, every shift, every urgent ticket. False confidence about endpoint reachability wastes more time than honest uncertainty. That is why Caisey does not treat presence indicators as authoritative. We treat RPC state as the source of truth for whether a command is actually sendable.

The problem with cosmetic online indicators

Traditional remote access tools derive "online" from heartbeats, TCP keepalives, or WebSocket connection state. These signals are cheap to maintain and cheap to break. A machine can appear online while its tunnel is half-dead, while a proxy has silently dropped the route, or while the agent process is hung in a way that preserves the socket but blocks all useful work.

The result is a dashboard full of false positives. Technicians learn to distrust the green dot, which means they also learn to ignore it. The indicator becomes decoration rather than information. When every status is suspect, technicians fall back to trial and error: attempt the action, wait for timeout, diagnose the failure, retry through a different path.

This pattern is especially painful for MSPs managing heterogeneous networks. A client machine might be behind CGNAT, on a VPN that flakes, or on a corporate network that aggressively times out idle connections. The visible "online" state often reflects the last successful heartbeat through a path that no longer exists.

What RPC-grounded sendability means

Caisey's bridge architecture does not ask "is this endpoint online?" It asks "can I deliver this specific command to this specific endpoint right now?" The difference is operational, not philosophical.

When a technician initiates an action through Caisey's cloud console, the request flows through a Cloudflare Worker control plane to a SQLite Durable Object that tracks the endpoint's bridge state. The bridge is not a persistent tunnel. It is a reconnect-capable RPC channel that lazily establishes transport only when there is actual work to do. The Durable Object knows whether the bridge currently has an active WebRTC or WebSocket path, whether that path has passed a recent round-trip validation, and whether the endpoint runtime has acknowledged readiness within the RPC layer itself.

Sendability is determined by attempting to route the command and receiving an RPC-level acknowledgment. Not a heartbeat. Not a socket state. An actual request-response pair that confirms the endpoint is alive, the runtime is responsive, and the specific command can be queued for execution.

If the bridge is currently disconnected, the Durable Object holds the command and attempts reconnection through available transports. The technician sees "queued" or "connecting" rather than a misleading "online." There is no false confidence. There is no green dot that lies.

Why lazy event bridges change the equation

Caisey's bridge is designed around lazy event delivery rather than persistent pipe maintenance. This is not an implementation detail. It is a structural choice that makes RPC-grounded sendability possible.

A persistent tunnel must keep state synchronized across network boundaries, handle NAT traversal continuously, and recover from middlebox interference without breaking the abstraction. These requirements create pressure to show optimistic connection state, because the cost of admitting uncertainty is high. If you have invested in maintaining a tunnel, you want to believe it is working.

A lazy bridge inverts this. The transport is ephemeral. It exists only to carry actual commands and events. There is no sunk cost in maintaining a connection that might be degraded. The system can afford to be honest about reachability because it does not need to pretend the tunnel is healthy to justify its own existence.

Reconnect handling becomes part of normal operation rather than exceptional recovery. The Durable Object sequences commands, tracks retry state, and surfaces explicit status to the console. A technician can see that a machine is "last reachable 4 minutes ago, retrying now" rather than guessing based on a color that has not updated.

What this means for technician workflow

The practical impact shows up in small moments that add up. A technician looking at a client's machine list sees actual sendability state: "reachable now," "queued for retry," or "last contact 12 minutes ago." They do not waste time attempting actions against endpoints that the system already knows are unreachable. They do not refresh dashboards hoping a stale indicator will update.

When a command is queued, the technician can continue working on other tickets rather than babysitting a connection attempt. The Durable Object handles retry with exponential backoff and transport rotation. When the endpoint comes back, the command executes and the result appears in the session history. The technician gets a notification, not a timeout dialog.

For after-hours work, this is especially valuable. A patch deployment or configuration change can be submitted against a machine that is currently offline. The command waits in the bridge's durable queue. When the machine reconnects—perhaps because a user woke it in the morning—the action executes and the result is recorded. There is no need for the technician to be online at the same time as the endpoint.

The audit and accountability benefit

RPC-grounded sendability also produces better records. Every command attempt generates a traceable event: submitted, routed, acknowledged, executed, or failed with specific reason. The session history shows not just what was attempted, but whether the endpoint was actually reachable at the time of attempt.

This matters for MSPs explaining service delivery to clients. "We tried to run the diagnostic at 2:15 AM but the machine was unreachable" is a defensible statement backed by RPC state. "The dashboard said it was online but we could not connect" is not. The difference shows up in SLA discussions, in dispute resolution, and in internal post-incident reviews.

Caisey's public reviewed transcript shares can include this reachability context. A client reviewing a support session sees that a command was queued during a network outage and executed automatically when connectivity restored. The narrative is transparent. The technician's actions are documented against actual system state, not against a status icon that may have been wrong.

Building operational trust through honesty

The deeper point is about trust between technicians and their tools. MSPs run on thin margins and high ticket volumes. Every moment of friction—every false start, every stale indicator, every unexplained timeout—erodes confidence in the remote troubleshooting stack. Technicians develop workarounds. They open multiple tools. They maintain mental checklists of which clients have reliable connectivity and which do not.

RPC-grounded sendability removes a category of uncertainty. The system tells you what it actually knows, not what it hopes is true. Technicians can plan around real state. Dispatchers can route tickets based on actual endpoint availability. Client communication can reference specific command status rather than vague "connection issues."

This is not a feature that shows up in marketing checklists. It is an architectural commitment that shapes every interaction with the console. Caisey built this way because we started from the assumption that MSPs need to trust their remote troubleshooting infrastructure the same way they trust their monitoring or their backup systems. Not because it is always available, but because it is honest about when it is not.