Skip to content

Security

Authentication

The WebSocketManager gRPC and HTTP endpoints accept connections authenticated via Dapr's service-to-service mTLS. All inter-service calls (API Gateway → WebSocketManager) are secured by Dapr's built-in mutual TLS.

For external WebSocket servers, authentication is handled at the connection level:

  • Managed connections: Credentials can be embedded in the WebSocket URL or exchanged via the connection's message protocol
  • Binary pipeline: Messages are sent as raw bytes over WebSocket Binary or Text frames, determined by the content_type field. No correlation envelope wrapping occurs on external servers — correlation is handled via a 16-byte UUID prefix.

For external client access to the WebSocketManager itself, authentication should be handled at the API Gateway layer. See API Gateway Security for JWT bearer token and role-based authorization configuration.

Connection Ownership

Each connection is owned by the service instance (Kubernetes pod) that created it. Only the owning instance can modify or disconnect a connection. The instance ID is derived from the HOSTNAME environment variable (Kubernetes pod name). If HOSTNAME is unset (e.g. local dotnet run), the instance ID falls back to a fresh GUID per process start — connections owned by previous local runs will be considered orphans and reclaimed by the reclaimer.

Pod replacement orphan risk

Kubernetes pod replacement (rolling update, eviction, node drain, HPA scale-down) changes the underlying compute but does not change the pod name: the StatefulSet-or-ReplicaSet-managed pod is replaced with a new pod that gets the same name only if it was scheduled to the same ordinal. In most deployments, the new pod has a different name.

When that happens:

  1. The new pod's HOSTNAME is different from the dying pod's.
  2. All connections owned by the dying pod's HOSTNAME are now owned by an "unknown" instance from the new pod's perspective.
  3. The ConnectionReclaimerHostedService reclaims them after StaleConnectionTtl (60s default) unless the new pod knows the old pod is alive.

To detect dead-vs-alive peers without waiting for the full TTL, the InstanceHeartbeatService publishes this instance's heartbeat to the Dapr state store on INSTANCE_HEARTBEAT_INTERVAL_SECONDS (default 10s) and discovers peer heartbeats on INSTANCE_DISCOVERY_INTERVAL_SECONDS (default 30s). Peer heartbeats older than INSTANCE_STALE_THRESHOLD_SECONDS (default 60s) are ignored — the peer is treated as dead.

To disable heartbeat-based detection (e.g. to fall back to pure TTL-based reclamation), set INSTANCE_HEARTBEAT_INTERVAL_SECONDS=0.

During rolling updates, the ConnectionReclaimerHostedService reclaims connections from dead instances every 30 seconds. Straggler connections can be cleared on startup with CLEAR_ALL_ON_START=TRUE.

Configuration & Secrets

The WebSocketManager reads its configuration from the standard .NET configuration sources (in order: appsettings.json, environment variables, command-line). The following keys control runtime behavior and are the only ones that carry security-sensitive values:

Key Default Security role
Cors:AllowedOrigins [] Restricts which browser origins can call the REST endpoints. Always set explicitly in production.
AllowedCodeSourceHosts (empty) (Not applicable — WebSocketManager does not fetch worker code.)
Logging:LogLevel:Default Information Set to Warning or higher in production to avoid leaking sensitive request data into logs.

Connection Dapr component values (Redis password, broker credentials) are mounted at runtime via Dapr's secret management component (secretstores/kubernetes) — never commit them to source control. Dapr resolves them by key (e.g., stateStore.password) and the application reads them via DaprClient.GetSecretAsync.

Dapr Component Security

The WebSocketManager relies on Dapr for state storage and pub/sub messaging. Ensure Dapr components are configured with:

  • State store: Authentication enabled (Redis password for Valkey)
  • Pub/sub: Authentication enabled on the message broker
  • mTLS: Enabled in Dapr configuration (default since Dapr 1.0)

Audit & Logging

The WebSocketManager emits structured logs for every state-mutating operation (connect, disconnect, send, start_publish, stop_publish) and every reclaim sweep. Each log record includes the instance ID, connection ID, and a correlation ID derived from the incoming gRPC request metadata.

Fire-and-forget logging calls (subscription health sweep, reclaim sweep) are wrapped in try/catch and downgrade to LogLevel.Warning if the underlying stream write fails — this prevents observability failures from cascading into service failures.

Reporting Issues

Report security concerns to the Virtufin security team.