WebSocket Fundamentals - Persistent Connection Patterns
Learning Objectives
Establish WebSocket connections to XRPL servers using client libraries and raw WebSocket APIs
Implement connection lifecycle handling (connect, disconnect, error states)
Design reconnection logic with exponential backoff and jitter
Manage subscriptions across connection interruptions
Build production-ready connection wrappers that handle real-world network conditions
If you've used REST APIs, WebSocket will feel unfamiliar. REST is stateless—each request is independent, and you don't care what happened before or after. WebSocket is stateful—you establish a connection, maintain it, and must handle what happens when it breaks.
The Core Challenge:
REST API: WEBSOCKET:
Request → Response → Done Connect → Maintain → Handle Failures → Reconnect
Your app must handle:
• Connection establishment
• Keeping connection alive
• Detecting disconnection
• Reconnecting automatically
• Resubscribing after reconnect
• Handling messages during reconnection
• Correlating responses to requests
Most XRPL tutorials show connection code that works perfectly in demos:
// The "hello world" that breaks in production
const client = new xrpl.Client('wss://s1.ripple.com:51233')
await client.connect()
const response = await client.request({ command: 'server_info' })
console.log(response)- Connection succeeds on first try
- Connection never drops
- No network interruptions
- Server is always available
None of these assumptions hold in production. This lesson teaches you to build code that handles reality.
The xrpl.js library abstracts much of WebSocket complexity, but you still need to understand what's happening:
const xrpl = require('xrpl')
async function connect() {
const client = new xrpl.Client('wss://s1.ripple.com:51233')
// Event handlers should be registered BEFORE connect()
client.on('connected', () => {
console.log('Connected to XRPL')
})
client.on('disconnected', (code) => {
console.log(Disconnected with code: ${code})
})
client.on('error', (error) => {
console.error('Connection error:', error)
})
try {
await client.connect()
console.log('Connection established')
return client
} catch (error) {
console.error('Failed to connect:', error)
throw error
}
}
```
Critical Insight: Register event handlers BEFORE calling connect(). If you register them after, you might miss events that fire during connection establishment.
Understanding raw WebSocket helps when debugging or when library behavior isn't what you need:
const WebSocket = require('ws')
function rawConnect(url) {
return new Promise((resolve, reject) => {
const ws = new WebSocket(url)
ws.on('open', () => {
console.log('WebSocket connection opened')
resolve(ws)
})
ws.on('error', (error) => {
console.error('WebSocket error:', error)
reject(error)
})
ws.on('close', (code, reason) => {
console.log(WebSocket closed: ${code} - ${reason})
})
ws.on('message', (data) => {
const message = JSON.parse(data)
console.log('Received:', message)
})
})
}
// Making a request with raw WebSocket
async function rawRequest(ws, command, params = {}) {
const requestId = Date.now() // Simple unique ID
return new Promise((resolve, reject) => {
const timeout = setTimeout(() => {
reject(new Error('Request timeout'))
}, 10000)
const handler = (data) => {
const response = JSON.parse(data)
if (response.id === requestId) {
clearTimeout(timeout)
ws.off('message', handler)
resolve(response)
}
}
ws.on('message', handler)
ws.send(JSON.stringify({
id: requestId,
command: command,
...params
}))
})
}
```
Why Raw WebSocket Matters:
- Debug library issues
- Implement in languages without good libraries
- Understand what libraries are doing under the hood
- Build custom optimizations when needed
WebSocket connections go through distinct states:
CONNECTION STATE MACHINE:
┌──────────────┐
│ CONNECTING │ ─────── Connection attempt in progress
└──────┬───────┘
│ success
▼
┌──────────────┐
│ OPEN │ ─────── Connected, can send/receive
└──────┬───────┘
│ close event (intentional or error)
▼
┌──────────────┐
│ CLOSED │ ─────── Connection terminated
└──────────────┘
WebSocket readyState values:
0 = CONNECTING
1 = OPEN
2 = CLOSING
3 = CLOSED
Checking Connection State:
function isConnected(ws) {
return ws && ws.readyState === WebSocket.OPEN
}
function safeRequest(client, request) {
if (!client.isConnected()) {
throw new Error('Not connected to XRPL')
}
return client.request(request)
}
Connections can hang indefinitely without proper timeouts:
async function connectWithTimeout(url, timeoutMs = 10000) {
const client = new xrpl.Client(url)
const timeoutPromise = new Promise((_, reject) => {
setTimeout(() => {
reject(new Error(`Connection timeout after ${timeoutMs}ms`))
}, timeoutMs)
})
try {
await Promise.race([
client.connect(),
timeoutPromise
])
return client
} catch (error) {
// Clean up on failure
try {
await client.disconnect()
} catch (disconnectError) {
// Ignore disconnect errors during cleanup
}
throw error
}
}
Connections drop for many reasons:
DISCONNECTION CAUSES:
Network Issues
• Internet connectivity lost
• Route changes (mobile networks)
• Network congestion/packet loss
• Firewall/proxy interruptions
Server Side
• Server maintenance/restart
• Load balancer rotation
• Server overload
• Idle connection cleanup
Client Side
• Application backgrounded (mobile)
• System sleep/hibernate
• Resource constraints
• Intentional disconnect
Protocol Level
• Ping/pong timeout (no heartbeat response)
• TLS/SSL issues
• Message size exceeded
• Protocol violations
Key Insight
Disconnections are not errors to prevent—they're normal events to handle gracefully.
xrpl.js fires a disconnected event:
client.on('disconnected', (code) => {
console.log(`Disconnected with code: ${code}`)
// Common close codes:
// 1000 - Normal closure
// 1001 - Going away (server shutdown)
// 1006 - Abnormal closure (no close frame)
// 1011 - Server error
// 1012 - Server restart
if (code === 1000) {
console.log('Clean disconnect, likely intentional')
} else {
console.log('Unexpected disconnect, should reconnect')
initiateReconnect()
}
})
Close Codes Matter:
CODE MEANING ACTION
────────────────────────────────────────────
1000 Normal closure Don't reconnect (intentional)
1001 Server going away Reconnect to different server
1006 Abnormal closure Reconnect with backoff
1011 Server error Wait longer, then reconnect
1012 Server restart Wait, then reconnect
1013 Try again later Wait, then reconnectTCP connections can become "half-open"—one side thinks it's connected, the other has disconnected. Without active checking, you might not know your connection is dead.
How rippled Handles This:
rippled implements WebSocket ping/pong frames. The server sends periodic ping frames; your client must respond with pong. If the server doesn't receive pong, it closes the connection.
Client-Side Heartbeat:
class HeartbeatManager {
constructor(client, intervalMs = 30000) {
this.client = client
this.intervalMs = intervalMs
this.heartbeatTimer = null
this.lastPong = Date.now()
this.missedPongs = 0
this.maxMissedPongs = 3
}
start() {
this.heartbeatTimer = setInterval(async () => {
try {
// server_info is a lightweight way to check connection
const start = Date.now()
await this.client.request({ command: 'ping' })
this.lastPong = Date.now()
this.missedPongs = 0
const latency = this.lastPong - start
console.log(`Heartbeat OK, latency: ${latency}ms`)
} catch (error) {
this.missedPongs++
console.warn(`Heartbeat failed (${this.missedPongs}/${this.maxMissedPongs})`)
if (this.missedPongs >= this.maxMissedPongs) {
console.error('Too many missed heartbeats, connection presumed dead')
this.client.disconnect()
}
}
}, this.intervalMs)
}
stop() {
if (this.heartbeatTimer) {
clearInterval(this.heartbeatTimer)
this.heartbeatTimer = null
}
}
}
// BAD: Reconnect immediately on disconnect
client.on('disconnected', async () => {
await client.connect() // Don't do this!
})- If server is down, you hammer it with connection attempts
- If network is flaky, you create connection storms
- No backoff means no recovery time
- Could overwhelm both client and server
The standard pattern: wait longer after each failed attempt.
class ExponentialBackoff {
constructor(options = {}) {
this.baseDelay = options.baseDelay || 1000 // Start with 1 second
this.maxDelay = options.maxDelay || 60000 // Cap at 60 seconds
this.multiplier = options.multiplier || 2 // Double each time
this.attempt = 0
}
getDelay() {
const delay = Math.min(
this.baseDelay * Math.pow(this.multiplier, this.attempt),
this.maxDelay
)
this.attempt++
return delay
}
reset() {
this.attempt = 0
}
}
// Usage
const backoff = new ExponentialBackoff()
// First attempt: 1000ms
// Second: 2000ms
// Third: 4000ms
// Fourth: 8000ms
// Fifth: 16000ms
// ... until maxDelay (60000ms)
```
Pure exponential backoff has a problem: if many clients disconnect simultaneously (server restart), they all reconnect at the same intervals, creating thundering herd.
Add randomness (jitter) to spread out reconnection attempts:
class BackoffWithJitter {
constructor(options = {}) {
this.baseDelay = options.baseDelay || 1000
this.maxDelay = options.maxDelay || 60000
this.multiplier = options.multiplier || 2
this.jitterFactor = options.jitterFactor || 0.3 // ±30%
this.attempt = 0
}
getDelay() {
const baseDelay = Math.min(
this.baseDelay * Math.pow(this.multiplier, this.attempt),
this.maxDelay
)
// Add jitter: random value between -jitterFactor and +jitterFactor
const jitter = baseDelay * this.jitterFactor * (Math.random() * 2 - 1)
const delay = Math.max(0, baseDelay + jitter)
this.attempt++
return Math.round(delay)
}
reset() {
this.attempt = 0
}
}
// Example delays with 30% jitter:
// Base 1000ms → actual 700-1300ms
// Base 2000ms → actual 1400-2600ms
// Base 4000ms → actual 2800-5200ms
Here's production-ready reconnection:
class ResilientXRPLClient {
constructor(url, options = {}) {
this.url = url
this.client = null
this.backoff = new BackoffWithJitter({
baseDelay: options.baseDelay || 1000,
maxDelay: options.maxDelay || 60000,
jitterFactor: options.jitterFactor || 0.3
})
this.maxAttempts = options.maxAttempts || 10
this.currentAttempt = 0
this.shouldReconnect = true
this.subscriptions = [] // Track active subscriptions
this.onReconnect = options.onReconnect || (() => {})
}
async connect() {
this.client = new xrpl.Client(this.url)
this.setupEventHandlers()
try {
await this.client.connect()
this.backoff.reset()
this.currentAttempt = 0
console.log('Connected to XRPL')
return this.client
} catch (error) {
console.error('Initial connection failed:', error)
throw error
}
}
setupEventHandlers() {
this.client.on('disconnected', async (code) => {
console.log(`Disconnected with code: ${code}`)
if (code === 1000 || !this.shouldReconnect) {
console.log('Clean disconnect, not reconnecting')
return
}
await this.attemptReconnect()
})
this.client.on('error', (error) => {
console.error('Connection error:', error)
})
}
async attemptReconnect() {
while (this.currentAttempt < this.maxAttempts && this.shouldReconnect) {
this.currentAttempt++
const delay = this.backoff.getDelay()
console.log(`Reconnection attempt ${this.currentAttempt}/${this.maxAttempts} in ${delay}ms`)
await this.sleep(delay)
if (!this.shouldReconnect) return
try {
this.client = new xrpl.Client(this.url)
this.setupEventHandlers()
await this.client.connect()
console.log('Reconnected successfully')
this.backoff.reset()
this.currentAttempt = 0
// Restore subscriptions
await this.restoreSubscriptions()
// Notify application of reconnection
this.onReconnect()
return
} catch (error) {
console.error(`Reconnection attempt ${this.currentAttempt} failed:`, error)
}
}
console.error('Max reconnection attempts reached')
throw new Error('Failed to reconnect after maximum attempts')
}
async restoreSubscriptions() {
for (const sub of this.subscriptions) {
try {
await this.client.request({
command: 'subscribe',
...sub.params
})
console.log(`Restored subscription: ${JSON.stringify(sub.params)}`)
} catch (error) {
console.error(`Failed to restore subscription:`, error)
}
}
}
// Track subscriptions for restoration after reconnect
async subscribe(params) {
const response = await this.client.request({
command: 'subscribe',
...params
})
// Store subscription for restoration
this.subscriptions.push({ params })
return response
}
async disconnect() {
this.shouldReconnect = false
if (this.client) {
await this.client.disconnect()
}
}
sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms))
}
isConnected() {
return this.client && this.client.isConnected()
}
async request(req) {
if (!this.isConnected()) {
throw new Error('Not connected')
}
return this.client.request(req)
}
}
Subscriptions tell the server to push updates when events occur:
// Subscribe to ledger closings
await client.request({
command: 'subscribe',
streams: ['ledger']
})
// Subscribe to specific account transactions
await client.request({
command: 'subscribe',
accounts: ['rN7n3473SaZBCG4dFL83w7a1RXtXtbk2D9']
})
// Subscribe to order book changes
await client.request({
command: 'subscribe',
books: [{
taker_gets: { currency: 'XRP' },
taker_pays: { currency: 'USD', issuer: 'rvYAfWj5gh67oV6fW32ZzP3Aw4Eubs59B' }
}]
})
Events arrive as messages on the WebSocket connection:
// xrpl.js handles parsing; you register handlers
client.on('ledgerClosed', (ledger) => {
console.log(`Ledger ${ledger.ledger_index} closed`)
console.log(`Transactions: ${ledger.txn_count}`)
console.log(`Time: ${new Date(ledger.ledger_time * 1000 + 946684800000)}`)
})
client.on('transaction', (tx) => {
console.log(`Transaction: ${tx.transaction.hash}`)
console.log(`Type: ${tx.transaction.TransactionType}`)
console.log(`Result: ${tx.meta.TransactionResult}`)
})
When connection drops and reconnects, you miss events that occurred during the gap:
Timeline:
──────────────────────────────────────────────────────────
t=0 Connected, subscribed to account
t=10 Ledger 100 closes (you receive notification)
t=15 Ledger 101 closes (you receive notification)
t=20 CONNECTION DROPS
t=21 Ledger 102 closes (you DON'T receive this)
t=25 Ledger 103 closes (you DON'T receive this)
t=30 Reconnected, resubscribed
t=35 Ledger 104 closes (you receive notification)
GAP: You missed ledgers 102 and 103
```
Handling the Gap:
class GapAwareSubscriptionManager {
constructor(client) {
this.client = client
this.lastLedgerSeen = null
this.lastDisconnectTime = null
}
async handleReconnect() {
if (this.lastLedgerSeen && this.lastDisconnectTime) {
// Query for missed data
await this.fillGap()
}
// Resubscribe
await this.subscribe()
}
async fillGap() {
console.log(`Filling gap from ledger ${this.lastLedgerSeen}`)
// Get current validated ledger
const serverInfo = await this.client.request({ command: 'server_info' })
const currentLedger = serverInfo.result.info.validated_ledger.seq
if (currentLedger > this.lastLedgerSeen) {
// Query transactions we might have missed
const response = await this.client.request({
command: 'account_tx',
account: this.monitoredAccount,
ledger_index_min: this.lastLedgerSeen + 1,
ledger_index_max: currentLedger
})
// Process missed transactions
for (const tx of response.result.transactions) {
console.log(`Gap fill: Found transaction ${tx.tx.hash}`)
this.processTransaction(tx)
}
}
this.lastLedgerSeen = currentLedger
}
onLedgerClosed(ledger) {
this.lastLedgerSeen = ledger.ledger_index
}
onDisconnect() {
this.lastDisconnectTime = Date.now()
}
}
DO:
✓ Track last seen ledger for gap detection
✓ Store subscription parameters for restoration
✓ Handle events idempotently (same event processed twice is safe)
✓ Log subscription state changes for debugging
DON'T:
✗ Assume you'll receive every event (gaps happen)
✗ Subscribe to everything (server may disconnect heavy subscribers)
✗ Process events synchronously if slow (buffer and process async)
✗ Ignore subscription errors (they indicate problems)
```
Here's a full-featured connection manager combining all patterns:
const xrpl = require('xrpl')
const EventEmitter = require('events')
class ProductionXRPLClient extends EventEmitter {
constructor(servers, options = {}) {
super()
// Support multiple servers for failover
this.servers = Array.isArray(servers) ? servers : [servers]
this.currentServerIndex = 0
this.client = null
this.options = {
connectionTimeout: options.connectionTimeout || 10000,
requestTimeout: options.requestTimeout || 15000,
heartbeatInterval: options.heartbeatInterval || 30000,
maxReconnectAttempts: options.maxReconnectAttempts || 10,
baseReconnectDelay: options.baseReconnectDelay || 1000,
maxReconnectDelay: options.maxReconnectDelay || 60000,
...options
}
this.state = 'disconnected'
this.subscriptions = new Map()
this.lastLedgerIndex = null
this.heartbeatTimer = null
this.reconnectAttempts = 0
this.shouldReconnect = true
}
get currentServer() {
return this.servers[this.currentServerIndex]
}
nextServer() {
this.currentServerIndex = (this.currentServerIndex + 1) % this.servers.length
return this.currentServer
}
async connect() {
this.shouldReconnect = true
this.state = 'connecting'
try {
await this.establishConnection()
return this
} catch (error) {
this.state = 'disconnected'
throw error
}
}
async establishConnection() {
this.client = new xrpl.Client(this.currentServer, {
timeout: this.options.requestTimeout
})
this.setupEventHandlers()
// Connect with timeout
const connectPromise = this.client.connect()
const timeoutPromise = new Promise((_, reject) => {
setTimeout(() => reject(new Error('Connection timeout')),
this.options.connectionTimeout)
})
await Promise.race([connectPromise, timeoutPromise])
this.state = 'connected'
this.reconnectAttempts = 0
// Start heartbeat
this.startHeartbeat()
// Restore subscriptions
await this.restoreSubscriptions()
// Fill any gaps
await this.fillGaps()
this.emit('connected', { server: this.currentServer })
console.log(`Connected to ${this.currentServer}`)
}
setupEventHandlers() {
this.client.on('disconnected', async (code) => {
this.stopHeartbeat()
this.state = 'disconnected'
this.emit('disconnected', { code, server: this.currentServer })
if (code !== 1000 && this.shouldReconnect) {
await this.handleReconnect()
}
})
this.client.on('error', (error) => {
this.emit('error', error)
})
this.client.on('ledgerClosed', (ledger) => {
this.lastLedgerIndex = ledger.ledger_index
this.emit('ledgerClosed', ledger)
})
this.client.on('transaction', (tx) => {
this.emit('transaction', tx)
})
}
async handleReconnect() {
this.state = 'reconnecting'
while (this.reconnectAttempts < this.options.maxReconnectAttempts &&
this.shouldReconnect) {
this.reconnectAttempts++
// Calculate delay with exponential backoff and jitter
const baseDelay = Math.min(
this.options.baseReconnectDelay * Math.pow(2, this.reconnectAttempts - 1),
this.options.maxReconnectDelay
)
const jitter = baseDelay * 0.3 * (Math.random() * 2 - 1)
const delay = Math.round(baseDelay + jitter)
console.log(`Reconnection attempt ${this.reconnectAttempts}/${this.options.maxReconnectAttempts} in ${delay}ms`)
this.emit('reconnecting', {
attempt: this.reconnectAttempts,
maxAttempts: this.options.maxReconnectAttempts,
delay
})
await this.sleep(delay)
if (!this.shouldReconnect) return
// Try next server if we've failed on current one
if (this.reconnectAttempts > 1 && this.servers.length > 1) {
const newServer = this.nextServer()
console.log(`Trying alternate server: ${newServer}`)
}
try {
await this.establishConnection()
this.emit('reconnected', {
server: this.currentServer,
attempts: this.reconnectAttempts
})
return
} catch (error) {
console.error(`Reconnection attempt ${this.reconnectAttempts} failed:`, error.message)
}
}
this.state = 'failed'
const error = new Error('Max reconnection attempts exceeded')
this.emit('connectionFailed', error)
throw error
}
startHeartbeat() {
this.heartbeatTimer = setInterval(async () => {
try {
await this.client.request({ command: 'ping' })
} catch (error) {
console.warn('Heartbeat failed:', error.message)
// Connection will detect failure and trigger reconnect
}
}, this.options.heartbeatInterval)
}
stopHeartbeat() {
if (this.heartbeatTimer) {
clearInterval(this.heartbeatTimer)
this.heartbeatTimer = null
}
}
async subscribe(id, params) {
const response = await this.client.request({
command: 'subscribe',
...params
})
// Store for restoration after reconnect
this.subscriptions.set(id, params)
return response
}
async unsubscribe(id) {
const params = this.subscriptions.get(id)
if (!params) return
await this.client.request({
command: 'unsubscribe',
...params
})
this.subscriptions.delete(id)
}
async restoreSubscriptions() {
for (const [id, params] of this.subscriptions) {
try {
await this.client.request({
command: 'subscribe',
...params
})
console.log(`Restored subscription: ${id}`)
} catch (error) {
console.error(`Failed to restore subscription ${id}:`, error.message)
}
}
}
async fillGaps() {
// Override in subclass for application-specific gap filling
}
async request(req) {
if (this.state !== 'connected') {
throw new Error(`Cannot make request: client is ${this.state}`)
}
return this.client.request(req)
}
async disconnect() {
this.shouldReconnect = false
this.stopHeartbeat()
if (this.client) {
await this.client.disconnect()
}
this.state = 'disconnected'
}
isConnected() {
return this.state === 'connected'
}
sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms))
}
}
module.exports = ProductionXRPLClient
const ProductionXRPLClient = require('./ProductionXRPLClient')
async function main() {
const client = new ProductionXRPLClient([
'wss://s1.ripple.com:51233',
'wss://s2.ripple.com:51233',
'wss://xrplcluster.com'
], {
maxReconnectAttempts: 15,
heartbeatInterval: 30000
})
// Event handlers
client.on('connected', ({ server }) => {
console.log(Connected to ${server})
})
client.on('disconnected', ({ code }) => {
console.log(Disconnected with code ${code})
})
client.on('reconnecting', ({ attempt, maxAttempts, delay }) => {
console.log(Reconnecting (${attempt}/${maxAttempts}) in ${delay}ms)
})
client.on('reconnected', ({ server, attempts }) => {
console.log(Reconnected to ${server} after ${attempts} attempts)
})
client.on('ledgerClosed', (ledger) => {
console.log(Ledger ${ledger.ledger_index} closed)
})
// Connect
await client.connect()
// Subscribe to ledger stream
await client.subscribe('ledger', { streams: ['ledger'] })
// Make requests
const info = await client.request({ command: 'server_info' })
console.log('Server version:', info.result.info.build_version)
// Keep running
process.on('SIGINT', async () => {
console.log('Shutting down...')
await client.disconnect()
process.exit(0)
})
}
main().catch(console.error)
```
✅ Exponential backoff with jitter is the standard: Industry-wide consensus that this prevents thundering herd and resource exhaustion
✅ WebSocket connections require active maintenance: Half-open connections are a real problem; heartbeats are necessary
✅ Subscriptions must survive reconnection: Applications that don't restore subscriptions silently fail
✅ Multiple servers improve reliability: Failover to alternate servers handles single-server outages
⚠️ Optimal heartbeat interval: 30 seconds is common, but depends on network and server configuration
⚠️ Maximum reconnection attempts: 10 is arbitrary; depends on application requirements and user experience
⚠️ Gap detection completeness: Some events may be impossible to recover after extended disconnection
⚠️ xrpl.js reconnection behavior: Library behavior may change between versions; test with your version
🔴 Assuming connection is stable: Production connections WILL drop; code must handle this
🔴 Not tracking subscriptions: After reconnect, you'll have no subscriptions and miss events
🔴 Immediate reconnection: Hammering servers without backoff can get you rate-limited or banned
🔴 Ignoring gaps: Business-critical applications must detect and fill subscription gaps
WebSocket connection management is unglamorous but essential. The code in this lesson isn't exciting—it's defensive programming against network reality. Skip this work at your peril: your demo will work perfectly, and your production system will silently stop receiving events when the network hiccups. Every hour spent on connection resilience saves days of debugging production incidents.
Assignment: Build a production-ready WebSocket client for XRPL.
Requirements:
Part 1: Core Connection Management (40%)
- Connects with configurable timeout
- Detects disconnection via close event and heartbeat failure
- Reconnects with exponential backoff (configurable base, max, jitter)
- Supports multiple servers with automatic failover
- Emits events for all state changes
Part 2: Subscription Management (30%)
- Track active subscriptions by ID
- Restore all subscriptions after reconnection
- Support subscribe/unsubscribe methods
- Log subscription state changes
Part 3: Gap Detection (30%)
Track last seen ledger index
Detect gaps after reconnection
Log detected gaps (actual gap filling is application-specific)
Connects to server successfully
Reconnects after server-initiated disconnect
Backs off appropriately (verify timing)
Fails over to alternate server
Restores subscriptions after reconnect
Detects ledger gaps
Correctness of reconnection logic: 30%
Proper backoff implementation: 25%
Subscription restoration: 25%
Code quality and documentation: 20%
Time Investment: 3-4 hours
Submission: JavaScript/TypeScript module with usage example and test results
Value: This client becomes the foundation for all your XRPL applications. Invest in getting it right.
1. Reconnection Strategy (Tests Understanding):
Why is exponential backoff with jitter preferred over fixed-interval reconnection?
A) It's faster—you reconnect more quickly
B) It prevents thundering herd when many clients reconnect simultaneously
C) It's required by the WebSocket specification
D) It uses less bandwidth
Correct Answer: B
Explanation: When a server restarts, all connected clients disconnect simultaneously. With fixed intervals, they all try to reconnect at the same times, overwhelming the server. Exponential backoff spreads attempts over time, and jitter ensures clients don't synchronize on the exponential intervals. Option A is wrong—backoff is slower, not faster. Option C is wrong—this is a best practice, not a requirement. Option D is tangentially related but not the primary reason.
2. Subscription Behavior (Tests Knowledge):
What happens to your WebSocket subscriptions when the connection drops and reconnects?
A) They are automatically restored by the server
B) They persist because WebSocket maintains state
C) They are lost—you must resubscribe after reconnecting
D) They are queued and resume when connection is restored
Correct Answer: C
Explanation: WebSocket subscriptions exist only for the lifetime of a connection. When connection drops, the server forgets your subscriptions. After reconnecting, you have a new connection with no subscriptions. Your application must track subscriptions and restore them after reconnection. The server has no memory of previous connection state.
3. Gap Handling (Tests Critical Thinking):
Your payment monitoring application was disconnected for 30 seconds. During that time, ledgers 1000-1005 closed. After reconnecting, what should your application do?
A) Nothing—subscriptions will catch you up automatically
B) Query account_tx for the missed ledger range to find any transactions
C) Assume no payments were received since you didn't see them
D) Restart the application to ensure clean state
Correct Answer: B
Explanation: Subscriptions only deliver events in real-time; they don't provide history. During disconnection, you missed any events. For payment monitoring, you must query the ledger for transactions that occurred in the gap (ledgers 1000-1005) to ensure you don't miss payments. Option A is wrong—subscriptions don't catch up. Option C could cause you to miss real payments. Option D doesn't address the gap.
4. Heartbeat Purpose (Tests Comprehension):
What problem does implementing a client-side heartbeat solve?
A) It makes the connection faster
B) It detects half-open connections where the client thinks it's connected but the server has disconnected
C) It prevents the server from disconnecting idle clients
D) It reduces bandwidth usage
Correct Answer: B
Explanation: TCP connections can become "half-open"—one side has closed but the other hasn't received notification (network issue, abrupt failure). Without heartbeats, your application might try to use a dead connection indefinitely. Periodic heartbeats (ping/pong or request/response) detect this condition so you can reconnect. Option C is a secondary benefit but not the primary purpose.
5. Server Failover (Tests Application):
Your application is configured with three servers: [A, B, C]. Server A drops your connection. What's the correct failover behavior?
A) Immediately connect to server B
B) Retry server A with backoff, then try B, then C
C) Try all servers simultaneously and use first to connect
D) Alert an operator to manually select a server
Correct Answer: B
Explanation: Proper failover uses backoff before trying alternate servers. The first disconnection might be transient; immediately jumping to B abandons A too quickly. Retry A with backoff first, then cycle through other servers. Option A doesn't use backoff. Option C wastes resources. Option D isn't automated failover.
- RFC 6455 (WebSocket Protocol): https://tools.ietf.org/html/rfc6455
- MDN WebSocket Guide: https://developer.mozilla.org/en-US/docs/Web/API/WebSocket
- Subscribe Method: https://xrpl.org/subscribe.html
- WebSocket API: https://xrpl.org/websocket-api-tool.html
- "Release It!" by Michael Nygard (Circuit breaker, bulkhead patterns)
- AWS Architecture Blog: Exponential Backoff and Jitter
- GitHub: https://github.com/XRPLF/xrpl.js
- Client implementation for reference
For Next Lesson:
Lesson 3 covers JSON-RPC—the simpler but stateless alternative to WebSocket. We'll examine when stateless requests make more sense and how to implement them efficiently.
End of Lesson 2
Total words: ~5,200
Estimated completion time: 55 minutes reading + 3-4 hours for deliverable
- Transforms students from "it works in demo" to "it works in production" mindset
- Provides copy-paste production code (the deliverable is genuinely useful)
- Covers failure modes before students discover them in production
- Establishes patterns used throughout the course
- "My app stopped receiving payments" → No reconnection logic
- "Why did my app miss transactions?" → No gap detection
- "Server is getting hammered" → No backoff
- "Works on my machine" → No handling of real network conditions
Code Provided:
The lesson provides substantial code. This is intentional—WebSocket resilience is boilerplate that everyone needs. Students should focus on understanding and customizing, not reinventing.
Lesson 3 Setup:
After the complexity of WebSocket, Lesson 3's JSON-RPC will feel refreshingly simple. This contrast helps students appreciate when simpler approaches are appropriate.
Key Takeaways
WebSocket requires lifecycle management:
Connect, maintain, detect failures, reconnect—your code must handle the full lifecycle, not just the happy path.
Exponential backoff with jitter is mandatory:
Immediate reconnection causes thundering herd problems. Always back off, always add randomness.
Subscriptions don't survive disconnection:
You must track what you're subscribed to and restore subscriptions after reconnecting.
Gaps happen and must be handled:
During disconnection, you miss events. Production applications detect gaps and query for missed data.
Multiple servers provide resilience:
Configure fallback servers. If one is down, try another. Don't rely on a single endpoint. ---