Executive Summary
This document presents the architectural validation and results of modernizing a professional trading terminal for an international crypto broker. Migration from traditional React DOM architecture to Off-Main-Thread (OMT) pattern using Canvas API, Web Workers, and Protocol Buffers.
Key Business Outcomes:
- Performance: stable 60/144 FPS while processing 5000+ deltas/sec stream
- Latency: Data-to-Pixel Latency < 16 ms (in-app processing time)
- Traffic: 70% payload reduction (JSON → Protobuf) — critical for mobile networks
- Safety: elimination of rounding errors in PnL (Decimal.js) and data integrity guarantee (Gap Detection)
Market Context: We don't trade at HFT speeds (that's FPGA and fiber optics) — we display data at such speeds. Data-to-Pixel Latency < 16 ms means our code processes data faster than one frame (16 ms @ 60 Hz). Network latency is beyond our control.
Technology Stack: TypeScript, React, Web Workers (OMT), OffscreenCanvas (Immediate Mode), Protobuf, BigInt (Matching Core) + Decimal.js (UI Formatting).
1. Problem Statement: React DOM Limits
1.1 Initial System State
The terminal was a classic React SPA using useState/useReducer to manage quote state. WebSocket connection processed JSON stream on Main Thread.
1.2 Identified Bottlenecks
Table 1. Legacy Terminal Performance Matrix
| Component | Problem | UX Impact |
|---|---|---|
| React Virtual DOM | Every price tick triggers full tree Reconciliation | Frame drops (Jank) |
| JSON Parsing | Synchronous deserialization blocks Event Loop | Click processing delay |
| Main Thread | All operations in single thread (Single-Threaded) | Input Lag up to 250ms |
| IEEE 754 Math | 0.1 + 0.2 ≠ 0.3 in JavaScript | PnL errors at $10k+ |
| Text Rendering | DOM Layout for each order book cell | 90%+ CPU on rendering |
1.3 Quantitative Problem Assessment
Data stream: 5,000 updates/sec (Order Book + Trades)
React setState: ~1ms per call
Reconciliation: ~5ms per 1000 elements
DOM Paint: ~10ms per frame
Total: 5000 × 1ms = 5 sec/sec on setState
→ Impossible to process in real-timeConclusion: React DOM architecture is physically incapable of processing high-frequency data streams. Fundamental rendering pipeline redesign required.
2. Architectural Decisions
2.1 Off-Main-Thread Architecture (OMT)
We applied a pattern from the GameDev industry: separation into processing and rendering threads.
Fig. 1. Data flow architecture. Main Thread only handles input processing and displays ready bitmaps. All heavy lifting is offloaded to Workers.
2.2 Technology Stack Rationale
2.2.1 Canvas API vs React DOM
Table 2. Rendering Approach Comparison
| Criterion | React DOM | Canvas API (Imperative) |
|---|---|---|
| Update 1000 cells | ~15ms (Reconciliation) | ~0.5ms (drawRect batch) |
| Memory Overhead | 1 DOM Node = ~1KB | 1 Pixel = 4 bytes (RGBA) |
| Layout Thrashing | Yes (reflow on every tick) | No (pixel-perfect control) |
| Accessibility | Built-in | Requires ARIA overlay |
| Development Complexity | Low | High (GameDev patterns) |
Rendering Technology Selection
Canvas API (Imperative)
Pixel-perfect control without DOM reconciliation. 1000 cell update in ~0.5ms instead of ~15ms.
SVG
DOM overhead at high update frequencies.
2.2.2 Protocol Buffers vs JSON
Table 3. Serialization Format Comparison
| Parameter | JSON | Protocol Buffers | Improvement |
|---|---|---|---|
| Packet size (Order Book) | 2.4 KB | 0.7 KB | 3.4× smaller |
| Parse time (1000 msg) | 45ms | 8ms | 5.6× faster |
| Typing | Runtime checks | Compile-time schema | Safer |
| Backward Compatibility | Fragile | Built-in (field numbers) | — |
Serialization Format Selection
Protocol Buffers
Compile-time schema with type generation. 3.4× smaller packet size, 5.6× faster parsing.
MessagePack
No strict schema and type generation.
2.2.3 Web Workers vs Single Thread
Table 4. Concurrency Model Comparison
| Scenario | Single Thread | Web Workers |
|---|---|---|
| Flash Crash (50k msg/sec) | UI Freeze 5+ sec | UI responsive |
| CPU Utilization | 100% Main Thread | Distributed across cores |
| Input Latency | 100-500ms | < 16 ms |
| Debugging | Simple | Requires DevTools expertise |
3. Performance Mechanisms
3.1 Message Conflation (Stream Compression)
At 5000 msg/sec, rendering every update is impossible — monitors only display 60 frames.
Fig. 2. Conflation Pattern. 5000 incoming messages merge into 60 state snapshots, synchronized with monitor refresh rate.
Principle: Last value for each price "wins" (Latest Value Wins). Snapshot dispatch synchronized with requestAnimationFrame — no more than 60 times/sec. Data transferred via Transferable Objects for Zero-Copy transfer.
class OrderBookConflator {
private buffer = new Map<string, PriceLevel>();
private frameRequested = false;
onMessage(update: PriceLevelUpdate) {
this.buffer.set(update.price, update);
if (!this.frameRequested) {
this.frameRequested = true;
requestAnimationFrame(() => this.flush());
}
}
private flush() {
const snapshot = Array.from(this.buffer.values());
this.buffer.clear();
this.frameRequested = false;
renderWorker.postMessage({ type: 'SNAPSHOT', data: snapshot }, [snapshot.buffer]);
}
}3.2 OffscreenCanvas & Zero-Copy Transfer
Mechanism: Canvas is transferred to Render Worker via transferControlToOffscreen(). This is a Zero-Copy operation — Main Thread is completely freed for user input processing. React remains only a shell wrapper with Canvas placeholder.
3.3 Safe Financial Math (BigInt + Decimal.js)
Critical Architectural Decision: At 5000 msg/sec with 20+ fields = 100,000 operations/sec. Arbitrary precision libraries (BigNumber) create new objects in Heap on every operation, overloading Garbage Collector.
Solution: Separation of concerns:
- BigInt (Scaled Integers) — for aggregation core (Conflation Engine). Order of magnitude faster, no memory pressure.
- Decimal.js — only at final formatting stage for UI (60 times/sec, not 100,000).
IEEE 754 Problem: 0.1 + 0.2 !== 0.3 in JavaScript (result: 0.30000000000000004). At billion-dollar volumes, this leads to PnL errors at $10k+.
Architectural Solution:
- BigInt (Scaled Integers) in core — price stored as
BigIntwith fixed scale 10^8. Operations without allocations, minimal GC load. - Decimal.js only for UI — precision: 20, ROUND_HALF_UP. Called 60 times/sec (frame rate), not 100k times/sec.
3.4 Circuit Breaker (Overload Protection)
During Flash Crash, stream can spike to 50,000 msg/sec. Without protection — Out of Memory. Circuit Breaker triggers when threshold exceeded (2000 messages in queue): flushes buffer, requests full snapshot from server, recovers in 1 second.
3.5 Gap Detection (Data Integrity Control)
At 5000 msg/sec, packets can be lost. If we lose one delta packet, the entire Order Book becomes invalid — prices "drift".
Mechanism: Every message contains sequence number. On gap detection:
- Small gap (≤5 packets): micro-snapshot request for missing data only
- Critical gap (>5 packets): full resync from server
Trader Guarantee: Terminal instantly requests missing data, blocking interface for milliseconds. Trader never sees false price.
3.6 Object Pooling (Fighting Garbage Collector)
To reduce GC load, Object Pooling pattern implemented in workers — message object reuse instead of creating new ones. Pre-allocation of 1000 objects at startup, acquire/release in hot path.
Result: Elimination of micro-freezes (Jank) from garbage collector. GC pauses reduced from 50-100ms to < 5ms.
4. Results and Metrics
4.1 Performance Comparative Analysis
Table 5. Key Metrics: Legacy vs OMT Architecture
| Metric | React DOM (Legacy) | Canvas + Workers (New) | Improvement |
|---|---|---|---|
| FPS (5k msg/sec) | 10-15 (Janky) | 60 (Stable) | 4-6× |
| Input Latency | 100-250ms | < 16 ms | 6-15× |
| CPU Usage | 95% (Main Thread) | 30% (Distributed) | 3× |
| Memory | 150MB (DOM nodes) | 45MB (Buffers) | 3.3× |
| Network | 5 Mbps (JSON) | 1.5 Mbps (Protobuf) | 3.3× |
| Parse Time | 45ms/1000 msg | 8ms/1000 msg | 5.6× |
4.2 Data-to-Pixel Latency Breakdown
Table 6. In-App Processing Latency Decomposition (Client Processing Latency)
| Stage | Legacy | New Architecture | Savings |
|---|---|---|---|
| JSON Parse | 15ms | — | 15ms |
| Protobuf Decode | — | 2ms | — |
| React Reconciliation | 30ms | — | 30ms |
| Canvas Draw | — | 3ms | — |
| DOM Paint | 20ms | — | 20ms |
| Bitmap Transfer | — | 1ms | — |
| Total (Data-to-Pixel) | 65ms | 6ms | 59ms |
Note: Network RTT excluded from table as we only control in-app processing. Data-to-Pixel < 16 ms = faster than one frame.
4.3 Business Results
- VIP Retention: "lag" complaints reduced to zero
- Session Duration: +25% average time in terminal
- Trading Volume: +18% trade volume (UX correlation)
- Mobile Traffic: $15k/month CDN savings (Protobuf compression)
5. Risks and Mitigation
Table 7. Risk Matrix
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Safari without OffscreenCanvas | Medium | High | Fallback to Main Thread Canvas |
| Memory Leaks in Workers | Low | High | Object Pooling, profiling |
| Debugging complexity | High | Medium | Structured logging, Replay tools |
| Accessibility (a11y) | Medium | Medium | ARIA overlay for screen readers |
6. Conclusions and Recommendations
This architectural transformation confirmed that Off-Main-Thread Architecture using OffscreenCanvas, Web Workers, and Protocol Buffers is the only path to creating a professional terminal for high-frequency trading data visualization on the Web platform.
Key Takeaways:
- OffscreenCanvas eliminates DOM overhead and provides pixel-perfect control
- Web Workers isolate heavy computations from UI thread, guaranteeing responsiveness
- Protocol Buffers reduce traffic and parsing time by 3-5×
- BigInt + Decimal.js — separation of concerns: speed in core, precision in UI
- Gap Detection + Object Pooling — data integrity guarantee and GC stability
- Conflation — mandatory pattern for high-frequency data streams
Recommendation: This architecture is applicable to any real-time applications with high update frequency: IoT monitoring, live dashboards, collaborative editors, online gaming.