Quote Service Architecture - Source of Truth

Quote Service Architecture - Source of Truth

Date: December 31, 2025 Status: 🎯 Production Architecture - Canonical Reference Version: 3.1 (Enhanced: Batch Streaming + Dual Shared Memory + Parallel External Quotes) Supersedes: Docs 22, 24, 26, 31, 32, 33, 34 Last Review: December 31, 2025 (Solution Architect + HFT Expert Review)


Table of Contents

  1. Executive Summary
  2. System Context
  3. Architecture Evolution
  4. Current Architecture: 3-Microservice Design
  5. Service 1: Local Quote Service
  6. Service 2: External Quote Service
  7. Service 3: Quote Aggregator Service
  8. Data Flow & Integration
  9. Performance Characteristics
  10. Deployment & Operations

Executive Summary

What is the Quote Service?

The Quote Service is the engine core of the Solana HFT trading system. It sits at the beginning of the critical path, providing sub-10ms quote generation for arbitrage detection and execution.

Mission: Deliver fast, accurate, and reliable quotes that enable <200ms total execution time from opportunity detection to profit.

Current State: 3-Microservice Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  CLIENTS (Scanner, Planner, Executor, Dashboard)            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           ↓ gRPC (50051)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  QUOTE AGGREGATOR (10 instances, stateless)                 β”‚
β”‚  β€’ Unified client API                                       β”‚
β”‚  β€’ Parallel fan-out to local + external                     β”‚
β”‚  β€’ Best quote selection                                     β”‚
β”‚  β€’ Price comparison & deduplication                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓ gRPC (50052)                    ↓ gRPC (50053)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ LOCAL QUOTE SERVICE    β”‚      β”‚ EXTERNAL QUOTE SERVICE      β”‚
β”‚ (1-2 instances)        β”‚      β”‚ (5 instances)               β”‚
β”‚                        β”‚      β”‚                             β”‚
β”‚ β€’ Parallel paired      β”‚      β”‚ β€’ API aggregation           β”‚
β”‚   quote calculation    β”‚      β”‚ β€’ Rate limiting (1 RPS)     β”‚
β”‚ β€’ Dual cache (pool +   β”‚      β”‚ β€’ Circuit breakers          β”‚
β”‚   quote response)      β”‚      β”‚ β€’ Provider rotation         β”‚
β”‚ β€’ Background refresh   β”‚      β”‚ β€’ Multi-region support      β”‚
β”‚ β€’ <5ms latency         β”‚      β”‚ β€’ 200-500ms latency         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓                                  ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Infrastructure         β”‚      β”‚ External APIs               β”‚
β”‚ β€’ Pool Discovery Redis β”‚      β”‚ β€’ Jupiter (1 RPS)           β”‚
β”‚ β€’ RPC Proxy            β”‚      β”‚ β€’ Dflow (5 RPS)             β”‚
β”‚ β€’ Oracle Feeds (Pyth)  β”‚      β”‚ β€’ OKX (10 RPS)              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚ β€’ HumidiFi, GoonFi, ZeroFi  β”‚
                                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Performance Metrics

MetricTargetActualStatus
Local Quote (cache hit)<1ms0.5-0.8msβœ…
Local Quote (fresh calc)<5ms2-4msβœ…
Parallel Paired Quotes<10ms5-8msβœ…
External Quote (cached)<1ms0.5msβœ…
External Quote (API)200-500ms250ms avgβœ…
Aggregated Quote<10ms5-8msβœ…
Throughput (local)2,000 req/s2,500 req/sβœ…
Cache Hit Rate>80%85-90%βœ…
Availability99.9%99.95%βœ…

Critical Path in HFT Pipeline

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ QUOTE-SERVICE ⭐ THIS SYSTEM                                β”‚
β”‚ β€’ Parallel paired quotes: <10ms                             β”‚
β”‚ β€’ gRPC streaming to clients                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ SCANNER (Rust - production, TypeScript - prototyping)       β”‚
β”‚ β€’ Fast pre-filter: <10ms                                    β”‚
β”‚ β€’ Publishes: NATS opportunity.*                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ PLANNER (Rust - production, TypeScript - prototyping)       β”‚
β”‚ β€’ Deep validation + RPC simulation: 50-100ms                β”‚
β”‚ β€’ Publishes: NATS execution.planned.*                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ EXECUTOR (Rust - production, TypeScript - prototyping)      β”‚
β”‚ β€’ Transaction submission: <20ms                             β”‚
β”‚ β€’ Publishes: NATS executed.*                                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            ↓
                    PROFIT IN WALLET

TOTAL: <200ms (quote β†’ profit) βœ…

Why Quote Service is Critical:

  • It’s the first step in the pipeline
  • If quotes are slow (>10ms), the entire 200ms budget is blown
  • If quotes are stale (>2s), arbitrage opportunities are missed
  • If quotes are inaccurate, false positives waste execution budget

System Context

Position in Overall HFT Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ DATA LAYER (Solana Blockchain)                              β”‚
β”‚ β€’ Pool Discovery Service β†’ Redis (pool state cache)         β”‚
β”‚ β€’ RPC Proxy β†’ Connection pooling (CLMM tick arrays)         β”‚
β”‚ β€’ Oracle Feeds β†’ Pyth/Jupiter (price validation)            β”‚
β”‚ β€’ External APIs β†’ Jupiter/Dflow/OKX (market aggregation)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ QUOTE SERVICE ⭐ CRITICAL PATH                              β”‚
β”‚ β€’ Local quotes: <5ms (on-chain pool math, dual cache)       β”‚
β”‚ β€’ External quotes: 200-500ms (API aggregation)              β”‚
β”‚ β€’ Aggregated quotes: <10ms (best selection)                 β”‚
β”‚ β€’ Parallel paired quotes: <10ms (simultaneous calculation)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ TRADING PIPELINE (Scanner β†’ Planner β†’ Executor)             β”‚
β”‚ β€’ Scanner: Fast pre-filter (<10ms)                          β”‚
β”‚ β€’ Planner: Deep validation (50-100ms)                       β”‚
β”‚ β€’ Executor: Transaction submission (<20ms)                  β”‚
β”‚ β€’ TARGET: <200ms total execution                            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Integration Points

Upstream Dependencies:

  • Pool Discovery Service: Provides pool state updates via Redis
  • RPC Proxy: Fetches CLMM tick arrays for concentrated liquidity pools
  • Oracle Feeds: Validates quote accuracy (Pyth, Jupiter)
  • External APIs: Jupiter, Dflow, OKX, dark pools (HumidiFi, GoonFi, ZeroFi)

Downstream Consumers:

  • Scanner Service (Rust production, TypeScript prototyping): Consumes quotes via shared memory IPC (Rust) or gRPC (TypeScript), detects arbitrage
  • Planner Service (Rust production, TypeScript prototyping): Validates opportunities with fresh quotes
  • Executor Service (Rust production, TypeScript prototyping): Uses quotes for final profit calculation
  • Dashboard (Next.js): Real-time quote monitoring and visualization

Architecture Evolution

Historical Context: 10+ Iterations

The quote service has undergone extensive evolution to solve paired quote timing issues:

IterationApproachResultKey Learning
1. Auto-reverse pairsServer adds reverse pairsβœ… WorkingNeed both forward + reverse
2. Paired optimizationUse same amount⚠️ Partial3.6s delay still too high
3. Fresh calculationBypass cache❌ Worse8.6s delay (RPC bottleneck)
4. Server timestampFix measurementβœ… Accurate5s actual delay confirmed
5. Cache + fresh timeFast cache + accurate timestampβœ… Fast~100ms delay
6. Liquidity cacheFix pool scanningβœ… WorkingPerformance stable
7. Performance fixesCache key + async metricsβœ… FastSub-5ms cache hits
8. Batch refreshPeriodic batch every 2s⚠️ Blocking100ms delay persists
9. Panic fixSafe channel sendsβœ… StableNo more crashes
10. Continuous streamingWorker per pairβœ… Current68-100ms delay
11. Parallel pairedSimultaneous calculation🎯 Target<10ms delay

Root Cause Identified (Doc 22):

Sequential calculation creates 50-100ms gap where pool state changes:

T=0ms:    Start forward calculation
T=50ms:   Forward complete β†’ Stream
T=50ms:   Start reverse calculation  ⬅️ Pool state may have changed!
T=100ms:  Reverse complete β†’ Stream

Result: Different slots, inconsistent pool state, invalid arbitrage

Solution: Parallel Paired Quote Calculation (current architecture)

Why Microservices? (Evolution from Monolith)

Monolithic Quote-Service Problems:

  • ❌ RPC failures impacted external API quotes
  • ❌ External API rate limits slowed local quotes
  • ❌ Single scaling strategy (can’t optimize local vs external independently)
  • ❌ Single deployment (can’t update providers without affecting pool math)
  • ❌ Mixed concerns (AMM math + API integration in same codebase)

3-Service Architecture Benefits:

  • βœ… Failure Isolation: RPC down β†’ only local quotes affected, external continues
  • βœ… Independent Scaling: Local (vertical: CPU/memory), External (horizontal: 5+ instances)
  • βœ… Independent Deployment: Update API providers without touching pool math
  • βœ… Team Independence: Protocol experts (local) vs API engineers (external)
  • βœ… Technology Flexibility: Can rewrite services independently
  • βœ… <10% Code Overlap: Clear separation of concerns

Current Architecture: 3-Microservice Design

Service Boundaries

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    CLIENTS                                   β”‚
β”‚  β€’ TypeScript (Scanner, Planner, Executor)                   β”‚
β”‚  β€’ Rust Services (future)                                    β”‚
β”‚  β€’ Dashboard (Next.js)                                       β”‚
β”‚  β€’ Analytics Tools                                           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            ↓
                    gRPC (Port 50051)
                    Load Balanced (NGINX/K8s)
                            ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         QUOTE AGGREGATOR SERVICE (Client-Facing)             β”‚
β”‚  ─────────────────────────────────────────────────────────  β”‚
β”‚  Responsibilities:                                           β”‚
β”‚  β€’ Client gRPC API (StreamQuotes, GetBestQuote)              β”‚
β”‚  β€’ Fan-out to local + external services (PARALLEL)           β”‚
β”‚  β€’ Merge quote streams in real-time                          β”‚
β”‚  β€’ Deduplicate routes (same pool = same route)               β”‚
β”‚  β€’ Select best quote (price, confidence, liquidity)          β”‚
β”‚  β€’ Compare local vs external (price diff, recommendation)    β”‚
β”‚  β€’ Client load balancing across 10 instances                 β”‚
β”‚                                                              β”‚
β”‚  Ports: gRPC (50051), HTTP (8081), Metrics (9092)            β”‚
β”‚  Technology: Go                                              β”‚
β”‚  Scaling: Horizontal (stateless, 10 instances)               β”‚
β”‚  Latency: 2-3ms overhead (gRPC serialization)                β”‚
β”‚  Resources: 1 CPU, 1-2 GB memory per instance                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓ gRPC (50052)                    ↓ gRPC (50053)
         ↓ HTTP (8082)                     ↓ HTTP (8083)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ LOCAL QUOTE SERVICE    β”‚      β”‚ EXTERNAL QUOTE SERVICE      β”‚
β”‚  ────────────────────  β”‚      β”‚  ─────────────────────────  β”‚
β”‚ Responsibilities:      β”‚      β”‚ Responsibilities:           β”‚
β”‚ β€’ Pool state cache     β”‚      β”‚ β€’ API client pool           β”‚
β”‚ β€’ Background refresh   β”‚      β”‚ β€’ Per-provider rate limit   β”‚
β”‚ β€’ AMM/CLMM/DLMM math   β”‚      β”‚ β€’ Provider rotation         β”‚
β”‚ β€’ Quote response cache β”‚      β”‚ β€’ Circuit breakers          β”‚
β”‚ β€’ Parallel paired calc β”‚      β”‚ β€’ Response cache (10s TTL)  β”‚
β”‚ β€’ Oracle validation    β”‚      β”‚ β€’ Multi-region support      β”‚
β”‚ β€’ RPC coordination     β”‚      β”‚ β€’ API key management        β”‚
β”‚                        β”‚      β”‚                             β”‚
β”‚ Ports: gRPC (50052),   β”‚      β”‚ Ports: gRPC (50053),        β”‚
β”‚        HTTP (8082),    β”‚      β”‚        HTTP (8083),         β”‚
β”‚        Metrics (9090)  β”‚      β”‚        Metrics (9091)       β”‚
β”‚ Technology: Go         β”‚      β”‚ Technology: Go              β”‚
β”‚ Scaling: Vertical      β”‚      β”‚ Scaling: Horizontal         β”‚
β”‚ Instances: 1-2         β”‚      β”‚ Instances: 5                β”‚
β”‚ Latency: <5ms          β”‚      β”‚ Latency: 200-500ms          β”‚
β”‚ Resources: 2-4 CPU,    β”‚      β”‚ Resources: 1-2 CPU,         β”‚
β”‚            8-16 GB     β”‚      β”‚            2-4 GB           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓                                  ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Infrastructure         β”‚      β”‚ External APIs               β”‚
β”‚ β€’ Pool Discovery Redis β”‚      β”‚ β€’ Jupiter (1 RPS shared)    β”‚
β”‚ β€’ RPC Proxy            β”‚      β”‚ β€’ Jupiter Ultra (1 RPS)     β”‚
β”‚ β€’ Oracle Feeds (Pyth)  β”‚      β”‚ β€’ Dflow (5 RPS)             β”‚
β”‚ β€’ NATS (optional)      β”‚      β”‚ β€’ OKX (10 RPS)              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚ β€’ QuickNode (10 RPS)        β”‚
                                β”‚ β€’ BLXR (5 RPS)              β”‚
                                β”‚ β€’ HumidiFi (dark pool)      β”‚
                                β”‚ β€’ GoonFi (dark pool)        β”‚
                                β”‚ β€’ ZeroFi (dark pool)        β”‚
                                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Communication Protocols

gRPC Streaming (Primary):

service QuoteAggregatorService {
  // Get single best quote (one-time request)
  rpc GetBestQuote(AggregatedQuoteRequest) returns (AggregatedQuote);

  // Stream continuous quotes (real-time updates)
  rpc StreamQuotes(AggregatedQuoteRequest) returns (stream AggregatedQuote);

  // Health check
  rpc Health(HealthRequest) returns (HealthResponse);
}

NATS Messaging (Required):

  • Quote-Service publishes to market.swap_route.* for HFT pipeline integration
  • Scanner consumes from NATS for real-time opportunity detection
  • Critical Path: Required for Scanner β†’ Planner β†’ Executor pipeline
  • Latency: 2-5ms (acceptable for pipeline events)

Proto Definitions:

message AggregatedQuote {
  optional LocalQuote best_local = 1;
  optional ExternalQuote best_external = 2;
  QuoteSource best_source = 3;  // LOCAL or EXTERNAL
  QuoteComparison comparison = 4;
  repeated LocalQuote all_local_quotes = 5;
  repeated ExternalQuote all_external_quotes = 6;
  int64 aggregation_time_ms = 7;
  int64 timestamp = 8;
}

message QuoteComparison {
  double price_diff_percent = 1;     // (local - external) / external * 100
  double spread_bps = 2;              // Bid-ask spread
  int64 local_latency_ms = 3;
  int64 external_latency_ms = 4;
  string recommendation = 5;          // "local", "external", "equal"
  double confidence_score = 6;        // 0.0-1.0
}

Shared Memory IPC (Rust Production Scanners)

Purpose: Ultra-low-latency quote access (<1ΞΌs) for production Rust scanners with intelligent change notification.

Architecture (Dual Shared Memory with Hybrid Change Detection):

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  QUOTE SERVICE (Go - Writer)                  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  1. Calculate quotes from TWO sources:                        β”‚
β”‚     β€’ Local (on-chain pool math, <5ms)                        β”‚
β”‚     β€’ External (API aggregation, 200-500ms)                   β”‚
β”‚  2. Write to TWO memory-mapped files (lock-free)              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            ↓
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚ SHMEM #1 (Internal)      β”‚ SHMEM #2 (External)      β”‚
        β”‚ quotes-internal.mmap     β”‚ quotes-external.mmap     β”‚
        β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
        β”‚ β€’ Local pool quotes      β”‚ β€’ External API quotes    β”‚
        β”‚ β€’ Sub-second freshness   β”‚ β€’ 10s+ refresh interval  β”‚
        β”‚ β€’ On-chain accurate      β”‚ β€’ Better routing         β”‚
        β”‚ β€’ Fixed-size metadata    β”‚ β€’ Fixed-size metadata    β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              RUST SCANNER (Readers - Production)              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  1. Read from BOTH shared memory (<1ΞΌs each)                  β”‚
β”‚  2. Intelligent selection:                                    β”‚
β”‚     β€’ Strategy A: Use internal (fresher, on-chain)            β”‚
β”‚     β€’ Strategy B: Use external (better multi-hop)             β”‚
β”‚     β€’ Strategy C: Compare both, pick best                     β”‚
β”‚  3. Detect arbitrage (<10ΞΌs vs 500ΞΌs-2ms with gRPC)           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Memory Layout (Fixed-Size Metadata):

// 128-byte aligned quote entry
#[repr(C, align(128))]
struct QuoteMetadata {
    version: AtomicU64,           // Versioning for consistency
    pair_id: [u8; 32],            // BLAKE3(token_in, token_out, amount)
    input_mint: [u8; 32],         // Solana public key (32 bytes)
    output_mint: [u8; 32],        // Solana public key (32 bytes)
    input_amount: u64,            // Lamports
    output_amount: u64,           // Lamports
    price_impact_bps: u32,        // Basis points (0.01%)
    timestamp_unix_ms: u64,       // Unix timestamp (milliseconds)
    route_id: [u8; 32],           // BLAKE3(route_steps) -> lookup in Redis/PG
    _padding: [u8; 24],           // Padding to 128 bytes
}

// Memory-mapped file: Array of QuoteMetadata
// File size: 128 bytes Γ— 1000 pairs = 128 KB (fits in L2 cache)

Performance Characteristics:

Operation                  Latency       Improvement vs gRPC
────────────────────────────────────────────────────────────
Read quote metadata        <1ΞΌs          100-500x faster
Arbitrage detection        <10ΞΌs         50-200x faster
Two-hop check              <2ΞΌs          50-250x faster
Cache hit rate             >99%          Memory-resident

Key Features:

  • Lock-Free Reads: Atomic versioning ensures consistency
  • Dual Sources: Internal (fresh) + External (better routing)
  • Zero-Copy: Direct memory access (no serialization)
  • OS-Managed: Kernel handles memory-mapped file sync
  • Single Writer: Quote service (Go) ONLY, eliminates write conflicts
  • Multiple Readers: Unlimited concurrent Rust scanners

Route Storage (Dynamic Data in PostgreSQL + Redis):

-- PostgreSQL: Persistent route storage
CREATE TABLE route_steps (
    route_id BYTEA PRIMARY KEY,           -- BLAKE3 hash (32 bytes)
    route_steps JSONB NOT NULL,           -- RouteStep array
    hop_count SMALLINT NOT NULL,
    created_at TIMESTAMPTZ NOT NULL,
    last_used_at TIMESTAMPTZ NOT NULL
);

-- Redis: Hot route cache (30s TTL)
SET route:{route_id_hex} '{route_steps_json}' EX 30

Rust Scanner Example:

use memmap2::MmapOptions;
use std::fs::File;

// Open shared memory regions
let file_internal = File::open("/var/quote-service/quotes-internal.mmap")?;
let mmap_internal = unsafe { MmapOptions::new().map(&file_internal)? };

let file_external = File::open("/var/quote-service/quotes-external.mmap")?;
let mmap_external = unsafe { MmapOptions::new().map(&file_external)? };

// Read quote metadata (<1ΞΌs)
let quotes_internal: &[QuoteMetadata] = unsafe {
    std::slice::from_raw_parts(
        mmap_internal.as_ptr() as *const QuoteMetadata,
        1000, // Max 1000 pairs
    )
};

let quotes_external: &[QuoteMetadata] = unsafe {
    std::slice::from_raw_parts(
        mmap_external.as_ptr() as *const QuoteMetadata,
        1000,
    )
};

// Fast arbitrage detection
for quote_internal in quotes_internal {
    if quote_internal.version.load(Ordering::Acquire) == 0 {
        continue; // Empty slot
    }

    // Find matching external quote
    for quote_external in quotes_external {
        if quote_external.pair_id == quote_internal.pair_id {
            // Compare prices (<10ΞΌs total)
            let price_diff_bps = calculate_price_diff(
                quote_internal.output_amount,
                quote_external.output_amount,
            );

            if price_diff_bps > ARBITRAGE_THRESHOLD {
                // Fetch route from Redis/PostgreSQL
                let route_internal = fetch_route(&quote_internal.route_id)?;
                let route_external = fetch_route(&quote_external.route_id)?;

                // Publish opportunity to NATS
                publish_arbitrage_opportunity(route_internal, route_external)?;
            }
        }
    }
}

Why Dual Regions? (Internal vs External):

  1. Different Freshness: Internal (sub-second) vs External (10s+)
  2. Different Sources: On-chain pool math vs 3rd-party APIs
  3. Different Use Cases: Internal for accuracy, External for better routing
  4. Cross-Validation: Detect price manipulation by comparing both
  5. Strategy Flexibility: Scanner chooses best source per opportunity

Implementation Roadmap (3 weeks total)

Phase 1: PostgreSQL Route Storage (Week 1 - 3 days):

  • Create PostgreSQL schema (route_steps table with JSONB)
  • Implement Go route repository
  • Integrate with quote service (RouteID calculation, storage)
  • Add route cleanup job (7-day retention)

Phase 2: Redis Route Caching (Week 1 - 2 days):

  • Implement Redis route cache
  • Add cache warming on quote calculation
  • Add cache metrics (hit/miss rate, latency)
  • Target: >95% cache hit rate

Phase 3: Shared Memory Writer (Go) (Week 2 - 3 days):

  • Implement SharedMemoryWriter with atomic write protocol
  • Memory-mapped file creation and initialization
  • Integrate with QuoteServiceImpl.GetPairedQuotes()
  • Test concurrent writes (10K writes/sec)

Phase 4: Shared Memory Reader (Rust) (Week 2 - 2 days):

  • Implement SharedMemoryReader with lock-free reads
  • Double-checked read with version validation
  • PairID lookup in index with retry logic
  • Target: 10M reads/sec

Phase 5: Route Fetcher (Rust) (Week 2 - 2 days):

  • Implement RouteFetcher with Redis β†’ PostgreSQL fallback
  • Redis fetch with deserialization (<1ms)
  • PostgreSQL fetch with connection pooling (<10ms)
  • Cache warming on PostgreSQL fetch

Phase 6: Scanner Integration (Week 3 - 3 days):

  • Implement ArbitrageDetector in Rust
  • Integrate shared memory reader + route fetcher
  • Arbitrage calculation (forward Γ— reverse)
  • Publish opportunities to NATS
  • Target: <50ΞΌs detection latency

Phase 7: Production Testing (Week 3 - 2 days):

  • Load test (1000 paired quotes/sec)
  • Stress test (10K concurrent readers)
  • Measure cache hit rates and end-to-end latency
  • 24-hour soak test for stability

Implementation Status: 🚧 Future (Rust scanners not yet implemented)

Full Design Specification: Doc 31 content has been fully incorporated above. The original document has been archived at docs/archive/quote-service/rewrite/31-SHARED-MEMORY-QUOTE-ARCHITECTURE.md.


Service 1: Local Quote Service

Purpose

Provide fast, accurate on-chain quotes using:

  • Dual cache architecture (pool state + quote response)
  • Background refresh (AMM: 10s, CLMM: 30s)
  • Parallel paired quote calculation (forward + reverse simultaneously)
  • Oracle validation (Pyth price feed deviation alerts)

Dual Cache Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              LOCAL QUOTE SERVICE                             β”‚
β”‚                                                              β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  LAYER 1: Pool State Cache (map[poolID]*PoolState)     β”‚ β”‚
β”‚  β”‚  ─────────────────────────────────────────────────────  β”‚ β”‚
β”‚  β”‚                                                         β”‚ β”‚
β”‚  β”‚  type PoolState struct {                               β”‚ β”‚
β”‚  β”‚    Pool         *domain.Pool    // Metadata + reserves β”‚ β”‚
β”‚  β”‚    CLMMData     *CLMMPoolData   // Tick arrays (CLMM)  β”‚ β”‚
β”‚  β”‚    LastUpdate   time.Time       // Refresh timestamp   β”‚ β”‚
β”‚  β”‚    RefreshCount int             // Counter             β”‚ β”‚
β”‚  β”‚    IsStale      bool            // Staleness flag      β”‚ β”‚
β”‚  β”‚    mu           sync.RWMutex    // Thread-safe         β”‚ β”‚
β”‚  β”‚  }                                                      β”‚ β”‚
β”‚  β”‚                                                         β”‚ β”‚
β”‚  β”‚  Refresh Strategy:                                      β”‚ β”‚
β”‚  β”‚  β€’ AMM/CPMM: Every 10s (Redis read, fast)              β”‚ β”‚
β”‚  β”‚  β€’ CLMM: Every 30s (RPC fetch tick arrays, slow)       β”‚ β”‚
β”‚  β”‚  β€’ Staleness check: Every 5s (mark pools >60s stale)   β”‚ β”‚
β”‚  β”‚  β€’ Priority queue: On-demand refresh for stale pools   β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                          ↓                                   β”‚
β”‚           Pool Refresh β†’ Invalidate Layer 2                  β”‚
β”‚                          ↓                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  LAYER 2: Quote Response Cache (map[key]*CachedQuote)  β”‚ β”‚
β”‚  β”‚  ─────────────────────────────────────────────────────  β”‚ β”‚
β”‚  β”‚                                                         β”‚ β”‚
β”‚  β”‚  type CachedQuoteResponse struct {                     β”‚ β”‚
β”‚  β”‚    InputMint     string                                β”‚ β”‚
β”‚  β”‚    OutputMint    string                                β”‚ β”‚
β”‚  β”‚    InputAmount   uint64                                β”‚ β”‚
β”‚  β”‚    OutputAmount  uint64                                β”‚ β”‚
β”‚  β”‚    PriceImpact   float64                               β”‚ β”‚
β”‚  β”‚    PoolID        string                                β”‚ β”‚
β”‚  β”‚    CachedAt      time.Time                             β”‚ β”‚
β”‚  β”‚    ExpiresAt     time.Time  // 2s TTL                  β”‚ β”‚
β”‚  β”‚    PoolStateAge  time.Duration                         β”‚ β”‚
β”‚  β”‚    OraclePrice   float64                               β”‚ β”‚
β”‚  β”‚    Deviation     float64                               β”‚ β”‚
β”‚  β”‚  }                                                      β”‚ β”‚
β”‚  β”‚                                                         β”‚ β”‚
β”‚  β”‚  Cache Key: hash(inputMint, outputMint, amount)        β”‚ β”‚
β”‚  β”‚  Invalidation:                                          β”‚ β”‚
β”‚  β”‚  β€’ Pool refresh β†’ invalidate ALL quotes using pool     β”‚ β”‚
β”‚  β”‚  β€’ TTL expiration (2s)                                 β”‚ β”‚
β”‚  β”‚  β€’ Periodic eviction (every 5s)                        β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why Dual Cache?

  • Layer 1 (Pool State): Expensive to fetch (50-100ms RPC for CLMM), shared across all quotes
  • Layer 2 (Quote Response): Fast to calculate (< 5ms pool math), specific to input/output/amount
  • Pool-Aware Invalidation: Pool refresh β†’ invalidate ALL Layer 2 quotes using that pool
  • Different TTLs: Pool state (60s staleness), Quote response (2s TTL)

Parallel Paired Quote Calculation

Problem (Sequential calculation):

T=0ms:    Start forward calculation (SOL β†’ USDC)
T=50ms:   Forward complete β†’ Stream
T=50ms:   Start reverse calculation (USDC β†’ SOL)  ⬅️ Pool changed!
T=100ms:  Reverse complete β†’ Stream

Result: 100ms delay, different slots, invalid arbitrage

Solution (Parallel calculation):

T=0ms:    Launch forward goroutine ─┐
T=0ms:    Launch reverse goroutine ── PARALLEL!
T=50ms:   Both complete β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
T=51ms:   Stream both quotes

Result: ~1ms delay, SAME slot, valid arbitrage

Implementation:

// calculatePairedQuotesParallel calculates forward + reverse simultaneously
func (s *GRPCQuoteServer) calculatePairedQuotesParallel(
    ctx context.Context,
    inputMint, outputMint, amount string,
    timeoutMs int,
) *PairedQuoteResult {
    ctxWithTimeout, cancel := context.WithTimeout(ctx,
        time.Duration(timeoutMs) * time.Millisecond)
    defer cancel()

    // Result channels
    forwardChan := make(chan *QuoteResult, 1)
    reverseChan := make(chan *QuoteResult, 1)

    startTime := time.Now()

    // Launch forward goroutine
    go func() {
        quote, err := s.cache.GetOrCalculateQuoteFresh(
            ctxWithTimeout, inputMint, outputMint, amount, ...)
        forwardChan <- &QuoteResult{Quote: quote, Error: err}
    }()

    // Launch reverse goroutine (SIMULTANEOUSLY!)
    go func() {
        quote, err := s.cache.GetOrCalculateQuoteFresh(
            ctxWithTimeout, outputMint, inputMint, amount, ...)
        reverseChan <- &QuoteResult{Quote: quote, Error: err}
    }()

    // Wait for BOTH or timeout
    result := &PairedQuoteResult{}
    resultsReceived := 0

    for resultsReceived < 2 {
        select {
        case fwd := <-forwardChan:
            result.Forward = fwd
            resultsReceived++
        case rev := <-reverseChan:
            result.Reverse = rev
            resultsReceived++
        case <-ctxWithTimeout.Done():
            // Timeout: return partial results
            goto done
        }
    }

done:
    result.TotalMs = time.Since(startTime).Milliseconds()
    return result
}

Key Features:

  • Go Concurrency: Leverages goroutines + channels (like Promise.allSettled)
  • Timeout Handling: 100ms max, returns partial results if one fails
  • Same Pool State: Both quotes use identical pool snapshot
  • Same Slot: Both calculated within ~1ms (vs 100ms sequential)

Quote Request Flow

1. Client Request
   Scanner: GetQuote(SOL β†’ USDC, 1 SOL)
                ↓
2. Check Layer 2 (Quote Response Cache)
   β”œβ”€ Cache HIT β†’ Return immediately (< 1ms)
   └─ Cache MISS β†’ Continue
                ↓
3. Check Layer 1 (Pool State Cache)
   β”œβ”€ Pool fresh (<60s) β†’ Calculate quote (<5ms)
   β”‚                    β†’ Store in Layer 2
   β”‚                    β†’ Return
   └─ Pool stale (>60s) β†’ Add to priority queue
                        β†’ Calculate with stale (mark stale)
                        β†’ Return with warning
                ↓
4. Background Refresh (async, triggered by priority queue)
   β€’ Refresh pool state (Layer 1)
   β€’ Invalidate Layer 2 quotes for pool
   β€’ Next request gets fresh quote

Background Refresh System

Refresh Manager:

type RefreshManager struct {
    poolCache          *PoolStateCache
    quoteResponseCache *QuoteResponseCache
    refreshQueue       chan RefreshCommand

    // Refresh intervals
    ammRefreshInterval  time.Duration  // 10s
    clmmRefreshInterval time.Duration  // 30s

    // Worker pool
    workerCount int  // 5 workers
}

type RefreshCommand struct {
    PoolID   string
    PoolType PoolType  // AMM, CLMM, DLMM
    Priority Priority  // HIGH (on-demand), NORMAL (scheduled)
}

Refresh Strategies:

  1. Scheduled Refresh (Background):
    Every 10s: Refresh all AMM/CPMM pools from Redis
    Every 30s: Refresh all CLMM pools from RPC (tick arrays)
    Every 5s:  Check staleness, mark pools >60s as stale
    
  2. On-Demand Refresh (Priority Queue):
    Quote request for stale pool
     β†’ Add HIGH priority to queue
     β†’ Worker picks up immediately
     β†’ Refresh pool state
     β†’ Invalidate quote cache
     β†’ Future quotes use fresh data
    
  3. Pool-Aware Cache Invalidation:
    Pool state refreshed
     β†’ Find all Layer 2 quotes using this pool
     β†’ Invalidate ALL matching quotes
     β†’ Ensures consistency
    

Protocol Support

ProtocolTypeRefresh SourceIntervalLatencyProgram ID
Raydium AMM V4AMMRedis10s<1ms675kPX9M...
Raydium CPMMAMMRedis10s<1msCPMMoo8L...
Raydium CLMMCLMMRPC (tick arrays)30s50-100msCAMMCzo5...
Meteora DLMMDLMMRPC (bins)30s50-100msLBUZKhRx...
Orca WhirlpoolsCLMMRPC (tick arrays)30s50-100ms(future)
PumpSwapAMMRedis10s<1mspAMMBay6...

Oracle Integration

Purpose: Validate quote accuracy against external price feeds.

Supported Oracles:

  • Pyth (primary): Real-time price feeds (<1s latency)
  • Jupiter Oracle (secondary): TWAP prices

Usage:

// After calculating quote, validate against oracle
oraclePrice := s.oracleClient.GetPrice(tokenPair)
quotePrice := float64(outputAmount) / float64(inputAmount)
deviation := math.Abs(quotePrice - oraclePrice) / oraclePrice

if deviation > 0.01 {  // >1% deviation
    log.Warn("Quote deviates from oracle",
        "pair", tokenPair,
        "deviation", deviation*100,
        "quotePrice", quotePrice,
        "oraclePrice", oraclePrice)

    // Publish metric for alerting
    s.metrics.RecordOracleDeviation(tokenPair, deviation)
}

Configuration

Environment Variables:

# Service
LOCAL_QUOTE_SERVICE_PORT=50052
LOCAL_QUOTE_SERVICE_HTTP_PORT=8082

# Cache
POOL_CACHE_STALENESS_THRESHOLD=60s
QUOTE_CACHE_TTL=2s
QUOTE_CACHE_EVICTION_INTERVAL=5s

# Background Refresh
AMM_REFRESH_INTERVAL=10s
CLMM_REFRESH_INTERVAL=30s
REFRESH_WORKER_COUNT=5

# Parallel Paired Quotes
PARALLEL_QUOTE_TIMEOUT_MS=100
PARALLEL_QUOTE_ENABLED=true

# Dependencies
RPC_PROXY_URL=http://localhost:8083
POOL_DISCOVERY_REDIS_URL=redis://localhost:6379/0
ORACLE_PYTH_HTTP_URL=https://hermes.pyth.network
ORACLE_JUPITER_URL=https://price.jup.ag/v4

# Observability
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
METRICS_PORT=9090
LOG_LEVEL=info

Deployment Specifications

Resource Requirements:

  • CPU: 2-4 cores (pool math computation)
  • Memory: 8-16 GB
    • Pool cache: ~5-10 MB (100 pools Γ— 50KB)
    • Quote cache: ~1-2 MB (1000 quotes Γ— 200B)
    • CLMM tick arrays: ~5 MB
    • Go runtime: ~2-3 GB
  • Disk: 1 GB (logs only, cache in memory)
  • Network: 10 Mbps (RPC calls)

Scaling Strategy:

  • Vertical: Increase CPU/memory for more pools
  • Horizontal: 2 instances for HA (shared Redis cache)
  • Pattern: Active-Active with shared state

Docker Image:

FROM golang:1.24-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN go build -o /bin/local-quote-service ./cmd/local-quote-service

FROM alpine:latest
RUN apk --no-cache add ca-certificates
COPY --from=builder /bin/local-quote-service /bin/
EXPOSE 50052 8082 9090
ENTRYPOINT ["/bin/local-quote-service"]

Service 2: External Quote Service

Purpose

Aggregate quotes from external APIs with:

  • Rate limiting (per-provider token bucket)
  • Circuit breakers (auto-disable failing providers)
  • Provider rotation (health-based load balancing)
  • Multi-region support (5+ instances globally)

Provider Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              EXTERNAL QUOTE SERVICE                          β”‚
β”‚                                                              β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Provider Pool (9 providers)                            β”‚ β”‚
β”‚  β”‚  ─────────────────────────────────────────────────────  β”‚ β”‚
β”‚  β”‚                                                         β”‚ β”‚
β”‚  β”‚  JUPITER POOL (1 RPS shared):                          β”‚ β”‚
β”‚  β”‚  β€’ Jupiter Standard    (0.16 req/s)  ─┐                β”‚ β”‚
β”‚  β”‚  β€’ Jupiter Ultra       (0.16 req/s)   β”œβ”€ Share 1 RPS   β”‚ β”‚
β”‚  β”‚  β€’ HumidiFi (dark pool)(0.16 req/s)  ──   (60 req/min) β”‚ β”‚
β”‚  β”‚  β€’ GoonFi (dark pool)  (0.16 req/s)  ──                β”‚ β”‚
β”‚  β”‚  β€’ ZeroFi (dark pool)  (0.16 req/s)  β”€β”˜                β”‚ β”‚
β”‚  β”‚                                                         β”‚ β”‚
β”‚  β”‚  INDEPENDENT PROVIDERS:                                 β”‚ β”‚
β”‚  β”‚  β€’ Dflow               (5.0 req/s)                      β”‚ β”‚
β”‚  β”‚  β€’ OKX                 (10.0 req/s)                     β”‚ β”‚
β”‚  β”‚  β€’ QuickNode           (10.0 req/s)                     β”‚ β”‚
β”‚  β”‚  β€’ BLXR                (5.0 req/s)                      β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                          ↓                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Rate Limiters (Per-Provider Token Bucket)             β”‚ β”‚
β”‚  β”‚  ─────────────────────────────────────────────────────  β”‚ β”‚
β”‚  β”‚                                                         β”‚ β”‚
β”‚  β”‚  type RateLimiter struct {                             β”‚ β”‚
β”‚  β”‚    limiter   *rate.Limiter  // Token bucket            β”‚ β”‚
β”‚  β”‚    capacity  float64        // Requests per second     β”‚ β”‚
β”‚  β”‚    tokens    float64        // Available tokens        β”‚ β”‚
β”‚  β”‚  }                                                      β”‚ β”‚
β”‚  β”‚                                                         β”‚ β”‚
β”‚  β”‚  Strategy: Wait for token before API call              β”‚ β”‚
β”‚  β”‚  Burst: Allow 1 request instantly, refill at rate      β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                          ↓                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Circuit Breakers (Per-Provider)                       β”‚ β”‚
β”‚  β”‚  ─────────────────────────────────────────────────────  β”‚ β”‚
β”‚  β”‚                                                         β”‚ β”‚
β”‚  β”‚  States: CLOSED β†’ OPEN β†’ HALF_OPEN β†’ CLOSED            β”‚ β”‚
β”‚  β”‚                                                         β”‚ β”‚
β”‚  β”‚  β€’ Failure threshold: 5 consecutive failures           β”‚ β”‚
β”‚  β”‚  β€’ Timeout: 30s (OPEN β†’ HALF_OPEN)                     β”‚ β”‚
β”‚  β”‚  β€’ Test requests: 1 in HALF_OPEN state                 β”‚ β”‚
β”‚  β”‚  β€’ Auto-recovery: Success β†’ CLOSED                     β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                          ↓                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Quote Cache (10s TTL, per-provider)                   β”‚ β”‚
β”‚  β”‚  ─────────────────────────────────────────────────────  β”‚ β”‚
β”‚  β”‚                                                         β”‚ β”‚
β”‚  β”‚  Cache Key: hash(provider, inputMint, outputMint, amt) β”‚ β”‚
β”‚  β”‚                                                         β”‚ β”‚
β”‚  β”‚  Longer TTL than local (10s vs 2s):                    β”‚ β”‚
β”‚  β”‚  β€’ API calls are expensive (rate limited)              β”‚ β”‚
β”‚  β”‚  β€’ External prices change slower                       β”‚ β”‚
β”‚  β”‚  β€’ Reduces API usage 10x                               β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Provider Interface

type QuoteProvider interface {
    GetQuote(ctx context.Context, req *QuoteRequest) (*QuoteResponse, error)
    GetName() string
    GetRateLimit() float64  // Requests per second
    IsEnabled() bool
    GetHealth() *ProviderHealth
}

// Jupiter Provider Example
type JupiterProvider struct {
    client       *http.Client
    apiKeys      []string
    currentKey   int
    rateLimiter  *rate.Limiter  // Shared 1 RPS
    mode         string          // "standard" or "ultra"
    baseURL      string
}

func (p *JupiterProvider) GetQuote(
    ctx context.Context,
    req *QuoteRequest,
) (*QuoteResponse, error) {
    // 1. Wait for rate limit token
    if err := p.rateLimiter.Wait(ctx); err != nil {
        return nil, fmt.Errorf("rate limit: %w", err)
    }

    // 2. Build URL
    url := fmt.Sprintf("%s/quote?inputMint=%s&outputMint=%s&amount=%d",
        p.baseURL, req.InputMint, req.OutputMint, req.Amount)

    // 3. Add headers (Jupiter Ultra requires Origin + Referer)
    headers := http.Header{}
    if p.mode == "ultra" {
        headers.Set("Origin", "https://jup.ag")
        headers.Set("Referer", "https://jup.ag/")
    }

    // 4. Make API call
    httpReq, _ := http.NewRequestWithContext(ctx, "GET", url, nil)
    httpReq.Header = headers

    resp, err := p.client.Do(httpReq)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    // 5. Parse response
    return p.parseResponse(resp)
}

Rate Limiting Strategy

Jupiter API Pool (1 RPS total):

Total capacity: 60 requests/min

Allocation:
β€’ Jupiter Oracle: 12 req/min (5s interval, always-on)
β€’ Quote services: 48 req/min remaining

Per-quoter capacity (5 quoters sharing):
β€’ 48 Γ· 5 = 9.6 req/min
β€’ = 0.16 req/s per quoter

Quoters sharing 1 RPS:
1. Jupiter Standard
2. Jupiter Ultra
3. HumidiFi (dark pool)
4. GoonFi (dark pool)
5. ZeroFi (dark pool)

Capacity Planning:

With 0.16 req/s (9.6 req/min) per Jupiter-based quoter:

10s refresh interval = 6 calls/min per pair
Maximum pairs: ~1 pair with all 5 quoters enabled

Recommendations:
β€’ HFT (1 pair):     Jupiter + Jupiter Ultra + dark pools (best prices)
β€’ Multi-pair (2-3): Jupiter only (higher capacity, fewer quoters)

Independent Providers (no sharing):

β€’ Dflow:      5.0 req/s  = 300 req/min
β€’ OKX:       10.0 req/s  = 600 req/min
β€’ QuickNode: 10.0 req/s  = 600 req/min
β€’ BLXR:       5.0 req/s  = 300 req/min

Circuit Breaker Logic

type CircuitBreaker struct {
    state            CircuitState  // Closed, Open, HalfOpen
    failureCount     int
    successCount     int
    failureThreshold int           // 5 consecutive failures
    timeout          time.Duration // 30s (Open β†’ HalfOpen)
    lastFailure      time.Time
    lastSuccess      time.Time
    mu               sync.Mutex
}

func (cb *CircuitBreaker) Call(fn func() error) error {
    cb.mu.Lock()

    // Check state
    if cb.state == Open {
        if time.Since(cb.lastFailure) > cb.timeout {
            cb.state = HalfOpen  // Try recovery
            cb.successCount = 0
        } else {
            cb.mu.Unlock()
            return ErrCircuitOpen
        }
    }

    cb.mu.Unlock()

    // Execute function
    err := fn()

    cb.mu.Lock()
    defer cb.mu.Unlock()

    if err != nil {
        cb.failureCount++
        cb.successCount = 0
        cb.lastFailure = time.Now()

        if cb.failureCount >= cb.failureThreshold {
            cb.state = Open  // Disable provider
            log.Error("Circuit breaker opened",
                "provider", cb.name,
                "failures", cb.failureCount)
        }
        return err
    }

    // Success - reset or close circuit
    cb.failureCount = 0
    cb.successCount++
    cb.lastSuccess = time.Now()

    if cb.state == HalfOpen && cb.successCount >= 3 {
        cb.state = Closed  // Fully recovered
        log.Info("Circuit breaker closed", "provider", cb.name)
    }

    return nil
}

State Transitions:

CLOSED (Normal operation)
   ↓ 5 consecutive failures
OPEN (Provider disabled)
   ↓ Wait 30s
HALF_OPEN (Testing recovery)
   ↓ 3 consecutive successes
CLOSED (Fully recovered)

Provider Rotation & Health

type ProviderRotator struct {
    providers     []ProviderConfig
    currentIndex  int
    usageCounts   map[string]int64
    errorCounts   map[string]int64
    lastResetTime time.Time
    mu            sync.Mutex
}

func (r *ProviderRotator) SelectProvider() string {
    r.mu.Lock()
    defer r.mu.Unlock()

    // 1. Filter healthy providers (circuit breaker closed)
    healthy := r.getHealthyProviders()
    if len(healthy) == 0 {
        return ""  // No providers available
    }

    // 2. Sort by error rate (ascending)
    sort.Slice(healthy, func(i, j int) bool {
        errRateI := float64(r.errorCounts[healthy[i]]) / float64(r.usageCounts[healthy[i]]+1)
        errRateJ := float64(r.errorCounts[healthy[j]]) / float64(r.usageCounts[healthy[j]]+1)
        return errRateI < errRateJ
    })

    // 3. Round-robin among top 3 providers
    topProviders := healthy
    if len(healthy) > 3 {
        topProviders = healthy[:3]
    }

    selected := topProviders[r.currentIndex % len(topProviders)]
    r.currentIndex++
    r.usageCounts[selected]++

    return selected
}

Configuration

Environment Variables:

# Service
EXTERNAL_QUOTE_SERVICE_PORT=50053
EXTERNAL_QUOTE_SERVICE_HTTP_PORT=8083

# Provider Enablement
QUOTERS_ENABLED=true
QUOTERS_USE_JUPITER=true
QUOTERS_USE_JUPITER_ULTRA=false
QUOTERS_USE_HUMIDIFI=false
QUOTERS_USE_GOONFI=false
QUOTERS_USE_ZEROFI=false
QUOTERS_USE_DFLOW=false
QUOTERS_USE_OKX=false
QUOTERS_USE_QUICKNODE=false
QUOTERS_USE_BLXR=false

# API Keys
OKX_API_KEY=your-api-key
OKX_SECRET_KEY=your-secret-key
OKX_PASSPHRASE=your-passphrase
BLXR_API_KEY=your-blxr-key

# Rate Limits (requests per second)
JUPITER_RATE_LIMIT=0.16     # Shared pool
DFLOW_RATE_LIMIT=5.0
OKX_RATE_LIMIT=10.0
QUICKNODE_RATE_LIMIT=10.0
BLXR_RATE_LIMIT=5.0

# Cache & Circuit Breaker
QUOTE_CACHE_TTL=10s
CIRCUIT_BREAKER_FAILURE_THRESHOLD=5
CIRCUIT_BREAKER_TIMEOUT=30s
CIRCUIT_BREAKER_HALFOPEN_SUCCESSES=3

# Observability
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
METRICS_PORT=9091
LOG_LEVEL=info

Deployment Specifications

Resource Requirements:

  • CPU: 1-2 cores (HTTP I/O bound)
  • Memory: 2-4 GB
    • HTTP connections: ~500 MB
    • Quote cache: ~1-2 MB
    • Go runtime: ~1-2 GB
  • Disk: 1 GB (logs only)
  • Network: 50 Mbps (external API calls)

Scaling Strategy:

  • Horizontal: 5+ instances (distribute API keys)
  • Geographic: US East/West, EU, Asia (reduce API latency)
  • Pattern: Stateless (optional Redis cache for sharing)

Docker Image:

FROM golang:1.24-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN go build -o /bin/external-quote-service ./cmd/external-quote-service

FROM alpine:latest
RUN apk --no-cache add ca-certificates
COPY --from=builder /bin/external-quote-service /bin/
EXPOSE 50053 8083 9091
ENTRYPOINT ["/bin/external-quote-service"]

Service 3: Quote Aggregator Service

Purpose

Client-facing unified interface that:

  • Fans out to local + external services in parallel
  • Merges quote streams in real-time
  • Selects best quote (price, confidence, liquidity)
  • Compares local vs external (price diff, recommendation)
  • Handles partial failures gracefully

Aggregation Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  QUOTE AGGREGATOR SERVICE                    β”‚
β”‚                                                              β”‚
β”‚  1. Client Request                                           β”‚
β”‚     StreamQuotes(SOL β†’ USDC, 1 SOL)                         β”‚
β”‚                                                              β”‚
β”‚  2. Fan-Out (Parallel goroutines)                           β”‚
β”‚     β”œβ”€ Local Quote Service (gRPC: 50052)                    β”‚
β”‚     └─ External Quote Service (gRPC: 50053)                 β”‚
β”‚                                                              β”‚
β”‚  3. Collect Responses (merge streams)                        β”‚
β”‚     Local stream:    outputAmount = 102.5 USDC (5ms)        β”‚
β”‚     External stream: outputAmount = 103.0 USDC (250ms)      β”‚
β”‚                                                              β”‚
β”‚  4. Deduplicate Routes                                       β”‚
β”‚     β€’ Hash route plan (pool addresses)                       β”‚
β”‚     β€’ Same route from multiple sources β†’ keep best price    β”‚
β”‚                                                              β”‚
β”‚  5. Select Best Quote                                        β”‚
β”‚     Ranking:                                                 β”‚
β”‚     1. Output amount (higher = better)                       β”‚
β”‚     2. Price impact (lower = better)                         β”‚
β”‚     3. Confidence score (higher = better)                    β”‚
β”‚     4. Liquidity depth (higher = better)                     β”‚
β”‚                                                              β”‚
β”‚     Best: External 103.0 USDC (0.5 USDC better)             β”‚
β”‚                                                              β”‚
β”‚  6. Compare Local vs External                                β”‚
β”‚     priceDiff = (103.0 - 102.5) / 102.5 = 0.49%             β”‚
β”‚     recommendation = "external" (>0.5% better)               β”‚
β”‚     confidence = 0.85                                        β”‚
β”‚                                                              β”‚
β”‚  7. Build Aggregated Response                                β”‚
β”‚     AggregatedQuote {                                        β”‚
β”‚       bestLocal: 102.5 USDC,                                β”‚
β”‚       bestExternal: 103.0 USDC,                             β”‚
β”‚       bestSource: EXTERNAL,                                  β”‚
β”‚       comparison: { priceDiff: 0.49%, rec: "external" }     β”‚
β”‚     }                                                        β”‚
β”‚                                                              β”‚
β”‚  8. Stream to Client                                         β”‚
β”‚     gRPC streaming (continuous updates)                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Fan-Out Implementation

func (s *QuoteAggregatorService) StreamQuotes(
    req *AggregatedQuoteRequest,
    stream QuoteAggregatorService_StreamQuotesServer,
) error {
    ctx := stream.Context()

    // Create channels for quote streams
    localChan := make(chan *LocalQuote, 100)
    externalChan := make(chan *ExternalQuote, 100)
    errChan := make(chan error, 2)

    // Fan-out to both services (PARALLEL)
    go s.streamLocalQuotes(ctx, req, localChan, errChan)
    go s.streamExternalQuotes(ctx, req, externalChan, errChan)

    // Track best quotes
    var bestLocal *LocalQuote
    var bestExternal *ExternalQuote

    // Merge streams in real-time
    for {
        select {
        case local := <-localChan:
            // Update best local quote
            if bestLocal == nil || local.OutputAmount > bestLocal.OutputAmount {
                bestLocal = local
            }

            // Send aggregated quote immediately
            aggregated := s.buildAggregatedQuote(bestLocal, bestExternal)
            if err := stream.Send(aggregated); err != nil {
                return err
            }

        case external := <-externalChan:
            // Update best external quote
            if bestExternal == nil || external.OutputAmount > bestExternal.OutputAmount {
                bestExternal = external
            }

            // Send aggregated quote immediately
            aggregated := s.buildAggregatedQuote(bestLocal, bestExternal)
            if err := stream.Send(aggregated); err != nil {
                return err
            }

        case err := <-errChan:
            // Log error, continue with partial results
            log.Printf("Downstream service error: %v", err)
            // Don't return - partial results are OK

        case <-ctx.Done():
            return ctx.Err()
        }
    }
}

Key Features:

  • Non-blocking: Doesn’t wait for both services
  • Partial Results: Returns local-only or external-only if one fails
  • Real-time Updates: Streams quotes as they arrive
  • Graceful Degradation: Logs errors, continues operation

Quote Comparison

type QuoteComparator struct {
    oracleClient *OracleClient
}

func (c *QuoteComparator) Compare(
    local *LocalQuote,
    external *ExternalQuote,
) *QuoteComparison {
    // 1. Price difference
    priceDiff := (float64(external.OutputAmount) - float64(local.OutputAmount)) /
                 float64(local.OutputAmount) * 100

    // 2. Spread (bid-ask)
    spreadBps := int32(math.Abs(priceDiff) * 100)

    // 3. Recommendation
    recommendation := "local"
    if priceDiff > 0.5 {  // External > 0.5% better
        recommendation = "external"
    } else if math.Abs(priceDiff) < 0.1 {  // Within 0.1%
        recommendation = "equal"
    }

    // 4. Confidence score (based on oracle validation)
    confidence := c.calculateConfidence(local, external)

    return &QuoteComparison{
        PriceDiffPercent:  priceDiff,
        SpreadBps:         spreadBps,
        LocalLatencyMs:    local.CalculationTimeMs,
        ExternalLatencyMs: external.ApiLatencyMs,
        Recommendation:    recommendation,
        ConfidenceScore:   confidence,
    }
}

func (c *QuoteComparator) calculateConfidence(
    local *LocalQuote,
    external *ExternalQuote,
) float64 {
    confidence := 0.5  // Base

    // Factor 1: Oracle price validation
    if local.OraclePrice > 0 && external.Provider != "" {
        localDev := math.Abs(local.Deviation)
        externalDev := math.Abs(external.Deviation)

        if localDev < 0.01 && externalDev < 0.01 {
            confidence += 0.3  // Both within 1% of oracle
        } else if localDev < 0.05 && externalDev < 0.05 {
            confidence += 0.1  // Both within 5% of oracle
        }
    }

    // Factor 2: Liquidity depth
    minLiquidity := math.Min(
        local.LiquidityUsd,
        external.LiquidityUsd,
    )
    if minLiquidity > 100000 {
        confidence += 0.1  // >$100k liquidity
    }

    // Factor 3: Price impact
    totalImpact := local.PriceImpact + external.PriceImpact
    if totalImpact < 0.005 {  // <0.5%
        confidence += 0.1
    }

    return math.Min(confidence, 1.0)
}

Deduplication

type QuoteDeduplicator struct {
    seenRoutes map[string]*Quote  // routeHash -> best quote
    mu         sync.Mutex
}

func (d *QuoteDeduplicator) AddQuote(quote Quote) bool {
    routeHash := d.hashRoute(quote.GetRoutePlan())

    d.mu.Lock()
    defer d.mu.Unlock()

    existing, seen := d.seenRoutes[routeHash]

    if !seen {
        // New route
        d.seenRoutes[routeHash] = &quote
        return true  // Keep
    }

    // Duplicate route - compare prices
    if quote.GetOutputAmount() > existing.GetOutputAmount() {
        // Better price - replace
        d.seenRoutes[routeHash] = &quote
        return true  // Keep
    }

    return false  // Discard (worse price)
}

func (d *QuoteDeduplicator) hashRoute(plan []SwapHop) string {
    // Hash based on pool addresses (order matters)
    var b strings.Builder
    for _, hop := range plan {
        b.WriteString(hop.AmmKey)
        b.WriteString(":")
    }
    hash := sha256.Sum256([]byte(b.String()))
    return hex.EncodeToString(hash[:])
}

Configuration

Environment Variables:

# Service
QUOTE_AGGREGATOR_SERVICE_PORT=50051
QUOTE_AGGREGATOR_SERVICE_HTTP_PORT=8081

# Downstream Services
LOCAL_QUOTE_SERVICE_URL=localhost:50052
EXTERNAL_QUOTE_SERVICE_URL=localhost:50053

# Connection Pool
GRPC_MAX_CONCURRENT_STREAMS=100
GRPC_KEEPALIVE_TIME=30s
GRPC_KEEPALIVE_TIMEOUT=10s

# Circuit Breaker (for downstream services)
CIRCUIT_BREAKER_FAILURE_THRESHOLD=10
CIRCUIT_BREAKER_TIMEOUT=30s

# Observability
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
METRICS_PORT=9092
LOG_LEVEL=info

Deployment Specifications

Resource Requirements:

  • CPU: 1 core (lightweight aggregation)
  • Memory: 1-2 GB
    • gRPC connections: ~500 MB
    • Deduplication map: ~10 MB
    • Go runtime: ~500 MB
  • Disk: 1 GB (logs only)
  • Network: 10 Mbps (gRPC proxying)

Scaling Strategy:

  • Horizontal: 10+ instances (stateless)
  • Load Balancing: NGINX or Kubernetes HPA
  • Auto-scaling: Based on client connection count

Docker Image:

FROM golang:1.24-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN go build -o /bin/quote-aggregator-service ./cmd/quote-aggregator-service

FROM alpine:latest
RUN apk --no-cache add ca-certificates
COPY --from=builder /bin/quote-aggregator-service /bin/
EXPOSE 50051 8081 9092
ENTRYPOINT ["/bin/quote-aggregator-service"]

Data Flow & Integration

Complete Quote Flow (End-to-End)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 1. CLIENT REQUEST                                            β”‚
β”‚    Scanner: StreamQuotes(SOL β†’ USDC, 1 SOL)                 β”‚
β”‚             ↓ gRPC: localhost:50051                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 2. AGGREGATOR (2-3ms overhead)                               β”‚
β”‚    β€’ Parse request                                           β”‚
β”‚    β€’ Fan-out to local + external (PARALLEL)                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              ↓                              ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 3a. LOCAL (50052)       β”‚      β”‚ 3b. EXTERNAL (50053)        β”‚
β”‚                         β”‚      β”‚                             β”‚
β”‚ β€’ Layer 2 cache check   β”‚      β”‚ β€’ Provider pool selection   β”‚
β”‚   β”œβ”€ HIT: <1ms         β”‚      β”‚ β€’ Rate limiter (wait token) β”‚
β”‚   └─ MISS:              β”‚      β”‚ β€’ Circuit breaker check     β”‚
β”‚      β”œβ”€ Layer 1 (pool)  β”‚      β”‚ β€’ API call (Jupiter/etc)    β”‚
β”‚      β”‚   (<5ms)         β”‚      β”‚   (200-500ms)               β”‚
β”‚      └─ Pool math       β”‚      β”‚ β€’ Store in cache (10s TTL)  β”‚
β”‚                         β”‚      β”‚                             β”‚
β”‚ Latency: 2-5ms          β”‚      β”‚ Latency: 250ms              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              ↓                              ↓
         LocalQuote                   ExternalQuote
         102.5 USDC                   103.0 USDC
              ↓                              ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 4. AGGREGATOR MERGE                                          β”‚
β”‚    β€’ Deduplicate routes (hash-based)                         β”‚
β”‚    β€’ Select best (103.0 USDC from External)                  β”‚
β”‚    β€’ Compare local vs external (0.49% diff)                  β”‚
β”‚    β€’ Calculate confidence (0.85)                             β”‚
β”‚    β€’ Build AggregatedQuote                                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 5. STREAM TO CLIENT                                          β”‚
β”‚    AggregatedQuote {                                         β”‚
β”‚      bestLocal: 102.5 USDC,                                 β”‚
β”‚      bestExternal: 103.0 USDC,                              β”‚
β”‚      bestSource: EXTERNAL,                                   β”‚
β”‚      comparison: { priceDiff: 0.49%, rec: "external" },     β”‚
β”‚      confidence: 0.85                                        β”‚
β”‚    }                                                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Integration with HFT Pipeline

Scanner Integration (TypeScript):

import { QuoteAggregatorServiceClient } from '@/generated/quote';

const client = new QuoteAggregatorServiceClient('localhost:50051');

// Stream aggregated quotes
const stream = client.streamQuotes({
  inputMint: SOL_MINT,
  outputMint: USDC_MINT,
  amount: 1_000_000_000,  // 1 SOL
  includeLocal: true,
  includeExternal: true,
});

stream.on('data', (quote: AggregatedQuote) => {
  // Fast pre-filter (<10ms)
  const opportunities = detectArbitrage(quote);

  // Publish to NATS: opportunity.two_hop.*
  for (const opp of opportunities) {
    await nats.publish('opportunity.two_hop.detected', opp);
  }
});

Planner Integration (TypeScript):

// Planner validates with fresh quotes
const quote = await client.getBestQuote({
  inputMint: opportunity.tokenStart,
  outputMint: opportunity.tokenIntermediate,
  amount: opportunity.amountIn,
});

// Deep validation (50-100ms)
const simulation = await simulateArbitrage(quote);

if (simulation.success && simulation.netProfit > 0.01) {
  const plan = createExecutionPlan(quote, simulation);
  await nats.publish('execution.planned', plan);
}

Error Handling & Graceful Degradation

Failure Scenarios:

FailureLocal ServiceExternal ServiceAggregatorClient Impact
Local service down❌ Offlineβœ… Running⚠️ PartialGets external-only quotes
External service downβœ… Running❌ Offline⚠️ PartialGets local-only quotes
Specific provider downN/A⚠️ Circuit openβœ… OKOther providers active
RPC timeout⚠️ Stale quotesN/A⚠️ PartialStale local + fresh external
API rate limitN/A⚠️ Cache hitsβœ… OKCached external quotes
Both services down❌ Offline❌ Offline❌ ErrorClient retries or fails

Graceful Degradation Priority:

1. Local quotes (fastest, most critical for HFT)
2. External quotes (slower, better prices sometimes)
3. Cached quotes (stale but available)
4. Error (client decides: retry or skip)

Performance Characteristics

Latency Breakdown (Detailed)

OperationTargetP50P95P99Bottleneck
Local: Layer 2 hit<1ms0.5ms0.8ms1.2msMemory access
Local: Layer 1 hit + calc<5ms2ms4ms6msPool math CPU
Local: Pool stale (RPC)<150ms80ms120ms180msRPC tick arrays
Local: Parallel paired<10ms5ms8ms12ms2Γ— Layer 1 hit
External: Cache hit<1ms0.5ms0.8ms1.2msMemory access
External: API call200-500ms250ms450ms600msAPI latency
Aggregator: Both cached<5ms3ms5ms8msgRPC overhead
Aggregator: Local cached + External API200-500ms255ms455ms610msExternal API

Throughput Capacity

ServiceBottleneckSingle InstanceMulti-InstanceNotes
Local QuotePool math CPU2,000 req/s4,000 req/s (2Γ—)Quote cache hit rate >80%
External QuoteAPI rate limits10 req/s50 req/s (5Γ—)Jupiter: 1 RPS, OKX: 10 RPS
AggregatorgRPC proxying1,000 req/s10,000 req/s (10Γ—)CPU: <50% at 1k req/s
Overall SystemExternal APIs10-50 req/s50-250 req/sDepends on provider mix

Cache Performance

Target Hit Rates:

  • Local Layer 2 (Quote Response): >80% hit rate
  • Local Layer 1 (Pool State): >95% hit rate
  • External Quote Cache: >70% hit rate

Actual Performance (production):

Local Layer 2:  85-90% hit rate βœ…
Local Layer 1:  96-98% hit rate βœ…
External Cache: 72-78% hit rate βœ…

Cache Impact on Latency:

Without cache: 250ms avg (always fresh calculation)
With 80% hit: 50ms avg (0.8 Γ— 1ms + 0.2 Γ— 250ms)
Improvement: 5Γ— faster

Resource Utilization (Production)

Memory Usage:

ServicePer InstanceTotal (All Instances)Breakdown
Local10-12 GB20-24 GB (2Γ—)Pool: 5MB, Quote: 2MB, CLMM: 5MB, Go: 2GB
External3 GB15 GB (5Γ—)Cache: 2MB, HTTP: 500MB, Go: 1.5GB
Aggregator1.5 GB15 GB (10Γ—)gRPC: 500MB, Map: 10MB, Go: 500MB
TotalΒ 50-54 GBAcross 17 instances

CPU Usage:

ServicePer InstanceTotal (All Instances)Usage Pattern
Local2-3 cores4-6 cores (2Γ—)Burst: pool math, Idle: 10-20%
External1 core5 cores (5Γ—)Steady: 30-50% (I/O wait)
Aggregator0.5 core5 cores (10Γ—)Steady: 20-30%
TotalΒ 14-16 coresPeak: 80%, Avg: 40-50%

Network Bandwidth:

ServiceIngressEgressTotal
Local5 Mbps5 Mbps10 Mbps
External50 Mbps50 Mbps100 Mbps (API calls)
Aggregator10 Mbps10 Mbps20 Mbps
Total65 Mbps65 Mbps130 Mbps peak

Deployment & Operations

Docker Compose Configuration

version: '3.8'

services:
  # ============================================================
  # QUOTE AGGREGATOR (Client-Facing)
  # ============================================================
  quote-aggregator:
    image: quote-aggregator-service:latest
    container_name: quote-aggregator
    ports:
      - "50051:50051"  # gRPC (load balanced)
      - "8081:8081"    # HTTP metrics
      - "9092:9092"    # Prometheus
    environment:
      - GRPC_PORT=50051
      - HTTP_PORT=8081
      - METRICS_PORT=9092
      - LOCAL_QUOTE_SERVICE_URL=local-quote:50052
      - EXTERNAL_QUOTE_SERVICE_URL=external-quote:50053
      - MAX_CONCURRENT_STREAMS=100
    depends_on:
      - local-quote
      - external-quote
    deploy:
      replicas: 10
      resources:
        limits:
          cpus: '1'
          memory: 2G
    restart: unless-stopped
    networks:
      - quote-network

  # ============================================================
  # LOCAL QUOTE SERVICE
  # ============================================================
  local-quote:
    image: local-quote-service:latest
    container_name: local-quote
    ports:
      - "50052:50052"  # gRPC
      - "8082:8082"    # HTTP
      - "9090:9090"    # Prometheus
    environment:
      - GRPC_PORT=50052
      - HTTP_PORT=8082
      - METRICS_PORT=9090
      - AMM_REFRESH_INTERVAL=10s
      - CLMM_REFRESH_INTERVAL=30s
      - POOL_CACHE_STALENESS_THRESHOLD=60s
      - QUOTE_CACHE_TTL=2s
      - PARALLEL_QUOTE_ENABLED=true
      - PARALLEL_QUOTE_TIMEOUT_MS=100
      - RPC_PROXY_URL=http://rpc-proxy:8083
      - POOL_DISCOVERY_REDIS_URL=redis://redis:6379/0
      - ORACLE_PYTH_HTTP_URL=https://hermes.pyth.network
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
    depends_on:
      - redis
      - rpc-proxy
    deploy:
      resources:
        limits:
          cpus: '4'
          memory: 16G
    restart: unless-stopped
    networks:
      - quote-network

  # ============================================================
  # EXTERNAL QUOTE SERVICE
  # ============================================================
  external-quote:
    image: external-quote-service:latest
    ports:
      - "50053-50057:50053"  # gRPC (5 instances)
      - "8083-8087:8083"     # HTTP
      - "9091-9095:9091"     # Prometheus
    environment:
      - GRPC_PORT=50053
      - HTTP_PORT=8083
      - METRICS_PORT=9091
      - QUOTERS_ENABLED=true
      - QUOTERS_USE_JUPITER=true
      - QUOTERS_USE_JUPITER_ULTRA=false
      - QUOTERS_USE_DFLOW=false
      - QUOTERS_USE_OKX=false
      - JUPITER_RATE_LIMIT=0.16
      - QUOTE_CACHE_TTL=10s
      - CIRCUIT_BREAKER_FAILURE_THRESHOLD=5
      - CIRCUIT_BREAKER_TIMEOUT=30s
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
    env_file:
      - ../../.env  # API keys (OKX_API_KEY, BLXR_API_KEY)
    deploy:
      replicas: 5
      resources:
        limits:
          cpus: '2'
          memory: 4G
    restart: unless-stopped
    networks:
      - quote-network

  # ============================================================
  # INFRASTRUCTURE
  # ============================================================
  redis:
    image: redis:7-alpine
    container_name: redis
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    restart: unless-stopped
    networks:
      - quote-network

  rpc-proxy:
    image: solana-rpc-proxy:latest
    container_name: rpc-proxy
    ports:
      - "8083:8083"
    environment:
      - HTTP_PORT=8083
      - RPC_ENDPOINTS=${RPC_ENDPOINTS}
    restart: unless-stopped
    networks:
      - quote-network

networks:
  quote-network:
    driver: bridge

volumes:
  redis-data:

Kubernetes Deployment (Future)

Local Quote Service (StatefulSet):

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: local-quote
spec:
  serviceName: local-quote
  replicas: 2
  selector:
    matchLabels:
      app: local-quote
  template:
    spec:
      containers:
      - name: local-quote
        image: local-quote-service:v1
        ports:
        - containerPort: 50052
          name: grpc
        resources:
          requests:
            cpu: 2
            memory: 8Gi
          limits:
            cpu: 4
            memory: 16Gi
        livenessProbe:
          exec:
            command: ["/bin/grpc_health_probe", "-addr=:50052"]
          initialDelaySeconds: 10
          periodSeconds: 10

Monitoring & Observability

Key Metrics (Prometheus):

Local Quote Service:

# Pool refresh
pool_refresh_duration_seconds{type="amm"}
pool_refresh_duration_seconds{type="clmm"}
pool_cache_size{type="amm"}
pool_cache_size{type="clmm"}
pool_stale_count

# Quote cache
quote_cache_hit_rate
quote_cache_size
quote_staleness_seconds_p95

# Parallel paired quotes
parallel_quote_duration_ms_p50
parallel_quote_duration_ms_p95
parallel_quote_success_rate

# Oracle
oracle_price_deviation_percent

External Quote Service:

# Provider health
provider_healthy{provider="Jupiter"}
provider_requests_total{provider="Jupiter"}
provider_errors_total{provider="Jupiter"}
provider_latency_seconds{provider="Jupiter"}

# Rate limiting
rate_limit_tokens_available{provider="Jupiter"}
rate_limit_requests_waiting{provider="Jupiter"}

# Circuit breaker
circuit_breaker_state{provider="Jupiter"}  # 0=closed, 1=open, 2=halfopen
circuit_breaker_failures_total{provider="Jupiter"}

Aggregator Service:

# Requests
aggregator_requests_total
aggregator_latency_seconds_p50
aggregator_latency_seconds_p95
aggregator_latency_seconds_p99
aggregator_errors_total

# Quote sources
aggregator_best_source_total{source="local"}
aggregator_best_source_total{source="external"}
aggregator_price_diff_percent_p50
aggregator_price_diff_percent_p95

Grafana Dashboards

Dashboard 1: Quote Services Overview (Single Pane of Glass):

Row 1: Service Health
- [Local] [External] [Aggregator] status indicators
- Uptime, error rate, request rate

Row 2: Latency Distribution
- P50/P95/P99 latency (all services)
- Latency heatmap (last 24h)

Row 3: Cache Performance
- Local Layer 1 hit rate (target: >95%)
- Local Layer 2 hit rate (target: >80%)
- External cache hit rate (target: >70%)

Row 4: Quote Quality
- Price diff histogram (local vs external)
- Best source distribution (local % vs external %)
- Oracle deviation alerts

Dashboard 2: Local Quote Deep Dive:

Row 1: Pool Refresh Performance
- AMM refresh duration (target: <10ms)
- CLMM refresh duration (target: <100ms)
- Refresh queue depth

Row 2: Pool Cache
- Cache size (AMM vs CLMM)
- Staleness count
- Refresh rate (per second)

Row 3: Quote Cache
- Hit rate over time
- Cache size
- Eviction rate

Row 4: Parallel Paired Quotes
- Duration (both quotes) P50/P95/P99
- Success rate (both completed)
- Timeout rate

Alerts (Prometheus Alertmanager)

Critical (PagerDuty):

groups:
- name: quote_service_critical
  rules:
  - alert: QuoteServiceDown
    expr: up{job="quote-service"} == 0
    for: 1m
    annotations:
      summary: "Quote service  is down"

  - alert: HighErrorRate
    expr: rate(errors_total[5m]) > 0.05
    for: 5m
    annotations:
      summary: "Error rate >5% on "

  - alert: HighLatencyP99
    expr: histogram_quantile(0.99, latency_seconds) > 5
    for: 5m
    annotations:
      summary: "P99 latency >5s on "

  - alert: AllProvidersDown
    expr: count(provider_healthy == 0) == count(provider_healthy)
    for: 2m
    annotations:
      summary: "All external quote providers are down"

Warning (Slack):

- alert: LowCacheHitRate
  expr: cache_hit_rate < 0.7
  for: 10m
  annotations:
    summary: "Cache hit rate <70% on "

- alert: PoolStaleness
  expr: pool_stale_count > 5
  for: 5m
  annotations:
    summary: " pools are stale (>60s)"

- alert: CircuitBreakerOpen
  expr: circuit_breaker_state > 0
  for: 5m
  annotations:
    summary: "Circuit breaker open for "

- alert: OracleDeviation
  expr: oracle_price_deviation_percent > 1.0
  for: 2m
  annotations:
    summary: "Quote deviates >1% from oracle for "

Conclusion

Architecture Summary

The Quote Service v3.0 delivers sub-10ms quotes through:

βœ… 3 specialized microservices (local, external, aggregator) βœ… Dual cache architecture (pool state + quote response) βœ… Parallel paired quotes (<10ms vs 100ms sequential) βœ… Background refresh (AMM: 10s, CLMM: 30s) βœ… Rate limiting & circuit breakers (external API protection) βœ… Oracle validation (Pyth price feed deviation alerts) βœ… gRPC streaming (continuous real-time updates) βœ… Graceful degradation (partial results on failures)

Critical Path Performance

QUOTE-SERVICE β†’ SCANNER β†’ PLANNER β†’ EXECUTOR β†’ PROFIT
   < 5ms        < 10ms     50-100ms    < 20ms
──────────────────────────────────────────────────
              < 200ms TOTAL βœ…

Quote Service is the foundation: If quotes are slow, stale, or inaccurate, the entire HFT system fails.

Key Innovations

  1. Parallel Paired Quotes (Doc 22):
    • Sequential: 100ms delay, different slots ❌
    • Parallel: <10ms delay, same slot βœ…
    • 10Γ— faster, 100% slot consistency
  2. Dual Cache (Doc 33):
    • Layer 1: Pool state (expensive to fetch)
    • Layer 2: Quote response (cheap to calculate)
    • Pool-aware invalidation
  3. Microservices (Doc 34):
    • Failure isolation (RPC β‰  API)
    • Independent scaling (vertical vs horizontal)
    • Independent deployment (3Γ— release velocity)
  4. Shared Memory IPC (Doc 31):
    • Three-tier storage: Shared memory β†’ Redis β†’ PostgreSQL
    • Lock-free reads with atomic versioning (<1ΞΌs quote access)
    • Dual shared memory regions (internal vs external quotes)
    • 100-500x faster than gRPC for Rust production scanners
    • Sub-10ΞΌs arbitrage detection (vs 500ΞΌs-2ms with gRPC)

Production Readiness

MetricTargetActualStatus
Latency (cached)<5ms2-3msβœ… Exceeds
Latency (fresh)<10ms5-8msβœ… Meets
Throughput2,000 req/s2,500 req/sβœ… Exceeds
Cache Hit Rate>80%85-90%βœ… Exceeds
Availability99.9%99.95%βœ… Exceeds
Parallel Paired<10ms5-8msβœ… Meets

Archived Documentation

This document supersedes and consolidates:

  • βœ… Doc 22: Parallel Paired Quotes (integrated into Local Service)
  • βœ… Doc 24: Quote Streaming Design (integrated into Architecture)
  • βœ… Doc 26: Quote Service Rewrite (evolved into Microservices)
  • βœ… Doc 31: Shared Memory IPC Architecture (fully incorporated into Section 4.2)
  • βœ… Doc 32: Hybrid Quote Architecture (split into Local + External)
  • βœ… Doc 33: Background Pool Refresh (integrated into Local Service)
  • βœ… Doc 34: Microservices Architecture (current design)

Archive Location: docs/archive/quote-service/


Document Version: 3.0 Last Updated: December 31, 2025 Status: βœ… Production-Ready - Canonical Reference Maintainer: Solution Architect Team