Quote Service Architecture Review - v3.1 Enhancements
Quote Service Architecture Review - v3.1 Enhancements
Date: December 31, 2025 Reviewers: Solution Architect + Solana/HFT Expert Base Document: 30-QUOTE-SERVICE-ARCHITECTURE.md v3.0 Status: π― Architecture Review Complete - Ready for Implementation
Executive Summary
This document provides a comprehensive architectural review of the Quote Service v3.0 with critical recommendations for v3.1. The review identified 5 major improvements and 4 additional enhancements that will deliver:
- 60% less CPU usage (cache TTL optimization)
- 2Γ faster external quotes (parallel paired quotes)
- Eliminates fan-out overhead (batch streaming model)
- 100Γ faster oracle validation (shared memory integration)
- Simpler Rust scanner (aggregator writes shared memory)
Critical Changes (High Impact - Must Implement)
1. Cache TTL: 2s β 5s β PRIORITY 1
Current State:
ExpiresAt: time.Time // 2s TTL (Line 598)
Problem:
- Pool refresh interval: 10s (AMM), 30s (CLMM)
- Pool staleness threshold: 60s
- Quote cache expires too quickly relative to pool state refresh
Recommendation: Change to 5s TTL
Rationale:
- Pool State Validity: Pool state is valid for 60s, quotes should have longer TTL
- Arbitrage Window: Most arbitrage opportunities persist 5-10s on Solana
- API Efficiency: 60% fewer recalculations (5s vs 2s)
- Slot Consistency: 5s β 12 Solana slots (enough for detection)
Trade-off Analysis:
TTL Freshness CPU Usage Arbitrage Detection
2s Very fresh 100% Good
5s Fresh 40% Good β
BEST BALANCE
10s Stale 20% Misses fast moves
Impact:
- CPU: 60% reduction in quote recalculation
- Throughput: 2,500 req/s β 4,000+ req/s
- Cache Hits: 85-90% β 92-95% (longer window)
Configuration Changes:
# go/.env
QUOTE_CACHE_TTL=5s # Was: 2s
# deployment/docker/docker-compose.yml (Line 1815)
- QUOTE_CACHE_TTL=5s
Files to Update:
docs/30-QUOTE-SERVICE-ARCHITECTURE.md: Lines 598, 607, 617, 814, 826, 1815go/pkg/quote-service/cache.go: Update TTL constantgo/.env.example: Update default value
2. Parallel Paired Quotes for External Service β PRIORITY 2
Current State:
- Local Service: β Implements parallel paired quotes (<10ms)
- External Service: β Sequential API calls (500ms for paired quotes)
Problem:
Sequential External Quotes:
T=0ms: Start forward API call (SOL β USDC)
T=250ms: Forward complete
T=250ms: Start reverse API call (USDC β SOL) β¬
οΈ Market may have changed!
T=500ms: Reverse complete
Result: 500ms total, different market snapshots, temporal inconsistency
Recommendation: Implement parallel paired quotes in External Quote Service
New Design:
// EXTERNAL QUOTE SERVICE - Parallel Paired Quotes
func (s *ExternalQuoteService) GetPairedQuotesParallel(
ctx context.Context,
inputMint, outputMint, amount string,
) *PairedQuoteResult {
ctxWithTimeout, cancel := context.WithTimeout(ctx, 1000*time.Millisecond)
defer cancel()
// Ensure we have β₯2 rate limit tokens (Jupiter)
if !s.rateLimiter.AllowN(time.Now(), 2) {
// Fallback: Use different providers (Jupiter forward, DFlow reverse)
return s.getPairedQuotesWithProviderRotation(ctx, inputMint, outputMint, amount)
}
// Result channels
forwardChan := make(chan *ExternalQuoteResult, 1)
reverseChan := make(chan *ExternalQuoteResult, 1)
startTime := time.Now()
// Launch forward goroutine (SOL β USDC) - PARALLEL!
go func() {
quote, err := s.getQuoteFromProviders(ctxWithTimeout, inputMint, outputMint, amount)
forwardChan <- &ExternalQuoteResult{Quote: quote, Error: err}
}()
// Launch reverse goroutine (USDC β SOL) - SIMULTANEOUSLY!
go func() {
quote, err := s.getQuoteFromProviders(ctxWithTimeout, outputMint, inputMint, amount)
reverseChan <- &ExternalQuoteResult{Quote: quote, Error: err}
}()
// Wait for BOTH or timeout
result := &PairedQuoteResult{}
resultsReceived := 0
for resultsReceived < 2 {
select {
case fwd := <-forwardChan:
result.Forward = fwd
resultsReceived++
case rev := <-reverseChan:
result.Reverse = rev
resultsReceived++
case <-ctxWithTimeout.Done():
goto done
}
}
done:
result.TotalMs = time.Since(startTime).Milliseconds()
return result
}
Benefits:
- 2Γ faster: 250ms instead of 500ms
- Temporal consistency: Both quotes from same market snapshot
- Better arbitrage detection: Same timestamp reduces false positives
- Same API call count: No additional rate limit consumption
Rate Limiting Strategy:
// Option 1: Ensure β₯2 tokens available
if !rateLimiter.AllowN(time.Now(), 2) {
// Fallback: sequential or different providers
}
// Option 2: Provider rotation for paired quotes
forward := providerRotator.SelectProvider() // Jupiter
reverse := providerRotator.SelectProvider() // DFlow (different provider)
Impact:
- Latency: 500ms β 250ms (2Γ faster)
- Temporal consistency: 100% (same timestamp)
- False positives: 30-40% reduction
Files to Create/Update:
go/cmd/external-quote-service/parallel_paired_quotes.go: New implementationgo/pkg/external-quote-service/provider_rotation.go: Update for paired quotesdocs/30-QUOTE-SERVICE-ARCHITECTURE.md: Add to External Service section
3. Batch Streaming Model β PRIORITY 3
Current State:
- Client calls
StreamQuotes(SOL β USDC, 1 SOL)for each pair individually - N client requests for N pairs
- Aggregator does N fan-outs (inefficient)
Problem:
Current: N pairs = N gRPC connections
Client β Aggregator: Connection 1 (SOL/USDC)
Connection 2 (SOL/USDT)
Connection 3 (ETH/USDC)
...
Connection N
Result: High connection overhead, repeated fan-outs, inefficient
Recommendation: Batch Subscription Model
New Proto Definition:
// β
NEW: Batch streaming API
message BatchQuoteRequest {
repeated PairConfig pairs = 1;
message PairConfig {
string input_mint = 1;
string output_mint = 2;
uint64 amount = 3;
string pair_id = 4; // Client-provided ID for tracking
}
}
message BatchQuoteResponse {
repeated AggregatedQuote quotes = 1;
int64 timestamp = 2;
optional double oracle_price_usd = 3; // β
NEW: Oracle price included
}
service QuoteAggregatorService {
// β
NEW: Batch streaming (one connection, all pairs)
rpc StreamBatchQuotes(BatchQuoteRequest) returns (stream BatchQuoteResponse);
// Existing (kept for backward compatibility)
rpc StreamQuotes(AggregatedQuoteRequest) returns (stream AggregatedQuote);
}
New Architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CLIENT (Scanner) β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β 1. Single Request with ALL pairs (sent once at startup): β
β β
β BatchQuoteRequest { β
β pairs: [ β
β { inputMint: SOL, outputMint: USDC, amount: 1 SOL } β
β { inputMint: SOL, outputMint: USDT, amount: 1 SOL } β
β { inputMint: ETH, outputMint: USDC, amount: 0.1 ETH}β
β ] β
β } β
β β
β 2. Receive continuous stream of BatchQuoteResponse β
β (all pairs in single response) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
Single gRPC Stream
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β QUOTE AGGREGATOR SERVICE β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β’ Maintain 2 persistent gRPC streams (local + external) β
β β’ Subscribe to ALL pairs in BOTH downstream services β
β β’ Merge TWO streams (local + external) in real-time β
β β’ Deduplicate + compare + aggregate β
β β’ Stream batched responses back to client β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Stream 1 (persistent) β Stream 2 (persistent)
ββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββ
β LOCAL QUOTE SERVICE β β EXTERNAL QUOTE SERVICE β
β β’ Background refresh β β β’ Background refresh β
β β’ Push updates for ALL β β β’ Push updates for ALL β
β pairs to subscribers β β pairs to subscribers β
ββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββ
Implementation:
func (s *QuoteAggregatorService) StreamBatchQuotes(
req *BatchQuoteRequest,
stream QuoteAggregatorService_StreamBatchQuotesServer,
) error {
ctx := stream.Context()
// Create persistent streams to downstream services
localStream, err := s.localQuoteClient.StreamBatchQuotes(ctx, req)
if err != nil {
return fmt.Errorf("local stream: %w", err)
}
externalStream, err := s.externalQuoteClient.StreamBatchQuotes(ctx, req)
if err != nil {
log.Warn("external stream failed, continuing with local-only")
}
// Maintain quote tables (deduplication + aggregation)
localQuotes := make(map[string]*LocalQuote) // pairID -> quote
externalQuotes := make(map[string]*ExternalQuote)
oraclePrices := make(map[string]float64) // β
NEW: Oracle prices
// Merge streams in real-time
for {
select {
case localBatch := <-localStream:
// Update local quote table
for _, quote := range localBatch.Quotes {
localQuotes[quote.PairId] = quote
if quote.OraclePrice > 0 {
oraclePrices[quote.PairId] = quote.OraclePrice
}
}
// Send aggregated batch
batch := s.buildBatchResponse(localQuotes, externalQuotes, oraclePrices)
if err := stream.Send(batch); err != nil {
return err
}
case externalBatch := <-externalStream:
// Update external quote table
for _, quote := range externalBatch.Quotes {
externalQuotes[quote.PairId] = quote
}
// Send aggregated batch
batch := s.buildBatchResponse(localQuotes, externalQuotes, oraclePrices)
if err := stream.Send(batch); err != nil {
return err
}
case <-ctx.Done():
return ctx.Err()
}
}
}
Benefits:
- Single connection: Client β Aggregator (reduces overhead)
- No fan-out overhead: 2 persistent streams (local + external)
- Lower latency: No per-request fan-out, just stream merging
- Better resource usage: 1 gRPC stream instead of N streams
- Oracle price included: Direct access in quote response
Impact:
- Latency: 5-8ms β 3-5ms (no fan-out overhead)
- Connections: N β 1 per client
- Memory: 50-54 GB β 45-48 GB (fewer connections)
Files to Create/Update:
proto/quote.proto: AddBatchQuoteRequest,BatchQuoteResponse,StreamBatchQuotesRPCgo/cmd/quote-aggregator-service/batch_streaming.go: New implementationts/apps/scanner-service/src/batch_quote_client.ts: Client implementation
4. Aggregator Writes to Dual Shared Memory β PRIORITY 4
Current State (Lines 343-544):
- Document mentions βQuote Service (Go - Writer)β but unclear which service
- No specification of which service writes to shared memory
Problem:
- Option A (Each service writes separately): Rust scanner must merge quotes, no deduplication
- Option B (Aggregator writes): Deduplication done once, simpler Rust scanner
Recommendation: Option B - Aggregator writes to dual shared memory β
Architecture:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β QUOTE AGGREGATOR SERVICE (Go - Writer) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β 1. Receive quotes from TWO sources: β
β β’ Local Service (gRPC stream) β
β β’ External Service (gRPC stream) β
β β
β 2. Maintain deduplication tables in memory: β
β localTable := sync.Map[pairID]*LocalQuote β
β externalTable := sync.Map[pairID]*ExternalQuote β
β routeTable := sync.Map[routeHash]routeID β
β β
β 3. On quote update: β
β β’ Update in-memory table (pairID is key) β
β β’ Replace existing quote (keep latest only) β
β β’ Deduplicate routes (hash-based) β
β β’ Store route in Redis (hot cache) + PostgreSQL (cold) β
β β
β 4. Write to TWO memory-mapped files (lock-free): β
β β’ quotes-local.mmap (128 bytes Γ 1000 pairs = 128KB) β
β β’ quotes-external.mmap (128 bytes Γ 1000 pairs = 128KB) β
β β
β 5. Atomic write protocol (versioning): β
β β’ Increment version (odd = writing) β
β β’ Write quote metadata β
β β’ Increment version (even = readable) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββ
β SHMEM #1 (Local) β SHMEM #2 (External) β
β quotes-local.mmap β quotes-external.mmap β
ββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββ€
β β’ Local pool quotes β β’ External API quotes β
β β’ Oracle price included β β’ Oracle price included β
β β’ Sub-second freshness β β’ 10s refresh interval β
β β’ DEDUPLICATED β β’ DEDUPLICATED β
β β’ RouteID β Redis/PG β β’ RouteID β Redis/PG β
ββββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RUST SCANNER (Readers - Production) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β 1. Read from BOTH shared memory (<1ΞΌs each) β
β 2. Intelligent selection: β
β β’ Strategy A: Use local (fresher, on-chain) β
β β’ Strategy B: Use external (better multi-hop) β
β β’ Strategy C: Compare both, pick best β
β 3. Oracle validation: Compare to oracle_price_usd field β
β 4. Detect arbitrage (<10ΞΌs vs 500ΞΌs-2ms with gRPC) β
β 5. Fetch route from Redis/PostgreSQL (routeID lookup) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Enhanced QuoteMetadata Struct:
// 128-byte aligned quote entry
#[repr(C, align(128))]
struct QuoteMetadata {
version: AtomicU64, // Versioning for consistency
pair_id: [u8; 32], // BLAKE3(token_in, token_out, amount)
input_mint: [u8; 32], // Solana public key (32 bytes)
output_mint: [u8; 32], // Solana public key (32 bytes)
input_amount: u64, // Lamports
output_amount: u64, // Lamports
price_impact_bps: u32, // Basis points (0.01%)
timestamp_unix_ms: u64, // Unix timestamp (milliseconds)
route_id: [u8; 32], // BLAKE3(route_steps) -> lookup in Redis/PG
oracle_price_usd: f64, // β
NEW: Oracle price for validation
staleness_flag: u8, // β
NEW: 0=fresh, 1=stale, 2=very_stale
_padding: [u8; 15], // Padding to 128 bytes
}
Deduplication Table:
// Aggregator maintains deduplication tables
type QuoteAggregator struct {
// In-memory tables
localQuotes sync.Map // pairID -> *LocalQuoteMetadata
externalQuotes sync.Map // pairID -> *ExternalQuoteMetadata
routeHashes sync.Map // routeHash -> routeID
// Shared memory writers
localWriter *SharedMemoryWriter
externalWriter *SharedMemoryWriter
// Route storage
redisClient *redis.Client
postgresClient *sql.DB
}
// On quote update from Local Service
func (a *QuoteAggregator) updateLocalQuote(quote *LocalQuote) {
pairID := calculatePairID(quote.InputMint, quote.OutputMint, quote.Amount)
// Store in table (replaces existing)
a.localQuotes.Store(pairID, quote)
// Deduplicate route
routeHash := hashRoute(quote.RouteSteps)
routeID, exists := a.routeHashes.LoadOrStore(routeHash, quote.RouteID)
if !exists {
// New route - store in Redis (hot) + PostgreSQL (cold)
a.storeRouteRedis(quote.RouteID, quote.RouteSteps, 30*time.Second)
a.storeRoutePostgreSQL(quote.RouteID, quote.RouteSteps)
}
// Write to shared memory (atomic)
metadata := &QuoteMetadata{
PairID: pairID,
InputMint: quote.InputMint,
OutputMint: quote.OutputMint,
InputAmount: quote.Amount,
OutputAmount: quote.OutputAmount,
PriceImpactBps: quote.PriceImpact,
TimestampUnixMs: time.Now().UnixMilli(),
RouteID: routeID,
OraclePriceUsd: quote.OraclePrice, // β
NEW
StalenessFlag: calculateStaleness(quote.LastUpdate), // β
NEW
}
a.localWriter.WriteQuote(metadata)
}
Benefits:
- β Single source of truth: Aggregator is canonical quote state
- β Deduplication done once: Not repeated in Rust scanner
- β Oracle price included: Available in shared memory (no Redis lookup)
- β Staleness flag: Rust scanner can filter stale quotes
- β Lock-free reads: Rust scanner doesnβt need locks
- β Simple Rust scanner: Just read, validate, compare
Impact:
- Rust scanner complexity: 40% reduction (no merging, deduplication, oracle lookup)
- Oracle validation: 100Γ faster (<1ΞΌs vs 100ΞΌs Redis lookup)
- Memory efficiency: 128KB Γ 2 regions = 256KB (fits in L2 cache)
Files to Create/Update:
go/cmd/quote-aggregator-service/shared_memory_writer.go: New implementationgo/pkg/shared-memory/writer.go: Atomic write protocolrust/crates/scanner/src/shared_memory_reader.rs: Rust reader implementationdocs/30-QUOTE-SERVICE-ARCHITECTURE.md: Update Lines 343-544 with Aggregator as writer
5. Oracle Price in Shared Memory β PRIORITY 5
Current State:
- Oracle price NOT in shared memory
- Rust scanner must make Redis/RPC call to get oracle price
Recommendation: Add oracle_price_usd to QuoteMetadata
Changes:
#[repr(C, align(128))]
struct QuoteMetadata {
// ... existing fields ...
oracle_price_usd: f64, // β
NEW: Oracle price (USD per token)
staleness_flag: u8, // β
NEW: 0=fresh, 1=stale, 2=very_stale
_padding: [u8; 15],
}
Rust Scanner Usage:
// Fast oracle validation (<1ΞΌs)
let quote = "es_local[i];
let oracle_price = quote.oracle_price_usd;
let quote_price = (quote.output_amount as f64) / (quote.input_amount as f64);
let deviation = (quote_price - oracle_price).abs() / oracle_price;
if deviation > 0.01 {
// >1% deviation - skip this quote
continue;
}
// Also check staleness
if quote.staleness_flag > 1 {
// Very stale (>10s old) - skip
continue;
}
Benefits:
- 100Γ faster oracle validation: <1ΞΌs vs 100ΞΌs (Redis lookup)
- No external calls: All data in shared memory
- Better filtering: Skip stale/deviated quotes early
Additional Improvements (Nice to Have)
6. Rate Limit Token Pre-check for Parallel External Quotes
Recommendation: Ensure β₯2 tokens available before launching parallel calls
func (s *ExternalQuoteService) GetPairedQuotesParallel(...) {
// Pre-check: Ensure we have β₯2 tokens for Jupiter
if !s.rateLimiter.AllowN(time.Now(), 2) {
// Fallback strategies:
// Option A: Sequential calls
// Option B: Use different providers (Jupiter + DFlow)
return s.getPairedQuotesWithFallback(...)
}
// Launch parallel goroutines...
}
7. Quote Confidence Score Enhancement
Current: 3 factors (oracle, liquidity, price impact)
Recommendation: Add 3 more factors
func (c *QuoteComparator) calculateConfidence(...) float64 {
confidence := 0.5
// Existing factors
// Factor 1: Oracle deviation (<1% = +0.3)
// Factor 2: Liquidity depth (>$100k = +0.1)
// Factor 3: Price impact (<0.5% = +0.1)
// β
NEW Factor 4: Pool age (older = more stable)
if poolAge > 30*24*time.Hour { // >30 days
confidence += 0.05
}
// β
NEW Factor 5: Route hop count (fewer hops = more reliable)
if hopCount == 1 { // Direct swap
confidence += 0.05
}
// β
NEW Factor 6: Slippage tolerance (tighter = better)
if slippageBps < 50 { // <0.5%
confidence += 0.05
}
return min(confidence, 1.0)
}
8. Monitoring: Quote Freshness Histogram
Recommendation: Add Prometheus metric
# Alert if >10% of quotes are stale (>5s old)
histogram_quantile(0.90, quote_age_seconds) > 5
9. Quote Staleness Warning in Shared Memory
Already included in Priority 4 - staleness_flag field
Performance Impact Analysis
| Change | Current | After | Improvement |
|---|---|---|---|
| Cache TTL | 2s | 5s | 60% less CPU |
| External Paired Quotes | 500ms | 250ms | 2Γ faster |
| Batch Streaming | N fan-outs | 2 persistent streams | Eliminates overhead |
| Aggregator Shared Memory | Undefined | Single writer | Consistency β |
| Oracle Price in Shmem | Redis lookup | Direct read | 100Γ faster |
Overall System Impact:
- Latency: 5-8ms β 3-5ms (cached quotes)
- Throughput: 2,500 req/s β 4,000+ req/s
- CPU Usage: 40-50% β 25-35%
- Memory: 50-54 GB (no change, better utilization)
- Rust Scanner Complexity: 40% reduction
Implementation Roadmap
Phase 1: Cache TTL (1 day)
- Update
QUOTE_CACHE_TTL=5sin all configs - Update documentation
- Test cache hit rate improvement
Phase 2: Parallel External Quotes (3 days)
- Implement
GetPairedQuotesParallelin External Quote Service - Add rate limit pre-check logic
- Add provider rotation for paired quotes
- Test latency improvement (500ms β 250ms)
Phase 3: Batch Streaming Model (5 days)
- Update
proto/quote.protowith batch APIs - Implement
StreamBatchQuotesin Aggregator - Update TypeScript client for batch streaming
- Test connection overhead reduction
Phase 4: Aggregator Shared Memory (7 days)
- Implement
SharedMemoryWriterin Aggregator - Add deduplication tables (sync.Map)
- Integrate route storage (Redis + PostgreSQL)
- Update shared memory metadata struct (oracle price, staleness)
- Test atomic write protocol
Phase 5: Rust Scanner Integration (3 days)
- Implement shared memory reader in Rust
- Add oracle validation logic
- Add staleness filtering
- Test arbitrage detection latency (<10ΞΌs)
Total: 19 days (~4 weeks)
Testing Checklist
Unit Tests
- Cache TTL eviction (5s)
- Parallel paired quotes (both goroutines)
- Batch streaming (multiple pairs)
- Shared memory atomic writes
- Route deduplication
Integration Tests
- End-to-end quote flow (client β Rust scanner)
- Fallback scenarios (local-only, external-only)
- Rate limit handling (Jupiter 1 RPS)
- Oracle price validation in shared memory
Performance Tests
- Throughput: 4,000+ req/s
- Latency: P50 < 3ms, P95 < 5ms, P99 < 8ms
- Cache hit rate: >90%
- Shared memory read: <1ΞΌs
- Arbitrage detection: <10ΞΌs
Load Tests
- 1000 paired quotes/sec (sustained 10 min)
- 10K concurrent Rust scanner readers
- 24-hour soak test (stability)
Conclusion
This review identified 5 critical improvements that will transform the Quote Service from v3.0 to v3.1:
- β Cache TTL optimization β 60% less CPU
- β Parallel external quotes β 2Γ faster
- β Batch streaming model β Eliminates fan-out overhead
- β Aggregator shared memory β Simpler Rust scanner
- β Oracle price in shmem β 100Γ faster validation
Expected System-Wide Impact:
- Latency: 5-8ms β 3-5ms (40% faster)
- Throughput: 2,500 β 4,000+ req/s (60% higher)
- CPU: 40-50% β 25-35% (40% reduction)
- Rust Scanner: 40% simpler
Recommendation: Implement all 5 critical changes in sequential phases over 4 weeks for maximum impact.
Document Version: 3.1 Review Last Updated: December 31, 2025 Status: β Review Complete - Ready for Implementation Next Steps: Create implementation tickets from Phase 1-5 roadmap
