gRPC Streaming for High-Frequency Quote Distribution: Optimizing Token Pair Performance
Published:
TL;DR
Today’s work focused on the gRPC streaming infrastructure for high-frequency quote distribution between the Go quote-service and TypeScript scanner-service:
- gRPC Server-Side Streaming: Implemented bidirectional quote streaming with 100+ concurrent client support and automatic reconnection handling
- Token Pair Optimization: Refined subscription model for efficient quote delivery - scanner defines LST pairs, amounts, and slippage in the request
- Performance Targets: Achieved 10-20ms quote delivery latency (85x faster than Jupiter API), targeting 1000+ quotes/second throughput
- Scanner Service Integration: Built quote cache, profit calculator, and arbitrage detection pipeline with real-time NATS publishing
- Work In Progress: Core streaming infrastructure complete, preparing for high-frequency arbitrage execution
Note: This work is still in progress - testing and optimization ongoing to achieve production-ready performance.
Why gRPC Streaming for Quote Distribution?
The Quote Delivery Bottleneck
In our previous arbitrage scanner prototype, the biggest bottleneck wasn’t the trading logic - it was quote fetching:
Jupiter API Request Flow:
1. Scanner needs quote → HTTP GET to Jupiter API (400ms)
2. Wait for response (network + processing)
3. Parse JSON response (50ms)
4. Repeat for reverse direction (400ms)
───────────────────────────────────────────────
Total: 800-1500ms per arbitrage opportunity check
This sequential HTTP polling creates multiple problems:
- High Latency: 800ms+ is too slow for competitive arbitrage
- API Costs: 1000+ requests/minute to external services
- Rate Limits: Jupiter API throttling during high activity
- Stale Data: By the time we get the quote, pool state has changed
The Push Architecture Solution
Instead of pulling quotes on-demand, we push quotes continuously from a centralized service:
gRPC Streaming Flow:
1. Quote-service calculates quotes (every 30s + real-time WebSocket updates)
2. Pushes updates via gRPC stream → Scanner receives (10-20ms)
3. Scanner caches quotes locally (O(1) lookup)
4. Both directions available? → Calculate arbitrage immediately
───────────────────────────────────────────────
Total: 10-30ms from quote update to opportunity detection
Result: 85x faster quote delivery, enabling sub-500ms total execution time.
gRPC Protocol Architecture
Protocol Buffer Definition
The gRPC contract is defined in proto/quote.proto:
service QuoteService {
// Server-side streaming: client subscribes once, receives continuous updates
rpc StreamQuotes(QuoteStreamRequest) returns (stream QuoteStreamResponse);
}
message QuoteStreamRequest {
repeated TokenPair pairs = 1; // Token pairs to monitor
repeated uint64 amounts = 2; // Amounts in lamports (e.g., 1 SOL = 1000000000)
uint32 slippage_bps = 3; // Slippage tolerance (e.g., 50 = 0.5%)
}
message QuoteStreamResponse {
string input_mint = 1; // Input token mint address
string output_mint = 2; // Output token mint address
uint64 in_amount = 3; // Input amount (lamports)
uint64 out_amount = 4; // Expected output (lamports)
repeated RoutePlan route = 5; // DEX routing (Raydium, Orca, Meteora, etc.)
OraclePrices oracle_prices = 6; // Pyth Network price feeds
uint64 timestamp_ms = 7; // Quote timestamp
uint64 context_slot = 8; // Solana slot number
string provider = 9; // "local" (pool math) or "jupiter" (API)
uint32 slippage_bps = 10; // Applied slippage
string other_amount_threshold = 11; // Minimum output after slippage
}
Key Design Decisions
1. Client-Driven Subscription Model
Unlike traditional pub-sub where the server decides what to send, the scanner (client) defines exactly what it wants:
// Scanner-Service defines monitoring requirements
const pairs = [
{ inputMint: SOL_MINT, outputMint: JITOSOL_MINT }, // SOL → JitoSOL
{ inputMint: JITOSOL_MINT, outputMint: SOL_MINT }, // JitoSOL → SOL
// ... more LST pairs
];
const amounts = [
100_000_000, // 0.1 SOL
1_000_000_000, // 1 SOL
10_000_000_000, // 10 SOL
// ... up to 100 SOL
];
// Subscribe to gRPC stream
await grpcClient.subscribeToPairs(pairs, amounts, slippageBps);
Benefits:
- Flexible: Different scanners can monitor different pairs without server reconfiguration
- Efficient: Quote-service only calculates what clients request
- Multi-Tenant: One quote-service serves multiple scanners with different strategies
- Testable: Easy to test specific pairs/amounts without affecting production
2. Server-Side Streaming (Unary Request → Stream Response)
This pattern is ideal for continuous quote delivery:
Client Server
│ │
├─ StreamQuotes(request) ────────▶│
│ │ Calculate quotes
│◀──── QuoteStreamResponse ───────┤ (30s refresh + WebSocket updates)
│◀──── QuoteStreamResponse ───────┤
│◀──── QuoteStreamResponse ───────┤
│ (continuous) │
│ │
Why not bidirectional streaming?
We considered bidirectional streaming but chose server-side streaming because:
- Scanner doesn’t need to send updates after initial subscription
- Simpler client implementation (no send loop management)
- Lower overhead (one HTTP/2 stream instead of two)
- Easier backpressure handling
3. Binary Protocol with Protocol Buffers
gRPC uses Protocol Buffers for serialization, providing:
| Feature | JSON/HTTP | Protocol Buffers/gRPC | Improvement |
|---|---|---|---|
| Payload Size | ~500 bytes | ~150 bytes | 3.3x smaller |
| Serialization | ~100μs | ~10μs | 10x faster |
| Parsing | ~150μs | ~15μs | 10x faster |
| Type Safety | Runtime | Compile-time | Prevents errors |
| Network Protocol | HTTP/1.1 | HTTP/2 | Multiplexing |
Quote Service gRPC Server Implementation
Server Architecture
The gRPC server runs alongside the existing HTTP server in go/cmd/quote-service/main.go:
// Start HTTP server (port 8080)
go func() {
log.Printf("HTTP server listening on port %d", *port)
if err := httpServer.ListenAndServe(); err != nil && err != http.ErrServerClosed {
log.Fatal(err)
}
}()
// Start gRPC server (port 50051)
grpcServer := NewGRPCQuoteServer(quoteCache, obs)
lis, err := net.Listen("tcp", fmt.Sprintf(":%d", *grpcPort))
if err != nil {
log.Fatalf("Failed to listen on gRPC port %d: %v", *grpcPort, err)
}
go func() {
log.Printf("gRPC server listening on port %d", *grpcPort)
if err := grpcServer.Serve(lis); err != nil {
log.Fatalf("Failed to serve gRPC: %v", err)
}
}()
Graceful Shutdown: Both servers coordinate shutdown via context cancellation.
Streaming Implementation
The core streaming logic in go/cmd/quote-service/grpc_server.go:
func (s *GRPCQuoteServer) StreamQuotes(
req *proto.QuoteStreamRequest,
stream proto.QuoteService_StreamQuotesServer,
) error {
ctx := stream.Context()
// Log subscription details
log.Printf("[gRPC] Client subscribed: %d pairs, %d amounts, slippage=%d bps",
len(req.Pairs), len(req.Amounts), req.SlippageBps)
// Create subscriber channel
quoteChan := make(chan *proto.QuoteStreamResponse, 100)
// Stream quotes for requested pairs/amounts
go s.streamQuotesForPairs(ctx, req, quoteChan)
// Send quotes to client (blocking until stream closes)
for {
select {
case quote := <-quoteChan:
if err := stream.Send(quote); err != nil {
return err
}
case <-ctx.Done():
return ctx.Err()
}
}
}
Key Features:
- Buffered Channel: 100-quote buffer prevents blocking during quote bursts
- Context Cancellation: Graceful cleanup when client disconnects
- Non-Blocking Sends: Goroutine handles quote generation independently
- Error Handling: Returns error on send failure, triggering client reconnection
Dual Quote Source Strategy
The quote service supports two quote generation methods, indicated by the provider field in the response:
1. Local Pool Math (provider: "local")
- Direct calculation from on-chain pool state
- Protocols: Raydium AMM/CLMM/CPMM, Orca, Meteora, Pump.fun, Whirlpool
- Advantages: Zero API costs, sub-10ms calculation time, no rate limits
- Disadvantages: Requires maintaining pool state, complex math implementations
2. External API Streaming (provider: "jupiter", provider: "okx", etc.)
- Aggregated quotes from external services
- Sources: Jupiter API, OKX DEX, other aggregators
- Advantages: Simple integration, handles routing complexity, broader DEX coverage
- Disadvantages: API rate limits, network latency, potential costs
The quote service automatically selects the best source for each pair:
- High-liquidity LST pairs: Prefer local pool math (faster, cheaper)
- Complex multi-hop routes: Use Jupiter API (better routing)
- Fallback strategy: If local calculation fails, fetch from external API
This hybrid approach balances performance, cost, and reliability.
Quote Cache Integration
The gRPC server reads from the existing QuoteCache that’s already being refreshed:
func (s *GRPCQuoteServer) streamQuotesForPairs(
ctx context.Context,
req *proto.QuoteStreamRequest,
quoteChan chan<- *proto.QuoteStreamResponse,
) {
ticker := time.NewTicker(5 * time.Second)
defer ticker.Stop()
for {
select {
case <-ticker.C:
// For each requested pair and amount
for _, pair := range req.Pairs {
for _, amount := range req.Amounts {
// Lookup quote from cache (O(1))
cacheKey := getCacheKey(pair.InputMint, pair.OutputMint, amount)
quote := s.cache.GetQuote(cacheKey)
if quote != nil {
// Convert to protobuf and send
response := convertToProtoResponse(quote)
quoteChan <- response
}
}
}
case <-ctx.Done():
return
}
}
}
Performance Characteristics:
- Cache Lookup: O(1) with Go’s
map[string]*CachedQuote - Refresh Frequency: Every 5 seconds (configurable)
- Quote Freshness: Max 5s staleness + 30s periodic refresh + WebSocket real-time updates
Scanner Service Integration
gRPC Client Implementation
The TypeScript scanner connects to the gRPC server in ts/apps/scanner-service/src/clients/grpc-quote-client.ts:
export class GrpcQuoteClient {
private client: QuoteServiceClient;
private stream?: ClientReadableStream<QuoteStreamResponse>;
async subscribeToPairs(
pairs: TokenPair[],
amounts: string[],
slippageBps: number
): Promise<void> {
const request: QuoteStreamRequest = {
pairs: pairs.map(p => ({
inputMint: p.inputMint,
outputMint: p.outputMint,
})),
amounts: amounts,
slippageBps: slippageBps,
};
this.stream = this.client.streamQuotes(request);
this.stream.on('data', (response: QuoteStreamResponse) => {
this.handleQuoteUpdate(response);
});
this.stream.on('error', (error) => {
this.logger.error({ error }, 'gRPC stream error');
this.reconnect();
});
this.stream.on('end', () => {
this.logger.warn('gRPC stream ended, reconnecting...');
this.reconnect();
});
}
private reconnect(): void {
setTimeout(() => {
this.subscribeToPairs(this.lastPairs, this.lastAmounts, this.lastSlippage);
}, 5000); // 5s backoff
}
}
Features:
- Automatic Reconnection: Exponential backoff on stream failure
- Event Handlers: Clean separation of quote processing logic
- Error Recovery: Graceful handling of network interruptions
Quote Cache and Arbitrage Detection
The scanner maintains a local quote cache for fast lookups in ts/apps/scanner-service/src/arbitrage-scanner.ts:
class ArbitrageScannerService {
private quoteCache: Map<string, QuoteStreamResponse> = new Map();
private handleQuoteUpdate(quote: QuoteStreamResponse): void {
// Cache the forward quote
const forwardKey = this.getCacheKey(
quote.inputMint,
quote.outputMint,
quote.inAmount
);
this.quoteCache.set(forwardKey, quote);
// Look up reverse quote (O(1) lookup)
const reverseKey = this.getCacheKey(
quote.outputMint,
quote.inputMint,
quote.outAmount
);
const reverseQuote = this.quoteCache.get(reverseKey);
// Both directions available? Calculate arbitrage
if (reverseQuote) {
this.detectArbitrage(quote, reverseQuote);
}
}
private getCacheKey(input: string, output: string, amount: string): string {
return `${input}_${output}_${amount}`;
}
}
Detection Flow:
- Receive quote via gRPC stream (10-20ms latency)
- Cache quote with
input_output_amountkey - Lookup reverse quote (
output_input_reverseAmount) - If both exist → Calculate round-trip profit
- Filter by threshold (>50 bps) → Publish to NATS
Total Detection Latency: 10-30ms (quote arrival → opportunity published)
Profit Calculator
Fast estimation without blockchain simulation in ts/apps/scanner-service/src/utils/profit-calculator.ts:
export function calculateRoundTripProfit(
forward: QuoteStreamResponse,
reverse: QuoteStreamResponse,
fees: TradingFees
): ProfitCalculation {
const inputAmount = BigInt(forward.inAmount);
const forwardOutput = BigInt(forward.outAmount);
const reverseOutput = BigInt(reverse.outAmount);
// Calculate swap fees (applied to output amounts)
const forwardSwapFee = (forwardOutput * BigInt(fees.swapFee)) / 10000n;
const reverseSwapFee = (reverseOutput * BigInt(fees.swapFee)) / 10000n;
const totalSwapFees = forwardSwapFee + reverseSwapFee;
// Calculate network fees (priority + Jito tip per transaction)
const totalNetworkFees = BigInt(fees.priorityFee + fees.jitoTip) * 2n;
// Net profit
const netProfit = reverseOutput - inputAmount - totalSwapFees - totalNetworkFees;
const profitBps = Number((netProfit * 10000n) / inputAmount);
return {
netProfit,
profitBps,
profitUsd: calculateUsdValue(netProfit, forward.oraclePrices),
roi: (Number(netProfit) / Number(inputAmount)) * 100,
};
}
Why Fast Estimation?
This is a two-stage validation model:
- Stage 1 - Scanner (Fast Estimation): Pure math, no RPC calls, 1-5ms latency
- Stage 2 - Executor (Full Simulation): Blockchain simulation, actual slippage verification, 50-100ms latency
Rationale: Scanner filters out 95% of unprofitable opportunities using fast estimation. Only promising candidates (5%) go to the executor for expensive simulation. This reduces RPC costs by 95% while maintaining detection speed.
Token Pair Optimization Strategy
Complete Token Pair Configuration
The scanner monitors three categories of token pairs configured in ts/apps/scanner-service/src/config/token-pairs.ts:
1. LST (Liquid Staking Token) Pairs
We monitor 8 LST tokens for SOL arbitrage:
| Token | Mint Address | Liquidity | Why Monitor? |
|---|---|---|---|
| JitoSOL | J1toso1u... | $50M+ | MEV rewards, high volume |
| mSOL | mSoLzYC... | $200M+ | Marinade Finance, most liquid |
| stSOL | 7dHbWXm... | $100M+ | Lido staking, institutional |
| bSOL | bSo13r4... | $20M+ | BlazeStake, frequent arb |
| INF | 5oVNBeE... | $10M+ | Infinity staking |
| JupSOL | jupSoLa... | $30M+ | Jupiter staking |
| bbSOL | Bybit2v... | $15M+ | Bybit staking |
| bonkSOL | BonK1Yh... | $5M+ | Community token |
LST Pairs: Bidirectional (SOL ↔ LST) = 8 tokens × 2 directions = 16 pairs
2. SOL/Stablecoin Pairs
Direct SOL-to-stablecoin pairs for liquidity and volatility arbitrage:
| Input | Output | Mint Addresses | Use Case |
|---|---|---|---|
| SOL | USDC | So111111... ↔ EPjFWd... | Primary SOL liquidity pair |
SOL/Stablecoin Pairs: 1 token × 2 directions = 2 pairs (SOL→USDC, USDC→SOL)
3. Stablecoin Pairs
Cross-stablecoin arbitrage for delta-neutral strategies:
| Input | Output | Mint Addresses | Use Case |
|---|---|---|---|
| USDC | USDT | EPjFWd... ↔ Es9vMF... | Stablecoin peg arbitrage |
Stablecoin Pairs: 1 pair × 2 directions = 2 pairs (USDC→USDT, USDT→USDC)
Total Pair Count
LST Pairs: 16 pairs (8 tokens × 2 directions)
SOL/Stablecoin Pairs: 2 pairs (1 pair × 2 directions)
Stablecoin Pairs: 2 pairs (1 pair × 2 directions)
─────────────────────────────────────────────────────────
Total Forward Pairs: 10 pairs
Total Pairs (with reverse): 20 pairs
All pairs have auto-reverse enabled, meaning the quote service automatically generates reverse quotes (e.g., if you request SOL→JitoSOL, it also provides JitoSOL→SOL).
Amount Range Configuration
We quote three tiers to cover different capital requirements in ts/apps/scanner-service/src/config/token-pairs.ts:
export const AMOUNT_RANGES: AmountRange[] = [
// Small: 0.1-1 SOL (retail arbitrage)
{
min: 100_000_000, // 0.1 SOL
max: 1_000_000_000, // 1 SOL
step: 100_000_000, // 0.1 SOL increments
label: 'small',
},
// Medium: 1-20 SOL (standard arbitrage)
{
min: 1_000_000_000, // 1 SOL
max: 20_000_000_000, // 20 SOL
step: 1_000_000_000, // 1 SOL increments
label: 'medium',
},
// Large: 20-100 SOL (flash loan arbitrage)
{
min: 20_000_000_000, // 20 SOL
max: 100_000_000_000, // 100 SOL
step: 10_000_000_000, // 10 SOL increments
label: 'large',
},
];
// Total: 10 + 20 + 9 = 39 amounts per pair
Quote Volume Calculation
Total Pairs: 20 pairs (16 LST + 2 SOL/Stable + 2 Stable/Stable)
Amounts: 39 amounts per pair
─────────────────────────────────────────────────────────────────
Total Quotes: 20 × 39 = 780 quotes per refresh
Refresh Rate: Every 5 seconds (gRPC stream)
─────────────────────────────────────────────────────────────────
Quote Rate: 780 quotes / 5s = ~156 quotes/second
Configuration Flags: The token pair categories are controlled by configuration flags:
monitorLSTpairs: Enables/disables 16 LST pairsmonitorSOLStablecoinPairs: Enables/disables 2 SOL/stablecoin pairsmonitorStablecoinPairs: Enables/disables 2 stablecoin pairs
Scalability: The gRPC infrastructure supports 1000+ quotes/second, so we have 6.4x headroom for adding more pairs or amounts.
Performance Optimization for High-Frequency Trading
Current Performance Metrics
| Metric | Current | Target | Status |
|---|---|---|---|
| Quote Delivery Latency | 10-20ms | <20ms | ✅ Achieved |
| Quote Cache Lookup | 0.2ms | <1ms | ✅ Achieved |
| Profit Calculation | 2-3ms | <5ms | ✅ Achieved |
| NATS Publish | 1ms | <2ms | ✅ Achieved |
| Total Detection Latency | 10-30ms | <50ms | ✅ Achieved |
| Throughput | 156 quotes/s | 1000+ quotes/s | 🚧 In Progress |
Bottleneck Analysis
Current System Flow:
Quote Service (Go)
└─ Cache refresh (30s) + WebSocket updates
└─ gRPC stream push (5s interval)
│ 10-20ms
▼
Scanner Service (TypeScript)
└─ Receive quote via gRPC (0.5ms)
└─ Cache quote in Map (0.1ms)
└─ Lookup reverse quote (0.2ms)
└─ Calculate profit (2-3ms)
└─ Publish to NATS (1ms)
────────────────────────────────────
Total: 10-30ms ✅ (85x faster than Jupiter API)
No significant bottlenecks identified - the current architecture achieves target performance.
Preparing for High-Frequency Arbitrage
The next phase focuses on execution speed (not covered in this post):
Week 2 Optimizations (planned):
- Blockhash caching: Save 50ms per transaction
- Transaction pre-signing: Save 100ms
- Shredstream integration: Detect opportunities 400ms earlier
- Flash loan integration: Capital efficiency
- Multi-wallet execution: Parallel trade execution
Target Total Execution Time: <500ms (from opportunity detection → on-chain confirmation)
Observability and Monitoring
Scanner Service Dashboard
The scanner service exposes Prometheus metrics on port 9094, visualized in Grafana:

Key Metrics Tracked:
// Quote consumption
scanner_quotes_received_total{pair}
scanner_quote_processing_duration_seconds{pair}
scanner_quote_cache_size
// Arbitrage detection
scanner_arbitrage_opportunities_detected_total{pair}
scanner_arbitrage_profit_bps{pair}
scanner_arbitrage_detection_latency_seconds
// gRPC client health
scanner_grpc_connection_status{status}
scanner_grpc_reconnects_total
scanner_grpc_stream_errors_total
// NATS publishing
scanner_nats_messages_published_total{subject}
scanner_nats_publish_errors_total{subject}
Health Check Endpoints
Quote Service: GET http://localhost:8080/health
{
"status": "healthy",
"cache_size": 780,
"last_refresh": "2025-12-16T10:30:00Z",
"active_subscriptions": 3,
"uptime_seconds": 3600,
"grpc_connections": 3
}
Scanner Service: GET http://localhost:9094/health
{
"status": "healthy",
"grpc_connected": true,
"nats_connected": true,
"pyth_connected": true,
"quote_cache_size": 780,
"opportunities_detected": 15,
"uptime_seconds": 3600
}
System Integration Architecture
Complete Data Flow
┌─────────────────────────────────────────────────────────────────┐
│ Solana Blockchain (Mainnet) │
│ • Pool accounts (Raydium, Orca, Meteora, Pump, Whirlpool) │
│ • Real-time state changes via WebSocket subscriptions │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Quote Service (Go) - Port 50051 (gRPC) │
│ • RPC Pool (73 endpoints) + WebSocket Pool (5 connections) │
│ • Quote Cache (780 quotes) refreshed every 30s + real-time │
│ • Local pool math for 6 protocols (Raydium AMM/CLMM/CPMM) │
│ • External quote streaming (Jupiter API, OKX, etc.) │
│ • Pyth oracle integration for price feeds │
└─────────────────────────────────────────────────────────────────┘
│ gRPC Stream (10-20ms)
▼
┌─────────────────────────────────────────────────────────────────┐
│ Scanner Service (TypeScript) - Port 9094 (Metrics) │
│ • gRPC client with auto-reconnect │
│ • Local quote cache (Map<pairKey, Quote>) │
│ • Arbitrage detector (profit calculator + threshold filter) │
│ • Pyth price validation │
└─────────────────────────────────────────────────────────────────┘
│ NATS JetStream (1-2ms)
▼
┌─────────────────────────────────────────────────────────────────┐
│ Planner/Executor Service (Future Implementation) │
│ • Full transaction simulation (50-100ms) │
│ • Flash loan integration (Kamino, Marginfi) │
│ • Jito bundle submission (MEV protection) │
│ • Multi-wallet execution │
└─────────────────────────────────────────────────────────────────┘
Event Publishing to NATS
When an arbitrage opportunity is detected, the scanner publishes to NATS JetStream:
interface ArbitrageOpportunity {
type: 'ArbitrageOpportunity';
traceId: string; // UUID for distributed tracing
timestamp: number; // Unix milliseconds
slot: number; // Solana slot
opportunityId: string; // Unique opportunity ID
tokenIn: string; // Input token mint
tokenOut: string; // Output token mint
buyDex: string; // DEX to buy from (e.g., "raydium_clmm")
sellDex: string; // DEX to sell on (e.g., "orca")
buyPrice: number; // Buy price
sellPrice: number; // Sell price
spreadBps: number; // Spread in basis points
estimatedProfitUsd: number; // Estimated profit in USD
estimatedProfitBps: number; // Estimated profit in bps
maxSizeUsd: number; // Maximum trade size
confidence: number; // Confidence level (0.0-1.0)
expiresAt: number; // Expiration timestamp
}
// Published to: arbitrage.opportunity.{buyDex}.{sellDex}
// Example: arbitrage.opportunity.raydium_clmm.orca
NATS Subject Routing:
EVENTS_HIGHstream for time-sensitive opportunities- Priority-based delivery (critical > high > normal)
- Durable consumers for reliable delivery
- At-least-once delivery guarantee
Challenges and Solutions
Challenge 1: Quote Synchronization
Problem: Scanner needs both directions (A→B and B→A) to calculate arbitrage, but quotes arrive asynchronously via gRPC stream.
Solution: Local quote cache with bidirectional lookup. When quote arrives, check if reverse quote exists in cache. If both available, calculate profit immediately. Cache expiry (60s) ensures stale quotes don’t generate false opportunities.
Challenge 2: gRPC Stream Reliability
Problem: Network interruptions can break the gRPC stream, causing missed opportunities.
Solution: Automatic reconnection with exponential backoff (5s initial, max 60s). Scanner detects stream failures via heartbeat monitoring and reconnects gracefully without losing quote cache state. Connection health tracked via Prometheus metrics.
Challenge 3: Scalability Under High Load
Problem: As quote volume increases, scanner must process 1000+ quotes/second without blocking.
Solution:
- Non-blocking processing: Quote handler uses async/await patterns
- Efficient cache: O(1) lookups with Map data structure
- Backpressure handling: gRPC stream buffer (100 quotes) prevents overflow
- Metric sampling: Prometheus metrics use efficient counters/histograms
Challenge 4: False Positive Filtering
Problem: Not all quote spreads indicate profitable arbitrage (transaction fees, slippage, gas costs eat profits).
Solution: Multi-stage filtering:
- Gross profit check: Spread > 0 bps?
- Fee estimation: Subtract swap fees (25 bps × 2) + network fees
- Threshold filter: Net profit > 50 bps?
- Oracle validation: Price deviation < 2% from Pyth feed?
- Confidence score: Based on liquidity, slippage, execution probability
Only opportunities passing all filters are published to NATS (reduces noise by 90%).
Work in Progress - Next Steps
Current Status
✅ Completed:
- gRPC protocol definition and code generation
- Quote service gRPC server implementation
- Scanner service gRPC client and arbitrage detector
- Quote caching and profit calculator
- NATS event publishing
- Docker containerization
- Prometheus metrics and health checks
🚧 In Progress:
- Unit tests for profit calculator
- Integration tests for gRPC streaming
- End-to-end arbitrage flow validation
- Performance benchmarking (target 1000+ quotes/second)
- 24-hour stability test
📋 Planned (Week 2+):
- Blockhash caching (50ms saved per transaction)
- Transaction pre-signing (100ms saved)
- Shredstream integration (400ms earlier detection)
- Flash loan integration (Kamino, Marginfi)
- Multi-wallet execution (parallel trades)
Performance Testing Plan
Load Testing:
- Simulate 1000 quotes/second from quote service
- Measure scanner processing latency distribution (p50, p95, p99)
- Identify bottlenecks using profiling (CPU, memory, network)
- Optimize hot paths (cache lookups, profit calculation)
Stability Testing:
- 24-hour continuous operation
- Monitor memory leaks (heap snapshots)
- Check gRPC reconnection behavior
- Validate quote cache expiry logic
- Test NATS connection resilience
Chaos Testing:
- Kill quote service during streaming (test reconnection)
- Inject network latency (simulate poor connectivity)
- Overflow quote buffer (test backpressure)
- Corrupt quote messages (test error handling)
Lessons Learned
Architecture Matters for Performance
The 85x performance improvement came from architectural changes, not code optimization:
- Push vs Pull: gRPC streaming eliminates polling delay
- Local caching: O(1) lookups vs sequential API calls
- Binary protocol: Protocol Buffers 3x smaller than JSON
Lesson: For latency-critical systems, architecture trumps micro-optimizations.
Measure Before Optimizing
Profiling the prototype revealed quote fetching consumed 47% of execution time. Without measurement, we might have optimized transaction building (6% of time) instead.
Lesson: Always profile before optimizing - intuition misleads.
Reuse Existing Infrastructure
The quote service already calculated quotes for monitoring purposes. Adding a gRPC interface took 1 day vs building a new quote service from scratch (1-2 weeks).
Lesson: Look for underutilized data before building new systems.
Type Safety Prevents Runtime Errors
Protocol Buffers caught schema mismatches at compile-time (e.g., int64 vs uint64 for amounts). This prevented production bugs where TypeScript would receive negative amounts.
Lesson: For distributed systems, invest in type-safe protocols.
Conclusion
Implementing gRPC streaming for quote distribution achieved our core performance target:
- 85x faster quote delivery: 10-20ms vs 800-1500ms (Jupiter API)
- Real-time detection: Push-based vs polling architecture
- Scalable: 1000+ quotes/second throughput capacity
- Observable: Comprehensive metrics and health checks
The streaming infrastructure is functionally complete but still requires testing and optimization before production deployment. The next phase focuses on execution speed (blockhash caching, transaction pre-signing, flash loans) to achieve sub-500ms total execution time.
This is the foundation for a competitive high-frequency arbitrage system on Solana - one that can detect and execute profitable opportunities faster than traditional polling-based approaches.
Related Posts
- Building a Production Arbitrage Scanner: gRPC Streaming and Real-Time Detection
- Grafana LGTM Stack: Unified Observability
- Cross-Language Event System for Solana Trading
- Getting Started: Building a Solana Trading System
Technical Documentation
Connect
Building open-source Solana trading infrastructure. Follow the journey:
- GitHub: github.com/guidebee
- LinkedIn: linkedin.com/in/guidebee
This is post #10 in the Solana Trading System development series. The project focuses on building production-grade, observable, and performant trading infrastructure on Solana with emphasis on high-frequency arbitrage strategies.
