How I Built a Solana Trading System with AI as My Co-Developer
Published:
TL;DR
- AI was not a shortcut — it was a structured development methodology that let one developer build what would normally need a team of five
- Four specialised AI roles replaced four human specialists: solution architect, Go developer, TypeScript developer, Rust developer
- Multiple LLMs as reviewers provided architecture validation that previously required a design review committee
- The system itself is agentic — the AI development approach shaped the runtime architecture (observe → decide → act)
A Different Kind of Solo Project
In late 2025, I started building a Solana high-frequency trading system. On paper it looked like a small project. In reality it turned out to be a polyglot monorepo spanning three languages — Go, TypeScript, and Rust — with a microservices architecture, a full observability stack, sub-500ms latency targets, and live on-chain execution.
That is the kind of project that normally requires a team of five or six engineers, six to twelve months, and a proper sprint board.
I built it in four months, mostly alone — a couple of hours on weekday evenings and some weekend time, treated as a hobby project rather than a full-time job.
The difference was how I used AI tools — not as autocomplete, not as a Stack Overflow replacement, but as a genuine development team where I was the lead and the AI filled the specialist roles.
I wrote about the career dimension of this transition in From Full-Stack Developer to AI Engineer: Is Now the Right Time to Make the Switch? — the short answer being: yes, now is exactly the right time. This post goes deeper into the technical reality of what that actually looks like day to day.
The Team I Never Had to Hire
The first thing I changed was how I thought about Claude Code (Anthropic’s AI coding CLI). Most developers use it like a very fast search engine — ask a question, get an answer, copy some code. I treated it differently: as a set of specialist colleagues who needed proper onboarding.
I created four distinct skill roles, each loaded with domain-specific expertise:
Solution Architect — responsible for system design, technology selection, and cross-service integration decisions. When I was unsure whether to use gRPC or NATS between services, the Architect role weighed the tradeoffs in the context of this specific system’s latency requirements and operational complexity.
Go Senior Developer — owned the quote service, DEX routing logic, and the pool pricing algorithms. This role knew which patterns were forbidden (blocking calls inside a hot loop), which libraries were preferred, and what 10ms latency actually means in Go concurrency terms.
TypeScript Senior Developer — owned the scanner, strategy, and executor services. Critically, this role knew that @solana/web3.js was banned in favour of @solana/kit — a non-obvious constraint that matters enormously for correctness and performance.
Rust Senior Developer — owned the RPC proxy and the Shredstream integration. This role understood zero-copy parsing, Tokio async patterns, and why naïve Rust is often slower than correct Rust.
Each skill file contained not just area responsibilities but also explicit forbidden patterns, performance targets, and naming conventions. The result: Claude Code’s output was consistent with production HFT requirements rather than with generic best practices learned from Stack Overflow posts written in 2019.
Memory That Survives the Session
One of the hidden costs of AI-assisted development is the re-explanation problem. Every new conversation, you spend ten minutes re-establishing context: here is the system, here is the constraint, here is why we made that choice three weeks ago. It is exhausting and error-prone — you always leave something out, and the AI makes a suggestion that contradicts a decision you already made.
Claude Code’s persistent memory system changed this. Architectural decisions, debugging outcomes, and design constraints are stored and recalled across sessions automatically. Some examples of what was stored:
- “The scanner is a pre-filter only —
slippageBps: 0is intentional, not a bug. Do not suggest adding slippage.” - “Test baseline run on March 14 showed 142ms median latency on RPC calls — use this as the benchmark when evaluating optimisation suggestions.”
- “Jupiter Codama code generation requires
pnpm codama generatefrom the root — do not suggest regenerating individual files.”
These sound like small things. Across a four-month project, they were the difference between coherent, consistent codebase evolution and a gradual drift towards contradictory implementations.
Using Multiple AIs as a Review Committee
Here is a practice I have not seen described elsewhere: using multiple frontier LLMs as independent reviewers before implementing architecture.
For each major design decision, I would write up the proposal and send it to four different models:
| Model | What I asked it to focus on |
|---|---|
| ChatGPT (GPT-4) | General correctness, race conditions, non-blocking patterns |
| Grok (xAI) | Performance optimisations and throughput bottlenecks |
| Qwen3-Max | Market regime awareness and operational resilience |
| DeepSeek | Risk management gaps and safety mechanisms |
The results were not just “AI says it looks good.” GPT-4 scored one architecture at 9.3/10 and identified two specific torn-read risks. Grok’s suggestions improved simulated success rates from 60–70% to 70–85%. DeepSeek found seven critical risk management gaps that I had not considered. Qwen3 added probability uplift of three to five percentage points through market regime detection.
This is what an architecture review committee does in a large engineering organisation. I ran one with AI instead of colleagues. It cost me an afternoon and a modest API bill. The alternative would have been shipping with those seven gaps undiscovered.
The AI That Runs the Trading System Itself
There is a deeper layer to this story. The development approach shaped the runtime architecture.
The trading system itself is built as an agentic pipeline:
SCANNERS → PLANNERS → EXECUTORS
(Observe) (Decide) (Act)
Scanners continuously monitor live DEX pools and emit typed opportunity events over NATS. Planners validate those opportunities, calculate risk scores, and publish execution plans. Executors autonomously build, sign, and submit transactions.
The three-layer structure — observe, decide, act — is the same model used in modern AI agent frameworks. It was not a coincidence. After months of working with AI agents as development tools, the same architecture that made the AI tools reliable (separation of concerns, typed interfaces, explicit state) turned out to be the right architecture for autonomous trading too.
There was also an earlier prototype: an agent that connected to TradingView chart data to interpret technical analysis signals — trend direction, momentum indicators — and feed them into the trading strategy. A system where one AI-flavoured component communicated with another. That felt like a preview of where software is going.
GitHub Copilot: Cost-Effective for the Routine Work
Running frontier AI models on everything gets expensive fast. One of the practical lessons from this project was that not every task needs the most capable model.
The approach I settled on: use Claude Code for complex, context-heavy work — multi-file refactors, architectural decisions, debugging sessions that require holding a lot of state in mind. Use GitHub Copilot for simpler, repetitive tasks where it is more cost-effective — boilerplate completions, straightforward pattern implementations, protocol buffer field definitions, retry loop scaffolding.
The two tools are not competing; they are covering different cost-complexity tiers. Routing tasks to the right tool keeps the overall API bill manageable without sacrificing quality on the work that actually needs a more capable model. For a hobby project with no budget, that cost discipline matters.
OpenClaw: AI for Operations, Not Just Development
A few months in, I realised I needed AI not just to build the system but to run it.
I built an internal tool called OpenClaw — a local AI gateway that sits alongside the trading system in production. It connects a local DeepSeek model (via Ollama, running on my own hardware, not sent to any cloud API), GitHub Copilot’s API, and a Telegram bot.
The result: I can send a message to a Telegram channel and ask “what was the average arbitrage profit margin in the last hour?” and get a natural language answer sourced from live Prometheus metrics. I can ask “are there any anomalies in the RPC latency today?” and get a structured analysis.
This is a different kind of AI use — not development, not code generation, but operational intelligence. The trading system runs autonomously; OpenClaw is the interface through which I observe and understand what it is doing.
What Actually Changes
I want to be direct about something: this is not a story about AI making development easy. It is a story about AI making ambitious development possible with smaller teams.
The hard parts were still hard. I spent days debugging why Jupiter’s /swap-instructions returns the wrong destination account for circular arbitrage routes. No AI solved that — I had to read the Borsh encoding, trace the account indexing through the merged route plan, and work out the fix. What AI did was handle the ninety percent of the work that is not that hard part: the boilerplate, the scaffolding, the known patterns, the consistency enforcement, the documentation. That freed my attention for the problems that actually required a human.
I also found that the quality of AI output depends almost entirely on the quality of constraints you give it. A vague prompt produces vague code. A prompt with explicit forbidden patterns, performance targets, and system-specific context produces code you can ship. The skill that matters is not knowing how to prompt — it is knowing your domain well enough to specify the constraints correctly.
What Comes Next
I wrote in the LinkedIn article that the transition from full-stack developer to AI engineer is not about learning AI — it is about learning how to direct AI. That is what this project taught me in practice.
The Solana trading system is still under active development — there is a lot left to build. The execution layer is validated, the pipeline architecture is proven, but the planner service, live risk controls, and competitive tip strategy are all still in progress. This is not a finished product. It is a moving target.
What AI tools change is the pace at which the target can be chased. When market conditions shift or a new DEX becomes relevant, I can pivot the implementation strategy in days rather than weeks. When a debugging session uncovers a new constraint, that constraint gets encoded into the AI’s memory and shapes every future suggestion. When a design decision needs revisiting, I can run it through the multi-LLM review process in an afternoon.
The project is still being built. But it is being built faster, and with the ability to change direction, than it would be without these tools. That is the real value — not automation, but acceleration and adaptability.
I do not think that model is unique to me or to this project. I think it is how a significant fraction of software will be built in the next three years.
Related Posts
- Scanner Service: AI Tools Reflection — post #24: First reflection on two months of AI-assisted development
- OpenClaw Setup: AI Monitoring for Solana Trading Bot — post #25: Building the operational AI layer
- OpenClaw: Cost-Effective AI Model Configuration — post #28: Balancing model cost and capability
- Planner Validation: Testing the Scanner→Planner→Executor Pipeline — post #27: What the AI-assisted architecture produced
Further Reading
- From Full-Stack Developer to AI Engineer: Is Now the Right Time? — LinkedIn article
- AI-Assisted Development Overview — Technical documentation
This is post #28 in the Solana Trading System development series. Follow along on GitHub or LinkedIn.
