Designing an Autonomous Trading Desk with AI Agents

The typical "AI trading bot" is a misnomer. It is usually a deterministic system — a chain of if-then-else rules dressed up with a machine learning model at one node. It does not reason. It does not adapt its approach when conditions change. It does not ask a colleague for a second opinion.

We wanted something fundamentally different. We wanted a trading desk — a group of specialists who collaborate, challenge each other, and reach collective decisions. Except the specialists are AI agents, each powered by Gemini 2.5 Pro, each with a distinct role, a distinct set of tools, and a distinct personality.

The 9-agent autonomous trading desk architecture — Alex supervises, traders specialize, Zainab guards — From left to right: the supervisor layer (Alex), trading swarm (Hugo, Bella, Mike, Victor, Ola, Azeez), risk oversight (Zainab), intelligence layer (Summy), and human override (Kenny via Telegram).

The Old Approach vs. The New

The old approach to algorithmic trading looks like this:

if momentum_score > 7:
    enter_long()
elif rsi < 30:
    enter_mean_reversion_long()
else:
    hold()

Market data flows into a feature pipeline, features feed a model, the model outputs a signal, the signal triggers execution. It is a linear pipeline with no feedback loops, no reasoning, and no ability to handle novel situations.

Our new approach is a multi-agent system built on LangGraph, where each agent is a stateful LLM-powered entity that reasons through problems, uses tools dynamically, and communicates with other agents through LangGraph's native handoff mechanisms. The contrast is stark:

Agent receives: "Scan for opportunities"
Agent reasons: "Let me check momentum..."
Agent calls: scan_momentum_tool()
Agent reasons: "SOL looks strong, R:R is 2.3. But let me check beta first."
Agent calls: get_beta("SOLUSDT")
Agent reasons: "High beta in current environment. I need smaller size."
Agent calls: request_approval(SOL, LONG, 1.5%, 180, 200, "Momentum 8.2, R:R 2.3, reduced for beta")

The agent is not just processing a signal. It is thinking about the signal in context.

The Nine Agents

Each agent on our desk has a distinct role, a set of tools, and specific criteria that govern when they act.

Alex — Portfolio Manager. Alex is the supervisor who orchestrates the entire desk. Built using create_supervisor() from the langgraph-supervisor library, Alex runs a check every 2 hours (or shorter during volatile conditions). His primary job is to consult Summy for the current market picture, then activate the appropriate specialists based on what opportunities exist. Alex controls capital allocation silently — traders do not know their allocation percentage, they always operate as if they have 100% of their own budget. He escalates to Kenny only when there are real problems: broken tools, agent errors, or a drawdown breach.

Hugo — Directional/Momentum Trader. Hugo specializes in trend-following and momentum breakout strategies on 4h timeframes with 1h entry refinement. His entry criteria require a momentum score of at least 7.0 (from Summy's C++ scanner engine), a risk-reward ratio of at least 2.0, and higher-timeframe trend confirmation. He trades crypto perpetuals on Binance and Bybit, sizes at 2% per trade (max 4%), holds for hours to days (max 72h), and uses trailing stops set 2% from the extreme.

Bella — Mean Reversion Trader. Bella fades extreme moves. Her sweet spot is z-scores between 2.5 and 4.0 — extended enough to warrant a fade, not so extreme that she is catching a falling knife. She uses up to 3x leverage (lower than Hugo due to counter-trend risk), targets max 24-hour holding periods, and exits when z-score reverts to 0.5. If z-score extends beyond 5.0, she treats it as setup failure and exits immediately.

Mike — Portfolio Hedging Specialist. Mike manages beta-neutral long/short portfolios. He monitors the portfolio's net beta exposure continuously and deploys hedges when it drifts outside his target range. Unlike Hugo and Bella who trade directionally, Mike's goal is market neutrality — his longs and shorts should cancel out most market beta. He has a beta monitor that can wake him automatically if portfolio beta drifts beyond tolerance.

Victor — Funding Arbitrage Trader. Victor exploits funding rate extremes on perpetual futures. When funding rates are significantly positive (longs are crowded and paying shorts), he goes SHORT to collect funding yield plus potential directional alpha. When funding is significantly negative, he goes LONG. Before entering any trade, he requires Summy to confirm that technical indicators align with his funding-directed thesis. This dual confirmation — funding + technicals — is his edge.

Ola — Liquidation/Momentum Scalper. Ola exploits liquidation cascades. When a large cascade is detected (more than 70% one-sided liquidation imbalance over 15 minutes), Ola enters in the cascade direction with market orders, holds for 5–15 minutes, and exits quickly. He bypasses Zainab's per-trade approval — instead, he operates under a rolling 24-hour loss limit of 5%. The execution engine automatically locks him out when this limit is breached.

Azeez — Range/Mean-Reversion Scalper. Azeez is Ola's counterpart in quiet markets. He trades bounces at range extremes with 10–30 minute holding periods and limit order execution. He checks orderbook depth to confirm visible support/resistance, watches for absence of liquidation cascades (which would indicate trending conditions that invalidate his range thesis), and uses Summy's z-score scanner to identify the most extreme boundary tests.

Zainab — Risk Manager. Zainab is the gatekeeper for all non-scalper trades. Every proposal from Hugo, Bella, Mike, and Victor must pass through her before execution. She evaluates cross-trader correlation (are multiple traders making correlated bets?), symbol concentration (is any single symbol overweighted across the portfolio?), total portfolio exposure (never above 80% of capital), and portfolio drawdown (halt everything at -5%). Her veto is absolute. If she rejects a trade, it does not happen — not even Alex can override her on risk grounds.

Summy — Market Intelligence. Summy is the desk's intelligence layer. She wraps our proprietary trading_intelligence system — a C++ divergence scanner engine, multi-timeframe technical analysis, dual-benchmark beta calculations (using both BTC and ETH as benchmarks), funding rate monitoring, and live exchange connectivity. Every other agent consults Summy before making decisions. She does not trade; she provides the data and analysis that enables others to trade well.

How They Work Together

The agents communicate through LangGraph's handoff system. Any agent can call consult_agent() to request reasoning from another agent, or wake_agent() to trigger an asynchronous cycle. A typical trade flow:

Alex activates on his 2-hour supervisor cycle
Alex asks Summy: "Scan the full market. What do you see?"
Summy runs the C++ scanner engine across all Binance symbols, identifies 5 momentum signals, 2 z-score extremes, and an elevated funding rate on ETHUSDT
Alex publishes a market brief so all agents can access Summy's findings without re-consulting her
Alex wakes Hugo ("strong momentum in SOL, AVAX — check the brief"), wakes Bella ("DOGE oversold z=-2.8"), and wakes Victor ("ETH funding at 0.08% per 8h — confirm with Summy first")
Hugo activates, reads the brief, consults Summy for multi-timeframe analysis on SOL, gets a trade setup, requests approval from Zainab
Zainab checks cross-trader exposure, approves at slightly reduced size due to existing correlated positions
Hugo executes, sets a trailing stop
Zainab monitors ongoing exposure as other agents also trade

The agents run asynchronously in separate threads after Alex wakes them. Alex does not wait for their results — he trusts them to execute their mandates.

The LangGraph Architecture

LangGraph provides three key primitives that make this system work:

StateGraph with checkpointing. All agent state — positions, P&L, pending proposals, portfolio exposure — is stored in a shared SQLite checkpoint via SqliteSaver. If the system crashes, it resumes from exactly where it left off. This is critical for a 24/7 live trading system.

create_supervisor() and create_swarm(). Alex is built with create_supervisor(), which gives him control over the trading swarm. The swarm (Hugo, Bella, Mike, Victor, Ola, Azeez) is built with create_swarm(), which allows agents to hand off to each other dynamically based on what they discover.

LangGraph interrupt() for human-in-the-loop. Any agent can call interrupt() to pause execution and wait for human input. This is how the Telegram kill switch works — when a critical condition is detected, the system sends a message to the human operator and waits for a response before continuing.

Human-in-the-Loop via Telegram

Despite our confidence in the agents, we maintain a human-in-the-loop through Telegram integration. The human operator (referred to as Kenny in the system) receives notifications and can send commands.

Automatic escalation triggers: - Any trade exceeding a size threshold ($500 notional in current configuration) - Any asset the system has never traded before - Portfolio drawdown exceeding 2% in a single day - Agent errors or tool failures occurring three or more times consecutively

Kenny can respond with simple commands: approve, reject, halt all trading, or modify a specific proposal. The kill switch — a single Telegram command that halts all trading, closes all positions, and locks the system until manually resumed — is tested weekly.

The Capital Allocation System

Alex controls capital allocation through a hierarchical weight cascade: w_PM × w_T × w_S × internal sizing.

w_PM (Portfolio Manager weight): Set by Kenny. Default 70% — meaning 70% of total capital is active at any time, with 30% held as buffer
w_T (Trader weight): Set by Alex. Default allocations are Hugo 25%, Bella 20%, Mike 20%, Victor 15%, Ola 10%, Azeez 10%
w_S (Strategy weight): Set by each trader when deploying bots. A trader's active bots must sum to ≤ 100% of their allocation

Alex adjusts trader allocations every 12 hours based on a performance scorecard. Outperforming traders get more capital; underperforming ones get less. The guard rails prevent any trader from dropping below 10% or exceeding 45% of total allocation. Critically, traders are never told their allocation percentage — they always operate as if they have 100% of their own budget, which prevents conservative behavior driven by capital constraints.

Strategy Bots: Automated Within Agentic

On top of the agent layer sits a strategy bot system. Traders can deploy deterministic bots — trend-following algorithms, mean-reversion scalpers, donchian breakout systems — that run autonomously within their allocated capital. The bots emit signals via a standardized format, which are processed by our Execution Alpha engine with intelligent limit order execution.

The agent controls the bots strategically: deploying them when market conditions match their logic, adjusting parameters, and shutting them down when they hit drawdown thresholds. The bot framework handles the mechanical execution; the agent handles the strategic judgment.

What We Learned

Agent specialization matters more than we expected. Early versions had fewer, more general agents. They performed worse because their prompts had to handle too many scenarios and contained contradictions similar to the z-score bug described in our previous post. Agents with narrow, well-defined roles produce better reasoning because their prompts have no ambiguous overlap.

The risk manager must have absolute veto power, no exceptions. We experimented with giving the portfolio manager override authority on risk decisions. This led to trades that were individually justified but collectively dangerous — multiple correlated directional bets that looked independent on the surface but shared significant underlying risk. Zainab's veto is now architecturally enforced.

Consultation depth matters. We limit consultation chains to two levels (agent A consults agent B, B responds; B cannot initiate a new consultation from within that chain). Deeper chains create circular reasoning and token waste.

Named agents are not just whimsy. Giving agents names and personalities was initially an aesthetic choice. It turned out to be functionally important for system design and debugging. When reviewing trade logs, a conversation between Alex, Hugo, and Zainab is immediately comprehensible. The names create a mental model that keeps the system legible as it grows more complex.

Asynchronous activation is essential for a 24/7 system. Synchronous handoffs — where Alex waits for each trader to finish before proceeding — would make the system impossibly slow in active markets. The async wake model (Alex wakes traders and moves on) allows the desk to work several opportunities simultaneously.

Current Status

The system is in live trading with small size. All nine agents are active. Hugo is running on 4-hour cycles, Bella on 1-hour cycles, and the scalpers (Ola and Azeez) on 15-minute and 30-minute cycles respectively. We are building toward full deployment as we validate performance across different market regimes.

The postmortem data from our earlier single-agent bot informed every architectural decision described here. That expensive failure was, ultimately, the most valuable research investment we have made.

Takeaways

Multi-agent systems are fundamentally different from traditional trading bots — they reason rather than react
LangGraph provides the orchestration layer needed for agent communication, state management, and human-in-the-loop
The risk manager must have absolute veto authority that cannot be overridden on portfolio-wide risk grounds
Specialization beats generalization — narrow roles produce better reasoning and eliminate prompt contradictions
Human-in-the-loop via Telegram provides a practical safety layer without bottlenecking execution
Named agents with distinct personalities improve system legibility and debugging

The Old Approach vs. The New

The old approach to algorithmic trading looks like this:

if momentum_score > 7:
    enter_long()
elif rsi < 30:
    enter_mean_reversion_long()
else:
    hold()

Agent receives: "Scan for opportunities"
Agent reasons: "Let me check momentum..."
Agent calls: scan_momentum_tool()
Agent reasons: "SOL looks strong, R:R is 2.3. But let me check beta first."
Agent calls: get_beta("SOLUSDT")
Agent reasons: "High beta in current environment. I need smaller size."
Agent calls: request_approval(SOL, LONG, 1.5%, 180, 200, "Momentum 8.2, R:R 2.3, reduced for beta")

The agent is not just processing a signal. It is thinking about the signal in context.

The Nine Agents

Each agent on our desk has a distinct role, a set of tools, and specific criteria that govern when they act.

How They Work Together

Alex activates on his 2-hour supervisor cycle
Alex asks Summy: "Scan the full market. What do you see?"
Summy runs the C++ scanner engine across all Binance symbols, identifies 5 momentum signals, 2 z-score extremes, and an elevated funding rate on ETHUSDT
Alex publishes a market brief so all agents can access Summy's findings without re-consulting her
Alex wakes Hugo ("strong momentum in SOL, AVAX — check the brief"), wakes Bella ("DOGE oversold z=-2.8"), and wakes Victor ("ETH funding at 0.08% per 8h — confirm with Summy first")
Hugo activates, reads the brief, consults Summy for multi-timeframe analysis on SOL, gets a trade setup, requests approval from Zainab
Zainab checks cross-trader exposure, approves at slightly reduced size due to existing correlated positions
Hugo executes, sets a trailing stop
Zainab monitors ongoing exposure as other agents also trade

The agents run asynchronously in separate threads after Alex wakes them. Alex does not wait for their results — he trusts them to execute their mandates.

The LangGraph Architecture

LangGraph provides three key primitives that make this system work:

Human-in-the-Loop via Telegram

The Capital Allocation System

Alex controls capital allocation through a hierarchical weight cascade: w_PM × w_T × w_S × internal sizing.

w_PM (Portfolio Manager weight): Set by Kenny. Default 70% — meaning 70% of total capital is active at any time, with 30% held as buffer
w_T (Trader weight): Set by Alex. Default allocations are Hugo 25%, Bella 20%, Mike 20%, Victor 15%, Ola 10%, Azeez 10%
w_S (Strategy weight): Set by each trader when deploying bots. A trader's active bots must sum to ≤ 100% of their allocation

Strategy Bots: Automated Within Agentic

What We Learned

Current Status

The postmortem data from our earlier single-agent bot informed every architectural decision described here. That expensive failure was, ultimately, the most valuable research investment we have made.

Takeaways

Multi-agent systems are fundamentally different from traditional trading bots — they reason rather than react
LangGraph provides the orchestration layer needed for agent communication, state management, and human-in-the-loop
The risk manager must have absolute veto authority that cannot be overridden on portfolio-wide risk grounds
Specialization beats generalization — narrow roles produce better reasoning and eliminate prompt contradictions
Human-in-the-loop via Telegram provides a practical safety layer without bottlenecking execution
Named agents with distinct personalities improve system legibility and debugging

Designing an Autonomous Trading Desk with AI Agents

The Old Approach vs. The New

The Nine Agents

How They Work Together

The LangGraph Architecture

Human-in-the-Loop via Telegram

The Capital Allocation System

Strategy Bots: Automated Within Agentic

What We Learned

Current Status

Takeaways

Related Articles

We Built an AI Trading Bot. It Lost Money. Here's What We Learned.

Building an AI Fundamental Analyst: Automated Equity Research on AAPL and NVDA

Designing an Autonomous Trading Desk with AI Agents

The Old Approach vs. The New

The Nine Agents

How They Work Together

The LangGraph Architecture

Human-in-the-Loop via Telegram

The Capital Allocation System

Strategy Bots: Automated Within Agentic

What We Learned

Current Status

Takeaways

Related Articles

We Built an AI Trading Bot. It Lost Money. Here's What We Learned.

Building an AI Fundamental Analyst: Automated Equity Research on AAPL and NVDA