quorai owl quorai / multi-agent trading sandbox
open source · educational

A quorum, not a chorus. Twenty-five LLM analysts argue across six schools.

Quorai is a research sandbox for AI-driven investment analysis. Twenty-five language-model agents — investor personas paired with quant, sentiment, and valuation specialists — are sorted into six schools of thought. Each school weighs its members by conviction, not by headcount; when the schools disagree, the disagreement itself becomes the signal. Nothing is ever actually traded.

Status
experimental
License
MIT
Stack
Python · LangGraph
Agents
25 across 6 groups
Author
Nils Flaschel
§ 01  Premise

What this is, and what it isn't.

Quorai is not a product. It's a small open-source repo exploring one idea: can structured disagreement between LLM agents produce more honest trading signals than a single model talking to itself?

The name blends quorum — a deliberating body — with AI. That is the system's core mechanic: each agent embodies a known investment philosophy (Graham's margin of safety, Simons' stat-arb, Dalio's debt-cycle lens, and so on) and analyses the same ticker independently. The agents are sorted into six schools of thought; within each school, members' signals are combined weighted by conviction — not by headcount — into a single stance. When schools disagree, a moderator surfaces the split before a portfolio manager turns the result into a buy / sell / hold.

The project is inspired by virattt/ai-hedge-fund and TauricResearch/TradingAgents. It's meant as a study aid and a place to compare prompting strategies — not as a recommendation engine. No real orders are ever placed.

Two optional layers shape which agents run and how much weight they carry. A market-regime classifier reads the current SPY trend each morning and narrows the active analyst subset to the groups most relevant to that regime — reducing noise and token cost on bear or risk-off days. A conviction-weight feedback loop tracks each agent's rolling directional hit-rate across labelled historical signals and proportionally upweights the most accurate voices in the debate aggregation on the next run.

§ 02  Compare

Where Quorai sits among the multi-agent trading repos.

Quorai is one of several open-source multi-agent LLM trading projects. The two most-cited reference points are virattt/ai-hedge-fund — the popular persona-agent demo with a bundled web UI — and TauricResearch/TradingAgents — the LangGraph research framework behind arXiv:2412.20138. Quorai borrows from both and diverges from both.

Area ai-hedge-fund TradingAgents Quorai
Agent roster 13 personas + 4 analytical + risk + PM 4 analysts + Bull/Bear researchers + Trader + Risk team + PM 25 personas across 6 schools + analytical agents
Orchestration Sequential agent workflow LangGraph StateGraph LangGraph StateGraph with parallel analyst fan-out
Debate mechanism Bull vs Bear structured debate (max_debate_rounds configurable) 6 strategy-group aggregation (confidence-weighted) + LLM moderator on contested tickers only
Reflection / memory ✅ Decision log with realised return + alpha-vs-SPY injected into next run's Portfolio Manager prompt ✅ Per-agent rolling hit-rate from forward-return labelling drives next run's debate weights
Point-in-time SEC EDGAR ✅ Local SQLite XBRL cache; eliminates yfinance look-ahead bias on historical share counts and financials
Market regime gate ✅ Daily SPY regime classifier (bull / bear / risk-off / neutral) gates active analyst subset and filters proposed orders
A/B comparison harness backtester compare — side-by-side metrics for regime-on vs off, uniform vs conviction weights
Live / paper trading Backtest only ("does not actually make any trades") Backtest / simulated exchange only ✅ Alpaca paper trading (live keys hard-blocked at the client layer)
Human approval gate ✅ Telegram approve/reject per cycle + command inbox (pause, only-sells, skip-next, resume)
Risk controls Single Risk Manager agent Risk Management Team agents ✅ 5 preset profiles bundling position sizing + per-order notional / qty / daily-loss caps
Per-agent model routing Two-tier: deep_think_llm / quick_think_llm ✅ Per-analyst override (--agent-model AGENT=model/PROVIDER)
Token telemetry + prompt caching ✅ Per-agent input / output / cache tokens accumulated across the run; Anthropic prompt cache surfaced separately
Parallel ticker execution ✅ Thread-pool fan-out via QUORAI_PARALLEL_TICKERS=N
Checkpoint / resume ✅ LangGraph checkpoint per ticker (--checkpoint)
LLM provider coverage OpenAI, Anthropic, Groq, DeepSeek, Ollama OpenAI, Google, Anthropic, xAI, DeepSeek, Qwen, GLM, MiniMax, OpenRouter, Ollama, Azure 14 providers: OpenAI, Anthropic, Groq, Gemini, DeepSeek, xAI, OpenRouter, Ollama (local / free), Alibaba, Azure, GigaChat, Meta, Mistral, Kimi + per-agent overrides
LLM-vs-math measurement ✅ Deterministic math twin per agent; agreement rate, override accuracy, ΔIC / Δhit-rate / Δalpha "LLM premium"; auto-run after every backtest
Realistic cost model ✅ Optional slippage / commission / short-borrow bps flow through NAV into Sharpe and alpha (--slippage-bps, --commission-bps, --borrow-bps-annual)
Docker
Academic paper arXiv:2412.20138
Web UI ✅ Bundled, full-featured — (CLI / TUI only) Separate read-only inspector (quorai-ui)
MCP server quorai-mcp via uvx — full analyst panel callable from Claude Code, Cursor, Cline, and any MCP host

Comparison based on the public ai-hedge-fund and TradingAgents READMEs at the time of writing. Corrections welcome via PR.

Which should you use?

  • ai-hedge-fund if you want a polished bundled web UI, the simplest "try LLM-driven trading" demo, or the largest roster of well-known investor personas out of the box.
  • TradingAgents if you want the academically-grounded Bull/Bear debate framework, the broadest LLM-provider matrix (Qwen, GLM, MiniMax, Azure, and more), checkpoint-resume reliability for long runs, or to build on a paper-backed research codebase.
  • Quorai if you want point-in-time SEC fundamentals with no look-ahead bias, agents weighted by realised forward-return accuracy, a real (paper) broker connection with a human approval gate, to run ablation studies on regime selection, conviction weights, and analyst subsets, or to measure exactly how much alpha the LLM calls add over deterministic quant rules via the math-twin harness.
§ 03  Pipeline

One run, six stages.

A single ticker traverses the graph below once per trading day. Boxes are deterministic where the maths allows; only those that need judgement call an LLM.

trading-day graph
  • i/o
  • llm call
  • deterministic
  • fan-out
  1. 00in
    Tickers + portfolio
    User-supplied universe.
  2. 01math
    Preflight
    Deterministic SPY regime classifier; loads conviction weights.
  3. 02llm × 25
    Analyst agents
    Personas score independently.
  4. 03llm*
    Debate node
    Deterministic aggregation; LLM fires only on contested tickers.
  5. 04math
    Risk manager
    Vol & correlation position limits.
  6. 05llm
    Portfolio manager
    Final action + sizing + reasoning.
  7. 06out
    Decision + log
    JSONL feeds back into next run.
  1. 01preflight
    Classify the regime, load the weights.

    Each trading day the pipeline optionally classifies the current SPY regime (bull / bear / risk-off / neutral) and loads per-agent conviction weights built from prior labelled runs.

  2. 02analyse
    Twenty-five personas read the same evidence.

    Each persona independently analyses each ticker from price history, fundamentals, insider trades, and news. When regime selection is active, only groups relevant to the current regime run — e.g. risk-off routes to macro, deep value, and sentiment.

  3. 03debate
    Schools take sides — deterministically first.

    Within each of the six schools, members' signals are confidence-weighted (optionally multiplied by their rolling conviction score via --use-conviction-weights) and averaged into a group stance: ≥ +0.25 → bullish, ≤ −0.25 → bearish, otherwise neutral. Only when at least one school is bullish and another is bearish on the same ticker does an LLM moderator fire to name and summarise the split.

  4. 04risk
    Pure maths — no LLM call.

    Volatility- and correlation-adjusted position limits are computed deterministically, so an excitable agent can't oversize a bad idea.

  5. 05decide
    One final decision, written down.

    The portfolio manager emits a buy / sell / hold decision with sizing and reasoning. Per-agent signals are written to a JSONL log. To close the feedback loop, run the backtester feedback subcommand after each run to label signals with 1 d / 5 d / 20 d forward returns and write weights.json; then pass --use-conviction-weights on the next run.

§ 04  Roster

The twenty-five.

Six schools of thought, twenty-five personas. Each persona has a fixed prompt; the pull-quotes summarise the lens they argue from.

deep value

Deep Value

4 agents

Find companies trading below intrinsic worth with a clear margin of safety, using hard numbers over narrative.

BG
Ben Graham
The Father of Value Investing
“Margin of safety first: I apply seven systematic rules — Graham Number, NCAV, P/E ≤ 15 — and never deviate for narrative.”
MB
Michael Burry
The Big Short Contrarian
“Hunt value with hard numbers — FCF yield, EV/EBIT, balance sheet. Be contrarian: hatred in the press is your friend when fundamentals are solid.”
MP
Mohnish Pabrai
The Dhandho Investor
“Heads I win, tails I don't lose much: clone great investors, demand 8%+ FCF yield, keep it simple, and wait for the price to come to you.”
JG
Joel Greenblatt
The Magic Formula Investor
“Magic Formula: rank by high ROIC plus high earnings yield, then wait for a special-situation catalyst — spin-off, insider buy — to unlock the value.”
quality compounders

Quality Compounders

4 agents

Long-term ownership of durable franchises with strong moats, exceptional management, and compounding economics.

AD
Aswath Damodaran
The Dean of Valuation
“Start with the company story, connect it to FCFF drivers — growth, margins, reinvestment, risk — and let the DCF decide the price you should pay.”
CM
Charlie Munger
The Rational Thinker
“Invert always: ask what makes it a terrible investment first, then let a lollapalooza of reinforcing moat signals earn your conviction.”
PF
Phil Fisher
The Scuttlebutt Investor
“Exceptional management + sustained R&D-led growth + consistent margins: pay a fair price for a great business and hold it for decades.”
WB
Warren Buffett
The Oracle of Omaha
“Buy wonderful businesses with durable moats and exceptional capital allocation at a price that gives a margin of safety, then never sell.”
growth and catalyst

Growth & Catalyst

4 agents

Disruptive innovation, activist catalysts, and outsized growth potential at a price the market hasn't yet recognised.

BA
Bill Ackman
The Activist Investor
“Identify the specific lever — board change, spin-off, buyback — that can re-rate a high-quality franchise by 50–100%.”
CW
Cathie Wood
The Queen of Growth Investing
“Bet on disruptive innovation with a massive TAM; accept the volatility — exponential growth rewards patience on a multi-year horizon.”
PL
Peter Lynch
The 10-Bagger Investor
“Buy what you know: if the PEG is cheap and the business is simple enough to explain to a twelve-year-old, it could be a ten-bagger.”
GA
Growth Analyst
Growth Specialist
“Follow revenue acceleration and expanding margins — when growth is durable and valuation is still reasonable, the compounding does the work.”
macro and cycle

Macro & Cycle

5 agents

Top-down regime analysis — debt cycles, liquidity flows, and tail-risk awareness — to time and size positions.

NT
Nassim Taleb
The Black Swan Risk Analyst
“Seek the antifragile: convex payoffs with bounded downside, avoid the fragile, and never mistake low volatility for safety.”
RJ
Rakesh Jhunjhunwala
The Big Bull Of India
“Structural tailwind + earnings quality + low leverage + market leadership + contrarian timing — that is how you find multi-decade compounders.”
SD
Stanley Druckenmiller
The Macro Investor
“Macro regime first: liquidity and rates determine the tide; when the thesis is clear, size very large and cut the moment it changes.”
RD
Ray Dalio
The All-Weather Macro Investor
“Understand the debt cycle and economic regime — resilient balance sheets bought cheaply in early-cycle fear are the foundation of all-weather returns.”
HM
Howard Marks
The Credit Cycle Master
“Second-level thinking: ask what the price implies and what the crowd is missing — the best entries come when fear drives pricing below fair value.”
quant systematic

Quant & Systematic

4 agents

Rule-based factor models, trend-following, and statistical arbitrage — data decides, narrative is noise.

TA
Technical Analyst
Chart Pattern Specialist
“Trend, momentum, mean-reversion, and volatility signals combined by weight — the tape tells the truth when charts are read without emotion.”
CA
Cliff Asness
The Quant Factor Investor
“Value + momentum + quality + low-vol all aligned is the highest conviction signal; never let narrative override the factor scores.”
ES
Ed Seykota
The Trend-Following Pioneer
“The trend is your friend until the bend: ride the Donchian breakout, size by ATR volatility, and cut losers the instant the trend reverses.”
JS
Jim Simons
The Quant / Stat-Arb Legend
“No narrative, only data: Hurst below 0.5 means mean-reversion is our edge — Z-scores, autocorrelation, and volume microstructure decide every trade.”
sentiment and analytical

Sentiment & Analytical

4 agents

Bottom-up specialists in fundamentals, valuation, growth signals, and market-sentiment indicators.

FA
Fundamentals Analyst
Financial Statement Specialist
“Score profitability, growth, financial health, and efficiency systematically — when three of four dimensions align bullish, the signal is robust.”
NSA
News Sentiment Analyst
News Sentiment Specialist
“Read the news flow quantitatively: positive sentiment breadth and rising coverage predict price momentum before it shows up in fundamentals.”
SA
Sentiment Analyst
Market Sentiment Specialist
“Market psychology leaves measurable footprints — insider activity, put/call ratios, short interest — read the crowd to trade against its worst impulses.”
VA
Valuation Analyst
Company Valuation Specialist
“Model DCF, owner earnings, EV/EBITDA, and P/E together — when multiple methods converge on undervaluation, the margin of safety is real.”
§ 05  Method

Under the debate, a quantitative core.

The schools argue in natural language, but each analyst calls into a small Python library that does the actual arithmetic. Two pieces are worth surfacing: how positions are sized, and how intrinsic value is estimated.

Position sizing

Every ticker gets a base notional limit, then two independent multipliers are applied. First, realised volatility — annualised as std(60-day returns) × √252 — buckets the name into one of four tiers:

Annualised volMultiplier
< 15 %1.25×
15 – 30 %1.0 − (vol − 0.15) × 0.5
30 – 50 %0.75 − (vol − 0.30) × 0.5
> 50 %0.50×

Second, the average correlation with already-open positions scales the limit again — concentrated bets get cut, uncorrelated additions get a small boost:

Avg correlationMultiplier
≥ 0.800.70×
0.60 – 0.800.85×
0.40 – 0.601.00×
0.20 – 0.401.05×
< 0.201.10×

Final limit: base_limit × vol_multiplier × corr_multiplier, then clamped to available cash and margin.

Valuation models

The valuation agent runs four methods in parallel and blends their outputs. Owner Earnings (Buffett): net_income + D&A − capex − Δworking_capital, projected ten years, discounted to present value, then haircut 25 % for margin of safety. DCF: Σ FCF_t / (1+r)^t + TV / (1+r)^n with terminal value TV = FCF_n × (1 + g) / (r − g).

EV/EBITDA cross-check: implied_equity = median_sector_EV/EBITDA × current_EBITDA − net_debt. Residual income (Edwards-Bell-Ohlson): RI_t = net_income_t − cost_of_equity × book_value_{t−1}, summed in present-value terms on top of current book value.

The four estimates are blended at fixed weights — DCF 35 % · Owner Earnings 35 % · EV/EBITDA 20 % · Residual Income 10 % — and the gap (weighted_intrinsic − market_cap) / market_cap drives the signal at a ±15 % threshold.

Multi-stage DCF and quality adjustment

The DCF agent runs three growth phases (high, fade, terminal) discounted at WACC. Before discounting, a quality adjustment scales the projected cash flows: quality_factor = max(0.7, 1 − fcf_volatility × 0.5) where fcf_volatility = std(FCF) / mean(FCF) (coefficient of variation). A scenario overlay then weights bear / base / bull growth assumptions at 20 % / 60 % / 20 % to capture tail outcomes without over-engineering the base case.

WACC

cost_of_equity = RF + β × MRP (RF = 4.5 %, MRP = 6 %, β from TTM metrics). Cost of debt: max(RF + 0.01, RF + 10 / interest_coverage). Final WACC: (E/V) × CoE + (D/V) × CoD × (1 − 0.25), floored at 6 % and capped at 20 % to keep the discount rate within a plausible range regardless of data quality.

Conviction-weight feedback loop

Every per-agent, per-ticker signal written to the JSONL log can be labelled with 1 d / 5 d / 20 d forward returns via the backtester feedback subcommand. A rolling per-agent directional hit-rate is computed from those labels and serialised to src/feedback/weights.json. On the next run, passing --use-conviction-weights loads those weights and multiplies each agent's confidence by its historical accuracy before the debate aggregation — so agents that have been right proportionally louder.

§ 06  Measurement

Does the LLM actually beat the math?

Every instrumented persona analyst computes a deterministic "math twin" signal from its underlying quantitative inputs before making an LLM call. The score maps to a direction via per-agent calibrated thresholds (defaults: bull_cut = 0.60, bear_cut = 0.40) and is written to the signal log alongside the LLM signal and confidence — so both can be compared per agent, per ticker, and across the full run.

LLM-premium metrics

The backtester llm-vs-math subcommand computes the following per agent from a labelled signal log:

MetricDefinition
Agreement rate agree / total_directional — fraction of (bullish/bearish) records where signal == math_signal
Override count Records where the LLM direction differs from the math twin
LLM override accuracy llm_correct / override_count — among overrides, fraction where the LLM direction matched the forward-return sign
Math win rate on overrides Fraction of overrides where math was right and LLM was wrong
Δhit-rate llm_hit_rate − math_hit_rate per agent — LLM premium on directional accuracy
Δspread (IC) llm_directional_spread − math_directional_spread — LLM premium on signal information coefficient
Δalpha llm_alpha_vs_baseline − math_alpha_vs_baseline per agent

The backtester debate-impact subcommand replays group aggregation on both LLM and math-twin signal sets per stored cycle bundle and reports the mean |tilt delta| (LLM vs math panel stance), contested-set Jaccard overlap, and per-school group-stance flip rate. Together these answer: how much does the debate change what the math alone would have decided?

Automatic post-backtest analytics

Every backtester run automatically executes a 3-step analytics suite after the backtest completes:

  1. [1/3] School-debate impact — group-stance flip rates and panel-tilt delta vs the math-twin baseline.
  2. [2/3] Forward-return labeling + conviction weights — signals are labelled with 1 d / 5 d / 20 d returns; conviction weights are recomputed and written to feedback/weights.json.
  3. [3/3] LLM-vs-math per-agent analysis — the full premium table above. Step 3 is automatically skipped when the run window is younger than the forward-return horizon (5 trading days); re-run backtester feedback once the horizon has elapsed.

Each step is guarded — an analytics failure prints a warning but never crashes the run. Pass --no-analyze to skip the suite entirely. The two heavier remaining steps (calibrate-math and llm-ablation) are printed as copy-pasteable commands at the end of each run rather than run automatically.

Ablation harness

backtester llm-ablation reruns the backtest under several modes and reports P&L, Sharpe, drawdown, and token cost for each:

  • all-LLM — baseline: every agent uses its LLM call (default behaviour).
  • all-math — every agent uses its deterministic math-twin signal only; no LLM calls.
  • QUORAI_PM_MODE=rule — portfolio manager replaced by a deterministic tilt-following rule; no PM LLM call.
  • QUORAI_PM_DEBATE_CONTEXT=0 — PM prompt omits the debate-moderator summary.
  • leave-one-out (--full-loo) — each persona is individually replaced by its math twin to quantify its marginal contribution.

Math-twin thresholds can be calibrated from any labelled signal log: backtester calibrate-math grid-searches per-agent bull_cut / bear_cut values to maximise directional spread and writes results to feedback/math_thresholds.json.

§ 07  Safety

Hard stops first, soft limits on top.

The Alpaca client refuses to construct a live-trading client unless ALPACA_PAPER=True. That single check is the base safety net; everything else is layered on top of it.

Risk-profile presets

Pass --risk-profile to bundle four related caps in one flag. Individual caps are still overridable via env vars for a single run.

Profile base_limit Notional cap Qty cap Daily loss limit
conservative10 %$5,000500 shares2 %
cautious15 %$7,500750 shares3 %
balanced (default)20 %$10,0001,000 shares5 %
aggressive30 %$20,0002,000 shares8 %
speculative50 %$50,0005,000 shares15 %

Telegram approval gate

Add --require-approval to send proposed orders as an inline Telegram message before any order is submitted. The gate is fail-closed: missing credentials, a Telegram error, a rejection, or a timeout (default 30 min) all abort with zero orders. The bot also reads a plain-text command inbox at the start of each run:

MessageEffect
accept only salesSuppress all buy orders for the next run only
skip next daySkip the next scheduled run entirely
pausePause all runs until you send continue
continueClear an active pause

Command state persists in logs/command_state.json across restarts.

Kill switch

Set KILL_SWITCH=true in .env to reject every order immediately — no LLM call, no broker call — until the flag is cleared.

Known limitations

  • Notional cap is per-order, not per-cycle. With N tickers, up to N × $10,000 (at the balanced preset) can be submitted in a single run before any cap fires.
  • Sub-$1 fractional buys are silently dropped. A tiny allocation on a high-priced stock rounds to 0.000 shares and is classified as skipped with no warning.
  • Daily-loss limit re-baselines if SOD equity file is missing. After a crash, the runner resets the baseline to current (already drawn-down) equity, effectively disabling the limit for that day. Use --catch-up to recover the prior-close equity from Alpaca's portfolio history instead.
§ 08  Use

Run it yourself.

Clone the repo, install with uv sync, and invoke one of the entry points below. You'll need a key for at least one LLM provider plus a Finnhub key for market data.

quickstart
uv run backtester --tickers AAPL,MSFT --model deepseek/deepseek-chat --model-provider OpenRouter --show-reasoning
risk profile
uv run backtester --tickers AAPL,MSFT --model deepseek/deepseek-chat --model-provider OpenRouter --risk-profile speculative
regime + conviction
uv run backtester --tickers AAPL,MSFT --model deepseek/deepseek-chat --model-provider OpenRouter --use-regime-selection --use-conviction-weights
score signals
uv run backtester feedback --signal-log logs/backtest/signals/signals-<run-id>.jsonl # writes weights.json
a / b compare
uv run backtester compare --mode both --tickers AAPL,MSFT --model deepseek/deepseek-chat --model-provider OpenRouter
paper live + Telegram
uv run python src/live_trading.py --tickers AAPL,MSFT --model deepseek/deepseek-chat --model-provider OpenRouter --require-approval --dry-run
claude code (mcp)
claude mcp add quorai uvx quorai-mcp
cost model
uv run backtester --tickers AAPL,MSFT --model deepseek/deepseek-chat --model-provider OpenRouter --slippage-bps 5 --commission-bps 1 --borrow-bps-annual 50
calibrate math twins
uv run backtester calibrate-math --signal-log logs/backtest/signals/signals-<run-id>.jsonl # grid-searches per-agent bull_cut / bear_cut
llm ablation
uv run backtester llm-ablation --tickers AAPL,MSFT --start-date 2026-01-01 --end-date 2026-03-31 # reruns with all-math / PM-rule / no-debate-context modes

--risk-profile accepts conservative, cautious, balanced (default), aggressive, or speculative — each preset bundles position-sizing limits, order-notional caps, and a daily loss limit. --use-conviction-weights requires a prior feedback run to have produced weights.json. Live trading is paper-only by construction (the Alpaca client refuses to connect to a live endpoint). For historically accurate backtests, seed the SEC EDGAR fundamentals cache first: uv run python experiments/seed_sec_fundamentals.py --tickers AAPL,MSFT. Every backtest automatically runs a 3-step analytics suite after the run completes (school-debate impact → forward-return labeling → LLM-vs-math); pass --no-analyze to opt out. The calibrate-math and llm-ablation steps are heavier — they're printed as copy-pasteable commands at the end of each run rather than run automatically. See the README for the full setup walkthrough. For a browsable view of the cycle bundles written to logs/, see the read-only companion UI at quorai/quorai-ui (§ 09 below). To expose the panel as a tool call to any MCP host (Claude Code, Claude Desktop, Cursor, Cline, …), run claude mcp add quorai uvx quorai-mcp — see the MCP server section of the README.

!

Educational and research use only. Quorai does not constitute investment advice. Agent signals are generated by language models and should not be used as the basis for real financial decisions. Past signals do not guarantee future performance.

§ 09  Inspector

Browse the run — don't grep the logs.

quorai-ui is a read-only companion web app (Next.js, runs on port 3030) that reads the cycle bundles quorai-app writes to logs/<mode>/runs/ and presents them as browsable, comparable views. It never touches the trading system — point it at the logs/ directory and go.

quorai-ui demo — runs list, run overview, cycle breakdown
demo — runs list → run overview → cycle breakdown

Six main views:

  • Runs list — searchable, sortable index of all backtest and live runs with at-a-glance return, Sharpe, risk profile, and timestamps.
  • Run overview — headline metrics (cycles, tokens, duration, model, status), risk profile, backtest results (return, Sharpe, Sortino, drawdown, baselines), equity-curve chart, and a per-cycle breakdown table.
  • Agents — the 25 persona analysts grouped by investing school, with a tickers × days decision matrix and per-cell reasoning popovers.
  • Schools — six investing schools with per-school signal heatmaps and a cross-school debate log for each cycle.
  • Cycle detail — full breakdown of a single trading cycle: regime indicators, portfolio before/after, analyst-signal heatmap, group consensus and debate, risk limits, portfolio-manager decisions, executed trades, and token-usage chart.
  • LLM-vs-Math analysis tabs — three optional tabs that mirror the measurement harness (§ 06): per-agent dual-attribution (Δhit-rate, Δspread, Δalpha), school-debate impact (tilt delta, stance flip rates, contested-set Jaccard), and a P&L ablation across modes. Each tab only renders when its backing report JSON exists for the run.

The app also ships a command palette (keyboard-driven navigation across runs) and uses the same paper / lab-notebook design language as the main repo. Setup requires only a single env var (QUORAI_LOGS_DIR) pointing at quorai-app's logs/ directory.

quorai-ui run overview with equity curve
run overview — equity curve + per-cycle breakdown
quorai-ui LLM-vs-Math analysis tab
LLM-vs-math — per-agent Δhit-rate, Δspread, Δalpha
quorai-ui Schools view — signal heatmap and debate log
schools — signal heatmap + cross-school debate log
quorai-ui cycle detail view
cycle detail — regime, analyst signals, PM decisions, trades