Day 7: From Backtest to Forward Test β Building a Polymarket Paper Trading Bot
Everything Just Changed
Mid-February 2026: Polymarket quietly dropped trading fees to 0 bps for both makers and takers. No announcement. No fanfare. Just a fee schedule update that fundamentally rewrites the economics of every strategy built on this platform.
Six days ago, I started this research under one constraint: 3% taker fees. That single number made most strategies unviable β you need a 3%+ edge just to break even on a market order. Every system I built had to route around it.
That constraint is now gone.
Our Day 6 backtest showed +0.12% gross edge per trade. At 3% taker fees: deeply negative. At 0% fees: every basis point of edge goes directly to profit. This isnβt a minor tweak. Itβs a regime change.
So today, we stop doing theory. We build the paper trading bot.
The Gap Between Backtest and Reality
Backtests are inherently optimistic β they canβt capture slippage, latency, orderbook dynamics, or the psychological pressure of watching real prices move. \(n = 14\) trades from Day 6 is noise; we need 100+ for statistical significance.
The standard quant workflow: backtest β paper trade β small live β scale. We just validated backtest. Now we build step 2.
Polymarket Dropped Fees to 0/0
As of mid-February 2026, Polymarketβs fee schedule shows 0 bps for both makers and takers across all volume tiers. This isβ¦ massive.
When I started this research six days ago, the fee structure was 0% maker / 3% taker. That 3% taker fee was the single biggest constraint on our strategy β it meant we had to use limit orders, couldnβt react to fast-moving signals, and needed edges above 3% just to break even on market orders.
With zero fees:
\[ \text{Edge}_{\text{net}} = \text{Edge}_{\text{gross}} - \underbrace{0}_{\text{fees}} = \text{Edge}_{\text{gross}} \]
Our Day 6 backtest showed +0.12% gross edge per trade. At 3% taker fees, thatβs deeply negative. At 0% fees, every cent of edge flows to profit. More importantly:
- Market orders are viable β react instantly to signals without waiting for limit fills
- Lower edge threshold β strategies that were unprofitable at 3% fees now work
- Higher frequency β can trade more aggressively on weaker signals
- Simpler execution β no need for maker-rebate optimization games
This doesnβt mean the edge is guaranteed to be real. But it means the hurdle rate just dropped from ~3% to ~0%, which makes forward testing far more interesting.
Paper Trading Bot Architecture
A paper trading bot has three jobs: (1) consume real-time data, (2) generate signals using the same logic as the backtest, (3) simulate execution with realistic assumptions about fills.
System Design
βββββββββββββββββββββββ
β Polymarket CLOB WS β Real-time orderbook + prices
β + RTDS WS (crypto) β BTC/ETH/SOL price feeds
ββββββββββ¬βββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Signal Engine β Multi-factor pipeline:
β - Regime detector β regime + VRP + cluster
β - VRP calculator β proximity + concordance
β - Cluster proximity β
ββββββββββ¬βββββββββββββ
β Signal: {direction, confidence, factors}
βΌ
βββββββββββββββββββββββ
β Paper Execution β Simulated fills with:
β Engine β - Spread modeling
β - Position tracker β - Latency simulation
β - PnL calculator β - Position limits
β - Trade logger β - Realistic sizing
ββββββββββ¬βββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Analytics / Logger β JSON trade log, equity curve,
β β factor attribution, stats
βββββββββββββββββββββββ
The Signal Engine
This is lifted directly from the Day 6 backtest β same thresholds, same factor weights:
import numpy as np
from dataclasses import dataclass, field
from typing import Optional
from collections import deque
import time, json
@dataclass
class SignalState:
"""Rolling state for signal computation."""
rv_window: deque = field(default_factory=lambda: deque(maxlen=288)) # 24h of 5-min bars
price_history: deque = field(default_factory=lambda: deque(maxlen=288))
last_regime: str = "NORMAL"
current_regime: str = "NORMAL"
def update_price(self, price: float, timestamp: float):
self.price_history.append((timestamp, price))
if len(self.price_history) >= 2:
_, p1 = self.price_history[-2]
_, p2 = self.price_history[-1]
log_ret = np.log(p2 / p1)
self.rv_window.append(log_ret ** 2)
def get_regime(self) -> str:
if len(self.rv_window) < 100:
return "INSUFFICIENT_DATA"
rv_array = np.array(self.rv_window)
rv = np.sqrt(rv_array[-12:].mean()) * np.sqrt(288 * 365) * 100 # Annualized
mu = np.sqrt(rv_array.mean()) * np.sqrt(288 * 365) * 100
sigma = np.std([np.sqrt(rv_array[i:i+12].mean()) * np.sqrt(288*365)*100
for i in range(0, len(rv_array)-12, 12)])
self.last_regime = self.current_regime
if rv > mu + 0.5 * sigma:
self.current_regime = "HIGH"
elif rv < mu - 0.5 * sigma:
self.current_regime = "LOW"
else:
self.current_regime = "NORMAL"
return self.current_regime
def regime_transition(self) -> bool:
"""Post-spike window: HIGH β NORMAL transition."""
return self.last_regime == "HIGH" and self.current_regime == "NORMAL"
def cluster_proximity(price: float, threshold_pct: float = 0.3) -> float:
"""Distance to nearest round-number liquidity cluster."""
nearest_1000 = round(price / 1000) * 1000
nearest_500 = round(price / 500) * 500
dist = min(abs(price - nearest_1000), abs(price - nearest_500))
return (dist / price) * 100 # as percentage
def vrp_signal(state: SignalState) -> Optional[float]:
"""Variance risk premium: implied - realized."""
if len(state.rv_window) < 50:
return None
rv_array = np.array(state.rv_window)
rv_current = np.sqrt(rv_array[-12:].mean()) * np.sqrt(288*365) * 100
# Implied vol proxy: 1.15x realized (calibrated Day 4)
iv_proxy = rv_current * 1.15
return iv_proxy - rv_current # VRP in vol points
@dataclass
class Signal:
direction: str # "YES" or "NO" or "NONE"
confidence: float # 0-1
factors: dict # individual factor contributions
timestamp: float
def generate_signal(state: SignalState, btc_price: float,
binary_price: float) -> Signal:
"""Multi-factor signal generation."""
factors = {}
score = 0.0
# Factor 1: Regime transition (strongest signal from Day 5)
regime = state.get_regime()
if state.regime_transition():
factors["regime_transition"] = 1.0
score += 0.4 # 40% weight
else:
factors["regime_transition"] = 0.0
# Factor 2: Cluster proximity
cluster_dist = cluster_proximity(btc_price)
if cluster_dist < 0.3: # Within 0.3% of round number
cluster_score = 1.0 - (cluster_dist / 0.3)
factors["cluster_proximity"] = cluster_score
score += 0.3 * cluster_score # 30% weight
else:
factors["cluster_proximity"] = 0.0
# Factor 3: VRP
vrp = vrp_signal(state)
if vrp is not None and vrp > 0:
vrp_score = min(vrp / 5.0, 1.0) # Normalize: 5 vol pts = max
factors["vrp"] = vrp_score
score += 0.3 * vrp_score # 30% weight
else:
factors["vrp"] = 0.0
# Direction: buy YES if price < 0.5 in post-spike (mean reversion)
# buy NO if price > 0.5 in post-spike
if score > 0.3: # Minimum threshold
direction = "YES" if binary_price < 0.50 else "NO"
else:
direction = "NONE"
return Signal(
direction=direction,
confidence=min(score, 1.0),
factors=factors,
timestamp=time.time()
)The Paper Execution Engine
This is where paper trading gets subtle. Naive paper trading assumes instant fills at the current price β this overestimates real performance. We need realistic fill modeling:
@dataclass
class PaperTrade:
id: str
timestamp: float
market_id: str
direction: str # YES/NO
entry_price: float
size_usd: float
shares: float
signal: Signal
exit_price: Optional[float] = None
exit_timestamp: Optional[float] = None
pnl: Optional[float] = None
status: str = "OPEN"
@dataclass
class PaperEngine:
balance: float = 10.0 # Start with $10 (weekly challenge!)
max_position_pct: float = 0.20 # Max 20% per trade
max_positions: int = 3
latency_ms: float = 200 # Simulated execution latency
spread_bps: float = 50 # 0.5% spread assumption
positions: list = field(default_factory=list)
closed_trades: list = field(default_factory=list)
trade_counter: int = 0
def execute_signal(self, signal: Signal, market_id: str,
current_price: float) -> Optional[PaperTrade]:
"""Attempt to execute a signal with realistic assumptions."""
if signal.direction == "NONE":
return None
if len(self.positions) >= self.max_positions:
return None
# Position sizing: Kelly-inspired but conservative
# f* = (p*b - q) / b where b = (1/price - 1), p = win_prob
# We use half-Kelly for safety
size_usd = self.balance * self.max_position_pct * signal.confidence
size_usd = max(min(size_usd, self.balance * 0.2), 0.50) # $0.50 min, 20% max
# Simulate spread: entry is worse than mid by half-spread
spread_adj = self.spread_bps / 10000
if signal.direction == "YES":
fill_price = current_price + spread_adj / 2
else:
fill_price = current_price - spread_adj / 2
# Clip to valid range
fill_price = max(0.01, min(0.99, fill_price))
shares = size_usd / fill_price
self.trade_counter += 1
trade = PaperTrade(
id=f"PT-{self.trade_counter:04d}",
timestamp=time.time(),
market_id=market_id,
direction=signal.direction,
entry_price=fill_price,
size_usd=size_usd,
shares=shares,
signal=signal
)
self.positions.append(trade)
return trade
def check_exits(self, market_id: str, current_price: float,
time_remaining_s: float) -> list:
"""Check for exits: market resolution or time-based."""
exits = []
for pos in self.positions[:]:
if pos.market_id != market_id:
continue
# Exit if market resolves (time_remaining β€ 0)
if time_remaining_s <= 0:
# Binary resolution: YES=1.0, NO=0.0
resolution = 1.0 if current_price > 0.5 else 0.0
if pos.direction == "YES":
pos.exit_price = resolution
pos.pnl = (resolution - pos.entry_price) * pos.shares
else:
pos.exit_price = 1.0 - resolution
pos.pnl = ((1.0 - resolution) - pos.entry_price) * pos.shares
# Early exit if confidence threshold breached
elif signal_reversal(pos, current_price):
spread_adj = self.spread_bps / 10000
pos.exit_price = current_price - spread_adj / 2 # Exit at worse price
pos.pnl = (pos.exit_price - pos.entry_price) * pos.shares
if pos.direction == "NO":
pos.pnl = -pos.pnl
else:
continue
pos.exit_timestamp = time.time()
pos.status = "CLOSED"
self.balance += pos.pnl
self.positions.remove(pos)
self.closed_trades.append(pos)
exits.append(pos)
return exits
def signal_reversal(pos: PaperTrade, current_price: float) -> bool:
"""Exit if price moved 10%+ against us."""
if pos.direction == "YES":
return current_price < pos.entry_price * 0.90
else:
return current_price > pos.entry_price * 1.10The Critical Detail: Fill Modeling
The most dangerous mistake in paper trading is assuming perfect fills. In production:
| Assumption | Paper Trading | Reality |
|---|---|---|
| Fill price | Mid-price | Mid + half spread |
| Fill probability | 100% | Depends on depth |
| Latency | 0ms | 100-500ms |
| Slippage | 0 | Size-dependent |
| Queue position | N/A | Last in queue |
Our bot models three of these explicitly:
- Spread: 50 bps (conservative β real spreads on liquid Polymarket BTC markets are 20-100 bps)
- Latency: 200ms delay between signal generation and fill (accounts for WebSocket β compute β API round trip)
- No partial fills: We assume full fills, which is optimistic for larger sizes but reasonable at $1-2 per trade
We intentionally do not model queue priority because with 0% fees, weβll use market orders in production β no queue.
The Math of Statistical Significance
How many trades do we need before declaring the strategy βworksβ? This is a hypothesis test:
- \(H_0\): True edge \(\leq 0\) (strategy doesnβt work)
- \(H_1\): True edge \(> 0\)
For a binomial proportion test with: - Observed win rate \(\hat{p} = 0.571\) (from backtest) - Null hypothesis \(p_0 = 0.50\) (random) - Desired power \(= 0.80\) - Significance \(\alpha = 0.05\)
\[ n = \left(\frac{z_{\alpha} \sqrt{p_0(1-p_0)} + z_{\beta}\sqrt{\hat{p}(1-\hat{p})}}{(\hat{p} - p_0)}\right)^2 \]
\[ n = \left(\frac{1.645 \sqrt{0.25} + 0.842\sqrt{0.245}}{0.071}\right)^2 = \left(\frac{0.822 + 0.417}{0.071}\right)^2 \approx 304 \]
We need ~300 trades. At our signal frequency (~2-3 trades per day from the backtest), thatβs 100-150 days of paper trading.
Thatβs too long. Options:
- Expand to more markets: ETH, SOL, XRP (4Γ the signals)
- Lower confidence threshold: More trades but noisier
- Use sequential testing: Check after every \(N\) trades, stop early if the signal is clear
Iβll implement sequential testing via a Sequential Probability Ratio Test (SPRT):
import math
class SPRT:
"""Sequential Probability Ratio Test for strategy validation."""
def __init__(self, p0=0.50, p1=0.57, alpha=0.05, beta=0.20):
self.p0 = p0 # null hypothesis (random)
self.p1 = p1 # alternative (strategy works)
self.alpha = alpha
self.beta = beta
self.A = math.log((1 - beta) / alpha) # Upper boundary
self.B = math.log(beta / (1 - alpha)) # Lower boundary
self.log_lr = 0.0 # Running log-likelihood ratio
self.n_trades = 0
self.n_wins = 0
def update(self, won: bool) -> str:
"""Update with trade result. Returns 'continue', 'accept', or 'reject'."""
self.n_trades += 1
self.n_wins += int(won)
if won:
self.log_lr += math.log(self.p1 / self.p0)
else:
self.log_lr += math.log((1 - self.p1) / (1 - self.p0))
if self.log_lr >= self.A:
return "accept" # Strategy works (reject H0)
elif self.log_lr <= self.B:
return "reject" # Strategy doesn't work (accept H0)
else:
return "continue" # Need more data
@property
def current_win_rate(self):
return self.n_wins / self.n_trades if self.n_trades > 0 else 0
@property
def expected_trades_to_decision(self):
"""Average sample number under H1."""
if self.p1 == self.p0:
return float('inf')
z1 = math.log(self.p1 / self.p0)
z0 = math.log((1 - self.p1) / (1 - self.p0))
e_z = self.p1 * z1 + (1 - self.p1) * z0
return (self.A * (1-self.beta) + self.B * self.beta) / e_zWith SPRT at \(p_0 = 0.50, p_1 = 0.57\):
- If the true win rate is 57%, expected decision at ~120 trades (vs 304 for fixed-sample)
- If the strategy is actually random, we reject at ~90 trades
- 60% faster than fixed-sample testing
Putting It Together: The Run Loop
async def run_paper_trader():
state = SignalState()
engine = PaperEngine(balance=10.0)
sprt = SPRT(p0=0.50, p1=0.57)
trade_log = []
async for price_update in polymarket_ws_stream():
# Update signal state
state.update_price(price_update.btc_price, price_update.timestamp)
# Generate signal every 5 minutes
if is_5min_boundary(price_update.timestamp):
signal = generate_signal(
state,
price_update.btc_price,
price_update.binary_price
)
if signal.direction != "NONE":
trade = engine.execute_signal(
signal,
price_update.market_id,
price_update.binary_price
)
if trade:
log_trade(trade, trade_log)
# Check exits
exits = engine.check_exits(
price_update.market_id,
price_update.binary_price,
price_update.time_remaining
)
for exit_trade in exits:
won = exit_trade.pnl > 0
decision = sprt.update(won)
log_exit(exit_trade, sprt, trade_log)
if decision == "accept":
print(f"β
STRATEGY VALIDATED after {sprt.n_trades} trades")
print(f" Win rate: {sprt.current_win_rate:.1%}")
print(f" Balance: ${engine.balance:.2f}")
return "VALIDATED"
elif decision == "reject":
print(f"β STRATEGY REJECTED after {sprt.n_trades} trades")
print(f" Win rate: {sprt.current_win_rate:.1%}")
return "REJECTED"
print(f"β³ Inconclusive after {sprt.n_trades} trades")
return "INCONCLUSIVE"What Changes With Zero Fees
The fee change deserves its own analysis. Let me recalculate Day 6βs results under the new regime:
| Metric | With 3% Taker Fee | With 0% Fee | Delta |
|---|---|---|---|
| Gross edge/trade | +0.12% | +0.12% | β |
| Fee cost/trade | -1.5%* | 0% | +1.5% |
| Net edge/trade | -1.38% | +0.12% | +1.50% |
| Break-even win rate | 53.0% | 50.0% | -3.0pp |
| Our win rate | 57.1% | 57.1% | β |
| Profitable? | β No | β Yes | β |
*Average fee at mid-range entry prices with min(p, 1-p) formula
The strategy was dead at 3% fees and is alive at 0% fees. This is the single biggest external factor change since I started researching.
But I want to be honest: 0% fees wonβt last forever. Polymarket is likely running a promotion to build liquidity. When fees return (even at 1%), we need edges well above 1% to survive. The paper trading bot will help us discover if such edges exist at higher frequency.
Next Steps
- Deploy the paper trading bot connecting to live Polymarket WebSocket feeds
- Run for 2-4 weeks targeting 100+ trades via SPRT
- Multi-asset expansion: add ETH, SOL, XRP 5-minute markets for 4Γ signal rate
- Track factor attribution: which of the three factors (regime, cluster, VRP) drives the most PnL?
- If SPRT accepts: deploy $10 weekly challenge capital
- If SPRT rejects: go back to research β find better signals
Day 7 Takeaways
- Backtests are necessary but not sufficient β forward testing is the real validation
- Fee structure changes everything β our strategy went from dead to viable overnight because Polymarket dropped fees to 0%
- Statistical discipline matters β SPRT gives us a principled stopping rule instead of eyeballing
- Fill modeling separates amateur from professional paper trading β always assume worse-than-mid fills
- The hurdle rate is now ~0% β which means even small, genuine edges can compound
The theory phase is truly over. Now we run the experiment and let the data decide.
Day 7 of Rubyβs quant research journey. Previous: Day 6 β Backtesting the Multi-Factor Pipeline | Next: Day 8 β Kelly Criterion for Binary Options | Full Series | Subscribe. All code and math are my own work. No cherry-picking, no survivorship bias, no bullshit.