WritingFebruary 12, 20268 min read

We built a bot that printed money on Polymarket's broken binary markets

How we scraped Polymarket's stale price feed and blended it with live exchange data to exploit mispriced crypto UP/DOWN markets, before the rest of the market caught up.

The Inefficiency

Polymarket runs 15-minute binary markets on crypto prices. The question is simple: will BTC be higher or lower than its opening price when the window closes?

Two tokens. UP and DOWN.

If BTC goes up   → UP pays $1.00, DOWN pays $0.00
If BTC goes down → DOWN pays $1.00, UP pays $0.00

In the early days, these markets were wildly inefficient. Retail traders would see a small uptick in BTC and panic-sell their DOWN tokens at $0.30, even when the price had barely moved and twelve minutes remained on the clock. The market was pricing sentiment, not probability.

We saw a gap between what the market believed and what the math said. So we built a system to quantify it and trade accordingly.

The Core Idea: Outrun the Oracle

Polymarket settles these markets using Chainlink price feeds. On-chain oracle data that determines whether the asset closed higher or lower than its opening price. Direct access to Chainlink's oracle is gatekept to institutional use. We didn't have it. But we didn't need it. By scraping Polymarket's own WebSocket, we could reverse-engineer the price data the platform was using to settle markets, and it was stale. Typically 1-2 seconds behind, sometimes worse. In a 15-minute binary market where every tick matters, that delay is a gift.

We built a hybrid price feed. The idea is simple:

  1. Scrape Polymarket's WebSocket to extract the price data the platform was using. This gives us the “official” price, but it's stale.
  2. Simultaneously stream real-time book tickers from five exchanges: Binance, Coinbase, Kraken, OKX, and Bybit.
  3. When a new price tick arrives from Polymarket's feed, look at its timestamp to determine when it was actually generated.
  4. Look back in our exchange history buffers to find what those exchanges were trading at that same moment.
  5. Measure how much exchange prices have moved since that moment.
  6. Apply that movement to the stale price.

The formula:

hybrid_price = stale_price × mean(exchange_now / exchange_at_stale_price_generation)

Each exchange keeps a rolling 5-second history buffer at ~100ms granularity. About 50 data points per exchange. When Polymarket's feed tells us “BTC was $84,250 about 1.2 seconds ago,” we look back 1.2 seconds in each exchange buffer, find what they were quoting then, compare it to what they're quoting now, and compute the average ratio.

Polymarket feed says:  $84,250.00  (generated 1,200ms ago)

At generation time:           Now:              Ratio:
┌──────────────────────────────────────────────────────┐
│ Binance    $84,248.50       $84,271.30        1.000271 │
│ Coinbase   $84,251.20       $84,273.80        1.000268 │
│ Kraken     $84,249.90       $84,270.10        1.000240 │
│ OKX        $84,250.10       $84,272.50        1.000266 │
│ Bybit      $84,249.00       $84,271.90        1.000271 │
└──────────────────────────────────────────────────────┘

Mean ratio: 1.000263

Hybrid price: $84,250.00 × 1.000263 = $84,272.16

Polymarket's feed was $22 behind the real market.

That $22 difference matters. In a market settling on whether BTC is above or below its opening price, being $22 closer to the truth than every other participant is the entire edge.

From Price to Probability

Knowing the current price isn't enough. We need to convert a price lead into a probability that's accurate enough to beat the market.

We used the normal CDF. The intuition: price movement in a 15-minute window is roughly normally distributed. If the asset is currently above its opening price, we can ask “what's the probability it stays there given how far it's moved and how much time is left?”

P(Up) = Φ(z)

where:
  z = deviation / remaining_volatility

  deviation = (current_price - open_price) / open_price
  remaining_vol = base_vol × (time_remaining_fraction ^ 0.65)
  base_vol ≈ 0.25% (empirical BTC 15-minute volatility)

The key insight is remaining volatility. As time runs out, remaining volatility shrinks, which pushes z toward the extremes and makes the probability more decisive.

Examples (BTC 15m, base_vol = 0.25%):

  +0.05% move, 4 min left   →  z ≈ 0.49  →  P(Up) ≈ 69%
  +0.10% move, 4 min left   →  z ≈ 0.97  →  P(Up) ≈ 83%
  +0.10% move, 1 min left   →  z ≈ 1.94  →  P(Up) ≈ 97%
  +0.02% move, 10 min left  →  z ≈ 0.12  →  P(Up) ≈ 55%

A 0.10% move with 1 minute left is almost a certainty. There simply isn't enough time for BTC to reverse. But the market would routinely price that at 80-85%. That's 12-17 cents of edge per token.

We also layered a small trend signal on top. A momentum nudge from recent price ticks, capped at ±0.3 z-score units, to catch situations where the price was actively accelerating in one direction.

Time: The Most Important Variable

        Probability of UP holding (given +0.05% lead)

  100% ┤                                              •
       │                                           •
       │                                        •
   80% ┤                                    •
       │                                •
       │                           •
   60% ┤                     •
       │               •
       │         •
   50% ┤    •
       └────┬─────┬─────┬─────┬─────┬─────┬─────┬──┐
           0    2min   4min   6min   8min  10min 12min 15min
                              Time elapsed

The curve isn't linear. It accelerates. We modelled remaining volatility with a power of 0.65 rather than the standard square root (0.5) of time. This was calibrated empirically: crypto in these short windows showed slightly more mean-reversion than a pure random walk, so volatility decayed a bit faster than √t.

The market consistently lagged this reality. With 2 minutes left and the asset already up, traders would still price UP tokens at $0.75 when the math said $0.90+. That's where we made money.

Edge Detection and Execution

Edge is the gap between our model's probability and the market's price:

edge = model_probability - market_price

We only traded when edge exceeded 5%. Aggressive enough to capture frequent opportunities, conservative enough to avoid noise.

Trade sizing scaled linearly with edge:

  $1 at min edge (5%) → $5 at 3× min edge (15%)
  Hard cap: $30 per 15-minute window

Every price tick from Polymarket's feed, roughly once per second, triggered a full recalculation: hybrid price, deviation, fair value, edge check, trade or pass.

The bot was tick-driven, not polling on a timer. When a new price arrived via the WebSocket, the entire pipeline ran immediately. Fresh hybrid price, new fair value estimate, trade decision, all within milliseconds.

For every Polymarket price tick:
  1. Compute hybrid price (stale price × exchange ratio)
  2. Calculate deviation from market opening price
  3. Run through Φ(z) model with time-adjusted volatility
  4. Compare fair value to Polymarket orderbook prices
  5. If edge > 5% and time remaining > 60s → execute

The 60-second cutoff prevented us from trading into illiquid end-of-window conditions where spreads blow out and fills become unreliable.

Why It Worked

The edge came from a simple information asymmetry. The price data Polymarket used to settle markets had structural latency, typically 1-2 seconds, and the tail was fatter than that. Exchange WebSocket feeds from Binance and Coinbase were effectively real-time. By scraping Polymarket's feed and blending it with live exchange data, we had a better estimate of the current BTC price than anyone trading solely off Polymarket's stale prices.

In a binary market that settles on a single question (is BTC up or down from its open?) being $20 ahead of the oracle is enormous. It's the difference between a 65% fair value and a 75% fair value. That's 10 cents of edge on a $1.00 binary.

The inefficiency didn't last forever. As more participants entered, spreads tightened, prices became more rational, and the latency edge compressed. Market makers showed up. Pricing models improved. The window between “what the oracle says” and “what the truth is” narrowed to the point where the edge no longer justified the execution risk.

That's how markets work. Inefficiencies exist until someone builds a system to exploit them, then they disappear.

We built that system.

Build something like this

We build AI systems that solve real problems. If you have a data edge that needs capturing, a workflow that needs automating, or a system that needs building, we can help.