How the UFC Verdict is Built — Edge Factors Methodology

Model update — May 17, 2026

Why the verdict shrank from ten edges to three

The May 16 Allen vs Costa card was the worst single result in the model's track record: 2 of 10 committed picks. We audited the verdict logic the next day. The finding was uncomfortable but clear.

Of the ten edge factors, seven sat at 52–57% per-fire accuracy — close enough to a coin flip that they were behaving like correlated noise in the Bayesian combiner. A 53% factor doesn't add information; it adds confidence theatre. When several stacked on the wrong side of an upset, they outvoted the genuinely predictive signals (Record, Cardio, Takedown Defense — all 60%+ per-fire).

The verdict now combines only those three. We tested every subset of the ten factors against 8,533 historical fights (research/subset-test.js):

Accuracy by edge subset

All ten edges: 65.8% on 8,367 predictions
Drop the two noisiest (age, stance/reach): 67.5%
Top 3 only (Record, Cardio, TD Defense): 68.4% ← shipped
Record alone (when it fires): 75.5% — strongest single signal

Calibration improved alongside accuracy. The 70–75% confidence band hit 82.9% accuracy before the change and 92.7% after. About 20% of fights now show "no edge" rather than a confident pick — those are genuine toss-ups where the prior model was gambling and losing.

Every pick committed before May 17 reflects the older ten-edge model. We don't retroactively re-grade — the track record is the track record. Cards from May 17 onward use the new three-edge model.

The unit

What an edge factor is

An edge is a single piece of statistical evidence that nudges the verdict toward one fighter or the other. Each factor outputs three things: who it favors, by how much (a percentage between 50% and 78%), and a one-line reason that you can read on the fight card.

Most fights have 3–5 factors firing. Fewer than 2 is rare; more than 6 is rare. If no factor fires, the verdict explicitly says "even matchup" rather than guessing — the model refuses to predict where it has no signal.

The toolkit

The factors we measure

Every factor below is implemented in the public edges.js file. Active factors drive the current verdict. Retired factors are kept documented for transparency — they were dropped May 17, 2026 after the subset audit showed they were diluting the verdict more than informing it.

Factor	Triggers when	Range	Status
Record Pro W-L gap (Laplace-smoothed)	Adjusted win-rate gap of at least 8 percentage points between the two fighters	60–72%	Active
Cardio Late-round output sustain	Cardio tier gap of 3+ (e.g. Tireless vs Fades). See cardio methodology.	54–58%	Active
TD defense Takedown defense %	10+ percentage-point gap in takedown defense, when grappling is in play	52–56%	Active
Age Years younger	1+ year age gap. Newcomer-discounted at gap < 9 when either fighter has <5 UFC fights.	52–70%	Retired May 17
Win streak Current consecutive wins	One fighter on a 2+ win streak that's longer than the opponent's	53–63%	Retired May 17
Loss streak Opponent on a skid	One fighter on a 2+ loss streak — fades the colder fighter	53–60%	Retired May 17
Post-loss Layoff after a loss	Asymmetric post-loss state with 6+ months of rust	52–58%	Retired May 17
Stance + reach Striking geometry	Mixed stances (southpaw vs orthodox) and/or 2"+ reach gap, when striking is in play	52–57%	Retired May 17
Strikes/min SLpM differential	1.0+ strikes/min gap, when at least one fighter strikes meaningfully	52–55%	Retired May 17
TD accuracy Takedown finish %	15+ percentage-point gap in takedown accuracy, when grappling is in play	52–56%	Retired May 17

The math

How factors combine into a verdict

Each firing factor pushes the prior probability up or down using a Bayesian odds update. A single 60% edge raises the prior from 50% to 60%; two independent 60% edges raise it to roughly 69%, not 70%, because the second update is anchored to an already-shifted prior.

Per-factor update

prior × factor_pct ÷ ( prior × factor_pct + (1 − prior) × (1 − factor_pct) )

Applied iteratively across every firing factor, starting from a 50/50 prior.

Then a hard cap is applied: the verdict can never exceed the strongest single factor by more than 6 percentage points, and is bounded at 78% on top. This prevents stacked-edges runaway — ten weak edges shouldn't claim 95% confidence in a UFC fight where the floor of randomness is roughly 25%.

Track record

What we measured against the full UFC history

The validation script in research/validate.js simulates every edge.js verdict against every decided UFC fight in the database. Headline numbers from the most recent run (post May 17 rebuild):

Overall accuracy

68.4% correct on 6,655 historical predictions

The new model abstains on ~20% of fights (no Record, Cardio, or TD-defense edge fires) rather than guess. The remaining 6,655 are where the model has genuine signal — and on those it lifts +2.6 percentage points over the old ten-edge version (65.8% → 68.4%).

Calibration check: in a well-calibrated model, predictions at confidence X% should win X% of the time. Here's what the most recent validation found:

Predicted confidence	Actual win rate	Fights in band
50–55%	55.9%	1,952
55–60%	64.7%	1,614
60–65%	72.7%	1,676
65–70%	81.6%	1,008
70–75%	92.7%	399
75%+	100.0%	6

Every confidence band lands inside or above its predicted range. High-confidence verdicts (70%+) are under-stated — when the model says it's confident, history says listen.

How we tune

A concrete example: the age recalibration

The age factor used to assign 55.9% to fighters with a 3–4 year age advantage. We extended the validation script to compute the actual historical win rate at every integer age gap among veteran fighters (5+ UFC fights on both sides), where the data is cleanest. The result:

Age gap → actual younger-fighter win rate (veterans only)

Gap 1–2 yrs: ~52% · model said 52% — on
Gap 3–4 yrs: ~59% · model said 55.9% — undersold
Gap 5–6 yrs: ~57% · model said 58.2% — slightly oversold
Gap 7–9 yrs: ~64% · model said 63.3% — on
Gap 10–11 yrs: ~62% · model said 65.2% — oversold
Gap 12+ yrs: ~70% · model said 65.2% — undersold

Source: research/validate.js. Veterans-only because newcomer fights are noisier (older newcomers often have more regional experience and outperform their age gap).

We shipped new tiers that match the measured rates. The newcomer discount used to apply uniformly; it now only fires at gap < 9 years because the data shows newcomer fights with 9+ year age gaps actually have higher win rates for the younger fighter, not lower. The age edge function is open in edges.js — every line of the recalibration is reviewable.

This is what "open methodology" means in practice. Factors aren't tuned to vibes; they're tuned to the database, the tuning process is documented in code, and the validation rerun confirms the change improved calibration.

Honest limitations

What this model can't do

Validation uses current career stats on historical fights

When we score the model against a 2018 fight, we use the fighters' current career averages — not their stats as they existed in 2018. That introduces a measurable hindsight bias. Year-over-year accuracy in the validation report drifts higher for recent years, confirming the effect. Real fair-replay accuracy is lower than the 66% headline. We flag this prominently in research/results.md.

Streak edges were silently broken for months

The win-streak / loss-streak / post-loss factors are listed in edges.js, but a data-shape mismatch meant they returned null on every fight in production until we found it. We fixed the bug and ran validation again. We mention this because "we found and fixed it" is the right framing for a working methodology, not "our model is flawless."

MMA has an irreducible floor of randomness

Roughly a quarter of UFC fights are won by the statistical underdog. No model — ours, the betting market's, or anyone else's — will ever predict more than ~75% of fights correctly. Anything claiming 90%+ accuracy on a meaningful sample is wrong or overfit.

Past performance is not a guarantee

Calibration on 8,000+ historical fights tells you the model's confidence numbers are honest. It does not tell you any individual upcoming fight will go the way the model predicts. Read the verdict alongside the matchup story. Stats for entertainment purposes only. 1-800-GAMBLER.

How the verdict is built