How accurate is this WBGT — and how do we know?
Every live WBGT you see in an app is a calculation, not a thermometer reading. There's no black-globe sensor on your street corner — the number is modeled from weather data (temperature, humidity, wind, sunshine). So the fair question isn't "is it the real WBGT?" but "how close is the model?" Here's how we keep ours honest — in public.
We grade ourselves against an independent answer key
The U.S. National Weather Service publishes its own WBGT forecast for locations across the country. That gives us an independent answer key to check ourselves against. We compare our number to theirs continuously, and we show the raw comparison live on our Live Accuracy page — updated through the day — instead of making a one-time claim.
(An honest caveat: the NWS figure is itself a model, so a gap measures method-and-input disagreement, not absolute truth. But it's a rigorous, independent yardstick — and the best public one there is.)
The accuracy flywheel
Every comparison is saved — not just the two numbers, but the full set of conditions behind them (temperature, humidity, wind, sunshine, time of day…). Over time that builds a growing, retained record of "here's what we predicted, here's the reference, and here were the exact conditions."
That dataset is what powers the next two steps — and it compounds: the longer it runs, the more it can teach us.
Our error wasn't random — it depended on conditions
When we looked across thousands of these comparisons, a clear pattern emerged. On average our WBGT matched the reference almost perfectly. But that average was hiding something:
- In strong sun, we ran a touch warm (about +1.4 °C).
- At night, we ran a touch cool (about −1.4 °C).
- The warm bias was worst in calm air — up to a few degrees — which is exactly the situation that occasionally produced a scary "Extreme" reading on a day that felt fine.
Because the warm and cool errors cancel out on average, a single across-the-board nudge would do nothing. The fix had to depend on the conditions — chiefly how strong the sun is and how much wind there is.
So we correct it — but only when it's proven to help
We learned a small correction from the data: how much to add or subtract for a given amount of sun and wind. In strong, calm sun it gently pulls the number down; on a still night it nudges it up.
The careful part is how it's trusted. Before we apply the correction, we hold out entire cities it has never seen and check whether it improves the reading there. On our current data, that cross-check cuts the typical error by about 31% on cities the model never trained on. In other words it's learning real physics — not memorizing the handful of places we can check. And if the correction ever stopped passing that test, it would switch itself off.
You can see the exact correction — and that cross-check — on the Live Accuracy page.
Why a U.S. answer key helps the whole world
Here's the part we're genuinely proud of: the correction is keyed on sun and wind, and those exist everywhere. A lesson learned at U.S. reference cities — "in this much sun, with this little wind, the model tends to run a bit warm" — applies just as well in Lagos or Lahore. So an answer key that only exists for the United States quietly makes the reading better worldwide, including the many places where no one publishes a WBGT to check against.
That's the whole idea: treat accuracy as a living system, not a fixed claim. Measure it in public, learn where it's off, correct it carefully, and prove the fix travels.