OpenAI botches the charts in GPT-5 introduction

OpenAI introduced GPT-5 in a livestream, and they used a set of seemingly straightforward charts for benchmarks. The point was to show the improved performance of GPT-5 over previous models. However, the labels do not remotely match the bar heights.

The bar for 69.1% is the same height as the one for 30.8% when the former should be more than twice the height of the latter. The bar for 52.8% is taller than the one for 69.1%. It’s off.

Catch the chart at 5:00 in the presentation below:

The chart at 30:30 (on deception of all things) is also not great, as the bar for 50.0 is shorter than the one for 47.4:

They don’t say if the charts were made with ChatGPT, but of course it’s not the best look to use a botched chart when talking about the accuracy of a tool that requires high user trust.

FlowingData Delivered to Your Inbox

Weekly Digest