XAI Grok 4 Benchmarks are showing it is the leading model. Humanity Last Exam at 35 and 45 for reasoning is a big improvement from about 21 for other top models.
If these leaked Grok 4 benchmarks are correct, 95 AIME, 88 GPQA, 75 SWE-bench, then XAI has the most powerful model on the market.
The GPQA for Grok and SWE Bench rankings for Grok 4 code will also top the rankings.
BREAKING 🚨 Grok 4 benchmarks now out and release is imminent anytime now. Are you ready?
Elon is going to celebrate America’s birthday with the most advanced AI ever from xAI team. The benchmarks are off the roof, a truly SOTA model
Grok-4 and Grok-4 Code
– 35% on HLE, 45%… https://t.co/6m5IbcSM6b pic.twitter.com/mcwhUjTY6y— Prashant (@Prashant_1722) July 4, 2025
Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.