Say WOW

Suborbital space tourism finally arrives | FCC prepares to run public C-band auction | The big four in the U.S. launch industry — United Launch Alliance, SpaceX, Blue Origin and Northrop Grumman — hope to be one of two providers that will receive five-year contracts later this year to launch national security payloads starting in 2022. | China’s launch rate stays high | The International Space Station is the largest ever crewed object in space.

The way we measure progress in AI is terrible

November 26, 2024

| No Comments

Every time a new AI model is released, it’s typically touted as acing its performance against a series of benchmarks. OpenAI’s GPT-4o, for example, was launched in May with a compilation of results that showed its performance topping every other AI company’s latest model in several tests. The problem is that these benchmarks are poorly designed, the results hard to replicate, and the metrics they use are frequently arbitrary, according to new research. That matters because AI models’ scores against these benchmarks will determine the level of scrutiny and regulation…

This content is for Member members only.
Log In Register

Health

The way we measure progress in AI is terrible

What’s on BrandMoiAhora

Be Up to date at all times

Be Part of a Groove Society