An organization developing mathematical standards for AI recently disclosed that it received funding from OpenAI, prompting accusations of impropriety from some in the AI community.
Epoch AI, a nonprofit primarily funded by research and grant-making foundations, disclosed on Dec. 20 that OpenAI supported the creation of FrontierMath. FrontierMath is a set of expert-level problems and tests designed to measure AI's mathematical abilities, and is one of the OpenAI benchmarks used to demonstrate its upcoming flagship AI. o3.
in one Post LessWrong on the forum. A contractor for Epoch AI, who goes by the username “Meemi,” said several contributors to the FrontierMath standard were not informed of OpenAI's involvement until it was made public.
“The communication about this is not transparent,” Meemi said. “In my view, Epoch AI should disclose OpenAI funding; Contractors should have transparent information about the potential that is being used for their work performance when choosing whether to work on a standard.”
on social media Some users Concerns have been raised that the secrecy could undermine FrontierMath's reputation as an objective benchmark. In addition to backing FrontierMath, OpenAI has access to many of the issues and solutions in the benchmark – a fact that Epoch AI didn't reveal before December 20 when it announced o3.
In response to Meemi's post, Epoch AI co-director Tamay Besiroglu insisted that FrontierMath's integrity had not been compromised, but admitted that Epoch AI “made a mistake.” Open.
“We are restricted from disclosing the partnership until o3 is launched, and the standard setters behind the scenes should coordinate to be transparent as soon as possible,” Besiroglu wrote. “Our mathematicians deserve to know who is entering their profession. Although we are contractually limited in what we can say. We should make transparency with our contributors a non-negotiable part of our agreement with OpenAI.”
Although OpenAI joined FrontierMath, Besiroglu said. It has a “verbal agreement” with Epoch AI not to use FrontierMath's problem set to train its AI. (The same goes for AI training at FrontierMath Teaching for exams..) Epoch AI also has a “separate hold” that serves as an additional safeguard for independent validation of FrontierMath benchmark results, Besiroglu said.
“OpenAI fully supports our decision to maintain a separate, invisible limit,” Besiroglu wrote.
However, Muddying the Waters Epoch AI led mathematician Ellot Glazer. A post was made on Reddit. Epoch AI has not independently verified OpenAI's FrontierMath o3 results.
“My personal opinion is that (OpenAI's) scores are valid (meaning they weren't trained on the dataset), and there's no incentive to falsify internal benchmark performance,” Glazer said. We cannot guarantee them until our independent assessment has been completed.”
The novel is Not yet Another thing For example The challenge of developing empirical standards to evaluate AI—and securing the resources necessary for standards development without creating conflicts of interest.