Tech

A new AI framework automatically optimizes training data, structures and algorithms – more efficient than human-based

0 9 4 minutes read

A new AI framework automatically optimizes training data, structures and algorithms – more efficient than human-based

AI R&D works through a cycle of hypothesis, testing, and analysis – each step requiring significant manual engineering effort. A new framework from researchers at SII-GAIR aims to overcome that problem by automating the full development loop of training data, modeling architecture, and learning algorithms.

A new framework called ASI-EVOLVEdeveloped by researchers at the Generative Artificial Intelligence Research Lab (SII-GAIR), aims to solve this problem. Designed as an AI-for-AI research agent system, it uses continuous "read-design-experiment-analyze" cycle to optimize the optimization of the basic AI stack.

In experiments, this self-improvement loop automatically discovered novel designs that worked better than the human-level baselines. The system generated a novel language model architecture, improved pre-training data pipelines to increase the markup score by more than 18 points, and designed highly efficient reinforcement learning algorithms.

For business teams that use iterative development cycles for their AI systems, the framework provides a way to reduce engineering overhead while matching or exceeding the performance of human-designed frameworks.

Data and design dilemma

Engineering teams can explore a small portion of the potential design space for AI models at any given time. Implementing a test workflow requires expensive manual effort and frequent human intervention. And the information gained from these expensive cycles is often captured as individual experience or knowledge, making it difficult to systematically save and transfer that knowledge to future projects or to different teams. These constraints limit the pace and scale of AI innovation.

AI has made incredible strides in scientific discovery, from specialized tools like AlphaFold solving different biological problems in agent systems that answer basic scientific questions. However, current frameworks still struggle with open AI innovation and are largely limited to small improvements within specific constraints.

The development of basic AI skills is very complex. It requires preparing large interdependent codebases, running heavy tests that consume tens to hundreds of GPU hours, and analyzing multidimensional feedback from training dynamics.

“Existing frameworks have yet to demonstrate that AI can work effectively in this realm in a unified way, or that it can generate meaningful progress across the three fundamental pillars of AI development rather than within one limited context,” the researchers wrote.

How ASI-EVOLVE learns to research

To overcome the limitations of manual R&D, ASI-EVOLVE works continuously between prior knowledge, hypothesis generation, testing, and refinement. The system reads relevant information and historical information from existing repositories, designs a candidate program that represents its next hypothesis, runs experiments to find analytical signals, and analyzes the results into reusable, human-readable lessons that it feeds back into its knowledge base.

There are two key components that drive ASI-EVOLVE. “Cognition Base” acts as the basic domain expert of the system. To speed up the search process, the system is preloaded with human knowledge, task-related heuristics, and known pitfalls extracted from the existing literature. This steers the experiment in promising directions from the first iteration.

The second part is “Analyzer,” which deals with the complex, multifaceted feedback from the experiment. It processes raw training logs, benchmark results, and optimizations, breaking them down into compact, actionable data and causal analysis.

Several other complementary modules bring the framework together. The “Researcher” agent reviews previous information from the knowledge base and previous test results to generate new ideas, or suggest changes to the local code or write new programs.

The “Developer” section runs the actual tests. Because AI training tests are incredibly expensive, Developer is equipped with efficiency measures like wall-clock limits and rapid rejection tests to filter out faulty candidate programs before they consume too many GPU hours.

Finally, the “Database” serves as the persistent memory of the system, storing code, research stimuli, raw results, and final Analyzer reports across iterations, ensuring that information is systematically compiled over time.

By combining these components, ASI-EVOLVE ensures that an AI agent systematically learns from complex, real-world test feedback without requiring constant human intervention.

While previous frameworks were designed to evolve candidate solutions, “ASI-EVOLVE evolves the understanding itself,” the researchers wrote. “The experience gathered and the information gathered are continuously stored and retrieved to inform future experiments, ensuring that the system grows not only in the quality of its solutions but in its ability to imagine where to search next.”

ASI-EVOLVE works

In their tests, researchers have shown that ASI-EVOLVE can effectively improve data processing, architectural modeling, and learning algorithms to create better AI systems.

In real-world business applications, high-quality data is a constant bottleneck. When tasked with designing cleaning strategies for a particular phase of a large pre-training company, ASI-EVOLVE examined data samples and identified quality issues such as HTML artifacts and formatting inconsistencies. The system independently generated custom list rules, finding that systematic cleaning combined with domain-aware storage rules is more effective than aggressive filtering.

In benchmark tests, 3B parameter models trained on AI-selected data saw an average score increase of about 4 points over models trained on raw data. The gains were greatest in knowledge-intensive jobs, with performance increasing by more than 18 points on Massive Multitask Language Understanding (MMLU), an LLM benchmark that includes jobs across STEM, humanities, and social sciences.

Apart from the data, the system has shown to be highly efficient in the creation of neural architecture. In 1,773 rounds of independent testing, it produced 105 novel direct detections that surpassed DeltaNet, the most efficient human-designed database. In order to achieve these results, ASI-EVOLVE has developed multi-scale algorithms that dynamically adjust the model’s computational budget based on specific input content.

Finally, in strengthening the construction of the learning algorithm, ASI-EVOLVE found novel methods. It has designed algorithms that outperform competing GRPO algorithms on mathematical reasoning benchmarks such as AMC32 and AIME24. One successful exception established ia "Budget Compressed Dynamic Radius" which keeps model updates within a defined budget, effectively stabilizing training on noisy data.

What does this mean for business AI

Enterprise AI workflows always require improvements to existing systems, from fine-tuning open source models to proprietary data to making small changes to architecture and algorithms. Often, the computational resources and engineering hours required to undertake such efforts are large and beyond the capabilities of most organizations. As a result, many are left to use unoptimized versions of standard AI models.

The research team says the framework is designed to enable businesses to integrate proprietary domain knowledge into a knowledge repository and allow an automated loop to iterate through internal AI systems.

The research team went open source the ASI-EVOLVE codemaking the basic framework available to product developers and builders.