Make AI-generated code more accurate in any language | MIT News



Programmers can now generate computer code more quickly using large-scale language models (LLM). However, only if this code follows the rules of the programming language and the computer does not crash will it make the programmer’s life easier.

There are several ways to ensure that LLM conforms to the rules of the language that generates text, but many of these methods take too long to distort the intended meaning of the model or be able to run on complex tasks.

A new approach developed by researchers in MIT and elsewhere will automatically guide LLM to generate text that adheres to rules for related languages, such as specific programming languages, and is also error-free. These methods allow LLM to allocate effort to the output that is most likely to be valid and accurate, while discarding the output early in the process. This stochastic approach increases computational efficiency.

These increased efficiency allowed the researcher’s architecture to outperform much larger models when it comes to producing accurate, well-structured outputs for several real-world use cases, including molecular biology and robotics.

In the long run, this new architecture will help nonexperts control the content generated by AI. For example, a businessman can write complex queries in SQL, the language of database operations, using only natural language prompts.

“This work has implications beyond research. It can improve programming assistants, AI-powered data analysis, scientific discovery tools. It can ensure that both AI-generated outputs are useful and correct.”

Lula is added to the paper by Benjamin Lebrun, a research assistant at the Institute of Artificial Intelligence at Mila Quebec, and author Benjamin Lebrun, co-leaded by Lee du, a graduate student at John Hopkins University. Co-Senior Authors Vikash Mansinghka ’05, Meng ’09, PhD ’09, Leader of the MIT Department of Brain and Cognitive Sciences Probabilistic Computing Project. Alexander K. Lu SM ’20, assistant professor at Yale University. Tim Vieira, a postdoctoral student from Eth Zurich. Timothy J. O’Donnell, associate professor at McGill University and chairman of the Canadian CIFAR AI at Mira, led the international team. Like some others. This research will be presented at the International Conference on Learning Expression.

Force structure and meaning

One common approach to controlling structured text generated by LLMS involves looking at the entire output, such as a block of computer code, to make sure it’s valid and error-free. Otherwise, the user must start again and acquire computational resources.

Meanwhile, the programmer can stop halfway through to check the output. This ensures that your code is compliant with the programming language and structurally valid, but gradually modifying the code can drift from the meaning that the user intended, and can damage its accuracy in the long run.

“It’s much easier to enforce structures than meaning. You can quickly see if something is the right programming language, but you need to run the code to see what that means. Our work is also about dealing with these different kinds of information,” says Loula.

The researcher’s approach leads to engineering knowledge about LLM towards the most promising output. These outputs are likely to have the meaning they intend, subject to user-defined structural constraints.

“We’re not trying to train LLMs to do this. Instead, we’re realizing the knowledge that experts have and combining it with LLM knowledge.

They accomplish this using a technique called sequential Monte Carlo, which allows parallel generation from LLM to compete with each other. This model dynamically allocates resources to different threads of parallel computation based on what the output looks like.

Each output is given weights that represent the possibility that it is structurally valid and semantically accurate. At each step of the calculation, the model focuses on those with higher weights and discards the rest.

In a sense, LLM seems to have experts looking over their shoulder to ensure that they make the right choice at each step, focusing on their overall goals. The user specifies the desired structure and meaning, and how to check the output, and specifies that the researcher’s architecture guides the LLM to do the rest.

“We’ve solved hard maths, so we’ll get the right weights for all sorts of constraints we want to incorporate. In the end, we’ll get the right answer,” says Loula.

Small model improvements

To test their approach, they applied the framework to LLMS, which is tasked with generating four different outputs: Python code, SQL database queries, molecular structures, and robotic plans to follow.

Compared to existing approaches, the researcher’s methods were performed more accurately while less calculations were required.

In Python code generation, for example, the researcher’s architecture allowed small open source models to outperform specialized commercial closure models, which are more than twice their size.

“We’re very excited that these little models can far outweigh our weight,” says Lula.

In the future, researchers would like to use their techniques to control larger chunks of generated text rather than working on one small piece at a time. Also, since we want to combine methods and training, we learn that controlling the outputs that the model generates will be more accurate.

In the long run, this project could have a wider range of applications aimed at non-technical users. For example, it can be combined with a system of automated data modeling to query the generated model of a database.

This approach enables machine-assisted data analysis systems. Users can talk to software that accurately model the meaning of data and software that can accurately model the questions they ask.

“One of the fundamental problems in linguistics is that it explains how words, phrases, and sentences mean they are based on models of the world, and the uncertainty and ambiguity of meaning and reference. LLMS predicts the possibility of token sequences and does not address this issue. Linguistics, and artificial intelligence, like we did, had to understand how machines could communicate about the world,” says O’Donnell.

This research is funded in part by the Canada CIFAR AI Chair Program and by the Siegel Family Foundation via a gift to the MIT Siegel Family Quest for Intelligence.



Source link