Researchers teach LLM to solve complex planning challenges | MIT News

Imagine a coffee company trying to optimize its supply chain. The company sources beans from three suppliers, roasts at two facilities and moves them to dark or light coffee, and ships roasted coffee to three retailers. Suppliers have different fixed capacity, and roasting costs and shipping costs vary depending on location.

The company is trying to minimize costs while meeting a 23% increase in demand.

Wouldn’t it be easy for a company to ask you to plan the best ChatGpt? In fact, for all incredible capabilities, large-scale language models (LLMs) often suffer performance when tasked with solving such complex planning problems directly on their own.

Rather than trying to change the model to make LLM a better planner, MIT researchers took a different approach. They introduced a framework that guided LLM to break down problems like humans, and solved them automatically using powerful software tools.

Users simply explain the problem in natural language. No task-specific examples are required to train or encourage LLMs. This model encodes the user’s text prompts into a format that can be elucidated by an optimization solver designed to efficiently crack extremely stringent planning challenges.

During the formulation process, LLM checks the work in multiple intermediate steps to ensure that the plan is correctly written in the solver. If an error occurs rather than giving up, LLM will try to fix the broken part of the formulation.

When researchers tested the framework on nine complex challenges, including minimizing distance warehouse robots, a success rate of 85% was achieved due to the need to move to complete the task, but the highest baseline achieved a success rate of 39%.

The versatile framework can be applied to a variety of multi-stage planning tasks, such as scheduling airline crews and managing machine time in factories.

“In our research, we present a framework that basically functions as a smart assistant for planning problems. Whether the rules are complex or abnormal, you can get a better idea of ​​the best plan to meet all your needs.”

She is joined by Yang Zhang, a research scientist at MIT-IBM Watson AI Lab. and Senior author Chuchu Fan, an associate professor of the Aerospace Association and Astronauts and a lead researcher of the Lid. This research will be presented at the International Conference on Learning Expression.

Optimization 101

Fan groups develop algorithms that automatically solve what is known as combinatorial optimization problems. These vast issues have many interrelated decision variables, each with multiple options, which rapidly lead to billions of potential options.

Humans solve such problems by narrowing them down to several options and determining which leads to the best overall plan. Researcher’s algorithm solvers apply the same principles to optimization problems that are too complicated for humans to break.

However, the solvers they develop tend to have a steep learning curve and are usually used only by experts.

“LLM thought that Nonexpert could use these solution algorithms. In our lab, we’ll take the domain expert problem and formalize it into problems that solvers can solve. Can we teach LLM to do the same?” fans say.

Using a framework developed by researchers, it is called LLM-based formal programming (LLMFP), where people provide a natural language description in question, background information about tasks, and queries that explain the goals.

LLMFP then encourages LLM to infer about the problem and determine the decision variables and important constraints that form the optimal solution.

LLMFP requires LLM to detail the requirements for each variable before encoding the information into the mathematical formulation of the optimization problem. Write the code to code the problem and call the attached optimization solver to reach the ideal solution.

“This is similar to how we teach undergraduates about MIT optimization issues. We don’t teach them just one domain. We teach them the methodology,” adds the fan.

As long as the input to the solver is correct, it gives the correct answer. The solution error comes from an error in the formulation process.

To verify the practice plan, LLMFP analyzes the solution and alters the incorrect steps in formulating the problem. Once the plan passes this self-assessment, the solution is explained to the user in natural language.

Complete the plan

This self-assessment module also allows LLM to add implicit constraints that they initially missed, says Hao.

For example, if the framework optimizes its supply chain to minimize coffee shops’ costs, humans know that coffee shops cannot ship roasted beans negatively, but LLM may not recognize it.

The self-evaluation step flags the error and prompts the model to correct it.

“In addition, LLM can adapt to user preferences. If a model does not want to change the time or budget for a specific user’s trip planning, we can suggest that you change what suits the user’s needs,” says fans.

In a series of tests, those frameworks achieved an average success rate of 83-87% across nine diverse planning problems using multiple LLMs. Some baseline models excel in specific problems, but LLMFP achieved an overall success rate of about twice as much as baseline techniques.

Unlike these other approaches, LLMFP does not require domain-specific examples for training. You can find the best solution to your planning problems out of the box.

Additionally, users can adapt LLMFP to a variety of optimization solvers by adjusting the prompts provided to LLM.

“With LLMS, you have the opportunity to use tools from other domains to create an interface that allows you to solve problems in ways you may not have thought about before,” says Fan.

In the future, researchers would like to allow LLMFPs to acquire images as input to supplement their explanations of planning problems. This helps the framework solve tasks that are particularly difficult to fully explain in natural language.

This work was funded in part by the Naval Research Bureau and the MIT-IBM Watson AI Lab.