• Latest
  • Trending
  • All
  • Market Updates
  • Cryptocurrency
  • Blockchain
  • Investing
  • Commodities
  • Personal Finance
  • Technology
  • Business
  • Real Estate
  • Finance
Researchers find you don’t need a ton of data to train LLMs for reasoning tasks

Researchers find you don’t need a ton of data to train LLMs for reasoning tasks

February 15, 2025
Eurozone June PPI +0.8% vs +0.8% m/m expected

Eurozone September industrial production +0.2% vs vs +0.7% m/m expected

November 13, 2025
Stocks making the biggest moves premarket: NVDA, PSKY, RGTI

Stocks making the biggest moves premarket: NVDA, PSKY, RGTI

November 13, 2025
Curve Finance warns its DNS has been hijacked again

Crypto Scam Leverages Australian Government Infrastructure

November 13, 2025
The best Kindles in 2025: Expert recommended

The best Kindles in 2025: Expert recommended

November 13, 2025
China October M2 money supply +8.2% vs +8.1% y/y expected

China October M2 money supply +8.2% vs +8.1% y/y expected

November 13, 2025
Fidelity says tax moves, not whales, drove Bitcoin’s Q4 sell-off!

Fidelity says tax moves, not whales, drove Bitcoin’s Q4 sell-off!

November 13, 2025
Is “Feeling Rich” Propping Up the Economy?

Is “Feeling Rich” Propping Up the Economy?

November 13, 2025
investingLive Asia-Pacific FX news wrap: AUD up, ASX drop after strong Australian job data

investingLive Asia-Pacific FX news wrap: AUD up, ASX drop after strong Australian job data

November 13, 2025
BitFuFu Doubles Q3 Revenue as Cloud Mining Demand Surges

BitFuFu Doubles Q3 Revenue as Cloud Mining Demand Surges

November 13, 2025
OpenAI says the brand-new GPT-5.1 is ‘warmer’ and has more ‘personality’ options

OpenAI says the brand-new GPT-5.1 is ‘warmer’ and has more ‘personality’ options

November 13, 2025
Does your chatbot have ‘brain rot’? 4 ways to tell

Does your chatbot have ‘brain rot’? 4 ways to tell

November 13, 2025
Soft Manager – Trading Ideas – 5 August 2025

🧠 Ego Trading — When You Start Competing With the Market – Trading Systems – 12 November 2025

November 13, 2025
Thursday, November 13, 2025
No Result
View All Result
InvestorNewsToday.com
  • Home
  • Market
  • Business
  • Finance
  • Investing
  • Real Estate
  • Commodities
  • Crypto
  • Blockchain
  • Personal Finance
  • Tech
InvestorNewsToday.com
No Result
View All Result
Home Technology

Researchers find you don’t need a ton of data to train LLMs for reasoning tasks

by Investor News Today
February 15, 2025
in Technology
0
Researchers find you don’t need a ton of data to train LLMs for reasoning tasks
491
SHARES
1.4k
VIEWS
Share on FacebookShare on Twitter

Be a part of our every day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Study Extra


Massive language fashions (LLMs) can study advanced reasoning duties with out counting on massive datasets, in response to a brand new examine by researchers at Shanghai Jiao Tong College. Their findings present that with only a small batch of well-curated examples, you possibly can prepare an LLM for duties that have been thought to require tens of 1000’s of coaching cases. 

This effectivity is as a result of inherent data that fashionable LLMs acquire through the pre-training part. With new coaching strategies turning into extra data- and compute-efficient, enterprises may be capable to create personalized fashions with out requiring entry to the sources of huge AI labs.

Much less is extra (LIMO)

Of their examine, the researchers problem the idea that you just want massive quantities of knowledge to coach LLMs for reasoning duties. They introduce the idea of “much less is extra” (LIMO). Their work builds on high of earlier analysis that confirmed LLMs could possibly be aligned with human preferences with just a few examples.

Much less is Extra (LIMO) for reasoning (supply: arXiv)

Of their experiments, they demonstrated that they may create a LIMO dataset for advanced mathematical reasoning duties with just a few hundred coaching examples. An LLM fine-tuned on the dataset was capable of create advanced chain-of-thought (CoT) reasoning chains that enabled it to perform the duties at a really excessive success fee.

For instance, a Qwen2.5-32B-Instruct mannequin fine-tuned on 817 coaching examples chosen primarily based on LIMO reached 57.1% accuracy on the extremely difficult AIME benchmark and 94.8% on MATH, outperforming fashions that have been educated on 100 occasions extra examples. It additionally scored greater on the benchmarks than reasoning fashions akin to QwQ-32B-Preview (a model of the Qwen mannequin that has been educated for reasoning) and OpenAI o1-preview, each of which have been educated with bigger knowledge and compute sources.

Furthermore, LIMO-trained fashions generalize to examples drastically totally different from their coaching knowledge. For instance, on the OlympiadBench scientific benchmark, the LIMO mannequin outperformed QwQ-32B-Preview, and on the difficult GPQA benchmark, it achieved 66.7% accuracy, near OpenAI-o1-preview’s main rating of 73.3%.

What does it imply for enterprise AI?

Customizing LLMs is a gorgeous use case for enterprise functions. Because of methods akin to retrieval-augmented technology (RAG) and in-context studying, LLMs might be personalized to make use of bespoke knowledge or carry out new duties with out the necessity for costly fine-tuning. 

Nonetheless, reasoning duties usually require coaching and fine-tuning LLMs. The widely-held perception has been that such duties require massive volumes of coaching examples with extremely detailed reasoning chains and options. Creating such datasets is sluggish and impractical for a lot of functions and corporations.

Extra not too long ago, researchers have proven that pure reinforcement studying approaches can allow fashions to coach themselves for reasoning duties by producing many options and selecting those that work finest. Whereas this strategy requires much less handbook effort, it nonetheless calls for costly compute sources which can be past the attain of many enterprises.

Then again, crafting just a few hundred examples is an endeavor that many corporations can deal with, bringing specialised reasoning fashions throughout the attain of a wider vary of organizations.

“This discovery has profound implications for synthetic intelligence analysis: It means that even competition-level advanced reasoning skills might be successfully elicited by means of minimal however curated coaching samples,” the researchers write.

Why LIMO works

Of their experiments, the researchers determine two key explanation why LLMs can study advanced reasoning duties with fewer examples.

First, state-of-the-art basis fashions have been educated on a really great amount of mathematical content material and code throughout pre-training. Which means these LLMs already possess wealthy reasoning data of their parameters that may be activated by means of carefully-crafted examples.

Second, new post-training methods have proven that permitting fashions to generate prolonged reasoning chains considerably improves their reasoning skill. In essence, giving the fashions extra time to “assume” permits them to unpack and apply their pre-trained data extra successfully.

“We hypothesize that profitable reasoning emerges from the synergy of those two components: wealthy pre-trained data and adequate computational sources at inference time,” the researchers write. “These developments collectively recommend a putting chance: If fashions possess wealthy reasoning data and are given enough computational area, then activating their reasoning capabilities could require solely a small variety of high-quality coaching samples that encourage prolonged deliberation, reasonably than large fine-tuning datasets.”

Selecting extra advanced issues to incorporate within the coaching dataset can have a big impact on the educated mannequin’s accuracy in reasoning duties (supply: arXiv)

In accordance with the researchers’ findings, creating helpful LIMO datasets hinges on selecting the best issues and options. Knowledge curators ought to prioritize difficult issues that require advanced reasoning chains, numerous thought processes and data integration. The issues must also deviate from the mannequin’s coaching distribution to encourage new reasoning approaches and pressure it towards generalization.

Accordingly, options ought to be clearly and well-organized, with the reasoning steps tailored to the complexity of the issue. Excessive-quality options must also present strategic academic help by regularly constructing understanding by means of fastidiously structured explanations. 

“By specializing in a minimal but meticulously curated set of reasoning chains, we embody the core precept of LIMO: Excessive-quality demonstrations, reasonably than sheer knowledge quantity, are key to unlocking advanced reasoning capabilities,” the researchers write.

The researchers have launched the code and knowledge used to coach the LIMO fashions of their experiments. Sooner or later, they plan to broaden the idea to different domains and functions.

Day by day insights on enterprise use circumstances with VB Day by day

If you wish to impress your boss, VB Day by day has you lined. We provide the inside scoop on what corporations are doing with generative AI, from regulatory shifts to sensible deployments, so you possibly can share insights for max ROI.

Learn our Privateness Coverage

Thanks for subscribing. Try extra VB newsletters right here.

An error occured.



Source link
Tags: datadontfindLLMsreasoningresearcherstaskstontrain
Share196Tweet123
Previous Post

Why would DOGE kill the CFPB?

Next Post

Pirates of the ‘Gulf of America’

Investor News Today

Investor News Today

Next Post
Pirates of the ‘Gulf of America’

Pirates of the ‘Gulf of America’

  • Trending
  • Comments
  • Latest
Private equity groups prepare to offload Ensemble Health for up to $12bn

Private equity groups prepare to offload Ensemble Health for up to $12bn

May 16, 2025
The human harbor: Navigating identity and meaning in the AI age

The human harbor: Navigating identity and meaning in the AI age

July 14, 2025
Equinor scales back renewables push 7 years after ditching ‘oil’ from its name

Equinor scales back renewables push 7 years after ditching ‘oil’ from its name

February 5, 2025
Niels Troost has a staggering story to tell about how he got sanctioned

Niels Troost has a staggering story to tell about how he got sanctioned

December 14, 2024
Why America’s economy is soaring ahead of its rivals

Why America’s economy is soaring ahead of its rivals

0
Dollar climbs after Donald Trump’s Brics tariff threat and French political woes

Dollar climbs after Donald Trump’s Brics tariff threat and French political woes

0
Nato chief Mark Rutte’s warning to Trump

Nato chief Mark Rutte’s warning to Trump

0
Top Federal Reserve official warns progress on taming US inflation ‘may be stalling’

Top Federal Reserve official warns progress on taming US inflation ‘may be stalling’

0
Eurozone June PPI +0.8% vs +0.8% m/m expected

Eurozone September industrial production +0.2% vs vs +0.7% m/m expected

November 13, 2025
Stocks making the biggest moves premarket: NVDA, PSKY, RGTI

Stocks making the biggest moves premarket: NVDA, PSKY, RGTI

November 13, 2025
Curve Finance warns its DNS has been hijacked again

Crypto Scam Leverages Australian Government Infrastructure

November 13, 2025
The best Kindles in 2025: Expert recommended

The best Kindles in 2025: Expert recommended

November 13, 2025

Live Prices

© 2024 Investor News Today

No Result
View All Result
  • Home
  • Market
  • Business
  • Finance
  • Investing
  • Real Estate
  • Commodities
  • Crypto
  • Blockchain
  • Personal Finance
  • Tech

© 2024 Investor News Today