• Latest
  • Trending
  • All
  • Market Updates
  • Cryptocurrency
  • Blockchain
  • Investing
  • Commodities
  • Personal Finance
  • Technology
  • Business
  • Real Estate
  • Finance
MiniMax-M2 is the new king of open source LLMs (especially for agentic tool calling)

MiniMax-M2 is the new king of open source LLMs (especially for agentic tool calling)

October 27, 2025
Japanese Yen supported by intervention fears, awaits Fed/BoJ decisions

Japanese Yen supported by intervention fears, awaits Fed/BoJ decisions

October 28, 2025
Nucor, Waste Management, F5, NXP and more

Nucor, Waste Management, F5, NXP and more

October 28, 2025
dYdX proposes $462K trader compensation after Oct. 10 chain halt

dYdX proposes $462K trader compensation after Oct. 10 chain halt

October 28, 2025
The best VPS hosting services for 2025: Expert tested

The best VPS hosting services for 2025: Expert tested

October 27, 2025
Soft Manager – Trading Ideas – 5 August 2025

Buy by Trend – My Trading – 27 October 2025

October 27, 2025
10 Best Ways to Invest $10,000 in 2025 – Make Your Money Grow

10 Best Ways to Invest $10,000 in 2025 – Make Your Money Grow

October 27, 2025
What Happened When Data Met Experience

What Happened When Data Met Experience

October 27, 2025
Beijing and Berlin play down diplomatic rift after German minister cancels China trip

Beijing and Berlin play down diplomatic rift after German minister cancels China trip

October 27, 2025
Qualcomm, Fermi, Organon and more

Qualcomm, Fermi, Organon and more

October 27, 2025
Amazon may cut 30,000 corporate jobs, starting Tuesday

Amazon may cut 30,000 corporate jobs, starting Tuesday

October 27, 2025
Bitcoin’s Illiquid Supply Drops By 62,000 BTC

Bitcoin’s Illiquid Supply Drops By 62,000 BTC

October 27, 2025
Poor Versus Rich Millionaires: Liquidity Is The Difference

Poor Versus Rich Millionaires: Liquidity Is The Difference

October 27, 2025
Tuesday, October 28, 2025
No Result
View All Result
InvestorNewsToday.com
  • Home
  • Market
  • Business
  • Finance
  • Investing
  • Real Estate
  • Commodities
  • Crypto
  • Blockchain
  • Personal Finance
  • Tech
InvestorNewsToday.com
No Result
View All Result
Home Technology

MiniMax-M2 is the new king of open source LLMs (especially for agentic tool calling)

by Investor News Today
October 27, 2025
in Technology
0
MiniMax-M2 is the new king of open source LLMs (especially for agentic tool calling)
492
SHARES
1.4k
VIEWS
Share on FacebookShare on Twitter



Be careful, DeepSeek and Qwen! There's a brand new king of open supply giant language fashions (LLMs), particularly in the case of one thing enterprises are more and more valuing: agentic software use — that’s, the power to go off and use different software program capabilities like net search or bespoke purposes — with out a lot human steerage.

That mannequin is none apart from MiniMax-M2, the most recent LLM from the Chinese language startup of the identical identify. And in a giant win for enterprises globally, the mannequin is offered below a permissive, enterprise-friendly MIT License, that means it’s made out there freely for builders to take, deploy, retrain, and use how they see match — even for industrial functions. It may be discovered on Hugging Face, GitHub and ModelScope, in addition to by way of MiniMax's API right here. It helps OpenAI and Anthropic API requirements, as nicely, making it simple for patrons of stated proprietary AI startups to shift out their fashions to MiniMax's API, if they need.

In response to impartial evaluations by Synthetic Evaluation, a third-party generative AI mannequin benchmarking and analysis group, M2 now ranks first amongst all open-weight techniques worldwide on the Intelligence Index—a composite measure of reasoning, coding, and task-execution efficiency.

In agentic benchmarks that measure how nicely a mannequin can plan, execute, and use exterior instruments—abilities that energy coding assistants and autonomous brokers—MiniMax’s personal reported outcomes, following the Synthetic Evaluation methodology, present τ²-Bench 77.2, BrowseComp 44.0, and FinSearchComp-global 65.5.

These scores place it at or close to the extent of prime proprietary techniques like GPT-5 (pondering) and Claude Sonnet 4.5, making MiniMax-M2 the highest-performing open mannequin but launched for real-world agentic and tool-calling duties.

What It Means For Enterprises and the AI Race

Constructed round an environment friendly Combination-of-Consultants (MoE) structure, MiniMax-M2 delivers high-end functionality for agentic and developer workflows whereas remaining sensible for enterprise deployment.

For technical decision-makers, the discharge marks an essential turning level for open fashions in enterprise settings. MiniMax-M2 combines frontier-level reasoning with a manageable activation footprint—simply 10 billion energetic parameters out of 230 billion whole.

This design permits enterprises to function superior reasoning and automation workloads on fewer GPUs, reaching near-state-of-the-art outcomes with out the infrastructure calls for or licensing prices related to proprietary frontier techniques.

Synthetic Evaluation’ information present that MiniMax-M2’s strengths transcend uncooked intelligence scores. The mannequin leads or carefully trails prime proprietary techniques equivalent to GPT-5 (pondering) and Claude Sonnet 4.5 throughout benchmarks for end-to-end coding, reasoning, and agentic software use.

Its efficiency in τ²-Bench, SWE-Bench, and BrowseComp signifies explicit benefits for organizations that rely on AI techniques able to planning, executing, and verifying complicated workflows—key capabilities for agentic and developer instruments inside enterprise environments.

As LLM engineer Pierre-Carl Langlais aka Alexander Doria posted on X: "MiniMax [is] making a case for mastering the expertise end-to-end to get precise agentic automation."

Compact Design, Scalable Efficiency

MiniMax-M2’s technical structure is a sparse Combination-of-Consultants mannequin with 230 billion whole parameters and 10 billion energetic per inference.

This configuration considerably reduces latency and compute necessities whereas sustaining broad common intelligence.

The design permits for responsive agent loops—compile–run–check or browse–retrieve–cite cycles—that execute sooner and extra predictably than denser fashions.

For enterprise expertise groups, this implies simpler scaling, decrease cloud prices, and diminished deployment friction. In response to Synthetic Evaluation, the mannequin could be served effectively on as few as 4 NVIDIA H100 GPUs at FP8 precision, a setup nicely inside attain for mid-size organizations or departmental AI clusters.

Benchmark Management Throughout Agentic and Coding Workflows

MiniMax’s benchmark suite highlights sturdy real-world efficiency throughout developer and agent environments. The determine beneath, launched with the mannequin, compares MiniMax-M2 (in pink) with a number of main proprietary and open fashions, together with GPT-5 (pondering), Claude Sonnet 4.5, Gemini 2.5 Professional, and DeepSeek-V3.2.

MiniMax-M2 achieves prime or near-top efficiency in lots of classes:

  • SWE-bench Verified: 69.4 — near GPT-5’s 74.9

  • ArtifactsBench: 66.8 — above Claude Sonnet 4.5 and DeepSeek-V3.2

  • τ²-Bench: 77.2 — approaching GPT-5’s 80.1

  • GAIA (textual content solely): 75.7 — surpassing DeepSeek-V3.2

  • BrowseComp: 44.0 — notably stronger than different open fashions

  • FinSearchComp-global: 65.5 — finest amongst examined open-weight techniques

These outcomes present MiniMax-M2’s functionality in executing complicated, tool-augmented duties throughout a number of languages and environments—abilities more and more related for automated help, R&D, and information evaluation inside enterprises.

Robust Displaying in Synthetic Evaluation’ Intelligence Index

The mannequin’s total intelligence profile is confirmed within the newest Synthetic Evaluation Intelligence Index v3.0, which aggregates efficiency throughout ten reasoning benchmarks together with MMLU-Professional, GPQA Diamond, AIME 2025, IFBench, and τ²-Bench Telecom.

MiniMax-M2 scored 61 factors, rating as the very best open-weight mannequin globally and following carefully behind GPT-5 (excessive) and Grok 4.

Synthetic Evaluation highlighted the mannequin’s steadiness between technical accuracy, reasoning depth, and utilized intelligence throughout domains. For enterprise customers, this consistency signifies a dependable mannequin basis appropriate for integration into software program engineering, buyer help, or information automation techniques.

Designed for Builders and Agentic Programs

MiniMax engineered M2 for end-to-end developer workflows, enabling multi-file code edits, automated testing, and regression restore immediately inside built-in improvement environments or CI/CD pipelines.

The mannequin additionally excels in agentic planning—dealing with duties that mix net search, command execution, and API calls whereas sustaining reasoning traceability.

These capabilities make MiniMax-M2 particularly priceless for enterprises exploring autonomous developer brokers, information evaluation assistants, or AI-augmented operational instruments.

Benchmarks equivalent to Terminal-Bench and BrowseComp reveal the mannequin’s capacity to adapt to incomplete information and get better gracefully from intermediate errors, enhancing reliability in manufacturing settings.

Interleaved Pondering and Structured Software Use

A particular facet of MiniMax-M2 is its interleaved pondering format, which maintains seen reasoning traces between <suppose>…</suppose> tags.

This permits the mannequin to plan and confirm steps throughout a number of exchanges, a vital characteristic for agentic reasoning. MiniMax advises retaining these segments when passing dialog historical past to protect the mannequin’s logic and continuity.

The corporate additionally gives a Software Calling Information on Hugging Face, detailing how builders can join exterior instruments and APIs by way of structured XML-style calls.

This performance permits MiniMax-M2 to function the reasoning core for bigger agent frameworks, executing dynamic duties equivalent to search, retrieval, and computation by way of exterior capabilities.

Open Supply Entry and Enterprise Deployment Choices

Enterprises can entry the mannequin by way of the MiniMax Open Platform API and MiniMax Agent interface (an internet chat just like ChatGPT), each at present free for a restricted time.

MiniMax recommends SGLang and vLLM for environment friendly serving, every providing day-one help for the mannequin’s distinctive interleaved reasoning and tool-calling construction.

Deployment guides and parameter configurations can be found by way of MiniMax’s documentation.

Price Effectivity and Token Economics

As Synthetic Evaluation famous, MiniMax’s API pricing is about at $0.30 per million enter tokens and $1.20 per million output tokens, among the many best within the open-model ecosystem.

Supplier

Mannequin (doc hyperlink)

Enter $/1M

Output $/1M

Notes

MiniMax

MiniMax-M2

$0.30

$1.20

Listed below “Chat Completion v2” for M2.

OpenAI

GPT-5

$1.25

$10.00

Flagship mannequin pricing on OpenAI’s API pricing web page.

OpenAI

GPT-5 mini

$0.25

$2.00

Cheaper tier for well-defined duties.

Anthropic

Claude Sonnet 4.5

$3.00

$15.00

Anthropic’s present per-MTok listing; long-context (>200K enter) makes use of a premium tier.

Google

Gemini 2.5 Flash (Preview)

$0.30

$2.50

Costs embrace “pondering tokens”; web page additionally lists cheaper Flash-Lite and a couple of.0 tiers.

xAI

Grok-4 Quick (reasoning)

$0.20

$0.50

“Quick” tier; xAI additionally lists Grok-4 at $3 / $15.

DeepSeek

DeepSeek-V3.2 (chat)

$0.28

$0.42

Cache-hit enter is $0.028; desk exhibits per-model particulars.

Qwen (Alibaba)

qwen-flash (Mannequin Studio)

from $0.022

from $0.216

Tiered by enter measurement (≤128K, ≤256K, ≤1M tokens); listed “Enter worth / Output worth per 1M”.

Cohere

Command R+ (Aug 2024)

$2.50

$10.00

First-party pricing web page additionally lists Command R ($0.50 / $1.50) and others.

Notes & caveats (for readers):

  • Costs are USD per million tokens and might change; examine linked pages for updates and area/endpoint nuances (e.g., Anthropic long-context >200K enter, Google Dwell API variants, cache reductions).

  • Distributors could invoice additional for server-side instruments (net search, code execution) or provide batch/context-cache reductions.

Whereas the mannequin produces longer, extra specific reasoning traces, its sparse activation and optimized compute design assist preserve a good cost-performance steadiness—a bonus for groups deploying interactive brokers or high-volume automation techniques.

Background on MiniMax — an Rising Chinese language Powerhouse

MiniMax has rapidly change into one of the carefully watched names in China’s fast-rising AI sector.

Backed by Alibaba and Tencent, the corporate moved from relative obscurity to worldwide recognition inside a 12 months—first by way of breakthroughs in AI video technology, then by way of a collection of open-weight giant language fashions (LLMs) aimed squarely at builders and enterprises.

The corporate first captured world consideration in late 2024 with its AI video technology software, “video-01,” which demonstrated the power to create dynamic, cinematic scenes in seconds. VentureBeat described how the mannequin’s launch sparked widespread curiosity after on-line creators started sharing lifelike, AI-generated footage—most memorably, a viral clip of a Star Wars lightsaber duel that drew hundreds of thousands of views in below two days.

CEO Yan Junjie emphasised that the system outperformed main Western instruments in producing human motion and expression, an space the place video AIs usually battle. The product, later commercialized by way of MiniMax’s Hailuo platform, showcased the startup’s technical confidence and artistic attain, serving to to determine China as a critical contender in generative video expertise.

By early 2025, MiniMax had turned its consideration to long-context language modeling, unveiling the MiniMax-01 collection, together with MiniMax-Textual content-01 and MiniMax-VL-01. These open-weight fashions launched an unprecedented 4-million-token context window, doubling the attain of Google’s Gemini 1.5 Professional and dwarfing OpenAI’s GPT-4o by greater than twentyfold.

The corporate continued its speedy cadence with the MiniMax-M1 launch in June 2025, a mannequin centered on long-context reasoning and reinforcement studying effectivity. M1 prolonged context capability to 1 million tokens and launched a hybrid Combination-of-Consultants design skilled utilizing a customized reinforcement-learning algorithm often known as CISPO. Remarkably, VentureBeat reported that MiniMax skilled M1 at a complete value of about $534,700, roughly one-tenth of DeepSeek’s R1 and much beneath the multimillion-dollar budgets typical for frontier-scale fashions.

For enterprises and technical groups, MiniMax’s trajectory alerts the arrival of a brand new technology of cost-efficient, open-weight fashions designed for real-world deployment. Its open licensing—starting from Apache 2.0 to MIT—offers companies freedom to customise, self-host, and fine-tune with out vendor lock-in or compliance restrictions.

Options equivalent to structured operate calling, long-context retention, and high-efficiency consideration architectures immediately deal with the wants of engineering teams managing multi-step reasoning techniques and data-intensive pipelines.

As MiniMax continues to increase its lineup, the corporate has emerged as a key world innovator in open-weight AI, combining formidable analysis with pragmatic engineering.

Open-Weight Management and Business Context

The discharge of MiniMax-M2 reinforces the rising management of Chinese language AI analysis teams in open-weight mannequin improvement.

Following earlier contributions from DeepSeek, Alibaba’s Qwen collection, and Moonshot AI, MiniMax’s entry continues the development towards open, environment friendly techniques designed for real-world use.

Synthetic Evaluation noticed that MiniMax-M2 exemplifies a broader shift in focus towards agentic functionality and reinforcement-learning refinement, prioritizing controllable reasoning and actual utility over uncooked mannequin measurement.

For enterprises, this implies entry to a state-of-the-art open mannequin that may be audited, fine-tuned, and deployed internally with full transparency.

By pairing sturdy benchmark efficiency with open licensing and environment friendly scaling, MiniMaxAI positions MiniMax-M2 as a sensible basis for clever techniques that suppose, act, and help with traceable logic—making it one of the enterprise-ready open AI fashions out there as we speak.



Source link

Tags: agenticcallingkingLLMsMiniMaxM2opensourceTool
Share197Tweet123
Previous Post

The best VPS hosting services for 2025: Expert tested

Next Post

dYdX proposes $462K trader compensation after Oct. 10 chain halt

Investor News Today

Investor News Today

Next Post
dYdX proposes $462K trader compensation after Oct. 10 chain halt

dYdX proposes $462K trader compensation after Oct. 10 chain halt

  • Trending
  • Comments
  • Latest
Private equity groups prepare to offload Ensemble Health for up to $12bn

Private equity groups prepare to offload Ensemble Health for up to $12bn

May 16, 2025
The human harbor: Navigating identity and meaning in the AI age

The human harbor: Navigating identity and meaning in the AI age

July 14, 2025
Equinor scales back renewables push 7 years after ditching ‘oil’ from its name

Equinor scales back renewables push 7 years after ditching ‘oil’ from its name

February 5, 2025
Niels Troost has a staggering story to tell about how he got sanctioned

Niels Troost has a staggering story to tell about how he got sanctioned

December 14, 2024
Why America’s economy is soaring ahead of its rivals

Why America’s economy is soaring ahead of its rivals

0
Dollar climbs after Donald Trump’s Brics tariff threat and French political woes

Dollar climbs after Donald Trump’s Brics tariff threat and French political woes

0
Nato chief Mark Rutte’s warning to Trump

Nato chief Mark Rutte’s warning to Trump

0
Top Federal Reserve official warns progress on taming US inflation ‘may be stalling’

Top Federal Reserve official warns progress on taming US inflation ‘may be stalling’

0
Japanese Yen supported by intervention fears, awaits Fed/BoJ decisions

Japanese Yen supported by intervention fears, awaits Fed/BoJ decisions

October 28, 2025
Nucor, Waste Management, F5, NXP and more

Nucor, Waste Management, F5, NXP and more

October 28, 2025
dYdX proposes $462K trader compensation after Oct. 10 chain halt

dYdX proposes $462K trader compensation after Oct. 10 chain halt

October 28, 2025
MiniMax-M2 is the new king of open source LLMs (especially for agentic tool calling)

MiniMax-M2 is the new king of open source LLMs (especially for agentic tool calling)

October 27, 2025

Live Prices

© 2024 Investor News Today

No Result
View All Result
  • Home
  • Market
  • Business
  • Finance
  • Investing
  • Real Estate
  • Commodities
  • Crypto
  • Blockchain
  • Personal Finance
  • Tech

© 2024 Investor News Today