Sarvam-M is a single, versatile model that supports both 'think' and 'non-think' modes. The think mode is for complex logical reasoning, mathematical problems, and coding tasks, while the non-think mode is for efficient general-purpose conversation.
Sarvam AI Official Blog
Key Facts
- Sarvam AI, a Bengaluru-based startup, has launched Sarvam-M, a 24-billion-parameter multilingual hybrid-reasoning Large Language Model based on the open-weight Mistral Small model from French firm Mistral AI.
- Sarvam-M was enhanced through a three-step process: Supervised Fine-Tuning (SFT), Reinforcement Learning with Verifiable Rewards (RLVR), and Inference Optimisations to improve performance.
- Sarvam-M set new benchmarks with a 20% average improvement on Indian language tasks, 21.6% improvement on math tasks, and 17.6% improvement on coding benchmarks compared to the base model.
- On combined Indian language and math tasks such as the romanised GSM-8K benchmark, Sarvam-M achieved an 86% improvement.
- Sarvam AI claims that Sarvam-M outperforms Meta's LLaMA-4 Scout on most benchmarks and rivals larger models like LLaMA-3.3 70B and Google's Gemma 3 27B.
- Sarvam-M is now publicly accessible via Sarvam's API and is available for download on Hugging Face for experimentation and integration.
Key Stats at a Glance
Parameter size of Sarvam-M LLM
24 billion parameters
Average improvement on Indian language benchmarks
20%
Improvement on math-related tasks
21.6%
Improvement on coding benchmarks
17.6%
Improvement on combined Indian language and math tasks (GSM-8K benchmark)
86%
