All posts
AI Tools 14 min read March 25, 2026

Sarvam AI 105B: India's Most Powerful Multilingual Language Model

A deep dive into Sarvam AI's 105B parameter model — built for Bharat, fluent in 22 Indian languages, and designed for production AI applications. API setup, key features, and real-world use cases.

#Sarvam AI#LLM#Indian Languages#Multilingual AI#API#Python#Bharat
Neel Shah Tech Lead · Senior Data Engineer · Ottawa

If you are building AI products for India, you have been patching around a real problem: the big models from OpenAI and Anthropic are trained mostly on English. They handle Hindi, Tamil, Bengali — but with obvious gaps. Code-switching (mixing languages mid-sentence), regional dialects, script accuracy for Devanagari and Tamil, culturally grounded reasoning — these are not afterthoughts you can fine-tune away.

Sarvam AI was built to solve this from the ground up. The Sarvam-105B model is their flagship large language model — 105 billion parameters, pre-trained on a multilingual corpus that prioritises all 22 scheduled Indian languages alongside English. This guide walks you through what the model actually does well, how to access it, and how to build with it.


What Is Sarvam AI?

Sarvam AI is a Bangalore-based AI company co-founded by Vivek Raghavan and Pratyush Kumar — both researchers with deep roots in speech and language technology for Indian languages.

The company’s stated mission is building full-stack AI for Bharat: not just language models, but the entire pipeline from speech-to-text and translation to LLM inference and text-to-speech. Every layer is built with Indian languages as a first-class target, not an add-on.

Their model line-up before the 105B:

  • Sarvam-2B — open-source, released in 2024, focused on 10 Indian languages
  • Sarvam Translate — specialised translation API supporting 22 Indian languages
  • Sarvam Speech — ASR and TTS models tuned for Indian accents and scripts

Sarvam-105B is the new flagship. It is the first Indian-developed LLM at this parameter scale designed for multilingual reasoning across all 22 scheduled Indian languages.


What Makes 105B Different

Three things set this model apart from using a general-purpose LLM and prompting it in Hindi.

1. Script-Native Training

Most large models see Indian scripts through tokenisers built for English. A Hindi sentence gets fragmented into sub-word tokens in a way that inflates context length and degrades coherence. Sarvam-105B uses a tokeniser built from scratch on a balanced multilingual corpus — each Indian script is tokenised efficiently. A sentence in Tamil uses roughly the same number of tokens as its English translation.

2. Code-Switching Fluency

Real Indian communication mixes languages constantly. A customer support message might read: “Mera order abhi tak nahi aaya, please help.” (My order hasn’t arrived yet, please help.) Standard LLMs handle this poorly — they either default to English or produce awkward responses that don’t match the register.

Sarvam-105B is trained on naturally occurring code-switched text. It reads the input as a whole and responds in the same register, not a cleaned-up translation of it.

3. Cultural and Contextual Grounding

Names, places, festivals, legal frameworks, government schemes — these are all part of the model’s training distribution rather than gaps in it. When a user asks about PMAY, MNREGA, or GST filing, the model has substantive knowledge rather than a thin English-language summary.


Getting Access

Sarvam AI provides API access through their developer platform.

Requirements
─────────────────────────────────────────
  Account    sarvam.ai (free tier available)
  API key    from the developer dashboard
  Python     3.9 or later
  pip        any recent version

Step 1: Create an account

Go to sarvam.ai, create an account, and navigate to the API keys section of your dashboard.

Step 2: Get your API key

Your API key will look like: sk-...

Store it as an environment variable — never hardcode keys in source files:

export SARVAM_API_KEY="sk-your-key-here"

Or add it to your .bashrc / .zshrc for persistence:

echo 'export SARVAM_API_KEY="sk-your-key-here"' >> ~/.bashrc
source ~/.bashrc

Step 3: Install the SDK

pip install sarvam-python

For a minimal setup without the SDK, you can use the REST API directly with requests:

pip install requests

Your First API Call

Let’s verify the setup with a simple Hindi prompt.

import os
from sarvam import SarvamClient

client = SarvamClient(api_key=os.environ["SARVAM_API_KEY"])

response = client.chat.complete(
    model="sarvam-105b",
    messages=[
        {
            "role": "user",
            "content": "भारत में आर्टिफिशियल इंटेलिजेंस के क्या फायदे हैं?"
        }
    ]
)

print(response.choices[0].message.content)

You should get a coherent, well-structured Hindi response about AI benefits in India. No translation artifacts, no broken Devanagari script.

Using the REST API directly:

import os
import requests

response = requests.post(
    "https://api.sarvam.ai/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {os.environ['SARVAM_API_KEY']}",
        "Content-Type": "application/json"
    },
    json={
        "model": "sarvam-105b",
        "messages": [
            {"role": "user", "content": "Hello, what languages do you speak?"}
        ]
    }
)

data = response.json()
print(data["choices"][0]["message"]["content"])

Five Features That Matter

Feature 1: Multilingual Chat

The model handles all 22 scheduled Indian languages. You can switch languages between turns and it maintains context correctly.

messages = [
    {"role": "user", "content": "नमस्ते! मेरा नाम नील है।"},
    {"role": "assistant", "content": "नमस्ते नील! आप कैसे हैं?"},
    {"role": "user", "content": "I'm good. Can you explain what you just said?"},
]

response = client.chat.complete(
    model="sarvam-105b",
    messages=messages
)

The model correctly identifies that the user has switched to English and responds accordingly, summarising the Hindi exchange.

Supported languages include:

Hindi (hi)        Bengali (bn)       Telugu (te)
Tamil (ta)        Marathi (mr)       Gujarati (gu)
Kannada (kn)      Malayalam (ml)     Punjabi (pa)
Odia (or)         Assamese (as)      Urdu (ur)
Sanskrit (sa)     Sindhi (sd)        Kashmiri (ks)
Konkani (kok)     Manipuri (mni)     Maithili (mai)
Dogri (doi)       Bodo (brx)         Santhali (sat)
Nepali (ne)

Feature 2: Document Analysis in Regional Languages

Useful for processing scanned government documents, legal filings, or customer correspondence in Indian languages.

def analyse_document(text: str, question: str, language: str = "hi") -> str:
    prompt = f"""नीचे दिया गया दस्तावेज़ पढ़ें और प्रश्न का उत्तर दें।

दस्तावेज़:
{text}

प्रश्न: {question}

उत्तर:"""

    response = client.chat.complete(
        model="sarvam-105b",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=500
    )
    return response.choices[0].message.content

# Example usage
doc = """ग्राम पंचायत कार्यालय
दिनांक: 15 मार्च 2026

प्रमाण पत्र

यह प्रमाणित किया जाता है कि श्री राम प्रसाद शर्मा, पुत्र श्री हरि प्रसाद शर्मा,
ग्राम रामपुर, जिला लखनऊ के स्थायी निवासी हैं।"""

result = analyse_document(doc, "इस दस्तावेज़ में किस व्यक्ति का उल्लेख है?")
print(result)

Feature 3: Code-Switched Customer Support

This is where Sarvam-105B genuinely outperforms general-purpose models. Indian customer support conversations naturally mix Hindi and English — the model handles this without special prompting.

def customer_support_agent(user_message: str) -> str:
    system_prompt = """You are a helpful customer support agent for an Indian e-commerce platform.
Respond naturally in the same language mix the user uses. If they write in Hindi, respond in Hindi.
If they mix Hindi and English (Hinglish), match that style. Be warm and helpful."""

    response = client.chat.complete(
        model="sarvam-105b",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_message}
        ],
        temperature=0.7
    )
    return response.choices[0].message.content

# Hinglish input
reply = customer_support_agent(
    "Mera order 3 din se pending hai, kab milega? Very frustrated hoon."
)
print(reply)

Feature 4: Structured Output in Indian Languages

For applications that need JSON extraction from Hindi or Tamil text — forms, applications, data entry automation.

import json

def extract_from_hindi_form(form_text: str) -> dict:
    prompt = f"""नीचे दिए गए फॉर्म से जानकारी निकालें और JSON में दें।

फॉर्म:
{form_text}

केवल JSON लौटाएं, कोई अन्य टेक्स्ट नहीं:
{{
  "name": "",
  "age": "",
  "address": "",
  "phone": ""
}}"""

    response = client.chat.complete(
        model="sarvam-105b",
        messages=[{"role": "user", "content": prompt}],
        temperature=0
    )

    raw = response.choices[0].message.content.strip()
    # Clean markdown code blocks if present
    if raw.startswith("```"):
        raw = raw.split("```")[1]
        if raw.startswith("json"):
            raw = raw[4:]
    return json.loads(raw.strip())

form_data = """
नाम: सुनीता देवी
आयु: 34 वर्ष
पता: 12, गांधी नगर, जयपुर, राजस्थान
मोबाइल: 9876543210
"""

result = extract_from_hindi_form(form_data)
print(result)
# {'name': 'सुनीता देवी', 'age': '34', 'address': '12, गांधी नगर, जयपुर, राजस्थान', 'phone': '9876543210'}

Feature 5: Streaming for Real-Time Applications

For chatbots and interfaces where users need to see tokens as they arrive:

import sys

def stream_response(prompt: str) -> None:
    stream = client.chat.complete(
        model="sarvam-105b",
        messages=[{"role": "user", "content": prompt}],
        stream=True
    )

    for chunk in stream:
        delta = chunk.choices[0].delta
        if delta.content:
            print(delta.content, end="", flush=True)
    print()  # newline at end

stream_response("भारत की संस्कृति के बारे में तीन रोचक तथ्य बताइए।")

Real-World Use Cases

Government and Civic Tech

India’s digital governance stack — from Aadhaar to UMANG to DigiLocker — serves hundreds of millions of citizens who interact in regional languages. Sarvam-105B fits naturally into:

  • Chatbots for scheme eligibility queries (PM Kisan, Ujjwala Yojana, etc.)
  • Automated processing of handwritten or scanned forms in regional languages
  • Legal aid tools that explain rights and procedures in local languages

EdTech

India has the world’s largest student population. Tutoring AI that can switch between English explanations and Hindi/Tamil clarifications within the same conversation removes a significant barrier for learners in tier-2 and tier-3 cities.

Financial Services

KYC document processing, loan application analysis, and customer onboarding often involve documents and conversations in multiple Indian languages. The model’s structured output and document understanding capabilities apply directly here.

Healthcare

Patient intake forms, discharge summaries, symptom triage — all of these generate text in regional languages in Indian hospitals. Automated summarisation and extraction that actually works in Marathi or Gujarati has real clinical value.


Comparing with Alternatives

Model              Indian Languages    Code-switching    Context (tokens)
────────────────────────────────────────────────────────────────────────
Sarvam-105B        22 (first-class)    Strong            128K
GPT-4o             English-primary     Moderate          128K
Claude 3.5 Sonnet  English-primary     Moderate          200K
Gemini 1.5 Pro     Hindi support       Moderate          1M
IndicBERT          10 (encoder only)   N/A               512
────────────────────────────────────────────────────────────────────────

The comparison is not that Sarvam-105B beats GPT-4o at everything — it does not. For complex reasoning in English, general-purpose frontier models are still ahead. The comparison that matters is: for Indian language applications, Sarvam-105B produces noticeably better output at lower cost, since API pricing is calibrated for Indian developers and startups.


Pricing and Rate Limits

Sarvam AI offers tiered pricing designed for Indian market economics:

Tier          Tokens/month    Rate limit       Price (per 1M tokens)
──────────────────────────────────────────────────────────────────────
Free          2M input        5 req/min        ₹0
Developer     50M input       60 req/min       ₹500 (~$6)
Startup       500M input      300 req/min      ₹3,500 (~$42)
Enterprise    Unlimited       Custom           Custom
──────────────────────────────────────────────────────────────────────

Check the current pricing on the Sarvam AI dashboard — these figures are approximate and subject to change.


Common Issues and Fixes

Issue: Response contains mixed script artifacts

If you see garbled output when mixing scripts in the same prompt, be explicit about the expected output language:

# Add language instruction to system prompt
system = "Respond only in Hindi (Devanagari script). Do not mix scripts."

Issue: Rate limit errors (429)

Add exponential backoff:

import time

def complete_with_retry(client, messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.complete(
                model="sarvam-105b",
                messages=messages
            )
        except Exception as e:
            if "429" in str(e) and attempt < max_retries - 1:
                wait = 2 ** attempt
                print(f"Rate limited. Waiting {wait}s...")
                time.sleep(wait)
            else:
                raise

Issue: JSON extraction fails for some Indian language inputs

Use temperature=0 for extraction tasks and add a validation step:

import json

def safe_json_extract(raw: str) -> dict | None:
    try:
        # Strip markdown if present
        if "```" in raw:
            raw = raw.split("```")[1].lstrip("json").strip()
        return json.loads(raw)
    except json.JSONDecodeError:
        return None

Getting Started Checklist

Setup
─────────────────────────────────────────────────────────
  [ ] Create account at sarvam.ai
  [ ] Copy API key to environment variable SARVAM_API_KEY
  [ ] pip install sarvam-python
  [ ] Run first API call — verify Hindi/English response

First Build
─────────────────────────────────────────────────────────
  [ ] Pick one of the five features above that matches your use case
  [ ] Run the code example with your own prompt
  [ ] Test with code-switched input (mix Hindi + English)
  [ ] Verify structured output extraction if you need JSON

Production
─────────────────────────────────────────────────────────
  [ ] Add exponential backoff for rate limit handling
  [ ] Store API key in environment — never in source code
  [ ] Choose the right tier for your expected token volume
  [ ] Monitor latency — enable streaming for user-facing features

The Bottom Line

Sarvam-105B is not trying to beat GPT-5 at MMLU benchmarks. It is trying to be the best model for building products that serve India — and for that narrower, more specific goal, it is genuinely well-suited.

If your application involves Indian users who communicate in regional languages, the model’s script-native tokenisation, code-switching fluency, and cultural grounding translate directly into better product quality. The API pricing makes it accessible for Indian startups in a way that frontier model pricing often is not.

For English-primary applications, the general frontier models are still the default choice. For anything India-first, Sarvam-105B should be your starting point.

Frequently asked questions

What is Sarvam AI 105B: India's Most Powerful Multilingual Language Model about?

A deep dive into Sarvam AI's 105B parameter model — built for Bharat, fluent in 22 Indian languages, and designed for production AI applications. API setup, key features, and real-world use cases.

Who should read this article?

This article is written for engineers, technical leads, and data teams working with Sarvam AI, LLM, Indian Languages.

What can readers use from it?

Readers can use the article as a practical reference for ai tools decisions, implementation tradeoffs, and production engineering workflows.