About
BharatGen is a consortium under the Technology Innovation Hub at IIT Bombay and India's first government-funded multimodal Large Language Model (LLM) initiative for Indian languages, supported by the Department of Science and Technology (DST) under the National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS). Officially launched in October 2024, BharatGen is developing India's sovereign AI ecosystem with foundational AI models fluent in understanding and speaking 22+ Indian languages, dialects, and cultural nuances. The initiative is building multimodal generative AI models across text, speech, and computer vision to revolutionize public service delivery, citizen engagement, and digital access across governance, agriculture, education, healthcare, and legal sectors. BharatGen is creating the world's largest India-centric dataset (Bharat Data Sagar) focusing on underrepresented Indian data tied to languages, culture, history, and philosophy. The organization is also investing in talent development through MTech/PhD funding, AI courses, hackathons, and workshops to position India as a global AI leader.
Mission
To develop and deploy advanced AI solutions that empower every Indian to connect, communicate, and innovate effortlessly by building India's first sovereign, multilingual, and culturally-rooted artificial intelligence ecosystem that serves as a public good for governance, agriculture, education, healthcare, and social equity across the nation.
Focus areas
Data Platform Engineer
ClosedWe're seeking a skilled Data Platform Engineer to build scalable tools, platforms, and pipelines tailored for processing large-scale, multilingual, multimodal datasets critical for foundational AI models. In this role, you will build scalable data pipelines to ingest, transform, and prepare data from diverse sources—text, speech, images, and video—making it ready for Generative AI model training. Your work will involve developing and managing the underlying platform while addressing challenges like governance, security, observability, lineage, and scalability. The outcomes of your work will include efficient tools for data processing, a reliable data platform, and high-quality datasets tailored to the evolving needs of large-scale AI and LLM training.
Linguistic Data Operations Manager
ClosedThe Linguistic Data Operations Manager will be responsible for scaling and managing a pool of ~200 freelancers across 22 Indian languages and for setting up and running data validation and model evaluation workflows in collaboration with Linguist leads for different models. This role requires strong project/operations management skills, with enough linguistic awareness to design processes that support annotation, evaluation, and corpus digitization at scale. This work will directly contribute to making India's AI ecosystem linguistically inclusive and globally competitive.
Senior Linguist (Speech)
ClosedThe Speech Linguist Lead will own both pretraining data quality validation and model output evaluation for BharatGen's speech technology efforts. You will design validation frameworks for large-scale ASR/TTS datasets, define linguistic and acoustic quality standards, and evaluate model outputs for intelligibility, fluency, and naturalness. This role requires a linguist or speech technologist who can bridge linguistic theory, acoustic data understanding, and operational execution — collaborating closely with ML engineers, data collection teams, and freelance linguists.
Senior Linguist (Text LLM – Model Evaluation)
ClosedThe Text LLM Model Evaluation Lead will own the end-to-end process of evaluating BharatGen's text-based large language models. You will design human evaluation frameworks, test sets, rubrics, and metrics that assess model outputs across multiple tasks and languages. Working closely with ML engineers, linguists, and data operations, you'll ensure that every model iteration is measured with rigor, fairness, and linguistic precision.
Agentic AI Engineer
ClosedWe are looking for an Agentic AI Engineer to join our growing team to design and develop agentic AI systems that can plan, reason, and act. This includes single- and multi-agent AI systems and orchestration of workflows that involve retrieval augmented generation (RAG), contextual awareness, reasoning, tool calling, and inter-agent communication. This role is ideal for software engineers with AI/ML development experience, and the passion for transforming generative AI models into actionable, goal-driven systems capable of solving complex, real-world business problems.
Generative AI Engineer
ClosedWe are looking for a Generative AI – Engineer to join our growing team to help design, build, fine-tune, and deploy cutting-edge generative AI models and agentic systems. You will work on the full lifecycle of foundational model development – involving both large and small language models (LLMs and SLMs) – to create scalable AI solutions that address diverse needs across different business domains. This role is ideal for proactive individuals with a strong foundation in machine learning and an experimental mindset, who are passionate about driving transformative advancements in generative AI from research to real-world production impact.
AI Full Stack Developer
ClosedAs an AI Full Stack Developer, you will have the opportunity to build end-to-end applications that leverage generative AI models and agentic architectures to address real-world business needs across different domains. You will design and develop autonomous and human-in-the-loop agentic AI systems, web and mobile user interfaces, and scalable backend services and APIs. This role is ideal for software engineers who are passionate about crafting elegant user experiences, enjoy working across the stack, and want to convert generative AI capabilities into production-ready solutions.
AI Evaluation & Test Engineer
ClosedWe are looking for an AI Evaluation & Test Engineer to join our growing team to ensure that our generative AI models and applications are safe, accurate, trustworthy, and deliver an elegant user experience. You will serve as the first customer of our AI systems. This role is ideal for product-minded engineers who obsess over product quality and customer-centricity, and are passionate about shaping the behavior of AI systems in the real world.
