About CognitiveLab

We're on a mission to unlock the full potential of AI through innovative solutions that bridge the gap between human cognition and machine intelligence.

CognitiveLab Logo

CognitiveLab

An Open Source First AI Research Lab Building from India, for the World.

CognitiveLab is an open-source first AI research and product lab founded in Bangalore in May 2023. Our core mission is to build impactful AI technology that solves real-world problems, with a strong focus on democratizing access and fostering innovation through open collaboration.

We develop cutting-edge models and tools that bridge the gap between research and practical applications across diverse domains. While we're known for excellence in multilingual AI (like Ambari and Project Nayana), our vision extends to creating widely adopted open-source software that transforms how organizations work with data and AI, like OmniParse (6,000+ GitHub stars) and our educational initiatives. Shortly after inception, we were selected for the Microsoft for Startups program.

To sustain our research and open-source contributions, CognitiveLab partners with businesses through consulting services, helping organizations of all sizes implement AI solutions that create tangible business value - from early-stage MVPs to enterprise-ready production systems.

CognitiveLab is dedicated to developing state-of-the-art AI models in India that create tangible impact globally. We prioritize open-source development to foster innovation, accelerate progress, and ensure accessibility. We balance our focus between critical multilingual/Indic projects and broadly applicable AI tools like data parsers and educational resources (e.g., AI Engineering Academy).

TL;DR - Our Major Achievements

  • Nayana
    Revolutionary multilingual, multimodal model supporting 22 languages with SoTA OCR capabilities
  • Omniparse
    6,000+ GitHub stars, 10,000+ monthly users, recognized as one of the fastest-growing open-source repositories
  • Ambari
    India's first bilingual Kannada-English LLM, evaluated by NVIDIA and Microsoft, featured by Meta
  • Meta Grant
    Received Llama Impact Grant for advancing multilingual AI capabilities

The Language Gap & Global Access

Despite India's linguistic diversity (22 official languages), AI development historically overlooked many regional languages, creating a digital divide for over 500 million non-English fluent speakers.

CognitiveLab aims to bridge this gap by building high-quality AI for underserved languages, ensuring technological equity while simultaneously creating solutions that solve real-world problems across industries and domains.

The Resource Imbalance & Open Source Need

Access to resources and datasets for cutting-edge AI was limited, often controlled by large entities with less focus on open community involvement and practical applications.

We champion an open-source approach to prove impactful AI can emerge from India with focused engineering and collaboration, empowering local researchers, developers, and organizations to build solutions that matter.

Our Broader Vision

Beyond addressing language barriers, CognitiveLab exists to tackle fundamental challenges in how AI technology is developed and deployed:

Practical Problem Solving

We're committed to developing AI tools that solve tangible business and social challenges - from enterprise data processing to educational access - rather than purely academic research.

Ethical & Inclusive Innovation

Our products are designed with inclusivity and accessibility as core principles, ensuring AI benefits aren't limited by language, geography, or economic status.

Community-Powered Advancement

We believe in creating platforms and tools that enable broader participation in AI development, allowing a diverse community to contribute to and benefit from technological progress.

From a bootstrapped initiative in May 2023 to securing international grants, our journey reflects our growing impact. Here are some key milestones:

May 2023

Founding & Microsoft for Startups

CognitiveLab founded as an open-source first research lab in Bangalore. Accepted into the Microsoft for Startups program around the same time.

January 2024

Ambari Launch

Released India's first bilingual Kannada-English LLM (Ambari), achieving SoTA performance with limited resources.

March 2024

Indic LLM Infrastructure

Launched tools and benchmarks like the Indic LLM Leaderboard to support Indic language AI development.

May 2024

OmniParse Launch

Released OmniParse, an open-source data parsing tool that quickly gained traction (6,000+ GitHub stars).

September 2024

Nayana OCR @ NAACL

First paper on Nayana OCR accepted at the prestigious NAACL conference workshop.

April 2025

Meta Llama Impact Grant

Awarded the grant from Meta (Llama Impact Grant) to advance multilingual AI (Project Nayana). Public announcement on April 29, 2025.

Present Day

Ongoing Research

Continuing work on Nayana, OmniParse, Indic infrastructure, and exploring new frontiers in open-source AI.

01
Ambari

India's first bilingual Kannada-English LLM, set a new benchmark by being SoTA at the time of its launch. Trained with a modest budget of just $1,000 on Azure's infrastructure, it showcased how powerful AI can emerge even with limited resources.

SoTA bilingual Kannada-English model at launch time
Featured in Meta's keynote at the Build with AI summit
Evaluated in research papers by NVIDIA and Microsoft
Highlighted in official posts on India.gov.ai

References & Resources

02
OmniParse

An open-source tool designed to ingest and parse any type of data into a structured format. With 6,000+ GitHub stars and 10,000+ developers using it monthly, it's rapidly gaining traction in the AI space.

6,000+ GitHub stars and 10,000+ monthly users
Featured on prominent tech blogs like MarkTechPost
Gained 3,000 GitHub stars in just 2 days after launch
10K+ monthly Docker pulls by developers worldwide
Recognized as one of the fastest-growing open-source repositories of Q3 2024

References & Resources

03
Indic LLM Infrastructure

We've developed several tools to support Indic language AI development, including the India LLM Leaderboard, Indic Eval, and Indic Tokeniser.

Standardized benchmarking platform for Indic language models
Entire infrastructure hosted on Azure for scalability and reliability
Part of the Leaderboard Mission at People+AI
Featured by NASSCOM as Tech Maverick innovation

References & Resources

04
Project Nayana

A revolutionary multilingual, multimodal, multitask language model that supports 22 languages, including text, audio, and vision capabilities.

Supports 22 languages with text, audio, and vision capabilities
Nayana OCR accepted at the prestigious NAACL Workshop
SoTA OCR model in 10 different Indic languages
Received grant from Meta (Llama Impact Grant)
Available on Hugging Face for easy access and use

References & Resources

05
Llama Impact Grant

In recognition of our work, particularly with Project Nayana and the Indic LLM Leaderboard, CognitiveLab was awarded the prestigious Llama Impact Grant by Meta. This significant support, set to be publicly announced on April 29, 2025, will accelerate our efforts in advancing multilingual and multimodal AI for diverse languages.

How We'll Utilize the Grant

Advancing Project Nayana and Indic language AI

  • Expand language coverage across all 22 official Indian languages
  • Deepen multimodal capabilities connecting text, audio, and visual modalities
  • Develop high-quality training data for low-resource languages
  • Enhance the Indic Tokenizer to better handle morphological richness
  • Create inference optimization techniques for deployment on constrained hardware

CognitiveLab's commitment to open-source AI is deeply rooted in our foundational philosophy of democratizing artificial intelligence. We believe that open source has the most direct and widespread impact, benefiting both developers and end-users.

Accessibility for All

Open-source AI fundamentally aligns with our mission to make advanced AI technologies accessible across diverse communities, especially in regions where language barriers have historically limited technological inclusion.

Democratization of Innovation

By embracing open-source models, we're enabling developers, researchers, and organizations throughout India and beyond to build on powerful foundations without prohibitive costs.

Community-Powered Development

The vibrant ecosystem around open-source models has accelerated our progress through collaborative debugging, shared improvements, and collective problem-solving that proprietary approaches simply cannot match.

Indigenous Innovation Focus

Open-source allows us to develop locally relevant AI solutions that address uniquely Indian challenges while contributing to the global AI ecosystem.

Opportunities

Join our team or support our research

Join our Team

Application Process

Please fill out the form below to show interest in our open positions. We will review your application and get back to you within 2-3 weeks.

Support Our Research

If you like our research and would like to sponsor our projects and open source initiatives, please get in touch. Your sponsorship will greatly help us continue developing innovative solutions and advancing the field of AI.

  • Support cutting-edge AI research
  • Contribute to open source development
  • Help make AI accessible to everyone

Get In Touch

Questions, ideas, or collaborations? Reach out—I'm all ears!

CognitiveLab Logo

CognitiveLab

Transforming Enterprises with AI Solutions at Scale

© 2025 CognitiveLab. All rights reserved.