scorecardresearch

India’s homegrown AI: Tech Mahindra's 'Project Indus' could bridge the language gap that plagues ChatGPT | Interview

Tech Mahindra's 'Project Indus' pioneers India's Indic LLM, with insights from Nikhil Malhotra, Global Head at Maker's Lab, emphasising responsible AI and bias mitigation.

advertisement
Tech Mahindra unveils 'Project Indus' Indic-based LLMInterviews
Tech Mahindra unveils 'Project Indus' Indic-based LLM
profile
New Delhi, UPDATED: Sep 16, 2023 15:24 IST

Highlights

  • ‘Project Indus' pioneers India's Hindi-focused LLM, democratising tech for non-English speakers
  • This story features an exclusive interview with Nikhil Malhotra, the Chief Innovation Officer of Tech Mahindra, who shares insights into Project Indus, its significance, and responsible AI practices
  • Project Indus employs a strong strategy, including a feedback loop and data cleansing, to ensure ethical AI with minimal bias and inaccuracies

In a remarkable advancement within India's AI sphere, Tech Mahindra has officially unveiled 'Project Indus,' the nation's first-ever Large Language Model (LLM) with a primary focus on Hindi and its diverse array of dialects.

Spearheading this ambitious endeavour is Nikhil Malhotra, the Chief Innovation Officer and the Global Head of Maker's Lab, a unique incubation space within Tech Mahindra. With over two decades of experience in the field, Nikhil also holds the prestigious title of World Economic Forum AI Fellow, where he dedicates his expertise to advancing responsible AI and Quantum ethics.

advertisement

We got in touch with the man himself and tried to get some insights on the first-ever indic-based foundational model.

Addressing the Language Gap

Project Indus, as described by Nikhil Malhotra, is a civilisation initiative tailored for India. The project's dual objectives are clear: firstly, to construct a language model deeply rooted in Indian culture and secondly, to excel in prevailing benchmarks.

The team is currently considering parameters ranging from 7 to 14 billion, with the potential to extend to 40 billion if necessary. These parameters are crucial for the model's versatility and ability to cater to India's linguistic diversity, as highlighted by Nikhil Malhotra, Chief Innovation Officer of Tech Mahindra.

advertisement

In essence, this pioneering effort aims to bridge the massive language gap that persists in India, where only 10-20 percent of the population converses in English.

Unlocking opportunities with Indic LLM

The choice of the Indic LLM model is not coincidental. While Language Models (LLMs) have predominantly favoured languages like English, German, French, and Spanish, India's linguistic diversity often goes unnoticed. Hindi, being the third most spoken language worldwide, has been Tech Mahindra's initial focus.

However, the project's ultimate goal is to expand to other languages and dialects. This strategic decision could democratise technology and extend its reach to the last mile.

Advantages of the Indic LLM model

The advantages associated with the Indic LLM model are numerous. It promises to be culturally sensitive, aligning with local customs and norms, while also democratising AI for non-English speakers in India. Moreover, its versatility makes it invaluable for specialised industries like healthcare, retail, and tourism. Importantly, it offers a cost-effective solution for generating content in Indic languages.

Data collection for diverse dialects

Collecting data for Project Indus' extensive dialects has been a challenge. The initiative "Bhasha Daan" encourages individuals to contribute data in their native languages. The data sets are sourced from various platforms, including Common Crawl, incorporating content from various sources like newspapers, Wikis, and domain-specific sources.

advertisement

This diverse dataset ensures that the model is trained on a wide range of data types, enhancing its performance.

However, obtaining dialect-specific data poses a significant challenge because most websites predominantly offer content in mainstream languages.

Nikhil Malhotara, Global Head at Maker's Lab

Addressing inaccuracies & bias

Project Indus has a two-pronged approach to address inaccuracies and bias. It contextualises data and establishes guidelines for accurate and appropriate responses. Fact-checking measures are in place, along with a reinforcement learning feedback loop involving users to improve responses.

To mitigate bias, data cleaning is conducted during the initial collection phase, using a combination of human annotation and automated techniques.

AI as a Job creator

Regarding concerns about AI's impact on jobs, Nikhil Malhotra believes that AI can create more job opportunities than it eliminates. As technology integrates into various sectors, new opportunities requiring new skill sets will emerge, leading to a more dynamic job market.

I believe that AI can create more job opportunities rather than eliminating them. With technology becoming a vital part of everything, it can lead to numerous new opportunities requiring new skill sets.

advertisement
Nikhil Malhotara, Global Head at Maker's Lab

Project Indus stands as a pioneering initiative that promises to reshape the AI landscape in India. With a strong focus on linguistic diversity and responsible AI, Tech Mahindra is charting a course toward a more inclusive and technologically advanced future for the nation.

Published on: Sep 16, 2023 15:24 ISTPosted by: Minaal, Sep 16, 2023 15:24 IST

COMMENTS 0

Advertisement
Recommended