How Google is trying to standardize Indian-English

A new feature on Google search allows users to learn an English word’s Indian pronunciation. But how does one standardize a language that’s spoken differently in different parts of the country?

A schoolboy in an English class in Rajasthan
A schoolboy in an English class in Rajasthan (iStock)

Indian internet users might have noticed a new feature Google added to its website last year. Every time you look up the pronunciation of a word, Google gives three options: one in British, one in American and one in Indian English.

Try ‘hair’. According to Google, it is pronounced ‘hei·uh’ in Indian-English, ‘heuh’ in British, and ‘hehr’ in American.

There are many more words like this: ‘plumber’, ‘flour’, ‘fair’, ‘brinjal’, ‘develop’ etc. Each has a pronunciation that differs in Indian-English compared to its American and British counterparts. And for each of them, along with the guide to phonetics, a Google search for its pronunciation also shows a digital face that mimics the lip-movement. To help the user use it ‘correctly’.

It is an experiment in standardizing pronunciation of Indian-English, which itself is spoken differently across the country. There is literature documenting the evolution of Indian-English and its cultural capital, all the way back to the 19th century, but so far there has never been an effort to standardize its pronunciation. So how is Google going about it? And what prompted it to?

It started with building over the old processes of pronunciation, says Partha Talukdar, staff research scientist at Google Research, Bengaluru, and head of a group focused on Natural Language Understanding. “Earlier, pronunciations were done using a symbol system of IPA or international phonetic alphabet,” he adds. Users of a dictionary would recognize it as the system of phonetic notation listed next to every word. For example, the word ‘complicated’, as per the IPA, is pronounced ‘kɒmplɪkeɪtɪd’.



“The IPA isn’t accessible for most non-experts or non-linguists,” says Talukar. “So one of the things we did was to make it more readable with the help of linguists. We also developed the lip movements using an internal technology.”

This greatly simplified how one understood phonetics of a word. If you look up ‘complicated’ on Google, it breaks down the pronunciation to ‘kawm·pluh·keit·uhd’.

It also enabled Google's AI systems to read and pronounce words. "Then it was left to the voice part,” says Talukdar. Google hired a few male and female voiceover artists to record audio and speech samples. Machine learning systems extrapolated the learnings from these samples and applied it to a broader set of words. “We [Google's AI systems] could then take each one of those phonetic symbols and pronounce it in an Indian-English way.” Each word, Talukdar adds, is vetted by a team of linguists. “Only human vetted words where we’re confident that quality is good are those that finally surface,” he adds.

The conflict begins when one realizes there isn’t, and has never been, a single variant of Indian-English. Most Indian-English pronunciations on Google sound like how a north-Indian TV news anchor would say it. In a way, it's similar to the accent-preferences in the United Kingdom: until a few decades ago, the BBC allowed only RP accents (Received Pronunciations, also known as the Queen's English) on its airwaves, which later became synonymous with the 'posh' British accent as we know it.



“This is a challenge,” admits Talukdar. “For all of variations we want to cover, if we want to take help of voice artists for different dialects, that’s not scalable. So the idea is, how can we take one representative database and adapt it to different situations, or dialects of a language using data collected from artists and making some technical alterations to that. That’s in the realm of research right now.”

Google says their reasons for doing this is to promote language literacy, but it’s got just as much to do with India emerging as a market for global tech-firms. India has 700 million internet users and rising fast. As per the 2011 census, there were nearly 127 million English speakers in India – including those who spoke it as the second and third language. Most higher education in India is conducted in English, so proficiency in the language is seen as a way for better career prospects. It’s also led to AI (artificial intelligence) assistant systems, like Google Assistant, Amazon’s Alexa and Apple’s Siri to develop voice programmes so its Indian users can connect with them in the dialect they are most familiar with: Indian-English.



To linguist Ranjan Kumar Auddy, such initiatives represent how far Indian-English has come over the years. In 2019, Auddy wrote a book In Search of Indian English: History, Politics and Indigenisation, giving an account of the development of this variety of English in colonial India.

“Until a few decades ago, the way Indians wrote and pronounced the English language, was looked down upon,” he says. “They were not considered as something not genuine. I remember reading an interview once where Anita Desai said that she sometimes felt she would be among the last generation of Indian writers to write in English.”

But things changed in the 1980s and 1990s, when Salman Rushdie’s Midnight’s Children and Arundhati Roy’s The God of Small Things achieved international acclaim. With economic reforms in 1991, the Indian markets opened up. With the turn of the century, internet made borders fluid and English became the lingua-franca in several regions of the world.

Soon, the concept of mother-tongue or native-speaker lost its reverence. The English language, too, started incorporating cultural flavours of the different regions in the world where it was spoken. But this was only a matter of time. In 1975, Nigerian novelist Chinua Achebe had also talked about English taking on regional flavours as an artistic necessity: “I feel that the English language will be able to carry the weight of my African experience. But it will have to be a new English, still in full communion with its ancestral home but altered to suit its African surroundings."



So while Indian English has gained widespread acceptance, Auddy agrees that it is difficult to standardize one variant of it. There will be some resistance from purists, too, he says. India does, after all, reel under colonial baggage and often promotes British pronunciations, albeit in Indian accents. But at the end of the day, it would be about what someone says, rather than how one says it, that would matter more, he says.

“A self-taught person cannot appropriate the so-called standard pronunciation,” he says. “But that does not mean she or he will be a less efficient user of the language. After some years, with change of time, perhaps we will not be bothered about the purity of pronunciation. And to me, it’d be a perfectly desirable change.”

