January 24, 2024

The latest version of the chatbot developed by Open AI, ChatGPT-4, performs well in English but struggles in other languages, scoring just 62% when tested in Telugu, an Indian language. This is due to the fact that large language models are typically trained on text that is predominantly in English. The problem of low performance in “low-resource” languages is a challenge for those hoping to export AI to developing countries. Efforts are being made to make AI more multilingual, including the development of machine-translation software and optimized tokenizers for specific languages. Researchers are also looking to improve the datasets used to train language models, as well as tweaking models after training. However, there are challenges to overcome, such as high illiteracy rates in some regions and the preference for voice messages over text. Despite these challenges, teaching AI to understand multiple languages is seen as a positive development.

