Gokul Karthik Kumar says his experience at the Mohamed bin Zayed University of Artificial Intelligence has prepared him well for a career in AI R&D
Ahead of Sunday’s commencement practice, Masters in Computer Vision student Gokul Karthik Kumar with harrij times about his journey at the groundbreaking institution.
He said the university gave him the freedom to explore many areas of artificial intelligence (AI), including his passion for natural language processing.
“While my two-year major was in computer vision, my advisor supported me to pursue projects in other fields such as natural language processing and speech processing, which was very fulfilling and helped me identify my current field of passion.”
A native of Tamil Nadu, India, Kumar emphasized that his experience at MBZUAI was “transformative” and prepared him for a future career in AI research and development.
“I learned from some of the most knowledgeable professors in AI. I also improved my research skills from problem identification to research proposal to presentation.”
Unique Artificial Intelligence Solution
Kumar has an extensive background in machine learning for text, image, speech and time series and has worked with top technology organizations such as Microsoft Research India, TCS Research, MBZUAI and IIT Madras. He has won numerous hackathons, including the IEEE SLT 2022 International Hackathon in Qatar, eight national hackathons in the UAE and India, and a US patent.
Co-authored papers published at important conferences such as IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023), Association for Computational Linguistics (ACL 2022) Symposium, Empirical Methods in Natural Language Processing (EMNLP 2022) Symposium and International Joint Conference on Neural Networks (IJCNN).
After graduation, he will travel to Greece to attend ICASSP 2023, where he will deliver a presentation titled Build text-to-speech systems for the next billion users.
“This project was initiated during my summer internship at Microsoft Research India, where I collaborated with my co-author, Praveen from IIT Madras. Our work involved systematic evaluation of design choices for text-to-speech systems to Published state-of-the-art models for 13 Indian languages. Most open source text-to-speech are available in English, but extending them to local languages could impact the masses, especially those who cannot read.”
Hate memes, Autodub
Kumar’s dissertation research explored effective representation methods for multilingual and multimodal data. His work addresses critical tasks such as question answering, hate meme classification, text-to-speech, and text-to-image retrieval. In today’s age of social media, where online bullying has become increasingly common, Kumar’s research is significant.
Hate memes, including hate speech against individuals on social media, pose a worrying challenge. While multiple techniques exist for classifying such memes, Kumar devised a straightforward method that effectively combines image and textual features to predict the likelihood of hate. This can enable social media platforms to make informed decisions about what they should and should not post.
He was also involved in the co-development of the award-winning Autodub, a human-in-the-loop AI dubbing platform designed to remove language barriers in educational video content to enhance remote online learning in every corner of the globe. Autodub seamlessly integrates transcription, translation, voice-over and background audio separation to create accurate translations and promote accessibility for all. Since many educational videos are primarily in English, this can create barriers for non-native English speakers. Autodub offers a viable solution to this challenge.
“What really excites me about my future career is the opportunity to make a tangible impact. It would be amazing if I could develop something that would enhance the process and thus positively impact a large number of individuals. There are only a few A field or a technology has the ability to create something that instantly captures widespread attention and sparks conversations in diverse communities.”
Join G42 as a scientist
An avid follower of IPL cricket team Chennai Super Kings, he recalls his favorite UAE memory of seeing his team claim the season title in Dubai in 2021, which coincides with the start of his journey as host. Even more exciting, just days before his graduation ceremony, his team won again.
His next challenge is to help develop large-scale language models in the UAE — a country that has made perpetuating AI a key priority.
“I will be joining G42’s Inception Institute for Artificial Intelligence (IIAI) as an applied scientist, where my focus will be on working collaboratively in teams to develop large-scale language models tailored for UAE-centric applications .”
Kumar was the first in his family to earn a master’s degree. He also holds a Bachelor of Information Technology from Anna University, Chennai. He is one of 59 computer vision, machine learning and natural language processing students graduating in 2023.