Optimizing LLMs: Harnessing Core Sub-Models for Efficient Training on New Tasks - MSc Thesis Defense by: Prithvi Rao Muthineni

Friday, June 7, 2024 - 12:00

The School of Computer Science is pleased to present…

Optimizing LLMs: Harnessing Core Sub-Models for Efficient Training on New Tasks

MSc Thesis Defense by: Prithvi Rao Muthineni

 

Date: 07 June 2024

Time: 12:00 pm

Location: Essex Hall, Room 122

 

Abstract: The continuous evolution of natural language processing (NLP) has been pivotal in advancing AI's capacity to comprehend and decode human language. Large Language Models (LLMs) such as BERT, GPT, and RoBERTa exemplify this progress, setting new benchmarks across a spectrum of NLP tasks, including sentiment analysis. However, the practical deployment of such models encounters significant computational obstacles owing to their intricate architectures, demanding substantial processing resources and memory. Moreover, concerns about the environmental repercussions of training large-scale NLP models have gained prominence, accentuating the imperative for sustainable AI development to mitigate carbon emissions associated with training processes.

This thesis addresses the computational and practical impediments associated with deploying large-scale NLP models, focusing on the optimization of RoBERTa and Electra. The research endeavors to optimize these models through novel model pruning techniques and strategic fine-tuning, guided by the notion of a core sub-model embedded within trained LLMs. This core sub-model encapsulates generic language properties that are prevalent across various NLP tasks, serving as a foundational framework for fine-tuning on new tasks. Leveraging this core sub-model enables a streamlined fine-tuning process that targets only a subset of parameters, thereby enhancing efficiency while preserving or improving NLP task performance.

 

Thesis Committee:

Internal Reader: Dr. Olena Syrotkina       

External Reader: Dr. Mohammad Hassanzadeh

Advisor: Dr. Robin Gras

Chair: Dr. Dan Wu

Vector Institute logo

MAC STUDENTS ONLY - Register here