The School of Computer Science is pleased to present…
MSc Thesis Defense by: Darpan Khanna
Date: Thursday, April 3, 2025
Time: 1:30 pm
Location: CS51 (Chrysler Hall South)
Accurate prediction of protein interaction sites is key to understanding the underlying mechanisms of many biological processes. Most state-of-the-art methods for the prediction of protein interaction sites accomplish this task by leveraging sequence features and incorporating spatial neighbourhood information. Some methods capture local and global structural, evolutionary, and sequence-based features; however, they overlook physicochemical properties that are crucial in protein interactions and struggle with feature redundancy, which limits their overall performance. By addressing these gaps, we aim to enhance the predictive power of graph neural networks and introduce new feature sets that significantly improve the model's performance. We propose an enhanced model, called Prediction of Protein Interactions based on Solvent Accessible Surface Area, Hydrogen-Bonding Propensity, and Electrostatic Potential Sites (PPISHES), which is built upon incorporating three fundamental physicochemical features—electrostatic potential, hydrogen-bonding propensity, and solvent-accessible-surface-area—to, used to enrich the feature representation of protein structure, and thereby improving site prediction for both obligate and non-obligate complexes. To enhance interpretability and feature selection, we applied feature ablation analysis, systematically masking individual features and identifying the most important features of the model. We evaluated PPISHES on widely used benchmark datasets, achieving a substantial improvement in the Area Under the Precision-Recall Curve (AUPRC), with increases of up to 42.8% and 29.3% for Test_315 and Test_71 respectively. PPISHES was also evaluated on key metrics including accuracy, precision, recall, area under the curve, and Matthews correlation coefficient, confirming its superior overall performance and surpassing current state-of-the-art.
Internal Reader: Dr. Alioune Ngom
External Reader: Dr. Kenneth Ng
Advisor: Dr. Luis Rueda
Chair: Dr. Muhammad Asaduzzaman