GGrantIndex
← Search

Structural data science methods and software to study immunotherapeutic proteins

$151,389ZIHFY2022CANIH

Division Of Basic Sciences - Nci

Investigators

Linked publications & trials

Abstract

We have developed a highly efficient interactive web-based software for molecular visualization and structural analysis (iCn3D) [Wang et al. 2020, Wang et al. 2022] through an initial collaboration with the NCBI structure group. We successfully applied the software to study the structure and interactions of viral proteins with cell surface receptors [Youkharibache et al. 2020]. The iCn3D software is now becoming a collaborative research platform as demonstrated by our recent sequence-structure analysis of SARS-CoV-2 and other beta coronaviruses where we identified specific sequence-structure micro-homologies in receptor binding domains/motifs (RBD/RBM) supersecondary structures of coronaviruses from SARS to MERS, OC43, HKU1, HKU4, and MHV [Youkharibache et al. 2020] for targeting by neutralizing antibodies or other therapeutic molecules. While performing this analysis, we also proposed structure corrections that were hiding sequence homologies, demonstrating the value of an integrated analysis approach to improve structures. We have implemented an innovative data sharing capability through a F.A.I.R mechanism in iCn3D. In fact, we go further than data sharing, as entire analysis protocols are embedded in sharable permanent links for reproducibility, extensibility, and collaborative research. As the software becomes cross-disciplinary, it is also becoming a platform to integrate diverse data streams. Software development itself is evolving into a collaborative, open-source hub with new development groups joining in from both in the intramural and extramural community, and collectively reaching out to a broader developers' community through hackathons [https://www.iscb.org/ismb2020-program/ismb2020-hackathon], co-organized with intramural and extramural collaborators. The fundamental basis of my research has been the study of self-association determinants of molecular systems, especially proteins, as revealed by their structural symmetries at several levels of molecular organization [Youkharibache 2019; Youkharibache, Tran, and Abrol 2020]. The software we are developing to study molecular interactions and the applications we are now tackling are beginning to capture this vision and we are exploring the initial implementations of symmetry analysis as a data organizing mechanism, aiming at developing therapeutics based on molecular interactions knowledge. For example, while antibodies' heavy and light chain symmetries are well known, the individual Immunoglobulin domains consist themselves of intrinsically pseudo-symmetric protodomains [Youkharibache 2019], a property largely ignored that can open new routes to antibody engineering, especially nanobodies. At the same time, many of the cell surface protein receptors, from T-cells to their target cells (TCRs, CD4, CD8, CD28, CTLA4, PD1, PDL1, etc.) are composed of Ig domains interacting through oligomeric pseudo-symmetric arrangements revealing the determinants of protein domain association, and Ig domains in particular. We are assembling an Ig-centric database that will provide invaluable data to design new Ig-based immunoreceptors and inhibitors. The Ig-domain is by far the most common structural fold of the immunome, and its pseudo symmetric assembly patterns are an invaluable guide to understand and design inhibitors and modulators. There are, however, other important folds on cell surfaces: GPCRs, MFS, SLCs, etc. that are used as receptors for immune cell interactions, metabolic modulations, or for viral entry. We have demonstrated that a wide range of polytopic membrane proteins, including GPCRs and SLCs, are indeed formed through a pseudo-symmetric assembly mechanism [Youkharibache, Tran, and Abrol 2020]. Second, to Ig-based proteins, GPCRs represent the most important subset of molecular scaffolds in the cell surfaceome/immunome, and SLCs are also high up in the list. We had established earlier that protein domains' pseudo symmetries are found in 20% of known structures overall, yet quasi-symmetry is found in higher proportion in integral membrane proteins [Youkharibache, Tran, and Abrol 2020], and we are now seeing an even higher percentage across the proteins of the surfaceome, especially on immune cells. Our symmetry analysis gives us a decoding framework to study molecular interactions, and we are actively developing methods and databases that can enable the design of new Ig-based receptors as anti-cancer therapeutics based on these ideas. The characterization of anti-CD19 and anti-BCMA CARs based on flexibility analysis have enabled us to support observations in ongoing clinical trials [Brudno et al. 2020]; at the same time, we have observed the formation of a spontaneous rearrangement of a CAR-T scFv in a crystal [PDBid: 7JO8 Cheung et al. 2020] mediated by quasi-symmetry of Ig domains association [Youkharibache 2019]. We are currently developing an algorithm to detect and characterize flexible parts of proteins and protein complexes to study protein folding and unfolding, conformational changes, and, most importantly, for some of our applications to relate flexibility to their underlying sequence-structure determinants. We are also developing an annotated Immunoproteins database regrouping all known structures containing Immunoglobulin domains in interaction to study the interfaces at the heart of immune synapses between cells, and primarily involving T-cells and their receptors.

View original record on NIH RePORTER →