Pattern Identification in Sequence Activity Data
National Institute Of Diabetes And Digestive And Kidney Diseases
Investigators
Linked publications & trials
Abstract
Current models for inferring co-evolutionary contacts from aligned protein sequence data use the principle of maximum-entropy (ME) which assumes that the sequences are in a state of equilibrium with respect to the evolutionary forces acting on them. This has two obvious issues. Firstly, HIV protein sequences cannot accurately be defined as being in a state of equilibrium, especially when our aim is to discover possible dynamic mutation-driven escape pathways. Secondly, epistatic interactions are not necessarily symmetric because proteins do not exist in isolation but must interact with other biological macromolecules. Thus, interacting residues are not symmetrically equal because, for instance, one might be structural and the other functional in their respective contributions to the proteins function. To overcome these limitations, we developed a methodology which seeks to describe the same physical system, but in a non-equilibrium state. This method, termed Expectation Reflection (ER), has been shown to improve upon the maximum-entropy approach and successfully characterize epistatic interactions within the SARS-CoV-2 genome. The ER approach directly computes the conditional probability of observing a mutated amino acid at a position, given a sequence in the context of a population of sequences. With this conditional probability we can directly calculate the likelihood of a specific residue, such as a DRM, in the context of a given population. The conditional probability also means that our inferred pair-wise interactions are not necessarily symmetric. Ultimately this methodology gives a more tractable, and biologically appropriate, theoretical description of the transition energy of a given mutation which is the basis of several analyses of DRM in previous work. Dr. Kearney has indicated to us that at this point the most valuable part of these analyses would be to help in characterizing broadly neutralizing antibodies because the success of the anti-retroviral treatment regimen in people who comply with the drug regimen implies that in these individuals the mutation rate of the virus does not overwhelm the immune system. In other words, the therapeutic focus needs to be shifted to the development and characterization of broadly neutralizing antibodies in the population that is not adequately protected by the drug regimen. Thus we are focusing our work on the viral envelope glycoprotein Env.
View original record on NIH RePORTER →