Classification Algorithms for Chemical Compounds

$283,196R01FY2005LMNIH

University Of Minnesota Twin Cities, Minneapolis MN

Investigators

Linked publications & trials

Paper 19764745 Paper 19566958 Paper 19072298 Paper 18402435 Paper 17042943

Abstract

Computational techniques that build models to correctly assign chemical compounds to various classes of interests have extensive applications in pharmaceutical research and are used extensively at various phases during the drug development process. These techniques are used to solve a number of classification problems such as predicting whether or not a chemical compound has the desired biological activity, is toxic or non-toxic, and filtering out drug-like compounds from large compound libraries. The overall goal of this proposal is to develop substructure-based classification algorithms for chemical compound datasets. The key elements of these algorithms are that they (i) utilize highly efficient substructure discovery algorithms to mine the chemical compounds and discover all substructures that can be critical for the classification task, (ii) use multiple criteria to generate a set of substructure-based features that simultaneously simplify the compounds' representation while retaining and exposing the features that are responsible for the specific classification problem, and (iii) build predictive models by employing kernel-based methods that take into account the relationships between these substructures at different levels of granularity and complexity, as well as information provided by traditional descriptors.

View original record on NIH RePORTER →