Recursive partitioning and ensemble methods for classifying an ordinal response

$75,000R03FY2009LMNIH

Virginia Commonwealth University, Richmond VA

Investigators

Linked publications & trials

Paper 22611297 Paper 22359384 Paper 20625561 Paper 19939941 Paper 19697302

Abstract

DESCRIPTION (provided by applicant): This proposal is submitted in response to NOT-OD-09-058 NIH Announces the Availability of Recovery Act Funds for Competitive Revision Applications. Health status and outcomes are frequently measured on an ordinal scale. Examples include scoring methods for liver biopsy specimens from patients with chronic hepatitis, including the Knodell hepatic activity index, the Ishak score, and the METAVIR score. In addition, tumor-node-metasis stage for cancer patients is an ordinal scaled measure. Moreover, the more recently advocated method for evaluating response to treatment in target tumor lesions is the Response Evaluation Criteria In Solid Tumors method, with ordinal outcomes defined as complete response, partial response, stable disease, and progressive disease. Traditional ordinal response modeling methods assume independence among the predictor variables and require that the number of samples (n) exceed the number of covariates (p). These are both violated in the context of high-throughput genomic studies. Our currently funded R03 grant, "Recursive partitioning and ensemble methods for classifying an ordinal response," consists of the following three specific aims (SA.1) extend the recursive partitioning and random forest classification methodologies for predicting an ordinal response by developing computational tools for the R programming environment including implementing our ordinal impurity criteria in rpart and implementing the ordinal impurity criteria in randomForest;(SA.2) evaluate the proposed ordinal classification methods in comparison to existing nominal and continuous response methods using simulated, benchmark, and gene expression datasets;and (SA.3) develop and evaluate methods for assessing variable importance when interest is in predicting an ordinal response. Recently, penalized models have been successfully applied to high-throughput genomic datasets in fitting linear, logistic, and Cox proportional hazards models with excellent performance. However, extension of penalized models to the ordinal response setting has not been described. Herein we propose to extend the L1 penalized method to ordinal response models to enable modeling of common ordinal response data when a high-dimensional genomic data comprise the predictor space. This study will expand the scope of our current research by providing a model-based ordinal classification methodology applicable for high-dimensional datasets to accompany the heuristic based classification tree and random forest ordinal methodologies considered in the parent grant. The specific aims of this competitive revision application are to: Aim 1) Extend the L1 penalized methodology to enable predicting an ordinal response by developing computational tools for the R programming environment;Aim 2) Using simulated, benchmark, and gene expression datasets, evaluate L1 penalized ordinal response models by comparing error rates from our L1 fitting algorithm to those obtained when using a forward variable selection modeling strategy and our ordinal random forest approach;and Aim 3) Evaluate methods for assessing important covariates from L1 penalized ordinal response models.

View original record on NIH RePORTER →