Abstract
Advances in technology have generated larger omics datasets with potential applications for machine learning. In many datasets, however, cost and limited sample availability result in an excessively higher number of features as compared to observations. Moreover, biological processes are associated with networks of core and peripheral genes, while traditional feature selection approaches capture only core genes. Results: To overcome these limitations, we present dRFEtools that implements dynamic recursive feature elimination (RFE), reducing computational time with high accuracy compared to standard RFE, expanding dynamic RFE to regression algorithms, and outputting the subsets of features that hold predictive power with and without peripheral features. dRFEtools integrates with scikit-learn (the popular Python machine learning platform) and thus provides new opportunities for dynamic RFE in large-scale omics data while enhancing its interpretability.
Original language | English (US) |
---|---|
Article number | btad513 |
Journal | Bioinformatics |
Volume | 39 |
Issue number | 8 |
DOIs | |
State | Published - Aug 2023 |
Funding
This work was supported by the Lieber Institute for Brain Development and the National Institute on Minority Health and Health Disparities of the National Institutes of Health [K99MD016964 to K.J.M.B.].
ASJC Scopus subject areas
- Statistics and Probability
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics