This study evaluates various feature reduction (FR) techniques to enhance drug response prediction (DRP) using machine learning on high-dimensional molecular data. Researchers compared nine knowledge-based and data-driven methods across thousands of tests involving cancer cell lines and human tumor samples. While techniques like sparse principal components and Landmark genes performed well on cell lines, transcription factor (TF) activities proved most effective for predicting responses in complex clinical tumors. Across all tested scenarios, ridge regression emerged as the most robust machine learning model for handling correlated genetic features. These findings suggest that leveraging biological insights, particularly regulatory network activities, can significantly improve the interpretability and accuracy of personalized medicine models. The research highlights the potential for using simplified, biologically relevant feature sets to bridge the gap between laboratory screenings and clinical patient outcomes.
References:
Firoozbakht, F., Yousefi, B., Tsoy, O. et al. Comparative evaluation of feature reduction methods for drug response prediction. Sci Rep 14, 30885 (2024). doi.org

