TY - GEN
T1 - Classifying aging genes into DNA repair or non-DNA repair-related categories
AU - Fang, Yaping
AU - Wang, Xinkun
AU - Michaelis, Elias K.
AU - Fang, Jianwen
PY - 2013
Y1 - 2013
N2 - The elderly population in almost every country is growing faster than ever before. However, our knowledge about the aging process is still limited despite decades of studies on this topic. In this report, we focus on the gradual accumulation of DNA damage in cells, which is a key aspect of the aging process and one that underlies age-dependent functional decline in cells, tissues, and organs. To achieve the goal of discriminating DNA-repair from non-DNA-repair genes among currently known genes related to human aging, four machine learning methods were employed: Decision Trees, Naïve Bayes, Support Vector Machine, and Random Forest (RF). Among the four methods, the RF algorithm achieved a total accuracy (ACC) of 97.32% and an area under receiver operating characteristic (AUC) of 0.98. These estimates were based on 18 selected attributes, including 10 Gene Ontology and 8 Protein-Protein Interaction (PPI) attributes. A predictive model built with only 15 PPI attributes achieved performance levels of ACC= 96.56% and AUC=0.95. Systems biology analyses showed that the features of these attributes were related to cancer, genetic, developmental, and neurological disorders, as well as DNA replication/ recombination/repair, cell cycle, cell death, and cell function maintenance. The results of this study indicate that genes indicative of aging may be successfully classified into DNA repair and non-DNA repair genes and such successful classification may help identify pathways and biomarkers that are important to the aging process.
AB - The elderly population in almost every country is growing faster than ever before. However, our knowledge about the aging process is still limited despite decades of studies on this topic. In this report, we focus on the gradual accumulation of DNA damage in cells, which is a key aspect of the aging process and one that underlies age-dependent functional decline in cells, tissues, and organs. To achieve the goal of discriminating DNA-repair from non-DNA-repair genes among currently known genes related to human aging, four machine learning methods were employed: Decision Trees, Naïve Bayes, Support Vector Machine, and Random Forest (RF). Among the four methods, the RF algorithm achieved a total accuracy (ACC) of 97.32% and an area under receiver operating characteristic (AUC) of 0.98. These estimates were based on 18 selected attributes, including 10 Gene Ontology and 8 Protein-Protein Interaction (PPI) attributes. A predictive model built with only 15 PPI attributes achieved performance levels of ACC= 96.56% and AUC=0.95. Systems biology analyses showed that the features of these attributes were related to cancer, genetic, developmental, and neurological disorders, as well as DNA replication/ recombination/repair, cell cycle, cell death, and cell function maintenance. The results of this study indicate that genes indicative of aging may be successfully classified into DNA repair and non-DNA repair genes and such successful classification may help identify pathways and biomarkers that are important to the aging process.
KW - Aging
KW - Classification
KW - DNA-repair
KW - Feature selection
KW - Random Forest
UR - http://www.scopus.com/inward/record.url?scp=84883136909&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84883136909&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-39482-9_3
DO - 10.1007/978-3-642-39482-9_3
M3 - Conference contribution
AN - SCOPUS:84883136909
SN - 9783642394812
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 20
EP - 29
BT - Intelligent Computing Theories and Technology - 9th International Conference, ICIC 2013, Proceedings
T2 - 9th International Conference on Intelligent Computing, ICIC 2013
Y2 - 28 July 2013 through 31 July 2013
ER -