This study assessed the utility of the nationwide population-based administrative health data in predicting the future incidence of AD. Using machine learning, we predicted future incidence of AD with acceptable accuracy of 0.713 (in terms of AUC 0.781) in one-year prediction. The high accuracy of our models based on large nationwide samples may lend a support to the potential utility of the administrative data-based predictive model in AD. Despite of the limitations inherent to the administrative health data, such as the inability to directly ascertain clinical phenotypes, this study demonstrates its potential utility in AD risk prediction, when combined with data-driven machine learning.
Our model performance with AUC of 0.898, 0.775, and 0.725 in predicting baseline, subsequent one-year, and four-year incident AD is relatively accurate compared with the literature. In all-cause…