Research Interests
- Large-scale multivariate statistical learning
- Statistical machine learning
- Statistical computing and optimization
- Population health and healthcare analytics
- Ecological and environmental statistics
- Reinsel, G. C., Velu, R. P., and Chen, K. (2022) Multivariate Reduced-Rank Regression: Theory, Methods and Applications, 2nd Edition. Springer.
Selected Publications
Large-Scale Multivariate Statistical Learning
- Chen, J., Aseltine, R., Wang, F., and Chen, K. (2024) Tree-guided rare feature selection and logic aggregation with electronic health records data. Journal of the American Statistical Association, 119 (547):1765–1777.
- Chen, K., Dong, R., Xu, W., and Zheng, Z. (2022) Fast stagewise sparse factor regression. Journal of Machine Learning Research, 23(271):1–45.
- Liu, X., Ma, S., and Chen, K. (2021) Multivariate functional regression via a nested reduced-rank regularization. Journal of Computational & Graphical Statistics, 31(1):231–240.
- Uematsu, Y., Fan, Y., Chen, K., Lv, J., Lin, W. (2019) SOFAR: large-scale association network learning. IEEE Transactions on Information Theory, 65(8):4924–4939.
Li, G., Liu, X., Chen, K. (2019) Integrative multi-view reduced-rank regression: Bridging group-sparse and low-rank models. Biometrics, 75(2):593–602.
He, L., Chen, K., Xu, W., Zhou, J., Wang, F. (2018) Boosted sparse and low-rank tensor regression. Advances in Neural Information Processing Systems (NeurIPS) 31, 1017–1026.
- Luo, C., Liang, J., Li, G., Wang, F., Dey, D., Chen, K. (2018) Leveraging mixed and incomplete outcomes via a generalized reduced rank regression. Journal of Multivariate Analysis, 167:378–394.
Mishra, A., Dey, D., Chen, K. (2017) Sequential co-sparse factor regression. Journal of Computational & Graphical Statistics, 26(4):814–825.
She, Y., Chen, K. (2017) Robust reduced-rank regression. Biometrika, 104(3):633–647.
Goh, G., Dey, D., Chen, K. (2017) Bayesian sparse reduced-rank regression. Journal of Multivariate Analysis, 157:14–28. (Student Paper Award, ASA Section on Bayesian Statistical Science)
- Luo, C., Liu, J., Dey, D., Chen, K. (2016) Canonical variate regression. Biostatistics, 17(3), 468-483. (ICSA Student Paper Award)
- Mukherjee, A., Chen, K., Wang, N., Zhu, J. (2015) On the degrees of freedom of reduced-rank estimators in multivariate regression. Biometrika, 102(2), 457-477.
- Chen, K., Dong H., Chan, K.-S. (2013). Reduced rank regression via adaptive nuclear norm penalization. Biometrika, 100(4), 901-920.
- Chen, K., Chan, K.-S., Stenseth, N. R. (2012). Reduced rank stochastic regression with a sparse singular value decomposition. Journal of the Royal Statistical Society (Series B), 74(2), 203-221. (ENAR Distinguished Student Paper Award)
Statistical Machine Learning & Computing
- Xu, T., Chen, K., and Li, G. (2024) Tensor regression for incomplete observations with application to longitudinal studies. Annals of Applied Statistics, 8(2):1294–1318.
- Jin, J., Aseltine, R., Yan, J., and Chen, K. (2024) Transfer learning for large-scale quantile regression. Technometrics, 66(3):381–393. (Honorable Mention in Student Paper Competition, ASA Section on Risk Analysis)
- Cui, S., Liang, J., Pan, W., Chen, K., Zhang, C., and Wang, F. (2022) Collaboration equilibrium in federated learning. In KDD ’22: The 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. ACM, 241–251.
- Liang, J., Chen, K., Lin, M., Zhang, C., Wang, F. (2018) Robust finite mixture regression for heterogeneous targets. Data Mining & Knowledge Discovery, 32(6):1509–1560.
- Li, G., Li, Y., and Chen, K. (2022) It’s all relative: regression analysis with compositional predictors. Biometrics, 79(2):1318–1329.
- Li, Y., Yu, C., Zhao, Y. Aseltine, R., Yao, W., Chen, K. (2021) Pursuing sources of heterogeneity in modeling clustered population. Biometrics, 78(2):716–729. (ENAR Distinguished Student Paper Award)
- Vaughan, G., Aseltine, R., Chen, K., Yan, J. (2017) Stagewise generalized estimation equations with grouped variables. Biometrics, 73:1332–1342. (Student Paper Award, ASA Mental Health Statistics Section)
- Dong, H., Chen, K., Linderoth, J. (2016) Regularization vs. relaxation: A conic optimization perspective of statistical variable selection. arXiv:1510.06083.
- Chen, K., Ma, Y. (2017) Analysis of double single index models. Scandinavian Journal of Statistics. 44(1), 1-20.
Chen, K., Chan, K.-S. (2011). Subset ARMA model selection via the adaptive Lasso. Statistics and Its Interface, 4, 197-205.
Statistical Methods in Data Science
- Li, Y., Chen, K., Yan, J., and Zhang, X. (2023) Regularized fingerprinting in detection and attribution of climate change with weight matrix optimizing the efficiency in scaling factor estimation. Annals of Applied Statistics, 17(1):225–239.
- Zang, C., Zhang, H., Xu, J., Zhang, H., Fouladvand, S., Havaldar, S., Cheng, F., Chen, K., Chen, Y., Glicksberg, B. S., Chen, J., Bian, J., and Wang, F. (2023) High-throughput clinical trial emulation with real-world data and machine learning: A case study of drug repurposing for Alzheimer’s disease. Nature Communications, 14:8180.
- Xu, W., Chang, S., Li, Y., Doshi, R., Chen, K., Wang, F., and Aseltine, R. (2022) Improving suicide risk prediction via targeted data fusion: proof of concept using medical claims data. Journal of the American Medical Informatics Association, 29(3):500–511. (Featured article).
- Li, Y., Li, G., and Chen, K. (2022) Principal amalgamation analysis for microbiome data. Genes, 13(7):1139
- Li, Y., Chen, K., Yan, J., and Zhang, X. (2021) Uncertainty in optimal fingerprinting is underestimated. Environmental Research Letters, 16(8):084043.
- Sun, Z., Xu, W., Cong, X., Li, G., Chen, K. (2020) Log-contrast regression with functional compositional predictors: Linking preterm infant’s gut microbiome trajectories in early postnatal period to neurobehavioral outcome. Annals of Applied Statistics, 14(3):1535–1556. (John van Ryzin Award & ENAR Distinguished Student Paper Award).
Wang, W., Aseltine, R., Chen, K., Yan, J.(2020) Integrative survival analysis with uncertain event times in application to a suicide risk study. Annals of Applied Statistics, 14(1):51–73. (NESS Student Research Award)
Chen, K., Mishra, N., Smyth, J., Bar, H., Schifano, E., Kuo, L., Chen, M.-H. (2018) A tailored multivariate mixture model for detecting proteins of concordant change in the pathogenesis of Necrotic Enteritis. Journal of the American Statistical Association, 113:546–559.
Chen, K., Hoffman, E., Seetharaman, I., Jiao, F., Lin, C.L., Chan, K.-S. (2016) Linking lung airway structure to pulmonary function via composite bridge regression. Annals of Applied Statistics, 10(4), 1880-1906.
Chen, K., Chan, K.-S., Stenseth N. R. (2014). Source-sink reconstruction through regularized multi-component regression analysis --with application to assessing whether North Sea cod larvae contributed to local fjord cod in Skagerrak. Journal of the American Statistical Association, 109(506), 560-573.
Applications - Data Driven Suicide Prevention
- Mitra, A., Chen, K., Liu, W., Kessler, R., and Yu, H. (2025) Post-discharge suicide prediction among US veterans using natural language processing-enriched social and behavioral determinants of health. npj Mental Health Research. Accepted.
- Zang, C., Hou, Y., Jin, J., Sacco, S., Chen, K., Aseltine, R., and Wang, F. (2024) Accuracy and generalizability of machine learning models for adolescent suicide prediction with longitudinal clinical records. Translational Psychiatry, 14:316.
- Sacco, S., Chen, K., Wang, F., and Aseltine, R. (2023) Target-based fusion using social determinants of health to enhance suicide prediction with electronic health records. PLoS ONE, 18(4):e0283595.
- Rawat, B. P. S., Reisman, J., Pogoda, T. K., Weisong, L., Rongali, S., Aseltin, R. H., Chen, K., Tsai, J., Berlowitz, D., Yu, H., and Carlson, K. (2023) Intentional self-harm among US veterans with traumatic brain injury and/or posttraumatic stress disorder: A retrospective cohort study 2008–2017. JMIR Public Health and Surveillance, 9:e42803.
- Mitra, A., Pradhan, R., Melamed, R. D., Chen, K., Hoaglin, D. C., Tucker, K. L., Reisman, J. I., Yang, Z., Liu, W., Tsai, J., and Yu, H. (2023) Associations between natural language processing (NLP) enriched social determinants of health and suicide death among us veterans. JAMA Network Open, 6(3):e233079.
- Luo, C., Chen, K., Doshi, R., Rickles, N., Chen, Y., Schwartz, H., and Aseltine, R. H. (2022) The association of prescription opioid use with suicide attempts: An analysis of statewide medical claims data. PLoS ONE, 17(6):e0269809.
- Chang, S., Aseltine, R., Riddhi, D., Chen, K., Rogers, S., and Wang, F. (2020) Machine learning for suicide risk prediction in children and adolescents with electronic health records. Translational Psychiatry, 10, 413.
- Doshi, R., Chen, K., Wang, F., Schwartz, H., Herzog, A., Aseltine, R. (2020) Identifying risk factors for mortality among patients previously hospitalized for a suicide attempt. Scientific Reports, 10:15223.
- Chen, K., Aseltine, R. (2017) Using hospitalization and mortality data to target suicide prevention activities. Journal of Adolescent Health, (61):192-197.
Applications - Other
- Lin, Q., Dorsett, Y., Mirza, A., Tremlett, H., Piccio, L., Longbrake, E., Choileain, S. N., Hafler, D., Cox, L., Weiner, H., Yamamura, T., Chen, K., Wu, Y., and Zhou, Y. (2024)Meta-analysis identifies common gut microbiota signatures in patients with multiple sclerosis. Genome Medicine, 16:94.
- Sun, Y., Wang, Y., Zhu, H., Jin, N., Mohammad, A., Biyikli, N., Chen, O., Chen, K., and Zhao, J. (2022) Excitation wavelength-dependent photoluminescence decay of single quantum dots near plasmonic gold nanoparticles. Journal of Chemical Physics, 156:154701.
- Wang, J., Tang, K., Feng, K., Lin, X., Lv, W., Chen, K., and Wang, F. (2021) Impact of temperature and relative humidity on the transmission of COVID-19: a modeling study in China and the United States. BMJ Open, 11:e043863.
- Choi, S., Hoffman, E. A., Wenzel, S. E., Castro, M., Fain, S., Jarjour, N., Schiebler, M. L., Chen, K., and Lin, C.-L. (2017) Quantitative computed tomography imaging-based clustering differentiates asthmatic subgroups with distinctive clinical phenotypes. Journal of Allergy and Clinical Immunology, 140(3):690–700.
Chen, Y., Chen, K., and Kalichman, S. C. (2017) Barriers to HIV medication adherence in the context of regimen simplification. Annals of Behavioral Medicine, 51(1):67–78.
Chen, K., Ciannelli, L., Decker, M.B., Ladd, C., Cheng, W., Zhou, Z., Chan, K.S. (2014) Reconstructing source-sink dynamics in a population with a pelagic dispersal phase. PLoS ONE, 9(5): e95316.
- Elucidating the impact of nano-agrichemicals on paddy soil health and rice production through combined greenhouse studies and machine learning. USDA (2023-67021-39755). Co-PI (with Dr. Samuel Ma at TAMU); 04/01/2023–03/31/2027.
- Improving identification of pediatric patients at risk of child physical abuse. The Patterson Foundation. PI on sub-award (with Dr. Amy A. Hunter); 01/31/2023 – 01/30/2025.
- Integrative learning of fluorescence fluctuations in perovskite quantum dots using a data science assisted single-particle approach. National Science Foundation (CHE-2203854). Co-PI (with Dr. Jing Zhao); 09/01/2022–08/31/2025.
- Developing suicide risk algorithms for diverse clinical settings using data fusion. National Institutes of Health (R01-MH124740). MPI (with Dr. Robert Aseltine and Dr. Fei Wang); 09/16/2020–06/30/2024.
- Improving suicide prediction using NLP-derived social determinants of health. National Institutes of Health (R01-MH125027). PI on sub-award (PI: Dr. Hong Yu); 09/01/2020–06/30/2024.
- Improving the identification and management of suicide risk among patients using prescription opioids (HEAL Supplement). National Institutes of Health (R01-MH112148-03S1). PI on sub-award (PI: Dr. Robert Aseltine); 09/18/2020–06/30/2021.
- Reciprocal modulation of the microbiome and cellular senescence in metabolic dysfunction. National Institutes of Health (R01-AG068860). PI on sub-award (PI: Dr. Yanjiao Zhou); 09/10/2020–05/31/2025.
- Comprehensive heterogeneous response regression from complex data. National Science Foundation (IIS-1718798). PI; 09/01/2017– 08/31/2020.
- Improving the identification of patients at risk of suicide. National Institutes of Health (R01-MH112148). PI on sub-award (PI: Dr. Robert Aseltine); 07/01/2017–06/30/2020.
- Integrative multivariate analysis with multi-view data. National Science Foundation (DMS-1613295). PI; 09/01/2016–08/31/2019.
- An integrative statistics-guided image-based multi-scale lung model. U.S. National Institutes of Health (U01-HL114494). PI on sub-award (PI: Dr. Ching-Long Lin); 08/01/2013– 05/31/2018.