Research Interests
- Large-scale multivariate statistical learning
- Statistical machine learning
- Statistical computing and optimization
- Population health and healthcare analytics
- Ecological and environmental statistics
- Elucidating the impact of nano-agrichemicals on paddy soil health and rice production through combined greenhouse studies and machine learning. USDA (2023-67021-39755). Co-PI (with Dr. Samuel Ma at TAMU); 04/01/2023–03/31/2027.
- Improving identification of pediatric patients at risk of child physical abuse. The Patterson Foundation. PI on sub-award (with Dr. Amy A. Hunter); 01/31/2023 – 01/30/2025.
- Integrative learning of fluorescence fluctuations in perovskite quantum dots using a data science assisted single-particle approach. National Science Foundation (CHE-2203854). Co-PI (with Dr. Jing Zhao); 09/01/2022–08/31/2025.
- Developing suicide risk algorithms for diverse clinical settings using data fusion. National Institutes of Health (R01-MH124740). MPI (with Dr. Robert Aseltine and Dr. Fei Wang); 09/16/2020–06/30/2024.
- Improving suicide prediction using NLP-derived social determinants of health. National Institutes of Health (R01-MH125027). PI on sub-award (PI: Dr. Hong Yu); 09/01/2020–06/30/2024.
- Improving the identification and management of suicide risk among patients using prescription opioids (HEAL Supplement). National Institutes of Health (R01-MH112148-03S1). PI on sub-award (PI: Dr. Robert Aseltine); 09/18/2020–06/30/2021.
- Reciprocal modulation of the microbiome and cellular senescence in metabolic dysfunction. National Institutes of Health (R01-AG068860). PI on sub-award (PI: Dr. Yanjiao Zhou); 09/10/2020–05/31/2025.
- Comprehensive heterogeneous response regression from complex data. National Science Foundation (IIS-1718798). PI; 09/01/2017– 08/31/2020.
- Improving the identification of patients at risk of suicide. National Institutes of Health (R01-MH112148). PI on sub-award (PI: Dr. Robert Aseltine); 07/01/2017–06/30/2020.
- Integrative multivariate analysis with multi-view data. National Science Foundation (DMS-1613295). PI; 09/01/2016–08/31/2019.
- An integrative statistics-guided image-based multi-scale lung model. U.S. National Institutes of Health (U01-HL114494). PI on sub-award (PI: Dr. Ching-Long Lin); 08/01/2013– 05/31/2018.
- Reinsel, G. C., Velu, R. P., and Chen, K. (2022) Multivariate Reduced-Rank Regression: Theory, Methods and Applications, 2nd Edition. Springer.
Selected Publications
Methodology & Theory
- Chen, J., Aseltine, R., Wang, F., and Chen, K. (2024) Tree-guided rare feature selection and logic aggregation with electronic health records data. Journal of the American Statistical Association.
- Jin, J., Aseltine, R., Yan, J., and Chen, K. (2024) Transfer learning for large-scale quantile regression. Technometrics.
- Xu, T., Chen, K., and Li, G. (2023) Tensor regression for incomplete observations with application to longitudinal studies. Annals of Applied Statistics. Accepted.
- Li, Y., Chen, K., Yan, J., and Zhang, X. (2023) Regularized fingerprinting in detection and attribution of climate change with weight matrix optimizing the efficiency in scaling factor estimation. Annals of Applied Statistics, 17(1):225–239.
- Chen, K., Dong, R., Xu, W., and Zheng, Z. (2022) Fast stagewise sparse factor regression. Journal of Machine Learning Research, 23(271):1–45.
- Li, G., Li, Y., and Chen, K. (2022) It’s all relative: regression analysis with compositional predictors. Biometrics, 79(2):1318–1329.
- Li, Y., Li, G., and Chen, K. (2022) Principal amalgamation analysis for microbiome data. Genes, 13(7):1139.
- Cui, S., Liang, J., Pan, W., Chen, K., Zhang, C., and Wang, F. (2022) Collaboration equilibrium in federated learning. In KDD ’22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. ACM.
- Liu, X., Cong, X., Li, G., Mass, K., and Chen, K. (2022) Multivariate log-contrast regression with sub-compositional predictors: testing the associations between preterm infant’s gut microbiomes and neurobehavioral outcomes. Statistics in Medicine, 41(3):580–594.
- Liu, X., Ma, S., and Chen, K. (2021) Multivariate functional regression via a nested reduced-rank regularization. Journal of Computational & Graphical Statistics, 31(1):231–240.
- Li, Y., Yu, C., Zhao, Y. Aseltine, R., Yao, W., Chen, K. (2021) Pursuing sources of heterogeneity in modeling clustered population. Biometrics, 78(2):716–729.
- Sun, Z., Xu, W., Cong, X., Li, G., Chen, K. (2020) Log-contrast regression with functional compositional predictors: Linking preterm infant’s gut microbiome trajectories in early postnatal period to neurobehavioral outcome. Annals of Applied Statistics, 14(3):1535–1556. (2020 John van Ryzin Award).
Wang, W., Aseltine, R., Chen, K., Yan, J.(2020) Integrative survival analysis with uncertain event times in application to a suicide risk study. Annals of Applied Statistics, 14(1):51–73.
- Uematsu, Y., Fan, Y., Chen, K., Lv, J., Lin, W. (2019) SOFAR: large-scale association network learning. IEEE Transactions on Information Theory, 65(8):4924–4939.
Li, G., Liu, X., Chen, K. (2019) Integrative multi-view reduced-rank regression: Bridging group-sparse and low-rank models. Biometrics, 75(2):593–602.
He, L., Chen, K., Xu, W., Zhou, J., Wang, F. (2018) Boosted sparse and low-rank tensor regression. Advances in Neural Information Processing Systems (NeurIPS) 31, 1017–1026.
Liang, J., Chen, K., Lin, M., Zhang, C., Wang, F. (2018) Robust finite mixture regression for heterogeneous targets. Data Mining and Knowledge Discovery, 32(6):1509–1560.
Luo, C., Liang, J., Li, G., Wang, F., Dey, D., Chen, K. (2018) Leveraging mixed and incomplete outcomes via a generalized reduced rank regression. Journal of Multivariate Analysis, 167:378–394.
Chen, K., Mishra, N., Smyth, J., Bar, H., Schifano, E., Kuo, L., Chen, M.-H. (2018) A tailored multivariate mixture model for detecting proteins of concordant change in the pathogenesis of Necrotic Enteritis. Journal of the American Statistical Association, 113:546–559.
Mishra, A., Dey, D., Chen, K. (2017) Sequential co-sparse factor regression. Journal of Computational and Graphical Statistics, 26(4):814–825.
She, Y., Chen, K. (2017) Robust reduced-rank regression. Biometrika, 104(3):633–647.
Vaughan, G., Aseltine, R., Chen, K., Yan, J. (2017) Stagewise generalized estimation equations with grouped variables. Biometrics, 73:1332–1342.
Goh, G., Dey, D., Chen, K. (2017) Bayesian sparse reduced-rank regression. Journal of Multivariate Analysis, 157:14–28.
Chen, K., Ma, Y. (2017) Analysis of double single index models. Scandinavian Journal of Statistics. 44(1), 1-20.
- Dong, H., Chen, K., Linderoth, J. (2016) Regularization vs. relaxation: A conic optimization perspective of statistical variable selection. arXiv:1510.06083.
Chen, K., Hoffman, E., Seetharaman, I., Jiao, F., Lin, C.L., Chan, K.-S. (2016) Linking lung airway structure to pulmonary function via composite bridge regression. Annals of Applied Statistics, 10(4), 1880-1906.
Luo, C., Liu, J., Dey, D., Chen, K. (2016) Canonical variate regression. Biostatistics, 17(3), 468-483.
Mukherjee, A., Chen, K., Wang, N., Zhu, J. (2015) On the degrees of freedom of reduced-rank estimators in multivariate regression. Biometrika, 102(2), 457-477.
Chen, K., Chan, K.-S., Stenseth N. R. (2014). Source-sink reconstruction through regularized multi-component regression analysis --with application to assessing whether North Sea cod larvae contributed to local fjord cod in Skagerrak. Journal of the American Statistical Association, 109(506), 560-573.
Chen, K., Dong H., Chan, K.-S. (2013). Reduced rank regression via adaptive nuclear norm penalization. Biometrika, 100(4), 901-920.
Chen, K., Chan, K.-S., Stenseth, N. R. (2012). Reduced rank stochastic regression with a sparse singular value decomposition. Journal of the Royal Statistical Society (Series B), 74(2), 203-221.
Chen, K., Chan, K.-S. (2011). Subset ARMA model selection via the adaptive Lasso. Statistics and Its Interface, 4, 197-205.
Application & Case Study
- Zang, C., Zhang, H., Xu, J., Zhang, H., Fouladvand, S., Havaldar, S., Cheng, F., Chen, K., Chen, Y., Glicksberg, B. S., Chen, J., Bian, J., and Wang, F. (2023) High-throughput clinical trial emulation with real-world data and machine learning: A case study of drug repurposing for Alzheimer’s disease. Nature Communications. Accepted.
- Sacco, S., Chen, K., Wang, F., and Aseltine, R. (2023) Target-based fusion using social determinants of health to enhance suicide prediction with electronic health records. PLoS ONE, 18(4):e0283595.
- Mitra, A., Pradhan†, R., Melamed, R. D., Chen, K., Hoaglin, D. C., Tucker, K. L., Reisman, J. I., Yang, Z., Liu, W., Tsai, J., and Yu, H. (2023) Associations between natural language processing (NLP) enriched social determinants of health and suicide death among us veterans. JAMA Network Open, 6(3):e233079.
- Luo, C., Chen, K., Doshi, R., Rickles, N., Chen, Y., Schwartz, H., and Aseltine, R. H. (2022) The association of prescription opioid use with suicide attempts: An analysis of statewide medical claims data. PLoS ONE, 17(6):e0269809.
- Xu, W., Chang, S., Li, Y., Doshi, R., Chen, K., Wang, F., and Aseltine, R. (2022) Improving suicide risk prediction via targeted data fusion: proof of concept using medical claims data. Journal of the American Medical Informatics Association, 29(3):500–511. (Featured article).
- Aseltine, R., Chen, K., Wang, F., and Jin, J. (2022) Harnessing big data in health care: Challenges in enhancing the clinical utility of patient data for suicide prevention. Connecticut Medicine, 86(1):61–66.
- Sun, Y., Wang, Y., Zhu, H., Jin, N., Mohammad, A., Biyikli, N., Chen, O., Chen, K., and Zhao, J. (2022) Excitation wavelength-dependent photoluminescence decay of single quantum dots near plasmonic gold nanoparticles. Journal of Chemical Physics, 156:154701.
- Wang, J., Tang, K., Feng, K., Lin, X., Lv, W., Chen, K., and Wang, F. (2021) Impact of temperature and relative humidity on the transmission of COVID-19: a modeling study in China and the United States. BMJ Open, 11:e043863.
- Chang, S., Aseltine, R., Riddhi†, D., Chen, K., Rogers, S., and Wang, F. (2020) Machine learning for suicide risk prediction in children and adolescents with electronic health records. Translational Psychiatry, 10, 413.
- Doshi, R., Chen, K., Wang, F., Schwartz, H., Herzog, A., Aseltine, R. (2020) Identifying risk factors for mortality among patients previously hospitalized for a suicide attempt. Scientific Reports, 10:15223.
- Li, X., Dou, F., Guo, J., Velarca, M. V., Chen, K., Gentry, T., McNear, D. (2020) Soil microbial community responses to nitrogen application in organic and conventional rice (Oryza Sativa L.) production. Soil Science Society of America Journal. In press.
- Liu, Y., Huang, J., Urbanowicz, R. J., Chen, K., Manduchi, E., Greene, C. S., Scheet, P., Moore, J. H., Chen, Y. (2020) Embracing heterogeneity for finding genetic interactions in large- scale research consortia. Genetic Epidemiology, 44(1):52–66.
- Doshi, R., Aseltine, R., Wang, F., Schwartz, H., Rogers, S., Chen, K. (2018) Illustrating the role of health information exchange in a learning health system: Improving the identification and management of suicide risk. Connecticut Medicine, 82(6):327–333.
- Chen, K., Aseltine, R. (2017) Using hospitalization and mortality data to target suicide prevention activities. Journal of Adolescent Health, (61):192-197.
- Choi, S., Hoffman, E. A., Wenzel, S. E., Castro, M., Fain, S., Jarjour, N., Schiebler, M. L., Chen, K., and Lin, C.-L. (2017) Quantitative computed tomography imaging-based clustering differentiates asthmatic subgroups with distinctive clinical phenotypes. Journal of Allergy and Clinical Immunology, 140(3):690–700.
Chen, Y., Chen, K., and Kalichman, S. C. (2017) Barriers to HIV medication adherence in the context of regimen simplification. Annals of Behavioral Medicine, 51(1):67–78.
Dou, F., Soriano, J., Tabien, R., and Chen, K. (2016) Soil texture and cultivar effects on rice (Oryza sativa, L.) grain yield, yield components and water productivity in three water regimes. PLoS ONE, 11(3):e0150549.
Choi, S., Hoffman, E.A., Wenzel, S.E., Castro, M., Fain, S., Jarjour, N., Schiebler, M.L., Chen, K., Lin, C.-L. (2015) Quantitative assessment of multiscale structural and functional alterations in asthmatic populations. Journal of Applied Physiology, 118(10), 1286-1298.
Chen, K., Ciannelli, L., Decker, M.B., Ladd, C., Cheng, W., Zhou, Z., Chan, K.S. (2014) Reconstructing source-sink dynamics in a population with a pelagic dispersal phase. PLoS ONE, 9(5): e95316.
Chen, K., Chan, K.-S., Bailey, K., Aydin, K., and Ciannelli, L. (2012) A probabilistic cellular automata approach for predator-prey interactions of arrowtooth flounder (Atheresthes stomias) and walleye pollock (Theragra chalcogramma) in the eastern Bering Sea. Canadian Journal of Fisheries and Aquatic Sciences, 69(2):259–272.