Chunhua Weng
  Professor of Biomedical Informatics
Columbia University

Mission Statement:

My long-term research interest is to accelerate clinical and translational science using electronic data while minimizing study design biases and optimizing study results’ generalizability. My approach combines formal methods and socio-technical approaches. I combine text knowledge engineering and health data analytics to improve the efficiency and generalizability of clinical research. My co-authors and I have created a distribution-based method for quantifying the collective generalizability of multiple clinical trials and a novel generalizability index for study traits (GIST), which have enabled scalable and proactive clinical study generalizability assessment. My team also explore the symbiosis between knowledge representation and natural language processing for text knowledge engineering, as reflected in our work on EliXR. I aim to advance the field of clinical research informatics on several fronts, including text knowledge engineering, aggregate analysis of clinical studies, quality-aware computational reuse of electronic patient data and public data, and clinical research workflow optimization in patient care settings towards the achievement of a learning health system. Currently I also spend a significant amount of my time leading the Columbia eMERGE project, as part of a national eMERGE network. I also participate in the Biomedical Data Translator consortia.

My current research focus on these areas:

Openings for research officer, postdoc, and research assistant are available immediately to model patient populations and quantify the population representativeness of clinical studies using electronic data sources. Highly motivated individuals with computing background and quantitative analytical skills are encouraged to apply.  Prospective candidates can email a CV to me with “job application” in the subject line.

Selected Publications (All my publications are listed chronologically here, or grouped by the above listed topics here):

Son JH, Xie G, Yuan C, Ena L, Li Z, Goldstein A, Huang L, Wang L, Shen F, Liu H, Mehl K, Groopman E, Marasa M, Kiryluk K, Gharavi AG, Chung WK, Hripcsak G, Friedman C, Weng C*, Wang K*,
Deep phenotyping on electronic health records facilitates genetic diagnosis by clinical exomes, American Journal of Human Genetics, 2018 Jul 5;103(1):58-73. doi: 10.1016/j.ajhg.2018.05.010. Epub 2018 Jun 28. PMID: 29961570 (*: equal-contribution corresponding author).

Yuan C, Ryan PB, Ta C, Guo Y, Li Z, Hardin J, Markadia R, Jin P, Shang N, Kang T, Weng C, Criteria2Query: A Natural Language Interface to Clinical Databases for Cohort Definition,
J Am Med Inform Assoc, in press, 2019. [Link to paper]

Goldstein A, Venker E, Weng C, Evidence Appraisal: A Scoping Review, Conceptual Framework, and Research Agenda, J Am Med Inform Assoc, 2017.

Sen A, Goldstein A, Chakrabarti S, Shang N, Kang T, Yaman A, Ryan P, Weng C, The Representativeness of Eligible Patients in Type 2 Diabetes Trials: A Case Study Using GIST 2.0. J Am Med Inform Assoc, 2017.

Kang T, Zhang S, Tang Y, Hruby GW, Rusanov A, Elhadad N, Weng C, EliIE: An Open-Source Information Extraction System for Clinical Trial Eligibility Criteria, J Am Med Inform Assoc, 2017 Apr 1. doi: 10.1093/jamia/ocx019. [Epub ahead of print] PMID: 28379377.

Miotto R, Weng C, Case-based Reasoning Using Electronic Health Records Identifies Eligible Patients for Clinical Trials, Journal of American Medical Informatics Association, 2015.

Acknowledgment: Dr. Weng thanks the National Library of Medicine, National Human Genome Research Institute, and National Center for Advancing Translational Science for funding her research.

Contact Information:

Chunhua Weng, Ph.D., FACMI
Professor of Biomedical Informatics
Columbia University
622 West 168 Street, PH-20-room 407
New York, NY 10032
University email account ( chunhua

Last Updated: 07-2019