검색 상세

Representation-learning based propensity score model for causal inference in high dimension

고차원 데이터를 이용한 인과추론을 위한 표현 학습 기반 성향 점수 예측 모델 개발

초록/요약

There has been a surge in medical research attempting causal inference along with the enhancement in the adoption of electronic health records (EHRs) and the secondary use of large claim databases. Unlike in randomized clinical trials, the assignment of treatment is not independent of the baseline characteristics in observational data. Hence, two key assumptions should be satisfied for estimating causal inference in the observational study: unconfoundedness and overlap. Unconfoundedness rather than overlap is a significant challenge in most studies. Intuitively, unconfoundedness is more plausible when more covariates are included in the analysis. In this regard, the large-scale propensity score model (LSPS) balancing virtually all observed confounders is favorable over the propensity score model adjusting expert-derived tens of variables. However, LSPS often fails to balance available covariates in the high-dimensional, low sample-size (HDLSS) data, i.e. p >> n. This weakness hinders its wide adoption through a distributed research network based on standardized clinical data. Hence, this study aims to develop a more robust framework for causal inference based on propensity score in HDLSS: database-wide representation-learning-based propensity score model (RLPS). RLPS is composed of two components: 1. a task-agnostic, database-wide asymmetrically stacked autoencoder (DASA) to abstract high-dimensional features; and 2. downstream Bayesian lasso to estimate propensity score. A task-agnostic, database-wide asymmetrically stacked autoencoder (DASA) is trained in an unsupervised way based on a database-wide feature matrix to distill condensed meaningful representation. Once DASA is pretrained, the deep encoder of DASA maps the covariates into condensed space, and then Bayesian lasso estimates propensity score as a downstream task. Finally, propensity score matching is conducted to estimate the average treatment effect. The performance of RLPS was evaluated by using two clinical cases: 1. comparative cohort study of new users of 1. angiotensin receptor blocker and calcium channel blocker in hypertension; 2. ranitidine and other H2-receptor antagonists. In each case, 1000 and 500 patients were randomly sampled 100 times from the single standardized EHR database of tertiary hospital. Unconfoundedness, accuracy in risk estimates, and residual bias were compared between RLPS and LSPS. Compared to LSPS, RLPS identified more overlap and achieved better balancing performance of a large set of covariates between target and comparator cohorts. Mostly, RLPS performs better when there is an empirical equipoise. RLPS can be an attractive alternative to LSPS in studies when the number of covariates exceeds observations. Furthermore, RLPS may facilitate the population-level estimation study using EHRs of single institutions across the distributed research network.

more

목차

I. Introduction 1
A. Background 1
B. Purpose of study 5
II. Literature review 6
A. Assumptions for estimating average treatment effects and large-scale propensity score model 6
B. Causal inference in high dimension 8
C. Deep learning as representation learning 10
D. Neural network for causal inference 11
III. Database-wide representation learning-based Propensity score model 13
A. Data preparation 13
B. Database-wide asymmetrically stacked autoencoder 22
C. Representation learning-based propensity score model 28
IV. Case studies 30
A. Definition of clinical case 1 30
B. Definition of clinical case 2 31
C. Large-scale propensity score model matching 31
D. Sampling and propensity score matching 32
E. Evaluation of c-statistics of propensity score models and overlap 33
F. Evaluation of unconfoundedness 33
G. Evaluation of accuracy in risk estimates 34
H. Evaluation of residual confounding 34
V. Results from case studies 36
A. Results from the full cohorts of case 1 36
B. Results from the sampled cohorts of case 1 45
C. Results from the whole cohorts of case 2 52
D. Results from the sampled cohorts of case 2 62
VI. Discussion 69
Limitations 72
VII. Conclusion 75
References 76

more