Please feel free to email me c.shi7@lse.ac.uk if you have any comments.
* indicates equal contribution
Liu, P., Shi, C. and Sun, W. Dual Active Learning for Reinforcement Learning from Human Feedback
Sun, K., Kong, L., Zhu, H. and Shi, C. Optimal Treatment Allocation Strategies for A/B Testing in Partially Observable Time Series Experiments Python module ARMAdesign
slides
Hao, M*., Su, P*., Hu, L., Szabó, Z., Zhao, Q. and Shi, C. Off-policy Evaluation with Deeply-abstracted States . Python module state-abstraction
Dai, R*., Wang, J*., Zhou, F*., Luo, S., Qin, Q., Shi, C., and Zhu, H. Causal Deepsets for Off-policy Evaluation under Spatial or Spatio-temporal Interferences
Yang, Y., Shi, C., Yao, F., Wang, S. and Zhu, H. Spatially Randomized Designs Can Enhance Policy Evaluation
Aminian, G*., Behnamnia, A*., Vega, R., Toni, L., Shi, C. Rabiee, H., Rivasplata, O. and Rodrigues, M. Semi-supervised Batch Learning From Logged Data
Wang, D., Shi, C., Luo, S. and Sun, W. Pessimistic Causal Reinforcement Learning with Mediators for Confounded Offline Data
Ma, T*., Zhu, J*., Cai, H., Qi, Z., Chen, Y., Shi, C. and Laber, E. Sequential Knockoffs for Variable Selection in Reinforcement Learning (SEEK)
Yang, X., Shi, C., Luo, S., Wang, L. and Song, R. Quantile Off-Policy Evaluation via Deep Conditional Generative Learning
2023 JSM Student Paper Award
Uehara, M., Shi, C. and Kallus, N. A Reivew of Off-Policy Evaluation in Reinforcement Learning
Hu, L*., Li, M*., Shi, C., Wu, Z. and Fryzlewicz, P. Doubly Inhomogeneous Reinforcement Learning. Python module DIRL
slides presented at CMStatistics 2022.
Wang, J., Qi, Z. and Shi, C. Blessing from Experts: Super Reinforcement Learning in Confounded Environments
Li, M*., Shi, C*., Wu, Z. and Fryzlewicz, P. Testing Stationarity and Change Point Detection in
Reinforcement Learning.
Python module CUSUM-RL
slides video presented at JSM 2022.
Luo, L*., Shi, C*., Wang, J*, Wu, Z. and Li, L. (2024+). Multivariate Dynamic Mediation Analysis under a Reinforcement Learning Framework, Annals of Statistics, accepted.
Yu, S., Fang, S., Peng, R., Qi, Z., Zhou, F. and Shi, C. (2024). Two-way Deconfounder for Off-policy Evaluation under Unmeasured Confounding, NeurIPS. Python module Two-way-deconfounder
Bian, Z., Shi, C., Qi, Z. and Wang, L. (2024+). Off-policy Evaluation in Doubly Inhomogeneous Environments, Journal of the American Statistical Association, accepted. Python module 2FEOPE
Li, T*., Shi, C*., Wen, Q., Sui, Y., Qin, Y., Lai, C. and Zhu, H. (2024). Combining Experimental and Historical Data for Policy Evaluation, ICML. Python module Data_Combination
Li, J., Shi, C., Li, L. and Collins, A. (2024). Dynamic noise estimation: A generalized method for modeling noise fluctuations in decision-making, Journal of Mathematical Psychology, 119, 102842. Python module dynamic_noise_estimation
Shi, C*., Qi, Z*., Wang, J. and Zhou, F. (2024). Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization, Journal of the American Statistical Association, 119, 2011-2025. Python module VEPO
Shi, C., Zhou, Y. and Li, L. (2024). Testing Directed Acyclic Graph via Structural,
Supervised and Generative Adversarial Learning (SUGAR), Journal of the American Statistical Association, 119, 1833-1846. Python module SUGAR
slides presented at JSM 2021.
Li, T*., Shi, C*., Lu, Z., Li, Y. and Zhu, H. (2024). Evaluating Dynamic Conditional Quantile Treatment Effects with Applications in Ridesharing, Journal of the American Statistical Association, 119, 1736-1750. Python module CQSTVCM
Shi, C., Zhu, J., Shen, Y., Luo, S., Zhu, H. and Song, R. (2024). Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process (COPE), Journal of the American Statistical Association, 119, 273-284. Python module COPE
Shi, C., Luo, S., Le, Y., Zhu, H. and Song, R. (2024). Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons (SEAL), Journal of the American Statistical Association, 119, 232-245. Python module SEAL
Luo, S*., Yang, Y*., Shi, C*., Yao, F., Ye, J. and Zhu, H. (2024). Policy Evaluation for Temporal and/or Spatial Dependent Experiments, Journal of the Royal Statistical Society, Series B, 86, 623–649. Python module STVCM.
Zhu, J*., Wan, R*., Qi, Z., Luo, S. and Shi, C. (2024). Robust Offline Reinforcement Learning with Heavy-Tailed Rewards, AISTATS. Python module ROOM.
Uehara, M., Kiyohara, H., Bennett, A., Chernozhukov, V., Jiang, N., Kallus, N., Shi, C. and Sun, W. (2023) Future-Dependent Value-Based Off-Policy Evaluation in POMDPs, NeurIPS (spotlight).
Li, T*., Shi, C*., Wang, J., Zhou, F. and Zhu, H. (2023). Optimal Treatment Allocation for Efficient Policy Evaluation in Sequential Decision Making, NeurIPS. Python module MDPdesign.
Zhou, Y., Shi, C., Li, L. and Yao, Q. (2023). Testing for the Markov Property in Time Series via Deep Conditional Generative Learning, Journal of the Royal Statistical Society, Series B, 85, 1204–1222. Python module markov_test
Shi, C., Wan, R., Song, G., Luo, S., Zhu, H. and Song, R. (2023). A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets, Annals of Applied Statistics, 17, 2701-2722. Python module CausalMARL
Shi, C*., Wang, X*., Luo, S., Zhu, H., Ye, J. and Song, R. (2023). Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework. Journal of the American Statistical Association, 108, 2059-2071.
Python module CausalRL
slides video presented at Online Causal Inference Seminar.
Wu, G., Song, G., Lv, X., Luo, S., Shi, C. and Zhu, H. (2023). DNet: Distributional Network for Distributional Individualized Treatment Effects, KDD.
Ge, L., Wang, J., Shi, C., Wu, Z. and Song, R. (2023). A Reinforcement Learning Framework for Dynamic
Mediation Analysis, ICML. Python module MediationRL
2023 ICSA Student Paper Award
Yang, X., Zhu, J., Shi, C., Luo, S. and Song, R. (2023). An Instrumental Variable Approach to Confounded Off-Policy Evaluation, ICML. Python module IVMDP
Wang, J., Shi, C. and Wu, Z. (2023). A Robust Test for the Stationarity Assumption in Sequential Decision Making, ICML. Python module Double-CUSUM-RL
Li, J., Shi, C., Li, L. and Collins, A. (2023). A Generalized Method for Dynamic Noise Inference in Modeling Sequential Decision-making, CogSci.
Shi, C. (2023). The Impact of David Cox’s Work and Leadership on My Research, Harvard Data Science Review.
Gao. Y., Shi, C. and Song, R. (2023). Deep Spectral Q-learning with Application to Mobile Health, STAT, 12, e564.
2022 JSM Student Paper Award
Cai. H*., Shi, C*., Song, R. and Lu, W. (2023). Jump Interval-Learning for Individualized Decision Making with Continuous Treatments, Journal of Machine Learning Research, 24, 1–92. R Package JQL
Zhou, Y., Qi, Z., Shi, C. and Li, L. (2023). Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning Approach, AISTATS. Python module PBL
Zhang, Y., Shi, C. and Luo, S. (2023). Conformal Off-Policy Prediction (COPP), AISTATS. R code COPP
Shi, C. and Li, L. (2022). Testing Mediation Effects Using Logic of Boolean Matrices (LOGAN), Journal of the American Statistical Association, 117, 2014-2027.
Python module LOGAN
slides presented at JSM 2021.
Shi, C., Zhang, S., Song, R. and Lu, W. (2022). Statistical Inference of the Value Function for Reinforcement
Learning in Infinite Horizon Settings, Journal of the Royal Statistical Society, Series B, 84, 765-793.
Python module SAVE
slides presented at ICSA 2019.
Shi, C*., Uehara, M*., Huang, J. and Jiang, N. (2022). A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes, ICML (long talk, top 2% of submissions). Python module Confounded-POMDP-OPE
video presented at ICML.
Li, L., Shi, C., Guo, T. and Jagust, W. (2022). Sequential Pathway Inference for Multimodal Neuroimaging Analysis, Stat, 11, e433.
Python module LOGAN
slides presented at JSM 2021.
Shi, C., Xu, T., Bergsma, W. and Li, L. (2021) Double Generative Adversarial Networks for Conditional Independence Testing. Journal of Machine Learning Research, 22, 1-32. Python module dgcit
Shi, C., Luo, S., Zhu, H. and Song, R. (2021). An Online Sequential Test for Qualitative Treatment Effects. Journal of Machine Learning Research, 22, 1-51.
Cai, H*. Shi, C.*, Song, R. and Lu, W. (2021). Deep Jump Learning for Off-Policy Evaluation in Continuous Treatment Settings, NeurIPS.
2021 ENAR Distinguished Student Paper Awards
Python module DJL
video presented at NeurIPS.
Wan, R*., Zhang, S*., Shi, C., Luo, S. and Song, R. (2021) Pattern Transfer Learning for Reinforcement Learning in Order Dispatching, IJCAI Reinforcement Learning for Intelligent Transportation Systems Workshop (best paper, spotlight).
video presented at the workshop.
Shi, C*., Wan, R*., Chernozhukov, V. and Song, R. (2021). Deeply-Debiased Off-Policy Interval Estimation, ICML (long talk, top 3% of submissions).
Python module D2OPE
video presented at ICML.
Shi, C., Song, R., Lu, W. and Li. R. (2021). Statistical Inference for High-Dimensional Models via Recursive Online-Score Estimation (ROSE), Journal of the American Statistical Association, 116, 1307-1318. R code for linear/logistic regression
Shi, C., Song, R. and Lu, W. (2021). Concordance and Value Information Criteria for Optimal Treatment Decision (CIVIC), Annals of Statistics, 49, 49-75.
Shi, C., Lu, W. and Song, R. (2020). Breaking the Curse of Nonregularity with Subagging — Inference of the Mean Outcome under Optimal Treatment Regimes, Journal of Machine Learning Research, 21, 1−67. R and C sample code subagging2.cpp sb.r
Shi, C., Wan, R., Song, R., Lu, W. and Leng, L. (2020). Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making. ICML.
Python module TestMDP
slides video presented at CMStatistics 2020, ICML 2020, JSM 2020 and EYSM 2021.
Shi, C., Lu, W. and Song, R. (2020). A Sparse Random Projection-based Test for Overall Qualitative Treatment Effects, Journal of the American Statistical Association, 115, 1201-1213.
Shi, C., Song, R., Chen, Z. and Li, R. (2019). Linear Hypothesis Testing for High Dimensional Generalized Linear Models.
Annals of Statistics, 47, 2671-2703.
2018 IMS travel award
R code for linear/logistic/Poisson regression
Shi, C., Lu, W., and Song, R. (2019). On Testing Conditional Qualitative Treatment Effects. Annals of Statistics, 47, 2348-2377.
2017 IMS travel award
slides presented at JSM 2017.
Shi, C., Lu, W. and Song, R. (2019). Determining the Number of Latent Factors in Multirelational Learning, Journal of Machine Learning Research, 20, 1-38.
Shi, C., Lu, W., and Song, R. (2018). A Massive Data Framework for M-estimators with Cubic-Rate. Journal of the American Statistical Association, 113, 1698-1709.
Shi, C., Song, R., Lu, W., and Fu, B. (2018). Maximin Projection Learning for Optimal Treatment Decision with Heterogeneous Individualized Treatment Effects. Journal of the Royal Statistical Society, Series B, 80, 681-702.
R package ITRLearn
slides presented at JSM 2016, poster presented at 2018 NCSU research symposium.
Shi, C., Fan, A., Song, R., and Lu, W. (2018). High-Dimensional A-Learning for Optimal Dynamic Treatment Regimes. Annals of Statistics, 46, 925-957.
R package ITRSelect
slides presented at ENAR 2016
Shi, C., Song, R. and Lu, W. (2018). Discussion of “Optimal Treatment Allocations in Space and Time for On-Line Control of an Emerging Infectious Disease”, Journal of the Royal Statistical Society, Series C, 67, 743-789.
Shi, C., Song, R. and Lu, W. (2017). Discussion of “Random Projection Ensemble Classification”, Journal of the Royal Statistical Society, Series B, 79, 959-1035.
Shi, C., Song, R. and Lu, W. (2016). Robust Learning for Optimal Treatment Decision with NP-Dimensionality, Electronic Journal of Statistics, 10, 2894-2921.
Zhang, P., Qiu, Z. and Shi, C. (2016). simplexreg: An R Package for Regression Analysis of Proportional Data Using Simplex Distribution, Journal of Statistical Software, 71, 1-21. R package simplexreg