Causal Inference

Off-Policy Evaluation and Learning for Survival Outcomes under Censoring

Abstract Optimizing survival outcomes, such as patient survival or customer retention, is a critical objective in data-driven decision-making. Off-Policy Evaluation (OPE) provides a powerful framework for assessing such decision-making policies using logged data alone, without the need for costly or risky online experiments in high-stakes applications. However, typical estimators are not designed to handle right-censored survival outcomes, as they ignore unobserved survival times beyond the censoring time, leading to systematic underestimation of the true policy performance. To address this issue, we propose a novel framework for OPE and Off-Policy Learning (OPL) tailored for survival outcomes under censoring. Specifically, we introduce IPCW-IPS and IPCW-DR, which employ the Inverse Probability of Censoring Weighting technique to explicitly deal with censoring bias. We theoretically establish that our estimators are unbiased and that IPCW-DR achieves double robustness, ensuring consistency if either the propensity score or the outcome model is correct. Furthermore, we extend this framework to constrained OPL to optimize policy value under budget constraints. We demonstrate the effectiveness of our proposed methods through simulation studies and illustrate their practical impacts using public real-world data for both evaluation and learning tasks.

Mar 24, 2026

Causal Inference under Threshold Manipulation: Bayesian Mixture Modeling and Heterogeneous Treatment Effects

Abstract Many marketing applications, including credit card incentive programs, offer rewards to customers who exceed specific spending thresholds to encourage increased consumption. Quantifying the causal effect of these thresholds on customers is crucial for effective marketing strategy design. Although regression discontinuity design is a standard method for such causal inference tasks, its assumptions can be violated when customers, aware of the thresholds, strategically manipulate their spending to qualify for the rewards. To address this issue, we propose a novel framework for estimating the causal effect under threshold manipulation. The main idea is to model the observed spending distribution as a mixture of two distributions: one representing customers strategically affected by the threshold, and the other representing those unaffected. To fit the mixture model, we adopt a two-step Bayesian approach consisting of modeling non-bunching customers and fitting a mixture model to a sample around the threshold. We show posterior contraction of the resulting posterior distribution of the causal effect under large samples. Furthermore, we extend this framework to a hierarchical Bayesian setting to estimate heterogeneous causal effects across customer subgroups, allowing for stable inference even with small subgroup sample sizes. We demonstrate the effectiveness of our proposed methods through simulation studies and illustrate their practical implications using a real-world marketing dataset.

Mar 14, 2026

Wald-Difference-in-Differences Estimation without Individual-level Treatment Data

Abstract In-store advertising, such as digital signage and posters, is a crucial method that influences customer behavior. While effectiveness is often evaluated by displaying ads on a store-by-store basis and comparing outcomes for those exposed to ads and those not, obtaining individual ad exposure data is costly, making it difficult to conduct causal inference with individual-level treatment variables. A common approach to address this issue is to perform causal inference considering non-compliance, treating visitors to stores with advertising as the treatment group and similar customers who visited comparable stores as the control. In this setting, a popular estimator is the ratio of two Difference-in-Differences (DID) estimates: one for the outcome and one for the treatment variable. However, prior studies assumed the DID estimate for the treatment variable is known, which is not always true. To address this, we propose a causal inference method using the fact that, for binary treatment, the DID estimate of the treatment variable represents the change in the proportion of compliers in the treatment group. Our method uses a Gaussian Mixture Model to estimate this proportion. This approach allows estimation of the treatment effect on compliers even when individual ad exposure data is unobserved.

Feb 16, 2026

Impact of lottery promotion wins and losses: evidence from a promotion in a mobile payment service

Abstract Purpose Lottery promotion is a gambled price discount that provides random incentives for each consumer transaction. This study investigates how winning (i.e. being selected exactly once during the promotion) and losing (i.e. never being selected) influence consumer payment both during and after the promotion.

Dec 9, 2025

Multiple Treatments Causal Effects Estimation with Task Embeddings and Balanced Representation Learning

Abstract The simultaneous application of multiple treatments is increasingly common in many fields, such as healthcare and marketing. In such scenarios, it is important to estimate the single treatment effects and the interaction treatment effects that arise from treatment combinations. Previous studies have proposed using independent outcome networks with subnetworks for interactions, or combining task embedding networks that capture treatment similarity with variational autoencoders. However, these methods suffer from the lack of parameter sharing among related treatments, or the estimation of unnecessary latent variables reduces the accuracy of causal effect estimation. To address these issues, we propose a novel deep learning framework that incorporates a task embedding network and a representation learning network with the balancing penalty. The task embedding network enables parameter sharing across related treatment patterns because it encodes elements common to single effects and contributions specific to interaction effects. The representation learning network with the balancing penalty learns representations nonparametrically from observed covariates while reducing distances in representation distributions across different treatment patterns. This process mitigates selection bias and avoids model misspecification. Simulation studies demonstrate that the proposed method outperforms existing baselines, and application to real-world marketing datasets confirms the practical implications and utility of our framework.

Nov 12, 2025

Estimation of Single and Synergistic Treatment Effects under Multiple Treatments with Deep Neural Networks

Abstract The simultaneous application of multiple treatments is increasingly common in many fields, such as healthcare and marketing. In such scenarios, it is important to estimate not only the effect of each single treatment effect, but also the synergistic treatment effects that arise from combinations of treatments. Previous studies have proposed methods that combine a variational autoencoder with a task embedding network, which captures treatment similarities for multi-treatment causal inference. These methods assume the presence of unobserved covariates and regard observed data as proxies for those unobserved covariates. As a result, they may still learn unnecessary latent variables even when the covariates are observed. This model misspecification can lead to misleading estimates of causal effects. To address this issue, we propose a novel deep learning framework that simultaneously captures both single and synergistic treatment effects and mitigates selection bias, using a task embedding network and a representation learning network with the balancing penalty. The task embedding network ensures that similar treatments yield similar representations and outcomes, improving the estimation of both single and synergistic effects. The representation learning network with the balancing penalty directly learns representations from observed covariates and controls distributional differences across treatment patterns using Integral Probability Metrics, thereby reducing the risk of model misspecification due to erroneous latent structures. We evaluate our method using multiple simulation datasets and compare its performance with existing baselines. Our method consistently outperforms baselines by reducing estimation errors in both single and synergistic treatment effects across settings.

Aug 4, 2025

Causal Inference under Threshold Manipulation: A Bayesian Mixture Approach

Abstract Many marketing applications, including credit card incentive programs, offer rewards to customers exceeding specific spending thresholds to encourage increased consumption. Quantifying the causal effect of these thresholds on customers is crucial for effective marketing strategy design. While regression discontinuity design is a common method for such causal inference tasks, its assumptions can be violated when customers, aware of the thresholds, strategically manipulate their spending to qualify for the rewards. To address this issue, we propose a novel framework for estimating the causal effect of thresholds on customers under their manipulation. The core idea is to model the observed spending distribution as a mixture of two distributions: one representing customers strategically affected by the threshold and the other representing those unaffected. To fit the mixture model, we adopt a Bayesian approach, which enables valid causal effect estimation with proper uncertainty quantification. Furthermore, we extend this framework to a hierarchical Bayesian setting to estimate heterogeneous causal effects across customer subgroups, allowing for stable inference even with small subgroup sample sizes. We demonstrate the effectiveness of our proposed methods through simulation studies and show that our proposed framework yields more accurate estimates of the causal effect of thresholds on customers compared to naive regression discontinuity design methods.

Aug 4, 2025

Wald-Differences-in-Differences Estimation without Individual-Level Treatment Data

Abstract In-store advertising, such as digital signage and in-store posters, is a crucial advertising method that influences customer purchasing behavior. While their effectiveness is typically evaluated by displaying ads on a store-by-store basis and comparing the purchasing behavior of those exposed to ads with those who are not, obtaining ad exposure data for individual customers is costly, making it challenging to conduct accurate causal inference with individual-level treatment variables. A common approach to address this issue is to perform causal inference considering non-compliance, setting visitors to stores implementing an ad campaign as the treatment group and similar customers who have visited comparable stores as the control group. In this setting, a popular estimator is the ratio of two Differences-in-Differences (DID) estimates: one for the outcome variable and another for the treatment variable. However, previous study assume that the DID estimate for the treatment variable is known from public data, which is not always the case. To overcome this limitation, we propose a method to estimate causal effects by utilizing the fact that, for binary treatment variables, the DID estimate of the treatment variable represents the change in the proportion of compliers in the treatment group. Our method leverages a Gaussian Mixture Model to estimate the proportion. This approach allows for the estimation of the treatment effect on the compliers even in advertising strategies where ad exposure data for individual customers is unavailable.

Mar 4, 2025