Workshop

Estimation of Single and Synergistic Treatment Effects under Multiple Treatments with Deep Neural Networks

Abstract The simultaneous application of multiple treatments is increasingly common in many fields, such as healthcare and marketing. In such scenarios, it is important to estimate not only the effect of each single treatment effect, but also the synergistic treatment effects that arise from combinations of treatments. Previous studies have proposed methods that combine a variational autoencoder with a task embedding network, which captures treatment similarities for multi-treatment causal inference. These methods assume the presence of unobserved covariates and regard observed data as proxies for those unobserved covariates. As a result, they may still learn unnecessary latent variables even when the covariates are observed. This model misspecification can lead to misleading estimates of causal effects. To address this issue, we propose a novel deep learning framework that simultaneously captures both single and synergistic treatment effects and mitigates selection bias, using a task embedding network and a representation learning network with the balancing penalty. The task embedding network ensures that similar treatments yield similar representations and outcomes, improving the estimation of both single and synergistic effects. The representation learning network with the balancing penalty directly learns representations from observed covariates and controls distributional differences across treatment patterns using Integral Probability Metrics, thereby reducing the risk of model misspecification due to erroneous latent structures. We evaluate our method using multiple simulation datasets and compare its performance with existing baselines. Our method consistently outperforms baselines by reducing estimation errors in both single and synergistic treatment effects across settings.

Aug 4, 2025

Causal Inference under Threshold Manipulation: A Bayesian Mixture Approach

Abstract Many marketing applications, including credit card incentive programs, offer rewards to customers exceeding specific spending thresholds to encourage increased consumption. Quantifying the causal effect of these thresholds on customers is crucial for effective marketing strategy design. While regression discontinuity design is a common method for such causal inference tasks, its assumptions can be violated when customers, aware of the thresholds, strategically manipulate their spending to qualify for the rewards. To address this issue, we propose a novel framework for estimating the causal effect of thresholds on customers under their manipulation. The core idea is to model the observed spending distribution as a mixture of two distributions: one representing customers strategically affected by the threshold and the other representing those unaffected. To fit the mixture model, we adopt a Bayesian approach, which enables valid causal effect estimation with proper uncertainty quantification. Furthermore, we extend this framework to a hierarchical Bayesian setting to estimate heterogeneous causal effects across customer subgroups, allowing for stable inference even with small subgroup sample sizes. We demonstrate the effectiveness of our proposed methods through simulation studies and show that our proposed framework yields more accurate estimates of the causal effect of thresholds on customers compared to naive regression discontinuity design methods.

Aug 4, 2025

Wald-Differences-in-Differences Estimation without Individual-Level Treatment Data

Abstract In-store advertising, such as digital signage and in-store posters, is a crucial advertising method that influences customer purchasing behavior. While their effectiveness is typically evaluated by displaying ads on a store-by-store basis and comparing the purchasing behavior of those exposed to ads with those who are not, obtaining ad exposure data for individual customers is costly, making it challenging to conduct accurate causal inference with individual-level treatment variables. A common approach to address this issue is to perform causal inference considering non-compliance, setting visitors to stores implementing an ad campaign as the treatment group and similar customers who have visited comparable stores as the control group. In this setting, a popular estimator is the ratio of two Differences-in-Differences (DID) estimates: one for the outcome variable and another for the treatment variable. However, previous study assume that the DID estimate for the treatment variable is known from public data, which is not always the case. To overcome this limitation, we propose a method to estimate causal effects by utilizing the fact that, for binary treatment variables, the DID estimate of the treatment variable represents the change in the proportion of compliers in the treatment group. Our method leverages a Gaussian Mixture Model to estimate the proportion. This approach allows for the estimation of the treatment effect on the compliers even in advertising strategies where ad exposure data for individual customers is unavailable.

Mar 4, 2025

Stay Ahead of the Competition: An Approach for Churn Prediction by Leveraging Competitive Service App Usage Logs

Abstract With the widespread adoption of smartphones, users now have easy access to similar services, leading to increased churn. As a result, it has become essential for service providers to prevent churn caused by customers’ switch to competing services. The most common approach for service providers to prevent their customers’ churn is to make churn predictions by monitoring customers’ usage patterns of their own services. However, despite the importance of insights concerning customers’ usage of competing services for the retention of customers, such information are yet to be integrated into churn prediction models due to the lack of suitable monitoring methods. Here, we propose an approach to predict user churn leveraging the event logs from smartphones and tablets. Instead of conventional churn prediction methods that solely rely on the users’ usage patterns of their own service, our approach predicts churn by utilizing users’ usage patterns of competing services, including their trial use of service before switch to competitor’s. We evaluated the prototyped prediction model using smartphone logs collected from NTT DOCOMO smartphone and tablet users who consented to data collection between April 2020 and March 2021. The results demonstrated that the proposed method achieved AUC values ranging from 0.844 to 0.923. Moreover, our approach improved the performance of the conventional method that predicts churn without leveraging the features of the competitor’s app by 1.8% to 7.5%.

Oct 8, 2023

Time-aware GCN: Representation Learning for Mobile App Usage Time-series Data

Abstract With the expansion of smartphones, most users are using many apps (applications) on their smartphones. As the way users use their apps would reflect their personality, understanding their app usage is increasingly becoming an interesting problem. In order to understand their app usage, which consists of time-series data, we can use sequential models such as N-gram and long short-term memory (LSTM) for considering sequence characteristics. However, it is still challenging to reduce the impact of internal factors (e.g., their feelings) and external factors (e.g., notifications on their smartphones) on the differences in the order of apps used in a short term. In this paper, we propose a novel method for representation learning of app usage, called Time-aware Graph Convolutional Networks (T-GCN), to address the problem mentioned above. We evaluated the performance of T-GCN with the largescale real-world dataset on the app usage prediction task. The results demonstrate that T-GCN achieves 3.6% higher accuracy than the LSTM model in accuracy@10.

Aug 24, 2020