Predictive LTV for UA optimization - go causal, or go home

Roi Shivek | December 6, 2024 · 6 min

Achieving sustainable success in pLTV-based user acquisition (UA) optimization is suboptimal without causal models, as these models are more robust and adaptable to inevitable changes in data distribution or the environment. In this article, we introduce the notion of causal modeling and explain why adapting to dynamic environments leads to better outcomes in UA campaigns.

Reflecting the true value of your users to ad networks enables them to align their targeting strategies more effectively with your business goals. Most advertisers share generated revenue data for users acquired by the network during its standard 7-day conversion window. Some even share predicted long-term user value within that same 7-day window as it helps the network distinguish between users who generate immediate revenue and those who continue to deliver value over time. For instance, predicting a user will have a high LTV and sharing this knowledge with the network can prompt higher bids by the network algorithms for similar user types, increasing the likelihood of acquiring users who contribute sustained value to your brand moving forward.

Distribution shifts

Imagine you're running a mobile game app (it could be any other type of business). Initially, your user base might consist of a diverse mix of players - some who play casually and some who are highly engaged. This is your initial "distribution" of users.

You create a predictive model to identify high-value users based on their early behavior. Your model notices that users who play for more than 2 hours on their first day tend to become high-value customers. Based on this insight, you configure your ad network to optimize towards this “proxy event” that signals high early engagement. The network then uses this feedback loop to target users more likely to trigger this event, adjusting its bidding and targeting strategies accordingly. 

As a result of this targeting, you acquire more users who play a lot on their first day. This is where the distribution shift occurs: your original user base had a mix of different engagement levels, and your new acquisitions are predominantly highly engaged users. While your ad network finds ways to give you users who trigger your "high engagement" signals cheaply, these are not the type of highly engaged users your initial model was based on, as it appears the vast majority of them don't even convert to paying users.

This shift introduces problems as your prediction model overvalues early engagement, not realizing that some new users don't become paying customers as they are "binge players" who burn out quickly. Not only that, but you now miss out on potentially valuable users who start slowly but become dedicated players over time. What started as an effort to improve your acquisition ended up as a flop.

Causal models address this distribution shift by accurately identifying true causal factors. Instead of only relying on correlations (like high first-day engagement correlating with becoming a high-value customer), causal models attempt to understand why users become high-value customers. They might discover that it's not just the amount of time spent but also the variety of game features explored or the social connections made within the game that truly drive long-term value.

Causal models are designed to predict the effects of interventions - in your case, the intervention of changing your user acquisition strategy. They would account for the fact that targeting highly engaged early users might change the characteristics of your user base and would continually reassess the relationship between early engagement and long-term value. These models might notice that the correlation weakens over time and adjust predictions accordingly.

Hidden factors influence both early engagement and the likelihood of becoming a high-value user. For example, users who have played similar games might engage more early on and be more likely to spend. A causal model would account for these hidden factors.

By addressing these aspects, causal models help maintain accurate predictions and effective targeting strategies even as the distribution of acquired users shifts, avoiding the pitfalls that a purely correlational model might fall into.

Feature Drift and Behavioral Shift Misalignments

Imagine you’re the proud owner of a brand behind a subscription-based fitness app focusing on home workouts. Your app's predictive model for user lifetime value (LTV) might heavily weigh factors like how often users log workouts at home and how many workout videos they watch.

Let's say you introduce a significant product update: a new feature for tracking outdoor runs using GPS. This major change to your product alters user behavior and value and directly affects the accuracy of your existing prediction model.

Consider that some users who previously did mostly home workouts started doing more outdoor runs. Your model might interpret this as decreased engagement (fewer home workout logs) and predict lower LTV for these users, even though they're more engaged with the new feature. This means that the number of outdoor runs logged might become a strong indicator of user value, but your model wouldn't automatically pick up on this new relationship.

As your new outdoor running feature attracts a new segment of users who convert to paid subscriptions differently than the original user base, your existing model continues to use outdated conversion patterns. In your newly introduced reality, features previously highly predictive (like the number of workout videos watched) are now less important, while new features (like distance run outdoors) become crucial. Your static model would need to adapt to these shifts in feature importance.

Causal models are better equipped to handle these changes because they attempt to understand why users find value in the app. They might recognize that "engagement with fitness activities" is the causal factor for retention and LTV, regardless of whether it's through home workouts or outdoor runs.

These models can adapt more quickly to new patterns of behavior and value creation introduced by product updates, helping maintain accurate predictions even as the product and user base evolve. This adaptability is crucial for maintaining effective user acquisition and retention strategies in a dynamic business or product environment.

Why causal models or go home?

When investing in pLTV-based UA optimization you commit to going the extra mile to achieve hyper-optimized campaigns. However, relying on non-causal models creates an inherent contradiction—pursuing optimization while using tools that struggle to adapt to shifting dynamics undermines the very goal of maximizing efficiency and impact.

Causal models focus on identifying the key features that truly affect outcomes. This makes them reliable even when inevitable data shifts are introduced by the very act of performing pLTV-based UA optimization. Causal model based predictions are essential for anyone aiming to successfully and consistently optimize UA performance.