What is data readiness?

Q: What identifiers matter most?

A stable user ID internal to your business, plus ad platform identifiers where available (for example GCLID, fbc/fbp, hashed email or phone for CAPI). Gaps directly affect match rate and model labels.

Why it matters

pLTV and value-based bidding fail quietly when data is almost good enough. A model trained on incomplete refunds learns the wrong economics. Missing platform-specific click identifiers hurt match quality on the network that needs them: gclid (or Google enhanced-conversion keys) for Google, fbc/fbp for Meta CAPI. Absent fbc on non-Meta traffic is normal; missing hashed email or phone on purchase events is often the bigger EMQ drag. Stale nightly batches send signal freshness problems that look like "the model doesn't work."

Teams often jump to campaign settings before auditing readiness. The result is weeks of platform learning on noisy or biased inputs, followed by inconclusive holdouts and loss of stakeholder trust.

Data readiness is the gate between "we want pLTV" and "we can ship pLTV." Marketing analytics, data engineering, and performance marketing need a shared checklist before pilot kickoff.

Data readiness

Data readiness is the upstream prerequisite for the full activation chain:

Inventory: Map events, revenue definitions, IDs, and attribution fields in your data warehouse (typically 3–12 months of history for modeling).
Hygiene: Enforce append-only data, stable user ID, refund timing, and subscription state logic aligned to finance.
Modeling: Train user-level pLTV only after leakage checks and anchor-event coverage meet thresholds.
Signal design: Calibrate value scale, timing, and volume for platform learning.
Activation: Churney reads from your data warehouse and sends values directly to ad networks via Meta CAPI, Google Ads Conversion API, or app measurement paths; monitor match rate and freshness in pilot.

Skipping readiness steps produces models that rank poorly, signals that do not match users, and experiments that cannot be defended to finance.

Next step: What data Churney needs · Talk to an expert

Category variants

Model	Readiness priorities
Ecommerce / DTC	Order and refund events, net revenue definition, repeat purchase history, click IDs on web sessions.
Subscription app	Install and trial events, trial-to-paid transitions, renewal and churn, MMP postbacks plus data warehouse truth for revenue.
SaaS / PLG	Account-level IDs, product usage events, expansion revenue, longer sales cycles reflected in training windows.

Common mistakes

Starting with ad platform setup before ID audit. Match rate fails before the model is tested.
Inconsistent user ID across web and app. Breaks joins and duplicates customers in training data.
Batch-only updates with no SLA. Models and live signals drift from operational reality.
Training on gross revenue, activating on net. Calibration and finance readout disagree.
Missing attribution fields at anchor event. Cannot connect pLTV scores to the ad touch that should receive credit.
MMP-only for hybrid businesses. App postbacks complement but rarely replace data warehouse history for web and cross-device journeys.

Advertiser lens

Role	What they ask	What good looks like
Head of Performance / UA	Are we ready to pilot pLTV?	Signed checklist: IDs, volume, anchor event, and campaign consolidation plan.
VP Growth / CMO	What blocks go-live?	Dated readiness milestones with owners across marketing and data.
Marketing Analytics / Data Science	Is training data valid?	Leakage audit, label definition doc, and maturity window for evaluation.
Data Engineering	What do we pipe and when?	Daily append-only feeds, monitored pipelines, and identifier map documented.
Finance / Procurement	Is revenue data auditable?	Single source of truth for net revenue and refunds aligned to pilot success metrics.

FAQ

What is data readiness for pLTV?

Data readiness means your event, revenue, and identity data in the data warehouse are complete and reliable enough to train pLTV models and send calibrated value events to ad platforms with acceptable match rates and freshness.

How much history do you need?

Often 3–12 months of user-level events and revenue, depending on repeat cycles, subscription length, and seasonality. Shorter history can work for simple ecommerce; subscription and SaaS usually need more.

What identifiers matter most?

A stable user ID internal to your business, plus ad platform identifiers where available (for example GCLID, fbc/fbp, hashed email or phone for CAPI). Gaps directly affect match rate and model labels.

Is data readiness only a data engineering job?

No. Marketing defines anchor events and success metrics. Analytics validates labels and leakage. Data engineering owns pipelines. Performance marketing owns campaign eligibility and pilot design.

How is data readiness different from signal health?

Readiness is whether source data can support modeling and activation. Signal health is whether events arriving at platforms stay accurate, fresh, and matched over time.

What happens if we are not ready?

Delay the pilot, fix IDs and feeds, or scope a limited proof on one channel or geo. Launching on weak data usually wastes learning phase and produces inconclusive holdouts.

Where is the full Churney checklist?

See What data Churney needs for the operational checklist and onboarding steps.

Not the same as

Term	Difference
Signal health	Ongoing quality of events at platforms; readiness is upstream source fitness.
Data warehouse	The system that stores data; readiness is a state of that data relative to pLTV use cases.
Calibration	How well predictions match outcomes after modeling; readiness must exist before calibration matters.
Match rate	Platform-side user matching; readiness includes having the IDs match rate depends on.