Customer Lifetime Value Prediction Model Builder
Build a predictive customer lifetime value (CLV) model using historical transaction data to forecast future revenue, segment customers by value tiers, and inform acquisition spend decisions.
Prompt Template
You are a data scientist specializing in customer analytics and predictive modeling. Build a customer lifetime value (CLV) prediction model from historical data. **Business type:** [e.g., subscription SaaS with monthly and annual plans] **Data available:** [e.g., 3 years of transaction history, 15,000 customers, fields: customer_id, signup_date, plan_type, monthly_revenue, churn_date, support_tickets, feature_usage_score] **Current state:** [e.g., we calculate CLV retrospectively but can't predict it for new customers] **Tools:** [e.g., Python (pandas, scikit-learn, lifetimes library) or SQL + spreadsheet] **Key business questions:** - What is a new customer worth in their first 12/24/36 months? - Which customer segments have highest/lowest CLV? - How much should we spend to acquire a customer in each segment? Build: 1. **CLV Methodology Selection** — compare approaches with recommendation: - Historical (simple, backward-looking) - Probabilistic (BG/NBD + Gamma-Gamma) - ML-based (regression/survival analysis) 2. **Data Preparation Pipeline** — code for: - Feature engineering from raw transactions - Handling censored data (customers still active) - Train/test split strategy for temporal data 3. **Model Implementation** — full code for recommended approach: - Training pipeline - Prediction for individual customers - Confidence intervals on predictions 4. **Customer Segmentation** — value-based tiers: - Tier definitions and thresholds - Profile of each tier (demographics, behavior) - Migration patterns between tiers 5. **Business Application** — translating model to decisions: - Maximum CAC by segment - Retention investment prioritization - Revenue forecasting from current cohort 6. **Model Monitoring** — how to track prediction accuracy over time
Example Output
CLV Prediction Model: SaaS Subscription
Methodology: Survival Analysis + Revenue Regression
Why not simpler approaches?
| Method | Accuracy | Handles Censoring | New Customer Prediction |
|--------|----------|-------------------|------------------------|
| Historical average | Low | No | No |
| BG/NBD | Medium | Yes | Limited |
| **Survival + Regression** ✓ | **High** | **Yes** | **Yes, from Day 1** |
Implementation
from lifelines import CoxPHFitter, KaplanMeierFitter
import pandas as pd
# Feature engineering
df['tenure_months'] = (df['churn_date'].fillna(pd.Timestamp.now()) - df['signup_date']).dt.days / 30
df['is_churned'] = df['churn_date'].notna().astype(int)
df['avg_monthly_revenue'] = df.groupby('customer_id')['monthly_revenue'].transform('mean')
df['support_ticket_rate'] = df['support_tickets'] / df['tenure_months']
# Survival model for retention probability
cph = CoxPHFitter()
cph.fit(df[['tenure_months', 'is_churned', 'plan_type_encoded',
'feature_usage_score', 'support_ticket_rate']],
duration_col='tenure_months', event_col='is_churned')
# Predict 36-month survival curve per customer
surv_funcs = cph.predict_survival_function(df_new_customers)
# CLV = sum of (survival_probability_month_t × expected_revenue_month_t)
clv_36m = (surv_funcs * monthly_revenue_predictions).sum(axis=0)
Customer Value Segments
| Tier | CLV Range (36mo) | % Customers | Avg Revenue/Mo | Max CAC |
|------|-----------------|-------------|----------------|----------|
| Platinum | >$2,000 | 8% | $89 | $600 |
| Gold | $800-$2,000 | 22% | $49 | $250 |
| Silver | $300-$800 | 45% | $29 | $100 |
| Bronze | <$300 | 25% | $19 | $40 |
Key Insight
**Feature usage score is the #1 predictor of CLV** — customers who use 5+ features in their first 30 days have 3.2x higher CLV than those using ≤2 features. → Invest in onboarding, not just acquisition.
Tips for Best Results
- 💡Always account for censored data (active customers who haven't churned yet) — ignoring them drastically underestimates CLV
- 💡Use temporal train/test splits, not random — training on 2023 data and testing on 2024 mimics real prediction conditions
- 💡CLV predictions are most valuable when tied to action — if you can't change your CAC or retention strategy based on the segments, the model is academic
- 💡Recalculate CLV quarterly and track prediction drift — customer behavior changes and your model needs to keep up
Related Prompts
Dataset Summary and Insights
Paste or describe a dataset and get an instant summary of key statistics, patterns, anomalies, and actionable insights.
SQL Query Writer for Business Reports
Generate SQL queries for common business reporting needs — revenue trends, cohort analysis, funnel metrics, and more.
Dashboard KPI Definition Framework
Define the right KPIs for your business dashboard with clear formulas, targets, and data sources.