# Frequently Asked Questions

## How to interpret the results?

First, read the two files: `model_analysis.txt` (which explains results from Stage 1) and `optimisation_analysis.txt` (which explains results from Stage 2). 

### Understanding Beta Coefficients

Looking at an example from the current model (`refined_model_v3`), such as `beta_configuration_accessories_to_offer_request`:

This is a coefficient in a Bayesian network model that represents the strength and direction of the relationship between two events in the customer journey:

- `configuration_accessories` (the source event)
- `offer_request` (the target event)

Let's assume, hypothetically, the model output for this beta shows:

- Mean: 0.15
- Standard deviation: 0.05
- HDI (Highest Density Interval) from 3% to 97%: [0.05, 0.25]
- Median: 0.14

What does this mean? In practical terms, this beta coefficient tells us:

- The positive mean value (0.15) indicates that completing the accessories configuration positively influences the likelihood of making an offer request.
- The exact interpretation isn't a direct percentage increase in probability, but rather an increase in the *log-odds* of making an offer request. A positive coefficient means the probability increases.
- We're confident the effect is positive (as the 95% HDI [0.05, 0.25] is entirely above 0), and the likely magnitude of the effect on the log-odds scale is around 0.15, with some uncertainty (standard deviation 0.05).

These betas are crucial for:

- a) Understanding the customer journey - they show which actions influence other actions
- b) Identifying strong and weak points in the conversion funnel
- c) Making decisions about where to focus improvements or marketing efforts

For completeness, we should note that the HDI not crossing below 0 is partly due to our choice of prior. In our current model implementation, we use Half-Cauchy priors for the beta coefficients, which constrains the betas to be non-negative. This means that when we say "we're confident the effect is positive because the HDI doesn't cross zero", we need to be careful - this isn't really a finding from the data, but rather a constraint we've built into the model through our choice of prior.

Implications:
1. We can't truly discover negative relationships in the current model
2. The HDI not crossing zero is not evidence of a positive effect
3. We should consider using priors that allow for both positive and negative effects if we want to discover true directional relationships

### Interpreting Optimization Results

The `optimal_budget_allocation.csv` file contains the recommended budget allocations:
- Each node is assigned a proportion of the total budget
- Higher allocations indicate touchpoints where additional investment will yield the greatest return
- The allocation takes into account both direct conversion effects and indirect effects through the customer journey

## Is a non-negative prior justified?

Yes, it is justified - we believe a priori that all relationships in the customer journey should be positive. In a customer journey, each action typically represents a step of engagement or interest. It is logically consistent that more engagement at one step would increase (not decrease) the likelihood of engagement at subsequent steps.

Also, the journey from awareness to purchase is fundamentally a forward-moving process. While customers might drop off or abandon the journey, the presence of one positive action (like configuring a car) shouldn't directly cause a decrease in the probability of another positive action (like making a contact request).

Users who take action A have self-selected as more engaged users, making them inherently more likely (not less likely) to take action B. This creates a natural positive relationship between sequential actions.

Finally, our data captures positive events (actions taken) rather than negative events (actions avoided). In this framework, the absence of an action is represented by zero, making negative relationships less meaningful.

The key point is that while negative relationships could theoretically exist in other contexts (like when measuring satisfaction or when actions are mutually exclusive), in the specific context of a customer journey funnel, positive relationships are a reasonable prior assumption.

## Is the Half-Cauchy prior justified?

Yes. The model's earlier parametrisation used a Half-Gaussian. The Half-Cauchy is a better choice than a Half-Gaussian for our use case. 

Customer journeys often have "power law" type behaviours where most touchpoints have moderate effects, but a few critical touchpoints can have outsized impacts. The heavy tails of the Half-Cauchy can capture these important edge cases.

We don't want to artificially constrain large effects when they truly exist in the data. The Half-Cauchy's heavier tails allow the model to learn these strong relationships if they're supported by the data, while still maintaining reasonable skepticism about extremely large effects.

The scale invariance property is valuable because different types of touchpoints (e.g., configuration events vs contact requests) might naturally operate on different scales of impact.

## Should the downstream nodes have the biggest weights?

Arguably, no. The question of whether downstream nodes should have the biggest weights is more complex than it initially appears. The optimization logic determines node importance based on several factors:

- The node's position in the funnel
- The strength of its connections (betas)
- Its baseline probability (related to alpha)
- Its role in the path to conversion

The model does not simply assign greater weights to downstream nodes. Instead, it employs a more sophisticated approach, calculating each node's importance based on its direct influence on the final outcome (`purchase_outcome`) and its indirect influence via intermediate nodes. This approach considers both _immediate effects_ AND _network effects_.

From the optimization analysis output, it is expected that the allocation to various stages will be strategic rather than purely sequential. For instance, if the transition from `configuration_accessories` to `offer_request` proves to be a significant bottleneck, the optimization might allocate less budget directly to `offer_request` compared to `configuration_accessories`.

Therefore, the optimal allocation does not automatically prioritize downstream nodes. Instead, it depends on the strength of connections between nodes, the presence of bottlenecks, the potential for network effects, and the conversion value at each stage.

## What do convergence warnings mean?

MCMC convergence warnings indicate potential issues with the Bayesian model fitting process:

- **Divergent transitions**: These indicate areas where the sampler had difficulty exploring the posterior distribution. They might suggest that the model is too complex or that there are correlations between parameters.
  
  *Solution*: Try increasing the `target_accept` parameter in the configuration or simplifying the model structure.

- **Low effective sample size**: This indicates that samples are highly correlated, reducing the effective amount of independent information.
  
  *Solution*: Increase the number of samples (`draws` parameter) or modify the model structure to reduce parameter correlations.

## The problem with exit nodes (actual car purchase)

The challenge we currently face is that our model uses a proxy for the true exit node. The `purchase_outcome` node in the current model is derived from the `SALES_FUNNEL == 'ClosedSuccessfully'` status in the `backend_leads` data, which might not capture *all* actual purchases.

Incorporating true, verified purchase data linked directly to the `USER_PSEUDO_ID` would significantly enhance the model's accuracy and utility in several ways:

1. **True value calculation**: With real purchase data, we could measure the actual beta coefficients between each stage of the funnel and final purchases
2. **Comprehensive path analysis**: We could identify multiple successful paths to purchase, recognizing that some customers might skip certain stages
3. **Time-to-purchase dynamics**: By understanding the time lag between different actions and actual purchases, we could identify which early actions lead to faster conversions
4. **Model improvements**: We would obtain more accurate beta coefficients, allowing for better optimization targets focused on actual revenue rather than proxy conversions
5. **Assumption validation**: We could test whether high-value proxy events, such as test drives, actually correlate strongly with purchases

Integrating actual purchase data would likely lead to different optimization recommendations, as we would be optimizing for actual purchases rather than proxy metrics.

## How would individual user-level data change the model?

Currently, we rely on aggregate data and assume a predefined DAG structure. Incorporating individual user-level data could significantly enhance the model's effectiveness:

1. **Path discovery**: Instead of assuming paths, we could discover the actual journeys customers take
2. **Causality vs. correlation**: We could determine whether users who participate in test drives are indeed more likely to make a purchase
3. **Event sequence analysis**: User data could reveal whether the order of events impacts conversion rates
4. **Segment analysis**: We could identify different types of users with distinct optimal paths
5. **DAG structure reevaluation**: We might discover important loops or repeated patterns that are currently overlooked

Transitioning to user-level data analysis would provide a more nuanced understanding of customer behaviour, potentially showing that the customer journey is less linear than we assume.

## What about the acyclicity assumption (the 'A' in DAG)?

If we observe significant feedback effects, it would necessitate advancing beyond the current Directed Acyclic Graph (DAG) model. Users may exhibit behaviors such as:

- Returning to configuration after taking test drives
- Engaging in multiple dealer interactions over time
- Iteratively comparing prices and configurations
- Revisiting research phases after initial interest

The current DAG model assumes a one-way flow through the funnel, failing to capture recursive behaviour. Alternative modelling approaches could include:

1. **Dynamic Graphical Models**: Can explicitly model time-dependent relationships and handle feedback loops
2. **Hidden Markov Models**: Effective for modelling state transitions and cyclical patterns
3. **Recurrent Neural Networks**: Robust for handling sequential data and capturing long-term effects

Adopting these alternatives would require different optimization algorithms, allowing for the optimization of long-term value and better handling of multi-touch attribution.

## How can I adapt the model for different countries?

Adapting the model for different countries (e.g., from Toyota GB to Toyota Germany) requires:

1. **DAG Structure Adaptation**: Revise the graph structure in `config.yml` to reflect the local website architecture
2. **Data Preparation**: Ensure data follows the same format but reflects local user behavior
3. **Cultural Factors**: Consider adjusting priors to account for cultural differences in the customer journey
4. **Validation**: Run comparative analysis against local data to validate the adapted model

## What if the optimizer is not finding good solutions?

If the Genetic Algorithm optimizer is struggling:

1. **Increase the number of generations**: Modify the `generations` parameter in the optimization configuration
2. **Adjust the population size**: A larger population can explore the solution space more thoroughly
3. **Check constraints**: Ensure that the constraints are not overly restrictive
4. **Examine parameter estimates**: Poor parameter estimates from the Bayesian model will lead to poor optimization results

## How much data is needed for reliable results?

- For Bayesian Network inference, at least 1000 user sessions are recommended
- More complex models (with more nodes and edges) require more data
- For robust results, aim for at least 3 months of historical data
- Seasonal patterns may require a full year of data for accurate modeling

## Can I integrate this with our ad platforms directly?

Currently, the ConversionFlow library generates recommended budget allocations but does not provide direct integration with ad platforms. The typical workflow is:

1. Run the ConversionFlow pipeline to generate optimal budget allocations
2. Manually apply these recommendations to your ad platforms
3. Monitor performance and periodically re-run the pipeline with updated data

Direct integration would require additional development work specific to each platform's API.