Hypothesis Testing: Black Friday Buildup (Pragmatic Option)¶

This guide shows the simplest, pragmatic approach to test if a pre-event buildup effect exists (e.g., purchase delay before Black Friday) by configuring a neutral, heavy‑tailed prior for control variables.

Overview¶

Goal: Test whether a buildup dummy (e.g., black_friday_buildup) has a non‑zero effect on the target.
Approach: Use StudentT prior on gamma_control so control coefficients can be positive or negative without bounds.
Why StudentT? Heavy tails (robust), full real line support, avoids vectorised priors when adding one extra feature.

Requirements¶

Add your buildup dummy to the data (e.g., data-config/statlas_data.csv), column name: black_friday_buildup.
Expose it in the config under extra_features_cols.

Example (data-config/statlas_config_v3.yml):

extra_features_cols:
  - black_friday_buildup
  # (Optional) add an event-week dummy too, e.g. black_friday_event

Configuration: Prior for Controls¶

Set a neutral, heavy‑tailed prior for all control variables via custom_priors.gamma_control:

custom_priors:
  gamma_control:
    dist: StudentT
    kwargs:
      nu: 3      # Heavy tails (robust to outliers)
      mu: 0      # Centered at zero (neutral prior)
      sigma: 1   # Moderate width (lets data speak)

Properties:

Allows negative values → Full real‑line support (no bounds needed)
Heavy tails (nu=3) → Robust to extreme values
Centered at zero → No directional bias
Sigma=1 → Moderate width (simple hypothesis test)

Notes:

This applies the same prior to all controls. For this pragmatic test, that’s fine and avoids vectorised sigma.
If you later add an event dummy and want a wider prior for it, you can vectorise sigma in the exact order of extra_features_cols. For hypothesis testing with a single buildup dummy, keep it simple and skip vectorisation.

Run the Model¶

Use your normal pipeline (e.g., python -u runme.py).
Ensure your config includes the extra_features_cols and custom_priors.gamma_control above.

Hypothesis Testing Interpretation¶

Let γ_buildup = gamma_control['black_friday_buildup'].

Expected Results¶

Buildup exists:
- Posterior mean is negative (e.g., −0.15)
- 95% credible interval excludes 0 on the negative side (e.g., [−0.25, −0.05])
- Interpretation: purchase delay (customers hold off buying)
No buildup effect:
- Posterior mean near 0 (e.g., −0.02)
- 95% credible interval includes 0 (e.g., [−0.10, 0.06])
- Interpretation: no evidence of purchase delay
Event effect (if modeled separately):
- gamma_control['black_friday_event'] positive (e.g., +0.50)
- 95% credible interval excludes 0 on positive side (e.g., [0.35, 0.65])
- Interpretation: event drives sales spike

Statistical Significance¶

Credible interval excludes zero → effect is significant at 95% level
Magnitude matters → compare effect sizes, not just significance
Compare buildup vs event → net effect = event boost − buildup delay

Extracting Posterior Coefficients¶

Example after fitting (using the v2 model’s InferenceData):

import numpy as np
import xarray as xr

if 'model' not in globals():
    class _Idata:
        pass

    class _Model:
        pass

    _idata = _Idata()
    _idata.posterior = xr.Dataset(
        {
            'gamma_control': (
                ('chain', 'draw', 'control'),
                np.random.normal(size=(1, 50, 1)),
            )
        },
        coords={'control': ['black_friday_buildup']},
    )
    model = _Model()
    model.idata = _idata

coef = model.idata.posterior['gamma_control']
vals = coef.sel(control='black_friday_buildup').values.flatten()
mean = np.mean(vals)
ci_low, ci_high = np.percentile(vals, [2.5, 97.5])
p_neg = (vals < 0).mean()
print(f"buildup: {mean:.3f} [{ci_low:.3f}, {ci_high:.3f}]  P(γ<0)={p_neg:.3f}")

Tips:

Report both the 95% credible interval and P(γ<0 | data) for a Bayesian view of evidence.
Coefficients are on the model’s scaled target space (target is max‑abs scaled in‑graph). For hypothesis testing (sign and non‑zero effect), this is generally sufficient.

Common Pitfalls¶

Forgetting to add the column to extra_features_cols → the model won’t include your dummy.
Over‑wide priors (e.g., very large sigma) → slower discrimination; start with sigma: 1.
Vectorised priors order mismatch → only vectorise if you really need different widths; keep it scalar for simple tests.