Payments Forecasting
Forecast gross payment volume (GPV) across merchants using panel data with 44 entities tracked over 41 time periods.
Dataset Exploration
The Payments dataset tracks financial transaction patterns and payment processing metrics. Features include currency type, territory, payment provider, and payment method flags. The target variable (GPV) represents gross payment volume.
Panel data structure: 44 merchants tracked over 41 time periods (1,804 total rows, 9 columns).
Fit Data (first 5 rows)
| id | date | gpv | currency_eur | territory_FR | territory_IT | pprovider_stripe | paymenttype_link | paymenttype_sepa_debit |
|---|---|---|---|---|---|---|---|---|
| id4 | 44621 | 332804.72 | false | false | false | true | false | false |
| id4 | 44652 | 25440.27 | false | false | false | true | false | false |
| id4 | 44805 | 33363.04 | false | false | false | true | false | false |
| id4 | 44682 | 10024.33 | false | false | false | true | false | false |
| id4 | 44896 | 1950.75 | false | false | false | true | false | false |
Forecast Data (first 3 rows)
The forecast file covers future periods without the gpv target column.
| id | date | currency_eur | territory_FR | territory_IT | pprovider_stripe | paymenttype_link | paymenttype_sepa_debit |
|---|---|---|---|---|---|---|---|
| id0 | 45809 | false | false | false | true | false | false |
| id0 | 45839 | false | false | false | true | false | false |
| id0 | 45870 | false | false | false | true | false | false |
Code Walkthrough
Step 1: Initialize
import pandas as pd
from datfid import DATFIDClient
client = DATFIDClient(token="your_DATFID_token")Step 2: Fit the Model
url_fit = "https://raw.githubusercontent.com/datfid-valeriidashuk/sample-datasets/main/Payments.xlsx"
df = pd.read_excel(url_fit)
result = client.fit_model(
df=df,
id_col="id",
time_col="date",
y="gpv",
current_features="all",
filter_by_significance=True
)Step 3: Forecast
url_forecast = "https://raw.githubusercontent.com/datfid-valeriidashuk/sample-datasets/main/Payments_forecast.xlsx"
df_forecast = pd.read_excel(url_forecast)
forecast = client.forecast_model(df_forecast=df_forecast)Analysis Results (Model Fit)
Formula
gpv ~ α1*Intercept + α2*paymenttype_sepa_debit + α3*paymenttype_link + α4*territory_IT + α5*territory_FRAlpha Estimates (Time-Invariant)
All features in this dataset are time-invariant (entity characteristics), so there are no time-varying betas.
| Variable | Estimate | T-stat | Interpretation |
|---|---|---|---|
| Intercept | 12,156.0 | 9.8 | Baseline gross payment volume per entity (~12,156). |
| paymenttype_sepa_debit | -1,346.6 | 0.205 | SEPA-debit payments differ from the reference payment type by ~-1,346.6 in gross volume, but not statistically significantly. |
| paymenttype_link | -12,073.1 | 2.6 | Link payments run ~12,073.1 below the reference payment type. |
| territory_IT | -11,663.8 | 4.6 | Italian territory runs ~11,663.8 below the reference territory. |
| territory_FR | -9,513.5 | 3.8 | French territory runs ~9,513.5 below the reference territory. |
Model Performance
The Payments dataset has high merchant-level variance and limited features. Adding time-varying features or lagged variables would likely improve model fit.
Try it yourself: Select "Payments" in the Free Playground.