M5 Department Sales Forecasting
Forecast retail department-level sales based on the M5 forecasting competition data. 7 departments tracked over 1,941 time periods.
Dataset Exploration
The M5 Department dataset is based on the M5 forecasting competition and contains Walmart department-level sales data. Features include CPI, holiday events (sporting, cultural, national, religious), SNAP benefits by state, day of week, and month.
Panel data structure: 7 department aggregations tracked over 1,941 days (13,587 total rows, 13 columns). Each department (e.g. FOODS_1) is tracked daily.
Fit Data (first 5 rows)
| agg_id | ds | CPI | Sporting | Cultural | National | Religious | snap_CA | snap_TX | snap_WI | wday | month | Sales |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FOODS_1 | 40572 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 2343 |
| FOODS_1 | 40573 | 1.04 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 2216 |
| FOODS_1 | 40574 | 1.053 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 1 | 1657 |
| FOODS_1 | 40575 | 1.066 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 4 | 2 | 1508 |
| FOODS_1 | 40576 | 1.075 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 5 | 2 | 1209 |
Forecast Data (first 3 rows)
| agg_id | ds | CPI | Sporting | Cultural | National | Religious | snap_CA | snap_TX | snap_WI | wday | month | Sales |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FOODS_1 | 42513 | 1.201 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 5 | 3246 |
| FOODS_1 | 42514 | 1.201 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 5 | 3270 |
| FOODS_1 | 42515 | 1.201 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 5 | 3274 |
Code Walkthrough
Step 1: Initialize
import pandas as pd
from datfid import DATFIDClient
client = DATFIDClient(token="your_DATFID_token")Step 2: Fit the Model
url_fit = "https://raw.githubusercontent.com/datfid-valeriidashuk/sample-datasets/main/M5_Department.xlsx"
df = pd.read_excel(url_fit)
result = client.fit_model(
df=df,
id_col="agg_id",
time_col="ds",
y="Sales",
current_features="all",
filter_by_significance=True
)Step 3: Forecast
url_forecast = "https://raw.githubusercontent.com/datfid-valeriidashuk/sample-datasets/main/M5_Department_forecast.xlsx"
df_forecast = pd.read_excel(url_forecast)
forecast = client.forecast_model(df_forecast=df_forecast)Analysis Results (Model Fit)
Formula
Sales ~ α1*Intercept + β1*CPI + β2*Sporting + β3*Cultural + β4*National + β5*Religious + β6*snap_CA + β7*snap_TX + β8*snap_WI + β9*wday + β10*monthAlpha Estimates (Time-Invariant)
| Variable | Estimate | T-stat | Interpretation |
|---|---|---|---|
| Intercept | -24,014.8 | 518.5 | Baseline department-level sales when every driver sits at zero. |
Beta Estimates (Time-Varying)
| Variable | Estimate | T-stat | Interpretation |
|---|---|---|---|
| CPI | +25,321.8 | 45.2 | A 1-unit CPI increase lifts sales by ~25,321.8 units — inflation-linked retail pricing flowing through to revenue. |
| Sporting | +51.1 | 0.4 | Sporting-event days are not statistically distinguishable from non-event days in this run. |
| Cultural | -205.3 | 2.3 | Cultural holidays subtract ~205.3 units from sales (stores reduce hours or shoppers stay home). |
| National | -758.6 | 9.7 | National holidays subtract ~758.6 units from sales (closures and reduced traffic). |
| Religious | -97.6 | 1.3 | Religious holidays move sales slightly down, but the effect is not statistically significant. |
| snap_CA | +212.0 | 7.1 | SNAP days in California add ~212.0 units of sales over a typical day. |
| snap_TX | +256.7 | 7.8 | SNAP days in Texas add ~256.7 units of sales over a typical day. |
| snap_WI | +242.7 | 7.4 | SNAP days in Wisconsin add ~242.7 units of sales over a typical day. |
| wday | -242.3 | 38.6 | Each step from Monday toward Sunday subtracts ~242.3 units — weekends still sell more, but the day-of-week index treats Saturday as the high-sales anchor. |
| month | -16.4 | 4.5 | Each later month of the year shaves ~16.4 units off sales — a mild seasonal drift baked into the average. |
Model Performance
The M5 dataset has high variance across departments (FOODS vs HOBBIES vs HOUSEHOLD), which explains the lower between-group R². Adding lagged features or filtering by significance can improve results.
Try it yourself: Select "M5 Department" in the Free Playground.