Python SDK
The official DATFID Python package for interpretable AI forecasting. Available on PyPI.
Installation
pip install datfidInitialize the Client
import pandas as pd
from datfid import DATFIDClient
client = DATFIDClient(token="your_DATFID_token")Need a token? See Get Your API Key.
Supported File Formats
DATFID accepts Excel (.xlsx, .xls) and CSV (.csv) for both model fitting and forecasting inputs.
fit_model()
Trains an interpretable panel data model on your dataset and returns a formula with coefficients.
result = client.fit_model(
df=df,
id_col="Product",
time_col="Time",
y="Revenue",
current_features="all",
filter_by_significance=True,
lag_y="1:3",
lagged_features={"Income Level": "1:3"}
)Parameters
| Parameter | Required | Description |
|---|---|---|
df | Required | Pandas DataFrame with your historical panel data |
id_col | Required | Column name identifying each entity (product, customer, store, etc.) |
time_col | Required | Column name for the time dimension |
y | Required | Column name of the target variable to predict |
current_features | Optional | Which features to include. Use "all" for all columns, a list of column names for a subset, or omit to use only the mandatory columns (id, time, target). |
filter_by_significance | Optional | When True, DATFID automatically removes statistically insignificant features, keeping only variables with a meaningful relationship to the target. Simplifies the model and avoids noise. Recommended for most use cases. |
lag_y | Optional | Include lagged values of the target variable as features. E.g. "1:3"uses the target's values from 1, 2, and 3 periods ago. Useful for autoregressive patterns. |
lagged_features | Optional | Include lagged values of specific features. Pass a dict mapping feature names to lag ranges, e.g. {"Income Level": "1:3"}. Useful when the effect of a feature is delayed (e.g. marketing spend affects sales with a lag). |
Return Value — result
fit_model() returns a result object. In Python or Google Colab, typing result. shows all available attributes via autocomplete. Here is what each one contains:
| Attribute | Type | Description |
|---|---|---|
result.formula | str | The fitted model formula as a human-readable string, e.g. Revenue = α + 0.42·Price + … |
result.alpha | DataFrame | Entity-level intercepts (α coefficients). One row per entity — captures the inherent, time-invariant baseline of each entity. |
result.beta | DataFrame | Feature coefficients (β). Each row is a predictor with its estimated effect, standard error, t-statistic, and p-value. |
result.Performance | dict | Overall model quality metrics: R², MAE, MSE, and RMSE computed on the training set. |
result.R2_individual | DataFrame | Per-entity R² values. Shows how well the model fits each individual entity in the panel. |
result.df | DataFrame | The processed training DataFrame used for fitting, after column selection and lag construction. |
result.ID / result.Id | str | The name of the entity ID column used during fitting. |
result.headers_alpha | list[str] | Column headers for the alpha table (entity ID + coefficient column names). |
result.headers_beta | list[str] | Column headers for the beta table (variable name, estimate, SE, t-stat, p-value). |
result.dropped_cols | list[str] | Features that were dropped (e.g. due to filter_by_significance=True or multicollinearity). |
result.errors | list[str] | Any non-fatal warnings or messages generated during fitting (e.g. near-singular columns). |
print(result.formula)
# e.g. Revenue = α + 0.42·Price + 1.8·Promo + …
print(result.Performance)
# {'R2': 0.91, 'MAE': 142.3, 'MSE': 38201.0, 'RMSE': 195.4}
print(result.alpha.head())
# entity-level intercepts
print(result.dropped_cols)
# ['Inflation Rate'] -- removed as insignificantIndividual Entity Access — result['entity_name'].attribute
You can drill into any single entity by indexing result with the entity's name and chaining the attribute directly. All the same attributes are available, but scoped to that one entity only:
| Attribute | Type | Description |
|---|---|---|
result['ind1'].formula | str | The model formula with this entity's specific α substituted in. |
result['ind1'].alpha | float / Series | This entity's intercept (α) — its time-invariant baseline value. |
result['ind1'].beta | DataFrame | Feature coefficients — same as the global result.beta (coefficients are shared across entities in a panel model). |
result['ind1'].Performance | dict | Model quality metrics (R², MAE, MSE, RMSE) computed only on this entity's rows. |
result['ind1'].R2_individual | float | The R² score for this specific entity. |
result['ind1'].df | DataFrame | The training data rows belonging to this entity only. |
result['ind1'].dropped_cols | list[str] | Features removed during fitting (same as global result.dropped_cols). |
result['ind1'].errors | list[str] | Any warnings specific to this entity's data (e.g. too few observations). |
print(result["ind1"].formula)
# Revenue = 18,450.2 + 0.42·Price + 1.8·Promo + …
# ^^^^^^^^ this entity's specific alpha
print(result["ind1"].Performance)
# {'R2': 0.94, 'MAE': 118.7, 'MSE': 29004.0, 'RMSE': 170.3}
print(result["ind1"].alpha)
# 18450.2 -- ind1's baseline, independent of time
print(result["ind1"].df.head())
# rows for ind1 onlyforecast_model()
Generates predictions using the previously fitted model. Requires a forecast DataFrame that defines which entities and time periods to predict.
df_forecast = pd.read_excel("forecast_data.xlsx")
forecast = client.forecast_model(df_forecast=df_forecast)Parameters
| Parameter | Required | Description |
|---|---|---|
df_forecast | Required | DataFrame with the same entity/time structure as the fit data, covering the periods you want predictions for. Must include feature columns if the model uses them. |
Use Case Examples
See the SDK in action with real datasets, analysis results, and forecast outputs:
- → Revenue Forecasting — Food & Beverages industry
- → Loan Probability — Banking sector with lagged features
- → Energy Electricity — Regional demand forecasting
- → Insurance — Premium pricing & risk
- → M5 Department — Retail department-level sales
- → Payments — Transaction volume forecasting
- → Venture Capital — Investment risk scoring
Want to try without code? The Free Playground exposes all the same parameters (feature selection, lags, filter by significance) through a point-and-click UI — no SDK installation required.