Python SDK

The official DATFID Python package for interpretable AI forecasting. Available on PyPI.

Installation

Terminal
pip install datfid

Initialize the Client

Python
import pandas as pd
from datfid import DATFIDClient

client = DATFIDClient(token="your_DATFID_token")

Need a token? See Get Your API Key.

Supported File Formats

DATFID accepts Excel (.xlsx, .xls) and CSV (.csv) for both model fitting and forecasting inputs.

fit_model()

Trains an interpretable panel data model on your dataset and returns a formula with coefficients.

Python
result = client.fit_model(
    df=df,
    id_col="Product",
    time_col="Time",
    y="Revenue",
    current_features="all",
    filter_by_significance=True,
    lag_y="1:3",
    lagged_features={"Income Level": "1:3"}
)

Parameters

ParameterRequiredDescription
df
Required
Pandas DataFrame with your historical panel data
id_col
Required
Column name identifying each entity (product, customer, store, etc.)
time_col
Required
Column name for the time dimension
y
Required
Column name of the target variable to predict
current_features
Optional
Which features to include. Use "all" for all columns, a list of column names for a subset, or omit to use only the mandatory columns (id, time, target).
filter_by_significance
Optional
When True, DATFID automatically removes statistically insignificant features, keeping only variables with a meaningful relationship to the target. Simplifies the model and avoids noise. Recommended for most use cases.
lag_y
Optional
Include lagged values of the target variable as features. E.g. "1:3"uses the target's values from 1, 2, and 3 periods ago. Useful for autoregressive patterns.
lagged_features
Optional
Include lagged values of specific features. Pass a dict mapping feature names to lag ranges, e.g. {"Income Level": "1:3"}. Useful when the effect of a feature is delayed (e.g. marketing spend affects sales with a lag).

Return Value — result

fit_model() returns a result object. In Python or Google Colab, typing result. shows all available attributes via autocomplete. Here is what each one contains:

AttributeTypeDescription
result.formulastrThe fitted model formula as a human-readable string, e.g. Revenue = α + 0.42·Price + …
result.alphaDataFrameEntity-level intercepts (α coefficients). One row per entity — captures the inherent, time-invariant baseline of each entity.
result.betaDataFrameFeature coefficients (β). Each row is a predictor with its estimated effect, standard error, t-statistic, and p-value.
result.PerformancedictOverall model quality metrics: R², MAE, MSE, and RMSE computed on the training set.
result.R2_individualDataFramePer-entity R² values. Shows how well the model fits each individual entity in the panel.
result.dfDataFrameThe processed training DataFrame used for fitting, after column selection and lag construction.
result.ID / result.IdstrThe name of the entity ID column used during fitting.
result.headers_alphalist[str]Column headers for the alpha table (entity ID + coefficient column names).
result.headers_betalist[str]Column headers for the beta table (variable name, estimate, SE, t-stat, p-value).
result.dropped_colslist[str]Features that were dropped (e.g. due to filter_by_significance=True or multicollinearity).
result.errorslist[str]Any non-fatal warnings or messages generated during fitting (e.g. near-singular columns).
Python – quick inspection
print(result.formula)
# e.g. Revenue = α + 0.42·Price + 1.8·Promo + …

print(result.Performance)
# {'R2': 0.91, 'MAE': 142.3, 'MSE': 38201.0, 'RMSE': 195.4}

print(result.alpha.head())
# entity-level intercepts

print(result.dropped_cols)
# ['Inflation Rate']  -- removed as insignificant

Individual Entity Access — result['entity_name'].attribute

You can drill into any single entity by indexing result with the entity's name and chaining the attribute directly. All the same attributes are available, but scoped to that one entity only:

AttributeTypeDescription
result['ind1'].formulastrThe model formula with this entity's specific α substituted in.
result['ind1'].alphafloat / SeriesThis entity's intercept (α) — its time-invariant baseline value.
result['ind1'].betaDataFrameFeature coefficients — same as the global result.beta (coefficients are shared across entities in a panel model).
result['ind1'].PerformancedictModel quality metrics (R², MAE, MSE, RMSE) computed only on this entity's rows.
result['ind1'].R2_individualfloatThe R² score for this specific entity.
result['ind1'].dfDataFrameThe training data rows belonging to this entity only.
result['ind1'].dropped_colslist[str]Features removed during fitting (same as global result.dropped_cols).
result['ind1'].errorslist[str]Any warnings specific to this entity's data (e.g. too few observations).
Python – entity drill-down example
print(result["ind1"].formula)
# Revenue = 18,450.2 + 0.42·Price + 1.8·Promo + …
#            ^^^^^^^^ this entity's specific alpha

print(result["ind1"].Performance)
# {'R2': 0.94, 'MAE': 118.7, 'MSE': 29004.0, 'RMSE': 170.3}

print(result["ind1"].alpha)
# 18450.2   -- ind1's baseline, independent of time

print(result["ind1"].df.head())
# rows for ind1 only

forecast_model()

Generates predictions using the previously fitted model. Requires a forecast DataFrame that defines which entities and time periods to predict.

Python
df_forecast = pd.read_excel("forecast_data.xlsx")

forecast = client.forecast_model(df_forecast=df_forecast)

Parameters

ParameterRequiredDescription
df_forecast
Required
DataFrame with the same entity/time structure as the fit data, covering the periods you want predictions for. Must include feature columns if the model uses them.

Use Case Examples

See the SDK in action with real datasets, analysis results, and forecast outputs:

Want to try without code? The Free Playground exposes all the same parameters (feature selection, lags, filter by significance) through a point-and-click UI — no SDK installation required.