PredictiveModelAgent

The PredictiveModelAgent is a specialized Crowe agent that focuses on time-series forecasting, trend detection, and predictive analytics. It is designed for financial markets, demand planning, and operational forecasting, leveraging both statistical models and machine learning techniques.

Overview

The PredictiveModelAgent operates through three main phases:

Data Ingestion — collects and preprocesses time-series data from various sources.
Model Execution — runs forecasting models such as ARIMA, Prophet, or LSTM.
Result Analysis — interprets predictions, detects anomalies, and generates insights.

This makes it an ideal tool for data-driven decision making in environments where future trends matter.

Initialization

from crowe.agents import PredictiveModelAgent

agent = PredictiveModelAgent(
    agent_name="predictive-model-agent",
    model_name="prophet",
    forecast_horizon=30,
    anomaly_detection=True
)

Parameters

Parameter

Type

Default

Description

agent_name

str

"predictive-model-agent"

Name of the agent

model_name

str

"prophet"

Forecasting model to use

forecast_horizon

int

30

Number of future periods to predict

anomaly_detection

bool

False

Whether to run anomaly detection on predictions

data_source

str

None

Optional source identifier for data ingestion

max_loops

int

1

Max iterations for model execution

Methods

`load_data`

Purpose: Load and normalize time-series data from a given source (CSV, DB, API, or in-memory object).

Signature

data = agent.load_data(
    source: str | dict | "pd.DataFrame",
    *,
    timestamp_col: str = "ds",
    value_col: str = "y",
    freq: str | None = None,
    tz: str | None = None,
    parse_dates: bool = True,
    resample: str | None = None,
    fill_na: float | str | None = "ffill"
) -> pd.DataFrame

Parameters

source: Path/URL, connector config dict (e.g., SQL query), or a preloaded DataFrame.
timestamp_col / value_col: Column names to map to (ds, y).
freq: Expected frequency (e.g., "D", "H"). If None, infer.
tz: Force a timezone for timestamps.
parse_dates: Parse timestamp column to datetime.
resample: Optional resampling frequency (e.g., "D").
fill_na: Strategy ("ffill", "bfill", numeric), or None to leave as-is.

Returns A clean DataFrame with columns ds (datetime, sorted, unique) and y (float).

Notes & Edge Cases

Deduplicates on ds (keeps last).
If gaps are present and resample is provided, fills missing timestamps.
Raises ValueError if timestamp_col/value_col missing.

Example

df = agent.load_data("prices.csv", timestamp_col="date", value_col="close", resample="D")

`train_model`

Purpose: Fit the configured forecasting model on historical data.

Signature

agent.train_model(
    data: "pd.DataFrame",
    *,
    seasonal: str | None = None,    # e.g., "daily", "weekly", "yearly"
    regressors: list[str] | None = None,
    cross_validate: bool = False,
    cv_folds: int = 3,
    cv_horizon: int | None = None,
    metrics: list[str] = ("mae", "rmse", "mape")
) -> None

Parameters

data: Must include ds, y; optional exogenous columns (regressors).
seasonal: Seasonal prior (model dependent).
regressors: Additional features to include.
cross_validate: If True, performs rolling backtests.
cv_folds: Number of folds for CV.
cv_horizon: Steps ahead per fold (default derived from agent config).
metrics: Metrics to compute during CV.

Behavior

Stores fitted model + training diagnostics on the agent.
If CV enabled, stores fold metrics and error plots data.

Example

agent.train_model(df, regressors=["volume"], cross_validate=True, cv_folds=5, cv_horizon=14)

`forecast`

Purpose: Generate forward predictions for the requested horizon.

Signature

predictions = agent.forecast(
    horizon: int | None = None,
    *,
    include_components: bool = True,
    include_intervals: bool = True,
    alpha: float = 0.1,
    future_exogenous: "pd.DataFrame" | None = None
) -> "pd.DataFrame"

Parameters

horizon: Steps ahead; defaults to agent’s forecast_horizon.
include_components: Return trend/seasonality/regressor contributions if available.
include_intervals: Return prediction intervals.
alpha: Interval width (e.g., 0.1 → 90% CI).
future_exogenous: Future regressor values aligned on ds.

Returns DataFrame with at least:

ds, yhat (point forecast)
Optional: yhat_lower, yhat_upper, trend, season_*, regressor_*

Notes

Validates continuity: horizon must be ≥ 1.
If future_exogenous provided, columns must match trained regressors.

Example

pred = agent.forecast(30, include_components=True, alpha=0.2)

`detect_anomalies`

Purpose: Flag unusual values in either historical or forecast series.

Signature

anomalies = agent.detect_anomalies(
    predictions: "pd.DataFrame",
    *,
    method: str = "residual_zscore",  # "residual_zscore" | "iqr" | "prophet"
    z_thresh: float = 3.0,
    iqr_coeff: float = 1.5,
    use_confidence: bool = True
) -> list[dict]

Parameters

predictions: DF including ds, yhat, optionally y (actuals), and intervals.
method:
- "residual_zscore" → z-score on residuals (y - yhat)
- "iqr" → interquartile range on residuals
- "prophet" → outliers where y falls outside interval (if available)
z_thresh / iqr_coeff: Sensitivity controls.
use_confidence: If intervals exist, leverage them to reduce false positives.

Returns List of dicts like:

[{"ds": "...", "y": float, "yhat": float, "residual": float, "reason": "z>3"}]

Example

anoms = agent.detect_anomalies(pred, method="prophet")

`run`

Purpose: End-to-end pipeline: load → train → forecast → (optional) anomalies → assemble report.

Signature

result = agent.run(
    task: str | dict,
    *,
    detect_outliers: bool = True,
    return_artifacts: bool = False
) -> dict

Task Formats

String: natural language instruction (e.g., "Forecast next 14 days for BTC price from prices.csv").
Dict example:

{
  "source": "prices.csv",
  "timestamp_col": "date",
  "value_col": "close",
  "horizon": 14,
  "regressors": ["volume"],
  "anomaly_detection": True
}

Returns A structured dict:

{
  "model": {...},           # model metadata (type, params, train range)
  "data_summary": {...},    # counts, freq, gaps
  "forecast": pd.DataFrame, # or serialized if configured
  "anomalies": list,        # optional
  "metrics": {...},         # CV/backtest metrics if computed
  "notes": list[str]        # warnings, assumptions
}

Example

result = agent.run({
    "source": "prices.csv",
    "timestamp_col": "date",
    "value_col": "close",
    "horizon": 21,
    "regressors": ["volume"],
    "anomaly_detection": True
})

Additional Tips

Feature drift: Periodically retrain if data distribution shifts.
Missing data: Prefer resampling + forward-fill, but audit long gaps.
Exogenous inputs: Validate alignment and scale consistently.
Metrics you care about: For business ops, monitor MAPE/MASE, not just RMSE.
Reproducibility: Persist model configs, seed, training window, and version.

If you’d like, I can add minimal code stubs for each method (with pandas placeholders) so your docs double as a runnable template.

Example Usage

from crowe.agents import PredictiveModelAgent

agent = PredictiveModelAgent(
    model_name="lstm",
    forecast_horizon=14,
    anomaly_detection=True
)

result = agent.run("Forecast next two weeks' cryptocurrency prices.")
print(result)

Memory System

The PredictiveModelAgent includes a DataCacheMemory system that stores recent datasets and prediction results for quick re-analysis.

Memory Features

Stores processed datasets for reuse
Caches model parameters to avoid retraining when unnecessary
Supports retrieval of past prediction runs for comparison

Practices

Use high-quality, clean data for better forecasting accuracy
Select the appropriate model based on data seasonality and trends
Enable anomaly detection for early warning systems
Monitor performance periodically and update models with new data

PreviousCritiqueLoopAgent NextHow to Pick the Best Crowe for Your Challenge

Last updated 10 days ago