PredictiveModelAgent

PredictiveModelAgent

The PredictiveModelAgent is a specialized Crowe agent that focuses on time-series forecasting, trend detection, and predictive analytics. It is designed for financial markets, demand planning, and operational forecasting, leveraging both statistical models and machine learning techniques.


Overview

The PredictiveModelAgent operates through three main phases:

  • Data Ingestion — collects and preprocesses time-series data from various sources.

  • Model Execution — runs forecasting models such as ARIMA, Prophet, or LSTM.

  • Result Analysis — interprets predictions, detects anomalies, and generates insights.

This makes it an ideal tool for data-driven decision making in environments where future trends matter.


Initialization

from crowe.agents import PredictiveModelAgent

agent = PredictiveModelAgent(
    agent_name="predictive-model-agent",
    model_name="prophet",
    forecast_horizon=30,
    anomaly_detection=True
)

Parameters

Parameter
Type
Default
Description

agent_name

str

"predictive-model-agent"

Name of the agent

model_name

str

"prophet"

Forecasting model to use

forecast_horizon

int

30

Number of future periods to predict

anomaly_detection

bool

False

Whether to run anomaly detection on predictions

data_source

str

None

Optional source identifier for data ingestion

max_loops

int

1

Max iterations for model execution


Methods

load_data

Purpose: Load and normalize time-series data from a given source (CSV, DB, API, or in-memory object).

Signature

data = agent.load_data(
    source: str | dict | "pd.DataFrame",
    *,
    timestamp_col: str = "ds",
    value_col: str = "y",
    freq: str | None = None,
    tz: str | None = None,
    parse_dates: bool = True,
    resample: str | None = None,
    fill_na: float | str | None = "ffill"
) -> pd.DataFrame

Parameters

  • source: Path/URL, connector config dict (e.g., SQL query), or a preloaded DataFrame.

  • timestamp_col / value_col: Column names to map to (ds, y).

  • freq: Expected frequency (e.g., "D", "H"). If None, infer.

  • tz: Force a timezone for timestamps.

  • parse_dates: Parse timestamp column to datetime.

  • resample: Optional resampling frequency (e.g., "D").

  • fill_na: Strategy ("ffill", "bfill", numeric), or None to leave as-is.

Returns A clean DataFrame with columns ds (datetime, sorted, unique) and y (float).

Notes & Edge Cases

  • Deduplicates on ds (keeps last).

  • If gaps are present and resample is provided, fills missing timestamps.

  • Raises ValueError if timestamp_col/value_col missing.

Example

df = agent.load_data("prices.csv", timestamp_col="date", value_col="close", resample="D")

train_model

Purpose: Fit the configured forecasting model on historical data.

Signature

agent.train_model(
    data: "pd.DataFrame",
    *,
    seasonal: str | None = None,    # e.g., "daily", "weekly", "yearly"
    regressors: list[str] | None = None,
    cross_validate: bool = False,
    cv_folds: int = 3,
    cv_horizon: int | None = None,
    metrics: list[str] = ("mae", "rmse", "mape")
) -> None

Parameters

  • data: Must include ds, y; optional exogenous columns (regressors).

  • seasonal: Seasonal prior (model dependent).

  • regressors: Additional features to include.

  • cross_validate: If True, performs rolling backtests.

  • cv_folds: Number of folds for CV.

  • cv_horizon: Steps ahead per fold (default derived from agent config).

  • metrics: Metrics to compute during CV.

Behavior

  • Stores fitted model + training diagnostics on the agent.

  • If CV enabled, stores fold metrics and error plots data.

Example

agent.train_model(df, regressors=["volume"], cross_validate=True, cv_folds=5, cv_horizon=14)

forecast

Purpose: Generate forward predictions for the requested horizon.

Signature

predictions = agent.forecast(
    horizon: int | None = None,
    *,
    include_components: bool = True,
    include_intervals: bool = True,
    alpha: float = 0.1,
    future_exogenous: "pd.DataFrame" | None = None
) -> "pd.DataFrame"

Parameters

  • horizon: Steps ahead; defaults to agent’s forecast_horizon.

  • include_components: Return trend/seasonality/regressor contributions if available.

  • include_intervals: Return prediction intervals.

  • alpha: Interval width (e.g., 0.1 → 90% CI).

  • future_exogenous: Future regressor values aligned on ds.

Returns DataFrame with at least:

  • ds, yhat (point forecast)

  • Optional: yhat_lower, yhat_upper, trend, season_*, regressor_*

Notes

  • Validates continuity: horizon must be ≥ 1.

  • If future_exogenous provided, columns must match trained regressors.

Example

pred = agent.forecast(30, include_components=True, alpha=0.2)

detect_anomalies

Purpose: Flag unusual values in either historical or forecast series.

Signature

anomalies = agent.detect_anomalies(
    predictions: "pd.DataFrame",
    *,
    method: str = "residual_zscore",  # "residual_zscore" | "iqr" | "prophet"
    z_thresh: float = 3.0,
    iqr_coeff: float = 1.5,
    use_confidence: bool = True
) -> list[dict]

Parameters

  • predictions: DF including ds, yhat, optionally y (actuals), and intervals.

  • method:

    • "residual_zscore" → z-score on residuals (y - yhat)

    • "iqr" → interquartile range on residuals

    • "prophet" → outliers where y falls outside interval (if available)

  • z_thresh / iqr_coeff: Sensitivity controls.

  • use_confidence: If intervals exist, leverage them to reduce false positives.

Returns List of dicts like:

[{"ds": "...", "y": float, "yhat": float, "residual": float, "reason": "z>3"}]

Example

anoms = agent.detect_anomalies(pred, method="prophet")

run

Purpose: End-to-end pipeline: load → train → forecast → (optional) anomalies → assemble report.

Signature

result = agent.run(
    task: str | dict,
    *,
    detect_outliers: bool = True,
    return_artifacts: bool = False
) -> dict

Task Formats

  • String: natural language instruction (e.g., "Forecast next 14 days for BTC price from prices.csv").

  • Dict example:

{
  "source": "prices.csv",
  "timestamp_col": "date",
  "value_col": "close",
  "horizon": 14,
  "regressors": ["volume"],
  "anomaly_detection": True
}

Returns A structured dict:

{
  "model": {...},           # model metadata (type, params, train range)
  "data_summary": {...},    # counts, freq, gaps
  "forecast": pd.DataFrame, # or serialized if configured
  "anomalies": list,        # optional
  "metrics": {...},         # CV/backtest metrics if computed
  "notes": list[str]        # warnings, assumptions
}

Example

result = agent.run({
    "source": "prices.csv",
    "timestamp_col": "date",
    "value_col": "close",
    "horizon": 21,
    "regressors": ["volume"],
    "anomaly_detection": True
})

Additional Tips

  • Feature drift: Periodically retrain if data distribution shifts.

  • Missing data: Prefer resampling + forward-fill, but audit long gaps.

  • Exogenous inputs: Validate alignment and scale consistently.

  • Metrics you care about: For business ops, monitor MAPE/MASE, not just RMSE.

  • Reproducibility: Persist model configs, seed, training window, and version.

If you’d like, I can add minimal code stubs for each method (with pandas placeholders) so your docs double as a runnable template.


Example Usage

from crowe.agents import PredictiveModelAgent

agent = PredictiveModelAgent(
    model_name="lstm",
    forecast_horizon=14,
    anomaly_detection=True
)

result = agent.run("Forecast next two weeks' cryptocurrency prices.")
print(result)

Memory System

The PredictiveModelAgent includes a DataCacheMemory system that stores recent datasets and prediction results for quick re-analysis.

Memory Features

  • Stores processed datasets for reuse

  • Caches model parameters to avoid retraining when unnecessary

  • Supports retrieval of past prediction runs for comparison


Practices

  • Use high-quality, clean data for better forecasting accuracy

  • Select the appropriate model based on data seasonality and trends

  • Enable anomaly detection for early warning systems

  • Monitor performance periodically and update models with new data

Last updated