PredictiveModelAgent
PredictiveModelAgent
The PredictiveModelAgent is a specialized Crowe agent that focuses on time-series forecasting, trend detection, and predictive analytics. It is designed for financial markets, demand planning, and operational forecasting, leveraging both statistical models and machine learning techniques.
Overview
The PredictiveModelAgent operates through three main phases:
Data Ingestion — collects and preprocesses time-series data from various sources.
Model Execution — runs forecasting models such as ARIMA, Prophet, or LSTM.
Result Analysis — interprets predictions, detects anomalies, and generates insights.
This makes it an ideal tool for data-driven decision making in environments where future trends matter.
Initialization
from crowe.agents import PredictiveModelAgent
agent = PredictiveModelAgent(
agent_name="predictive-model-agent",
model_name="prophet",
forecast_horizon=30,
anomaly_detection=True
)
Parameters
agent_name
str
"predictive-model-agent"
Name of the agent
model_name
str
"prophet"
Forecasting model to use
forecast_horizon
int
30
Number of future periods to predict
anomaly_detection
bool
False
Whether to run anomaly detection on predictions
data_source
str
None
Optional source identifier for data ingestion
max_loops
int
1
Max iterations for model execution
Methods
load_data
load_data
Purpose: Load and normalize time-series data from a given source (CSV, DB, API, or in-memory object).
Signature
data = agent.load_data(
source: str | dict | "pd.DataFrame",
*,
timestamp_col: str = "ds",
value_col: str = "y",
freq: str | None = None,
tz: str | None = None,
parse_dates: bool = True,
resample: str | None = None,
fill_na: float | str | None = "ffill"
) -> pd.DataFrame
Parameters
source
: Path/URL, connector config dict (e.g., SQL query), or a preloaded DataFrame.timestamp_col
/value_col
: Column names to map to(ds, y)
.freq
: Expected frequency (e.g.,"D"
,"H"
). IfNone
, infer.tz
: Force a timezone for timestamps.parse_dates
: Parse timestamp column todatetime
.resample
: Optional resampling frequency (e.g.,"D"
).fill_na
: Strategy ("ffill"
,"bfill"
, numeric), orNone
to leave as-is.
Returns
A clean DataFrame with columns ds
(datetime, sorted, unique) and y
(float).
Notes & Edge Cases
Deduplicates on
ds
(keeps last).If gaps are present and
resample
is provided, fills missing timestamps.Raises
ValueError
iftimestamp_col
/value_col
missing.
Example
df = agent.load_data("prices.csv", timestamp_col="date", value_col="close", resample="D")
train_model
train_model
Purpose: Fit the configured forecasting model on historical data.
Signature
agent.train_model(
data: "pd.DataFrame",
*,
seasonal: str | None = None, # e.g., "daily", "weekly", "yearly"
regressors: list[str] | None = None,
cross_validate: bool = False,
cv_folds: int = 3,
cv_horizon: int | None = None,
metrics: list[str] = ("mae", "rmse", "mape")
) -> None
Parameters
data
: Must includeds
,y
; optional exogenous columns (regressors).seasonal
: Seasonal prior (model dependent).regressors
: Additional features to include.cross_validate
: IfTrue
, performs rolling backtests.cv_folds
: Number of folds for CV.cv_horizon
: Steps ahead per fold (default derived from agent config).metrics
: Metrics to compute during CV.
Behavior
Stores fitted model + training diagnostics on the agent.
If CV enabled, stores fold metrics and error plots data.
Example
agent.train_model(df, regressors=["volume"], cross_validate=True, cv_folds=5, cv_horizon=14)
forecast
forecast
Purpose: Generate forward predictions for the requested horizon.
Signature
predictions = agent.forecast(
horizon: int | None = None,
*,
include_components: bool = True,
include_intervals: bool = True,
alpha: float = 0.1,
future_exogenous: "pd.DataFrame" | None = None
) -> "pd.DataFrame"
Parameters
horizon
: Steps ahead; defaults to agent’sforecast_horizon
.include_components
: Return trend/seasonality/regressor contributions if available.include_intervals
: Return prediction intervals.alpha
: Interval width (e.g., 0.1 → 90% CI).future_exogenous
: Future regressor values aligned onds
.
Returns DataFrame with at least:
ds
,yhat
(point forecast)Optional:
yhat_lower
,yhat_upper
,trend
,season_*
,regressor_*
Notes
Validates continuity: horizon must be ≥ 1.
If
future_exogenous
provided, columns must match trained regressors.
Example
pred = agent.forecast(30, include_components=True, alpha=0.2)
detect_anomalies
detect_anomalies
Purpose: Flag unusual values in either historical or forecast series.
Signature
anomalies = agent.detect_anomalies(
predictions: "pd.DataFrame",
*,
method: str = "residual_zscore", # "residual_zscore" | "iqr" | "prophet"
z_thresh: float = 3.0,
iqr_coeff: float = 1.5,
use_confidence: bool = True
) -> list[dict]
Parameters
predictions
: DF includingds
,yhat
, optionallyy
(actuals), and intervals.method
:"residual_zscore"
→ z-score on residuals (y - yhat
)"iqr"
→ interquartile range on residuals"prophet"
→ outliers wherey
falls outside interval (if available)
z_thresh
/iqr_coeff
: Sensitivity controls.use_confidence
: If intervals exist, leverage them to reduce false positives.
Returns List of dicts like:
[{"ds": "...", "y": float, "yhat": float, "residual": float, "reason": "z>3"}]
Example
anoms = agent.detect_anomalies(pred, method="prophet")
run
run
Purpose: End-to-end pipeline: load → train → forecast → (optional) anomalies → assemble report.
Signature
result = agent.run(
task: str | dict,
*,
detect_outliers: bool = True,
return_artifacts: bool = False
) -> dict
Task Formats
String: natural language instruction (e.g., "Forecast next 14 days for BTC price from prices.csv").
Dict example:
{
"source": "prices.csv",
"timestamp_col": "date",
"value_col": "close",
"horizon": 14,
"regressors": ["volume"],
"anomaly_detection": True
}
Returns A structured dict:
{
"model": {...}, # model metadata (type, params, train range)
"data_summary": {...}, # counts, freq, gaps
"forecast": pd.DataFrame, # or serialized if configured
"anomalies": list, # optional
"metrics": {...}, # CV/backtest metrics if computed
"notes": list[str] # warnings, assumptions
}
Example
result = agent.run({
"source": "prices.csv",
"timestamp_col": "date",
"value_col": "close",
"horizon": 21,
"regressors": ["volume"],
"anomaly_detection": True
})
Additional Tips
Feature drift: Periodically retrain if data distribution shifts.
Missing data: Prefer resampling + forward-fill, but audit long gaps.
Exogenous inputs: Validate alignment and scale consistently.
Metrics you care about: For business ops, monitor MAPE/MASE, not just RMSE.
Reproducibility: Persist model configs, seed, training window, and version.
If you’d like, I can add minimal code stubs for each method (with pandas
placeholders) so your docs double as a runnable template.
Example Usage
from crowe.agents import PredictiveModelAgent
agent = PredictiveModelAgent(
model_name="lstm",
forecast_horizon=14,
anomaly_detection=True
)
result = agent.run("Forecast next two weeks' cryptocurrency prices.")
print(result)
Memory System
The PredictiveModelAgent includes a DataCacheMemory system that stores recent datasets and prediction results for quick re-analysis.
Memory Features
Stores processed datasets for reuse
Caches model parameters to avoid retraining when unnecessary
Supports retrieval of past prediction runs for comparison
Practices
Use high-quality, clean data for better forecasting accuracy
Select the appropriate model based on data seasonality and trends
Enable anomaly detection for early warning systems
Monitor performance periodically and update models with new data
Last updated