SelfCritiqueAgent

The SelfCritiqueAgent is a Crowe-native, reflection-driven agent that improves outputs through drafting, structured critique, self-reflection, and targeted revision. It iterates over a task multiple times, learning from its own feedback and retained memories to deliver higher-quality responses.

Overview

SelfCritiqueAgent employs three internal roles:

Author – produces the initial draft/output
Critic – evaluates the draft against explicit quality criteria
Reflector – distills lessons and patterns to guide revisions and future outputs

Initialization

from crowe.agents import SelfCritiqueAgent

agent = SelfCritiqueAgent(
    agent_name="self-critique-agent",
    system_prompt="...",          # Optional: custom system prompt
    model_name="openai/gpt-4o",
    max_cycles=3,
    memory_capacity=128
)

Parameters

Parameter

Type

Default

Description

agent_name

str

"self-critique-agent"

Agent identifier for logs and routing

system_prompt

str

SC_PROMPT_DEFAULT

Base system prompt for all sub-roles

model_name

str

"openai/gpt-4o"

Backbone model used by the roles

max_cycles

int

3

Maximum iterative improvement cycles

memory_capacity

int

128

Long-term memory capacity (items)

return_list

bool

False

If True, return conversation as list

return_dict

bool

False

If True, return conversation as dict

Methods

`draft`

Generate an initial draft for a task (Author role).

response = agent.draft(
    task: str,
    context: dict | None = None
) -> str

Args

task: The problem or instruction.
context (optional): Extra hints, constraints, or retrieved facts.

`critique`

Evaluate a response against objective criteria (Critic role).

feedback, score = agent.critique(
    task: str,
    response: str,
    rubric: dict | None = None
) -> tuple[str, float]

Returns

feedback: Structured critique text.
score: A float in [0, 1] indicating quality.

`self_reflect`

Produce self-reflection statements that generalize beyond a single attempt (Reflector role).

reflection = agent.self_reflect(
    task: str,
    response: str,
    feedback: str
) -> str

`revise`

Revise a response using the latest critique and reflection.

revised = agent.revise(
    task: str,
    original: str,
    feedback: str,
    reflection: str
) -> str

`cycle`

Run one improvement cycle: draft/critique/reflect/revise.

result = agent.cycle(
    task: str,
    iteration: int = 0,
    previous: str | None = None,
    rubric: dict | None = None
) -> dict

Returns a dictionary containing:

task, draft, feedback, reflection, revised, score, iteration

`run_batch`

Execute the iterative process for a list of tasks.

results = agent.run_batch(
    tasks: list[str],
    include_intermediates: bool = False,
    rubric: dict | None = None
) -> list[dict] | list[str]

Returns

If include_intermediates=False: List of final revised responses
If include_intermediates=True: Full histories per task

Example Usage

from crowe.agents import SelfCritiqueAgent

agent = SelfCritiqueAgent(
    agent_name="self-critique-agent",
    model_name="openai/gpt-4o",
    max_cycles=3
)

tasks = [
    "Explain quantum computing to a beginner.",
    "Write a Python function to sort a list of dictionaries by a key."
]

results = agent.run_batch(tasks)

for i, result in enumerate(results, start=1):
    print(f"\nTask {i}: {tasks[i-1]}")
    print(f"Response: {result}")

Memory System

SelfCritiqueAgent ships with InsightMemory, maintaining both short-term and long-term records of drafts, critiques, and reflections to guide future work.

Memory Features

Short-term traces for current session context
Long-term insights for durable lessons and reusable patterns
Capacity limits with automatic pruning
Relevance retrieval for similar future tasks
Similarity deduplication to avoid memory bloat

Best Practices

Define clear tasks: Supply constraints, audience, tone, and format.
Tune cycles: Increase max_cycles for complex or high-stakes outputs.
Curate memory: Adjust memory_capacity and periodically purge.
Choose the right model: Favor models with strong reasoning and instruction-following.
Harden for production: Add timeouts, retries, and validation if outputs feed downstream systems.

PreviousIterative Reflective Expansion Algorithm NextCritiqueLoopAgent

Last updated 10 days ago