Usage Modes

MAC supports two prompting modes and works across any task you can evaluate with a scoring function.

Mode 1: Auto-Adapt (Recommended Starting Point)

Just give MAC a task description and your data. The meta-model handles all prompt engineering automatically.

from mac import MAC

compiler = MAC(
    model="Qwen/Qwen3-8B",
    base_url="http://localhost:8000/v1",  # vLLM local server
    mac_model="gpt-4o",                  # strong MAC agents
    task_description="Extract named entities from legal text.",
    rule_type="NER extraction rules",
)
optimized = compiler.compile(trainset=train, holdout=holdout, metric=metric)

What the meta-model does

Before training starts, a meta-model (defaults to mac_model) reads your task description, inspects real data samples, examines the output format, and rewrites four internal agent prompts -- Annotator, Decision Agent, Rule Proposer, and Rule Editor -- to be specific to your task. This is a structural rewrite, not a word swap.

The adaptation step fires once before epoch 1. You can watch it live in the terminal:

  1/4  ✓  annotator    adapted  (12s)
  2/4  ✓  decision     adapted  (18s)
  3/4  ✓  rule_proposer adapted  (24s)
  4/4  ✓  rule_editor  adapted  (31s)

After this, every agent in the network operates with task-specific instructions, so the rules they propose and the decisions they make are grounded in your domain -- not generic placeholders.

vLLM local server

Point base_url at any OpenAI-compatible endpoint. The most common setup runs vllm serve and passes the server URL:

MAC(
    model="Qwen/Qwen3-8B",
    base_url="http://localhost:8000/v1",   # default vLLM port
    # or a custom port:
    # base_url="http://localhost:11434/v1",  # Ollama
    # base_url="http://localhost:8080/v1",   # LM Studio
    mac_model="gpt-4o",
)

mac_base_url does not fall back to base_url. MAC agents should call a stronger cloud model, not your local vLLM server.

Mode 2: Custom Prompt

You write the full prompt and mark exactly where MAC should inject the learned rules:

compiler = MAC(
    model="gpt-4o",
    task_prompt=(
        "You are an expert math solver.\n\n"
        "{{CONSTITUTION_BLOCK}}\n\n"
        "Return JSON: {\"reasoning\": \"...\", \"answer\": 42}"
    ),
    task_description="Competition math problems requiring multi-step arithmetic.",
    rule_type="math reasoning rules",
)
optimized = compiler.compile(trainset=train, holdout=holdout, metric=metric)

The `{{CONSTITUTION_BLOCK}}` tag

At inference time, {{CONSTITUTION_BLOCK}} is replaced with the learned constitution formatted as a numbered rule list:

Rules:
1. If the problem asks for a remaining total, sum each sub-total after applying per-category operations.
2. If the calculated value is a decimal and the context implies whole numbers, round to the nearest integer.

If the constitution is still empty (first few batches of epoch 1), the placeholder is replaced with an empty string -- your prompt works normally with no rules injected, and MAC improves from there.

Place {{CONSTITUTION_BLOCK}} where rules make the most sense in your prompt. For structured-output tasks, put it before the JSON schema. For reasoning tasks, put it after the system persona but before any examples.

Mode 3: YOLO (Fully Automatic)

The simplest possible invocation. No prompt, no task scaffold -- just a description, data, and a metric:

compiler = MAC(
    model="Qwen/Qwen3-8B",
    base_url="http://localhost:8000/v1",
    mac_model="gpt-4o",
    task_description="Answer multi-hop questions about Wikipedia articles.",
    rule_type="multi-hop QA rules",
)
result = compiler.compile(trainset=train, holdout=holdout, metric=exact_match)

MAC builds the initial prompt from scratch, adapts all four agent prompts, and starts learning rules. You get a constitution and an optimized prompt back without writing a single line of prompt engineering.

Use this to get a strong baseline fast. Once you see what rules MAC learns, you can switch to Custom Prompt mode to lock in a prompt structure and refine further.

Comparing the Modes

	Auto-Adapt	Custom Prompt	YOLO
You write the prompt	No	Yes	No
Meta-model adapts agent prompts	Yes	Yes	Yes
Constitution injected automatically	Yes	At `{{CONSTITUTION_BLOCK}}`	Yes
Best for	Fast iteration	Precise control	Quickest start

What MAC Returns

compiler.compile() returns an OptimizedPrompt object:

result = compiler.compile(trainset=train, holdout=holdout, metric=metric)

print(result.prompt)        # the full optimized prompt
print(result.constitution)  # list of learned rules (strings)
print(result.score)         # final holdout score

The rules in result.constitution are plain English. You can read them, edit them by hand, share them with your team, or paste them into a different model's prompt without any MAC infrastructure.

Works with Any Task

MAC is task-agnostic. Beyond the benchmarks:

Phrase extraction -- extract key phrases, entities, or spans from documents
Classification -- sentiment, intent, topic, toxicity
Structured output -- JSON extraction, table filling, slot-filling
Tool calling -- API call generation (see Tool-MAC in paper results)
Math / reasoning -- step-by-step arithmetic, formal verification
Custom domains -- anything you can score with a Python function

The only requirement: you need a training set with examples and a scoring function that returns a number per example. MAC does the rest.