Usage Modes
MAC supports two prompting modes and works across any task you can evaluate with a scoring function.
Mode 1: Auto-Adapt (Recommended Starting Point)
Just give MAC a task description and your data. The meta-model handles all prompt engineering automatically.
from mac import MAC
compiler = MAC(
model="Qwen/Qwen3-8B",
base_url="http://localhost:8000/v1", # vLLM local server
mac_model="gpt-4o", # strong MAC agents
task_description="Extract named entities from legal text.",
rule_type="NER extraction rules",
)
optimized = compiler.compile(trainset=train, holdout=holdout, metric=metric)
What the meta-model does
Before training starts, a meta-model (defaults to mac_model) reads your task description, inspects real data samples, examines the output format, and rewrites four internal agent prompts -- Annotator, Decision Agent, Rule Proposer, and Rule Editor -- to be specific to your task. This is a structural rewrite, not a word swap.
The adaptation step fires once before epoch 1. You can watch it live in the terminal:
1/4 ✓ annotator adapted (12s)
2/4 ✓ decision adapted (18s)
3/4 ✓ rule_proposer adapted (24s)
4/4 ✓ rule_editor adapted (31s)
After this, every agent in the network operates with task-specific instructions, so the rules they propose and the decisions they make are grounded in your domain -- not generic placeholders.
vLLM local server
Point base_url at any OpenAI-compatible endpoint. The most common setup runs vllm serve and passes the server URL:
MAC(
model="Qwen/Qwen3-8B",
base_url="http://localhost:8000/v1", # default vLLM port
# or a custom port:
# base_url="http://localhost:11434/v1", # Ollama
# base_url="http://localhost:8080/v1", # LM Studio
mac_model="gpt-4o",
)
mac_base_url does not fall back to base_url. MAC agents should call a stronger cloud model, not your local vLLM server.
Mode 2: Custom Prompt
You write the full prompt and mark exactly where MAC should inject the learned rules:
compiler = MAC(
model="gpt-4o",
task_prompt=(
"You are an expert math solver.\n\n"
"{{CONSTITUTION_BLOCK}}\n\n"
"Return JSON: {\"reasoning\": \"...\", \"answer\": 42}"
),
task_description="Competition math problems requiring multi-step arithmetic.",
rule_type="math reasoning rules",
)
optimized = compiler.compile(trainset=train, holdout=holdout, metric=metric)
The {{CONSTITUTION_BLOCK}} tag
At inference time, {{CONSTITUTION_BLOCK}} is replaced with the learned constitution formatted as a numbered rule list:
Rules:
1. If the problem asks for a remaining total, sum each sub-total after applying per-category operations.
2. If the calculated value is a decimal and the context implies whole numbers, round to the nearest integer.
If the constitution is still empty (first few batches of epoch 1), the placeholder is replaced with an empty string -- your prompt works normally with no rules injected, and MAC improves from there.
Place {{CONSTITUTION_BLOCK}} where rules make the most sense in your prompt. For structured-output tasks, put it before the JSON schema. For reasoning tasks, put it after the system persona but before any examples.
Mode 3: YOLO (Fully Automatic)
The simplest possible invocation. No prompt, no task scaffold -- just a description, data, and a metric:
compiler = MAC(
model="Qwen/Qwen3-8B",
base_url="http://localhost:8000/v1",
mac_model="gpt-4o",
task_description="Answer multi-hop questions about Wikipedia articles.",
rule_type="multi-hop QA rules",
)
result = compiler.compile(trainset=train, holdout=holdout, metric=exact_match)
MAC builds the initial prompt from scratch, adapts all four agent prompts, and starts learning rules. You get a constitution and an optimized prompt back without writing a single line of prompt engineering.
Use this to get a strong baseline fast. Once you see what rules MAC learns, you can switch to Custom Prompt mode to lock in a prompt structure and refine further.
Comparing the Modes
| Auto-Adapt | Custom Prompt | YOLO | |
|---|---|---|---|
| You write the prompt | No | Yes | No |
| Meta-model adapts agent prompts | Yes | Yes | Yes |
| Constitution injected automatically | Yes | At {{CONSTITUTION_BLOCK}} |
Yes |
| Best for | Fast iteration | Precise control | Quickest start |
What MAC Returns
compiler.compile() returns an OptimizedPrompt object:
result = compiler.compile(trainset=train, holdout=holdout, metric=metric)
print(result.prompt) # the full optimized prompt
print(result.constitution) # list of learned rules (strings)
print(result.score) # final holdout score
The rules in result.constitution are plain English. You can read them, edit them by hand, share them with your team, or paste them into a different model's prompt without any MAC infrastructure.
Works with Any Task
MAC is task-agnostic. Beyond the benchmarks:
- Phrase extraction -- extract key phrases, entities, or spans from documents
- Classification -- sentiment, intent, topic, toxicity
- Structured output -- JSON extraction, table filling, slot-filling
- Tool calling -- API call generation (see Tool-MAC in paper results)
- Math / reasoning -- step-by-step arithmetic, formal verification
- Custom domains -- anything you can score with a Python function
The only requirement: you need a training set with examples and a scoring function that returns a number per example. MAC does the rest.