Graft User Guide

Graft compiles .gft files into multi-agent pipeline structures that Claude Code can execute directly. Best for natural-language I/O pipelines — code review, ideation, content generation, data analysis — where agents exchange structured JSON. In coding workflows, use Graft for the NL sub-steps (review, analysis, planning) while running code execution directly.

Write .gft file → graft compile → .claude/ structure generated → Run in Claude Code

1. Installation

npm install -g @jsleekr/graft

Requires Node.js 20+. Verify:

graft --version

2. Create or Configure a Project

New project

graft init my-project
cd my-project

Creates:

my-project/
  pipeline.gft       ← Two-node pipeline template
  .claude/CLAUDE.md   ← Full .gft syntax spec (so Claude Code understands Graft)

Existing project

cd your-existing-project
graft init

This adds .claude/CLAUDE.md with the Graft spec. If .claude/CLAUDE.md already exists, the Graft section is appended. Your existing files are untouched.

2.1 Claude Code Native Integration

The generated .claude/CLAUDE.md contains the complete .gft language reference. Any Claude Code session in this project automatically understands Graft syntax.

claude    # Open Claude Code

Then just say what you want:

“Make a data analysis pipeline where a classifier runs first, then statistical and trend analyzers run in parallel, then a report writer combines everything.”

Claude Code will write the .gft file and run graft compile — no DSL learning required.

3. Writing .gft Files

3.1 Context — Define Input Data

context UserRequest(max_tokens: 500) {
  question: String
}

max_tokens: Maximum token count for this data (used in cost analysis)
Field types: String, Int, Float, Bool, List<T>, Map<K, V>, Optional<T>
Special types: Float(0..1) (range-bounded), enum(low, medium, high), TokenBounded<String, 100>

3.2 Node — Define an Agent

node Analyst(model: sonnet, budget: 4k/2k) {
  reads: [UserRequest]
  produces Analysis {
    answer: String
    confidence: Float(0..1)
  }
}

model: Choose from haiku, sonnet, opus
budget: input_tokens/output_tokens (k = 1000)
reads: Data this agent reads (context, another node’s produces, or memory)
produces: JSON schema this agent outputs
tools: [file_read, file_write, terminal] (optional)
on_failure: retry(N), fallback(NodeName), skip, abort (optional)

3.3 Edge — Data Flow + Transforms Between Nodes

edge Analyst -> Reviewer | select(answer, confidence) | compact

When data flows from Analyst to Reviewer, keep only answer and confidence fields, then remove empty values.

Available transforms:

Transform	Description	Example
`select(field1, field2)`	Keep only specified fields	`select(answer, confidence)`
`drop(field)`	Remove a specific field	`drop(reasoning_trace)`
`compact`	Recursively remove nulls, empty strings, empty arrays, empty objects + minify JSON	`compact`
`filter(field, condition)`	Filter array field by condition	`filter(issues, severity >= medium)`
`truncate(N)`	Proportionally reduce content to fit N tokens	`truncate(500)`

Transforms are chainable with pipes (|): select(a, b) | drop(c) | compact

Why this matters: Passing full context between agents wastes tokens. Edge transforms forward only the data the next agent needs, reducing cost.

3.4 Graph — Define Execution Order

graph Pipeline(input: UserRequest, output: Analysis, budget: 10k) {
  Analyst -> Reviewer -> done
}

input: Pipeline input (context name)
output: Final output (produces name)
budget: Total token budget

Parallel execution:

graph Review(input: PR, output: FinalReview, budget: 40k) {
  parallel { SecurityReviewer  LogicReviewer  PerfReviewer }
  -> SeniorReviewer -> done
}

All three reviewers run concurrently. Once all complete, SeniorReviewer executes.

3.5 Conditional Routing

edge RiskAssessor -> {
  when risk_score > 0.7 -> DetailedReviewer
  when risk_score > 0.3 -> StandardReviewer
  else -> AutoApprove
}

Evaluates risk_score from RiskAssessor’s output to determine the next node.

3.6 Memory — Persistent State Across Runs

memory ConversationLog(max_tokens: 2k, storage: file) {
  turns: List<Turn { role: String  content: String }>
  summary: Optional<String>
}

node Responder(model: sonnet, budget: 4k/2k) {
  reads: [ConversationLog]
  writes: [ConversationLog]
  produces Response { reply: String }
}

Memory is stored as JSON files in .graft/memory/ and persists between pipeline runs.

3.7 Import — Split Across Files

import { UserMessage, SystemConfig } from "./shared.gft"

Import contexts, nodes, etc. from other .gft files.

4. Compile

graft compile pipeline.gft

Generated file structure:

.claude/
  agents/
    analyst.md          ← Agent definition (prompt, model, tools, schema)
    reviewer.md
  hooks/
    analyst-to-reviewer.js  ← Edge transform (Node.js script)
  CLAUDE.md              ← Orchestration plan (execution order, I/O paths)
  settings.json          ← Model routing, hook registration, token budget
.graft/
  session/
    node_outputs/        ← Directory where each node's output is stored
    routing/             ← Conditional routing decisions (when conditional edges exist)
  token_log.txt          ← Token usage log
  memory/                ← Memory state (when memory declarations exist)

Specify output directory

graft compile pipeline.gft --out-dir ./my-output

5. Running in Claude Code

5.1 Create Input

echo '{"question": "What are the differences between TypeScript and JavaScript?"}' > .graft/session/input.json

5.2 Launch Claude Code

claude

Then instruct Claude Code:

Follow the execution plan in .claude/CLAUDE.md. The input is at .graft/session/input.json.

5.3 What Happens

Claude Code reads .claude/CLAUDE.md and executes:

Step 1: Runs the Analyst agent → saves output to .graft/session/node_outputs/analyst.json
Hook auto-fires: When analyst.json is created, analyst-to-reviewer.js runs automatically → creates analyst_to_reviewer.json (transformed data)
Step 2: Reviewer agent reads the transformed input → saves output to .graft/session/node_outputs/reviewer.json

5.4 Check Results

cat .graft/session/node_outputs/reviewer.json

5.5 Check Token Usage

cat .graft/token_log.txt

6. Other CLI Commands

graft check — Validate Only

graft check pipeline.gft

Parses, scope-checks, type-checks, and analyzes tokens without generating files. Useful in CI.

graft run — Compile + Execute + Validate

graft run pipeline.gft --input input.json --dry-run

--input: Input JSON file
--dry-run: Simulate execution without spawning subprocesses
--json: Machine-readable JSON output
--verbose: Print execution details
--timeout <seconds>: Subprocess timeout (default: 300)

How graft run works under the hood:

Compiles the .gft file (same as graft compile)
For each node in the execution plan, spawns a claude CLI subprocess
Each subprocess gets the agent’s prompt, reads its input, and produces JSON output
Edge transforms run between nodes (same JavaScript hooks as in compiled output)
Conditional routing is evaluated after each node completes
Result formatter shows human-readable summary with node table and token usage bar
Quality validator checks every node output against the .gft schema
Feedback engine suggests concrete .gft modifications for any issues found

In --dry-run mode, no subprocesses are spawned — the pipeline structure is validated and execution is simulated with placeholder outputs.

Example output:

Graph 'DataAnalysis' completed in 15.2s

  ✓ Classifier       haiku    1.2s    1,200 tok
  ✓ StatAnalyzer     sonnet   3.4s    5,420 tok
  ✓ TrendAnalyzer    sonnet   3.1s    5,420 tok
  ✓ ReportWriter     opus     7.5s   15,520 tok

Token usage: 27,560 / 40,000 (69%)
  [████████████████████░░░░░░░░░░]

── Quality Check ─────────────────────────────────
  ✓ executive_summary OK [ReportWriter.executive_summary]
  ✓ findings OK [ReportWriter.findings]
  ⚠ Field 'recommendations' is empty [ReportWriter.recommendations]
  ✓ Token budget OK: 69%
────────────────────────────────────────────────
Quality: 75% (3/4 checks passed)

── Suggestions ───────────────────────────────────
  ⚠ Field 'recommendations' is empty.
    → Increase ReportWriter output budget: budget: 10k/10k
────────────────────────────────────────────────

graft generate — Natural Language to .gft

graft generate "code review pipeline with parallel security and logic reviewers"
graft generate "chatbot with conversation memory" --output chatbot.gft

Calls Claude Code as a subprocess to generate a .gft file from a natural language description. Validates the output with the Graft compiler and retries up to 2 times on parse failure.

For environments with Claude Code already open, just describe what you want in conversation — graft init has already injected the syntax spec.

graft watch — File Watcher + Auto-Recompile

graft watch pipeline.gft

Automatically recompiles whenever the .gft file changes. Useful during development alongside your editor.

graft visualize — DAG Visualization

graft visualize pipeline.gft

Outputs the pipeline structure as a Mermaid diagram:

graph TD
    Analyst["Analyst<br/><small>sonnet</small>"]
    Reviewer["Reviewer<br/><small>haiku</small>"]
    Analyst -->|select → compact| Reviewer

Paste into GitHub README/docs for rendered diagrams, or use Mermaid Live Editor to preview.

graft fmt — Format .gft Source

graft fmt pipeline.gft          # Print formatted output to stdout
graft fmt pipeline.gft -w       # Write formatted output back to file
graft fmt pipeline.gft --check  # Check if already formatted (exit 1 if not)

Parses the .gft file and pretty-prints it with consistent indentation, spacing, and ordering.

graft test — Pipeline Testing

graft test pipeline.gft                          # Auto-generate test input
graft test pipeline.gft --input '{"question":"hi"}'  # Explicit input
graft test pipeline.gft --verbose                # Show node outputs

Runs the pipeline in dry-run mode and validates all node outputs against their produces schemas. If no --input is provided, generates minimal valid test data from the graph’s input context schema.

Multi-Backend Compilation

graft compile pipeline.gft --backend claude     # Default: Claude Code harness
graft compile pipeline.gft --backend generic    # Tool-agnostic output

The generic backend produces tool-agnostic markdown agents and documentation, useful as a starting point for adapting to other AI coding assistants.

7. Generated File Details

.claude/agents/*.md — Agent Definitions

---
model: claude-sonnet-4-20250514
tools: []
---

# Analyst Agent

## Task
Read the input and produce a structured Analysis.

## Input
- `.graft/session/input.json` (UserRequest)

## Output Schema
Write your output as JSON to `.graft/session/node_outputs/analyst.json`:
{
  "answer": "<string>",
  "confidence": <number 0-1>
}

When done, output: ===NODE_COMPLETE:analyst===

Claude Code reads the model from frontmatter and uses the content as the agent prompt.

.claude/hooks/*.js — Edge Transform Hooks

Registered as PostToolUse hooks. When Claude Code writes a node output using the Write tool, the hook fires automatically.

Example (analyst-to-reviewer.js):

const data = JSON.parse(fs.readFileSync(INPUT, 'utf-8'));
let result = {
  "answer": data["answer"],
  "confidence": data["confidence"]
};
result = compact(result);
fs.writeFileSync(OUTPUT, JSON.stringify(result));

.claude/CLAUDE.md — Orchestration Plan

Describes execution order for Claude Code: which agents to run, in what order, with what inputs/outputs, and token budgets per step.

.claude/settings.json — Settings

{
  "model": "claude-sonnet-4-20250514",
  "permissions": { "allow": ["Read", "Write", "Edit", "Bash", "Skill"] },
  "graft": {
    "budget": { "total": 10000 },
    "model_routing": {
      "default": "claude-sonnet-4-20250514",
      "overrides": { "reviewer": "claude-haiku-4-5-20251001" }
    }
  },
  "hooks": {
    "PostToolUse": [{
      "matcher": "Write",
      "hooks": [{
        "type": "command",
        "command": "node .claude/hooks/analyst-to-reviewer.js",
        "if": "Write(.graft/session/node_outputs/analyst.json)"
      }]
    }]
  }
}

8. Examples

Code Review Pipeline

Three reviewers analyze a PR in parallel, then a senior reviewer makes the final call.

graph AdversarialReview(input: PullRequest, output: FinalReview, budget: 40k) {
  parallel { SecurityReviewer  LogicReviewer  PerfReviewer }
  -> SeniorReviewer -> done
}

Edge transforms forward only key findings to the senior reviewer:

edge SecurityReviewer -> SeniorReviewer | select(vulnerabilities, severity) | compact
edge LogicReviewer -> SeniorReviewer | select(bugs, edge_cases) | compact

Full source: examples/code-review.gft

Content Pipeline

Research → Draft → Edit → Metadata extraction. Memory references previously published articles.

graph ContentPipeline(input: Brief, output: ArticleMetadata, budget: 40k) {
  Researcher -> Drafter -> Editor -> MetadataExtractor -> done
}

Full source: examples/content-pipeline.gft

Data Analysis Pipeline

Classify data → parallel analysis (statistics + trends) → write report.

graph DataAnalysis(input: RawData, output: AnalysisReport, budget: 50k) {
  Classifier
  -> parallel { StatAnalyzer  TrendAnalyzer }
  -> ReportWriter -> done
}

Full source: examples/data-analysis.gft

Conditional Routing

Route to different reviewers based on risk assessment:

edge RiskAssessor -> {
  when risk_score > 0.7 -> DetailedReviewer
  when risk_score > 0.3 -> StandardReviewer
  else -> AutoApprove
}

Full source: benchmarks/correctness/conditional_edge.gft

9. Type System

String                    — String
Int                       — Integer
Float                     — Float
Float(0..1)               — Range-bounded float
Bool                      — Boolean
List<T>                   — List
Map<K, V>                 — Map
Optional<T>               — Optional value
TokenBounded<String, 100> — Token-bounded string
enum(low, medium, high)   — Inline enum
Issue { file: FilePath, severity: enum(low, medium, high) }  — Inline struct

Domain Types

FilePath — File path
FileDiff — Diff text
TestFile — Test file
IssueRef — Issue reference

10. Error Messages

Graft provides rustc-style error messages:

error[SCOPE_UNDEFINED_REF]: 'Inpt' is not declared as a context, produces output, or memory
  --> pipeline.gft:6:11
   |
 6 |   reads: [Inpt]
   |           ^^^^
   |
   = help: did you mean 'Input'?

Typos are caught with fuzzy matching suggestions.

11. Programmatic API

import { compile } from '@jsleekr/graft/compiler';
import { Executor } from '@jsleekr/graft/runtime';
import type { Program } from '@jsleekr/graft/types';

const result = compile(source, 'pipeline.gft');
if (result.success) {
  console.log(`Parsed ${result.program.nodes.length} nodes`);
  console.log(`Token estimate: ${result.report.bestCase}`);
  for (const file of result.files) {
    console.log(`Generated: ${file.path}`);
  }
}

12. The Result Loop

After every graft run, three modules process the results automatically:

12.1 Result Formatter

Converts raw execution data into a human-readable summary:

Node table with status icon, model, duration, and token count
Token usage bar with percentage and warnings at 80%/90%/95%
Final output field summary (arrays show count, strings truncated)

Use --json for machine-readable output.

12.2 Quality Validator

Checks every node’s output against the .gft schema:

Check	What it validates
Schema	Are all declared `produces` fields present?
Type	Is a `String` actually a string? Is a `List` actually an array?
Range	Is `Float(0..1)` actually between 0 and 1?
Empty	Are lists or strings unexpectedly empty?
Budget	Did token usage exceed 80% (warn) or 95% (fail)?

12.3 Feedback Engine

When quality issues are found, suggests specific .gft modifications:

Problem	Suggestion
Empty field	Increase node output budget
Budget exhaustion	Add edge transforms (`truncate`, `compact`)
Node failure	Add `on_failure: retry(2)`
Type mismatch	Check produces schema
Missing field	Check node prompt or input data

The feedback maps to concrete .gft changes you can apply directly, then rerun.

13. Using Graft in Coding Workflows

Coding tasks mix natural-language steps (planning, review, analysis) with code execution steps (writing files, running tests). Graft handles the NL parts:

Plan (NL) → Code (direct) → Review (NL) → Fix (direct) → Final Review (NL)
  ↑                            ↑                              ↑
  optional                   Graft                          Graft

The pattern:

Plan — optionally use a Graft pipeline to generate a structured plan
Code — run code execution directly through Claude Code (file writes, tests, builds)
Review — use a Graft review pipeline (graft run review.gft) with parallel reviewers
Fix — apply fixes directly through Claude Code
Final Review — run the Graft pipeline again to validate

Example: code review after implementation

// review.gft — run after code is written
context CodeChanges(max_tokens: 3k) {
  diff: String
  description: String
}

node SecurityReviewer(model: sonnet, budget: 6k/3k) {
  reads: [CodeChanges]
  produces SecurityAnalysis { vulnerabilities: List<String>, severity: String }
}

node LogicReviewer(model: sonnet, budget: 6k/3k) {
  reads: [CodeChanges]
  produces LogicAnalysis { bugs: List<String>, edge_cases: List<String> }
}

node SeniorReviewer(model: opus, budget: 8k/4k) {
  reads: [CodeChanges, SecurityAnalysis, LogicAnalysis]
  produces FinalReview { approved: Bool, action_items: List<String> }
}

edge SecurityReviewer -> SeniorReviewer | select(vulnerabilities, severity) | compact
edge LogicReviewer -> SeniorReviewer | select(bugs, edge_cases) | compact

graph Review(input: CodeChanges, output: FinalReview, budget: 30k) {
  parallel { SecurityReviewer  LogicReviewer }
  -> SeniorReviewer -> done
}

# After writing code, run the review pipeline
graft run review.gft --input '{"diff": "...", "description": "..."}'

This gives you structured, parallel code review with token budget control — while the actual coding stays in direct Claude Code sessions.

14. Execution Model and Limitations

How Graft Differs from Runtime Orchestrators

Graft is a compiler, not a runtime orchestrator like LangGraph or CrewAI.

	Graft	LangGraph / CrewAI
Execution control	LLM follows generated instructions	Deterministic state machine
Token optimization	Compile-time analysis + edge transforms	Manual / none
Runtime dependency	Claude Code	Python runtime
Deployment	`.claude/` files, zero runtime	Application server

What this means in practice:

The orchestration plan (.claude/CLAUDE.md) is a natural-language prompt. Claude Code interprets it, but there’s no hard guarantee of exact execution order.
Edge transform hooks (.claude/hooks/*.js) are deterministic — they run as Node.js scripts triggered by PostToolUse events.
Model routing and permissions (.claude/settings.json) are deterministic.
The compile-time token analysis is deterministic.

In other words: the data pipeline is deterministic; the orchestration is best-effort.

Known Limitations

Non-deterministic orchestration: Claude Code usually follows CLAUDE.md faithfully, but complex pipelines may require explicit re-prompting.
Claude Code dependency: If the .claude/ structure format changes upstream, Graft’s codegen must be updated.
Single provider: Only Anthropic Claude models are supported currently.
Memory: Only JSON file storage works. Database backends are specified but not implemented.
Conditional edge codegen: Router hooks evaluate conditions correctly, but the orchestration plan may need Claude Code to read the routing file manually in complex cases.

15. Troubleshooting

Compile error: “is not declared”

A name in reads references something that doesn’t exist. Check your context, produces, and memory names.

Hooks not firing

Verify the hook is registered in settings.json under hooks.PostToolUse. Check that the if field path matches the actual output path.

Token budget exceeded warning

Run graft check to see token analysis. Either increase the node budget or add edge transforms to reduce the data passed between agents.

Claude Code not following the plan

Open .claude/CLAUDE.md directly to inspect the execution plan. Explicitly instruct Claude Code: “Read .claude/CLAUDE.md and follow the execution plan.”

Path issues on Windows

Graft uses POSIX paths internally. On Windows, paths are normalized automatically — no extra configuration needed.