API Reference

This section provides detailed API documentation for BRAID-DSPy.

Core Classes

BraidReasoning

Main module for BRAID reasoning in DSPy.

class braid.module.BraidReasoning(*args, **kwargs)[source]

Bases: Module

BRAID reasoning module for DSPy.

This module implements the BRAID (Bounded Reasoning for Autonomous Inference and Decisions) architecture: 1. Planning Phase: Generate a Guided Reasoning Diagram (GRD) in Mermaid format 2. Execution Phase: Execute the GRD step by step to solve the problem

Example

>>> import dspy
>>> from braid import BraidReasoning
>>>
>>> lm = dspy.OpenAI(model="gpt-4")
>>> dspy.configure(lm=lm)
>>>
>>> braid = BraidReasoning()
>>> result = braid(problem="If a train travels 120 km in 2 hours, what is its speed?")
>>> print(result.answer)
>>> print(result.grd)

__init__(use_generator: bool = True, max_execution_steps: int = 20, validate_grd: bool = True)[source]

Initialize the BRAID reasoning module.

Parameters:

use_generator – Whether to use GRDGenerator for planning (True) or direct LLM call (False)
max_execution_steps – Maximum number of steps to execute
validate_grd – Whether to validate GRD syntax before execution

forward(problem: str, grd: str | None = None, problem_type: str | None = None) → BraidResult[source]

Execute BRAID reasoning on a problem.

Parameters:

problem – The problem to solve
grd – Optional pre-generated GRD (if None, will be generated)
problem_type – Optional problem type hint for generation

Returns:

BraidResult object containing GRD, reasoning steps, and answer

__call__(problem: str, **kwargs) → BraidResult[source]: Make the module callable.

BraidResult

Result object returned by BraidReasoning module.

class braid.module.BraidResult(problem: str, grd: str, parsed_grd: GRDStructure | None, reasoning_steps: List[Dict[str, Any]], answer: str, execution_trace: List[Dict[str, Any]], valid: bool, error: str | None = None)[source]

Result object returned by BraidReasoning module.

problem: str

grd: str

parsed_grd: GRDStructure | None

reasoning_steps: List[Dict[str, Any]]

answer: str

execution_trace: List[Dict[str, Any]]

valid: bool

error: str | None = None

__init__(problem: str, grd: str, parsed_grd: GRDStructure | None, reasoning_steps: List[Dict[str, Any]], answer: str, execution_trace: List[Dict[str, Any]], valid: bool, error: str | None = None) → None

Parser

MermaidParser

Parser for Mermaid flowchart diagrams.

class braid.parser.MermaidParser[source]

Bases: object

Parser for Mermaid flowchart diagrams.

NODE_PATTERNS = [('(\\w+)\\[\\[(.*?)\\]\\]', NodeType.SUBROUTINE), ('(\\w+)\\[\$(.*?)\$\\]', NodeType.STADIUM), ('(\\w+)\$\\((.*?)\$\\)', NodeType.CIRCLE), ('(\\w+)\\{(.*?)\\}', NodeType.DIAMOND), ('(\\w+)\\{\\{(.*?)\\}\\}', NodeType.HEXAGON), ('(\\w+)\$(.*?)\$', NodeType.ROUNDED), ('(\\w+)\\[(.*?)\\]', NodeType.RECTANGLE), ('(\\w+)\\[/?(.*?)[/\\\\]\\]', NodeType.PARALLELOGRAM)]

EDGE_PATTERN = '(\\w+)\\s*(--[>]|==[>])\\s*(\\w+)|(\\w+)\\s*--[->]\\s*\\|\\s*(.*?)\\s*\\|\\s*(\\w+)'

__init__()[source]: Initialize the parser.

parse(mermaid_code: str) → GRDStructure[source]

Parse Mermaid flowchart code into a GRD structure.

Parameters:: mermaid_code – Mermaid diagram code (flowchart format)
Returns:: GRDStructure object containing parsed nodes and edges
Raises:: ValueError – If the Mermaid code is invalid or cannot be parsed

validate(mermaid_code: str) → Tuple[bool, str | None][source]

Validate Mermaid code syntax.

Parameters:: mermaid_code – Mermaid diagram code to validate
Returns:: Tuple of (is_valid, error_message)

extract_execution_steps(grd: GRDStructure) → List[Dict[str, Any]][source]

Extract execution steps from a GRD structure.

Parameters:: grd – Parsed GRD structure
Returns:: List of execution steps with node information

GRDStructure

Structure representing a parsed GRD.

class braid.parser.GRDStructure(nodes: ~typing.List[~braid.parser.GRDNode] = <factory>, edges: ~typing.List[~braid.parser.GRDEdge] = <factory>, start_nodes: ~typing.List[str] = <factory>, end_nodes: ~typing.List[str] = <factory>, metadata: ~typing.Dict[str, ~typing.Any] = <factory>)[source]

Complete structure of a Guided Reasoning Diagram.

nodes: List[GRDNode]

edges: List[GRDEdge]

start_nodes: List[str]

end_nodes: List[str]

metadata: Dict[str, Any]

get_node_by_id(node_id: str) → GRDNode | None[source]: Get a node by its ID.

get_outgoing_edges(node_id: str) → List[GRDEdge][source]: Get all edges outgoing from a node.

get_incoming_edges(node_id: str) → List[GRDEdge][source]: Get all edges incoming to a node.

get_execution_order() → List[str][source]: Get the execution order of nodes using topological sort. Returns a list of node IDs in execution order.

__init__(nodes: ~typing.List[~braid.parser.GRDNode] = <factory>, edges: ~typing.List[~braid.parser.GRDEdge] = <factory>, start_nodes: ~typing.List[str] = <factory>, end_nodes: ~typing.List[str] = <factory>, metadata: ~typing.Dict[str, ~typing.Any] = <factory>) → None

Generator

GRDGenerator

Generator for Guided Reasoning Diagrams.

class braid.generator.GRDGenerator(examples: List[Dict[str, str]] | None = None, max_retries: int = 3, temperature: float = 0.3, use_dspy_predict: bool = True)[source]

Bases: object

Generator for Guided Reasoning Diagrams in Mermaid format.

DEFAULT_EXAMPLES = [{'grd': '```mermaid\nflowchart TD\n Start[Read and understand problem] --> Extract[Extract given values]\n Extract --> Identify[Identify what to find]\n Identify --> Formula[Recall speed formula]\n Formula --> Apply[Apply: divide distance by time]\n Apply --> Calculate[Perform the division]\n Calculate --> Verify[Verify units are correct]\n Verify --> Answer[State the final speed]\n```', 'problem': 'If a train travels 120 km in 2 hours, what is its speed?'}, {'grd': '```mermaid\nflowchart TD\n Start[Analyze the equation] --> Goal[Goal: isolate x]\n Goal --> Subtract[Subtract constant from both sides]\n Subtract --> Simplify1[Simplify the right side]\n Simplify1 --> Divide[Divide both sides by coefficient]\n Divide --> Simplify2[Simplify to get x value]\n Simplify2 --> Check[Verify: substitute back]\n Check --> Answer[State the solution]\n```', 'problem': 'Solve: 3x + 5 = 14'}, {'grd': '```mermaid\nflowchart TD\n Start[Understand the scenario] --> Values[Identify: price and quantity]\n Values --> Operation[Determine operation: multiplication]\n Operation --> Calculate[Multiply price by quantity]\n Calculate --> Answer[State total cost]\n```', 'problem': 'A store sells apples at $2 each. If John buys 5 apples, how much does he pay?'}]

__init__(examples: List[Dict[str, str]] | None = None, max_retries: int = 3, temperature: float = 0.3, use_dspy_predict: bool = True)[source]

Initialize the GRD Generator.

Parameters:

examples – Few-shot examples for GRD generation
max_retries – Maximum number of retries if generation fails
temperature – Temperature for LLM generation (lower = more deterministic)
use_dspy_predict – Whether to use DSPy’s Predict API (recommended)

generate(problem: str, problem_type: str | None = None, custom_instructions: str | None = None) → Dict[str, Any][source]

Generate a GRD for a given problem.

Parameters:

problem – The problem to solve
problem_type – Optional type hint (e.g., “math”, “logic”, “reasoning”)
custom_instructions – Optional custom instructions for generation

Returns:

grd: Mermaid code string
raw_response: Raw LLM response
parsed_structure: Parsed GRDStructure object
valid: Whether the GRD is valid

Return type:

Dictionary containing

add_example(problem: str, grd: str)[source]

Add a custom example to the generator.

Parameters:

problem – Example problem
grd – Example GRD in Mermaid format

get_template(problem_type: str) → str | None[source]

Get a template GRD for a specific problem type.

Parameters:: problem_type – Type of problem (e.g., “math”, “logic”, “reasoning”)
Returns:: Template Mermaid code or None

Optimizer

BraidOptimizer

BRAID-aware optimizer for DSPy.

class braid.optimizer.BraidOptimizer(*args, **kwargs)[source]

Bases: Module

BRAID-aware optimizer for DSPy.

This optimizer extends DSPy’s optimization capabilities by: 1. Optimizing GRD generation quality 2. Optimizing step-by-step execution 3. Providing GRD-specific metrics

__init__(base_optimizer: Module | None = None, grd_quality_weight: float = 0.5, execution_quality_weight: float = 0.5)[source]

Initialize the BRAID optimizer.

Parameters:

base_optimizer – Base DSPy optimizer to use (e.g., MIPROv2)
grd_quality_weight – Weight for GRD quality in optimization
execution_quality_weight – Weight for execution quality in optimization

optimize(module: BraidReasoning, trainset: List[Dict[str, Any]], metric: Callable | None = None, num_threads: int = 1) → BraidReasoning[source]

Optimize a BraidReasoning module.

Parameters:

module – The BraidReasoning module to optimize
trainset – Training examples with ‘problem’ and optionally ‘answer’ keys
metric – Optional custom metric function
num_threads – Number of threads for parallel optimization

Returns:

Optimized BraidReasoning module

evaluate(module: BraidReasoning, testset: List[Dict[str, Any]], metric: Callable | None = None) → Dict[str, float][source]

Evaluate a BraidReasoning module on a test set.

Parameters:

module – The BraidReasoning module to evaluate
testset – Test examples with ‘problem’ and optionally ‘answer’ keys
metric – Optional custom metric function

Returns:

Dictionary of evaluation metrics

GRDMetrics

Metrics for evaluating GRD quality.

class braid.optimizer.GRDMetrics[source]

Metrics for evaluating GRD quality.

Includes both structural metrics and BRAID protocol compliance metrics: - Structural validity - Completeness - Execution traceability - Atomicity (token density) - Masking compliance - Procedural scaffolding

static structural_validity(grd: str) → float[source]

Evaluate structural validity of a GRD.

Returns:: Score between 0.0 and 1.0 (1.0 = perfectly valid)

static completeness(grd_structure: GRDStructure) → float[source]

Evaluate completeness of a GRD (has start, end, reasonable number of steps).

Returns:: Score between 0.0 and 1.0

static execution_traceability(grd_structure: GRDStructure) → float[source]

Evaluate how traceable/executable the GRD is.

Returns:: Score between 0.0 and 1.0

static atomicity_score(grd_structure: GRDStructure, max_tokens: int = 15) → float[source]

Evaluate node atomicity (token density) compliance.

According to BRAID research, nano-scale models perform best when node labels contain fewer than 15 tokens.

Parameters:

grd_structure – Parsed GRD structure
max_tokens – Maximum tokens allowed per node

Returns:

Score between 0.0 and 1.0 (1.0 = all nodes within limit)

static masking_compliance(grd: str) → float[source]

Evaluate compliance with numerical masking protocol.

Detects potential answer leakage where computed values appear in node labels.

Parameters:: grd – Mermaid GRD code
Returns:: Score between 0.0 and 1.0 (1.0 = no leakage detected)

static procedural_scaffolding_score(grd: str) → float[source]

Evaluate adherence to procedural scaffolding rules.

Good GRDs describe HOW to solve, not WHAT the answer is.

Parameters:: grd – Mermaid GRD code
Returns:: Score between 0.0 and 1.0

static overall_quality(grd: str, grd_structure: GRDStructure | None = None) → float[source]

Calculate overall GRD quality score.

Parameters:

grd – Mermaid code string
grd_structure – Optional pre-parsed structure

Returns:

Overall quality score between 0.0 and 1.0

static detailed_quality_report(grd: str, grd_structure: GRDStructure | None = None) → Dict[str, float][source]

Get a detailed breakdown of all quality metrics.

Parameters:

grd – Mermaid code string
grd_structure – Optional pre-parsed structure

Returns:

Dictionary with all individual metric scores

Signatures

BraidPlanSignature

Signature for GRD planning phase.

class braid.signatures.BraidPlanSignature(*, problem: str, grd: str)[source]

Bases: Signature

Signature for GRD planning phase.

This signature defines the input/output structure for generating a Guided Reasoning Diagram (GRD) from a problem statement.

BRAID Protocol Requirements: - Procedural Scaffolding: Describe HOW to solve, not WHAT the answer is - Atomicity: Keep each node under 15 tokens for optimal performance - No Answer Leakage: Never include computed values in the diagram

problem: str

grd: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

BraidExecuteSignature

Signature for GRD execution phase.

class braid.signatures.BraidExecuteSignature(*, problem: str, grd: str, current_step: str, previous_results: str = '', step_result: str)[source]

Bases: Signature

Signature for GRD execution phase.

This signature defines the input/output structure for executing a Guided Reasoning Diagram step by step.

problem: str

grd: str

current_step: str

previous_results: str

step_result: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

BRAID Protocol Modules

NumericalMasker

Prevents answer leakage by masking numerical values in GRDs.

class braid.masking.NumericalMasker(preserve_step_numbers: bool = True, min_value_to_mask: float | None = None, custom_patterns: List[Tuple[str, str]] | None = None)[source]

Bases: object

Masks numerical values in Mermaid diagrams to prevent answer leakage.

The BRAID architecture requires that the Architect model creates a procedural scaffold without computing actual values. This class detects and replaces numerical values with placeholders like {{VALUE_1}}, {{VALUE_2}}, etc.

Example

>>> masker = NumericalMasker()
>>> result = masker.mask("Calculate[120 ÷ 2 = 60 km/h]")
>>> print(result.masked)
Calculate[{{VALUE_1}} ÷ {{VALUE_2}} = {{VALUE_3}} km/h]

PLACEHOLDER_PREFIX = '{{VALUE_'

PLACEHOLDER_SUFFIX = '}}'

NUMERICAL_PATTERNS = [('\\$\\s*[\\d,]+\\.?\\d*', 'currency_usd'), ('€\\s*[\\d,]+\\.?\\d*', 'currency_eur'), ('£\\s*[\\d,]+\\.?\\d*', 'currency_gbp'), ('₺\\s*[\\d,]+\\.?\\d*', 'currency_try'), ('[\\d,]+\\.?\\d*\\s*%', 'percentage'), ('[\\d,]+\\.?\\d*\\s*(?:km/h|mph|m/s)', 'speed'), ('[\\d,]+\\.?\\d*\\s*(?:km|m|cm|mm|mi|ft|in)', 'distance'), ('[\\d,]+\\.?\\d*\\s*(?:kg|g|mg|lb|oz)', 'weight'), ('[\\d,]+\\.?\\d*\\s*(?:hours?|hrs?|minutes?|mins?|seconds?|secs?|days?)', 'time'), ('[\\d,]+\\.?\\d*\\s*(?:liters?|L|ml|gallons?|gal)', 'volume'), ('[\\d,]+\\.?\\d*\\s*[eE][+-]?\\d+', 'scientific'), ('\\d+\\s*/\\s*\\d+', 'fraction'), ('[\\d,]+\\.\\d+', 'decimal'), ('\\b\\d{1,}(?:,\\d{3})*\\b', 'integer')]

EXCLUDE_PATTERNS = ['Step\\s*\\d+', 'Node\\s*\\d+', '\\bA\\d+\\b', '\\bB\\d+\\b', '\\bC\\d+\\b', 'flowchart\\s+\\w+', 'graph\\s+\\w+']

__init__(preserve_step_numbers: bool = True, min_value_to_mask: float | None = None, custom_patterns: List[Tuple[str, str]] | None = None)[source]

Initialize the NumericalMasker.

Parameters:

preserve_step_numbers – If True, don’t mask step/node numbers
min_value_to_mask – Minimum value to mask (smaller values preserved)
custom_patterns – Additional patterns to detect (pattern, category)

mask(text: str) → MaskingResult[source]

Mask all numerical values in the text.

Parameters:: text – Text containing numerical values to mask
Returns:: MaskingResult with masked text and value mapping

unmask(text: str, value_mapping: Dict[str, str], computed_values: Dict[str, str] | None = None) → UnmaskingResult[source]

Restore masked values in the text.

Parameters:

text – Text with placeholders
value_mapping – Original placeholder to value mapping
computed_values – Optional computed values to use instead of originals

Returns:

UnmaskingResult with unmasked text

detect_leakage(grd: str) → List[Dict[str, str]][source]

Detect potential answer leakage in a GRD.

This method finds numerical values that might be computed answers rather than problem inputs.

Parameters:: grd – Mermaid GRD code to analyze
Returns:: List of detected potential leaks with context

mask_grd_nodes(mermaid_code: str, preserve_problem_values: bool = True) → MaskingResult[source]

Mask numerical values specifically in GRD node labels.

This method is more targeted than mask(), focusing only on content within node brackets [text], (text), {text}, etc.

Parameters:

mermaid_code – Complete Mermaid GRD code
preserve_problem_values – If True, try to preserve problem input values

Returns:

MaskingResult with masked GRD

AtomicityValidator

Validates node token density (≤15 tokens per node).

class braid.validators.AtomicityValidator(max_tokens_per_node: int = 15, strict_mode: bool = False)[source]

Bases: object

Validates node atomicity in GRDs.

According to BRAID research, nano-scale models achieve highest accuracy when node labels contain fewer than 15 tokens. This validator checks and enforces this constraint.

Example

>>> validator = AtomicityValidator()
>>> result = validator.validate_node(node)
>>> if not result.valid:
...     print(result.issues[0].suggestion)

DEFAULT_MAX_TOKENS = 15

__init__(max_tokens_per_node: int = 15, strict_mode: bool = False)[source]

Initialize the AtomicityValidator.

Parameters:

max_tokens_per_node – Maximum allowed tokens per node label
strict_mode – If True, treat violations as errors; otherwise warnings

count_tokens(text: str) → int[source]

Count tokens in a text string.

Uses a simple whitespace + punctuation tokenization that approximates what most LLMs would produce. For more accurate counting, consider using the tiktoken library with a specific model’s tokenizer.

Parameters:: text – Text to tokenize
Returns:: Number of tokens

validate_node(node: GRDNode) → ValidationResult[source]

Validate a single node’s atomicity.

Parameters:: node – GRDNode to validate
Returns:: ValidationResult with any issues found

validate_grd(grd: GRDStructure) → ValidationResult[source]

Validate all nodes in a GRD for atomicity.

Parameters:: grd – GRDStructure to validate
Returns:: ValidationResult with all issues found

ProceduralScaffoldingValidator

Validates procedural scaffolding and detects answer leakage.

class braid.validators.ProceduralScaffoldingValidator(strict_mode: bool = False)[source]

Bases: object

Validates that GRDs follow procedural scaffolding rules.

The BRAID protocol requires that GRDs describe HOW to solve a problem, not WHAT the answer is. This validator detects answer leakage and ensures nodes describe actions rather than computed values.

LEAKAGE_PATTERNS = [('=\\s*\\d+', 'EQUALS_VALUE', 'Avoid computed values in node labels'), ('(?:answer|result|solution)\\s*[:=]?\\s*\\d+', 'LABELED_ANSWER', "Don't include answers in scaffolding"), ('\\d+\\s*(?:km/h|mph|m/s|kg|lb)', 'UNIT_VALUE', 'Use placeholders instead of computed values with units'), ('(?:total|sum|difference|product)\\s*[:=]?\\s*\\d+', 'COMPUTED_AGGREGATE', 'Describe the computation, not the result')]

SCAFFOLDING_PATTERNS = ['(?:calculate|compute|find|determine|solve)\\s+', '(?:divide|multiply|add|subtract)\\s+', '(?:compare|check|verify|validate)\\s+', '(?:extract|identify|locate)\\s+', '(?:apply|use|utilize)\\s+']

__init__(strict_mode: bool = False)[source]

Initialize the ProceduralScaffoldingValidator.

Parameters:: strict_mode – If True, treat leakage as errors

validate_node(node: GRDNode) → ValidationResult[source]

Validate a single node for procedural scaffolding compliance.

Parameters:: node – GRDNode to validate
Returns:: ValidationResult with any issues found

validate_grd(grd: GRDStructure) → ValidationResult[source]

Validate all nodes in a GRD for procedural scaffolding.

Parameters:: grd – GRDStructure to validate
Returns:: ValidationResult with all issues found

GRDValidator

Comprehensive GRD validator combining all checks.

class braid.validators.GRDValidator(max_tokens_per_node: int = 15, strict_atomicity: bool = False, strict_scaffolding: bool = False)[source]

Bases: object

Comprehensive GRD validator combining all validation rules.

This is the main entry point for validating GRDs according to BRAID protocol requirements.

__init__(max_tokens_per_node: int = 15, strict_atomicity: bool = False, strict_scaffolding: bool = False)[source]

Initialize the GRDValidator.

Parameters:

max_tokens_per_node – Maximum tokens allowed per node
strict_atomicity – Treat atomicity violations as errors
strict_scaffolding – Treat scaffolding violations as errors

validate(grd: GRDStructure) → ValidationResult[source]

Perform comprehensive validation on a GRD.

Parameters:: grd – GRDStructure to validate
Returns:: Combined ValidationResult from all validators

validate_and_report(grd: GRDStructure) → str[source]

Validate a GRD and return a formatted report.

Parameters:: grd – GRDStructure to validate
Returns:: Markdown-formatted validation report

StatefulExecutionEngine

Dynamic GRD execution engine with state management.

class braid.engine.StatefulExecutionEngine(grd: GRDStructure, max_iterations_per_node: int = 3, max_total_steps: int = 50)[source]

Bases: object

Stateful execution engine for GRDs.

Unlike simple topological sorting, this engine: - Maintains execution state across steps - Supports conditional branching - Handles cycles for critic/verification loops - Provides runtime condition evaluation

DEFAULT_MAX_ITERATIONS = 3

DEFAULT_MAX_TOTAL_STEPS = 50

__init__(grd: GRDStructure, max_iterations_per_node: int = 3, max_total_steps: int = 50)[source]

Initialize the execution engine.

Parameters:

grd – The GRD structure to execute
max_iterations_per_node – Max times a single node can be executed
max_total_steps – Maximum total execution steps

reset() → None[source]: Reset the execution state.

execute(problem: str, executor: Callable[[GRDNode, Dict[str, Any]], str], initial_context: Dict[str, Any] | None = None) → ExecutionResult[source]

Execute the GRD step by step.

Parameters:

problem – The problem being solved
executor – Function that executes a single node
initial_context – Optional initial context

Returns:

ExecutionResult with complete execution details

can_reach(from_node: str, to_node: str) → bool[source]: Check if to_node is reachable from from_node.

has_cycles() → bool[source]: Check if the GRD contains cycles.

detect_cycles() → List[List[str]][source]: Detect all cycles in the GRD.

CriticDetector

Identifies critic/verification nodes in GRDs.

class braid.critic.CriticDetector[source]

Bases: object

Detects and classifies critic nodes in GRDs.

Critic nodes are special nodes that verify previous computations and can trigger retries if verification fails.

CRITIC_PATTERNS = {CriticType.CONFIRMATION: [re.compile('^Confirm[:\\s]', re.IGNORECASE), re.compile('^Make sure[:\\s]', re.IGNORECASE), re.compile('^Is this correct', re.IGNORECASE)], CriticType.REVIEW: [re.compile('^Review[:\\s]', re.IGNORECASE), re.compile('^Examine[:\\s]', re.IGNORECASE), re.compile('^Inspect[:\\s]', re.IGNORECASE)], CriticType.VALIDATION: [re.compile('^Validate[:\\s]', re.IGNORECASE), re.compile('^Ensure[:\\s]', re.IGNORECASE), re.compile('^Assert[:\\s]', re.IGNORECASE)], CriticType.VERIFICATION: [re.compile('^Check[:\\s]', re.IGNORECASE), re.compile('^Verify[:\\s]', re.IGNORECASE), re.compile('^Double[- ]?check', re.IGNORECASE)]}

FAILURE_EDGE_PATTERNS = [re.compile('fail', re.IGNORECASE), re.compile('error', re.IGNORECASE), re.compile('retry', re.IGNORECASE), re.compile('incorrect', re.IGNORECASE), re.compile('wrong', re.IGNORECASE), re.compile('no\\s*$', re.IGNORECASE)]

is_critic_node(node: GRDNode) → bool[source]: Check if a node is a critic node.

get_critic_type(node: GRDNode) → CriticType | None[source]: Get the type of critic node.

detect_critics(grd: GRDStructure) → List[CriticNode][source]

Detect all critic nodes in a GRD.

Parameters:: grd – GRDStructure to analyze
Returns:: List of detected CriticNodes with their metadata

get_feedback_loops(grd: GRDStructure) → List[Tuple[CriticNode, List[str]]][source]

Identify feedback loops in the GRD.

A feedback loop is a path from a critic node back to a previous node.

Returns:: List of (critic_node, loop_path) tuples

CriticExecutor

Manages execution with critic feedback loops.

class braid.critic.CriticExecutor(grd: GRDStructure, max_retries: int = 2)[source]

Bases: object

Executes GRDs with critic feedback loops.

This executor handles the complete cycle of: 1. Executing normal nodes 2. Executing critic nodes 3. Processing critic feedback 4. Retrying on failure (up to max retries)

DEFAULT_MAX_RETRIES = 2

__init__(grd: GRDStructure, max_retries: int = 2)[source]

Initialize the CriticExecutor.

Parameters:

grd – The GRD structure to execute
max_retries – Maximum number of retries per critic failure

is_critic_node(node_id: str) → bool[source]: Check if a node ID is a critic node.

get_critic(node_id: str) → CriticNode | None[source]: Get critic node by ID.

process_critic_output(critic: CriticNode, output: str, context: Dict[str, Any], retry_count: int) → Tuple[bool, str | None, Dict[str, Any]][source]

Process the output from a critic node.

Parameters:

critic – The critic node that was executed
output – Output from critic execution
context – Current execution context
retry_count – Number of retries already attempted

Returns:

Tuple of (should_continue, next_node_id, updated_context)

execute_with_feedback(problem: str, executor: Callable[[GRDNode, Dict[str, Any]], str], initial_context: Dict[str, Any] | None = None) → FeedbackLoopResult[source]

Execute the GRD with critic feedback loops.

Parameters:

problem – The problem being solved
executor – Function to execute a single node
initial_context – Optional initial context

Returns:

FeedbackLoopResult with complete execution details

PPDAnalyzer

Performance-per-Dollar analysis.

class braid.metrics.PPDAnalyzer(architect_model: str = 'gpt-4', solver_model: str = 'gpt-3.5-turbo', custom_configs: Dict[str, ModelConfig] | None = None)[source]

Bases: object

Performance-per-Dollar analyzer for BRAID executions.

This class tracks token usage and costs across the planning and execution phases, and provides metrics for comparing with baseline models.

Example

>>> analyzer = PPDAnalyzer(
...     architect_model="gpt-4",
...     solver_model="gpt-3.5-turbo"
... )
>>> analyzer.track_usage(TokenUsage(100, 50), "planning")
>>> report = analyzer.generate_report(accuracy=0.95)
>>> print(f"PPD Score: {report.ppd_score}")

MODEL_CONFIGS: Dict[str, ModelConfig] = {'claude-3-haiku': ModelConfig(model_id='claude-3-haiku', input_cost_per_1m=0.25, output_cost_per_1m=1.25, provider='anthropic'), 'claude-3-opus': ModelConfig(model_id='claude-3-opus', input_cost_per_1m=15.0, output_cost_per_1m=75.0, provider='anthropic'), 'claude-3-sonnet': ModelConfig(model_id='claude-3-sonnet', input_cost_per_1m=3.0, output_cost_per_1m=15.0, provider='anthropic'), 'claude-3.5-haiku': ModelConfig(model_id='claude-3.5-haiku', input_cost_per_1m=0.8, output_cost_per_1m=4.0, provider='anthropic'), 'claude-3.5-sonnet': ModelConfig(model_id='claude-3.5-sonnet', input_cost_per_1m=3.0, output_cost_per_1m=15.0, provider='anthropic'), 'claude-3.7-sonnet': ModelConfig(model_id='claude-3.7-sonnet', input_cost_per_1m=3.0, output_cost_per_1m=15.0, provider='anthropic'), 'claude-4.5-haiku': ModelConfig(model_id='claude-4.5-haiku', input_cost_per_1m=1.0, output_cost_per_1m=5.0, provider='anthropic'), 'claude-4.5-opus': ModelConfig(model_id='claude-4.5-opus', input_cost_per_1m=5.0, output_cost_per_1m=25.0, provider='anthropic'), 'claude-4.5-sonnet': ModelConfig(model_id='claude-4.5-sonnet', input_cost_per_1m=3.0, output_cost_per_1m=15.0, provider='anthropic'), 'deepseek-r1': ModelConfig(model_id='deepseek-r1', input_cost_per_1m=0.55, output_cost_per_1m=2.19, provider='local'), 'deepseek-v3': ModelConfig(model_id='deepseek-v3', input_cost_per_1m=0.28, output_cost_per_1m=0.42, provider='local'), 'gemini-1.5-flash': ModelConfig(model_id='gemini-1.5-flash', input_cost_per_1m=0.075, output_cost_per_1m=0.3, provider='google'), 'gemini-1.5-pro': ModelConfig(model_id='gemini-1.5-pro', input_cost_per_1m=1.25, output_cost_per_1m=5.0, provider='google'), 'gemini-2.0-flash': ModelConfig(model_id='gemini-2.0-flash', input_cost_per_1m=0.1, output_cost_per_1m=0.4, provider='google'), 'gemini-2.0-flash-lite': ModelConfig(model_id='gemini-2.0-flash-lite', input_cost_per_1m=0.075, output_cost_per_1m=0.3, provider='google'), 'gemini-2.0-pro-exp': ModelConfig(model_id='gemini-2.0-pro-exp', input_cost_per_1m=0.0, output_cost_per_1m=0.0, provider='google'), 'gemini-2.5-flash': ModelConfig(model_id='gemini-2.5-flash', input_cost_per_1m=0.3, output_cost_per_1m=2.5, provider='google'), 'gemini-2.5-pro': ModelConfig(model_id='gemini-2.5-pro', input_cost_per_1m=1.25, output_cost_per_1m=10.0, provider='google'), 'gemini-3.0-flash': ModelConfig(model_id='gemini-3.0-flash', input_cost_per_1m=0.5, output_cost_per_1m=3.0, provider='google'), 'gemini-3.0-pro': ModelConfig(model_id='gemini-3.0-pro', input_cost_per_1m=2.0, output_cost_per_1m=12.0, provider='google'), 'gpt-3.5-turbo': ModelConfig(model_id='gpt-3.5-turbo', input_cost_per_1m=0.5, output_cost_per_1m=1.5, provider='openai'), 'gpt-4': ModelConfig(model_id='gpt-4', input_cost_per_1m=30.0, output_cost_per_1m=60.0, provider='openai'), 'gpt-4-turbo': ModelConfig(model_id='gpt-4-turbo', input_cost_per_1m=10.0, output_cost_per_1m=30.0, provider='openai'), 'gpt-4-turbo-preview': ModelConfig(model_id='gpt-4-turbo-preview', input_cost_per_1m=10.0, output_cost_per_1m=30.0, provider='openai'), 'gpt-4o': ModelConfig(model_id='gpt-4o', input_cost_per_1m=2.5, output_cost_per_1m=10.0, provider='openai'), 'gpt-4o-mini': ModelConfig(model_id='gpt-4o-mini', input_cost_per_1m=0.15, output_cost_per_1m=0.6, provider='openai'), 'llama-3.3-70b': ModelConfig(model_id='llama-3.3-70b', input_cost_per_1m=0.1, output_cost_per_1m=0.4, provider='local'), 'llama-4-behemoth': ModelConfig(model_id='llama-4-behemoth', input_cost_per_1m=3.5, output_cost_per_1m=3.5, provider='local'), 'llama-4-maverick': ModelConfig(model_id='llama-4-maverick', input_cost_per_1m=0.22, output_cost_per_1m=0.85, provider='local'), 'llama-4-scout': ModelConfig(model_id='llama-4-scout', input_cost_per_1m=0.1, output_cost_per_1m=0.34, provider='local'), 'o1': ModelConfig(model_id='o1', input_cost_per_1m=15.0, output_cost_per_1m=60.0, provider='openai'), 'o1-mini': ModelConfig(model_id='o1-mini', input_cost_per_1m=0.15, output_cost_per_1m=0.6, provider='openai'), 'o1-preview': ModelConfig(model_id='o1-preview', input_cost_per_1m=15.0, output_cost_per_1m=60.0, provider='openai'), 'o3': ModelConfig(model_id='o3', input_cost_per_1m=2.0, output_cost_per_1m=8.0, provider='openai'), 'o3-mini': ModelConfig(model_id='o3-mini', input_cost_per_1m=1.1, output_cost_per_1m=4.4, provider='openai')}

__init__(architect_model: str = 'gpt-4', solver_model: str = 'gpt-3.5-turbo', custom_configs: Dict[str, ModelConfig] | None = None)[source]

Initialize the PPD Analyzer.

Parameters:

architect_model – Model used for GRD planning phase
solver_model – Model used for GRD execution phase
custom_configs – Optional custom model configurations

get_model_config(model_id: str) → ModelConfig[source]: Get configuration for a model.

calculate_cost(usage: TokenUsage, model_id: str) → float[source]

Calculate cost for given token usage.

Parameters:

usage – Token usage to calculate cost for
model_id – Model ID to use for pricing

Returns:

Cost in USD

track_usage(usage: TokenUsage, phase: str, step_id: str | None = None, latency_ms: float = 0.0) → StepMetrics[source]

Track token usage for a step.

Parameters:

usage – Token usage for this step
phase – “planning” or “execution”
step_id – Optional step identifier
latency_ms – Latency in milliseconds

Returns:

StepMetrics for this step

get_cost_analysis() → CostAnalysis[source]

Get complete cost analysis for all tracked usage.

Returns:: CostAnalysis with complete breakdown

estimate_baseline_cost(baseline_model: str, problem_complexity_tokens: int = 500, response_tokens: int = 200) → float[source]

Estimate cost for solving with a single baseline model.

This estimates what it would cost to solve the problem using a single model without BRAID’s split architecture.

Parameters:

baseline_model – Model to use as baseline
problem_complexity_tokens – Estimated input tokens
response_tokens – Estimated response tokens

Returns:

Estimated cost in USD

calculate_ppd_score(accuracy: float, total_cost: float | None = None) → float[source]

Calculate Performance-per-Dollar score.

PPD = Accuracy / Cost

Higher is better. A score of 100 means 100% accuracy at $0.01 cost.

Parameters:

accuracy – Accuracy between 0.0 and 1.0
total_cost – Optional override for total cost

Returns:

PPD score

compare_with_baseline(accuracy: float, baseline_model: str, baseline_accuracy: float | None = None) → PPDReport[source]

Compare BRAID execution with a baseline model.

Parameters:

accuracy – BRAID accuracy
baseline_model – Model to compare against
baseline_accuracy – Baseline model accuracy (if known)

Returns:

PPDReport with comparison metrics

generate_report(accuracy: float, baseline_model: str | None = None, format: str = 'markdown') → str[source]

Generate a human-readable performance report.

Parameters:

accuracy – Achieved accuracy
baseline_model – Optional model for comparison
format – Output format (“markdown” or “text”)

Returns:

Formatted report string

reset() → None[source]: Reset all tracking data.

SyntheticDataGenerator

Generates BRAID-compliant training data.

class braid.training.SyntheticDataGenerator(validate_output: bool = True, max_tokens_per_node: int = 15)[source]

Bases: object

Generates synthetic training data for Architect models.

This generator creates problem-GRD pairs following BRAID protocol: - Procedural scaffolding (describe HOW, not WHAT) - Atomic nodes (≤15 tokens per node) - No answer leakage

MATH_TEMPLATES = [{'answer_fn': <function SyntheticDataGenerator.<lambda>>, 'grd_template': 'flowchart TD\n Start[Read and analyze problem] --> Extract[Extract: distance and time values]\n Extract --> Identify[Identify: need to find speed]\n Identify --> Formula[Recall speed formula]\n Formula --> Apply[Apply: divide distance by time]\n Apply --> Units[Verify units are correct]\n Units --> Answer[State the final speed]', 'template': 'If a {vehicle} travels {distance} km in {time} hours, what is its speed?', 'variables': {'distance': [60, 120, 180, 240, 300, 450, 600], 'time': [1, 2, 3, 4, 5, 6], 'vehicle': ['car', 'train', 'bus', 'bicycle', 'plane']}}, {'answer_fn': <function SyntheticDataGenerator.<lambda>>, 'grd_template': 'flowchart TD\n Start[Analyze the equation] --> Goal[Goal: isolate x]\n Goal --> Subtract[Subtract constant from both sides]\n Subtract --> Simplify1[Simplify right side]\n Simplify1 --> Divide[Divide by coefficient]\n Divide --> Simplify2[Calculate x value]\n Simplify2 --> Check[Verify by substitution]\n Check --> Answer[State solution]', 'template': 'Solve: {a}x + {b} = {c}', 'variables': {'a': [2, 3, 4, 5, 6], 'b': [1, 2, 3, 4, 5, 7, 8, 10], 'c': [10, 12, 14, 15, 18, 20, 22, 25]}}, {'answer_fn': <function SyntheticDataGenerator.<lambda>>, 'grd_template': 'flowchart TD\n Start[Understand the scenario] --> Values[Identify: unit price and quantity]\n Values --> Operation[Determine operation needed]\n Operation --> Calculate[Multiply price by quantity]\n Calculate --> Format[Format as currency]\n Format --> Answer[State total cost]', 'template': 'A store sells {item} at ${price} each. If {name} buys {quantity}, how much does {pronoun} pay?', 'variables': {'item': ['apples', 'oranges', 'books', 'pens', 'notebooks'], 'name': ['John', 'Maria', 'Alex', 'Sarah'], 'price': [2, 3, 5, 8, 10, 15], 'pronoun': ['he', 'she', 'they'], 'quantity': [3, 4, 5, 6, 7, 8, 10]}}]

LOGIC_TEMPLATES = [{'answer_fn': <function SyntheticDataGenerator.<lambda>>, 'grd_template': 'flowchart TD\n Start[Identify premises] --> P1[Premise 1: All A are B]\n P1 --> P2[Premise 2: X is A]\n P2 --> Apply[Apply syllogistic reasoning]\n Apply --> Deduce[Deduce: X must be B]\n Deduce --> Answer[State conclusion]', 'template': 'If all {category_a} are {category_b}, and {item} is a {category_a}, what can we conclude?', 'variables': {'category_a': ['dogs', 'cats', 'birds', 'mammals'], 'category_b': ['animals', 'living things', 'creatures'], 'item': ['Rex', 'Fluffy', 'Tweety', 'Max']}}]

REASONING_TEMPLATES = [{'answer_fn': <function SyntheticDataGenerator.<lambda>>, 'grd_template': 'flowchart TD\n Start[Understand the situation] --> Initial[Identify initial count]\n Initial --> Change[Identify the change]\n Change --> Operation[Determine: addition needed]\n Operation --> Calculate[Add the quantities]\n Calculate --> Answer[State final count]', 'template': '{person} has {count} {items}. {person2} gives {person} {more} more. How many does {person} have now?', 'variables': {'count': [3, 5, 7, 10, 12], 'items': ['apples', 'books', 'coins', 'marbles'], 'more': [2, 3, 4, 5], 'person': ['Alice', 'Bob', 'Charlie', 'Diana'], 'person2': ['Bob', 'Carol', 'David', 'Eve']}}]

__init__(validate_output: bool = True, max_tokens_per_node: int = 15)[source]

Initialize the synthetic data generator.

Parameters:

validate_output – Whether to validate generated samples
max_tokens_per_node – Maximum tokens per node for validation

generate_math_samples(count: int) → List[TrainingSample][source]

Generate math problem samples.

Parameters:: count – Number of samples to generate
Returns:: List of TrainingSample objects

generate_logic_samples(count: int) → List[TrainingSample][source]

Generate logic problem samples.

Parameters:: count – Number of samples to generate
Returns:: List of TrainingSample objects

generate_reasoning_samples(count: int) → List[TrainingSample][source]

Generate general reasoning samples.

Parameters:: count – Number of samples to generate
Returns:: List of TrainingSample objects

generate_mixed_samples(count: int, math_ratio: float = 0.4, logic_ratio: float = 0.3, reasoning_ratio: float = 0.3) → List[TrainingSample][source]

Generate a mixed dataset of samples.

Parameters:

count – Total number of samples
math_ratio – Proportion of math problems
logic_ratio – Proportion of logic problems
reasoning_ratio – Proportion of reasoning problems

Returns:

List of TrainingSample objects

validate_samples(samples: List[TrainingSample]) → Tuple[List[TrainingSample], List[TrainingSample]][source]

Validate samples against BRAID protocol rules.

Parameters:: samples – Samples to validate
Returns:: Tuple of (valid_samples, invalid_samples)

ArchitectTrainer

Fine-tuning utilities for Architect models.

class braid.training.ArchitectTrainer[source]

Bases: object

Utilities for training/fine-tuning Architect models.

Supports: - Creating DSPy examples for BootstrapFewShot - Preparing fine-tuning datasets - Calculating dataset statistics

__init__()[source]: Initialize the trainer.

create_dspy_examples(samples: List[TrainingSample]) → List[Any][source]

Create DSPy Example objects from training samples.

Parameters:: samples – Training samples to convert
Returns:: List of dspy.Example objects

prepare_openai_finetune_dataset(samples: List[TrainingSample], system_prompt: str | None = None) → List[Dict[str, Any]][source]

Prepare dataset in OpenAI fine-tuning format.

Parameters:

samples – Training samples
system_prompt – Optional system prompt

Returns:

List of conversation dictionaries

calculate_dataset_stats(samples: List[TrainingSample]) → DatasetStats[source]

Calculate statistics for a dataset.

Parameters:: samples – Training samples
Returns:: DatasetStats object

generate_training_dataset(size: int = 100, output_path: str | None = None, format: str = 'jsonl') → List[TrainingSample][source]

Generate and optionally save a training dataset.

Parameters:

size – Number of samples to generate
output_path – Optional path to save the dataset
format – Output format (“jsonl” or “json”)

Returns:

List of generated samples