API Reference
This section provides detailed API documentation for BRAID-DSPy.
Core Classes
BraidReasoning
Main module for BRAID reasoning in DSPy.
- class braid.module.BraidReasoning(*args, **kwargs)[source]
Bases:
ModuleBRAID reasoning module for DSPy.
This module implements the BRAID (Bounded Reasoning for Autonomous Inference and Decisions) architecture: 1. Planning Phase: Generate a Guided Reasoning Diagram (GRD) in Mermaid format 2. Execution Phase: Execute the GRD step by step to solve the problem
Example
>>> import dspy >>> from braid import BraidReasoning >>> >>> lm = dspy.OpenAI(model="gpt-4") >>> dspy.configure(lm=lm) >>> >>> braid = BraidReasoning() >>> result = braid(problem="If a train travels 120 km in 2 hours, what is its speed?") >>> print(result.answer) >>> print(result.grd)
- __init__(use_generator: bool = True, max_execution_steps: int = 20, validate_grd: bool = True)[source]
Initialize the BRAID reasoning module.
- Parameters:
use_generator – Whether to use GRDGenerator for planning (True) or direct LLM call (False)
max_execution_steps – Maximum number of steps to execute
validate_grd – Whether to validate GRD syntax before execution
- forward(problem: str, grd: str | None = None, problem_type: str | None = None) BraidResult[source]
Execute BRAID reasoning on a problem.
- Parameters:
problem – The problem to solve
grd – Optional pre-generated GRD (if None, will be generated)
problem_type – Optional problem type hint for generation
- Returns:
BraidResult object containing GRD, reasoning steps, and answer
- __call__(problem: str, **kwargs) BraidResult[source]
Make the module callable.
BraidResult
Result object returned by BraidReasoning module.
- class braid.module.BraidResult(problem: str, grd: str, parsed_grd: GRDStructure | None, reasoning_steps: List[Dict[str, Any]], answer: str, execution_trace: List[Dict[str, Any]], valid: bool, error: str | None = None)[source]
Result object returned by BraidReasoning module.
- parsed_grd: GRDStructure | None
Parser
MermaidParser
Parser for Mermaid flowchart diagrams.
- class braid.parser.MermaidParser[source]
Bases:
objectParser for Mermaid flowchart diagrams.
- NODE_PATTERNS = [('(\\w+)\\[\\[(.*?)\\]\\]', NodeType.SUBROUTINE), ('(\\w+)\\[\\((.*?)\\)\\]', NodeType.STADIUM), ('(\\w+)\\(\\((.*?)\\)\\)', NodeType.CIRCLE), ('(\\w+)\\{(.*?)\\}', NodeType.DIAMOND), ('(\\w+)\\{\\{(.*?)\\}\\}', NodeType.HEXAGON), ('(\\w+)\\((.*?)\\)', NodeType.ROUNDED), ('(\\w+)\\[(.*?)\\]', NodeType.RECTANGLE), ('(\\w+)\\[/?(.*?)[/\\\\]\\]', NodeType.PARALLELOGRAM)]
- EDGE_PATTERN = '(\\w+)\\s*(--[>]|==[>])\\s*(\\w+)|(\\w+)\\s*--[->]\\s*\\|\\s*(.*?)\\s*\\|\\s*(\\w+)'
- parse(mermaid_code: str) GRDStructure[source]
Parse Mermaid flowchart code into a GRD structure.
- Parameters:
mermaid_code – Mermaid diagram code (flowchart format)
- Returns:
GRDStructure object containing parsed nodes and edges
- Raises:
ValueError – If the Mermaid code is invalid or cannot be parsed
GRDStructure
Structure representing a parsed GRD.
- class braid.parser.GRDStructure(nodes: ~typing.List[~braid.parser.GRDNode] = <factory>, edges: ~typing.List[~braid.parser.GRDEdge] = <factory>, start_nodes: ~typing.List[str] = <factory>, end_nodes: ~typing.List[str] = <factory>, metadata: ~typing.Dict[str, ~typing.Any] = <factory>)[source]
Complete structure of a Guided Reasoning Diagram.
Generator
GRDGenerator
Generator for Guided Reasoning Diagrams.
- class braid.generator.GRDGenerator(examples: List[Dict[str, str]] | None = None, max_retries: int = 3, temperature: float = 0.3, use_dspy_predict: bool = True)[source]
Bases:
objectGenerator for Guided Reasoning Diagrams in Mermaid format.
- DEFAULT_EXAMPLES = [{'grd': '```mermaid\nflowchart TD\n Start[Read and understand problem] --> Extract[Extract given values]\n Extract --> Identify[Identify what to find]\n Identify --> Formula[Recall speed formula]\n Formula --> Apply[Apply: divide distance by time]\n Apply --> Calculate[Perform the division]\n Calculate --> Verify[Verify units are correct]\n Verify --> Answer[State the final speed]\n```', 'problem': 'If a train travels 120 km in 2 hours, what is its speed?'}, {'grd': '```mermaid\nflowchart TD\n Start[Analyze the equation] --> Goal[Goal: isolate x]\n Goal --> Subtract[Subtract constant from both sides]\n Subtract --> Simplify1[Simplify the right side]\n Simplify1 --> Divide[Divide both sides by coefficient]\n Divide --> Simplify2[Simplify to get x value]\n Simplify2 --> Check[Verify: substitute back]\n Check --> Answer[State the solution]\n```', 'problem': 'Solve: 3x + 5 = 14'}, {'grd': '```mermaid\nflowchart TD\n Start[Understand the scenario] --> Values[Identify: price and quantity]\n Values --> Operation[Determine operation: multiplication]\n Operation --> Calculate[Multiply price by quantity]\n Calculate --> Answer[State total cost]\n```', 'problem': 'A store sells apples at $2 each. If John buys 5 apples, how much does he pay?'}]
- __init__(examples: List[Dict[str, str]] | None = None, max_retries: int = 3, temperature: float = 0.3, use_dspy_predict: bool = True)[source]
Initialize the GRD Generator.
- Parameters:
examples – Few-shot examples for GRD generation
max_retries – Maximum number of retries if generation fails
temperature – Temperature for LLM generation (lower = more deterministic)
use_dspy_predict – Whether to use DSPy’s Predict API (recommended)
- generate(problem: str, problem_type: str | None = None, custom_instructions: str | None = None) Dict[str, Any][source]
Generate a GRD for a given problem.
- Parameters:
problem – The problem to solve
problem_type – Optional type hint (e.g., “math”, “logic”, “reasoning”)
custom_instructions – Optional custom instructions for generation
- Returns:
grd: Mermaid code string
raw_response: Raw LLM response
parsed_structure: Parsed GRDStructure object
valid: Whether the GRD is valid
- Return type:
Dictionary containing
Optimizer
BraidOptimizer
BRAID-aware optimizer for DSPy.
- class braid.optimizer.BraidOptimizer(*args, **kwargs)[source]
Bases:
ModuleBRAID-aware optimizer for DSPy.
This optimizer extends DSPy’s optimization capabilities by: 1. Optimizing GRD generation quality 2. Optimizing step-by-step execution 3. Providing GRD-specific metrics
- __init__(base_optimizer: Module | None = None, grd_quality_weight: float = 0.5, execution_quality_weight: float = 0.5)[source]
Initialize the BRAID optimizer.
- Parameters:
base_optimizer – Base DSPy optimizer to use (e.g., MIPROv2)
grd_quality_weight – Weight for GRD quality in optimization
execution_quality_weight – Weight for execution quality in optimization
- optimize(module: BraidReasoning, trainset: List[Dict[str, Any]], metric: Callable | None = None, num_threads: int = 1) BraidReasoning[source]
Optimize a BraidReasoning module.
- Parameters:
module – The BraidReasoning module to optimize
trainset – Training examples with ‘problem’ and optionally ‘answer’ keys
metric – Optional custom metric function
num_threads – Number of threads for parallel optimization
- Returns:
Optimized BraidReasoning module
- evaluate(module: BraidReasoning, testset: List[Dict[str, Any]], metric: Callable | None = None) Dict[str, float][source]
Evaluate a BraidReasoning module on a test set.
- Parameters:
module – The BraidReasoning module to evaluate
testset – Test examples with ‘problem’ and optionally ‘answer’ keys
metric – Optional custom metric function
- Returns:
Dictionary of evaluation metrics
GRDMetrics
Metrics for evaluating GRD quality.
- class braid.optimizer.GRDMetrics[source]
Metrics for evaluating GRD quality.
Includes both structural metrics and BRAID protocol compliance metrics: - Structural validity - Completeness - Execution traceability - Atomicity (token density) - Masking compliance - Procedural scaffolding
- static structural_validity(grd: str) float[source]
Evaluate structural validity of a GRD.
- Returns:
Score between 0.0 and 1.0 (1.0 = perfectly valid)
- static completeness(grd_structure: GRDStructure) float[source]
Evaluate completeness of a GRD (has start, end, reasonable number of steps).
- Returns:
Score between 0.0 and 1.0
- static execution_traceability(grd_structure: GRDStructure) float[source]
Evaluate how traceable/executable the GRD is.
- Returns:
Score between 0.0 and 1.0
- static atomicity_score(grd_structure: GRDStructure, max_tokens: int = 15) float[source]
Evaluate node atomicity (token density) compliance.
According to BRAID research, nano-scale models perform best when node labels contain fewer than 15 tokens.
- Parameters:
grd_structure – Parsed GRD structure
max_tokens – Maximum tokens allowed per node
- Returns:
Score between 0.0 and 1.0 (1.0 = all nodes within limit)
- static masking_compliance(grd: str) float[source]
Evaluate compliance with numerical masking protocol.
Detects potential answer leakage where computed values appear in node labels.
- Parameters:
grd – Mermaid GRD code
- Returns:
Score between 0.0 and 1.0 (1.0 = no leakage detected)
- static procedural_scaffolding_score(grd: str) float[source]
Evaluate adherence to procedural scaffolding rules.
Good GRDs describe HOW to solve, not WHAT the answer is.
- Parameters:
grd – Mermaid GRD code
- Returns:
Score between 0.0 and 1.0
Signatures
BraidPlanSignature
Signature for GRD planning phase.
- class braid.signatures.BraidPlanSignature(*, problem: str, grd: str)[source]
Bases:
SignatureSignature for GRD planning phase.
This signature defines the input/output structure for generating a Guided Reasoning Diagram (GRD) from a problem statement.
BRAID Protocol Requirements: - Procedural Scaffolding: Describe HOW to solve, not WHAT the answer is - Atomicity: Keep each node under 15 tokens for optimal performance - No Answer Leakage: Never include computed values in the diagram
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
BraidExecuteSignature
Signature for GRD execution phase.
- class braid.signatures.BraidExecuteSignature(*, problem: str, grd: str, current_step: str, previous_results: str = '', step_result: str)[source]
Bases:
SignatureSignature for GRD execution phase.
This signature defines the input/output structure for executing a Guided Reasoning Diagram step by step.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
BRAID Protocol Modules
NumericalMasker
Prevents answer leakage by masking numerical values in GRDs.
- class braid.masking.NumericalMasker(preserve_step_numbers: bool = True, min_value_to_mask: float | None = None, custom_patterns: List[Tuple[str, str]] | None = None)[source]
Bases:
objectMasks numerical values in Mermaid diagrams to prevent answer leakage.
The BRAID architecture requires that the Architect model creates a procedural scaffold without computing actual values. This class detects and replaces numerical values with placeholders like {{VALUE_1}}, {{VALUE_2}}, etc.
Example
>>> masker = NumericalMasker() >>> result = masker.mask("Calculate[120 ÷ 2 = 60 km/h]") >>> print(result.masked) Calculate[{{VALUE_1}} ÷ {{VALUE_2}} = {{VALUE_3}} km/h]
- PLACEHOLDER_PREFIX = '{{VALUE_'
- PLACEHOLDER_SUFFIX = '}}'
- NUMERICAL_PATTERNS = [('\\$\\s*[\\d,]+\\.?\\d*', 'currency_usd'), ('€\\s*[\\d,]+\\.?\\d*', 'currency_eur'), ('£\\s*[\\d,]+\\.?\\d*', 'currency_gbp'), ('₺\\s*[\\d,]+\\.?\\d*', 'currency_try'), ('[\\d,]+\\.?\\d*\\s*%', 'percentage'), ('[\\d,]+\\.?\\d*\\s*(?:km/h|mph|m/s)', 'speed'), ('[\\d,]+\\.?\\d*\\s*(?:km|m|cm|mm|mi|ft|in)', 'distance'), ('[\\d,]+\\.?\\d*\\s*(?:kg|g|mg|lb|oz)', 'weight'), ('[\\d,]+\\.?\\d*\\s*(?:hours?|hrs?|minutes?|mins?|seconds?|secs?|days?)', 'time'), ('[\\d,]+\\.?\\d*\\s*(?:liters?|L|ml|gallons?|gal)', 'volume'), ('[\\d,]+\\.?\\d*\\s*[eE][+-]?\\d+', 'scientific'), ('\\d+\\s*/\\s*\\d+', 'fraction'), ('[\\d,]+\\.\\d+', 'decimal'), ('\\b\\d{1,}(?:,\\d{3})*\\b', 'integer')]
- EXCLUDE_PATTERNS = ['Step\\s*\\d+', 'Node\\s*\\d+', '\\bA\\d+\\b', '\\bB\\d+\\b', '\\bC\\d+\\b', 'flowchart\\s+\\w+', 'graph\\s+\\w+']
- __init__(preserve_step_numbers: bool = True, min_value_to_mask: float | None = None, custom_patterns: List[Tuple[str, str]] | None = None)[source]
Initialize the NumericalMasker.
- Parameters:
preserve_step_numbers – If True, don’t mask step/node numbers
min_value_to_mask – Minimum value to mask (smaller values preserved)
custom_patterns – Additional patterns to detect (pattern, category)
- mask(text: str) MaskingResult[source]
Mask all numerical values in the text.
- Parameters:
text – Text containing numerical values to mask
- Returns:
MaskingResult with masked text and value mapping
- unmask(text: str, value_mapping: Dict[str, str], computed_values: Dict[str, str] | None = None) UnmaskingResult[source]
Restore masked values in the text.
- Parameters:
text – Text with placeholders
value_mapping – Original placeholder to value mapping
computed_values – Optional computed values to use instead of originals
- Returns:
UnmaskingResult with unmasked text
- detect_leakage(grd: str) List[Dict[str, str]][source]
Detect potential answer leakage in a GRD.
This method finds numerical values that might be computed answers rather than problem inputs.
- Parameters:
grd – Mermaid GRD code to analyze
- Returns:
List of detected potential leaks with context
- mask_grd_nodes(mermaid_code: str, preserve_problem_values: bool = True) MaskingResult[source]
Mask numerical values specifically in GRD node labels.
This method is more targeted than mask(), focusing only on content within node brackets [text], (text), {text}, etc.
- Parameters:
mermaid_code – Complete Mermaid GRD code
preserve_problem_values – If True, try to preserve problem input values
- Returns:
MaskingResult with masked GRD
AtomicityValidator
Validates node token density (≤15 tokens per node).
- class braid.validators.AtomicityValidator(max_tokens_per_node: int = 15, strict_mode: bool = False)[source]
Bases:
objectValidates node atomicity in GRDs.
According to BRAID research, nano-scale models achieve highest accuracy when node labels contain fewer than 15 tokens. This validator checks and enforces this constraint.
Example
>>> validator = AtomicityValidator() >>> result = validator.validate_node(node) >>> if not result.valid: ... print(result.issues[0].suggestion)
- DEFAULT_MAX_TOKENS = 15
- __init__(max_tokens_per_node: int = 15, strict_mode: bool = False)[source]
Initialize the AtomicityValidator.
- Parameters:
max_tokens_per_node – Maximum allowed tokens per node label
strict_mode – If True, treat violations as errors; otherwise warnings
- count_tokens(text: str) int[source]
Count tokens in a text string.
Uses a simple whitespace + punctuation tokenization that approximates what most LLMs would produce. For more accurate counting, consider using the tiktoken library with a specific model’s tokenizer.
- Parameters:
text – Text to tokenize
- Returns:
Number of tokens
- validate_node(node: GRDNode) ValidationResult[source]
Validate a single node’s atomicity.
- Parameters:
node – GRDNode to validate
- Returns:
ValidationResult with any issues found
- validate_grd(grd: GRDStructure) ValidationResult[source]
Validate all nodes in a GRD for atomicity.
- Parameters:
grd – GRDStructure to validate
- Returns:
ValidationResult with all issues found
ProceduralScaffoldingValidator
Validates procedural scaffolding and detects answer leakage.
- class braid.validators.ProceduralScaffoldingValidator(strict_mode: bool = False)[source]
Bases:
objectValidates that GRDs follow procedural scaffolding rules.
The BRAID protocol requires that GRDs describe HOW to solve a problem, not WHAT the answer is. This validator detects answer leakage and ensures nodes describe actions rather than computed values.
- LEAKAGE_PATTERNS = [('=\\s*\\d+', 'EQUALS_VALUE', 'Avoid computed values in node labels'), ('(?:answer|result|solution)\\s*[:=]?\\s*\\d+', 'LABELED_ANSWER', "Don't include answers in scaffolding"), ('\\d+\\s*(?:km/h|mph|m/s|kg|lb)', 'UNIT_VALUE', 'Use placeholders instead of computed values with units'), ('(?:total|sum|difference|product)\\s*[:=]?\\s*\\d+', 'COMPUTED_AGGREGATE', 'Describe the computation, not the result')]
- SCAFFOLDING_PATTERNS = ['(?:calculate|compute|find|determine|solve)\\s+', '(?:divide|multiply|add|subtract)\\s+', '(?:compare|check|verify|validate)\\s+', '(?:extract|identify|locate)\\s+', '(?:apply|use|utilize)\\s+']
- __init__(strict_mode: bool = False)[source]
Initialize the ProceduralScaffoldingValidator.
- Parameters:
strict_mode – If True, treat leakage as errors
- validate_node(node: GRDNode) ValidationResult[source]
Validate a single node for procedural scaffolding compliance.
- Parameters:
node – GRDNode to validate
- Returns:
ValidationResult with any issues found
- validate_grd(grd: GRDStructure) ValidationResult[source]
Validate all nodes in a GRD for procedural scaffolding.
- Parameters:
grd – GRDStructure to validate
- Returns:
ValidationResult with all issues found
GRDValidator
Comprehensive GRD validator combining all checks.
- class braid.validators.GRDValidator(max_tokens_per_node: int = 15, strict_atomicity: bool = False, strict_scaffolding: bool = False)[source]
Bases:
objectComprehensive GRD validator combining all validation rules.
This is the main entry point for validating GRDs according to BRAID protocol requirements.
- __init__(max_tokens_per_node: int = 15, strict_atomicity: bool = False, strict_scaffolding: bool = False)[source]
Initialize the GRDValidator.
- Parameters:
max_tokens_per_node – Maximum tokens allowed per node
strict_atomicity – Treat atomicity violations as errors
strict_scaffolding – Treat scaffolding violations as errors
- validate(grd: GRDStructure) ValidationResult[source]
Perform comprehensive validation on a GRD.
- Parameters:
grd – GRDStructure to validate
- Returns:
Combined ValidationResult from all validators
- validate_and_report(grd: GRDStructure) str[source]
Validate a GRD and return a formatted report.
- Parameters:
grd – GRDStructure to validate
- Returns:
Markdown-formatted validation report
StatefulExecutionEngine
Dynamic GRD execution engine with state management.
- class braid.engine.StatefulExecutionEngine(grd: GRDStructure, max_iterations_per_node: int = 3, max_total_steps: int = 50)[source]
Bases:
objectStateful execution engine for GRDs.
Unlike simple topological sorting, this engine: - Maintains execution state across steps - Supports conditional branching - Handles cycles for critic/verification loops - Provides runtime condition evaluation
- DEFAULT_MAX_ITERATIONS = 3
- DEFAULT_MAX_TOTAL_STEPS = 50
- __init__(grd: GRDStructure, max_iterations_per_node: int = 3, max_total_steps: int = 50)[source]
Initialize the execution engine.
- Parameters:
grd – The GRD structure to execute
max_iterations_per_node – Max times a single node can be executed
max_total_steps – Maximum total execution steps
- execute(problem: str, executor: Callable[[GRDNode, Dict[str, Any]], str], initial_context: Dict[str, Any] | None = None) ExecutionResult[source]
Execute the GRD step by step.
- Parameters:
problem – The problem being solved
executor – Function that executes a single node
initial_context – Optional initial context
- Returns:
ExecutionResult with complete execution details
CriticDetector
Identifies critic/verification nodes in GRDs.
- class braid.critic.CriticDetector[source]
Bases:
objectDetects and classifies critic nodes in GRDs.
Critic nodes are special nodes that verify previous computations and can trigger retries if verification fails.
- CRITIC_PATTERNS = {CriticType.CONFIRMATION: [re.compile('^Confirm[:\\s]', re.IGNORECASE), re.compile('^Make sure[:\\s]', re.IGNORECASE), re.compile('^Is this correct', re.IGNORECASE)], CriticType.REVIEW: [re.compile('^Review[:\\s]', re.IGNORECASE), re.compile('^Examine[:\\s]', re.IGNORECASE), re.compile('^Inspect[:\\s]', re.IGNORECASE)], CriticType.VALIDATION: [re.compile('^Validate[:\\s]', re.IGNORECASE), re.compile('^Ensure[:\\s]', re.IGNORECASE), re.compile('^Assert[:\\s]', re.IGNORECASE)], CriticType.VERIFICATION: [re.compile('^Check[:\\s]', re.IGNORECASE), re.compile('^Verify[:\\s]', re.IGNORECASE), re.compile('^Double[- ]?check', re.IGNORECASE)]}
- FAILURE_EDGE_PATTERNS = [re.compile('fail', re.IGNORECASE), re.compile('error', re.IGNORECASE), re.compile('retry', re.IGNORECASE), re.compile('incorrect', re.IGNORECASE), re.compile('wrong', re.IGNORECASE), re.compile('no\\s*$', re.IGNORECASE)]
- detect_critics(grd: GRDStructure) List[CriticNode][source]
Detect all critic nodes in a GRD.
- Parameters:
grd – GRDStructure to analyze
- Returns:
List of detected CriticNodes with their metadata
CriticExecutor
Manages execution with critic feedback loops.
- class braid.critic.CriticExecutor(grd: GRDStructure, max_retries: int = 2)[source]
Bases:
objectExecutes GRDs with critic feedback loops.
This executor handles the complete cycle of: 1. Executing normal nodes 2. Executing critic nodes 3. Processing critic feedback 4. Retrying on failure (up to max retries)
- DEFAULT_MAX_RETRIES = 2
- __init__(grd: GRDStructure, max_retries: int = 2)[source]
Initialize the CriticExecutor.
- Parameters:
grd – The GRD structure to execute
max_retries – Maximum number of retries per critic failure
- process_critic_output(critic: CriticNode, output: str, context: Dict[str, Any], retry_count: int) Tuple[bool, str | None, Dict[str, Any]][source]
Process the output from a critic node.
- Parameters:
critic – The critic node that was executed
output – Output from critic execution
context – Current execution context
retry_count – Number of retries already attempted
- Returns:
Tuple of (should_continue, next_node_id, updated_context)
- execute_with_feedback(problem: str, executor: Callable[[GRDNode, Dict[str, Any]], str], initial_context: Dict[str, Any] | None = None) FeedbackLoopResult[source]
Execute the GRD with critic feedback loops.
- Parameters:
problem – The problem being solved
executor – Function to execute a single node
initial_context – Optional initial context
- Returns:
FeedbackLoopResult with complete execution details
PPDAnalyzer
Performance-per-Dollar analysis.
- class braid.metrics.PPDAnalyzer(architect_model: str = 'gpt-4', solver_model: str = 'gpt-3.5-turbo', custom_configs: Dict[str, ModelConfig] | None = None)[source]
Bases:
objectPerformance-per-Dollar analyzer for BRAID executions.
This class tracks token usage and costs across the planning and execution phases, and provides metrics for comparing with baseline models.
Example
>>> analyzer = PPDAnalyzer( ... architect_model="gpt-4", ... solver_model="gpt-3.5-turbo" ... ) >>> analyzer.track_usage(TokenUsage(100, 50), "planning") >>> report = analyzer.generate_report(accuracy=0.95) >>> print(f"PPD Score: {report.ppd_score}")
- MODEL_CONFIGS: Dict[str, ModelConfig] = {'claude-3-haiku': ModelConfig(model_id='claude-3-haiku', input_cost_per_1m=0.25, output_cost_per_1m=1.25, provider='anthropic'), 'claude-3-opus': ModelConfig(model_id='claude-3-opus', input_cost_per_1m=15.0, output_cost_per_1m=75.0, provider='anthropic'), 'claude-3-sonnet': ModelConfig(model_id='claude-3-sonnet', input_cost_per_1m=3.0, output_cost_per_1m=15.0, provider='anthropic'), 'claude-3.5-haiku': ModelConfig(model_id='claude-3.5-haiku', input_cost_per_1m=0.8, output_cost_per_1m=4.0, provider='anthropic'), 'claude-3.5-sonnet': ModelConfig(model_id='claude-3.5-sonnet', input_cost_per_1m=3.0, output_cost_per_1m=15.0, provider='anthropic'), 'claude-3.7-sonnet': ModelConfig(model_id='claude-3.7-sonnet', input_cost_per_1m=3.0, output_cost_per_1m=15.0, provider='anthropic'), 'claude-4.5-haiku': ModelConfig(model_id='claude-4.5-haiku', input_cost_per_1m=1.0, output_cost_per_1m=5.0, provider='anthropic'), 'claude-4.5-opus': ModelConfig(model_id='claude-4.5-opus', input_cost_per_1m=5.0, output_cost_per_1m=25.0, provider='anthropic'), 'claude-4.5-sonnet': ModelConfig(model_id='claude-4.5-sonnet', input_cost_per_1m=3.0, output_cost_per_1m=15.0, provider='anthropic'), 'deepseek-r1': ModelConfig(model_id='deepseek-r1', input_cost_per_1m=0.55, output_cost_per_1m=2.19, provider='local'), 'deepseek-v3': ModelConfig(model_id='deepseek-v3', input_cost_per_1m=0.28, output_cost_per_1m=0.42, provider='local'), 'gemini-1.5-flash': ModelConfig(model_id='gemini-1.5-flash', input_cost_per_1m=0.075, output_cost_per_1m=0.3, provider='google'), 'gemini-1.5-pro': ModelConfig(model_id='gemini-1.5-pro', input_cost_per_1m=1.25, output_cost_per_1m=5.0, provider='google'), 'gemini-2.0-flash': ModelConfig(model_id='gemini-2.0-flash', input_cost_per_1m=0.1, output_cost_per_1m=0.4, provider='google'), 'gemini-2.0-flash-lite': ModelConfig(model_id='gemini-2.0-flash-lite', input_cost_per_1m=0.075, output_cost_per_1m=0.3, provider='google'), 'gemini-2.0-pro-exp': ModelConfig(model_id='gemini-2.0-pro-exp', input_cost_per_1m=0.0, output_cost_per_1m=0.0, provider='google'), 'gemini-2.5-flash': ModelConfig(model_id='gemini-2.5-flash', input_cost_per_1m=0.3, output_cost_per_1m=2.5, provider='google'), 'gemini-2.5-pro': ModelConfig(model_id='gemini-2.5-pro', input_cost_per_1m=1.25, output_cost_per_1m=10.0, provider='google'), 'gemini-3.0-flash': ModelConfig(model_id='gemini-3.0-flash', input_cost_per_1m=0.5, output_cost_per_1m=3.0, provider='google'), 'gemini-3.0-pro': ModelConfig(model_id='gemini-3.0-pro', input_cost_per_1m=2.0, output_cost_per_1m=12.0, provider='google'), 'gpt-3.5-turbo': ModelConfig(model_id='gpt-3.5-turbo', input_cost_per_1m=0.5, output_cost_per_1m=1.5, provider='openai'), 'gpt-4': ModelConfig(model_id='gpt-4', input_cost_per_1m=30.0, output_cost_per_1m=60.0, provider='openai'), 'gpt-4-turbo': ModelConfig(model_id='gpt-4-turbo', input_cost_per_1m=10.0, output_cost_per_1m=30.0, provider='openai'), 'gpt-4-turbo-preview': ModelConfig(model_id='gpt-4-turbo-preview', input_cost_per_1m=10.0, output_cost_per_1m=30.0, provider='openai'), 'gpt-4o': ModelConfig(model_id='gpt-4o', input_cost_per_1m=2.5, output_cost_per_1m=10.0, provider='openai'), 'gpt-4o-mini': ModelConfig(model_id='gpt-4o-mini', input_cost_per_1m=0.15, output_cost_per_1m=0.6, provider='openai'), 'llama-3.3-70b': ModelConfig(model_id='llama-3.3-70b', input_cost_per_1m=0.1, output_cost_per_1m=0.4, provider='local'), 'llama-4-behemoth': ModelConfig(model_id='llama-4-behemoth', input_cost_per_1m=3.5, output_cost_per_1m=3.5, provider='local'), 'llama-4-maverick': ModelConfig(model_id='llama-4-maverick', input_cost_per_1m=0.22, output_cost_per_1m=0.85, provider='local'), 'llama-4-scout': ModelConfig(model_id='llama-4-scout', input_cost_per_1m=0.1, output_cost_per_1m=0.34, provider='local'), 'o1': ModelConfig(model_id='o1', input_cost_per_1m=15.0, output_cost_per_1m=60.0, provider='openai'), 'o1-mini': ModelConfig(model_id='o1-mini', input_cost_per_1m=0.15, output_cost_per_1m=0.6, provider='openai'), 'o1-preview': ModelConfig(model_id='o1-preview', input_cost_per_1m=15.0, output_cost_per_1m=60.0, provider='openai'), 'o3': ModelConfig(model_id='o3', input_cost_per_1m=2.0, output_cost_per_1m=8.0, provider='openai'), 'o3-mini': ModelConfig(model_id='o3-mini', input_cost_per_1m=1.1, output_cost_per_1m=4.4, provider='openai')}
- __init__(architect_model: str = 'gpt-4', solver_model: str = 'gpt-3.5-turbo', custom_configs: Dict[str, ModelConfig] | None = None)[source]
Initialize the PPD Analyzer.
- Parameters:
architect_model – Model used for GRD planning phase
solver_model – Model used for GRD execution phase
custom_configs – Optional custom model configurations
- calculate_cost(usage: TokenUsage, model_id: str) float[source]
Calculate cost for given token usage.
- Parameters:
usage – Token usage to calculate cost for
model_id – Model ID to use for pricing
- Returns:
Cost in USD
- track_usage(usage: TokenUsage, phase: str, step_id: str | None = None, latency_ms: float = 0.0) StepMetrics[source]
Track token usage for a step.
- Parameters:
usage – Token usage for this step
phase – “planning” or “execution”
step_id – Optional step identifier
latency_ms – Latency in milliseconds
- Returns:
StepMetrics for this step
- get_cost_analysis() CostAnalysis[source]
Get complete cost analysis for all tracked usage.
- Returns:
CostAnalysis with complete breakdown
- estimate_baseline_cost(baseline_model: str, problem_complexity_tokens: int = 500, response_tokens: int = 200) float[source]
Estimate cost for solving with a single baseline model.
This estimates what it would cost to solve the problem using a single model without BRAID’s split architecture.
- Parameters:
baseline_model – Model to use as baseline
problem_complexity_tokens – Estimated input tokens
response_tokens – Estimated response tokens
- Returns:
Estimated cost in USD
- calculate_ppd_score(accuracy: float, total_cost: float | None = None) float[source]
Calculate Performance-per-Dollar score.
PPD = Accuracy / Cost
Higher is better. A score of 100 means 100% accuracy at $0.01 cost.
- Parameters:
accuracy – Accuracy between 0.0 and 1.0
total_cost – Optional override for total cost
- Returns:
PPD score
- compare_with_baseline(accuracy: float, baseline_model: str, baseline_accuracy: float | None = None) PPDReport[source]
Compare BRAID execution with a baseline model.
- Parameters:
accuracy – BRAID accuracy
baseline_model – Model to compare against
baseline_accuracy – Baseline model accuracy (if known)
- Returns:
PPDReport with comparison metrics
- generate_report(accuracy: float, baseline_model: str | None = None, format: str = 'markdown') str[source]
Generate a human-readable performance report.
- Parameters:
accuracy – Achieved accuracy
baseline_model – Optional model for comparison
format – Output format (“markdown” or “text”)
- Returns:
Formatted report string
SyntheticDataGenerator
Generates BRAID-compliant training data.
- class braid.training.SyntheticDataGenerator(validate_output: bool = True, max_tokens_per_node: int = 15)[source]
Bases:
objectGenerates synthetic training data for Architect models.
This generator creates problem-GRD pairs following BRAID protocol: - Procedural scaffolding (describe HOW, not WHAT) - Atomic nodes (≤15 tokens per node) - No answer leakage
- MATH_TEMPLATES = [{'answer_fn': <function SyntheticDataGenerator.<lambda>>, 'grd_template': 'flowchart TD\n Start[Read and analyze problem] --> Extract[Extract: distance and time values]\n Extract --> Identify[Identify: need to find speed]\n Identify --> Formula[Recall speed formula]\n Formula --> Apply[Apply: divide distance by time]\n Apply --> Units[Verify units are correct]\n Units --> Answer[State the final speed]', 'template': 'If a {vehicle} travels {distance} km in {time} hours, what is its speed?', 'variables': {'distance': [60, 120, 180, 240, 300, 450, 600], 'time': [1, 2, 3, 4, 5, 6], 'vehicle': ['car', 'train', 'bus', 'bicycle', 'plane']}}, {'answer_fn': <function SyntheticDataGenerator.<lambda>>, 'grd_template': 'flowchart TD\n Start[Analyze the equation] --> Goal[Goal: isolate x]\n Goal --> Subtract[Subtract constant from both sides]\n Subtract --> Simplify1[Simplify right side]\n Simplify1 --> Divide[Divide by coefficient]\n Divide --> Simplify2[Calculate x value]\n Simplify2 --> Check[Verify by substitution]\n Check --> Answer[State solution]', 'template': 'Solve: {a}x + {b} = {c}', 'variables': {'a': [2, 3, 4, 5, 6], 'b': [1, 2, 3, 4, 5, 7, 8, 10], 'c': [10, 12, 14, 15, 18, 20, 22, 25]}}, {'answer_fn': <function SyntheticDataGenerator.<lambda>>, 'grd_template': 'flowchart TD\n Start[Understand the scenario] --> Values[Identify: unit price and quantity]\n Values --> Operation[Determine operation needed]\n Operation --> Calculate[Multiply price by quantity]\n Calculate --> Format[Format as currency]\n Format --> Answer[State total cost]', 'template': 'A store sells {item} at ${price} each. If {name} buys {quantity}, how much does {pronoun} pay?', 'variables': {'item': ['apples', 'oranges', 'books', 'pens', 'notebooks'], 'name': ['John', 'Maria', 'Alex', 'Sarah'], 'price': [2, 3, 5, 8, 10, 15], 'pronoun': ['he', 'she', 'they'], 'quantity': [3, 4, 5, 6, 7, 8, 10]}}]
- LOGIC_TEMPLATES = [{'answer_fn': <function SyntheticDataGenerator.<lambda>>, 'grd_template': 'flowchart TD\n Start[Identify premises] --> P1[Premise 1: All A are B]\n P1 --> P2[Premise 2: X is A]\n P2 --> Apply[Apply syllogistic reasoning]\n Apply --> Deduce[Deduce: X must be B]\n Deduce --> Answer[State conclusion]', 'template': 'If all {category_a} are {category_b}, and {item} is a {category_a}, what can we conclude?', 'variables': {'category_a': ['dogs', 'cats', 'birds', 'mammals'], 'category_b': ['animals', 'living things', 'creatures'], 'item': ['Rex', 'Fluffy', 'Tweety', 'Max']}}]
- REASONING_TEMPLATES = [{'answer_fn': <function SyntheticDataGenerator.<lambda>>, 'grd_template': 'flowchart TD\n Start[Understand the situation] --> Initial[Identify initial count]\n Initial --> Change[Identify the change]\n Change --> Operation[Determine: addition needed]\n Operation --> Calculate[Add the quantities]\n Calculate --> Answer[State final count]', 'template': '{person} has {count} {items}. {person2} gives {person} {more} more. How many does {person} have now?', 'variables': {'count': [3, 5, 7, 10, 12], 'items': ['apples', 'books', 'coins', 'marbles'], 'more': [2, 3, 4, 5], 'person': ['Alice', 'Bob', 'Charlie', 'Diana'], 'person2': ['Bob', 'Carol', 'David', 'Eve']}}]
- __init__(validate_output: bool = True, max_tokens_per_node: int = 15)[source]
Initialize the synthetic data generator.
- Parameters:
validate_output – Whether to validate generated samples
max_tokens_per_node – Maximum tokens per node for validation
- generate_math_samples(count: int) List[TrainingSample][source]
Generate math problem samples.
- Parameters:
count – Number of samples to generate
- Returns:
List of TrainingSample objects
- generate_logic_samples(count: int) List[TrainingSample][source]
Generate logic problem samples.
- Parameters:
count – Number of samples to generate
- Returns:
List of TrainingSample objects
- generate_reasoning_samples(count: int) List[TrainingSample][source]
Generate general reasoning samples.
- Parameters:
count – Number of samples to generate
- Returns:
List of TrainingSample objects
- generate_mixed_samples(count: int, math_ratio: float = 0.4, logic_ratio: float = 0.3, reasoning_ratio: float = 0.3) List[TrainingSample][source]
Generate a mixed dataset of samples.
- Parameters:
count – Total number of samples
math_ratio – Proportion of math problems
logic_ratio – Proportion of logic problems
reasoning_ratio – Proportion of reasoning problems
- Returns:
List of TrainingSample objects
ArchitectTrainer
Fine-tuning utilities for Architect models.
- class braid.training.ArchitectTrainer[source]
Bases:
objectUtilities for training/fine-tuning Architect models.
Supports: - Creating DSPy examples for BootstrapFewShot - Preparing fine-tuning datasets - Calculating dataset statistics
- create_dspy_examples(samples: List[TrainingSample]) List[Any][source]
Create DSPy Example objects from training samples.
- Parameters:
samples – Training samples to convert
- Returns:
List of dspy.Example objects
- prepare_openai_finetune_dataset(samples: List[TrainingSample], system_prompt: str | None = None) List[Dict[str, Any]][source]
Prepare dataset in OpenAI fine-tuning format.
- Parameters:
samples – Training samples
system_prompt – Optional system prompt
- Returns:
List of conversation dictionaries
- calculate_dataset_stats(samples: List[TrainingSample]) DatasetStats[source]
Calculate statistics for a dataset.
- Parameters:
samples – Training samples
- Returns:
DatasetStats object
- generate_training_dataset(size: int = 100, output_path: str | None = None, format: str = 'jsonl') List[TrainingSample][source]
Generate and optionally save a training dataset.
- Parameters:
size – Number of samples to generate
output_path – Optional path to save the dataset
format – Output format (“jsonl” or “json”)
- Returns:
List of generated samples