Skip to content

BLOGE DSL Language Specification

FieldValue
Spec Version1.0.0
BLOGE DSL Version1.0.0
Date2026-03-07
StatusSTABLE

Table of Contents

  1. Introduction
  2. Lexical Specification
  3. Syntax Specification
  4. AST Specification
  5. Compilation Semantics
  6. Error Reporting
  7. Appendix A: Complete EBNF Grammar Quick-Reference
  8. Appendix B: Example Index
  9. Appendix C: DSL vs Java Fluent API Comparison
  10. Appendix D: Reserved Keywords & Future Extensions

1. Introduction

BLOGE (Biz Orchestration Graph Engine) DSL is an external domain-specific language for declaratively defining business logic orchestration graphs. A .bloge file describes a directed acyclic graph (DAG) of operator nodes, transform blocks, data flow bindings, conditional branches, resilience policies, and schema declarations. The DSL compiler translates .bloge source into the bloge-core Graph runtime model.

This specification is derived from the shipping implementation (Lexer.java, Parser.java, DslCompiler.java, the session extension compiler, and the shared conformance fixtures). It is the stable source of truth for lexical rules, syntax grammar, AST structure, graph compilation, session compilation, and the higher-order expression features now supported by the Java and TypeScript parsers.

Scope: This document covers the DSL language and its compilation to Graph and session-extension runtime models. Low-level runtime scheduling details (virtual-thread dispatch, persistence adapters, and execution listeners) remain implementation concerns outside the language contract.


2. Lexical Specification

The BLOGE DSL lexer (Lexer.java) is a hand-written character-by-character scanner that produces a list of Token records. Each token carries a TokenType, the raw lexeme text, and 1-based line:column source position.

2.1 Token Types

Defined in TokenType.java:

Keywords (50)

TokenLexemeDescription
GRAPHgraphGraph definition
NODEnodeNode definition
BRANCHbranchBranch definition
ONonBranch condition marker
INPUTinputInput block / input schema
DEPENDS_ONdepends_onExplicit dependency declaration
TIMEOUTtimeoutTimeout configuration
RETRYretryRetry configuration
FALLBACKfallbackFallback value
TRUEtrueBoolean literal true
FALSEfalseBoolean literal false
SCHEMAschemaSchema definition
OUTPUToutputOutput schema / output path
OTHERWISEotherwiseDefault branch or transition case
WHENwhenwhen {} expression
TRANSFORMtransformTransform block definition
FOREACHforeachCollection fan-out block
SEQUENTIALsequentialSequential foreach modifier
ITEMSitemsReserved foreach helper identifier
INinForeach binding separator
SESSIONsessionSession extension definition
PHASEphaseSession phase definition
ROUNDroundRound-body marker inside a phase
THENthenPhase transition declaration
YIELD_ONyield_onPhase yield event list
IDLE_TIMEOUTidle_timeoutSession idle timeout
TIMEOUT_ACTIONtimeout_actionSession timeout policy reference
MAX_ROUNDSmax_roundsSession or phase round limit
MAX_HISTORYmax_historySession history retention limit
ON_ROUND_FAILUREon_round_failurePhase failure policy
LOOPloopIterative hyper-node block
UNTILuntilLoop or phase termination condition
CARRYcarryLoop carry state block
MAX_ITERATIONSmax_iterationsLoop max iteration count
SCOPEscopeScope mode (parent / isolated)
WAITwaitTimer-based suspension node
AFTERafterWait dependency marker
SIGNAL_KEYsignal_keyWait correlation key
ON_TIMEOUTon_timeoutWait/await timeout payload block
ON_FIREon_fireWait fire payload block
DEADLINEdeadlineDeadline timer constructor
CRONcronCron timer constructor
AWAITawaitExternal-event await node
EVENTeventAwait event matcher
WHEREwhereAwait correlation predicate
MODEmodeAwait aggregation mode
STREAMstreamStreaming member prefix / stream path
BUFFERbufferStreaming buffer size
LETletTransform local binding
SCRIPTscriptSandboxed script node

Literals and string forms (5)

TokenDescription
IDENTIdentifier (not matched to a keyword in the current token)
STRINGDouble-quoted string literal
NUMBERInteger or decimal number
DURATIONNumber followed by a duration suffix (ms, s, m, h, d)
TRIPLE_STRINGTriple-quoted multiline string used by script code blocks

Punctuation (27)

TokenLexemeDescription
LBRACE{Left brace
RBRACE}Right brace
LBRACKET[Left bracket
RBRACKET]Right bracket
LPAREN(Left parenthesis
RPAREN)Right parenthesis
EQUALS=Assignment
ARROW->Branch, transition, or lambda arrow
DOT.Path separator / method-call chain
COMMA,List separator
COLON:Type or binding separator
QUESTION?Optional marker / ternary
PLUS+Addition / string concatenation
MINUS-Subtraction / unary negation
STAR*Multiplication
SLASH/Division
PERCENT%Modulo
BANG!Logical NOT
EQ_EQ==Equality comparison
BANG_EQ!=Inequality comparison
GT>Greater than
LT<Less than
GT_EQ>=Greater than or equal
LT_EQ<=Less than or equal
DOUBLE_QUESTION??Null coalescing
AMP_AMP&&Logical AND
PIPE_PIPE||Logical OR

Comments and special tokens (4)

TokenDescription
LINE_COMMENT// ... comment, discarded by the parser
BLOCK_COMMENT/* ... */ comment, discarded by the parser
DOC_COMMENT/// or /** */ documentation comment preserved in the AST
EOFEnd-of-file marker

2.2 Lexical Rules

Identifiers and Keywords

IdentStart    = [a-zA-Z_]
IdentContinue = [a-zA-Z0-9_]
Identifier    = IdentStart IdentContinue*

After scanning an identifier, the lexer checks it against the keyword map. If matched, the corresponding keyword TokenType is emitted; otherwise IDENT is emitted. Note that depends_on is a single keyword token containing an underscore — the underscore is a valid IdentContinue character, so depends_on is scanned as a single identifier then matched as the DEPENDS_ON keyword.

String Literals

StringLiteral = '"' StringChar* '"'
StringChar    = EscapeSeq | <any char except '"' and '\'>
EscapeSeq     = '\"' | '\n' | '\t' | '\\'
  • Strings are delimited by double quotes (")
  • Strings may span multiple lines (newlines inside strings increment the line counter)
  • Supported escape sequences: \"", \n → newline, \t → tab, \\ → backslash
  • An unterminated string (reaching EOF without closing ") produces a lexer error

Number Literals

NumberLiteral = Digits ('.' Digits)?
Digits        = [0-9]+
  • Both integers (42) and decimals (3.14) are supported
  • A decimal point requires at least one digit on each side
  • The number is stored as its raw lexeme text; parsing to numeric types occurs later

Duration Literals

DurationLiteral = Digits Suffix
Suffix          = 'ms' | 's' | 'm'
  • A duration is a number immediately followed by a time suffix: ms (milliseconds), s (seconds), or m (minutes)
  • Examples: 100ms, 3s, 5m
  • The suffix must not be followed by an identifier-continue character (to distinguish 5m from 5myVar)
  • The ms suffix is checked first (longest match) to avoid ambiguity with m

Comments

The lexer supports four comment forms:

FormSyntaxBehavior
Line// ...Discarded (ignored)
Block/* ... */Discarded (ignored), supports nesting
Doc-line/// ...Emitted as DOC_COMMENT token
Doc-block/** ... */Emitted as DOC_COMMENT token

Line comment (//): All characters from // to end-of-line are discarded.

Block comment (/* ... */): Supports nested block comments. A depth counter tracks /* (depth++) and */ (depth--). Newlines inside block comments increment the line counter. An unterminated block comment (depth > 0 at EOF) produces a lexer error.

Doc-line comment (///): After consuming the third /, an optional leading space is skipped. The remaining text to end-of-line is captured as the DOC_COMMENT lexeme. Multiple consecutive /// lines produce multiple DOC_COMMENT tokens (merged later by the parser).

Doc-block comment (/** ... */): After consuming /**, all content up to */ is captured. Leading whitespace and optional * prefixes on continuation lines are stripped. The final text is trimmed of leading/trailing blank lines. Emitted as a single DOC_COMMENT token.

Note: /***/ (three-character sequence) is treated as a regular block comment because peek() == '*' and peekNext() == '/', so the doc-block branch is not taken.

Whitespace

  • Space (), tab (\t), and carriage return (\r) are silently skipped
  • Newline (\n) increments the line counter and resets the column counter to 1

2.3 Lexical Disambiguation & Priority

AmbiguityResolution Rule
/// vs //After matching //, check if next char is /. If so → doc-line comment; otherwise → line comment
/** vs /*After matching /*, check if next char is * and char after that is not /. If so → doc-block comment; otherwise → block comment
- vs ->After matching -, check if next char is >. If so → ARROW; otherwise → MINUS. The lexer always emits the token; the parser determines whether MINUS is valid in the current position (it is valid in expression contexts, invalid elsewhere).
/ vs // vs /*After matching /, check next char: / → comment, * → block comment, otherwise → SLASH. As with MINUS, the lexer always emits SLASH; validity depends on parser context (expression position).
= vs ==After matching =, check if next char is =. If so → EQ_EQ; otherwise → EQUALS
! vs !=After matching !, check if next char is =. If so → BANG_EQ; otherwise → BANG
> vs >=After matching >, check if next char is =. If so → GT_EQ; otherwise → GT
< vs <=After matching <, check if next char is =. If so → LT_EQ; otherwise → LT
? vs ??After matching ?, check if next char is ?. If so → DOUBLE_QUESTION; otherwise → QUESTION
& vs &&& alone is a lexer error; &&AMP_AMP
| vs ||| alone is a lexer error; ||PIPE_PIPE
depends_on as keywordThe underscore _ is a valid identifier character, so depends_on is scanned as a single identifier token and then matched as the DEPENDS_ON keyword
5m vs 5myVarAfter a number followed by m, check if the next char is an IdentContinue. If so → treat as NUMBER + IDENT; if not → treat as DURATION
5ms durationms suffix is checked before single-character m and s suffixes (longest-match priority)

3. Syntax Specification

The BLOGE DSL parser (Parser.java) is a hand-written recursive descent parser that consumes a list of tokens and produces an AST rooted at either GraphDef or ExtensionDef (for top-level session documents).

3.1 EBNF Grammar

ebnf
Program           = GraphDef
                  | SessionDef

GraphDef          = DocComment? "graph" IDENT "{" Member* "}"

SessionDef        = DocComment? "session" IDENT "{" SessionMember* "}"

SessionMember     = SessionProperty
                  | PhaseDef
                  | CommentNode

SessionProperty   = "idle_timeout" "=" DURATION
                  | "timeout_action" "=" STRING
                  | "max_rounds" "=" NUMBER
                  | "max_history" "=" NUMBER

PhaseDef          = DocComment? "phase" IDENT "{" PhaseBody "}"

PhaseBody         = ( PhaseProperty
                    | RoundBlock
                    | Member
                    )*

PhaseProperty     = "max_rounds" "=" NUMBER
                  | "on_round_failure" "=" IDENT
                  | "yield_on" "=" "[" IdentList? "]"
                  | "until" Expression
                  | ThenProperty

ThenProperty      = "then" "->" IDENT
                  | "then" "{" ( ( Expression | "otherwise" ) "->" IDENT )* "}"

RoundBlock        = "round" "{" Member* "}"

Member            = NodeDef
                  | BranchDef
                  | TransformDef
                  | SchemaDef
                  | ForEachDef
                  | LoopDef
                  | WaitDef
                  | AwaitDef
                  | ScriptDef
                  | StreamMember
                  | CommentNode

StreamMember      = DocComment? "stream" ( NodeDef
                                           | ForEachDef
                                           | LoopDef )

NodeDef           = DocComment? "node" IDENT ":" IDENT "{" NodeBody "}"

NodeBody          = ( InputBlock
                    | InputSchemaBlock
                    | OutputDecl
                    | DependsOn
                    | TimeoutField
                    | RetryField
                    | FallbackField
                    | ScopeField
                    | BufferField
                    )*

InputBlock        = "input" "{" ( DocComment? IDENT "=" Expression )* "}"

InputSchemaBlock  = "input" "{" FieldDeclaration* "}"
                    (* Disambiguated from InputBlock by lookahead:
                       if first IDENT is followed by "=" → InputBlock;
                       if first IDENT is followed by ":" → InputSchemaBlock *)

OutputDecl        = "output" "{" FieldDeclaration* "}"
                  | "output" ":" IDENT

DependsOn         = "depends_on" "=" "[" IdentList? "]"

IdentList         = IDENT ( "," IDENT )*

TimeoutField      = "timeout" "=" DURATION

RetryField        = "retry" "=" "{" RetryConfig "}"

RetryConfig       = ( RetryKey ":" RetryValue ","? )*

RetryKey          = "attempts" | "backoff" | "strategy"

RetryValue        = NUMBER                    (* for "attempts" *)
                  | DURATION                  (* for "backoff" *)
                  | IDENT                     (* for "strategy" *)

FallbackField     = "fallback" "=" Expression

ScopeField        = "scope" "=" ( "parent" | "isolated" )
                    (* Controls scope visibility for sub-graph constructs. *)

BufferField       = "buffer" "=" NUMBER

ForEachDef        = DocComment? "foreach" IDENT ":" ForEachBinding "in" Expression
                    "sequential"? "{" ( ScopeField | BufferField | Member )* "}"

ForEachBinding    = "(" IDENT ( "," IDENT )? ")"   (* tuple binding with optional index *)
                  | IDENT                           (* bare item variable *)

LoopDef           = DocComment? "loop" IDENT "{" LoopBody "}"

LoopBody          = ( "max_iterations" "=" NUMBER
                    | "delay" "=" DURATION
                    | DependsOn
                    | ScopeField
                    | BufferField
                    | Member
                    | CarryBlock
                    | UntilClause
                    )*

CarryBlock        = "carry" "{" ( IDENT ":" Expression ","? )* "}"

UntilClause       = "until" Expression

WaitDef           = DocComment? "wait" IDENT "=" WaitTimer "after" IDENT ( "{" WaitBody "}" )?

WaitTimer         = DURATION
                  | "deadline" "(" STRING ")"
                  | "cron" "(" STRING ")"

WaitBody          = ( "signal_key" "=" Expression
                    | "on_timeout" PayloadBlock
                    | "on_fire" PayloadBlock
                    )*

AwaitDef          = DocComment? "await" IDENT "{" AwaitBody "}"

AwaitBody         = ( "mode" "=" IDENT
                    | DependsOn
                    | EventMatcher
                    | TimeoutField
                    | "on_timeout" PayloadBlock
                    )*

EventMatcher      = "event" STRING ( "where" IDENT "=" Expression )?
                    ( "{" "optional" "=" "true" "}" )?

ScriptDef         = DocComment? "script" IDENT "{" ScriptBody "}"

ScriptBody        = ( ScriptLang
                    | InputBlock
                    | ScriptOutputSchema
                    | ScriptCode
                    | TimeoutField
                    )*

ScriptLang        = "lang" "=" STRING

ScriptOutputSchema = "output_schema" ( "{" FieldDeclaration* "}"
                                      | "=" IDENT )

ScriptCode        = "code" "=" TRIPLE_STRING

PayloadBlock      = "{" ( IDENT "=" Expression )* "}"

BranchDef         = DocComment? "branch" "on" PathExpression "{" BranchBody "}"

BranchBody        = ( BranchCase | OtherwiseCase )*

BranchCase        = DocComment? Expression "->" IDENT

OtherwiseCase     = "otherwise" "->" IDENT

TransformDef      = DocComment? "transform" IDENT "{" TransformBody "}"

TransformBody     = LetBinding* TransformField*

LetBinding        = "let" IDENT "=" Expression

TransformField    = DocComment? IDENT ( ":" TypeRef )? "=" Expression

TypeRef           = IDENT "?"?

SchemaDef         = "schema" IDENT "{" FieldDeclaration* "}"

FieldDeclaration  = IDENT ":" ( IDENT "?"?
                               | "{" FieldDeclaration* "}" )

(* --- Enhanced Expression Grammar --- *)
(* Operator precedence (lowest to highest):
     1. when expression
     2. ternary conditional (?:)
     3. null coalescing (??)
     4. logical OR (||)
     5. logical AND (&&)
     6. equality (==, !=)
     7. comparison (>, <, >=, <=)
     8. additive (+, -)
     9. multiplicative (*, /, %)
    10. unary (!, -)
    11. primary (literals, paths, function calls, grouping, when) *)

Expression        = LambdaExpr
                  | WhenExpr
                  | ConditionalExpr

LambdaExpr        = ( IDENT | "(" ( IDENT ( "," IDENT )* )? ")" ) "->" Expression

ConditionalExpr   = NullCoalesce ( "?" Expression ":" Expression )?

NullCoalesce      = LogicalOr ( "??" LogicalOr )*

LogicalOr         = LogicalAnd ( "||" LogicalAnd )*

LogicalAnd        = Equality ( "&&" Equality )*

Equality          = Comparison ( ( "==" | "!=" ) Comparison )?

Comparison        = Additive ( ( ">" | "<" | ">=" | "<=" ) Additive )?

Additive          = Multiplicative ( ( "+" | "-" ) Multiplicative )*

Multiplicative    = Unary ( ( "*" | "/" | "%" ) Unary )*

Unary             = ( "!" | "-" ) Unary
                  | Primary

Primary           = Atomic ( "." MethodCall )*

MethodCall        = IDENT "(" ( Expression ( "," Expression )* )? ")"

Atomic            = FunctionCall
                  | PathExpression
                  | StringLiteral
                  | NumberLiteral
                  | BooleanLiteral
                  | DurationLiteral
                  | ObjectLiteral
                  | ArrayLiteral
                  | GroupExpr

FunctionCall      = IDENT "(" ( Expression ( "," Expression )* )? ")"

GroupExpr         = "(" Expression ")"

WhenExpr          = "when" WhenSubject? "{" WhenClause* OtherwiseClause? "}"

WhenSubject       = Expression

WhenClause        = Expression "->" Expression

OtherwiseClause   = "otherwise" "->" Expression

PathExpression    = IDENT ( "." IDENT )*

StringLiteral     = STRING

NumberLiteral     = NUMBER

BooleanLiteral    = "true" | "false"

DurationLiteral   = DURATION

ObjectLiteral     = "{" ( IDENT ":" Expression ( "," IDENT ":" Expression )* )? "}"

ArrayLiteral      = "[" ( Expression ( "," Expression )* )? "]"

DocComment        = DOC_COMMENT+

CommentNode       = (* A DocComment that is not followed by "node", "branch",
                       "transform", or "schema" — becomes a standalone
                       CommentNode in the AST *)

3.2 Syntax Railroad Descriptions

GraphDef

──▶ DocComment? ──▶ "graph" ──▶ IDENT ──▶ "{" ──▶ Member* ──▶ "}" ──▶

NodeDef

──▶ DocComment? ──▶ "node" ──▶ IDENT ──▶ ":" ──▶ IDENT ──▶ "{" ──▶ NodeBody ──▶ "}" ──▶
                                │ (id)           │ (operatorRef)

BranchDef

──▶ DocComment? ──▶ "branch" ──▶ "on" ──▶ PathExpr ──▶ "{" ──┬──▶ BranchCase ──┬──▶ "}" ──▶
                                                              │                 │
                                                              ├──▶ Otherwise ───┤
                                                              └─────────────────┘

TransformDef

──▶ DocComment? ──▶ "transform" ──▶ IDENT ──▶ "{" ──┬──▶ TransformField ──┬──▶ "}" ──▶
                                                     │                     │
                                                     └─────────────────────┘
TransformField:
──▶ DocComment? ──▶ IDENT ──▶ ( ":" TypeRef )? ──▶ "=" ──▶ Expression ──▶

Expression (Enhanced)

                ┌──▶ WhenExpr ─────────────────────────┐
                │                                       │
──▶ ────────────┤                                       ├──▶
                │                                       │
                └──▶ ConditionalExpr ──────────────────┘

ConditionalExpr:
──▶ NullCoalesce ──┬──▶ "?" ──▶ Expression ──▶ ":" ──▶ Expression ──▶

                   └──▶ (none) ──▶

NullCoalesce:
──▶ LogicalOr ──┬──▶ "??" ──▶ LogicalOr ──┬──▶
                │                           │
                └───────────────────────────┘

Primary:
         ┌──▶ FunctionCall ────┐
         ├──▶ PathExpression ──┤
         ├──▶ StringLiteral ───┤
         ├──▶ NumberLiteral ───┤
──▶ ─────┼──▶ BooleanLiteral ──┼──▶
         ├──▶ DurationLiteral ─┤
         ├──▶ ObjectLiteral ───┤
         ├──▶ ArrayLiteral ────┤
         └──▶ "(" Expr ")" ───┘

FunctionCall:
──▶ IDENT ──▶ "(" ──┬──▶ Expression ──┬──▶ "," ──▶ Expression ──┬──▶ ")" ──▶
                     │                 └─────────────────────────┘
                     └──▶ (empty) ──────────────────────────────────▶

WhenExpr:
──▶ "when" ──▶ Subject? ──▶ "{" ──┬──▶ Expr "->" Expr ──┬──▶ OtherwiseClause? ──▶ "}" ──▶
                                   │                      │
                                   └──────────────────────┘

3.3 Contextual Keywords

All 50 keyword tokens are contextual: they carry special meaning only in specific syntactic positions but can still be accepted by expectIdent() when the grammar expects an identifier. This allows domain names such as phase, mode, or transform to remain usable where the parser is looking for IDs rather than grammar markers.

Positions where keywords are treated as identifiers:

  • Graph, phase, session, node, transform, foreach, loop, wait, await, and script IDs
  • Operator references after node <id> : <operatorRef>
  • Schema names and field names
  • Retry strategy names
  • Function arguments and path segments inside expressions
  • Event correlation keys inside await event ... where <key> = ...

This design means the lexer can stay strict while the parser still supports natural domain naming.

Semantic reservation still applies: contextual keywords are not the same as semantically safe identifiers. Some names remain reserved by later compiler stages, for example node IDs cannot use ctx, prev, carry, loopIteration, stream, or output, and lambda parameters cannot shadow reserved DSL keywords.

3.4 Session Extension Syntax

session is the stable top-level extension form for long-running conversational or multi-round flows. A .bloge source file may now contain either a graph or a session as its root production.

bloge
session onboarding {
  idle_timeout = 30m
  max_rounds = 10

  phase collectProfile {
    yield_on = [profile_submitted]
    then -> review

    round {
      node askProfile : AskProfileOperator {}
      wait profileTimeout = 24h after askProfile {
        signal_key = ctx.sessionId
      }
    }
  }

  phase review {
    on_round_failure = retry_phase
    until ctx.approved

    node evaluate : EvaluateProfileOperator {}
    then {
      ctx.approved -> complete
      otherwise -> collectProfile
    }
  }
}

Key structural rules:

  1. Session-level properties (idle_timeout, timeout_action, max_rounds, max_history) live directly under session.
  2. phase blocks are ordered children of the session and are represented as nested ExtensionDef nodes.
  3. A phase is either a once phase (direct Member children) or a round phase (round { ... }), never both.
  4. then supports either a direct phase target (then -> nextPhase) or a case table that lowers into an ordered list of transition objects.
  5. Phase bodies can contain the same executable members as graphs, including node, branch, transform, foreach, loop, wait, await, script, and streaming variants.

4. AST Specification

4.1 AST Node Hierarchy

The AST is defined via Java sealed interface hierarchies in AstNode.java and Expression.java. All AST nodes carry source position (line, column) for diagnostics, conformance snapshots, and downstream compilation.

AstNode (sealed interface)
├── GraphDef(name, members: List<AstNode>, description, line, column)
├── ExtensionDef(kind, id, properties: Map<String, Expression>,
│                children: List<AstNode>, description, line, column)
├── NodeDef(id, operatorRef, input, dependsOn, timeout, retry, fallback,
│           inputSchema, outputSchema, scope, streaming, bufferSize,
│           description, line, column)
├── BranchDef(condition: Expression, cases: List<BranchCase>,
│             otherwise, description, line, column)
├── InputBlock(bindings: Map<String, Expression>,
│              fieldComments: Map<String, String>, line, column)
├── SchemaDef(name, body: SchemaDeclaration, line, column)
├── TransformDef(id, letBindings: List<LetBinding>,
│                fields: List<TransformField>, description, line, column)
├── ForEachDef(id, itemsExpr: Expression, sequential, itemVar, indexVar,
│              scope, streaming, bufferSize, body: List<AstNode>,
│              description, line, column)
├── LoopDef(id, maxIterations, delay, dependsOn, scope, streaming,
│           bufferSize, body: List<AstNode>, carryDef, untilCondition,
│           description, line, column)
├── WaitDef(id, timerExpr, afterNode, signalKey,
│           onTimeoutPayload, onFirePayload, description, line, column)
├── AwaitDef(id, aggregationMode, events: List<EventMatcherDef>, timeout,
│            onTimeoutPayload, dependsOn, description, line, column)
├── ScriptDef(id, lang, input, outputSchema, code, timeout,
│             description, line, column)
├── CarryDef(bindings, line, column)
├── UntilDef(condition, line, column)
└── CommentNode(text, style, line, column)

(helper records, not AstNode)
├── BranchCase(value: Expression, target, description)
├── EventMatcherDef(eventName, correlationKey, expectedValue, optional, line, column)
├── TransformField(name, typeAnnotation, value: Expression, description, line, column)
├── LetBinding(name, value: Expression, line, column)
├── RetryDef(attempts, backoff: DurationValue, strategy)
├── FallbackDef(value: Expression)
└── DurationValue(amount, unit)

SchemaDeclaration (sealed interface)
├── InlineSchema(fields: List<FieldDeclaration>, line, column)
└── SchemaRef(name, line, column)

Expression (sealed interface)
├── ContextPath(segments, line, column)
├── NodeOutputPath(nodeId, segments, line, column)
├── NodeStreamPath(nodeId, segments, line, column)
├── TransformFieldPath(transformId, fieldName, line, column)
├── ItemPath(segments, line, column)
├── ItemIndex(line, column)
├── LoopPrevPath(nodeId, segments, line, column)
├── LoopCarryPath(segments, line, column)
├── LoopIterationRef(line, column)
├── LambdaParamPath(paramName, segments, line, column)
├── StringLiteral(value, line, column)
├── NumberLiteral(value, line, column)
├── BooleanLiteral(value, line, column)
├── DurationLiteral(duration, line, column)
├── ObjectLiteral(fields, line, column)
├── ArrayLiteral(elements, line, column)
├── BinaryOp(left, op, right, line, column)
├── UnaryOp(op, operand, line, column)
├── ConditionalExpr(condition, thenBranch, elseBranch, line, column)
├── NullCoalesce(primary, fallback, line, column)
├── FunctionCall(name, args, line, column)
├── WhenExpr(subject, clauses, otherwise, line, column)
├── LambdaExpr(params, body, line, column)
├── MethodCallExpr(receiver, method, args, line, column)
└── GroupExpr(inner, line, column)

WhenClause(condition: Expression, result: Expression)
BinaryOperator(enum): PLUS, MINUS, STAR, SLASH, PERCENT,
                      EQ_EQ, BANG_EQ, GT, LT, GT_EQ, LT_EQ,
                      AMP_AMP, PIPE_PIPE
UnaryOperator(enum): NEGATE, NOT

4.2 Syntax → AST Mapping

Syntax ProductionAST Node / HelperNotes
GraphDefGraphDefmembers contains executable graph members and detached CommentNodes
SessionDefExtensionDef(kind="session")Session-level properties are stored in properties; phases become children
PhaseDefExtensionDef(kind="phase")phase_type, then, until, yield_on, and round/once metadata are normalized into properties
RoundBlockExtensionDef(kind="phase") child listround { ... } does not emit a dedicated node; its members replace the phase child list
NodeDefNodeDefInput, schemas, resilience, scope, streaming, and buffer metadata map to dedicated fields
StreamMemberNodeDef / ForEachDef / LoopDefstream is represented by streaming=true on the wrapped member
ForEachDefForEachDefitemVar, optional indexVar, sequential, scope, bufferSize, and child body are preserved explicitly
LoopDefLoopDefcarry lowers into CarryDef; until lowers into untilCondition
WaitDefWaitDefTimer constructor is stored as an Expression; payload blocks are Map<String, Expression>
AwaitDefAwaitDefEvent matchers lower into EventMatcherDef; mode defaults to and if omitted
ScriptDefScriptDefcode preserves raw triple-string content; output_schema may be inline or by reference
BranchDefBranchDefcondition is any path-capable expression accepted by parseExpression()
InputBlockInputBlockKey-value map; values are any Expression subtype
TransformDefTransformDeflet bindings and output fields are stored separately
TransformFieldTransformFieldtypeAnnotation is nullable; value is any Expression
SchemaDefSchemaDefTop-level standalone schema; body is typically InlineSchema
OutputDecl (inline)InlineSchemaoutput { field: Type }
OutputDecl (ref)SchemaRefoutput: SchemaName
InputSchemaBlockInlineSchemaStored as NodeDef.inputSchema
TimeoutFieldDurationValuetimeout = 3sDurationValue(3, "s")
RetryFieldRetryDefMissing strategy defaults to fixed in compiler normalization
FallbackFieldFallbackDefWraps any Expression
FieldDeclarationFieldDeclarationname: Type? marks the field optional; nested object schemas lower into InlineSchema
Expression (path)Path-aware Expression subtypeSee §4.3 for classification rules
Expression (lambda)LambdaExprOnly valid where a higher-order collection method expects a lambda
Expression (method)MethodCallExprReceiver-style collection helpers such as items.map(x -> ...)

DurationValue Parsing

The DurationValue is parsed from the DURATION token lexeme:

LexemeParsed As
100msDurationValue(100, "ms")
3sDurationValue(3, "s")
5mDurationValue(5, "m")

Suffix matching order: mssm (longest-suffix-first).

4.3 Expression Path Resolution Rules

When parsing a PathExpression (an IDENT followed by zero or more .IDENT segments), the parser classifies the result into one of several path-aware Expression subtypes based on the following rules, applied in order:

Rule 1: ctx prefix → ContextPath

ctx.request.userId  →  ContextPath(segments=["request", "userId"])
ctx.items           →  ContextPath(segments=["items"])

If the first segment is "ctx", the expression is a ContextPath. The "ctx" prefix is stripped from the segments list.

Rule 2: <nodeId>.output.<segments>NodeOutputPath (explicit)

fetchUser.output.id       →  NodeOutputPath(nodeId="fetchUser", segments=["id"])
fetchUser.output          →  NodeOutputPath(nodeId="fetchUser", segments=[])
calcPrice.output.total    →  NodeOutputPath(nodeId="calcPrice", segments=["total"])

If the path has ≥2 segments and the second segment is "output", it is a NodeOutputPath. The node ID is the first segment, and the segments list starts after "output".

Rule 3: Single identifier → ContextPath

someVar  →  ContextPath(segments=["someVar"])

If the path has exactly one segment (not "ctx"), it is a ContextPath with that single segment.

Rule 4: <transformId>.<fieldName>TransformFieldPath

orderSummary.customerName  →  TransformFieldPath(transformId="orderSummary", fieldName="customerName")
riskMetrics.score          →  TransformFieldPath(transformId="riskMetrics", fieldName="score")

If the path has exactly two segments and the first segment matches a declared transform ID in the current graph, it is a TransformFieldPath. The first segment is the transform ID, the second segment is the field name.

Design decision: Transform references use transformId.fieldName (without .output. infix), distinguishing them from node references (nodeId.output.field). This is because transforms do not have an explicit input {} block — they only expose computed output fields — making the .output. segment redundant and visually noisy.

Rule 5: <nodeId>.<segments>NodeOutputPath (implicit)

fetchUser.name       →  NodeOutputPath(nodeId="fetchUser", segments=["name"])
a.result.data        →  NodeOutputPath(nodeId="a", segments=["result", "data"])

Any multi-segment path that doesn't match Rules 1–4 is treated as a NodeOutputPath where the first segment is the node ID and the remaining segments are the output path. This is the implicit form (without the output keyword).

Disambiguation note: When a two-segment path x.y could match both Rule 4 (transform) and Rule 5 (implicit node output), the compiler resolves it based on which IDs are declared. If x is declared as both a transform ID and a node ID, a compile-time error is raised: "Transform '<id>' conflicts with node of the same name" (see §6.3). Naming conventions — noun phrases for transforms (orderSummary) vs verb phrases for nodes (fetchUser) — are recommended but not enforced by the compiler.

Rule 6: <nodeId>.stream or <nodeId>.stream.<fieldName>NodeStreamPath

processData.stream            →  NodeStreamPath(nodeId="processData", segments=[])
processData.stream.chunk      →  NodeStreamPath(nodeId="processData", segments=["chunk"])

If the path has ≥2 segments and the second segment is "stream", it is a NodeStreamPath. The node ID is the first segment. The segments list contains everything after "stream". This form is only valid when the upstream node is declared as a streaming node; at runtime, results.getChannel(nodeId) is called to retrieve the live NodeChannel<?> for streaming consumption.

Rule 7 (foreach body): <itemVar>.<path>ItemPath

Inside a foreach body, the item variable binding declared in foreach id : (itemVar) in ... creates an implicit item reference:

// foreach orders : (order) in ctx.orders { ... }
order.customerId    →  ItemPath(segments=["customerId"])
order               →  ItemPath(segments=[])           // entire item

At runtime the DSL compiler reads the current item from GraphContext key __item__. The item variable name (order in the example) is lexically matched during parsing; any path whose first segment equals itemVar is classified as ItemPath.

Rule 8 (foreach body): <indexVar>ItemIndex

Inside a foreach body with a two-binding form (itemVar, indexVar), the second identifier is the item index reference:

// foreach orders : (order, idx) in ctx.orders { ... }
idx   →  ItemIndex

At runtime the compiler reads the 0-based integer from GraphContext key __itemIndex__.

Rule 9 (loop body): prev.<nodeId>.<path>LoopPrevPath

Inside a loop body, the special prefix prev refers to the previous iteration's terminal node outputs:

// inside loop body:
prev.checkStatus.status      →  LoopPrevPath(nodeId="checkStatus", segments=["status"])
prev.checkStatus              →  LoopPrevPath(nodeId="checkStatus", segments=[])

At runtime the compiler reads from GraphContext key __prev__ (a Map<String,Object> of the previous iteration's node outputs).

Rule 10 (loop body): carry.<path>LoopCarryPath

Inside a loop body, the special prefix carry refers to the carry state passed from the previous iteration (or the initial input of the loop node):

// inside loop body:
carry.retryCount      →  LoopCarryPath(segments=["retryCount"])
carry                  →  LoopCarryPath(segments=[])   // entire carry map

At runtime the compiler reads from GraphContext key __carry__.

Rule 11 (loop body): loopIterationLoopIterationRef

Inside a loop body, the bare identifier loopIteration refers to the 0-based iteration counter:

loopIteration    →  LoopIterationRef

At runtime the compiler reads from GraphContext key __loopIteration__ (an Integer).

Reserved identifier summary: ctx, prev, carry, loopIteration, stream, and output are reserved implicit identifiers in the DSL expression layer. They cannot be used as node IDs or lambda parameter names. See the Implicit Declarations Registry (docs/implicit-declarations-registry.md) for the complete list of reserved names and their runtime lifecycle.

4.4 Doc-Comment Attachment Rules

Documentation comments (/// and /** */) follow a prefix attachment strategy:

  1. Consumption: the parser calls consumeDocComments() which collects consecutive DOC_COMMENT tokens and merges their lexemes with newline separators.
  2. Attachment targets: a collected doc-comment is attached to the immediately following syntax element when that element has a description field or field-level comment slot.
  3. Standalone comment nodes: if a doc-comment is not followed by an attachable target, the parser emits a CommentNode so documentation is not silently discarded.
  4. Forwarding: member-level parsers use the pendingDocComment handoff so nested parsing methods can still observe the prefix comment that was consumed by the outer dispatcher.

Concrete attachment targets include:

  • graph, session, phase, node, branch, transform, foreach, loop, wait, await, and script definitions
  • branch case entries (BranchCase.description)
  • input {} field comments (InputBlock.fieldComments)
  • transform field descriptions (TransformField.description)

4.5 Session Extension AST Mapping

Sessions are intentionally modeled with the generic ExtensionDef node so future language extensions can reuse the same structural contract.

SyntaxAST formNotes
session onboarding { ... }ExtensionDef(kind="session", id="onboarding", ...)Session properties are stored in properties; phases become ordered children
phase review { ... }ExtensionDef(kind="phase", id="review", ...)phase_type is synthesized as once or round during parsing
yield_on = [a, b]properties["yield_on"] = ArrayLiteral([...])Identifiers are normalized into string literals inside the array
then -> nextPhaseproperties["then"] = StringLiteral("nextPhase")Single-target transition
then { expr -> a otherwise -> b }properties["then"] = ArrayLiteral([ObjectLiteral(...) ...])Each object contains condition and target fields
round { ... }child members of the phase ExtensionDefNo standalone RoundDef record is emitted

This shape keeps the parser and conformance fixtures stable while allowing the dedicated session compiler to interpret extension properties with richer runtime semantics.


5. Compilation Semantics

5.1 Compilation Pipeline Overview

The GraphLoader provides the entry point for the three-stage compilation pipeline:

Source (.bloge text)


┌──────────────────┐
│  Lexer.tokenize() │  →  List<Token>
└──────────────────┘


┌──────────────────┐
│  Parser.parse()   │  →  GraphDef (AST)
└──────────────────┘


┌──────────────────────┐
│  DslCompiler.compile()│  →  Graph (runtime model)
└──────────────────────┘

The DslCompiler.compile() stage processes the AST in the following order:

  1. Schema registration — collect top-level SchemaDef entries into a named schema registry
  2. Node compilation — compile NodeDef entries into NodeSpec instances (operator resolution, input assembly, resilience config, schema binding)
  3. Transform compilation — compile TransformDef entries into virtual NodeSpec instances backed by TransformOperator (expression compilation, dependency inference, schema generation); see §5.8
  4. Transform ordering — topological sort of transforms using Kahn's algorithm with cycle detection
  5. Branch compilation — compile BranchDef entries into Edge.Conditional edges
  6. Dependency merging — merge explicit (depends_on) and implicit (expression-inferred) dependencies; add transform-originated dependencies
  7. Graph assembly — construct the Graph runtime model with all NodeSpec and Edge entries; DAG validation

The GraphLoader also supports file-watching mode (watch(Path, Consumer<Graph>)) that monitors a directory for .bloge file changes and recompiles on update.

5.2 Operator Resolution

Each NodeDef.operatorRef is resolved against the OperatorRegistry:

java
registry.contains(nd.operatorRef())  // must return true
  • If the operator is not registered, a GraphDefinitionException is thrown: "Node '<id>' references unregistered operator '<ref>'"
  • The operator reference is a simple string name (e.g., "FetchUserOperator")
  • Operator metadata (input/output schemas) may be auto-introspected via registry.metadata(operatorRef) for schema enrichment

5.3 Implicit Dependency Inference

Dependencies between nodes are established through two mechanisms, merged and deduplicated:

Explicit Dependencies

bloge
depends_on = [fetchUser, calcPrice]

Produces DirectEdge(from, to) for each listed node ID.

Implicit Dependencies

When compiling input block expressions, any NodeOutputPath or TransformFieldPath expression automatically registers the referenced node or transform as an implicit dependency:

bloge
input {
    userId = fetchUser.output.id    // implicit dep on "fetchUser"
    total  = calcPrice.output.total // implicit dep on "calcPrice"
    name   = orderSummary.customerName // implicit dep on transform "orderSummary"
}

The compiler tracks implicit dependencies per node in a Map<String, Set<String>>. After all input blocks are compiled, explicit and implicit dependencies are merged (using LinkedHashSet for deduplication) to produce the final edge list.

Transform Dependencies

Transform blocks also participate in dependency inference. Each TransformDef is compiled into a virtual NodeSpec, and any NodeOutputPath or TransformFieldPath in its field expressions creates an implicit dependency from the transform to the referenced node or transform. See §5.8 for details.

If any dependency (explicit or implicit) references a non-existent node, a GraphDefinitionException is thrown: "Node '<id>' depends on non-existent node '<dep>'".

5.4 Expression Compilation

Each Expression AST node is compiled into a BiFunction<NodeResults, GraphContext, Object> — a runtime extractor function.

Expression TypeCompilation Strategy
ContextPathNavigate GraphContext with gc.get(...), then continue through Map lookup or reflective property access.
NodeOutputPathRead results.getRaw(nodeId) and walk remaining segments with the cached property accessor.
NodeStreamPathRegister an implicit dependency on nodeId, then read results.getChannel(nodeId) to expose the live stream channel or a derived stream field.
TransformFieldPathRead the transform's virtual-node output map and extract the referenced field.
ItemPathResolve from the current foreach item bound in GraphContext reserved keys.
ItemIndexRead the current foreach index from reserved keys.
LoopPrevPathResolve against the previous iteration snapshot stored under the loop reserved context.
LoopCarryPathResolve against the current loop carry map.
LoopIterationRefRead the current loop iteration counter from reserved keys.
LambdaParamPathResolve against the lexical lambda parameter frame introduced by a higher-order method call.
StringLiteralReturn the constant string value.
NumberLiteralReturn the constant double value.
BooleanLiteralReturn the constant boolean value.
DurationLiteralReturn a java.time.Duration via DurationValue.toDuration().
ObjectLiteralCompile each field expression and build a LinkedHashMap<String, Object> at runtime.
ArrayLiteralCompile each element expression and build a runtime List<Object>.
BinaryOpCompile both sides, apply arithmetic/comparison/logical coercion rules, and short-circuit logical operators.
UnaryOpCompile the operand and apply numeric negation or boolean inversion.
ConditionalExprCompile condition/then/else extractors and evaluate only the selected branch at runtime.
NullCoalesceEvaluate the primary value first and only evaluate the fallback when the primary returns null.
FunctionCallResolve an ExpressionFunction, compile arguments, and invoke apply(Object...) with the evaluated values.
WhenExprCompile the optional subject plus clause/result expressions, then execute ordered matching with optional otherwise.
LambdaExprCompile into an internal callable object captured by higher-order collection methods; standalone lambda values are rejected in node and wait/await payload bindings.
MethodCallExprCompile the receiver and arguments, ensure the method name is in the allowed collection-method set, and dispatch to CollectionOps.invoke(...).
GroupExprCompile the inner expression transparently; grouping only preserves precedence.

Higher-order collection methods and lambda semantics

Receiver-style collection helpers currently support: map, filter, flatMap, reduce, groupBy, sortBy, find, any, all, zip, associate, take, drop, chunked, windowed, distinctBy, minBy, maxBy, count, sumBy, and partition.

Lambda-specific rules:

  • LambdaExpr is only meaningful as an argument to a higher-order collection method.
  • reduce requires an init value plus a two-parameter lambda (acc, x) -> ....
  • associate requires two lambdas: one for the key and one for the value.
  • Lambda parameter names participate in reserved-key validation so they cannot shadow DSL runtime identifiers.

The compiled extractors are packaged into a CompiledInputAssembler (via CompiledInputAssembler.ofMap()) which implements the InputAssembler<Map<String, Object>> interface.

Expression Evaluation Semantics

Truthiness: For ConditionalExpr, WhenExpr (Form A), and logical operators, values are coerced to boolean: nullfalse, Boolean → as-is, Numberfalse if 0, Stringfalse if empty, all others → true.

Null propagation: Arithmetic and comparison operators return null if either operand is null (except ?? which explicitly handles null). NullCoalesce (??) returns the fallback when the primary is null.

Type coercion for +: If either operand is a String, the other is converted via String.valueOf() and the result is string concatenation. Otherwise, both operands are coerced to Number for arithmetic addition.

Property Access Chain

The CachedPropertyAccessor.getProperty(obj, name) utility resolves properties in this order:

  1. Map.get(key) — if the object is a Map
  2. Record component accessor — if the object is a Java record
  3. Getter method (getXxx) — standard JavaBean convention
  4. Public field — direct field access

5.5 Branch Compilation

A BranchDef is compiled into an Edge.Conditional (implementing ConditionalEdge):

BranchDef  →  Edge.Conditional(fromNodeId, conditionField, branches, otherwise)

Condition Expression

The branch condition (branch on <expr>) must be a NodeOutputPath expression. The compiler extracts:

  • fromNodeId: the referenced node ID
  • conditionField: the dot-joined path segments (e.g., "status", "output.mode"), or null if segments are empty

If the condition is not a NodeOutputPath, a GraphDefinitionException is thrown.

Case Predicate Building

Each BranchCase value is compiled into a Predicate<Object> with type-lenient comparison:

Case Value TypePredicate Logic
BooleanLiteralIf runtime value is Boolean → direct == comparison; otherwise → case-insensitive String.valueOf() comparison
StringLiteralsl.value().equals(val) or sl.value().equals(String.valueOf(val))
NumberLiteralIf runtime value is NumberdoubleValue() comparison; otherwise → false
OtherObjects.equals(val, evaluateLiteral(caseValue)) fallback

Otherwise Target

If an otherwise clause is present, its target node ID is stored as the conditional edge's default branch. If the otherwise target references a non-existent node, a GraphDefinitionException is thrown.

Validation

  • The fromNodeId must exist in the node set
  • Each branch case target must exist in the node set
  • The otherwise target (if present) must exist in the node set

5.6 Resilience Configuration Compilation

Node resilience settings are compiled into a ResilienceConfig record:

Timeout

bloge
timeout = 3s

DurationValue(3, "s").toDuration()java.time.Duration.ofSeconds(3)

DSL UnitJava Conversion
msDuration.ofMillis(amount)
sDuration.ofSeconds(amount)
mDuration.ofMinutes(amount)

Retry

bloge
retry = {
    attempts: 3
    backoff: 200ms
    strategy: exponential
}

ResilienceConfig(retryAttempts=3, retryBackoff=Duration.ofMillis(200), backoffStrategy=EXPONENTIAL, ...)

Config KeyTypeDefaultDescription
attemptsint0Max retry attempts
backoffDuration100msBackoff delay
strategyString"fixed"Backoff strategy name

Strategy string mapping:

DSL ValueBackoffStrategy Enum
"exponential"EXPONENTIAL
"jitter"JITTER
"fixed"FIXED
(any other)FIXED (default)

The strategy field is optional; if omitted, defaults to "fixed".

Fallback

bloge
fallback = { status: "error", code: 500 }
fallback = "default_value"

The fallback expression is eagerly evaluated at compile time (not at runtime) via evaluateFallbackExpression(). The resulting value is wrapped in a Supplier<?> closure:

Expression TypeFallback Value
StringLiteralThe string value
NumberLiteralThe double value
BooleanLiteralThe boolean value
ObjectLiteralLinkedHashMap<String, Object> (recursively evaluated)
ArrayLiteralArrayList<Object> (recursively evaluated)
Othernull

5.7 Schema Resolution & Validation

Schema Declarations

Schemas can be declared in three forms:

  1. Top-level named schema (SchemaDef):

    bloge
    schema UserOutput {
        id: Int
        name: String
        email: String?
    }
  2. Inline output schema (within a node):

    bloge
    output {
        id: Int
        name: String
    }
  3. Schema reference (within a node):

    bloge
    output: UserOutput
  4. Inline input schema (within a node, disambiguated from InputBlock by lookahead):

    bloge
    input {
        name: String
        age: Int
    }

Named Schema Registry

Top-level SchemaDef members are registered in a Map<String, SchemaDescriptor> during compilation. When a SchemaRef is encountered, it is resolved against this registry. If the referenced schema is not found, a GraphDefinitionException is thrown: "Referenced schema '<name>' not found (at <line>:<column>)".

Type Name Mapping

Field type names in schema declarations are mapped to Java types:

DSL TypeJava Type
StringString.class
IntInteger.class
IntegerInteger.class
LongLong.class
DoubleDouble.class
FloatFloat.class
BooleanBoolean.class
BoolBoolean.class
NumberNumber.class
ObjectObject.class
MapMap.class
ListList.class
(unknown)Object.class

Optional Fields

A field suffixed with ? is marked as required=false:

bloge
email: String?    // required=false
name: String      // required=true (default)

Nested Schemas

Fields can have nested inline schemas:

bloge
address: {
    street: String
    city: String
    zip: String?
}

This produces a FieldDeclaration with typeName="Object" and nested=InlineSchema(...).

Schema Path Validation

When SchemaValidationLevel is not OFF, the compiler validates NodeOutputPath expressions against declared output schemas:

  1. For each NodeOutputPath in input bindings, look up the referenced node's outputSchema
  2. Walk each path segment against the schema, verifying that each field exists
  3. If a field is not found in the schema:
    • SchemaValidationLevel.WARN → log a warning
    • SchemaValidationLevel.ERROR → collect the error; after all validations, throw GraphDefinitionException with all errors

Validation is skipped for nodes with OpaqueSchema (no declared schema) output.

Schema Validation Levels

LevelBehavior
OFFNo schema path validation at compile time
WARNLog warnings for invalid paths (default)
ERRORThrow GraphDefinitionException for invalid paths

Auto-Introspection

If a node has no explicit input/output schema declaration (OpaqueSchema), the compiler attempts to auto-introspect schemas from the operator's metadata via registry.metadata(operatorRef). This allows operators that declare their own schemas to have them used for path validation without explicit DSL declarations.

5.8 Transform Compilation

A TransformDef is compiled into a virtual NodeSpec backed by the framework's built-in TransformOperator. This approach provides observability (transform input/output visible in execution logs and Studio) at the cost of negligible overhead.

Compilation Strategy

Each TransformDef is compiled as follows:

  1. Expression compilation — each TransformField.value expression is compiled into a BiFunction<NodeResults, GraphContext, Object> using the same expression compilation pipeline as node input blocks (§5.4), including all enhanced expression types (binary/unary ops, function calls, when, ??, ?:)
  2. Dependency inference — all NodeOutputPath and TransformFieldPath references within field expressions are collected as implicit dependencies of the transform's virtual node
  3. Input assembler generation — field extractors are packaged into a CompiledTransformAssembler (implements InputAssembler<Map<String, Object>>), which evaluates all field expressions and returns a Map<String, Object> as the transform output
  4. NodeSpec generation — a NodeSpec is created with:
    • id = transform ID
    • operator = TransformOperator (a built-in operator that returns input directly — all computation happens in the assembler)
    • metadata = {"__kind__" → "transform"} (used by Studio for differentiated visual rendering and excluded from complexity metrics)
    • No resilience configuration (no timeout, retry, fallback)
  5. Schema generation — a StructuredSchema is auto-generated from the transform's fields using type inference (§5.9)

TransformOperator

java
public final class TransformOperator implements Operator<Map<String,Object>, Map<String,Object>> {
    @Override
    public Map<String,Object> execute(Map<String,Object> input, OperatorContext ctx) {
        return input; // passthrough — all computation is in the InputAssembler
    }

    @Override
    public Idempotency idempotency() { return Idempotency.IDEMPOTENT; }

    public SideEffectType sideEffectType() { return SideEffectType.READ_ONLY; }
}

Transform Ordering & Cycle Detection

Transforms may reference other transforms (e.g., transformA.field used in transformB's expression). The compiler performs topological sort on all transforms using Kahn's algorithm:

  1. Build a dependency graph among transforms
  2. Detect cycles — if a cycle is found, throw GraphDefinitionException: "Circular dependency detected among transforms: [transformA → transformB → transformA]"
  3. Compile transforms in topological order so that downstream transforms can reference upstream transforms' schemas during type inference

Let Bindings (§5.8.1)

Transform blocks support let bindings to name intermediate computation results. A let binding is an immutable, declarative name binding scoped to the transform evaluation — it does not declare a mutable variable.

Syntax:

bloge
transform summary {
  let subtotal = fetchProducts.output.items.sumBy(x -> x.price)
  let discount = when { ctx.tier == "premium" -> 0.15   otherwise -> 0.0 }
  total    = subtotal * (1.0 - discount)
  itemCount = size(fetchProducts.output.items)
}

Rules:

  • let bindings must appear before field assignments in the transform body
  • let bindings can reference earlier let bindings (forward references are not allowed)
  • A let binding name must not conflict with: node IDs, reserved DSL keywords (ctx, prev, carry, loopIteration, stream, output, let), or other let binding names in the same transform
  • let values are not exposed in NodeResults — they are internal to the transform evaluation scope and invisible to other nodes
  • let expressions follow the same purity rules as transform field expressions

AST representation:

  • TransformDef now has a letBindings: List<LetBinding> field (in addition to fields)
  • LetBinding(name: String, value: Expression, line: int, column: int)

Transform Constraints

ConstraintRule
Pure functions onlyExpressionFunction calls in transforms must have isPure() == true; calling an impure function produces a compile error
No circular referencesTransform A referencing transform B requires B to not reference A directly or transitively
Upstream SKIP propagationIf all upstream dependencies of a transform are SKIPPED at runtime, the transform is also SKIPPED
No resilience configurationtimeout, retry, fallback blocks are not allowed in transform definitions
No depends_onDependencies are automatically inferred from expressions; explicit depends_on is not supported
Referenceable by branch onbranch on transformId.fieldName is valid — the transform compiles to a NodeSpec, enabling branch conditions on computed fields

Transform DSL Example

bloge
graph orderProcess {
  node fetchUser : FetchUserOperator { input { userId = ctx.userId } }
  node fetchProducts : FetchProductsOperator { input { cartId = ctx.cartId } }
  node calcPrice : CalcPriceOperator { input { items = fetchProducts.output.items } }

  /// Aggregate order summary data from multiple upstream nodes
  transform orderSummary {
    /// Full customer display name
    customerName = concat(fetchUser.output.firstName, " ", fetchUser.output.lastName)
    itemCount    = size(fetchProducts.output.items)
    totalWithTax = calcPrice.output.subtotal * 1.08
    isPremium    = fetchUser.output.vipLevel > 3
    tier: String = when { isPremium -> "premium"   otherwise -> "standard" }
  }

  node createOrder : CreateOrderOperator {
    input {
      customer = orderSummary.customerName
      total    = orderSummary.totalWithTax
    }
  }
}

node / branch on / transform Orthogonal Relationship

Dimensionnodebranch ontransform
SemanticsExecute an operatorSelect execution pathCompute data fields
DAG elementConcrete nodeConditional edgeVirtual node
Inputinput {} / depends_onSingle node, single fieldMulti-upstream, multi-field (auto-inferred)
OutputOperator outputRoutes to N-choose-1 nodeNamed field set
Reference syntaxnodeId.output.fieldImplicit (target nodes selected/skipped)transformId.field
Resilience config✅ (timeout, retry, fallback)❌ (not applicable)
Studio visualSolid rectangle (blue)Diamond (orange)Dashed rounded rectangle (light purple)

Common Composition Patterns

  • Transform → Node: Transform adapts data, then feeds to an operator node
  • Transform → Branch: Transform computes a condition field, branch on references it for routing (limited to "calculator-level" simple conditions; domain-knowledge-based decisions should be operators)
  • Branch → Transform: After branch selects a path, transform assembles data for downstream nodes
  • Transform → Transform: Chained transforms (compiler sorts + detects cycles)

5.9 Scope Mode Compilation

The scope property controls scope visibility for sub-graph constructs (loop, foreach, subgraph nodes). It determines whether the sub-graph body can see parent graph node outputs and inherits the parent GraphContext.

Default Scope Matrix

ConstructDefaultRationale
loopparentInline body, lexically nested — natural to see parent
foreachparentInline body, lexically nested — same as loop
subgraph("name")isolatedExternal/reusable graph — encapsulation by default

Compilation Behavior

Scope ModeBehavior
parentThe compiler collects NodeOutputPath references to parent graph nodes from the sub-graph body expressions (collectNodeOutputRefs). These references are registered with the sub-compiler via withParentNodeIds(). At runtime, parent node outputs are injected into the sub-graph context as __parentOutput_<nodeId>__ keys. The sub-graph's compiled expressions read from these context keys instead of NodeResults.
isolatedThe sub-graph is compiled as a fresh, independent graph with no awareness of parent scope. Only explicit input {} bindings are visible. No __parentOutput_* keys are injected.

ForEach Parent Scope

When foreach has scope = parent (the default):

  1. collectNodeOutputRefs(body) scans body node input expressions for NodeOutputPath references to parent graph nodes
  2. The sub-compiler is configured with withParentNodeIds(parentRefNodeIds) so expressions like parentNode.output.field compile to context reads
  3. The ForEachOperator receives both the items list and parent node outputs in a combined input map (__items__ + __parentOutput_* keys)
  4. Each item execution merges the parent graph context and parent outputs into the sub-graph context before running

ForEach Execution Semantics

  • itemsExpr is compiled once into an extractor that produces the collection fed to the hyper-node.
  • The foreach body is compiled as a nested graph named <foreachId>__subgraph.
  • sequential switches the runtime from default fan-out execution to ordered item-by-item execution.
  • stream foreach selects StreamingForEachOperator; non-streaming foreach selects ForEachOperator.
  • buffer = N is stored in NodeMetadata and forwarded to the streaming runtime as buffer sizing metadata.
  • The foreach node itself exposes an OpaqueSchema output because the aggregate result shape depends on the nested graph and runtime fan-out policy.

Loop Execution Semantics

  • The loop body is compiled as a nested graph named <loopId>__subgraph.
  • max_iterations becomes a hard runtime guard, while delay is converted to a Duration inserted between iterations.
  • until is compiled into a predicate over the current iteration's node outputs and only terminates the loop when the expression evaluates to Boolean.TRUE.
  • carry { ... } is compiled into a carry-mapper; when omitted, the runtime carries forward the previous iteration's full output map.
  • Explicit depends_on plus any parent-scope references become loop-node dependencies in the outer graph.
  • stream loop selects StreamingLoopOperator; non-streaming loop selects LoopOperator.

Streaming Semantics

  • stream is a member prefix on node, foreach, and loop; it does not create a separate AST node type.
  • The parser records streaming intent as streaming=true plus an optional bufferSize on the wrapped member.
  • nodeId.stream and nodeId.stream.field compile to NodeStreamPath, which also records a stream dependency so the runtime can wire the live channel correctly.
  • Standard consumer nodes keep their normal operator references; stream behavior is carried through NodeMetadata, NodeResults.getChannel(nodeId), and the streaming operator implementations.

SubGraph Parent Scope

When a subgraph("name") node has scope = parent:

  1. The SubGraphOperator merges the parent GraphContext into the sub-graph context before overlaying the explicit input {} values
  2. This allows the child graph's operators to access parent-level context entries

5.10 Expression Type Inference

The compiler infers types for transform fields and validates expression type consistency. Type inference is used to auto-generate StructuredSchema for transforms and to validate downstream path references.

Inference Rules

ExpressionInferred Type
StringLiteralString
NumberLiteralNumber
BooleanLiteralBoolean
DurationLiteralDuration
NodeOutputPathLooked up from upstream operator output schema
NodeStreamPathObject
TransformFieldPathRecursively inferred from the referenced transform field
ContextPathUnknown
ItemPath / LoopPrevPath / LoopCarryPathObject
ItemIndex / LoopIterationRefInteger
LambdaParamPathObject
a + b (both Number)Number
a + b (either side String)String
a * b, a - b, a / b, a % bNumber
a > b, a == b, a != b, etc.Boolean
a && b, a || b, !aBoolean
a ?? bType of the non-Unknown side; if both known, their common compatible type
cond ? a : bCommon type of both branches
when { ... }Common type of all right-hand-side expressions
func(args...)ExpressionFunction.returnType(String...) result
MethodCallExprMethod-specific (List, Map, Boolean, or Object)
LambdaExprFunction
ObjectLiteralObject
ArrayLiteralList

Unknown Type Handling

When inference produces Unknown, behavior depends on the SchemaValidationLevel:

LevelBehavior for Unknown type
OFFIgnored — no schema generated for Unknown fields
WARNCompile warning suggesting explicit type annotation (field: Type = expr)
ERRORCompile error requiring explicit type annotation

Explicit Type Annotation

Transform fields support optional explicit type annotations that override inference:

bloge
transform orderSummary {
  tier: String = when { isPremium -> "premium"   otherwise -> "standard" }
  score: Double = riskCalc.output.rawScore * 0.85
}

When an explicit annotation is present, the compiler verifies that the inferred type is compatible with the declared type. A mismatch produces a compile warning (at WARN level) or error (at ERROR level).

5.11 Built-in Function Library

Enhanced expressions support function calls within input {} blocks and transform blocks. Functions are resolved against an ExpressionFunction registry. The framework provides built-in functions; custom functions can be registered via the ExpressionFunction SPI.

ExpressionFunction Interface

java
public interface ExpressionFunction {
    String name();
    Object apply(Object... args);

    /**
     * Report the result type for the given argument types.
     * Implementations must return an explicit type name instead of silently
     * defaulting to Unknown.
     */
    String returnType(String... argTypes);

    /** Whether this function has no side effects. */
    default boolean isPure() { return true; }
}

Built-in Functions

CategoryFunctions
Stringconcat(s...), substring(s, start, end?), uppercase(s) / upper(s), lowercase(s) / lower(s), trim(s), replace(s, target, replacement), replaceAll(s, regex, replacement), startsWith(s, prefix), endsWith(s, suffix), split(s, delimiter), join(list, delimiter), length(s) / len(s), indexOf(s, target), matches(s, regex), padLeft(s, length, padChar?), padRight(s, length, padChar?)
Mathabs(n), min(a, b), max(a, b), round(n), ceil(n), floor(n), sum(list), avg(list), clamp(n, min, max), pow(base, exp)
Collectionsize(c), contains(c, elem), first(c), last(c), isEmpty(c), distinct(list), flatten(listOfLists), sort(list), take(list, n), drop(list, n), any(list, value), reverse(list), all(list, value)
Map/Objectkeys(map), values(map), entries(map), merge(map1, map2), has(map, key)
Nullcoalesce(a, b, ...), isNull(v), isNotNull(v)
TypetoString(v), toNumber(v), toBoolean(v), toInt(v), typeOf(v)
Date/Timenow() ⚠, today() ⚠, formatDate(isoStr, pattern), parseDate(str, pattern), addDuration(isoStr, duration), diffDuration(iso1, iso2, unit)
ID/Utilityuuid() ⚠, format(template, args...)
Hashmd5(s), sha1(s), sha256(s), sha512(s)
Encodingbase64Encode(s), base64Decode(s), hexEncode(s), hexDecode(s), urlEncode(s), urlDecode(s)
Crypto (SPI)hmacSha256(key, data), hmacSha512(key, data), aesEncrypt(key, data), aesDecrypt(key, data) — require SecretProvider SPI
JSON (SPI)toJson(v), fromJson(jsonStr) — require JsonCodec SPI
Secrets (SPI)secret(name) ⚠ — resolves a named secret via SecretProvider SPI; impure (treated like uuid())

Aliases: upperuppercase, lowerlowercase, lenlength. All alias names are accepted.

Date/Time Design: All date/time functions operate on ISO-8601 strings (e.g., "2024-01-15T10:30:00Z"), keeping the DSL type system to String/Number/Boolean without introducing new value types. addDuration accepts DSL duration literals ("2h", "30m", "1d") or ISO-8601 durations ("PT2H").

⚠ Impure Functions: Functions marked with ⚠ (now, today, uuid) are non-deterministic (impure). They are not allowed in transform blocks — the compiler will reject them with a GraphDefinitionException. In input {} blocks, they are evaluated once per node execution (not re-evaluated on retry).

Collection method calls and lambdas

Collection method calls are receiver-based (orders.map(o -> o.id)) rather than registry-based. They are compiled separately from ExpressionFunction calls and currently dispatch through CollectionOps.

Additional constraints:

  • method names must come from the allowed collection-method set
  • the receiver must evaluate to a List<?>; otherwise the runtime result is null
  • lambdas are expression-bodied only and capture the surrounding reserved execution context lexically
  • standalone lambda literals are rejected for ordinary node input bindings and wait/await payload fields

5.11.1 Impure Functions

Three built-in functions are impure (non-deterministic):

FunctionReturnBehavior
now()ISO-8601 timestamp stringReturns current instant at evaluation time
today()yyyy-MM-dd date stringReturns current date at evaluation time
uuid()UUID v4 stringGenerates a random UUID
secret(name)StringResolves a named secret via SecretProvider SPI; non-deterministic

Restrictions and semantics:

  1. Transform blocks: Impure functions are forbidden. The compiler checks isPure() and throws: "Function '<name>' is not pure and cannot be used in transform blocks"
  2. Input blocks: Impure functions are allowed. They are evaluated once when InputAssembler.assemble() is called, which happens before the operator executes — specifically before the retry wrapper. Retry does not re-evaluate input expressions.
  3. Concurrent evaluation: Sibling nodes evaluated concurrently may get different now() values. If a consistent timestamp is needed across nodes, compute it in an upstream node's output or pass it via ctx.
  4. Replay/Audit: Since impure functions produce different results on each execution, workflows using them are not fully deterministic. For auditability, consider passing generated values (timestamps, IDs) via ctx instead.

Custom Function Registration

Business teams can register custom functions via the ExpressionFunction SPI (e.g., formatCurrency(), maskPhone()). Custom functions:

  • Must implement the ExpressionFunction interface
  • Are registered in the ExpressionFunction registry (discoverable via ServiceLoader or Spring auto-configuration)
  • If used in transform blocks, must declare isPure() == true; impure functions in transforms produce a compile error
  • Must implement returnType() so type inference stays explicit and deterministic

Expression Power Boundary (Design Red Lines)

The enhanced expression system is intentionally bounded to preserve the declarative nature of the DSL:

  • Allowed: ??, ?:, when {}, comparison (>, <, ==, !=, >=, <=), logical (&&, ||, !), arithmetic (+, -, *, /, %), ExpressionFunction calls, receiver-style collection method calls, single-expression lambdas, and parentheses grouping
  • Prohibited: if/else statement blocks, for/while statements, multi-statement sequential execution, statement-scoped variable declarations outside transform let, try/catch, and named user-defined functions

Every line in transform and input {} blocks must maintain the fieldName = <expression> declarative form. All conditional logic must be expressions (returning a value), not statements (with side effects).

when Expression Details

The when expression has two forms, both using the same -> and otherwise syntax as branch on for consistency, though the semantics are orthogonal:

Form A — No subject (each clause left-hand side is a boolean condition):

bloge
tier = when {
  score > 0.8 -> "critical"
  score > 0.5 -> "elevated"
  otherwise   -> "low"
}

Form B — With subject (each clause left-hand side is a match value):

bloge
warehouse = when fetchOrder.output.region {
  "华东" -> "SH-01"
  "华南" -> "GZ-02"
  otherwise -> "DEFAULT"
}

when vs branch on orthogonality: branch on is the control plane (decides which nodes execute/skip); when is the data plane (decides what value a field takes). branch on only appears at graph body top level; when only appears in expression position. They share syntax style (-> arrows + otherwise) but are semantically completely different.


5.12 Session Extension Compilation

Session sources are compiled outside the ordinary Graph DAG path. The parser still emits generic ExtensionDef nodes, but the dedicated session compiler interprets them as a structured long-running state machine.

Compilation flow

  1. Parse the root session into ExtensionDef(kind="session").
  2. Normalize session properties such as idle_timeout, timeout_action, max_rounds, and max_history.
  3. Compile each phase child into a PhaseSpec, preserving source order for deterministic transition resolution.
  4. Interpret phase_type as either:
    • once: direct child members run once when the phase activates
    • round: the round { ... } child list becomes the body that can repeat until until succeeds or phase limits are hit
  5. Lower then into either a default target or an ordered list of conditional transitions.
  6. Reuse the standard node, branch, transform, loop, wait, await, and script compilers for phase members so graph semantics stay consistent inside a phase.

Session-specific semantic rules

  • exactly one top-level session root is allowed in a session document
  • phases are addressable by ID and then targets must resolve to declared phases
  • a round phase cannot mix round { ... } with direct executable phase members
  • yield_on lowers to a string list consumed by the session runtime when deciding whether to suspend and wait for external signals
  • until is compiled as an expression against the phase/session execution context and is evaluated after each round or once-phase completion as appropriate

Runtime contract boundary

This specification intentionally stops at the compiled session model boundary. Persistence, wake-up delivery, history truncation, and durable resume behavior are owned by bloge-core-ext, bloge-durable, and their codecs, not by the DSL grammar itself.

5.13 Script Node

Script nodes allow embedding dynamic business logic as sandboxed Groovy code directly inside a .bloge graph. They are intended for hot-updatable rules that change faster than the deployment cycle (pricing tiers, risk thresholds, compliance rules), not for general computation.

Dependency: Script nodes require the optional bloge-script module. Without it the compiler throws a GraphDefinitionException explaining how to enable script support.

Syntax

bloge
/// Optional doc comment
script <id> {
  lang         = "groovy"           // optional; default "groovy"
  timeout      = 5s                 // strongly recommended
  input {                           // optional; maps context fields into script variables
    <binding> = <expression>
    ...
  }
  output_schema {                   // optional; validates return value
    <field>: <Type>
    ...
  }
  code = """
    // Groovy code here
    // Last expression is the return value (must be a Map)
    return [field: value]
  """
}

Execution Model

  1. Input assembly: the input block expressions are evaluated against the current graph context, producing a Map<String, Object>.
  2. Script execution: the ScriptOperator calls ScriptEngine.execute(code, inputs). Inputs are wrapped Collections.unmodifiableMap before binding.
  3. Output: the last evaluated expression in the Groovy script becomes the return value. It must be a Map; otherwise a ScriptExecutionException is thrown.
  4. Schema validation: if output_schema is declared, the return map is validated against it.

Security Model (Three-Layer Sandbox)

LayerMechanismWhat It Blocks
A — Compile-time ASTSecureASTCustomizerDangerous imports (java.io, java.net, java.lang.Runtime, etc.), all static imports, dangerous receivers (File, Socket, URL, Runtime, Thread)
B — Class-loadingSandboxedGroovyClassLoaderAny class not in java.lang/util/math/time, groovy.lang/util; blocks reflection, java.io, java.net, javax, groovy.grape, sun.*
C — RuntimeImmutable inputs + timeoutInputs wrapped as unmodifiable; timeout enforced by ResilientOperatorWrapper (default 5 s)

Operator Registration

The bloge-script module provides GroovyScriptOperatorFactory which implements the ScriptOperatorFactory SPI. Pass it to the compiler:

java
DslCompiler compiler = new DslCompiler(registry)
    .withScriptOperatorFactory(new GroovyScriptOperatorFactory());

With Spring Boot, simply add bloge-script to the classpath — BlogeAutoConfiguration auto-wires the factory via @ConditionalOnClass.

Compilation Behavior

  • A NodeSpec is created with operator ref __script__:<id>.
  • Metadata keys: kind=script, __script_lang__, __script_code__.
  • The ScriptEngine caches compiled classes by SHA-256 hash of the code string — hot-reload only re-compiles when the code changes.
  • Script nodes can have depends_on inferred from their input block expressions (same as regular nodes).

Lint Rules

Rule IDSeverityCondition
max-script-nodesWARNINGGraph has more than 2 script nodes
script-timeout-requiredWARNINGScript node has no timeout
script-line-limitINFOScript code exceeds 30 lines

6. Error Reporting

All errors carry precise source positions (line:column) for diagnostic messages.

6.1 Lexer Errors

The lexer throws ParseException for the following conditions:

ErrorTriggerMessage Format
Unterminated string literalReaching EOF while inside a "..." string"[line:col] Unterminated string literal"
Unterminated block commentReaching EOF with block-comment depth > 0"[line:col] Unterminated block comment"
Unexpected characterAny character not matching any token rule"[line:col] Unexpected character '<c>'"
Bare && not followed by &"[line:col] Unexpected character '&'. Did you mean '&&'?"
Bare || not followed by |"[line:col] Unexpected character '|'. Did you mean '||'?"

Note: With enhanced expressions, - and / are now valid operator tokens (MINUS and SLASH respectively) and are always emitted by the lexer. Previously these were lexer errors when not part of -> or comment syntax. Invalid usage of these tokens (e.g., MINUS where the parser expects a member keyword) is now caught by the parser rather than the lexer.

6.2 Parser Errors

The parser uses two error handling mechanisms:

ParseException with Position

The expect() method throws ParseException when the expected token type does not match the current token:

"[line:col] <message>, got <actualType>('<actualLexeme>')"

Examples:

  • "[3:5] Expected 'graph', got IDENT('myNode')"
  • "[7:12] Expected '{' after graph name, got COLON(':')"

Error Collection and Synchronization

The parser collects multiple errors rather than aborting on the first one:

  1. When a ParseException is caught during member parsing, its message is added to the errors list
  2. The synchronize() method skips tokens until it finds a synchronization point such as SESSION, PHASE, ROUND, NODE, BRANCH, TRANSFORM, SCHEMA, FOREACH, LOOP, WAIT, AWAIT, STREAM, or RBRACE
  3. Parsing continues from the synchronization point
  4. After parsing completes, if the errors list is non-empty, a single ParseException(List<String>) is thrown containing all collected error messages:
Parse errors:
[3:5] Expected ':' after node id, got IDENT('opA')
[7:1] Expected 'node', 'branch', 'transform', or 'schema', got IDENT('invalid')

6.3 Compiler Errors

The compiler (DslCompiler) throws GraphDefinitionException for semantic errors:

ErrorTriggerMessage
Unregistered operatorNodeDef.operatorRef not in OperatorRegistry"Node '<id>' references unregistered operator '<ref>'"
Non-existent dependencydepends_on or implicit dep references unknown node"Node '<id>' depends on non-existent node '<dep>'"
Invalid branch conditionBranch condition is not a NodeOutputPath or TransformFieldPath"Branch condition must be a node output path or transform field path expression at <line>:<col>"
Non-existent branch sourceBranch condition references unknown node"Branch references non-existent node '<id>'"
Non-existent branch targetBranch case target references unknown node"Branch target '<id>' references non-existent node"
Non-existent otherwise targetotherwise target references unknown node"Branch otherwise target '<id>' references non-existent node"
Schema reference not foundSchemaRef name not in named schema registry"Referenced schema '<name>' not found (at <line>:<col>)"
Schema path validation failureNodeOutputPath segment not in upstream output schema"Node '<id>' input binding '<field>': path '<path>' — field '<segment>' not found in output schema of '<nodeId>'"
Cycle detectionDAG validation in Graph constructor detects a cycle(thrown by bloge-core Graph model)
Transform circular dependencyTransform A references transform B which references A"Circular dependency detected among transforms: [transformA → transformB → transformA]"
Impure function in transformTransform field expression calls a function with isPure() == false"Transform '<id>' field '<field>': function '<name>' is not pure and cannot be used in transform blocks"
Unresolved functionFunction name not found in ExpressionFunction registry"Unknown function '<name>' at <line>:<col>"
Transform type mismatchExplicit type annotation incompatible with inferred type"Transform '<id>' field '<field>': declared type '<declared>' is incompatible with inferred type '<inferred>'"
Duplicate transform/node IDTransform ID conflicts with a node ID"Transform '<id>' conflicts with node of the same name"
Non-existent transform referenceTransformFieldPath references unknown transform"Reference to non-existent transform '<id>' at <line>:<col>"
Non-existent transform fieldTransformFieldPath references unknown field in transform"Transform '<id>' has no field '<field>' at <line>:<col>"
Reserved node IDnode ID matches a reserved DSL keyword (ctx, prev, carry, loopIteration, stream, output)"[line:col] Node id '<id>' is a reserved DSL keyword and cannot be used as a node id"
Reserved lambda parameterLambda parameter matches a reserved DSL keyword"Lambda parameter '<param>' conflicts with the reserved DSL keyword '<param>'"

Appendix A: Complete EBNF Grammar Quick-Reference

ebnf
Program           = GraphDef | SessionDef

GraphDef          = DocComment? "graph" IDENT "{" Member* "}"
SessionDef        = DocComment? "session" IDENT "{" SessionMember* "}"
SessionMember     = SessionProperty | PhaseDef | CommentNode
SessionProperty   = "idle_timeout" "=" DURATION
                  | "timeout_action" "=" STRING
                  | "max_rounds" "=" NUMBER
                  | "max_history" "=" NUMBER
PhaseDef          = DocComment? "phase" IDENT "{" PhaseBody "}"
PhaseBody         = ( PhaseProperty | RoundBlock | Member )*
PhaseProperty     = "max_rounds" "=" NUMBER
                  | "on_round_failure" "=" IDENT
                  | "yield_on" "=" "[" ( IDENT ( "," IDENT )* )? "]"
                  | "until" Expression
                  | ThenProperty
ThenProperty      = "then" "->" IDENT
                  | "then" "{" ( ( Expression | "otherwise" ) "->" IDENT )* "}"
RoundBlock        = "round" "{" Member* "}"

Member            = NodeDef | BranchDef | TransformDef | SchemaDef | CommentNode
                  | ForEachDef | LoopDef | WaitDef | AwaitDef | ScriptDef | StreamMember
DocComment        = DOC_COMMENT+
StreamMember      = DocComment? "stream" ( NodeDef | ForEachDef | LoopDef )

NodeDef           = DocComment? "node" IDENT ":" IDENT "{" NodeBody "}"
NodeBody          = ( InputBlock | InputSchemaBlock | OutputDecl | DependsOn
                    | TimeoutField | RetryField | FallbackField | ScopeField | BufferField )*

InputBlock        = "input" "{" ( DocComment? IDENT "=" Expression )* "}"
InputSchemaBlock  = "input" "{" FieldDeclaration* "}"
OutputDecl        = "output" ( "{" FieldDeclaration* "}" | ":" IDENT )
DependsOn         = "depends_on" "=" "[" ( IDENT ( "," IDENT )* )? "]"
TimeoutField      = "timeout" "=" DURATION
RetryField        = "retry" "=" "{" ( IDENT ":" ( NUMBER | DURATION | IDENT ) ","? )* "}"
FallbackField     = "fallback" "=" Expression
ScopeField        = "scope" "=" ( "parent" | "isolated" )
BufferField       = "buffer" "=" NUMBER

ForEachDef        = DocComment? "foreach" IDENT ":" ( "(" IDENT ( "," IDENT )? ")" | IDENT )
                    "in" Expression "sequential"? "{" ( ScopeField | BufferField | Member )* "}"
LoopDef           = DocComment? "loop" IDENT "{" ( "max_iterations" "=" NUMBER
                    | "delay" "=" DURATION | DependsOn | ScopeField | BufferField
                    | Member | CarryBlock | UntilClause )* "}"
CarryBlock        = "carry" "{" ( IDENT ":" Expression ","? )* "}"
UntilClause       = "until" Expression

WaitDef           = DocComment? "wait" IDENT "=" WaitTimer "after" IDENT ( "{" WaitBody "}" )?
WaitTimer         = DURATION | "deadline" "(" STRING ")" | "cron" "(" STRING ")"
WaitBody          = ( "signal_key" "=" Expression | "on_timeout" PayloadBlock | "on_fire" PayloadBlock )*

AwaitDef          = DocComment? "await" IDENT "{" AwaitBody "}"
AwaitBody         = ( "mode" "=" IDENT | DependsOn | EventMatcher | TimeoutField | "on_timeout" PayloadBlock )*
EventMatcher      = "event" STRING ( "where" IDENT "=" Expression )?
                    ( "{" "optional" "=" "true" "}" )?

ScriptDef         = DocComment? "script" IDENT "{" ScriptBody "}"
ScriptBody        = ( "lang" "=" STRING | InputBlock
                    | "output_schema" ( "{" FieldDeclaration* "}" | "=" IDENT )
                    | "code" "=" TRIPLE_STRING | TimeoutField )*
PayloadBlock      = "{" ( IDENT "=" Expression )* "}"

BranchDef         = DocComment? "branch" "on" PathExpression "{" BranchBody "}"
BranchBody        = ( DocComment? Expression "->" IDENT | "otherwise" "->" IDENT )*

TransformDef      = DocComment? "transform" IDENT "{" LetBinding* TransformField* "}"
LetBinding        = "let" IDENT "=" Expression
TransformField    = DocComment? IDENT ( ":" IDENT "?"? )? "=" Expression

SchemaDef         = "schema" IDENT "{" FieldDeclaration* "}"
FieldDeclaration  = IDENT ":" ( IDENT "?"? | "{" FieldDeclaration* "}" )

Expression        = LambdaExpr | WhenExpr | ConditionalExpr
LambdaExpr        = ( IDENT | "(" ( IDENT ( "," IDENT )* )? ")" ) "->" Expression
ConditionalExpr   = NullCoalesce ( "?" Expression ":" Expression )?
NullCoalesce      = LogicalOr ( "??" LogicalOr )*
LogicalOr         = LogicalAnd ( "||" LogicalAnd )*
LogicalAnd        = Equality ( "&&" Equality )*
Equality          = Comparison ( ( "==" | "!=" ) Comparison )?
Comparison        = Additive ( ( ">" | "<" | ">=" | "<=" ) Additive )?
Additive          = Multiplicative ( ( "+" | "-" ) Multiplicative )*
Multiplicative    = Unary ( ( "*" | "/" | "%" ) Unary )*
Unary             = ( "!" | "-" ) Unary | Primary
Primary           = Atomic ( "." MethodCall )*
MethodCall        = IDENT "(" ( Expression ( "," Expression )* )? ")"
Atomic            = FunctionCall | PathExpression | STRING | NUMBER | DURATION
                  | "true" | "false" | ObjectLiteral | ArrayLiteral | "(" Expression ")"
FunctionCall      = IDENT "(" ( Expression ( "," Expression )* )? ")"
WhenExpr          = "when" Expression? "{" ( Expression "->" Expression )* ( "otherwise" "->" Expression )? "}"
PathExpression    = IDENT ( "." IDENT )*
ObjectLiteral     = "{" ( IDENT ":" Expression ( "," IDENT ":" Expression )* )? "}"
ArrayLiteral      = "[" ( Expression ( "," Expression )* )? "]"

Appendix B: Example Index

The following 8 example files are located in bloge-examples/src/main/resources/bloge/:

FileGraph NameDescriptionKey Features Demonstrated
order-process.blogeorderProcessE-commerce order flow: user/product fetch → pricing → credit check → create/rejectImplicit deps, branch on boolean, retry with jitter strategy, fallback with object literal
bff-dashboard.blogebffDashboardBackend-for-frontend aggregation: parallel fetch of profile, orders, recommendations, etc.Fan-out parallelism, multiple fallbacks with array/object literals, no branching
loan-approval.blogeloanApprovalLoan application: credit check, fraud detection, income verification → risk aggregation4-way fan-out, otherwise branch, multiple retry/fallback configs
ticket-routing.blogeticketRoutingCustomer ticket routing: sentiment analysis → priority classification → agent assignmentotherwiseautoResolve, exponential retry, fallback with mixed types
food-order.blogefoodOrderProcessFood delivery: inventory/kitchen/delivery checks → acceptance → payment/dispatchBranch on true/false boolean values, deep fan-out, post-branch parallel nodes
claim-processing.blogeclaimProcessingInsurance claim flow: policy validation, document review, history check → risk assessment3-way fan-in/fan-out, otherwiseinvestigateClaim, exponential retry
shipment-planning.blogeshipmentPlanningShipment logistics: warehouse/carrier/route optimization → cost calculation → dispatch mode3-way fan-in, otherwisedispatchConsolidated, fallback with decimal numbers
online-triage.blogeonlineTriageMedical triage: patient records → symptom analysis → AI diagnosis → routingctx path with domain context (ctx.chiefComplaint), high timeout (10s), 3-way branch

Appendix C: DSL vs Java Fluent API Comparison

Using the orderProcess example to illustrate the equivalence between DSL syntax and the Java fluent graph builder:

DSL (.bloge)

bloge
graph orderProcess {
  node fetchUser : FetchUserOperator {
    input {
      userId = ctx.userId
    }
    timeout = 3s
    retry = { attempts: 2, backoff: 200ms, strategy: exponential }
  }

  node fetchProducts : FetchProductsOperator {
    input {
      productIds = ctx.productIds
    }
    timeout = 5s
  }

  node checkCredit : CheckCreditOperator {
    input {
      user     = fetchUser.output
      products = fetchProducts.output
    }
  }

  branch on checkCredit.output.approved {
    true  -> createOrder
    otherwise -> rejectOrder
  }

  node createOrder : CreateOrderOperator {}
  node rejectOrder : RejectOrderOperator {}
}

Java Fluent API (equivalent)

java
Graph graph = Graph.builder("orderProcess")
    .node("fetchUser", fetchUserOperator)
        .input((results, ctx) -> Map.of("userId", ctx.get("userId", Object.class)))
        .timeout(Duration.ofSeconds(3))
        .retry(2, Duration.ofMillis(200), BackoffStrategy.EXPONENTIAL)
    .node("fetchProducts", fetchProductsOperator)
        .input((results, ctx) -> Map.of("productIds", ctx.get("productIds", Object.class)))
        .timeout(Duration.ofSeconds(5))
    .node("checkCredit", checkCreditOperator)
        .dependsOn("fetchUser", "fetchProducts")
        .input((results, ctx) -> Map.of(
            "user", results.getRaw("fetchUser"),
            "products", results.getRaw("fetchProducts")))
    .branch("checkCredit")
        .on("approved")
        .when(value -> Boolean.TRUE.equals(value), "createOrder")
        .otherwise("rejectOrder")
    .node("createOrder", createOrderOperator)
    .node("rejectOrder", rejectOrderOperator)
    .build();

Key Differences

AspectDSLJava API
Dependency decldepends_on = [...] or implicit.dependsOn(...) explicit only
Input bindingfield = path.expression.field("name", (results, ctx) -> ...)
Enhanced expressionsa + b, ??, when {}, func()Arbitrary Java lambdas
Transform blockstransform id { field = expr }Manual virtual node creation with TransformOperator
Branch predicateLiteral matching ("ok", true)Predicate<Object> lambda
Schema declarationoutput { field: Type }Programmatic SchemaDescriptor construction
ResilienceInline DSL syntaxBuilder method chain
Doc-comments/// prefix commentsNot supported in API

Appendix D: Reserved Keywords & Future Extensions

Current Reserved Keywords

All 50 keywords listed in §2.1 are reserved at the lexer level (they produce keyword tokens). However, as described in §3.3, they remain contextual — usable as identifiers in many grammar positions even though later semantic validation may still reserve some runtime names.

Newly Specified Features (v1.0.0)

The following previously partial or draft features are fully specified in this stable version:

FeatureStatusDescriptionSpec Section
Session root and phase syntaxSpecifiedStable session / phase / round grammar and lowering model§3.1, §3.4, §4.5, §5.12
Higher-order expressionsSpecifiedLambdas plus receiver-style collection methods§3.1, §5.4, §5.10, §5.11
Streaming members and pathsSpecifiedstream node, stream foreach, stream loop, and node.stream references§3.1, §4.1, §4.3, §5.4
Transform let bindingsSpecifiedImmutable transform-local intermediate values§3.1, §4.1, §5.8
Wait / await nodesSpecifiedTimer suspension and correlated event waiting§3.1, §4.1, §4.2, §5.12
Script nodesSpecifiedSandboxed Groovy script execution via optional bloge-script module§3.1, §4.2, §5.13
Scope and buffer controlsSpecified`scope = parentisolatedand streamingbuffer = N` metadata
SPI function typing contractSpecifiedExpressionFunction.returnType(...) is mandatory for explicit type reporting§5.10, §5.11
Shared parser conformance basisSpecifiedThe spec now matches the cross-language fixture and AST baseline suite§1, §4, §5

Future Extension Points

The following features are referenced in design documents as potential future additions but are not yet specified:

FeatureStatusDescription
AccessorCodeGeneratorPlannedBytecode generation via Java ClassFile API to replace reflective property access with generated accessor classes
Sub-graph mechanismSpecifiedSubGraphOperator enabling nested graph execution via node x : subgraph("name") {} syntax with optional scope = parent
Import/includeNot plannedMulti-file graph composition
Graph complexity validationPlannedCompile-time enforcement of node count, path depth, branch nesting, and fan-out limits
Operator behavior contractsPlannedIdempotency and SideEffectType declarations on operators, influencing runtime resilience defaults
Schema versioningPlannedSchemaDescriptor.version field with upstream/downstream compatibility validation

The grammar has syntactic room for these extensions:

  • New expression forms can be added as additional Expression sealed permits
  • New node body clauses can be added without breaking existing syntax
  • New graph body member types (like transform) can be added to the Member production