BLOGE DSL Language Specification
| Field | Value |
|---|---|
| Spec Version | 1.0.0 |
| BLOGE DSL Version | 1.0.0 |
| Date | 2026-03-07 |
| Status | STABLE |
Table of Contents
- Introduction
- Lexical Specification
- Syntax Specification
- AST Specification
- Compilation Semantics
- 5.1 Compilation Pipeline Overview
- 5.2 Operator Resolution
- 5.3 Implicit Dependency Inference
- 5.4 Expression Compilation
- 5.5 Branch Compilation
- 5.6 Resilience Configuration Compilation
- 5.7 Schema Resolution & Validation
- 5.8 Transform Compilation
- 5.9 Scope Mode Compilation
- 5.10 Expression Type Inference
- 5.11 Built-in Function Library
- 5.12 Session Extension Compilation
- 5.13 Script Node
- Error Reporting
- 6.1 Lexer Errors
- 6.2 Parser Errors
- 6.3 Compiler Errors
- Appendix A: Complete EBNF Grammar Quick-Reference
- Appendix B: Example Index
- Appendix C: DSL vs Java Fluent API Comparison
- Appendix D: Reserved Keywords & Future Extensions
1. Introduction
BLOGE (Biz Orchestration Graph Engine) DSL is an external domain-specific language for declaratively defining business logic orchestration graphs. A .bloge file describes a directed acyclic graph (DAG) of operator nodes, transform blocks, data flow bindings, conditional branches, resilience policies, and schema declarations. The DSL compiler translates .bloge source into the bloge-core Graph runtime model.
This specification is derived from the shipping implementation (Lexer.java, Parser.java, DslCompiler.java, the session extension compiler, and the shared conformance fixtures). It is the stable source of truth for lexical rules, syntax grammar, AST structure, graph compilation, session compilation, and the higher-order expression features now supported by the Java and TypeScript parsers.
Scope: This document covers the DSL language and its compilation to
Graphand session-extension runtime models. Low-level runtime scheduling details (virtual-thread dispatch, persistence adapters, and execution listeners) remain implementation concerns outside the language contract.
2. Lexical Specification
The BLOGE DSL lexer (Lexer.java) is a hand-written character-by-character scanner that produces a list of Token records. Each token carries a TokenType, the raw lexeme text, and 1-based line:column source position.
2.1 Token Types
Defined in TokenType.java:
Keywords (50)
| Token | Lexeme | Description |
|---|---|---|
GRAPH | graph | Graph definition |
NODE | node | Node definition |
BRANCH | branch | Branch definition |
ON | on | Branch condition marker |
INPUT | input | Input block / input schema |
DEPENDS_ON | depends_on | Explicit dependency declaration |
TIMEOUT | timeout | Timeout configuration |
RETRY | retry | Retry configuration |
FALLBACK | fallback | Fallback value |
TRUE | true | Boolean literal true |
FALSE | false | Boolean literal false |
SCHEMA | schema | Schema definition |
OUTPUT | output | Output schema / output path |
OTHERWISE | otherwise | Default branch or transition case |
WHEN | when | when {} expression |
TRANSFORM | transform | Transform block definition |
FOREACH | foreach | Collection fan-out block |
SEQUENTIAL | sequential | Sequential foreach modifier |
ITEMS | items | Reserved foreach helper identifier |
IN | in | Foreach binding separator |
SESSION | session | Session extension definition |
PHASE | phase | Session phase definition |
ROUND | round | Round-body marker inside a phase |
THEN | then | Phase transition declaration |
YIELD_ON | yield_on | Phase yield event list |
IDLE_TIMEOUT | idle_timeout | Session idle timeout |
TIMEOUT_ACTION | timeout_action | Session timeout policy reference |
MAX_ROUNDS | max_rounds | Session or phase round limit |
MAX_HISTORY | max_history | Session history retention limit |
ON_ROUND_FAILURE | on_round_failure | Phase failure policy |
LOOP | loop | Iterative hyper-node block |
UNTIL | until | Loop or phase termination condition |
CARRY | carry | Loop carry state block |
MAX_ITERATIONS | max_iterations | Loop max iteration count |
SCOPE | scope | Scope mode (parent / isolated) |
WAIT | wait | Timer-based suspension node |
AFTER | after | Wait dependency marker |
SIGNAL_KEY | signal_key | Wait correlation key |
ON_TIMEOUT | on_timeout | Wait/await timeout payload block |
ON_FIRE | on_fire | Wait fire payload block |
DEADLINE | deadline | Deadline timer constructor |
CRON | cron | Cron timer constructor |
AWAIT | await | External-event await node |
EVENT | event | Await event matcher |
WHERE | where | Await correlation predicate |
MODE | mode | Await aggregation mode |
STREAM | stream | Streaming member prefix / stream path |
BUFFER | buffer | Streaming buffer size |
LET | let | Transform local binding |
SCRIPT | script | Sandboxed script node |
Literals and string forms (5)
| Token | Description |
|---|---|
IDENT | Identifier (not matched to a keyword in the current token) |
STRING | Double-quoted string literal |
NUMBER | Integer or decimal number |
DURATION | Number followed by a duration suffix (ms, s, m, h, d) |
TRIPLE_STRING | Triple-quoted multiline string used by script code blocks |
Punctuation (27)
| Token | Lexeme | Description |
|---|---|---|
LBRACE | { | Left brace |
RBRACE | } | Right brace |
LBRACKET | [ | Left bracket |
RBRACKET | ] | Right bracket |
LPAREN | ( | Left parenthesis |
RPAREN | ) | Right parenthesis |
EQUALS | = | Assignment |
ARROW | -> | Branch, transition, or lambda arrow |
DOT | . | Path separator / method-call chain |
COMMA | , | List separator |
COLON | : | Type or binding separator |
QUESTION | ? | Optional marker / ternary |
PLUS | + | Addition / string concatenation |
MINUS | - | Subtraction / unary negation |
STAR | * | Multiplication |
SLASH | / | Division |
PERCENT | % | Modulo |
BANG | ! | Logical NOT |
EQ_EQ | == | Equality comparison |
BANG_EQ | != | Inequality comparison |
GT | > | Greater than |
LT | < | Less than |
GT_EQ | >= | Greater than or equal |
LT_EQ | <= | Less than or equal |
DOUBLE_QUESTION | ?? | Null coalescing |
AMP_AMP | && | Logical AND |
PIPE_PIPE | || | Logical OR |
Comments and special tokens (4)
| Token | Description |
|---|---|
LINE_COMMENT | // ... comment, discarded by the parser |
BLOCK_COMMENT | /* ... */ comment, discarded by the parser |
DOC_COMMENT | /// or /** */ documentation comment preserved in the AST |
EOF | End-of-file marker |
2.2 Lexical Rules
Identifiers and Keywords
IdentStart = [a-zA-Z_]
IdentContinue = [a-zA-Z0-9_]
Identifier = IdentStart IdentContinue*After scanning an identifier, the lexer checks it against the keyword map. If matched, the corresponding keyword TokenType is emitted; otherwise IDENT is emitted. Note that depends_on is a single keyword token containing an underscore — the underscore is a valid IdentContinue character, so depends_on is scanned as a single identifier then matched as the DEPENDS_ON keyword.
String Literals
StringLiteral = '"' StringChar* '"'
StringChar = EscapeSeq | <any char except '"' and '\'>
EscapeSeq = '\"' | '\n' | '\t' | '\\'- Strings are delimited by double quotes (
") - Strings may span multiple lines (newlines inside strings increment the line counter)
- Supported escape sequences:
\"→",\n→ newline,\t→ tab,\\→ backslash - An unterminated string (reaching EOF without closing
") produces a lexer error
Number Literals
NumberLiteral = Digits ('.' Digits)?
Digits = [0-9]+- Both integers (
42) and decimals (3.14) are supported - A decimal point requires at least one digit on each side
- The number is stored as its raw lexeme text; parsing to numeric types occurs later
Duration Literals
DurationLiteral = Digits Suffix
Suffix = 'ms' | 's' | 'm'- A duration is a number immediately followed by a time suffix:
ms(milliseconds),s(seconds), orm(minutes) - Examples:
100ms,3s,5m - The suffix must not be followed by an identifier-continue character (to distinguish
5mfrom5myVar) - The
mssuffix is checked first (longest match) to avoid ambiguity withm
Comments
The lexer supports four comment forms:
| Form | Syntax | Behavior |
|---|---|---|
| Line | // ... | Discarded (ignored) |
| Block | /* ... */ | Discarded (ignored), supports nesting |
| Doc-line | /// ... | Emitted as DOC_COMMENT token |
| Doc-block | /** ... */ | Emitted as DOC_COMMENT token |
Line comment (//): All characters from // to end-of-line are discarded.
Block comment (/* ... */): Supports nested block comments. A depth counter tracks /* (depth++) and */ (depth--). Newlines inside block comments increment the line counter. An unterminated block comment (depth > 0 at EOF) produces a lexer error.
Doc-line comment (///): After consuming the third /, an optional leading space is skipped. The remaining text to end-of-line is captured as the DOC_COMMENT lexeme. Multiple consecutive /// lines produce multiple DOC_COMMENT tokens (merged later by the parser).
Doc-block comment (/** ... */): After consuming /**, all content up to */ is captured. Leading whitespace and optional * prefixes on continuation lines are stripped. The final text is trimmed of leading/trailing blank lines. Emitted as a single DOC_COMMENT token.
Note:
/***/(three-character sequence) is treated as a regular block comment becausepeek() == '*'andpeekNext() == '/', so the doc-block branch is not taken.
Whitespace
- Space (
), tab (\t), and carriage return (\r) are silently skipped - Newline (
\n) increments the line counter and resets the column counter to 1
2.3 Lexical Disambiguation & Priority
| Ambiguity | Resolution Rule |
|---|---|
/// vs // | After matching //, check if next char is /. If so → doc-line comment; otherwise → line comment |
/** vs /* | After matching /*, check if next char is * and char after that is not /. If so → doc-block comment; otherwise → block comment |
- vs -> | After matching -, check if next char is >. If so → ARROW; otherwise → MINUS. The lexer always emits the token; the parser determines whether MINUS is valid in the current position (it is valid in expression contexts, invalid elsewhere). |
/ vs // vs /* | After matching /, check next char: / → comment, * → block comment, otherwise → SLASH. As with MINUS, the lexer always emits SLASH; validity depends on parser context (expression position). |
= vs == | After matching =, check if next char is =. If so → EQ_EQ; otherwise → EQUALS |
! vs != | After matching !, check if next char is =. If so → BANG_EQ; otherwise → BANG |
> vs >= | After matching >, check if next char is =. If so → GT_EQ; otherwise → GT |
< vs <= | After matching <, check if next char is =. If so → LT_EQ; otherwise → LT |
? vs ?? | After matching ?, check if next char is ?. If so → DOUBLE_QUESTION; otherwise → QUESTION |
& vs && | & alone is a lexer error; && → AMP_AMP |
| vs || | | alone is a lexer error; || → PIPE_PIPE |
depends_on as keyword | The underscore _ is a valid identifier character, so depends_on is scanned as a single identifier token and then matched as the DEPENDS_ON keyword |
5m vs 5myVar | After a number followed by m, check if the next char is an IdentContinue. If so → treat as NUMBER + IDENT; if not → treat as DURATION |
5ms duration | ms suffix is checked before single-character m and s suffixes (longest-match priority) |
3. Syntax Specification
The BLOGE DSL parser (Parser.java) is a hand-written recursive descent parser that consumes a list of tokens and produces an AST rooted at either GraphDef or ExtensionDef (for top-level session documents).
3.1 EBNF Grammar
Program = GraphDef
| SessionDef
GraphDef = DocComment? "graph" IDENT "{" Member* "}"
SessionDef = DocComment? "session" IDENT "{" SessionMember* "}"
SessionMember = SessionProperty
| PhaseDef
| CommentNode
SessionProperty = "idle_timeout" "=" DURATION
| "timeout_action" "=" STRING
| "max_rounds" "=" NUMBER
| "max_history" "=" NUMBER
PhaseDef = DocComment? "phase" IDENT "{" PhaseBody "}"
PhaseBody = ( PhaseProperty
| RoundBlock
| Member
)*
PhaseProperty = "max_rounds" "=" NUMBER
| "on_round_failure" "=" IDENT
| "yield_on" "=" "[" IdentList? "]"
| "until" Expression
| ThenProperty
ThenProperty = "then" "->" IDENT
| "then" "{" ( ( Expression | "otherwise" ) "->" IDENT )* "}"
RoundBlock = "round" "{" Member* "}"
Member = NodeDef
| BranchDef
| TransformDef
| SchemaDef
| ForEachDef
| LoopDef
| WaitDef
| AwaitDef
| ScriptDef
| StreamMember
| CommentNode
StreamMember = DocComment? "stream" ( NodeDef
| ForEachDef
| LoopDef )
NodeDef = DocComment? "node" IDENT ":" IDENT "{" NodeBody "}"
NodeBody = ( InputBlock
| InputSchemaBlock
| OutputDecl
| DependsOn
| TimeoutField
| RetryField
| FallbackField
| ScopeField
| BufferField
)*
InputBlock = "input" "{" ( DocComment? IDENT "=" Expression )* "}"
InputSchemaBlock = "input" "{" FieldDeclaration* "}"
(* Disambiguated from InputBlock by lookahead:
if first IDENT is followed by "=" → InputBlock;
if first IDENT is followed by ":" → InputSchemaBlock *)
OutputDecl = "output" "{" FieldDeclaration* "}"
| "output" ":" IDENT
DependsOn = "depends_on" "=" "[" IdentList? "]"
IdentList = IDENT ( "," IDENT )*
TimeoutField = "timeout" "=" DURATION
RetryField = "retry" "=" "{" RetryConfig "}"
RetryConfig = ( RetryKey ":" RetryValue ","? )*
RetryKey = "attempts" | "backoff" | "strategy"
RetryValue = NUMBER (* for "attempts" *)
| DURATION (* for "backoff" *)
| IDENT (* for "strategy" *)
FallbackField = "fallback" "=" Expression
ScopeField = "scope" "=" ( "parent" | "isolated" )
(* Controls scope visibility for sub-graph constructs. *)
BufferField = "buffer" "=" NUMBER
ForEachDef = DocComment? "foreach" IDENT ":" ForEachBinding "in" Expression
"sequential"? "{" ( ScopeField | BufferField | Member )* "}"
ForEachBinding = "(" IDENT ( "," IDENT )? ")" (* tuple binding with optional index *)
| IDENT (* bare item variable *)
LoopDef = DocComment? "loop" IDENT "{" LoopBody "}"
LoopBody = ( "max_iterations" "=" NUMBER
| "delay" "=" DURATION
| DependsOn
| ScopeField
| BufferField
| Member
| CarryBlock
| UntilClause
)*
CarryBlock = "carry" "{" ( IDENT ":" Expression ","? )* "}"
UntilClause = "until" Expression
WaitDef = DocComment? "wait" IDENT "=" WaitTimer "after" IDENT ( "{" WaitBody "}" )?
WaitTimer = DURATION
| "deadline" "(" STRING ")"
| "cron" "(" STRING ")"
WaitBody = ( "signal_key" "=" Expression
| "on_timeout" PayloadBlock
| "on_fire" PayloadBlock
)*
AwaitDef = DocComment? "await" IDENT "{" AwaitBody "}"
AwaitBody = ( "mode" "=" IDENT
| DependsOn
| EventMatcher
| TimeoutField
| "on_timeout" PayloadBlock
)*
EventMatcher = "event" STRING ( "where" IDENT "=" Expression )?
( "{" "optional" "=" "true" "}" )?
ScriptDef = DocComment? "script" IDENT "{" ScriptBody "}"
ScriptBody = ( ScriptLang
| InputBlock
| ScriptOutputSchema
| ScriptCode
| TimeoutField
)*
ScriptLang = "lang" "=" STRING
ScriptOutputSchema = "output_schema" ( "{" FieldDeclaration* "}"
| "=" IDENT )
ScriptCode = "code" "=" TRIPLE_STRING
PayloadBlock = "{" ( IDENT "=" Expression )* "}"
BranchDef = DocComment? "branch" "on" PathExpression "{" BranchBody "}"
BranchBody = ( BranchCase | OtherwiseCase )*
BranchCase = DocComment? Expression "->" IDENT
OtherwiseCase = "otherwise" "->" IDENT
TransformDef = DocComment? "transform" IDENT "{" TransformBody "}"
TransformBody = LetBinding* TransformField*
LetBinding = "let" IDENT "=" Expression
TransformField = DocComment? IDENT ( ":" TypeRef )? "=" Expression
TypeRef = IDENT "?"?
SchemaDef = "schema" IDENT "{" FieldDeclaration* "}"
FieldDeclaration = IDENT ":" ( IDENT "?"?
| "{" FieldDeclaration* "}" )
(* --- Enhanced Expression Grammar --- *)
(* Operator precedence (lowest to highest):
1. when expression
2. ternary conditional (?:)
3. null coalescing (??)
4. logical OR (||)
5. logical AND (&&)
6. equality (==, !=)
7. comparison (>, <, >=, <=)
8. additive (+, -)
9. multiplicative (*, /, %)
10. unary (!, -)
11. primary (literals, paths, function calls, grouping, when) *)
Expression = LambdaExpr
| WhenExpr
| ConditionalExpr
LambdaExpr = ( IDENT | "(" ( IDENT ( "," IDENT )* )? ")" ) "->" Expression
ConditionalExpr = NullCoalesce ( "?" Expression ":" Expression )?
NullCoalesce = LogicalOr ( "??" LogicalOr )*
LogicalOr = LogicalAnd ( "||" LogicalAnd )*
LogicalAnd = Equality ( "&&" Equality )*
Equality = Comparison ( ( "==" | "!=" ) Comparison )?
Comparison = Additive ( ( ">" | "<" | ">=" | "<=" ) Additive )?
Additive = Multiplicative ( ( "+" | "-" ) Multiplicative )*
Multiplicative = Unary ( ( "*" | "/" | "%" ) Unary )*
Unary = ( "!" | "-" ) Unary
| Primary
Primary = Atomic ( "." MethodCall )*
MethodCall = IDENT "(" ( Expression ( "," Expression )* )? ")"
Atomic = FunctionCall
| PathExpression
| StringLiteral
| NumberLiteral
| BooleanLiteral
| DurationLiteral
| ObjectLiteral
| ArrayLiteral
| GroupExpr
FunctionCall = IDENT "(" ( Expression ( "," Expression )* )? ")"
GroupExpr = "(" Expression ")"
WhenExpr = "when" WhenSubject? "{" WhenClause* OtherwiseClause? "}"
WhenSubject = Expression
WhenClause = Expression "->" Expression
OtherwiseClause = "otherwise" "->" Expression
PathExpression = IDENT ( "." IDENT )*
StringLiteral = STRING
NumberLiteral = NUMBER
BooleanLiteral = "true" | "false"
DurationLiteral = DURATION
ObjectLiteral = "{" ( IDENT ":" Expression ( "," IDENT ":" Expression )* )? "}"
ArrayLiteral = "[" ( Expression ( "," Expression )* )? "]"
DocComment = DOC_COMMENT+
CommentNode = (* A DocComment that is not followed by "node", "branch",
"transform", or "schema" — becomes a standalone
CommentNode in the AST *)3.2 Syntax Railroad Descriptions
GraphDef
──▶ DocComment? ──▶ "graph" ──▶ IDENT ──▶ "{" ──▶ Member* ──▶ "}" ──▶NodeDef
──▶ DocComment? ──▶ "node" ──▶ IDENT ──▶ ":" ──▶ IDENT ──▶ "{" ──▶ NodeBody ──▶ "}" ──▶
│ (id) │ (operatorRef)BranchDef
──▶ DocComment? ──▶ "branch" ──▶ "on" ──▶ PathExpr ──▶ "{" ──┬──▶ BranchCase ──┬──▶ "}" ──▶
│ │
├──▶ Otherwise ───┤
└─────────────────┘TransformDef
──▶ DocComment? ──▶ "transform" ──▶ IDENT ──▶ "{" ──┬──▶ TransformField ──┬──▶ "}" ──▶
│ │
└─────────────────────┘
TransformField:
──▶ DocComment? ──▶ IDENT ──▶ ( ":" TypeRef )? ──▶ "=" ──▶ Expression ──▶Expression (Enhanced)
┌──▶ WhenExpr ─────────────────────────┐
│ │
──▶ ────────────┤ ├──▶
│ │
└──▶ ConditionalExpr ──────────────────┘
ConditionalExpr:
──▶ NullCoalesce ──┬──▶ "?" ──▶ Expression ──▶ ":" ──▶ Expression ──▶
│
└──▶ (none) ──▶
NullCoalesce:
──▶ LogicalOr ──┬──▶ "??" ──▶ LogicalOr ──┬──▶
│ │
└───────────────────────────┘
Primary:
┌──▶ FunctionCall ────┐
├──▶ PathExpression ──┤
├──▶ StringLiteral ───┤
├──▶ NumberLiteral ───┤
──▶ ─────┼──▶ BooleanLiteral ──┼──▶
├──▶ DurationLiteral ─┤
├──▶ ObjectLiteral ───┤
├──▶ ArrayLiteral ────┤
└──▶ "(" Expr ")" ───┘
FunctionCall:
──▶ IDENT ──▶ "(" ──┬──▶ Expression ──┬──▶ "," ──▶ Expression ──┬──▶ ")" ──▶
│ └─────────────────────────┘
└──▶ (empty) ──────────────────────────────────▶
WhenExpr:
──▶ "when" ──▶ Subject? ──▶ "{" ──┬──▶ Expr "->" Expr ──┬──▶ OtherwiseClause? ──▶ "}" ──▶
│ │
└──────────────────────┘3.3 Contextual Keywords
All 50 keyword tokens are contextual: they carry special meaning only in specific syntactic positions but can still be accepted by expectIdent() when the grammar expects an identifier. This allows domain names such as phase, mode, or transform to remain usable where the parser is looking for IDs rather than grammar markers.
Positions where keywords are treated as identifiers:
- Graph, phase, session, node, transform, foreach, loop, wait, await, and script IDs
- Operator references after
node <id> : <operatorRef> - Schema names and field names
- Retry strategy names
- Function arguments and path segments inside expressions
- Event correlation keys inside
await event ... where <key> = ...
This design means the lexer can stay strict while the parser still supports natural domain naming.
Semantic reservation still applies: contextual keywords are not the same as semantically safe identifiers. Some names remain reserved by later compiler stages, for example node IDs cannot use
ctx,prev,carry,loopIteration,stream, oroutput, and lambda parameters cannot shadow reserved DSL keywords.
3.4 Session Extension Syntax
session is the stable top-level extension form for long-running conversational or multi-round flows. A .bloge source file may now contain either a graph or a session as its root production.
session onboarding {
idle_timeout = 30m
max_rounds = 10
phase collectProfile {
yield_on = [profile_submitted]
then -> review
round {
node askProfile : AskProfileOperator {}
wait profileTimeout = 24h after askProfile {
signal_key = ctx.sessionId
}
}
}
phase review {
on_round_failure = retry_phase
until ctx.approved
node evaluate : EvaluateProfileOperator {}
then {
ctx.approved -> complete
otherwise -> collectProfile
}
}
}Key structural rules:
- Session-level properties (
idle_timeout,timeout_action,max_rounds,max_history) live directly undersession. phaseblocks are ordered children of the session and are represented as nestedExtensionDefnodes.- A phase is either a once phase (direct
Memberchildren) or a round phase (round { ... }), never both. thensupports either a direct phase target (then -> nextPhase) or a case table that lowers into an ordered list of transition objects.- Phase bodies can contain the same executable members as graphs, including
node,branch,transform,foreach,loop,wait,await,script, and streaming variants.
4. AST Specification
4.1 AST Node Hierarchy
The AST is defined via Java sealed interface hierarchies in AstNode.java and Expression.java. All AST nodes carry source position (line, column) for diagnostics, conformance snapshots, and downstream compilation.
AstNode (sealed interface)
├── GraphDef(name, members: List<AstNode>, description, line, column)
├── ExtensionDef(kind, id, properties: Map<String, Expression>,
│ children: List<AstNode>, description, line, column)
├── NodeDef(id, operatorRef, input, dependsOn, timeout, retry, fallback,
│ inputSchema, outputSchema, scope, streaming, bufferSize,
│ description, line, column)
├── BranchDef(condition: Expression, cases: List<BranchCase>,
│ otherwise, description, line, column)
├── InputBlock(bindings: Map<String, Expression>,
│ fieldComments: Map<String, String>, line, column)
├── SchemaDef(name, body: SchemaDeclaration, line, column)
├── TransformDef(id, letBindings: List<LetBinding>,
│ fields: List<TransformField>, description, line, column)
├── ForEachDef(id, itemsExpr: Expression, sequential, itemVar, indexVar,
│ scope, streaming, bufferSize, body: List<AstNode>,
│ description, line, column)
├── LoopDef(id, maxIterations, delay, dependsOn, scope, streaming,
│ bufferSize, body: List<AstNode>, carryDef, untilCondition,
│ description, line, column)
├── WaitDef(id, timerExpr, afterNode, signalKey,
│ onTimeoutPayload, onFirePayload, description, line, column)
├── AwaitDef(id, aggregationMode, events: List<EventMatcherDef>, timeout,
│ onTimeoutPayload, dependsOn, description, line, column)
├── ScriptDef(id, lang, input, outputSchema, code, timeout,
│ description, line, column)
├── CarryDef(bindings, line, column)
├── UntilDef(condition, line, column)
└── CommentNode(text, style, line, column)
(helper records, not AstNode)
├── BranchCase(value: Expression, target, description)
├── EventMatcherDef(eventName, correlationKey, expectedValue, optional, line, column)
├── TransformField(name, typeAnnotation, value: Expression, description, line, column)
├── LetBinding(name, value: Expression, line, column)
├── RetryDef(attempts, backoff: DurationValue, strategy)
├── FallbackDef(value: Expression)
└── DurationValue(amount, unit)
SchemaDeclaration (sealed interface)
├── InlineSchema(fields: List<FieldDeclaration>, line, column)
└── SchemaRef(name, line, column)
Expression (sealed interface)
├── ContextPath(segments, line, column)
├── NodeOutputPath(nodeId, segments, line, column)
├── NodeStreamPath(nodeId, segments, line, column)
├── TransformFieldPath(transformId, fieldName, line, column)
├── ItemPath(segments, line, column)
├── ItemIndex(line, column)
├── LoopPrevPath(nodeId, segments, line, column)
├── LoopCarryPath(segments, line, column)
├── LoopIterationRef(line, column)
├── LambdaParamPath(paramName, segments, line, column)
├── StringLiteral(value, line, column)
├── NumberLiteral(value, line, column)
├── BooleanLiteral(value, line, column)
├── DurationLiteral(duration, line, column)
├── ObjectLiteral(fields, line, column)
├── ArrayLiteral(elements, line, column)
├── BinaryOp(left, op, right, line, column)
├── UnaryOp(op, operand, line, column)
├── ConditionalExpr(condition, thenBranch, elseBranch, line, column)
├── NullCoalesce(primary, fallback, line, column)
├── FunctionCall(name, args, line, column)
├── WhenExpr(subject, clauses, otherwise, line, column)
├── LambdaExpr(params, body, line, column)
├── MethodCallExpr(receiver, method, args, line, column)
└── GroupExpr(inner, line, column)
WhenClause(condition: Expression, result: Expression)
BinaryOperator(enum): PLUS, MINUS, STAR, SLASH, PERCENT,
EQ_EQ, BANG_EQ, GT, LT, GT_EQ, LT_EQ,
AMP_AMP, PIPE_PIPE
UnaryOperator(enum): NEGATE, NOT4.2 Syntax → AST Mapping
| Syntax Production | AST Node / Helper | Notes |
|---|---|---|
GraphDef | GraphDef | members contains executable graph members and detached CommentNodes |
SessionDef | ExtensionDef(kind="session") | Session-level properties are stored in properties; phases become children |
PhaseDef | ExtensionDef(kind="phase") | phase_type, then, until, yield_on, and round/once metadata are normalized into properties |
RoundBlock | ExtensionDef(kind="phase") child list | round { ... } does not emit a dedicated node; its members replace the phase child list |
NodeDef | NodeDef | Input, schemas, resilience, scope, streaming, and buffer metadata map to dedicated fields |
StreamMember | NodeDef / ForEachDef / LoopDef | stream is represented by streaming=true on the wrapped member |
ForEachDef | ForEachDef | itemVar, optional indexVar, sequential, scope, bufferSize, and child body are preserved explicitly |
LoopDef | LoopDef | carry lowers into CarryDef; until lowers into untilCondition |
WaitDef | WaitDef | Timer constructor is stored as an Expression; payload blocks are Map<String, Expression> |
AwaitDef | AwaitDef | Event matchers lower into EventMatcherDef; mode defaults to and if omitted |
ScriptDef | ScriptDef | code preserves raw triple-string content; output_schema may be inline or by reference |
BranchDef | BranchDef | condition is any path-capable expression accepted by parseExpression() |
InputBlock | InputBlock | Key-value map; values are any Expression subtype |
TransformDef | TransformDef | let bindings and output fields are stored separately |
TransformField | TransformField | typeAnnotation is nullable; value is any Expression |
SchemaDef | SchemaDef | Top-level standalone schema; body is typically InlineSchema |
OutputDecl (inline) | InlineSchema | output { field: Type } |
OutputDecl (ref) | SchemaRef | output: SchemaName |
InputSchemaBlock | InlineSchema | Stored as NodeDef.inputSchema |
TimeoutField | DurationValue | timeout = 3s → DurationValue(3, "s") |
RetryField | RetryDef | Missing strategy defaults to fixed in compiler normalization |
FallbackField | FallbackDef | Wraps any Expression |
FieldDeclaration | FieldDeclaration | name: Type? marks the field optional; nested object schemas lower into InlineSchema |
Expression (path) | Path-aware Expression subtype | See §4.3 for classification rules |
Expression (lambda) | LambdaExpr | Only valid where a higher-order collection method expects a lambda |
Expression (method) | MethodCallExpr | Receiver-style collection helpers such as items.map(x -> ...) |
DurationValue Parsing
The DurationValue is parsed from the DURATION token lexeme:
| Lexeme | Parsed As |
|---|---|
100ms | DurationValue(100, "ms") |
3s | DurationValue(3, "s") |
5m | DurationValue(5, "m") |
Suffix matching order: ms → s → m (longest-suffix-first).
4.3 Expression Path Resolution Rules
When parsing a PathExpression (an IDENT followed by zero or more .IDENT segments), the parser classifies the result into one of several path-aware Expression subtypes based on the following rules, applied in order:
Rule 1: ctx prefix → ContextPath
ctx.request.userId → ContextPath(segments=["request", "userId"])
ctx.items → ContextPath(segments=["items"])If the first segment is "ctx", the expression is a ContextPath. The "ctx" prefix is stripped from the segments list.
Rule 2: <nodeId>.output.<segments> → NodeOutputPath (explicit)
fetchUser.output.id → NodeOutputPath(nodeId="fetchUser", segments=["id"])
fetchUser.output → NodeOutputPath(nodeId="fetchUser", segments=[])
calcPrice.output.total → NodeOutputPath(nodeId="calcPrice", segments=["total"])If the path has ≥2 segments and the second segment is "output", it is a NodeOutputPath. The node ID is the first segment, and the segments list starts after "output".
Rule 3: Single identifier → ContextPath
someVar → ContextPath(segments=["someVar"])If the path has exactly one segment (not "ctx"), it is a ContextPath with that single segment.
Rule 4: <transformId>.<fieldName> → TransformFieldPath
orderSummary.customerName → TransformFieldPath(transformId="orderSummary", fieldName="customerName")
riskMetrics.score → TransformFieldPath(transformId="riskMetrics", fieldName="score")If the path has exactly two segments and the first segment matches a declared transform ID in the current graph, it is a TransformFieldPath. The first segment is the transform ID, the second segment is the field name.
Design decision: Transform references use
transformId.fieldName(without.output.infix), distinguishing them from node references (nodeId.output.field). This is because transforms do not have an explicitinput {}block — they only expose computed output fields — making the.output.segment redundant and visually noisy.
Rule 5: <nodeId>.<segments> → NodeOutputPath (implicit)
fetchUser.name → NodeOutputPath(nodeId="fetchUser", segments=["name"])
a.result.data → NodeOutputPath(nodeId="a", segments=["result", "data"])Any multi-segment path that doesn't match Rules 1–4 is treated as a NodeOutputPath where the first segment is the node ID and the remaining segments are the output path. This is the implicit form (without the output keyword).
Disambiguation note: When a two-segment path
x.ycould match both Rule 4 (transform) and Rule 5 (implicit node output), the compiler resolves it based on which IDs are declared. Ifxis declared as both a transform ID and a node ID, a compile-time error is raised:"Transform '<id>' conflicts with node of the same name"(see §6.3). Naming conventions — noun phrases for transforms (orderSummary) vs verb phrases for nodes (fetchUser) — are recommended but not enforced by the compiler.
Rule 6: <nodeId>.stream or <nodeId>.stream.<fieldName> → NodeStreamPath
processData.stream → NodeStreamPath(nodeId="processData", segments=[])
processData.stream.chunk → NodeStreamPath(nodeId="processData", segments=["chunk"])If the path has ≥2 segments and the second segment is "stream", it is a NodeStreamPath. The node ID is the first segment. The segments list contains everything after "stream". This form is only valid when the upstream node is declared as a streaming node; at runtime, results.getChannel(nodeId) is called to retrieve the live NodeChannel<?> for streaming consumption.
Rule 7 (foreach body): <itemVar>.<path> → ItemPath
Inside a foreach body, the item variable binding declared in foreach id : (itemVar) in ... creates an implicit item reference:
// foreach orders : (order) in ctx.orders { ... }
order.customerId → ItemPath(segments=["customerId"])
order → ItemPath(segments=[]) // entire itemAt runtime the DSL compiler reads the current item from GraphContext key __item__. The item variable name (order in the example) is lexically matched during parsing; any path whose first segment equals itemVar is classified as ItemPath.
Rule 8 (foreach body): <indexVar> → ItemIndex
Inside a foreach body with a two-binding form (itemVar, indexVar), the second identifier is the item index reference:
// foreach orders : (order, idx) in ctx.orders { ... }
idx → ItemIndexAt runtime the compiler reads the 0-based integer from GraphContext key __itemIndex__.
Rule 9 (loop body): prev.<nodeId>.<path> → LoopPrevPath
Inside a loop body, the special prefix prev refers to the previous iteration's terminal node outputs:
// inside loop body:
prev.checkStatus.status → LoopPrevPath(nodeId="checkStatus", segments=["status"])
prev.checkStatus → LoopPrevPath(nodeId="checkStatus", segments=[])At runtime the compiler reads from GraphContext key __prev__ (a Map<String,Object> of the previous iteration's node outputs).
Rule 10 (loop body): carry.<path> → LoopCarryPath
Inside a loop body, the special prefix carry refers to the carry state passed from the previous iteration (or the initial input of the loop node):
// inside loop body:
carry.retryCount → LoopCarryPath(segments=["retryCount"])
carry → LoopCarryPath(segments=[]) // entire carry mapAt runtime the compiler reads from GraphContext key __carry__.
Rule 11 (loop body): loopIteration → LoopIterationRef
Inside a loop body, the bare identifier loopIteration refers to the 0-based iteration counter:
loopIteration → LoopIterationRefAt runtime the compiler reads from GraphContext key __loopIteration__ (an Integer).
Reserved identifier summary:
ctx,prev,carry,loopIteration,stream, andoutputare reserved implicit identifiers in the DSL expression layer. They cannot be used as node IDs or lambda parameter names. See the Implicit Declarations Registry (docs/implicit-declarations-registry.md) for the complete list of reserved names and their runtime lifecycle.
4.4 Doc-Comment Attachment Rules
Documentation comments (/// and /** */) follow a prefix attachment strategy:
- Consumption: the parser calls
consumeDocComments()which collects consecutiveDOC_COMMENTtokens and merges their lexemes with newline separators. - Attachment targets: a collected doc-comment is attached to the immediately following syntax element when that element has a
descriptionfield or field-level comment slot. - Standalone comment nodes: if a doc-comment is not followed by an attachable target, the parser emits a
CommentNodeso documentation is not silently discarded. - Forwarding: member-level parsers use the
pendingDocCommenthandoff so nested parsing methods can still observe the prefix comment that was consumed by the outer dispatcher.
Concrete attachment targets include:
graph,session,phase,node,branch,transform,foreach,loop,wait,await, andscriptdefinitions- branch case entries (
BranchCase.description) input {}field comments (InputBlock.fieldComments)- transform field descriptions (
TransformField.description)
4.5 Session Extension AST Mapping
Sessions are intentionally modeled with the generic ExtensionDef node so future language extensions can reuse the same structural contract.
| Syntax | AST form | Notes |
|---|---|---|
session onboarding { ... } | ExtensionDef(kind="session", id="onboarding", ...) | Session properties are stored in properties; phases become ordered children |
phase review { ... } | ExtensionDef(kind="phase", id="review", ...) | phase_type is synthesized as once or round during parsing |
yield_on = [a, b] | properties["yield_on"] = ArrayLiteral([...]) | Identifiers are normalized into string literals inside the array |
then -> nextPhase | properties["then"] = StringLiteral("nextPhase") | Single-target transition |
then { expr -> a otherwise -> b } | properties["then"] = ArrayLiteral([ObjectLiteral(...) ...]) | Each object contains condition and target fields |
round { ... } | child members of the phase ExtensionDef | No standalone RoundDef record is emitted |
This shape keeps the parser and conformance fixtures stable while allowing the dedicated session compiler to interpret extension properties with richer runtime semantics.
5. Compilation Semantics
5.1 Compilation Pipeline Overview
The GraphLoader provides the entry point for the three-stage compilation pipeline:
Source (.bloge text)
│
▼
┌──────────────────┐
│ Lexer.tokenize() │ → List<Token>
└──────────────────┘
│
▼
┌──────────────────┐
│ Parser.parse() │ → GraphDef (AST)
└──────────────────┘
│
▼
┌──────────────────────┐
│ DslCompiler.compile()│ → Graph (runtime model)
└──────────────────────┘The DslCompiler.compile() stage processes the AST in the following order:
- Schema registration — collect top-level
SchemaDefentries into a named schema registry - Node compilation — compile
NodeDefentries intoNodeSpecinstances (operator resolution, input assembly, resilience config, schema binding) - Transform compilation — compile
TransformDefentries into virtualNodeSpecinstances backed byTransformOperator(expression compilation, dependency inference, schema generation); see §5.8 - Transform ordering — topological sort of transforms using Kahn's algorithm with cycle detection
- Branch compilation — compile
BranchDefentries intoEdge.Conditionaledges - Dependency merging — merge explicit (
depends_on) and implicit (expression-inferred) dependencies; add transform-originated dependencies - Graph assembly — construct the
Graphruntime model with allNodeSpecandEdgeentries; DAG validation
The GraphLoader also supports file-watching mode (watch(Path, Consumer<Graph>)) that monitors a directory for .bloge file changes and recompiles on update.
5.2 Operator Resolution
Each NodeDef.operatorRef is resolved against the OperatorRegistry:
registry.contains(nd.operatorRef()) // must return true- If the operator is not registered, a
GraphDefinitionExceptionis thrown:"Node '<id>' references unregistered operator '<ref>'" - The operator reference is a simple string name (e.g.,
"FetchUserOperator") - Operator metadata (input/output schemas) may be auto-introspected via
registry.metadata(operatorRef)for schema enrichment
5.3 Implicit Dependency Inference
Dependencies between nodes are established through two mechanisms, merged and deduplicated:
Explicit Dependencies
depends_on = [fetchUser, calcPrice]Produces DirectEdge(from, to) for each listed node ID.
Implicit Dependencies
When compiling input block expressions, any NodeOutputPath or TransformFieldPath expression automatically registers the referenced node or transform as an implicit dependency:
input {
userId = fetchUser.output.id // implicit dep on "fetchUser"
total = calcPrice.output.total // implicit dep on "calcPrice"
name = orderSummary.customerName // implicit dep on transform "orderSummary"
}The compiler tracks implicit dependencies per node in a Map<String, Set<String>>. After all input blocks are compiled, explicit and implicit dependencies are merged (using LinkedHashSet for deduplication) to produce the final edge list.
Transform Dependencies
Transform blocks also participate in dependency inference. Each TransformDef is compiled into a virtual NodeSpec, and any NodeOutputPath or TransformFieldPath in its field expressions creates an implicit dependency from the transform to the referenced node or transform. See §5.8 for details.
If any dependency (explicit or implicit) references a non-existent node, a GraphDefinitionException is thrown: "Node '<id>' depends on non-existent node '<dep>'".
5.4 Expression Compilation
Each Expression AST node is compiled into a BiFunction<NodeResults, GraphContext, Object> — a runtime extractor function.
| Expression Type | Compilation Strategy |
|---|---|
ContextPath | Navigate GraphContext with gc.get(...), then continue through Map lookup or reflective property access. |
NodeOutputPath | Read results.getRaw(nodeId) and walk remaining segments with the cached property accessor. |
NodeStreamPath | Register an implicit dependency on nodeId, then read results.getChannel(nodeId) to expose the live stream channel or a derived stream field. |
TransformFieldPath | Read the transform's virtual-node output map and extract the referenced field. |
ItemPath | Resolve from the current foreach item bound in GraphContext reserved keys. |
ItemIndex | Read the current foreach index from reserved keys. |
LoopPrevPath | Resolve against the previous iteration snapshot stored under the loop reserved context. |
LoopCarryPath | Resolve against the current loop carry map. |
LoopIterationRef | Read the current loop iteration counter from reserved keys. |
LambdaParamPath | Resolve against the lexical lambda parameter frame introduced by a higher-order method call. |
StringLiteral | Return the constant string value. |
NumberLiteral | Return the constant double value. |
BooleanLiteral | Return the constant boolean value. |
DurationLiteral | Return a java.time.Duration via DurationValue.toDuration(). |
ObjectLiteral | Compile each field expression and build a LinkedHashMap<String, Object> at runtime. |
ArrayLiteral | Compile each element expression and build a runtime List<Object>. |
BinaryOp | Compile both sides, apply arithmetic/comparison/logical coercion rules, and short-circuit logical operators. |
UnaryOp | Compile the operand and apply numeric negation or boolean inversion. |
ConditionalExpr | Compile condition/then/else extractors and evaluate only the selected branch at runtime. |
NullCoalesce | Evaluate the primary value first and only evaluate the fallback when the primary returns null. |
FunctionCall | Resolve an ExpressionFunction, compile arguments, and invoke apply(Object...) with the evaluated values. |
WhenExpr | Compile the optional subject plus clause/result expressions, then execute ordered matching with optional otherwise. |
LambdaExpr | Compile into an internal callable object captured by higher-order collection methods; standalone lambda values are rejected in node and wait/await payload bindings. |
MethodCallExpr | Compile the receiver and arguments, ensure the method name is in the allowed collection-method set, and dispatch to CollectionOps.invoke(...). |
GroupExpr | Compile the inner expression transparently; grouping only preserves precedence. |
Higher-order collection methods and lambda semantics
Receiver-style collection helpers currently support: map, filter, flatMap, reduce, groupBy, sortBy, find, any, all, zip, associate, take, drop, chunked, windowed, distinctBy, minBy, maxBy, count, sumBy, and partition.
Lambda-specific rules:
LambdaExpris only meaningful as an argument to a higher-order collection method.reducerequires an init value plus a two-parameter lambda(acc, x) -> ....associaterequires two lambdas: one for the key and one for the value.- Lambda parameter names participate in reserved-key validation so they cannot shadow DSL runtime identifiers.
The compiled extractors are packaged into a CompiledInputAssembler (via CompiledInputAssembler.ofMap()) which implements the InputAssembler<Map<String, Object>> interface.
Expression Evaluation Semantics
Truthiness: For ConditionalExpr, WhenExpr (Form A), and logical operators, values are coerced to boolean: null → false, Boolean → as-is, Number → false if 0, String → false if empty, all others → true.
Null propagation: Arithmetic and comparison operators return null if either operand is null (except ?? which explicitly handles null). NullCoalesce (??) returns the fallback when the primary is null.
Type coercion for +: If either operand is a String, the other is converted via String.valueOf() and the result is string concatenation. Otherwise, both operands are coerced to Number for arithmetic addition.
Property Access Chain
The CachedPropertyAccessor.getProperty(obj, name) utility resolves properties in this order:
Map.get(key)— if the object is aMap- Record component accessor — if the object is a Java
record - Getter method (
getXxx) — standard JavaBean convention - Public field — direct field access
5.5 Branch Compilation
A BranchDef is compiled into an Edge.Conditional (implementing ConditionalEdge):
BranchDef → Edge.Conditional(fromNodeId, conditionField, branches, otherwise)Condition Expression
The branch condition (branch on <expr>) must be a NodeOutputPath expression. The compiler extracts:
fromNodeId: the referenced node IDconditionField: the dot-joined path segments (e.g.,"status","output.mode"), ornullif segments are empty
If the condition is not a NodeOutputPath, a GraphDefinitionException is thrown.
Case Predicate Building
Each BranchCase value is compiled into a Predicate<Object> with type-lenient comparison:
| Case Value Type | Predicate Logic |
|---|---|
BooleanLiteral | If runtime value is Boolean → direct == comparison; otherwise → case-insensitive String.valueOf() comparison |
StringLiteral | sl.value().equals(val) or sl.value().equals(String.valueOf(val)) |
NumberLiteral | If runtime value is Number → doubleValue() comparison; otherwise → false |
| Other | Objects.equals(val, evaluateLiteral(caseValue)) fallback |
Otherwise Target
If an otherwise clause is present, its target node ID is stored as the conditional edge's default branch. If the otherwise target references a non-existent node, a GraphDefinitionException is thrown.
Validation
- The
fromNodeIdmust exist in the node set - Each branch case target must exist in the node set
- The
otherwisetarget (if present) must exist in the node set
5.6 Resilience Configuration Compilation
Node resilience settings are compiled into a ResilienceConfig record:
Timeout
timeout = 3s→ DurationValue(3, "s").toDuration() → java.time.Duration.ofSeconds(3)
| DSL Unit | Java Conversion |
|---|---|
ms | Duration.ofMillis(amount) |
s | Duration.ofSeconds(amount) |
m | Duration.ofMinutes(amount) |
Retry
retry = {
attempts: 3
backoff: 200ms
strategy: exponential
}→ ResilienceConfig(retryAttempts=3, retryBackoff=Duration.ofMillis(200), backoffStrategy=EXPONENTIAL, ...)
| Config Key | Type | Default | Description |
|---|---|---|---|
attempts | int | 0 | Max retry attempts |
backoff | Duration | 100ms | Backoff delay |
strategy | String | "fixed" | Backoff strategy name |
Strategy string mapping:
| DSL Value | BackoffStrategy Enum |
|---|---|
"exponential" | EXPONENTIAL |
"jitter" | JITTER |
"fixed" | FIXED |
| (any other) | FIXED (default) |
The strategy field is optional; if omitted, defaults to "fixed".
Fallback
fallback = { status: "error", code: 500 }
fallback = "default_value"The fallback expression is eagerly evaluated at compile time (not at runtime) via evaluateFallbackExpression(). The resulting value is wrapped in a Supplier<?> closure:
| Expression Type | Fallback Value |
|---|---|
StringLiteral | The string value |
NumberLiteral | The double value |
BooleanLiteral | The boolean value |
ObjectLiteral | LinkedHashMap<String, Object> (recursively evaluated) |
ArrayLiteral | ArrayList<Object> (recursively evaluated) |
| Other | null |
5.7 Schema Resolution & Validation
Schema Declarations
Schemas can be declared in three forms:
Top-level named schema (
SchemaDef):blogeschema UserOutput { id: Int name: String email: String? }Inline output schema (within a node):
blogeoutput { id: Int name: String }Schema reference (within a node):
blogeoutput: UserOutputInline input schema (within a node, disambiguated from
InputBlockby lookahead):blogeinput { name: String age: Int }
Named Schema Registry
Top-level SchemaDef members are registered in a Map<String, SchemaDescriptor> during compilation. When a SchemaRef is encountered, it is resolved against this registry. If the referenced schema is not found, a GraphDefinitionException is thrown: "Referenced schema '<name>' not found (at <line>:<column>)".
Type Name Mapping
Field type names in schema declarations are mapped to Java types:
| DSL Type | Java Type |
|---|---|
String | String.class |
Int | Integer.class |
Integer | Integer.class |
Long | Long.class |
Double | Double.class |
Float | Float.class |
Boolean | Boolean.class |
Bool | Boolean.class |
Number | Number.class |
Object | Object.class |
Map | Map.class |
List | List.class |
| (unknown) | Object.class |
Optional Fields
A field suffixed with ? is marked as required=false:
email: String? // required=false
name: String // required=true (default)Nested Schemas
Fields can have nested inline schemas:
address: {
street: String
city: String
zip: String?
}This produces a FieldDeclaration with typeName="Object" and nested=InlineSchema(...).
Schema Path Validation
When SchemaValidationLevel is not OFF, the compiler validates NodeOutputPath expressions against declared output schemas:
- For each
NodeOutputPathin input bindings, look up the referenced node'soutputSchema - Walk each path segment against the schema, verifying that each field exists
- If a field is not found in the schema:
SchemaValidationLevel.WARN→ log a warningSchemaValidationLevel.ERROR→ collect the error; after all validations, throwGraphDefinitionExceptionwith all errors
Validation is skipped for nodes with OpaqueSchema (no declared schema) output.
Schema Validation Levels
| Level | Behavior |
|---|---|
OFF | No schema path validation at compile time |
WARN | Log warnings for invalid paths (default) |
ERROR | Throw GraphDefinitionException for invalid paths |
Auto-Introspection
If a node has no explicit input/output schema declaration (OpaqueSchema), the compiler attempts to auto-introspect schemas from the operator's metadata via registry.metadata(operatorRef). This allows operators that declare their own schemas to have them used for path validation without explicit DSL declarations.
5.8 Transform Compilation
A TransformDef is compiled into a virtual NodeSpec backed by the framework's built-in TransformOperator. This approach provides observability (transform input/output visible in execution logs and Studio) at the cost of negligible overhead.
Compilation Strategy
Each TransformDef is compiled as follows:
- Expression compilation — each
TransformField.valueexpression is compiled into aBiFunction<NodeResults, GraphContext, Object>using the same expression compilation pipeline as node input blocks (§5.4), including all enhanced expression types (binary/unary ops, function calls,when,??,?:) - Dependency inference — all
NodeOutputPathandTransformFieldPathreferences within field expressions are collected as implicit dependencies of the transform's virtual node - Input assembler generation — field extractors are packaged into a
CompiledTransformAssembler(implementsInputAssembler<Map<String, Object>>), which evaluates all field expressions and returns aMap<String, Object>as the transform output - NodeSpec generation — a
NodeSpecis created with:id= transform IDoperator=TransformOperator(a built-in operator that returnsinputdirectly — all computation happens in the assembler)metadata={"__kind__" → "transform"}(used by Studio for differentiated visual rendering and excluded from complexity metrics)- No resilience configuration (no
timeout,retry,fallback)
- Schema generation — a
StructuredSchemais auto-generated from the transform's fields using type inference (§5.9)
TransformOperator
public final class TransformOperator implements Operator<Map<String,Object>, Map<String,Object>> {
@Override
public Map<String,Object> execute(Map<String,Object> input, OperatorContext ctx) {
return input; // passthrough — all computation is in the InputAssembler
}
@Override
public Idempotency idempotency() { return Idempotency.IDEMPOTENT; }
public SideEffectType sideEffectType() { return SideEffectType.READ_ONLY; }
}Transform Ordering & Cycle Detection
Transforms may reference other transforms (e.g., transformA.field used in transformB's expression). The compiler performs topological sort on all transforms using Kahn's algorithm:
- Build a dependency graph among transforms
- Detect cycles — if a cycle is found, throw
GraphDefinitionException:"Circular dependency detected among transforms: [transformA → transformB → transformA]" - Compile transforms in topological order so that downstream transforms can reference upstream transforms' schemas during type inference
Let Bindings (§5.8.1)
Transform blocks support let bindings to name intermediate computation results. A let binding is an immutable, declarative name binding scoped to the transform evaluation — it does not declare a mutable variable.
Syntax:
transform summary {
let subtotal = fetchProducts.output.items.sumBy(x -> x.price)
let discount = when { ctx.tier == "premium" -> 0.15 otherwise -> 0.0 }
total = subtotal * (1.0 - discount)
itemCount = size(fetchProducts.output.items)
}Rules:
letbindings must appear before field assignments in the transform bodyletbindings can reference earlierletbindings (forward references are not allowed)- A
letbinding name must not conflict with: node IDs, reserved DSL keywords (ctx,prev,carry,loopIteration,stream,output,let), or otherletbinding names in the same transform letvalues are not exposed inNodeResults— they are internal to the transform evaluation scope and invisible to other nodesletexpressions follow the same purity rules as transform field expressions
AST representation:
TransformDefnow has aletBindings: List<LetBinding>field (in addition tofields)LetBinding(name: String, value: Expression, line: int, column: int)
Transform Constraints
| Constraint | Rule |
|---|---|
| Pure functions only | ExpressionFunction calls in transforms must have isPure() == true; calling an impure function produces a compile error |
| No circular references | Transform A referencing transform B requires B to not reference A directly or transitively |
| Upstream SKIP propagation | If all upstream dependencies of a transform are SKIPPED at runtime, the transform is also SKIPPED |
| No resilience configuration | timeout, retry, fallback blocks are not allowed in transform definitions |
No depends_on | Dependencies are automatically inferred from expressions; explicit depends_on is not supported |
Referenceable by branch on | branch on transformId.fieldName is valid — the transform compiles to a NodeSpec, enabling branch conditions on computed fields |
Transform DSL Example
graph orderProcess {
node fetchUser : FetchUserOperator { input { userId = ctx.userId } }
node fetchProducts : FetchProductsOperator { input { cartId = ctx.cartId } }
node calcPrice : CalcPriceOperator { input { items = fetchProducts.output.items } }
/// Aggregate order summary data from multiple upstream nodes
transform orderSummary {
/// Full customer display name
customerName = concat(fetchUser.output.firstName, " ", fetchUser.output.lastName)
itemCount = size(fetchProducts.output.items)
totalWithTax = calcPrice.output.subtotal * 1.08
isPremium = fetchUser.output.vipLevel > 3
tier: String = when { isPremium -> "premium" otherwise -> "standard" }
}
node createOrder : CreateOrderOperator {
input {
customer = orderSummary.customerName
total = orderSummary.totalWithTax
}
}
}node / branch on / transform Orthogonal Relationship
| Dimension | node | branch on | transform |
|---|---|---|---|
| Semantics | Execute an operator | Select execution path | Compute data fields |
| DAG element | Concrete node | Conditional edge | Virtual node |
| Input | input {} / depends_on | Single node, single field | Multi-upstream, multi-field (auto-inferred) |
| Output | Operator output | Routes to N-choose-1 node | Named field set |
| Reference syntax | nodeId.output.field | Implicit (target nodes selected/skipped) | transformId.field |
| Resilience config | ✅ (timeout, retry, fallback) | — | ❌ (not applicable) |
| Studio visual | Solid rectangle (blue) | Diamond (orange) | Dashed rounded rectangle (light purple) |
Common Composition Patterns
- Transform → Node: Transform adapts data, then feeds to an operator node
- Transform → Branch: Transform computes a condition field,
branch onreferences it for routing (limited to "calculator-level" simple conditions; domain-knowledge-based decisions should be operators) - Branch → Transform: After branch selects a path, transform assembles data for downstream nodes
- Transform → Transform: Chained transforms (compiler sorts + detects cycles)
5.9 Scope Mode Compilation
The scope property controls scope visibility for sub-graph constructs (loop, foreach, subgraph nodes). It determines whether the sub-graph body can see parent graph node outputs and inherits the parent GraphContext.
Default Scope Matrix
| Construct | Default | Rationale |
|---|---|---|
loop | parent | Inline body, lexically nested — natural to see parent |
foreach | parent | Inline body, lexically nested — same as loop |
subgraph("name") | isolated | External/reusable graph — encapsulation by default |
Compilation Behavior
| Scope Mode | Behavior |
|---|---|
parent | The compiler collects NodeOutputPath references to parent graph nodes from the sub-graph body expressions (collectNodeOutputRefs). These references are registered with the sub-compiler via withParentNodeIds(). At runtime, parent node outputs are injected into the sub-graph context as __parentOutput_<nodeId>__ keys. The sub-graph's compiled expressions read from these context keys instead of NodeResults. |
isolated | The sub-graph is compiled as a fresh, independent graph with no awareness of parent scope. Only explicit input {} bindings are visible. No __parentOutput_* keys are injected. |
ForEach Parent Scope
When foreach has scope = parent (the default):
collectNodeOutputRefs(body)scans body node input expressions forNodeOutputPathreferences to parent graph nodes- The sub-compiler is configured with
withParentNodeIds(parentRefNodeIds)so expressions likeparentNode.output.fieldcompile to context reads - The
ForEachOperatorreceives both the items list and parent node outputs in a combined input map (__items__+__parentOutput_*keys) - Each item execution merges the parent graph context and parent outputs into the sub-graph context before running
ForEach Execution Semantics
itemsExpris compiled once into an extractor that produces the collection fed to the hyper-node.- The foreach body is compiled as a nested graph named
<foreachId>__subgraph. sequentialswitches the runtime from default fan-out execution to ordered item-by-item execution.stream foreachselectsStreamingForEachOperator; non-streaming foreach selectsForEachOperator.buffer = Nis stored inNodeMetadataand forwarded to the streaming runtime as buffer sizing metadata.- The foreach node itself exposes an
OpaqueSchemaoutput because the aggregate result shape depends on the nested graph and runtime fan-out policy.
Loop Execution Semantics
- The loop body is compiled as a nested graph named
<loopId>__subgraph. max_iterationsbecomes a hard runtime guard, whiledelayis converted to aDurationinserted between iterations.untilis compiled into a predicate over the current iteration's node outputs and only terminates the loop when the expression evaluates toBoolean.TRUE.carry { ... }is compiled into a carry-mapper; when omitted, the runtime carries forward the previous iteration's full output map.- Explicit
depends_onplus any parent-scope references become loop-node dependencies in the outer graph. stream loopselectsStreamingLoopOperator; non-streaming loop selectsLoopOperator.
Streaming Semantics
streamis a member prefix onnode,foreach, andloop; it does not create a separate AST node type.- The parser records streaming intent as
streaming=trueplus an optionalbufferSizeon the wrapped member. nodeId.streamandnodeId.stream.fieldcompile toNodeStreamPath, which also records a stream dependency so the runtime can wire the live channel correctly.- Standard consumer nodes keep their normal operator references; stream behavior is carried through
NodeMetadata,NodeResults.getChannel(nodeId), and the streaming operator implementations.
SubGraph Parent Scope
When a subgraph("name") node has scope = parent:
- The
SubGraphOperatormerges the parentGraphContextinto the sub-graph context before overlaying the explicitinput {}values - This allows the child graph's operators to access parent-level context entries
5.10 Expression Type Inference
The compiler infers types for transform fields and validates expression type consistency. Type inference is used to auto-generate StructuredSchema for transforms and to validate downstream path references.
Inference Rules
| Expression | Inferred Type |
|---|---|
StringLiteral | String |
NumberLiteral | Number |
BooleanLiteral | Boolean |
DurationLiteral | Duration |
NodeOutputPath | Looked up from upstream operator output schema |
NodeStreamPath | Object |
TransformFieldPath | Recursively inferred from the referenced transform field |
ContextPath | Unknown |
ItemPath / LoopPrevPath / LoopCarryPath | Object |
ItemIndex / LoopIterationRef | Integer |
LambdaParamPath | Object |
a + b (both Number) | Number |
a + b (either side String) | String |
a * b, a - b, a / b, a % b | Number |
a > b, a == b, a != b, etc. | Boolean |
a && b, a || b, !a | Boolean |
a ?? b | Type of the non-Unknown side; if both known, their common compatible type |
cond ? a : b | Common type of both branches |
when { ... } | Common type of all right-hand-side expressions |
func(args...) | ExpressionFunction.returnType(String...) result |
MethodCallExpr | Method-specific (List, Map, Boolean, or Object) |
LambdaExpr | Function |
ObjectLiteral | Object |
ArrayLiteral | List |
Unknown Type Handling
When inference produces Unknown, behavior depends on the SchemaValidationLevel:
| Level | Behavior for Unknown type |
|---|---|
OFF | Ignored — no schema generated for Unknown fields |
WARN | Compile warning suggesting explicit type annotation (field: Type = expr) |
ERROR | Compile error requiring explicit type annotation |
Explicit Type Annotation
Transform fields support optional explicit type annotations that override inference:
transform orderSummary {
tier: String = when { isPremium -> "premium" otherwise -> "standard" }
score: Double = riskCalc.output.rawScore * 0.85
}When an explicit annotation is present, the compiler verifies that the inferred type is compatible with the declared type. A mismatch produces a compile warning (at WARN level) or error (at ERROR level).
5.11 Built-in Function Library
Enhanced expressions support function calls within input {} blocks and transform blocks. Functions are resolved against an ExpressionFunction registry. The framework provides built-in functions; custom functions can be registered via the ExpressionFunction SPI.
ExpressionFunction Interface
public interface ExpressionFunction {
String name();
Object apply(Object... args);
/**
* Report the result type for the given argument types.
* Implementations must return an explicit type name instead of silently
* defaulting to Unknown.
*/
String returnType(String... argTypes);
/** Whether this function has no side effects. */
default boolean isPure() { return true; }
}Built-in Functions
| Category | Functions |
|---|---|
| String | concat(s...), substring(s, start, end?), uppercase(s) / upper(s), lowercase(s) / lower(s), trim(s), replace(s, target, replacement), replaceAll(s, regex, replacement), startsWith(s, prefix), endsWith(s, suffix), split(s, delimiter), join(list, delimiter), length(s) / len(s), indexOf(s, target), matches(s, regex), padLeft(s, length, padChar?), padRight(s, length, padChar?) |
| Math | abs(n), min(a, b), max(a, b), round(n), ceil(n), floor(n), sum(list), avg(list), clamp(n, min, max), pow(base, exp) |
| Collection | size(c), contains(c, elem), first(c), last(c), isEmpty(c), distinct(list), flatten(listOfLists), sort(list), take(list, n), drop(list, n), any(list, value), reverse(list), all(list, value) |
| Map/Object | keys(map), values(map), entries(map), merge(map1, map2), has(map, key) |
| Null | coalesce(a, b, ...), isNull(v), isNotNull(v) |
| Type | toString(v), toNumber(v), toBoolean(v), toInt(v), typeOf(v) |
| Date/Time | now() ⚠, today() ⚠, formatDate(isoStr, pattern), parseDate(str, pattern), addDuration(isoStr, duration), diffDuration(iso1, iso2, unit) |
| ID/Utility | uuid() ⚠, format(template, args...) |
| Hash | md5(s), sha1(s), sha256(s), sha512(s) |
| Encoding | base64Encode(s), base64Decode(s), hexEncode(s), hexDecode(s), urlEncode(s), urlDecode(s) |
| Crypto (SPI) | hmacSha256(key, data), hmacSha512(key, data), aesEncrypt(key, data), aesDecrypt(key, data) — require SecretProvider SPI |
| JSON (SPI) | toJson(v), fromJson(jsonStr) — require JsonCodec SPI |
| Secrets (SPI) | secret(name) ⚠ — resolves a named secret via SecretProvider SPI; impure (treated like uuid()) |
Aliases:
upper→uppercase,lower→lowercase,len→length. All alias names are accepted.Date/Time Design: All date/time functions operate on ISO-8601 strings (e.g.,
"2024-01-15T10:30:00Z"), keeping the DSL type system to String/Number/Boolean without introducing new value types.addDurationaccepts DSL duration literals ("2h","30m","1d") or ISO-8601 durations ("PT2H").⚠ Impure Functions: Functions marked with ⚠ (
now,today,uuid) are non-deterministic (impure). They are not allowed in transform blocks — the compiler will reject them with aGraphDefinitionException. Ininput {}blocks, they are evaluated once per node execution (not re-evaluated on retry).
Collection method calls and lambdas
Collection method calls are receiver-based (orders.map(o -> o.id)) rather than registry-based. They are compiled separately from ExpressionFunction calls and currently dispatch through CollectionOps.
Additional constraints:
- method names must come from the allowed collection-method set
- the receiver must evaluate to a
List<?>; otherwise the runtime result isnull - lambdas are expression-bodied only and capture the surrounding reserved execution context lexically
- standalone lambda literals are rejected for ordinary node input bindings and wait/await payload fields
5.11.1 Impure Functions
Three built-in functions are impure (non-deterministic):
| Function | Return | Behavior |
|---|---|---|
now() | ISO-8601 timestamp string | Returns current instant at evaluation time |
today() | yyyy-MM-dd date string | Returns current date at evaluation time |
uuid() | UUID v4 string | Generates a random UUID |
secret(name) | String | Resolves a named secret via SecretProvider SPI; non-deterministic |
Restrictions and semantics:
- Transform blocks: Impure functions are forbidden. The compiler checks
isPure()and throws:"Function '<name>' is not pure and cannot be used in transform blocks" - Input blocks: Impure functions are allowed. They are evaluated once when
InputAssembler.assemble()is called, which happens before the operator executes — specifically before the retry wrapper. Retry does not re-evaluate input expressions. - Concurrent evaluation: Sibling nodes evaluated concurrently may get different
now()values. If a consistent timestamp is needed across nodes, compute it in an upstream node's output or pass it viactx. - Replay/Audit: Since impure functions produce different results on each execution, workflows using them are not fully deterministic. For auditability, consider passing generated values (timestamps, IDs) via
ctxinstead.
Custom Function Registration
Business teams can register custom functions via the ExpressionFunction SPI (e.g., formatCurrency(), maskPhone()). Custom functions:
- Must implement the
ExpressionFunctioninterface - Are registered in the
ExpressionFunctionregistry (discoverable viaServiceLoaderor Spring auto-configuration) - If used in
transformblocks, must declareisPure() == true; impure functions in transforms produce a compile error - Must implement
returnType()so type inference stays explicit and deterministic
Expression Power Boundary (Design Red Lines)
The enhanced expression system is intentionally bounded to preserve the declarative nature of the DSL:
- ✅ Allowed:
??,?:,when {}, comparison (>,<,==,!=,>=,<=), logical (&&,||,!), arithmetic (+,-,*,/,%),ExpressionFunctioncalls, receiver-style collection method calls, single-expression lambdas, and parentheses grouping - ❌ Prohibited:
if/elsestatement blocks,for/whilestatements, multi-statement sequential execution, statement-scoped variable declarations outsidetransform let,try/catch, and named user-defined functions
Every line in transform and input {} blocks must maintain the fieldName = <expression> declarative form. All conditional logic must be expressions (returning a value), not statements (with side effects).
when Expression Details
The when expression has two forms, both using the same -> and otherwise syntax as branch on for consistency, though the semantics are orthogonal:
Form A — No subject (each clause left-hand side is a boolean condition):
tier = when {
score > 0.8 -> "critical"
score > 0.5 -> "elevated"
otherwise -> "low"
}Form B — With subject (each clause left-hand side is a match value):
warehouse = when fetchOrder.output.region {
"华东" -> "SH-01"
"华南" -> "GZ-02"
otherwise -> "DEFAULT"
}when vs branch on orthogonality: branch on is the control plane (decides which nodes execute/skip); when is the data plane (decides what value a field takes). branch on only appears at graph body top level; when only appears in expression position. They share syntax style (-> arrows + otherwise) but are semantically completely different.
5.12 Session Extension Compilation
Session sources are compiled outside the ordinary Graph DAG path. The parser still emits generic ExtensionDef nodes, but the dedicated session compiler interprets them as a structured long-running state machine.
Compilation flow
- Parse the root
sessionintoExtensionDef(kind="session"). - Normalize session properties such as
idle_timeout,timeout_action,max_rounds, andmax_history. - Compile each
phasechild into aPhaseSpec, preserving source order for deterministic transition resolution. - Interpret
phase_typeas either:once: direct child members run once when the phase activatesround: theround { ... }child list becomes the body that can repeat untiluntilsucceeds or phase limits are hit
- Lower
theninto either a default target or an ordered list of conditional transitions. - Reuse the standard node, branch, transform, loop, wait, await, and script compilers for phase members so graph semantics stay consistent inside a phase.
Session-specific semantic rules
- exactly one top-level
sessionroot is allowed in a session document - phases are addressable by ID and
thentargets must resolve to declared phases - a round phase cannot mix
round { ... }with direct executable phase members yield_onlowers to a string list consumed by the session runtime when deciding whether to suspend and wait for external signalsuntilis compiled as an expression against the phase/session execution context and is evaluated after each round or once-phase completion as appropriate
Runtime contract boundary
This specification intentionally stops at the compiled session model boundary. Persistence, wake-up delivery, history truncation, and durable resume behavior are owned by bloge-core-ext, bloge-durable, and their codecs, not by the DSL grammar itself.
5.13 Script Node
Script nodes allow embedding dynamic business logic as sandboxed Groovy code directly inside a .bloge graph. They are intended for hot-updatable rules that change faster than the deployment cycle (pricing tiers, risk thresholds, compliance rules), not for general computation.
Dependency: Script nodes require the optional
bloge-scriptmodule. Without it the compiler throws aGraphDefinitionExceptionexplaining how to enable script support.
Syntax
/// Optional doc comment
script <id> {
lang = "groovy" // optional; default "groovy"
timeout = 5s // strongly recommended
input { // optional; maps context fields into script variables
<binding> = <expression>
...
}
output_schema { // optional; validates return value
<field>: <Type>
...
}
code = """
// Groovy code here
// Last expression is the return value (must be a Map)
return [field: value]
"""
}Execution Model
- Input assembly: the
inputblock expressions are evaluated against the current graph context, producing aMap<String, Object>. - Script execution: the
ScriptOperatorcallsScriptEngine.execute(code, inputs). Inputs are wrappedCollections.unmodifiableMapbefore binding. - Output: the last evaluated expression in the Groovy script becomes the return value. It must be a
Map; otherwise aScriptExecutionExceptionis thrown. - Schema validation: if
output_schemais declared, the return map is validated against it.
Security Model (Three-Layer Sandbox)
| Layer | Mechanism | What It Blocks |
|---|---|---|
| A — Compile-time AST | SecureASTCustomizer | Dangerous imports (java.io, java.net, java.lang.Runtime, etc.), all static imports, dangerous receivers (File, Socket, URL, Runtime, Thread) |
| B — Class-loading | SandboxedGroovyClassLoader | Any class not in java.lang/util/math/time, groovy.lang/util; blocks reflection, java.io, java.net, javax, groovy.grape, sun.* |
| C — Runtime | Immutable inputs + timeout | Inputs wrapped as unmodifiable; timeout enforced by ResilientOperatorWrapper (default 5 s) |
Operator Registration
The bloge-script module provides GroovyScriptOperatorFactory which implements the ScriptOperatorFactory SPI. Pass it to the compiler:
DslCompiler compiler = new DslCompiler(registry)
.withScriptOperatorFactory(new GroovyScriptOperatorFactory());With Spring Boot, simply add bloge-script to the classpath — BlogeAutoConfiguration auto-wires the factory via @ConditionalOnClass.
Compilation Behavior
- A
NodeSpecis created with operator ref__script__:<id>. - Metadata keys:
kind=script,__script_lang__,__script_code__. - The
ScriptEnginecaches compiled classes by SHA-256 hash of the code string — hot-reload only re-compiles when the code changes. - Script nodes can have
depends_oninferred from theirinputblock expressions (same as regular nodes).
Lint Rules
| Rule ID | Severity | Condition |
|---|---|---|
max-script-nodes | WARNING | Graph has more than 2 script nodes |
script-timeout-required | WARNING | Script node has no timeout |
script-line-limit | INFO | Script code exceeds 30 lines |
6. Error Reporting
All errors carry precise source positions (line:column) for diagnostic messages.
6.1 Lexer Errors
The lexer throws ParseException for the following conditions:
| Error | Trigger | Message Format |
|---|---|---|
| Unterminated string literal | Reaching EOF while inside a "..." string | "[line:col] Unterminated string literal" |
| Unterminated block comment | Reaching EOF with block-comment depth > 0 | "[line:col] Unterminated block comment" |
| Unexpected character | Any character not matching any token rule | "[line:col] Unexpected character '<c>'" |
Bare & | & not followed by & | "[line:col] Unexpected character '&'. Did you mean '&&'?" |
Bare | | | not followed by | | "[line:col] Unexpected character '|'. Did you mean '||'?" |
Note: With enhanced expressions,
-and/are now valid operator tokens (MINUSandSLASHrespectively) and are always emitted by the lexer. Previously these were lexer errors when not part of->or comment syntax. Invalid usage of these tokens (e.g.,MINUSwhere the parser expects a member keyword) is now caught by the parser rather than the lexer.
6.2 Parser Errors
The parser uses two error handling mechanisms:
ParseException with Position
The expect() method throws ParseException when the expected token type does not match the current token:
"[line:col] <message>, got <actualType>('<actualLexeme>')"Examples:
"[3:5] Expected 'graph', got IDENT('myNode')""[7:12] Expected '{' after graph name, got COLON(':')"
Error Collection and Synchronization
The parser collects multiple errors rather than aborting on the first one:
- When a
ParseExceptionis caught during member parsing, its message is added to theerrorslist - The
synchronize()method skips tokens until it finds a synchronization point such asSESSION,PHASE,ROUND,NODE,BRANCH,TRANSFORM,SCHEMA,FOREACH,LOOP,WAIT,AWAIT,STREAM, orRBRACE - Parsing continues from the synchronization point
- After parsing completes, if the errors list is non-empty, a single
ParseException(List<String>)is thrown containing all collected error messages:
Parse errors:
[3:5] Expected ':' after node id, got IDENT('opA')
[7:1] Expected 'node', 'branch', 'transform', or 'schema', got IDENT('invalid')6.3 Compiler Errors
The compiler (DslCompiler) throws GraphDefinitionException for semantic errors:
| Error | Trigger | Message |
|---|---|---|
| Unregistered operator | NodeDef.operatorRef not in OperatorRegistry | "Node '<id>' references unregistered operator '<ref>'" |
| Non-existent dependency | depends_on or implicit dep references unknown node | "Node '<id>' depends on non-existent node '<dep>'" |
| Invalid branch condition | Branch condition is not a NodeOutputPath or TransformFieldPath | "Branch condition must be a node output path or transform field path expression at <line>:<col>" |
| Non-existent branch source | Branch condition references unknown node | "Branch references non-existent node '<id>'" |
| Non-existent branch target | Branch case target references unknown node | "Branch target '<id>' references non-existent node" |
| Non-existent otherwise target | otherwise target references unknown node | "Branch otherwise target '<id>' references non-existent node" |
| Schema reference not found | SchemaRef name not in named schema registry | "Referenced schema '<name>' not found (at <line>:<col>)" |
| Schema path validation failure | NodeOutputPath segment not in upstream output schema | "Node '<id>' input binding '<field>': path '<path>' — field '<segment>' not found in output schema of '<nodeId>'" |
| Cycle detection | DAG validation in Graph constructor detects a cycle | (thrown by bloge-core Graph model) |
| Transform circular dependency | Transform A references transform B which references A | "Circular dependency detected among transforms: [transformA → transformB → transformA]" |
| Impure function in transform | Transform field expression calls a function with isPure() == false | "Transform '<id>' field '<field>': function '<name>' is not pure and cannot be used in transform blocks" |
| Unresolved function | Function name not found in ExpressionFunction registry | "Unknown function '<name>' at <line>:<col>" |
| Transform type mismatch | Explicit type annotation incompatible with inferred type | "Transform '<id>' field '<field>': declared type '<declared>' is incompatible with inferred type '<inferred>'" |
| Duplicate transform/node ID | Transform ID conflicts with a node ID | "Transform '<id>' conflicts with node of the same name" |
| Non-existent transform reference | TransformFieldPath references unknown transform | "Reference to non-existent transform '<id>' at <line>:<col>" |
| Non-existent transform field | TransformFieldPath references unknown field in transform | "Transform '<id>' has no field '<field>' at <line>:<col>" |
| Reserved node ID | node ID matches a reserved DSL keyword (ctx, prev, carry, loopIteration, stream, output) | "[line:col] Node id '<id>' is a reserved DSL keyword and cannot be used as a node id" |
| Reserved lambda parameter | Lambda parameter matches a reserved DSL keyword | "Lambda parameter '<param>' conflicts with the reserved DSL keyword '<param>'" |
Appendix A: Complete EBNF Grammar Quick-Reference
Program = GraphDef | SessionDef
GraphDef = DocComment? "graph" IDENT "{" Member* "}"
SessionDef = DocComment? "session" IDENT "{" SessionMember* "}"
SessionMember = SessionProperty | PhaseDef | CommentNode
SessionProperty = "idle_timeout" "=" DURATION
| "timeout_action" "=" STRING
| "max_rounds" "=" NUMBER
| "max_history" "=" NUMBER
PhaseDef = DocComment? "phase" IDENT "{" PhaseBody "}"
PhaseBody = ( PhaseProperty | RoundBlock | Member )*
PhaseProperty = "max_rounds" "=" NUMBER
| "on_round_failure" "=" IDENT
| "yield_on" "=" "[" ( IDENT ( "," IDENT )* )? "]"
| "until" Expression
| ThenProperty
ThenProperty = "then" "->" IDENT
| "then" "{" ( ( Expression | "otherwise" ) "->" IDENT )* "}"
RoundBlock = "round" "{" Member* "}"
Member = NodeDef | BranchDef | TransformDef | SchemaDef | CommentNode
| ForEachDef | LoopDef | WaitDef | AwaitDef | ScriptDef | StreamMember
DocComment = DOC_COMMENT+
StreamMember = DocComment? "stream" ( NodeDef | ForEachDef | LoopDef )
NodeDef = DocComment? "node" IDENT ":" IDENT "{" NodeBody "}"
NodeBody = ( InputBlock | InputSchemaBlock | OutputDecl | DependsOn
| TimeoutField | RetryField | FallbackField | ScopeField | BufferField )*
InputBlock = "input" "{" ( DocComment? IDENT "=" Expression )* "}"
InputSchemaBlock = "input" "{" FieldDeclaration* "}"
OutputDecl = "output" ( "{" FieldDeclaration* "}" | ":" IDENT )
DependsOn = "depends_on" "=" "[" ( IDENT ( "," IDENT )* )? "]"
TimeoutField = "timeout" "=" DURATION
RetryField = "retry" "=" "{" ( IDENT ":" ( NUMBER | DURATION | IDENT ) ","? )* "}"
FallbackField = "fallback" "=" Expression
ScopeField = "scope" "=" ( "parent" | "isolated" )
BufferField = "buffer" "=" NUMBER
ForEachDef = DocComment? "foreach" IDENT ":" ( "(" IDENT ( "," IDENT )? ")" | IDENT )
"in" Expression "sequential"? "{" ( ScopeField | BufferField | Member )* "}"
LoopDef = DocComment? "loop" IDENT "{" ( "max_iterations" "=" NUMBER
| "delay" "=" DURATION | DependsOn | ScopeField | BufferField
| Member | CarryBlock | UntilClause )* "}"
CarryBlock = "carry" "{" ( IDENT ":" Expression ","? )* "}"
UntilClause = "until" Expression
WaitDef = DocComment? "wait" IDENT "=" WaitTimer "after" IDENT ( "{" WaitBody "}" )?
WaitTimer = DURATION | "deadline" "(" STRING ")" | "cron" "(" STRING ")"
WaitBody = ( "signal_key" "=" Expression | "on_timeout" PayloadBlock | "on_fire" PayloadBlock )*
AwaitDef = DocComment? "await" IDENT "{" AwaitBody "}"
AwaitBody = ( "mode" "=" IDENT | DependsOn | EventMatcher | TimeoutField | "on_timeout" PayloadBlock )*
EventMatcher = "event" STRING ( "where" IDENT "=" Expression )?
( "{" "optional" "=" "true" "}" )?
ScriptDef = DocComment? "script" IDENT "{" ScriptBody "}"
ScriptBody = ( "lang" "=" STRING | InputBlock
| "output_schema" ( "{" FieldDeclaration* "}" | "=" IDENT )
| "code" "=" TRIPLE_STRING | TimeoutField )*
PayloadBlock = "{" ( IDENT "=" Expression )* "}"
BranchDef = DocComment? "branch" "on" PathExpression "{" BranchBody "}"
BranchBody = ( DocComment? Expression "->" IDENT | "otherwise" "->" IDENT )*
TransformDef = DocComment? "transform" IDENT "{" LetBinding* TransformField* "}"
LetBinding = "let" IDENT "=" Expression
TransformField = DocComment? IDENT ( ":" IDENT "?"? )? "=" Expression
SchemaDef = "schema" IDENT "{" FieldDeclaration* "}"
FieldDeclaration = IDENT ":" ( IDENT "?"? | "{" FieldDeclaration* "}" )
Expression = LambdaExpr | WhenExpr | ConditionalExpr
LambdaExpr = ( IDENT | "(" ( IDENT ( "," IDENT )* )? ")" ) "->" Expression
ConditionalExpr = NullCoalesce ( "?" Expression ":" Expression )?
NullCoalesce = LogicalOr ( "??" LogicalOr )*
LogicalOr = LogicalAnd ( "||" LogicalAnd )*
LogicalAnd = Equality ( "&&" Equality )*
Equality = Comparison ( ( "==" | "!=" ) Comparison )?
Comparison = Additive ( ( ">" | "<" | ">=" | "<=" ) Additive )?
Additive = Multiplicative ( ( "+" | "-" ) Multiplicative )*
Multiplicative = Unary ( ( "*" | "/" | "%" ) Unary )*
Unary = ( "!" | "-" ) Unary | Primary
Primary = Atomic ( "." MethodCall )*
MethodCall = IDENT "(" ( Expression ( "," Expression )* )? ")"
Atomic = FunctionCall | PathExpression | STRING | NUMBER | DURATION
| "true" | "false" | ObjectLiteral | ArrayLiteral | "(" Expression ")"
FunctionCall = IDENT "(" ( Expression ( "," Expression )* )? ")"
WhenExpr = "when" Expression? "{" ( Expression "->" Expression )* ( "otherwise" "->" Expression )? "}"
PathExpression = IDENT ( "." IDENT )*
ObjectLiteral = "{" ( IDENT ":" Expression ( "," IDENT ":" Expression )* )? "}"
ArrayLiteral = "[" ( Expression ( "," Expression )* )? "]"Appendix B: Example Index
The following 8 example files are located in bloge-examples/src/main/resources/bloge/:
| File | Graph Name | Description | Key Features Demonstrated |
|---|---|---|---|
order-process.bloge | orderProcess | E-commerce order flow: user/product fetch → pricing → credit check → create/reject | Implicit deps, branch on boolean, retry with jitter strategy, fallback with object literal |
bff-dashboard.bloge | bffDashboard | Backend-for-frontend aggregation: parallel fetch of profile, orders, recommendations, etc. | Fan-out parallelism, multiple fallbacks with array/object literals, no branching |
loan-approval.bloge | loanApproval | Loan application: credit check, fraud detection, income verification → risk aggregation | 4-way fan-out, otherwise branch, multiple retry/fallback configs |
ticket-routing.bloge | ticketRouting | Customer ticket routing: sentiment analysis → priority classification → agent assignment | otherwise → autoResolve, exponential retry, fallback with mixed types |
food-order.bloge | foodOrderProcess | Food delivery: inventory/kitchen/delivery checks → acceptance → payment/dispatch | Branch on true/false boolean values, deep fan-out, post-branch parallel nodes |
claim-processing.bloge | claimProcessing | Insurance claim flow: policy validation, document review, history check → risk assessment | 3-way fan-in/fan-out, otherwise → investigateClaim, exponential retry |
shipment-planning.bloge | shipmentPlanning | Shipment logistics: warehouse/carrier/route optimization → cost calculation → dispatch mode | 3-way fan-in, otherwise → dispatchConsolidated, fallback with decimal numbers |
online-triage.bloge | onlineTriage | Medical triage: patient records → symptom analysis → AI diagnosis → routing | ctx path with domain context (ctx.chiefComplaint), high timeout (10s), 3-way branch |
Appendix C: DSL vs Java Fluent API Comparison
Using the orderProcess example to illustrate the equivalence between DSL syntax and the Java fluent graph builder:
DSL (.bloge)
graph orderProcess {
node fetchUser : FetchUserOperator {
input {
userId = ctx.userId
}
timeout = 3s
retry = { attempts: 2, backoff: 200ms, strategy: exponential }
}
node fetchProducts : FetchProductsOperator {
input {
productIds = ctx.productIds
}
timeout = 5s
}
node checkCredit : CheckCreditOperator {
input {
user = fetchUser.output
products = fetchProducts.output
}
}
branch on checkCredit.output.approved {
true -> createOrder
otherwise -> rejectOrder
}
node createOrder : CreateOrderOperator {}
node rejectOrder : RejectOrderOperator {}
}Java Fluent API (equivalent)
Graph graph = Graph.builder("orderProcess")
.node("fetchUser", fetchUserOperator)
.input((results, ctx) -> Map.of("userId", ctx.get("userId", Object.class)))
.timeout(Duration.ofSeconds(3))
.retry(2, Duration.ofMillis(200), BackoffStrategy.EXPONENTIAL)
.node("fetchProducts", fetchProductsOperator)
.input((results, ctx) -> Map.of("productIds", ctx.get("productIds", Object.class)))
.timeout(Duration.ofSeconds(5))
.node("checkCredit", checkCreditOperator)
.dependsOn("fetchUser", "fetchProducts")
.input((results, ctx) -> Map.of(
"user", results.getRaw("fetchUser"),
"products", results.getRaw("fetchProducts")))
.branch("checkCredit")
.on("approved")
.when(value -> Boolean.TRUE.equals(value), "createOrder")
.otherwise("rejectOrder")
.node("createOrder", createOrderOperator)
.node("rejectOrder", rejectOrderOperator)
.build();Key Differences
| Aspect | DSL | Java API |
|---|---|---|
| Dependency decl | depends_on = [...] or implicit | .dependsOn(...) explicit only |
| Input binding | field = path.expression | .field("name", (results, ctx) -> ...) |
| Enhanced expressions | a + b, ??, when {}, func() | Arbitrary Java lambdas |
| Transform blocks | transform id { field = expr } | Manual virtual node creation with TransformOperator |
| Branch predicate | Literal matching ("ok", true) | Predicate<Object> lambda |
| Schema declaration | output { field: Type } | Programmatic SchemaDescriptor construction |
| Resilience | Inline DSL syntax | Builder method chain |
| Doc-comments | /// prefix comments | Not supported in API |
Appendix D: Reserved Keywords & Future Extensions
Current Reserved Keywords
All 50 keywords listed in §2.1 are reserved at the lexer level (they produce keyword tokens). However, as described in §3.3, they remain contextual — usable as identifiers in many grammar positions even though later semantic validation may still reserve some runtime names.
Newly Specified Features (v1.0.0)
The following previously partial or draft features are fully specified in this stable version:
| Feature | Status | Description | Spec Section |
|---|---|---|---|
| Session root and phase syntax | Specified | Stable session / phase / round grammar and lowering model | §3.1, §3.4, §4.5, §5.12 |
| Higher-order expressions | Specified | Lambdas plus receiver-style collection methods | §3.1, §5.4, §5.10, §5.11 |
| Streaming members and paths | Specified | stream node, stream foreach, stream loop, and node.stream references | §3.1, §4.1, §4.3, §5.4 |
Transform let bindings | Specified | Immutable transform-local intermediate values | §3.1, §4.1, §5.8 |
| Wait / await nodes | Specified | Timer suspension and correlated event waiting | §3.1, §4.1, §4.2, §5.12 |
| Script nodes | Specified | Sandboxed Groovy script execution via optional bloge-script module | §3.1, §4.2, §5.13 |
| Scope and buffer controls | Specified | `scope = parent | isolatedand streamingbuffer = N` metadata |
| SPI function typing contract | Specified | ExpressionFunction.returnType(...) is mandatory for explicit type reporting | §5.10, §5.11 |
| Shared parser conformance basis | Specified | The spec now matches the cross-language fixture and AST baseline suite | §1, §4, §5 |
Future Extension Points
The following features are referenced in design documents as potential future additions but are not yet specified:
| Feature | Status | Description |
|---|---|---|
AccessorCodeGenerator | Planned | Bytecode generation via Java ClassFile API to replace reflective property access with generated accessor classes |
| Sub-graph mechanism | Specified | SubGraphOperator enabling nested graph execution via node x : subgraph("name") {} syntax with optional scope = parent |
| Import/include | Not planned | Multi-file graph composition |
| Graph complexity validation | Planned | Compile-time enforcement of node count, path depth, branch nesting, and fan-out limits |
| Operator behavior contracts | Planned | Idempotency and SideEffectType declarations on operators, influencing runtime resilience defaults |
| Schema versioning | Planned | SchemaDescriptor.version field with upstream/downstream compatibility validation |
The grammar has syntactic room for these extensions:
- New expression forms can be added as additional
Expressionsealed permits - New node body clauses can be added without breaking existing syntax
- New graph body member types (like
transform) can be added to theMemberproduction