Schema & I/O
BLOGE treats operator inputs and outputs as contracts that should be visible to both humans and tooling. The schema system exists to remove guesswork from graph authoring, validation, testing, and visualization.
Why explicit schemas matter
Without schemas, downstream consumers have to read operator source code to discover output shape. That makes DSL authoring weaker, runtime mismatches harder to diagnose, and visual tooling less useful.
With explicit or inferred schemas, BLOGE can:
- validate field paths during graph build and DSL compilation
- document operator inputs and outputs automatically
- surface shape information in Studio and metadata JSON
- detect mismatches earlier, before production traffic hits a broken path
Core schema model
BLOGE's schema layer is centered around SchemaDescriptor.
| Type | Meaning |
|---|---|
StructuredSchema | Field-level schema with nested structure |
TypedSchema | Simple type reference when field expansion is unnecessary |
OpaqueSchema | Escape hatch when shape is unknown or intentionally undeclared |
A structured schema is composed of FieldDescriptor records describing field name, type, required-ness, optional nested schema, and documentation.
Where schemas come from
BLOGE can discover schema from several sources.
Java-side inference
For Java operators, the runtime can infer schema from the Operator<I, O> generic types.
Typical behavior:
- Java records -> recursive field extraction
- POJOs -> getter-based inspection
Map<String, Object>-> usually degrades toOpaqueSchema
Operators can override inference by implementing SchemaAware and returning explicit schemas.
DSL-side declarations
The DSL can declare schemas inline or through reusable schema blocks.
schema UserOutput {
id: Int
name: String
email: String?
}
node fetchUser {
operator: "FetchUserOperator"
output: UserOutput
}Inline declarations are especially useful when a graph is defined externally and the compiler should validate field paths before runtime.
Transform inference
transform fields also participate in the schema system. Their types can come from:
- explicit type annotations
- expression type inference
- an
Unknownfallback when inference is incomplete
Validation stages
BLOGE can validate schema at three different moments.
| Stage | What it checks |
|---|---|
| Graph build time | Compatibility across edges and referenced fields |
| DSL compile time | Path validity, declared schema references, and binding compatibility |
| Runtime | Actual values versus declared required fields and types |
Validation strictness is controlled by SchemaValidationLevel:
OFFWARNERROR
This lets teams start with warnings and tighten enforcement as contracts stabilize.
Example: inferred Java schema
public record UserQuery(String userId) {}
public record UserView(String id, String name, String email) {}
public final class FetchUserOperator implements Operator<UserQuery, UserView> {
@Override
public UserView execute(UserQuery input, OperatorContext ctx) {
return userService.find(input.userId());
}
}Here BLOGE can infer both input and output schema automatically from record components.
Example: metadata export
The Maven plugin can export operator schema into operator-metadata.json for visual tooling:
{
"name": "FetchUserOperator",
"inputSchema": {
"kind": "structured",
"fields": [{ "name": "userId", "type": "String", "required": true }]
},
"outputSchema": {
"kind": "structured",
"fields": [
{ "name": "id", "type": "String", "required": true },
{ "name": "name", "type": "String", "required": true },
{ "name": "email", "type": "String", "required": false }
]
}
}Studio can then use this data for operator catalogs, field completion, and data-flow visualization.
Practical guidance
- Prefer Java records for clean inference whenever possible.
- Use explicit schemas when a graph is authored in DSL and path validation matters.
- Treat
OpaqueSchemaas a temporary escape hatch, not the default destination. - Version breaking output changes intentionally instead of mutating contracts invisibly.
- Keep transform outputs typed when they become shared downstream dependencies.
What schema validation is not
Schema validation does not replace domain validation. It answers questions like:
- does
fetchUser.output.address.cityexist? - is this field optional or required?
- does the downstream node expect a compatible shape?
It does not decide whether a value is semantically correct for the business domain.
Next steps
- Export schema catalogs in Maven Plugin & Lint
- Visualize them in Bloge Studio
- See data shaping guidance in Design Principles