ADR-0018: Composition Operators
Date: 2026-04-06 Status: Accepted
Context
jPipe files can declare a justification (or template) as the result of composing two or more existing models via a composition operator:
justification refined is refine(minimal, refinement) {
hook: "minimal/e"
}
The grammar (JPipe.g4) already supports this syntax. The compiler must:
- Dispatch the operator call to the correct implementation at model-build time.
- Remain non-destructive — source models must not be mutated.
- Be extensible — adding a new operator must not require changes to the compilation pipeline.
- Preserve the original element names in the compiled
Unitso that post-composition lookups can resolve them even after elements are renamed during merging.
Decision
1. Match-and-merge framework (jpipe-operators module)
Composition is modelled as a two-phase match-and-merge algorithm owned by
an abstract CompositionOperator class (Template Method pattern):
EquivalenceRelation— a binary predicate(a, b) → booleanthat decides whether twoSourcedElements belong to the same equivalence class.CompositionOperatorpartitions all source elements automatically using an O(n²) representative-based scan; callers need only supply the predicate.MergeFunction— receives one equivalence class (ElementGroup) and creates the merged element viaCommandobjects. It must register old ids → new id in the suppliedAliasRegistryand must not emitAddSupportcommands (edge reconstruction is automatic).SourcedElement— a 3-component record(element, source, location)pairing aJustificationElementwith theJustificationModelit was taken from and its originalSourceLocation. The location is looked up from the compilation unit's location registry whenapply()builds the sourced-element list, soMergeFunctionimplementations can forward it to creation commands and have it recorded in the result model's symbol-table entry.CompositionOperator.apply()— the sealed template method. Four overloads are provided for call-site convenience; all delegate to the full 5-argument form:apply(resultName, sources, arguments)— no location information (unit tests).apply(resultName, sources, arguments, location)— attaches aSourceLocationto the result model declaration but supplies no element locations.apply(resultName, sources, arguments, location, knownLocations)— the canonical form;knownLocationsis aMap<String, SourceLocation>keyed on"modelName/elementId"(the same format used byUnit.locations()).ApplyOperator.expand()passescontext.locations()here so that copied elements carry their source file positions into the result model.- Phase 1: creates elements, populates
AliasRegistry, emitsRegisterAliascommands to persist aliases in theUnit. - Phase 2: reconstructs support edges by translating original endpoints through the registry, with set-based deduplication to prevent duplicate edges when multiple source models share the same edge.
- Subclasses provide
equivalenceRelation(sources, arguments),mergeFunction(sources, arguments),createResultModel(name, location, sources, arguments), and optionallyrequiredArguments()(missing required keys throwInvalidOperatorCallException). All hook methods receive the fullsourceslist so that operators can inspect source types and distinguish elements by their origin model without breaking the sealedapply(). additionalCommands(resultName, sources, aliases, args)— optional hook called after Phase 2 is complete. The default implementation returnsList.of(). Override to inject synthesized elements (e.g. a new aggregating strategy and conclusion) and cross-model edges that have no counterpart in any source model. Because it runs after Phase 2, all source-derived elements and edges are already present; the returned commands are appended at the end.- Phase 4 — automatic unification runs in
ApplyOperator.expand()afterapply()returns, viaUnifier.unify(). It is not part of the sealedapply()template but is always applied to the command list before the engine sees it. See §5 below. ModelReplicator— a stateless utility that generates commands to copy a model's elements and edges non-destructively, following the same qualified-id convention asJustificationModel.inline().OperatorRegistry— a name →CompositionOperatormap populated at compiler startup. Currently populated with hardcoded built-in operators inCompilerFactory.builtInOperators(); a service-loader extension point is deferred to a later ADR.- Aliases in
Unit—UnitgainsrecordAlias(model, oldId, newId),resolveAlias(model, id), andaliases()(an unmodifiable view of the full alias map, keyed on"model/oldId"), backed by a flatMap<String, String>. A newRegisterAliascommand writes alias entries so the mapping survives in the compiled unit.aliases()is used byDiagnosticReportto include alias entries in the symbol table.
2. Automatic post-composition unification (jpipe-operators)
ApplyOperator.expand() passes the command list returned by apply() through
a Unifier instance before returning it to the engine. The Unifier performs
a Phase 4 transformation on the command list:
- Algorithm — scans the list for element-creation commands (
CreateConclusion,CreateStrategy,CreateEvidence,CreateSubConclusion,CreateAbstractSupport), partitions them into equivalence classes using anEquivalenceRelationlooked up inUnificationEquivalenceRegistry, and for each class with more than one member: - Creates a new synthesized element with id
"unified_N"(N = 0-based counter per merged group, incremented per group within oneunify()call). - Removes all original
Create*commands for the group members. - Rewrites all
AddSupportcommands referencing removed ids to use the new"unified_N"id, deduplicating edge re-writes. - Appends
RegisterAlias(resultName, oldId, "unified_N")for every removed id so the alias map inUnitremains consistent. - Config parameters (read from the operator's
rule_configmap): unifyBy— name of the equivalence relation (default:"sameLabel"). ThrowsInvalidOperatorCallExceptionif the name is not registered.unifyExclude— comma-separated list of result-model element ids that must not participate in unification (default: empty). Excluded elements remain as-is and do not block other elements from merging.UnificationEquivalenceRegistry— a name-to-EquivalenceRelationmap populated at compiler startup, following the same pattern asOperatorRegistry. Currently registers"sameLabel"→SameLabel.SameShortIdis intentionally absent — it is reserved for Phase 1 operator equivalence only. New equivalence relations (e.g. Levenshtein distance) can be added by registering them inCompilerFactory.builtInUnificationEquivalences().Partitions— a package-private utility class extracted fromCompositionOperatorso that bothCompositionOperator(Phase 1) andUnifier(Phase 4) share the same O(n²) representative-based partition algorithm without duplication.
4. ApplyOperator implements MacroCommand (jpipe-operators)
When the parser produces a justification or template with ctx.operator != null,
ActionListProvider emits an ApplyOperator command instead of
CreateJustification / CreateTemplate.
ApplyOperator implements the MacroCommand interface already used by the engine
for deferred expansion:
condition()— defers until all source model names are present in theUnit.expand(Unit)— looks up the operator by name in theOperatorRegistry, gathers source models from the unit, and delegates to the 5-argumentoperator.apply(), passing the storedSourceLocationandcontext.locations()so that element locations are threaded through to the result model. Returns the resultingList<Command>for the engine to splice at the front of the queue.
5. Compiler integration (jpipe-compiler)
ActionListProviderreceives anOperatorRegistryat construction (field namedoperators).- In
enterJustification/enterTemplate, whenctx.operator != null, all data needed forApplyOperator(result name, operator name, source names, config map) is read eagerly from the already-built parse tree context. The method returns early without updatingbuildContext, since there is no body to parse. enterRule_configbecomes a no-op (config was consumed in the parent callback).CompilerFactory.parsingChain()passesbuiltInOperators()toActionListProvider.
6. Symbol table for operator-created models
DiagnosticReport builds the symbol table by iterating unit.getModels()
directly rather than only the recorded-location registry. For each element:
- If
unit.locationOf(modelName, elementId)returns a known location (threaded from the source model viaknownLocations), that location is displayed. - Otherwise the element is marked
[synthesized]— indicating it was created by the operator with no corresponding source position (e.g. a merged SubConclusion).
Aliases are shown after the element list for each model, formatted as
oldId → newId [alias].
Rationale
- Encoding the operator call as a
MacroCommandreuses the engine's existing deferred-execution mechanism (also used byImplementsTemplate) without introducing new scheduling logic. - Collecting
rule_configeagerly inenterJustificationavoids accumulating state across child callbacks and is safe because the ANTLR parse tree is fully built before the walker fires any listener method. - Separating the equivalence predicate from the merge function makes both independently testable and reusable across different operator implementations.
OperatorRegistryas an explicit constructor dependency ofActionListProvider(rather than a static singleton) makes the available operator set visible at construction time and mockable in tests.- Persisting aliases to
UnitviaRegisterAliascommands keeps the alias data in the same place as all other compiled model state, consistent with howrecordLocation/locationOftrack source positions. - Passing
knownLocationsas aMap<String, SourceLocation>(rather thanUnititself) toapply()keeps the operator framework decoupled from the full compilation unit and limits the operator's read access to location data only. - Building the symbol table from
unit.getModels()rather than fromunit.locations()ensures that operator-created models — which may have no element-level location entries — still appear with their complete element list.
Consequences
- Two built-in operators ship with the framework:
refine(base, refinement)— merges a hook element frombasewith the conclusion ofrefinementinto a singleSubConclusion; requires ahookargument in"modelName/elementId"form.assemble(src₁, …, srcₙ)— demotes each source's conclusion to aSubConclusion, wires all demoted conclusions through a synthesized aggregatingStrategy, and tops them with a synthesized globalConclusion; requiresconclusionLabelandstrategyLabelarguments. Result is aTemplateif any source is aTemplate, otherwise aJustification.- Post-composition unification always runs after every operator call. By default
(
unifyBy: "sameLabel") any two result-model elements with identical labels are merged into a single synthesized element with id"unified_N"(N = 0-based counter per group). Both original ids are registered as aliases. SetunifyExcludeto a comma-separated id list to protect specific elements from unification. SpecifyunifyByto use an alternative equivalence relation registered inUnificationEquivalenceRegistry. - Adding a new built-in unification equivalence requires only implementing
EquivalenceRelationand registering it inCompilerFactory.builtInUnificationEquivalences(). - Adding a new built-in operator requires only implementing
CompositionOperatorand registering it inCompilerFactory.builtInOperators(). ApplyOperatoris the only command that holds a reference toOperatorRegistry; all other commands remain independent of the operator framework.- Future service-loader extensibility can be added inside
builtInOperators()without changing any other class. - Merge functions that accidentally emit
AddSupportviolate theMergeFunctioncontract and will produce duplicate edges; this is documented in the contract Javadoc but not enforced at runtime. - An operator call with an unknown operator name causes
expand()to throwInvalidOperatorCallException. The engine wraps this in aCompilationExceptionvia the standardfire()/ exception-wrapping mechanism. - Elements copied from source models appear in the result model's symbol table
with their original source positions. Truly synthesized elements (e.g. a
merged SubConclusion produced by the
refineoperator) appear as[synthesized]. Aliases (old id → merged id) are shown separately per model.