Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

CEDAR Template Model Specification

This specification defines the structural model for the CEDAR Template Model and its concrete JSON wire form.

It separates schema definition, presentation structure, reusable artifacts, contextual embedding, and instance data, and is layered as an abstract grammar paired with a JSON wire grammar, encoding rules, host-language bindings, and a normative validation algorithm.

The core concepts are Artifact, SchemaArtifact, Template, Field, PresentationComponent, EmbeddedArtifact, TemplateInstance, and InstanceValue. Every concrete artifact carries a top-level ModelVersion identifying the version of the CEDAR structural model it conforms to.

Scope

This specification defines:

  • the core metamodel and abstract grammar
  • artifact metadata, identity, lifecycle, and versioning
  • the field-spec system (twenty concrete field families)
  • the presentation-component model
  • the instance model
  • the JSON wire form (encoding rules, kind discriminator, wrapper collapse, property-name map)
  • encoding and decoding semantics, including a normative error model
  • a canonical validation algorithm with explicit error reports
  • host-language binding idioms for TypeScript, Java, and Python
  • a derived RDF projection of Value instances
  • a cross-language conformance test suite

Document Structure

  • notation.md — notation conventions used throughout the specification.
  • metamodel.md — conceptual overview: principal categories, the field hierarchy, the layered specification, and cross-cutting conventions.
  • grammar.md — the abstract EBNF-style grammar, including the FieldSpec system, defaults, primitive lexical-form productions, and related constraints.
  • wire-grammar.md — the JSON wire grammar: kind rule, wrapper collapse, encoding rules, and the property-name map for every production.
  • serialization.md — encoding and decoding semantics: round-tripping, the wrapping principle, the error model, and worked examples.
  • bindings.md — host-language idioms (TypeScript, Java, Python) and codebase-organisation guidance.
  • validation.md — the canonical validation algorithm, with per-step error reports.
  • presentation.md — the PresentationComponent family.
  • instances.mdTemplateInstance and InstanceValue semantics, including the explicit “defaults are not part of instances” rule.
  • rdf-projection.md — the derived projection from CEDAR Value instances to RDF.
  • index-of-productions.md — auto-generated A–Z index of every production in the specification.

A cross-language conformance test suite accompanies the specification: 114 fixtures (91 valid round-trip cases, 23 invalid cases with expected-error reports) embedded into serialization.md §8 and intended as a binding-acceptance contract.

Core Design Principles

  • Schema definition MUST be separated from instance data.
  • Semantic structure MUST be separated from presentation.
  • Templates MUST contain embedded artifacts rather than directly containing Field, Template, or PresentationComponent.
  • PresentationComponent MUST NOT contribute instance values.
  • Defaults are UI/UX initialisation only and never appear in TemplateInstance artifacts or in the RDF projection.
  • Terminology MUST remain stable across this specification.

Open Questions

  • Should the model support template-local (on-the-fly) fields without identity or versioning? See issue #1.
  • Are the Name, Description, PreferredLabel, and AlternativeLabel properties on ArtifactMetadata all pulling their weight, or is there redundancy worth simplifying? See issue #2.
  • Should instance structures eventually allow path-based keys in addition to EmbeddedArtifactKey?
  • Should option sets for some FieldSpec variants become reusable artifacts?

Notation

This specification uses the terminology and naming conventions that are shared across the rest of the specification.

The production notation for the abstract grammar is defined in spec/grammar.md.

Conformance Language

The words MUST, MUST NOT, SHOULD, and MAY are used to express normative requirements when appropriate.

Naming Conventions

Defined terms use the terminology in this specification exactly. In particular, the following terms are normative and stable.

Schema and artifact terms:

  • Artifact
  • SchemaArtifact
  • Template
  • Field
  • PresentationComponent

Embedding terms:

  • EmbeddedArtifact
  • EmbeddedField
  • EmbeddedTemplate
  • EmbeddedPresentationComponent
  • EmbeddedArtifactKey

Instance terms:

  • TemplateInstance
  • InstanceValue
  • FieldValue
  • NestedTemplateInstance

Typing terms:

  • FieldSpec

Value Notation

Value denotes an instance-level data value in the grammar.

Each Value family carries its content directly: a LexicalForm, an optional LanguageTag, an explicit datatype IRI (where one is configurable), or a boolean payload, depending on the family. There is no separate RDF-Literal layer in the abstract grammar; an RDF projection is defined separately in rdf-projection.md.

The normative structure and semantics of values are defined in the Values section of grammar.md.

Metamodel

Overview

This section provides a conceptual overview of the CEDAR Template Model. Its purpose is to describe the principal categories of constructs, the relationships among them, and the design rationale behind key decisions. It is intended as a companion to the formal abstract grammar defined in spec/grammar.md, which is the normative specification. Readers seeking precise structural definitions, production rules, or normative constraints should consult grammar.md directly.

The CEDAR Template Model is organised around three principal concerns: reusable schema artifacts that define structure, embedding constructs that contextualise those artifacts within a specific template, and template instances that record data conforming to a template.

Principal Categories

Artifact is the broadest category in the model. Every artifact carries a repository-assigned identifier, descriptive metadata, lifecycle metadata, and zero or more annotations. SchemaArtifact, PresentationComponent, and TemplateInstance are the three principal subclasses.

A SchemaArtifact is a reusable artifact that defines schema structure. Template and Field are the two concrete schema artifact kinds. Both carry versioning metadata — semantic version, publication status, optional lineage references — in addition to the common artifact metadata; see grammar.md for the normative shape. Independently of schema versioning, every concrete Artifact (every Template, every TemplateInstance, every Field, and every PresentationComponent) carries a top-level ModelVersion identifying the version of the CEDAR structural model the artifact conforms to.

A Template is the central container of the model. It specifies an ordered arrangement of EmbeddedArtifact constructs and defines the schema that TemplateInstance constructs must conform to.

A Field is an abstract category refined into a fixed set of typed concrete variants. Each concrete field carries a matching FieldSpec that specifies its value semantics and configuration: the field artifact carries identity, metadata, and lifecycle information, while the FieldSpec carries value rules and rendering properties. The full set of concrete variants, their groupings under abstract sub-categories (NumericField, TemporalField, EnumField, ContactField, ExternalAuthorityField), and the rationale behind the splits are documented in grammar.md and indexed in the Field Families chapter.

A PresentationComponent is a reusable non-data-bearing artifact that contributes presentational or instructional structure within a template. Examples include rich text, images, YouTube videos, section breaks, and page breaks. Presentation components do not produce instance values.

An EmbeddedArtifact contextualises a reusable artifact within a specific Template. There are three forms, and they carry different subsets of template-local properties:

  • EmbeddedField carries the full property set: an EmbeddedArtifactKey, a typed reference to the embedded Field, and optional ValueRequirement, Cardinality, Visibility, family-typed defaultValue, LabelOverride, and Property (a semantic property IRI for the embedding site).
  • EmbeddedTemplate carries the embedding key, the embedded template’s identifier, and optional ValueRequirement, Cardinality, Visibility, LabelOverride, and Property. It carries no defaultValue (templates do not have value-typed defaults).
  • EmbeddedPresentationComponent carries only the embedding key, the embedded presentation component’s identifier, and an optional Visibility. It contributes no instance data and exists purely to contribute presentational structure.

An EmbeddedArtifactKey is the local identifier of an EmbeddedArtifact within its containing Template. It is the mechanism that connects template structure to instance structure.

A TemplateInstance is an artifact that records data conforming to a Template. It contains FieldValue and NestedTemplateInstance constructs keyed by EmbeddedArtifactKey, corresponding to the data-bearing embedded artifacts of the referenced template.

The diagram below sketches how the principal categories connect at runtime. Schema-side classes (definitions) are on the right; instance-side classes (data records) are on the left. The horizontal arrows show the two cross-side links: a TemplateInstance is bound to its Template by IRI (templateRef), and each FieldValue is joined to its corresponding EmbeddedField by an EmbeddedArtifactKey. The schema-side downward chain (TemplateEmbeddedFieldFieldFieldSpec) is the structural surface a template author defines; the instance-side downward chain (TemplateInstanceFieldValueValue) is the runtime data the schema admits.

Relationships between Template, EmbeddedField, Field, FieldSpec, TemplateInstance, FieldValue, and Value

For the within-Field typed-variant hierarchy (the 20 concrete field families and their abstract groupings), see the next section.

Field Hierarchy

The diagram below shows the complete Field hierarchy and the FieldSpec each concrete field variant carries.

classDiagram
  class Field {
    <<abstract>>
  }
  class TemporalField {
    <<abstract>>
  }
  class EnumField {
    <<abstract>>
  }
  class ContactField {
    <<abstract>>
  }
  class ExternalAuthorityField {
    <<abstract>>
  }

  class TextField
  class NumericField
  class DateField
  class TimeField
  class DateTimeField
  class ControlledTermField
  class SingleValuedEnumField
  class MultiValuedEnumField
  class LinkField
  class EmailField
  class PhoneNumberField
  class OrcidField
  class RorField
  class DoiField
  class PubMedIdField
  class RridField
  class NihGrantIdField
  class AttributeValueField
  class IntegerNumberField
  class RealNumberField
  class BooleanField

  class TextFieldSpec
  class IntegerNumberFieldSpec
  class RealNumberFieldSpec
  class BooleanFieldSpec
  class DateFieldSpec
  class TimeFieldSpec
  class DateTimeFieldSpec
  class ControlledTermFieldSpec
  class SingleValuedEnumFieldSpec
  class MultiValuedEnumFieldSpec
  class LinkFieldSpec
  class EmailFieldSpec
  class PhoneNumberFieldSpec
  class OrcidFieldSpec
  class RorFieldSpec
  class DoiFieldSpec
  class PubMedIdFieldSpec
  class RridFieldSpec
  class NihGrantIdFieldSpec
  class AttributeValueFieldSpec

  Field <|-- TextField
  Field <|-- NumericField
  Field <|-- BooleanField
  Field <|-- TemporalField
  Field <|-- ControlledTermField
  Field <|-- EnumField
  Field <|-- LinkField
  Field <|-- ContactField
  Field <|-- ExternalAuthorityField
  Field <|-- AttributeValueField

  NumericField <|-- IntegerNumberField
  NumericField <|-- RealNumberField

  TemporalField <|-- DateField
  TemporalField <|-- TimeField
  TemporalField <|-- DateTimeField

  EnumField <|-- SingleValuedEnumField
  EnumField <|-- MultiValuedEnumField

  ContactField <|-- EmailField
  ContactField <|-- PhoneNumberField

  ExternalAuthorityField <|-- OrcidField
  ExternalAuthorityField <|-- RorField
  ExternalAuthorityField <|-- DoiField
  ExternalAuthorityField <|-- PubMedIdField
  ExternalAuthorityField <|-- RridField
  ExternalAuthorityField <|-- NihGrantIdField

  TextField --> TextFieldSpec : carries
  IntegerNumberField --> IntegerNumberFieldSpec : carries
  RealNumberField --> RealNumberFieldSpec : carries
  BooleanField --> BooleanFieldSpec : carries
  DateField --> DateFieldSpec : carries
  TimeField --> TimeFieldSpec : carries
  DateTimeField --> DateTimeFieldSpec : carries
  ControlledTermField --> ControlledTermFieldSpec : carries
  SingleValuedEnumField --> SingleValuedEnumFieldSpec : carries
  MultiValuedEnumField --> MultiValuedEnumFieldSpec : carries
  LinkField --> LinkFieldSpec : carries
  EmailField --> EmailFieldSpec : carries
  PhoneNumberField --> PhoneNumberFieldSpec : carries
  OrcidField --> OrcidFieldSpec : carries
  RorField --> RorFieldSpec : carries
  DoiField --> DoiFieldSpec : carries
  PubMedIdField --> PubMedIdFieldSpec : carries
  RridField --> RridFieldSpec : carries
  NihGrantIdField --> NihGrantIdFieldSpec : carries
  AttributeValueField --> AttributeValueFieldSpec : carries

Layered Specification

The CEDAR Template Model is specified across four normative chapters, each with a different concern:

ChapterConcern
grammar.mdThe abstract grammar — the productions, the categories, and the structural relationships that constitute the model. The authoritative definition.
wire-grammar.mdThe JSON wire form — the concrete shape every production takes when encoded as JSON, plus the encoding rules (kind discriminator, wrapper collapse, property names).
serialization.mdEncoding and decoding semantics — round-tripping, the error model, NFC normalisation, integer-string fallback, default-value semantics.
bindings.mdHost-language idioms for TypeScript, Java, and Python, plus codebase-organisation guidance.

A reusable conformance test suite accompanies the specification, embedded into serialization.md §8 via mdBook {{#include}}. It defines a cross-binding acceptance contract.

Cross-Cutting Conventions

A few structural conventions thread through every chapter:

  • The kind discriminator. Every member of a discriminator: kind union (e.g. every Field family, every Value family, every EmbeddedField family) carries a kind property identifying its production, at every position it occupies on the wire. Productions that are not members of any kind-discriminated union (Cardinality, Annotation, LabelOverride, Property, etc.) never carry kind. The rule is uniform — see wire-grammar.md §1.5.
  • Two-layer default values. Every concrete field family except AttributeValueField carries two layers of optional default value: a field-level default on the reusable Field’s FieldSpec, and an embedding-level default on the EmbeddedXxxField inside a Template. The embedding-level default overrides the field-level default when both are present. Defaults are UI/UX initialisation only — they do not appear in TemplateInstance artifacts and do not affect the RDF projection. See grammar.md §Defaults.
  • Pinned lexical-form productions. The grammar’s primitive string types (SemanticVersion, IriString, Bcp47Tag, Iso8601DateTimeLexicalForm, AsciiIdentifier, IntegerLexicalForm) are normatively pinned to specific external specifications and regular expressions. See grammar.md §Primitive String Types.
  • The error model. Conforming decoders and encoders report errors in three normatively-defined categories — wireShape, lexical, and structural — each with a JSON-pointer path locating the offending slot. See serialization.md §9.

Abstract Grammar

This section defines the abstract structure of the CEDAR Template Model using an EBNF-style grammar.

The grammar defines the abstract syntactic structure of the model. It specifies the kinds of constructs that exist and how they are composed, but it does not define a concrete textual or data serialization such as JSON, YAML, RDF, or a functional-style syntax.

Accordingly, a production in this grammar describes abstract structure rather than a directly parseable text form. In particular, a production such as Template ::= template( ... ) does not mean:

  • the literal token template must appear in a file
  • parentheses must appear in a file
  • whitespace must be used in a particular way in a file
  • the production is itself a concrete serialization format

The following notation is used throughout this grammar:

::=    defined as
|      alternative production
X*     zero or more occurrences of X
X+     one or more occurrences of X
[X]    optional occurrence of X
(...)  groups the named components of an abstract constructor form

Whitespace separates symbols within a production.

Production names use UpperCamelCase. A production name denotes the abstract category being defined, such as Template, Field, or DateFieldSpec.

Abstract constructor forms use lower_snake_case. In this document, a constructor form is the schematic form used to show how an abstract construct is composed, such as template(...), field(...), or date_field_spec(...). The difference between UpperCamelCase production names and lower_snake_case constructor forms is purely a visual distinction used to make it clear when the grammar is naming a category and when it is showing the abstract form of a construct belonging to that category.

For example, in the production

Template ::= template(
               TemplateId
               CatalogMetadata
               SchemaArtifactVersioning
               Title
               [TemplateRenderingHint]
               EmbeddedArtifact*
             )

Template is the production being defined, while template(...) denotes the abstract constructor form of that construct; in other words, it shows the components of a Template and how they are composed.

A conceptual overview of the model — describing the principal categories, their relationships, and the design rationale behind key decisions — is provided in spec/metamodel.md. The present document is the normative formal specification.

Contents

Kernel Grammar

The kernel grammar defines the primary abstract categories of the model and the core schema-level structure that connects them. It introduces reusable schema artifacts, templates, and the embedding constructs through which templates assemble fields, nested templates, and presentation components. Subsequent sections refine the metadata, field-spec families, instance structures, and supporting constructs referenced here.

The diagram below gives an overview of the kernel. Template is the central container: it holds an ordered sequence of EmbeddedArtifact constructs, each of which contextualises a reusable artifact — a Field, a nested Template, or a PresentationComponent — within that specific template. A TemplateInstance records data conforming to a Template. Concrete Field variants and FieldSpec configurations are omitted for clarity.

%%{init: {'themeVariables': {'fontSize': '12px'}}}%%
classDiagram
  class Artifact {
    <<abstract>>
  }
  class SchemaArtifact {
    <<abstract>>
  }
  class Field {
    <<abstract>>
  }
  class Template {
    TemplateId
    ModelVersion
    CatalogMetadata
    SchemaArtifactVersioning
    Title
    [TemplateRenderingHint]
    [Header]
    [Footer]
  }
  class PresentationComponent {
    PresentationComponentId
    ModelVersion
    CatalogMetadata
  }
  class TemplateInstance {
    TemplateInstanceId
    ModelVersion
    CatalogMetadata
  }

  class EmbeddedArtifact {
    <<abstract>>
  }
  class EmbeddedField {
    EmbeddedArtifactKey
    [ValueRequirement]
    [Cardinality]
    [Visibility]
    [defaultValue]
    [LabelOverride]
    [HelpTextOverride]
    [Property]
  }
  class EmbeddedTemplate {
    EmbeddedArtifactKey
    [ValueRequirement]
    [Cardinality]
    [Visibility]
    [LabelOverride]
    [Property]
  }
  class EmbeddedPresentationComponent {
    EmbeddedArtifactKey
    [Visibility]
    [LabelOverride]
  }
  class Property {
    PropertyIri
    [PropertyLabel]
  }

  Artifact <|-- SchemaArtifact
  Artifact <|-- PresentationComponent
  Artifact <|-- TemplateInstance

  SchemaArtifact <|-- Field
  SchemaArtifact <|-- Template

  Template "1" *-- "0..*" EmbeddedArtifact : contains ordered

  EmbeddedArtifact <|-- EmbeddedField
  EmbeddedArtifact <|-- EmbeddedTemplate
  EmbeddedArtifact <|-- EmbeddedPresentationComponent

  EmbeddedField --> Field : references
  EmbeddedTemplate --> Template : references
  EmbeddedPresentationComponent --> PresentationComponent : references

  EmbeddedField ..> Property : carries
  EmbeddedTemplate ..> Property : carries

  TemplateInstance --> Template : conforms to

Core Structure

This subsection establishes the top-level taxonomy of the model and introduces its two principal concrete schema artifacts. Artifact is the broadest category, encompassing reusable schema artifacts, presentation components, and template instances. Template is defined here as the central container that organises embedded artifacts into a structured form. Field is introduced as an abstract category whose concrete variants are defined in the following subsection.

Artifact ::= SchemaArtifact
           | PresentationComponent
           | TemplateInstance

SchemaArtifact ::= Field
                 | Template

Template is a concrete schema artifact and the central container of the model. It assembles EmbeddedArtifact constructs into a structured form and defines the schema that TemplateInstance constructs conform to.

Template ::= template(
               TemplateId
               ModelVersion
               CatalogMetadata
               SchemaArtifactVersioning
               Title
               [TemplateRenderingHint]
               [Header]
               [Footer]
               EmbeddedArtifact*
             )

Title ::= title(
            MultilingualString
          )

Label ::= label(
            MultilingualString
          )

Header ::= header(
             MultilingualString
           )

Footer ::= footer(
             MultilingualString
           )

TemplateRenderingHint ::= template_rendering_hint(
                            [HelpDisplayMode]
                          )

HelpDisplayMode ::= "inline" | "tooltip" | "both" | "none"

Header and Footer denote optional human-readable textual content displayed at the top and bottom of a rendered template respectively. Each is a MultilingualString carrying one or more language-tagged localizations of the same conceptual text.

TemplateRenderingHint carries form-level UX configuration. Distinct from the per-field-spec RenderingHint family, which configures how a single field is rendered, TemplateRenderingHint configures behaviour that applies to the form as a whole. Currently the only slot is HelpDisplayMode; future revisions may add further form-level UX switches, each with its own cascade rule for embedded templates.

HelpDisplayMode selects how field HelpText — and any per-embedding HelpTextOverride — is presented at form-render time:

  • "inline"HelpText renders as visible text adjacent to the field, typically beneath the input.
  • "tooltip"HelpText renders as a hover/focus tooltip, triggered by a ? icon or similar affordance.
  • "both" — both presentations are emitted. Useful for accessibility contexts where redundancy is preferred.
  • "none" — the field’s HelpText is not displayed at form-render time. The content remains part of the model (visible to alternative renderers, to the RDF projection, and to catalog displays) but the form-rendering layer suppresses it.

When HelpDisplayMode is absent — either because the Template carries no TemplateRenderingHint, or because the hint omits the slot — the default behaviour is "inline".

The cascade rule for nested templates is a rendering-time concern, not a structural validation constraint, and is normatively stated in presentation.md: when a Template is embedded inside another Template, the inner template’s HelpDisplayMode is ignored for help-text rendering; the enclosing template’s setting applies to every field within the rendered form, including fields contributed by nested templates. The inner template’s own HelpDisplayMode applies only when the template is rendered standalone.

The following productions introduce the abstract field categories. Field remains an abstract category, while the intermediate categories group related concrete field artifacts for readability and shared semantics.

Field ::= TextField
        | NumericField
        | BooleanField
        | TemporalField
        | ControlledTermField
        | EnumField
        | LinkField
        | ContactField
        | ExternalAuthorityField
        | AttributeValueField

NumericField ::= IntegerNumberField
               | RealNumberField

TemporalField ::= DateField
                | TimeField
                | DateTimeField

EnumField ::= SingleValuedEnumField
            | MultiValuedEnumField

ContactField ::= EmailField
               | PhoneNumberField

ExternalAuthorityField ::= OrcidField
                         | RorField
                         | DoiField
                         | PubMedIdField
                         | RridField
                         | NihGrantIdField

Concrete Field Artifacts

Each concrete Field variant carries six components: a typed artifact identifier that permanently identifies the reusable field; a ModelVersion identifying the version of the CEDAR structural model the artifact conforms to; CatalogMetadata providing the descriptive, lifecycle, and annotation metadata used in catalog and registry contexts; SchemaArtifactVersioning providing the version, status, and lineage information common to all schema artifacts; a typed FieldSpec that specifies the value semantics and configuration for that field category; and a Label that carries the rendered question text shown to users at data-entry time. The identifier, FieldSpec, and Label are specific to each concrete variant; ModelVersion, CatalogMetadata, and SchemaArtifactVersioning are uniform across all fields. Each concrete Field MAY additionally carry an optional HelpText. The groupings below mirror the abstract Field hierarchy defined in Core Structure.

TextField, BooleanField, and the two numeric field families (IntegerNumberField and RealNumberField) are the simple scalar field specs. Each carries the most basic value semantics — free text, true / false, exact integer values, and real-valued numbers respectively.

TextField ↗ EmbeddedTextField ::= text_field(
               TextFieldId
               ModelVersion
               CatalogMetadata
               SchemaArtifactVersioning
               TextFieldSpec
               Label
               [HelpText]
             )

BooleanField ↗ EmbeddedBooleanField ::= boolean_field(
                  BooleanFieldId
                  ModelVersion
                  CatalogMetadata
                  SchemaArtifactVersioning
                  BooleanFieldSpec
                  Label
                  [HelpText]
                )

The numeric field variants correspond to the NumericField abstract category. They share the broader concept of numeric content but split semantically: IntegerNumberField carries arbitrary-precision integer values (no fractional part); RealNumberField carries real-valued numbers (decimal arbitrary precision, or IEEE 754 single- or double-precision floating point). The split is principled: integer arithmetic is exact and closed under the usual operations, whereas real-valued arithmetic carries approximation concerns. See Field Specs for the per-family configuration.

IntegerNumberField ↗ EmbeddedIntegerNumberField ::= integer_number_field(
                         IntegerNumberFieldId
                         ModelVersion
                         CatalogMetadata
                         SchemaArtifactVersioning
                         IntegerNumberFieldSpec
                         Label
                         [HelpText]
                       )

RealNumberField ↗ EmbeddedRealNumberField ::= real_number_field(
                      RealNumberFieldId
                      ModelVersion
                      CatalogMetadata
                      SchemaArtifactVersioning
                      RealNumberFieldSpec
                      Label
                      [HelpText]
                    )

The temporal field variants correspond to the TemporalField abstract category. Each is typed to a distinct temporal semantic — date, time of day, or combined date-time — and carries its own FieldSpec with precision and rendering options appropriate to that category.

DateField ↗ EmbeddedDateField ::= date_field(
               DateFieldId
               ModelVersion
               CatalogMetadata
               SchemaArtifactVersioning
               DateFieldSpec
               Label
               [HelpText]
             )

TimeField ↗ EmbeddedTimeField ::= time_field(
               TimeFieldId
               ModelVersion
               CatalogMetadata
               SchemaArtifactVersioning
               TimeFieldSpec
               Label
               [HelpText]
             )

DateTimeField ↗ EmbeddedDateTimeField ::= date_time_field(
                   DateTimeFieldId
                   ModelVersion
                   CatalogMetadata
                   SchemaArtifactVersioning
                   DateTimeFieldSpec
                   Label
                   [HelpText]
                 )

ControlledTermField supports values drawn from declared ontology sources. LinkField carries a single IRI-valued hyperlink.

ControlledTermField ↗ EmbeddedControlledTermField ::= controlled_term_field(
                          ControlledTermFieldId
                          ModelVersion
                          CatalogMetadata
                          SchemaArtifactVersioning
                          ControlledTermFieldSpec
                          Label
                          [HelpText]
                        )

LinkField ↗ EmbeddedLinkField ::= link_field(
               LinkFieldId
               ModelVersion
               CatalogMetadata
               SchemaArtifactVersioning
               LinkFieldSpec
               Label
               [HelpText]
             )

SingleValuedEnumField and MultiValuedEnumField correspond to the EnumField abstract category and are the two concrete enum field variants. They differ in whether they permit exactly one or multiple simultaneous selections from a declared set of permissible values. The permitted values are declared in the corresponding EnumFieldSpec and are validated against at the instance level.

SingleValuedEnumField ↗ EmbeddedSingleValuedEnumField ::= single_valued_enum_field(
                            SingleValuedEnumFieldId
                            ModelVersion
                            CatalogMetadata
                            SchemaArtifactVersioning
                            SingleValuedEnumFieldSpec
                            Label
                            [HelpText]
                          )

MultiValuedEnumField ↗ EmbeddedMultiValuedEnumField ::= multi_valued_enum_field(
                           MultiValuedEnumFieldId
                           ModelVersion
                           CatalogMetadata
                           SchemaArtifactVersioning
                           MultiValuedEnumFieldSpec
                           Label
                           [HelpText]
                         )

The contact field variants correspond to the ContactField abstract category and represent human contact identifiers.

EmailField ↗ EmbeddedEmailField ::= email_field(
                EmailFieldId
                ModelVersion
                CatalogMetadata
                SchemaArtifactVersioning
                EmailFieldSpec
                Label
                [HelpText]
              )

PhoneNumberField ↗ EmbeddedPhoneNumberField ::= phone_number_field(
                      PhoneNumberFieldId
                      ModelVersion
                      CatalogMetadata
                      SchemaArtifactVersioning
                      PhoneNumberFieldSpec
                      Label
                      [HelpText]
                    )

The external authority field variants correspond to the ExternalAuthorityField abstract category. Each represents an identifier issued by a specific external authority system, as described in the External Authority Values section. Each external authority field is associated with format validation specific to its identifier scheme and supports integration with the corresponding resolution service for identifier lookup and verification.

OrcidField ↗ EmbeddedOrcidField ::= orcid_field(
                OrcidFieldId
                ModelVersion
                CatalogMetadata
                SchemaArtifactVersioning
                OrcidFieldSpec
                Label
                [HelpText]
              )

RorField ↗ EmbeddedRorField ::= ror_field(
              RorFieldId
              ModelVersion
              CatalogMetadata
              SchemaArtifactVersioning
              RorFieldSpec
              Label
              [HelpText]
            )

DoiField ↗ EmbeddedDoiField ::= doi_field(
              DoiFieldId
              ModelVersion
              CatalogMetadata
              SchemaArtifactVersioning
              DoiFieldSpec
              Label
              [HelpText]
            )

PubMedIdField ↗ EmbeddedPubMedIdField ::= pub_med_id_field(
                    PubMedIdFieldId
                    ModelVersion
                    CatalogMetadata
                    SchemaArtifactVersioning
                    PubMedIdFieldSpec
                    Label
                    [HelpText]
                  )

RridField ↗ EmbeddedRridField ::= rrid_field(
               RridFieldId
               ModelVersion
               CatalogMetadata
               SchemaArtifactVersioning
               RridFieldSpec
               Label
               [HelpText]
             )

NihGrantIdField ↗ EmbeddedNihGrantIdField ::= nih_grant_id_field(
                     NihGrantIdFieldId
                     ModelVersion
                     CatalogMetadata
                     SchemaArtifactVersioning
                     NihGrantIdFieldSpec
                     Label
                     [HelpText]
                   )

AttributeValueField supports open-ended name-value pair data whose attribute names are not fixed at schema definition time.

AttributeValueField ↗ EmbeddedAttributeValueField ::= attribute_value_field(
                          AttributeValueFieldId
                          ModelVersion
                          CatalogMetadata
                          SchemaArtifactVersioning
                          AttributeValueFieldSpec
                          Label
                          [HelpText]
                        )

The concrete field artifacts defined above are reusable schema-level constructs. A reusable Field deliberately does not carry template-local keying, cardinality, visibility, or label override — those properties belong to the embedding context, not to the reusable artifact. To appear within a Template, each field must be included via an Embedded Artifacts construct, which adds that template-local context and governs how the field participates in that specific template.

Each concrete Field artifact MAY carry an optional HelpText slot. HelpText is authored guidance about what the field is asking for and how to answer — text typically rendered alongside the field at form-render time as inline help, as a hover tooltip, or both, controlled by the enclosing Template’s HelpDisplayMode. HelpText is distinct from Description: Description is the artifact-catalog explanation seen when browsing the field registry; HelpText is the form-author-facing guidance seen at data-entry time. The two roles often share text but serve different audiences.

HelpText ::= help_text( MultilingualString )

HelpText carries a MultilingualString value: localized authored guidance that may be presented in one or more natural languages. The enclosing Template’s HelpDisplayMode selects the presentation; absence of HelpDisplayMode defaults to "inline" rendering. The "none" arm suppresses rendering but preserves the content in the model (visible to alternative renderers, RDF projection, and catalog displays).

A per-embedding override is also defined: an EmbeddedField MAY carry an optional HelpTextOverride that replaces the field’s canonical HelpText at that embedding site only, mirroring the existing LabelOverride precedent. See Embedded Artifacts for the embedding-site shape.

Embedded Artifacts

An EmbeddedArtifact contextualises a reusable artifact within a specific Template, adding template-local properties that govern how the artifact participates in that context. There are three forms: EmbeddedField, which embeds a data-bearing field; EmbeddedTemplate, which nests a template within the containing template; and EmbeddedPresentationComponent, which contributes presentational structure without producing instance data.

The sequence of EmbeddedArtifact constructs within a Template is significant. The order in which they appear determines the presentation order of embedded artifacts in a rendered template. Conforming implementations MUST preserve this order.

EmbeddedArtifact ::= EmbeddedField
                   | EmbeddedTemplate
                   | EmbeddedPresentationComponent

EmbeddedField ::= EmbeddedTextField
                | EmbeddedIntegerNumberField
                | EmbeddedRealNumberField
                | EmbeddedBooleanField
                | EmbeddedDateField
                | EmbeddedTimeField
                | EmbeddedDateTimeField
                | EmbeddedControlledTermField
                | EmbeddedSingleValuedEnumField
                | EmbeddedMultiValuedEnumField
                | EmbeddedLinkField
                | EmbeddedEmailField
                | EmbeddedPhoneNumberField
                | EmbeddedOrcidField
                | EmbeddedRorField
                | EmbeddedDoiField
                | EmbeddedPubMedIdField
                | EmbeddedRridField
                | EmbeddedNihGrantIdField
                | EmbeddedAttributeValueField

Every concrete EmbeddedField variant follows the same structural pattern. Each carries: an EmbeddedArtifactKey uniquely identifying the embedding site within the containing Template; a typed field reference identifying the reusable Field being embedded; an optional ValueRequirement specifying whether a value is required, recommended, or optional; an optional Cardinality bounding the permitted number of values; an optional Visibility controlling whether the field is shown in rendered interfaces; an optional defaultValue providing an embedding-specific default whose type is the family-specific Value type (e.g. TextValue for EmbeddedTextField, DateValue for EmbeddedDateField); an optional LabelOverride allowing the template to override the field’s label in this context; and an optional Property associating a semantic property IRI with the embedding site. The only variation across concrete EmbeddedField variants is the typed field reference and the typed default value, both of which match the value family of the referenced field.

EmbeddedBooleanField and EmbeddedSingleValuedEnumField are the two exceptions to this pattern: each omits the [Cardinality] slot. A boolean field is inherently single-valued — its ValueRequirement slot already distinguishes the meaningful states (required, recommended, optional). A SingleValuedEnumField is similarly single-valued by construction; multi-valued enum embedding is expressed only through EmbeddedMultiValuedEnumField. EmbeddedMultiValuedEnumField further differs in that its embedding-level default is a sequence (EnumValue*) rather than a single optional value, parallel to how multi-valued enum instance values appear as a sequence in FieldValue.

EmbeddedTextField ↗ TextField ::= embedded_text_field(
                        EmbeddedArtifactKey
                        TextFieldId
                        [ValueRequirement]
                        [Cardinality]
                        [Visibility]
                        [TextValue]
                        [LabelOverride]
                        [HelpTextOverride]
                        [Property]
                      )

EmbeddedIntegerNumberField ↗ IntegerNumberField ::= embedded_integer_number_field(
                                 EmbeddedArtifactKey
                                 IntegerNumberFieldId
                                 [ValueRequirement]
                                 [Cardinality]
                                 [Visibility]
                                 [IntegerNumberValue]
                                 [LabelOverride]
                                 [HelpTextOverride]
                                 [Property]
                               )

EmbeddedRealNumberField ↗ RealNumberField ::= embedded_real_number_field(
                              EmbeddedArtifactKey
                              RealNumberFieldId
                              [ValueRequirement]
                              [Cardinality]
                              [Visibility]
                              [RealNumberValue]
                              [LabelOverride]
                              [HelpTextOverride]
                              [Property]
                            )

EmbeddedBooleanField ↗ BooleanField ::= embedded_boolean_field(
                           EmbeddedArtifactKey
                           BooleanFieldId
                           [ValueRequirement]
                           [Visibility]
                           [BooleanValue]
                           [LabelOverride]
                           [HelpTextOverride]
                           [Property]
                         )

EmbeddedDateField ↗ DateField ::= embedded_date_field(
                        EmbeddedArtifactKey
                        DateFieldId
                        [ValueRequirement]
                        [Cardinality]
                        [Visibility]
                        [DateValue]
                        [LabelOverride]
                        [HelpTextOverride]
                        [Property]
                      )

EmbeddedTimeField ↗ TimeField ::= embedded_time_field(
                        EmbeddedArtifactKey
                        TimeFieldId
                        [ValueRequirement]
                        [Cardinality]
                        [Visibility]
                        [TimeValue]
                        [LabelOverride]
                        [HelpTextOverride]
                        [Property]
                      )

EmbeddedDateTimeField ↗ DateTimeField ::= embedded_date_time_field(
                            EmbeddedArtifactKey
                            DateTimeFieldId
                            [ValueRequirement]
                            [Cardinality]
                            [Visibility]
                            [DateTimeValue]
                            [LabelOverride]
                            [HelpTextOverride]
                            [Property]
                          )

EmbeddedControlledTermField ↗ ControlledTermField ::= embedded_controlled_term_field(
                                  EmbeddedArtifactKey
                                  ControlledTermFieldId
                                  [ValueRequirement]
                                  [Cardinality]
                                  [Visibility]
                                  [ControlledTermValue]
                                  [LabelOverride]
                                  [HelpTextOverride]
                                  [Property]
                                )

EmbeddedSingleValuedEnumField ↗ SingleValuedEnumField ::= embedded_single_valued_enum_field(
                                    EmbeddedArtifactKey
                                    SingleValuedEnumFieldId
                                    [ValueRequirement]
                                    [Visibility]
                                    [EnumValue]
                                    [LabelOverride]
                                    [HelpTextOverride]
                                    [Property]
                                  )

EmbeddedMultiValuedEnumField ↗ MultiValuedEnumField ::= embedded_multi_valued_enum_field(
                                   EmbeddedArtifactKey
                                   MultiValuedEnumFieldId
                                   [ValueRequirement]
                                   [Cardinality]
                                   [Visibility]
                                   EnumValue*
                                   [LabelOverride]
                                   [HelpTextOverride]
                                   [Property]
                                 )

EmbeddedLinkField ↗ LinkField ::= embedded_link_field(
                        EmbeddedArtifactKey
                        LinkFieldId
                        [ValueRequirement]
                        [Cardinality]
                        [Visibility]
                        [LinkValue]
                        [LabelOverride]
                        [HelpTextOverride]
                        [Property]
                      )

EmbeddedEmailField ↗ EmailField ::= embedded_email_field(
                         EmbeddedArtifactKey
                         EmailFieldId
                         [ValueRequirement]
                         [Cardinality]
                         [Visibility]
                         [EmailValue]
                         [LabelOverride]
                         [HelpTextOverride]
                         [Property]
                       )

EmbeddedPhoneNumberField ↗ PhoneNumberField ::= embedded_phone_number_field(
                               EmbeddedArtifactKey
                               PhoneNumberFieldId
                               [ValueRequirement]
                               [Cardinality]
                               [Visibility]
                               [PhoneNumberValue]
                               [LabelOverride]
                               [HelpTextOverride]
                               [Property]
                             )

EmbeddedOrcidField ↗ OrcidField ::= embedded_orcid_field(
                         EmbeddedArtifactKey
                         OrcidFieldId
                         [ValueRequirement]
                         [Cardinality]
                         [Visibility]
                         [OrcidValue]
                         [LabelOverride]
                         [HelpTextOverride]
                         [Property]
                       )

EmbeddedRorField ↗ RorField ::= embedded_ror_field(
                       EmbeddedArtifactKey
                       RorFieldId
                       [ValueRequirement]
                       [Cardinality]
                       [Visibility]
                       [RorValue]
                       [LabelOverride]
                       [HelpTextOverride]
                       [Property]
                     )

EmbeddedDoiField ↗ DoiField ::= embedded_doi_field(
                       EmbeddedArtifactKey
                       DoiFieldId
                       [ValueRequirement]
                       [Cardinality]
                       [Visibility]
                       [DoiValue]
                       [LabelOverride]
                       [HelpTextOverride]
                       [Property]
                     )

EmbeddedPubMedIdField ↗ PubMedIdField ::= embedded_pub_med_id_field(
                            EmbeddedArtifactKey
                            PubMedIdFieldId
                            [ValueRequirement]
                            [Cardinality]
                            [Visibility]
                            [PubMedIdValue]
                            [LabelOverride]
                            [HelpTextOverride]
                            [Property]
                          )

EmbeddedRridField ↗ RridField ::= embedded_rrid_field(
                        EmbeddedArtifactKey
                        RridFieldId
                        [ValueRequirement]
                        [Cardinality]
                        [Visibility]
                        [RridValue]
                        [LabelOverride]
                        [HelpTextOverride]
                        [Property]
                      )

EmbeddedNihGrantIdField ↗ NihGrantIdField ::= embedded_nih_grant_id_field(
                               EmbeddedArtifactKey
                               NihGrantIdFieldId
                               [ValueRequirement]
                               [Cardinality]
                               [Visibility]
                               [NihGrantIdValue]
                               [LabelOverride]
                               [HelpTextOverride]
                               [Property]
                             )

EmbeddedAttributeValueField ↗ AttributeValueField ::= embedded_attribute_value_field(
                                  EmbeddedArtifactKey
                                  AttributeValueFieldId
                                  [ValueRequirement]
                                  [Cardinality]
                                  [Visibility]
                                  [LabelOverride]
                                  [HelpTextOverride]
                                  [Property]
                                )

EmbeddedTemplate and EmbeddedPresentationComponent follow a similar pattern to embedded fields but differ in what embedding properties they carry. EmbeddedTemplate supports cardinality to permit multiple nested instances of the referenced template, carries no typed default value, and carries an optional Property associating a semantic property IRI with the embedding site. EmbeddedPresentationComponent carries neither a value requirement, cardinality, default value, label override, nor property, as it contributes no instance data and exists purely to contribute presentational structure. The only embedding-level property it carries is Visibility.

EmbeddedTemplate ::= embedded_template(
                       EmbeddedArtifactKey
                       TemplateId
                       [ValueRequirement]
                       [Cardinality]
                       [Visibility]
                       [LabelOverride]
                       [Property]
                     )

EmbeddedPresentationComponent ::= embedded_presentation_component(
                                    EmbeddedArtifactKey
                                    PresentationComponentId
                                    [Visibility]
                                  )

Artifact Identity

Artifact identity defines the typed identifiers by which artifacts and artifact references are denoted in the model. These identity constructs are distinct from descriptive metadata, lifecycle metadata, versioning, and annotations.

Each field kind has its own typed identifier rather than sharing a single generic FieldId. This provides strong typing: an EmbeddedTextField can only carry a TextFieldId at its artifactRef slot, an EmbeddedDateField can only carry a DateFieldId, and so on, making it structurally impossible to embed a field of the wrong type. TemplateId, PresentationComponentId, and TemplateInstanceId follow the same pattern for the same reason.

Identifiers serve two roles: at the definition site of a reusable artifact (e.g. Field.id, Template.id) they permanently name the artifact; at the embedding site (e.g. EmbeddedField.artifactRef, EmbeddedTemplate.artifactRef) they reference the artifact being embedded. The same typed identifier production is used at both positions; the role distinction is conveyed by the surrounding production’s component name.

FieldId ::= TextFieldId
          | IntegerNumberFieldId
          | RealNumberFieldId
          | BooleanFieldId
          | DateFieldId
          | TimeFieldId
          | DateTimeFieldId
          | ControlledTermFieldId
          | SingleValuedEnumFieldId
          | MultiValuedEnumFieldId
          | LinkFieldId
          | EmailFieldId
          | PhoneNumberFieldId
          | OrcidFieldId
          | RorFieldId
          | DoiFieldId
          | PubMedIdFieldId
          | RridFieldId
          | NihGrantIdFieldId
          | AttributeValueFieldId

TextFieldId ::= text_field_id( Iri )

IntegerNumberFieldId ::= integer_number_field_id( Iri )

RealNumberFieldId ::= real_number_field_id( Iri )

BooleanFieldId ::= boolean_field_id( Iri )

DateFieldId ::= date_field_id( Iri )

TimeFieldId ::= time_field_id( Iri )

DateTimeFieldId ::= date_time_field_id( Iri )

ControlledTermFieldId ::= controlled_term_field_id( Iri )

SingleValuedEnumFieldId ::= single_valued_enum_field_id( Iri )

MultiValuedEnumFieldId ::= multi_valued_enum_field_id( Iri )

LinkFieldId ::= link_field_id( Iri )

EmailFieldId ::= email_field_id( Iri )

PhoneNumberFieldId ::= phone_number_field_id( Iri )

OrcidFieldId ::= orcid_field_id( Iri )

RorFieldId ::= ror_field_id( Iri )

DoiFieldId ::= doi_field_id( Iri )

PubMedIdFieldId ::= pub_med_id_field_id( Iri )

RridFieldId ::= rrid_field_id( Iri )

NihGrantIdFieldId ::= nih_grant_id_field_id( Iri )

AttributeValueFieldId ::= attribute_value_field_id( Iri )

TemplateId ::= template_id( Iri )

PresentationComponentId ::= presentation_component_id( Iri )

TemplateInstanceId ::= template_instance_id( Iri )

All artifact identifier productions are IRI-valued. See Iri.

Concrete serializations need not preserve the per-family identifier distinctions drawn here. In the JSON wire encoding, every artifact identifier — whether a per-family FieldId variant such as TextFieldId or SingleValuedEnumFieldId, or one of the non-field identifiers TemplateId, PresentationComponentId, and TemplateInstanceId — is encoded as a bare IRI string with no per-family discriminator. The field family of a FieldId reference is recovered from the kind of the enclosing Field or EmbeddedField. See wire-grammar.md §5 and serialization.md.

Artifact Metadata

Artifact metadata defines descriptive information, lifecycle information, versioning, and annotations. CatalogMetadata is uniform across every artifact kind and provides the common catalog-oriented metadata carried by all artifacts other than identity: descriptive properties (preferred catalog label, description, identifier, alternative labels), lifecycle metadata, and annotations. Schema artifacts (Field, Template) additionally carry SchemaArtifactVersioning as a separate top-level slot recording version, status, and lineage.

Aggregate Structure

This subsection identifies how the metadata categories are grouped at the artifact level. CatalogMetadata carries the catalog-oriented properties of an artifact — descriptive properties (preferred catalog label, description, identifier, alternative labels), lifecycle metadata, and annotations — directly as members. It is uniform across every artifact kind: Field, Template, PresentationComponent, and TemplateInstance all carry the same CatalogMetadata shape.

The schema artifacts (Field and Template) additionally carry SchemaArtifactVersioning as a separate top-level slot on the artifact itself; non-schema artifacts (PresentationComponent, TemplateInstance) do not carry versioning.

CatalogMetadata is distinct from an artifact’s rendered display name. A Field carries a top-level Label slot (the rendered question text); a Template carries a top-level Title slot (the rendered form title); a TemplateInstance MAY carry an optional Label (a user-supplied instance name); a PresentationComponent carries no rendered display name at all. These rendered slots are defined on the per-artifact productions in Field Artifacts, Core Structure, Instances, and Presentation Components respectively.

CatalogMetadata ::= catalog_metadata(
                      [PreferredLabel]
                      [Description]
                      [Identifier]
                      AlternativeLabel*
                      LifecycleMetadata
                      Annotation*
                    )

Descriptive Metadata

The descriptive metadata of an artifact comprises a set of human-oriented properties carried directly by CatalogMetadata. These properties support naming, explanatory text, and external or local identifiers used for cataloging. PreferredLabel, when present, is the artifact’s preferred display name in catalog and registry contexts (e.g., browsing the field registry or listing templates) — distinct from the artifact’s rendered display name, which lives in a top-level slot on the artifact itself (Field.label, Template.title, TemplateInstance.label). Authors typically populate PreferredLabel with the same text as the rendered slot; the two are separate so they MAY differ when needed (for example, a field whose registry name is "Comment field (v1.2)" may render in forms as just "Comment"). Description, when present, is an extended textual explanation of the artifact’s purpose and content, intended for catalog display. Identifier, when present, is a user-specified external identifier intended for integration with institutional or external systems. AlternativeLabel, when present, provides additional display labels for the artifact (synonyms, abbreviations, legacy labels carried forward from prior versions of the model).

Description ::= description(
                  MultilingualString
                )

Identifier ::= identifier(
                 string
               )

Description carries a MultilingualString value: human-readable text that may be presented in one or more natural languages. Identifier carries an arbitrary Unicode string value: it is a technical user-supplied key intended for integration with external systems and is not a human-display label, so it is not multilingual. PreferredLabel is defined in the Controlled Term Value section; AlternativeLabel is defined in the Label Override section. Both are MultilingualString-valued.

Lifecycle Metadata

LifecycleMetadata identifies when an artifact was created and modified, and which agents were responsible for those actions.

LifecycleMetadata ::= lifecycle_metadata(
                        CreatedOn
                        CreatedBy
                        ModifiedOn
                        ModifiedBy
                      )

CreatedOn ::= IsoDateTimeStamp

CreatedBy ::= Iri

ModifiedOn ::= IsoDateTimeStamp

ModifiedBy ::= Iri

CreatedOn and ModifiedOn MUST be ISO 8601 date-time timestamps.

CreatedBy and ModifiedBy denote IRIs identifying the responsible agents.

See IsoDateTimeStamp and Iri.

Schema Versioning

SchemaArtifactVersioning identifies version-related metadata specific to reusable schema artifacts. It captures artifact version, publication status, and optional derivation links to earlier or source artifacts.

SchemaArtifactVersioning ::= schema_artifact_versioning(
                       Version
                       Status
                       [PreviousVersion]
                       [DerivedFrom]
                     )
Version ::= version(
              SemanticVersion
            )

Status ::= "draft" | "published"

ModelVersion ::= model_version(
                   SemanticVersion
                 )

PreviousVersion ::= previous_version(
                      Iri
                    )

DerivedFrom ::= derived_from(
                  Iri
                )

Version denotes a Semantic Versioning 2.0.0 version identifier.

Status denotes the publication status of a reusable schema artifact and is restricted to draft or published.

PreviousVersion and DerivedFrom denote IRIs identifying related source or predecessor artifacts.

The combined meaning of these fields and their interaction with artifact identity is specified in Versioning Model below.

Versioning Model

The CEDAR versioning model rests on one guiding rule: identity is per-version. Every version of a Field or Template is itself a distinct Artifact with its own IRI. There is no separate “version-independent” identifier for the conceptual artifact; what holds successive versions together is the PreviousVersion link from each artifact to the one it replaces.

Identity and immutability. Every reusable schema artifact (every Field and every Template) is identified by a single SchemaArtifactId (a FieldId or TemplateId). That IRI denotes one specific version: distinct versions of “the same” artifact are distinct artifacts in the model, each with its own IRI. A published artifact MUST be treated as immutable — once Status is "published", the content addressed by its IRI MUST NOT change. A draft artifact MAY be edited in place while its Status remains "draft". The transition from draft to published is one-way: an artifact whose Status is "published" MUST NOT transition back to "draft".

Creating a new version. To produce a revised version of a published artifact, mint a new IRI, allocate a new artifact at that IRI with Status set to "draft", and set PreviousVersion to the IRI of the artifact being revised. Editing happens on the new draft; once the new artifact is itself published, it joins the version chain and becomes immutable in turn. The published predecessor is unaffected by the existence of its successor: it remains addressable at its own IRI and continues to be a valid target for TemplateInstance references.

Version chains. Successive versions of an artifact form a version chain: a sequence of distinct artifacts, each with its own IRI, linked by PreviousVersion. Artifact B is the immediate successor of artifact A when B.previousVersion = A.id. The first artifact in a chain MUST omit PreviousVersion. Every subsequent artifact in the chain MUST set PreviousVersion to the IRI of its immediate predecessor. A chain is therefore a singly-linked list of IRIs, traversable backwards from any version to the original.

The role of Version. Version carries a Semantic Versioning identifier as advisory metadata describing this artifact’s place in its chain (e.g. 1.0.01.1.0 for a backwards-compatible change, 1.0.02.0.0 for a breaking change). The pairing of IRI and PreviousVersion is what authoritatively establishes the chain; Version is descriptive and is not load-bearing for chain identity. Successive artifacts in a chain SHOULD carry monotonically increasing SemanticVersion values, but this specification does not impose a structural constraint to that effect.

Derivation versus succession. DerivedFrom and PreviousVersion are distinct relationships and answer different questions. PreviousVersion records succession within a single version chain: the successor is intended to replace its predecessor as the same conceptual artifact evolves. DerivedFrom records non-version lineage: the new artifact is a fork or adaptation — it was authored by copying or modifying an existing artifact, but it is not the next version of that artifact. A fork begins its own independent version chain. Typical uses of DerivedFrom include adopting a community-published template into an institutional namespace or spawning a specialised variant of an existing field. An artifact MAY carry both PreviousVersion and DerivedFrom simultaneously: the artifact succeeds another within its own chain and was originally derived from a separate source artifact. The two relationships are independent. PreviousVersion and DerivedFrom, when both present, MUST NOT carry the same IRI value — succession and derivation are mutually exclusive at any single point.

Summary of normative rules.

  1. Every version of a Field or Template MUST have a distinct IRI.
  2. A published artifact MUST NOT change at its IRI.
  3. A published artifact MUST NOT transition back to draft.
  4. The first artifact in a version chain MUST omit PreviousVersion. Every other artifact in the chain MUST set PreviousVersion to the IRI of its immediate predecessor in that chain.
  5. When both PreviousVersion and DerivedFrom are present on the same artifact, they MUST NOT carry the same IRI.

Annotations

Annotation provides an extensible metadata mechanism for additional named metadata values that are not captured by the core descriptive, lifecycle, or versioning structures. The first Iri identifies the annotation property — the predicate IRI under which the annotation is asserted. The AnnotationValue is the associated metadata value, currently a string-bearing scalar or an IRI. This supports linking to external resources such as DOIs and grant identifiers, as well as storing institutional metadata.

Annotation ::= annotation(
                 Iri
                 AnnotationValue
               )

AnnotationValue ::= AnnotationStringValue
                  | AnnotationIriValue

AnnotationStringValue ::= annotation_string_value(
                            LexicalForm
                            [LanguageTag]
                          )

AnnotationIriValue ::= annotation_iri_value(
                         Iri
                       )

AnnotationValue is a discriminated union over named annotation-value productions. The two currently defined variants represent text-valued and IRI-valued annotations: AnnotationStringValue carries a lexical form with an optional language tag; AnnotationIriValue carries an IRI denoting a resource. AnnotationStringValue does not carry an explicit datatype; lexically-typed annotations are not modelled at this position, since annotation metadata is by convention either text or IRI-valued.

The variant family is open to extension. A future revision MAY introduce additional AnnotationXxxValue productions (for example, integer- or real-number-valued annotations) without breaking the existing variants.

Scalar and Datatype Leaves

The following productions define the primitive leaf types used throughout this grammar. They represent the atomic constructs from which all other productions are built: IRIs, typed string domains, lexical forms, multilingual textual metadata, numeric and temporal datatype IRIs, and textual metadata values.

Primitive String Types

The following nonterminals are the string-valued leaf types referenced by the productions in this section. Each is pinned to a specific external specification or regular expression so that implementations can validate inputs unambiguously.

  • SemanticVersion — a Semantic Versioning 2.0.0 string. MUST conform to the Semantic Versioning 2.0.0 specification at semver.org, specifically the regular expression in the SemVer FAQ (https://semver.org/#is-there-a-suggested-regular-expression-regex-to-check-a-semver-string). Examples: 1.0.0, 2.0.0-alpha.1, 1.0.0+build.7.

  • IriString — the lexical form of an IRI as defined by RFC 3987 §2.2 (the IRI production). The IRI MUST be absolute (carry a scheme); relative IRIs are not permitted at any wire-form position. Implementations SHOULD use the RFC 3987 IRI ABNF; a permissive practical regex is ^[A-Za-z][A-Za-z0-9+.\-]*:[^\s<>"]+$ but this is not sufficient for full conformance.

  • Bcp47Tag — a well-formed BCP 47 language tag per RFC 5646, specifically the Language-Tag production. Implementations SHOULD validate against the IANA Language Subtag Registry; a syntactic-only check (well-formedness without registry lookup) is acceptable as a baseline. Examples: en, en-US, zh-Hant-TW, de-CH-1901.

  • Iso8601DateTimeLexicalForm — an ISO 8601 combined date-and-time string in the extended format with full date and full time, with or without a UTC offset. The accepted shapes are:

    • YYYY-MM-DDTHH:MM:SS (no offset)
    • YYYY-MM-DDTHH:MM:SS.sss (fractional seconds, 1–9 digits)
    • YYYY-MM-DDTHH:MM:SSZ (UTC)
    • YYYY-MM-DDTHH:MM:SS±HH:MM (offset)
    • the same shapes with .sss fractional seconds combined with the timezone form.

    This corresponds to the XSD dateTime lexical form (XSD 1.1 §3.3.7). Examples: 2026-05-08T14:30:00Z, 2026-05-08T14:30:00.123-07:00.

  • AsciiIdentifier — a string matching the regular expression ^[A-Za-z][A-Za-z0-9_-]*$: an ASCII letter followed by zero or more ASCII letters, digits, underscores, or hyphens. Length is unbounded. Examples: topic, field-1, Member_42.

  • IntegerLexicalForm — a base-10 signed integer literal matching the regular expression ^-?(0|[1-9][0-9]*)$: an optional leading minus sign followed by either 0 or a non-zero digit and zero or more digits. Leading zeros and a leading + are not permitted. Magnitude is unbounded. The using context may further restrict the sign — NonNegativeInteger rejects values with a leading minus sign; signed bounds productions accept it.

Core IRI and String Types

This subsection defines the fundamental IRI, string, and numeric leaf types that appear throughout the grammar. Iri is the base construct for all IRI-valued positions. TermIri is a specialised IRI form for controlled-vocabulary references. LanguageTag and LexicalForm are leaf string types used by Value constructs that carry localized or lexically-typed content. IsoDateTimeStamp carries ISO 8601 date-time values used in lifecycle metadata. NonNegativeInteger supports field-spec constraints.

Iri ::= iri(
          IriString
        )

TermIri ::= term_iri(
              Iri
            )

LanguageTag ::= language_tag(
                  Bcp47Tag
                )

LexicalForm ::= lexical_form(
                  string
                )

IsoDateTimeStamp ::= iso_date_time_stamp(
                       Iso8601DateTimeLexicalForm
                     )

NonNegativeInteger ::= non_negative_integer(
                         IntegerLexicalForm
                       )

Iri denotes an Internationalized Resource Identifier. It corresponds to the xsd:anyURI datatype; implementations MAY represent it as a plain string provided it is a syntactically valid IRI.

TermIri denotes an Iri that identifies a term in a controlled vocabulary or ontology. It is used in ControlledTermValue, ControlledTermClass, and Meaning.

LanguageTag denotes a well-formed BCP 47 language tag.

LexicalForm denotes a Unicode string and SHOULD be in Unicode Normalization Form C.

IsoDateTimeStamp denotes an ISO 8601 date-time lexical form.

NonNegativeInteger denotes an integer greater than or equal to zero.

Multilingual Strings

LangString and MultilingualString are the constructs used at every grammar position that carries human-display text. They distinguish localizations of one conceptual string from technical Unicode-string keys (which remain plain string-valued; see Identifier and the controlled-term-source identifiers in Controlled Term Sources).

LangString ::= lang_string(
                 string
                 Bcp47Tag
               )

MultilingualString ::= multilingual_string(
                         LangString+
                       )

LangString pairs a textual value with a BCP 47 language tag identifying its natural language.

MultilingualString denotes a non-empty set of LangString entries representing localizations of one conceptual string. The entries’ language tags MUST be unique within a MultilingualString (case-folded comparison): the construct represents a set of localizations, not a list of phrasings within a single language.

The 'und' (undetermined) BCP 47 subtag MAY be used to denote a LangString whose natural language is unspecified. Implementations MAY use 'und' as the default tag when constructing a MultilingualString from a bare string with no language information.

MultilingualString differs from a single language-tagged scalar value (such as TextValue with a LanguageTag) in that it carries an unweighted localization set — multiple language tags coexist for the same conceptual string at metadata positions such as Template.header or CatalogMetadata.preferredLabel.

Numeric Datatype Kind

IntegerNumberValue is fixed to a single integer category; its datatype is implicit and is not a configurable component of the production. RealNumberValue carries an explicit RealNumberDatatypeKind chosen from three alternatives — decimal, float, or double. The kind names are CEDAR-native enum values; their corresponding XSD datatype IRIs are defined externally to the abstract grammar by rdf-projection.md.

RealNumberDatatypeKind ::= "decimal" | "float" | "double"

decimal denotes exact arbitrary-precision decimal numbers. float and double denote IEEE 754 single- and double-precision floating-point numbers respectively.

This specification narrows the supported numeric kinds to four (one integer kind plus the three real-number kinds). Earlier drafts admitted the full XSD numeric hierarchy (16 datatypes including long, short, byte, the signed/unsigned bounded subtypes, and the sign-constrained subtypes such as nonNegativeInteger); those are not part of the conforming set. Sign and range constraints are expressed via IntegerNumberMinValue / IntegerNumberMaxValue (or the real-valued equivalents). Bit-precision distinctions are not modelled at the type level; decimal covers exact arbitrary precision when needed, and float / double cover IEEE 754 single- and double-precision when storage precision matters.

Values

This section defines the Value types that represent instance-level data. Value constructs appear in FieldValue instances and as typed default values in EmbeddedArtifact properties. The value types are defined here independently of the FieldSpec productions that constrain them; the normative mapping between each FieldSpec and its permitted Value form is given in the Field Spec And Value Correspondence section.

Value ::= TextValue
        | NumericValue
        | BooleanValue
        | DateValue
        | TimeValue
        | DateTimeValue
        | ControlledTermValue
        | EnumValue
        | LinkValue
        | EmailValue
        | PhoneNumberValue
        | ExternalAuthorityValue
        | AttributeValue

NumericValue ::= IntegerNumberValue
               | RealNumberValue

Scalar Values

TextValue, BooleanValue, and the two numeric value forms (IntegerNumberValue and RealNumberValue) are the simplest value types. Each carries the family-specific content directly: a lexical form for the string-bearing variants, a boolean payload for BooleanValue. TextValue carries an optional LanguageTag; when present, the value is a language-tagged string, when absent, a plain string. IntegerNumberValue carries a base-10 integer lexical form; its category is implicit and not carried as a component. RealNumberValue carries a base-10 real-valued lexical form paired with an explicit RealNumberDatatypeKind (decimal, float, or double).

TextValue ::= text_value(
                LexicalForm
                [LanguageTag]
              )

IntegerNumberValue ::= integer_number_value(
                         LexicalForm
                       )

RealNumberValue ::= real_number_value(
                      LexicalForm
                      RealNumberDatatypeKind
                    )

BooleanValue ::= boolean_value(
                   boolean
                 )

IntegerNumberValue’s lexical form MUST be a base-10 integer literal (per the IntegerLexicalForm primitive in §Primitive String Types). RealNumberValue’s lexical form is a base-10 real-valued literal whose admissible form depends on the carried datatype: decimal admits an arbitrary-precision decimal lexical form; float and double admit IEEE 754-style lexical forms (including special values such as INF, -INF, and NaN).

NumericValue is the abstract category admitting IntegerNumberValue and RealNumberValue; the two are distinct concrete value types and a FieldValue carrying numeric content discriminates between them by kind.

The lexical form of any string-bearing value SHOULD be in Unicode Normalization Form C.

A Value whose lexical form lies outside the lexical space of its declared datatype is ill-typed: it is not syntactically ill-formed but does not determine a valid value. Implementations MUST accept ill-typed values and MAY produce warnings when encountering them. The corresponding RDF projection (see rdf-projection.md) preserves the ill-typed lexical form.

Temporal Values

Temporal values represent date, time, and date-time data, corresponding directly to DateFieldSpec, TimeFieldSpec, and DateTimeFieldSpec respectively. DateValue is further refined into three precision variants — YearValue, YearMonthValue, and FullDateValue. Each temporal Value variant carries a LexicalForm directly; the temporal category is fixed by the variant’s kind. FullDateValue carries an ISO 8601 calendar-date lexical form; TimeValue carries an ISO 8601 time-of-day lexical form; DateTimeValue carries an ISO 8601 combined date-time lexical form. YearValue and YearMonthValue carry plain strings matching the patterns YYYY and YYYY-MM respectively. The RDF projection of these values is defined separately in rdf-projection.md.

DateValue ::= YearValue
            | YearMonthValue
            | FullDateValue

YearValue ::= year_value(
                LexicalForm    (* matches YYYY, e.g. "2024" *)
              )

YearMonthValue ::= year_month_value(
                     LexicalForm    (* matches YYYY-MM, e.g. "2024-06" *)
                   )

FullDateValue ::= full_date_value(
                    LexicalForm
                  )

TimeValue ::= time_value(
                LexicalForm
              )

DateTimeValue ::= date_time_value(
                    LexicalForm
                  )

Controlled Term Value

A controlled term value identifies a term drawn from an ontology, branch, class set, or value set declared in the corresponding ControlledTermFieldSpec. It carries a TermIri identifying the term, together with an optional human-readable Label and optional Notation and PreferredLabel terminology metadata from the source ontology. Label is the display label intended for end-user presentation; Notation is a symbolic code (typically a SKOS notation) bound to the term; PreferredLabel is the ontology’s own preferred label for the term, distinct from the display Label that may have been customized for the surrounding context.

A ControlledTermValue MAY omit Label: a consumer that has access to the source ontology can resolve the term’s display label from the TermIri. Producers SHOULD include Label when it is known at the point of value construction so that downstream consumers without ontology access can render the value.

Label ::= label(
            MultilingualString
          )

Notation ::= notation(
               string
             )

PreferredLabel ::= preferred_label(
                     MultilingualString
                   )

ControlledTermValue ::= controlled_term_value(
                          TermIri
                          [Label]
                          [Notation]
                          [PreferredLabel]
                        )

Label and PreferredLabel are MultilingualString values: each carries one or more language-tagged localizations of the term’s display label. Notation is a plain Unicode string: it is a technical symbolic code (typically a SKOS notation) rather than human-display text, and is therefore not multilingual.

Enum Value

An enum value carries a selection from the permissible values declared by an EnumFieldSpec. Every enum value is identified by a Token — a non-empty Unicode string that serves as the canonical key of one of the enum spec’s PermissibleValue entries. A conforming instance value MUST equal the Token of one of the referenced spec’s permissible values.

EnumValue ::= enum_value(
                Token
              )

Token is the leaf type used as the canonical key of an enum selection. It is defined in the Field Specs section alongside the related leaf productions (PermissibleValue, Meaning) used by EnumFieldSpec.

A link value represents a hyperlink or URL-valued field. It carries an Iri identifying the linked resource and an optional Label providing a human-readable display label for the link.

LinkValue ::= link_value(
                Iri
                [Label]
              )

Label is the same MultilingualString-valued production used by ControlledTermValue, PermissibleValue, and the external-authority value types: a label is treated uniformly as a localizable display string. A hyperlink’s display text MAY therefore carry one or more language-tagged localizations.

Contact Values

Contact values represent human contact identifiers. EmailValue carries an email address as a plain LexicalForm; PhoneNumberValue carries a telephone number as a plain LexicalForm. Format validation is left to implementations.

EmailValue ::= email_value(
                 LexicalForm
               )

PhoneNumberValue ::= phone_number_value(
                       LexicalForm
                     )

External Authority Values

External authority values represent identifiers issued by recognised external authority systems. Each concrete value type carries a typed IRI specialised for its authority together with an optional human-readable Label. The typed IRI signals the expected identifier scheme; format conformance for each authority may be enforced by profile-specific or implementation-specific validation rules.

ExternalAuthorityValue ::= OrcidValue
                         | RorValue
                         | DoiValue
                         | PubMedIdValue
                         | RridValue
                         | NihGrantIdValue

OrcidValue ::= orcid_value(
                 OrcidIri
                 [Label]
               )

RorValue ::= ror_value(
               RorIri
               [Label]
             )

DoiValue ::= doi_value(
               DoiIri
               [Label]
             )

PubMedIdValue ::= pub_med_id_value(
                    PubMedIri
                    [Label]
                  )

RridValue ::= rrid_value(
                RridIri
                [Label]
              )

NihGrantIdValue ::= nih_grant_id_value(
                      NihGrantIri
                      [Label]
                    )

OrcidIri    ::= orcid_iri( Iri )
RorIri      ::= ror_iri( Iri )
DoiIri      ::= doi_iri( Iri )
PubMedIri   ::= pub_med_iri( Iri )
RridIri     ::= rrid_iri( Iri )
NihGrantIri ::= nih_grant_iri( Iri )
Typed IRIAuthorityIRI Pattern
OrcidIriORCID — identifies a researcher by ORCID iDhttps://orcid.org/\d{4}-\d{4}-\d{4}-\d{3}[\dX]
RorIriResearch Organization Registry — identifies a research organisation by ROR IDhttps://ror.org/0[a-z0-9]{8}
DoiIriDigital Object Identifier — identifies a digital object by DOIhttps://doi.org/10\.\d{4,}/.+
PubMedIriPubMed — identifies a PubMed articlehttps://pubmed.ncbi.nlm.nih.gov/\d+
RridIriResearch Resource Identifier — identifies a research resource by RRIDhttps://identifiers.org/RRID:[A-Z]+_\d+
NihGrantIriNIH — identifies an NIH-funded grantunspecified

The final character of an ORCID iD MAY be X, serving as an ISO 7064 Mod 11-2 check character.

Attribute Value

An attribute value is a name-value pair used to represent arbitrary named properties whose names are not known at schema definition time. AttributeName carries the name of the attribute as a Unicode string. The value component is itself a Value, permitting attribute values to carry any value type including nested attribute values. Nesting depth is unbounded at the model level; concrete implementations MAY impose practical limits.

AttributeName ::= attribute_name(
                    string
                  )

AttributeValue ::= attribute_value(
                     AttributeName
                     Value
                   )

Embedded Artifact Properties

Embedded artifact properties define the contextual information carried by an EmbeddedArtifact within a Template. These properties govern how a referenced reusable artifact is used in that template context, including key, reference, requirement, cardinality, visibility, defaults, and label override, and they are distinct from the intrinsic properties of the referenced reusable artifact itself.

Embedded Artifact Key

An EmbeddedArtifactKey is the local identifier of an EmbeddedArtifact within a Template. It is the key by which an embedded field, embedded template, or embedded presentation component is distinguished from other embedded artifacts in the same template. This key is also the mechanism that connects template structure to instance structure: FieldValue and NestedTemplateInstance use EmbeddedArtifactKey to identify which embedded artifact in the template they correspond to.

EmbeddedArtifactKey ::= embedded_artifact_key(
                          AsciiIdentifier
                        )

EmbeddedArtifactKey MUST match the pattern [A-Za-z][A-Za-z0-9_-]*: it MUST begin with an ASCII letter followed by zero or more ASCII letters, digits, underscores, or hyphens.

EmbeddedArtifactKey values are local to a Template and MUST be unique within that Template.

EmbeddedArtifactKey is distinct from artifact identifiers such as FieldId and TemplateId. It identifies the embedding site within a template rather than the reusable artifact being referenced. The same reusable Field may be embedded more than once in a Template under different keys, and each key independently identifies that embedding site in both the template structure and any corresponding TemplateInstance.

Requirements

ValueRequirement identifies whether a value is required, recommended, or optional in the embedding context. Required means that a value must be supplied for conformance. Recommended and Optional are identical for conformance purposes: absence of a value MUST NOT cause conformance failure in either case. The distinction is one of authoring guidance only: implementations SHOULD encourage entry for Recommended fields and MAY issue warnings when such fields are left empty.

ValueRequirement ::= "required" | "recommended" | "optional"

When ValueRequirement is absent from an EmbeddedArtifact, the default is "optional".

Cardinality

Cardinality identifies the permitted number of occurrences for the embedded artifact in the embedding context.

Cardinality ::= cardinality(
                  MinCardinality
                  [MaxCardinality]
                )

MinCardinality ::= min_cardinality(
                     NonNegativeInteger
                   )

MaxCardinality ::= max_cardinality(
                     NonNegativeInteger
                   )

When MaxCardinality is absent from a present Cardinality, the cardinality is unbounded above: any number of occurrences greater than or equal to the specified MinCardinality is permitted. Unboundedness is therefore expressed by omission of MaxCardinality rather than by a distinct construct.

When Cardinality is absent from an EmbeddedArtifact, the implied default is min_cardinality(1) with max_cardinality(1): the embedded artifact MUST appear exactly once.

ValueRequirement and Cardinality are orthogonal. ValueRequirement governs whether the user is obligated to supply any values at all. Cardinality governs the permitted count of values if any are supplied. A field may therefore be Optional — meaning the user is not required to fill it in — while carrying a min_cardinality greater than one, meaning that if values are supplied, at least that many must be present. For example, a primer pair field might be Optional but carry min_cardinality(2), because a primer pair is only interpretable when both the forward and reverse primers are specified together.

Visibility

Visibility determines whether the embedded artifact is shown in rendered interfaces. It is modeled as an embedding property rather than as a rendering hint because it applies to any kind of embedded artifact, not only to fields.

Visibility ::= "visible" | "hidden"

When Visibility is absent from an EmbeddedArtifact, the default is "visible".

Defaults

A default value is a value used to pre-populate a field at instance-creation time when no explicit value has yet been supplied by the user. Defaults exist at two layers:

  • A field-level default lives on the reusable Field’s FieldSpec. It is set when the field is authored and is shared by every Template that embeds the field.
  • An embedding-level default lives on the EmbeddedXxxField inside a Template. It overrides the field-level default for embeddings within that one Template.

Every concrete field family carries an optional default at both layers, with one exception: AttributeValueField carries no default at either layer (an AttributeValue instance is a per-instance pairing of an attribute name and a value, and a default is not meaningful).

The two default-value types match: at each layer the slot is typed with the family-specific Value type. The per-family typing is:

FamilyField-level slot (on XxxFieldSpec)Embedding-level slot (on EmbeddedXxxField)
Text[TextValue][TextValue]
IntegerNumber[IntegerNumberValue][IntegerNumberValue]
RealNumber[RealNumberValue][RealNumberValue]
Boolean[BooleanValue][BooleanValue]
Date[DateValue][DateValue] (polymorphic: YearValue | YearMonthValue | FullDateValue)
Time[TimeValue][TimeValue]
DateTime[DateTimeValue][DateTimeValue]
ControlledTerm[ControlledTermValue][ControlledTermValue]
SingleValuedEnum[EnumValue][EnumValue]
MultiValuedEnum[EnumValue*] (zero or more)[EnumValue*] (zero or more)
Link[LinkValue][LinkValue]
Email[EmailValue][EmailValue]
PhoneNumber[PhoneNumberValue][PhoneNumberValue]
Orcid[OrcidValue][OrcidValue]
Ror[RorValue][RorValue]
Doi[DoiValue][DoiValue]
PubMedId[PubMedIdValue][PubMedIdValue]
Rrid[RridValue][RridValue]
NihGrantId[NihGrantIdValue][NihGrantIdValue]
AttributeValue(no default)(no default)

The shape is uniform across layers: every default at every layer is the family’s Value type. For the enum families this means the field-level default is an EnumValue (or sequence of EnumValue) — the same kind-tagged object form that appears at the embedding level. The Token carried inside each default EnumValue MUST equal the Token of one of the spec’s PermissibleValue+ entries; for MultiValuedEnumFieldSpec the sequence MUST NOT contain duplicate tokens.

Precedence and absence semantics. Both layers are independent and optional. The four cases:

Field-levelEmbedding-levelEffective default
absentabsentnone — the field has no default
presentabsentthe field-level default
absentpresentthe embedding-level default
presentpresentthe embedding-level default (it overrides the field-level default)

There is no mechanism for an embedding to unset a field-level default. An embedding that wishes to override a field-level default with no default at all is not expressible in this version of the model.

Defaults are UI/UX initialisation only. A default value’s sole role is to seed an instance’s value at creation time, so that a user-facing form can pre-fill the corresponding input. Defaults do not appear in the wire form of TemplateInstance artifacts and do not affect the RDF projection. When an instance is created and the user accepts the default without modification, the resulting FieldValue carries the default value as if the user had typed it in by hand; from the instance’s perspective the default and a user-supplied identical value are indistinguishable. When an instance is created and the user does not supply a value (and the field is not required), the corresponding FieldValue is omitted entirely — the default does not appear by virtue of having existed.

Label Override

LabelOverride provides template-specific labeling for an embedded artifact. This allows a template to override the default label of the referenced reusable artifact in that embedding context.

AlternativeLabel ::= alternative_label(
                       MultilingualString
                     )

LabelOverride ::= label_override(
                    Label
                    AlternativeLabel*
                  )

AlternativeLabel is a MultilingualString: each entry is itself a localization set for one alternative phrasing of the artifact’s display label.

Help Text Override

HelpTextOverride provides template-specific authored guidance for an embedded field. When present, it replaces the field’s canonical HelpText for that embedding context only. The reusable Field’s HelpText remains the canonical content for all other embedding contexts (and for the field rendered standalone).

HelpTextOverride ::= help_text_override( MultilingualString )

HelpTextOverride is a MultilingualString: it carries the same kind of authored guidance as HelpText, but scoped to a single embedding site. The override’s presentation — inline, tooltip, both, or none — is selected by the enclosing Template’s HelpDisplayMode exactly as for the underlying HelpText.

The precedence rule is straightforward: at an embedding site, the renderer displays the embedding’s HelpTextOverride if present, otherwise the referenced Field’s HelpText, otherwise nothing. The override is replace, not merge: localizations present in the field’s HelpText but absent from the embedding’s HelpTextOverride do not fall back.

Properties

A Property associates a semantic property IRI with an EmbeddedField or EmbeddedTemplate within a specific Template. The property IRI identifies the RDF property that the embedded artifact’s value represents in that template context. The optional PropertyLabel provides a human-readable label for the property.

Property is an embedding-level construct. It is distinct from the intrinsic metadata of the referenced Field or Template artifact. The same reusable artifact may be embedded in different templates under different property IRIs.

Property ::= property(
               PropertyIri
               [PropertyLabel]
             )

PropertyIri   ::= property_iri( Iri )
PropertyLabel ::= property_label( MultilingualString )

PropertyLabel is a MultilingualString carrying one or more language-tagged localizations of the property’s human-readable label.

Field Specs

A FieldSpec is the semantic configuration block carried by a concrete Field artifact. It specifies what kind of value the field accepts, any constraints on that value, and any compatible rendering hints for presentation. Each concrete Field variant carries exactly one FieldSpec that matches its kind: a TextField carries a TextFieldSpec, a DateField carries a DateFieldSpec, and so on. The correspondence between each FieldSpec and its permitted Value form is given in the Field Spec And Value Correspondence section.

One might ask why FieldSpec exists as a separate construct rather than folding its content directly into the concrete Field artifact. The answer is separation of concerns: the concrete field artifact — TextField, DateField, and so on — answers the question “what kind of reusable field is this?” and carries the artifact’s identity, catalog metadata, versioning, and the rendered question-text label. The FieldSpec answers the separate question “what are the value rules and rendering-compatible properties for this kind of field?” Keeping these concerns distinct means that artifact identity, catalog metadata, and lifecycle/versioning information remain uniform across all field kinds, while value semantics and field-specific configuration vary per family through FieldSpec.

FieldSpec productions are grouped here by field family, mirroring the abstract Field hierarchy in the Kernel Grammar. Temporal field specs, which carry additional precision and rendering configuration, are detailed in the Temporal Field Specs subsection. Controlled term source declarations, which specify the ontological authorities from which controlled-term values may be drawn, are covered in the Controlled Term Sources subsection. Rendering hints for all field families are defined in the Rendering Hints subsection, with the exception of temporal rendering hints which are defined alongside their field specs.

FieldSpec ::= TextFieldSpec
            | NumericFieldSpec
            | BooleanFieldSpec
            | TemporalFieldSpec
            | ControlledTermFieldSpec
            | EnumFieldSpec
            | LinkFieldSpec
            | ContactFieldSpec
            | ExternalAuthorityFieldSpec
            | AttributeValueFieldSpec

NumericFieldSpec ::= IntegerNumberFieldSpec
                   | RealNumberFieldSpec

TextFieldSpec ::= text_field_spec(
                    [TextValue]
                    [MinLength]
                    [MaxLength]
                    [ValidationRegex]
                    [LangTagRequirement]
                    [TextRenderingHint]
                  )

LangTagRequirement ::= "langTagRequired"
                     | "langTagOptional"
                     | "langTagForbidden"

IntegerNumberFieldSpec ::= integer_number_field_spec(
                             [IntegerNumberValue]
                             [Unit]
                             [IntegerNumberMinValue]
                             [IntegerNumberMaxValue]
                             [NumericRenderingHint]
                           )

RealNumberFieldSpec ::= real_number_field_spec(
                          RealNumberDatatypeKind
                          [RealNumberValue]
                          [Unit]
                          [RealNumberMinValue]
                          [RealNumberMaxValue]
                          [NumericRenderingHint]
                        )

Unit ::= unit(
           Iri
           [Label]
         )

MinLength ::= min_length(
                NonNegativeInteger
              )

MaxLength ::= max_length(
                NonNegativeInteger
              )

ValidationRegex ::= validation_regex(
                      string
                    )

IntegerNumberMinValue ::= integer_number_min_value(
                            IntegerNumberValue
                          )

IntegerNumberMaxValue ::= integer_number_max_value(
                            IntegerNumberValue
                          )

RealNumberMinValue ::= real_number_min_value(
                         RealNumberValue
                       )

RealNumberMaxValue ::= real_number_max_value(
                         RealNumberValue
                       )

BooleanFieldSpec ::= boolean_field_spec(
                       [BooleanValue]
                       [BooleanRenderingHint]
                     )

TemporalFieldSpec ::= DateFieldSpec
                    | TimeFieldSpec
                    | DateTimeFieldSpec

ControlledTermFieldSpec ::= controlled_term_field_spec(
                              [ControlledTermValue]
                              ControlledTermSource+
                              [ControlledTermRenderingHint]
                            )

EnumFieldSpec ::= SingleValuedEnumFieldSpec
                | MultiValuedEnumFieldSpec

SingleValuedEnumFieldSpec ::= single_valued_enum_field_spec(
                                PermissibleValue+
                                [EnumValue]
                                [SingleValuedEnumRenderingHint]
                              )

MultiValuedEnumFieldSpec ::= multi_valued_enum_field_spec(
                               PermissibleValue+
                               EnumValue*
                               [MultiValuedEnumRenderingHint]
                             )

PermissibleValue ::= permissible_value(
                       Token
                       [Label]
                       [Description]
                       Meaning*
                     )

Token ::= token(
            string
          )

Meaning ::= meaning(
              TermIri
              [Label]
            )

LinkFieldSpec ::= link_field_spec(
                    [LinkValue]
                    [LinkRenderingHint]
                  )

ContactFieldSpec ::= EmailFieldSpec
                   | PhoneNumberFieldSpec

EmailFieldSpec ::= email_field_spec(
                     [EmailValue]
                     [EmailRenderingHint]
                   )

PhoneNumberFieldSpec ::= phone_number_field_spec(
                           [PhoneNumberValue]
                           [PhoneNumberRenderingHint]
                         )

ExternalAuthorityFieldSpec ::= OrcidFieldSpec
                             | RorFieldSpec
                             | DoiFieldSpec
                             | PubMedIdFieldSpec
                             | RridFieldSpec
                             | NihGrantIdFieldSpec

OrcidFieldSpec ::= orcid_field_spec(
                     [OrcidValue]
                     [OrcidRenderingHint]
                   )

RorFieldSpec ::= ror_field_spec(
                   [RorValue]
                   [RorRenderingHint]
                 )

DoiFieldSpec ::= doi_field_spec(
                   [DoiValue]
                   [DoiRenderingHint]
                 )

PubMedIdFieldSpec ::= pub_med_id_field_spec(
                        [PubMedIdValue]
                        [PubMedIdRenderingHint]
                      )

RridFieldSpec ::= rrid_field_spec(
                    [RridValue]
                    [RridRenderingHint]
                  )

NihGrantIdFieldSpec ::= nih_grant_id_field_spec(
                          [NihGrantIdValue]
                          [NihGrantIdRenderingHint]
                        )

AttributeValueFieldSpec ::= attribute_value_field_spec()

Unit denotes an identified measurement or quantity unit optionally paired with a human-readable label.

The current placement of Unit on IntegerNumberFieldSpec and RealNumberFieldSpec is a pragmatic compromise. A later revision may introduce a distinct QuantityFieldSpec to model numeric values with fixed units more explicitly.

IntegerNumberMinValue and IntegerNumberMaxValue specify inclusive lower and upper bounds on the integer values that an IntegerNumberField accepts. Both are expressed as IntegerNumberValue constructs. RealNumberMinValue and RealNumberMaxValue are the analogous bounds on RealNumberField and carry RealNumberValue constructs whose RealNumberDatatypeKind matches the field’s declared datatype.

A RealNumberFieldSpec MAY use the family-shared NumericRenderingHint; if it carries a non-zero decimalPlaces rendering hint, the hint applies to display rounding only and does not constrain the lexical form of submitted values. IntegerNumberFieldSpec MAY also use NumericRenderingHint; a decimalPlaces value other than 0 on an integer field is harmless (display only) and SHOULD be omitted when not meaningful.

EnumFieldSpec is refined along a single dimension: cardinality. SingleValuedEnumFieldSpec permits exactly one selection; MultiValuedEnumFieldSpec permits zero or more simultaneous selections (subject to the embedding’s Cardinality). The two specs share a common option model: every permissible value is a PermissibleValue carrying a canonical Token key together with optional human-readable Label and Description localizations and zero or more Meaning entries that bind the token to ontology terms. The Token strings of a spec’s permissible values MUST be unique within that spec; the spec’s PermissibleValue+ is the closed set of values an instance may carry.

The two enum specs each carry a field-level default per the Defaults section: SingleValuedEnumFieldSpec an optional [EnumValue], MultiValuedEnumFieldSpec a (possibly empty) EnumValue*. The Token carried inside each default EnumValue MUST equal the Token of one of the spec’s PermissibleValue+ entries; for MultiValuedEnumFieldSpec the sequence MUST NOT contain duplicate tokens.

A Meaning carried by a PermissibleValue binds the token to a term IRI in an external vocabulary or ontology. A permissible value MAY carry zero, one, or several Meaning entries. Each Meaning MAY additionally carry an optional Label recording the bound term’s human-readable label (in the same way ControlledTermValue.Label caches the term’s label inline) so that consumers without ontology access can render the bound term’s display name. The Meaning.Label is the label of the bound term, distinct from the surrounding PermissibleValue.Label which is the display label of the permissible value itself. When the RDF projection is applied (see rdf-projection.md), an EnumValue whose token matches a PermissibleValue carrying one or more Meaning entries projects as the corresponding term IRIs; an EnumValue whose matching permissible value carries no Meaning projects as a plain string literal.

ControlledTermSource is defined in Controlled Term Sources.

Temporal Field Specs

TemporalFieldSpec denotes temporal-valued fields and is refined into strongly typed date, time, and date-time forms. This section groups the temporal field-spec productions together with their compatible rendering hints and value-type constraints.

DateFieldSpec ::= date_field_spec(
                    DateValueType
                    [DateValue]
                    [DateRenderingHint]
                  )

DateValueType ::= "year" | "yearMonth" | "fullDate"
TimeFieldSpec ::= time_field_spec(
                    [TimeValue]
                    [TimePrecision]
                    [TimezoneRequirement]
                    [TimeRenderingHint]
                  )

TimePrecision ::= "hourMinute" | "hourMinuteSecond" | "hourMinuteSecondFraction"

TimezoneRequirement ::= "timezoneRequired" | "timezoneNotRequired"

TimePrecision identifies the finest time precision permitted by a TimeFieldSpec.

"hourMinute", "hourMinuteSecond", and "hourMinuteSecondFraction" identify time values constrained respectively to hour-and-minute precision, second precision, and fractional-second precision.

TimezoneRequirement identifies whether timezone information is required by the field spec.

The declared TimePrecision determines the required lexical form of conforming TimeValue constructs. Finer components than the declared precision MUST be omitted entirely; zeroing them is not equivalent to omitting them. Specifically:

  • "hourMinute": TimeValue MUST carry only hour and minute components (HH:MM).
  • "hourMinuteSecond": TimeValue MUST carry hour, minute, and second components (HH:MM:SS), with no fractional seconds.
  • "hourMinuteSecondFraction": TimeValue MAY carry a fractional seconds component.

When TimePrecision is absent from a TimeFieldSpec, no precision constraint applies and any well-formed TimeValue is conforming.

The same strict-truncation rule applies to DateTimeValueType for DateTimeValue constructs:

  • "dateHourMinute": the time component of DateTimeValue MUST carry only hour and minute (YYYY-MM-DDTHH:MM).
  • "dateHourMinuteSecond": the time component MUST carry hour, minute, and second (YYYY-MM-DDTHH:MM:SS), with no fractional seconds.
  • "dateHourMinuteSecondFraction": the time component MAY carry a fractional seconds component.
DateTimeFieldSpec ::= date_time_field_spec(
                        DateTimeValueType
                        [DateTimeValue]
                        [TimezoneRequirement]
                        [DateTimeRenderingHint]
                      )

DateTimeValueType ::= "dateHourMinute" | "dateHourMinuteSecond" | "dateHourMinuteSecondFraction"

DateTimeValueType identifies the finest permitted date-time precision.

"dateHourMinute", "dateHourMinuteSecond", and "dateHourMinuteSecondFraction" identify date-time values constrained respectively to minute precision, second precision, and fractional-second precision.

DateRenderingHint ::= date_rendering_hint(
                        [DateComponentOrder]
                        [Placeholder]
                      )

DateComponentOrder ::= "dayMonthYear" | "monthDayYear" | "yearMonthDay"

TimeRenderingHint ::= time_rendering_hint(
                        [TimeFormat]
                        [Placeholder]
                      )

DateTimeRenderingHint ::= date_time_rendering_hint(
                            [TimeFormat]
                            [Placeholder]
                          )

TimeFormat ::= "twelveHour" | "twentyFourHour"

DateComponentOrder identifies whether a date is rendered or acquired in day-month-year, month-day-year, or year-month-day order.

Controlled Term Sources

Controlled term sources define the ontological authorities from which controlled-term values may be drawn. A ControlledTermFieldSpec requires one or more ControlledTermSource entries. Each source specifies either an entire ontology, a branch of an ontology rooted at a given term, a set of individual ontology classes, or an external value set. TermIri is defined in the Scalar and Datatype Leaves section.

ControlledTermSource ::= OntologySource
                       | BranchSource
                       | ClassSource
                       | ValueSetSource

OntologySource ::= ontology_source(
                     OntologyReference
                   )

OntologyReference ::= ontology_reference(
                        OntologyIri
                        [OntologyDisplayHint]
                      )

OntologyDisplayHint ::= ontology_display_hint(
                          [OntologyAcronym]
                          [OntologyName]
                        )

BranchSource ::= branch_source(
                   OntologyReference
                   RootTermIri
                   [RootTermLabel]
                   [MaxTraversalDepth]
                 )

ClassSource ::= class_source(
                  ControlledTermClass+
                )

ControlledTermClass ::= controlled_term_class(
                          TermIri
                          [Label]
                          OntologyReference
                        )

ValueSetSource ::= value_set_source(
                     ValueSetIdentifier
                     [ValueSetName]
                     [ValueSetIri]
                   )
OntologyAcronym ::= ontology_acronym(
                      string
                    )

OntologyName ::= ontology_name(
                   MultilingualString
                 )

OntologyIri ::= ontology_iri(
                  Iri
                )

RootTermIri ::= root_term_iri(
                  Iri
                )

RootTermLabel ::= root_term_label(
                    MultilingualString
                  )

MaxTraversalDepth ::= max_traversal_depth(
                        NonNegativeInteger
                      )

ValueSetIdentifier ::= value_set_identifier(
                         string
                       )

ValueSetName ::= value_set_name(
                   MultilingualString
                 )

ValueSetIri ::= value_set_iri(
                  Iri
                )

OntologyIri, RootTermIri, and ValueSetIri denote IRIs used in controlled-term source specifications.

OntologyName, RootTermLabel, and ValueSetName are human-readable display names and carry MultilingualString values: each may be presented in one or more natural languages. OntologyAcronym and ValueSetIdentifier are technical short-form identifiers (e.g. an ontology acronym like "NCIT", a value-set key) and remain plain Unicode strings.

MaxTraversalDepth denotes a non-negative traversal-depth limit for branch-based controlled-term sources. When MaxTraversalDepth is absent, no depth limit applies and any descendant of the root term is admissible. A value of zero restricts the source to the root term itself.

When OntologyDisplayHint is present on an OntologyReference, at least one of its OntologyAcronym or OntologyName components MUST be present. A display hint with neither component is non-conforming.

A ControlledTermClass SHOULD include a Label. The label is captured at the time the class is declared as a source, when the term’s display text is typically known; consumers without ontology access rely on this label to render the class. Conforming producers MAY omit the label when it is not available, in which case downstream consumers must resolve the label from the term IRI by other means. The same recommendation applies to BranchSource.RootTermLabel: producers SHOULD include it when declaring a branch source.

Rendering Hints

A RenderingHint is an optional presentational instruction carried by a FieldSpec that tells a rendering implementation how to display the field. Rendering hints are strictly presentational: they do not affect the meaning, structure, or validation of field values. Each rendering hint is typed to a specific FieldSpec family, so only compatible hint-and-field-spec combinations are expressible. For example, a TextRenderingHint may only appear on a TextFieldSpec, and a SingleValuedEnumRenderingHint may only appear on a SingleValuedEnumFieldSpec. Note that temporal rendering hints (DateRenderingHint, TimeRenderingHint, and DateTimeRenderingHint) are defined alongside their respective field specs in the Temporal Field Specs subsection.

RenderingHint ::= TextRenderingHint
                | SingleValuedEnumRenderingHint
                | MultiValuedEnumRenderingHint
                | NumericRenderingHint
                | BooleanRenderingHint
                | DateRenderingHint
                | TimeRenderingHint
                | DateTimeRenderingHint
                | ControlledTermRenderingHint
                | EmailRenderingHint
                | PhoneNumberRenderingHint
                | LinkRenderingHint
                | OrcidRenderingHint
                | RorRenderingHint
                | DoiRenderingHint
                | PubMedIdRenderingHint
                | RridRenderingHint
                | NihGrantIdRenderingHint

TextRenderingHint ::= text_rendering_hint(
                        [TextLineMode]
                        [Placeholder]
                      )

TextLineMode ::= "singleLine" | "multiLine"

SingleValuedEnumRenderingHint ::= "radio" | "dropdown"

MultiValuedEnumRenderingHint ::= "checkbox" | "multiSelect"

NumericRenderingHint ::= numeric_rendering_hint(
                           [DecimalPlaces]
                           [Placeholder]
                         )

DecimalPlaces ::= decimal_places(
                    NonNegativeInteger
                  )

BooleanRenderingHint ::= "checkbox" | "toggle" | "radio" | "dropdown"

ControlledTermRenderingHint ::= controlled_term_rendering_hint( [Placeholder] )
EmailRenderingHint          ::= email_rendering_hint(            [Placeholder] )
PhoneNumberRenderingHint    ::= phone_number_rendering_hint(     [Placeholder] )
LinkRenderingHint           ::= link_rendering_hint(             [Placeholder] )
OrcidRenderingHint          ::= orcid_rendering_hint(            [Placeholder] )
RorRenderingHint            ::= ror_rendering_hint(              [Placeholder] )
DoiRenderingHint            ::= doi_rendering_hint(              [Placeholder] )
PubMedIdRenderingHint       ::= pub_med_id_rendering_hint(       [Placeholder] )
RridRenderingHint           ::= rrid_rendering_hint(             [Placeholder] )
NihGrantIdRenderingHint     ::= nih_grant_id_rendering_hint(     [Placeholder] )

Placeholder ::= placeholder( MultilingualString )

Placeholder is a MultilingualString-valued production carrying sample-input text shown inside an empty text-entry widget. It is purely presentational format demonstration — distinct from HelpText, which carries semantic content about the field’s meaning. Placeholder content is not validated against the field spec’s lexical-form constraints; a placeholder of "YYYY-MM-DD" may appear on a date field whose values must conform to ISO 8601, since the placeholder is a demonstration of the expected lexical shape, not an instance of one.

Placeholder appears as an optional slot on every rendering hint attached to a text-entry-capable field family: TextRenderingHint, NumericRenderingHint, DateRenderingHint, TimeRenderingHint, DateTimeRenderingHint, plus the ten rendering hints introduced for ControlledTermField, EmailField, PhoneNumberField, LinkField, and the six identifier families. It does NOT appear on BooleanRenderingHint, SingleValuedEnumRenderingHint, or MultiValuedEnumRenderingHint, since those widgets are not text-entry surfaces.

Note on TextRenderingHint shape. In earlier revisions of this spec, TextRenderingHint was a bare string enum ("singleLine" | "multiLine"). It has been restructured into a structured object carrying an optional TextLineMode (the former enum content) plus the optional Placeholder slot. This is a wire-form-breaking change for templates that carry the bare-string form; such templates require migration to the object form before they will decode under this revision of the spec.

This specification draws a strict distinction between semantic structure and presentation. Semantic distinctions MUST be modeled in FieldSpec when they affect the meaning, cardinality, or value structure of a field. This includes distinctions such as single-valued versus multi-valued enum, date versus time versus date-time, and permitted temporal precision. Purely presentational distinctions MUST NOT be modeled as separate field specs. Instead, distinctions such as single-line versus multi-line text entry, date component ordering, and 12-hour versus 24-hour time display MUST be expressed only through compatible typed rendering hints.

Accordingly, TextFieldSpec is a single semantic field spec whose single-line and multi-line display forms are represented by TextRenderingHint.

A TextFieldSpec MAY additionally define a default text value, minimum length, maximum length, validating regular expression, and a LangTagRequirement constraining the presence of the lang slot on conforming TextValue instances.

LangTagRequirement identifies whether the lang slot of a TextValue is required, optional, or forbidden by the field spec:

  • "langTagRequired" — every TextValue admitted by this field MUST carry a lang slot with a well-formed BCP 47 tag. Suitable for fields whose values are natural-language text that authors expect to be language-tagged (e.g., titles, abstracts, captions).
  • "langTagOptional" — every TextValue admitted MAY carry a lang slot. This matches the default behaviour when LangTagRequirement is absent and is provided for explicitness.
  • "langTagForbidden" — every TextValue admitted MUST NOT carry a lang slot. Suitable for fields whose values are technical identifiers, slugs, query fragments, or other strings for which a natural-language tag has no meaning.

When LangTagRequirement is absent from a TextFieldSpec, the constraint behaves as "langTagOptional" (the historical default).

The LangTagRequirement constraint applies to each TextValue individually: in a multi-valued field, every value MUST satisfy the constraint independently. The constraint also applies to the field-spec-level defaultValue (when present) and to any embedding-level defaultValue carried by an EmbeddedTextField.

Similarly, EnumFieldSpec distinguishes SingleValuedEnumFieldSpec from MultiValuedEnumFieldSpec semantically, while the rendering hint determines whether the UI uses radio buttons, dropdown, checkboxes, or multi-select presentation. Typed rendering hints make incompatible combinations structurally invalid.

Temporal semantics are also split structurally: DateFieldSpec, TimeFieldSpec, and DateTimeFieldSpec are distinct semantic field specs, and each carries only the rendering hints and temporal options that are meaningful for that temporal category.

The current rendering vocabulary is explicit but deliberately small: numeric fields use NumericRenderingHint (which carries an optional DecimalPlaces for display-time rounding); date fields use DateRenderingHint (with optional DateComponentOrder); time fields use TimeRenderingHint (with optional TimeFormat); and date-time fields use DateTimeRenderingHint (also with optional TimeFormat).

DecimalPlaces is a presentation concern, not a value-semantics constraint. Conforming consumers SHOULD use it to control display rounding and MAY use it as a UX-level input nicety (e.g., limiting the number of digits an end-user can type after the decimal point). It does not constrain the lexical form of a submitted RealNumberValue; conforming validators MUST NOT reject a value purely on grounds of decimal-places mismatch with the rendering hint. The slot is meaningful for RealNumberFieldSpec; on IntegerNumberFieldSpec it is harmless and conventionally omitted.

BooleanRenderingHint admits four widget choices — checkbox, toggle, radio, and dropdown — distinguished by how they handle the unset state of a boolean field. A boolean field has three observable states at the UI: a value of true, a value of false, and no value supplied (the user has not yet asserted either). The four widget choices differ in whether they can faithfully represent the unset state:

  • radio (a Yes / No radio pair) and dropdown (a Yes / No dropdown with no initial selection) admit three observable states — Yes selected, No selected, and neither selected — and so faithfully represent the unset case.
  • checkbox and toggle admit only two observable states (checked / unchecked, or on / off) and so cannot distinguish false from unset. They SHOULD be used only when the field’s ValueRequirement is required (so unset is not a valid resting state) or when the surrounding application is content to interpret unset as false.

The unset state is structurally represented in the value model by absence of a FieldValue for the embedding’s key, not by a third value within BooleanValue. BooleanValue.value carries true | false only.

Presentation Components

A PresentationComponent is a reusable artifact that contributes presentation or instructional structure to a rendered template without introducing data-bearing content. It is distinct from SchemaArtifact: where Template and Field define the structure and semantics of instance data, PresentationComponent exists purely to guide, organise, or annotate the rendered form — for example by embedding rich text instructions, illustrative images, video content, or structural breaks between sections.

PresentationComponent carries its own identity, metadata, and lifecycle information as an Artifact, making it independently reusable across multiple templates. It appears within a template only through EmbeddedPresentationComponent, which contributes no InstanceValue and is therefore invisible to the instance model. A conforming TemplateInstance MUST NOT contain an InstanceValue for an EmbeddedPresentationComponent.

The following concrete variants are defined:

PresentationComponent ::= RichTextComponent
                        | ImageComponent
                        | YoutubeVideoComponent
                        | SectionBreakComponent
                        | PageBreakComponent

RichTextComponent ::= rich_text_component(
                        PresentationComponentId
                        ModelVersion
                        CatalogMetadata
                        HtmlContent
                      )

ImageComponent ::= image_component(
                     PresentationComponentId
                     ModelVersion
                     CatalogMetadata
                     Iri
                     [Label]
                     [Description]
                   )

YoutubeVideoComponent ::= you_tube_video_component(
                            PresentationComponentId
                            ModelVersion
                            CatalogMetadata
                            Iri
                            [Label]
                            [Description]
                          )

SectionBreakComponent ::= section_break_component(
                            PresentationComponentId
                            ModelVersion
                            CatalogMetadata
                          )

PageBreakComponent ::= page_break_component(
                         PresentationComponentId
                         ModelVersion
                         CatalogMetadata
                       )
HtmlContent ::= html_content(
                  string
                )

HtmlContent denotes an HTML fragment represented as a Unicode string and used by a RichTextComponent.

The permitted HTML feature set and any sanitization requirements are outside the scope of this abstract specification and SHOULD be defined by concrete serialization specifications that build on this model.

The Iri slot on ImageComponent and YoutubeVideoComponent identifies the image or video resource referenced by the corresponding presentation component.

Label and Description on ImageComponent and YoutubeVideoComponent carry accessibility metadata. Label is a short alternative-text label (the image’s alt text or the video’s caption title); Description is a longer textual description for screen readers and other assistive technologies. Both are MultilingualString values, allowing localized accessibility text. Conforming producers SHOULD provide a Label for every ImageComponent and YoutubeVideoComponent that conveys meaningful content; decorative images MAY omit the label to indicate that no alternative text is needed.

Field Spec And Value Correspondence

The FieldSpec carried by a Field determines the Value form that MUST appear in any FieldValue corresponding to an embedding of that field. This is a normative constraint: a FieldValue that carries a Value of the wrong form for the referenced field’s FieldSpec is non-conforming.

The correspondence is applied through the EmbeddedArtifactKey chain. A FieldValue in a TemplateInstance carries an EmbeddedArtifactKey that identifies an EmbeddedField in the referenced Template. That EmbeddedField references a reusable Field, which carries a FieldSpec. It is that FieldSpec that determines the permitted Value form for the FieldValue. The correspondence therefore spans the full path from instance value through embedding context to reusable field definition.

The table below gives the complete correspondence. The Field Family column identifies the abstract category in the Field hierarchy to which the concrete field belongs; families group field kinds that share related value semantics. Where a field is a direct subclass of Field with no intermediate abstract category, this column is left blank.

Field FamilyFieldSpecValue
TextFieldSpecTextValue
NumericFieldIntegerNumberFieldSpecIntegerNumberValue
NumericFieldRealNumberFieldSpecRealNumberValue
BooleanFieldSpecBooleanValue
TemporalFieldDateFieldSpecDateValue
TemporalFieldTimeFieldSpecTimeValue
TemporalFieldDateTimeFieldSpecDateTimeValue
ControlledTermFieldSpecControlledTermValue
EnumFieldSingleValuedEnumFieldSpecEnumValue
EnumFieldMultiValuedEnumFieldSpecEnumValue
LinkFieldSpecLinkValue
ContactFieldEmailFieldSpecEmailValue
ContactFieldPhoneNumberFieldSpecPhoneNumberValue
ExternalAuthorityFieldOrcidFieldSpecOrcidValue
ExternalAuthorityFieldRorFieldSpecRorValue
ExternalAuthorityFieldDoiFieldSpecDoiValue
ExternalAuthorityFieldPubMedIdFieldSpecPubMedIdValue
ExternalAuthorityFieldRridFieldSpecRridValue
ExternalAuthorityFieldNihGrantIdFieldSpecNihGrantIdValue
AttributeValueFieldSpecAttributeValue

The two concrete enum field specs share a single value type, EnumValue. The cardinality distinction — single versus multiple — is not visible in the value type itself but in the count of values permitted per FieldValue: a SingleValuedEnumFieldSpec permits exactly one EnumValue, while a MultiValuedEnumFieldSpec permits one or more (subject to the embedding’s Cardinality). This cardinality constraint is enforced at validation rather than through distinct value types.

Instances

A TemplateInstance is an Artifact that records data conforming to a specific Template. Instance productions are defined here separately from schema and presentation productions so that the schema model and the instance model can be read independently.

Because TemplateInstance is a full Artifact, it carries CatalogMetadata — a TemplateInstanceId, descriptive metadata, and lifecycle metadata. This means instances are independently identifiable, catalogable artifacts in their own right rather than anonymous data records. They can be referenced, versioned, and tracked just as templates and fields can.

A TemplateInstance contains zero or more InstanceValue constructs, each keyed by an EmbeddedArtifactKey identifying the corresponding embedded artifact in the referenced template. There are two forms: FieldValue, which carries one or more typed values for an EmbeddedField, and NestedTemplateInstance, which carries a nested collection of InstanceValue constructs for an EmbeddedTemplate. EmbeddedPresentationComponent constructs produce no InstanceValue and are absent from the instance model entirely.

TemplateInstance ::= template_instance(
                       TemplateInstanceId
                       ModelVersion
                       CatalogMetadata
                       TemplateId
                       [Label]
                       InstanceValue*
                     )

InstanceValue ::= FieldValue
                | NestedTemplateInstance

FieldValue ::= field_value(
                 EmbeddedArtifactKey
                 Value+
               )

NestedTemplateInstance ::= nested_template_instance(
                             EmbeddedArtifactKey
                             InstanceValue*
                           )

TemplateId is the persistent schema link that ties a TemplateInstance to the Template it was created from. It is the basis for all validation and interpretation of instance content: the EmbeddedArtifactKey values in FieldValue and NestedTemplateInstance constructs are only meaningful in relation to the embedded artifacts of that specific template.

Each FieldValue’s EmbeddedArtifactKey MUST identify an EmbeddedField in the referenced Template. Each NestedTemplateInstance’s EmbeddedArtifactKey MUST identify an EmbeddedTemplate. An EmbeddedArtifactKey that identifies an EmbeddedPresentationComponent MUST NOT appear as the key of any InstanceValue. The full instance alignment constraints are specified in spec/validation.md.

To make the abstract structure concrete, consider a Template containing two EmbeddedTextField constructs keyed title and description, and one EmbeddedTemplate keyed study_arm with a maximum cardinality of three. A conforming TemplateInstance for that template would contain two FieldValue constructs — one keyed title carrying a TextValue, one keyed description carrying a TextValue — and between one and three NestedTemplateInstance constructs each keyed study_arm, where each NestedTemplateInstance contains its own InstanceValue constructs corresponding to the embedded artifacts of the nested template.

For multi-valued EmbeddedField, all values for a single field occurrence are collected within a single FieldValue using Value*. For multi-valued EmbeddedTemplate, multiplicity is represented by multiple NestedTemplateInstance constructs sharing the same EmbeddedArtifactKey within the containing TemplateInstance. This asymmetry reflects the structural difference between scalar repetition (multiple values for one field) and structural repetition (multiple complete nested instances for one embedded template). In both cases the number of values or instances MUST satisfy the Cardinality constraints defined by the corresponding EmbeddedField or EmbeddedTemplate; see spec/validation.md for the normative multiplicity rules. NestedTemplateInstance is the recursive construct that supports arbitrarily deep nested template structure: because a NestedTemplateInstance itself contains InstanceValue*, and InstanceValue may contain further NestedTemplateInstance constructs, template nesting can be as deep as the schema requires.

Instance conformance may be enforced at data-entry time, preventing submission of a non-conforming instance, or retrospectively, by validating existing instances against their referenced template. Both modes apply the same conformance rules; the distinction is an implementation concern rather than a model-level distinction.

Absence of a value for an optional field is represented by omitting the FieldValue entirely rather than including an empty one; hence FieldValue requires Value+. Note that concrete serializations and authoring tools may have their own conventions for representing absence — for example, a JSON serialization may choose to omit a key entirely or include it with a null value — but such distinctions are a concern of the serialization layer and do not affect the abstract model defined here.

Open Questions

  • Should embedded artifacts always refer to reusable artifacts by explicit reference construct, or does the CEDAR model require some embeddings to support inline artifact definition?
  • Should PresentationComponent remain a direct subclass of Artifact, or should a later revision introduce an intermediate superclass for reusable non-schema artifacts? This would make the distinction between reusable schema artifacts such as Template and Field and reusable non-schema artifacts such as rich text, images, videos, and section breaks more explicit in the hierarchy.
  • Should a later revision introduce a distinct QuantityFieldSpec rather than attaching optional Unit information directly to IntegerNumberFieldSpec and RealNumberFieldSpec? The current model permits fixed units on both numeric field families as a pragmatic compromise, but a dedicated quantity field spec may provide a cleaner semantic distinction for numeric values that are intrinsically unit-bearing.

Cedar Wire Grammar

This file is a formal, JSON-shaped grammar that mirrors grammar.md production-for-production. It is the source of truth for the wire shape of every abstract grammar production. serialization.md is its companion: it carries the encoding philosophy, JSON-specific rules (property naming, NFC normalisation, etc.), worked examples, and cross-references, but does not duplicate per-production shape information.

For every XxxYyy ::= in grammar.md there is exactly one XxxYyy ::: in this file, and vice versa.

Status: hand-maintained, eventually generated. This file is currently authored in lock-step with grammar.md. The longer-term direction is to derive it mechanically from grammar.md plus the property-name map (§14) plus the encoding rules (§1.7). Until that generator exists, the file is hand-maintained; the §14 property-name map and the §1.7 encoding rules together define what such a generator would need to know.

1. Notation

Each line takes one of two forms:

production_name ::: type-expression
production_name ::: type-expression
  // inline constraints on this production

(The placeholder production_name is shown in lower_snake_case here purely to keep it out of the formal-production count; real wire productions use UpperCamelCase.) The ::: separator (three colons) distinguishes a wire-format production from an abstract grammar production (::= in grammar.md). A wire production names the JSON shape that encodes the corresponding abstract production.

1.1 Type expressions

FormMeaning
string, number, boolean, nullThe corresponding JSON primitive.
"literal"A string-literal type — the JSON value MUST equal the literal. Used for kind discriminators.
ProductionNameReference to another wire production.
array<T>A JSON array; each element is T.
nonEmptyArray<T>An array of T with at least one element.
object { … }A JSON object. Property syntax in §1.2.
T | UA union. Discrimination strategy is documented inline (see §1.3).

1.2 Object property syntax

Within object { … }:

property: Type        // required
property?: Type       // optional; encoded only when present
"property": "literal" // a fixed string-literal value (used for kind)

Property order in the notation is informational. JSON does not preserve key order, and conforming encoders MAY emit properties in any order unless an inline constraint says otherwise (no current production requires a specific order).

1.3 Unions

some_union ::: A | B | C
  // discriminator: kind

(Placeholder shown in lower_snake_case for the same reason as §1.1.)

Two discrimination strategies are recognised, declared inline:

  • discriminator: kind — every member is an object production whose shape includes a kind: "MemberName" literal property. Decoders pick the variant by reading kind.
  • discriminator: position — members are distinguished by the enclosing property name and the surrounding context, not by anything on the encoded object itself. Used at singleton positions where the abstract grammar admits exactly one production at the property.

If no discriminator is declared, kind is the default.

1.4 Inline constraints

Constraints that cannot be expressed in the type expression appear as single-line //-prefixed comments immediately below the production:

MultilingualString ::: nonEmptyArray<LangString>
  // lang tags MUST be unique within the array (case-folded)

Constraints are normative.

1.5 The kind rule

Rule. A wire object carries a "kind": "X" property if and only if its abstract grammar production is a member of some discriminator: kind union — regardless of the position the object occupies in the wire form. Productions that are not members of any discriminator: kind union (Cardinality, Annotation, LabelOverride, Property, CatalogMetadata, LifecycleMetadata, SchemaArtifactVersioning, Unit, OntologyReference, OntologyDisplayHint, ControlledTermClass, PermissibleValue, Meaning, and the temporal RenderingHint object variants) never carry kind.

This rule is purely a property of the production: it does not depend on where in the document the object appears. A given production either always carries kind on the wire or never does. In particular, singleton positions — slots where the enclosing context already fixes the family — make no difference to whether kind is carried; a polymorphic-union member retains its kind even when the slot’s type pins the family unambiguously. The kind is then redundant for decoding (the family is recoverable from the slot type) but is retained because uniformity of the rule is more valuable than the small wire-size saving.

Terms.

  • Singleton position — a property slot in a wire object where the abstract grammar admits exactly one production (e.g. EmbeddedField.cardinality admits only Cardinality, EmbeddedTextField.defaultValue admits only TextValue).
  • Singleton-only production — an abstract production that appears only at singleton positions and is never a member of a discriminator: kind union (e.g. Cardinality, Annotation, LabelOverride). Equivalently: the productions enumerated in the Rule above.

Worked examples. Two cases illustrate the rule.

Case 1 — polymorphic-union member always carries kind. TextValue is a member of the Value union (which uses discriminator: kind). At the polymorphic FieldValue.values[*] position the wire form is:

{ "kind": "TextValue", "value": "Hello", "lang": "en" }

At the singleton EmbeddedTextField.defaultValue position, where the enclosing EmbeddedTextField.kind already fixes the family, the wire form is the same:

{
  "kind": "EmbeddedTextField",
  "key": "comment",
  "artifactRef": "https://example.org/fields/comment",
  "defaultValue": { "kind": "TextValue", "value": "Initial", "lang": "en" }
}

The inner "kind": "TextValue" is structurally redundant at this slot but is retained because TextValue is a polymorphic-union member and the rule is uniform across positions.

Case 2 — singleton-only production never carries kind. Cardinality is not a member of any discriminator: kind union — it appears only at singleton positions (e.g. EmbeddedField.cardinality, EmbeddedTemplate.cardinality). Its wire form never carries kind:

{
  "kind": "EmbeddedTextField",
  "key": "alias",
  "artifactRef": "https://example.org/fields/alias",
  "cardinality": { "min": 0, "max": 3 }
}

Wire vs. in-memory. The kind rule constrains the wire form, not the in-memory form of any host-language binding. Bindings MAY carry synthetic kind (or any other) discriminator fields on their in-memory representations of singleton-only productions — e.g. Cardinality, Annotation — for runtime introspection, type-guard ergonomics, or debugging. Any such synthetic discriminator MUST be stripped before encoding and MUST NOT appear on the wire; the converse is also possible (a binding’s in-memory type may omit a kind it chooses to recover from context, provided the encoder restores it). (See bindings.md §2.1 for examples.)

1.6 Collapsed wrappers

A typed singleton wrapper is an abstract grammar production whose constructor form has exactly one component. The inner component may be a primitive lexical category (string, number, boolean), another typed singleton wrapper, or a composite production such as MultilingualString. For example:

Iri        ::= iri(IriString)
TemplateId ::= template_id(IriString)
Label      ::= label(MultilingualString)

In the abstract grammar these productions exist to give a value a roleIri is a syntactically valid IRI, TemplateId is specifically the identifier of a template, Label is a label rather than an arbitrary multilingual string. The abstract grammar treats these roles as distinct types so that, e.g., a TemplateId cannot be substituted for a FieldId even though both reduce to a string at the wire level.

On the wire, however, this typed-role information is recovered from the surrounding context (the property name and the abstract grammar production at that slot). The wrapper therefore collapses to its inner type at encode time and disappears from the JSON, leaving only the inner value (a primitive, an array, or whichever shape the inner type encodes to). The wire grammar still names the wrapper production where the abstract grammar does, so that slot types in composite productions remain isomorphic to the abstract grammar’s component types — but the wrapper’s wire form ::: is the wire form of whatever it carries.

The wrappers fall into four groups by inner type:

  • IRI-typed (string, syntactically valid IRI per RFC 3987): Iri, TermIri, every XxxFieldId, TemplateId, TemplateInstanceId, PresentationComponentId, PropertyIri, OrcidIri, RorIri, DoiIri, PubMedIri, RridIri, NihGrantIri, OntologyIri, RootTermIri, ValueSetIri, PreviousVersion, DerivedFrom, CreatedBy, ModifiedBy.
  • Other strings (string): LanguageTag, LexicalForm, IsoDateTimeStamp, OntologyAcronym, ValueSetIdentifier, Notation, Identifier, AttributeName, HtmlContent, EmbeddedArtifactKey, ValidationRegex, Token, Version, ModelVersion, CreatedOn, ModifiedOn.
  • Numbers: NonNegativeInteger, MinCardinality, MaxCardinality, MinLength, MaxLength, DecimalPlaces, MaxTraversalDepth.
  • MultilingualString-typed (the inner type is itself a composite, encoded as a nonEmptyArray<LangString> per the MultilingualString wire production; the wrapper carries no additional wire shape): Name, Description, PreferredLabel, AlternativeLabel, Label, PropertyLabel, OntologyName, ValueSetName, RootTermLabel, Header, Footer.

Version and ModelVersion carry SemanticVersion 2.0.0 lexical strings. CreatedOn and ModifiedOn carry ISO 8601 date-time lexical strings. CreatedBy, ModifiedBy, PreviousVersion, and DerivedFrom carry IRIs.

1.7 Encoding rules

This section summarises the rules a generator would apply to derive wire-grammar.md from grammar.md plus the property-name map (§14). The rules are also the framing under which the file should be read: each ::: production in the rest of the file is what these rules produce when applied to the corresponding ::= production in grammar.md.

  1. Production naming. Every abstract production XxxYyy ::= ... in grammar.md becomes a wire production XxxYyy ::: ... with the same name.

  2. Object-form productions. A production that composes one or more named components encodes as object { ... } with property names drawn from the property-name map (§14). When such a production is a member of a kind-discriminated union, its object additionally carries "kind": "XxxYyy" (see rule 7).

  3. Optional components. A grammar.md [X] component becomes an optional wire property prop?: X and is omitted from the JSON when absent.

  4. Repeated components. A grammar.md X* becomes a wire array<X>; a grammar.md X+ becomes a wire nonEmptyArray<X>. Some sequence positions are encoded as omittable optional arrays per the wrapping principle of serialization.md §5 — altLabels?: array<AlternativeLabel> and annotations?: array<Annotation> on CatalogMetadata are SHOULD-omitted when empty, and the spec-level MultiValuedEnumFieldSpec.defaultValues is similarly optional. These exceptions are flagged at the production sites with inline constraints.

  5. Collapsed wrappers. Productions whose abstract form is a single-component wrapper around a primitive collapse to that primitive on the wire (§1.6). Their ::: definitions remain in this file for completeness and for use as type names at slot positions in composite productions: every slot in an object { ... } is typed with the abstract grammar’s component name (e.g. key: EmbeddedArtifactKey rather than key: string). This makes the wire form’s slot types isomorphic to the abstract grammar’s component types, even where the encoding bottoms out at a JSON primitive.

  6. Discriminator strategies. Two strategies are recognised, declared inline on the union: discriminator: kind (default) and discriminator: position. See §1.3.

  7. The kind rule. A kind: "X" literal property appears on a wire object if and only if its production is a member of some discriminator: kind union, regardless of position. Productions not so used (Cardinality, Annotation, LabelOverride, Property, etc.) never carry kind. See §1.5 for the full statement.

  8. Primitive bottom-out. Where the abstract grammar uses a bare primitive type (string, boolean, number) without a typed wrapper, the wire form uses that primitive directly (e.g. Cardinality.min: number, BooleanValue.value: boolean).

The wrapping principle that underlies rule 5 is given normatively in serialization.md §5; this section restates only the form in which it appears in the wire grammar.


2. Scalar and Datatype Leaves

The grammar’s primitive string types (SemanticVersion, IriString, Bcp47Tag, Iso8601DateTimeLexicalForm, AsciiIdentifier, IntegerLexicalForm) are abstract leaves with no ::= production; on the wire they all encode as string, with constraints noted at each site that uses them.

2.1 Core IRI and string types

Iri ::: string
  // a syntactically valid IRI per RFC 3987. At every position in the
  // model where the grammar uses Iri the wire form is a JSON string.

TermIri ::: Iri
  // a documented role; encodes as Iri

LanguageTag ::: string
  // a well-formed BCP 47 language tag

LexicalForm ::: string
  // a Unicode string; SHOULD be in Unicode Normalization Form C

IsoDateTimeStamp ::: string
  // an ISO 8601 date-time lexical form

NonNegativeInteger ::: number
  // a JSON number that is a non-negative integer
  // values exceeding 2^53 - 1 MUST be encoded as a string

2.2 Multilingual strings

LangString ::: object {
  value: string
  lang: string
}
  // lang MUST be a well-formed BCP 47 tag

MultilingualString ::: nonEmptyArray<LangString>
  // lang tags MUST be unique within the array (case-folded comparison)

2.3 Numeric datatype kind

RealNumberDatatypeKind ::: "decimal" | "float" | "double"
  // CEDAR-native enum naming the three real-number kinds.
  // The mapping to XSD datatype IRIs is defined separately in
  // rdf-projection.md and is out of scope for the wire form.

IntegerNumberValue is fixed to a single integer category and carries no datatype slot on the wire. Temporal Value variants (FullDateValue, TimeValue, DateTimeValue) likewise carry no datatype slot — the temporal category is fixed by the variant’s kind.


3. Values

Value ::: TextValue | NumericValue | BooleanValue
        | DateValue | TimeValue | DateTimeValue
        | ControlledTermValue | EnumValue | LinkValue
        | EmailValue | PhoneNumberValue | ExternalAuthorityValue
        | AttributeValue
  // discriminator: kind
  // NumericValue, DateValue, and ExternalAuthorityValue are themselves
  // unions; their members supply the kind discriminator directly

NumericValue ::: IntegerNumberValue | RealNumberValue
  // discriminator: kind

3.1 Scalar values

Scalar Value variants carry their content directly. There is no inner literal wrapper. TextValue carries an optional lang for language-tagged text; IntegerNumberValue carries a base-10 integer lexical form (datatype is fixed at xsd:integer and not carried); RealNumberValue carries a real-valued lexical form paired with the required datatype enum (decimal | float | double); BooleanValue carries a JSON boolean.

TextValue ::: object {
  "kind": "TextValue"
  value: LexicalForm
  lang?: LanguageTag
}
  // lang, when present, MUST be a well-formed BCP 47 tag
  // value MUST be in Unicode Normalization Form C

IntegerNumberValue ::: object {
  "kind": "IntegerNumberValue"
  value: LexicalForm
}
  // value is a base-10 integer lexical form
  // datatype is implicit (xsd:integer) and not carried on the wire

RealNumberValue ::: object {
  "kind": "RealNumberValue"
  value: LexicalForm
  datatype: RealNumberDatatypeKind
}
  // value is a base-10 real-valued lexical form
  // datatype names the XSD datatype (xsd:decimal, xsd:float, or xsd:double)

BooleanValue ::: object {
  "kind": "BooleanValue"
  value: boolean
}
  // value is a JSON boolean (true or false)
  // datatype is implicit (xsd:boolean) and not carried on the wire

3.2 Temporal values

Each temporal Value variant carries its lexical form directly. The datatype is fixed by the variant’s kind and is not carried on the wire.

DateValue ::: YearValue | YearMonthValue | FullDateValue
  // discriminator: kind

YearValue ::: object {
  "kind": "YearValue"
  value: LexicalForm
}
  // value matches YYYY

YearMonthValue ::: object {
  "kind": "YearMonthValue"
  value: LexicalForm
}
  // value matches YYYY-MM

FullDateValue ::: object {
  "kind": "FullDateValue"
  value: LexicalForm
}
  // value is an xsd:date lexical form (YYYY-MM-DD with optional zone)

TimeValue ::: object {
  "kind": "TimeValue"
  value: LexicalForm
}
  // value is an xsd:time lexical form

DateTimeValue ::: object {
  "kind": "DateTimeValue"
  value: LexicalForm
}
  // value is an xsd:dateTime lexical form

3.3 Controlled-term value

Label ::: MultilingualString
Notation ::: string
PreferredLabel ::: MultilingualString

ControlledTermValue ::: object {
  "kind": "ControlledTermValue"
  term: TermIri
  label?: Label
  notation?: Notation
  preferredLabel?: PreferredLabel
}
  // term is a TermIri (an Iri identifying the term)

3.4 Enum value

EnumValue ::: object {
  "kind": "EnumValue"
  value: Token
}
  // value is the canonical Token of one of the referenced
  // EnumFieldSpec's PermissibleValue entries
  // value MUST be a non-empty Unicode string

EnumValue.value carries the wire-form of the abstract grammar’s Token slot — the wire property name is value for consistency with other Value variants, while the abstract production names the slot Token. Token is defined in §7 alongside PermissibleValue.

LinkValue ::: object {
  "kind": "LinkValue"
  iri: Iri
  label?: Label
}

3.6 Contact values

EmailValue ::: object {
  "kind": "EmailValue"
  value: LexicalForm
}

PhoneNumberValue ::: object {
  "kind": "PhoneNumberValue"
  value: LexicalForm
}

3.7 External authority values

ExternalAuthorityValue ::: OrcidValue | RorValue | DoiValue
                         | PubMedIdValue | RridValue | NihGrantIdValue
  // discriminator: kind

OrcidValue ::: object {
  "kind": "OrcidValue"
  iri: OrcidIri
  label?: Label
}

RorValue ::: object {
  "kind": "RorValue"
  iri: RorIri
  label?: Label
}

DoiValue ::: object {
  "kind": "DoiValue"
  iri: DoiIri
  label?: Label
}

PubMedIdValue ::: object {
  "kind": "PubMedIdValue"
  iri: PubMedIri
  label?: Label
}

RridValue ::: object {
  "kind": "RridValue"
  iri: RridIri
  label?: Label
}

NihGrantIdValue ::: object {
  "kind": "NihGrantIdValue"
  iri: NihGrantIri
  label?: Label
}

The typed external-authority IRI productions collapse to plain string IRIs on the wire — see §1.6.

OrcidIri ::: Iri
RorIri ::: Iri
DoiIri ::: Iri
PubMedIri ::: Iri
RridIri ::: Iri
NihGrantIri ::: Iri

3.8 Attribute value

AttributeName ::: string

AttributeValue ::: object {
  "kind": "AttributeValue"
  name: AttributeName
  value: Value
}
  // value is a tagged Value carrying its kind discriminator per §1.5.

4. Identifiers (artifact)

Each artifact identifier wire-encodes as an Iri (which itself collapses to a plain string IRI per §1.6); the abstract grammar’s typed-role distinction is not visible on the wire.

FieldId is the umbrella union of the twenty typed XxxFieldId families per grammar.md; on the wire its encoding is just the encoding of whichever family member is at the slot position, which in every case is Iri. The wire grammar therefore lists `FieldId
:: Iri` alongside each typed family for consistency.
FieldId ::: Iri
TextFieldId ::: Iri
IntegerNumberFieldId ::: Iri
RealNumberFieldId ::: Iri
BooleanFieldId ::: Iri
DateFieldId ::: Iri
TimeFieldId ::: Iri
DateTimeFieldId ::: Iri
ControlledTermFieldId ::: Iri
SingleValuedEnumFieldId ::: Iri
MultiValuedEnumFieldId ::: Iri
LinkFieldId ::: Iri
EmailFieldId ::: Iri
PhoneNumberFieldId ::: Iri
OrcidFieldId ::: Iri
RorFieldId ::: Iri
DoiFieldId ::: Iri
PubMedIdFieldId ::: Iri
RridFieldId ::: Iri
NihGrantIdFieldId ::: Iri
AttributeValueFieldId ::: Iri

TemplateId ::: Iri
PresentationComponentId ::: Iri
TemplateInstanceId ::: Iri

The family of an identifier is recovered from the kind discriminator on the enclosing object — Field and EmbeddedField for FieldId variants, Template and EmbeddedTemplate for TemplateId, PresentationComponent and EmbeddedPresentationComponent for PresentationComponentId, and TemplateInstance for TemplateInstanceId. The identifier shape itself carries no family information.

The same identifier productions serve at both the definition site of a reusable artifact (e.g. Field.id, Template.id) and the reference site where it is embedded (e.g. EmbeddedField.artifactRef, EmbeddedTemplate.artifactRef); the abstract grammar does not distinguish reference-typed productions from identity-typed ones, and on the wire both positions encode as a plain IRI string.


5. Catalog Metadata

5.1 Aggregate structure

CatalogMetadata is flat on the wire: its descriptive properties (preferredLabel, description, identifier, altLabels), its lifecycle slot, and its annotations slot are all direct members of the same object — there is no descriptiveMetadata wrapper.

Description ::: MultilingualString
Identifier ::: string
AlternativeLabel ::: MultilingualString

CatalogMetadata ::: object {
  preferredLabel?: PreferredLabel
  description?: Description
  identifier?: Identifier
  altLabels?: array<AlternativeLabel>
  lifecycle: LifecycleMetadata
  annotations?: array<Annotation>
}
  // preferredLabel is the artifact's catalog-display name. It is
  // distinct from a schema artifact's *rendered* display name, which
  // lives on a top-level slot on the artifact itself (Field.label,
  // Template.title, TemplateInstance.label).
  // altLabels SHOULD be omitted from the wire when empty; it round-trips
  // as an empty array in memory
  // annotations SHOULD be omitted from the wire when empty; it round-trips
  // as an empty array in memory
  // the grammar's Description, PreferredLabel, and AlternativeLabel
  // productions are MultilingualString-typed wrappers that collapse on
  // the wire (§1.6); the type names appear here for parity with the
  // abstract grammar's component naming

CatalogMetadata is uniform across every artifact kind: Field, Template, PresentationComponent, and TemplateInstance all carry the same CatalogMetadata shape under the wire-form metadata key.

Schema artifacts (Field, Template) additionally carry SchemaArtifactVersioning as a separate top-level versioning slot on the artifact itself; non-schema artifacts (PresentationComponent, TemplateInstance) do not carry versioning. The SchemaArtifactMetadata wrapper production used in prior revisions of this specification is removed: in the new shape, schema artifacts carry metadata and versioning as parallel top-level slots rather than as a single metadata-wrapped blob.

5.2 Lifecycle metadata

CreatedOn ::: string
CreatedBy ::: string
ModifiedOn ::: string
ModifiedBy ::: string

LifecycleMetadata ::: object {
  createdOn: CreatedOn
  createdBy: CreatedBy
  modifiedOn: ModifiedOn
  modifiedBy: ModifiedBy
}
  // createdOn and modifiedOn carry IsoDateTimeStamp values
  // createdBy and modifiedBy carry agent Iri values

5.3 Schema versioning

SchemaArtifactVersioning ::: object {
  version: Version
  status: Status
  previousVersion?: PreviousVersion
  derivedFrom?: DerivedFrom
}
  // version is a SemanticVersion lexical form
  // when both previousVersion and derivedFrom are present, they MUST
  // NOT carry the same IRI (per grammar.md §Schema Artifact
  // Versioning); succession and derivation are mutually exclusive at
  // any single point

Version ::: string
ModelVersion ::: string
  // a SemanticVersion 2.0.0 lexical form; carried directly on every
  // concrete artifact wire object as the top-level `modelVersion` slot
PreviousVersion ::: Iri
DerivedFrom ::: Iri

Status ::: "draft" | "published"

5.4 Annotations

Annotation ::: object {
  property: Iri
  body: AnnotationValue
}
  // property is the annotation-property Iri (the grammar's bare Iri
  // collapses to a string per §1.6)

AnnotationValue ::: AnnotationStringValue | AnnotationIriValue
  // discriminator: kind

AnnotationStringValue ::: object {
  "kind": "AnnotationStringValue"
  value: LexicalForm
  lang?: LanguageTag
}
  // lang, when present, MUST be a well-formed BCP 47 tag
  // value MUST be in Unicode Normalization Form C

AnnotationIriValue ::: object {
  "kind": "AnnotationIriValue"
  iri: Iri
}
  // iri carries an Iri value (RFC 3987)

6. Embedded Artifact Properties

6.1 Embedded artifact key

EmbeddedArtifactKey ::: string
  // matches the pattern [A-Za-z][A-Za-z0-9_-]*
  // unique within the containing Template (constraint enforced on Template)

6.2 Requirements

ValueRequirement ::: "required" | "recommended" | "optional"

6.3 Cardinality

Cardinality ::: object {
  min: MinCardinality
  max?: MaxCardinality
}
  // min is a non-negative integer
  // max omitted ⇒ unbounded above (per grammar.md §Cardinality)

MinCardinality ::: number
MaxCardinality ::: number

6.4 Visibility

Visibility ::: "visible" | "hidden"

6.5 Defaults

Defaults are specified at two layers, with parallel typing per family. See grammar.md §Defaults for the abstract grammar’s full treatment, including precedence and the UI/UX-only semantics; this section gives the wire form.

Embedding-level defaults. The optional defaultValue slot on each EmbeddedXxxField is typed family-by-family with the family’s Value type. There is no DefaultValue union and no per-family XxxDefaultValue wrapper on the wire: the defaultValue JSON encodes directly as the corresponding family’s Value. Per the kind rule (§1.5), every Value family is a member of the Value discriminator-kind union, so every embedding-level defaultValue carries a kind discriminator on the wire.

Embedded fielddefaultValue wire form
EmbeddedTextFieldTextValue: { "kind": "TextValue", "value": …, "lang"?: … }
EmbeddedIntegerNumberFieldIntegerNumberValue: { "kind": "IntegerNumberValue", "value": … }
EmbeddedRealNumberFieldRealNumberValue: { "kind": "RealNumberValue", "value": …, "datatype": … }
EmbeddedBooleanFieldBooleanValue: { "kind": "BooleanValue", "value": … } (value is a JSON boolean)
EmbeddedDateFieldone of the DateValue arms: { "kind": "YearValue" | "YearMonthValue" | "FullDateValue", "value": … }
EmbeddedTimeFieldTimeValue: { "kind": "TimeValue", "value": … }
EmbeddedDateTimeFieldDateTimeValue: { "kind": "DateTimeValue", "value": … }
EmbeddedControlledTermFieldControlledTermValue: { "kind": "ControlledTermValue", … }
EmbeddedSingleValuedEnumFieldEnumValue: { "kind": "EnumValue", "value": … }
EmbeddedMultiValuedEnumFieldarray<EnumValue>: each element { "kind": "EnumValue", "value": … }
EmbeddedLinkFieldLinkValue: { "kind": "LinkValue", … }
EmbeddedEmailFieldEmailValue: { "kind": "EmailValue", "value": … }
EmbeddedPhoneNumberFieldPhoneNumberValue: { "kind": "PhoneNumberValue", "value": … }
EmbeddedOrcidFieldOrcidValue: { "kind": "OrcidValue", … }
EmbeddedRorFieldRorValue: { "kind": "RorValue", … }
EmbeddedDoiFieldDoiValue: { "kind": "DoiValue", … }
EmbeddedPubMedIdFieldPubMedIdValue: { "kind": "PubMedIdValue", … }
EmbeddedRridFieldRridValue: { "kind": "RridValue", … }
EmbeddedNihGrantIdFieldNihGrantIdValue: { "kind": "NihGrantIdValue", … }

EmbeddedAttributeValueField has no defaultValue slot (per §9).

Field-level defaults. Every XxxFieldSpec (with one exception) carries an optional defaultValue slot whose type matches its embedding-level counterpart. The two layers are independent: a field MAY ship with a field-level default and a Template embedding that field MAY override that default with an embedding-level defaultValue (see grammar.md §Defaults for the full precedence rule). The wire shapes are identical to the embedding-level table above, with the following per-family details:

  • TextFieldSpec.defaultValue?: TextValue

  • IntegerNumberFieldSpec.defaultValue?: IntegerNumberValue

  • RealNumberFieldSpec.defaultValue?: RealNumberValue

  • BooleanFieldSpec.defaultValue?: BooleanValue

  • DateFieldSpec.defaultValue?: DateValue (the arm MUST be consistent with dateValueType)

  • TimeFieldSpec.defaultValue?: TimeValue

  • DateTimeFieldSpec.defaultValue?: DateTimeValue

  • ControlledTermFieldSpec.defaultValue?: ControlledTermValue

  • LinkFieldSpec.defaultValue?: LinkValue

  • EmailFieldSpec.defaultValue?: EmailValue

  • PhoneNumberFieldSpec.defaultValue?: PhoneNumberValue

  • OrcidFieldSpec.defaultValue?: OrcidValue

  • RorFieldSpec.defaultValue?: RorValue

  • DoiFieldSpec.defaultValue?: DoiValue

  • PubMedIdFieldSpec.defaultValue?: PubMedIdValue

  • RridFieldSpec.defaultValue?: RridValue

  • NihGrantIdFieldSpec.defaultValue?: NihGrantIdValue

  • SingleValuedEnumFieldSpec.defaultValue?: EnumValue — a tagged EnumValue whose value MUST equal the Token of one of the spec’s permissible-value entries.

  • MultiValuedEnumFieldSpec.defaultValues?: array<EnumValue> — a (possibly empty) JSON array of tagged EnumValue entries; each value MUST equal the Token of one of the spec’s permissible-value entries, and the array MUST NOT contain duplicate value entries.

AttributeValueFieldSpec carries no field-level default.

6.6 Label override

LabelOverride ::: object {
  label: Label
  altLabels: array<AlternativeLabel>
}
  // altLabels MAY be empty

6.7 Help text

HelpText ::: MultilingualString
HelpTextOverride ::: MultilingualString

Both productions collapse on the wire per the wrapper-collapse rule (§1.6): a MultilingualString is encoded as a non-empty array of LangString entries. HelpText is carried by the reusable Field artifact (slot helpText?); HelpTextOverride is carried by each EmbeddedXxxField (slot helpTextOverride?).

6.8 Properties

Property ::: object {
  iri: PropertyIri
  label?: PropertyLabel
}
  // iri carries the PropertyIri; label is the optional PropertyLabel

PropertyIri ::: Iri
PropertyLabel ::: MultilingualString

7. Field Specs

FieldSpec ::: TextFieldSpec | NumericFieldSpec | BooleanFieldSpec
            | TemporalFieldSpec
            | ControlledTermFieldSpec | EnumFieldSpec | LinkFieldSpec
            | ContactFieldSpec | ExternalAuthorityFieldSpec
            | AttributeValueFieldSpec
  // discriminator: kind
  // NumericFieldSpec, TemporalFieldSpec, EnumFieldSpec, ContactFieldSpec,
  // and ExternalAuthorityFieldSpec are unions; their members supply
  // the kind discriminator directly

NumericFieldSpec ::: IntegerNumberFieldSpec | RealNumberFieldSpec
  // discriminator: kind

TextFieldSpec ::: object {
  "kind": "TextFieldSpec"
  defaultValue?: TextValue
  minLength?: MinLength
  maxLength?: MaxLength
  validationRegex?: ValidationRegex
  langTagRequirement?: LangTagRequirement
  renderingHint?: TextRenderingHint
}

LangTagRequirement ::: "langTagRequired" | "langTagOptional" | "langTagForbidden"
  // defaultValue, when present, encodes as a tagged TextValue per
  // the kind rule (§1.5): `{ "kind": "TextValue", "value": ..., "lang"?: ... }`.
  // See §6.5 for default-value semantics across all field families.

IntegerNumberFieldSpec ::: object {
  "kind": "IntegerNumberFieldSpec"
  defaultValue?: IntegerNumberValue
  unit?: Unit
  minValue?: IntegerNumberMinValue
  maxValue?: IntegerNumberMaxValue
  renderingHint?: NumericRenderingHint
}

RealNumberFieldSpec ::: object {
  "kind": "RealNumberFieldSpec"
  datatype: RealNumberDatatypeKind
  defaultValue?: RealNumberValue
  unit?: Unit
  minValue?: RealNumberMinValue
  maxValue?: RealNumberMaxValue
  renderingHint?: NumericRenderingHint
}

BooleanFieldSpec ::: object {
  "kind": "BooleanFieldSpec"
  defaultValue?: BooleanValue
  renderingHint?: BooleanRenderingHint
}

Unit ::: object {
  iri: Iri
  label?: Label
}

MinLength ::: number
MaxLength ::: number
ValidationRegex ::: string
DecimalPlaces ::: number
IntegerNumberMinValue ::: IntegerNumberValue
IntegerNumberMaxValue ::: IntegerNumberValue
RealNumberMinValue ::: RealNumberValue
RealNumberMaxValue ::: RealNumberValue

7.1 Temporal field specs

TemporalFieldSpec ::: DateFieldSpec | TimeFieldSpec | DateTimeFieldSpec
  // discriminator: kind

DateFieldSpec ::: object {
  "kind": "DateFieldSpec"
  dateValueType: DateValueType
  defaultValue?: DateValue
  renderingHint?: DateRenderingHint
}
  // defaultValue, when present, MUST be a DateValue arm consistent
  // with dateValueType (e.g. dateValueType "year" admits only YearValue).

DateValueType ::: "year" | "yearMonth" | "fullDate"

TimeFieldSpec ::: object {
  "kind": "TimeFieldSpec"
  defaultValue?: TimeValue
  timePrecision?: TimePrecision
  timezoneRequirement?: TimezoneRequirement
  renderingHint?: TimeRenderingHint
}

TimePrecision ::: "hourMinute" | "hourMinuteSecond" | "hourMinuteSecondFraction"

TimezoneRequirement ::: "timezoneRequired" | "timezoneNotRequired"

DateTimeFieldSpec ::: object {
  "kind": "DateTimeFieldSpec"
  dateTimeValueType: DateTimeValueType
  defaultValue?: DateTimeValue
  timezoneRequirement?: TimezoneRequirement
  renderingHint?: DateTimeRenderingHint
}

DateTimeValueType ::: "dateHourMinute" | "dateHourMinuteSecond"
                    | "dateHourMinuteSecondFraction"

DateRenderingHint ::: object {
  componentOrder?: DateComponentOrder
  placeholder?: Placeholder
}

DateComponentOrder ::: "dayMonthYear" | "monthDayYear" | "yearMonthDay"

TimeRenderingHint ::: object {
  timeFormat?: TimeFormat
  placeholder?: Placeholder
}

DateTimeRenderingHint ::: object {
  timeFormat?: TimeFormat
  placeholder?: Placeholder
}

TimeFormat ::: "twelveHour" | "twentyFourHour"

7.2 Controlled term field spec

ControlledTermFieldSpec ::: object {
  "kind": "ControlledTermFieldSpec"
  defaultValue?: ControlledTermValue
  sources: nonEmptyArray<ControlledTermSource>
  renderingHint?: ControlledTermRenderingHint
}
  // defaultValue.term, when present, SHOULD belong to one of the
  // declared sources, but the structural model does not enforce this

7.3 Enum field specs

EnumFieldSpec ::: SingleValuedEnumFieldSpec | MultiValuedEnumFieldSpec
  // discriminator: kind

SingleValuedEnumFieldSpec ::: object {
  "kind": "SingleValuedEnumFieldSpec"
  permissibleValues: nonEmptyArray<PermissibleValue>
  defaultValue?: EnumValue
  renderingHint?: SingleValuedEnumRenderingHint
}
  // defaultValue.value, when present, MUST equal the `value` of one
  // of the permissibleValues entries

MultiValuedEnumFieldSpec ::: object {
  "kind": "MultiValuedEnumFieldSpec"
  permissibleValues: nonEmptyArray<PermissibleValue>
  defaultValues?: array<EnumValue>
  renderingHint?: MultiValuedEnumRenderingHint
}
  // defaultValues, when present, is a (possibly empty) array of
  // EnumValue entries; each defaultValues[i].value MUST equal the
  // `value` of one of the permissibleValues entries; the array MUST
  // NOT contain duplicate `value` entries

PermissibleValue ::: object {
  value: Token
  label?: Label
  description?: Description
  meanings?: array<Meaning>
}
  // value carries the canonical Token of the permissible value and
  // MUST be a non-empty Unicode string
  // value MUST be unique within the enclosing spec's permissibleValues
  // meanings, when present, is a (possibly empty) array of Meaning
  // objects binding the token to ontology terms; SHOULD be omitted
  // when empty

Token ::: string
  // a non-empty Unicode string serving as the canonical key of a
  // PermissibleValue or the value carried by an EnumValue

Meaning ::: object {
  iri: TermIri
  label?: Label
}
  // iri carries the TermIri of the bound ontology term
  // label, when present, is the cached human-readable label of the
  // bound term (distinct from the enclosing PermissibleValue's label,
  // which is the label of the permissible value itself)

7.4 Other field specs

LinkFieldSpec ::: object {
  "kind": "LinkFieldSpec"
  defaultValue?: LinkValue
  renderingHint?: LinkRenderingHint
}

ContactFieldSpec ::: EmailFieldSpec | PhoneNumberFieldSpec
  // discriminator: kind

EmailFieldSpec ::: object {
  "kind": "EmailFieldSpec"
  defaultValue?: EmailValue
  renderingHint?: EmailRenderingHint
}

PhoneNumberFieldSpec ::: object {
  "kind": "PhoneNumberFieldSpec"
  defaultValue?: PhoneNumberValue
  renderingHint?: PhoneNumberRenderingHint
}

ExternalAuthorityFieldSpec ::: OrcidFieldSpec | RorFieldSpec | DoiFieldSpec
                             | PubMedIdFieldSpec | RridFieldSpec
                             | NihGrantIdFieldSpec
  // discriminator: kind

OrcidFieldSpec ::: object {
  "kind": "OrcidFieldSpec"
  defaultValue?: OrcidValue
  renderingHint?: OrcidRenderingHint
}

RorFieldSpec ::: object {
  "kind": "RorFieldSpec"
  defaultValue?: RorValue
  renderingHint?: RorRenderingHint
}

DoiFieldSpec ::: object {
  "kind": "DoiFieldSpec"
  defaultValue?: DoiValue
  renderingHint?: DoiRenderingHint
}

PubMedIdFieldSpec ::: object {
  "kind": "PubMedIdFieldSpec"
  defaultValue?: PubMedIdValue
  renderingHint?: PubMedIdRenderingHint
}

RridFieldSpec ::: object {
  "kind": "RridFieldSpec"
  defaultValue?: RridValue
  renderingHint?: RridRenderingHint
}

NihGrantIdFieldSpec ::: object {
  "kind": "NihGrantIdFieldSpec"
  defaultValue?: NihGrantIdValue
  renderingHint?: NihGrantIdRenderingHint
}

AttributeValueFieldSpec ::: object {
  "kind": "AttributeValueFieldSpec"
}
  // AttributeValueFieldSpec carries no defaultValue; an AttributeValue
  // is a per-instance pairing of a name and a value, and a default is
  // not meaningful here (see grammar.md §Defaults).

7.5 Controlled term sources

ControlledTermSource ::: OntologySource | BranchSource
                       | ClassSource | ValueSetSource
  // discriminator: kind

OntologySource ::: object {
  "kind": "OntologySource"
  ontology: OntologyReference
}

OntologyReference ::: object {
  iri: OntologyIri
  displayHint?: OntologyDisplayHint
}

OntologyDisplayHint ::: object {
  acronym?: OntologyAcronym
  name?: OntologyName
}
  // at least one of acronym, name MUST be present

BranchSource ::: object {
  "kind": "BranchSource"
  ontology: OntologyReference
  rootTermIri: RootTermIri
  rootTermLabel?: RootTermLabel
  maxTraversalDepth?: MaxTraversalDepth
}
  // rootTermLabel SHOULD be present (captured at source-declaration time)
  // but MAY be omitted when the term's display text is not available

ClassSource ::: object {
  "kind": "ClassSource"
  classes: nonEmptyArray<ControlledTermClass>
}

ControlledTermClass ::: object {
  term: TermIri
  label?: Label
  ontology: OntologyReference
}
  // term is a TermIri
  // label SHOULD be present (captured at source-declaration time)
  // but MAY be omitted when the term's display text is not available

ValueSetSource ::: object {
  "kind": "ValueSetSource"
  identifier: ValueSetIdentifier
  name?: ValueSetName
  iri?: ValueSetIri
}

OntologyAcronym ::: string
OntologyName ::: MultilingualString
OntologyIri ::: Iri
RootTermIri ::: Iri
RootTermLabel ::: MultilingualString
MaxTraversalDepth ::: number
ValueSetIdentifier ::: string
ValueSetName ::: MultilingualString
ValueSetIri ::: Iri

The leaf productions used by the controlled-term sources collapse on the wire per §1.6; their ::: definitions are listed alongside the source productions for slot-type reference.

7.6 Rendering hints

The RenderingHint union is heterogeneous: text/enum/boolean hints encode as flat strings, while DateRenderingHint, TimeRenderingHint, DateTimeRenderingHint, and NumericRenderingHint encode as objects that can carry configuration. Because some members are strings (which cannot carry a "kind" property), the union uses discriminator: position (§1.3): the decoder identifies the variant from the enclosing FieldSpec’s family — e.g. the value at TextFieldSpec.renderingHint is decoded as a TextRenderingHint, the value at SingleValuedEnumFieldSpec.renderingHint as a SingleValuedEnumRenderingHint, and so on.

RenderingHint ::: TextRenderingHint | SingleValuedEnumRenderingHint
                | MultiValuedEnumRenderingHint | NumericRenderingHint
                | BooleanRenderingHint
                | DateRenderingHint | TimeRenderingHint | DateTimeRenderingHint
                | ControlledTermRenderingHint
                | EmailRenderingHint | PhoneNumberRenderingHint
                | LinkRenderingHint
                | OrcidRenderingHint | RorRenderingHint | DoiRenderingHint
                | PubMedIdRenderingHint | RridRenderingHint
                | NihGrantIdRenderingHint
  // discriminator: position
  // resolved by the renderingHint property of the enclosing FieldSpec

TextRenderingHint ::: object {
  lineMode?: TextLineMode
  placeholder?: Placeholder
}

TextLineMode ::: "singleLine" | "multiLine"

SingleValuedEnumRenderingHint ::: "radio" | "dropdown"

MultiValuedEnumRenderingHint ::: "checkbox" | "multiSelect"

NumericRenderingHint ::: object {
  decimalPlaces?: DecimalPlaces
  placeholder?: Placeholder
}
  // decimalPlaces, when present, MUST be a non-negative integer
  // it is a presentation concern (display rounding); it does NOT
  // constrain the lexical form of submitted values

BooleanRenderingHint ::: "checkbox" | "toggle" | "radio" | "dropdown"

ControlledTermRenderingHint ::: object { placeholder?: Placeholder }
EmailRenderingHint          ::: object { placeholder?: Placeholder }
PhoneNumberRenderingHint    ::: object { placeholder?: Placeholder }
LinkRenderingHint           ::: object { placeholder?: Placeholder }
OrcidRenderingHint          ::: object { placeholder?: Placeholder }
RorRenderingHint            ::: object { placeholder?: Placeholder }
DoiRenderingHint            ::: object { placeholder?: Placeholder }
PubMedIdRenderingHint       ::: object { placeholder?: Placeholder }
RridRenderingHint           ::: object { placeholder?: Placeholder }
NihGrantIdRenderingHint     ::: object { placeholder?: Placeholder }

Placeholder ::: MultilingualString

Placeholder collapses on the wire per the wrapper-collapse rule (§1.6).


8. Field artifacts

Field ::: TextField | NumericField | BooleanField
        | DateField | TimeField | DateTimeField
        | ControlledTermField
        | SingleValuedEnumField | MultiValuedEnumField
        | LinkField | EmailField | PhoneNumberField
        | OrcidField | RorField | DoiField | PubMedIdField
        | RridField | NihGrantIdField | AttributeValueField
  // discriminator: kind
  // NumericField is itself a union of IntegerNumberField and RealNumberField

NumericField ::: IntegerNumberField | RealNumberField
  // discriminator: kind

TemporalField ::: DateField | TimeField | DateTimeField
  // discriminator: kind
  // a documented intermediate category; the wire form is just the variant

EnumField ::: SingleValuedEnumField | MultiValuedEnumField
  // discriminator: kind

ContactField ::: EmailField | PhoneNumberField
  // discriminator: kind

ExternalAuthorityField ::: OrcidField | RorField | DoiField
                         | PubMedIdField | RridField | NihGrantIdField
  // discriminator: kind

TextField ↗ EmbeddedTextField ::: object {
  "kind": "TextField"
  id: TextFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: TextFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

IntegerNumberField ↗ EmbeddedIntegerNumberField ::: object {
  "kind": "IntegerNumberField"
  id: IntegerNumberFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: IntegerNumberFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

RealNumberField ↗ EmbeddedRealNumberField ::: object {
  "kind": "RealNumberField"
  id: RealNumberFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: RealNumberFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

BooleanField ↗ EmbeddedBooleanField ::: object {
  "kind": "BooleanField"
  id: BooleanFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: BooleanFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

DateField ↗ EmbeddedDateField ::: object {
  "kind": "DateField"
  id: DateFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: DateFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

TimeField ↗ EmbeddedTimeField ::: object {
  "kind": "TimeField"
  id: TimeFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: TimeFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

DateTimeField ↗ EmbeddedDateTimeField ::: object {
  "kind": "DateTimeField"
  id: DateTimeFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: DateTimeFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

ControlledTermField ↗ EmbeddedControlledTermField ::: object {
  "kind": "ControlledTermField"
  id: ControlledTermFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: ControlledTermFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

SingleValuedEnumField ↗ EmbeddedSingleValuedEnumField ::: object {
  "kind": "SingleValuedEnumField"
  id: SingleValuedEnumFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: SingleValuedEnumFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

MultiValuedEnumField ↗ EmbeddedMultiValuedEnumField ::: object {
  "kind": "MultiValuedEnumField"
  id: MultiValuedEnumFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: MultiValuedEnumFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

LinkField ↗ EmbeddedLinkField ::: object {
  "kind": "LinkField"
  id: LinkFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: LinkFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

EmailField ↗ EmbeddedEmailField ::: object {
  "kind": "EmailField"
  id: EmailFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: EmailFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

PhoneNumberField ↗ EmbeddedPhoneNumberField ::: object {
  "kind": "PhoneNumberField"
  id: PhoneNumberFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: PhoneNumberFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

OrcidField ↗ EmbeddedOrcidField ::: object {
  "kind": "OrcidField"
  id: OrcidFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: OrcidFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

RorField ↗ EmbeddedRorField ::: object {
  "kind": "RorField"
  id: RorFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: RorFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

DoiField ↗ EmbeddedDoiField ::: object {
  "kind": "DoiField"
  id: DoiFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: DoiFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

PubMedIdField ↗ EmbeddedPubMedIdField ::: object {
  "kind": "PubMedIdField"
  id: PubMedIdFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: PubMedIdFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

RridField ↗ EmbeddedRridField ::: object {
  "kind": "RridField"
  id: RridFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: RridFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

NihGrantIdField ↗ EmbeddedNihGrantIdField ::: object {
  "kind": "NihGrantIdField"
  id: NihGrantIdFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: NihGrantIdFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

AttributeValueField ↗ EmbeddedAttributeValueField ::: object {
  "kind": "AttributeValueField"
  id: AttributeValueFieldId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  fieldSpec: AttributeValueFieldSpec
  label: Label
  helpText?: HelpText
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

9. Embedded artifacts

Most embedded-field productions follow the same eight-property template — kind, key, artifactRef, valueRequirement?, cardinality?, visibility?, defaultValue?, labelOverride?, property? — with the per-family typing applied at artifactRef and defaultValue. Four families deviate from this template; the deviations are listed here so an implementer can scan them in one place rather than spotting them inside the per-family productions below.

FamilyDeviation
EmbeddedBooleanFieldomits cardinality (booleans are inherently single-valued)
EmbeddedSingleValuedEnumFieldomits cardinality (single-valued is implicit, parallel to boolean)
EmbeddedMultiValuedEnumFielddefaultValue?: array<EnumValue> rather than a singular Value (multi-valued enum admits a list of pre-selected tokens; each element is a tagged EnumValue per §1.5)
EmbeddedAttributeValueFieldomits defaultValue (attribute-value fields have no spec-level default)

EmbeddedTemplate and EmbeddedPresentationComponent follow their own shapes; see the per-production definitions later in this section.

EmbeddedArtifact ::: EmbeddedField | EmbeddedTemplate
                   | EmbeddedPresentationComponent
  // discriminator: kind

EmbeddedField ::: EmbeddedTextField
                | EmbeddedIntegerNumberField | EmbeddedRealNumberField
                | EmbeddedBooleanField
                | EmbeddedDateField | EmbeddedTimeField | EmbeddedDateTimeField
                | EmbeddedControlledTermField
                | EmbeddedSingleValuedEnumField | EmbeddedMultiValuedEnumField
                | EmbeddedLinkField
                | EmbeddedEmailField | EmbeddedPhoneNumberField
                | EmbeddedOrcidField | EmbeddedRorField | EmbeddedDoiField
                | EmbeddedPubMedIdField | EmbeddedRridField
                | EmbeddedNihGrantIdField
                | EmbeddedAttributeValueField
  // discriminator: kind

EmbeddedTextField ↗ TextField ::: object {
  "kind": "EmbeddedTextField"
  key: EmbeddedArtifactKey
  artifactRef: TextFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: TextValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedIntegerNumberField ↗ IntegerNumberField ::: object {
  "kind": "EmbeddedIntegerNumberField"
  key: EmbeddedArtifactKey
  artifactRef: IntegerNumberFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: IntegerNumberValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedRealNumberField ↗ RealNumberField ::: object {
  "kind": "EmbeddedRealNumberField"
  key: EmbeddedArtifactKey
  artifactRef: RealNumberFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: RealNumberValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedBooleanField ↗ BooleanField ::: object {
  "kind": "EmbeddedBooleanField"
  key: EmbeddedArtifactKey
  artifactRef: BooleanFieldId
  valueRequirement?: ValueRequirement
  visibility?: Visibility
  defaultValue?: BooleanValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}
  // boolean embeddings carry no cardinality slot per grammar.md
  // (booleans are inherently single-valued)

EmbeddedDateField ↗ DateField ::: object {
  "kind": "EmbeddedDateField"
  key: EmbeddedArtifactKey
  artifactRef: DateFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: DateValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedTimeField ↗ TimeField ::: object {
  "kind": "EmbeddedTimeField"
  key: EmbeddedArtifactKey
  artifactRef: TimeFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: TimeValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedDateTimeField ↗ DateTimeField ::: object {
  "kind": "EmbeddedDateTimeField"
  key: EmbeddedArtifactKey
  artifactRef: DateTimeFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: DateTimeValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedControlledTermField ↗ ControlledTermField ::: object {
  "kind": "EmbeddedControlledTermField"
  key: EmbeddedArtifactKey
  artifactRef: ControlledTermFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: ControlledTermValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedSingleValuedEnumField ↗ SingleValuedEnumField ::: object {
  "kind": "EmbeddedSingleValuedEnumField"
  key: EmbeddedArtifactKey
  artifactRef: SingleValuedEnumFieldId
  valueRequirement?: ValueRequirement
  visibility?: Visibility
  defaultValue?: EnumValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}
  // single-valued enum embeddings carry no cardinality slot per
  // grammar.md (single-valued enum is implicit, parallel to boolean)

EmbeddedMultiValuedEnumField ↗ MultiValuedEnumField ::: object {
  "kind": "EmbeddedMultiValuedEnumField"
  key: EmbeddedArtifactKey
  artifactRef: MultiValuedEnumFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: array<EnumValue>
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}
  // defaultValue is a (possibly empty) array of EnumValue entries;
  // each element is a tagged EnumValue per the kind rule (§1.5).
  // The array MUST NOT contain duplicate `value` entries.

EmbeddedLinkField ↗ LinkField ::: object {
  "kind": "EmbeddedLinkField"
  key: EmbeddedArtifactKey
  artifactRef: LinkFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: LinkValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedEmailField ↗ EmailField ::: object {
  "kind": "EmbeddedEmailField"
  key: EmbeddedArtifactKey
  artifactRef: EmailFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: EmailValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedPhoneNumberField ↗ PhoneNumberField ::: object {
  "kind": "EmbeddedPhoneNumberField"
  key: EmbeddedArtifactKey
  artifactRef: PhoneNumberFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: PhoneNumberValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedOrcidField ↗ OrcidField ::: object {
  "kind": "EmbeddedOrcidField"
  key: EmbeddedArtifactKey
  artifactRef: OrcidFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: OrcidValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedRorField ↗ RorField ::: object {
  "kind": "EmbeddedRorField"
  key: EmbeddedArtifactKey
  artifactRef: RorFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: RorValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedDoiField ↗ DoiField ::: object {
  "kind": "EmbeddedDoiField"
  key: EmbeddedArtifactKey
  artifactRef: DoiFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: DoiValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedPubMedIdField ↗ PubMedIdField ::: object {
  "kind": "EmbeddedPubMedIdField"
  key: EmbeddedArtifactKey
  artifactRef: PubMedIdFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: PubMedIdValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedRridField ↗ RridField ::: object {
  "kind": "EmbeddedRridField"
  key: EmbeddedArtifactKey
  artifactRef: RridFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: RridValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedNihGrantIdField ↗ NihGrantIdField ::: object {
  "kind": "EmbeddedNihGrantIdField"
  key: EmbeddedArtifactKey
  artifactRef: NihGrantIdFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  defaultValue?: NihGrantIdValue
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}

EmbeddedAttributeValueField ↗ AttributeValueField ::: object {
  "kind": "EmbeddedAttributeValueField"
  key: EmbeddedArtifactKey
  artifactRef: AttributeValueFieldId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  labelOverride?: LabelOverride
  helpTextOverride?: HelpTextOverride
  property?: Property
}
  // attribute-value embeddings carry no defaultValue per grammar.md

EmbeddedTemplate ::: object {
  "kind": "EmbeddedTemplate"
  key: EmbeddedArtifactKey
  artifactRef: TemplateId
  valueRequirement?: ValueRequirement
  cardinality?: Cardinality
  visibility?: Visibility
  labelOverride?: LabelOverride
  property?: Property
}

EmbeddedPresentationComponent ::: object {
  "kind": "EmbeddedPresentationComponent"
  key: EmbeddedArtifactKey
  artifactRef: PresentationComponentId
  visibility?: Visibility
}

10. Presentation Components

PresentationComponent ::: RichTextComponent | ImageComponent
                        | YoutubeVideoComponent
                        | SectionBreakComponent | PageBreakComponent
  // discriminator: kind

RichTextComponent ::: object {
  "kind": "RichTextComponent"
  id: PresentationComponentId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  html: HtmlContent
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

ImageComponent ::: object {
  "kind": "ImageComponent"
  id: PresentationComponentId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  image: Iri
  label?: Label
  description?: Description
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form
  // image is an Iri identifying the image resource
  // label, when present, is short alt-text accessibility metadata
  // description, when present, is longer accessibility-focused text

YoutubeVideoComponent ::: object {
  "kind": "YoutubeVideoComponent"
  id: PresentationComponentId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  video: Iri
  label?: Label
  description?: Description
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form
  // video is an Iri identifying the video resource
  // label, when present, is short alt-text / caption-title accessibility metadata
  // description, when present, is longer accessibility-focused text

SectionBreakComponent ::: object {
  "kind": "SectionBreakComponent"
  id: PresentationComponentId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

PageBreakComponent ::: object {
  "kind": "PageBreakComponent"
  id: PresentationComponentId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form

HtmlContent ::: string

11. Templates and Top-Level Artifacts

Artifact ::: SchemaArtifact | PresentationComponent | TemplateInstance
  // discriminator: kind
  // kind ∈ {"Field", "Template", "RichTextComponent", "ImageComponent",
  //         "YoutubeVideoComponent", "SectionBreakComponent",
  //         "PageBreakComponent", "TemplateInstance"}

SchemaArtifact ::: Field | Template
  // discriminator: kind

Template ::: object {
  "kind": "Template"
  id: TemplateId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  versioning: SchemaArtifactVersioning
  title: Title
  renderingHint?: TemplateRenderingHint
  header?: Header
  footer?: Footer
  members: array<EmbeddedArtifact>
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form
  // EmbeddedArtifact keys (each member's `key` property) MUST be unique
  // within `members` (per grammar.md §Embedded Artifact Key)
  // the order of `members` MUST be preserved

TemplateRenderingHint ::: object {
  helpDisplayMode?: HelpDisplayMode
}

HelpDisplayMode ::: "inline" | "tooltip" | "both" | "none"

Title ::: MultilingualString

Header ::: MultilingualString
Footer ::: MultilingualString

12. Instances

TemplateInstance ::: object {
  "kind": "TemplateInstance"
  id: TemplateInstanceId
  modelVersion: ModelVersion
  metadata: CatalogMetadata
  templateRef: TemplateId
  label?: Label
  values: array<InstanceValue>
}
  // modelVersion is a SemanticVersion 2.0.0 lexical form
  // metadata is CatalogMetadata; instances do not carry schema
  // versioning, so there is no top-level versioning slot
  // label, when present, is a user-supplied name for this instance,
  // shown in catalog listings or detail views

InstanceValue ::: FieldValue | NestedTemplateInstance
  // discriminator: kind

FieldValue ::: object {
  "kind": "FieldValue"
  key: EmbeddedArtifactKey
  values: nonEmptyArray<Value>
}
  // values MUST be non-empty (per grammar's Value+; absence of a value is
  // represented by omitting the FieldValue entirely)

NestedTemplateInstance ::: object {
  "kind": "NestedTemplateInstance"
  key: EmbeddedArtifactKey
  values: array<InstanceValue>
}
  // values MAY be empty

13. Cross-reference

For the JSON-encoding rules that frame this grammar — property naming (lowerCamelCase), Unicode normalisation, big-integer string fallback, implementation-extension prefixes, and worked end-to-end examples — see serialization.md. For the abstract grammar this file mirrors, see grammar.md. For conformance rules, see validation.md.


14. Property-name map

This section makes the implicit map between abstract grammar component slots and JSON property names explicit. Each entry lists, for one abstract production, the abstract component types in their grammar-defined order paired with the wire property name used to encode that component.

The list covers every abstract production in grammar.md that has at least one component. Productions whose abstract form has no components (e.g. EmailFieldSpec ::= email_field_spec()) and pure-union or enum-string productions (e.g. Value, ValueRequirement) carry no property-name mapping and are not listed.

Conventions:

  • Each entry leads with the abstract production name in bold and, in parentheses, the corresponding lower_snake_case constructor form’s name from grammar.md — e.g. Template (template), YoutubeVideoComponent (you_tube_video_component). The parenthesised name is informational, included so a reader cross- referencing this section against grammar.md can match ::= productions to entries here without manually re-deriving the snake_case form. It does not appear on the wire and has no normative effect.
  • Component order follows grammar.md. Component-index numbering is zero-based.
  • Optional [X] and repeated X* / X+ components are noted alongside the component type.
  • The mapping records the wire property name; whether the encoded object carries a kind discriminator at that slot is determined separately by the kind rule (§1.5) and is not duplicated here.

14.1 Top-level artifacts and templates

Template (template): 0. TemplateIdid

  1. ModelVersionmodelVersion
  2. CatalogMetadatametadata
  3. SchemaArtifactVersioningversioning
  4. Titletitle
  5. [TemplateRenderingHint]renderingHint?
  6. [Header]header?
  7. [Footer]footer?
  8. EmbeddedArtifact*members

TemplateRenderingHint (template_rendering_hint): 0. [HelpDisplayMode]helpDisplayMode?

TemplateInstance (template_instance): 0. TemplateInstanceIdid

  1. ModelVersionmodelVersion
  2. CatalogMetadatametadata
  3. TemplateIdtemplateRef
  4. [Label]label?
  5. InstanceValue*values

14.2 Field artifacts

Every concrete Field production has the same six-component shape: (<Family>FieldId, ModelVersion, CatalogMetadata, SchemaArtifactVersioning, <Family>FieldSpec, Label), with an optional seventh HelpText slot. For all of TextField, IntegerNumberField, RealNumberField, BooleanField, DateField, TimeField, DateTimeField, ControlledTermField, SingleValuedEnumField, MultiValuedEnumField, LinkField, EmailField, PhoneNumberField, OrcidField, RorField, DoiField, PubMedIdField, RridField, NihGrantIdField, and AttributeValueField:

  1. <Family>FieldIdid
  2. ModelVersionmodelVersion
  3. CatalogMetadatametadata
  4. SchemaArtifactVersioningversioning
  5. <Family>FieldSpecfieldSpec
  6. Labellabel
  7. [HelpText]helpText?

14.3 Embedded artifacts

Every concrete EmbeddedXxxField production follows the same pattern, with the per-family typed-id and typed-default-value slots:

  1. EmbeddedArtifactKeykey
  2. <Family>FieldIdartifactRef
  3. [ValueRequirement]valueRequirement?
  4. [Cardinality]cardinality? (omitted on EmbeddedBooleanField and EmbeddedSingleValuedEnumField)
  5. [Visibility]visibility?
  6. [<Family>Value]defaultValue? (omitted on EmbeddedAttributeValueField; on EmbeddedMultiValuedEnumField the slot is EnumValue*defaultValue?: array<EnumValue>)
  7. [LabelOverride]labelOverride?
  8. [HelpTextOverride]helpTextOverride?
  9. [Property]property?

(Component indices are renumbered to skip slots a particular family omits, per the per-family abstract production. The list above gives the canonical ordering common to the family.)

EmbeddedTemplate (embedded_template): 0. EmbeddedArtifactKeykey

  1. TemplateIdartifactRef
  2. [ValueRequirement]valueRequirement?
  3. [Cardinality]cardinality?
  4. [Visibility]visibility?
  5. [LabelOverride]labelOverride?
  6. [Property]property?

EmbeddedPresentationComponent (embedded_presentation_component): 0. EmbeddedArtifactKeykey

  1. PresentationComponentIdartifactRef
  2. [Visibility]visibility?

14.4 Catalog metadata

CatalogMetadata (catalog_metadata): 0. [PreferredLabel]preferredLabel?

  1. [Description]description?
  2. [Identifier]identifier?
  3. AlternativeLabel*altLabels? (SHOULD-omitted when empty per §1.7 rule 4)
  4. LifecycleMetadatalifecycle
  5. Annotation*annotations? (SHOULD-omitted when empty)

On schema artifacts, SchemaArtifactVersioning appears as a separate top-level versioning slot on the artifact rather than being nested inside metadata.

LifecycleMetadata (lifecycle_metadata): 0. CreatedOncreatedOn

  1. CreatedBycreatedBy
  2. ModifiedOnmodifiedOn
  3. ModifiedBymodifiedBy

SchemaArtifactVersioning (schema_artifact_versioning): 0. Versionversion

  1. Statusstatus
  2. [PreviousVersion]previousVersion?
  3. [DerivedFrom]derivedFrom?

Annotation (annotation): 0. Iriproperty

  1. AnnotationValuebody

AnnotationStringValue (annotation_string_value): 0. LexicalFormvalue

  1. [LanguageTag]lang?

AnnotationIriValue (annotation_iri_value): 0. Iriiri

14.5 Embedded artifact properties

Cardinality (cardinality): 0. MinCardinalitymin

  1. [MaxCardinality]max?

LabelOverride (label_override): 0. Labellabel

  1. AlternativeLabel*altLabels

Property (property): 0. PropertyIriiri

  1. [PropertyLabel]label?

14.6 Multilingual strings

LangString (lang_string): 0. stringvalue

  1. Bcp47Taglang

14.7 Values

TextValue (text_value): 0. LexicalFormvalue

  1. [LanguageTag]lang?

IntegerNumberValue (integer_number_value): 0. LexicalFormvalue

RealNumberValue (real_number_value): 0. LexicalFormvalue

  1. RealNumberDatatypeKinddatatype

BooleanValue (boolean_value): 0. booleanvalue

YearValue (year_value): 0. LexicalFormvalue

YearMonthValue (year_month_value): 0. LexicalFormvalue

FullDateValue (full_date_value): 0. LexicalFormvalue

TimeValue (time_value): 0. LexicalFormvalue

DateTimeValue (date_time_value): 0. LexicalFormvalue

ControlledTermValue (controlled_term_value): 0. TermIriterm

  1. [Label]label?
  2. [Notation]notation?
  3. [PreferredLabel]preferredLabel?

EnumValue (enum_value): 0. Tokenvalue

LinkValue (link_value): 0. Iriiri

  1. [Label]label?

EmailValue (email_value): 0. LexicalFormvalue

PhoneNumberValue (phone_number_value): 0. LexicalFormvalue

OrcidValue (orcid_value): 0. OrcidIriiri

  1. [Label]label?

RorValue (ror_value): 0. RorIriiri

  1. [Label]label?

DoiValue (doi_value): 0. DoiIriiri

  1. [Label]label?

PubMedIdValue (pub_med_id_value): 0. PubMedIriiri

  1. [Label]label?

RridValue (rrid_value): 0. RridIriiri

  1. [Label]label?

NihGrantIdValue (nih_grant_id_value): 0. NihGrantIriiri

  1. [Label]label?

AttributeValue (attribute_value): 0. AttributeNamename

  1. Valuevalue

14.8 Field specs

TextFieldSpec (text_field_spec): 0. [TextValue]defaultValue?

  1. [MinLength]minLength?
  2. [MaxLength]maxLength?
  3. [ValidationRegex]validationRegex?
  4. [LangTagRequirement]langTagRequirement?
  5. [TextRenderingHint]renderingHint?

IntegerNumberFieldSpec (integer_number_field_spec): 0. [IntegerNumberValue]defaultValue?

  1. [Unit]unit?
  2. [IntegerNumberMinValue]minValue?
  3. [IntegerNumberMaxValue]maxValue?
  4. [NumericRenderingHint]renderingHint?

RealNumberFieldSpec (real_number_field_spec): 0. RealNumberDatatypeKinddatatype

  1. [RealNumberValue]defaultValue?
  2. [Unit]unit?
  3. [RealNumberMinValue]minValue?
  4. [RealNumberMaxValue]maxValue?
  5. [NumericRenderingHint]renderingHint?

BooleanFieldSpec (boolean_field_spec): 0. [BooleanValue]defaultValue?

  1. [BooleanRenderingHint]renderingHint?

Unit (unit): 0. Iriiri

  1. [Label]label?

DateFieldSpec (date_field_spec): 0. DateValueTypedateValueType

  1. [DateValue]defaultValue?
  2. [DateRenderingHint]renderingHint?

TimeFieldSpec (time_field_spec): 0. [TimeValue]defaultValue?

  1. [TimePrecision]timePrecision?
  2. [TimezoneRequirement]timezoneRequirement?
  3. [TimeRenderingHint]renderingHint?

DateTimeFieldSpec (date_time_field_spec): 0. DateTimeValueTypedateTimeValueType

  1. [DateTimeValue]defaultValue?
  2. [TimezoneRequirement]timezoneRequirement?
  3. [DateTimeRenderingHint]renderingHint?

ControlledTermFieldSpec (controlled_term_field_spec): 0. [ControlledTermValue]defaultValue?

  1. ControlledTermSource+sources
  2. [ControlledTermRenderingHint]renderingHint?

SingleValuedEnumFieldSpec (single_valued_enum_field_spec): 0. PermissibleValue+permissibleValues

  1. [EnumValue]defaultValue?
  2. [SingleValuedEnumRenderingHint]renderingHint?

MultiValuedEnumFieldSpec (multi_valued_enum_field_spec): 0. PermissibleValue+permissibleValues

  1. EnumValue*defaultValues? (SHOULD-omitted when empty per §1.7 rule 4)
  2. [MultiValuedEnumRenderingHint]renderingHint?

PermissibleValue (permissible_value): 0. Tokenvalue

  1. [Label]label?
  2. [Description]description?
  3. Meaning*meanings? (SHOULD-omitted when empty)

Meaning (meaning): 0. TermIriiri

  1. [Label]label?

TextRenderingHint (text_rendering_hint): 0. [TextLineMode]lineMode?

  1. [Placeholder]placeholder?

DateRenderingHint (date_rendering_hint): 0. [DateComponentOrder]componentOrder?

  1. [Placeholder]placeholder?

TimeRenderingHint (time_rendering_hint): 0. [TimeFormat]timeFormat?

  1. [Placeholder]placeholder?

DateTimeRenderingHint (date_time_rendering_hint): 0. [TimeFormat]timeFormat?

  1. [Placeholder]placeholder?

NumericRenderingHint (numeric_rendering_hint): 0. [DecimalPlaces]decimalPlaces?

  1. [Placeholder]placeholder?

The ten new rendering hints introduced for previously hint-less families each carry a single optional slot:

ControlledTermRenderingHint (controlled_term_rendering_hint): [Placeholder]placeholder? EmailRenderingHint (email_rendering_hint): [Placeholder]placeholder? PhoneNumberRenderingHint (phone_number_rendering_hint): [Placeholder]placeholder? LinkRenderingHint (link_rendering_hint): [Placeholder]placeholder? OrcidRenderingHint (orcid_rendering_hint): [Placeholder]placeholder? RorRenderingHint (ror_rendering_hint): [Placeholder]placeholder? DoiRenderingHint (doi_rendering_hint): [Placeholder]placeholder? PubMedIdRenderingHint (pub_med_id_rendering_hint): [Placeholder]placeholder? RridRenderingHint (rrid_rendering_hint): [Placeholder]placeholder? NihGrantIdRenderingHint (nih_grant_id_rendering_hint): [Placeholder]placeholder?

LinkFieldSpec (link_field_spec): 0. [LinkValue]defaultValue?

  1. [LinkRenderingHint]renderingHint?

EmailFieldSpec (email_field_spec): 0. [EmailValue]defaultValue?

  1. [EmailRenderingHint]renderingHint?

PhoneNumberFieldSpec (phone_number_field_spec): 0. [PhoneNumberValue]defaultValue?

  1. [PhoneNumberRenderingHint]renderingHint?

OrcidFieldSpec (orcid_field_spec): 0. [OrcidValue]defaultValue?

  1. [OrcidRenderingHint]renderingHint?

RorFieldSpec (ror_field_spec): 0. [RorValue]defaultValue?

  1. [RorRenderingHint]renderingHint?

DoiFieldSpec (doi_field_spec): 0. [DoiValue]defaultValue?

  1. [DoiRenderingHint]renderingHint?

PubMedIdFieldSpec (pub_med_id_field_spec): 0. [PubMedIdValue]defaultValue?

  1. [PubMedIdRenderingHint]renderingHint?

RridFieldSpec (rrid_field_spec): 0. [RridValue]defaultValue?

  1. [RridRenderingHint]renderingHint?

NihGrantIdFieldSpec (nih_grant_id_field_spec): 0. [NihGrantIdValue]defaultValue?

  1. [NihGrantIdRenderingHint]renderingHint?

AttributeValueFieldSpec carries no components and has no entry here.

14.9 Controlled term sources

OntologySource (ontology_source): 0. OntologyReferenceontology

OntologyReference (ontology_reference): 0. OntologyIriiri

  1. [OntologyDisplayHint]displayHint?

OntologyDisplayHint (ontology_display_hint): 0. [OntologyAcronym]acronym?

  1. [OntologyName]name?

BranchSource (branch_source): 0. OntologyReferenceontology

  1. RootTermIrirootTermIri
  2. [RootTermLabel]rootTermLabel?
  3. [MaxTraversalDepth]maxTraversalDepth?

ClassSource (class_source): 0. ControlledTermClass+classes

ControlledTermClass (controlled_term_class): 0. TermIriterm

  1. [Label]label?
  2. OntologyReferenceontology

ValueSetSource (value_set_source): 0. ValueSetIdentifieridentifier

  1. [ValueSetName]name?
  2. [ValueSetIri]iri?

14.10 Presentation components

RichTextComponent (rich_text_component): 0. PresentationComponentIdid

  1. ModelVersionmodelVersion
  2. CatalogMetadatametadata
  3. HtmlContenthtml

ImageComponent (image_component): 0. PresentationComponentIdid

  1. ModelVersionmodelVersion
  2. CatalogMetadatametadata
  3. Iriimage
  4. [Label]label?
  5. [Description]description?

YoutubeVideoComponent (you_tube_video_component): 0. PresentationComponentIdid

  1. ModelVersionmodelVersion
  2. CatalogMetadatametadata
  3. Irivideo
  4. [Label]label?
  5. [Description]description?

SectionBreakComponent (section_break_component): 0. PresentationComponentIdid

  1. ModelVersionmodelVersion
  2. CatalogMetadatametadata

PageBreakComponent (page_break_component): 0. PresentationComponentIdid

  1. ModelVersionmodelVersion
  2. CatalogMetadatametadata

14.11 Instances

FieldValue (field_value): 0. EmbeddedArtifactKeykey

  1. Value+values

NestedTemplateInstance (nested_template_instance): 0. EmbeddedArtifactKeykey

  1. InstanceValue*values

14.12 Collapsed-wrapper productions

The single-component wrapper productions enumerated in §1.6 — every XxxFieldId, TemplateId, TemplateInstanceId, PresentationComponentId, Iri, TermIri, LanguageTag, LexicalForm, IsoDateTimeStamp, NonNegativeInteger, MinCardinality, MaxCardinality, MinLength, MaxLength, DecimalPlaces, MaxTraversalDepth, the typed external-authority IRIs, Name, Description, PreferredLabel, AlternativeLabel, Label, PropertyLabel, OntologyName, OntologyAcronym, OntologyIri, RootTermIri, RootTermLabel, ValueSetIdentifier, ValueSetName, ValueSetIri, Notation, Identifier, AttributeName, EmbeddedArtifactKey, ValidationRegex, Token, Header, Footer, Version, ModelVersion, CreatedOn, CreatedBy, ModifiedOn, ModifiedBy, PreviousVersion, DerivedFrom, PropertyIri, and HtmlContent — collapse to their inner primitive on the wire and have no per-production property name. The single component appears directly at the slot in the enclosing production whose property name is given by that production’s mapping.

JSON Serialization

This document defines a normative JSON wire format for the CEDAR Template Model. Conforming implementations in any host language MUST produce and consume documents that follow the encoding defined here, so that artifacts can be exchanged between implementations with no information loss.

This document is companion to but not part of the abstract grammar. The abstract grammar in grammar.md defines what a CEDAR template is; wire-grammar.md defines the JSON shape of every grammar production; this document defines the encoding rules and conventions that frame those shapes, plus illustrative examples.

1. Purpose and Scope

1.1 Purpose

The CEDAR Structural Model is intentionally serialization-agnostic at the grammar level. Implementations in different host languages may realize abstract constructs as language-idiomatic data structures (TypeScript interfaces, Java records, Python dataclasses, etc.). For two implementations to exchange artifacts, a common wire format is required.

This document defines that common wire format using JSON (RFC 8259) as the target encoding. The format is:

  • Native — encodes the Structural Model directly, without conflating schema, schema-of-schemas, and presentation concerns.
  • Lossless — every abstract construct encodes to exactly one JSON value, and every conforming JSON value decodes to exactly one abstract construct.
  • Round-trippable — encoding then decoding yields the same abstract construct.

1.2 Relationship to other specifications

grammar.md is the authoritative definition of the abstract Structural Model. This document defines an encoding of that model and does not extend or modify it. Where the grammar permits multiple equivalent abstract forms, this document selects exactly one wire form.

wire-grammar.md is the formal source of truth for the JSON shape of every grammar production. It mirrors grammar.md one-to-one and uses a compact JSON-shaped notation. Per-production property tables formerly in §6 of this document have moved there. The present document carries the encoding philosophy, JSON-specific rules, and worked examples.

validation.md defines the conformance rules a Structural Model artifact must satisfy. This document does not define validation; a JSON document MAY be wire-format-conformant yet fail Structural Model validation, and vice versa.

ctm-1.6.0-serialization.md defines a one-directional, lossy mapping from the Structural Model to legacy CEDAR Template Model 1.6.0 JSON-LD format. This is a separate concern; the encoding defined in the present document is independent of CTM 1.6.0 and not interconvertible with it.

Note on JSON-LD shape parallel

The string-bearing and IRI-bearing Value shapes defined below are structurally similar to JSON-LD’s term forms — value/lang/datatype parallel JSON-LD’s @value/@language/@type, and iri parallels @id. This similarity is incidental: the wire form is CEDAR-native and stands on its own. RDF interoperability is provided by a separate derived projection (see rdf-projection.md).

Conforming documents are not JSON-LD. They carry no @context, are not interpretable as RDF graphs without external schema knowledge, and do not follow JSON-LD’s compaction, expansion, or framing algorithms. A future JSON-LD encoding parallel to (and convertible to/from) the native form defined here MAY be defined; that work is out of scope for this document.

1.3 Scope

In scope:

  • The JSON encoding rules (property naming, NFC normalisation, integer handling) that frame the shapes formally defined in wire-grammar.md.
  • Discriminator placement (the kind / position rules).
  • The wrapping principle that determines which productions are tagged JSON objects vs flat JSON values.
  • Worked end-to-end examples.

Out of scope:

  • Per-production property tables. Those live normatively in wire-grammar.md.
  • JSON-LD, RDF, or other RDF-graph representations.
  • YAML, msgpack, CBOR, or other non-JSON encodings.
  • Validation conformance (validation.md).
  • Storage and transport concerns (file naming, MIME types, HTTP headers, etc.).
  • Per-language implementation concerns: decoder/encoder code structure, error-reporting conventions, partial-decoding strategies, in-memory data shapes, and similar realization decisions. These are addressed in language-specific binding documents (forthcoming).

2. Conformance Language

The words MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY are used in the sense of RFC 2119 and RFC 8174.

A conforming JSON document is a JSON value that satisfies every encoding rule in this document, matches the wire shape defined for some production in wire-grammar.md, and corresponds to some abstract Structural Model construct as defined in grammar.md.

A conforming implementation is software that, when given an abstract Structural Model construct, produces a conforming JSON document; and when given a conforming JSON document, decodes it to the corresponding abstract construct.

3. Conventions

3.1 Production references

Production names from grammar.md and wire-grammar.md appear in UpperCamelCase. Constructor forms from grammar.md appear in lower_snake_case. Concrete JSON property names appear in lowerCamelCase.

3.2 JSON terminology

The terms object, array, string, number, boolean, null, and value refer to JSON values per RFC 8259. The terms property, member, and element refer to the structural components of those values.

3.3 Property naming

Property names within tagged objects MUST be lowerCamelCase translations of the corresponding component names in the production. Where a component name in the grammar is itself an UpperCamelCase production name (e.g. EmbeddedArtifactKey), the JSON property uses the role-name from the production (e.g. key) rather than the production name itself. The canonical property name for any production component is the one given in its wire-grammar.md entry.

3.4 Examples

JSON examples appear in fenced code blocks marked json. Examples are illustrative only; the normative content is the corresponding wire-grammar.md entry.

Examples may use placeholders of the form <ProductionName> to denote the JSON encoding of a production at the surrounding position. A placeholder is resolved by replacing it with the encoding defined for that production in wire-grammar.md. The * and + suffixes (e.g. <Annotation>*, <EnumValue>+) denote sequences per §4.4 — zero-or-more and one-or-more respectively.

4. General Encoding Rules

4.1 Tagged and untagged objects

JSON objects in the wire format are either tagged — carrying a "kind" property — or untagged — without "kind". Whether an object is tagged is determined by its production: every member of a discriminator: kind union is tagged at every position; every other production is untagged at every position. See §4.4 for the rule.

When an object is tagged, the value of "kind" MUST be the production name from grammar.md, transcribed in UpperCamelCase exactly as the grammar names it. For example, "TextValue" for the TextValue production. The grammar’s lower_snake_case constructor forms (e.g. text_value(...)) describe abstract composition and do not appear on the wire.

A conforming implementation MUST reject any object whose tagged-or-untagged status does not match its production (per §4.4), whose "kind" value (when tagged) does not match any production known to the implementation, or whose other properties do not match the wire-grammar entry for the named production.

4.2 Optional components

A grammar component marked [X] (optional) MUST be omitted from its enclosing JSON object when not present. A conforming implementation MUST NOT emit null or an empty string in place of an absent optional component.

A conforming implementation MUST treat the absence of an optional property as equivalent to that component not being present in the abstract construct.

On decode, a conforming implementation MUST reject any document in which an optional property is present with the JSON value null. The two conforming wire forms for an absent optional are: the property is omitted entirely, or the enclosing object is itself absent. Treating null as equivalent to absent is non-conforming because it admits two distinct wire forms for the same abstract state, breaking round-trip equality.

4.3 Sequence components

A grammar component marked X* (zero or more) is encoded as a JSON array. The array MAY be empty.

A grammar component marked X+ (one or more) is encoded as a JSON array. The array MUST contain at least one element. In wire-grammar.md these are written nonEmptyArray<X>.

The order of elements in the JSON array MUST match the order of components in the abstract construct. A conforming implementation MUST preserve this order through encode and decode.

4.4 Discriminator placement

A JSON object’s discriminator presence depends on its production, not on the position it occupies in the document. Per wire-grammar.md §1.5, every production is either a member of some discriminator: kind union or it is not, and the encoding follows uniformly:

Polymorphic-union members — productions that appear as alternatives in a discriminator: kind union (e.g. Value, FieldSpec, Annotation.body: AnnotationValue, EmbeddedField, EmbeddedArtifact, every Field family, every Value family) — MUST encode as a tagged JSON object carrying "kind": "<ProductionName>". The discriminator is present even when the surrounding context (the enclosing object’s kind and property name) would already determine the family — for example, EmbeddedTextField.defaultValue carries "kind": "TextValue" even though EmbeddedTextField.kind already pins the family. Uniformity of the rule is preferred over the small wire-size saving.

Singleton-only productions — productions that never appear as members of any discriminator: kind union (Cardinality, Property, LabelOverride, CatalogMetadata, LifecycleMetadata, SchemaArtifactVersioning, Annotation, Unit, OntologyReference, OntologyDisplayHint, ControlledTermClass, PermissibleValue, Meaning, and the temporal RenderingHint object variants) — MUST encode as untagged JSON objects whose properties correspond to the production’s components. A "kind" property MUST NOT appear.

The rule applies recursively: a tagged object whose own components include further composite objects follows the same rule for each of those components, with the encoding determined by each inner production’s own discriminator-union membership.

Position-discriminated unions

A few unions occupy fixed singleton positions where the surrounding property name fully determines the variant. For example, RenderingHint is determined by which FieldSpec family the parent is. These wire entries are flagged // discriminator: position in wire-grammar.md.

Implementations MUST NOT rely on JSON property ordering to discriminate alternatives.

4.5 String values

Strings are JSON strings encoded in UTF-8. Lexical-form strings (e.g. the value property of a TextValue) MUST be transmitted in Unicode Normalization Form C (NFC). A conforming encoder MUST emit NFC. A conforming decoder receiving non-NFC input handles it per §9.6.

4.6 Number values

Integer-valued grammar productions (e.g. NonNegativeInteger) are encoded as JSON numbers without a fractional part or exponent. Implementations MUST encode integer values that fit within JSON Number’s safe integer range (the integers in the closed interval [−(2^53 − 1), 2^53 − 1]) without loss. Values outside that range fall under §5.1 below — the wire grammar permits a JSON-string fallback, but implementations MAY refuse to encode out-of-range values since no current use site exercises this case.

Decimal-valued grammar productions are encoded as JSON numbers in standard decimal notation per RFC 8259.

4.7 Implementation freedom

A conforming implementation MAY add JSON properties beyond those defined here for non-normative purposes (annotations, hashes, signatures, etc.), provided those properties begin with _ or $ to avoid collision with future normative additions. Decoders MUST ignore such properties. Decoders encountering a property whose name does not begin with _ or $ and is not declared by the production at the position MUST report a wire-shape error per §9.5.

A conforming implementation MAY emit JSON object properties in any order; the wire format is order-independent at the object level.

5. The Wrapping Principle

The grammar uses constructor forms uniformly to define every production, including productions that consist of a single component of a primitive type. For example:

Header ::= header( MultilingualString )
NonNegativeInteger ::= non_negative_integer( IntegerLexicalForm )
EmbeddedArtifactKey ::= embedded_artifact_key( AsciiIdentifier )

A literal translation would encode each such production as a tagged JSON object with a single payload property. This document does not require that. Instead, the wrapping principle applies:

A production is encoded as a tagged JSON object only when wrapping carries information beyond the production’s payload. Otherwise, the production is encoded as the JSON value of its single component, and the production’s identity is communicated by the property name in the enclosing object.

A production carries information beyond its payload, and so MUST be encoded as a tagged object, when at least one of the following holds:

  • (a) Composite structure. The production has more than one named component (e.g. Cardinality, Property, LabelOverride, every Value family).

  • (b) Discriminated union membership. The production participates in a union where alternatives must be distinguished at decode time (e.g. Value, every artifact’s kind, the twenty Field family variants). The discriminator is "kind".

  • (c) Lexical-form preservation. The production carries lexical content whose preservation requires more than a JSON primitive can express (e.g. LangString carries a lexical form and a language tag; both must be present in the wire form).

A production that satisfies none of these is encoded flat: the JSON value at the corresponding property position in the enclosing object is the JSON encoding of the production’s single component, with no "kind" wrapper.

The full list of productions that collapse this way is given in §1.6 of wire-grammar.md. At a glance:

  • All MultilingualString-typed wrappers (Header, Footer, Name, Description, PreferredLabel, AlternativeLabel, Label, PropertyLabel, OntologyName, RootTermLabel, ValueSetName) flatten to a JSON array of LangString entries.
  • All single-Iri wrappers (artifact identifiers and references, PropertyIri, the typed external-authority IRIs, OntologyIri, etc.) flatten to a plain JSON string.
  • All single-NonNegativeInteger wrappers (MinLength, MaxLength, MinCardinality, MaxCardinality, DecimalPlaces, MaxTraversalDepth) flatten to a plain JSON number.
  • Plain-string wrappers (Identifier, Notation, OntologyAcronym, ValueSetIdentifier, HtmlContent) flatten to a plain JSON string.
  • Enum-style productions (Status, ValueRequirement, Visibility, DateValueType, TimePrecision, DateTimeValueType, TimezoneRequirement, DateComponentOrder, TimeFormat, TextRenderingHint, SingleValuedEnumRenderingHint, MultiValuedEnumRenderingHint, BooleanRenderingHint, RealNumberDatatypeKind) flatten to a JSON string drawn from a fixed set.

5.1 Lexical-form preservation

Big integers. NonNegativeInteger values that exceed JSON Number’s safe integer range (the magnitude bound 2^53 − 1) MAY be encoded as JSON strings rather than numbers. A decoder MUST accept both forms. In practice this case does not arise for the model’s current use sites (length bounds, cardinality bounds, traversal depths, numeric precision are all small); implementations MAY refuse to encode an out-of-range value rather than fall back to the string form. If a future use site introduces values that routinely exceed the safe range, this section will be revisited to make the string fallback a MUST.

6. Per-Production Encoding (Examples)

Detailed wire shapes for every production are normatively specified in wire-grammar.md. This section gives illustrative JSON examples — one per family of related productions — and documents only those JSON-encoding-specific rules that aren’t expressible in the wire-grammar notation.

6.1 Identifiers

Every artifact identifier is encoded as a plain JSON string carrying the IRI. The kind of identifier is communicated by the surrounding context (the property name on the enclosing object, plus the kind discriminator of the enclosing artifact).

"https://example.org/fields/title"

A FieldId appears only in two grammar positions: as Field.id (the artifact’s own identity) and as EmbeddedField.artifactRef (a reference to the embedded artifact). Both surrounding constructs carry a kind discriminator that conveys the field family. The twenty permitted family-bearing kind values for Field variants are: "TextField", "IntegerNumberField", "RealNumberField", "BooleanField", "DateField", "TimeField", "DateTimeField", "ControlledTermField", "SingleValuedEnumField", "MultiValuedEnumField", "LinkField", "EmailField", "PhoneNumberField", "OrcidField", "RorField", "DoiField", "PubMedIdField", "RridField", "NihGrantIdField", or "AttributeValueField". The corresponding EmbeddedField variants prefix Embedded (e.g. "EmbeddedTextField").

The IRI placed at a FieldId position MUST belong to a field of the family declared by the surrounding kind. This is a structural-invariant constraint (per §9.1 category 3); a conforming encoder enforces it before emitting the wire form, and a conforming decoder reports a structural error against path if it is violated.

6.2 Multilingual strings

A MultilingualString is encoded as a non-empty JSON array of untagged LangString objects. Neither MultilingualString nor LangString is a member of any discriminator: kind union (per §4.4), so neither carries a kind discriminator on the wire.

[{ "value": "Hello", "lang": "en" }, { "value": "Bonjour", "lang": "fr" }]

The BCP 47 'und' (undetermined) subtag MAY be used when the natural language is unspecified.

MultilingualString and a single language-tagged TextValue share the {value, lang} shape but are structurally distinct: a TextValue is a single tagged value object (carrying kind: "TextValue"), whereas a MultilingualString is an array of one or more untagged {value, lang} entries. Encoders MUST NOT collapse a single-entry MultilingualString into a bare LangString object, and decoders MUST NOT promote a single LangString into a MultilingualString array.

6.3 Values

Each Value family is encoded as a tagged object that carries its content directly. The full set of variants is given in wire-grammar.md §3.

{ "kind": "TextValue", "value": "Jane Smith" }
{ "kind": "TextValue", "value": "Jane Smith", "lang": "en" }
{ "kind": "IntegerNumberValue", "value": "42" }
{ "kind": "RealNumberValue", "value": "3.14", "datatype": "decimal" }
{ "kind": "BooleanValue", "value": true }
{ "kind": "YearValue", "value": "2024" }
{ "kind": "FullDateValue", "value": "2024-06-15" }
{ "kind": "TimeValue", "value": "10:30:00" }
{ "kind": "DateTimeValue", "value": "2024-06-15T10:30:00Z" }
{ "kind": "ControlledTermValue", "term": "http://example.org/term/1", "label": [{ "value": "Term 1", "lang": "en" }] }
{ "kind": "EnumValue", "value": "professor" }
{ "kind": "LinkValue", "iri": "https://example.org/page" }
{ "kind": "EmailValue", "value": "jane@example.org" }
{ "kind": "OrcidValue", "iri": "https://orcid.org/0000-0002-1825-0097", "label": [{ "value": "Josiah Carberry", "lang": "en" }] }
{ "kind": "AttributeValue", "name": "https://example.org/p/color", "value": { "kind": "TextValue", "value": "blue" } }

6.4 Metadata and annotations

LifecycleMetadata, SchemaArtifactVersioning, and CatalogMetadata are singleton-only productions (never members of any discriminator: kind union per §4.4), so they encode as untagged JSON objects. The descriptive properties of an artifact (preferredLabel, description, identifier, altLabels) sit directly on CatalogMetadata rather than under a descriptiveMetadata wrapper. On schema artifacts, SchemaArtifactVersioning appears as a separate top-level versioning slot on the artifact rather than nested inside metadata.

{
  "preferredLabel": [{ "value": "Full Name", "lang": "en" }],
  "description": [{ "value": "Full legal name.", "lang": "en" }],
  "lifecycle": {
    "createdOn": "2024-01-01T00:00:00Z",
    "createdBy": "https://orcid.org/0000-0002-1825-0097",
    "modifiedOn": "2024-06-15T12:30:00Z",
    "modifiedBy": "https://orcid.org/0000-0002-1825-0097"
  },
  "annotations": [
    {
      "property": "https://example.org/annotation-properties/notes",
      "body": { "kind": "AnnotationStringValue", "value": "An institutional note." }
    }
  ]
}

AnnotationValue is a kind-discriminated polymorphic union over named annotation-value variants. Two variants are currently defined: AnnotationStringValue (a lexical form with optional language tag) and AnnotationIriValue (an IRI):

{ "kind": "AnnotationStringValue", "value": "An institutional note." }
{ "kind": "AnnotationStringValue", "value": "Une note institutionnelle.", "lang": "fr" }
{ "kind": "AnnotationIriValue", "iri": "https://example.org/related-resource" }

The wire-form property name on Annotation is body (for the grammar’s AnnotationValue component) — following the W3C Web Annotations convention.

The AnnotationValue variant family is open to extension: future revisions of this specification MAY introduce additional AnnotationXxxValue variants. Conforming decoders MUST reject documents whose body.kind is not a known variant.

6.5 Embedded artifact properties

Cardinality, Property, LabelOverride, and Unit are singleton-only productions (per §4.4) and encode as untagged JSON objects. EmbeddedArtifactKey flattens to a plain JSON string. ValueRequirement and Visibility flatten to JSON enum strings.

{ "min": 0, "max": 5 }
{ "iri": "https://schema.org/name", "label": [{ "value": "name", "lang": "en" }] }
{ "label": [{ "value": "Custom Label", "lang": "en" }], "altLabels": [] }
"required"

6.6 Field specs

Each concrete FieldSpec is encoded as a tagged object whose "kind" matches the spec’s grammar production name. Optional configuration properties are omitted when absent. Every XxxFieldSpec (except AttributeValueFieldSpec) carries an optional defaultValue slot whose type matches the family’s Value; see §6.8 for the per-family table and the precedence rule against an embedding-level defaultValue on the corresponding EmbeddedXxxField.

{ "kind": "TextFieldSpec", "minLength": 1, "maxLength": 200, "renderingHint": "singleLine" }
{ "kind": "IntegerNumberFieldSpec", "minValue": { "kind": "IntegerNumberValue", "value": "0" } }
{ "kind": "DateFieldSpec", "dateValueType": "fullDate", "renderingHint": { "componentOrder": "dayMonthYear" } }
{ "kind": "SingleValuedEnumFieldSpec",
  "permissibleValues": [
    { "value": "yes", "label": [{ "value": "Yes", "lang": "en" }] },
    { "value": "no",  "label": [{ "value": "No",  "lang": "en" }] }
  ],
  "defaultValue": { "kind": "EnumValue", "value": "yes" },
  "renderingHint": "radio"
}
{ "kind": "MultiValuedEnumFieldSpec",
  "permissibleValues": [
    { "value": "active",  "label": [{ "value": "Active",  "lang": "en" }],
      "meanings": ["http://example.org/active-1"] },
    { "value": "retired", "label": [{ "value": "Retired", "lang": "en" }] }
  ],
  "defaultValues": [],
  "renderingHint": "checkbox"
}
{ "kind": "ControlledTermFieldSpec", "sources": [
  { "kind": "OntologySource", "ontology": { "iri": "http://purl.obolibrary.org/obo/ncit.owl",
    "displayHint": { "acronym": "NCIT", "name": [{ "value": "NCI Thesaurus", "lang": "en" }] } } }
] }

A SingleValuedEnumFieldSpec‘s defaultValue is a single tagged EnumValue whose value matches one of the permissible values’ tokens; a MultiValuedEnumFieldSpec’s defaultValues is a (possibly empty) array of such tagged EnumValue entries, with no duplicate value entries. An OntologyDisplayHint MUST carry at least one of acronym or name (a constraint enforced by wire-grammar.md).

The flat-string rendering hints (TextRenderingHint, SingleValuedEnumRenderingHint, MultiValuedEnumRenderingHint, BooleanRenderingHint) appear directly as JSON enum strings; the object-shaped rendering hints (NumericRenderingHint, DateRenderingHint, TimeRenderingHint, DateTimeRenderingHint) are JSON objects with optional configuration slots.

6.7 Field artifacts and embedded artifacts

A Field artifact (shown for the text family; the other nineteen families substitute "IntegerNumberField", "RealNumberField", "BooleanField", "DateField", etc. for kind):

{
  "kind": "TextField",
  "id": "<FieldId>",
  "modelVersion": "<SemanticVersion>",
  "metadata": "<CatalogMetadata>",
  "versioning": "<SchemaArtifactVersioning>",
  "fieldSpec": "<FieldSpec>",
  "label": "<MultilingualString>"
}

The modelVersion property is a top-level property of every concrete artifact (Template, TemplateInstance, every XxxField, and every PresentationComponent variant). It is encoded as a JSON string carrying a Semantic Versioning 2.0.0 lexical form and identifies the version of the CEDAR structural model the artifact conforms to. The position is immediately after id and before metadata.

The kind value MUST match the family of the nested fieldSpec. Conforming encoders MUST ensure that the IRI placed at id belongs to a field of the same family.

An EmbeddedField (shown for the text family; substitute "EmbeddedIntegerNumberField", "EmbeddedRealNumberField", "EmbeddedBooleanField", "EmbeddedDateField", etc. for the other nineteen families):

{
  "kind": "EmbeddedTextField",
  "key": "<EmbeddedArtifactKey>",
  "artifactRef": "<FieldId>",
  "valueRequirement": "required",
  "cardinality": { "min": 1, "max": 1 },
  "property": { "iri": "https://schema.org/name" }
}

An EmbeddedAttributeValueField MUST NOT carry a defaultValue property.

{
  "kind": "EmbeddedTemplate",
  "key": "<EmbeddedArtifactKey>",
  "artifactRef": "<TemplateId>",
  "cardinality": { "min": 0 }
}
{
  "kind": "EmbeddedPresentationComponent",
  "key": "<EmbeddedArtifactKey>",
  "artifactRef": "<PresentationComponentId>",
  "visibility": "visible"
}

6.8 Default values

A default value is a value used to pre-populate a field at instance-creation time when no explicit value has yet been supplied by the user. Defaults exist at two layers:

  • Field-level defaults, on the reusable Field’s FieldSpec (XxxFieldSpec.defaultValue), shared by every Template that embeds the field.
  • Embedding-level defaults, on the EmbeddedXxxField inside a Template (EmbeddedXxxField.defaultValue), specific to that one embedding.

Every concrete field family carries an optional default at both layers, with one exception: AttributeValueField carries no default at either layer (an AttributeValue is a per-instance pairing of a name and a value, and a default is not meaningful).

Defaults are UI/UX initialisation only. A default’s sole role is to seed an instance’s value at creation time. Defaults do not appear in the wire form of TemplateInstance artifacts and do not affect the RDF projection. When an instance is created and the user accepts the default without modification, the resulting FieldValue carries the default value as if the user had typed it in by hand; from the instance’s perspective the default and a user-supplied identical value are indistinguishable. When an instance is created and the user does not supply a value (and the field is not required), the corresponding FieldValue is omitted entirely — the default does not appear by virtue of having existed.

Wire form. Both layers use the same Value-typed wire shape: there is no DefaultValue wrapper. Every Value is a member of the Value polymorphic union, so per the kind rule (wire-grammar.md §1.5) every defaultValue carries a kind discriminator — at both layers, regardless of whether the enclosing context already pins the family. The discriminator is structurally redundant at slots whose enclosing XxxFieldSpec.kind or EmbeddedXxxField.kind already determines the family, but is retained for uniformity with Value’s appearance at the polymorphic positions where the kind genuinely discriminates (e.g. FieldValue.values[*] in instances).

MultiValuedEnumFieldSpec.defaultValues and EmbeddedMultiValuedEnumField.defaultValue are the two slots whose wire form is a JSON array rather than a single object: each carries an array of tagged EnumValue entries.

For the enum families specifically, the structural-invariant constraint that the default reference one of the spec’s permissibleValues applies to the inner value (the Token):

  • SingleValuedEnumFieldSpec.defaultValue?: EnumValue — a tagged EnumValue whose value MUST equal the Token of one of the spec’s permissible-value entries.
  • MultiValuedEnumFieldSpec.defaultValues?: array<EnumValue> — a (possibly empty) JSON array of tagged EnumValue entries; each entry’s value MUST equal the Token of one of the spec’s permissible-value entries, and the array MUST NOT contain duplicate value entries.

The same constraint applies at the corresponding embedding-level slots (EmbeddedSingleValuedEnumField.defaultValue and EmbeddedMultiValuedEnumField.defaultValue).

Examples by family — at every layer (field-level on XxxFieldSpec.defaultValue, embedding-level on EmbeddedXxxField.defaultValue) the wire shape is identical:

// TextValue (field-level on TextFieldSpec, embedding-level on EmbeddedTextField)
"defaultValue": { "kind": "TextValue", "value": "Stanford University" }
"defaultValue": { "kind": "TextValue", "value": "Bonjour", "lang": "fr" }

// IntegerNumberValue
"defaultValue": { "kind": "IntegerNumberValue", "value": "42" }

// RealNumberValue
"defaultValue": { "kind": "RealNumberValue", "value": "3.14", "datatype": "decimal" }

// BooleanValue
"defaultValue": { "kind": "BooleanValue", "value": true }

// DateValue (kind discriminates the arm; the arm MUST be consistent with the spec's dateValueType)
"defaultValue": { "kind": "FullDateValue", "value": "2024-06-15" }
"defaultValue": { "kind": "YearValue", "value": "2024" }

// TimeValue
"defaultValue": { "kind": "TimeValue", "value": "10:30:00" }

// DateTimeValue
"defaultValue": { "kind": "DateTimeValue", "value": "2024-06-15T10:30:00Z" }

// ControlledTermValue
"defaultValue": {
  "kind": "ControlledTermValue",
  "term": "http://purl.obolibrary.org/obo/UBERON_0000955",
  "label": [{ "value": "brain", "lang": "en" }]
}

// EnumValue (single) — both layers use the same shape
"defaultValue": { "kind": "EnumValue", "value": "yes" }

// array<EnumValue> — both layers use the same shape; MultiValuedEnumFieldSpec calls the slot defaultValues
"defaultValues": [
  { "kind": "EnumValue", "value": "active" },
  { "kind": "EnumValue", "value": "retired" }
]

// LinkValue
"defaultValue": { "kind": "LinkValue", "iri": "https://example.org", "label": [{ "value": "Example", "lang": "en" }] }

// EmailValue
"defaultValue": { "kind": "EmailValue", "value": "jane@example.org" }

// PhoneNumberValue
"defaultValue": { "kind": "PhoneNumberValue", "value": "+1-650-555-0123" }

// OrcidValue
"defaultValue": {
  "kind": "OrcidValue",
  "iri": "https://orcid.org/0000-0002-1825-0097",
  "label": [{ "value": "Josiah Carberry", "lang": "en" }]
}

// RorValue / DoiValue / PubMedIdValue / RridValue / NihGrantIdValue — analogous, each tagged with its family's kind

Precedence. When both a field-level default (on the referenced Field’s FieldSpec) and an embedding-level default (on the EmbeddedXxxField) are present for the same field, the embedding-level default wins. When only one is present, that one applies. When neither is present, the field has no default. There is no mechanism for an embedding to unset a field-level default; an embedding wishing to override with a different default supplies its own defaultValue, but cannot say “no default here.” See grammar.md §Defaults for the full table.

6.9 Templates

{
  "kind": "Template",
  "id": "<TemplateId>",
  "modelVersion": "<SemanticVersion>",
  "metadata": "<CatalogMetadata>",
  "versioning": "<SchemaArtifactVersioning>",
  "title": [{ "value": "Form Title", "lang": "en" }],
  "header": [{ "value": "Template Header Text", "lang": "en" }],
  "members": ["<EmbeddedArtifact>*"]
}

The members array MUST preserve order. The EmbeddedArtifactKey values within members MUST be unique; a conforming encoder MUST verify uniqueness before producing the JSON, and a conforming decoder MUST reject input that violates this constraint.

6.10 Presentation components

{ "kind": "RichTextComponent", "id": "<PresentationComponentId>", "modelVersion": "<SemanticVersion>", "metadata": "<CatalogMetadata>", "html": "<p>Hello</p>" }
{ "kind": "ImageComponent", "id": "<PresentationComponentId>", "modelVersion": "<SemanticVersion>", "metadata": "<CatalogMetadata>", "image": "https://example.org/image.png" }
{ "kind": "SectionBreakComponent", "id": "<PresentationComponentId>", "modelVersion": "<SemanticVersion>", "metadata": "<CatalogMetadata>" }

6.11 Instances

{
  "kind": "TemplateInstance",
  "id": "<TemplateInstanceId>",
  "modelVersion": "<SemanticVersion>",
  "metadata": "<CatalogMetadata>",
  "templateRef": "<TemplateId>",
  "label": [{ "value": "Optional user-supplied instance label", "lang": "en" }],
  "values": ["<InstanceValue>*"]
}

TemplateInstance.metadata is CatalogMetadata; instances do not carry schema versioning, so there is no top-level versioning slot. The optional label slot, when present, carries a user-supplied name for the instance, shown in catalog listings or detail views.

{ "kind": "FieldValue", "key": "<EmbeddedArtifactKey>", "values": ["<Value>+"] }

FieldValue.values MUST be a non-empty array; absence of a value is represented by omitting the FieldValue entirely.

{ "kind": "NestedTemplateInstance", "key": "<EmbeddedArtifactKey>", "values": ["<InstanceValue>*"] }

The values array of a TemplateInstance MUST satisfy the structural invariants defined in grammar.md §Instances: a given EmbeddedArtifactKey appears as the key of at most one FieldValue; a given EmbeddedArtifactKey does not appear as the key of both a FieldValue and a NestedTemplateInstance; multiple NestedTemplateInstance entries sharing a key are permitted.

7. Round-Tripping

A conforming encode-decode round-trip MUST preserve:

  • Every component value of every abstract construct, including lexical content of literals and IRI strings.
  • The order of every sequence component (* and +).
  • The presence-or-absence of every optional component.

A conforming encode-decode round-trip MAY NOT preserve:

  • JSON object property order within a single tagged object.
  • Whitespace between JSON tokens.
  • Implementation-specific properties beginning with _ or $ per §4.7 (these are explicitly outside the conformance contract).

Two conforming JSON documents that differ only in JSON object property order or non-significant whitespace MUST decode to the same abstract construct.

8. Examples

This section walks through one fully-elaborated example end-to-end — a realistic Template, a TemplateInstance that conforms to it, a round-trip equality check, and two known-bad inputs that exercise the error model from §9. The goal is to give implementers a concrete fixture they can decode-and-encode against, and to make every cross- section reference (the kind rule, wrapping principle, structural-invariant constraints) visible at one position in the wire form.

The JSON in this section is embedded from machine-readable test fixtures under spec/normative-tests/. A binding SHOULD treat that directory as a cross-language acceptance suite: every binding MUST decode every file under valid/, encode the result back to JSON, and verify §7 round-trip equivalence; every binding MUST decode every file under invalid/<case>/input.json and report at least the errors listed in invalid/<case>/expected-errors.json. The test fixtures are the authoritative source — this section embeds them via mdBook {{#include}} so the rendered prose and the test data cannot drift apart.

The example is deliberately compact rather than minimal: every wire shape this spec defines that is reachable from a Template appears at least once. The companion TemplateInstance exercises every value shape that is reachable from a FieldValue. Smaller variations (empty members, no annotations, single-language title) are straightforward subsets of the larger artifact and are not separately illustrated.

8.1 A Template exercising the principal wire shapes

The Template below describes a single patient observation: an identifier, a free-text comment, a single-valued enum severity, a date observed, an integer-valued count of repeated occurrences (with unit and bounds), and a controlled-term diagnosis. It carries a multi-language title (the rendered form heading) and description, a separate top-level versioning slot, a lifecycle, and two annotations on the metadata.

{
  "kind": "Template",
  "id": "https://example.org/templates/patient-observation",
  "modelVersion": "2.0.0",
  "metadata": {
    "description": [
      {
        "value": "A single observation made about a patient.",
        "lang": "en"
      }
    ],
    "altLabels": [
      [
        {
          "value": "Clinical observation",
          "lang": "en"
        }
      ],
      [
        {
          "value": "Patient note",
          "lang": "en"
        }
      ]
    ],
    "lifecycle": {
      "createdOn": "2026-01-15T09:30:00Z",
      "createdBy": "https://example.org/users/alice",
      "modifiedOn": "2026-04-02T16:12:00Z",
      "modifiedBy": "https://example.org/users/bob"
    },
    "annotations": [
      {
        "property": "https://purl.org/dc/terms/license",
        "body": {
          "kind": "AnnotationIriValue",
          "iri": "https://creativecommons.org/licenses/by/4.0/"
        }
      },
      {
        "property": "https://schema.org/keywords",
        "body": {
          "kind": "AnnotationStringValue",
          "value": "patient,observation,clinical",
          "lang": "en"
        }
      }
    ]
  },
  "versioning": {
    "version": "1.2.0",
    "status": "published",
    "previousVersion": "https://example.org/templates/patient-observation/v/1.1.0"
  },
  "title": [
    {
      "value": "Patient observation",
      "lang": "en"
    },
    {
      "value": "Beobachtung des Patienten",
      "lang": "de"
    }
  ],
  "header": [
    {
      "value": "Record one observation per submission.",
      "lang": "en"
    }
  ],
  "members": [
    {
      "kind": "EmbeddedTextField",
      "key": "comment",
      "artifactRef": "https://example.org/fields/comment",
      "valueRequirement": "recommended",
      "cardinality": {
        "min": 0,
        "max": 1
      },
      "visibility": "visible",
      "labelOverride": {
        "label": [
          {
            "value": "Free-text comment",
            "lang": "en"
          }
        ],
        "altLabels": []
      },
      "property": {
        "iri": "https://schema.org/comment"
      }
    },
    {
      "kind": "EmbeddedSingleValuedEnumField",
      "key": "severity",
      "artifactRef": "https://example.org/fields/severity",
      "valueRequirement": "required",
      "visibility": "visible",
      "defaultValue": {
        "kind": "EnumValue",
        "value": "moderate"
      },
      "property": {
        "iri": "https://example.org/ontology/severity"
      }
    },
    {
      "kind": "EmbeddedDateField",
      "key": "observed",
      "artifactRef": "https://example.org/fields/observed",
      "valueRequirement": "required",
      "cardinality": {
        "min": 1,
        "max": 1
      },
      "visibility": "visible",
      "defaultValue": {
        "kind": "FullDateValue",
        "value": "2026-01-01"
      },
      "property": {
        "iri": "https://schema.org/observationDate"
      }
    },
    {
      "kind": "EmbeddedIntegerNumberField",
      "key": "occurrences",
      "artifactRef": "https://example.org/fields/occurrences",
      "valueRequirement": "optional",
      "cardinality": {
        "min": 0,
        "max": 1
      },
      "visibility": "visible",
      "defaultValue": {
        "kind": "IntegerNumberValue",
        "value": "1"
      },
      "property": {
        "iri": "https://example.org/ontology/occurrenceCount"
      }
    },
    {
      "kind": "EmbeddedControlledTermField",
      "key": "diagnosis",
      "artifactRef": "https://example.org/fields/diagnosis",
      "valueRequirement": "required",
      "cardinality": {
        "min": 1
      },
      "visibility": "visible",
      "property": {
        "iri": "https://example.org/ontology/diagnosis"
      }
    }
  ]
}

A few things in the above artifact are worth highlighting because they exercise specific rules:

  • Top-level layout. metadata carries CatalogMetadata (descriptive properties, lifecycle, annotations). versioning is a separate top-level slot, not nested inside metadata. title carries the rendered form heading and is also a separate top-level slot (see §6.9 / wire-grammar §5.1).
  • Multilingual content. title and description are MultilingualString arrays. Each altLabels element on metadata is itself a MultilingualString, so altLabels is an array of arrays. Two of the language-tagged entries on title exercise the unique-lang-tag invariant (§9.1 category 3).
  • AnnotationValue polymorphism. Annotation.body is a discriminator: kind union with AnnotationStringValue and AnnotationIriValue arms; the wire form carries the discriminator per §1.5 of wire-grammar.md.
  • defaultValue kind discriminators. Every defaultValue on every EmbeddedXxxField carries a kind discriminator per the rule in wire-grammar.md §1.5 — for example { "kind": "EnumValue", "value": "moderate" } on EmbeddedSingleValuedEnumField, { "kind": "IntegerNumberValue", "value": "1" } on EmbeddedIntegerNumberField, and { "kind": "FullDateValue", "value": "2026-01-01" } on EmbeddedDateField. The discriminator is structurally redundant at slots whose enclosing EmbeddedXxxField.kind already fixes the family (everywhere except EmbeddedDateField), but is retained for uniformity with Value’s appearance at polymorphic positions such as FieldValue.values[*] in instances.
  • Identifier IRIs. Every artifactRef is an IRI string that belongs to a field of the family declared by the surrounding kind (§6.1, §9.1 category 3). A conforming encoder verifies this before emit; a conforming decoder reports a structural-invariant error if it does not.
  • Cardinality ranges. comment admits zero or one (min: 0, max: 1); observed requires exactly one; occurrences is optional with at most one; diagnosis requires at least one with no upper bound (max omitted, meaning unbounded). Cardinality appears at singleton positions only and never carries kind per §1.5.

8.2 kind discriminators, two examples

Per the kind rule (§1.5 of wire-grammar.md), every member of a discriminator: kind union carries "kind" on the wire — at every position. Two examples illustrate.

Example 1 — Value at a polymorphic position. In a TemplateInstance, the FieldValue.values slot is a nonEmptyArray<Value>. The decoder uses the array element’s kind to pick the union arm:

{
  "kind": "FieldValue",
  "key": "severity",
  "values": [ { "kind": "EnumValue", "value": "severe" } ]
}

Example 2 — Value at a singleton position. In an EmbeddedSingleValuedEnumField, the defaultValue slot’s type is the single concrete EnumValue production: the enclosing EmbeddedSingleValuedEnumField.kind already determines the family. The kind discriminator is therefore structurally redundant at this slot — but is still emitted, because EnumValue is a member of the Value polymorphic union and the rule is uniform across positions:

{
  "kind": "EmbeddedSingleValuedEnumField",
  "key": "severity",
  "artifactRef": "https://example.org/fields/severity",
  "defaultValue": { "kind": "EnumValue", "value": "moderate" }
}

The same pattern applies at every other singleton-Value slot: EmbeddedTextField.defaultValue carries "kind": "TextValue", EmbeddedIntegerNumberField.defaultValue carries "kind": "IntegerNumberValue", IntegerNumberFieldSpec.minValue carries "kind": "IntegerNumberValue", and so on. The wire-size cost is small (one extra short property per Value object) and the simplification at the spec level is that there is exactly one encoding rule for Value, applicable everywhere.

EmbeddedMultiValuedEnumField.defaultValue is the array case: each element of the array is itself a tagged EnumValue:

"defaultValue": [
  { "kind": "EnumValue", "value": "active" },
  { "kind": "EnumValue", "value": "retired" }
]

By contrast, Cardinality, Annotation, LabelOverride, Property, and the other singleton-only productions enumerated in §1.5 are not members of any discriminator: kind union, so they never carry "kind" regardless of position. Cardinality is always { "min": …, "max"?: … }; never { "kind": "Cardinality", … }.

8.3 A TemplateInstance for the above Template

The instance below conforms to the Template of §8.1: it carries one value per required and present optional EmbeddedField, omits the optional comment, and carries two diagnosis terms (since diagnosis admits min: 1 with unbounded max).

{
  "kind": "TemplateInstance",
  "id": "https://example.org/instances/observation-42",
  "modelVersion": "2.0.0",
  "metadata": {
    "preferredLabel": [
      {
        "value": "Observation #42",
        "lang": "en"
      }
    ],
    "lifecycle": {
      "createdOn": "2026-04-15T10:22:00Z",
      "createdBy": "https://example.org/users/alice",
      "modifiedOn": "2026-04-15T10:22:00Z",
      "modifiedBy": "https://example.org/users/alice"
    }
  },
  "templateRef": "https://example.org/templates/patient-observation",
  "values": [
    {
      "kind": "FieldValue",
      "key": "severity",
      "values": [
        {
          "kind": "EnumValue",
          "value": "severe"
        }
      ]
    },
    {
      "kind": "FieldValue",
      "key": "observed",
      "values": [
        {
          "kind": "FullDateValue",
          "value": "2026-04-14"
        }
      ]
    },
    {
      "kind": "FieldValue",
      "key": "occurrences",
      "values": [
        {
          "kind": "IntegerNumberValue",
          "value": "3"
        }
      ]
    },
    {
      "kind": "FieldValue",
      "key": "diagnosis",
      "values": [
        {
          "kind": "ControlledTermValue",
          "term": "https://www.snomed.org/snomed-ct/concept/22298006",
          "label": [
            {
              "value": "Myocardial infarction",
              "lang": "en"
            }
          ]
        },
        {
          "kind": "ControlledTermValue",
          "term": "https://www.snomed.org/snomed-ct/concept/49601007",
          "label": [
            {
              "value": "Disorder of cardiovascular system",
              "lang": "en"
            }
          ]
        }
      ]
    }
  ]
}

Notes:

  • Instance metadata. TemplateInstance.metadata is CatalogMetadata. Instances do not carry schema versioning, so there is no top-level versioning slot — the schema’s version is fixed by templateRef.
  • FieldValue.values is non-empty. Per the abstract grammar’s Value+ constraint, every FieldValue carries at least one value; absence of a value for a key is represented by omitting the FieldValue entry entirely (the comment key here). This is the reason valueRequirement is enforced at instance-validation time rather than wire-shape time: the wire grammar does not require a FieldValue for every EmbeddedField.
  • FieldValue.values[*] carries kind. The values inside FieldValue.values are members of the Value polymorphic union; every entry carries its kind discriminator (per §1.5 of wire-grammar.md). The same kind-bearing shape appears at every other Value slot — EmbeddedXxxField.defaultValue in the template above, TextFieldSpec.defaultValue on a standalone TextField, the IntegerNumberFieldSpec.minValue/maxValue bounds — because the rule is uniform across positions.

8.4 Round-tripping

Decoding the §8.1 Template JSON and re-encoding the resulting in-memory value MUST produce a JSON document that is equal to the input under §7’s equivalence (object property order and whitespace are not significant). A binding’s round-trip test SHOULD therefore:

  1. Parse the input to a JSON tree and to its in-memory model representation.
  2. Re-encode the in-memory representation to JSON.
  3. Compare the two JSON trees property-set-equally (recursive set equality on object members, sequence equality on arrays).

A binding MAY canonicalise property order on encode (e.g. always emit kind first, then required fields in grammar order, then optionals alphabetically); the canonical form is not normative under §7 — only its decode-equivalence to the input is.

8.5 Known-bad inputs

The two inputs below exercise the §9 error model. Each is presented with the expected reported errors per §9.3 (the four required fields). A conforming decoder operating in collected mode (the default per §9.4) MUST report all the listed errors before raising or returning.

Input 1 — wire-shape error (unknown kind discriminator).

{
  "kind": "TemplateInstance",
  "id": "https://example.org/instances/i1",
  "modelVersion": "2.0.0",
  "metadata": {
    "preferredLabel": [
      {
        "value": "x",
        "lang": "en"
      }
    ],
    "lifecycle": {
      "createdOn": "2026-04-15T10:22:00Z",
      "createdBy": "https://example.org/u",
      "modifiedOn": "2026-04-15T10:22:00Z",
      "modifiedBy": "https://example.org/u"
    }
  },
  "templateRef": "https://example.org/templates/patient-observation",
  "values": [
    {
      "kind": "FieldValue",
      "key": "severity",
      "values": [
        {
          "kind": "MysteryValue",
          "value": "severe"
        }
      ]
    }
  ]
}

Expected error report:

categorypathproductionmessage
wireShape/values/0/values/0Valuekind: "MysteryValue" is not a recognised Value variant

The decoder MUST NOT silently substitute a default variant or treat the input as a generic object (§9.5).

Input 2 — structural-invariant error (FieldId family mismatch and duplicate embedded-artifact key). This input has two errors at distinct positions; both must be reported. The same IRI https://example.org/fields/foo is used as artifactRef from two embeddings whose kinds declare different field families — once as a TextField, once as a DateField. A single field identifier cannot belong to two field families, so one of the two references must be wrong; conformance requires the binding to detect and report this without consulting an external registry.

{
  "kind": "Template",
  "id": "https://example.org/templates/x",
  "modelVersion": "2.0.0",
  "metadata": {
    "lifecycle": {
      "createdOn": "2026-01-15T09:30:00Z",
      "createdBy": "https://example.org/u",
      "modifiedOn": "2026-01-15T09:30:00Z",
      "modifiedBy": "https://example.org/u"
    }
  },
  "versioning": {
    "version": "1.0.0",
    "status": "draft"
  },
  "title": [
    {
      "value": "x",
      "lang": "en"
    }
  ],
  "members": [
    {
      "kind": "EmbeddedTextField",
      "key": "duplicate",
      "artifactRef": "https://example.org/fields/foo"
    },
    {
      "kind": "EmbeddedDateField",
      "key": "duplicate",
      "artifactRef": "https://example.org/fields/foo"
    }
  ]
}

Expected error report (collected mode):

categorypathproductionmessage
structural/members/1/artifactRefEmbeddedDateFieldartifactRef "https://example.org/fields/foo" is also referenced at /members/0/artifactRef as a TextField; a FieldId cannot belong to two field families
structural/members/1/keyTemplateEmbeddedArtifact.key "duplicate" is not unique within the enclosing Template (also at /members/0/key)

The duplicate-key error is reported against the second occurrence, not the first; the first occurrence’s path is included in the message for traceability. The family-mismatch error is reported against the second occurrence by the same convention.

A binding may also surface additional implementation-specific fields (error code, original JSON value, etc.); the four columns above are the required minimum per §9.3. The expected errors live as a JSON file alongside the input under spec/normative-tests/invalid/02-fieldid-family-mismatch-and-duplicate-key/expected-errors.json, where the messageRegex field gives the regex a binding’s reported message MUST match (literal equality is not required — wording is informational, the regex pins the substantive content). The same convention applies to the §8.5 first input and to all future invalid fixtures.

9. Errors

This section specifies the error model for conforming encoders and decoders: the categories of error each side reports, the common shape of an error, and the policy on fail-fast vs collected reporting. The intent is cross-binding parity — a TS, Java, and Python binding given the same malformed input report the same set of errors at the same wire-form locations, even if they surface those errors through different host-language exception types.

The error model defined here is normative: the binding contract covers not only what is encoded and decoded, but how failures are reported.

9.1 Error categories

Three categories of error are recognised:

  1. Wire-shape error. The JSON does not match the wire production that should appear at the position. Examples:
    • A property whose declared type is string is encoded as a JSON number.
    • A polymorphic union slot carries a kind that is not one of the declared variants.
    • A required property is missing, or a property is present that is not declared by the production at the position (excluding _/$-prefixed extension properties per §4.7).
    • A nonEmptyArray<X> slot carries [].
  2. Lexical error. A wire value is well-formed JSON of the right shape, but its lexical content does not match the production’s lexical category. Examples:
    • A LanguageTag string that is not a valid BCP 47 tag (per RFC 5646).
    • An Iri string that is not a syntactically valid absolute IRI (per RFC 3987).
    • An EmbeddedArtifactKey that does not match ^[A-Za-z][A-Za-z0-9_-]*$.
    • A LexicalForm integer string with a leading zero, leading sign, or non-decimal digit (per grammar.md §Primitive String Types).
    • A SemanticVersion string that does not conform to Semantic Versioning 2.0.0.
    • An Iso8601DateTimeLexicalForm string outside the XSD dateTime extended form.
  3. Structural-invariant error. The shape and lexical content are each individually valid, but a constraint that crosses positions is violated. Examples:
    • Two EmbeddedArtifact.key values within the same Template are equal.
    • The IRI placed at an EmbeddedField.artifactRef belongs to a field of a different family than the enclosing kind declares.
    • Cardinality.min > Cardinality.max.
    • An OntologyDisplayHint carries neither acronym nor name.
    • Two LangString.lang tags within the same MultilingualString are equal under case-folded comparison.
    • Two PermissibleValue.value tokens within the same enum spec are equal.
    • A MultiValuedEnumFieldSpec.defaultValues array contains two EnumValue entries with the same value.
    • A field-level or embedding-level defaultValue Token does not equal any PermissibleValue.value of the spec.
    • A DateFieldSpec.defaultValue arm is inconsistent with the spec’s dateValueType (e.g. dateValueType: "year" paired with a FullDateValue default).
    • A SchemaArtifactVersioning carrying both previousVersion and derivedFrom with the same IRI.

A single malformed input may produce errors in more than one category at distinct positions. An encoder reports the same three categories when given an in-memory value that does not satisfy them.

9.2 Error path

Every error MUST carry a path that locates it within the wire form. The path is a JSON Pointer per RFC 6901 (a slash-prefixed sequence of decoded property names and decimal array indices), relative to the root of the wire document being decoded or encoded. For example:

  • "" — the document root.
  • "/members/3/defaultValue" — the defaultValue property of the fourth element of the root-level members array.
  • "/metadata/annotations/0/body/value" — the value property of the body of the first annotation in the root metadata.

The decoder MUST report the path that names the innermost property or array index where the error was detected, not a parent. An encoder reports the path the property would have occupied in the wire form the encoder is producing.

When a wire-shape error refers to an array index that has not yet been written (e.g. a nonEmptyArray<X> violation reported on []), the path names the array property itself, with no trailing index.

9.3 Error report shape

The minimum information an error MUST carry is:

FieldTypeDescription
categoryone of "wireShape", "lexical", "structural"the §9.1 category
pathstringa JSON Pointer per §9.2
productionstringthe wire grammar production at path (e.g. "Cardinality", "LangString", "EmbeddedTextField")
messagestringa human-readable explanation

Bindings MAY carry additional fields — for example a machine-readable error code, the offending JSON value, or a chain of nested causes — but the four fields above are the lower bound on what every binding MUST surface.

The host-language form is binding-specific:

  • TypeScript. A class extending CedarConstructionError (or a sibling CedarDecodeError / CedarEncodeError if the binding prefers per-direction types). Properties are surfaced as instance fields.
  • Java. A subclass of RuntimeException (e.g. CedarDecodeException, CedarEncodeException). The four required fields appear as record components or accessor methods.
  • Python. A subclass of Exception carrying the four fields as attributes.

9.4 Fail-fast vs collected reporting

The default reporting mode is collected: a decoder or encoder MUST attempt to validate the entire input and report every error it finds before raising or returning. The thrown error type is therefore a collection of one or more individual errors; bindings idiomatic in single-error exceptions SHOULD wrap the collection in a single top-level exception whose message summarises the count and whose fields carry the list.

Bindings MAY additionally expose a fail-fast mode that raises on the first error encountered. Fail-fast mode is a performance and UX convenience for interactive use; the wire-form contract itself is defined in terms of the collected mode.

A decoder operating in collected mode MUST NOT short-circuit on a wire-shape error within an array element: each element is independent and must be checked. It MAY short-circuit if continuing past a wire-shape error would require the decoder to fabricate values (e.g. the property type is a polymorphic union and the kind discriminator is absent or unrecognised; in this case the decoder cannot know which arm’s properties to validate).

9.5 Decoder strictness for unknown discriminators

When a discriminator: kind union encounters a kind value that is not one of the declared variants, the decoder MUST report a wire-shape error and MUST NOT silently substitute a default variant or treat the input as a generic object. An unknown kind is a clear breaking-change indicator (per §11) and the decoder is in no position to recover.

When a property whose name is not declared by the production at the position is present, the decoder MUST report a wire-shape error, unless the property name begins with _ or $ (per §4.7), in which case the decoder MUST ignore it.

9.6 NFC normalisation

A decoder receiving a string that is not in Unicode NFC SHOULD normalise it to NFC silently and continue, recording the non-normalisation as a non-fatal warning if the binding’s API supports warnings. A decoder MAY instead raise a wire-shape error; this is implementation freedom. Encoders MUST emit NFC strings (per §4.5).


10. Reserved Property Names

The property name kind is reserved by this specification at all object-level positions. Implementations MUST NOT reuse this name for non-normative purposes.

The property name prefixes _ and $ are reserved for implementation-specific extensions per §4.7.

All other property names are scoped to their containing tagged object’s production and have no global meaning.

11. Versioning

This document defines version 1.0 of the JSON serialization. The version of the wire format itself is not encoded in conforming JSON documents; it is the responsibility of the surrounding storage or transport layer (file path conventions, MIME parameters, registry metadata, etc.) to communicate which version of this specification a document conforms to.

A future revision of this document MAY add new productions or new tagged-object kinds without a version bump, provided existing conforming documents remain conforming. A revision that changes the encoding of an existing production, removes a production, or changes the meaning of a property MUST bump the version.

12. Open Questions

  • Should this document define an explicit version-discrimination property (e.g. "$schema") at the root of every conforming document, parallel to the JSON Schema convention?
  • Should the wrapping principle in §5 be made into a normative algorithm rather than a checklist of properties?
  • Should the encoding distinguish “absent optional component” from “present optional component with the default value” in productions that carry defaults (e.g. ValueRequirement)? Current rule: omit if absent; encode the default when explicitly present. This may need to be made unambiguous per production.
  • Should NonNegativeInteger use a string encoding even within the safe-integer range, to make the wire format consistent across implementations whose host language has no JSON-Number-like type?

Validation

Overview

Validation in the CEDAR Template Model consists of structural conformance to the abstract grammar and satisfaction of well-formedness conditions that are not expressed directly in grammar productions. The Canonical Validation Algorithm section defines a two-phase procedural algorithm that operationalises all normative rules in this document.

Contents

Relationship to the wire-form error model

This document and serialization.md §9 describe two complementary layers of conformance checking:

  • The wire-form error model (serialization.md §9) governs decoder and encoder behaviour at the JSON boundary. It defines three error categories (wireShape, lexical, structural) and a JSON-pointer-based path format for locating each error.
  • This validation algorithm governs post-decode checking on in-memory values. It assumes a successful decode has already produced syntactically well-formed structures and verifies the cross-cutting rules that bind those structures together (key uniqueness, cardinality, instance alignment, field-spec compatibility, and so on).

The two layers overlap in scope: many of the structural-invariant constraints listed in §9.1 are also Phase 1 checks here, because a conforming decoder operating in collected mode applies them at decode time. Implementations MAY perform validation entirely at decode (folding Phase 1 into the decoder) or entirely after decode (running Phase 1 as a separate pass). Either approach is conforming.

When this document refers to a constraint that is also enumerated in serialization.md §9.1, the wire-form error category and path semantics from §9 apply. Reported errors SHOULD use the four-field shape from §9.3 (category, path, production, message).

Well-Formedness Conditions

The conditions below are organised by structural concern. Each subsection corresponds to one of the §9.1 categories — primarily structural-invariant (cross-position constraints that the grammar alone cannot express) but with a few lexical constraints (regex-based well-formedness of pinned primitive types) called out where they are most natural to state.

EmbeddedArtifactKey Uniqueness

Within a single Template, each EmbeddedArtifact MUST have a unique EmbeddedArtifactKey. The uniqueness constraint is local to that template level and does not extend across nested template boundaries. Accordingly, an embedded template MAY contain EmbeddedArtifactKey values that are identical to keys used in its containing template, because each template defines its own local key space.

Each EmbeddedArtifactKey MUST conform to the AsciiIdentifier lexical form (per grammar.md) — the regular expression ^[A-Za-z][A-Za-z0-9_-]*$.

Embedding References

Each EmbeddedField MUST reference a Field.

Each EmbeddedTemplate MUST reference a Template.

Each EmbeddedPresentationComponent MUST reference a PresentationComponent.

Cardinality Consistency

If an embedding defines minimum and maximum cardinality, the minimum cardinality MUST NOT exceed the maximum cardinality.

ValueRequirement and Cardinality are orthogonal: ValueRequirement governs whether any values must be supplied at all; Cardinality governs the permitted count if values are supplied.

If an embedding is marked “required”, its minimum cardinality MUST be at least one. For EmbeddedTemplate, this means at least one NestedTemplateInstance keyed to that embedding MUST be present in the TemplateInstance.

If an embedding is marked “recommended”, absence of a value MUST NOT by itself cause conformance failure, though implementations MAY issue warnings or other authoring guidance.

If an embedding is marked “optional”, absence of a value MUST NOT by itself cause conformance failure.

If values are present for a “recommended” or “optional” embedding, their count MUST satisfy the Cardinality constraints of that embedding.

Cardinality Defaults and Multiplicity

When Cardinality is absent from an EmbeddedArtifact, the implied default cardinality is min_cardinality(1) with max_cardinality(1): the embedded artifact MUST appear exactly once.

An EmbeddedField is single-valued if its effective maximum cardinality is max_cardinality(1).

An EmbeddedField is multi-valued if its effective maximum cardinality is greater than one or is UnboundedCardinality.

Versioning

Version and ModelVersion MUST conform to the SemanticVersion lexical form (per grammar.md) — Semantic Versioning 2.0.0.

ModelVersion is a top-level component of every concrete Artifact (every Template, TemplateInstance, every Field, and every PresentationComponent); it is not a component of SchemaArtifactVersioning.

Status MUST be either draft or published.

SchemaArtifactVersioning.previousVersion and SchemaArtifactVersioning.derivedFrom, when both present on the same artifact, MUST NOT carry the same IRI value (per grammar.md §Schema Artifact Versioning).

Instance Alignment

Each FieldValue in a TemplateInstance MUST reference the EmbeddedArtifactKey of an EmbeddedField in the referenced Template.

Each NestedTemplateInstance in a TemplateInstance MUST reference the EmbeddedArtifactKey of an EmbeddedTemplate in the referenced Template.

TemplateInstance MUST NOT contain an InstanceValue for an EmbeddedPresentationComponent.

Field Spec Compatibility

Values in a FieldValue MUST satisfy the FieldSpec and any field-spec-specific properties of the referenced Field.

The contained values MUST follow the FieldSpec-to-Value correspondence defined in grammar.md:

FieldSpecRequired Value type
TextFieldSpecTextValue
IntegerNumberFieldSpecIntegerNumberValue
RealNumberFieldSpecRealNumberValue
BooleanFieldSpecBooleanValue
DateFieldSpecDateValue (YearValue / YearMonthValue / FullDateValue per dateValueType)
TimeFieldSpecTimeValue
DateTimeFieldSpecDateTimeValue
ControlledTermFieldSpecControlledTermValue
SingleValuedEnumFieldSpec / MultiValuedEnumFieldSpecEnumValue
LinkFieldSpecLinkValue
EmailFieldSpecEmailValue
PhoneNumberFieldSpecPhoneNumberValue
OrcidFieldSpecOrcidValue
RorFieldSpecRorValue
DoiFieldSpecDoiValue
PubMedIdFieldSpecPubMedIdValue
RridFieldSpecRridValue
NihGrantIdFieldSpecNihGrantIdValue
AttributeValueFieldSpecAttributeValue

Additional well-formedness conditions apply per family, as described below.

For text values:

  • TextValue MUST carry a lexical form; it MAY carry a language tag
  • TextFieldSpec.defaultValue, if present, MUST be a TextValue
  • if both MinLength and MaxLength are present, MinLength MUST NOT exceed MaxLength
  • if MinLength is present, each TextValue lexical form MUST have length greater than or equal to that minimum
  • if MaxLength is present, each TextValue lexical form MUST have length less than or equal to that maximum
  • if ValidationRegex is present, each TextValue lexical form MUST match that regular expression
  • TextFieldSpec.defaultValue, if present, MUST satisfy any defined MinLength, MaxLength, and ValidationRegex
  • TextValue lexical forms SHOULD be in Unicode Normalization Form C
  • when present, TextValue.lang MUST be non-empty and well-formed according to BCP 47
  • if LangTagRequirement is “langTagRequired”, each TextValue MUST carry a lang slot
  • if LangTagRequirement is “langTagForbidden”, each TextValue MUST NOT carry a lang slot
  • TextFieldSpec.defaultValue, if present, MUST satisfy any defined LangTagRequirement

For integer-number values:

  • IntegerNumberValue MUST carry a base-10 integer lexical form; its datatype is implicitly xsd:integer
  • if both IntegerNumberMinValue and IntegerNumberMaxValue are present on the field spec, IntegerNumberMinValue MUST NOT exceed IntegerNumberMaxValue
  • if IntegerNumberMinValue is present, each IntegerNumberValue MUST be greater than or equal to that minimum
  • if IntegerNumberMaxValue is present, each IntegerNumberValue MUST be less than or equal to that maximum

For real-number values:

  • RealNumberValue MUST carry a real-valued lexical form together with a RealNumberDatatypeKind (one of decimal, float, or double)
  • a RealNumberValue’s datatype MUST equal the datatype declared on the enclosing RealNumberFieldSpec
  • if both RealNumberMinValue and RealNumberMaxValue are present on the field spec, RealNumberMinValue MUST NOT exceed RealNumberMaxValue
  • if RealNumberMinValue is present, each RealNumberValue MUST be greater than or equal to that minimum
  • if RealNumberMaxValue is present, each RealNumberValue MUST be less than or equal to that maximum

For boolean values:

  • BooleanValue MUST carry a boolean payload; its datatype is implicitly xsd:boolean

For date values:

  • DateFieldSpec with dateValueType: “year” MUST use YearValue, whose lexical form MUST match the pattern YYYY (a four-digit Gregorian year)
  • DateFieldSpec with dateValueType: “yearMonth” MUST use YearMonthValue, whose lexical form MUST match the pattern YYYY-MM (with month in 0112)
  • DateFieldSpec with dateValueType: “fullDate” MUST use FullDateValue, whose lexical form MUST be a well-formed xsd:date lexical form (YYYY-MM-DD with optional zone offset)
  • DateFieldSpec.defaultValue, if present, MUST carry a DateValue arm consistent with dateValueTypedateValueType: “year” admits only YearValue, dateValueType: “yearMonth” admits only YearMonthValue, dateValueType: “fullDate” admits only FullDateValue. The same constraint applies to EmbeddedDateField.defaultValue.

For time values:

  • TimeValue MUST carry a well-formed xsd:time lexical form
  • TimeFieldSpec values MUST conform to any stated TimePrecision

For date-time values:

  • DateTimeValue MUST carry a well-formed xsd:dateTime lexical form
  • DateTimeFieldSpec values MUST conform to the stated DateTimeValueType

For enum values:

  • A FieldValue for a SingleValuedEnumFieldSpec MUST contain exactly one EnumValue
  • A FieldValue for a MultiValuedEnumFieldSpec MUST contain one or more EnumValue constructs (subject to the Cardinality of the embedding)
  • Each EnumValue.value (a Token) MUST equal the canonical Token of one of the referenced spec’s PermissibleValue entries
  • The Token strings of an EnumFieldSpec’s PermissibleValue+ MUST be unique within that spec
  • SingleValuedEnumFieldSpec.defaultValue, if present, MUST be an EnumValue whose value equals the Token of one of its PermissibleValue entries
  • MultiValuedEnumFieldSpec.defaultValues, if present, MUST be a (possibly empty) list of EnumValue constructs each whose value equals the Token of one of its PermissibleValue entries; the list MUST NOT contain duplicate value entries

An EnumValue matches a PermissibleValue if and only if the value’s Token string equals the permissible value’s Token string (compared character by character).

For controlled-term values:

  • ControlledTermValue MUST include a term identifier and SHOULD include a human-readable label

For contact values:

  • EmailValue MUST carry a non-empty lexical form
  • PhoneNumberValue MUST carry a non-empty lexical form

For external authority values:

  • OrcidValue MUST include an OrcidIri
  • RorValue MUST include a RorIri
  • DoiValue MUST include a DoiIri
  • PubMedIdValue MUST include a PubMedIri
  • RridValue MUST include an RridIri
  • NihGrantIdValue MUST include a NihGrantIri
  • these values MAY additionally include a human-readable Label

For string-bearing values generally:

  • lexical forms MUST be in Unicode Normalization Form C (per serialization.md §4.5)
  • when present, language tags MUST conform to the Bcp47Tag lexical form (per grammar.md — RFC 5646)

For default values (both layers):

The model carries default values at two layers, and validation rules apply uniformly across the two:

  • A field-level default lives on the reusable Field’s FieldSpec (XxxFieldSpec.defaultValue), shared by every Template that embeds the field. Every concrete XxxFieldSpec except AttributeValueFieldSpec admits an optional default.
  • An embedding-level default lives on the EmbeddedXxxField inside a Template (EmbeddedXxxField.defaultValue), specific to that one embedding.

The well-formedness conditions:

  • A default value, at either layer, MUST be the family-specific Value type as given in grammar.md.
  • A default MUST satisfy every well-formedness condition that a corresponding FieldValue would satisfy for the same FieldSpec (length bounds, numeric bounds, datatype consistency, lexical-form constraints, and so on).
  • Enum defaults at either layer MUST be EnumValue constructs (single for SingleValuedEnumField/Spec, a possibly-empty list for MultiValuedEnumField/Spec) whose value equals the Token of one of the spec’s PermissibleValue entries; the multi-valued list MUST NOT contain duplicate value entries.
  • When both a field-level and an embedding-level default are present for the same field, the embedding-level default takes precedence (see grammar.md).
  • AttributeValueFieldSpec and EmbeddedAttributeValueField carry no defaults at either layer.

For multiplicity:

  • if an EmbeddedField is single-valued, its corresponding FieldValue MUST NOT contain more than one value
  • if an EmbeddedField is multi-valued, the number of values in its FieldValue MUST satisfy the embedding cardinality constraints
  • if an EmbeddedTemplate has multiplicity greater than one, the number of corresponding NestedTemplateInstance constructs MUST satisfy the embedding cardinality constraints

Rendering Hint Compatibility

Any rendering hint used by the model MUST be compatible with the associated FieldSpec:

Rendering hintPermitted on
TextRenderingHintTextFieldSpec
SingleValuedEnumRenderingHintSingleValuedEnumFieldSpec
MultiValuedEnumRenderingHintMultiValuedEnumFieldSpec
BooleanRenderingHintBooleanFieldSpec
NumericRenderingHintIntegerNumberFieldSpec, RealNumberFieldSpec
DateRenderingHintDateFieldSpec
TimeRenderingHintTimeFieldSpec
DateTimeRenderingHintDateTimeFieldSpec

Controlled Term Value Structure

If a value conforms to ControlledTermFieldSpec, the value MUST include a term identifier and SHOULD include a human-readable label.

A ControlledTermFieldSpec.defaultValue or EmbeddedControlledTermField.defaultValue, if present, SHOULD identify a term drawn from one of the declared ControlledTermSource entries of the referenced ControlledTermFieldSpec. Verifying source membership requires resolving the TermIri against an external ontology and is outside the scope of the canonical algorithm; see Out of Scope.

Canonical Validation Algorithm

The canonical validation algorithm consists of two phases that MUST be applied in order. Phase 1 validates the well-formedness of a Template and the artifacts it references. Phase 2 validates that a TemplateInstance conforms to a well-formed Template. Phase 2 MUST NOT be applied unless Phase 1 has passed without error.

Both phases are defined as error-collecting: all violations MUST be reported rather than stopping at the first failure. Implementations MAY additionally offer a fail-fast mode for performance, but the set of errors reported MUST be a subset of those that the collecting mode would report.

The algorithm is expressed as a set of named subroutines. Each subroutine takes typed inputs and produces a (possibly empty) set of errors. Verify denotes a hard constraint: failure produces an error. Warn denotes a SHOULD constraint: failure produces a warning. The notation count(X) denotes the number of elements of kind X, and len(s) denotes the length in characters of string s.

Reporting errors

Every Verify step in the algorithm has an associated error report that a conforming binding MUST surface on failure. Every Warn step has an associated warning report. Each step states its report inline as an On failure: line directly under the step.

Each report uses the four-field shape from serialization.md §9.3:

  • category — one of wireShape, lexical, or structural. Most validation reports are structural (cross-position constraint); a few are lexical (regex / well-formedness of a primitive type).
  • path — a JSON Pointer locating the offending slot in the wire form being validated.
  • production — the wire-grammar production at the path.
  • message — a human-readable explanation. The wording given in this document is recommended; bindings MAY use different text and SHOULD include enough detail to support diagnosis.

Path conventions. Subroutines describe paths relative to their input, using a placeholder for the input and slot accessors after slashes:

  • <input> — the subroutine’s input parameter, e.g. <embedded>, <template>, <fieldSpec>.
  • <input>/slotName — a property slot.
  • <input>/arrayName/<i> — an element of an array (with <i> an index variable).
  • <input>/arrayName/<i>/inner — a nested slot inside the i-th element.

The caller of a subroutine substitutes the placeholder for the actual JSON Pointer of its input. For example, when validate_cardinality_consistency runs against template.members[2], an error reported at <embedded>/cardinality/min becomes /members/2/cardinality/min in the surfaced report.

When a subroutine S₁ calls another subroutine S₂ and S₂ reports an error at path <S₂.input>/foo, the surfaced path is <S₁.input>/<path-to-S₂.input>/foo. Each layer prepends its own input path. For example, validate_default_value calls a family-specific value-validator with the default value as input; an error from the inner validator at <value>/value is surfaced at <embedded>/defaultValue/value.

Warning reports follow the same shape but are emitted through the binding’s warning channel rather than its error channel.

External resolution

Several Verify steps require resolving an artifact-reference IRI to its definition — for example, validate_embedding_reference verifies that embedded.artifactRef “identifies an existing <Family>Field”. Resolution is outside the scope of this specification. A conforming validator is given an external resolver function

resolve(iri: Iri)Artifact | null

that returns the artifact referenced by an IRI, or null if no such artifact is known. The validator MUST use this resolver to resolve every EmbeddedField.artifactRef, every EmbeddedTemplate.artifactRef, every EmbeddedPresentationComponent.artifactRef, and every TemplateInstance.templateRef.

How the resolver is implemented is a binding concern, not a model concern. Plausible implementations:

  • A registry-backed resolver that looks up artifacts in a local catalogue.
  • A document-local resolver that finds artifacts inlined in the same input document.
  • A network-backed resolver that dereferences HTTP IRIs.

When resolve(iri) returns null, the surfaced error is:

structural at the relevant artifactRef slot, production naming the embedding’s family, message “artifactRef does not resolve to an artifact”.

When resolve(iri) returns an artifact of the wrong family (e.g. a TextField is returned for an EmbeddedDateField.artifactRef), the surfaced error is the family-mismatch error already documented at validate_embedding_reference.

Implementations MAY operate without a resolver — in which case all Verify <…>identifying an existing <Family> steps are SKIPPED and any conformance claim must be qualified accordingly. This is a partial-validation mode appropriate for syntactic linting; full conformance requires a resolver.

Lexical-form precision

Several Verify steps appeal to lexical-form well-formedness for the primitive types pinned in grammar.md §Primitive String Types. For interoperability across implementations, the lexical-form predicates resolve as follows:

Lexical formAuthoritative grammar
SemanticVersionThe regular expression at semver.org.
IriStringThe IRI ABNF in RFC 3987 §2.2. The IRI MUST be absolute (carry a scheme). Implementations MAY use a permissive scheme-and-non-whitespace check as a fast pre-filter, but a conforming validator MUST be capable of full RFC 3987 conformance on demand.
Bcp47TagThe Language-Tag production of RFC 5646. Implementations MAY validate against the IANA Language Subtag Registry; a syntactic-only check is acceptable as a baseline.
IntegerLexicalFormRegex ^-?(0|[1-9][0-9]*)$. No leading +, no leading zeros (other than the literal 0), no whitespace. Magnitude is unbounded.
AsciiIdentifierRegex ^[A-Za-z][A-Za-z0-9_-]*$. Length is unbounded.
Iso8601DateTimeLexicalFormThe dateTime lexical form from XML Schema 1.1 Part 2 §3.3.7, extended format.
xsd:date lexical formXML Schema 1.1 Part 2 §3.3.9.
xsd:time lexical formXML Schema 1.1 Part 2 §3.3.8.
xsd:dateTime lexical formXML Schema 1.1 Part 2 §3.3.7.
xsd:decimal lexical formXML Schema 1.1 Part 2 §3.3.3.
xsd:float / xsd:double lexical formXML Schema 1.1 Part 2 §3.3.6 and §3.3.5. The special values INF, -INF, and NaN are part of the lexical space.

A conforming validator MUST treat the cited grammar as authoritative; a value is well-formed if and only if it matches the cited grammar. This pins the predicate so two independently-implemented validators agree on every input.


Phase 1: Schema Validation

Entry Point

validate_schema(template: Template)

Entry point for schema validation.

  1. Run validate_model_version(template.model_version) and validate_schema_artifact_versioning(template.versioning).
  2. If template.template_rendering_hint is present: run validate_template_rendering_hint(template.template_rendering_hint).
  3. Let fields = the set of Field artifacts referenced by EmbeddedField constructs in template.
  4. For each field in fields: run validate_model_version(field.model_version), validate_schema_artifact_versioning(field.versioning), and validate_field_spec(field.field_spec).
  5. Let pcs = the set of PresentationComponent artifacts referenced by EmbeddedPresentationComponent constructs in template.
  6. For each component in pcs: run validate_model_version(component.model_version). PresentationComponent does not carry SchemaArtifactVersioning, so no versioning validation step applies.
  7. Run validate_embedded_artifact_keys(template).
  8. For each embedded in template.embedded_artifacts:
    1. Run validate_embedding_reference(embedded).
    2. Run validate_cardinality_consistency(embedded).
    3. If embedded is an EmbeddedField: run validate_rendering_hints(embedded).
    4. If embedded.default_value is present: run validate_default_value(embedded.default_value, embedded).
    5. If embedded is an EmbeddedTemplate: run validate_schema(embedded.referenced_template).

Metadata and Key Validation

validate_schema_artifact_versioning(versioning: SchemaArtifactVersioning)

Applies the Versioning rules to the SchemaArtifactVersioning slot carried by each schema artifact (Template, Field). PresentationComponent and TemplateInstance do not carry SchemaArtifactVersioning; this subroutine is not invoked for them.

  1. Let version = versioning.version. Verify version conforms to the SemanticVersion lexical form (Semantic Versioning 2.0.0).
    On failure
    category
    lexical
    path
    <versioning>/version
    production
    SchemaArtifactVersioning
    message
    "version is not a valid SemanticVersion 2.0.0 string"
  2. Let status = versioning.status. Verify status { draft, published }.
    On failure
    category
    wireShape
    path
    <versioning>/status
    production
    SchemaArtifactVersioning
    message
    "status must be 'draft' or 'published'"
  3. If both versioning.previous_version and versioning.derived_from are present: verify they do not carry the same IRI value.
    On failure
    category
    structural
    path
    <versioning>/derivedFrom
    production
    SchemaArtifactVersioning
    message
    "previousVersion and derivedFrom MUST NOT carry the same IRI"

validate_template_rendering_hint(hint: TemplateRenderingHint)
  1. If hint.help_display_mode is present: verify it is one of “inline”, “tooltip”, “both”, “none”.
    On failure
    category
    wireShape
    path
    <hint>/helpDisplayMode
    production
    HelpDisplayMode
    message
    "unknown HelpDisplayMode value"

validate_model_version(modelVersion: ModelVersion)

Applies the Versioning rules to the artifact-level ModelVersion carried directly by every concrete Artifact.

  1. Verify modelVersion conforms to the SemanticVersion lexical form (Semantic Versioning 2.0.0).
    On failure
    category
    lexical
    path
    <modelVersion>
    production
    naming the enclosing artifact (e.g. TextField, Template)
    message
    "modelVersion is not a valid SemanticVersion 2.0.0 string"

validate_embedded_artifact_keys(template: Template)

Applies the EmbeddedArtifactKey Uniqueness rules.

  1. Let keys = the sequence of EmbeddedArtifactKey values across all EmbeddedArtifact constructs in template.
  2. For each key k in keys: verify k conforms to the AsciiIdentifier lexical form (regex ^[A-Za-z][A-Za-z0-9_-]*$).
    On failure
    category
    lexical
    path
    <template>/members/<i>/key
    production
    naming the embedded artifact at index <i>
    message
    "EmbeddedArtifactKey does not match the AsciiIdentifier pattern"
  3. Verify all values in keys are distinct: for each pair (k, k) where k k as positions but k= k as values, report a duplicate-key error. Key uniqueness is scoped to template; the same key may appear in a nested template without conflict.
    On failure
    category
    structural
    path
    <template>/members/<j>/key (the second occurrence)
    production
    Template
    message
    "EmbeddedArtifact.key is not unique within the enclosing Template (also at /members/<i>/key)"

Reference and Cardinality Validation

validate_embedding_reference(embedded: EmbeddedArtifact)

Applies the Embedding References rules.

Each step below resolves embedded.artifactRef via the external resolver resolve(iri) (see External resolution) and verifies the resolved artifact’s family. If the validator was given no resolver, all steps are SKIPPED.

For each step below, two failure modes are possible:

On failure (unresolved)
category
structural
path
<embedded>/artifactRef
production
naming embedded's family
message
"artifactRef does not resolve to an artifact"
On failure (family mismatch)
category
structural
path
<embedded>/artifactRef
production
naming embedded's family
message
"artifactRef resolves to an artifact of the wrong family (expected <Family>, got <ResolvedFamily>)"
  1. If embedded is an EmbeddedTextField: verify embedded.artifactRef is a TextFieldId identifying an existing TextField.
  2. If embedded is an EmbeddedIntegerNumberField: verify embedded.artifactRef is an IntegerNumberFieldId identifying an existing IntegerNumberField.
  3. If embedded is an EmbeddedRealNumberField: verify embedded.artifactRef is a RealNumberFieldId identifying an existing RealNumberField.
  4. If embedded is an EmbeddedBooleanField: verify embedded.artifactRef is a BooleanFieldId identifying an existing BooleanField.
  5. If embedded is an EmbeddedDateField: verify embedded.artifactRef is a DateFieldId identifying an existing DateField.
  6. If embedded is an EmbeddedTimeField: verify embedded.artifactRef is a TimeFieldId identifying an existing TimeField.
  7. If embedded is an EmbeddedDateTimeField: verify embedded.artifactRef is a DateTimeFieldId identifying an existing DateTimeField.
  8. If embedded is an EmbeddedControlledTermField: verify embedded.artifactRef is a ControlledTermFieldId identifying an existing ControlledTermField.
  9. If embedded is an EmbeddedSingleValuedEnumField: verify embedded.artifactRef is a SingleValuedEnumFieldId identifying an existing SingleValuedEnumField.
  10. If embedded is an EmbeddedMultiValuedEnumField: verify embedded.artifactRef is a MultiValuedEnumFieldId identifying an existing MultiValuedEnumField.
  11. If embedded is an EmbeddedLinkField: verify embedded.artifactRef is a LinkFieldId identifying an existing LinkField.
  12. If embedded is an EmbeddedEmailField: verify embedded.artifactRef is an EmailFieldId identifying an existing EmailField.
  13. If embedded is an EmbeddedPhoneNumberField: verify embedded.artifactRef is a PhoneNumberFieldId identifying an existing PhoneNumberField.
  14. If embedded is an EmbeddedOrcidField: verify embedded.artifactRef is an OrcidFieldId identifying an existing OrcidField.
  15. If embedded is an EmbeddedRorField: verify embedded.artifactRef is a RorFieldId identifying an existing RorField.
  16. If embedded is an EmbeddedDoiField: verify embedded.artifactRef is a DoiFieldId identifying an existing DoiField.
  17. If embedded is an EmbeddedPubMedIdField: verify embedded.artifactRef is a PubMedIdFieldId identifying an existing PubMedIdField.
  18. If embedded is an EmbeddedRridField: verify embedded.artifactRef is an RridFieldId identifying an existing RridField.
  19. If embedded is an EmbeddedNihGrantIdField: verify embedded.artifactRef is a NihGrantIdFieldId identifying an existing NihGrantIdField.
  20. If embedded is an EmbeddedAttributeValueField: verify embedded.artifactRef is an AttributeValueFieldId identifying an existing AttributeValueField.
  21. If embedded is an EmbeddedTemplate: verify embedded.artifactRef is a TemplateId identifying an existing Template.
  22. If embedded is an EmbeddedPresentationComponent: verify embedded.artifactRef is a PresentationComponentId identifying an existing PresentationComponent.

validate_cardinality_consistency(embedded: EmbeddedArtifact)

Applies the Cardinality Consistency rules.

  1. Let min = embedded.cardinality.min_cardinality if embedded.cardinality is present, else 1.
  2. Let max = embedded.cardinality.max_cardinality if embedded.cardinality is present, else 1. If max is UnboundedCardinality, let max = .
  3. Verify min max.
    On failure
    category
    structural
    path
    <embedded>/cardinality
    production
    Cardinality
    message
    "min must not exceed max"
  4. Let req = embedded.value_requirement if present, else “optional”.
  5. If req = “required”: verify min 1.
    On failure
    category
    structural
    path
    <embedded>/cardinality/min
    production
    Cardinality
    message
    "required embedding must have min cardinality of at least 1"

Field Spec Validation

Applies the Field Spec Compatibility rules. See also Field Specs in the abstract grammar.

validate_field_spec(fieldSpec: FieldSpec)

Dispatch on the kind of fieldSpec:


validate_text_field_spec(fieldSpec: TextFieldSpec)
  1. If both fieldSpec.min_length and fieldSpec.max_length are present: verify fieldSpec.min_length fieldSpec.max_length.
    On failure
    category
    structural
    path
    <fieldSpec>/minLength
    production
    TextFieldSpec
    message
    "minLength must not exceed maxLength"
  2. If fieldSpec.lang_tag_requirement is present: verify it is one of “langTagRequired”, “langTagOptional”, “langTagForbidden”.
    On failure
    category
    wireShape
    path
    <fieldSpec>/langTagRequirement
    production
    LangTagRequirement
    message
    "unknown LangTagRequirement value"

validate_integer_number_field_spec(fieldSpec: IntegerNumberFieldSpec)
  1. If both fieldSpec.min_value and fieldSpec.max_value are present: verify fieldSpec.min_value fieldSpec.max_value.
    On failure
    category
    structural
    path
    <fieldSpec>/minValue
    production
    IntegerNumberFieldSpec
    message
    "minValue must not exceed maxValue"

validate_real_number_field_spec(fieldSpec: RealNumberFieldSpec)
  1. If both fieldSpec.min_value and fieldSpec.max_value are present: verify fieldSpec.min_value fieldSpec.max_value.
    On failure
    category
    structural
    path
    <fieldSpec>/minValue
    production
    RealNumberFieldSpec
    message
    "minValue must not exceed maxValue"

validate_enum_field_spec(fieldSpec: EnumFieldSpec)
  1. Let tokens = the sequence of pv.value values across all pv in fieldSpec.permissible_values.
  2. Verify all values in tokens are distinct: report a duplicate-token error for any pair sharing the same token string.
    On failure
    category
    structural
    path
    <fieldSpec>/permissibleValues/<j>/value (the second occurrence)
    production
    naming fieldSpec's kind
    message
    "PermissibleValue.value is not unique within the enclosing spec (also at /permissibleValues/<i>/value)"
  3. For each pv in fieldSpec.permissible_values: verify pv.value is a non-empty Unicode string.
    On failure
    category
    wireShape
    path
    <fieldSpec>/permissibleValues/<i>/value
    production
    PermissibleValue
    message
    "value must be a non-empty Unicode string"
  4. For each pv in fieldSpec.permissible_values, for each m in pv.meanings: verify m.iri is a syntactically valid IRI.
    On failure
    category
    lexical
    path
    <fieldSpec>/permissibleValues/<i>/meanings/<j>/iri
    production
    Meaning
    message
    "iri is not a valid IRI"
  5. If fieldSpec is a SingleValuedEnumFieldSpec and fieldSpec.default_value is present: verify fieldSpec.default_value is an EnumValue and that fieldSpec.default_value.value tokens.
    On failure
    category
    structural
    path
    <fieldSpec>/defaultValue/value
    production
    SingleValuedEnumFieldSpec
    message
    "defaultValue does not match any of the spec's permissibleValues"
  6. If fieldSpec is a MultiValuedEnumFieldSpec and fieldSpec.default_values is present:
    1. Verify each entry is an EnumValue and that its value tokens.
      On failure
      category
      structural
      path
      <fieldSpec>/defaultValues/<i>/value
      production
      MultiValuedEnumFieldSpec
      message
      "defaultValues entry does not match any of the spec's permissibleValues"
    2. Verify all entries’ value strings are distinct.
      On failure
      category
      structural
      path
      <fieldSpec>/defaultValues/<j>/value (the second occurrence)
      production
      MultiValuedEnumFieldSpec
      message
      "defaultValues contains duplicate entries (also at /defaultValues/<i>/value)"

Default Value Validation

validate_default_value(defaultValue: Value, embedded: EmbeddedArtifact)

Let fieldSpec = the FieldSpec of the Field referenced by embedded.

  1. Verify defaultValue is of the family-specific Value type for fieldSpec: TextValue for TextFieldSpec, IntegerNumberValue for IntegerNumberFieldSpec, RealNumberValue for RealNumberFieldSpec, BooleanValue for BooleanFieldSpec, DateValue for DateFieldSpec, TimeValue for TimeFieldSpec, DateTimeValue for DateTimeFieldSpec, ControlledTermValue for ControlledTermFieldSpec, EnumValue for SingleValuedEnumFieldSpec, a sequence of EnumValue for MultiValuedEnumFieldSpec, LinkValue for LinkFieldSpec, EmailValue for EmailFieldSpec, PhoneNumberValue for PhoneNumberFieldSpec, and the corresponding external-authority Value types for the external-authority field specs. AttributeValueFieldSpec does not admit a default value.
    On failure
    category
    wireShape
    path
    <embedded>/defaultValue
    production
    naming embedded's family
    message
    "defaultValue must be a <FamilyValue> (got <kind>)"
  2. Apply the family-specific validate_xxx_value(defaultValue, fieldSpec) procedure to defaultValue. The default value MUST satisfy every constraint that a FieldValue carrying the same Value would satisfy. Errors reported by the inner subroutine are surfaced verbatim, with the path rooted at <embedded>/defaultValue.
  3. If embedded is an EmbeddedSingleValuedEnumField: verify defaultValue is a single EnumValue (not a sequence).
    On failure
    category
    wireShape
    path
    <embedded>/defaultValue
    production
    EmbeddedSingleValuedEnumField
    message
    "defaultValue must be a single EnumValue, not a sequence"
  4. If embedded is an EmbeddedMultiValuedEnumField: verify defaultValue is a (possibly empty) sequence of EnumValue constructs and that no two entries share the same value.
    On failure (shape)
    category
    wireShape
    path
    <embedded>/defaultValue
    production
    EmbeddedMultiValuedEnumField
    message
    "defaultValue must be an array of EnumValue"
    On failure (duplicate)
    category
    structural
    path
    <embedded>/defaultValue/<j>/value (the second occurrence)
    production
    EmbeddedMultiValuedEnumField
    message
    "defaultValue contains duplicate entries (also at /defaultValue/<i>/value)"

Rendering Hint Validation

validate_rendering_hints(embedded: EmbeddedField)

Applies the Rendering Hint Compatibility rules.

Let fieldSpec = the FieldSpec of the Field referenced by embedded.

For each step below, on failure: structural at the rendering-hint slot’s path (e.g. <embedded>/renderingHint), production naming embedded’s family, message “<HintKind> is not compatible with <FieldSpecKind>”.

  1. If embedded carries a TextRenderingHint: verify fieldSpec is TextFieldSpec.
  2. If embedded carries a SingleValuedEnumRenderingHint: verify fieldSpec is SingleValuedEnumFieldSpec.
  3. If embedded carries a MultiValuedEnumRenderingHint: verify fieldSpec is MultiValuedEnumFieldSpec.
  4. If embedded carries a NumericRenderingHint: verify fieldSpec is IntegerNumberFieldSpec or RealNumberFieldSpec.
  5. If embedded carries a DateRenderingHint: verify fieldSpec is DateFieldSpec.
  6. If embedded carries a TimeRenderingHint: verify fieldSpec is TimeFieldSpec.
  7. If embedded carries a DateTimeRenderingHint: verify fieldSpec is DateTimeFieldSpec.

Phase 2: Instance Validation

Entry Point

validate_instance(instance: TemplateInstance, template: Template)

Entry point for instance validation.

  1. Run validate_model_version(instance.model_version).
  2. Run validate_instance_alignment(instance, template).
  3. Run validate_field_presence_and_cardinality(instance, template).
  4. For each fieldValue in instance.instance_values where fieldValue is a FieldValue:
    1. Let embeddedField = the EmbeddedField in template whose key = fieldValue.key.
    2. Run validate_field_value(fieldValue, embeddedField).
  5. Run validate_nested_template_presence_and_cardinality(instance, template).
  6. For each nestedInstance in instance.instance_values where nestedInstance is a NestedTemplateInstance:
    1. Let embeddedTemplate = the EmbeddedTemplate in template whose key = nestedInstance.key.
    2. Let referencedTemplate = the Template identified by embeddedTemplate.artifactRef.
    3. Run validate_instance(nestedInstance, referencedTemplate).

Structural Alignment

validate_instance_alignment(instance: TemplateInstance, template: Template)

Applies the Instance Alignment rules.

  1. Let field_keys = { embedded.key | embedded template.embedded_artifacts, embedded is EmbeddedField }.
  2. Let template_keys = { embedded.key | embedded template.embedded_artifacts, embedded is EmbeddedTemplate }.
  3. Let pc_keys = { embedded.key | embedded template.embedded_artifacts, embedded is EmbeddedPresentationComponent }.
  4. For each fieldValue in instance.instance_values where fieldValue is a FieldValue: verify fieldValue.key field_keys.
    On failure
    category
    structural
    path
    <instance>/values/<i>/key
    production
    FieldValue
    message
    "FieldValue.key does not identify any EmbeddedField in the referenced Template"
  5. For each nestedInstance in instance.instance_values where nestedInstance is a NestedTemplateInstance: verify nestedInstance.key template_keys.
    On failure
    category
    structural
    path
    <instance>/values/<i>/key
    production
    NestedTemplateInstance
    message
    "NestedTemplateInstance.key does not identify any EmbeddedTemplate in the referenced Template"
  6. For each instanceValue in instance.instance_values: verify instanceValue.key pc_keys.
    On failure
    category
    structural
    path
    <instance>/values/<i>/key
    production
    naming instanceValue's kind
    message
    "InstanceValue keyed to an EmbeddedPresentationComponent — presentation components do not produce instance values"

Field Presence and Cardinality

validate_field_presence_and_cardinality(instance: TemplateInstance, template: Template)

Applies the Cardinality Consistency and Cardinality Defaults and Multiplicity rules.

For each embeddedField in template.embedded_artifacts where embeddedField is an EmbeddedField:

  1. Let eff_min = embeddedField.cardinality.min_cardinality if present, else 1.
  2. Let eff_max = embeddedField.cardinality.max_cardinality if present, else 1. If eff_max is UnboundedCardinality, let eff_max = .
  3. Let req = embeddedField.value_requirement if present, else “optional”.
  4. Let fieldValue = the FieldValue in instance with key = embeddedField.key, or absent if none exists.
  5. If req = “required”:
    1. Verify fieldValue absent.
      On failure
      category
      structural
      path
      <instance>/values
      production
      TemplateInstance
      message
      "required field <embeddedField.key> is missing from the instance"
    2. Verify count(fieldValue.values) eff_min.
      On failure
      category
      structural
      path
      <fieldValue>/values
      production
      FieldValue
      message
      "value count below required minimum cardinality (got <n>, expected ≥ <eff_min>)"
    3. If eff_max : verify count(fieldValue.values) eff_max.
      On failure
      category
      structural
      path
      <fieldValue>/values
      production
      FieldValue
      message
      "value count above maximum cardinality (got <n>, expected ≤ <eff_max>)"
  6. If req = “recommended” or req = “optional”:
    1. If fieldValue absent:
      1. Verify count(fieldValue.values) eff_min.
        On failure
        category
        structural
        path
        <fieldValue>/values
        production
        FieldValue
        message
        "value count below minimum cardinality (got <n>, expected ≥ <eff_min>)"
      2. If eff_max : verify count(fieldValue.values) eff_max.
        On failure
        category
        structural
        path
        <fieldValue>/values
        production
        FieldValue
        message
        "value count above maximum cardinality (got <n>, expected ≤ <eff_max>)"

Field Value Validation

validate_field_value(fieldValue: FieldValue, embeddedField: EmbeddedField)
  1. Let fieldSpec = the FieldSpec of the Field referenced by embeddedField.
  2. For each value in fieldValue.values: run validate_value(value, fieldSpec).

validate_value(value: Value, fieldSpec: FieldSpec)

Dispatch on the kind of fieldSpec:


validate_text_value(value: TextValue, fieldSpec: TextFieldSpec)
  1. Let lexicalForm = value.value.
  2. If fieldSpec.min_length is present: verify len(lexicalForm) fieldSpec.min_length.
    On failure
    category
    structural
    path
    <value>/value
    production
    TextValue
    message
    "value length below TextFieldSpec.minLength"
  3. If fieldSpec.max_length is present: verify len(lexicalForm) fieldSpec.max_length.
    On failure
    category
    structural
    path
    <value>/value
    production
    TextValue
    message
    "value length above TextFieldSpec.maxLength"
  4. If fieldSpec.validation_regex is present: verify lexicalForm matches fieldSpec.validation_regex.
    On failure
    category
    structural
    path
    <value>/value
    production
    TextValue
    message
    "value does not match TextFieldSpec.validationRegex"
  5. If value.lang is present: verify it conforms to the Bcp47Tag lexical form (RFC 5646).
    On failure
    category
    lexical
    path
    <value>/lang
    production
    TextValue
    message
    "lang is not a well-formed BCP 47 tag"
  6. If fieldSpec.lang_tag_requirement = “langTagRequired”: verify value.lang is present.
    On failure
    category
    structural
    path
    <value>/lang
    production
    TextValue
    message
    "lang tag missing; TextFieldSpec.langTagRequirement is 'langTagRequired'"
  7. If fieldSpec.lang_tag_requirement = “langTagForbidden”: verify value.lang is absent.
    On failure
    category
    structural
    path
    <value>/lang
    production
    TextValue
    message
    "lang tag present; TextFieldSpec.langTagRequirement is 'langTagForbidden'"

validate_integer_number_value(value: IntegerNumberValue, fieldSpec: IntegerNumberFieldSpec)
  1. Verify value.value conforms to the IntegerLexicalForm (regex ^-?(0|[1-9][0-9]*)$). Let n = its integer value.
    On failure
    category
    lexical
    path
    <value>/value
    production
    IntegerNumberValue
    message
    "value is not a well-formed IntegerLexicalForm"
  2. If fieldSpec.min_value is present: verify n fieldSpec.min_value.value (compared as integers).
    On failure
    category
    structural
    path
    <value>/value
    production
    IntegerNumberValue
    message
    "value below IntegerNumberFieldSpec.minValue"
  3. If fieldSpec.max_value is present: verify n fieldSpec.max_value.value (compared as integers).
    On failure
    category
    structural
    path
    <value>/value
    production
    IntegerNumberValue
    message
    "value above IntegerNumberFieldSpec.maxValue"

validate_real_number_value(value: RealNumberValue, fieldSpec: RealNumberFieldSpec)
  1. Verify value.datatype = fieldSpec.datatype (one of decimal, float, double).
    On failure
    category
    structural
    path
    <value>/datatype
    production
    RealNumberValue
    message
    "datatype does not match the enclosing RealNumberFieldSpec.datatype"
  2. Verify value.value is a well-formed lexical form for that datatype. Let n = its numeric value.
    On failure
    category
    lexical
    path
    <value>/value
    production
    RealNumberValue
    message
    "value is not a well-formed lexical form for datatype <datatype>"
  3. If fieldSpec.min_value is present: verify n fieldSpec.min_value.value (compared as numbers under fieldSpec.datatype’s ordering).
    On failure
    category
    structural
    path
    <value>/value
    production
    RealNumberValue
    message
    "value below RealNumberFieldSpec.minValue"
  4. If fieldSpec.max_value is present: verify n fieldSpec.max_value.value (compared as numbers under fieldSpec.datatype’s ordering).
    On failure
    category
    structural
    path
    <value>/value
    production
    RealNumberValue
    message
    "value above RealNumberFieldSpec.maxValue"

Comparison semantics for float and double. The numeric value n MAY be NaN, +INF, or -INF (these are part of the xsd:float and xsd:double lexical spaces). The bound comparisons in steps 3 and 4 follow IEEE 754 ordering:

  • If n is NaN, every comparison n x and n x is false. A NaN value therefore violates any present minValue or maxValue bound and reports the corresponding bound-failure error.
  • If n is +INF, then n x is true for every finite x and n x is true only when x is +INF.
  • If n is -INF, then n x is true for every finite x and n x is true only when x is -INF.

This convention matches the IEEE 754 totalOrder relation restricted to comparison; bindings SHOULD use their host language’s IEEE 754-compliant comparison primitives.


validate_boolean_value(value: BooleanValue, fieldSpec: BooleanFieldSpec)
  1. Verify value.value is true or false.
    On failure
    category
    wireShape
    path
    <value>/value
    production
    BooleanValue
    message
    "value must be a JSON boolean"

validate_date_value(value: DateValue, fieldSpec: DateFieldSpec)
  1. If fieldSpec.date_value_type = “year”: verify value is a YearValue whose value matches [0-9]{4}.
    On failure (arm)
    category
    structural
    path
    <value>
    production
    DateValue
    message
    "DateFieldSpec.dateValueType 'year' admits only YearValue"
    On failure (lexical)
    category
    lexical
    path
    <value>/value
    production
    YearValue
    message
    "value does not match YYYY"
  2. If fieldSpec.date_value_type = “yearMonth”: verify value is a YearMonthValue whose value matches [0-9]{4}-(0[1-9]|1[0-2]).
    On failure (arm)
    category
    structural
    path
    <value>
    production
    DateValue
    message
    "DateFieldSpec.dateValueType 'yearMonth' admits only YearMonthValue"
    On failure (lexical)
    category
    lexical
    path
    <value>/value
    production
    YearMonthValue
    message
    "value does not match YYYY-MM"
  3. If fieldSpec.date_value_type = “fullDate”: verify value is a FullDateValue whose value is a well-formed xsd:date lexical form.
    On failure (arm)
    category
    structural
    path
    <value>
    production
    DateValue
    message
    "DateFieldSpec.dateValueType 'fullDate' admits only FullDateValue"
    On failure (lexical)
    category
    lexical
    path
    <value>/value
    production
    FullDateValue
    message
    "value is not a well-formed xsd:date lexical form"

validate_time_value(value: TimeValue, fieldSpec: TimeFieldSpec)

For each step below that verifies a precision constraint, on failure: structural at <value>/value, production TimeValue, message “value does not match the precision required by TimeFieldSpec.timePrecision”. For lexical-form failures (xsd:time ill-formedness), the category is lexical instead.

  1. Let t = value.value.
  2. If fieldSpec.time_precision = “hourMinute”: verify t contains only hour and minute components (form HH:MM; no seconds or fractional seconds present).
  3. If fieldSpec.time_precision = “hourMinuteSecond”: verify t contains hour, minute, and second components (form HH:MM:SS; no fractional seconds present).
  4. If fieldSpec.time_precision = “hourMinuteSecondFraction”: verify t is a well-formed xsd:time lexical form; fractional seconds are permitted.
  5. If fieldSpec.time_precision is absent: verify t is a well-formed xsd:time lexical form.
  6. If fieldSpec.timezone_requirement = “timezoneRequired”: verify t includes a timezone designator.
    On failure
    category
    structural
    path
    <value>/value
    production
    TimeValue
    message
    "timezone designator missing; TimeFieldSpec.timezoneRequirement is 'timezoneRequired'"

validate_datetime_value(value: DateTimeValue, fieldSpec: DateTimeFieldSpec)

For each step below that verifies a precision constraint, on failure: structural at <value>/value, production DateTimeValue, message “value does not match the precision required by DateTimeFieldSpec.dateTimeValueType”. For lexical-form failures (xsd:dateTime ill-formedness), the category is lexical instead.

  1. Let dt = value.value.
  2. If fieldSpec.datetime_value_type = “dateHourMinute”: verify the time component of dt contains only hour and minute (form …THH:MM; no seconds present).
  3. If fieldSpec.datetime_value_type = “dateHourMinuteSecond”: verify the time component contains hour, minute, and second (form …THH:MM:SS; no fractional seconds present).
  4. If fieldSpec.datetime_value_type = “dateHourMinuteSecondFraction”: verify dt is a well-formed xsd:dateTime lexical form; fractional seconds are permitted.
  5. If fieldSpec.timezone_requirement = “timezoneRequired”: verify dt includes a timezone designator.
    On failure
    category
    structural
    path
    <value>/value
    production
    DateTimeValue
    message
    "timezone designator missing; DateTimeFieldSpec.timezoneRequirement is 'timezoneRequired'"

validate_controlled_term_value(value: ControlledTermValue, fieldSpec: ControlledTermFieldSpec)
  1. Verify value.term_iri is present.
    On failure
    category
    wireShape
    path
    <value>/term
    production
    ControlledTermValue
    message
    "term is required"
  2. Warn if value.label is absent.
    On warning
    category
    structural
    path
    <value>/label
    production
    ControlledTermValue
    message
    "label SHOULD be present so consumers without ontology access can render the term"

Note: validation of value.term_iri against fieldSpec.controlled_term_sources requires an external ontology resolver and is outside the scope of this algorithm; see Out of Scope.


validate_enum_value(value: EnumValue, fieldSpec: EnumFieldSpec)
  1. Verify there exists a pv in fieldSpec.permissible_values such that value.value = pv.value (string equality, character by character).
    On failure
    category
    structural
    path
    <value>/value
    production
    EnumValue
    message
    "value does not match any of the spec's permissibleValues tokens"

  1. Verify value.iri is present and is a well-formed IRI.
    On failure (missing)
    category
    wireShape
    path
    <value>/iri
    production
    LinkValue
    message
    "iri is required"
    On failure (malformed)
    category
    lexical
    path
    <value>/iri
    production
    LinkValue
    message
    "iri is not a valid IRI"

validate_contact_value(value: ContactValue)
  1. If value is an EmailValue: verify value.value is a non-empty lexical form.
    On failure
    category
    wireShape
    path
    <value>/value
    production
    EmailValue
    message
    "value must be a non-empty string"
  2. If value is a PhoneNumberValue: verify value.value is a non-empty lexical form.
    On failure
    category
    wireShape
    path
    <value>/value
    production
    PhoneNumberValue
    message
    "value must be a non-empty string"

validate_external_authority_value(value: ExternalAuthorityValue, fieldSpec: ExternalAuthorityFieldSpec)

Each external-authority Value carries a typed IRI specialised for its authority. The lexical patterns below are recommended (suitable for syntactic conformance checking) but are not structurally normative beyond Iri well-formedness; binding-level validators MAY apply stricter checks.

Field specRequired IRIRecommended pattern
OrcidFieldSpecOrcidIrihttps://orcid\.org/\d{4}-\d{4}-\d{4}-\d{3}[0-9X]
RorFieldSpecRorIrihttps://ror\.org/0[a-hj-km-np-tv-z0-9]{6}[0-9]{2}
DoiFieldSpecDoiIrihttps://doi\.org/10\.\d{4,9}/.+
PubMedIdFieldSpecPubMedIrihttps://pubmed\.ncbi\.nlm\.nih\.gov/\d+
RridFieldSpecRridIrihttps://identifiers\.org/RRID:[A-Z]+_\d+
NihGrantIdFieldSpecNihGrantIri(see Out of Scope)

In every case the procedure is: verify value is the corresponding XxxValue and that its iri slot is present and is a well-formed Iri per grammar.md §Primitive String Types. Implementations MAY additionally check the recommended pattern.

On failure (missing iri)
category
wireShape
path
<value>/iri
production
naming value's family
message
"iri is required"
On failure (not a well-formed Iri)
category
lexical
path
<value>/iri
production
naming value's family
message
"iri is not a valid Iri"
On failure (recommended pattern, when implementations check)
category
lexical
path
<value>/iri
production
naming value's family
message
"iri does not match the recommended pattern for <Authority>"

validate_attribute_value(value: AttributeValue)
  1. Verify value.name is present and contains a non-empty string.
    On failure
    category
    wireShape
    path
    <value>/name
    production
    AttributeValue
    message
    "name must be a non-empty string"
  2. Verify value.value is present and is a well-formed Value.
    On failure
    category
    wireShape
    path
    <value>/value
    production
    AttributeValue
    message
    "value is required and must be a Value"
  3. If value.value is an AttributeValue: run validate_attribute_value(value.value).

Nested Template Validation

validate_nested_template_presence_and_cardinality(instance: TemplateInstance, template: Template)

Applies the Cardinality Consistency and Cardinality Defaults and Multiplicity rules.

For each embeddedTemplate in template.embedded_artifacts where embeddedTemplate is an EmbeddedTemplate:

  1. Let eff_min = embeddedTemplate.cardinality.min_cardinality if present, else 1.
  2. Let eff_max = embeddedTemplate.cardinality.max_cardinality if present, else 1. If eff_max is UnboundedCardinality, let eff_max = .
  3. Let req = embeddedTemplate.value_requirement if present, else “optional”.
  4. Let n = count({ nestedInstance | nestedInstance instance.instance_values, nestedInstance is NestedTemplateInstance, nestedInstance.key = embeddedTemplate.key }).
  5. If req = “required”:
    1. Verify n eff_min.
      On failure
      category
      structural
      path
      <instance>/values
      production
      TemplateInstance
      message
      "required NestedTemplateInstance count below minimum (got <n>, expected ≥ <eff_min>) for key '<embeddedTemplate.key>'"
    2. If eff_max : verify n eff_max.
      On failure
      category
      structural
      path
      <instance>/values
      production
      TemplateInstance
      message
      "NestedTemplateInstance count above maximum (got <n>, expected ≤ <eff_max>) for key '<embeddedTemplate.key>'"
  6. If req = “recommended” or req = “optional”:
    1. If n > 0:
      1. Verify n eff_min.
        On failure
        category
        structural
        path
        <instance>/values
        production
        TemplateInstance
        message
        "NestedTemplateInstance count below minimum (got <n>, expected ≥ <eff_min>) for key '<embeddedTemplate.key>'"
      2. If eff_max : verify n eff_max.
        On failure
        category
        structural
        path
        <instance>/values
        production
        TemplateInstance
        message
        "NestedTemplateInstance count above maximum (got <n>, expected ≤ <eff_max>) for key '<embeddedTemplate.key>'"

Out of Scope

The following checks are outside the scope of the canonical algorithm and are not required for conformance:

  • ControlledTermSource membership — verifying that a ControlledTermValue’s TermIri is drawn from a declared ontology, branch, class set, or value set requires an external ontology resolver and is not defined here.
  • NIH Grant ID pattern — the lexical pattern for NihGrantIri is currently unspecified.
  • AttributeValueField name validation — attribute names are not fixed at schema definition time and cannot be structurally validated against the schema.

Open Questions

  • Which validation rules should be mandatory in the core specification versus deferred to profile-specific extensions?

RDF Projection

This section defines a projection from CEDAR Value instances to RDF. The projection is a derived view: CEDAR’s abstract grammar and wire form are CEDAR-native, and RDF is one consumer of the data, not the substrate of it. RDF tooling that consumes CEDAR instance data uses this projection; tooling that does not need RDF ignores it.

The projection is

  • total — every Value admitted by the abstract grammar projects to a unique RDF term (literal or IRI) plus zero or more accompanying triples,
  • deterministic — given the same input Value, every conforming projection produces the same RDF, and
  • mechanical — the rules below are the entire definition; no interpretive judgement is required.

The projection is informative with respect to the abstract grammar and wire grammar — it does not constrain how Value instances are encoded on the wire or represented in memory. It is normative for any RDF emitter that claims to project CEDAR instance data: a conforming emitter MUST produce the RDF specified here.

Vocabularies

The projection uses the following IRI prefixes:

PrefixIRI
xsd:http://www.w3.org/2001/XMLSchema#
rdf:http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs:http://www.w3.org/2000/01/rdf-schema#
skos:http://www.w3.org/2004/02/skos-core#
dc:http://purl.org/dc/terms/

No CEDAR-specific RDF vocabulary is introduced; the projection uses only RDF, RDFS, SKOS, XSD, and Dublin Core terms.

Per-variant projection

Each Value variant projects to a single RDF term. The “RDF term” column gives the produced node. The “Accompanying triples” column lists triples that travel with the term when the term is the object of an enclosing statement (for example, the value of an EmbeddedField instance). The exact subject/predicate of the enclosing statement is determined by the surrounding structure and is out of scope for this section.

Scalar values

Value variantRDF termAccompanying triples
TextValue { value, lang } (lang present)"value"@lang (rdf:langString)none
TextValue { value } (lang absent)"value"^^xsd:stringnone
IntegerNumberValue { value }"value"^^xsd:integernone
RealNumberValue { value, datatype }"value"^^xsd:<datatype>none
BooleanValue { value }"value"^^xsd:boolean ("true" or "false")none

For RealNumberValue, the <datatype> placeholder is the lexical name of the carried RealNumberDatatypeKind (decimal, float, or double), expanded against xsd:.

When the originating TextFieldSpec carries LangTagRequirement, the projection is pinned to a single RDF literal shape: "langTagRequired" always projects to rdf:langString literals; "langTagForbidden" always projects to xsd:string literals. "langTagOptional" (the default) admits either shape and projects each TextValue according to whether its lang slot is present.

Temporal values

Value variantRDF term
YearValue { value }"value"^^xsd:string
YearMonthValue { value }"value"^^xsd:string
FullDateValue { value }"value"^^xsd:date
TimeValue { value }"value"^^xsd:time
DateTimeValue { value }"value"^^xsd:dateTime

YearValue and YearMonthValue project to xsd:string literals. The temporal nature of the value is recoverable from the surrounding FieldSpec if needed; the projection does not introduce xsd:gYear or xsd:gYearMonth typed literals.

Contact values

Value variantRDF term
EmailValue { value }"value"^^xsd:string
PhoneNumberValue { value }"value"^^xsd:string

IRI-bearing values

LinkValue, OrcidValue, RorValue, DoiValue, PubMedIdValue, RridValue, and NihGrantIdValue each project to a plain RDF IRI node.

Value variantRDF termAccompanying triples
LinkValue { iri, label }<iri>if label present: <iri> rdfs:label "label"
OrcidValue { iri, label }<iri>if label present: <iri> rdfs:label "label"
RorValue { iri, label }<iri>if label present: <iri> rdfs:label "label"
DoiValue { iri, label }<iri>if label present: <iri> rdfs:label "label"
PubMedIdValue { iri, label }<iri>if label present: <iri> rdfs:label "label"
RridValue { iri, label }<iri>if label present: <iri> rdfs:label "label"
NihGrantIdValue { iri, label }<iri>if label present: <iri> rdfs:label "label"

The label is a MultilingualString on every IRI-bearing value. Each localization produces a separate rdfs:label triple. A label with no localizations (a single und-tagged entry) produces a single rdfs:label "label"@und triple.

Controlled-term values

ControlledTermValue projects to the term IRI together with optional metadata triples drawn from the optional Label, Notation, and PreferredLabel slots:

SlotTriple emitted (when present)
label<term> rdfs:label "label"@lang for each localization in the MultilingualString
notation<term> skos:notation "notation"^^xsd:string
preferredLabel<term> skos:prefLabel "preferredLabel"@lang for each localization in the MultilingualString

The accompanying-triple count is therefore variable: zero (no optional slots), one, two, or more (when label or preferred label carry several localizations).

Enum values

An enum value’s RDF projection requires the surrounding EnumFieldSpec context: the value carries a bare Token, and the spec’s PermissibleValue+ list supplies the per-token Label, Description, and Meaning metadata that the projection draws on. This is the only Value whose RDF lift cannot be determined from the value alone.

  • EnumValue { value: T } projects as follows:
    1. Look up T in the referenced EnumFieldSpec’s PermissibleValue entries to obtain the matching pv.
    2. If pv carries one or more Meaning entries, project as one RDF IRI node per Meaning — i.e. an enum value with n meanings projects to n IRI nodes. Each IRI node carries rdfs:label triples drawn from the matching Meaning’s own label (one triple per localization in the MultilingualString); if the Meaning carries no label, rdfs:label triples are drawn from the enclosing pv.label instead, providing a fallback display label when the bound term’s own label is not cached. dc:description triples are drawn from pv.description (one per localization). When this rule yields more than one RDF term, the surrounding statement that targets the enum value is duplicated once per term.
    3. If pv carries no Meaning, project as "T"^^xsd:string. The accompanying rdfs:label and dc:description triples are not emitted in this case (the value is a bare lexical token).

A conforming RDF emitter MUST therefore have access to the EnumFieldSpec of the surrounding EmbeddedField when projecting an EnumValue. RDF emitters that lift CEDAR data without schema context cannot project enum values faithfully.

Attribute value

AttributeValue { name, value } carries an attribute name and a nested value. The grammar types name as a Unicode string (the AttributeName production), not an Iri — attribute names are not constrained to be IRIs at the abstract-grammar level.

The projection treats name as the IRI of the predicate connecting the enclosing subject to the projected value; the projected RDF term is the projection of the nested value. The wrapper introduces one triple of the form <subject> <name> <projected-value>, where <subject> is supplied by the enclosing structure. The accompanying triples of the nested value (if any) travel with the projected value as for any other position.

For projection to succeed, the name string MUST be resolvable to a syntactically valid IRI — either because it is already an absolute IRI, or because the consuming tool resolves a relative name against an enclosing namespace before projection. CEDAR data whose name strings cannot be resolved this way is not projectable to RDF; tooling SHOULD either supply a default namespace or refuse to project such instances.

Annotation

An Annotation carries a property IRI and a body of polymorphic kind (AnnotationStringValue or AnnotationIriValue per grammar.md §Annotations). On any artifact carrying annotations (Field, Template, PresentationComponent), the annotation projects to a single triple whose subject is the artifact’s IRI, predicate is the annotation property, and object depends on the body kind:

Annotation body kindRDF term for the object
AnnotationStringValue { value, lang } (lang present)"value"@lang (rdf:langString)
AnnotationStringValue { value } (lang absent)"value"^^xsd:string
AnnotationIriValue { iri }<iri>

Each annotation produces exactly one triple. Multiple annotations on the same artifact produce one triple each. Annotations are projected only when the surrounding artifact is itself projected; the wrapping Annotation carries no other RDF presence.

Round-trip and faithfulness

The projection is forward-only by design: it converts CEDAR Value instances into RDF. The reverse direction (lifting an arbitrary RDF graph back into CEDAR Value instances) is not defined by this specification. RDF data produced by this projection MAY be re-ingested into CEDAR by tooling that knows the source FieldSpec for each value position; in the absence of FieldSpec context the reverse direction is ambiguous.

Within the projection itself, CEDAR-side identity is preserved: two CEDAR Value instances with identical content project to RDF terms that are RDF-term-equal. Two CEDAR Value instances differing in any structural component project to RDF terms that differ in either the term itself or in the accompanying triples.

Non-projected information

The following CEDAR information is not carried by the projection:

  • the kind discriminator of each Value variant — it is not preserved as an RDF triple. Variants whose RDF terms coincide (for example, EmailValue and PhoneNumberValue both projecting to xsd:string literals) cannot be distinguished from RDF alone,
  • presentation hints, label overrides, visibility, and other embedding-level configuration carried by EmbeddedField properties — the projection covers Value content only,
  • field-spec metadata such as units, validation regexes, or rendering hints — these are properties of the schema, not of the value,
  • default values at either layer (XxxFieldSpec.defaultValue and EmbeddedXxxField.defaultValue) — defaults are UI/UX initialisation only and never appear in TemplateInstance artifacts (see grammar.md §Defaults and instances.md). The projection sees only the values an instance actually carries; defaults that were accepted are projected as the chosen value (indistinguishable from a user-typed identical value), and defaults that were not accepted are simply absent.

Tooling that requires faithful round-tripping of these CEDAR-native concerns SHOULD work directly with the wire form rather than relying on the RDF projection.

Host-Language Bindings

This document gives guidance on how to map the abstract grammar (grammar.md) and the JSON wire format (wire-grammar.md) onto host-language types and idioms in TypeScript, Java, and Python.

1. Purpose and Scope

The CEDAR Structural Model is layered:

  • grammar.md defines what the model is — the abstract productions, their components, and the structural invariants they satisfy.
  • wire-grammar.md defines what the JSON looks like — exactly one JSON shape per abstract production, with discriminator placement and inline constraints.
  • serialization.md defines the encoding rules that frame the wire shapes — property naming, NFC normalisation, big-integer fallback, the wrapping principle.
  • This document defines how those JSON shapes become host-language values — the in-memory types a binding library exposes, and the idioms it follows.

Where the prior three documents are normative, this one is recommendation-grade. A binding conforms by realising the meta-categories in §2 with idioms compatible with the spirit of the recommendations below; deviations are allowed but SHOULD be documented in the binding’s own README.

In scope: TypeScript, Java (17+), Python (3.11+).

Out of scope (for now): Rust, Go, C#, Swift, Kotlin, and other languages. New languages can be added by following the meta-pattern in §2 — for each category, name an idiomatic realisation that preserves the wire round-trip and the construction-time invariants.

The reference TypeScript implementation is cedar-ts (npm package @metadatacenter/cedar-model); see §5. For idioms not covered explicitly here, cedar-ts is the source of truth on the TS side.


2. Meta-Categories

Each subsection below covers one structural pattern that recurs across the wire grammar. For every category we give:

  • a one-paragraph definition in terms of the grammar / wire-grammar;
  • a TypeScript idiom (reflecting cedar-ts);
  • a Java idiom (Java 17+, Jackson 2.x with jackson-databind and jackson-datatype-jdk8);
  • a Python idiom (Pydantic v2; attrs / dataclass mentioned where appropriate);
  • validation guidance — when and where the binding enforces the associated constraints;
  • a small worked example translating the same abstract production three ways.

Reading note: Jackson-annotation density. The Java idioms in this section annotate every record component explicitly with @JsonProperty(...) and every mapping constructor with @JsonCreator. The intent is unambiguous wire-to-Java mapping that does not depend on parameter-name reflection (the -parameters compiler flag) or on positional binding. A real binding MAY rely on Jackson defaults — e.g. record component name introspection plus implicit canonical-constructor binding — and elide most of these annotations; the explicit form here is the lower bound on what the binding contract requires, not the only style permitted.

Forward references. §2 mentions cedar-ts module names (leaves/, embedded/, etc.) and a few productions (EmbeddedArtifact, Template.members) before they are introduced in detail. The cedar-ts module layout is in §5; the productions themselves are defined in grammar.md and wire-grammar.md.

2.1 Plain object production

What it is. A wire production written as T ::: object { ... } with no "kind": "..." literal property. These are the singleton-only productions of the wire grammar — productions that never appear as alternatives in any discriminator: kind union, and therefore never carry kind on the wire (per the kind rule, wire-grammar.md §1.5). Examples: Cardinality, Property, LabelOverride, LifecycleMetadata, SchemaArtifactVersioning, Annotation, Unit, OntologyReference, OntologyDisplayHint, ControlledTermClass, PermissibleValue, Meaning.

TypeScript idiom. A readonly interface plus a constructor function. No kind field on the interface.

export interface Cardinality {
  readonly min: number;
  readonly max?: number;
}

export interface CardinalityInit {
  readonly min: number;
  readonly max?: number;
}

export function cardinality(init: CardinalityInit): Cardinality {
  const out: { min: number; max?: number } = {
    min: assertNonNegativeInteger(init.min),
  };
  if (init.max !== undefined) out.max = assertNonNegativeInteger(init.max);
  return out;
}

Java idiom. A record whose components mirror the wire properties. No Jackson type info is needed because the value lives at a singleton position and is decoded by its enclosing field’s static type.

public record Cardinality(
        @JsonProperty("min") int min,
        @JsonProperty("max") @JsonInclude(NON_NULL) Integer max) {
    @JsonCreator
    public Cardinality {
        if (min < 0) throw new CedarConstructionException("Cardinality.min must be >= 0");
        if (max != null && max < 0) throw new CedarConstructionException("Cardinality.max must be >= 0");
    }
}

Python idiom. A Pydantic v2 model with frozen=True and aliases for any name that differs from snake_case.

from pydantic import BaseModel, ConfigDict, Field

class Cardinality(BaseModel):
    model_config = ConfigDict(frozen=True, populate_by_name=True)
    min: int = Field(ge=0)
    max: int | None = Field(default=None, ge=0)

Validation guidance. Range checks (e.g., min >= 0) and any inline constraints from wire-grammar.md apply at construction time. The constructed value is always valid; downstream code never has to revalidate.

Worked example: Cardinality { min: number; max?: number }. The wire shape is { "min": 0, "max": 5 }; the three idioms above produce that JSON via their language’s natural serializer (TS via plain JSON.stringify; Java via Jackson default mapper; Python via model_dump_json()).

2.2 Discriminated union with kind tag

What it is. A wire production written as T ::: A | B | … with either an explicit // discriminator: kind comment or no discriminator comment at all (in which case kind is the default per wire-grammar.md §1.3). Each member is an object production whose shape includes a "kind": "MemberName" literal property. Examples: Value, FieldSpec, EmbeddedArtifact, ControlledTermSource, PresentationComponent, InstanceValue, SchemaArtifact, Artifact, ExternalAuthorityValue, DateValue.

TypeScript idiom. A discriminated (tagged) union of interfaces, all sharing a kind: "..." field as their literal-typed discriminant. Construction goes through per-variant factory functions; type-narrowing is by switch on value.kind.

export interface TextValue {
  readonly kind: 'TextValue';
  readonly value: string;
  readonly lang?: LanguageTag;
}
export interface IntegerNumberValue {
  readonly kind: 'IntegerNumberValue';
  readonly value: string;
}
export type Value = TextValue | IntegerNumberValue /* | … */;

export function textValue(value: string, lang?: LanguageTag): TextValue {
  return lang === undefined
    ? { kind: 'TextValue', value }
    : { kind: 'TextValue', value, lang };
}

Java idiom. A sealed interface with one record per variant and Jackson’s polymorphic-type annotations using the property name kind.

@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, property = "kind")
@JsonSubTypes({
    @JsonSubTypes.Type(value = TextValue.class, name = "TextValue"),
    @JsonSubTypes.Type(value = IntegerNumberValue.class, name = "IntegerNumberValue")
})
public sealed interface Value permits TextValue, IntegerNumberValue { }

@JsonTypeName("TextValue")
public record TextValue(
        @JsonProperty("value") String value,
        @JsonProperty("lang") @JsonInclude(NON_ABSENT) Optional<String> lang)
        implements Value {
    @JsonCreator
    public TextValue { }
}

@JsonTypeName("IntegerNumberValue")
public record IntegerNumberValue(@JsonProperty("value") String value)
        implements Value {
    @JsonCreator
    public IntegerNumberValue { }
}

Python idiom. A discriminated Union annotated with pydantic.Discriminator("kind"). Each variant carries a kind: Literal["..."] field.

from typing import Literal, Annotated, Union
from pydantic import BaseModel, ConfigDict, Discriminator

class TextValue(BaseModel):
    model_config = ConfigDict(frozen=True)
    kind: Literal["TextValue"] = "TextValue"
    value: str
    lang: str | None = None

class IntegerNumberValue(BaseModel):
    model_config = ConfigDict(frozen=True)
    kind: Literal["IntegerNumberValue"] = "IntegerNumberValue"
    value: str

Value = Annotated[Union[TextValue, IntegerNumberValue], Discriminator("kind")]

For complex roots, wrap in a pydantic.RootModel[Value] to permit top-level decoding via Value.model_validate_json(...).

Validation guidance. The decoder rejects any input whose kind value is not a known member. Encoders MUST emit kind with the exact production name (no aliasing). The construction-time invariants of each variant apply normally.

Worked example: Value (subset: TextValue | IntegerNumberValue). Wire shape: {"kind": "TextValue", "value": "hi"}. All three idioms decode that JSON to a value whose static type is Value and whose runtime narrowing predicate (value.kind === 'TextValue' / instanceof TextValue / isinstance(v, TextValue)) returns true.

Java note: nested sealed interfaces and Jackson dispatch tables. Where a sealed union permits another sealed union as a member, the outer union’s @JsonSubTypes SHOULD enumerate all leaf concrete records directly — a flat dispatch table — rather than delegating to the intermediate sealed interface.

Rationale. Nested-@JsonTypeInfo delegation through an intermediate sealed interface is fragile in Jackson 2.x: the resolver re-enters the deserializer chain at the inner interface, which can fight with @JsonTypeName on the leaves and produce spurious failures. The wire form already requires kind to be one of the leaf names (never an intermediate-group name), so a flat dispatch table is correct by construction.

Example. EmbeddedArtifact is a sealed union over EmbeddedField and EmbeddedPresentationComponent, with EmbeddedField itself sealed over the 20 family records (EmbeddedTextField, EmbeddedIntegerNumberField, etc.). The Jackson registration on EmbeddedArtifact should list every leaf record (all 20 EmbeddedXxxField records plus every EmbeddedXxxComponent record) in @JsonSubTypes, not the intermediate EmbeddedField interface.

2.3 Position-discriminated union

What it is. A wire production written as T ::: A | B | … and explicitly declared // discriminator: position in the wire grammar. The variant is determined entirely by the enclosing property and surrounding context; the encoded object itself carries no discriminator. The principal example is RenderingHint inside the various FieldSpec families: each FieldSpec family fixes which RenderingHint variant is permitted at its renderingHint slot, so the rendering hint encodes without a kind tag.

Note: position-discriminated vs. positionally-determinate. A position-discriminated union (this section) is one the wire grammar declares with // discriminator: position, and whose members therefore carry no kind on the wire. This is distinct from the larger class of unions whose use sites happen to be positionally determinate but which the wire grammar declares with the default discriminator: kind. The umbrella FieldSpec union is a good example of the latter: every actual use site (XxxField.fieldSpec) is typed with the per-family concrete production (TextFieldSpec, IntegerNumberFieldSpec, …) so the variant could in principle be recovered positionally, but FieldSpec is discriminator: kind so every member still carries "kind" on the wire per the kind rule (§1.5). Use this section only for productions explicitly flagged // discriminator: position.

Bindings can usually realise each use site as a single concrete class, since the position fixes the variant. There is no need for a runtime union at the use site.

TypeScript idiom. A single concrete interface per use site; the abstract union (RenderingHint) exists only as a documentation alias and is not used as a runtime narrowing target.

Java idiom. A concrete record per use site. The intermediate union interface MAY exist purely for documentation but does not need Jackson polymorphism.

Python idiom. A concrete BaseModel per use site; the abstract union may be exposed as a TypeAlias for documentation.

Validation guidance. None special — the variant is fixed by the enclosing property’s type. Decoders SHOULD NOT attempt cross-variant disambiguation at this position.

Worked example: DateRenderingHint inside DateFieldSpec.renderingHint. Wire: {"kind":"DateFieldSpec","dateValueType":"fullDate","renderingHint":{"componentOrder":"dayMonthYear"}}. The renderingHint property is statically typed as DateRenderingHint; no kind tag appears on the inner object since the position fixes the variant.

2.4 Typed primitive wrapper

What it is. A wire production written as T ::: string (or number) where T names a specialised role for the primitive — Iri, FieldId, TemplateId, OrcidIri, LanguageTag, Bcp47Tag, etc. On the wire these collapse to the underlying primitive; in the abstract grammar they are typed roles whose constraints (IRI well-formedness, BCP 47, ASCII identifier shape) MUST be enforced at decode time.

The trade-off: typed wrappers catch role mismatches (passing a TemplateId where a FieldId is expected) at compile time, at the cost of some construction friction. Plain strings are ergonomic but cede that protection to runtime checks. Bindings MAY choose either end of this spectrum; cedar-ts wraps strongly.

TypeScript idiom. Two patterns are in common use: a structural object wrapper carrying kind, and a TypeScript branded type (a compile-time-only role tag added to a primitive via a phantom __brand property of a string-literal type — the value is still a plain string at runtime, but the type system treats Iri and a generic string as distinct types). The structural wrapper costs one allocation per identifier but gives full structural typing in IDEs; the branded type costs nothing at runtime but enforces the role only at compile time. cedar-ts uses Option A.

Option A — structural wrapper. The cedar-ts choice for Iri, FieldId, and the typed-id families:

export interface Iri { readonly kind: 'Iri'; readonly value: string; }
export function iri(value: string): Iri {
  return { kind: 'Iri', value: parseIriString(value) };
}

cedar-ts’s FieldId family uses this form with a per-family kind discriminant so the twenty families remain distinguishable in the type system.

Option B — branded type. A lighter alternative; the value is a bare string at runtime, the type system enforces the role at compile time:

export type Iri = string & { readonly __brand: 'Iri' };
export function iri(value: string): Iri { /* validate */ return value as Iri; }

Java idiom. A dedicated value record:

public record Iri(@JsonValue String value) {
    public Iri {
        if (!IriSyntax.isValid(value)) throw new CedarConstructionException("Invalid IRI: " + value);
    }
    @JsonCreator public static Iri of(String value) { return new Iri(value); }
}

@JsonValue / @JsonCreator collapses to and from a bare JSON string so the wire form remains primitive while the in-memory type is nominal.

Python idiom. typing.NewType('Iri', str) for nominal typing in static analysis; the runtime value is a plain str and serialises as such.

from typing import NewType
Iri = NewType("Iri", str)

def iri(value: str) -> Iri:
    if not is_iri(value):
        raise CedarConstructionError(f"Invalid IRI: {value!r}")
    return Iri(value)

For richer runtime validation, a Pydantic BaseModel wrapper or an Annotated[str, AfterValidator(...)] form is also fine; the NewType form is the lightest.

Validation guidance. All typed primitive wrappers MUST enforce their syntactic constraints (RFC 3987 for IRI; BCP 47 for language tags; the ASCII pattern [A-Za-z][A-Za-z0-9_-]* for EmbeddedArtifactKey; SemVer for Version / ModelVersion) at the constructor.

Worked example: Iri and FieldId. Iri wire form: "https://example.org/x". FieldId wire form: also "https://example.org/x" — the family is recovered from the surrounding kind. Bindings reconstruct the typed form by combining the JSON string with the static type at the use site.

2.5 MultilingualString

What it is. A MultilingualString is an array of one or more {value, lang} localizations of the same conceptual string — for example, the English, French, and German labels for one field. The wire production is MultilingualString ::: nonEmptyArray<LangString>, with two invariants:

  • The array MUST be non-empty.
  • Lang tags MUST be unique within the array (case-folded, see wire-grammar.md §2.2).

A MultilingualString is distinct from a single language-tagged TextValue. A TextValue is one tagged object carrying kind, value, and lang — a single localized string. A MultilingualString is an array of localizations of the same conceptual string. The two are not interchangeable.

The (value, lang) pattern recurs across all three target languages and deserves its own section because the non-empty-and-unique-lang invariants need explicit support.

TypeScript idiom. A readonly array alias plus a constructor that enforces invariants and returns a frozen array. cedar-ts accepts a range of input shapes — bare string, {value, lang}, [value, lang], { [lang]: value } map, or an array of any of those — and normalises them to the canonical array form.

export type MultilingualString = readonly LangString[];
export interface LangString { readonly value: string; readonly lang: string; }

export function multilingualString(input: MultilingualStringInput): MultilingualString {
  // normalise, BCP 47-validate every lang, dedup-check, freeze, return.
}

Java idiom. Two records, with the outer carrying the invariants:

public record LangString(
        @JsonProperty("value") String value,
        @JsonProperty("lang") String lang) {
    @JsonCreator
    public LangString { /* BCP 47 check on lang */ }
}

public record MultilingualString(@JsonValue List<LangString> entries) {
    public MultilingualString {
        if (entries == null || entries.isEmpty())
            throw new CedarConstructionException("MultilingualString must be non-empty");
        var seen = new java.util.HashSet<String>();
        for (var e : entries) {
            if (!seen.add(e.lang().toLowerCase(Locale.ROOT)))
                throw new CedarConstructionException("Duplicate lang tag: " + e.lang());
        }
        entries = List.copyOf(entries);
    }
    @JsonCreator public static MultilingualString of(List<LangString> entries) { return new MultilingualString(entries); }
}

A NonEmptyList<T> helper type is a reasonable cross-cutting abstraction if the binding has several non-empty arrays to model.

Python idiom. A Pydantic model with a model_validator(mode="after") enforcing non-empty and unique lang tags:

from pydantic import BaseModel, ConfigDict, RootModel, model_validator

class LangString(BaseModel):
    model_config = ConfigDict(frozen=True)
    value: str
    lang: str  # validate BCP 47 with a field validator

class MultilingualString(RootModel[list[LangString]]):
    model_config = ConfigDict(frozen=True)

    @model_validator(mode="after")
    def _check(self):
        entries = self.root
        if not entries:
            raise CedarConstructionError("MultilingualString must be non-empty")
        seen = set()
        for e in entries:
            key = e.lang.lower()
            if key in seen:
                raise CedarConstructionError(f"Duplicate lang tag: {e.lang!r}")
            seen.add(key)
        return self

attrs with __attrs_post_init__ is a lighter alternative; the recommendation is Pydantic for the JSON round-trip story.

Validation guidance. Validate at construction. A constructed MultilingualString is always non-empty and always lang-unique.

2.6 Optional component

What it is. A grammar production component marked [X]. On the wire the property is encoded only when present (serialization.md §4.2): conforming encoders MUST NOT emit null or empty strings in place of an absent optional.

TypeScript idiom. prop?: T. The interface treats omission and undefined identically; encoders skip the property at JSON write time. Use JSON.stringify with no null-injection logic; the property is naturally absent from the serialised output.

Java idiom. Prefer @Nullable T over Optional<T> in record components. Jackson handles null/missing properties on records cleanly with @JsonInclude(JsonInclude.Include.NON_NULL) at either the field or class level. Optional<T> works but interacts awkwardly with records (Jackson must be configured to recognise empty Optionals) and adds a layer of allocation per access.

public record Cardinality(
        @JsonProperty("min") int min,
        @JsonProperty("max") @JsonInclude(NON_NULL) Integer max) { … }

Python idiom. T | None with default None; Pydantic respects the optional semantics and excludes None fields from model_dump_json(exclude_none=True).

class Cardinality(BaseModel):
    model_config = ConfigDict(frozen=True)
    min: int = Field(ge=0)
    max: int | None = Field(default=None, ge=0)

# Round-trip omits `max` when None:
c = Cardinality(min=0)
c.model_dump_json(exclude_none=True)  # '{"min":0}'

Set model_config = ConfigDict(json_dumps_kwargs={"exclude_none": True}) or use a custom model_dump_json wrapper to make this implicit.

Validation guidance. Decoders MUST treat "prop": null as an encoding error (per serialization.md §4.2), distinct from omission of the property.

2.7 String enum

What it is. A wire production T ::: "a" | "b" | … whose values are drawn from a fixed set. All values are lowerCamelCase per serialization.md §3.3. Examples: Status, ValueRequirement, Visibility, DateValueType, DateComponentOrder, TimeFormat, TimePrecision, DateTimeValueType, TimezoneRequirement, RealNumberDatatypeKind (three values), the flat-string rendering hints (TextRenderingHint, SingleValuedEnumRenderingHint, MultiValuedEnumRenderingHint, BooleanRenderingHint).

TypeScript idiom. A string-literal union. cedar-ts also exports a frozen array of permitted values and an isXxx type guard.

export type Status = 'draft' | 'published';
export const STATUSES: readonly Status[] = Object.freeze(['draft', 'published']);
export const isStatus = (x: unknown): x is Status =>
  typeof x === 'string' && (STATUSES as readonly string[]).includes(x);

Java idiom. A Java enum whose constants are uppercase by convention, with @JsonProperty annotations mapping each constant to its lowerCamelCase wire value:

public enum Status {
    @JsonProperty("draft") DRAFT,
    @JsonProperty("published") PUBLISHED
}

Jackson uses the annotation for both serialization and deserialization. An unknown wire value yields Jackson’s standard InvalidFormatException. Bindings that prefer to surface custom errors, or that need a wire accessor on the enum (e.g. for non-Jackson code paths), can use the @JsonValue / @JsonCreator pair instead.

Python idiom. enum.StrEnum (Python 3.11+); Pydantic accepts and emits the string form directly.

from enum import StrEnum
class Status(StrEnum):
    DRAFT = "draft"
    PUBLISHED = "published"

Validation guidance. Decoders MUST reject string values not in the declared set. The enum surface must be closed: future wire-grammar additions trigger a binding version bump.

2.8 Repeated component

What it is. A grammar component marked X* (zero-or-more) or X+ (one-or-more). On the wire both encode as JSON arrays (wire-grammar.md §1.1, §4.3); X+ is written as nonEmptyArray<X> and carries the non-empty invariant. Order MUST be preserved through encode and decode.

TypeScript idiom. readonly T[] with an explicit non-empty check in the constructor for X+ cases. cedar-ts uses Object.freeze on constructed arrays where the position carries an invariant (MultilingualString, embedded in Template).

Java idiom. List<T>. Jackson handles arrays out of the box. For non-empty cases, validate at the constructor with if (list.isEmpty()) throw … and store as List.copyOf(list) to enforce immutability.

Python idiom. list[T]. Pydantic handles arrays out of the box. For non-empty, use Field(min_length=1) or a model_validator.

Validation guidance. Decoders MUST reject empty arrays at nonEmptyArray<X> positions. Encoders MUST preserve element order.

2.9 Constraints

What it is. Inline //-comments on wire-grammar.md productions declare constraints not expressible in the type expression. The constraint kinds in current use are:

  • Lexical-form well-formedness. Values matching a syntactic category — e.g. BCP 47 language tags, the ASCII identifier pattern [A-Za-z][A-Za-z0-9_-]* for EmbeddedArtifactKey, RFC 3987 IRIs, SemVer for Version / ModelVersion, ISO 8601 date-time stamps.
  • Uniqueness across a collection. Distinctness of an entry’s identifying property within an array or set — e.g. EmbeddedArtifact.key values must be unique within a Template, LangString.lang tags must be unique within a MultilingualString, PermissibleValue.value tokens must be unique within an enum spec.
  • At-least-one-of. A production with all-optional components requires at least one to be present — e.g. OntologyDisplayHint requires at least one of acronym or name.
  • Value relationships across slots. A value at one slot must agree with another slot — e.g. the IRI placed at a field’s id MUST belong to a field of the same family as the enclosing kind.
  • Numeric ordering. Where a production carries paired bounds, the lower bound must not exceed the upper — e.g. Cardinality.min ≤ Cardinality.max.

Bindings SHOULD validate at construction time and throw a binding-specific exception type. A constructed instance is then always valid; downstream code may rely on the construction guarantee.

Recommend one canonical exception class per binding:

export class CedarConstructionError extends Error {
  constructor(message: string) { super(message); this.name = 'CedarConstructionError'; }
}
public class CedarConstructionException extends RuntimeException {
  public CedarConstructionException(String message) { super(message); }
}
class CedarConstructionError(Exception):
    pass

Validation guidance. Validate eagerly at construction. Lazy validation (deferring checks until access) is discouraged: the model is value-typed; an invalid value should never exist in the runtime heap. Where validation depends on a wider context (e.g., embedded-key uniqueness depends on the whole Template.members array), perform the check in the enclosing constructor.

2.10 Widening constructors

What it is. An ergonomic pattern in which a constructor accepts a broader set of input shapes than its return type: iri() accepts Iri | string; multilingualString() accepts string | LangString | { [lang]: string } | LangString[]; property() accepts string | Iri | PropertyInit. The widened constructor narrows to the canonical wire shape, validating along the way. When the input is already in canonical form the constructor SHOULD return it unchanged (so e.g. iri(iri(s)) is well-defined and equivalent to iri(s)); this lets callers chain widening constructors without redundancy concerns.

This is recommended-but-not-required. A binding’s narrow (canonical) constructor — taking exactly the wire-grammar shape — MUST exist; widening factories are convenience layers on top.

TypeScript idiom. Function overloads or a single union-typed input parameter; the function dispatches on typeof / structural shape.

Java idiom. Static factory overloads on the record: Iri.of(String), Iri.of(URI). Avoid widening the canonical record constructor itself, which Jackson uses; add overloads as static methods so the wire-shape constructor remains unambiguous.

Python idiom. Module-level factory functions accepting Union types; the canonical Pydantic model constructor remains for the narrow shape. Avoid __init__ overloading via sentinels; prefer explicit factory functions (iri.from_string, etc.).

2.11 Immutability

Strongly recommend immutable-by-default for all binding types. A CEDAR artifact is a value; mutability is a hazard.

  • TypeScript: readonly on every interface property; Object.freeze() on constructed instances and on any nested arrays. cedar-ts freezes invariant-bearing arrays (MultilingualString, Template.members).
  • Java: record types are immutable by language design; for non-record classes use final fields, no setters, and defensive copies on collections (List.copyOf, Set.copyOf, Map.copyOf).
  • Python: Pydantic models with model_config = ConfigDict(frozen=True); dataclasses with @dataclass(frozen=True).

Equality is structural: two values with the same component values are equal regardless of allocation identity. Records and Pydantic models provide this automatically; TypeScript binders need a shallow-equality helper if equality is meaningful at call sites.

2.12 Override-precedence accessors

What it is. Several grammar slots come in pairs: a canonical value on a reusable artifact and an optional override on the embedding site. The override wins per the spec’s two-layer precedence rule (grammar.md §Defaults; presentation.md §Help-Text Rendering). The current pairs:

Reusable artifact slotEmbedding-site override slot
Field.fieldSpec.defaultValueEmbeddedXxxField.defaultValue
Field.label / Field.metadata.altLabelsEmbeddedXxxField.labelOverride.label / altLabels
Field.helpTextEmbeddedXxxField.helpTextOverride

Binding guidance. Bindings SHOULD expose a small convenience accessor per pair so call sites do not re-implement the precedence rule. The accessor takes the EmbeddedField and the resolved Field and returns the effective value:

// TypeScript
function resolvedHelpText(
  embedded: EmbeddedTextField,
  field: TextField,
): MultilingualString | undefined {
  return embedded.helpTextOverride ?? field.helpText;
}
// Java
public static Optional<MultilingualString> resolvedHelpText(
    EmbeddedTextField embedded, TextField field) {
  return Optional.ofNullable(embedded.helpTextOverride())
                 .or(() -> Optional.ofNullable(field.helpText()));
}
# Python
def resolved_help_text(
    embedded: EmbeddedTextField, field: TextField
) -> MultilingualString | None:
    return embedded.help_text_override or field.help_text

The same pattern applies to defaultValue and labelOverride.label, each with its own accessor. Replace, not merge: the override replaces the canonical value at the embedding site; partial localization fallback (e.g., one language overridden, others falling through) is not part of the precedence rule. Bindings MUST NOT synthesise such fallback.


3. Naming Conventions per Language

LanguageTypesFunctions / methods / propertiesConstants
TypeScriptUpperCamelCaselowerCamelCaseSCREAMING_SNAKE_CASE
JavaUpperCamelCaselowerCamelCaseSCREAMING_SNAKE_CASE
PythonUpperCamelCasesnake_caseSCREAMING_SNAKE_CASE

Reserved-word collisions (Java). As of the current model no grammar property name collides with a Java reserved word. (Verified by cross-referencing every property name in wire-grammar.md against the full Java reserved-word list.) A future grammar property whose name collides with a Java reserved word SHOULD be escaped by either renaming the Java field to a non-reserved synonym (e.g. isFoo for a wire foo) or using a leading underscore (_foo), in either case mapping back to the wire name via @JsonProperty("foo"). The wire name remains canonical.

Property naming (Python). Pydantic models can use Field(alias= 'lowerCamelName') together with model_config = ConfigDict( populate_by_name=True) to expose Python snake_case attribute names while preserving the wire’s lowerCamelCase. This is the recommended pattern:

class SchemaArtifactVersioning(BaseModel):
    model_config = ConfigDict(populate_by_name=True, frozen=True)
    version: str
    status: Status
    previous_version: str | None = Field(default=None, alias="previousVersion")
    derived_from: str | None = Field(default=None, alias="derivedFrom")


class TextField(BaseModel):
    model_config = ConfigDict(populate_by_name=True, frozen=True)
    kind: Literal["TextField"] = "TextField"
    id: str
    model_version: str = Field(alias="modelVersion")
    metadata: CatalogMetadata
    versioning: SchemaArtifactVersioning
    field_spec: TextFieldSpec = Field(alias="fieldSpec")
    label: MultilingualString

model_version is a top-level field on every concrete artifact class (Template, TemplateInstance, every XxxField, and every PresentationComponent variant); it is no longer nested inside SchemaArtifactVersioning.

A binding MAY instead expose lowerCamelCase Python attribute names to avoid the alias layer; the alias approach is recommended for PEP 8 conformance on the Python surface.


4. Codebase Organisation

Bindings SHOULD organise the source tree so that everything specific to a single field family lives together. A field family is the twenty-way grouping introduced in grammar.md §3.2: TextField, IntegerNumberField, RealNumberField, BooleanField, DateField, TimeField, DateTimeField, ControlledTermField, SingleValuedEnumField, MultiValuedEnumField, LinkField, EmailField, PhoneNumberField, OrcidField, RorField, DoiField, PubMedIdField, RridField, NihGrantIdField, and AttributeValueField.

The “everything specific to a family” set comprises, at minimum, the family’s:

  • typed identifier (TextFieldId, etc.)
  • field artifact type (TextField, etc.)
  • field spec type (TextFieldSpec, etc.)
  • embedded field artifact type (EmbeddedTextField, etc.)
  • per-family value type (TextValue, etc.)
  • any per-family rendering hint (TextRenderingHint, etc.)
  • per-family construction helpers (widening constructors, type guards, JSON adapters, validators)

Per-language convention:

  • TypeScript. A single file per family — text-field.ts, integer-number-field.ts, controlled-term-field.ts, etc. — that exports every type and helper in the list above. Cross-family abstractions (the Field union, the EmbeddedField union, cross-cutting helpers) live in their own files and re-export from the per-family files where appropriate.
  • Java. A single package per family — package org.example.cedar.field.text, org.example.cedar.field.integer, org.example.cedar.field.controlledterm, etc. — containing the family’s records, sealed interface members, type-info annotations, and per-family helpers. The umbrella Field sealed interface and cross-family abstractions live in a parent package (org.example.cedar.field).
  • Python. A single module per family — cedar/field/text_field.py, etc. — paralleling the TypeScript layout.

The motivation is locality: any change to a field family — adding a constraint, renaming a property, introducing a new rendering hint — should touch one file (TS, Python) or one package (Java), not many. This also makes it straightforward for a reader to trace a wire property like defaultValue from its appearance in a SingleValuedEnumFieldSpec block to the family’s EnumValue type without navigating across the codebase.

Bindings MAY group cross-family abstractions (the Field union, the EmbeddedField union, the Value union, Cardinality, CatalogMetadata, SchemaArtifactVersioning, etc.) however they like; only family-specific code is constrained by this guideline.


5. The Reference TypeScript Binding

The reference TypeScript implementation is cedar-ts, published as @metadatacenter/cedar-model on npm. It is the source of truth for any TypeScript-specific idiom not covered explicitly in this document.

High-level structure (the src/ tree mirrors the grammar layering):

  • leaves/ — primitive validators and typed leaves (Iri, LanguageTag, IsoDateTimeStamp, ASCII-id, BCP 47, SemVer, integer).
  • multilingual.tsMultilingualString and LangString.
  • values/ — the Value family. Each Value variant carries its family-specific content directly (lexical form, language tag, datatype, or boolean payload, as appropriate); there is no separate Literal layer.
  • identity.ts — artifact identifiers (FieldId, TemplateId, PresentationComponentId, TemplateInstanceId).
  • metadata/LifecycleMetadata, SchemaArtifactVersioning, Annotation.
  • field-specs/FieldSpec family.
  • fields.tsField family.
  • embedded/EmbeddedField, EmbeddedTemplate, EmbeddedPresentationComponent, plus Cardinality, Property, LabelOverride, Visibility, ValueRequirement.
  • presentation/PresentationComponent family.
  • instances/TemplateInstance, FieldValue, NestedTemplateInstance.
  • template.tsTemplate.
  • index.ts — public API surface.

Conventions adopted by cedar-ts (already documented in §2 above):

  • readonly on all interface properties; Object.freeze on invariant-bearing arrays.
  • A canonical xxxInit interface alongside each Xxx interface, giving the construction-time input shape that may differ from the output (e.g., accepts Iri | string where the output stores Iri).
  • A widening constructor function (e.g. cardinality(init), multilingualString(input)) per production.
  • A type guard (isXxx) per polymorphic production.
  • A single CedarConstructionError thrown for all construction-time invariant failures.

6. Open Issues per Language

Java.

  • For NonEmptyList<T>-style helper types used as a MultilingualString substrate (or anywhere a non-empty collection invariant must be enforced), prefer a plain final class or sealed interface over a record. Records lock the component layout into the canonical constructor signature, which constrains the API for static factories (NonEmptyList.of(t), NonEmptyList.of(t1, t2, …)), varargs construction, and any desire to implement List<T> directly — all easier on a final class.
  • Records cannot be null-rejected at the canonical constructor in a way Jackson respects without extra annotations; combining @JsonInclude(NON_NULL) with explicit checks in the canonical constructor body is the established pattern.

Python.

  • Pydantic v1 vs v2 differs significantly in discriminated-union handling. Bindings SHOULD target v2; the recommendations in §2 use v2 exclusively.
  • Optional[T] vs T | None is style-only since Python 3.10; prefer T | None for new code.
  • enum.StrEnum requires Python 3.11+. Bindings targeting earlier versions SHOULD use enum.Enum subclassing str.

All bindings.

  • JSON numbers exceeding 2^53 − 1 in NonNegativeInteger slots: the wire grammar allows the string-fallback encoding (wire-grammar.md §2.1, serialization.md §5.1). Bindings SHOULD use BigInt (TS), BigInteger (Java), or int (Python ints are unbounded) on the binding side. The current model does not have any use site that actually exercises this — length bounds, cardinality bounds, traversal depths, numeric precision are all small — but the encoder MUST be capable of emitting the string form when given an out-of-range value.
  • Round-trip ordering of optional properties within a tagged object is not significant; bindings MUST NOT rely on JSON property order for correctness (per serialization.md §4.7).

7. Reading wire-grammar.md as a Binding Implementer

A short cheat-sheet that maps wire-grammar.md notation to the meta-categories above, so an implementer encountering a production can quickly classify it:

wire-grammar.md shapeCategory
T ::: string / number / boolean / nullPrimitive (or typed primitive wrapper — §2.4)
T ::: array<X>Repeated component (§2.8)
T ::: nonEmptyArray<X>Repeated component (§2.8); §2.5 for MultilingualString specifically
T ::: object { … } with no "kind": "..." literal propertyPlain object production (§2.1)
T ::: object { … } with a "kind": "..." literal propertyMember of a kind-discriminated union (§2.2)
`T ::: AB
`T ::: AB
`T ::: “a”“b”
T ::: SomeOtherProduction (collapsed wrapper, e.g. PreferredLabel ::: MultilingualString)The wrapper carries no extra information; bind it as the inner type’s idiom.

Optional components are marked with ? on the property (prop?: Type) — see §2.6. Inline //-comments declare constraints to enforce at construction (§2.9).


8. Cross-References

Instances

Overview

A TemplateInstance is an Artifact that conforms to a Template.

The structure of a TemplateInstance is determined by the embedded data-bearing artifacts of the referenced Template.

TemplateInstance

A TemplateInstance carries a TemplateInstanceId, a ModelVersion (the version of the CEDAR structural model the instance conforms to, hoisted to top-level on every concrete artifact), an ArtifactMetadata block, a reference to the Template it conforms to, and zero or more InstanceValue constructs.

TemplateInstance carries ArtifactMetadata rather than SchemaArtifactMetadata: instances do not carry schema versioning. The Template they reference fixes the schema version.

Each InstanceValue corresponds to an embedded artifact in the referenced Template that contributes data.

The template reference is persistent and provides the basis for validation and interpretation of instance content.

InstanceValue

InstanceValue has two forms:

  • FieldValue
  • NestedTemplateInstance

PresentationComponent does not correspond to any InstanceValue.

FieldValue

A FieldValue associates an EmbeddedArtifactKey with one or more values for an EmbeddedField.

The key identifies the embedding site within the containing Template, which allows the same referenced Field to appear in multiple contexts without ambiguity.

FieldValue may contain multiple values when the corresponding EmbeddedField permits multiplicity.

The permitted form of each contained value is determined by the FieldSpec of the referenced Field. Each value in FieldValue.values is a member of the Value polymorphic union and therefore carries a kind discriminator on the wire (per wire-grammar.md §1.5). A decoder reads kind to pick the union arm; the resulting arm MUST match the family expected by the referenced FieldSpec (e.g. a FieldValue for a TextFieldSpec carries TextValue entries with "kind": "TextValue").

For EnumFieldSpec, every contained value is a tagged EnumValue ({ "kind": "EnumValue", "value": "<Token>" }) whose value MUST equal the canonical Token of one of the referenced spec’s PermissibleValue entries. A SingleValuedEnumFieldSpec permits exactly one such EnumValue per FieldValue; a MultiValuedEnumFieldSpec permits one or more, subject to the embedding’s Cardinality.

Defaults are not part of instances

A TemplateInstance records the values a user supplied; it does not record default values. Defaults specified at the field-level (XxxFieldSpec.defaultValue) or embedding-level (EmbeddedXxxField.defaultValue) are UI/UX initialisation only — they pre-populate the form a user fills in, but the resulting instance carries the user’s chosen value as if the user had typed it in by hand.

Two consequences:

  • A user who accepts a default without modification produces a FieldValue carrying that value verbatim. From the instance’s perspective the default and a user-supplied identical value are indistinguishable.
  • A user who supplies no value (and the field is not required) produces no FieldValue for that key. The default does not appear by virtue of having existed; absence in the instance means absence, not “use the default.”

This matters for downstream consumers: the absence of a FieldValue for a given EmbeddedField is unambiguous evidence that no value was supplied, and never an implicit reference to a default.

See also grammar.md §Defaults and serialization.md §6.8.

NestedTemplateInstance

A NestedTemplateInstance associates an EmbeddedArtifactKey with nested InstanceValue constructs corresponding to an EmbeddedTemplate.

This provides recursive instance structure aligned with recursive template structure.

Conformance

A TemplateInstance MUST conform to the structure implied by its referenced Template.

A conforming instance MUST use EmbeddedArtifactKey values that identify embedded data-bearing artifacts in that template context.

Textual instance values MAY include language tags.

TextValue carries a lexical form and an optional language tag.

Numeric instance values carry a lexical form together with the corresponding XSD datatype: IntegerNumberValue is fixed at xsd:integer; RealNumberValue carries an explicit datatype (xsd:decimal, xsd:float, or xsd:double).

Date, time, and date-time instance values are represented separately by DateValue, TimeValue, and DateTimeValue, each carrying its own lexical form. Within DateValue, YearValue and YearMonthValue carry plain strings matching YYYY and YYYY-MM respectively; FullDateValue carries an xsd:date lexical form.

Controlled term instance values SHOULD preserve both a term Iri and a human-readable label. They MAY additionally preserve notation and preferred label information from the source terminology.

External authority instance values SHOULD preserve both the typed authority IRI (OrcidIri, RorIri, DoiIri, PubMedIri, RridIri, or NihGrantIri as appropriate) and, where available, a human-readable label.

Open Questions

  • Should TemplateInstance permit partial conformance during authoring workflows, or should the model define only fully conforming instances?

Presentation Components

Overview

PresentationComponent defines reusable presentation or instructional content that may appear within a Template through EmbeddedPresentationComponent.

PresentationComponent is distinct from Field and MUST NOT be treated as a data-bearing schema construct. It is also distinct from SchemaArtifact: presentation components carry plain ArtifactMetadata rather than SchemaArtifactMetadata, since they do not participate in schema versioning.

Artifact shape

Every concrete PresentationComponent carries the following common slots:

  • PresentationComponentId — the artifact’s identity IRI.
  • ModelVersion — the version of the CEDAR structural model the artifact conforms to (hoisted to top-level on every concrete artifact).
  • ArtifactMetadata — the artifact’s name, description, lifecycle, optional annotations, etc.
  • a per-variant body: the substantive content of the component (HTML, image IRI, video IRI, or — for the structural break components — empty).

Defined Components

This specification defines the following PresentationComponent variants:

VariantBody
RichTextComponentHtmlContent (an HTML string for rendered presentation)
ImageComponentIri for the image source, with optional Label and Description
YoutubeVideoComponentIri for the video source, with optional Label and Description
SectionBreakComponent(no body) — contributes sectional separation in a rendered form
PageBreakComponent(no body) — contributes pagination structure

These constructs replace the older practice of treating static presentation constructs as field variants.

Embedding

Presentation constructs appear in a Template only through EmbeddedPresentationComponent.

An EmbeddedPresentationComponent carries:

  • EmbeddedArtifactKey — the local key identifying this embedding within the containing Template.
  • PresentationComponentId — the artifactRef to the reusable PresentationComponent being embedded.
  • optional Visibility — the rendering visibility of the embedded component.

It does not carry a value requirement, cardinality, default value, label override, or semantic property IRI: the component contributes no instance data and exists purely to contribute presentational structure.

Instance Semantics

PresentationComponent does not produce InstanceValue.

Conforming implementations MUST NOT create FieldValue, NestedTemplateInstance, or any other InstanceValue for a PresentationComponent. The EmbeddedArtifactKey of an EmbeddedPresentationComponent MUST NOT appear as the key of any InstanceValue in a conforming TemplateInstance.

Help-Text Rendering

This section is normative for conforming form renderers. The structural model carries help-text content on the Field artifact (HelpText) and optional per-embedding overrides on EmbeddedField (HelpTextOverride). How that content is presented at form-render time is governed by the enclosing Template’s HelpDisplayMode.

Effective help-text resolution

At each EmbeddedField site, the effective help text is determined as follows:

  1. If the EmbeddedField carries a HelpTextOverride: the effective help text is the override’s value.
  2. Otherwise, if the referenced Field carries a HelpText: the effective help text is the field’s value.
  3. Otherwise, the effective help text is empty; the renderer displays no help for this field regardless of HelpDisplayMode.

The override is replace, not merge: localizations present in the field’s HelpText but absent from the embedding’s HelpTextOverride do not fall back into the resolved content.

Display-mode selection

The presentation of effective help text at a given embedding site is governed by the HelpDisplayMode resolved per the cascade rule below:

  • "inline" — render the effective help text as visible text adjacent to the field, typically beneath the input. This is the default when no mode is set.
  • "tooltip" — render as a hover/focus tooltip, triggered by a ? icon or other discoverable affordance. Conforming renderers MUST also make the text available to assistive technologies.
  • "both" — emit both the inline rendering and the tooltip rendering. Recommended for accessibility-sensitive contexts where redundancy is preferred.
  • "none" — do not render the effective help text at form-render time. The content remains part of the model and is available to alternative renderers (catalog browsers, RDF projectors, etc.).

Cascade rule for nested templates

HelpDisplayMode cascades from the outermost Template in a form to every field rendered within that form, including fields contributed by nested templates referenced via EmbeddedTemplate. Specifically:

  • When a Template T_outer embeds another Template T_inner via EmbeddedTemplate, the renderer MUST use T_outer’s HelpDisplayMode (or its default if unset) when rendering fields contributed by T_inner.
  • T_inner’s own HelpDisplayMode is ignored for help-text rendering at that embedding site.
  • T_inner’s HelpDisplayMode applies only when T_inner is rendered standalone (e.g., previewed in authoring tooling as a reusable artifact, or used as the top-level template in another context).

This rule is specific to HelpDisplayMode. Future TemplateRenderingHint slots may define different cascade behaviour and MUST state their cascade rule explicitly.

When HelpDisplayMode is absent from a Template — either because the template carries no TemplateRenderingHint, or because the hint omits the slot — the resolved mode is "inline".

Placeholder Rendering

This section is normative for conforming form renderers. The structural model carries placeholder content on rendering-hint productions attached to text-entry-capable field families (see grammar.md §Field Specs and §Rendering Hints, and the new Placeholder production). How that content is presented at form-render time is governed by the rules below.

What Placeholder is

Placeholder is a MultilingualString-valued slot on every rendering hint attached to a text-entry-capable field family. It carries sample input text — typically a short format demonstration such as "YYYY-MM-DD", "john.doe@example.com", or "https://orcid.org/0000-0000-0000-0000" — intended to be displayed inside an empty text-entry widget and to disappear once the user begins typing.

Placeholder is not semantic content about the field’s meaning; that is the role of HelpText. The two slots may coexist on the same field: HelpText explains what the field is for, Placeholder demonstrates what the typed input looks like.

Rendering requirements

Conforming renderers:

  • SHOULD display the effective Placeholder content inside text-entry input widgets when those inputs are empty.
  • MUST NOT display Placeholder content in a way that could be mistaken for a user-supplied value. Placeholders MUST be visually distinguishable from real input — conventionally via reduced opacity, italics, or a contrasting style.
  • MAY omit Placeholder rendering when accessibility concerns warrant (some screen readers handle the HTML placeholder attribute poorly). When Placeholder is omitted from the visual rendering, the renderer SHOULD ensure the same content is available through HelpText or another accessible affordance if it conveys information the user otherwise lacks.

Localization selection

When Placeholder carries multiple language-tagged localizations, the renderer selects the entry whose LanguageTag best matches the user’s preferred display language, falling back per the spec’s existing MultilingualString-localization-preference rules. No new rule is introduced for Placeholder; it follows the same selection convention as HelpText, PreferredLabel, and other MultilingualString-valued display content.

Relationship to value validation

Placeholder content is purely presentational. It is not validated against the field spec’s value constraints (validationRegex, langTagRequirement, timezoneRequirement, minLength, maxLength, etc.). A placeholder of "YYYY-MM-DD" may appear on a date field whose values are constrained to ISO 8601 — the placeholder is a demonstration of the expected lexical shape, not an instance of one. Conforming validators MUST NOT apply field-spec value constraints to placeholder content.

Open Questions

  • Model revision candidate: The current model requires all PresentationComponent variants to carry full reusable artifact identity. This is uniform but may be unnecessarily heavy for simple structural elements such as PageBreakComponent, which carry no meaningful content and are unlikely to be shared across templates. A future revision should consider whether lightweight inline-only variants could be introduced for such cases, and define the criteria for determining which components warrant reusable identity.
  • Which presentation-specific properties belong on the reusable PresentationComponent versus on EmbeddedPresentationComponent?

Field Families

This is a navigation index for the 20 concrete field families defined by the spec. Each row links to the family’s four principal productions in grammar.md:

  • Field artifact — the family’s XxxField, a standalone reusable artifact.
  • Field spec — the family’s XxxFieldSpec, carried by the standalone Field artifact.
  • Value — the family’s instance-value production (XxxValue), carried by a FieldValue in a TemplateInstance.
  • Embedded form — the family’s EmbeddedXxxField, used inside a Template’s members to reference the standalone field.

The conformance fixture column points at the per-family Template + Instance pair under normative-tests/valid/ and the standalone Field artifact.

Scalar text and numeric

Temporal

FamilyField artifactField specValueEmbedded formFixtures
DateDateFieldDateFieldSpecDateValue (three arms: FullDateValue, YearValue, YearMonthValue)EmbeddedDateField1318, 54
TimeTimeFieldTimeFieldSpecTimeValueEmbeddedTimeField1920, 55
Date-timeDateTimeFieldDateTimeFieldSpecDateTimeValueEmbeddedDateTimeField2122, 56

Controlled vocabulary

Reference and contact

External-authority identifiers

Open-ended

FamilyField artifactField specValueEmbedded formFixtures
Attribute valueAttributeValueFieldAttributeValueFieldSpecAttributeValueEmbeddedAttributeValueField4748, 72

Notes on the groupings

The groupings above are presentational, not normative. The spec does not partition field families into categories at the grammar level; every family is structurally a peer of every other under FieldSpec. The groupings here exist only to make this index easier to scan.

The four §9-of-serialization.md family-specific deviations are worth recalling at the embedding site:

  • EmbeddedBooleanField and EmbeddedSingleValuedEnumField omit cardinality (single-valued by construction).
  • EmbeddedMultiValuedEnumField.defaultValue is EnumValue* (a sequence).
  • EmbeddedAttributeValueField omits defaultValue.

The six external-authority identifier families (ORCID, ROR, DOI, PubMed, RRID, NIH grant) all share an identical XxxFieldSpec shape — { "kind": "<Family>FieldSpec" } with no per-family slots — but distinct value productions carrying typed IRIs.

Index of Productions

An alphabetical index of every production defined in this specification. The contents of this page are generated automatically from the EBNF blocks across all chapters.

A · B · C · D · E · F · H · I · L · M · N · O · P · R · S · T · U · V · Y

A

B

C

D

E

F

H

I

L

M

N

O

P

R

S

T

U

V

Y

CTM 1.6.0 Serialization Mapping

1. Purpose

This document specifies a one-directional, function-based mapping from the CEDAR Structural Model (defined in spec/grammar.md) to CTM 1.6.0 JSON-LD format. The Structural Model remains the authoritative definition of the model; this document defines how constructs in that model are encoded as CTM 1.6.0 JSON-LD values. Each encoding function takes one or more abstract grammar constructs as arguments and produces a JSON value. The functions are defined precisely enough to be directly implementable.

What CTM 1.6.0 is

CTM 1.6.0 (CEDAR Template Model version 1.6.0) is the concrete JSON-LD format used by the CEDAR Workbench to store and exchange metadata templates and their filled-in instances. A CTM 1.6.0 document is a JSON object that simultaneously serves three roles: it is a JSON-LD document (it carries @context, @id, and @type for RDF interpretation), a JSON Schema document (it carries $schema, type, properties, and required so that conforming instances can be validated), and a CEDAR-specific descriptor (it carries _valueConstraints and _ui keys understood by CEDAR tooling). These three concerns are all mixed into the same flat JSON object rather than being kept separate.

The abstract model vs. the serialization

The CEDAR Structural Model (spec/grammar.md) is the authoritative, format-independent definition of what a template means. It describes templates, fields, embedded artifacts, and instances in abstract terms — without committing to any particular wire format. This document defines how to translate that abstract model into CTM 1.6.0 JSON-LD. The mapping is one-directional (abstract → concrete) and lossy in places: some Structural Model constructs have no CTM 1.6.0 equivalent and are dropped (see Section 14).

Caution: This mapping is not round-trippable. Encoding a Structural Model construct to CTM 1.6.0 and then decoding back will not always recover the original construct. See Section 14 for a full list of known gaps and lossy areas before implementing.

Key structural ideas

Templates and fields are separate reusable artifacts. In the Structural Model, a Template does not contain Field objects directly — it contains EmbeddedField references that point to separately-defined Field artifacts. When encoding, information from both the embedding (EmbeddedField) and the referenced definition (Field) must be combined. Most field schema content (value shape, value constraints, UI hints) ends up inside the template’s "properties" object, keyed by the embedding’s key identifier.

Embedded artifact information is distributed across four top-level keys. For each field or nested template in a template, there is no single output key that corresponds to it. Instead its information is spread across "properties" (the field schema), "required" (whether it is mandatory), "_ui" (display order and label overrides), and "@context" (the property IRI mapping for JSON-LD). Understanding this distribution is essential to reading the encoding functions correctly.

The field object carries both schema structure and rendering hints. Each field’s entry inside "properties" is itself a JSON object that combines JSON Schema structure (what type of value the field holds, expressed via "properties", "required", "additionalProperties") with CTM-specific keys ("_valueConstraints" for validation rules such as required/optional, numeric type, or controlled term sources; "_ui" for rendering instructions such as input type and visibility). These are merged into a single flat field object.

Instance values are plain JSON-LD objects. A template instance is a flat JSON object whose keys are the field key identifiers from the template. Each key maps to a small JSON-LD value object — typically { "@value": "..." } for text and numeric fields, or { "@id": "..." } for IRI-valued fields. Multi-valued fields produce a JSON array of such objects. The template’s "@context" is reused in the instance so that each field key resolves to its property IRI for RDF interpretation.

Call graph

The diagram below shows the main call relationships between encoding functions. Nodes marked ×N represent a group of similar functions; see the relevant section for the individual entries. Dashed arrows indicate recursion.

flowchart TD
    ET(["encode_template"])
    ETI(["encode_template_instance"])

    subgraph S5["§5 · Metadata"]
        EAM["encode_artifact_metadata"]
        ECM["encode_catalog_metadata"]
        ETPR["encode_temporal_provenance"]
        ESV["encode_schema_artifact_versioning"]
        EAM --> ECM
        EAM --> ETPR
        EAM --> ESV
    end

    subgraph S6["§6 · Template Structure"]
        ETC["encode_template_context"]
        ETP["encode_template_properties"]
        ETR["encode_template_required"]
        ETUI["encode_template_ui"]
        ETC --> EPCE["encode_property_context_entry"]
    end

    subgraph S7["§7 · Embedded Artifacts"]
        EEAS["encode_embedded_artifact_schema"]
        EEFS["encode_embedded_field_schema"]
        EETS["encode_embedded_template_schema"]
        EEPCS["encode_embedded_presentation\n_component_schema"]
        EEAS --> EEFS
        EEAS --> EETS
        EEAS --> EEPCS
    end

    subgraph S8["§8 · Field"]
        EF["encode_field"]
    end

    subgraph S9["§9 · Field Specs"]
        EFT["encode_*_field_spec ×13"]
        EEC["encode_embedding_constraints"]
        EEUI["encode_embedding_ui"]
        EFT --> EEC
        EFT --> EEUI
    end

    ETE["encode_template_element §10"]

    subgraph S11["§11 · Values"]
        EV["encode_value"]
        EVX["encode_*_value ×12"]
        EV --> EVX
    end

    subgraph S12["§12 · Instance"]
        EFV["encode_field_value"]
        ENTS["encode_nested_template_instance_slot"]
    end

    ET --> EAM
    ET --> ETC
    ET --> ETP
    ET --> ETR
    ET --> ETUI
    ETP --> EEAS
    EEFS --> EF
    EETS --> ETE
    ETE -->|"reuses"| ETC
    ETE -->|"reuses"| ETP
    ETE -->|"reuses"| ETR
    ETE -->|"reuses"| ETUI
    ETE --> EAM
    EF --> EAM
    EF -->|"dispatch"| EFT

    ETI --> ETC
    ETI --> EAM
    ETI --> EFV
    ETI --> ENTS
    EFV --> EV
    ENTS -.->|"recursive"| ETI

2. Conventions

Function Signature Form

encode_X(x: X) → JSON-kind

X is a grammar production name, x is the parameter, and JSON-kind is one of: Object, String, Array, Number, Boolean, or null.

JSON Notation in Function Bodies

  • { k₁: v₁, k₂: v₂ } — JSON object literal
  • [ v₁, v₂ ] — JSON array
  • "..." — literal string
  • null — JSON null
  • omit — the key is absent from the output (not even present as null)

Accessor Notation

Dot notation is used on grammar constructs, e.g. T.schema_artifact_metadata or E.embedded_artifact_key. Where a grammar construct wraps a primitive string (e.g. Identifier ::= identifier(string)), write D.identifier.string to reach the string value.

Helper Functions

  • key(E) — the ASCII identifier string of E’s EmbeddedArtifactKey; defined as E.embedded_artifact_key.ascii_identifier
  • iri(I) — the IRI string of an Iri construct; defined as I.iri_string
  • merge(a, b, ...) — merge JSON objects left-to-right; later objects take precedence on key conflicts
  • if P then k: v — include key k with value v only when predicate P holds; otherwise omit
  • [ x(E) for each E in xs ] — JSON array built by evaluating x(E) for each element E of sequence xs
  • { k(E): v(E) for each E in xs } — JSON object built from key-value pairs, one per element (inline form); xs must be a plain sequence with no inline filter. In multi-line blocks, the iteration clause comes first: { for each E in xs: k(E): v(E) }
  • let x = expr — within a function body, binds the name x to the value of expr; x may then be used in subsequent expressions in the same body
  • [ E in xs | P(E) ] — the subsequence of xs retaining only those elements for which predicate P(E) holds; used in let bindings to pre-filter before passing to a comprehension
  • xs ++ ys — concatenation of arrays xs and ys

Cardinality Helper

  • is_multi(E) — true if E.cardinality is present and max_cardinality is either UnboundedCardinality or a NonNegativeInteger greater than 1

Default Conventions

  • When Cardinality is absent, effective min = 1, effective max = 1 (single-valued).
  • When ValueRequirement is absent, effective requirement is Optional.

3. Worked Example

This section traces a minimal template and a corresponding instance through the encoding functions. The goal is to show concretely what the abstract model constructs look like as CTM 1.6.0 JSON-LD, and which functions are responsible for each part of the output.

3.1 The Example Model

Template — “Sample Record”

PropertyValue
template_idhttps://repo.example.org/templates/sample-record
Name"Sample Record"
Description"A minimal metadata template for biological samples"
Version1.0.0
Status"draft"
Model version1.6.0
Created / modified2024-01-15T10:00:00Z by https://orcid.example.org/0000-0001-2345-6789

Two embedded fields:

KeyProperty IRIValueRequirementFieldSpec
titlehttps://schema.org/name"required"TextFieldSpec (single line)
counthttps://example.org/sampleCountOptionalIntegerNumberFieldSpec

Instance — “Sample 42”

PropertyValue
template_instance_idhttps://repo.example.org/instances/abc123
schema:name"Sample 42"
Based onthe template above
Created / modified2024-03-10T09:30:00Z by https://orcid.example.org/0000-0001-2345-6789
title valueTextValue"Mouse Sample 42"
count valueNumericValue5 (xsd:integer)

3.2 Encoding the Template

encode_template(T) assembles the output by calling several sub-functions and merging their results. The annotations below identify the responsible function for each part.

{
  // encode_template — fixed identity and schema keys
  "@id":    "https://repo.example.org/templates/sample-record",
  "@type":  "https://schema.metadatacenter.org/core/Template",
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type":   "object",
  "title":  "Sample Record",
  "description": "A minimal metadata template for biological samples",
  "additionalProperties": false,

  // encode_template_context — STANDARD_NS plus one entry per embedded field
  // that carries a Property (both do here); encode_property_context_entry
  // returns a plain IRI string when no property_label is present
  "@context": {
    "schema":   "http://schema.org/",
    "pav":      "http://purl.org/pav/",
    "oslc":     "http://open-services.net/ns/core#",
    "bibo":     "http://purl.org/ontology/bibo/",
    "rdfs":     "http://www.w3.org/2000/01/rdf-schema#",
    "skos":     "http://www.w3.org/2004/02/skos/core#",
    "xsd":      "http://www.w3.org/2001/XMLSchema#",
    "title":    "https://schema.org/name",
    "count":    "https://example.org/sampleCount"
  },

  // encode_template_properties — fixed instance-metadata entries followed
  // by one entry per embedded artifact (encode_embedded_field_schema for each)
  "properties": {
    "@context":           { "type": ["object", "null"] },
    "@id":                { "type": "string", "format": "uri" },
    "schema:isBasedOn":   { "type": "string", "format": "uri" },
    "schema:name":        { "type": "string" },
    "schema:description": { "type": ["string", "null"] },
    "pav:createdOn":      { "type": ["string", "null"], "format": "date-time" },
    "pav:createdBy":      { "type": ["string", "null"], "format": "uri" },
    "pav:lastUpdatedOn":  { "type": ["string", "null"], "format": "date-time" },
    "oslc:modifiedBy":    { "type": ["string", "null"], "format": "uri" },
    "title": { /* encode_embedded_field_schema — see Section 3.3 */ },
    "count": { /* encode_embedded_field_schema — see Section 3.3 */ }
  },

  // encode_template_required — fixed keys plus "title" (the only required field)
  "required": [
    "@context", "@id", "schema:isBasedOn", "schema:name",
    "schema:description", "pav:createdOn", "pav:createdBy",
    "pav:lastUpdatedOn", "oslc:modifiedBy",
    "title"
  ],

  // encode_template_ui — order reflects embedded_artifacts sequence
  "_ui": { "order": ["title", "count"] },

  // encode_artifact_metadata — metadata keys merged at top level
  "schema:name":        "Sample Record",
  "schema:description": "A minimal metadata template for biological samples",
  "pav:version":        "1.0.0",
  "bibo:status":        "bibo:draft",
  "schema:schemaVersion": "1.6.0",
  "pav:createdOn":      "2024-01-15T10:00:00Z",
  "pav:createdBy":      "https://orcid.example.org/0000-0001-2345-6789",
  "pav:lastUpdatedOn":  "2024-01-15T10:00:00Z",
  "oslc:modifiedBy":    "https://orcid.example.org/0000-0001-2345-6789"
}

3.3 Encoding the Embedded Fields

Both fields are single-valued (is_multi = false), so encode_embedded_field_schema returns the field object directly with no array wrapper.

title fieldencode_text_field_spec applies STRING_VALUE_SHAPE. encode_embedding_constraints sets requiredValue: true (the embedding is "required"). encode_text_rendering_hint returns "textfield" (absent hint defaults to single-line).

{
  "@id":    "https://repo.example.org/fields/title",
  "@type":  "https://schema.metadatacenter.org/core/TemplateField",
  "@context": { /* STANDARD_NS */ },
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type":   "object",
  "title":  "Title",
  "description": "",
  "properties": {
    "@type":  { "oneOf": [{ "type": "string", "format": "uri" }, { "type": "null" }] },
    "@value": { "type": ["string", "null"] }
  },
  "required": ["@value"],
  "additionalProperties": false,
  "_valueConstraints": { "requiredValue": true },
  "_ui":   { "inputType": "textfield" },
  // encode_artifact_metadata for the field:
  "schema:name": "Title", "schema:description": null,
  "pav:version": "1.0.0", "bibo:status": "bibo:draft",
  "schema:schemaVersion": "1.6.0",
  "pav:createdOn": "2024-01-15T10:00:00Z", ...
}

count fieldencode_integer_number_field_spec applies NUMBER_VALUE_SHAPE and emits "xsd:integer" for the datatype slot (an integer-number field’s category is fixed). encode_embedding_constraints sets requiredValue: false (Optional).

{
  "@id":    "https://repo.example.org/fields/count",
  "@type":  "https://schema.metadatacenter.org/core/TemplateField",
  "@context": { /* STANDARD_NS */ },
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type":   "object",
  "title":  "Sample Count",
  "description": "",
  "properties": {
    "@type":  { "oneOf": [{ "type": "string", "format": "uri" }, { "type": "null" }] },
    "@value": { "type": ["number", "null"] }
  },
  "required": ["@value"],
  "additionalProperties": false,
  "_valueConstraints": { "requiredValue": false, "numberType": "xsd:integer" },
  "_ui":   { "inputType": "numeric" },
  "schema:name": "Sample Count", ...
}

3.4 Encoding the Instance

encode_template_instance(I, T) reuses the template context and maps each FieldValue using encode_field_valueencode_value.

{
  // reuses encode_template_context(T) — same @context as the template
  "@context": {
    "schema": "http://schema.org/", /* ... STANDARD_NS ... */
    "title":  "https://schema.org/name",
    "count":  "https://example.org/sampleCount"
  },
  "@id":              "https://repo.example.org/instances/abc123",
  "schema:isBasedOn": "https://repo.example.org/templates/sample-record",

  // encode_artifact_metadata
  "schema:name":        "Sample 42",
  "schema:description": null,
  "pav:createdOn":      "2024-03-10T09:30:00Z",
  "pav:createdBy":      "https://orcid.example.org/0000-0001-2345-6789",
  "pav:lastUpdatedOn":  "2024-03-10T09:30:00Z",
  "oslc:modifiedBy":    "https://orcid.example.org/0000-0001-2345-6789",

  // encode_field_value → encode_text_value (no language tag)
  "title": { "@value": "Mouse Sample 42" },

  // encode_field_value → encode_integer_number_value
  "count": { "@value": "5", "@type": "xsd:integer" }
}

4. Standard Namespace Context Object

STANDARD_NS is the following JSON object. It is included in every @context produced by this mapping.

{
  "schema":   "http://schema.org/",
  "pav":      "http://purl.org/pav/",
  "oslc":     "http://open-services.net/ns/core#",
  "bibo":     "http://purl.org/ontology/bibo/",
  "rdfs":     "http://www.w3.org/2000/01/rdf-schema#",
  "skos":     "http://www.w3.org/2004/02/skos/core#",
  "xsd":      "http://www.w3.org/2001/XMLSchema#"
}

STATIC_FIELD_NS is the smaller @context used by StaticTemplateField objects (presentation components). It omits rdfs, skos, and xsd.

{
  "schema":   "http://schema.org/",
  "pav":      "http://purl.org/pav/",
  "bibo":     "http://purl.org/ontology/bibo/",
  "oslc":     "http://open-services.net/ns/core#"
}

5. Metadata Encoding Functions

encode_artifact_metadata(A: Artifact) → Object

CTM 1.6.0 artifacts carry both human-readable metadata and (for schema artifacts) versioning information at the top level of their JSON object. The Structural Model factors these concerns differently: CatalogMetadata carries descriptive properties and lifecycle, SchemaArtifactVersioning is a parallel top-level slot on schema artifacts, and Label/Title are rendered-name slots that live as top-level slots on the artifact itself (Field.label, Template.title, optional TemplateInstance.label).

The CTM 1.6.0 encoder flattens all of these into a single flat property set on the artifact’s JSON object:

merge(
  encode_catalog_metadata(A.catalog_metadata, rendered_name_of(A)),
  encode_temporal_provenance(A.catalog_metadata.lifecycle),
  A is SchemaArtifact ? encode_schema_artifact_versioning(A.versioning) : {}
)

Where rendered_name_of(A) selects the artifact’s rendered display name according to the artifact kind:

  • For a Field: A.label (always present).
  • For a Template: A.title (always present).
  • For a TemplateInstance: A.label if present, otherwise A.catalog_metadata.preferred_label if present, otherwise the artifact id slug.
  • For a PresentationComponent: A.catalog_metadata.preferred_label if present, otherwise the artifact id slug.

Calls: encode_catalog_metadata, encode_temporal_provenance, encode_schema_artifact_versioning


encode_catalog_metadata(C: CatalogMetadata, rendered: MultilingualString or null) → Object

Encodes the human-readable identity of an artifact. The schema:name and schema:description keys are always written; schema:identifier and rdfs:label appear only when set in the Structural Model.

CTM 1.6.0 requires a single-string schema:name. The Structural Model carries multiple candidate sources for the artifact’s display name (the rendered slot Label/Title on artifacts that have one, plus the optional catalog slot CatalogMetadata.preferred_label). The rendered parameter is the rendered name chosen for this artifact by encode_artifact_metadata’s rendered_name_of rule. The encoder flattens it to a single string by selecting the en localization if present, else the first localization entry. The same flattened string is also written to rdfs:label for round-trip stability.

Returns a JSON object with the following keys:

KeyValueCondition
"schema:name"flatten_to_string(rendered) — prefer en, else first entryAlways present; falls back to artifact id slug if rendered is null
"schema:description"C.description.unicode_stringnull if C.description absent
"schema:identifier"C.identifier.unicode_stringOmit if C.identifier absent
"rdfs:label"flatten_to_string(C.preferred_label) if present, else flatten_to_string(rendered)Always present

AlternativeLabel values on CatalogMetadata have no direct CTM 1.6.0 equivalent and are omitted on encode.

Reverse direction (CTM 1.6.0 import). When importing a CTM 1.6.0 document into the Structural Model, the CTM 1.6.0 schema:name is mapped to the artifact’s rendered slot — label for a Field or TemplateInstance, title for a Template. For a PresentationComponent (which has no rendered slot), schema:name is mapped to CatalogMetadata.preferred_label. If the legacy document carries a non-empty rdfs:label distinct from schema:name, the importer maps rdfs:label to CatalogMetadata.preferred_label so that the registry display name and the rendered display name can diverge after import; otherwise preferred_label is left absent.


encode_temporal_provenance(P: TemporalProvenance) → Object

Records when an artifact was created and last modified, and by whom. All four keys are always present; values are ISO 8601 date-time strings and IRI strings respectively.

Returns a JSON object with the following keys:

KeyValue
"pav:createdOn"P.created_on.iso_8601_date_time_lexical_form
"pav:createdBy"iri(P.created_by)
"pav:lastUpdatedOn"P.modified_on.iso_8601_date_time_lexical_form
"oslc:modifiedBy"iri(P.modified_by)

encode_schema_artifact_versioning(V: SchemaArtifactVersioning) → Object

Encodes the version number, publication status, and schema format version of a schema artifact. Optional pav:previousVersion and pav:derivedFrom links are included only when the Structural Model carries them.

Returns a JSON object with the following keys:

KeyValueCondition
"pav:version"V.version.semantic_versionAlways present
"bibo:status"encode_status(V.status)Always present
"schema:schemaVersion"V.model_version.semantic_versionAlways present
"pav:previousVersion"iri(V.previous_version.iri)Omit if V.previous_version absent
"pav:derivedFrom"iri(V.derived_from.iri)Omit if V.derived_from absent

Calls: encode_status


encode_status(S: Status) → String

Maps the two-valued Status enumeration to its corresponding bibo: vocabulary string.

Returns the string corresponding to the Status kind:

Status kindReturns
"draft""bibo:draft"
"published""bibo:published"

6. Template Encoding

encode_template(T: Template) → Object

The top-level template object is the root of a CTM 1.6.0 template document. It is produced by merging several independently constructed fragments into one flat JSON object.

A key characteristic of this encoding is that information about each embedded field is spread across multiple top-level keys — it does not appear under a single nested key. For each embedded field E referencing a field F:

  • "properties" receives an entry at key(E) containing the full field schema (value shape, constraints, and UI hints) produced by encode_embedded_field_schema.
  • "required" receives key(E) if the embedding’s value requirement is "required".
  • "_ui" receives key(E) in its "order" array (and optionally in "propertyLabels"), derived from the embedding itself.
  • "@context" receives key(E) mapped to the field’s property IRI, if the embedding carries a Property.

This means the _ui key contains ordering and display information drawn from the embedding (EmbeddedField), while properties[key(E)] contains schema information drawn from the referenced Field. There is no single place in the output that corresponds one-to-one with an EmbeddedField — the embedding’s information is hoisted and distributed across these four top-level keys.

merge(
  {
    "@id":    iri(T.template_id),
    "@type":  "https://schema.metadatacenter.org/core/Template",
    "@context": encode_template_context(T),
    "$schema": "http://json-schema.org/draft-04/schema#",
    "type":   "object",
    "title":  T.schema_artifact_metadata.artifact_metadata.descriptive_metadata.name.unicode_string,
    "description": T.schema_artifact_metadata.artifact_metadata.descriptive_metadata.description.unicode_string
                   if description is present, else "",
    "properties":  encode_template_properties(T),
    "required":    encode_template_required(T),
    "additionalProperties": false,
    "_ui":    encode_template_ui(T)
  },
  encode_artifact_metadata(T)
)

Calls: encode_template_context, encode_template_properties, encode_template_required, encode_template_ui, encode_artifact_metadata


encode_template_context(T: Template) → Object

The @context maps compact term names to full IRIs for JSON-LD interpretation. Every template context begins with STANDARD_NS. For each data-bearing embedded artifact that carries a Property, an additional entry maps the artifact’s key string to its property IRI — or to a labelled mapping object if a property_label is also present. Artifacts without a Property (such as presentation components) contribute no context entry.

let embedded_properties = [ E in T.embedded_artifacts
                           | (E is EmbeddedField or E is EmbeddedTemplate) and E.property is present ]

merge(
  STANDARD_NS,
  { key(E): encode_property_context_entry(E.property) for each E in embedded_properties }
)

Calls: encode_property_context_entry


encode_property_context_entry(P: Property) → String or Object

Determines the form of a single entry in the template’s @context. When only a property IRI is available the entry is a plain string. When a human-readable label is also present the entry is an object with both @id and rdfs:label to support labelled JSON-LD mapping.

ConditionReturns
P.property_label absentiri(P.property_iri.iri)
P.property_label present{ "@id": iri(P.property_iri.iri), "rdfs:label": P.property_label.unicode_string }

encode_template_properties(T: Template) → Object

Produces the "properties" object for the template’s JSON Schema layer. This is one of the primary sites where embedded artifacts are encoded — each EmbeddedField and EmbeddedTemplate in the template contributes exactly one entry here, keyed by its EmbeddedArtifactKey.

The output has two parts merged together:

  1. Fixed instance-metadata entries. Nine fixed keys (@context, @id, schema:isBasedOn, schema:name, schema:description, and the four provenance keys) are always present. These define the schema for the instance-level metadata properties that every CTM 1.6.0 instance must carry, regardless of what fields the template defines.

  2. One entry per embedded artifact. For each EmbeddedArtifact E in the template, the entry at key(E) is produced by encode_embedded_artifact_schema(E). For an EmbeddedField this ultimately encodes the value shape, value constraints, and UI input type of the referenced Field — meaning the bulk of the field encoding (what kind of value it holds, what type annotations are required, whether it is multi-valued) is expressed here inside properties, not at the top level. For an EmbeddedTemplate the entry contains the full nested element schema. EmbeddedPresentationComponent entries are stubs ({}).

merge(
  {
    "@context": { "type": ["object", "null"] },
    "@id":      { "type": "string", "format": "uri" },
    "schema:isBasedOn":   { "type": "string", "format": "uri" },
    "schema:name":        { "type": "string" },
    "schema:description": { "type": ["string", "null"] },
    "pav:createdOn":      { "type": ["string", "null"], "format": "date-time" },
    "pav:createdBy":      { "type": ["string", "null"], "format": "uri" },
    "pav:lastUpdatedOn":  { "type": ["string", "null"], "format": "date-time" },
    "oslc:modifiedBy":    { "type": ["string", "null"], "format": "uri" }
  },
  {
    for each E in T.embedded_artifacts:
      key(E): encode_embedded_artifact_schema(E)
  }
)

Calls: encode_embedded_artifact_schema


encode_template_required(T: Template) → Array

Builds the required array for the template’s JSON Schema. The fixed instance-metadata keys are always required. In addition, any data-bearing embedded artifact whose effective ValueRequirement is "required" contributes its key to this array.

let required_embs = [ E in T.embedded_artifacts
                    | (E is EmbeddedField or E is EmbeddedTemplate)
                      and effective value_requirement of E is "required" ]

[ "@context", "@id", "schema:isBasedOn", "schema:name",
  "schema:description", "pav:createdOn", "pav:createdBy",
  "pav:lastUpdatedOn", "oslc:modifiedBy" ]
++ [ key(E) for each E in required_embs ]

encode_template_ui(T: Template) → Object

Encodes the _ui object for the template. The order entry lists all embedded artifact keys in their sequence order, controlling display order in rendering tools. When any embedding carries a label override, a propertyLabels map is also included. Header and Footer on the template are encoded as "header" and "footer" string keys when present.

let label_embs = [ E in T.embedded_artifacts | E.label_override is present ]

merge(
  { "order": [ key(E) for each E in T.embedded_artifacts ] },
  if label_embs is non-empty:
  {
    "propertyLabels": { key(E): E.label_override.label.unicode_string for each E in label_embs }
  },
  if T.header is present: { "header": T.header.unicode_string },
  if T.footer is present: { "footer": T.footer.unicode_string }
)

7. Embedded Artifact Schema Encoding

These functions produce the value placed at properties[key(E)] within the containing template or template element.

encode_embedded_artifact_schema(E: EmbeddedArtifact) → Object

Selects the appropriate encoding function based on whether the embedded artifact is a field, a nested template, or a presentation component. The result becomes the value placed at the artifact’s key in the parent template’s properties object.

Dispatches to the encoding function for the EmbeddedArtifact kind:

EmbeddedArtifact kindEncoding function
EmbeddedFieldencode_embedded_field_schema(E)
EmbeddedTemplateencode_embedded_template_schema(E)
EmbeddedPresentationComponentencode_embedded_presentation_component_schema(E)

Calls: encode_embedded_field_schema, encode_embedded_template_schema, encode_embedded_presentation_component_schema


encode_embedded_field_schema(E: EmbeddedField) → Object

This function is the bridge between the abstract EmbeddedField and the CTM 1.6.0 JSON Schema representation that an instance validator will actually use. Its job is to produce the value that goes at properties[key(E)] in the containing template.

There are two distinct concerns to resolve here:

1. Single-valued vs. multi-valued. The Structural Model represents cardinality on the EmbeddedField (the embedding), not on the Field definition itself. CTM 1.6.0 expresses multi-valued fields by wrapping the field schema in a JSON Schema array object ("type": "array", "items": ...), with optional minItems and maxItems bounds. Single-valued fields need no wrapper — the field object is used directly. This wrapping decision is therefore made here, at the embedding level, where the cardinality information lives. The is_multi(E) helper encapsulates this check.

2. Merging embedding context into the field encoding. The EmbeddedField also carries embedding-specific properties — most notably whether the field is required and whether it is hidden — that are not part of the reusable Field definition. These are passed down to encode_field via the E parameter so they can be incorporated into _valueConstraints and _ui within the field object itself.

let field_obj = encode_field(referenced_field(E), E)

where referenced_field(E) is the Field identified by the reference in E.

if is_multi(E):
  {
    "type": "array",
    "items": field_obj,
    if E.cardinality.min_cardinality is present:
      "minItems": E.cardinality.min_cardinality.non_negative_integer.integer_lexical_form (as integer),
    if E.cardinality.max_cardinality is present and not UnboundedCardinality:
      "maxItems": E.cardinality.max_cardinality.non_negative_integer.integer_lexical_form (as integer)
  }

else (single-valued):
  field_obj

Calls: encode_field


encode_embedded_template_schema(E: EmbeddedTemplate) → Object

Parallel to encode_embedded_field_schema, but for nested template elements. Single-valued embeddings return the element object directly; multi-valued embeddings (determined by is_multi(E)) wrap it in an array descriptor with cardinality bounds.

Let elem_obj = encode_template_element(referenced_template(E), E).

if is_multi(E):
  {
    "type": "array",
    "items": elem_obj,
    if E.cardinality.min_cardinality is present:
      "minItems": E.cardinality.min_cardinality.non_negative_integer.integer_lexical_form (as integer),
    if E.cardinality.max_cardinality is present and not UnboundedCardinality:
      "maxItems": E.cardinality.max_cardinality.non_negative_integer.integer_lexical_form (as integer)
  }

else:
  elem_obj

Calls: encode_template_element


encode_embedded_presentation_component_schema(E: EmbeddedPresentationComponent) → Object

Presentation components are encoded as StaticTemplateField objects — regular field-like objects with a specific @type and no value shape, required array, or _valueConstraints. The component’s content (HTML, image URL, YouTube identifier) is stored in _ui._content.

let pc_obj = encode_presentation_component(referenced_presentation_component(E), E)

where referenced_presentation_component(E) is the PresentationComponent identified by the reference in E.

pc_obj

Calls: encode_presentation_component


encode_presentation_component(PC: PresentationComponent, E: EmbeddedPresentationComponent) → Object

Produces a StaticTemplateField object. Unlike regular fields, this object carries no "properties", "required", or "_valueConstraints" keys — the component holds no instance data. The @context is the smaller STATIC_FIELD_NS rather than STANDARD_NS.

merge(
  {
    "@id":    iri(PC.presentation_component_id),
    "@type":  "https://schema.metadatacenter.org/core/StaticTemplateField",
    "@context": STATIC_FIELD_NS,
    "$schema": "http://json-schema.org/draft-04/schema#",
    "type":   "object",
    "title":  PC.schema_artifact_metadata.artifact_metadata.descriptive_metadata.name.unicode_string,
    "description": PC.schema_artifact_metadata.artifact_metadata.descriptive_metadata.description.unicode_string
                   if present, else "",
    "additionalProperties": false,
    "_ui":    encode_presentation_component_ui(PC)
  },
  encode_artifact_metadata(PC)
)

Calls: encode_presentation_component_ui, encode_artifact_metadata


encode_presentation_component_ui(PC: PresentationComponent) → Object

Returns the _ui object for a static field. All component kinds carry "inputType" and "_content".

PresentationComponent kind"inputType""_content"
PageBreakComponent"page-break"null
SectionBreakComponent"section-break"null
RichTextComponent"richtext"PC.html_content.unicode_string
ImageComponent"image"iri(PC.iri)
YoutubeVideoComponent"youtube"iri(PC.iri)

ImageComponent.label, ImageComponent.description, YoutubeVideoComponent.label, and YoutubeVideoComponent.description accessibility metadata are not surfaced in CTM 1.6.0 output (the legacy form has no slot for them). See Section 14, Known Gaps.


8. Field Encoding

encode_field(F: Field, E: EmbeddedField) → Object

A CTM 1.6.0 field object merges fixed structural keys (@id, @type, $schema, type, title, description), the artifact metadata block, and the field-spec-specific encoding. The embedding E is passed to encode_field_spec because properties such as requiredValue and hidden depend on how the field is embedded rather than on the field definition itself.

merge(
  {
    "@id":   iri(F.field_id),
    "@type": "https://schema.metadatacenter.org/core/TemplateField",
    "@context": STANDARD_NS,
    "$schema": "http://json-schema.org/draft-04/schema#",
    "type":  "object",
    "title": F.schema_artifact_metadata.artifact_metadata.descriptive_metadata.name.unicode_string,
    "description": F.schema_artifact_metadata.artifact_metadata.descriptive_metadata.description.unicode_string
                   if description is present, else ""
  },
  encode_artifact_metadata(F),
  encode_field_spec(F.field_spec, E)
)

encode_field_spec(FT: FieldSpec, E: EmbeddedField) → Object is defined per field spec in Section 9 using a common skeleton with per-type value shape and constraint entries.

Calls: encode_artifact_metadata


9. Field Spec Encoding

Skeleton

Every standard field spec encoding function returns a fragment — an object with five keys — that gets merged into the full field object by encode_field. The skeleton below shows the structure, with placeholders for the parts that vary per field spec:

{
  "properties":           <value-shape>,
  "required":             <required>,
  "additionalProperties": false,
  "_valueConstraints":    merge(encode_embedding_constraints(E), <vc-extras>),
  "_ui":                  merge(encode_embedding_ui(E), <ui-extras>)
}

The placeholders mean:

  • <value-shape> — a JSON Schema properties object describing the keys an instance value for this field must (or may) carry. For example, a text field’s instance value is a JSON object with "@value" and optionally "@type"; a controlled term field’s value uses "@id" and "rdfs:label" instead. Three named shapes (STRING_VALUE_SHAPE, NUMBER_VALUE_SHAPE, IRI_VALUE_SHAPE) cover most field specs; each is defined below.

  • <required> — the JSON Schema required array listing which keys from the value shape must be present in an instance value. Most field specs require ["@value"] or []; the exact list is given per field spec.

  • <vc-extras> — additional keys to merge into _valueConstraints beyond the base requiredValue flag. For example, a numeric field adds "numberType" here; a text field may add "defaultValue", "minLength", etc. When a field spec has no extras, _valueConstraints is just encode_embedding_constraints(E) directly.

  • <ui-extras> — additional keys to merge into _ui beyond the base hidden flag. At minimum, every field spec adds "inputType" here. Temporal fields also add "temporalGranularity" and similar hints.

Field specs that do not follow this skeleton (multi-valued enum and attribute-value) are noted explicitly in their entries.

Value Shapes

A value shape is a JSON Schema properties object that defines what keys an instance value object for this field spec must or may contain. Rather than repeat the same structures throughout, three shapes are named here and referenced by the per-field-spec entries.

STRING_VALUE_SHAPE — used by text, date, time, datetime, email, and phone number fields. Instance values carry a string "@value" and an optional "@type" IRI for typed literals:

{
  "@type":  { "oneOf": [{ "type": "string", "format": "uri" }, { "type": "null" }] },
  "@value": { "type": ["string", "null"] }
}

NUMBER_VALUE_SHAPE — used by numeric fields. Instance values carry a numeric "@value" and an "@type" IRI identifying the XSD numeric datatype:

{
  "@type":  { "oneOf": [{ "type": "string", "format": "uri" }, { "type": "null" }] },
  "@value": { "type": ["number", "null"] }
}

IRI_VALUE_SHAPE — used by controlled term, link, and external authority fields. Instance values carry an "@id" IRI rather than an "@value" string, plus an optional human-readable "rdfs:label":

{
  "@type":      { "oneOf": [{ "type": "string", "format": "uri" }, { "type": "null" }] },
  "@id":        { "type": "string", "format": "uri" },
  "rdfs:label": { "type": ["string", "null"] }
}

Embedding Helper Functions

The two helpers below produce the base content of _valueConstraints and _ui from the EmbeddedField context. Every standard field spec merges these as the starting point before adding its own extras.

encode_embedding_constraints(E: EmbeddedField) → Object

Returns { "requiredValue": V } where V depends on the effective value requirement:

Effective ValueRequirement"requiredValue"
"required"true
"recommended" or "optional"false

Caution: The "recommended" and "optional" distinctions from the Structural Model are both encoded as "requiredValue": false and are therefore indistinguishable in CTM 1.6.0 output. This is not a JSON Schema concept — "requiredValue" is a CEDAR tooling hint only. The JSON Schema "required" array (produced by encode_template_required) separately handles enforcement, and it too only distinguishes "required" from everything else. The "recommended"/"optional" distinction is entirely lost in this encoding.

encode_embedding_ui(E: EmbeddedField) → Object

Returns a JSON object with the following keys:

KeyValueCondition
"hidden"trueOnly when E.visibility = "hidden"; omit otherwise

Field Spec Definitions

encode_text_field_spec(FT: TextFieldSpec, E: EmbeddedField) → Object

Text fields accept free-form string input. The rendering hint determines whether the input is single-line (textfield) or multi-line (textarea), defaulting to single-line when absent. Optional constraints — default value, length bounds, and a validation regex — are written to _valueConstraints only when present in the field definition.

Value shape: STRING_VALUE_SHAPE | Required: ["@value"]

_valueConstraints extras:

KeyValueCondition
"defaultValue"FT.default_value.text_value.lexical_form.unicode_stringOmit if absent
"minLength"FT.min_length.non_negative_integer (as integer)Omit if absent
"maxLength"FT.max_length.non_negative_integer (as integer)Omit if absent
"regex"FT.validation_regex.regex_pattern.unicode_stringOmit if absent

_ui extras: { "inputType": encode_text_rendering_hint(FT.text_rendering_hint) }

Calls: encode_embedding_constraints, encode_embedding_ui, encode_text_rendering_hint

encode_text_rendering_hint(hint: TextRenderingHint or absent) → String

Returns the string corresponding to the hint value:

TextRenderingHint valueReturns
"singleLine" or absent"textfield"
"multiLine""textarea"

encode_integer_number_field_spec(FT: IntegerNumberFieldSpec, E: EmbeddedField) → Object

Integer-number fields hold base-10 integer lexical values. The numberType key is always written and carries "xsd:integer"; the integer category is fixed by the field family.

Value shape: NUMBER_VALUE_SHAPE | Required: ["@value"]

_valueConstraints extras:

KeyValueCondition
"numberType""xsd:integer"Always present
"unitOfMeasure"iri(FT.unit.iri)Omit if absent
"minValue"FT.integer_number_min_value.integer_number_value.value (as integer)Omit if absent
"maxValue"FT.integer_number_max_value.integer_number_value.value (as integer)Omit if absent

_ui extras: { "inputType": "numeric" }

Unit carries an Iri in the Structural Model; CTM 1.6.0 unitOfMeasure is a plain string. The IRI string value is used directly.

Calls: encode_embedding_constraints, encode_embedding_ui

encode_real_number_field_spec(FT: RealNumberFieldSpec, E: EmbeddedField) → Object

Real-number fields hold lexical values for one of three real-number kinds (decimal, float, double). The numberType key carries the corresponding XSD datatype IRI string.

Value shape: NUMBER_VALUE_SHAPE | Required: ["@value"]

_valueConstraints extras:

KeyValueCondition
"numberType"encode_real_number_datatype(FT.datatype)Always present
"unitOfMeasure"iri(FT.unit.iri)Omit if absent
"minValue"FT.real_number_min_value.real_number_value.value (as number)Omit if absent
"maxValue"FT.real_number_max_value.real_number_value.value (as number)Omit if absent

A decimalPlaces hint, when present on the field’s NumericRenderingHint, is emitted under _ui rather than _valueConstraints.

_ui extras: { "inputType": "numeric", "decimalPlaces": FT.rendering_hint.decimal_places (as integer; omit if absent) }

Calls: encode_embedding_constraints, encode_embedding_ui, encode_real_number_datatype

encode_real_number_datatype(K: RealNumberDatatypeKind) → String

Returns the XSD datatype IRI string corresponding to the CEDAR-native RealNumberDatatypeKind:

RealNumberDatatypeKindReturns
"decimal""xsd:decimal"
"float""xsd:float"
"double""xsd:double"

encode_date_field_spec(FT: DateFieldSpec, E: EmbeddedField) → Object

Date fields encode values at year, year-month, or full-date precision. Both _valueConstraints.temporalType (the XSD datatype) and _ui.temporalGranularity are derived from the same DateValueType. An optional dateFormat hint controls the display ordering of day, month, and year components.

Value shape: STRING_VALUE_SHAPE | Required: ["@value"]

_valueConstraints extras: { "temporalType": encode_date_value_type(FT.date_value_type) }

_ui extras:

KeyValueCondition
"inputType""temporal"Always present
"temporalGranularity"encode_date_granularity(FT.date_value_type)Always present
"dateFormat"encode_date_format(FT.date_rendering_hint.date_format)Omit if FT.date_rendering_hint absent or date_format absent

Calls: encode_embedding_constraints, encode_embedding_ui, encode_date_value_type, encode_date_granularity, encode_date_format

encode_date_value_type(DVT: DateValueType) → String

Returns the XSD datatype string for the DateValueType kind:

DateValueType kindReturns
"year""xsd:gYear"
"yearMonth""xsd:gYearMonth"
"fullDate""xsd:date"

encode_date_granularity(DVT: DateValueType) → String

Returns the temporalGranularity string for the DateValueType kind:

DateValueType kindReturns
"year""year"
"yearMonth""month"
"fullDate""day"

encode_date_format(DF: DateComponentOrder) → String

Returns the dateFormat string for the DateComponentOrder kind:

DateComponentOrder kindReturns
"dayMonthYear""D/M/YYYY"
"monthDayYear""M/D/YYYY"
"yearMonthDay""YYYY/M/D"

encode_time_field_spec(FT: TimeFieldSpec, E: EmbeddedField) → Object

Time fields always use the xsd:time datatype. The temporalGranularity and optional timezone and format hints are placed in _ui. The timezoneEnabled key is only written when the timezone requirement is explicitly stated; it is omitted when unset.

Value shape: STRING_VALUE_SHAPE | Required: ["@value"]

_valueConstraints extras: { "temporalType": "xsd:time" }

_ui extras:

KeyValueCondition
"inputType""temporal"Always present
"temporalGranularity"encode_time_precision(FT.time_precision)Always present
"timezoneEnabled"trueOnly when FT.timezone_requirement = "timezoneRequired"
"timezoneEnabled"falseOnly when FT.timezone_requirement = "timezoneNotRequired"
"inputTimeFormat""12h"Only when FT.time_rendering_hint.time_format = "twelveHour"
"inputTimeFormat""24h"Only when FT.time_rendering_hint.time_format = "twentyFourHour"

Calls: encode_embedding_constraints, encode_embedding_ui, encode_time_precision

encode_time_precision(TP: TimePrecision or absent) → String

Returns the temporalGranularity string for the TimePrecision kind:

TimePrecision kindReturns
"hourMinute""minute"
"hourMinuteSecond""second"
"hourMinuteSecondFraction""decimalSecond"
absent"decimalSecond"

encode_datetime_field_spec(FT: DateTimeFieldSpec, E: EmbeddedField) → Object

Date-time fields always use the xsd:dateTime datatype. They follow the same pattern as time fields for timezone and format hints, with granularity derived from DateTimeValueType rather than TimePrecision.

Value shape: STRING_VALUE_SHAPE | Required: ["@value"]

_valueConstraints extras: { "temporalType": "xsd:dateTime" }

_ui extras:

KeyValueCondition
"inputType""temporal"Always present
"temporalGranularity"encode_datetime_value_type(FT.datetime_value_type)Always present
"timezoneEnabled"trueOnly when FT.timezone_requirement = "timezoneRequired"
"timezoneEnabled"falseOnly when FT.timezone_requirement = "timezoneNotRequired"
"inputTimeFormat""12h"Only when FT.date_time_rendering_hint.time_format = "twelveHour"
"inputTimeFormat""24h"Only when FT.date_time_rendering_hint.time_format = "twentyFourHour"

Calls: encode_embedding_constraints, encode_embedding_ui, encode_datetime_value_type

encode_datetime_value_type(DVT: DateTimeValueType) → String

Returns the temporalGranularity string for the DateTimeValueType kind:

DateTimeValueType kindReturns
"dateHourMinute""minute"
"dateHourMinuteSecond""second"
"dateHourMinuteSecondFraction""decimalSecond"

encode_controlled_term_field_spec(FT: ControlledTermFieldSpec, E: EmbeddedField) → Object

Controlled term fields constrain values to terms drawn from ontologies, branches of ontologies, named classes, or value sets. The four _valueConstraints list keys (ontologies, branches, classes, valueSets) are always present, each holding an array that is empty when no sources of that kind are configured. The multipleChoice: false flag distinguishes this from multi-valued enum fields.

Value shape: IRI_VALUE_SHAPE | Required: []

_valueConstraints extras:

KeyValueCondition
"multipleChoice"falseAlways present
"ontologies"[ encode_ontology_source(S) for each OntologySource S in FT.controlled_term_sources ]Always present
"branches"[ encode_branch_source(S) for each BranchSource S in FT.controlled_term_sources ]Always present
"classes"[ encode_class_source_entry(C) for each ClassSource S in FT.controlled_term_sources, for each C in S.controlled_term_classes ]Always present
"valueSets"[ encode_value_set_source(S) for each ValueSetSource S in FT.controlled_term_sources ]Always present

_ui extras: { "inputType": "textfield" }

Calls: encode_embedding_constraints, encode_embedding_ui, encode_ontology_source, encode_branch_source, encode_class_source_entry, encode_value_set_source

encode_ontology_source(S: OntologySource) → Object

Returns a JSON object with the following keys:

KeyValueCondition
"uri"iri(S.ontology_reference.ontology_iri.iri)Always present
"acronym"S.ontology_reference.ontology_display_hint.ontology_acronym.unicode_stringOmit if absent
"name"S.ontology_reference.ontology_display_hint.ontology_name.unicode_stringOmit if absent

encode_branch_source(S: BranchSource) → Object

Returns a JSON object with the following keys:

KeyValueCondition
"uri"iri(S.ontology_reference.ontology_iri.iri)Always present
"acronym"S.ontology_reference.ontology_display_hint.ontology_acronym.unicode_stringOmit if absent
"rootTermUri"iri(S.root_term_iri.iri)Always present
"rootTermLabel"S.root_term_label.unicode_stringAlways present
"maxDepth"S.max_traversal_depth.non_negative_integer (as integer)Omit if absent

encode_class_source_entry(C: ControlledTermClass) → Object

Returns a JSON object with the following keys:

KeyValueCondition
"uri"iri(C.term_iri.iri)Always present
"label"C.label.unicode_stringAlways present
"prefLabel"C.label.unicode_stringAlways present
"type""OntologyClass"Always present
"source"iri(C.ontology_reference.ontology_iri.iri)Always present

encode_value_set_source(S: ValueSetSource) → Object

Returns a JSON object with the following keys:

KeyValueCondition
"identifier"S.value_set_identifier.unicode_stringAlways present
"name"S.value_set_name.unicode_stringOmit if absent
"uri"iri(S.value_set_iri.iri)Omit if absent

encode_single_valued_enum_field_spec(FT: SingleValuedEnumFieldSpec, E: EmbeddedField) → Object

SingleValuedEnumFieldSpec declares a closed list of PermissibleValue entries. CTM 1.6.0 has no native equivalent for the Structural Model’s enum-with-meanings construct: this encoder maps the spec into the legacy "literals" list, using each permissible value’s canonical Token as the legacy literal label. Per-value Label, Description, and Meaning metadata is dropped (see Section 14, Known Gaps). The multipleChoice: false flag distinguishes this from the multi-valued enum case.

Value shape: STRING_VALUE_SHAPE | Required: []

_valueConstraints extras:

KeyValueCondition
"multipleChoice"falseAlways present
"literals"[ encode_permissible_value(PV) for each PV in FT.permissible_values ]Always present
"defaultValue"FT.default_value.token.stringOmit if absent

_ui extras: { "inputType": encode_single_valued_enum_rendering_hint(FT.rendering_hint) }

Calls: encode_embedding_constraints, encode_embedding_ui, encode_single_valued_enum_rendering_hint, encode_permissible_value

encode_single_valued_enum_rendering_hint(hint: SingleValuedEnumRenderingHint or absent) → String

Returns the inputType string for the hint value:

SingleValuedEnumRenderingHint valueReturns
"radio" or absent"radio"
"dropdown""list"

encode_permissible_value(PV: PermissibleValue) → Object

Encodes a single PermissibleValue from a SingleValuedEnumFieldSpec or MultiValuedEnumFieldSpec as a CTM 1.6.0 literals-array entry. The legacy entry carries a single label string; the encoder uses the permissible value’s Token as that label. The Token is the canonical wire-form key in the Structural Model and remains the value submitted in instances.

PermissibleValue.label and PermissibleValue.description localizations are dropped — CTM 1.6.0 has no slot for them on a literals entry. PermissibleValue.meanings is also dropped: CTM 1.6.0 literal options carry no ontology binding. See Section 14, Known Gaps.

KeyValueCondition
"label"PV.token.stringAlways present

selectedByDefault is no longer encoded per option. The Structural Model represents enum defaults at the spec level (SingleValuedEnumFieldSpec.defaultValue / MultiValuedEnumFieldSpec.defaultValues); these are emitted via defaultValue / defaultValues keys in _valueConstraints rather than as per-option flags. CTM 1.6.0 tooling support for those keys is not guaranteed (see Section 14).


encode_multi_valued_enum_field_spec(FT: MultiValuedEnumFieldSpec, E: EmbeddedField) → Object

Multi-valued enum fields allow instances to carry zero or more selected permissible values, so the value schema is wrapped in a JSON Schema array with minItems: 0. This field spec does not follow the standard skeleton. The multipleChoice: true flag distinguishes this from single-valued enum fields.

As with encode_single_valued_enum_field_spec, per-value Label, Description, and Meaning metadata is dropped at the legacy literals entries (see Section 14).

This field spec does not follow the standard skeleton. It wraps the value schema in an array:

{
  "type": "array",
  "minItems": 0,
  "items": {
    "type": "object",
    "properties": { "@value": { "type": ["string", "null"] } },
    "required": [],
    "additionalProperties": false
  },
  "_valueConstraints": merge(encode_embedding_constraints(E), <vc-extras>),
  "_ui":               merge(encode_embedding_ui(E), <ui-extras>)
}

_valueConstraints extras:

KeyValueCondition
"multipleChoice"trueAlways present
"literals"[ encode_permissible_value(PV) for each PV in FT.permissible_values ]Always present
"defaultValues"[ T.string for each T in FT.default_values ]Omit if absent or empty

_ui extras: { "inputType": encode_multi_valued_enum_rendering_hint(FT.rendering_hint) }

Calls: encode_embedding_constraints, encode_embedding_ui, encode_multi_valued_enum_rendering_hint, encode_permissible_value

encode_multi_valued_enum_rendering_hint(hint: MultiValuedEnumRenderingHint or absent) → String

Returns the inputType string for the hint value:

MultiValuedEnumRenderingHint valueReturns
"checkbox" or absent"checkbox"
"multiSelect""list"

Link fields hold a URI value with an optional human-readable label. They use IRI_VALUE_SHAPE and the link input type with no additional value constraints.

Value shape: IRI_VALUE_SHAPE | Required: [] | _valueConstraints extras: none | _ui extras: { "inputType": "link" }

Calls: encode_embedding_constraints, encode_embedding_ui


encode_email_field_spec(FT: EmailFieldSpec, E: EmbeddedField) → Object

Email fields hold a string value interpreted as an email address. They use STRING_VALUE_SHAPE and the email input type with no additional value constraints.

Value shape: STRING_VALUE_SHAPE | Required: [] | _valueConstraints extras: none | _ui extras: { "inputType": "email" }

Calls: encode_embedding_constraints, encode_embedding_ui


encode_phone_number_field_spec(FT: PhoneNumberFieldSpec, E: EmbeddedField) → Object

Phone number fields hold a string value interpreted as a phone number. They use STRING_VALUE_SHAPE and the phone-number input type with no additional value constraints.

Value shape: STRING_VALUE_SHAPE | Required: [] | _valueConstraints extras: none | _ui extras: { "inputType": "phone-number" }

Calls: encode_embedding_constraints, encode_embedding_ui


External Authority Field Specs

External authority fields identify entities from well-known registries such as ORCID, ROR, DOI, PubMed, RRID, and NIH Grant. All six types use IRI_VALUE_SHAPE and differ only in the inputType string written to _ui. They share the same skeleton entry:

Value shape: IRI_VALUE_SHAPE | Required: [] | _valueConstraints extras: none | _ui extras: { "inputType": encode_external_authority_input_type(FT) }

encode_external_authority_field_spec(FT: ExternalAuthorityFieldSpec, E: EmbeddedField) → Object

Applies the skeleton with the above parameters.

Calls: encode_embedding_constraints, encode_embedding_ui, encode_external_authority_input_type

encode_external_authority_input_type(FT: ExternalAuthorityFieldSpec) → String

Returns the inputType string for the field spec kind:

ExternalAuthorityFieldSpec kindReturns
OrcidFieldSpec"orcid"
RorFieldSpec"ror"
DoiFieldSpec"doi"
PubMedIdFieldSpec"pubmed"
RridFieldSpec"rrid"
NihGrantIdFieldSpec"nih-grant"

Caution: The inputType string values for external authority fields are not standardised in the published CTM 1.6.0 specification. The values in the table above reflect common practice but MUST be confirmed against the deployed CTM 1.6.0 implementation before use. Encoding with incorrect inputType values may cause CEDAR tooling to misrender or reject these fields.


encode_attribute_value_field_spec(FT: AttributeValueFieldSpec, E: EmbeddedField) → Object

Attribute-value fields hold dynamic key-value pairs whose attribute names are not known at schema definition time. CTM 1.6.0 represents this with a top-level array type and defers the dynamic key handling to the instance level via additionalProperties. This field spec does not follow the standard skeleton.

This field spec does not follow the standard skeleton. It uses a top-level array type:

{
  "type": "array",
  "items": { "type": "string" },
  "minItems": 0,
  "additionalProperties": false,
  "_valueConstraints": merge(encode_embedding_constraints(E), { "requiredValue": false }),
  "_ui":               merge(encode_embedding_ui(E), { "inputType": "attribute-value" })
}

The instance representation of AttributeValue fields in CTM 1.6.0 uses additionalProperties at the instance level rather than a structured value schema. See Section 14, Known Gaps.

Calls: encode_embedding_constraints, encode_embedding_ui


10. Template Element Encoding

When a Template is referenced by an EmbeddedTemplate, it is encoded as a CTM 1.6.0 template element object.

encode_template_element(T: Template, E: EmbeddedTemplate) → Object

When a Template is used as a nested element, it is encoded identically to a top-level template except that @type becomes TemplateElement. All sub-functions (encode_template_context, encode_template_properties, encode_template_required, encode_template_ui) operate identically regardless of nesting depth.

merge(
  {
    "@id":    iri(T.template_id),
    "@type":  "https://schema.metadatacenter.org/core/TemplateElement",
    "@context": encode_template_context(T),
    "$schema": "http://json-schema.org/draft-04/schema#",
    "type":   "object",
    "title":  T.schema_artifact_metadata.artifact_metadata.descriptive_metadata.name.unicode_string,
    "description": T.schema_artifact_metadata.artifact_metadata.descriptive_metadata.description.unicode_string
                   if description is present, else "",
    "properties":  encode_template_properties(T),
    "required":    encode_template_required(T),
    "additionalProperties": false,
    "_ui":    encode_template_ui(T)
  },
  encode_artifact_metadata(T)
)

encode_template_context, encode_template_properties, encode_template_required, and encode_template_ui are as defined in Section 6 and operate identically on Template constructs whether they are top-level templates or nested template elements.

Calls: encode_template_context, encode_template_properties, encode_template_required, encode_template_ui, encode_artifact_metadata


11. Value Encoding (Instance Level)

These functions encode Value constructs as they appear within a TemplateInstance.

encode_value(V: Value) → Object

All value types are encoded as JSON objects, though the specific keys differ by type. This function dispatches to the appropriate type-specific encoder.

Dispatches to the encoding function for the Value kind:

Value kindEncoding function
TextValueencode_text_value(V)
IntegerNumberValueencode_integer_number_value(V)
RealNumberValueencode_real_number_value(V)
BooleanValueencode_boolean_value(V)
DateValueencode_date_value(V)
TimeValueencode_time_value(V)
DateTimeValueencode_datetime_value(V)
ControlledTermValueencode_controlled_term_value(V)
EnumValueencode_enum_value(V)
LinkValueencode_link_value(V)
EmailValueencode_email_value(V)
PhoneNumberValueencode_phone_number_value(V)
ExternalAuthorityValueencode_external_authority_value(V)
AttributeValueencode_attribute_value(V)

Calls: encode_text_value, encode_integer_number_value, encode_real_number_value, encode_boolean_value, encode_date_value, encode_time_value, encode_datetime_value, encode_controlled_term_value, encode_enum_value, encode_link_value, encode_email_value, encode_phone_number_value, encode_external_authority_value, encode_attribute_value


encode_text_value(V: TextValue) → Object

Returns a JSON object whose keys depend on whether V carries a language tag:

Condition"@value" source"@language"
V.lang absentV.value.unicode_stringOmit
V.lang presentV.value.unicode_stringV.lang.bcp_47_tag

encode_integer_number_value(V: IntegerNumberValue) → Object

Integer-number instance values carry a base-10 integer lexical form. The XSD datatype IRI is fixed at "xsd:integer".

{
  "@value": V.value.unicode_string,
  "@type":  "xsd:integer"
}

encode_real_number_value(V: RealNumberValue) → Object

Real-number instance values carry both a lexical form and an explicit RealNumberDatatypeKind. The kind is mapped to the corresponding XSD datatype IRI string by encode_real_number_datatype.

{
  "@value": V.value.unicode_string,
  "@type":  encode_real_number_datatype(V.datatype)
}

encode_date_value(V: DateValue) → Object

Returns { "@value": <literal>, "@type": <xsd-type> } where the sources depend on the DateValue kind:

DateValue kind"@value" source"@type"
YearValueV.value"xsd:gYear"
YearMonthValueV.value"xsd:gYearMonth"
FullDateValueV.full_date_literal.lexical_form.string"xsd:date"

encode_time_value(V: TimeValue) → Object

Time instance values always use the xsd:time datatype. The lexical form is written directly from the time literal.

{ "@value": V.time_literal.lexical_form.unicode_string, "@type": "xsd:time" }

encode_datetime_value(V: DateTimeValue) → Object

Date-time instance values always use the xsd:dateTime datatype. The lexical form is written directly from the date-time literal.

{ "@value": V.date_time_literal.lexical_form.unicode_string, "@type": "xsd:dateTime" }

encode_controlled_term_value(V: ControlledTermValue) → Object

Returns a JSON object with the following keys:

KeyValueCondition
"@id"iri(V.term_iri.iri)Always present
"rdfs:label"V.label.unicode_stringOmit if absent
"skos:notation"V.notation.unicode_stringOmit if absent
"skos:prefLabel"V.preferred_label.unicode_stringOmit if absent

encode_enum_value(V: EnumValue) → Object

Encodes an EnumValue as a CTM 1.6.0 string-shaped JSON-LD value. The Token carried by the EnumValue is emitted under "@value". CTM 1.6.0 has no native concept of an enum value distinct from a string literal — the legacy form treats the submitted token as a plain string, with conformance to the spec’s permissible-value list enforced at the schema layer (the literals array under _valueConstraints).

{ "@value": V.token.string }

Per-value Meaning bindings carried by the source spec are not surfaced at the instance: the legacy wire form has no slot for them. Consumers that need ontology meanings MUST consult the source EnumFieldSpec.

Calls: none.


Returns a JSON object with the following keys:

KeyValueCondition
"@id"iri(V.iri)Always present
"rdfs:label"first localization of V.label (lexical form only)Omit if V.label absent

CTM 1.6.0’s rdfs:label slot accepts a single string only. When V.label is a multi-localization MultilingualString, the first entry is emitted; remaining localizations are dropped. See Section 14, Known Gaps.


encode_email_value(V: EmailValue) → Object

Email instance values are plain string objects with a single @value key. No type annotation is included.

{ "@value": V.simple_literal.lexical_form.unicode_string }

encode_phone_number_value(V: PhoneNumberValue) → Object

Phone number instance values are plain string objects with a single @value key. No type annotation is included.

{ "@value": V.simple_literal.lexical_form.unicode_string }

encode_external_authority_value(V: ExternalAuthorityValue) → Object

Each kind produces { "@id": <iri>, "rdfs:label": <label> } where "rdfs:label" is omitted when V.label is absent.

ExternalAuthorityValue kind"@id" source
OrcidValueiri(V.orcid_iri.iri)
RorValueiri(V.ror_iri.iri)
DoiValueiri(V.doi_iri.iri)
PubMedIdValueiri(V.pub_med_iri.iri)
RridValueiri(V.rrid_iri.iri)
NihGrantIdValueiri(V.nih_grant_iri.iri)

encode_attribute_value(V: AttributeValue) → Object

{ V.attribute_name.unicode_string: encode_value(V.value) }

Nested AttributeValue constructs produce nested objects. Multiple AttributeValue entries for the same instance field are merged into a single flat or nested JSON object in the CTM 1.6.0 representation.


12. Instance Encoding

encode_template_instance(I: TemplateInstance, T: Template) → Object

A template instance is encoded by reusing the template’s @context, writing instance identity and provenance metadata, and then encoding each field value and nested template instance slot. The template T is required as a parameter because the context and embedded artifact structure are derived from it rather than from the instance itself.

let fvs  = [ IV in I.instance_values | IV is FieldValue ]
let ntis = [ IV in I.instance_values | IV is NestedTemplateInstance ]
let emb_fields     = [ E in T.embedded_artifacts | E is EmbeddedField ]
let emb_templates  = [ E in T.embedded_artifacts | E is EmbeddedTemplate ]

merge(
  {
    "@context":         encode_template_context(T),
    "@id":              iri(I.template_instance_id),
    "schema:isBasedOn": iri(T.template_id)
  },
  encode_artifact_metadata(I.artifact_metadata),
  { for each EF in emb_fields:
      EF.key: encode_field_value(fv(EF), EF) },
  { for each ET in emb_templates:
      ET.key: encode_nested_template_instance_slot(ntis_for(ET), ET) }
)

where fv(EF) denotes the FieldValue in fvs whose key equals EF.key, and ntis_for(ET) denotes [ NTI in ntis | NTI.key = ET.key ].

Calls: encode_template_context, encode_artifact_metadata, encode_field_value, encode_nested_template_instance_slot


encode_field_value(FV: FieldValue, EF: EmbeddedField) → Object or Array

Encodes a single field’s data within an instance. When the field is multi-valued (per is_multi(EF)) the result is a JSON array of encoded values; when single-valued it is a single encoded value object.

Caution: Consumers of CTM 1.6.0 instances must handle both forms at any given field key — either a plain JSON object or a JSON array. A consumer that always expects an object will silently misread or discard data for multi-valued fields. The cardinality information needed to know which form to expect is carried in the template schema (the "type": "array" wrapper on the field entry in "properties"), not in the instance itself.

if is_multi(EF):
  [ encode_value(V) for each V in FV.values ]

else:
  encode_value(first(FV.values))

Calls: encode_value


encode_nested_template_instance_slot(NTIs: NestedTemplateInstance+, ET: EmbeddedTemplate) → Object or Array

Encodes a nested template slot within a parent instance. Multi-valued embeddings (per is_multi(ET)) produce a JSON array of encoded child instances; single-valued embeddings produce a single child instance object. Encoding recurses through encode_template_instance.

Let RT = the referenced Template of ET.

if is_multi(ET):
  [ encode_template_instance(NTI, RT) for each NTI in NTIs ]

else:
  encode_template_instance(first(NTIs), RT)

Calls: encode_template_instance


13. Annotations

Annotation constructs on CatalogMetadata have no standardised CTM 1.6.0 equivalent. They are encoded as top-level properties on the artifact object using the annotation name IRI as the JSON key.

encode_annotation(A: Annotation) → { key: value }

key:   iri(A.property)

value: if A.body is AnnotationStringValue:
         { "@value": A.body.value, "@language"?: A.body.lang }
         (or the raw lexical form string if simpler form is preferred)
       if A.body is AnnotationIriValue:
         iri(A.body.iri)

Implementations SHOULD confirm that annotation IRI keys are valid within the CTM 1.6.0 @context before including them.


14. Known Gaps and Lossy Areas

  1. skos:prefLabel on StaticTemplateField — Real CTM 1.6.0 output includes a skos:prefLabel key at the top level of static field objects (presentation components). This is not currently produced by encode_presentation_component because encode_artifact_metadata maps preferred labels to rdfs:label. The relationship between the Structural Model’s preferred_label and CTM 1.6.0’s skos:prefLabel on static fields needs clarification.

  2. propertyDescriptions in _ui — Real CTM 1.6.0 templates include a "propertyDescriptions" map inside _ui, keyed by embedded artifact key, containing the description/help text for each field. This is not currently produced by encode_template_ui. The source of these descriptions (whether from the EmbeddedField or the referenced Field) needs to be confirmed and the function updated accordingly.

  3. AlternativeLabel* on DescriptiveMetadata — No CTM 1.6.0 equivalent; omitted.

  4. PermissibleValue metadata — CTM 1.6.0 literals-array entries carry only a single label string, with no slot for per-value Description, ontology Meaning bindings, or multilingual Label localizations. encode_permissible_value drops all of these and emits the value’s canonical Token as the legacy label. Spec-level enum defaults (SingleValuedEnumFieldSpec.defaultValue and MultiValuedEnumFieldSpec.defaultValues) are emitted as defaultValue / defaultValues keys under _valueConstraints; CTM 1.6.0 tooling support for those keys is not guaranteed. The legacy selectedByDefault per-option flag is no longer produced — the Structural Model now represents enum defaults exclusively at the spec level. Embedding-level defaults (EmbeddedSingleValuedEnumField.defaultValue / EmbeddedMultiValuedEnumField.defaultValue) have no CTM 1.6.0 equivalent and are dropped.

  5. Default values for link, email, phone number, and external authority field specs — CTM 1.6.0 _valueConstraints.defaultValue is primarily defined for text fields. Default value encoding for LinkDefaultValue, EmailDefaultValue, PhoneNumberDefaultValue, and external authority defaults is implementation-defined.

  6. AttributeValue instance representation — CTM 1.6.0 uses additionalProperties on the instance object for attribute-value fields. The instance-level encoding of AttributeValue injects key-value pairs directly into the parent instance object rather than nesting them under a field key.

  7. Recommended vs Optional — Both map to "requiredValue": false in _valueConstraints and neither contributes to the "required" array. The distinction is entirely lost in CTM 1.6.0 output.

  8. Unit as IRI — CTM 1.6.0 unitOfMeasure is a plain string. The IRI string value is used directly; any human-readable label associated with Unit is omitted.

  9. Language-tagged text values — CTM 1.6.0 does not model language-tagged strings explicitly. The @language key is included in the encoded value object as a JSON-LD extension; support in CTM 1.6.0 tooling is not guaranteed.

  10. External authority inputType values — The inputType string values for ORCID, ROR, DOI, PubMed, RRID, and NIH Grant fields are not standardised in the published CTM 1.6.0 specification and SHOULD be confirmed against the deployed implementation.

  11. ImageComponent and YoutubeVideoComponent accessibility metadata — The label (alt text / caption title) and description (longer accessibility text) slots on ImageComponent and YoutubeVideoComponent have no CTM 1.6.0 equivalent and are dropped. Conforming consumers that require accessibility metadata MUST work with the Structural Model wire form rather than the CTM 1.6.0 mapping.