CEDAR Template Model Specification
This specification defines the structural model for the CEDAR Template Model and its concrete JSON wire form.
It separates schema definition, presentation structure, reusable artifacts, contextual embedding, and instance data, and is layered as an abstract grammar paired with a JSON wire grammar, encoding rules, host-language bindings, and a normative validation algorithm.
The core concepts are Artifact, SchemaArtifact, Template, Field, PresentationComponent, EmbeddedArtifact, TemplateInstance, and InstanceValue. Every concrete artifact carries a top-level ModelVersion identifying the version of the CEDAR structural model it conforms to.
Scope
This specification defines:
- the core metamodel and abstract grammar
- artifact metadata, identity, lifecycle, and versioning
- the field-spec system (twenty concrete field families)
- the presentation-component model
- the instance model
- the JSON wire form (encoding rules, kind discriminator, wrapper collapse, property-name map)
- encoding and decoding semantics, including a normative error model
- a canonical validation algorithm with explicit error reports
- host-language binding idioms for TypeScript, Java, and Python
- a derived RDF projection of
Valueinstances - a cross-language conformance test suite
Document Structure
- notation.md — notation conventions used throughout the specification.
- metamodel.md — conceptual overview: principal categories, the field hierarchy, the layered specification, and cross-cutting conventions.
- grammar.md — the abstract EBNF-style grammar, including the
FieldSpecsystem, defaults, primitive lexical-form productions, and related constraints. - wire-grammar.md — the JSON wire grammar: kind rule, wrapper collapse, encoding rules, and the property-name map for every production.
- serialization.md — encoding and decoding semantics: round-tripping, the wrapping principle, the error model, and worked examples.
- bindings.md — host-language idioms (TypeScript, Java, Python) and codebase-organisation guidance.
- validation.md — the canonical validation algorithm, with per-step error reports.
- presentation.md — the
PresentationComponentfamily. - instances.md —
TemplateInstanceandInstanceValuesemantics, including the explicit “defaults are not part of instances” rule. - rdf-projection.md — the derived projection from CEDAR
Valueinstances to RDF. - index-of-productions.md — auto-generated A–Z index of every production in the specification.
A cross-language conformance test suite accompanies the specification: 114 fixtures (91 valid round-trip cases, 23 invalid cases with expected-error reports) embedded into serialization.md §8 and intended as a binding-acceptance contract.
Core Design Principles
- Schema definition MUST be separated from instance data.
- Semantic structure MUST be separated from presentation.
- Templates MUST contain embedded artifacts rather than directly containing
Field,Template, orPresentationComponent. PresentationComponentMUST NOT contribute instance values.- Defaults are UI/UX initialisation only and never appear in
TemplateInstanceartifacts or in the RDF projection. - Terminology MUST remain stable across this specification.
Open Questions
- Should the model support template-local (on-the-fly) fields without identity or versioning? See issue #1.
- Are the
Name,Description,PreferredLabel, andAlternativeLabelproperties onArtifactMetadataall pulling their weight, or is there redundancy worth simplifying? See issue #2. - Should instance structures eventually allow path-based keys in addition to
EmbeddedArtifactKey? - Should option sets for some
FieldSpecvariants become reusable artifacts?
Notation
This specification uses the terminology and naming conventions that are shared across the rest of the specification.
The production notation for the abstract grammar is defined in spec/grammar.md.
Conformance Language
The words MUST, MUST NOT, SHOULD, and MAY are used to express normative requirements when appropriate.
Naming Conventions
Defined terms use the terminology in this specification exactly. In particular, the following terms are normative and stable.
Schema and artifact terms:
ArtifactSchemaArtifactTemplateFieldPresentationComponent
Embedding terms:
EmbeddedArtifactEmbeddedFieldEmbeddedTemplateEmbeddedPresentationComponentEmbeddedArtifactKey
Instance terms:
TemplateInstanceInstanceValueFieldValueNestedTemplateInstance
Typing terms:
FieldSpec
Value Notation
Value denotes an instance-level data value in the grammar.
Each Value family carries its content directly: a LexicalForm, an optional LanguageTag, an explicit datatype IRI (where one is configurable), or a boolean payload, depending on the family. There is no separate RDF-Literal layer in the abstract grammar; an RDF projection is defined separately in rdf-projection.md.
The normative structure and semantics of values are defined in the Values section of grammar.md.
Metamodel
Overview
This section provides a conceptual overview of the CEDAR Template Model. Its purpose is to describe the principal categories of constructs, the relationships among them, and the design rationale behind key decisions. It is intended as a companion to the formal abstract grammar defined in spec/grammar.md, which is the normative specification. Readers seeking precise structural definitions, production rules, or normative constraints should consult grammar.md directly.
The CEDAR Template Model is organised around three principal concerns: reusable schema artifacts that define structure, embedding constructs that contextualise those artifacts within a specific template, and template instances that record data conforming to a template.
Principal Categories
Artifact is the broadest category in the model. Every artifact carries a repository-assigned identifier, descriptive metadata, lifecycle metadata, and zero or more annotations. SchemaArtifact, PresentationComponent, and TemplateInstance are the three principal subclasses.
A SchemaArtifact is a reusable artifact that defines schema structure. Template and Field are the two concrete schema artifact kinds. Both carry versioning metadata — semantic version, publication status, optional lineage references — in addition to the common artifact metadata; see grammar.md for the normative shape. Independently of schema versioning, every concrete Artifact (every Template, every TemplateInstance, every Field, and every PresentationComponent) carries a top-level ModelVersion identifying the version of the CEDAR structural model the artifact conforms to.
A Template is the central container of the model. It specifies an ordered arrangement of EmbeddedArtifact constructs and defines the schema that TemplateInstance constructs must conform to.
A Field is an abstract category refined into a fixed set of typed concrete variants. Each concrete field carries a matching FieldSpec that specifies its value semantics and configuration: the field artifact carries identity, metadata, and lifecycle information, while the FieldSpec carries value rules and rendering properties. The full set of concrete variants, their groupings under abstract sub-categories (NumericField, TemporalField, EnumField, ContactField, ExternalAuthorityField), and the rationale behind the splits are documented in grammar.md and indexed in the Field Families chapter.
A PresentationComponent is a reusable non-data-bearing artifact that contributes presentational or instructional structure within a template. Examples include rich text, images, YouTube videos, section breaks, and page breaks. Presentation components do not produce instance values.
An EmbeddedArtifact contextualises a reusable artifact within a specific Template. There are three forms, and they carry different subsets of template-local properties:
EmbeddedFieldcarries the full property set: anEmbeddedArtifactKey, a typed reference to the embeddedField, and optionalValueRequirement,Cardinality,Visibility, family-typeddefaultValue,LabelOverride, andProperty(a semantic property IRI for the embedding site).EmbeddedTemplatecarries the embedding key, the embedded template’s identifier, and optionalValueRequirement,Cardinality,Visibility,LabelOverride, andProperty. It carries nodefaultValue(templates do not have value-typed defaults).EmbeddedPresentationComponentcarries only the embedding key, the embedded presentation component’s identifier, and an optionalVisibility. It contributes no instance data and exists purely to contribute presentational structure.
An EmbeddedArtifactKey is the local identifier of an EmbeddedArtifact within its containing Template. It is the mechanism that connects template structure to instance structure.
A TemplateInstance is an artifact that records data conforming to a Template. It contains FieldValue and NestedTemplateInstance constructs keyed by EmbeddedArtifactKey, corresponding to the data-bearing embedded artifacts of the referenced template.
The diagram below sketches how the principal categories connect at runtime. Schema-side classes (definitions) are on the right; instance-side classes (data records) are on the left. The horizontal arrows show the two cross-side links: a TemplateInstance is bound to its Template by IRI (templateRef), and each FieldValue is joined to its corresponding EmbeddedField by an EmbeddedArtifactKey. The schema-side downward chain (Template → EmbeddedField → Field → FieldSpec) is the structural surface a template author defines; the instance-side downward chain (TemplateInstance → FieldValue → Value) is the runtime data the schema admits.
For the within-Field typed-variant hierarchy (the 20 concrete field families and their abstract groupings), see the next section.
Field Hierarchy
The diagram below shows the complete Field hierarchy and the FieldSpec each concrete field variant carries.
classDiagram
class Field {
<<abstract>>
}
class TemporalField {
<<abstract>>
}
class EnumField {
<<abstract>>
}
class ContactField {
<<abstract>>
}
class ExternalAuthorityField {
<<abstract>>
}
class TextField
class NumericField
class DateField
class TimeField
class DateTimeField
class ControlledTermField
class SingleValuedEnumField
class MultiValuedEnumField
class LinkField
class EmailField
class PhoneNumberField
class OrcidField
class RorField
class DoiField
class PubMedIdField
class RridField
class NihGrantIdField
class AttributeValueField
class IntegerNumberField
class RealNumberField
class BooleanField
class TextFieldSpec
class IntegerNumberFieldSpec
class RealNumberFieldSpec
class BooleanFieldSpec
class DateFieldSpec
class TimeFieldSpec
class DateTimeFieldSpec
class ControlledTermFieldSpec
class SingleValuedEnumFieldSpec
class MultiValuedEnumFieldSpec
class LinkFieldSpec
class EmailFieldSpec
class PhoneNumberFieldSpec
class OrcidFieldSpec
class RorFieldSpec
class DoiFieldSpec
class PubMedIdFieldSpec
class RridFieldSpec
class NihGrantIdFieldSpec
class AttributeValueFieldSpec
Field <|-- TextField
Field <|-- NumericField
Field <|-- BooleanField
Field <|-- TemporalField
Field <|-- ControlledTermField
Field <|-- EnumField
Field <|-- LinkField
Field <|-- ContactField
Field <|-- ExternalAuthorityField
Field <|-- AttributeValueField
NumericField <|-- IntegerNumberField
NumericField <|-- RealNumberField
TemporalField <|-- DateField
TemporalField <|-- TimeField
TemporalField <|-- DateTimeField
EnumField <|-- SingleValuedEnumField
EnumField <|-- MultiValuedEnumField
ContactField <|-- EmailField
ContactField <|-- PhoneNumberField
ExternalAuthorityField <|-- OrcidField
ExternalAuthorityField <|-- RorField
ExternalAuthorityField <|-- DoiField
ExternalAuthorityField <|-- PubMedIdField
ExternalAuthorityField <|-- RridField
ExternalAuthorityField <|-- NihGrantIdField
TextField --> TextFieldSpec : carries
IntegerNumberField --> IntegerNumberFieldSpec : carries
RealNumberField --> RealNumberFieldSpec : carries
BooleanField --> BooleanFieldSpec : carries
DateField --> DateFieldSpec : carries
TimeField --> TimeFieldSpec : carries
DateTimeField --> DateTimeFieldSpec : carries
ControlledTermField --> ControlledTermFieldSpec : carries
SingleValuedEnumField --> SingleValuedEnumFieldSpec : carries
MultiValuedEnumField --> MultiValuedEnumFieldSpec : carries
LinkField --> LinkFieldSpec : carries
EmailField --> EmailFieldSpec : carries
PhoneNumberField --> PhoneNumberFieldSpec : carries
OrcidField --> OrcidFieldSpec : carries
RorField --> RorFieldSpec : carries
DoiField --> DoiFieldSpec : carries
PubMedIdField --> PubMedIdFieldSpec : carries
RridField --> RridFieldSpec : carries
NihGrantIdField --> NihGrantIdFieldSpec : carries
AttributeValueField --> AttributeValueFieldSpec : carries
Layered Specification
The CEDAR Template Model is specified across four normative chapters, each with a different concern:
| Chapter | Concern |
|---|---|
grammar.md | The abstract grammar — the productions, the categories, and the structural relationships that constitute the model. The authoritative definition. |
wire-grammar.md | The JSON wire form — the concrete shape every production takes when encoded as JSON, plus the encoding rules (kind discriminator, wrapper collapse, property names). |
serialization.md | Encoding and decoding semantics — round-tripping, the error model, NFC normalisation, integer-string fallback, default-value semantics. |
bindings.md | Host-language idioms for TypeScript, Java, and Python, plus codebase-organisation guidance. |
A reusable conformance test suite accompanies the specification, embedded into serialization.md §8 via mdBook {{#include}}. It defines a cross-binding acceptance contract.
Cross-Cutting Conventions
A few structural conventions thread through every chapter:
- The kind discriminator. Every member of a
discriminator: kindunion (e.g. everyFieldfamily, everyValuefamily, everyEmbeddedFieldfamily) carries akindproperty identifying its production, at every position it occupies on the wire. Productions that are not members of any kind-discriminated union (Cardinality,Annotation,LabelOverride,Property, etc.) never carrykind. The rule is uniform — seewire-grammar.md§1.5. - Two-layer default values. Every concrete field family except
AttributeValueFieldcarries two layers of optional default value: a field-level default on the reusableField’sFieldSpec, and an embedding-level default on theEmbeddedXxxFieldinside a Template. The embedding-level default overrides the field-level default when both are present. Defaults are UI/UX initialisation only — they do not appear inTemplateInstanceartifacts and do not affect the RDF projection. Seegrammar.md§Defaults. - Pinned lexical-form productions. The grammar’s primitive string types (
SemanticVersion,IriString,Bcp47Tag,Iso8601DateTimeLexicalForm,AsciiIdentifier,IntegerLexicalForm) are normatively pinned to specific external specifications and regular expressions. Seegrammar.md§Primitive String Types. - The error model. Conforming decoders and encoders report errors in three normatively-defined categories —
wireShape,lexical, andstructural— each with a JSON-pointer path locating the offending slot. Seeserialization.md§9.
Abstract Grammar
This section defines the abstract structure of the CEDAR Template Model using an EBNF-style grammar.
The grammar defines the abstract syntactic structure of the model. It specifies the kinds of constructs that exist and how they are composed, but it does not define a concrete textual or data serialization such as JSON, YAML, RDF, or a functional-style syntax.
Accordingly, a production in this grammar describes abstract structure rather than a directly parseable text form. In particular, a production such as Template ::= template( ... ) does not mean:
- the literal token
templatemust appear in a file - parentheses must appear in a file
- whitespace must be used in a particular way in a file
- the production is itself a concrete serialization format
The following notation is used throughout this grammar:
::= defined as
alternative production
X zero or more occurrences of X
X one or more occurrences of X
[X] optional occurrence of X
(...) groups the named components of an abstract constructor form
Whitespace separates symbols within a production.
Production names use UpperCamelCase. A production name denotes the abstract category being defined, such as Template, Field, or DateFieldSpec.
Abstract constructor forms use lower_snake_case. In this document, a constructor form is the schematic form used to show how an abstract construct is composed, such as template(...), field(...), or date_field_spec(...). The difference between UpperCamelCase production names and lower_snake_case constructor forms is purely a visual distinction used to make it clear when the grammar is naming a category and when it is showing the abstract form of a construct belonging to that category.
For example, in the production
Template ::= template(
TemplateId
CatalogMetadata
SchemaArtifactVersioning
Title
[TemplateRenderingHint]
EmbeddedArtifact
)
Template is the production being defined, while template(...) denotes the abstract constructor form of that construct; in other words, it shows the components of a Template and how they are composed.
A conceptual overview of the model — describing the principal categories, their relationships, and the design rationale behind key decisions — is provided in spec/metamodel.md. The present document is the normative formal specification.
Contents
- Kernel Grammar
- Artifact Identity
- Artifact Metadata
- Scalar and Datatype Leaves
- Values
- Embedded Artifact Properties
- Field Specs
- Presentation Components
- Field Spec And Value Correspondence
- Instances
- Open Questions
Kernel Grammar
The kernel grammar defines the primary abstract categories of the model and the core schema-level structure that connects them. It introduces reusable schema artifacts, templates, and the embedding constructs through which templates assemble fields, nested templates, and presentation components. Subsequent sections refine the metadata, field-spec families, instance structures, and supporting constructs referenced here.
The diagram below gives an overview of the kernel. Template is the central container: it holds an ordered sequence of EmbeddedArtifact constructs, each of which contextualises a reusable artifact — a Field, a nested Template, or a PresentationComponent — within that specific template. A TemplateInstance records data conforming to a Template. Concrete Field variants and FieldSpec configurations are omitted for clarity.
%%{init: {'themeVariables': {'fontSize': '12px'}}}%%
classDiagram
class Artifact {
<<abstract>>
}
class SchemaArtifact {
<<abstract>>
}
class Field {
<<abstract>>
}
class Template {
TemplateId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
Title
[TemplateRenderingHint]
[Header]
[Footer]
}
class PresentationComponent {
PresentationComponentId
ModelVersion
CatalogMetadata
}
class TemplateInstance {
TemplateInstanceId
ModelVersion
CatalogMetadata
}
class EmbeddedArtifact {
<<abstract>>
}
class EmbeddedField {
EmbeddedArtifactKey
[ValueRequirement]
[Cardinality]
[Visibility]
[defaultValue]
[LabelOverride]
[HelpTextOverride]
[Property]
}
class EmbeddedTemplate {
EmbeddedArtifactKey
[ValueRequirement]
[Cardinality]
[Visibility]
[LabelOverride]
[Property]
}
class EmbeddedPresentationComponent {
EmbeddedArtifactKey
[Visibility]
[LabelOverride]
}
class Property {
PropertyIri
[PropertyLabel]
}
Artifact <|-- SchemaArtifact
Artifact <|-- PresentationComponent
Artifact <|-- TemplateInstance
SchemaArtifact <|-- Field
SchemaArtifact <|-- Template
Template "1" *-- "0..*" EmbeddedArtifact : contains ordered
EmbeddedArtifact <|-- EmbeddedField
EmbeddedArtifact <|-- EmbeddedTemplate
EmbeddedArtifact <|-- EmbeddedPresentationComponent
EmbeddedField --> Field : references
EmbeddedTemplate --> Template : references
EmbeddedPresentationComponent --> PresentationComponent : references
EmbeddedField ..> Property : carries
EmbeddedTemplate ..> Property : carries
TemplateInstance --> Template : conforms to
Core Structure
This subsection establishes the top-level taxonomy of the model and introduces its two principal concrete schema artifacts. Artifact is the broadest category, encompassing reusable schema artifacts, presentation components, and template instances. Template is defined here as the central container that organises embedded artifacts into a structured form. Field is introduced as an abstract category whose concrete variants are defined in the following subsection.
Artifact ::= SchemaArtifact
PresentationComponent
TemplateInstance
SchemaArtifact ::= Field
Template
Template is a concrete schema artifact and the central container of the model. It assembles EmbeddedArtifact constructs into a structured form and defines the schema that TemplateInstance constructs conform to.
Template ::= template(
TemplateId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
Title
[TemplateRenderingHint]
[Header]
[Footer]
EmbeddedArtifact
)
Title ::= title(
MultilingualString
)
Label ::= label(
MultilingualString
)
Header ::= header(
MultilingualString
)
Footer ::= footer(
MultilingualString
)
TemplateRenderingHint ::= template_rendering_hint(
[HelpDisplayMode]
)
HelpDisplayMode ::= "inline" "tooltip" "both" "none"
Header and Footer denote optional human-readable textual content displayed at the top and bottom of a rendered template respectively. Each is a MultilingualString carrying one or more language-tagged localizations of the same conceptual text.
TemplateRenderingHint carries form-level UX configuration. Distinct from the per-field-spec RenderingHint family, which configures how a single field is rendered, TemplateRenderingHint configures behaviour that applies to the form as a whole. Currently the only slot is HelpDisplayMode; future revisions may add further form-level UX switches, each with its own cascade rule for embedded templates.
HelpDisplayMode selects how field HelpText — and any per-embedding HelpTextOverride — is presented at form-render time:
"inline"—HelpTextrenders as visible text adjacent to the field, typically beneath the input."tooltip"—HelpTextrenders as a hover/focus tooltip, triggered by a?icon or similar affordance."both"— both presentations are emitted. Useful for accessibility contexts where redundancy is preferred."none"— the field’sHelpTextis not displayed at form-render time. The content remains part of the model (visible to alternative renderers, to the RDF projection, and to catalog displays) but the form-rendering layer suppresses it.
When HelpDisplayMode is absent — either because the Template carries no TemplateRenderingHint, or because the hint omits the slot — the default behaviour is "inline".
The cascade rule for nested templates is a rendering-time concern, not a structural validation constraint, and is normatively stated in presentation.md: when a Template is embedded inside another Template, the inner template’s HelpDisplayMode is ignored for help-text rendering; the enclosing template’s setting applies to every field within the rendered form, including fields contributed by nested templates. The inner template’s own HelpDisplayMode applies only when the template is rendered standalone.
The following productions introduce the abstract field categories. Field remains an abstract category, while the intermediate categories group related concrete field artifacts for readability and shared semantics.
Field ::= TextField
NumericField
BooleanField
TemporalField
ControlledTermField
EnumField
LinkField
ContactField
ExternalAuthorityField
AttributeValueField
NumericField ::= IntegerNumberField
RealNumberField
TemporalField ::= DateField
TimeField
DateTimeField
EnumField ::= SingleValuedEnumField
MultiValuedEnumField
ContactField ::= EmailField
PhoneNumberField
ExternalAuthorityField ::= OrcidField
RorField
DoiField
PubMedIdField
RridField
NihGrantIdField
Concrete Field Artifacts
Each concrete Field variant carries six components: a typed artifact identifier that permanently identifies the reusable field; a ModelVersion identifying the version of the CEDAR structural model the artifact conforms to; CatalogMetadata providing the descriptive, lifecycle, and annotation metadata used in catalog and registry contexts; SchemaArtifactVersioning providing the version, status, and lineage information common to all schema artifacts; a typed FieldSpec that specifies the value semantics and configuration for that field category; and a Label that carries the rendered question text shown to users at data-entry time. The identifier, FieldSpec, and Label are specific to each concrete variant; ModelVersion, CatalogMetadata, and SchemaArtifactVersioning are uniform across all fields. Each concrete Field MAY additionally carry an optional HelpText. The groupings below mirror the abstract Field hierarchy defined in Core Structure.
TextField, BooleanField, and the two numeric field families (IntegerNumberField and RealNumberField) are the simple scalar field specs. Each carries the most basic value semantics — free text, true / false, exact integer values, and real-valued numbers respectively.
TextField ↗ EmbeddedTextField ::= text_field(
TextFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
TextFieldSpec
Label
[HelpText]
)
BooleanField ↗ EmbeddedBooleanField ::= boolean_field(
BooleanFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
BooleanFieldSpec
Label
[HelpText]
)
The numeric field variants correspond to the NumericField abstract category. They share the broader concept of numeric content but split semantically: IntegerNumberField carries arbitrary-precision integer values (no fractional part); RealNumberField carries real-valued numbers (decimal arbitrary precision, or IEEE 754 single- or double-precision floating point). The split is principled: integer arithmetic is exact and closed under the usual operations, whereas real-valued arithmetic carries approximation concerns. See Field Specs for the per-family configuration.
IntegerNumberField ↗ EmbeddedIntegerNumberField ::= integer_number_field(
IntegerNumberFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
IntegerNumberFieldSpec
Label
[HelpText]
)
RealNumberField ↗ EmbeddedRealNumberField ::= real_number_field(
RealNumberFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
RealNumberFieldSpec
Label
[HelpText]
)
The temporal field variants correspond to the TemporalField abstract category. Each is typed to a distinct temporal semantic — date, time of day, or combined date-time — and carries its own FieldSpec with precision and rendering options appropriate to that category.
DateField ↗ EmbeddedDateField ::= date_field(
DateFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
DateFieldSpec
Label
[HelpText]
)
TimeField ↗ EmbeddedTimeField ::= time_field(
TimeFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
TimeFieldSpec
Label
[HelpText]
)
DateTimeField ↗ EmbeddedDateTimeField ::= date_time_field(
DateTimeFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
DateTimeFieldSpec
Label
[HelpText]
)
ControlledTermField supports values drawn from declared ontology sources. LinkField carries a single IRI-valued hyperlink.
ControlledTermField ↗ EmbeddedControlledTermField ::= controlled_term_field(
ControlledTermFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
ControlledTermFieldSpec
Label
[HelpText]
)
LinkField ↗ EmbeddedLinkField ::= link_field(
LinkFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
LinkFieldSpec
Label
[HelpText]
)
SingleValuedEnumField and MultiValuedEnumField correspond to the EnumField abstract category and are the two concrete enum field variants. They differ in whether they permit exactly one or multiple simultaneous selections from a declared set of permissible values. The permitted values are declared in the corresponding EnumFieldSpec and are validated against at the instance level.
SingleValuedEnumField ↗ EmbeddedSingleValuedEnumField ::= single_valued_enum_field(
SingleValuedEnumFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
SingleValuedEnumFieldSpec
Label
[HelpText]
)
MultiValuedEnumField ↗ EmbeddedMultiValuedEnumField ::= multi_valued_enum_field(
MultiValuedEnumFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
MultiValuedEnumFieldSpec
Label
[HelpText]
)
The contact field variants correspond to the ContactField abstract category and represent human contact identifiers.
EmailField ↗ EmbeddedEmailField ::= email_field(
EmailFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
EmailFieldSpec
Label
[HelpText]
)
PhoneNumberField ↗ EmbeddedPhoneNumberField ::= phone_number_field(
PhoneNumberFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
PhoneNumberFieldSpec
Label
[HelpText]
)
The external authority field variants correspond to the ExternalAuthorityField abstract category. Each represents an identifier issued by a specific external authority system, as described in the External Authority Values section. Each external authority field is associated with format validation specific to its identifier scheme and supports integration with the corresponding resolution service for identifier lookup and verification.
OrcidField ↗ EmbeddedOrcidField ::= orcid_field(
OrcidFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
OrcidFieldSpec
Label
[HelpText]
)
RorField ↗ EmbeddedRorField ::= ror_field(
RorFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
RorFieldSpec
Label
[HelpText]
)
DoiField ↗ EmbeddedDoiField ::= doi_field(
DoiFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
DoiFieldSpec
Label
[HelpText]
)
PubMedIdField ↗ EmbeddedPubMedIdField ::= pub_med_id_field(
PubMedIdFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
PubMedIdFieldSpec
Label
[HelpText]
)
RridField ↗ EmbeddedRridField ::= rrid_field(
RridFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
RridFieldSpec
Label
[HelpText]
)
NihGrantIdField ↗ EmbeddedNihGrantIdField ::= nih_grant_id_field(
NihGrantIdFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
NihGrantIdFieldSpec
Label
[HelpText]
)
AttributeValueField supports open-ended name-value pair data whose attribute names are not fixed at schema definition time.
AttributeValueField ↗ EmbeddedAttributeValueField ::= attribute_value_field(
AttributeValueFieldId
ModelVersion
CatalogMetadata
SchemaArtifactVersioning
AttributeValueFieldSpec
Label
[HelpText]
)
The concrete field artifacts defined above are reusable schema-level constructs. A reusable Field deliberately does not carry template-local keying, cardinality, visibility, or label override — those properties belong to the embedding context, not to the reusable artifact. To appear within a Template, each field must be included via an Embedded Artifacts construct, which adds that template-local context and governs how the field participates in that specific template.
Each concrete Field artifact MAY carry an optional HelpText slot. HelpText is authored guidance about what the field is asking for and how to answer — text typically rendered alongside the field at form-render time as inline help, as a hover tooltip, or both, controlled by the enclosing Template’s HelpDisplayMode. HelpText is distinct from Description: Description is the artifact-catalog explanation seen when browsing the field registry; HelpText is the form-author-facing guidance seen at data-entry time. The two roles often share text but serve different audiences.
HelpText ::= help_text( MultilingualString )
HelpText carries a MultilingualString value: localized authored guidance that may be presented in one or more natural languages. The enclosing Template’s HelpDisplayMode selects the presentation; absence of HelpDisplayMode defaults to "inline" rendering. The "none" arm suppresses rendering but preserves the content in the model (visible to alternative renderers, RDF projection, and catalog displays).
A per-embedding override is also defined: an EmbeddedField MAY carry an optional HelpTextOverride that replaces the field’s canonical HelpText at that embedding site only, mirroring the existing LabelOverride precedent. See Embedded Artifacts for the embedding-site shape.
Embedded Artifacts
An EmbeddedArtifact contextualises a reusable artifact within a specific Template, adding template-local properties that govern how the artifact participates in that context. There are three forms: EmbeddedField, which embeds a data-bearing field; EmbeddedTemplate, which nests a template within the containing template; and EmbeddedPresentationComponent, which contributes presentational structure without producing instance data.
The sequence of EmbeddedArtifact constructs within a Template is significant. The order in which they appear determines the presentation order of embedded artifacts in a rendered template. Conforming implementations MUST preserve this order.
EmbeddedArtifact ::= EmbeddedField
EmbeddedTemplate
EmbeddedPresentationComponent
EmbeddedField ::= EmbeddedTextField
EmbeddedIntegerNumberField
EmbeddedRealNumberField
EmbeddedBooleanField
EmbeddedDateField
EmbeddedTimeField
EmbeddedDateTimeField
EmbeddedControlledTermField
EmbeddedSingleValuedEnumField
EmbeddedMultiValuedEnumField
EmbeddedLinkField
EmbeddedEmailField
EmbeddedPhoneNumberField
EmbeddedOrcidField
EmbeddedRorField
EmbeddedDoiField
EmbeddedPubMedIdField
EmbeddedRridField
EmbeddedNihGrantIdField
EmbeddedAttributeValueField
Every concrete EmbeddedField variant follows the same structural pattern. Each carries: an EmbeddedArtifactKey uniquely identifying the embedding site within the containing Template; a typed field reference identifying the reusable Field being embedded; an optional ValueRequirement specifying whether a value is required, recommended, or optional; an optional Cardinality bounding the permitted number of values; an optional Visibility controlling whether the field is shown in rendered interfaces; an optional defaultValue providing an embedding-specific default whose type is the family-specific Value type (e.g. TextValue for EmbeddedTextField, DateValue for EmbeddedDateField); an optional LabelOverride allowing the template to override the field’s label in this context; and an optional Property associating a semantic property IRI with the embedding site. The only variation across concrete EmbeddedField variants is the typed field reference and the typed default value, both of which match the value family of the referenced field.
EmbeddedBooleanField and EmbeddedSingleValuedEnumField are the two exceptions to this pattern: each omits the [Cardinality] slot. A boolean field is inherently single-valued — its ValueRequirement slot already distinguishes the meaningful states (required, recommended, optional). A SingleValuedEnumField is similarly single-valued by construction; multi-valued enum embedding is expressed only through EmbeddedMultiValuedEnumField. EmbeddedMultiValuedEnumField further differs in that its embedding-level default is a sequence (EnumValue*) rather than a single optional value, parallel to how multi-valued enum instance values appear as a sequence in FieldValue.
EmbeddedTextField ↗ TextField ::= embedded_text_field(
EmbeddedArtifactKey
TextFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[TextValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedIntegerNumberField ↗ IntegerNumberField ::= embedded_integer_number_field(
EmbeddedArtifactKey
IntegerNumberFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[IntegerNumberValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedRealNumberField ↗ RealNumberField ::= embedded_real_number_field(
EmbeddedArtifactKey
RealNumberFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[RealNumberValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedBooleanField ↗ BooleanField ::= embedded_boolean_field(
EmbeddedArtifactKey
BooleanFieldId
[ValueRequirement]
[Visibility]
[BooleanValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedDateField ↗ DateField ::= embedded_date_field(
EmbeddedArtifactKey
DateFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[DateValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedTimeField ↗ TimeField ::= embedded_time_field(
EmbeddedArtifactKey
TimeFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[TimeValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedDateTimeField ↗ DateTimeField ::= embedded_date_time_field(
EmbeddedArtifactKey
DateTimeFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[DateTimeValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedControlledTermField ↗ ControlledTermField ::= embedded_controlled_term_field(
EmbeddedArtifactKey
ControlledTermFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[ControlledTermValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedSingleValuedEnumField ↗ SingleValuedEnumField ::= embedded_single_valued_enum_field(
EmbeddedArtifactKey
SingleValuedEnumFieldId
[ValueRequirement]
[Visibility]
[EnumValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedMultiValuedEnumField ↗ MultiValuedEnumField ::= embedded_multi_valued_enum_field(
EmbeddedArtifactKey
MultiValuedEnumFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
EnumValue
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedLinkField ↗ LinkField ::= embedded_link_field(
EmbeddedArtifactKey
LinkFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[LinkValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedEmailField ↗ EmailField ::= embedded_email_field(
EmbeddedArtifactKey
EmailFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[EmailValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedPhoneNumberField ↗ PhoneNumberField ::= embedded_phone_number_field(
EmbeddedArtifactKey
PhoneNumberFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[PhoneNumberValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedOrcidField ↗ OrcidField ::= embedded_orcid_field(
EmbeddedArtifactKey
OrcidFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[OrcidValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedRorField ↗ RorField ::= embedded_ror_field(
EmbeddedArtifactKey
RorFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[RorValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedDoiField ↗ DoiField ::= embedded_doi_field(
EmbeddedArtifactKey
DoiFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[DoiValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedPubMedIdField ↗ PubMedIdField ::= embedded_pub_med_id_field(
EmbeddedArtifactKey
PubMedIdFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[PubMedIdValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedRridField ↗ RridField ::= embedded_rrid_field(
EmbeddedArtifactKey
RridFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[RridValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedNihGrantIdField ↗ NihGrantIdField ::= embedded_nih_grant_id_field(
EmbeddedArtifactKey
NihGrantIdFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[NihGrantIdValue]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedAttributeValueField ↗ AttributeValueField ::= embedded_attribute_value_field(
EmbeddedArtifactKey
AttributeValueFieldId
[ValueRequirement]
[Cardinality]
[Visibility]
[LabelOverride]
[HelpTextOverride]
[Property]
)
EmbeddedTemplate and EmbeddedPresentationComponent follow a similar pattern to embedded fields but differ in what embedding properties they carry. EmbeddedTemplate supports cardinality to permit multiple nested instances of the referenced template, carries no typed default value, and carries an optional Property associating a semantic property IRI with the embedding site. EmbeddedPresentationComponent carries neither a value requirement, cardinality, default value, label override, nor property, as it contributes no instance data and exists purely to contribute presentational structure. The only embedding-level property it carries is Visibility.
EmbeddedTemplate ::= embedded_template(
EmbeddedArtifactKey
TemplateId
[ValueRequirement]
[Cardinality]
[Visibility]
[LabelOverride]
[Property]
)
EmbeddedPresentationComponent ::= embedded_presentation_component(
EmbeddedArtifactKey
PresentationComponentId
[Visibility]
)
Artifact Identity
Artifact identity defines the typed identifiers by which artifacts and artifact references are denoted in the model. These identity constructs are distinct from descriptive metadata, lifecycle metadata, versioning, and annotations.
Each field kind has its own typed identifier rather than sharing a single generic FieldId. This provides strong typing: an EmbeddedTextField can only carry a TextFieldId at its artifactRef slot, an EmbeddedDateField can only carry a DateFieldId, and so on, making it structurally impossible to embed a field of the wrong type. TemplateId, PresentationComponentId, and TemplateInstanceId follow the same pattern for the same reason.
Identifiers serve two roles: at the definition site of a reusable artifact (e.g. Field.id, Template.id) they permanently name the artifact; at the embedding site (e.g. EmbeddedField.artifactRef, EmbeddedTemplate.artifactRef) they reference the artifact being embedded. The same typed identifier production is used at both positions; the role distinction is conveyed by the surrounding production’s component name.
FieldId ::= TextFieldId
IntegerNumberFieldId
RealNumberFieldId
BooleanFieldId
DateFieldId
TimeFieldId
DateTimeFieldId
ControlledTermFieldId
SingleValuedEnumFieldId
MultiValuedEnumFieldId
LinkFieldId
EmailFieldId
PhoneNumberFieldId
OrcidFieldId
RorFieldId
DoiFieldId
PubMedIdFieldId
RridFieldId
NihGrantIdFieldId
AttributeValueFieldId
TextFieldId ::= text_field_id( Iri )
IntegerNumberFieldId ::= integer_number_field_id( Iri )
RealNumberFieldId ::= real_number_field_id( Iri )
BooleanFieldId ::= boolean_field_id( Iri )
DateFieldId ::= date_field_id( Iri )
TimeFieldId ::= time_field_id( Iri )
DateTimeFieldId ::= date_time_field_id( Iri )
ControlledTermFieldId ::= controlled_term_field_id( Iri )
SingleValuedEnumFieldId ::= single_valued_enum_field_id( Iri )
MultiValuedEnumFieldId ::= multi_valued_enum_field_id( Iri )
LinkFieldId ::= link_field_id( Iri )
EmailFieldId ::= email_field_id( Iri )
PhoneNumberFieldId ::= phone_number_field_id( Iri )
OrcidFieldId ::= orcid_field_id( Iri )
RorFieldId ::= ror_field_id( Iri )
DoiFieldId ::= doi_field_id( Iri )
PubMedIdFieldId ::= pub_med_id_field_id( Iri )
RridFieldId ::= rrid_field_id( Iri )
NihGrantIdFieldId ::= nih_grant_id_field_id( Iri )
AttributeValueFieldId ::= attribute_value_field_id( Iri )
TemplateId ::= template_id( Iri )
PresentationComponentId ::= presentation_component_id( Iri )
TemplateInstanceId ::= template_instance_id( Iri )
All artifact identifier productions are IRI-valued. See Iri.
Concrete serializations need not preserve the per-family identifier distinctions drawn here. In the JSON wire encoding, every artifact identifier — whether a per-family FieldId variant such as TextFieldId or SingleValuedEnumFieldId, or one of the non-field identifiers TemplateId, PresentationComponentId, and TemplateInstanceId — is encoded as a bare IRI string with no per-family discriminator. The field family of a FieldId reference is recovered from the kind of the enclosing Field or EmbeddedField. See wire-grammar.md §5 and serialization.md.
Artifact Metadata
Artifact metadata defines descriptive information, lifecycle information, versioning, and annotations. CatalogMetadata is uniform across every artifact kind and provides the common catalog-oriented metadata carried by all artifacts other than identity: descriptive properties (preferred catalog label, description, identifier, alternative labels), lifecycle metadata, and annotations. Schema artifacts (Field, Template) additionally carry SchemaArtifactVersioning as a separate top-level slot recording version, status, and lineage.
Aggregate Structure
This subsection identifies how the metadata categories are grouped at the artifact level. CatalogMetadata carries the catalog-oriented properties of an artifact — descriptive properties (preferred catalog label, description, identifier, alternative labels), lifecycle metadata, and annotations — directly as members. It is uniform across every artifact kind: Field, Template, PresentationComponent, and TemplateInstance all carry the same CatalogMetadata shape.
The schema artifacts (Field and Template) additionally carry SchemaArtifactVersioning as a separate top-level slot on the artifact itself; non-schema artifacts (PresentationComponent, TemplateInstance) do not carry versioning.
CatalogMetadata is distinct from an artifact’s rendered display name. A Field carries a top-level Label slot (the rendered question text); a Template carries a top-level Title slot (the rendered form title); a TemplateInstance MAY carry an optional Label (a user-supplied instance name); a PresentationComponent carries no rendered display name at all. These rendered slots are defined on the per-artifact productions in Field Artifacts, Core Structure, Instances, and Presentation Components respectively.
CatalogMetadata ::= catalog_metadata(
[PreferredLabel]
[Description]
[Identifier]
AlternativeLabel
LifecycleMetadata
Annotation
)
Descriptive Metadata
The descriptive metadata of an artifact comprises a set of human-oriented properties carried directly by CatalogMetadata. These properties support naming, explanatory text, and external or local identifiers used for cataloging. PreferredLabel, when present, is the artifact’s preferred display name in catalog and registry contexts (e.g., browsing the field registry or listing templates) — distinct from the artifact’s rendered display name, which lives in a top-level slot on the artifact itself (Field.label, Template.title, TemplateInstance.label). Authors typically populate PreferredLabel with the same text as the rendered slot; the two are separate so they MAY differ when needed (for example, a field whose registry name is "Comment field (v1.2)" may render in forms as just "Comment"). Description, when present, is an extended textual explanation of the artifact’s purpose and content, intended for catalog display. Identifier, when present, is a user-specified external identifier intended for integration with institutional or external systems. AlternativeLabel, when present, provides additional display labels for the artifact (synonyms, abbreviations, legacy labels carried forward from prior versions of the model).
Description ::= description(
MultilingualString
)
Identifier ::= identifier(
string
)
Description carries a MultilingualString value: human-readable text that may be presented in one or more natural languages. Identifier carries an arbitrary Unicode string value: it is a technical user-supplied key intended for integration with external systems and is not a human-display label, so it is not multilingual. PreferredLabel is defined in the Controlled Term Value section; AlternativeLabel is defined in the Label Override section. Both are MultilingualString-valued.
Lifecycle Metadata
LifecycleMetadata identifies when an artifact was created and modified, and which agents were responsible for those actions.
LifecycleMetadata ::= lifecycle_metadata(
CreatedOn
CreatedBy
ModifiedOn
ModifiedBy
)
CreatedOn ::= IsoDateTimeStamp
CreatedBy ::= Iri
ModifiedOn ::= IsoDateTimeStamp
ModifiedBy ::= Iri
CreatedOn and ModifiedOn MUST be ISO 8601 date-time timestamps.
CreatedBy and ModifiedBy denote IRIs identifying the responsible agents.
See IsoDateTimeStamp and Iri.
Schema Versioning
SchemaArtifactVersioning identifies version-related metadata specific to reusable schema artifacts. It captures artifact version, publication status, and optional derivation links to earlier or source artifacts.
SchemaArtifactVersioning ::= schema_artifact_versioning(
Version
Status
[PreviousVersion]
[DerivedFrom]
)
Version ::= version(
SemanticVersion
)
Status ::= "draft" "published"
ModelVersion ::= model_version(
SemanticVersion
)
PreviousVersion ::= previous_version(
Iri
)
DerivedFrom ::= derived_from(
Iri
)
Version denotes a Semantic Versioning 2.0.0 version identifier.
Status denotes the publication status of a reusable schema artifact and is restricted to draft or published.
PreviousVersion and DerivedFrom denote IRIs identifying related source or predecessor artifacts.
The combined meaning of these fields and their interaction with artifact identity is specified in Versioning Model below.
Versioning Model
The CEDAR versioning model rests on one guiding rule: identity is per-version. Every version of a Field or Template is itself a distinct Artifact with its own IRI. There is no separate “version-independent” identifier for the conceptual artifact; what holds successive versions together is the PreviousVersion link from each artifact to the one it replaces.
Identity and immutability. Every reusable schema artifact (every Field and every Template) is identified by a single SchemaArtifactId (a FieldId or TemplateId). That IRI denotes one specific version: distinct versions of “the same” artifact are distinct artifacts in the model, each with its own IRI. A published artifact MUST be treated as immutable — once Status is "published", the content addressed by its IRI MUST NOT change. A draft artifact MAY be edited in place while its Status remains "draft". The transition from draft to published is one-way: an artifact whose Status is "published" MUST NOT transition back to "draft".
Creating a new version. To produce a revised version of a published artifact, mint a new IRI, allocate a new artifact at that IRI with Status set to "draft", and set PreviousVersion to the IRI of the artifact being revised. Editing happens on the new draft; once the new artifact is itself published, it joins the version chain and becomes immutable in turn. The published predecessor is unaffected by the existence of its successor: it remains addressable at its own IRI and continues to be a valid target for TemplateInstance references.
Version chains. Successive versions of an artifact form a version chain: a sequence of distinct artifacts, each with its own IRI, linked by PreviousVersion. Artifact B is the immediate successor of artifact A when B.previousVersion = A.id. The first artifact in a chain MUST omit PreviousVersion. Every subsequent artifact in the chain MUST set PreviousVersion to the IRI of its immediate predecessor. A chain is therefore a singly-linked list of IRIs, traversable backwards from any version to the original.
The role of Version. Version carries a Semantic Versioning identifier as advisory metadata describing this artifact’s place in its chain (e.g. 1.0.0 → 1.1.0 for a backwards-compatible change, 1.0.0 → 2.0.0 for a breaking change). The pairing of IRI and PreviousVersion is what authoritatively establishes the chain; Version is descriptive and is not load-bearing for chain identity. Successive artifacts in a chain SHOULD carry monotonically increasing SemanticVersion values, but this specification does not impose a structural constraint to that effect.
Derivation versus succession. DerivedFrom and PreviousVersion are distinct relationships and answer different questions. PreviousVersion records succession within a single version chain: the successor is intended to replace its predecessor as the same conceptual artifact evolves. DerivedFrom records non-version lineage: the new artifact is a fork or adaptation — it was authored by copying or modifying an existing artifact, but it is not the next version of that artifact. A fork begins its own independent version chain. Typical uses of DerivedFrom include adopting a community-published template into an institutional namespace or spawning a specialised variant of an existing field. An artifact MAY carry both PreviousVersion and DerivedFrom simultaneously: the artifact succeeds another within its own chain and was originally derived from a separate source artifact. The two relationships are independent. PreviousVersion and DerivedFrom, when both present, MUST NOT carry the same IRI value — succession and derivation are mutually exclusive at any single point.
Summary of normative rules.
- Every version of a
FieldorTemplateMUST have a distinct IRI. - A
publishedartifact MUST NOT change at its IRI. - A
publishedartifact MUST NOT transition back todraft. - The first artifact in a version chain MUST omit
PreviousVersion. Every other artifact in the chain MUST setPreviousVersionto the IRI of its immediate predecessor in that chain. - When both
PreviousVersionandDerivedFromare present on the same artifact, they MUST NOT carry the same IRI.
Annotations
Annotation provides an extensible metadata mechanism for additional named metadata values that are not captured by the core descriptive, lifecycle, or versioning structures. The first Iri identifies the annotation property — the predicate IRI under which the annotation is asserted. The AnnotationValue is the associated metadata value, currently a string-bearing scalar or an IRI. This supports linking to external resources such as DOIs and grant identifiers, as well as storing institutional metadata.
Annotation ::= annotation(
Iri
AnnotationValue
)
AnnotationValue ::= AnnotationStringValue
AnnotationIriValue
AnnotationStringValue ::= annotation_string_value(
LexicalForm
[LanguageTag]
)
AnnotationIriValue ::= annotation_iri_value(
Iri
)
AnnotationValue is a discriminated union over named annotation-value productions. The two currently defined variants represent text-valued and IRI-valued annotations: AnnotationStringValue carries a lexical form with an optional language tag; AnnotationIriValue carries an IRI denoting a resource. AnnotationStringValue does not carry an explicit datatype; lexically-typed annotations are not modelled at this position, since annotation metadata is by convention either text or IRI-valued.
The variant family is open to extension. A future revision MAY introduce additional AnnotationXxxValue productions (for example, integer- or real-number-valued annotations) without breaking the existing variants.
Scalar and Datatype Leaves
The following productions define the primitive leaf types used throughout this grammar. They represent the atomic constructs from which all other productions are built: IRIs, typed string domains, lexical forms, multilingual textual metadata, numeric and temporal datatype IRIs, and textual metadata values.
Primitive String Types
The following nonterminals are the string-valued leaf types referenced by the productions in this section. Each is pinned to a specific external specification or regular expression so that implementations can validate inputs unambiguously.
-
SemanticVersion— a Semantic Versioning 2.0.0 string. MUST conform to the Semantic Versioning 2.0.0 specification at semver.org, specifically the regular expression in the SemVer FAQ (https://semver.org/#is-there-a-suggested-regular-expression-regex-to-check-a-semver-string). Examples:1.0.0,2.0.0-alpha.1,1.0.0+build.7. -
IriString— the lexical form of an IRI as defined by RFC 3987 §2.2 (theIRIproduction). The IRI MUST be absolute (carry a scheme); relative IRIs are not permitted at any wire-form position. Implementations SHOULD use the RFC 3987IRIABNF; a permissive practical regex is^[A-Za-z][A-Za-z0-9+.\-]*:[^\s<>"]+$but this is not sufficient for full conformance. -
Bcp47Tag— a well-formed BCP 47 language tag per RFC 5646, specifically theLanguage-Tagproduction. Implementations SHOULD validate against the IANA Language Subtag Registry; a syntactic-only check (well-formedness without registry lookup) is acceptable as a baseline. Examples:en,en-US,zh-Hant-TW,de-CH-1901. -
Iso8601DateTimeLexicalForm— an ISO 8601 combined date-and-time string in the extended format with full date and full time, with or without a UTC offset. The accepted shapes are:YYYY-MM-DDTHH:MM:SS(no offset)YYYY-MM-DDTHH:MM:SS.sss(fractional seconds, 1–9 digits)YYYY-MM-DDTHH:MM:SSZ(UTC)YYYY-MM-DDTHH:MM:SS±HH:MM(offset)- the same shapes with
.sssfractional seconds combined with the timezone form.
This corresponds to the XSD
dateTimelexical form (XSD 1.1 §3.3.7). Examples:2026-05-08T14:30:00Z,2026-05-08T14:30:00.123-07:00. -
AsciiIdentifier— a string matching the regular expression^[A-Za-z][A-Za-z0-9_-]*$: an ASCII letter followed by zero or more ASCII letters, digits, underscores, or hyphens. Length is unbounded. Examples:topic,field-1,Member_42. -
IntegerLexicalForm— a base-10 signed integer literal matching the regular expression^-?(0|[1-9][0-9]*)$: an optional leading minus sign followed by either0or a non-zero digit and zero or more digits. Leading zeros and a leading+are not permitted. Magnitude is unbounded. The using context may further restrict the sign —NonNegativeIntegerrejects values with a leading minus sign; signed bounds productions accept it.
Core IRI and String Types
This subsection defines the fundamental IRI, string, and numeric leaf types that appear throughout the grammar. Iri is the base construct for all IRI-valued positions. TermIri is a specialised IRI form for controlled-vocabulary references. LanguageTag and LexicalForm are leaf string types used by Value constructs that carry localized or lexically-typed content. IsoDateTimeStamp carries ISO 8601 date-time values used in lifecycle metadata. NonNegativeInteger supports field-spec constraints.
Iri ::= iri(
IriString
)
TermIri ::= term_iri(
Iri
)
LanguageTag ::= language_tag(
Bcp47Tag
)
LexicalForm ::= lexical_form(
string
)
IsoDateTimeStamp ::= iso_date_time_stamp(
Iso8601DateTimeLexicalForm
)
NonNegativeInteger ::= non_negative_integer(
IntegerLexicalForm
)
Iri denotes an Internationalized Resource Identifier. It corresponds to the xsd:anyURI datatype; implementations MAY represent it as a plain string provided it is a syntactically valid IRI.
TermIri denotes an Iri that identifies a term in a controlled vocabulary or ontology. It is used in ControlledTermValue, ControlledTermClass, and Meaning.
LanguageTag denotes a well-formed BCP 47 language tag.
LexicalForm denotes a Unicode string and SHOULD be in Unicode Normalization Form C.
IsoDateTimeStamp denotes an ISO 8601 date-time lexical form.
NonNegativeInteger denotes an integer greater than or equal to zero.
Multilingual Strings
LangString and MultilingualString are the constructs used at every grammar position that carries human-display text. They distinguish localizations of one conceptual string from technical Unicode-string keys (which remain plain string-valued; see Identifier and the controlled-term-source identifiers in Controlled Term Sources).
LangString ::= lang_string(
string
Bcp47Tag
)
MultilingualString ::= multilingual_string(
LangString
)
LangString pairs a textual value with a BCP 47 language tag identifying its natural language.
MultilingualString denotes a non-empty set of LangString entries representing localizations of one conceptual string. The entries’ language tags MUST be unique within a MultilingualString (case-folded comparison): the construct represents a set of localizations, not a list of phrasings within a single language.
The 'und' (undetermined) BCP 47 subtag MAY be used to denote a LangString whose natural language is unspecified. Implementations MAY use 'und' as the default tag when constructing a MultilingualString from a bare string with no language information.
MultilingualString differs from a single language-tagged scalar value (such as TextValue with a LanguageTag) in that it carries an unweighted localization set — multiple language tags coexist for the same conceptual string at metadata positions such as Template.header or CatalogMetadata.preferredLabel.
Numeric Datatype Kind
IntegerNumberValue is fixed to a single integer category; its datatype is implicit and is not a configurable component of the production. RealNumberValue carries an explicit RealNumberDatatypeKind chosen from three alternatives — decimal, float, or double. The kind names are CEDAR-native enum values; their corresponding XSD datatype IRIs are defined externally to the abstract grammar by rdf-projection.md.
RealNumberDatatypeKind ::= "decimal" "float" "double"
decimal denotes exact arbitrary-precision decimal numbers. float and double denote IEEE 754 single- and double-precision floating-point numbers respectively.
This specification narrows the supported numeric kinds to four (one integer kind plus the three real-number kinds). Earlier drafts admitted the full XSD numeric hierarchy (16 datatypes including long, short, byte, the signed/unsigned bounded subtypes, and the sign-constrained subtypes such as nonNegativeInteger); those are not part of the conforming set. Sign and range constraints are expressed via IntegerNumberMinValue / IntegerNumberMaxValue (or the real-valued equivalents). Bit-precision distinctions are not modelled at the type level; decimal covers exact arbitrary precision when needed, and float / double cover IEEE 754 single- and double-precision when storage precision matters.
Values
This section defines the Value types that represent instance-level data. Value constructs appear in FieldValue instances and as typed default values in EmbeddedArtifact properties. The value types are defined here independently of the FieldSpec productions that constrain them; the normative mapping between each FieldSpec and its permitted Value form is given in the Field Spec And Value Correspondence section.
Value ::= TextValue
NumericValue
BooleanValue
DateValue
TimeValue
DateTimeValue
ControlledTermValue
EnumValue
LinkValue
EmailValue
PhoneNumberValue
ExternalAuthorityValue
AttributeValue
NumericValue ::= IntegerNumberValue
RealNumberValue
Scalar Values
TextValue, BooleanValue, and the two numeric value forms (IntegerNumberValue and RealNumberValue) are the simplest value types. Each carries the family-specific content directly: a lexical form for the string-bearing variants, a boolean payload for BooleanValue. TextValue carries an optional LanguageTag; when present, the value is a language-tagged string, when absent, a plain string. IntegerNumberValue carries a base-10 integer lexical form; its category is implicit and not carried as a component. RealNumberValue carries a base-10 real-valued lexical form paired with an explicit RealNumberDatatypeKind (decimal, float, or double).
TextValue ::= text_value(
LexicalForm
[LanguageTag]
)
IntegerNumberValue ::= integer_number_value(
LexicalForm
)
RealNumberValue ::= real_number_value(
LexicalForm
RealNumberDatatypeKind
)
BooleanValue ::= boolean_value(
boolean
)
IntegerNumberValue’s lexical form MUST be a base-10 integer literal (per the IntegerLexicalForm primitive in §Primitive String Types). RealNumberValue’s lexical form is a base-10 real-valued literal whose admissible form depends on the carried datatype: decimal admits an arbitrary-precision decimal lexical form; float and double admit IEEE 754-style lexical forms (including special values such as INF, -INF, and NaN).
NumericValue is the abstract category admitting IntegerNumberValue and RealNumberValue; the two are distinct concrete value types and a FieldValue carrying numeric content discriminates between them by kind.
The lexical form of any string-bearing value SHOULD be in Unicode Normalization Form C.
A Value whose lexical form lies outside the lexical space of its declared datatype is ill-typed: it is not syntactically ill-formed but does not determine a valid value. Implementations MUST accept ill-typed values and MAY produce warnings when encountering them. The corresponding RDF projection (see rdf-projection.md) preserves the ill-typed lexical form.
Temporal Values
Temporal values represent date, time, and date-time data, corresponding directly to DateFieldSpec, TimeFieldSpec, and DateTimeFieldSpec respectively. DateValue is further refined into three precision variants — YearValue, YearMonthValue, and FullDateValue. Each temporal Value variant carries a LexicalForm directly; the temporal category is fixed by the variant’s kind. FullDateValue carries an ISO 8601 calendar-date lexical form; TimeValue carries an ISO 8601 time-of-day lexical form; DateTimeValue carries an ISO 8601 combined date-time lexical form. YearValue and YearMonthValue carry plain strings matching the patterns YYYY and YYYY-MM respectively. The RDF projection of these values is defined separately in rdf-projection.md.
DateValue ::= YearValue
YearMonthValue
FullDateValue
YearValue ::= year_value(
LexicalForm ( matches YYYY, e.g. "2024" )
)
YearMonthValue ::= year_month_value(
LexicalForm ( matches YYYY-MM, e.g. "2024-06" )
)
FullDateValue ::= full_date_value(
LexicalForm
)
TimeValue ::= time_value(
LexicalForm
)
DateTimeValue ::= date_time_value(
LexicalForm
)
Controlled Term Value
A controlled term value identifies a term drawn from an ontology, branch, class set, or value set declared in the corresponding ControlledTermFieldSpec. It carries a TermIri identifying the term, together with an optional human-readable Label and optional Notation and PreferredLabel terminology metadata from the source ontology. Label is the display label intended for end-user presentation; Notation is a symbolic code (typically a SKOS notation) bound to the term; PreferredLabel is the ontology’s own preferred label for the term, distinct from the display Label that may have been customized for the surrounding context.
A ControlledTermValue MAY omit Label: a consumer that has access to the source ontology can resolve the term’s display label from the TermIri. Producers SHOULD include Label when it is known at the point of value construction so that downstream consumers without ontology access can render the value.
Label ::= label(
MultilingualString
)
Notation ::= notation(
string
)
PreferredLabel ::= preferred_label(
MultilingualString
)
ControlledTermValue ::= controlled_term_value(
TermIri
[Label]
[Notation]
[PreferredLabel]
)
Label and PreferredLabel are MultilingualString values: each carries one or more language-tagged localizations of the term’s display label. Notation is a plain Unicode string: it is a technical symbolic code (typically a SKOS notation) rather than human-display text, and is therefore not multilingual.
Enum Value
An enum value carries a selection from the permissible values declared by an EnumFieldSpec. Every enum value is identified by a Token — a non-empty Unicode string that serves as the canonical key of one of the enum spec’s PermissibleValue entries. A conforming instance value MUST equal the Token of one of the referenced spec’s permissible values.
EnumValue ::= enum_value(
Token
)
Token is the leaf type used as the canonical key of an enum selection. It is defined in the Field Specs section alongside the related leaf productions (PermissibleValue, Meaning) used by EnumFieldSpec.
Link Value
A link value represents a hyperlink or URL-valued field. It carries an Iri identifying the linked resource and an optional Label providing a human-readable display label for the link.
LinkValue ::= link_value(
Iri
[Label]
)
Label is the same MultilingualString-valued production used by ControlledTermValue, PermissibleValue, and the external-authority value types: a label is treated uniformly as a localizable display string. A hyperlink’s display text MAY therefore carry one or more language-tagged localizations.
Contact Values
Contact values represent human contact identifiers. EmailValue carries an email address as a plain LexicalForm; PhoneNumberValue carries a telephone number as a plain LexicalForm. Format validation is left to implementations.
EmailValue ::= email_value(
LexicalForm
)
PhoneNumberValue ::= phone_number_value(
LexicalForm
)
External Authority Values
External authority values represent identifiers issued by recognised external authority systems. Each concrete value type carries a typed IRI specialised for its authority together with an optional human-readable Label. The typed IRI signals the expected identifier scheme; format conformance for each authority may be enforced by profile-specific or implementation-specific validation rules.
ExternalAuthorityValue ::= OrcidValue
RorValue
DoiValue
PubMedIdValue
RridValue
NihGrantIdValue
OrcidValue ::= orcid_value(
OrcidIri
[Label]
)
RorValue ::= ror_value(
RorIri
[Label]
)
DoiValue ::= doi_value(
DoiIri
[Label]
)
PubMedIdValue ::= pub_med_id_value(
PubMedIri
[Label]
)
RridValue ::= rrid_value(
RridIri
[Label]
)
NihGrantIdValue ::= nih_grant_id_value(
NihGrantIri
[Label]
)
OrcidIri ::= orcid_iri( Iri )
RorIri ::= ror_iri( Iri )
DoiIri ::= doi_iri( Iri )
PubMedIri ::= pub_med_iri( Iri )
RridIri ::= rrid_iri( Iri )
NihGrantIri ::= nih_grant_iri( Iri )
| Typed IRI | Authority | IRI Pattern |
|---|---|---|
OrcidIri | ORCID — identifies a researcher by ORCID iD | https://orcid.org/\d{4}-\d{4}-\d{4}-\d{3}[\dX] |
RorIri | Research Organization Registry — identifies a research organisation by ROR ID | https://ror.org/0[a-z0-9]{8} |
DoiIri | Digital Object Identifier — identifies a digital object by DOI | https://doi.org/10\.\d{4,}/.+ |
PubMedIri | PubMed — identifies a PubMed article | https://pubmed.ncbi.nlm.nih.gov/\d+ |
RridIri | Research Resource Identifier — identifies a research resource by RRID | https://identifiers.org/RRID:[A-Z]+_\d+ |
NihGrantIri | NIH — identifies an NIH-funded grant | unspecified |
The final character of an ORCID iD MAY be X, serving as an ISO 7064 Mod 11-2 check character.
Attribute Value
An attribute value is a name-value pair used to represent arbitrary named properties whose names are not known at schema definition time. AttributeName carries the name of the attribute as a Unicode string. The value component is itself a Value, permitting attribute values to carry any value type including nested attribute values. Nesting depth is unbounded at the model level; concrete implementations MAY impose practical limits.
AttributeName ::= attribute_name(
string
)
AttributeValue ::= attribute_value(
AttributeName
Value
)
Embedded Artifact Properties
Embedded artifact properties define the contextual information carried by an EmbeddedArtifact within a Template. These properties govern how a referenced reusable artifact is used in that template context, including key, reference, requirement, cardinality, visibility, defaults, and label override, and they are distinct from the intrinsic properties of the referenced reusable artifact itself.
Embedded Artifact Key
An EmbeddedArtifactKey is the local identifier of an EmbeddedArtifact within a Template. It is the key by which an embedded field, embedded template, or embedded presentation component is distinguished from other embedded artifacts in the same template. This key is also the mechanism that connects template structure to instance structure: FieldValue and NestedTemplateInstance use EmbeddedArtifactKey to identify which embedded artifact in the template they correspond to.
EmbeddedArtifactKey ::= embedded_artifact_key(
AsciiIdentifier
)
EmbeddedArtifactKey MUST match the pattern [A-Za-z][A-Za-z0-9_-]*: it MUST begin with an ASCII letter followed by zero or more ASCII letters, digits, underscores, or hyphens.
EmbeddedArtifactKey values are local to a Template and MUST be unique within that Template.
EmbeddedArtifactKey is distinct from artifact identifiers such as FieldId and TemplateId. It identifies the embedding site within a template rather than the reusable artifact being referenced. The same reusable Field may be embedded more than once in a Template under different keys, and each key independently identifies that embedding site in both the template structure and any corresponding TemplateInstance.
Requirements
ValueRequirement identifies whether a value is required, recommended, or optional in the embedding context. Required means that a value must be supplied for conformance. Recommended and Optional are identical for conformance purposes: absence of a value MUST NOT cause conformance failure in either case. The distinction is one of authoring guidance only: implementations SHOULD encourage entry for Recommended fields and MAY issue warnings when such fields are left empty.
ValueRequirement ::= "required" "recommended" "optional"
When ValueRequirement is absent from an EmbeddedArtifact, the default is "optional".
Cardinality
Cardinality identifies the permitted number of occurrences for the embedded artifact in the embedding context.
Cardinality ::= cardinality(
MinCardinality
[MaxCardinality]
)
MinCardinality ::= min_cardinality(
NonNegativeInteger
)
MaxCardinality ::= max_cardinality(
NonNegativeInteger
)
When MaxCardinality is absent from a present Cardinality, the cardinality is unbounded above: any number of occurrences greater than or equal to the specified MinCardinality is permitted. Unboundedness is therefore expressed by omission of MaxCardinality rather than by a distinct construct.
When Cardinality is absent from an EmbeddedArtifact, the implied default is min_cardinality(1) with max_cardinality(1): the embedded artifact MUST appear exactly once.
ValueRequirement and Cardinality are orthogonal. ValueRequirement governs whether the user is obligated to supply any values at all. Cardinality governs the permitted count of values if any are supplied. A field may therefore be Optional — meaning the user is not required to fill it in — while carrying a min_cardinality greater than one, meaning that if values are supplied, at least that many must be present. For example, a primer pair field might be Optional but carry min_cardinality(2), because a primer pair is only interpretable when both the forward and reverse primers are specified together.
Visibility
Visibility determines whether the embedded artifact is shown in rendered interfaces. It is modeled as an embedding property rather than as a rendering hint because it applies to any kind of embedded artifact, not only to fields.
Visibility ::= "visible" "hidden"
When Visibility is absent from an EmbeddedArtifact, the default is "visible".
Defaults
A default value is a value used to pre-populate a field at instance-creation time when no explicit value has yet been supplied by the user. Defaults exist at two layers:
- A field-level default lives on the reusable
Field’sFieldSpec. It is set when the field is authored and is shared by every Template that embeds the field. - An embedding-level default lives on the
EmbeddedXxxFieldinside a Template. It overrides the field-level default for embeddings within that one Template.
Every concrete field family carries an optional default at both layers, with one exception: AttributeValueField carries no default at either layer (an AttributeValue instance is a per-instance pairing of an attribute name and a value, and a default is not meaningful).
The two default-value types match: at each layer the slot is typed with the family-specific Value type. The per-family typing is:
| Family | Field-level slot (on XxxFieldSpec) | Embedding-level slot (on EmbeddedXxxField) |
|---|---|---|
| Text | [TextValue] | [TextValue] |
| IntegerNumber | [IntegerNumberValue] | [IntegerNumberValue] |
| RealNumber | [RealNumberValue] | [RealNumberValue] |
| Boolean | [BooleanValue] | [BooleanValue] |
| Date | [DateValue] | [DateValue] (polymorphic: YearValue | YearMonthValue | FullDateValue) |
| Time | [TimeValue] | [TimeValue] |
| DateTime | [DateTimeValue] | [DateTimeValue] |
| ControlledTerm | [ControlledTermValue] | [ControlledTermValue] |
| SingleValuedEnum | [EnumValue] | [EnumValue] |
| MultiValuedEnum | [EnumValue*] (zero or more) | [EnumValue*] (zero or more) |
| Link | [LinkValue] | [LinkValue] |
[EmailValue] | [EmailValue] | |
| PhoneNumber | [PhoneNumberValue] | [PhoneNumberValue] |
| Orcid | [OrcidValue] | [OrcidValue] |
| Ror | [RorValue] | [RorValue] |
| Doi | [DoiValue] | [DoiValue] |
| PubMedId | [PubMedIdValue] | [PubMedIdValue] |
| Rrid | [RridValue] | [RridValue] |
| NihGrantId | [NihGrantIdValue] | [NihGrantIdValue] |
| AttributeValue | (no default) | (no default) |
The shape is uniform across layers: every default at every layer is the family’s Value type. For the enum families this means the field-level default is an EnumValue (or sequence of EnumValue) — the same kind-tagged object form that appears at the embedding level. The Token carried inside each default EnumValue MUST equal the Token of one of the spec’s PermissibleValue+ entries; for MultiValuedEnumFieldSpec the sequence MUST NOT contain duplicate tokens.
Precedence and absence semantics. Both layers are independent and optional. The four cases:
| Field-level | Embedding-level | Effective default |
|---|---|---|
| absent | absent | none — the field has no default |
| present | absent | the field-level default |
| absent | present | the embedding-level default |
| present | present | the embedding-level default (it overrides the field-level default) |
There is no mechanism for an embedding to unset a field-level default. An embedding that wishes to override a field-level default with no default at all is not expressible in this version of the model.
Defaults are UI/UX initialisation only. A default value’s sole role is to seed an instance’s value at creation time, so that a user-facing form can pre-fill the corresponding input. Defaults do not appear in the wire form of TemplateInstance artifacts and do not affect the RDF projection. When an instance is created and the user accepts the default without modification, the resulting FieldValue carries the default value as if the user had typed it in by hand; from the instance’s perspective the default and a user-supplied identical value are indistinguishable. When an instance is created and the user does not supply a value (and the field is not required), the corresponding FieldValue is omitted entirely — the default does not appear by virtue of having existed.
Label Override
LabelOverride provides template-specific labeling for an embedded artifact. This allows a template to override the default label of the referenced reusable artifact in that embedding context.
AlternativeLabel ::= alternative_label(
MultilingualString
)
LabelOverride ::= label_override(
Label
AlternativeLabel
)
AlternativeLabel is a MultilingualString: each entry is itself a localization set for one alternative phrasing of the artifact’s display label.
Help Text Override
HelpTextOverride provides template-specific authored guidance for an embedded field. When present, it replaces the field’s canonical HelpText for that embedding context only. The reusable Field’s HelpText remains the canonical content for all other embedding contexts (and for the field rendered standalone).
HelpTextOverride ::= help_text_override( MultilingualString )
HelpTextOverride is a MultilingualString: it carries the same kind of authored guidance as HelpText, but scoped to a single embedding site. The override’s presentation — inline, tooltip, both, or none — is selected by the enclosing Template’s HelpDisplayMode exactly as for the underlying HelpText.
The precedence rule is straightforward: at an embedding site, the renderer displays the embedding’s HelpTextOverride if present, otherwise the referenced Field’s HelpText, otherwise nothing. The override is replace, not merge: localizations present in the field’s HelpText but absent from the embedding’s HelpTextOverride do not fall back.
Properties
A Property associates a semantic property IRI with an EmbeddedField or EmbeddedTemplate within a specific Template. The property IRI identifies the RDF property that the embedded artifact’s value represents in that template context. The optional PropertyLabel provides a human-readable label for the property.
Property is an embedding-level construct. It is distinct from the intrinsic metadata of the referenced Field or Template artifact. The same reusable artifact may be embedded in different templates under different property IRIs.
Property ::= property(
PropertyIri
[PropertyLabel]
)
PropertyIri ::= property_iri( Iri )
PropertyLabel ::= property_label( MultilingualString )
PropertyLabel is a MultilingualString carrying one or more language-tagged localizations of the property’s human-readable label.
Field Specs
A FieldSpec is the semantic configuration block carried by a concrete Field artifact. It specifies what kind of value the field accepts, any constraints on that value, and any compatible rendering hints for presentation. Each concrete Field variant carries exactly one FieldSpec that matches its kind: a TextField carries a TextFieldSpec, a DateField carries a DateFieldSpec, and so on. The correspondence between each FieldSpec and its permitted Value form is given in the Field Spec And Value Correspondence section.
One might ask why FieldSpec exists as a separate construct rather than folding its content directly into the concrete Field artifact. The answer is separation of concerns: the concrete field artifact — TextField, DateField, and so on — answers the question “what kind of reusable field is this?” and carries the artifact’s identity, catalog metadata, versioning, and the rendered question-text label. The FieldSpec answers the separate question “what are the value rules and rendering-compatible properties for this kind of field?” Keeping these concerns distinct means that artifact identity, catalog metadata, and lifecycle/versioning information remain uniform across all field kinds, while value semantics and field-specific configuration vary per family through FieldSpec.
FieldSpec productions are grouped here by field family, mirroring the abstract Field hierarchy in the Kernel Grammar. Temporal field specs, which carry additional precision and rendering configuration, are detailed in the Temporal Field Specs subsection. Controlled term source declarations, which specify the ontological authorities from which controlled-term values may be drawn, are covered in the Controlled Term Sources subsection. Rendering hints for all field families are defined in the Rendering Hints subsection, with the exception of temporal rendering hints which are defined alongside their field specs.
FieldSpec ::= TextFieldSpec
NumericFieldSpec
BooleanFieldSpec
TemporalFieldSpec
ControlledTermFieldSpec
EnumFieldSpec
LinkFieldSpec
ContactFieldSpec
ExternalAuthorityFieldSpec
AttributeValueFieldSpec
NumericFieldSpec ::= IntegerNumberFieldSpec
RealNumberFieldSpec
TextFieldSpec ::= text_field_spec(
[TextValue]
[MinLength]
[MaxLength]
[ValidationRegex]
[LangTagRequirement]
[TextRenderingHint]
)
LangTagRequirement ::= "langTagRequired"
"langTagOptional"
"langTagForbidden"
IntegerNumberFieldSpec ::= integer_number_field_spec(
[IntegerNumberValue]
[Unit]
[IntegerNumberMinValue]
[IntegerNumberMaxValue]
[NumericRenderingHint]
)
RealNumberFieldSpec ::= real_number_field_spec(
RealNumberDatatypeKind
[RealNumberValue]
[Unit]
[RealNumberMinValue]
[RealNumberMaxValue]
[NumericRenderingHint]
)
Unit ::= unit(
Iri
[Label]
)
MinLength ::= min_length(
NonNegativeInteger
)
MaxLength ::= max_length(
NonNegativeInteger
)
ValidationRegex ::= validation_regex(
string
)
IntegerNumberMinValue ::= integer_number_min_value(
IntegerNumberValue
)
IntegerNumberMaxValue ::= integer_number_max_value(
IntegerNumberValue
)
RealNumberMinValue ::= real_number_min_value(
RealNumberValue
)
RealNumberMaxValue ::= real_number_max_value(
RealNumberValue
)
BooleanFieldSpec ::= boolean_field_spec(
[BooleanValue]
[BooleanRenderingHint]
)
TemporalFieldSpec ::= DateFieldSpec
TimeFieldSpec
DateTimeFieldSpec
ControlledTermFieldSpec ::= controlled_term_field_spec(
[ControlledTermValue]
ControlledTermSource
[ControlledTermRenderingHint]
)
EnumFieldSpec ::= SingleValuedEnumFieldSpec
MultiValuedEnumFieldSpec
SingleValuedEnumFieldSpec ::= single_valued_enum_field_spec(
PermissibleValue
[EnumValue]
[SingleValuedEnumRenderingHint]
)
MultiValuedEnumFieldSpec ::= multi_valued_enum_field_spec(
PermissibleValue
EnumValue
[MultiValuedEnumRenderingHint]
)
PermissibleValue ::= permissible_value(
Token
[Label]
[Description]
Meaning
)
Token ::= token(
string
)
Meaning ::= meaning(
TermIri
[Label]
)
LinkFieldSpec ::= link_field_spec(
[LinkValue]
[LinkRenderingHint]
)
ContactFieldSpec ::= EmailFieldSpec
PhoneNumberFieldSpec
EmailFieldSpec ::= email_field_spec(
[EmailValue]
[EmailRenderingHint]
)
PhoneNumberFieldSpec ::= phone_number_field_spec(
[PhoneNumberValue]
[PhoneNumberRenderingHint]
)
ExternalAuthorityFieldSpec ::= OrcidFieldSpec
RorFieldSpec
DoiFieldSpec
PubMedIdFieldSpec
RridFieldSpec
NihGrantIdFieldSpec
OrcidFieldSpec ::= orcid_field_spec(
[OrcidValue]
[OrcidRenderingHint]
)
RorFieldSpec ::= ror_field_spec(
[RorValue]
[RorRenderingHint]
)
DoiFieldSpec ::= doi_field_spec(
[DoiValue]
[DoiRenderingHint]
)
PubMedIdFieldSpec ::= pub_med_id_field_spec(
[PubMedIdValue]
[PubMedIdRenderingHint]
)
RridFieldSpec ::= rrid_field_spec(
[RridValue]
[RridRenderingHint]
)
NihGrantIdFieldSpec ::= nih_grant_id_field_spec(
[NihGrantIdValue]
[NihGrantIdRenderingHint]
)
AttributeValueFieldSpec ::= attribute_value_field_spec()
Unit denotes an identified measurement or quantity unit optionally paired with a human-readable label.
The current placement of Unit on IntegerNumberFieldSpec and RealNumberFieldSpec is a pragmatic compromise. A later revision may introduce a distinct QuantityFieldSpec to model numeric values with fixed units more explicitly.
IntegerNumberMinValue and IntegerNumberMaxValue specify inclusive lower and upper bounds on the integer values that an IntegerNumberField accepts. Both are expressed as IntegerNumberValue constructs. RealNumberMinValue and RealNumberMaxValue are the analogous bounds on RealNumberField and carry RealNumberValue constructs whose RealNumberDatatypeKind matches the field’s declared datatype.
A RealNumberFieldSpec MAY use the family-shared NumericRenderingHint; if it carries a non-zero decimalPlaces rendering hint, the hint applies to display rounding only and does not constrain the lexical form of submitted values. IntegerNumberFieldSpec MAY also use NumericRenderingHint; a decimalPlaces value other than 0 on an integer field is harmless (display only) and SHOULD be omitted when not meaningful.
EnumFieldSpec is refined along a single dimension: cardinality. SingleValuedEnumFieldSpec permits exactly one selection; MultiValuedEnumFieldSpec permits zero or more simultaneous selections (subject to the embedding’s Cardinality). The two specs share a common option model: every permissible value is a PermissibleValue carrying a canonical Token key together with optional human-readable Label and Description localizations and zero or more Meaning entries that bind the token to ontology terms. The Token strings of a spec’s permissible values MUST be unique within that spec; the spec’s PermissibleValue+ is the closed set of values an instance may carry.
The two enum specs each carry a field-level default per the Defaults section: SingleValuedEnumFieldSpec an optional [EnumValue], MultiValuedEnumFieldSpec a (possibly empty) EnumValue*. The Token carried inside each default EnumValue MUST equal the Token of one of the spec’s PermissibleValue+ entries; for MultiValuedEnumFieldSpec the sequence MUST NOT contain duplicate tokens.
A Meaning carried by a PermissibleValue binds the token to a term IRI in an external vocabulary or ontology. A permissible value MAY carry zero, one, or several Meaning entries. Each Meaning MAY additionally carry an optional Label recording the bound term’s human-readable label (in the same way ControlledTermValue.Label caches the term’s label inline) so that consumers without ontology access can render the bound term’s display name. The Meaning.Label is the label of the bound term, distinct from the surrounding PermissibleValue.Label which is the display label of the permissible value itself. When the RDF projection is applied (see rdf-projection.md), an EnumValue whose token matches a PermissibleValue carrying one or more Meaning entries projects as the corresponding term IRIs; an EnumValue whose matching permissible value carries no Meaning projects as a plain string literal.
ControlledTermSource is defined in Controlled Term Sources.
Temporal Field Specs
TemporalFieldSpec denotes temporal-valued fields and is refined into strongly typed date, time, and date-time forms. This section groups the temporal field-spec productions together with their compatible rendering hints and value-type constraints.
DateFieldSpec ::= date_field_spec(
DateValueType
[DateValue]
[DateRenderingHint]
)
DateValueType ::= "year" "yearMonth" "fullDate"
TimeFieldSpec ::= time_field_spec(
[TimeValue]
[TimePrecision]
[TimezoneRequirement]
[TimeRenderingHint]
)
TimePrecision ::= "hourMinute" "hourMinuteSecond" "hourMinuteSecondFraction"
TimezoneRequirement ::= "timezoneRequired" "timezoneNotRequired"
TimePrecision identifies the finest time precision permitted by a TimeFieldSpec.
"hourMinute", "hourMinuteSecond", and "hourMinuteSecondFraction" identify time values constrained respectively to hour-and-minute precision, second precision, and fractional-second precision.
TimezoneRequirement identifies whether timezone information is required by the field spec.
The declared TimePrecision determines the required lexical form of conforming TimeValue constructs. Finer components than the declared precision MUST be omitted entirely; zeroing them is not equivalent to omitting them. Specifically:
"hourMinute":TimeValueMUST carry only hour and minute components (HH:MM)."hourMinuteSecond":TimeValueMUST carry hour, minute, and second components (HH:MM:SS), with no fractional seconds."hourMinuteSecondFraction":TimeValueMAY carry a fractional seconds component.
When TimePrecision is absent from a TimeFieldSpec, no precision constraint applies and any well-formed TimeValue is conforming.
The same strict-truncation rule applies to DateTimeValueType for DateTimeValue constructs:
"dateHourMinute": the time component ofDateTimeValueMUST carry only hour and minute (YYYY-MM-DDTHH:MM)."dateHourMinuteSecond": the time component MUST carry hour, minute, and second (YYYY-MM-DDTHH:MM:SS), with no fractional seconds."dateHourMinuteSecondFraction": the time component MAY carry a fractional seconds component.
DateTimeFieldSpec ::= date_time_field_spec(
DateTimeValueType
[DateTimeValue]
[TimezoneRequirement]
[DateTimeRenderingHint]
)
DateTimeValueType ::= "dateHourMinute" "dateHourMinuteSecond" "dateHourMinuteSecondFraction"
DateTimeValueType identifies the finest permitted date-time precision.
"dateHourMinute", "dateHourMinuteSecond", and "dateHourMinuteSecondFraction" identify date-time values constrained respectively to minute precision, second precision, and fractional-second precision.
DateRenderingHint ::= date_rendering_hint(
[DateComponentOrder]
[Placeholder]
)
DateComponentOrder ::= "dayMonthYear" "monthDayYear" "yearMonthDay"
TimeRenderingHint ::= time_rendering_hint(
[TimeFormat]
[Placeholder]
)
DateTimeRenderingHint ::= date_time_rendering_hint(
[TimeFormat]
[Placeholder]
)
TimeFormat ::= "twelveHour" "twentyFourHour"
DateComponentOrder identifies whether a date is rendered or acquired in day-month-year, month-day-year, or year-month-day order.
Controlled Term Sources
Controlled term sources define the ontological authorities from which controlled-term values may be drawn. A ControlledTermFieldSpec requires one or more ControlledTermSource entries. Each source specifies either an entire ontology, a branch of an ontology rooted at a given term, a set of individual ontology classes, or an external value set. TermIri is defined in the Scalar and Datatype Leaves section.
ControlledTermSource ::= OntologySource
BranchSource
ClassSource
ValueSetSource
OntologySource ::= ontology_source(
OntologyReference
)
OntologyReference ::= ontology_reference(
OntologyIri
[OntologyDisplayHint]
)
OntologyDisplayHint ::= ontology_display_hint(
[OntologyAcronym]
[OntologyName]
)
BranchSource ::= branch_source(
OntologyReference
RootTermIri
[RootTermLabel]
[MaxTraversalDepth]
)
ClassSource ::= class_source(
ControlledTermClass
)
ControlledTermClass ::= controlled_term_class(
TermIri
[Label]
OntologyReference
)
ValueSetSource ::= value_set_source(
ValueSetIdentifier
[ValueSetName]
[ValueSetIri]
)
OntologyAcronym ::= ontology_acronym(
string
)
OntologyName ::= ontology_name(
MultilingualString
)
OntologyIri ::= ontology_iri(
Iri
)
RootTermIri ::= root_term_iri(
Iri
)
RootTermLabel ::= root_term_label(
MultilingualString
)
MaxTraversalDepth ::= max_traversal_depth(
NonNegativeInteger
)
ValueSetIdentifier ::= value_set_identifier(
string
)
ValueSetName ::= value_set_name(
MultilingualString
)
ValueSetIri ::= value_set_iri(
Iri
)
OntologyIri, RootTermIri, and ValueSetIri denote IRIs used in controlled-term source specifications.
OntologyName, RootTermLabel, and ValueSetName are human-readable display names and carry MultilingualString values: each may be presented in one or more natural languages. OntologyAcronym and ValueSetIdentifier are technical short-form identifiers (e.g. an ontology acronym like "NCIT", a value-set key) and remain plain Unicode strings.
MaxTraversalDepth denotes a non-negative traversal-depth limit for branch-based controlled-term sources. When MaxTraversalDepth is absent, no depth limit applies and any descendant of the root term is admissible. A value of zero restricts the source to the root term itself.
When OntologyDisplayHint is present on an OntologyReference, at least one of its OntologyAcronym or OntologyName components MUST be present. A display hint with neither component is non-conforming.
A ControlledTermClass SHOULD include a Label. The label is captured at the time the class is declared as a source, when the term’s display text is typically known; consumers without ontology access rely on this label to render the class. Conforming producers MAY omit the label when it is not available, in which case downstream consumers must resolve the label from the term IRI by other means. The same recommendation applies to BranchSource.RootTermLabel: producers SHOULD include it when declaring a branch source.
Rendering Hints
A RenderingHint is an optional presentational instruction carried by a FieldSpec that tells a rendering implementation how to display the field. Rendering hints are strictly presentational: they do not affect the meaning, structure, or validation of field values. Each rendering hint is typed to a specific FieldSpec family, so only compatible hint-and-field-spec combinations are expressible. For example, a TextRenderingHint may only appear on a TextFieldSpec, and a SingleValuedEnumRenderingHint may only appear on a SingleValuedEnumFieldSpec. Note that temporal rendering hints (DateRenderingHint, TimeRenderingHint, and DateTimeRenderingHint) are defined alongside their respective field specs in the Temporal Field Specs subsection.
RenderingHint ::= TextRenderingHint
SingleValuedEnumRenderingHint
MultiValuedEnumRenderingHint
NumericRenderingHint
BooleanRenderingHint
DateRenderingHint
TimeRenderingHint
DateTimeRenderingHint
ControlledTermRenderingHint
EmailRenderingHint
PhoneNumberRenderingHint
LinkRenderingHint
OrcidRenderingHint
RorRenderingHint
DoiRenderingHint
PubMedIdRenderingHint
RridRenderingHint
NihGrantIdRenderingHint
TextRenderingHint ::= text_rendering_hint(
[TextLineMode]
[Placeholder]
)
TextLineMode ::= "singleLine" "multiLine"
SingleValuedEnumRenderingHint ::= "radio" "dropdown"
MultiValuedEnumRenderingHint ::= "checkbox" "multiSelect"
NumericRenderingHint ::= numeric_rendering_hint(
[DecimalPlaces]
[Placeholder]
)
DecimalPlaces ::= decimal_places(
NonNegativeInteger
)
BooleanRenderingHint ::= "checkbox" "toggle" "radio" "dropdown"
ControlledTermRenderingHint ::= controlled_term_rendering_hint( [Placeholder] )
EmailRenderingHint ::= email_rendering_hint( [Placeholder] )
PhoneNumberRenderingHint ::= phone_number_rendering_hint( [Placeholder] )
LinkRenderingHint ::= link_rendering_hint( [Placeholder] )
OrcidRenderingHint ::= orcid_rendering_hint( [Placeholder] )
RorRenderingHint ::= ror_rendering_hint( [Placeholder] )
DoiRenderingHint ::= doi_rendering_hint( [Placeholder] )
PubMedIdRenderingHint ::= pub_med_id_rendering_hint( [Placeholder] )
RridRenderingHint ::= rrid_rendering_hint( [Placeholder] )
NihGrantIdRenderingHint ::= nih_grant_id_rendering_hint( [Placeholder] )
Placeholder ::= placeholder( MultilingualString )
Placeholder is a MultilingualString-valued production carrying sample-input text shown inside an empty text-entry widget. It is purely presentational format demonstration — distinct from HelpText, which carries semantic content about the field’s meaning. Placeholder content is not validated against the field spec’s lexical-form constraints; a placeholder of "YYYY-MM-DD" may appear on a date field whose values must conform to ISO 8601, since the placeholder is a demonstration of the expected lexical shape, not an instance of one.
Placeholder appears as an optional slot on every rendering hint attached to a text-entry-capable field family: TextRenderingHint, NumericRenderingHint, DateRenderingHint, TimeRenderingHint, DateTimeRenderingHint, plus the ten rendering hints introduced for ControlledTermField, EmailField, PhoneNumberField, LinkField, and the six identifier families. It does NOT appear on BooleanRenderingHint, SingleValuedEnumRenderingHint, or MultiValuedEnumRenderingHint, since those widgets are not text-entry surfaces.
Note on TextRenderingHint shape. In earlier revisions of this spec, TextRenderingHint was a bare string enum ("singleLine" | "multiLine"). It has been restructured into a structured object carrying an optional TextLineMode (the former enum content) plus the optional Placeholder slot. This is a wire-form-breaking change for templates that carry the bare-string form; such templates require migration to the object form before they will decode under this revision of the spec.
This specification draws a strict distinction between semantic structure and presentation. Semantic distinctions MUST be modeled in FieldSpec when they affect the meaning, cardinality, or value structure of a field. This includes distinctions such as single-valued versus multi-valued enum, date versus time versus date-time, and permitted temporal precision. Purely presentational distinctions MUST NOT be modeled as separate field specs. Instead, distinctions such as single-line versus multi-line text entry, date component ordering, and 12-hour versus 24-hour time display MUST be expressed only through compatible typed rendering hints.
Accordingly, TextFieldSpec is a single semantic field spec whose single-line and multi-line display forms are represented by TextRenderingHint.
A TextFieldSpec MAY additionally define a default text value, minimum length, maximum length, validating regular expression, and a LangTagRequirement constraining the presence of the lang slot on conforming TextValue instances.
LangTagRequirement identifies whether the lang slot of a TextValue is required, optional, or forbidden by the field spec:
"langTagRequired"— everyTextValueadmitted by this field MUST carry alangslot with a well-formed BCP 47 tag. Suitable for fields whose values are natural-language text that authors expect to be language-tagged (e.g., titles, abstracts, captions)."langTagOptional"— everyTextValueadmitted MAY carry alangslot. This matches the default behaviour whenLangTagRequirementis absent and is provided for explicitness."langTagForbidden"— everyTextValueadmitted MUST NOT carry alangslot. Suitable for fields whose values are technical identifiers, slugs, query fragments, or other strings for which a natural-language tag has no meaning.
When LangTagRequirement is absent from a TextFieldSpec, the constraint behaves as "langTagOptional" (the historical default).
The LangTagRequirement constraint applies to each TextValue individually: in a multi-valued field, every value MUST satisfy the constraint independently. The constraint also applies to the field-spec-level defaultValue (when present) and to any embedding-level defaultValue carried by an EmbeddedTextField.
Similarly, EnumFieldSpec distinguishes SingleValuedEnumFieldSpec from MultiValuedEnumFieldSpec semantically, while the rendering hint determines whether the UI uses radio buttons, dropdown, checkboxes, or multi-select presentation. Typed rendering hints make incompatible combinations structurally invalid.
Temporal semantics are also split structurally: DateFieldSpec, TimeFieldSpec, and DateTimeFieldSpec are distinct semantic field specs, and each carries only the rendering hints and temporal options that are meaningful for that temporal category.
The current rendering vocabulary is explicit but deliberately small: numeric fields use NumericRenderingHint (which carries an optional DecimalPlaces for display-time rounding); date fields use DateRenderingHint (with optional DateComponentOrder); time fields use TimeRenderingHint (with optional TimeFormat); and date-time fields use DateTimeRenderingHint (also with optional TimeFormat).
DecimalPlaces is a presentation concern, not a value-semantics constraint. Conforming consumers SHOULD use it to control display rounding and MAY use it as a UX-level input nicety (e.g., limiting the number of digits an end-user can type after the decimal point). It does not constrain the lexical form of a submitted RealNumberValue; conforming validators MUST NOT reject a value purely on grounds of decimal-places mismatch with the rendering hint. The slot is meaningful for RealNumberFieldSpec; on IntegerNumberFieldSpec it is harmless and conventionally omitted.
BooleanRenderingHint admits four widget choices — checkbox, toggle, radio, and dropdown — distinguished by how they handle the unset state of a boolean field. A boolean field has three observable states at the UI: a value of true, a value of false, and no value supplied (the user has not yet asserted either). The four widget choices differ in whether they can faithfully represent the unset state:
radio(a Yes / No radio pair) anddropdown(a Yes / No dropdown with no initial selection) admit three observable states — Yes selected, No selected, and neither selected — and so faithfully represent the unset case.checkboxandtoggleadmit only two observable states (checked/unchecked, oron/off) and so cannot distinguish false from unset. They SHOULD be used only when the field’sValueRequirementisrequired(so unset is not a valid resting state) or when the surrounding application is content to interpret unset asfalse.
The unset state is structurally represented in the value model by absence of a FieldValue for the embedding’s key, not by a third value within BooleanValue. BooleanValue.value carries true | false only.
Presentation Components
A PresentationComponent is a reusable artifact that contributes presentation or instructional structure to a rendered template without introducing data-bearing content. It is distinct from SchemaArtifact: where Template and Field define the structure and semantics of instance data, PresentationComponent exists purely to guide, organise, or annotate the rendered form — for example by embedding rich text instructions, illustrative images, video content, or structural breaks between sections.
PresentationComponent carries its own identity, metadata, and lifecycle information as an Artifact, making it independently reusable across multiple templates. It appears within a template only through EmbeddedPresentationComponent, which contributes no InstanceValue and is therefore invisible to the instance model. A conforming TemplateInstance MUST NOT contain an InstanceValue for an EmbeddedPresentationComponent.
The following concrete variants are defined:
PresentationComponent ::= RichTextComponent
ImageComponent
YoutubeVideoComponent
SectionBreakComponent
PageBreakComponent
RichTextComponent ::= rich_text_component(
PresentationComponentId
ModelVersion
CatalogMetadata
HtmlContent
)
ImageComponent ::= image_component(
PresentationComponentId
ModelVersion
CatalogMetadata
Iri
[Label]
[Description]
)
YoutubeVideoComponent ::= you_tube_video_component(
PresentationComponentId
ModelVersion
CatalogMetadata
Iri
[Label]
[Description]
)
SectionBreakComponent ::= section_break_component(
PresentationComponentId
ModelVersion
CatalogMetadata
)
PageBreakComponent ::= page_break_component(
PresentationComponentId
ModelVersion
CatalogMetadata
)
HtmlContent ::= html_content(
string
)
HtmlContent denotes an HTML fragment represented as a Unicode string and used by a RichTextComponent.
The permitted HTML feature set and any sanitization requirements are outside the scope of this abstract specification and SHOULD be defined by concrete serialization specifications that build on this model.
The Iri slot on ImageComponent and YoutubeVideoComponent identifies the image or video resource referenced by the corresponding presentation component.
Label and Description on ImageComponent and YoutubeVideoComponent carry accessibility metadata. Label is a short alternative-text label (the image’s alt text or the video’s caption title); Description is a longer textual description for screen readers and other assistive technologies. Both are MultilingualString values, allowing localized accessibility text. Conforming producers SHOULD provide a Label for every ImageComponent and YoutubeVideoComponent that conveys meaningful content; decorative images MAY omit the label to indicate that no alternative text is needed.
Field Spec And Value Correspondence
The FieldSpec carried by a Field determines the Value form that MUST appear in any FieldValue corresponding to an embedding of that field. This is a normative constraint: a FieldValue that carries a Value of the wrong form for the referenced field’s FieldSpec is non-conforming.
The correspondence is applied through the EmbeddedArtifactKey chain. A FieldValue in a TemplateInstance carries an EmbeddedArtifactKey that identifies an EmbeddedField in the referenced Template. That EmbeddedField references a reusable Field, which carries a FieldSpec. It is that FieldSpec that determines the permitted Value form for the FieldValue. The correspondence therefore spans the full path from instance value through embedding context to reusable field definition.
The table below gives the complete correspondence. The Field Family column identifies the abstract category in the Field hierarchy to which the concrete field belongs; families group field kinds that share related value semantics. Where a field is a direct subclass of Field with no intermediate abstract category, this column is left blank.
| Field Family | FieldSpec | Value |
|---|---|---|
TextFieldSpec | TextValue | |
NumericField | IntegerNumberFieldSpec | IntegerNumberValue |
NumericField | RealNumberFieldSpec | RealNumberValue |
BooleanFieldSpec | BooleanValue | |
TemporalField | DateFieldSpec | DateValue |
TemporalField | TimeFieldSpec | TimeValue |
TemporalField | DateTimeFieldSpec | DateTimeValue |
ControlledTermFieldSpec | ControlledTermValue | |
EnumField | SingleValuedEnumFieldSpec | EnumValue |
EnumField | MultiValuedEnumFieldSpec | EnumValue |
LinkFieldSpec | LinkValue | |
ContactField | EmailFieldSpec | EmailValue |
ContactField | PhoneNumberFieldSpec | PhoneNumberValue |
ExternalAuthorityField | OrcidFieldSpec | OrcidValue |
ExternalAuthorityField | RorFieldSpec | RorValue |
ExternalAuthorityField | DoiFieldSpec | DoiValue |
ExternalAuthorityField | PubMedIdFieldSpec | PubMedIdValue |
ExternalAuthorityField | RridFieldSpec | RridValue |
ExternalAuthorityField | NihGrantIdFieldSpec | NihGrantIdValue |
AttributeValueFieldSpec | AttributeValue |
The two concrete enum field specs share a single value type, EnumValue. The cardinality distinction — single versus multiple — is not visible in the value type itself but in the count of values permitted per FieldValue: a SingleValuedEnumFieldSpec permits exactly one EnumValue, while a MultiValuedEnumFieldSpec permits one or more (subject to the embedding’s Cardinality). This cardinality constraint is enforced at validation rather than through distinct value types.
Instances
A TemplateInstance is an Artifact that records data conforming to a specific Template. Instance productions are defined here separately from schema and presentation productions so that the schema model and the instance model can be read independently.
Because TemplateInstance is a full Artifact, it carries CatalogMetadata — a TemplateInstanceId, descriptive metadata, and lifecycle metadata. This means instances are independently identifiable, catalogable artifacts in their own right rather than anonymous data records. They can be referenced, versioned, and tracked just as templates and fields can.
A TemplateInstance contains zero or more InstanceValue constructs, each keyed by an EmbeddedArtifactKey identifying the corresponding embedded artifact in the referenced template. There are two forms: FieldValue, which carries one or more typed values for an EmbeddedField, and NestedTemplateInstance, which carries a nested collection of InstanceValue constructs for an EmbeddedTemplate. EmbeddedPresentationComponent constructs produce no InstanceValue and are absent from the instance model entirely.
TemplateInstance ::= template_instance(
TemplateInstanceId
ModelVersion
CatalogMetadata
TemplateId
[Label]
InstanceValue
)
InstanceValue ::= FieldValue
NestedTemplateInstance
FieldValue ::= field_value(
EmbeddedArtifactKey
Value
)
NestedTemplateInstance ::= nested_template_instance(
EmbeddedArtifactKey
InstanceValue
)
TemplateId is the persistent schema link that ties a TemplateInstance to the Template it was created from. It is the basis for all validation and interpretation of instance content: the EmbeddedArtifactKey values in FieldValue and NestedTemplateInstance constructs are only meaningful in relation to the embedded artifacts of that specific template.
Each FieldValue’s EmbeddedArtifactKey MUST identify an EmbeddedField in the referenced Template. Each NestedTemplateInstance’s EmbeddedArtifactKey MUST identify an EmbeddedTemplate. An EmbeddedArtifactKey that identifies an EmbeddedPresentationComponent MUST NOT appear as the key of any InstanceValue. The full instance alignment constraints are specified in spec/validation.md.
To make the abstract structure concrete, consider a Template containing two EmbeddedTextField constructs keyed title and description, and one EmbeddedTemplate keyed study_arm with a maximum cardinality of three. A conforming TemplateInstance for that template would contain two FieldValue constructs — one keyed title carrying a TextValue, one keyed description carrying a TextValue — and between one and three NestedTemplateInstance constructs each keyed study_arm, where each NestedTemplateInstance contains its own InstanceValue constructs corresponding to the embedded artifacts of the nested template.
For multi-valued EmbeddedField, all values for a single field occurrence are collected within a single FieldValue using Value*. For multi-valued EmbeddedTemplate, multiplicity is represented by multiple NestedTemplateInstance constructs sharing the same EmbeddedArtifactKey within the containing TemplateInstance. This asymmetry reflects the structural difference between scalar repetition (multiple values for one field) and structural repetition (multiple complete nested instances for one embedded template). In both cases the number of values or instances MUST satisfy the Cardinality constraints defined by the corresponding EmbeddedField or EmbeddedTemplate; see spec/validation.md for the normative multiplicity rules. NestedTemplateInstance is the recursive construct that supports arbitrarily deep nested template structure: because a NestedTemplateInstance itself contains InstanceValue*, and InstanceValue may contain further NestedTemplateInstance constructs, template nesting can be as deep as the schema requires.
Instance conformance may be enforced at data-entry time, preventing submission of a non-conforming instance, or retrospectively, by validating existing instances against their referenced template. Both modes apply the same conformance rules; the distinction is an implementation concern rather than a model-level distinction.
Absence of a value for an optional field is represented by omitting the FieldValue entirely rather than including an empty one; hence FieldValue requires Value+. Note that concrete serializations and authoring tools may have their own conventions for representing absence — for example, a JSON serialization may choose to omit a key entirely or include it with a null value — but such distinctions are a concern of the serialization layer and do not affect the abstract model defined here.
Open Questions
- Should embedded artifacts always refer to reusable artifacts by explicit reference construct, or does the CEDAR model require some embeddings to support inline artifact definition?
- Should
PresentationComponentremain a direct subclass ofArtifact, or should a later revision introduce an intermediate superclass for reusable non-schema artifacts? This would make the distinction between reusable schema artifacts such asTemplateandFieldand reusable non-schema artifacts such as rich text, images, videos, and section breaks more explicit in the hierarchy. - Should a later revision introduce a distinct
QuantityFieldSpecrather than attaching optionalUnitinformation directly toIntegerNumberFieldSpecandRealNumberFieldSpec? The current model permits fixed units on both numeric field families as a pragmatic compromise, but a dedicated quantity field spec may provide a cleaner semantic distinction for numeric values that are intrinsically unit-bearing.
Cedar Wire Grammar
This file is a formal, JSON-shaped grammar that mirrors grammar.md
production-for-production. It is the source of truth for the wire shape
of every abstract grammar production. serialization.md is its
companion: it carries the encoding philosophy, JSON-specific rules
(property naming, NFC normalisation, etc.), worked examples, and
cross-references, but does not duplicate per-production shape
information.
For every XxxYyy ::= in grammar.md there is exactly one
XxxYyy ::: in this file, and vice versa.
Status: hand-maintained, eventually generated. This file is currently authored in lock-step with
grammar.md. The longer-term direction is to derive it mechanically fromgrammar.mdplus the property-name map (§14) plus the encoding rules (§1.7). Until that generator exists, the file is hand-maintained; the §14 property-name map and the §1.7 encoding rules together define what such a generator would need to know.
1. Notation
Each line takes one of two forms:
production_name ::: type-expression
production_name ::: type-expression
// inline constraints on this production
(The placeholder production_name is shown in lower_snake_case here
purely to keep it out of the formal-production count; real wire
productions use UpperCamelCase.) The ::: separator (three colons)
distinguishes a wire-format production from an abstract grammar
production (::= in grammar.md). A wire production names the JSON
shape that encodes the corresponding abstract production.
1.1 Type expressions
| Form | Meaning |
|---|---|
string, number, boolean, null | The corresponding JSON primitive. |
"literal" | A string-literal type — the JSON value MUST equal the literal. Used for kind discriminators. |
ProductionName | Reference to another wire production. |
array<T> | A JSON array; each element is T. |
nonEmptyArray<T> | An array of T with at least one element. |
object { … } | A JSON object. Property syntax in §1.2. |
T | U | A union. Discrimination strategy is documented inline (see §1.3). |
1.2 Object property syntax
Within object { … }:
property: Type // required
property?: Type // optional; encoded only when present
"property": "literal" // a fixed string-literal value (used for kind)
Property order in the notation is informational. JSON does not preserve key order, and conforming encoders MAY emit properties in any order unless an inline constraint says otherwise (no current production requires a specific order).
1.3 Unions
some_union ::: A | B | C
// discriminator: kind
(Placeholder shown in lower_snake_case for the same reason as §1.1.)
Two discrimination strategies are recognised, declared inline:
discriminator: kind— every member is an object production whose shape includes akind: "MemberName"literal property. Decoders pick the variant by readingkind.discriminator: position— members are distinguished by the enclosing property name and the surrounding context, not by anything on the encoded object itself. Used at singleton positions where the abstract grammar admits exactly one production at the property.
If no discriminator is declared, kind is the default.
1.4 Inline constraints
Constraints that cannot be expressed in the type expression appear as
single-line //-prefixed comments immediately below the production:
MultilingualString ::: nonEmptyArray<LangString>
// lang tags MUST be unique within the array (case-folded)
Constraints are normative.
1.5 The kind rule
Rule. A wire object carries a "kind": "X" property if and only
if its abstract grammar production is a member of some
discriminator: kind union — regardless of the position the object
occupies in the wire form. Productions that are not members of any
discriminator: kind union (Cardinality, Annotation,
LabelOverride, Property, CatalogMetadata,
LifecycleMetadata, SchemaArtifactVersioning,
Unit, OntologyReference, OntologyDisplayHint,
ControlledTermClass, PermissibleValue, Meaning, and the
temporal RenderingHint object variants) never carry kind.
This rule is purely a property of the production: it does not depend
on where in the document the object appears. A given production
either always carries kind on the wire or never does. In particular,
singleton positions — slots where the enclosing context already
fixes the family — make no difference to whether kind is carried;
a polymorphic-union member retains its kind even when the slot’s
type pins the family unambiguously. The kind is then redundant for
decoding (the family is recoverable from the slot type) but is
retained because uniformity of the rule is more valuable than the
small wire-size saving.
Terms.
- Singleton position — a property slot in a wire object where the
abstract grammar admits exactly one production (e.g.
EmbeddedField.cardinalityadmits onlyCardinality,EmbeddedTextField.defaultValueadmits onlyTextValue). - Singleton-only production — an abstract production that appears
only at singleton positions and is never a member of a
discriminator: kindunion (e.g.Cardinality,Annotation,LabelOverride). Equivalently: the productions enumerated in the Rule above.
Worked examples. Two cases illustrate the rule.
Case 1 — polymorphic-union member always carries kind. TextValue
is a member of the Value union (which uses discriminator: kind).
At the polymorphic FieldValue.values[*] position the wire form is:
{ "kind": "TextValue", "value": "Hello", "lang": "en" }
At the singleton EmbeddedTextField.defaultValue position, where the
enclosing EmbeddedTextField.kind already fixes the family, the
wire form is the same:
{
"kind": "EmbeddedTextField",
"key": "comment",
"artifactRef": "https://example.org/fields/comment",
"defaultValue": { "kind": "TextValue", "value": "Initial", "lang": "en" }
}
The inner "kind": "TextValue" is structurally redundant at this
slot but is retained because TextValue is a polymorphic-union
member and the rule is uniform across positions.
Case 2 — singleton-only production never carries kind.
Cardinality is not a member of any discriminator: kind union — it
appears only at singleton positions (e.g.
EmbeddedField.cardinality, EmbeddedTemplate.cardinality). Its
wire form never carries kind:
{
"kind": "EmbeddedTextField",
"key": "alias",
"artifactRef": "https://example.org/fields/alias",
"cardinality": { "min": 0, "max": 3 }
}
Wire vs. in-memory. The kind rule constrains the wire form,
not the in-memory form of any host-language binding. Bindings
MAY carry synthetic kind (or any other) discriminator fields on
their in-memory representations of singleton-only productions —
e.g. Cardinality, Annotation — for runtime introspection,
type-guard ergonomics, or debugging. Any such synthetic
discriminator MUST be stripped before encoding and MUST NOT appear
on the wire; the converse is also possible (a binding’s in-memory
type may omit a kind it chooses to recover from context, provided
the encoder restores it). (See bindings.md §2.1
for examples.)
1.6 Collapsed wrappers
A typed singleton wrapper is an abstract grammar production whose
constructor form has exactly one component. The inner component may
be a primitive lexical category (string, number, boolean), another
typed singleton wrapper, or a composite production such as
MultilingualString. For example:
Iri ::= iri(IriString)
TemplateId ::= template_id(IriString)
Label ::= label(MultilingualString)
In the abstract grammar these productions exist to give a value a
role — Iri is a syntactically valid IRI, TemplateId is
specifically the identifier of a template, Label is a label rather
than an arbitrary multilingual string. The abstract grammar treats
these roles as distinct types so that, e.g., a TemplateId cannot
be substituted for a FieldId even though both reduce to a string at
the wire level.
On the wire, however, this typed-role information is recovered from
the surrounding context (the property name and the abstract grammar
production at that slot). The wrapper therefore collapses to its inner
type at encode time and disappears from the JSON, leaving only the
inner value (a primitive, an array, or whichever shape the inner type
encodes to). The wire grammar still names the wrapper production where
the abstract grammar does, so that slot types in composite productions
remain isomorphic to the abstract grammar’s component types — but the
wrapper’s wire form ::: is the wire form of whatever it carries.
The wrappers fall into four groups by inner type:
- IRI-typed (
string, syntactically valid IRI per RFC 3987):Iri,TermIri, everyXxxFieldId,TemplateId,TemplateInstanceId,PresentationComponentId,PropertyIri,OrcidIri,RorIri,DoiIri,PubMedIri,RridIri,NihGrantIri,OntologyIri,RootTermIri,ValueSetIri,PreviousVersion,DerivedFrom,CreatedBy,ModifiedBy. - Other strings (
string):LanguageTag,LexicalForm,IsoDateTimeStamp,OntologyAcronym,ValueSetIdentifier,Notation,Identifier,AttributeName,HtmlContent,EmbeddedArtifactKey,ValidationRegex,Token,Version,ModelVersion,CreatedOn,ModifiedOn. - Numbers:
NonNegativeInteger,MinCardinality,MaxCardinality,MinLength,MaxLength,DecimalPlaces,MaxTraversalDepth. - MultilingualString-typed (the inner type is itself a composite,
encoded as a
nonEmptyArray<LangString>per theMultilingualStringwire production; the wrapper carries no additional wire shape):Name,Description,PreferredLabel,AlternativeLabel,Label,PropertyLabel,OntologyName,ValueSetName,RootTermLabel,Header,Footer.
Version and ModelVersion carry SemanticVersion 2.0.0 lexical
strings. CreatedOn and ModifiedOn carry ISO 8601 date-time
lexical strings. CreatedBy, ModifiedBy, PreviousVersion, and
DerivedFrom carry IRIs.
1.7 Encoding rules
This section summarises the rules a generator would apply to derive
wire-grammar.md from grammar.md plus the property-name map (§14).
The rules are also the framing under which the file should be read:
each ::: production in the rest of the file is what these rules
produce when applied to the corresponding ::= production in
grammar.md.
-
Production naming. Every abstract production
XxxYyy ::= ...ingrammar.mdbecomes a wire productionXxxYyy ::: ...with the same name. -
Object-form productions. A production that composes one or more named components encodes as
object { ... }with property names drawn from the property-name map (§14). When such a production is a member of a kind-discriminated union, its object additionally carries"kind": "XxxYyy"(see rule 7). -
Optional components. A grammar.md
[X]component becomes an optional wire propertyprop?: Xand is omitted from the JSON when absent. -
Repeated components. A grammar.md
X*becomes a wirearray<X>; a grammar.mdX+becomes a wirenonEmptyArray<X>. Some sequence positions are encoded as omittable optional arrays per the wrapping principle ofserialization.md§5 —altLabels?: array<AlternativeLabel>andannotations?: array<Annotation>onCatalogMetadataare SHOULD-omitted when empty, and the spec-levelMultiValuedEnumFieldSpec.defaultValuesis similarly optional. These exceptions are flagged at the production sites with inline constraints. -
Collapsed wrappers. Productions whose abstract form is a single-component wrapper around a primitive collapse to that primitive on the wire (§1.6). Their
:::definitions remain in this file for completeness and for use as type names at slot positions in composite productions: every slot in anobject { ... }is typed with the abstract grammar’s component name (e.g.key: EmbeddedArtifactKeyrather thankey: string). This makes the wire form’s slot types isomorphic to the abstract grammar’s component types, even where the encoding bottoms out at a JSON primitive. -
Discriminator strategies. Two strategies are recognised, declared inline on the union:
discriminator: kind(default) anddiscriminator: position. See §1.3. -
The kind rule. A
kind: "X"literal property appears on a wire object if and only if its production is a member of somediscriminator: kindunion, regardless of position. Productions not so used (Cardinality,Annotation,LabelOverride,Property, etc.) never carrykind. See §1.5 for the full statement. -
Primitive bottom-out. Where the abstract grammar uses a bare primitive type (
string,boolean,number) without a typed wrapper, the wire form uses that primitive directly (e.g.Cardinality.min: number,BooleanValue.value: boolean).
The wrapping principle that underlies rule 5 is given normatively in
serialization.md §5; this section restates only
the form in which it appears in the wire grammar.
2. Scalar and Datatype Leaves
The grammar’s primitive string types (SemanticVersion, IriString,
Bcp47Tag, Iso8601DateTimeLexicalForm, AsciiIdentifier,
IntegerLexicalForm) are abstract leaves with no ::= production;
on the wire they all encode as string, with constraints noted at
each site that uses them.
2.1 Core IRI and string types
Iri ::: string
// a syntactically valid IRI per RFC 3987. At every position in the
// model where the grammar uses Iri the wire form is a JSON string.
TermIri ::: Iri
// a documented role; encodes as Iri
LanguageTag ::: string
// a well-formed BCP 47 language tag
LexicalForm ::: string
// a Unicode string; SHOULD be in Unicode Normalization Form C
IsoDateTimeStamp ::: string
// an ISO 8601 date-time lexical form
NonNegativeInteger ::: number
// a JSON number that is a non-negative integer
// values exceeding 2^53 - 1 MUST be encoded as a string
2.2 Multilingual strings
LangString ::: object {
value: string
lang: string
}
// lang MUST be a well-formed BCP 47 tag
MultilingualString ::: nonEmptyArray<LangString>
// lang tags MUST be unique within the array (case-folded comparison)
2.3 Numeric datatype kind
RealNumberDatatypeKind ::: "decimal" "float" "double"
// CEDAR-native enum naming the three real-number kinds.
// The mapping to XSD datatype IRIs is defined separately in
// rdf-projection.md and is out of scope for the wire form.
IntegerNumberValue is fixed to a single integer category and carries
no datatype slot on the wire. Temporal Value variants
(FullDateValue, TimeValue, DateTimeValue) likewise carry no
datatype slot — the temporal category is fixed by the variant’s
kind.
3. Values
Value ::: TextValue NumericValue BooleanValue
DateValue TimeValue DateTimeValue
ControlledTermValue EnumValue LinkValue
EmailValue PhoneNumberValue ExternalAuthorityValue
AttributeValue
// discriminator: kind
// NumericValue, DateValue, and ExternalAuthorityValue are themselves
// unions; their members supply the kind discriminator directly
NumericValue ::: IntegerNumberValue RealNumberValue
// discriminator: kind
3.1 Scalar values
Scalar Value variants carry their content directly. There is no inner
literal wrapper. TextValue carries an optional lang for
language-tagged text; IntegerNumberValue carries a base-10 integer
lexical form (datatype is fixed at xsd:integer and not carried);
RealNumberValue carries a real-valued lexical form paired with the
required datatype enum (decimal | float | double); BooleanValue
carries a JSON boolean.
TextValue ::: object {
"kind": "TextValue"
value: LexicalForm
lang: LanguageTag
}
// lang, when present, MUST be a well-formed BCP 47 tag
// value MUST be in Unicode Normalization Form C
IntegerNumberValue ::: object {
"kind": "IntegerNumberValue"
value: LexicalForm
}
// value is a base-10 integer lexical form
// datatype is implicit (xsd:integer) and not carried on the wire
RealNumberValue ::: object {
"kind": "RealNumberValue"
value: LexicalForm
datatype: RealNumberDatatypeKind
}
// value is a base-10 real-valued lexical form
// datatype names the XSD datatype (xsd:decimal, xsd:float, or xsd:double)
BooleanValue ::: object {
"kind": "BooleanValue"
value: boolean
}
// value is a JSON boolean (true or false)
// datatype is implicit (xsd:boolean) and not carried on the wire
3.2 Temporal values
Each temporal Value variant carries its lexical form directly. The
datatype is fixed by the variant’s kind and is not carried on the
wire.
DateValue ::: YearValue YearMonthValue FullDateValue
// discriminator: kind
YearValue ::: object {
"kind": "YearValue"
value: LexicalForm
}
// value matches YYYY
YearMonthValue ::: object {
"kind": "YearMonthValue"
value: LexicalForm
}
// value matches YYYY-MM
FullDateValue ::: object {
"kind": "FullDateValue"
value: LexicalForm
}
// value is an xsd:date lexical form (YYYY-MM-DD with optional zone)
TimeValue ::: object {
"kind": "TimeValue"
value: LexicalForm
}
// value is an xsd:time lexical form
DateTimeValue ::: object {
"kind": "DateTimeValue"
value: LexicalForm
}
// value is an xsd:dateTime lexical form
3.3 Controlled-term value
Label ::: MultilingualString
Notation ::: string
PreferredLabel ::: MultilingualString
ControlledTermValue ::: object {
"kind": "ControlledTermValue"
term: TermIri
label: Label
notation: Notation
preferredLabel: PreferredLabel
}
// term is a TermIri (an Iri identifying the term)
3.4 Enum value
EnumValue ::: object {
"kind": "EnumValue"
value: Token
}
// value is the canonical Token of one of the referenced
// EnumFieldSpec's PermissibleValue entries
// value MUST be a non-empty Unicode string
EnumValue.value carries the wire-form of the abstract grammar’s
Token slot — the wire property name is value for consistency with
other Value variants, while the abstract production names the slot
Token. Token is defined in §7 alongside PermissibleValue.
3.5 Link value
LinkValue ::: object {
"kind": "LinkValue"
iri: Iri
label: Label
}
3.6 Contact values
EmailValue ::: object {
"kind": "EmailValue"
value: LexicalForm
}
PhoneNumberValue ::: object {
"kind": "PhoneNumberValue"
value: LexicalForm
}
3.7 External authority values
ExternalAuthorityValue ::: OrcidValue RorValue DoiValue
PubMedIdValue RridValue NihGrantIdValue
// discriminator: kind
OrcidValue ::: object {
"kind": "OrcidValue"
iri: OrcidIri
label: Label
}
RorValue ::: object {
"kind": "RorValue"
iri: RorIri
label: Label
}
DoiValue ::: object {
"kind": "DoiValue"
iri: DoiIri
label: Label
}
PubMedIdValue ::: object {
"kind": "PubMedIdValue"
iri: PubMedIri
label: Label
}
RridValue ::: object {
"kind": "RridValue"
iri: RridIri
label: Label
}
NihGrantIdValue ::: object {
"kind": "NihGrantIdValue"
iri: NihGrantIri
label: Label
}
The typed external-authority IRI productions collapse to plain string IRIs on the wire — see §1.6.
OrcidIri ::: Iri
RorIri ::: Iri
DoiIri ::: Iri
PubMedIri ::: Iri
RridIri ::: Iri
NihGrantIri ::: Iri
3.8 Attribute value
AttributeName ::: string
AttributeValue ::: object {
"kind": "AttributeValue"
name: AttributeName
value: Value
}
// value is a tagged Value carrying its kind discriminator per §1.5.
4. Identifiers (artifact)
Each artifact identifier wire-encodes as an Iri (which itself
collapses to a plain string IRI per §1.6); the abstract grammar’s
typed-role distinction is not visible on the wire.
FieldIdis the umbrella union of the twenty typedXxxFieldIdfamilies pergrammar.md; on the wire its encoding is just the encoding of whichever family member is at the slot position, which in every case isIri. The wire grammar therefore lists `FieldId- :: Iri` alongside each typed family for consistency.
FieldId ::: Iri
TextFieldId ::: Iri
IntegerNumberFieldId ::: Iri
RealNumberFieldId ::: Iri
BooleanFieldId ::: Iri
DateFieldId ::: Iri
TimeFieldId ::: Iri
DateTimeFieldId ::: Iri
ControlledTermFieldId ::: Iri
SingleValuedEnumFieldId ::: Iri
MultiValuedEnumFieldId ::: Iri
LinkFieldId ::: Iri
EmailFieldId ::: Iri
PhoneNumberFieldId ::: Iri
OrcidFieldId ::: Iri
RorFieldId ::: Iri
DoiFieldId ::: Iri
PubMedIdFieldId ::: Iri
RridFieldId ::: Iri
NihGrantIdFieldId ::: Iri
AttributeValueFieldId ::: Iri
TemplateId ::: Iri
PresentationComponentId ::: Iri
TemplateInstanceId ::: Iri
The family of an identifier is recovered from the kind discriminator
on the enclosing object — Field and EmbeddedField for FieldId
variants, Template and EmbeddedTemplate for TemplateId,
PresentationComponent and EmbeddedPresentationComponent for
PresentationComponentId, and TemplateInstance for
TemplateInstanceId. The identifier shape itself carries no family
information.
The same identifier productions serve at both the definition site
of a reusable artifact (e.g. Field.id, Template.id) and the
reference site where it is embedded (e.g.
EmbeddedField.artifactRef, EmbeddedTemplate.artifactRef); the
abstract grammar does not distinguish reference-typed productions from
identity-typed ones, and on the wire both positions encode as a plain
IRI string.
5. Catalog Metadata
5.1 Aggregate structure
CatalogMetadata is flat on the wire: its descriptive properties
(preferredLabel, description, identifier, altLabels), its
lifecycle slot, and its annotations slot are all direct
members of the same object — there is no descriptiveMetadata
wrapper.
Description ::: MultilingualString
Identifier ::: string
AlternativeLabel ::: MultilingualString
CatalogMetadata ::: object {
preferredLabel: PreferredLabel
description: Description
identifier: Identifier
altLabels: array<AlternativeLabel>
lifecycle: LifecycleMetadata
annotations: array<Annotation>
}
// preferredLabel is the artifact's catalog-display name. It is
// distinct from a schema artifact's *rendered* display name, which
// lives on a top-level slot on the artifact itself (Field.label,
// Template.title, TemplateInstance.label).
// altLabels SHOULD be omitted from the wire when empty; it round-trips
// as an empty array in memory
// annotations SHOULD be omitted from the wire when empty; it round-trips
// as an empty array in memory
// the grammar's Description, PreferredLabel, and AlternativeLabel
// productions are MultilingualString-typed wrappers that collapse on
// the wire (§1.6); the type names appear here for parity with the
// abstract grammar's component naming
CatalogMetadata is uniform across every artifact kind: Field,
Template, PresentationComponent, and TemplateInstance all carry
the same CatalogMetadata shape under the wire-form metadata key.
Schema artifacts (Field, Template) additionally carry
SchemaArtifactVersioning as a separate top-level versioning slot
on the artifact itself; non-schema artifacts (PresentationComponent,
TemplateInstance) do not carry versioning. The
SchemaArtifactMetadata wrapper production used in prior revisions
of this specification is removed: in the new shape, schema artifacts
carry metadata and versioning as parallel top-level slots rather
than as a single metadata-wrapped blob.
5.2 Lifecycle metadata
CreatedOn ::: string
CreatedBy ::: string
ModifiedOn ::: string
ModifiedBy ::: string
LifecycleMetadata ::: object {
createdOn: CreatedOn
createdBy: CreatedBy
modifiedOn: ModifiedOn
modifiedBy: ModifiedBy
}
// createdOn and modifiedOn carry IsoDateTimeStamp values
// createdBy and modifiedBy carry agent Iri values
5.3 Schema versioning
SchemaArtifactVersioning ::: object {
version: Version
status: Status
previousVersion: PreviousVersion
derivedFrom: DerivedFrom
}
// version is a SemanticVersion lexical form
// when both previousVersion and derivedFrom are present, they MUST
// NOT carry the same IRI (per grammar.md §Schema Artifact
// Versioning); succession and derivation are mutually exclusive at
// any single point
Version ::: string
ModelVersion ::: string
// a SemanticVersion 2.0.0 lexical form; carried directly on every
// concrete artifact wire object as the top-level `modelVersion` slot
PreviousVersion ::: Iri
DerivedFrom ::: Iri
Status ::: "draft" "published"
5.4 Annotations
Annotation ::: object {
property: Iri
body: AnnotationValue
}
// property is the annotation-property Iri (the grammar's bare Iri
// collapses to a string per §1.6)
AnnotationValue ::: AnnotationStringValue AnnotationIriValue
// discriminator: kind
AnnotationStringValue ::: object {
"kind": "AnnotationStringValue"
value: LexicalForm
lang: LanguageTag
}
// lang, when present, MUST be a well-formed BCP 47 tag
// value MUST be in Unicode Normalization Form C
AnnotationIriValue ::: object {
"kind": "AnnotationIriValue"
iri: Iri
}
// iri carries an Iri value (RFC 3987)
6. Embedded Artifact Properties
6.1 Embedded artifact key
EmbeddedArtifactKey ::: string
// matches the pattern [A-Za-z][A-Za-z0-9_-]*
// unique within the containing Template (constraint enforced on Template)
6.2 Requirements
ValueRequirement ::: "required" "recommended" "optional"
6.3 Cardinality
Cardinality ::: object {
min: MinCardinality
max: MaxCardinality
}
// min is a non-negative integer
// max omitted ⇒ unbounded above (per grammar.md §Cardinality)
MinCardinality ::: number
MaxCardinality ::: number
6.4 Visibility
Visibility ::: "visible" "hidden"
6.5 Defaults
Defaults are specified at two layers, with parallel typing per
family. See grammar.md §Defaults for the abstract grammar’s full
treatment, including precedence and the UI/UX-only semantics; this
section gives the wire form.
Embedding-level defaults. The optional defaultValue slot on
each EmbeddedXxxField is typed family-by-family with the family’s
Value type. There is no DefaultValue union and no per-family
XxxDefaultValue wrapper on the wire: the defaultValue JSON
encodes directly as the corresponding family’s Value. Per the
kind rule (§1.5), every Value family is a member of the Value
discriminator-kind union, so every embedding-level defaultValue
carries a kind discriminator on the wire.
| Embedded field | defaultValue wire form |
|---|---|
EmbeddedTextField | TextValue: { "kind": "TextValue", "value": …, "lang"?: … } |
EmbeddedIntegerNumberField | IntegerNumberValue: { "kind": "IntegerNumberValue", "value": … } |
EmbeddedRealNumberField | RealNumberValue: { "kind": "RealNumberValue", "value": …, "datatype": … } |
EmbeddedBooleanField | BooleanValue: { "kind": "BooleanValue", "value": … } (value is a JSON boolean) |
EmbeddedDateField | one of the DateValue arms: { "kind": "YearValue" | "YearMonthValue" | "FullDateValue", "value": … } |
EmbeddedTimeField | TimeValue: { "kind": "TimeValue", "value": … } |
EmbeddedDateTimeField | DateTimeValue: { "kind": "DateTimeValue", "value": … } |
EmbeddedControlledTermField | ControlledTermValue: { "kind": "ControlledTermValue", … } |
EmbeddedSingleValuedEnumField | EnumValue: { "kind": "EnumValue", "value": … } |
EmbeddedMultiValuedEnumField | array<EnumValue>: each element { "kind": "EnumValue", "value": … } |
EmbeddedLinkField | LinkValue: { "kind": "LinkValue", … } |
EmbeddedEmailField | EmailValue: { "kind": "EmailValue", "value": … } |
EmbeddedPhoneNumberField | PhoneNumberValue: { "kind": "PhoneNumberValue", "value": … } |
EmbeddedOrcidField | OrcidValue: { "kind": "OrcidValue", … } |
EmbeddedRorField | RorValue: { "kind": "RorValue", … } |
EmbeddedDoiField | DoiValue: { "kind": "DoiValue", … } |
EmbeddedPubMedIdField | PubMedIdValue: { "kind": "PubMedIdValue", … } |
EmbeddedRridField | RridValue: { "kind": "RridValue", … } |
EmbeddedNihGrantIdField | NihGrantIdValue: { "kind": "NihGrantIdValue", … } |
EmbeddedAttributeValueField has no defaultValue slot (per §9).
Field-level defaults. Every XxxFieldSpec (with one exception)
carries an optional defaultValue slot whose type matches its
embedding-level counterpart. The two layers are independent: a
field MAY ship with a field-level default and a Template embedding
that field MAY override that default with an embedding-level
defaultValue (see grammar.md §Defaults for the full precedence
rule). The wire shapes are identical to the embedding-level table
above, with the following per-family details:
-
TextFieldSpec.defaultValue?: TextValue -
IntegerNumberFieldSpec.defaultValue?: IntegerNumberValue -
RealNumberFieldSpec.defaultValue?: RealNumberValue -
BooleanFieldSpec.defaultValue?: BooleanValue -
DateFieldSpec.defaultValue?: DateValue(the arm MUST be consistent withdateValueType) -
TimeFieldSpec.defaultValue?: TimeValue -
DateTimeFieldSpec.defaultValue?: DateTimeValue -
ControlledTermFieldSpec.defaultValue?: ControlledTermValue -
LinkFieldSpec.defaultValue?: LinkValue -
EmailFieldSpec.defaultValue?: EmailValue -
PhoneNumberFieldSpec.defaultValue?: PhoneNumberValue -
OrcidFieldSpec.defaultValue?: OrcidValue -
RorFieldSpec.defaultValue?: RorValue -
DoiFieldSpec.defaultValue?: DoiValue -
PubMedIdFieldSpec.defaultValue?: PubMedIdValue -
RridFieldSpec.defaultValue?: RridValue -
NihGrantIdFieldSpec.defaultValue?: NihGrantIdValue -
SingleValuedEnumFieldSpec.defaultValue?: EnumValue— a taggedEnumValuewhosevalueMUST equal theTokenof one of the spec’s permissible-value entries. -
MultiValuedEnumFieldSpec.defaultValues?: array<EnumValue>— a (possibly empty) JSON array of taggedEnumValueentries; eachvalueMUST equal theTokenof one of the spec’s permissible-value entries, and the array MUST NOT contain duplicatevalueentries.
AttributeValueFieldSpec carries no field-level default.
6.6 Label override
LabelOverride ::: object {
label: Label
altLabels: array<AlternativeLabel>
}
// altLabels MAY be empty
6.7 Help text
HelpText ::: MultilingualString
HelpTextOverride ::: MultilingualString
Both productions collapse on the wire per the wrapper-collapse rule (§1.6): a MultilingualString is encoded as a non-empty array of LangString entries. HelpText is carried by the reusable Field artifact (slot helpText?); HelpTextOverride is carried by each EmbeddedXxxField (slot helpTextOverride?).
6.8 Properties
Property ::: object {
iri: PropertyIri
label: PropertyLabel
}
// iri carries the PropertyIri; label is the optional PropertyLabel
PropertyIri ::: Iri
PropertyLabel ::: MultilingualString
7. Field Specs
FieldSpec ::: TextFieldSpec NumericFieldSpec BooleanFieldSpec
TemporalFieldSpec
ControlledTermFieldSpec EnumFieldSpec LinkFieldSpec
ContactFieldSpec ExternalAuthorityFieldSpec
AttributeValueFieldSpec
// discriminator: kind
// NumericFieldSpec, TemporalFieldSpec, EnumFieldSpec, ContactFieldSpec,
// and ExternalAuthorityFieldSpec are unions; their members supply
// the kind discriminator directly
NumericFieldSpec ::: IntegerNumberFieldSpec RealNumberFieldSpec
// discriminator: kind
TextFieldSpec ::: object {
"kind": "TextFieldSpec"
defaultValue: TextValue
minLength: MinLength
maxLength: MaxLength
validationRegex: ValidationRegex
langTagRequirement: LangTagRequirement
renderingHint: TextRenderingHint
}
LangTagRequirement ::: "langTagRequired" "langTagOptional" "langTagForbidden"
// defaultValue, when present, encodes as a tagged TextValue per
// the kind rule (§1.5): `{ "kind": "TextValue", "value": ..., "lang"?: ... }`.
// See §6.5 for default-value semantics across all field families.
IntegerNumberFieldSpec ::: object {
"kind": "IntegerNumberFieldSpec"
defaultValue: IntegerNumberValue
unit: Unit
minValue: IntegerNumberMinValue
maxValue: IntegerNumberMaxValue
renderingHint: NumericRenderingHint
}
RealNumberFieldSpec ::: object {
"kind": "RealNumberFieldSpec"
datatype: RealNumberDatatypeKind
defaultValue: RealNumberValue
unit: Unit
minValue: RealNumberMinValue
maxValue: RealNumberMaxValue
renderingHint: NumericRenderingHint
}
BooleanFieldSpec ::: object {
"kind": "BooleanFieldSpec"
defaultValue: BooleanValue
renderingHint: BooleanRenderingHint
}
Unit ::: object {
iri: Iri
label: Label
}
MinLength ::: number
MaxLength ::: number
ValidationRegex ::: string
DecimalPlaces ::: number
IntegerNumberMinValue ::: IntegerNumberValue
IntegerNumberMaxValue ::: IntegerNumberValue
RealNumberMinValue ::: RealNumberValue
RealNumberMaxValue ::: RealNumberValue
7.1 Temporal field specs
TemporalFieldSpec ::: DateFieldSpec TimeFieldSpec DateTimeFieldSpec
// discriminator: kind
DateFieldSpec ::: object {
"kind": "DateFieldSpec"
dateValueType: DateValueType
defaultValue: DateValue
renderingHint: DateRenderingHint
}
// defaultValue, when present, MUST be a DateValue arm consistent
// with dateValueType (e.g. dateValueType "year" admits only YearValue).
DateValueType ::: "year" "yearMonth" "fullDate"
TimeFieldSpec ::: object {
"kind": "TimeFieldSpec"
defaultValue: TimeValue
timePrecision: TimePrecision
timezoneRequirement: TimezoneRequirement
renderingHint: TimeRenderingHint
}
TimePrecision ::: "hourMinute" "hourMinuteSecond" "hourMinuteSecondFraction"
TimezoneRequirement ::: "timezoneRequired" "timezoneNotRequired"
DateTimeFieldSpec ::: object {
"kind": "DateTimeFieldSpec"
dateTimeValueType: DateTimeValueType
defaultValue: DateTimeValue
timezoneRequirement: TimezoneRequirement
renderingHint: DateTimeRenderingHint
}
DateTimeValueType ::: "dateHourMinute" "dateHourMinuteSecond"
"dateHourMinuteSecondFraction"
DateRenderingHint ::: object {
componentOrder: DateComponentOrder
placeholder: Placeholder
}
DateComponentOrder ::: "dayMonthYear" "monthDayYear" "yearMonthDay"
TimeRenderingHint ::: object {
timeFormat: TimeFormat
placeholder: Placeholder
}
DateTimeRenderingHint ::: object {
timeFormat: TimeFormat
placeholder: Placeholder
}
TimeFormat ::: "twelveHour" "twentyFourHour"
7.2 Controlled term field spec
ControlledTermFieldSpec ::: object {
"kind": "ControlledTermFieldSpec"
defaultValue: ControlledTermValue
sources: nonEmptyArray<ControlledTermSource>
renderingHint: ControlledTermRenderingHint
}
// defaultValue.term, when present, SHOULD belong to one of the
// declared sources, but the structural model does not enforce this
7.3 Enum field specs
EnumFieldSpec ::: SingleValuedEnumFieldSpec MultiValuedEnumFieldSpec
// discriminator: kind
SingleValuedEnumFieldSpec ::: object {
"kind": "SingleValuedEnumFieldSpec"
permissibleValues: nonEmptyArray<PermissibleValue>
defaultValue: EnumValue
renderingHint: SingleValuedEnumRenderingHint
}
// defaultValue.value, when present, MUST equal the `value` of one
// of the permissibleValues entries
MultiValuedEnumFieldSpec ::: object {
"kind": "MultiValuedEnumFieldSpec"
permissibleValues: nonEmptyArray<PermissibleValue>
defaultValues: array<EnumValue>
renderingHint: MultiValuedEnumRenderingHint
}
// defaultValues, when present, is a (possibly empty) array of
// EnumValue entries; each defaultValues[i].value MUST equal the
// `value` of one of the permissibleValues entries; the array MUST
// NOT contain duplicate `value` entries
PermissibleValue ::: object {
value: Token
label: Label
description: Description
meanings: array<Meaning>
}
// value carries the canonical Token of the permissible value and
// MUST be a non-empty Unicode string
// value MUST be unique within the enclosing spec's permissibleValues
// meanings, when present, is a (possibly empty) array of Meaning
// objects binding the token to ontology terms; SHOULD be omitted
// when empty
Token ::: string
// a non-empty Unicode string serving as the canonical key of a
// PermissibleValue or the value carried by an EnumValue
Meaning ::: object {
iri: TermIri
label: Label
}
// iri carries the TermIri of the bound ontology term
// label, when present, is the cached human-readable label of the
// bound term (distinct from the enclosing PermissibleValue's label,
// which is the label of the permissible value itself)
7.4 Other field specs
LinkFieldSpec ::: object {
"kind": "LinkFieldSpec"
defaultValue: LinkValue
renderingHint: LinkRenderingHint
}
ContactFieldSpec ::: EmailFieldSpec PhoneNumberFieldSpec
// discriminator: kind
EmailFieldSpec ::: object {
"kind": "EmailFieldSpec"
defaultValue: EmailValue
renderingHint: EmailRenderingHint
}
PhoneNumberFieldSpec ::: object {
"kind": "PhoneNumberFieldSpec"
defaultValue: PhoneNumberValue
renderingHint: PhoneNumberRenderingHint
}
ExternalAuthorityFieldSpec ::: OrcidFieldSpec RorFieldSpec DoiFieldSpec
PubMedIdFieldSpec RridFieldSpec
NihGrantIdFieldSpec
// discriminator: kind
OrcidFieldSpec ::: object {
"kind": "OrcidFieldSpec"
defaultValue: OrcidValue
renderingHint: OrcidRenderingHint
}
RorFieldSpec ::: object {
"kind": "RorFieldSpec"
defaultValue: RorValue
renderingHint: RorRenderingHint
}
DoiFieldSpec ::: object {
"kind": "DoiFieldSpec"
defaultValue: DoiValue
renderingHint: DoiRenderingHint
}
PubMedIdFieldSpec ::: object {
"kind": "PubMedIdFieldSpec"
defaultValue: PubMedIdValue
renderingHint: PubMedIdRenderingHint
}
RridFieldSpec ::: object {
"kind": "RridFieldSpec"
defaultValue: RridValue
renderingHint: RridRenderingHint
}
NihGrantIdFieldSpec ::: object {
"kind": "NihGrantIdFieldSpec"
defaultValue: NihGrantIdValue
renderingHint: NihGrantIdRenderingHint
}
AttributeValueFieldSpec ::: object {
"kind": "AttributeValueFieldSpec"
}
// AttributeValueFieldSpec carries no defaultValue; an AttributeValue
// is a per-instance pairing of a name and a value, and a default is
// not meaningful here (see grammar.md §Defaults).
7.5 Controlled term sources
ControlledTermSource ::: OntologySource BranchSource
ClassSource ValueSetSource
// discriminator: kind
OntologySource ::: object {
"kind": "OntologySource"
ontology: OntologyReference
}
OntologyReference ::: object {
iri: OntologyIri
displayHint: OntologyDisplayHint
}
OntologyDisplayHint ::: object {
acronym: OntologyAcronym
name: OntologyName
}
// at least one of acronym, name MUST be present
BranchSource ::: object {
"kind": "BranchSource"
ontology: OntologyReference
rootTermIri: RootTermIri
rootTermLabel: RootTermLabel
maxTraversalDepth: MaxTraversalDepth
}
// rootTermLabel SHOULD be present (captured at source-declaration time)
// but MAY be omitted when the term's display text is not available
ClassSource ::: object {
"kind": "ClassSource"
classes: nonEmptyArray<ControlledTermClass>
}
ControlledTermClass ::: object {
term: TermIri
label: Label
ontology: OntologyReference
}
// term is a TermIri
// label SHOULD be present (captured at source-declaration time)
// but MAY be omitted when the term's display text is not available
ValueSetSource ::: object {
"kind": "ValueSetSource"
identifier: ValueSetIdentifier
name: ValueSetName
iri: ValueSetIri
}
OntologyAcronym ::: string
OntologyName ::: MultilingualString
OntologyIri ::: Iri
RootTermIri ::: Iri
RootTermLabel ::: MultilingualString
MaxTraversalDepth ::: number
ValueSetIdentifier ::: string
ValueSetName ::: MultilingualString
ValueSetIri ::: Iri
The leaf productions used by the controlled-term sources collapse on
the wire per §1.6; their ::: definitions are listed alongside the
source productions for slot-type reference.
7.6 Rendering hints
The RenderingHint union is heterogeneous: text/enum/boolean hints
encode as flat strings, while DateRenderingHint, TimeRenderingHint,
DateTimeRenderingHint, and NumericRenderingHint encode as objects
that can carry configuration. Because some members are strings (which
cannot carry a "kind" property), the union uses
discriminator: position (§1.3): the decoder identifies the variant
from the enclosing FieldSpec’s family — e.g. the value at
TextFieldSpec.renderingHint is decoded as a TextRenderingHint, the
value at SingleValuedEnumFieldSpec.renderingHint as a
SingleValuedEnumRenderingHint, and so on.
RenderingHint ::: TextRenderingHint SingleValuedEnumRenderingHint
MultiValuedEnumRenderingHint NumericRenderingHint
BooleanRenderingHint
DateRenderingHint TimeRenderingHint DateTimeRenderingHint
ControlledTermRenderingHint
EmailRenderingHint PhoneNumberRenderingHint
LinkRenderingHint
OrcidRenderingHint RorRenderingHint DoiRenderingHint
PubMedIdRenderingHint RridRenderingHint
NihGrantIdRenderingHint
// discriminator: position
// resolved by the renderingHint property of the enclosing FieldSpec
TextRenderingHint ::: object {
lineMode: TextLineMode
placeholder: Placeholder
}
TextLineMode ::: "singleLine" "multiLine"
SingleValuedEnumRenderingHint ::: "radio" "dropdown"
MultiValuedEnumRenderingHint ::: "checkbox" "multiSelect"
NumericRenderingHint ::: object {
decimalPlaces: DecimalPlaces
placeholder: Placeholder
}
// decimalPlaces, when present, MUST be a non-negative integer
// it is a presentation concern (display rounding); it does NOT
// constrain the lexical form of submitted values
BooleanRenderingHint ::: "checkbox" "toggle" "radio" "dropdown"
ControlledTermRenderingHint ::: object { placeholder: Placeholder }
EmailRenderingHint ::: object { placeholder: Placeholder }
PhoneNumberRenderingHint ::: object { placeholder: Placeholder }
LinkRenderingHint ::: object { placeholder: Placeholder }
OrcidRenderingHint ::: object { placeholder: Placeholder }
RorRenderingHint ::: object { placeholder: Placeholder }
DoiRenderingHint ::: object { placeholder: Placeholder }
PubMedIdRenderingHint ::: object { placeholder: Placeholder }
RridRenderingHint ::: object { placeholder: Placeholder }
NihGrantIdRenderingHint ::: object { placeholder: Placeholder }
Placeholder ::: MultilingualString
Placeholder collapses on the wire per the wrapper-collapse rule (§1.6).
8. Field artifacts
Field ::: TextField NumericField BooleanField
DateField TimeField DateTimeField
ControlledTermField
SingleValuedEnumField MultiValuedEnumField
LinkField EmailField PhoneNumberField
OrcidField RorField DoiField PubMedIdField
RridField NihGrantIdField AttributeValueField
// discriminator: kind
// NumericField is itself a union of IntegerNumberField and RealNumberField
NumericField ::: IntegerNumberField RealNumberField
// discriminator: kind
TemporalField ::: DateField TimeField DateTimeField
// discriminator: kind
// a documented intermediate category; the wire form is just the variant
EnumField ::: SingleValuedEnumField MultiValuedEnumField
// discriminator: kind
ContactField ::: EmailField PhoneNumberField
// discriminator: kind
ExternalAuthorityField ::: OrcidField RorField DoiField
PubMedIdField RridField NihGrantIdField
// discriminator: kind
TextField ↗ EmbeddedTextField ::: object {
"kind": "TextField"
id: TextFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: TextFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
IntegerNumberField ↗ EmbeddedIntegerNumberField ::: object {
"kind": "IntegerNumberField"
id: IntegerNumberFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: IntegerNumberFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
RealNumberField ↗ EmbeddedRealNumberField ::: object {
"kind": "RealNumberField"
id: RealNumberFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: RealNumberFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
BooleanField ↗ EmbeddedBooleanField ::: object {
"kind": "BooleanField"
id: BooleanFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: BooleanFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
DateField ↗ EmbeddedDateField ::: object {
"kind": "DateField"
id: DateFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: DateFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
TimeField ↗ EmbeddedTimeField ::: object {
"kind": "TimeField"
id: TimeFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: TimeFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
DateTimeField ↗ EmbeddedDateTimeField ::: object {
"kind": "DateTimeField"
id: DateTimeFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: DateTimeFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
ControlledTermField ↗ EmbeddedControlledTermField ::: object {
"kind": "ControlledTermField"
id: ControlledTermFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: ControlledTermFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
SingleValuedEnumField ↗ EmbeddedSingleValuedEnumField ::: object {
"kind": "SingleValuedEnumField"
id: SingleValuedEnumFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: SingleValuedEnumFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
MultiValuedEnumField ↗ EmbeddedMultiValuedEnumField ::: object {
"kind": "MultiValuedEnumField"
id: MultiValuedEnumFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: MultiValuedEnumFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
LinkField ↗ EmbeddedLinkField ::: object {
"kind": "LinkField"
id: LinkFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: LinkFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
EmailField ↗ EmbeddedEmailField ::: object {
"kind": "EmailField"
id: EmailFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: EmailFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
PhoneNumberField ↗ EmbeddedPhoneNumberField ::: object {
"kind": "PhoneNumberField"
id: PhoneNumberFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: PhoneNumberFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
OrcidField ↗ EmbeddedOrcidField ::: object {
"kind": "OrcidField"
id: OrcidFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: OrcidFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
RorField ↗ EmbeddedRorField ::: object {
"kind": "RorField"
id: RorFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: RorFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
DoiField ↗ EmbeddedDoiField ::: object {
"kind": "DoiField"
id: DoiFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: DoiFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
PubMedIdField ↗ EmbeddedPubMedIdField ::: object {
"kind": "PubMedIdField"
id: PubMedIdFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: PubMedIdFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
RridField ↗ EmbeddedRridField ::: object {
"kind": "RridField"
id: RridFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: RridFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
NihGrantIdField ↗ EmbeddedNihGrantIdField ::: object {
"kind": "NihGrantIdField"
id: NihGrantIdFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: NihGrantIdFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
AttributeValueField ↗ EmbeddedAttributeValueField ::: object {
"kind": "AttributeValueField"
id: AttributeValueFieldId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
fieldSpec: AttributeValueFieldSpec
label: Label
helpText: HelpText
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
9. Embedded artifacts
Most embedded-field productions follow the same eight-property template
— kind, key, artifactRef, valueRequirement?, cardinality?,
visibility?, defaultValue?, labelOverride?, property? — with
the per-family typing applied at artifactRef and defaultValue.
Four families deviate from this template; the deviations are listed
here so an implementer can scan them in one place rather than spotting
them inside the per-family productions below.
| Family | Deviation |
|---|---|
EmbeddedBooleanField | omits cardinality (booleans are inherently single-valued) |
EmbeddedSingleValuedEnumField | omits cardinality (single-valued is implicit, parallel to boolean) |
EmbeddedMultiValuedEnumField | defaultValue?: array<EnumValue> rather than a singular Value (multi-valued enum admits a list of pre-selected tokens; each element is a tagged EnumValue per §1.5) |
EmbeddedAttributeValueField | omits defaultValue (attribute-value fields have no spec-level default) |
EmbeddedTemplate and EmbeddedPresentationComponent follow their own
shapes; see the per-production definitions later in this section.
EmbeddedArtifact ::: EmbeddedField EmbeddedTemplate
EmbeddedPresentationComponent
// discriminator: kind
EmbeddedField ::: EmbeddedTextField
EmbeddedIntegerNumberField EmbeddedRealNumberField
EmbeddedBooleanField
EmbeddedDateField EmbeddedTimeField EmbeddedDateTimeField
EmbeddedControlledTermField
EmbeddedSingleValuedEnumField EmbeddedMultiValuedEnumField
EmbeddedLinkField
EmbeddedEmailField EmbeddedPhoneNumberField
EmbeddedOrcidField EmbeddedRorField EmbeddedDoiField
EmbeddedPubMedIdField EmbeddedRridField
EmbeddedNihGrantIdField
EmbeddedAttributeValueField
// discriminator: kind
EmbeddedTextField ↗ TextField ::: object {
"kind": "EmbeddedTextField"
key: EmbeddedArtifactKey
artifactRef: TextFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: TextValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedIntegerNumberField ↗ IntegerNumberField ::: object {
"kind": "EmbeddedIntegerNumberField"
key: EmbeddedArtifactKey
artifactRef: IntegerNumberFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: IntegerNumberValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedRealNumberField ↗ RealNumberField ::: object {
"kind": "EmbeddedRealNumberField"
key: EmbeddedArtifactKey
artifactRef: RealNumberFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: RealNumberValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedBooleanField ↗ BooleanField ::: object {
"kind": "EmbeddedBooleanField"
key: EmbeddedArtifactKey
artifactRef: BooleanFieldId
valueRequirement: ValueRequirement
visibility: Visibility
defaultValue: BooleanValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
// boolean embeddings carry no cardinality slot per grammar.md
// (booleans are inherently single-valued)
EmbeddedDateField ↗ DateField ::: object {
"kind": "EmbeddedDateField"
key: EmbeddedArtifactKey
artifactRef: DateFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: DateValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedTimeField ↗ TimeField ::: object {
"kind": "EmbeddedTimeField"
key: EmbeddedArtifactKey
artifactRef: TimeFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: TimeValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedDateTimeField ↗ DateTimeField ::: object {
"kind": "EmbeddedDateTimeField"
key: EmbeddedArtifactKey
artifactRef: DateTimeFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: DateTimeValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedControlledTermField ↗ ControlledTermField ::: object {
"kind": "EmbeddedControlledTermField"
key: EmbeddedArtifactKey
artifactRef: ControlledTermFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: ControlledTermValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedSingleValuedEnumField ↗ SingleValuedEnumField ::: object {
"kind": "EmbeddedSingleValuedEnumField"
key: EmbeddedArtifactKey
artifactRef: SingleValuedEnumFieldId
valueRequirement: ValueRequirement
visibility: Visibility
defaultValue: EnumValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
// single-valued enum embeddings carry no cardinality slot per
// grammar.md (single-valued enum is implicit, parallel to boolean)
EmbeddedMultiValuedEnumField ↗ MultiValuedEnumField ::: object {
"kind": "EmbeddedMultiValuedEnumField"
key: EmbeddedArtifactKey
artifactRef: MultiValuedEnumFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: array<EnumValue>
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
// defaultValue is a (possibly empty) array of EnumValue entries;
// each element is a tagged EnumValue per the kind rule (§1.5).
// The array MUST NOT contain duplicate `value` entries.
EmbeddedLinkField ↗ LinkField ::: object {
"kind": "EmbeddedLinkField"
key: EmbeddedArtifactKey
artifactRef: LinkFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: LinkValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedEmailField ↗ EmailField ::: object {
"kind": "EmbeddedEmailField"
key: EmbeddedArtifactKey
artifactRef: EmailFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: EmailValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedPhoneNumberField ↗ PhoneNumberField ::: object {
"kind": "EmbeddedPhoneNumberField"
key: EmbeddedArtifactKey
artifactRef: PhoneNumberFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: PhoneNumberValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedOrcidField ↗ OrcidField ::: object {
"kind": "EmbeddedOrcidField"
key: EmbeddedArtifactKey
artifactRef: OrcidFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: OrcidValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedRorField ↗ RorField ::: object {
"kind": "EmbeddedRorField"
key: EmbeddedArtifactKey
artifactRef: RorFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: RorValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedDoiField ↗ DoiField ::: object {
"kind": "EmbeddedDoiField"
key: EmbeddedArtifactKey
artifactRef: DoiFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: DoiValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedPubMedIdField ↗ PubMedIdField ::: object {
"kind": "EmbeddedPubMedIdField"
key: EmbeddedArtifactKey
artifactRef: PubMedIdFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: PubMedIdValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedRridField ↗ RridField ::: object {
"kind": "EmbeddedRridField"
key: EmbeddedArtifactKey
artifactRef: RridFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: RridValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedNihGrantIdField ↗ NihGrantIdField ::: object {
"kind": "EmbeddedNihGrantIdField"
key: EmbeddedArtifactKey
artifactRef: NihGrantIdFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
defaultValue: NihGrantIdValue
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
EmbeddedAttributeValueField ↗ AttributeValueField ::: object {
"kind": "EmbeddedAttributeValueField"
key: EmbeddedArtifactKey
artifactRef: AttributeValueFieldId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
labelOverride: LabelOverride
helpTextOverride: HelpTextOverride
property: Property
}
// attribute-value embeddings carry no defaultValue per grammar.md
EmbeddedTemplate ::: object {
"kind": "EmbeddedTemplate"
key: EmbeddedArtifactKey
artifactRef: TemplateId
valueRequirement: ValueRequirement
cardinality: Cardinality
visibility: Visibility
labelOverride: LabelOverride
property: Property
}
EmbeddedPresentationComponent ::: object {
"kind": "EmbeddedPresentationComponent"
key: EmbeddedArtifactKey
artifactRef: PresentationComponentId
visibility: Visibility
}
10. Presentation Components
PresentationComponent ::: RichTextComponent ImageComponent
YoutubeVideoComponent
SectionBreakComponent PageBreakComponent
// discriminator: kind
RichTextComponent ::: object {
"kind": "RichTextComponent"
id: PresentationComponentId
modelVersion: ModelVersion
metadata: CatalogMetadata
html: HtmlContent
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
ImageComponent ::: object {
"kind": "ImageComponent"
id: PresentationComponentId
modelVersion: ModelVersion
metadata: CatalogMetadata
image: Iri
label: Label
description: Description
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
// image is an Iri identifying the image resource
// label, when present, is short alt-text accessibility metadata
// description, when present, is longer accessibility-focused text
YoutubeVideoComponent ::: object {
"kind": "YoutubeVideoComponent"
id: PresentationComponentId
modelVersion: ModelVersion
metadata: CatalogMetadata
video: Iri
label: Label
description: Description
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
// video is an Iri identifying the video resource
// label, when present, is short alt-text / caption-title accessibility metadata
// description, when present, is longer accessibility-focused text
SectionBreakComponent ::: object {
"kind": "SectionBreakComponent"
id: PresentationComponentId
modelVersion: ModelVersion
metadata: CatalogMetadata
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
PageBreakComponent ::: object {
"kind": "PageBreakComponent"
id: PresentationComponentId
modelVersion: ModelVersion
metadata: CatalogMetadata
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
HtmlContent ::: string
11. Templates and Top-Level Artifacts
Artifact ::: SchemaArtifact PresentationComponent TemplateInstance
// discriminator: kind
// kind ∈ {"Field", "Template", "RichTextComponent", "ImageComponent",
// "YoutubeVideoComponent", "SectionBreakComponent",
// "PageBreakComponent", "TemplateInstance"}
SchemaArtifact ::: Field Template
// discriminator: kind
Template ::: object {
"kind": "Template"
id: TemplateId
modelVersion: ModelVersion
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
title: Title
renderingHint: TemplateRenderingHint
header: Header
footer: Footer
members: array<EmbeddedArtifact>
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
// EmbeddedArtifact keys (each member's `key` property) MUST be unique
// within `members` (per grammar.md §Embedded Artifact Key)
// the order of `members` MUST be preserved
TemplateRenderingHint ::: object {
helpDisplayMode: HelpDisplayMode
}
HelpDisplayMode ::: "inline" "tooltip" "both" "none"
Title ::: MultilingualString
Header ::: MultilingualString
Footer ::: MultilingualString
12. Instances
TemplateInstance ::: object {
"kind": "TemplateInstance"
id: TemplateInstanceId
modelVersion: ModelVersion
metadata: CatalogMetadata
templateRef: TemplateId
label: Label
values: array<InstanceValue>
}
// modelVersion is a SemanticVersion 2.0.0 lexical form
// metadata is CatalogMetadata; instances do not carry schema
// versioning, so there is no top-level versioning slot
// label, when present, is a user-supplied name for this instance,
// shown in catalog listings or detail views
InstanceValue ::: FieldValue NestedTemplateInstance
// discriminator: kind
FieldValue ::: object {
"kind": "FieldValue"
key: EmbeddedArtifactKey
values: nonEmptyArray<Value>
}
// values MUST be non-empty (per grammar's Value+; absence of a value is
// represented by omitting the FieldValue entirely)
NestedTemplateInstance ::: object {
"kind": "NestedTemplateInstance"
key: EmbeddedArtifactKey
values: array<InstanceValue>
}
// values MAY be empty
13. Cross-reference
For the JSON-encoding rules that frame this grammar — property naming
(lowerCamelCase), Unicode normalisation, big-integer string fallback,
implementation-extension prefixes, and worked end-to-end examples —
see serialization.md. For the abstract grammar
this file mirrors, see grammar.md. For conformance
rules, see validation.md.
14. Property-name map
This section makes the implicit map between abstract grammar component slots and JSON property names explicit. Each entry lists, for one abstract production, the abstract component types in their grammar-defined order paired with the wire property name used to encode that component.
The list covers every abstract production in grammar.md that has at
least one component. Productions whose abstract form has no components
(e.g. EmailFieldSpec ::= email_field_spec()) and pure-union or
enum-string productions (e.g. Value, ValueRequirement) carry no
property-name mapping and are not listed.
Conventions:
- Each entry leads with the abstract production name in bold and,
in parentheses, the corresponding
lower_snake_caseconstructor form’s name fromgrammar.md— e.g.Template(template),YoutubeVideoComponent(you_tube_video_component). The parenthesised name is informational, included so a reader cross- referencing this section againstgrammar.mdcan match::=productions to entries here without manually re-deriving the snake_case form. It does not appear on the wire and has no normative effect. - Component order follows
grammar.md. Component-index numbering is zero-based. - Optional
[X]and repeatedX*/X+components are noted alongside the component type. - The mapping records the wire property name; whether the encoded
object carries a
kinddiscriminator at that slot is determined separately by the kind rule (§1.5) and is not duplicated here.
14.1 Top-level artifacts and templates
Template (template):
0. TemplateId → id
ModelVersion→modelVersionCatalogMetadata→metadataSchemaArtifactVersioning→versioningTitle→title[TemplateRenderingHint]→renderingHint?[Header]→header?[Footer]→footer?EmbeddedArtifact*→members
TemplateRenderingHint (template_rendering_hint):
0. [HelpDisplayMode] → helpDisplayMode?
TemplateInstance (template_instance):
0. TemplateInstanceId → id
ModelVersion→modelVersionCatalogMetadata→metadataTemplateId→templateRef[Label]→label?InstanceValue*→values
14.2 Field artifacts
Every concrete Field production has the same six-component shape:
(<Family>FieldId, ModelVersion, CatalogMetadata, SchemaArtifactVersioning, <Family>FieldSpec, Label),
with an optional seventh HelpText slot. For all of TextField,
IntegerNumberField, RealNumberField, BooleanField, DateField,
TimeField, DateTimeField, ControlledTermField,
SingleValuedEnumField, MultiValuedEnumField, LinkField,
EmailField, PhoneNumberField, OrcidField, RorField,
DoiField, PubMedIdField, RridField, NihGrantIdField, and
AttributeValueField:
<Family>FieldId→idModelVersion→modelVersionCatalogMetadata→metadataSchemaArtifactVersioning→versioning<Family>FieldSpec→fieldSpecLabel→label[HelpText]→helpText?
14.3 Embedded artifacts
Every concrete EmbeddedXxxField production follows the same pattern,
with the per-family typed-id and typed-default-value slots:
EmbeddedArtifactKey→key<Family>FieldId→artifactRef[ValueRequirement]→valueRequirement?[Cardinality]→cardinality?(omitted onEmbeddedBooleanFieldandEmbeddedSingleValuedEnumField)[Visibility]→visibility?[<Family>Value]→defaultValue?(omitted onEmbeddedAttributeValueField; onEmbeddedMultiValuedEnumFieldthe slot isEnumValue*→defaultValue?: array<EnumValue>)[LabelOverride]→labelOverride?[HelpTextOverride]→helpTextOverride?[Property]→property?
(Component indices are renumbered to skip slots a particular family omits, per the per-family abstract production. The list above gives the canonical ordering common to the family.)
EmbeddedTemplate (embedded_template):
0. EmbeddedArtifactKey → key
TemplateId→artifactRef[ValueRequirement]→valueRequirement?[Cardinality]→cardinality?[Visibility]→visibility?[LabelOverride]→labelOverride?[Property]→property?
EmbeddedPresentationComponent (embedded_presentation_component):
0. EmbeddedArtifactKey → key
PresentationComponentId→artifactRef[Visibility]→visibility?
14.4 Catalog metadata
CatalogMetadata (catalog_metadata):
0. [PreferredLabel] → preferredLabel?
[Description]→description?[Identifier]→identifier?AlternativeLabel*→altLabels?(SHOULD-omitted when empty per §1.7 rule 4)LifecycleMetadata→lifecycleAnnotation*→annotations?(SHOULD-omitted when empty)
On schema artifacts, SchemaArtifactVersioning appears as a separate
top-level versioning slot on the artifact rather than being nested
inside metadata.
LifecycleMetadata (lifecycle_metadata):
0. CreatedOn → createdOn
CreatedBy→createdByModifiedOn→modifiedOnModifiedBy→modifiedBy
SchemaArtifactVersioning (schema_artifact_versioning):
0. Version → version
Status→status[PreviousVersion]→previousVersion?[DerivedFrom]→derivedFrom?
Annotation (annotation):
0. Iri → property
AnnotationValue→body
AnnotationStringValue (annotation_string_value):
0. LexicalForm → value
[LanguageTag]→lang?
AnnotationIriValue (annotation_iri_value):
0. Iri → iri
14.5 Embedded artifact properties
Cardinality (cardinality):
0. MinCardinality → min
[MaxCardinality]→max?
LabelOverride (label_override):
0. Label → label
AlternativeLabel*→altLabels
Property (property):
0. PropertyIri → iri
[PropertyLabel]→label?
14.6 Multilingual strings
LangString (lang_string):
0. string → value
Bcp47Tag→lang
14.7 Values
TextValue (text_value):
0. LexicalForm → value
[LanguageTag]→lang?
IntegerNumberValue (integer_number_value):
0. LexicalForm → value
RealNumberValue (real_number_value):
0. LexicalForm → value
RealNumberDatatypeKind→datatype
BooleanValue (boolean_value):
0. boolean → value
YearValue (year_value):
0. LexicalForm → value
YearMonthValue (year_month_value):
0. LexicalForm → value
FullDateValue (full_date_value):
0. LexicalForm → value
TimeValue (time_value):
0. LexicalForm → value
DateTimeValue (date_time_value):
0. LexicalForm → value
ControlledTermValue (controlled_term_value):
0. TermIri → term
[Label]→label?[Notation]→notation?[PreferredLabel]→preferredLabel?
EnumValue (enum_value):
0. Token → value
LinkValue (link_value):
0. Iri → iri
[Label]→label?
EmailValue (email_value):
0. LexicalForm → value
PhoneNumberValue (phone_number_value):
0. LexicalForm → value
OrcidValue (orcid_value):
0. OrcidIri → iri
[Label]→label?
RorValue (ror_value):
0. RorIri → iri
[Label]→label?
DoiValue (doi_value):
0. DoiIri → iri
[Label]→label?
PubMedIdValue (pub_med_id_value):
0. PubMedIri → iri
[Label]→label?
RridValue (rrid_value):
0. RridIri → iri
[Label]→label?
NihGrantIdValue (nih_grant_id_value):
0. NihGrantIri → iri
[Label]→label?
AttributeValue (attribute_value):
0. AttributeName → name
Value→value
14.8 Field specs
TextFieldSpec (text_field_spec):
0. [TextValue] → defaultValue?
[MinLength]→minLength?[MaxLength]→maxLength?[ValidationRegex]→validationRegex?[LangTagRequirement]→langTagRequirement?[TextRenderingHint]→renderingHint?
IntegerNumberFieldSpec (integer_number_field_spec):
0. [IntegerNumberValue] → defaultValue?
[Unit]→unit?[IntegerNumberMinValue]→minValue?[IntegerNumberMaxValue]→maxValue?[NumericRenderingHint]→renderingHint?
RealNumberFieldSpec (real_number_field_spec):
0. RealNumberDatatypeKind → datatype
[RealNumberValue]→defaultValue?[Unit]→unit?[RealNumberMinValue]→minValue?[RealNumberMaxValue]→maxValue?[NumericRenderingHint]→renderingHint?
BooleanFieldSpec (boolean_field_spec):
0. [BooleanValue] → defaultValue?
[BooleanRenderingHint]→renderingHint?
Unit (unit):
0. Iri → iri
[Label]→label?
DateFieldSpec (date_field_spec):
0. DateValueType → dateValueType
[DateValue]→defaultValue?[DateRenderingHint]→renderingHint?
TimeFieldSpec (time_field_spec):
0. [TimeValue] → defaultValue?
[TimePrecision]→timePrecision?[TimezoneRequirement]→timezoneRequirement?[TimeRenderingHint]→renderingHint?
DateTimeFieldSpec (date_time_field_spec):
0. DateTimeValueType → dateTimeValueType
[DateTimeValue]→defaultValue?[TimezoneRequirement]→timezoneRequirement?[DateTimeRenderingHint]→renderingHint?
ControlledTermFieldSpec (controlled_term_field_spec):
0. [ControlledTermValue] → defaultValue?
ControlledTermSource+→sources[ControlledTermRenderingHint]→renderingHint?
SingleValuedEnumFieldSpec (single_valued_enum_field_spec):
0. PermissibleValue+ → permissibleValues
[EnumValue]→defaultValue?[SingleValuedEnumRenderingHint]→renderingHint?
MultiValuedEnumFieldSpec (multi_valued_enum_field_spec):
0. PermissibleValue+ → permissibleValues
EnumValue*→defaultValues?(SHOULD-omitted when empty per §1.7 rule 4)[MultiValuedEnumRenderingHint]→renderingHint?
PermissibleValue (permissible_value):
0. Token → value
[Label]→label?[Description]→description?Meaning*→meanings?(SHOULD-omitted when empty)
Meaning (meaning):
0. TermIri → iri
[Label]→label?
TextRenderingHint (text_rendering_hint):
0. [TextLineMode] → lineMode?
[Placeholder]→placeholder?
DateRenderingHint (date_rendering_hint):
0. [DateComponentOrder] → componentOrder?
[Placeholder]→placeholder?
TimeRenderingHint (time_rendering_hint):
0. [TimeFormat] → timeFormat?
[Placeholder]→placeholder?
DateTimeRenderingHint (date_time_rendering_hint):
0. [TimeFormat] → timeFormat?
[Placeholder]→placeholder?
NumericRenderingHint (numeric_rendering_hint):
0. [DecimalPlaces] → decimalPlaces?
[Placeholder]→placeholder?
The ten new rendering hints introduced for previously hint-less families each carry a single optional slot:
ControlledTermRenderingHint (controlled_term_rendering_hint): [Placeholder] → placeholder?
EmailRenderingHint (email_rendering_hint): [Placeholder] → placeholder?
PhoneNumberRenderingHint (phone_number_rendering_hint): [Placeholder] → placeholder?
LinkRenderingHint (link_rendering_hint): [Placeholder] → placeholder?
OrcidRenderingHint (orcid_rendering_hint): [Placeholder] → placeholder?
RorRenderingHint (ror_rendering_hint): [Placeholder] → placeholder?
DoiRenderingHint (doi_rendering_hint): [Placeholder] → placeholder?
PubMedIdRenderingHint (pub_med_id_rendering_hint): [Placeholder] → placeholder?
RridRenderingHint (rrid_rendering_hint): [Placeholder] → placeholder?
NihGrantIdRenderingHint (nih_grant_id_rendering_hint): [Placeholder] → placeholder?
LinkFieldSpec (link_field_spec):
0. [LinkValue] → defaultValue?
[LinkRenderingHint]→renderingHint?
EmailFieldSpec (email_field_spec):
0. [EmailValue] → defaultValue?
[EmailRenderingHint]→renderingHint?
PhoneNumberFieldSpec (phone_number_field_spec):
0. [PhoneNumberValue] → defaultValue?
[PhoneNumberRenderingHint]→renderingHint?
OrcidFieldSpec (orcid_field_spec):
0. [OrcidValue] → defaultValue?
[OrcidRenderingHint]→renderingHint?
RorFieldSpec (ror_field_spec):
0. [RorValue] → defaultValue?
[RorRenderingHint]→renderingHint?
DoiFieldSpec (doi_field_spec):
0. [DoiValue] → defaultValue?
[DoiRenderingHint]→renderingHint?
PubMedIdFieldSpec (pub_med_id_field_spec):
0. [PubMedIdValue] → defaultValue?
[PubMedIdRenderingHint]→renderingHint?
RridFieldSpec (rrid_field_spec):
0. [RridValue] → defaultValue?
[RridRenderingHint]→renderingHint?
NihGrantIdFieldSpec (nih_grant_id_field_spec):
0. [NihGrantIdValue] → defaultValue?
[NihGrantIdRenderingHint]→renderingHint?
AttributeValueFieldSpec carries no components and has no entry here.
14.9 Controlled term sources
OntologySource (ontology_source):
0. OntologyReference → ontology
OntologyReference (ontology_reference):
0. OntologyIri → iri
[OntologyDisplayHint]→displayHint?
OntologyDisplayHint (ontology_display_hint):
0. [OntologyAcronym] → acronym?
[OntologyName]→name?
BranchSource (branch_source):
0. OntologyReference → ontology
RootTermIri→rootTermIri[RootTermLabel]→rootTermLabel?[MaxTraversalDepth]→maxTraversalDepth?
ClassSource (class_source):
0. ControlledTermClass+ → classes
ControlledTermClass (controlled_term_class):
0. TermIri → term
[Label]→label?OntologyReference→ontology
ValueSetSource (value_set_source):
0. ValueSetIdentifier → identifier
[ValueSetName]→name?[ValueSetIri]→iri?
14.10 Presentation components
RichTextComponent (rich_text_component):
0. PresentationComponentId → id
ModelVersion→modelVersionCatalogMetadata→metadataHtmlContent→html
ImageComponent (image_component):
0. PresentationComponentId → id
ModelVersion→modelVersionCatalogMetadata→metadataIri→image[Label]→label?[Description]→description?
YoutubeVideoComponent (you_tube_video_component):
0. PresentationComponentId → id
ModelVersion→modelVersionCatalogMetadata→metadataIri→video[Label]→label?[Description]→description?
SectionBreakComponent (section_break_component):
0. PresentationComponentId → id
ModelVersion→modelVersionCatalogMetadata→metadata
PageBreakComponent (page_break_component):
0. PresentationComponentId → id
ModelVersion→modelVersionCatalogMetadata→metadata
14.11 Instances
FieldValue (field_value):
0. EmbeddedArtifactKey → key
Value+→values
NestedTemplateInstance (nested_template_instance):
0. EmbeddedArtifactKey → key
InstanceValue*→values
14.12 Collapsed-wrapper productions
The single-component wrapper productions enumerated in §1.6 — every
XxxFieldId, TemplateId, TemplateInstanceId,
PresentationComponentId, Iri, TermIri, LanguageTag,
LexicalForm, IsoDateTimeStamp, NonNegativeInteger,
MinCardinality, MaxCardinality, MinLength, MaxLength,
DecimalPlaces, MaxTraversalDepth, the typed external-authority
IRIs, Name, Description, PreferredLabel, AlternativeLabel,
Label, PropertyLabel, OntologyName, OntologyAcronym,
OntologyIri, RootTermIri, RootTermLabel, ValueSetIdentifier,
ValueSetName, ValueSetIri, Notation, Identifier,
AttributeName, EmbeddedArtifactKey, ValidationRegex, Token,
Header, Footer, Version, ModelVersion, CreatedOn,
CreatedBy, ModifiedOn, ModifiedBy, PreviousVersion,
DerivedFrom, PropertyIri, and HtmlContent — collapse to their
inner primitive on the wire and have no per-production property name.
The single component appears directly at the slot in the enclosing
production whose property name is given by that production’s mapping.
JSON Serialization
This document defines a normative JSON wire format for the CEDAR Template Model. Conforming implementations in any host language MUST produce and consume documents that follow the encoding defined here, so that artifacts can be exchanged between implementations with no information loss.
This document is companion to but not part of the abstract grammar. The abstract grammar in grammar.md defines what a CEDAR template is; wire-grammar.md defines the JSON shape of every grammar production; this document defines the encoding rules and conventions that frame those shapes, plus illustrative examples.
1. Purpose and Scope
1.1 Purpose
The CEDAR Structural Model is intentionally serialization-agnostic at the grammar level. Implementations in different host languages may realize abstract constructs as language-idiomatic data structures (TypeScript interfaces, Java records, Python dataclasses, etc.). For two implementations to exchange artifacts, a common wire format is required.
This document defines that common wire format using JSON (RFC 8259) as the target encoding. The format is:
- Native — encodes the Structural Model directly, without conflating schema, schema-of-schemas, and presentation concerns.
- Lossless — every abstract construct encodes to exactly one JSON value, and every conforming JSON value decodes to exactly one abstract construct.
- Round-trippable — encoding then decoding yields the same abstract construct.
1.2 Relationship to other specifications
grammar.md is the authoritative definition of the abstract Structural Model. This document defines an encoding of that model and does not extend or modify it. Where the grammar permits multiple equivalent abstract forms, this document selects exactly one wire form.
wire-grammar.md is the formal source of truth for the JSON shape of every grammar production. It mirrors grammar.md one-to-one and uses a compact JSON-shaped notation. Per-production property tables formerly in §6 of this document have moved there. The present document carries the encoding philosophy, JSON-specific rules, and worked examples.
validation.md defines the conformance rules a Structural Model artifact must satisfy. This document does not define validation; a JSON document MAY be wire-format-conformant yet fail Structural Model validation, and vice versa.
ctm-1.6.0-serialization.md defines a one-directional, lossy mapping from the Structural Model to legacy CEDAR Template Model 1.6.0 JSON-LD format. This is a separate concern; the encoding defined in the present document is independent of CTM 1.6.0 and not interconvertible with it.
Note on JSON-LD shape parallel
The string-bearing and IRI-bearing Value shapes defined below are structurally similar to JSON-LD’s term forms — value/lang/datatype parallel JSON-LD’s @value/@language/@type, and iri parallels @id. This similarity is incidental: the wire form is CEDAR-native and stands on its own. RDF interoperability is provided by a separate derived projection (see rdf-projection.md).
Conforming documents are not JSON-LD. They carry no @context, are not interpretable as RDF graphs without external schema knowledge, and do not follow JSON-LD’s compaction, expansion, or framing algorithms. A future JSON-LD encoding parallel to (and convertible to/from) the native form defined here MAY be defined; that work is out of scope for this document.
1.3 Scope
In scope:
- The JSON encoding rules (property naming, NFC normalisation, integer handling) that frame the shapes formally defined in
wire-grammar.md. - Discriminator placement (the
kind/ position rules). - The wrapping principle that determines which productions are tagged JSON objects vs flat JSON values.
- Worked end-to-end examples.
Out of scope:
- Per-production property tables. Those live normatively in
wire-grammar.md. - JSON-LD, RDF, or other RDF-graph representations.
- YAML, msgpack, CBOR, or other non-JSON encodings.
- Validation conformance (
validation.md). - Storage and transport concerns (file naming, MIME types, HTTP headers, etc.).
- Per-language implementation concerns: decoder/encoder code structure, error-reporting conventions, partial-decoding strategies, in-memory data shapes, and similar realization decisions. These are addressed in language-specific binding documents (forthcoming).
2. Conformance Language
The words MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY are used in the sense of RFC 2119 and RFC 8174.
A conforming JSON document is a JSON value that satisfies every encoding rule in this document, matches the wire shape defined for some production in wire-grammar.md, and corresponds to some abstract Structural Model construct as defined in grammar.md.
A conforming implementation is software that, when given an abstract Structural Model construct, produces a conforming JSON document; and when given a conforming JSON document, decodes it to the corresponding abstract construct.
3. Conventions
3.1 Production references
Production names from grammar.md and wire-grammar.md appear in UpperCamelCase. Constructor forms from grammar.md appear in lower_snake_case. Concrete JSON property names appear in lowerCamelCase.
3.2 JSON terminology
The terms object, array, string, number, boolean, null, and value refer to JSON values per RFC 8259. The terms property, member, and element refer to the structural components of those values.
3.3 Property naming
Property names within tagged objects MUST be lowerCamelCase translations of the corresponding component names in the production. Where a component name in the grammar is itself an UpperCamelCase production name (e.g. EmbeddedArtifactKey), the JSON property uses the role-name from the production (e.g. key) rather than the production name itself. The canonical property name for any production component is the one given in its wire-grammar.md entry.
3.4 Examples
JSON examples appear in fenced code blocks marked json. Examples are illustrative only; the normative content is the corresponding wire-grammar.md entry.
Examples may use placeholders of the form <ProductionName> to denote the JSON encoding of a production at the surrounding position. A placeholder is resolved by replacing it with the encoding defined for that production in wire-grammar.md. The * and + suffixes (e.g. <Annotation>*, <EnumValue>+) denote sequences per §4.4 — zero-or-more and one-or-more respectively.
4. General Encoding Rules
4.1 Tagged and untagged objects
JSON objects in the wire format are either tagged — carrying a "kind" property — or untagged — without "kind". Whether an object is tagged is determined by its production: every member of a discriminator: kind union is tagged at every position; every other production is untagged at every position. See §4.4 for the rule.
When an object is tagged, the value of "kind" MUST be the production name from grammar.md, transcribed in UpperCamelCase exactly as the grammar names it. For example, "TextValue" for the TextValue production. The grammar’s lower_snake_case constructor forms (e.g. text_value(...)) describe abstract composition and do not appear on the wire.
A conforming implementation MUST reject any object whose tagged-or-untagged status does not match its production (per §4.4), whose "kind" value (when tagged) does not match any production known to the implementation, or whose other properties do not match the wire-grammar entry for the named production.
4.2 Optional components
A grammar component marked [X] (optional) MUST be omitted from its enclosing JSON object when not present. A conforming implementation MUST NOT emit null or an empty string in place of an absent optional component.
A conforming implementation MUST treat the absence of an optional property as equivalent to that component not being present in the abstract construct.
On decode, a conforming implementation MUST reject any document in which an optional property is present with the JSON value null. The two conforming wire forms for an absent optional are: the property is omitted entirely, or the enclosing object is itself absent. Treating null as equivalent to absent is non-conforming because it admits two distinct wire forms for the same abstract state, breaking round-trip equality.
4.3 Sequence components
A grammar component marked X* (zero or more) is encoded as a JSON array. The array MAY be empty.
A grammar component marked X+ (one or more) is encoded as a JSON array. The array MUST contain at least one element. In wire-grammar.md these are written nonEmptyArray<X>.
The order of elements in the JSON array MUST match the order of components in the abstract construct. A conforming implementation MUST preserve this order through encode and decode.
4.4 Discriminator placement
A JSON object’s discriminator presence depends on its production, not on the position it occupies in the document. Per wire-grammar.md §1.5, every production is either a member of some discriminator: kind union or it is not, and the encoding follows uniformly:
Polymorphic-union members — productions that appear as alternatives in a discriminator: kind union (e.g. Value, FieldSpec, Annotation.body: AnnotationValue, EmbeddedField, EmbeddedArtifact, every Field family, every Value family) — MUST encode as a tagged JSON object carrying "kind": "<ProductionName>". The discriminator is present even when the surrounding context (the enclosing object’s kind and property name) would already determine the family — for example, EmbeddedTextField.defaultValue carries "kind": "TextValue" even though EmbeddedTextField.kind already pins the family. Uniformity of the rule is preferred over the small wire-size saving.
Singleton-only productions — productions that never appear as members of any discriminator: kind union (Cardinality, Property, LabelOverride, CatalogMetadata, LifecycleMetadata, SchemaArtifactVersioning, Annotation, Unit, OntologyReference, OntologyDisplayHint, ControlledTermClass, PermissibleValue, Meaning, and the temporal RenderingHint object variants) — MUST encode as untagged JSON objects whose properties correspond to the production’s components. A "kind" property MUST NOT appear.
The rule applies recursively: a tagged object whose own components include further composite objects follows the same rule for each of those components, with the encoding determined by each inner production’s own discriminator-union membership.
Position-discriminated unions
A few unions occupy fixed singleton positions where the surrounding property name fully determines the variant. For example, RenderingHint is determined by which FieldSpec family the parent is. These wire entries are flagged // discriminator: position in wire-grammar.md.
Implementations MUST NOT rely on JSON property ordering to discriminate alternatives.
4.5 String values
Strings are JSON strings encoded in UTF-8. Lexical-form strings (e.g. the value property of a TextValue) MUST be transmitted in Unicode Normalization Form C (NFC). A conforming encoder MUST emit NFC. A conforming decoder receiving non-NFC input handles it per §9.6.
4.6 Number values
Integer-valued grammar productions (e.g. NonNegativeInteger) are encoded as JSON numbers without a fractional part or exponent. Implementations MUST encode integer values that fit within JSON Number’s safe integer range (the integers in the closed interval [−(2^53 − 1), 2^53 − 1]) without loss. Values outside that range fall under §5.1 below — the wire grammar permits a JSON-string fallback, but implementations MAY refuse to encode out-of-range values since no current use site exercises this case.
Decimal-valued grammar productions are encoded as JSON numbers in standard decimal notation per RFC 8259.
4.7 Implementation freedom
A conforming implementation MAY add JSON properties beyond those defined here for non-normative purposes (annotations, hashes, signatures, etc.), provided those properties begin with _ or $ to avoid collision with future normative additions. Decoders MUST ignore such properties. Decoders encountering a property whose name does not begin with _ or $ and is not declared by the production at the position MUST report a wire-shape error per §9.5.
A conforming implementation MAY emit JSON object properties in any order; the wire format is order-independent at the object level.
5. The Wrapping Principle
The grammar uses constructor forms uniformly to define every production, including productions that consist of a single component of a primitive type. For example:
Header ::= header( MultilingualString )
NonNegativeInteger ::= non_negative_integer( IntegerLexicalForm )
EmbeddedArtifactKey ::= embedded_artifact_key( AsciiIdentifier )
A literal translation would encode each such production as a tagged JSON object with a single payload property. This document does not require that. Instead, the wrapping principle applies:
A production is encoded as a tagged JSON object only when wrapping carries information beyond the production’s payload. Otherwise, the production is encoded as the JSON value of its single component, and the production’s identity is communicated by the property name in the enclosing object.
A production carries information beyond its payload, and so MUST be encoded as a tagged object, when at least one of the following holds:
-
(a) Composite structure. The production has more than one named component (e.g.
Cardinality,Property,LabelOverride, everyValuefamily). -
(b) Discriminated union membership. The production participates in a union where alternatives must be distinguished at decode time (e.g.
Value, every artifact’skind, the twentyFieldfamily variants). The discriminator is"kind". -
(c) Lexical-form preservation. The production carries lexical content whose preservation requires more than a JSON primitive can express (e.g.
LangStringcarries a lexical form and a language tag; both must be present in the wire form).
A production that satisfies none of these is encoded flat: the JSON value at the corresponding property position in the enclosing object is the JSON encoding of the production’s single component, with no "kind" wrapper.
The full list of productions that collapse this way is given in §1.6 of wire-grammar.md. At a glance:
- All
MultilingualString-typed wrappers (Header,Footer,Name,Description,PreferredLabel,AlternativeLabel,Label,PropertyLabel,OntologyName,RootTermLabel,ValueSetName) flatten to a JSON array ofLangStringentries. - All single-
Iriwrappers (artifact identifiers and references,PropertyIri, the typed external-authority IRIs,OntologyIri, etc.) flatten to a plain JSON string. - All single-
NonNegativeIntegerwrappers (MinLength,MaxLength,MinCardinality,MaxCardinality,DecimalPlaces,MaxTraversalDepth) flatten to a plain JSON number. - Plain-
stringwrappers (Identifier,Notation,OntologyAcronym,ValueSetIdentifier,HtmlContent) flatten to a plain JSON string. - Enum-style productions (
Status,ValueRequirement,Visibility,DateValueType,TimePrecision,DateTimeValueType,TimezoneRequirement,DateComponentOrder,TimeFormat,TextRenderingHint,SingleValuedEnumRenderingHint,MultiValuedEnumRenderingHint,BooleanRenderingHint,RealNumberDatatypeKind) flatten to a JSON string drawn from a fixed set.
5.1 Lexical-form preservation
Big integers. NonNegativeInteger values that exceed JSON Number’s
safe integer range (the magnitude bound 2^53 − 1) MAY be encoded as
JSON strings rather than numbers. A decoder MUST accept both forms.
In practice this case does not arise for the model’s current use
sites (length bounds, cardinality bounds, traversal depths, numeric
precision are all small); implementations MAY refuse to encode an
out-of-range value rather than fall back to the string form. If a
future use site introduces values that routinely exceed the safe
range, this section will be revisited to make the string fallback a
MUST.
6. Per-Production Encoding (Examples)
Detailed wire shapes for every production are normatively specified in wire-grammar.md. This section gives illustrative JSON examples — one per family of related productions — and documents only those JSON-encoding-specific rules that aren’t expressible in the wire-grammar notation.
6.1 Identifiers
Every artifact identifier is encoded as a plain JSON string carrying the IRI. The kind of identifier is communicated by the surrounding context (the property name on the enclosing object, plus the kind discriminator of the enclosing artifact).
"https://example.org/fields/title"
A FieldId appears only in two grammar positions: as Field.id (the artifact’s own identity) and as EmbeddedField.artifactRef (a reference to the embedded artifact). Both surrounding constructs carry a kind discriminator that conveys the field family. The twenty permitted family-bearing kind values for Field variants are: "TextField", "IntegerNumberField", "RealNumberField", "BooleanField", "DateField", "TimeField", "DateTimeField", "ControlledTermField", "SingleValuedEnumField", "MultiValuedEnumField", "LinkField", "EmailField", "PhoneNumberField", "OrcidField", "RorField", "DoiField", "PubMedIdField", "RridField", "NihGrantIdField", or "AttributeValueField". The corresponding EmbeddedField variants prefix Embedded (e.g. "EmbeddedTextField").
The IRI placed at a FieldId position MUST belong to a field of the family declared by the surrounding kind. This is a structural-invariant constraint (per §9.1 category 3); a conforming encoder enforces it before emitting the wire form, and a conforming decoder reports a structural error against path if it is violated.
6.2 Multilingual strings
A MultilingualString is encoded as a non-empty JSON array of untagged LangString objects. Neither MultilingualString nor LangString is a member of any discriminator: kind union (per §4.4), so neither carries a kind discriminator on the wire.
[{ "value": "Hello", "lang": "en" }, { "value": "Bonjour", "lang": "fr" }]
The BCP 47 'und' (undetermined) subtag MAY be used when the natural language is unspecified.
MultilingualString and a single language-tagged TextValue share the {value, lang} shape but are structurally distinct: a TextValue is a single tagged value object (carrying kind: "TextValue"), whereas a MultilingualString is an array of one or more untagged {value, lang} entries. Encoders MUST NOT collapse a single-entry MultilingualString into a bare LangString object, and decoders MUST NOT promote a single LangString into a MultilingualString array.
6.3 Values
Each Value family is encoded as a tagged object that carries its content directly. The full set of variants is given in wire-grammar.md §3.
{ "kind": "TextValue", "value": "Jane Smith" }
{ "kind": "TextValue", "value": "Jane Smith", "lang": "en" }
{ "kind": "IntegerNumberValue", "value": "42" }
{ "kind": "RealNumberValue", "value": "3.14", "datatype": "decimal" }
{ "kind": "BooleanValue", "value": true }
{ "kind": "YearValue", "value": "2024" }
{ "kind": "FullDateValue", "value": "2024-06-15" }
{ "kind": "TimeValue", "value": "10:30:00" }
{ "kind": "DateTimeValue", "value": "2024-06-15T10:30:00Z" }
{ "kind": "ControlledTermValue", "term": "http://example.org/term/1", "label": [{ "value": "Term 1", "lang": "en" }] }
{ "kind": "EnumValue", "value": "professor" }
{ "kind": "LinkValue", "iri": "https://example.org/page" }
{ "kind": "EmailValue", "value": "jane@example.org" }
{ "kind": "OrcidValue", "iri": "https://orcid.org/0000-0002-1825-0097", "label": [{ "value": "Josiah Carberry", "lang": "en" }] }
{ "kind": "AttributeValue", "name": "https://example.org/p/color", "value": { "kind": "TextValue", "value": "blue" } }
6.4 Metadata and annotations
LifecycleMetadata, SchemaArtifactVersioning, and CatalogMetadata are singleton-only productions (never members of any discriminator: kind union per §4.4), so they encode as untagged JSON objects. The descriptive properties of an artifact (preferredLabel, description, identifier, altLabels) sit directly on CatalogMetadata rather than under a descriptiveMetadata wrapper. On schema artifacts, SchemaArtifactVersioning appears as a separate top-level versioning slot on the artifact rather than nested inside metadata.
{
"preferredLabel": [{ "value": "Full Name", "lang": "en" }],
"description": [{ "value": "Full legal name.", "lang": "en" }],
"lifecycle": {
"createdOn": "2024-01-01T00:00:00Z",
"createdBy": "https://orcid.org/0000-0002-1825-0097",
"modifiedOn": "2024-06-15T12:30:00Z",
"modifiedBy": "https://orcid.org/0000-0002-1825-0097"
},
"annotations": [
{
"property": "https://example.org/annotation-properties/notes",
"body": { "kind": "AnnotationStringValue", "value": "An institutional note." }
}
]
}
AnnotationValue is a kind-discriminated polymorphic union over named annotation-value variants. Two variants are currently defined: AnnotationStringValue (a lexical form with optional language tag) and AnnotationIriValue (an IRI):
{ "kind": "AnnotationStringValue", "value": "An institutional note." }
{ "kind": "AnnotationStringValue", "value": "Une note institutionnelle.", "lang": "fr" }
{ "kind": "AnnotationIriValue", "iri": "https://example.org/related-resource" }
The wire-form property name on Annotation is body (for the grammar’s AnnotationValue component) — following the W3C Web Annotations convention.
The AnnotationValue variant family is open to extension: future revisions of this specification MAY introduce additional AnnotationXxxValue variants. Conforming decoders MUST reject documents whose body.kind is not a known variant.
6.5 Embedded artifact properties
Cardinality, Property, LabelOverride, and Unit are singleton-only productions (per §4.4) and encode as untagged JSON objects. EmbeddedArtifactKey flattens to a plain JSON string. ValueRequirement and Visibility flatten to JSON enum strings.
{ "min": 0, "max": 5 }
{ "iri": "https://schema.org/name", "label": [{ "value": "name", "lang": "en" }] }
{ "label": [{ "value": "Custom Label", "lang": "en" }], "altLabels": [] }
"required"
6.6 Field specs
Each concrete FieldSpec is encoded as a tagged object whose "kind" matches the spec’s grammar production name. Optional configuration properties are omitted when absent. Every XxxFieldSpec (except AttributeValueFieldSpec) carries an optional defaultValue slot whose type matches the family’s Value; see §6.8 for the per-family table and the precedence rule against an embedding-level defaultValue on the corresponding EmbeddedXxxField.
{ "kind": "TextFieldSpec", "minLength": 1, "maxLength": 200, "renderingHint": "singleLine" }
{ "kind": "IntegerNumberFieldSpec", "minValue": { "kind": "IntegerNumberValue", "value": "0" } }
{ "kind": "DateFieldSpec", "dateValueType": "fullDate", "renderingHint": { "componentOrder": "dayMonthYear" } }
{ "kind": "SingleValuedEnumFieldSpec",
"permissibleValues": [
{ "value": "yes", "label": [{ "value": "Yes", "lang": "en" }] },
{ "value": "no", "label": [{ "value": "No", "lang": "en" }] }
],
"defaultValue": { "kind": "EnumValue", "value": "yes" },
"renderingHint": "radio"
}
{ "kind": "MultiValuedEnumFieldSpec",
"permissibleValues": [
{ "value": "active", "label": [{ "value": "Active", "lang": "en" }],
"meanings": ["http://example.org/active-1"] },
{ "value": "retired", "label": [{ "value": "Retired", "lang": "en" }] }
],
"defaultValues": [],
"renderingHint": "checkbox"
}
{ "kind": "ControlledTermFieldSpec", "sources": [
{ "kind": "OntologySource", "ontology": { "iri": "http://purl.obolibrary.org/obo/ncit.owl",
"displayHint": { "acronym": "NCIT", "name": [{ "value": "NCI Thesaurus", "lang": "en" }] } } }
] }
A SingleValuedEnumFieldSpec‘s defaultValue is a single tagged EnumValue whose value matches one of the permissible values’ tokens; a MultiValuedEnumFieldSpec’s defaultValues is a (possibly empty) array of such tagged EnumValue entries, with no duplicate value entries. An OntologyDisplayHint MUST carry at least one of acronym or name (a constraint enforced by wire-grammar.md).
The flat-string rendering hints (TextRenderingHint, SingleValuedEnumRenderingHint, MultiValuedEnumRenderingHint, BooleanRenderingHint) appear directly as JSON enum strings; the object-shaped rendering hints (NumericRenderingHint, DateRenderingHint, TimeRenderingHint, DateTimeRenderingHint) are JSON objects with optional configuration slots.
6.7 Field artifacts and embedded artifacts
A Field artifact (shown for the text family; the other nineteen families substitute "IntegerNumberField", "RealNumberField", "BooleanField", "DateField", etc. for kind):
{
"kind": "TextField",
"id": "<FieldId>",
"modelVersion": "<SemanticVersion>",
"metadata": "<CatalogMetadata>",
"versioning": "<SchemaArtifactVersioning>",
"fieldSpec": "<FieldSpec>",
"label": "<MultilingualString>"
}
The modelVersion property is a top-level property of every concrete artifact (Template, TemplateInstance, every XxxField, and every PresentationComponent variant). It is encoded as a JSON string carrying a Semantic Versioning 2.0.0 lexical form and identifies the version of the CEDAR structural model the artifact conforms to. The position is immediately after id and before metadata.
The kind value MUST match the family of the nested fieldSpec. Conforming encoders MUST ensure that the IRI placed at id belongs to a field of the same family.
An EmbeddedField (shown for the text family; substitute "EmbeddedIntegerNumberField", "EmbeddedRealNumberField", "EmbeddedBooleanField", "EmbeddedDateField", etc. for the other nineteen families):
{
"kind": "EmbeddedTextField",
"key": "<EmbeddedArtifactKey>",
"artifactRef": "<FieldId>",
"valueRequirement": "required",
"cardinality": { "min": 1, "max": 1 },
"property": { "iri": "https://schema.org/name" }
}
An EmbeddedAttributeValueField MUST NOT carry a defaultValue property.
{
"kind": "EmbeddedTemplate",
"key": "<EmbeddedArtifactKey>",
"artifactRef": "<TemplateId>",
"cardinality": { "min": 0 }
}
{
"kind": "EmbeddedPresentationComponent",
"key": "<EmbeddedArtifactKey>",
"artifactRef": "<PresentationComponentId>",
"visibility": "visible"
}
6.8 Default values
A default value is a value used to pre-populate a field at instance-creation time when no explicit value has yet been supplied by the user. Defaults exist at two layers:
- Field-level defaults, on the reusable
Field’sFieldSpec(XxxFieldSpec.defaultValue), shared by every Template that embeds the field. - Embedding-level defaults, on the
EmbeddedXxxFieldinside a Template (EmbeddedXxxField.defaultValue), specific to that one embedding.
Every concrete field family carries an optional default at both layers, with one exception: AttributeValueField carries no default at either layer (an AttributeValue is a per-instance pairing of a name and a value, and a default is not meaningful).
Defaults are UI/UX initialisation only. A default’s sole role is to seed an instance’s value at creation time. Defaults do not appear in the wire form of TemplateInstance artifacts and do not affect the RDF projection. When an instance is created and the user accepts the default without modification, the resulting FieldValue carries the default value as if the user had typed it in by hand; from the instance’s perspective the default and a user-supplied identical value are indistinguishable. When an instance is created and the user does not supply a value (and the field is not required), the corresponding FieldValue is omitted entirely — the default does not appear by virtue of having existed.
Wire form. Both layers use the same Value-typed wire shape: there is no DefaultValue wrapper. Every Value is a member of the Value polymorphic union, so per the kind rule (wire-grammar.md §1.5) every defaultValue carries a kind discriminator — at both layers, regardless of whether the enclosing context already pins the family. The discriminator is structurally redundant at slots whose enclosing XxxFieldSpec.kind or EmbeddedXxxField.kind already determines the family, but is retained for uniformity with Value’s appearance at the polymorphic positions where the kind genuinely discriminates (e.g. FieldValue.values[*] in instances).
MultiValuedEnumFieldSpec.defaultValues and EmbeddedMultiValuedEnumField.defaultValue are the two slots whose wire form is a JSON array rather than a single object: each carries an array of tagged EnumValue entries.
For the enum families specifically, the structural-invariant constraint that the default reference one of the spec’s permissibleValues applies to the inner value (the Token):
SingleValuedEnumFieldSpec.defaultValue?: EnumValue— a taggedEnumValuewhosevalueMUST equal theTokenof one of the spec’s permissible-value entries.MultiValuedEnumFieldSpec.defaultValues?: array<EnumValue>— a (possibly empty) JSON array of taggedEnumValueentries; each entry’svalueMUST equal theTokenof one of the spec’s permissible-value entries, and the array MUST NOT contain duplicatevalueentries.
The same constraint applies at the corresponding embedding-level slots (EmbeddedSingleValuedEnumField.defaultValue and EmbeddedMultiValuedEnumField.defaultValue).
Examples by family — at every layer (field-level on XxxFieldSpec.defaultValue, embedding-level on EmbeddedXxxField.defaultValue) the wire shape is identical:
// TextValue (field-level on TextFieldSpec, embedding-level on EmbeddedTextField)
"defaultValue": { "kind": "TextValue", "value": "Stanford University" }
"defaultValue": { "kind": "TextValue", "value": "Bonjour", "lang": "fr" }
// IntegerNumberValue
"defaultValue": { "kind": "IntegerNumberValue", "value": "42" }
// RealNumberValue
"defaultValue": { "kind": "RealNumberValue", "value": "3.14", "datatype": "decimal" }
// BooleanValue
"defaultValue": { "kind": "BooleanValue", "value": true }
// DateValue (kind discriminates the arm; the arm MUST be consistent with the spec's dateValueType)
"defaultValue": { "kind": "FullDateValue", "value": "2024-06-15" }
"defaultValue": { "kind": "YearValue", "value": "2024" }
// TimeValue
"defaultValue": { "kind": "TimeValue", "value": "10:30:00" }
// DateTimeValue
"defaultValue": { "kind": "DateTimeValue", "value": "2024-06-15T10:30:00Z" }
// ControlledTermValue
"defaultValue": {
"kind": "ControlledTermValue",
"term": "http://purl.obolibrary.org/obo/UBERON_0000955",
"label": [{ "value": "brain", "lang": "en" }]
}
// EnumValue (single) — both layers use the same shape
"defaultValue": { "kind": "EnumValue", "value": "yes" }
// array<EnumValue> — both layers use the same shape; MultiValuedEnumFieldSpec calls the slot defaultValues
"defaultValues": [
{ "kind": "EnumValue", "value": "active" },
{ "kind": "EnumValue", "value": "retired" }
]
// LinkValue
"defaultValue": { "kind": "LinkValue", "iri": "https://example.org", "label": [{ "value": "Example", "lang": "en" }] }
// EmailValue
"defaultValue": { "kind": "EmailValue", "value": "jane@example.org" }
// PhoneNumberValue
"defaultValue": { "kind": "PhoneNumberValue", "value": "+1-650-555-0123" }
// OrcidValue
"defaultValue": {
"kind": "OrcidValue",
"iri": "https://orcid.org/0000-0002-1825-0097",
"label": [{ "value": "Josiah Carberry", "lang": "en" }]
}
// RorValue / DoiValue / PubMedIdValue / RridValue / NihGrantIdValue — analogous, each tagged with its family's kind
Precedence. When both a field-level default (on the referenced Field’s FieldSpec) and an embedding-level default (on the EmbeddedXxxField) are present for the same field, the embedding-level default wins. When only one is present, that one applies. When neither is present, the field has no default. There is no mechanism for an embedding to unset a field-level default; an embedding wishing to override with a different default supplies its own defaultValue, but cannot say “no default here.” See grammar.md §Defaults for the full table.
6.9 Templates
{
"kind": "Template",
"id": "<TemplateId>",
"modelVersion": "<SemanticVersion>",
"metadata": "<CatalogMetadata>",
"versioning": "<SchemaArtifactVersioning>",
"title": [{ "value": "Form Title", "lang": "en" }],
"header": [{ "value": "Template Header Text", "lang": "en" }],
"members": ["<EmbeddedArtifact>*"]
}
The members array MUST preserve order. The EmbeddedArtifactKey values within members MUST be unique; a conforming encoder MUST verify uniqueness before producing the JSON, and a conforming decoder MUST reject input that violates this constraint.
6.10 Presentation components
{ "kind": "RichTextComponent", "id": "<PresentationComponentId>", "modelVersion": "<SemanticVersion>", "metadata": "<CatalogMetadata>", "html": "<p>Hello</p>" }
{ "kind": "ImageComponent", "id": "<PresentationComponentId>", "modelVersion": "<SemanticVersion>", "metadata": "<CatalogMetadata>", "image": "https://example.org/image.png" }
{ "kind": "SectionBreakComponent", "id": "<PresentationComponentId>", "modelVersion": "<SemanticVersion>", "metadata": "<CatalogMetadata>" }
6.11 Instances
{
"kind": "TemplateInstance",
"id": "<TemplateInstanceId>",
"modelVersion": "<SemanticVersion>",
"metadata": "<CatalogMetadata>",
"templateRef": "<TemplateId>",
"label": [{ "value": "Optional user-supplied instance label", "lang": "en" }],
"values": ["<InstanceValue>*"]
}
TemplateInstance.metadata is CatalogMetadata; instances do not carry schema versioning, so there is no top-level versioning slot. The optional label slot, when present, carries a user-supplied name for the instance, shown in catalog listings or detail views.
{ "kind": "FieldValue", "key": "<EmbeddedArtifactKey>", "values": ["<Value>+"] }
FieldValue.values MUST be a non-empty array; absence of a value is represented by omitting the FieldValue entirely.
{ "kind": "NestedTemplateInstance", "key": "<EmbeddedArtifactKey>", "values": ["<InstanceValue>*"] }
The values array of a TemplateInstance MUST satisfy the structural invariants defined in grammar.md §Instances: a given EmbeddedArtifactKey appears as the key of at most one FieldValue; a given EmbeddedArtifactKey does not appear as the key of both a FieldValue and a NestedTemplateInstance; multiple NestedTemplateInstance entries sharing a key are permitted.
7. Round-Tripping
A conforming encode-decode round-trip MUST preserve:
- Every component value of every abstract construct, including lexical content of literals and IRI strings.
- The order of every sequence component (
*and+). - The presence-or-absence of every optional component.
A conforming encode-decode round-trip MAY NOT preserve:
- JSON object property order within a single tagged object.
- Whitespace between JSON tokens.
- Implementation-specific properties beginning with
_or$per §4.7 (these are explicitly outside the conformance contract).
Two conforming JSON documents that differ only in JSON object property order or non-significant whitespace MUST decode to the same abstract construct.
8. Examples
This section walks through one fully-elaborated example end-to-end —
a realistic Template, a TemplateInstance that conforms to it, a
round-trip equality check, and two known-bad inputs that exercise the
error model from §9. The goal is to give implementers a concrete
fixture they can decode-and-encode against, and to make every cross-
section reference (the kind rule, wrapping principle,
structural-invariant constraints) visible at one position in the
wire form.
The JSON in this section is embedded from machine-readable test
fixtures under
spec/normative-tests/. A binding
SHOULD treat that directory as a cross-language acceptance suite:
every binding MUST decode every file under valid/, encode the
result back to JSON, and verify §7 round-trip equivalence; every
binding MUST decode every file under invalid/<case>/input.json
and report at least the errors listed in
invalid/<case>/expected-errors.json. The test fixtures are the
authoritative source — this section embeds them via mdBook
{{#include}} so the rendered prose and the test data cannot
drift apart.
The example is deliberately compact rather than minimal: every wire
shape this spec defines that is reachable from a Template appears
at least once. The companion TemplateInstance exercises every value
shape that is reachable from a FieldValue. Smaller variations
(empty members, no annotations, single-language title) are
straightforward subsets of the larger artifact and are not
separately illustrated.
8.1 A Template exercising the principal wire shapes
The Template below describes a single patient observation: an
identifier, a free-text comment, a single-valued enum severity, a
date observed, an integer-valued count of repeated occurrences (with
unit and bounds), and a controlled-term diagnosis. It carries a
multi-language title (the rendered form heading) and description,
a separate top-level versioning slot, a lifecycle, and two
annotations on the metadata.
{
"kind": "Template",
"id": "https://example.org/templates/patient-observation",
"modelVersion": "2.0.0",
"metadata": {
"description": [
{
"value": "A single observation made about a patient.",
"lang": "en"
}
],
"altLabels": [
[
{
"value": "Clinical observation",
"lang": "en"
}
],
[
{
"value": "Patient note",
"lang": "en"
}
]
],
"lifecycle": {
"createdOn": "2026-01-15T09:30:00Z",
"createdBy": "https://example.org/users/alice",
"modifiedOn": "2026-04-02T16:12:00Z",
"modifiedBy": "https://example.org/users/bob"
},
"annotations": [
{
"property": "https://purl.org/dc/terms/license",
"body": {
"kind": "AnnotationIriValue",
"iri": "https://creativecommons.org/licenses/by/4.0/"
}
},
{
"property": "https://schema.org/keywords",
"body": {
"kind": "AnnotationStringValue",
"value": "patient,observation,clinical",
"lang": "en"
}
}
]
},
"versioning": {
"version": "1.2.0",
"status": "published",
"previousVersion": "https://example.org/templates/patient-observation/v/1.1.0"
},
"title": [
{
"value": "Patient observation",
"lang": "en"
},
{
"value": "Beobachtung des Patienten",
"lang": "de"
}
],
"header": [
{
"value": "Record one observation per submission.",
"lang": "en"
}
],
"members": [
{
"kind": "EmbeddedTextField",
"key": "comment",
"artifactRef": "https://example.org/fields/comment",
"valueRequirement": "recommended",
"cardinality": {
"min": 0,
"max": 1
},
"visibility": "visible",
"labelOverride": {
"label": [
{
"value": "Free-text comment",
"lang": "en"
}
],
"altLabels": []
},
"property": {
"iri": "https://schema.org/comment"
}
},
{
"kind": "EmbeddedSingleValuedEnumField",
"key": "severity",
"artifactRef": "https://example.org/fields/severity",
"valueRequirement": "required",
"visibility": "visible",
"defaultValue": {
"kind": "EnumValue",
"value": "moderate"
},
"property": {
"iri": "https://example.org/ontology/severity"
}
},
{
"kind": "EmbeddedDateField",
"key": "observed",
"artifactRef": "https://example.org/fields/observed",
"valueRequirement": "required",
"cardinality": {
"min": 1,
"max": 1
},
"visibility": "visible",
"defaultValue": {
"kind": "FullDateValue",
"value": "2026-01-01"
},
"property": {
"iri": "https://schema.org/observationDate"
}
},
{
"kind": "EmbeddedIntegerNumberField",
"key": "occurrences",
"artifactRef": "https://example.org/fields/occurrences",
"valueRequirement": "optional",
"cardinality": {
"min": 0,
"max": 1
},
"visibility": "visible",
"defaultValue": {
"kind": "IntegerNumberValue",
"value": "1"
},
"property": {
"iri": "https://example.org/ontology/occurrenceCount"
}
},
{
"kind": "EmbeddedControlledTermField",
"key": "diagnosis",
"artifactRef": "https://example.org/fields/diagnosis",
"valueRequirement": "required",
"cardinality": {
"min": 1
},
"visibility": "visible",
"property": {
"iri": "https://example.org/ontology/diagnosis"
}
}
]
}
A few things in the above artifact are worth highlighting because they exercise specific rules:
- Top-level layout.
metadatacarriesCatalogMetadata(descriptive properties, lifecycle, annotations).versioningis a separate top-level slot, not nested insidemetadata.titlecarries the rendered form heading and is also a separate top-level slot (see §6.9 / wire-grammar §5.1). - Multilingual content.
titleanddescriptionareMultilingualStringarrays. EachaltLabelselement onmetadatais itself aMultilingualString, soaltLabelsis an array of arrays. Two of the language-tagged entries ontitleexercise the unique-lang-tag invariant (§9.1 category 3). AnnotationValuepolymorphism.Annotation.bodyis adiscriminator: kindunion withAnnotationStringValueandAnnotationIriValuearms; the wire form carries the discriminator per §1.5 ofwire-grammar.md.defaultValuekind discriminators. EverydefaultValueon everyEmbeddedXxxFieldcarries akinddiscriminator per the rule inwire-grammar.md§1.5 — for example{ "kind": "EnumValue", "value": "moderate" }onEmbeddedSingleValuedEnumField,{ "kind": "IntegerNumberValue", "value": "1" }onEmbeddedIntegerNumberField, and{ "kind": "FullDateValue", "value": "2026-01-01" }onEmbeddedDateField. The discriminator is structurally redundant at slots whose enclosingEmbeddedXxxField.kindalready fixes the family (everywhere exceptEmbeddedDateField), but is retained for uniformity withValue’s appearance at polymorphic positions such asFieldValue.values[*]in instances.- Identifier IRIs. Every
artifactRefis an IRI string that belongs to a field of the family declared by the surroundingkind(§6.1, §9.1 category 3). A conforming encoder verifies this before emit; a conforming decoder reports a structural-invariant error if it does not. Cardinalityranges.commentadmits zero or one (min: 0, max: 1);observedrequires exactly one;occurrencesis optional with at most one;diagnosisrequires at least one with no upper bound (maxomitted, meaning unbounded).Cardinalityappears at singleton positions only and never carrieskindper §1.5.
8.2 kind discriminators, two examples
Per the kind rule (§1.5 of wire-grammar.md), every member of a
discriminator: kind union carries "kind" on the wire — at every
position. Two examples illustrate.
Example 1 — Value at a polymorphic position. In a
TemplateInstance, the FieldValue.values slot is a
nonEmptyArray<Value>. The decoder uses the array element’s kind
to pick the union arm:
{
"kind": "FieldValue",
"key": "severity",
"values": [ { "kind": "EnumValue", "value": "severe" } ]
}
Example 2 — Value at a singleton position. In an
EmbeddedSingleValuedEnumField, the defaultValue slot’s type is
the single concrete EnumValue production: the enclosing
EmbeddedSingleValuedEnumField.kind already determines the family.
The kind discriminator is therefore structurally redundant at
this slot — but is still emitted, because EnumValue is a member of
the Value polymorphic union and the rule is uniform across
positions:
{
"kind": "EmbeddedSingleValuedEnumField",
"key": "severity",
"artifactRef": "https://example.org/fields/severity",
"defaultValue": { "kind": "EnumValue", "value": "moderate" }
}
The same pattern applies at every other singleton-Value slot:
EmbeddedTextField.defaultValue carries "kind": "TextValue",
EmbeddedIntegerNumberField.defaultValue carries "kind": "IntegerNumberValue", IntegerNumberFieldSpec.minValue carries
"kind": "IntegerNumberValue", and so on. The wire-size cost is
small (one extra short property per Value object) and the
simplification at the spec level is that there is exactly one
encoding rule for Value, applicable everywhere.
EmbeddedMultiValuedEnumField.defaultValue is the array case: each
element of the array is itself a tagged EnumValue:
"defaultValue": [
{ "kind": "EnumValue", "value": "active" },
{ "kind": "EnumValue", "value": "retired" }
]
By contrast, Cardinality, Annotation, LabelOverride, Property,
and the other singleton-only productions enumerated in §1.5 are
not members of any discriminator: kind union, so they never
carry "kind" regardless of position. Cardinality is always
{ "min": …, "max"?: … }; never { "kind": "Cardinality", … }.
8.3 A TemplateInstance for the above Template
The instance below conforms to the Template of §8.1: it carries one
value per required and present optional EmbeddedField, omits the
optional comment, and carries two diagnosis terms (since
diagnosis admits min: 1 with unbounded max).
{
"kind": "TemplateInstance",
"id": "https://example.org/instances/observation-42",
"modelVersion": "2.0.0",
"metadata": {
"preferredLabel": [
{
"value": "Observation #42",
"lang": "en"
}
],
"lifecycle": {
"createdOn": "2026-04-15T10:22:00Z",
"createdBy": "https://example.org/users/alice",
"modifiedOn": "2026-04-15T10:22:00Z",
"modifiedBy": "https://example.org/users/alice"
}
},
"templateRef": "https://example.org/templates/patient-observation",
"values": [
{
"kind": "FieldValue",
"key": "severity",
"values": [
{
"kind": "EnumValue",
"value": "severe"
}
]
},
{
"kind": "FieldValue",
"key": "observed",
"values": [
{
"kind": "FullDateValue",
"value": "2026-04-14"
}
]
},
{
"kind": "FieldValue",
"key": "occurrences",
"values": [
{
"kind": "IntegerNumberValue",
"value": "3"
}
]
},
{
"kind": "FieldValue",
"key": "diagnosis",
"values": [
{
"kind": "ControlledTermValue",
"term": "https://www.snomed.org/snomed-ct/concept/22298006",
"label": [
{
"value": "Myocardial infarction",
"lang": "en"
}
]
},
{
"kind": "ControlledTermValue",
"term": "https://www.snomed.org/snomed-ct/concept/49601007",
"label": [
{
"value": "Disorder of cardiovascular system",
"lang": "en"
}
]
}
]
}
]
}
Notes:
- Instance metadata.
TemplateInstance.metadataisCatalogMetadata. Instances do not carry schema versioning, so there is no top-levelversioningslot — the schema’s version is fixed bytemplateRef. FieldValue.valuesis non-empty. Per the abstract grammar’sValue+constraint, everyFieldValuecarries at least one value; absence of a value for a key is represented by omitting theFieldValueentry entirely (thecommentkey here). This is the reasonvalueRequirementis enforced at instance-validation time rather than wire-shape time: the wire grammar does not require aFieldValuefor everyEmbeddedField.FieldValue.values[*]carrieskind. The values insideFieldValue.valuesare members of theValuepolymorphic union; every entry carries itskinddiscriminator (per §1.5 ofwire-grammar.md). The samekind-bearing shape appears at every otherValueslot —EmbeddedXxxField.defaultValuein the template above,TextFieldSpec.defaultValueon a standaloneTextField, theIntegerNumberFieldSpec.minValue/maxValuebounds — because the rule is uniform across positions.
8.4 Round-tripping
Decoding the §8.1 Template JSON and re-encoding the resulting
in-memory value MUST produce a JSON document that is equal to the
input under §7’s equivalence (object property order and whitespace
are not significant). A binding’s round-trip test SHOULD therefore:
- Parse the input to a JSON tree and to its in-memory model representation.
- Re-encode the in-memory representation to JSON.
- Compare the two JSON trees property-set-equally (recursive set equality on object members, sequence equality on arrays).
A binding MAY canonicalise property order on encode (e.g. always emit
kind first, then required fields in grammar order, then optionals
alphabetically); the canonical form is not normative under §7 — only
its decode-equivalence to the input is.
8.5 Known-bad inputs
The two inputs below exercise the §9 error model. Each is presented with the expected reported errors per §9.3 (the four required fields). A conforming decoder operating in collected mode (the default per §9.4) MUST report all the listed errors before raising or returning.
Input 1 — wire-shape error (unknown kind discriminator).
{
"kind": "TemplateInstance",
"id": "https://example.org/instances/i1",
"modelVersion": "2.0.0",
"metadata": {
"preferredLabel": [
{
"value": "x",
"lang": "en"
}
],
"lifecycle": {
"createdOn": "2026-04-15T10:22:00Z",
"createdBy": "https://example.org/u",
"modifiedOn": "2026-04-15T10:22:00Z",
"modifiedBy": "https://example.org/u"
}
},
"templateRef": "https://example.org/templates/patient-observation",
"values": [
{
"kind": "FieldValue",
"key": "severity",
"values": [
{
"kind": "MysteryValue",
"value": "severe"
}
]
}
]
}
Expected error report:
| category | path | production | message |
|---|---|---|---|
wireShape | /values/0/values/0 | Value | kind: "MysteryValue" is not a recognised Value variant |
The decoder MUST NOT silently substitute a default variant or treat the input as a generic object (§9.5).
Input 2 — structural-invariant error (FieldId family mismatch and
duplicate embedded-artifact key). This input has two errors at
distinct positions; both must be reported. The same IRI
https://example.org/fields/foo is used as artifactRef from
two embeddings whose kinds declare different field families —
once as a TextField, once as a DateField. A single field
identifier cannot belong to two field families, so one of the two
references must be wrong; conformance requires the binding to
detect and report this without consulting an external registry.
{
"kind": "Template",
"id": "https://example.org/templates/x",
"modelVersion": "2.0.0",
"metadata": {
"lifecycle": {
"createdOn": "2026-01-15T09:30:00Z",
"createdBy": "https://example.org/u",
"modifiedOn": "2026-01-15T09:30:00Z",
"modifiedBy": "https://example.org/u"
}
},
"versioning": {
"version": "1.0.0",
"status": "draft"
},
"title": [
{
"value": "x",
"lang": "en"
}
],
"members": [
{
"kind": "EmbeddedTextField",
"key": "duplicate",
"artifactRef": "https://example.org/fields/foo"
},
{
"kind": "EmbeddedDateField",
"key": "duplicate",
"artifactRef": "https://example.org/fields/foo"
}
]
}
Expected error report (collected mode):
| category | path | production | message |
|---|---|---|---|
structural | /members/1/artifactRef | EmbeddedDateField | artifactRef "https://example.org/fields/foo" is also referenced at /members/0/artifactRef as a TextField; a FieldId cannot belong to two field families |
structural | /members/1/key | Template | EmbeddedArtifact.key "duplicate" is not unique within the enclosing Template (also at /members/0/key) |
The duplicate-key error is reported against the second occurrence,
not the first; the first occurrence’s path is included in the
message for traceability. The family-mismatch error is reported
against the second occurrence by the same convention.
A binding may also surface additional implementation-specific fields
(error code, original JSON value, etc.); the four columns above are
the required minimum per §9.3. The expected errors live as a JSON
file alongside the input under
spec/normative-tests/invalid/02-fieldid-family-mismatch-and-duplicate-key/expected-errors.json,
where the messageRegex field gives the regex a binding’s
reported message MUST match (literal equality is not required —
wording is informational, the regex pins the substantive content).
The same convention applies to the §8.5 first input and to all
future invalid fixtures.
9. Errors
This section specifies the error model for conforming encoders and decoders: the categories of error each side reports, the common shape of an error, and the policy on fail-fast vs collected reporting. The intent is cross-binding parity — a TS, Java, and Python binding given the same malformed input report the same set of errors at the same wire-form locations, even if they surface those errors through different host-language exception types.
The error model defined here is normative: the binding contract covers not only what is encoded and decoded, but how failures are reported.
9.1 Error categories
Three categories of error are recognised:
- Wire-shape error. The JSON does not match the wire production
that should appear at the position. Examples:
- A property whose declared type is
stringis encoded as a JSON number. - A polymorphic union slot carries a
kindthat is not one of the declared variants. - A required property is missing, or a property is present that is
not declared by the production at the position (excluding
_/$-prefixed extension properties per §4.7). - A
nonEmptyArray<X>slot carries[].
- A property whose declared type is
- Lexical error. A wire value is well-formed JSON of the right
shape, but its lexical content does not match the production’s
lexical category. Examples:
- A
LanguageTagstring that is not a valid BCP 47 tag (per RFC 5646). - An
Iristring that is not a syntactically valid absolute IRI (per RFC 3987). - An
EmbeddedArtifactKeythat does not match^[A-Za-z][A-Za-z0-9_-]*$. - A
LexicalForminteger string with a leading zero, leading sign, or non-decimal digit (pergrammar.md§Primitive String Types). - A
SemanticVersionstring that does not conform to Semantic Versioning 2.0.0. - An
Iso8601DateTimeLexicalFormstring outside the XSDdateTimeextended form.
- A
- Structural-invariant error. The shape and lexical content are
each individually valid, but a constraint that crosses positions is
violated. Examples:
- Two
EmbeddedArtifact.keyvalues within the sameTemplateare equal. - The IRI placed at an
EmbeddedField.artifactRefbelongs to a field of a different family than the enclosingkinddeclares. Cardinality.min > Cardinality.max.- An
OntologyDisplayHintcarries neitheracronymnorname. - Two
LangString.langtags within the sameMultilingualStringare equal under case-folded comparison. - Two
PermissibleValue.valuetokens within the same enum spec are equal. - A
MultiValuedEnumFieldSpec.defaultValuesarray contains twoEnumValueentries with the samevalue. - A field-level or embedding-level
defaultValueTokendoes not equal anyPermissibleValue.valueof the spec. - A
DateFieldSpec.defaultValuearm is inconsistent with the spec’sdateValueType(e.g.dateValueType: "year"paired with aFullDateValuedefault). - A
SchemaArtifactVersioningcarrying bothpreviousVersionandderivedFromwith the same IRI.
- Two
A single malformed input may produce errors in more than one category at distinct positions. An encoder reports the same three categories when given an in-memory value that does not satisfy them.
9.2 Error path
Every error MUST carry a path that locates it within the wire form. The path is a JSON Pointer per RFC 6901 (a slash-prefixed sequence of decoded property names and decimal array indices), relative to the root of the wire document being decoded or encoded. For example:
""— the document root."/members/3/defaultValue"— thedefaultValueproperty of the fourth element of the root-levelmembersarray."/metadata/annotations/0/body/value"— thevalueproperty of the body of the first annotation in the root metadata.
The decoder MUST report the path that names the innermost property or array index where the error was detected, not a parent. An encoder reports the path the property would have occupied in the wire form the encoder is producing.
When a wire-shape error refers to an array index that has not yet
been written (e.g. a nonEmptyArray<X> violation reported on []),
the path names the array property itself, with no trailing index.
9.3 Error report shape
The minimum information an error MUST carry is:
| Field | Type | Description |
|---|---|---|
category | one of "wireShape", "lexical", "structural" | the §9.1 category |
path | string | a JSON Pointer per §9.2 |
production | string | the wire grammar production at path (e.g. "Cardinality", "LangString", "EmbeddedTextField") |
message | string | a human-readable explanation |
Bindings MAY carry additional fields — for example a machine-readable error code, the offending JSON value, or a chain of nested causes — but the four fields above are the lower bound on what every binding MUST surface.
The host-language form is binding-specific:
- TypeScript. A class extending
CedarConstructionError(or a siblingCedarDecodeError/CedarEncodeErrorif the binding prefers per-direction types). Properties are surfaced as instance fields. - Java. A subclass of
RuntimeException(e.g.CedarDecodeException,CedarEncodeException). The four required fields appear as record components or accessor methods. - Python. A subclass of
Exceptioncarrying the four fields as attributes.
9.4 Fail-fast vs collected reporting
The default reporting mode is collected: a decoder or encoder MUST attempt to validate the entire input and report every error it finds before raising or returning. The thrown error type is therefore a collection of one or more individual errors; bindings idiomatic in single-error exceptions SHOULD wrap the collection in a single top-level exception whose message summarises the count and whose fields carry the list.
Bindings MAY additionally expose a fail-fast mode that raises on the first error encountered. Fail-fast mode is a performance and UX convenience for interactive use; the wire-form contract itself is defined in terms of the collected mode.
A decoder operating in collected mode MUST NOT short-circuit on a
wire-shape error within an array element: each element is independent
and must be checked. It MAY short-circuit if continuing past a
wire-shape error would require the decoder to fabricate values
(e.g. the property type is a polymorphic union and the kind
discriminator is absent or unrecognised; in this case the decoder
cannot know which arm’s properties to validate).
9.5 Decoder strictness for unknown discriminators
When a discriminator: kind union encounters a kind value that is
not one of the declared variants, the decoder MUST report a
wire-shape error and MUST NOT silently substitute a default variant or
treat the input as a generic object. An unknown kind is a clear
breaking-change indicator (per §11) and the decoder is in no position
to recover.
When a property whose name is not declared by the production at the
position is present, the decoder MUST report a wire-shape error,
unless the property name begins with _ or $ (per §4.7), in which
case the decoder MUST ignore it.
9.6 NFC normalisation
A decoder receiving a string that is not in Unicode NFC SHOULD normalise it to NFC silently and continue, recording the non-normalisation as a non-fatal warning if the binding’s API supports warnings. A decoder MAY instead raise a wire-shape error; this is implementation freedom. Encoders MUST emit NFC strings (per §4.5).
10. Reserved Property Names
The property name kind is reserved by this specification at all object-level positions. Implementations MUST NOT reuse this name for non-normative purposes.
The property name prefixes _ and $ are reserved for implementation-specific extensions per §4.7.
All other property names are scoped to their containing tagged object’s production and have no global meaning.
11. Versioning
This document defines version 1.0 of the JSON serialization. The version of the wire format itself is not encoded in conforming JSON documents; it is the responsibility of the surrounding storage or transport layer (file path conventions, MIME parameters, registry metadata, etc.) to communicate which version of this specification a document conforms to.
A future revision of this document MAY add new productions or new tagged-object kinds without a version bump, provided existing conforming documents remain conforming. A revision that changes the encoding of an existing production, removes a production, or changes the meaning of a property MUST bump the version.
12. Open Questions
- Should this document define an explicit version-discrimination property (e.g.
"$schema") at the root of every conforming document, parallel to the JSON Schema convention? - Should the wrapping principle in §5 be made into a normative algorithm rather than a checklist of properties?
- Should the encoding distinguish “absent optional component” from “present optional component with the default value” in productions that carry defaults (e.g.
ValueRequirement)? Current rule: omit if absent; encode the default when explicitly present. This may need to be made unambiguous per production. - Should
NonNegativeIntegeruse a string encoding even within the safe-integer range, to make the wire format consistent across implementations whose host language has no JSON-Number-like type?
Validation
Overview
Validation in the CEDAR Template Model consists of structural conformance to the abstract grammar and satisfaction of well-formedness conditions that are not expressed directly in grammar productions. The Canonical Validation Algorithm section defines a two-phase procedural algorithm that operationalises all normative rules in this document.
Contents
- Relationship to the wire-form error model
- Well-Formedness Conditions
- Canonical Validation Algorithm
- Open Questions
Relationship to the wire-form error model
This document and serialization.md §9 describe two complementary layers of conformance checking:
- The wire-form error model (
serialization.md§9) governs decoder and encoder behaviour at the JSON boundary. It defines three error categories (wireShape,lexical,structural) and a JSON-pointer-based path format for locating each error. - This validation algorithm governs post-decode checking on in-memory values. It assumes a successful decode has already produced syntactically well-formed structures and verifies the cross-cutting rules that bind those structures together (key uniqueness, cardinality, instance alignment, field-spec compatibility, and so on).
The two layers overlap in scope: many of the structural-invariant constraints listed in §9.1 are also Phase 1 checks here, because a conforming decoder operating in collected mode applies them at decode time. Implementations MAY perform validation entirely at decode (folding Phase 1 into the decoder) or entirely after decode (running Phase 1 as a separate pass). Either approach is conforming.
When this document refers to a constraint that is also enumerated in serialization.md §9.1, the wire-form error category and path semantics from §9 apply. Reported errors SHOULD use the four-field shape from §9.3 (category, path, production, message).
Well-Formedness Conditions
The conditions below are organised by structural concern. Each subsection corresponds to one of the §9.1 categories — primarily structural-invariant (cross-position constraints that the grammar alone cannot express) but with a few lexical constraints (regex-based well-formedness of pinned primitive types) called out where they are most natural to state.
EmbeddedArtifactKey Uniqueness
Within a single Template, each EmbeddedArtifact MUST have a unique EmbeddedArtifactKey. The uniqueness constraint is local to that template level and does not extend across nested template boundaries. Accordingly, an embedded template MAY contain EmbeddedArtifactKey values that are identical to keys used in its containing template, because each template defines its own local key space.
Each EmbeddedArtifactKey MUST conform to the AsciiIdentifier lexical form (per grammar.md) — the regular expression ^[A-Za-z][A-Za-z0-9_-]*$.
Embedding References
Each EmbeddedField MUST reference a Field.
Each EmbeddedTemplate MUST reference a Template.
Each EmbeddedPresentationComponent MUST reference a PresentationComponent.
Cardinality Consistency
If an embedding defines minimum and maximum cardinality, the minimum cardinality MUST NOT exceed the maximum cardinality.
ValueRequirement and Cardinality are orthogonal: ValueRequirement governs whether any values must be supplied at all; Cardinality governs the permitted count if values are supplied.
If an embedding is marked “required”, its minimum cardinality MUST be at least one. For EmbeddedTemplate, this means at least one NestedTemplateInstance keyed to that embedding MUST be present in the TemplateInstance.
If an embedding is marked “recommended”, absence of a value MUST NOT by itself cause conformance failure, though implementations MAY issue warnings or other authoring guidance.
If an embedding is marked “optional”, absence of a value MUST NOT by itself cause conformance failure.
If values are present for a “recommended” or “optional” embedding, their count MUST satisfy the Cardinality constraints of that embedding.
Cardinality Defaults and Multiplicity
When Cardinality is absent from an EmbeddedArtifact, the implied default cardinality is min_cardinality(1) with max_cardinality(1): the embedded artifact MUST appear exactly once.
An EmbeddedField is single-valued if its effective maximum cardinality is max_cardinality(1).
An EmbeddedField is multi-valued if its effective maximum cardinality is greater than one or is UnboundedCardinality.
Versioning
Version and ModelVersion MUST conform to the SemanticVersion lexical form (per grammar.md) — Semantic Versioning 2.0.0.
ModelVersion is a top-level component of every concrete Artifact (every Template, TemplateInstance, every Field, and every PresentationComponent); it is not a component of SchemaArtifactVersioning.
Status MUST be either draft or published.
SchemaArtifactVersioning.previousVersion and SchemaArtifactVersioning.derivedFrom, when both present on the same artifact, MUST NOT carry the same IRI value (per grammar.md §Schema Artifact Versioning).
Instance Alignment
Each FieldValue in a TemplateInstance MUST reference the EmbeddedArtifactKey of an EmbeddedField in the referenced Template.
Each NestedTemplateInstance in a TemplateInstance MUST reference the EmbeddedArtifactKey of an EmbeddedTemplate in the referenced Template.
TemplateInstance MUST NOT contain an InstanceValue for an EmbeddedPresentationComponent.
Field Spec Compatibility
Values in a FieldValue MUST satisfy the FieldSpec and any field-spec-specific properties of the referenced Field.
The contained values MUST follow the FieldSpec-to-Value correspondence defined in grammar.md:
| FieldSpec | Required Value type |
|---|---|
TextFieldSpec | TextValue |
IntegerNumberFieldSpec | IntegerNumberValue |
RealNumberFieldSpec | RealNumberValue |
BooleanFieldSpec | BooleanValue |
DateFieldSpec | DateValue (YearValue / YearMonthValue / FullDateValue per dateValueType) |
TimeFieldSpec | TimeValue |
DateTimeFieldSpec | DateTimeValue |
ControlledTermFieldSpec | ControlledTermValue |
SingleValuedEnumFieldSpec / MultiValuedEnumFieldSpec | EnumValue |
LinkFieldSpec | LinkValue |
EmailFieldSpec | EmailValue |
PhoneNumberFieldSpec | PhoneNumberValue |
OrcidFieldSpec | OrcidValue |
RorFieldSpec | RorValue |
DoiFieldSpec | DoiValue |
PubMedIdFieldSpec | PubMedIdValue |
RridFieldSpec | RridValue |
NihGrantIdFieldSpec | NihGrantIdValue |
AttributeValueFieldSpec | AttributeValue |
Additional well-formedness conditions apply per family, as described below.
For text values:
TextValueMUST carry a lexical form; it MAY carry a language tagTextFieldSpec.defaultValue, if present, MUST be aTextValue- if both
MinLengthandMaxLengthare present,MinLengthMUST NOT exceedMaxLength - if
MinLengthis present, eachTextValuelexical form MUST have length greater than or equal to that minimum - if
MaxLengthis present, eachTextValuelexical form MUST have length less than or equal to that maximum - if
ValidationRegexis present, eachTextValuelexical form MUST match that regular expression TextFieldSpec.defaultValue, if present, MUST satisfy any definedMinLength,MaxLength, andValidationRegexTextValuelexical forms SHOULD be in Unicode Normalization Form C- when present,
TextValue.langMUST be non-empty and well-formed according to BCP 47 - if
LangTagRequirementis“langTagRequired”, eachTextValueMUST carry alangslot - if
LangTagRequirementis“langTagForbidden”, eachTextValueMUST NOT carry alangslot TextFieldSpec.defaultValue, if present, MUST satisfy any definedLangTagRequirement
For integer-number values:
IntegerNumberValueMUST carry a base-10 integer lexical form; its datatype is implicitlyxsd:integer- if both
IntegerNumberMinValueandIntegerNumberMaxValueare present on the field spec,IntegerNumberMinValueMUST NOT exceedIntegerNumberMaxValue - if
IntegerNumberMinValueis present, eachIntegerNumberValueMUST be greater than or equal to that minimum - if
IntegerNumberMaxValueis present, eachIntegerNumberValueMUST be less than or equal to that maximum
For real-number values:
RealNumberValueMUST carry a real-valued lexical form together with aRealNumberDatatypeKind(one ofdecimal,float, ordouble)- a
RealNumberValue’s datatype MUST equal thedatatypedeclared on the enclosingRealNumberFieldSpec - if both
RealNumberMinValueandRealNumberMaxValueare present on the field spec,RealNumberMinValueMUST NOT exceedRealNumberMaxValue - if
RealNumberMinValueis present, eachRealNumberValueMUST be greater than or equal to that minimum - if
RealNumberMaxValueis present, eachRealNumberValueMUST be less than or equal to that maximum
For boolean values:
BooleanValueMUST carry a boolean payload; its datatype is implicitlyxsd:boolean
For date values:
DateFieldSpecwith dateValueType: “year” MUST useYearValue, whose lexical form MUST match the patternYYYY(a four-digit Gregorian year)DateFieldSpecwith dateValueType: “yearMonth” MUST useYearMonthValue, whose lexical form MUST match the patternYYYY-MM(with month in01–12)DateFieldSpecwith dateValueType: “fullDate” MUST useFullDateValue, whose lexical form MUST be a well-formedxsd:datelexical form (YYYY-MM-DDwith optional zone offset)DateFieldSpec.defaultValue, if present, MUST carry aDateValuearm consistent withdateValueType— dateValueType: “year” admits onlyYearValue, dateValueType: “yearMonth” admits onlyYearMonthValue, dateValueType: “fullDate” admits onlyFullDateValue. The same constraint applies toEmbeddedDateField.defaultValue.
For time values:
TimeValueMUST carry a well-formedxsd:timelexical formTimeFieldSpecvalues MUST conform to any statedTimePrecision
For date-time values:
DateTimeValueMUST carry a well-formedxsd:dateTimelexical formDateTimeFieldSpecvalues MUST conform to the statedDateTimeValueType
For enum values:
- A
FieldValuefor aSingleValuedEnumFieldSpecMUST contain exactly oneEnumValue - A
FieldValuefor aMultiValuedEnumFieldSpecMUST contain one or moreEnumValueconstructs (subject to theCardinalityof the embedding) - Each
EnumValue.value(aToken) MUST equal the canonicalTokenof one of the referenced spec’sPermissibleValueentries - The
Tokenstrings of anEnumFieldSpec’sPermissibleValue+MUST be unique within that spec SingleValuedEnumFieldSpec.defaultValue, if present, MUST be anEnumValuewhosevalueequals theTokenof one of itsPermissibleValueentriesMultiValuedEnumFieldSpec.defaultValues, if present, MUST be a (possibly empty) list ofEnumValueconstructs each whosevalueequals theTokenof one of itsPermissibleValueentries; the list MUST NOT contain duplicatevalueentries
An EnumValue matches a PermissibleValue if and only if the value’s Token string equals the permissible value’s Token string (compared character by character).
For controlled-term values:
ControlledTermValueMUST include a term identifier and SHOULD include a human-readable label
For contact values:
EmailValueMUST carry a non-empty lexical formPhoneNumberValueMUST carry a non-empty lexical form
For external authority values:
OrcidValueMUST include anOrcidIriRorValueMUST include aRorIriDoiValueMUST include aDoiIriPubMedIdValueMUST include aPubMedIriRridValueMUST include anRridIriNihGrantIdValueMUST include aNihGrantIri- these values MAY additionally include a human-readable
Label
For string-bearing values generally:
- lexical forms MUST be in Unicode Normalization Form C (per
serialization.md§4.5) - when present, language tags MUST conform to the
Bcp47Taglexical form (per grammar.md — RFC 5646)
For default values (both layers):
The model carries default values at two layers, and validation rules apply uniformly across the two:
- A field-level default lives on the reusable
Field’sFieldSpec(XxxFieldSpec.defaultValue), shared by every Template that embeds the field. Every concreteXxxFieldSpecexceptAttributeValueFieldSpecadmits an optional default. - An embedding-level default lives on the
EmbeddedXxxFieldinside a Template (EmbeddedXxxField.defaultValue), specific to that one embedding.
The well-formedness conditions:
- A default value, at either layer, MUST be the family-specific
Valuetype as given in grammar.md. - A default MUST satisfy every well-formedness condition that a corresponding
FieldValuewould satisfy for the sameFieldSpec(length bounds, numeric bounds, datatype consistency, lexical-form constraints, and so on). - Enum defaults at either layer MUST be
EnumValueconstructs (single forSingleValuedEnumField/Spec, a possibly-empty list forMultiValuedEnumField/Spec) whosevalueequals theTokenof one of the spec’sPermissibleValueentries; the multi-valued list MUST NOT contain duplicatevalueentries. - When both a field-level and an embedding-level default are present for the same field, the embedding-level default takes precedence (see grammar.md).
AttributeValueFieldSpecandEmbeddedAttributeValueFieldcarry no defaults at either layer.
For multiplicity:
- if an
EmbeddedFieldis single-valued, its correspondingFieldValueMUST NOT contain more than one value - if an
EmbeddedFieldis multi-valued, the number of values in itsFieldValueMUST satisfy the embedding cardinality constraints - if an
EmbeddedTemplatehas multiplicity greater than one, the number of correspondingNestedTemplateInstanceconstructs MUST satisfy the embedding cardinality constraints
Rendering Hint Compatibility
Any rendering hint used by the model MUST be compatible with the associated FieldSpec:
| Rendering hint | Permitted on |
|---|---|
TextRenderingHint | TextFieldSpec |
SingleValuedEnumRenderingHint | SingleValuedEnumFieldSpec |
MultiValuedEnumRenderingHint | MultiValuedEnumFieldSpec |
BooleanRenderingHint | BooleanFieldSpec |
NumericRenderingHint | IntegerNumberFieldSpec, RealNumberFieldSpec |
DateRenderingHint | DateFieldSpec |
TimeRenderingHint | TimeFieldSpec |
DateTimeRenderingHint | DateTimeFieldSpec |
Controlled Term Value Structure
If a value conforms to ControlledTermFieldSpec, the value MUST include a term identifier and SHOULD include a human-readable label.
A ControlledTermFieldSpec.defaultValue or EmbeddedControlledTermField.defaultValue, if present, SHOULD identify a term drawn from one of the declared ControlledTermSource entries of the referenced ControlledTermFieldSpec. Verifying source membership requires resolving the TermIri against an external ontology and is outside the scope of the canonical algorithm; see Out of Scope.
Canonical Validation Algorithm
The canonical validation algorithm consists of two phases that MUST be applied in order. Phase 1 validates the well-formedness of a Template and the artifacts it references. Phase 2 validates that a TemplateInstance conforms to a well-formed Template. Phase 2 MUST NOT be applied unless Phase 1 has passed without error.
Both phases are defined as error-collecting: all violations MUST be reported rather than stopping at the first failure. Implementations MAY additionally offer a fail-fast mode for performance, but the set of errors reported MUST be a subset of those that the collecting mode would report.
The algorithm is expressed as a set of named subroutines. Each subroutine takes typed inputs and produces a (possibly empty) set of errors. Verify denotes a hard constraint: failure produces an error. Warn denotes a SHOULD constraint: failure produces a warning. The notation count(X) denotes the number of elements of kind X, and len(s) denotes the length in characters of string s.
Reporting errors
Every Verify step in the algorithm has an associated error report that a conforming binding MUST surface on failure. Every Warn step has an associated warning report. Each step states its report inline as an On failure: line directly under the step.
Each report uses the four-field shape from serialization.md §9.3:
category— one ofwireShape,lexical, orstructural. Most validation reports arestructural(cross-position constraint); a few arelexical(regex / well-formedness of a primitive type).path— a JSON Pointer locating the offending slot in the wire form being validated.production— the wire-grammar production at the path.message— a human-readable explanation. The wording given in this document is recommended; bindings MAY use different text and SHOULD include enough detail to support diagnosis.
Path conventions. Subroutines describe paths relative to their input, using a placeholder for the input and slot accessors after slashes:
<input>— the subroutine’s input parameter, e.g.<embedded>,<template>,<fieldSpec>.<input>/slotName— a property slot.<input>/arrayName/<i>— an element of an array (with<i>an index variable).<input>/arrayName/<i>/inner— a nested slot inside the i-th element.
The caller of a subroutine substitutes the placeholder for the actual JSON Pointer of its input. For example, when validate_cardinality_consistency runs against template.members[2], an error reported at <embedded>/cardinality/min becomes /members/2/cardinality/min in the surfaced report.
When a subroutine S₁ calls another subroutine S₂ and S₂ reports an error at path <S₂.input>/foo, the surfaced path is <S₁.input>/<path-to-S₂.input>/foo. Each layer prepends its own input path. For example, validate_default_value calls a family-specific value-validator with the default value as input; an error from the inner validator at <value>/value is surfaced at <embedded>/defaultValue/value.
Warning reports follow the same shape but are emitted through the binding’s warning channel rather than its error channel.
External resolution
Several Verify steps require resolving an artifact-reference IRI to its definition — for example, validate_embedding_reference verifies that embedded.artifactRef “identifies an existing <Family>Field”. Resolution is outside the scope of this specification. A conforming validator is given an external resolver function
resolve(iri: Iri) → Artifact | null
that returns the artifact referenced by an IRI, or null if no such artifact is known. The validator MUST use this resolver to resolve every EmbeddedField.artifactRef, every EmbeddedTemplate.artifactRef, every EmbeddedPresentationComponent.artifactRef, and every TemplateInstance.templateRef.
How the resolver is implemented is a binding concern, not a model concern. Plausible implementations:
- A registry-backed resolver that looks up artifacts in a local catalogue.
- A document-local resolver that finds artifacts inlined in the same input document.
- A network-backed resolver that dereferences HTTP IRIs.
When resolve(iri) returns null, the surfaced error is:
structuralat the relevantartifactRefslot, production naming the embedding’s family, message“artifactRef does not resolve to an artifact”.
When resolve(iri) returns an artifact of the wrong family (e.g. a TextField is returned for an EmbeddedDateField.artifactRef), the surfaced error is the family-mismatch error already documented at validate_embedding_reference.
Implementations MAY operate without a resolver — in which case all Verify <…>identifying an existing <Family> steps are SKIPPED and any conformance claim must be qualified accordingly. This is a partial-validation mode appropriate for syntactic linting; full conformance requires a resolver.
Lexical-form precision
Several Verify steps appeal to lexical-form well-formedness for the primitive types pinned in grammar.md §Primitive String Types. For interoperability across implementations, the lexical-form predicates resolve as follows:
| Lexical form | Authoritative grammar |
|---|---|
SemanticVersion | The regular expression at semver.org. |
IriString | The IRI ABNF in RFC 3987 §2.2. The IRI MUST be absolute (carry a scheme). Implementations MAY use a permissive scheme-and-non-whitespace check as a fast pre-filter, but a conforming validator MUST be capable of full RFC 3987 conformance on demand. |
Bcp47Tag | The Language-Tag production of RFC 5646. Implementations MAY validate against the IANA Language Subtag Registry; a syntactic-only check is acceptable as a baseline. |
IntegerLexicalForm | Regex ^-?(0|[1-9][0-9]*)$. No leading +, no leading zeros (other than the literal 0), no whitespace. Magnitude is unbounded. |
AsciiIdentifier | Regex ^[A-Za-z][A-Za-z0-9_-]*$. Length is unbounded. |
Iso8601DateTimeLexicalForm | The dateTime lexical form from XML Schema 1.1 Part 2 §3.3.7, extended format. |
xsd:date lexical form | XML Schema 1.1 Part 2 §3.3.9. |
xsd:time lexical form | XML Schema 1.1 Part 2 §3.3.8. |
xsd:dateTime lexical form | XML Schema 1.1 Part 2 §3.3.7. |
xsd:decimal lexical form | XML Schema 1.1 Part 2 §3.3.3. |
xsd:float / xsd:double lexical form | XML Schema 1.1 Part 2 §3.3.6 and §3.3.5. The special values INF, -INF, and NaN are part of the lexical space. |
A conforming validator MUST treat the cited grammar as authoritative; a value is well-formed if and only if it matches the cited grammar. This pins the predicate so two independently-implemented validators agree on every input.
Phase 1: Schema Validation
Entry Point
validate_schema(template: Template)
Entry point for schema validation.
- Run
validate_model_version(template.model_version)andvalidate_schema_artifact_versioning(template.versioning). - If
template.template_rendering_hintis present: runvalidate_template_rendering_hint(template.template_rendering_hint). - Let
fields= the set ofFieldartifacts referenced byEmbeddedFieldconstructs intemplate. - For each
fieldinfields: runvalidate_model_version(field.model_version),validate_schema_artifact_versioning(field.versioning), andvalidate_field_spec(field.field_spec). - Let
pcs= the set ofPresentationComponentartifacts referenced byEmbeddedPresentationComponentconstructs intemplate. - For each
componentinpcs: runvalidate_model_version(component.model_version).PresentationComponentdoes not carrySchemaArtifactVersioning, so no versioning validation step applies. - Run
validate_embedded_artifact_keys(template). - For each
embeddedintemplate.embedded_artifacts:- Run
validate_embedding_reference(embedded). - Run
validate_cardinality_consistency(embedded). - If
embeddedis anEmbeddedField: runvalidate_rendering_hints(embedded). - If
embedded.default_valueis present: runvalidate_default_value(embedded.default_value, embedded). - If
embeddedis anEmbeddedTemplate: runvalidate_schema(embedded.referenced_template).
- Run
Metadata and Key Validation
validate_schema_artifact_versioning(versioning: SchemaArtifactVersioning)
Applies the Versioning rules to the SchemaArtifactVersioning slot carried by each schema artifact (Template, Field). PresentationComponent and TemplateInstance do not carry SchemaArtifactVersioning; this subroutine is not invoked for them.
- Let
version=versioning.version. Verifyversionconforms to theSemanticVersionlexical form (Semantic Versioning 2.0.0).On failure- category
lexical- path
<versioning>/version- production
SchemaArtifactVersioning- message
"version is not a valid SemanticVersion 2.0.0 string"
- Let
status=versioning.status. Verify status ∈ { draft, published }.On failure- category
wireShape- path
<versioning>/status- production
SchemaArtifactVersioning- message
"status must be 'draft' or 'published'"
- If both
versioning.previous_versionandversioning.derived_fromare present: verify they do not carry the same IRI value.On failure- category
structural- path
<versioning>/derivedFrom- production
SchemaArtifactVersioning- message
"previousVersion and derivedFrom MUST NOT carry the same IRI"
validate_template_rendering_hint(hint: TemplateRenderingHint)
- If
hint.help_display_modeis present: verify it is one of“inline”,“tooltip”,“both”,“none”.On failure- category
wireShape- path
<hint>/helpDisplayMode- production
HelpDisplayMode- message
"unknown HelpDisplayMode value"
validate_model_version(modelVersion: ModelVersion)
Applies the Versioning rules to the artifact-level ModelVersion carried directly by every concrete Artifact.
- Verify
modelVersionconforms to theSemanticVersionlexical form (Semantic Versioning 2.0.0).On failure- category
lexical- path
<modelVersion>- production
- naming the enclosing artifact (e.g.
TextField,Template) - message
"modelVersion is not a valid SemanticVersion 2.0.0 string"
validate_embedded_artifact_keys(template: Template)
Applies the EmbeddedArtifactKey Uniqueness rules.
- Let
keys= the sequence ofEmbeddedArtifactKeyvalues across allEmbeddedArtifactconstructs intemplate. - For each key
kinkeys: verifykconforms to theAsciiIdentifierlexical form (regex^[A-Za-z][A-Za-z0-9_-]*$).On failure- category
lexical- path
<template>/members/<i>/key- production
- naming the embedded artifact at index
<i> - message
"EmbeddedArtifactKey does not match the AsciiIdentifier pattern"
- Verify all values in
keysare distinct: for each pair (k₁, k₂) where k₁ ≠ k₂ as positions but k₁ = k₂ as values, report a duplicate-key error. Key uniqueness is scoped totemplate; the same key may appear in a nested template without conflict.On failure- category
structural- path
<template>/members/<j>/key(the second occurrence)- production
Template- message
"EmbeddedArtifact.key is not unique within the enclosing Template (also at /members/<i>/key)"
Reference and Cardinality Validation
validate_embedding_reference(embedded: EmbeddedArtifact)
Applies the Embedding References rules.
Each step below resolves embedded.artifactRef via the external resolver resolve(iri) (see External resolution) and verifies the resolved artifact’s family. If the validator was given no resolver, all steps are SKIPPED.
For each step below, two failure modes are possible:
- category
structural- path
<embedded>/artifactRef- production
- naming
embedded's family - message
"artifactRef does not resolve to an artifact"
- category
structural- path
<embedded>/artifactRef- production
- naming
embedded's family - message
"artifactRef resolves to an artifact of the wrong family (expected <Family>, got <ResolvedFamily>)"
- If
embeddedis anEmbeddedTextField: verifyembedded.artifactRefis aTextFieldIdidentifying an existingTextField. - If
embeddedis anEmbeddedIntegerNumberField: verifyembedded.artifactRefis anIntegerNumberFieldIdidentifying an existingIntegerNumberField. - If
embeddedis anEmbeddedRealNumberField: verifyembedded.artifactRefis aRealNumberFieldIdidentifying an existingRealNumberField. - If
embeddedis anEmbeddedBooleanField: verifyembedded.artifactRefis aBooleanFieldIdidentifying an existingBooleanField. - If
embeddedis anEmbeddedDateField: verifyembedded.artifactRefis aDateFieldIdidentifying an existingDateField. - If
embeddedis anEmbeddedTimeField: verifyembedded.artifactRefis aTimeFieldIdidentifying an existingTimeField. - If
embeddedis anEmbeddedDateTimeField: verifyembedded.artifactRefis aDateTimeFieldIdidentifying an existingDateTimeField. - If
embeddedis anEmbeddedControlledTermField: verifyembedded.artifactRefis aControlledTermFieldIdidentifying an existingControlledTermField. - If
embeddedis anEmbeddedSingleValuedEnumField: verifyembedded.artifactRefis aSingleValuedEnumFieldIdidentifying an existingSingleValuedEnumField. - If
embeddedis anEmbeddedMultiValuedEnumField: verifyembedded.artifactRefis aMultiValuedEnumFieldIdidentifying an existingMultiValuedEnumField. - If
embeddedis anEmbeddedLinkField: verifyembedded.artifactRefis aLinkFieldIdidentifying an existingLinkField. - If
embeddedis anEmbeddedEmailField: verifyembedded.artifactRefis anEmailFieldIdidentifying an existingEmailField. - If
embeddedis anEmbeddedPhoneNumberField: verifyembedded.artifactRefis aPhoneNumberFieldIdidentifying an existingPhoneNumberField. - If
embeddedis anEmbeddedOrcidField: verifyembedded.artifactRefis anOrcidFieldIdidentifying an existingOrcidField. - If
embeddedis anEmbeddedRorField: verifyembedded.artifactRefis aRorFieldIdidentifying an existingRorField. - If
embeddedis anEmbeddedDoiField: verifyembedded.artifactRefis aDoiFieldIdidentifying an existingDoiField. - If
embeddedis anEmbeddedPubMedIdField: verifyembedded.artifactRefis aPubMedIdFieldIdidentifying an existingPubMedIdField. - If
embeddedis anEmbeddedRridField: verifyembedded.artifactRefis anRridFieldIdidentifying an existingRridField. - If
embeddedis anEmbeddedNihGrantIdField: verifyembedded.artifactRefis aNihGrantIdFieldIdidentifying an existingNihGrantIdField. - If
embeddedis anEmbeddedAttributeValueField: verifyembedded.artifactRefis anAttributeValueFieldIdidentifying an existingAttributeValueField. - If
embeddedis anEmbeddedTemplate: verifyembedded.artifactRefis aTemplateIdidentifying an existingTemplate. - If
embeddedis anEmbeddedPresentationComponent: verifyembedded.artifactRefis aPresentationComponentIdidentifying an existingPresentationComponent.
validate_cardinality_consistency(embedded: EmbeddedArtifact)
Applies the Cardinality Consistency rules.
- Let
min=embedded.cardinality.min_cardinalityifembedded.cardinalityis present, else1. - Let
max=embedded.cardinality.max_cardinalityifembedded.cardinalityis present, else1. IfmaxisUnboundedCardinality, let max = ∞. - Verify min ≤ max.
On failure
- category
structural- path
<embedded>/cardinality- production
Cardinality- message
"min must not exceed max"
- Let
req=embedded.value_requirementif present, else“optional”. - If req = “required”: verify min ≥ 1.
On failure
- category
structural- path
<embedded>/cardinality/min- production
Cardinality- message
"required embedding must have min cardinality of at least 1"
Field Spec Validation
Applies the Field Spec Compatibility rules. See also Field Specs in the abstract grammar.
validate_field_spec(fieldSpec: FieldSpec)
Dispatch on the kind of fieldSpec:
- If
fieldSpecisTextFieldSpec: runvalidate_text_field_spec(fieldSpec). - If
fieldSpecisIntegerNumberFieldSpec: runvalidate_integer_number_field_spec(fieldSpec). - If
fieldSpecisRealNumberFieldSpec: runvalidate_real_number_field_spec(fieldSpec). - If
fieldSpecisSingleValuedEnumFieldSpecorMultiValuedEnumFieldSpec: runvalidate_enum_field_spec(fieldSpec). - All other field specs have no additional schema-level well-formedness checks beyond structural grammar conformance.
validate_text_field_spec(fieldSpec: TextFieldSpec)
- If both
fieldSpec.min_lengthandfieldSpec.max_lengthare present: verify fieldSpec.min_length ≤ fieldSpec.max_length.On failure- category
structural- path
<fieldSpec>/minLength- production
TextFieldSpec- message
"minLength must not exceed maxLength"
- If
fieldSpec.lang_tag_requirementis present: verify it is one of“langTagRequired”,“langTagOptional”,“langTagForbidden”.On failure- category
wireShape- path
<fieldSpec>/langTagRequirement- production
LangTagRequirement- message
"unknown LangTagRequirement value"
validate_integer_number_field_spec(fieldSpec: IntegerNumberFieldSpec)
- If both
fieldSpec.min_valueandfieldSpec.max_valueare present: verify fieldSpec.min_value ≤ fieldSpec.max_value.On failure- category
structural- path
<fieldSpec>/minValue- production
IntegerNumberFieldSpec- message
"minValue must not exceed maxValue"
validate_real_number_field_spec(fieldSpec: RealNumberFieldSpec)
- If both
fieldSpec.min_valueandfieldSpec.max_valueare present: verify fieldSpec.min_value ≤ fieldSpec.max_value.On failure- category
structural- path
<fieldSpec>/minValue- production
RealNumberFieldSpec- message
"minValue must not exceed maxValue"
validate_enum_field_spec(fieldSpec: EnumFieldSpec)
- Let
tokens= the sequence ofpv.valuevalues across allpvinfieldSpec.permissible_values. - Verify all values in
tokensare distinct: report a duplicate-token error for any pair sharing the same token string.On failure- category
structural- path
<fieldSpec>/permissibleValues/<j>/value(the second occurrence)- production
- naming
fieldSpec's kind - message
"PermissibleValue.value is not unique within the enclosing spec (also at /permissibleValues/<i>/value)"
- For each
pvinfieldSpec.permissible_values: verifypv.valueis a non-empty Unicode string.On failure- category
wireShape- path
<fieldSpec>/permissibleValues/<i>/value- production
PermissibleValue- message
"value must be a non-empty Unicode string"
- For each
pvinfieldSpec.permissible_values, for eachminpv.meanings: verifym.iriis a syntactically valid IRI.On failure- category
lexical- path
<fieldSpec>/permissibleValues/<i>/meanings/<j>/iri- production
Meaning- message
"iri is not a valid IRI"
- If
fieldSpecis aSingleValuedEnumFieldSpecandfieldSpec.default_valueis present: verifyfieldSpec.default_valueis anEnumValueand that fieldSpec.default_value.value ∈ tokens.On failure- category
structural- path
<fieldSpec>/defaultValue/value- production
SingleValuedEnumFieldSpec- message
"defaultValue does not match any of the spec's permissibleValues"
- If
fieldSpecis aMultiValuedEnumFieldSpecandfieldSpec.default_valuesis present:- Verify each entry is an
EnumValueand that its value ∈ tokens.On failure- category
structural- path
<fieldSpec>/defaultValues/<i>/value- production
MultiValuedEnumFieldSpec- message
"defaultValues entry does not match any of the spec's permissibleValues"
- Verify all entries’
valuestrings are distinct.On failure- category
structural- path
<fieldSpec>/defaultValues/<j>/value(the second occurrence)- production
MultiValuedEnumFieldSpec- message
"defaultValues contains duplicate entries (also at /defaultValues/<i>/value)"
- Verify each entry is an
Default Value Validation
validate_default_value(defaultValue: Value, embedded: EmbeddedArtifact)
Let fieldSpec = the FieldSpec of the Field referenced by embedded.
- Verify
defaultValueis of the family-specificValuetype forfieldSpec:TextValueforTextFieldSpec,IntegerNumberValueforIntegerNumberFieldSpec,RealNumberValueforRealNumberFieldSpec,BooleanValueforBooleanFieldSpec,DateValueforDateFieldSpec,TimeValueforTimeFieldSpec,DateTimeValueforDateTimeFieldSpec,ControlledTermValueforControlledTermFieldSpec,EnumValueforSingleValuedEnumFieldSpec, a sequence ofEnumValueforMultiValuedEnumFieldSpec,LinkValueforLinkFieldSpec,EmailValueforEmailFieldSpec,PhoneNumberValueforPhoneNumberFieldSpec, and the corresponding external-authorityValuetypes for the external-authority field specs.AttributeValueFieldSpecdoes not admit a default value.On failure- category
wireShape- path
<embedded>/defaultValue- production
- naming
embedded's family - message
"defaultValue must be a <FamilyValue> (got <kind>)"
- Apply the family-specific validate_xxx_value(defaultValue, fieldSpec) procedure to
defaultValue. The default value MUST satisfy every constraint that aFieldValuecarrying the sameValuewould satisfy. Errors reported by the inner subroutine are surfaced verbatim, with the path rooted at<embedded>/defaultValue. - If
embeddedis anEmbeddedSingleValuedEnumField: verifydefaultValueis a singleEnumValue(not a sequence).On failure- category
wireShape- path
<embedded>/defaultValue- production
EmbeddedSingleValuedEnumField- message
"defaultValue must be a single EnumValue, not a sequence"
- If
embeddedis anEmbeddedMultiValuedEnumField: verifydefaultValueis a (possibly empty) sequence ofEnumValueconstructs and that no two entries share the samevalue.On failure (shape)- category
wireShape- path
<embedded>/defaultValue- production
EmbeddedMultiValuedEnumField- message
"defaultValue must be an array of EnumValue"
On failure (duplicate)- category
structural- path
<embedded>/defaultValue/<j>/value(the second occurrence)- production
EmbeddedMultiValuedEnumField- message
"defaultValue contains duplicate entries (also at /defaultValue/<i>/value)"
Rendering Hint Validation
validate_rendering_hints(embedded: EmbeddedField)
Applies the Rendering Hint Compatibility rules.
Let fieldSpec = the FieldSpec of the Field referenced by embedded.
For each step below, on failure: structural at the rendering-hint slot’s path (e.g. <embedded>/renderingHint), production naming embedded’s family, message “<HintKind> is not compatible with <FieldSpecKind>”.
- If
embeddedcarries aTextRenderingHint: verifyfieldSpecisTextFieldSpec. - If
embeddedcarries aSingleValuedEnumRenderingHint: verifyfieldSpecisSingleValuedEnumFieldSpec. - If
embeddedcarries aMultiValuedEnumRenderingHint: verifyfieldSpecisMultiValuedEnumFieldSpec. - If
embeddedcarries aNumericRenderingHint: verifyfieldSpecisIntegerNumberFieldSpecorRealNumberFieldSpec. - If
embeddedcarries aDateRenderingHint: verifyfieldSpecisDateFieldSpec. - If
embeddedcarries aTimeRenderingHint: verifyfieldSpecisTimeFieldSpec. - If
embeddedcarries aDateTimeRenderingHint: verifyfieldSpecisDateTimeFieldSpec.
Phase 2: Instance Validation
Entry Point
validate_instance(instance: TemplateInstance, template: Template)
Entry point for instance validation.
- Run
validate_model_version(instance.model_version). - Run
validate_instance_alignment(instance, template). - Run
validate_field_presence_and_cardinality(instance, template). - For each
fieldValueininstance.instance_valueswherefieldValueis aFieldValue:- Let
embeddedField= theEmbeddedFieldintemplatewhose key =fieldValue.key. - Run
validate_field_value(fieldValue, embeddedField).
- Let
- Run
validate_nested_template_presence_and_cardinality(instance, template). - For each
nestedInstanceininstance.instance_valueswherenestedInstanceis aNestedTemplateInstance:- Let
embeddedTemplate= theEmbeddedTemplateintemplatewhose key =nestedInstance.key. - Let
referencedTemplate= theTemplateidentified byembeddedTemplate.artifactRef. - Run
validate_instance(nestedInstance, referencedTemplate).
- Let
Structural Alignment
validate_instance_alignment(instance: TemplateInstance, template: Template)
Applies the Instance Alignment rules.
- Let
field_keys= { embedded.key | embedded ∈ template.embedded_artifacts, embedded is EmbeddedField }. - Let
template_keys= { embedded.key | embedded ∈ template.embedded_artifacts, embedded is EmbeddedTemplate }. - Let
pc_keys= { embedded.key | embedded ∈ template.embedded_artifacts, embedded is EmbeddedPresentationComponent }. - For each
fieldValueininstance.instance_valueswherefieldValueis aFieldValue: verify fieldValue.key ∈ field_keys.On failure- category
structural- path
<instance>/values/<i>/key- production
FieldValue- message
"FieldValue.key does not identify any EmbeddedField in the referenced Template"
- For each
nestedInstanceininstance.instance_valueswherenestedInstanceis aNestedTemplateInstance: verify nestedInstance.key ∈ template_keys.On failure- category
structural- path
<instance>/values/<i>/key- production
NestedTemplateInstance- message
"NestedTemplateInstance.key does not identify any EmbeddedTemplate in the referenced Template"
- For each
instanceValueininstance.instance_values: verify instanceValue.key ∉ pc_keys.On failure- category
structural- path
<instance>/values/<i>/key- production
- naming
instanceValue's kind - message
"InstanceValue keyed to an EmbeddedPresentationComponent — presentation components do not produce instance values"
Field Presence and Cardinality
validate_field_presence_and_cardinality(instance: TemplateInstance, template: Template)
Applies the Cardinality Consistency and Cardinality Defaults and Multiplicity rules.
For each embeddedField in template.embedded_artifacts where embeddedField is an EmbeddedField:
- Let
eff_min=embeddedField.cardinality.min_cardinalityif present, else1. - Let
eff_max=embeddedField.cardinality.max_cardinalityif present, else1. Ifeff_maxisUnboundedCardinality, let eff_max = ∞. - Let
req=embeddedField.value_requirementif present, else“optional”. - Let
fieldValue= theFieldValueininstancewith key =embeddedField.key, orabsentif none exists. - If req = “required”:
- Verify fieldValue ≠ absent.
On failure
- category
structural- path
<instance>/values- production
TemplateInstance- message
"required field <embeddedField.key> is missing from the instance"
- Verify count(fieldValue.values) ≥ eff_min.
On failure
- category
structural- path
<fieldValue>/values- production
FieldValue- message
"value count below required minimum cardinality (got <n>, expected ≥ <eff_min>)"
- If eff_max ≠ ∞: verify count(fieldValue.values) ≤ eff_max.
On failure
- category
structural- path
<fieldValue>/values- production
FieldValue- message
"value count above maximum cardinality (got <n>, expected ≤ <eff_max>)"
- Verify fieldValue ≠ absent.
- If req = “recommended” or req = “optional”:
- If fieldValue ≠ absent:
- Verify count(fieldValue.values) ≥ eff_min.
On failure
- category
structural- path
<fieldValue>/values- production
FieldValue- message
"value count below minimum cardinality (got <n>, expected ≥ <eff_min>)"
- If eff_max ≠ ∞: verify count(fieldValue.values) ≤ eff_max.
On failure
- category
structural- path
<fieldValue>/values- production
FieldValue- message
"value count above maximum cardinality (got <n>, expected ≤ <eff_max>)"
- Verify count(fieldValue.values) ≥ eff_min.
- If fieldValue ≠ absent:
Field Value Validation
validate_field_value(fieldValue: FieldValue, embeddedField: EmbeddedField)
- Let
fieldSpec= theFieldSpecof theFieldreferenced byembeddedField. - For each
valueinfieldValue.values: runvalidate_value(value, fieldSpec).
validate_value(value: Value, fieldSpec: FieldSpec)
Dispatch on the kind of fieldSpec:
TextFieldSpec→validate_text_value(value, fieldSpec)IntegerNumberFieldSpec→validate_integer_number_value(value, fieldSpec)RealNumberFieldSpec→validate_real_number_value(value, fieldSpec)BooleanFieldSpec→validate_boolean_value(value, fieldSpec)DateFieldSpec→validate_date_value(value, fieldSpec)TimeFieldSpec→validate_time_value(value, fieldSpec)DateTimeFieldSpec→validate_datetime_value(value, fieldSpec)ControlledTermFieldSpec→validate_controlled_term_value(value, fieldSpec)SingleValuedEnumFieldSpecorMultiValuedEnumFieldSpec→validate_enum_value(value, fieldSpec)LinkFieldSpec→validate_link_value(value)EmailFieldSpecorPhoneNumberFieldSpec→validate_contact_value(value)OrcidFieldSpec,RorFieldSpec,DoiFieldSpec,PubMedIdFieldSpec,RridFieldSpec, orNihGrantIdFieldSpec→validate_external_authority_value(value, fieldSpec)AttributeValueFieldSpec→validate_attribute_value(value)
validate_text_value(value: TextValue, fieldSpec: TextFieldSpec)
- Let
lexicalForm=value.value. - If
fieldSpec.min_lengthis present: verify len(lexicalForm) ≥ fieldSpec.min_length.On failure- category
structural- path
<value>/value- production
TextValue- message
"value length below TextFieldSpec.minLength"
- If
fieldSpec.max_lengthis present: verify len(lexicalForm) ≤ fieldSpec.max_length.On failure- category
structural- path
<value>/value- production
TextValue- message
"value length above TextFieldSpec.maxLength"
- If
fieldSpec.validation_regexis present: verifylexicalFormmatchesfieldSpec.validation_regex.On failure- category
structural- path
<value>/value- production
TextValue- message
"value does not match TextFieldSpec.validationRegex"
- If
value.langis present: verify it conforms to theBcp47Taglexical form (RFC 5646).On failure- category
lexical- path
<value>/lang- production
TextValue- message
"lang is not a well-formed BCP 47 tag"
- If fieldSpec.lang_tag_requirement = “langTagRequired”: verify
value.langis present.On failure- category
structural- path
<value>/lang- production
TextValue- message
"lang tag missing; TextFieldSpec.langTagRequirement is 'langTagRequired'"
- If fieldSpec.lang_tag_requirement = “langTagForbidden”: verify
value.langis absent.On failure- category
structural- path
<value>/lang- production
TextValue- message
"lang tag present; TextFieldSpec.langTagRequirement is 'langTagForbidden'"
validate_integer_number_value(value: IntegerNumberValue, fieldSpec: IntegerNumberFieldSpec)
- Verify
value.valueconforms to theIntegerLexicalForm(regex^-?(0|[1-9][0-9]*)$). Letn= its integer value.On failure- category
lexical- path
<value>/value- production
IntegerNumberValue- message
"value is not a well-formed IntegerLexicalForm"
- If
fieldSpec.min_valueis present: verify n ≥ fieldSpec.min_value.value (compared as integers).On failure- category
structural- path
<value>/value- production
IntegerNumberValue- message
"value below IntegerNumberFieldSpec.minValue"
- If
fieldSpec.max_valueis present: verify n ≤ fieldSpec.max_value.value (compared as integers).On failure- category
structural- path
<value>/value- production
IntegerNumberValue- message
"value above IntegerNumberFieldSpec.maxValue"
validate_real_number_value(value: RealNumberValue, fieldSpec: RealNumberFieldSpec)
- Verify value.datatype = fieldSpec.datatype (one of
decimal,float,double).On failure- category
structural- path
<value>/datatype- production
RealNumberValue- message
"datatype does not match the enclosing RealNumberFieldSpec.datatype"
- Verify
value.valueis a well-formed lexical form for that datatype. Letn= its numeric value.On failure- category
lexical- path
<value>/value- production
RealNumberValue- message
"value is not a well-formed lexical form for datatype <datatype>"
- If
fieldSpec.min_valueis present: verify n ≥ fieldSpec.min_value.value (compared as numbers underfieldSpec.datatype’s ordering).On failure- category
structural- path
<value>/value- production
RealNumberValue- message
"value below RealNumberFieldSpec.minValue"
- If
fieldSpec.max_valueis present: verify n ≤ fieldSpec.max_value.value (compared as numbers underfieldSpec.datatype’s ordering).On failure- category
structural- path
<value>/value- production
RealNumberValue- message
"value above RealNumberFieldSpec.maxValue"
Comparison semantics for float and double. The numeric value n MAY be NaN, +INF, or -INF (these are part of the xsd:float and xsd:double lexical spaces). The bound comparisons in steps 3 and 4 follow IEEE 754 ordering:
- If
nisNaN, every comparison n ≥ x and n ≤ x is false. ANaNvalue therefore violates any presentminValueormaxValuebound and reports the corresponding bound-failure error. - If
nis+INF, then n ≥ x is true for every finitexand n ≤ x is true only whenxis+INF. - If
nis-INF, then n ≤ x is true for every finitexand n ≥ x is true only whenxis-INF.
This convention matches the IEEE 754 totalOrder relation restricted to comparison; bindings SHOULD use their host language’s IEEE 754-compliant comparison primitives.
validate_boolean_value(value: BooleanValue, fieldSpec: BooleanFieldSpec)
- Verify
value.valueistrueorfalse.On failure- category
wireShape- path
<value>/value- production
BooleanValue- message
"value must be a JSON boolean"
validate_date_value(value: DateValue, fieldSpec: DateFieldSpec)
- If fieldSpec.date_value_type = “year”: verify
valueis aYearValuewhosevaluematches[0-9]{4}.On failure (arm)- category
structural- path
<value>- production
DateValue- message
"DateFieldSpec.dateValueType 'year' admits only YearValue"
On failure (lexical)- category
lexical- path
<value>/value- production
YearValue- message
"value does not match YYYY"
- If fieldSpec.date_value_type = “yearMonth”: verify
valueis aYearMonthValuewhosevaluematches[0-9]{4}-(0[1-9]|1[0-2]).On failure (arm)- category
structural- path
<value>- production
DateValue- message
"DateFieldSpec.dateValueType 'yearMonth' admits only YearMonthValue"
On failure (lexical)- category
lexical- path
<value>/value- production
YearMonthValue- message
"value does not match YYYY-MM"
- If fieldSpec.date_value_type = “fullDate”: verify
valueis aFullDateValuewhosevalueis a well-formedxsd:datelexical form.On failure (arm)- category
structural- path
<value>- production
DateValue- message
"DateFieldSpec.dateValueType 'fullDate' admits only FullDateValue"
On failure (lexical)- category
lexical- path
<value>/value- production
FullDateValue- message
"value is not a well-formed xsd:date lexical form"
validate_time_value(value: TimeValue, fieldSpec: TimeFieldSpec)
For each step below that verifies a precision constraint, on failure: structural at <value>/value, production TimeValue, message “value does not match the precision required by TimeFieldSpec.timePrecision”. For lexical-form failures (xsd:time ill-formedness), the category is lexical instead.
- Let
t=value.value. - If fieldSpec.time_precision = “hourMinute”: verify
tcontains only hour and minute components (formHH:MM; no seconds or fractional seconds present). - If fieldSpec.time_precision = “hourMinuteSecond”: verify
tcontains hour, minute, and second components (formHH:MM:SS; no fractional seconds present). - If fieldSpec.time_precision = “hourMinuteSecondFraction”: verify
tis a well-formedxsd:timelexical form; fractional seconds are permitted. - If
fieldSpec.time_precisionis absent: verifytis a well-formedxsd:timelexical form. - If fieldSpec.timezone_requirement = “timezoneRequired”: verify
tincludes a timezone designator.On failure- category
structural- path
<value>/value- production
TimeValue- message
"timezone designator missing; TimeFieldSpec.timezoneRequirement is 'timezoneRequired'"
validate_datetime_value(value: DateTimeValue, fieldSpec: DateTimeFieldSpec)
For each step below that verifies a precision constraint, on failure: structural at <value>/value, production DateTimeValue, message “value does not match the precision required by DateTimeFieldSpec.dateTimeValueType”. For lexical-form failures (xsd:dateTime ill-formedness), the category is lexical instead.
- Let
dt=value.value. - If fieldSpec.datetime_value_type = “dateHourMinute”: verify the time component of
dtcontains only hour and minute (form…THH:MM; no seconds present). - If fieldSpec.datetime_value_type = “dateHourMinuteSecond”: verify the time component contains hour, minute, and second (form
…THH:MM:SS; no fractional seconds present). - If fieldSpec.datetime_value_type = “dateHourMinuteSecondFraction”: verify
dtis a well-formedxsd:dateTimelexical form; fractional seconds are permitted. - If fieldSpec.timezone_requirement = “timezoneRequired”: verify
dtincludes a timezone designator.On failure- category
structural- path
<value>/value- production
DateTimeValue- message
"timezone designator missing; DateTimeFieldSpec.timezoneRequirement is 'timezoneRequired'"
validate_controlled_term_value(value: ControlledTermValue, fieldSpec: ControlledTermFieldSpec)
- Verify
value.term_iriis present.On failure- category
wireShape- path
<value>/term- production
ControlledTermValue- message
"term is required"
- Warn if
value.labelis absent.On warning- category
structural- path
<value>/label- production
ControlledTermValue- message
"label SHOULD be present so consumers without ontology access can render the term"
Note: validation of value.term_iri against fieldSpec.controlled_term_sources requires an external ontology resolver and is outside the scope of this algorithm; see Out of Scope.
validate_enum_value(value: EnumValue, fieldSpec: EnumFieldSpec)
- Verify there exists a
pvinfieldSpec.permissible_valuessuch that value.value = pv.value (string equality, character by character).On failure- category
structural- path
<value>/value- production
EnumValue- message
"value does not match any of the spec's permissibleValues tokens"
validate_link_value(value: LinkValue)
- Verify
value.iriis present and is a well-formed IRI.On failure (missing)- category
wireShape- path
<value>/iri- production
LinkValue- message
"iri is required"
On failure (malformed)- category
lexical- path
<value>/iri- production
LinkValue- message
"iri is not a valid IRI"
validate_contact_value(value: ContactValue)
- If
valueis anEmailValue: verifyvalue.valueis a non-empty lexical form.On failure- category
wireShape- path
<value>/value- production
EmailValue- message
"value must be a non-empty string"
- If
valueis aPhoneNumberValue: verifyvalue.valueis a non-empty lexical form.On failure- category
wireShape- path
<value>/value- production
PhoneNumberValue- message
"value must be a non-empty string"
validate_external_authority_value(value: ExternalAuthorityValue, fieldSpec: ExternalAuthorityFieldSpec)
Each external-authority Value carries a typed IRI specialised for its authority. The lexical patterns below are recommended (suitable for syntactic conformance checking) but are not structurally normative beyond Iri well-formedness; binding-level validators MAY apply stricter checks.
| Field spec | Required IRI | Recommended pattern |
|---|---|---|
OrcidFieldSpec | OrcidIri | https://orcid\.org/\d{4}-\d{4}-\d{4}-\d{3}[0-9X] |
RorFieldSpec | RorIri | https://ror\.org/0[a-hj-km-np-tv-z0-9]{6}[0-9]{2} |
DoiFieldSpec | DoiIri | https://doi\.org/10\.\d{4,9}/.+ |
PubMedIdFieldSpec | PubMedIri | https://pubmed\.ncbi\.nlm\.nih\.gov/\d+ |
RridFieldSpec | RridIri | https://identifiers\.org/RRID:[A-Z]+_\d+ |
NihGrantIdFieldSpec | NihGrantIri | (see Out of Scope) |
In every case the procedure is: verify value is the corresponding XxxValue and that its iri slot is present and is a well-formed Iri per grammar.md §Primitive String Types. Implementations MAY additionally check the recommended pattern.
- category
wireShape- path
<value>/iri- production
- naming
value's family - message
"iri is required"
- category
lexical- path
<value>/iri- production
- naming
value's family - message
"iri is not a valid Iri"
- category
lexical- path
<value>/iri- production
- naming
value's family - message
"iri does not match the recommended pattern for <Authority>"
validate_attribute_value(value: AttributeValue)
- Verify
value.nameis present and contains a non-emptystring.On failure- category
wireShape- path
<value>/name- production
AttributeValue- message
"name must be a non-empty string"
- Verify
value.valueis present and is a well-formedValue.On failure- category
wireShape- path
<value>/value- production
AttributeValue- message
"value is required and must be a Value"
- If
value.valueis anAttributeValue: runvalidate_attribute_value(value.value).
Nested Template Validation
validate_nested_template_presence_and_cardinality(instance: TemplateInstance, template: Template)
Applies the Cardinality Consistency and Cardinality Defaults and Multiplicity rules.
For each embeddedTemplate in template.embedded_artifacts where embeddedTemplate is an EmbeddedTemplate:
- Let
eff_min=embeddedTemplate.cardinality.min_cardinalityif present, else1. - Let
eff_max=embeddedTemplate.cardinality.max_cardinalityif present, else1. Ifeff_maxisUnboundedCardinality, let eff_max = ∞. - Let
req=embeddedTemplate.value_requirementif present, else“optional”. - Let
n= count({ nestedInstance | nestedInstance ∈ instance.instance_values, nestedInstance is NestedTemplateInstance, nestedInstance.key = embeddedTemplate.key }). - If req = “required”:
- Verify n ≥ eff_min.
On failure
- category
structural- path
<instance>/values- production
TemplateInstance- message
"required NestedTemplateInstance count below minimum (got <n>, expected ≥ <eff_min>) for key '<embeddedTemplate.key>'"
- If eff_max ≠ ∞: verify n ≤ eff_max.
On failure
- category
structural- path
<instance>/values- production
TemplateInstance- message
"NestedTemplateInstance count above maximum (got <n>, expected ≤ <eff_max>) for key '<embeddedTemplate.key>'"
- Verify n ≥ eff_min.
- If req = “recommended” or req = “optional”:
- If n > 0:
- Verify n ≥ eff_min.
On failure
- category
structural- path
<instance>/values- production
TemplateInstance- message
"NestedTemplateInstance count below minimum (got <n>, expected ≥ <eff_min>) for key '<embeddedTemplate.key>'"
- If eff_max ≠ ∞: verify n ≤ eff_max.
On failure
- category
structural- path
<instance>/values- production
TemplateInstance- message
"NestedTemplateInstance count above maximum (got <n>, expected ≤ <eff_max>) for key '<embeddedTemplate.key>'"
- Verify n ≥ eff_min.
- If n > 0:
Out of Scope
The following checks are outside the scope of the canonical algorithm and are not required for conformance:
ControlledTermSourcemembership — verifying that aControlledTermValue’sTermIriis drawn from a declared ontology, branch, class set, or value set requires an external ontology resolver and is not defined here.- NIH Grant ID pattern — the lexical pattern for
NihGrantIriis currently unspecified. AttributeValueFieldname validation — attribute names are not fixed at schema definition time and cannot be structurally validated against the schema.
Open Questions
- Which validation rules should be mandatory in the core specification versus deferred to profile-specific extensions?
RDF Projection
This section defines a projection from CEDAR Value instances to RDF. The projection is a derived view: CEDAR’s abstract grammar and wire form are CEDAR-native, and RDF is one consumer of the data, not the substrate of it. RDF tooling that consumes CEDAR instance data uses this projection; tooling that does not need RDF ignores it.
The projection is
- total — every
Valueadmitted by the abstract grammar projects to a unique RDF term (literal or IRI) plus zero or more accompanying triples, - deterministic — given the same input
Value, every conforming projection produces the same RDF, and - mechanical — the rules below are the entire definition; no interpretive judgement is required.
The projection is informative with respect to the abstract grammar and wire grammar — it does not constrain how Value instances are encoded on the wire or represented in memory. It is normative for any RDF emitter that claims to project CEDAR instance data: a conforming emitter MUST produce the RDF specified here.
Vocabularies
The projection uses the following IRI prefixes:
| Prefix | IRI |
|---|---|
xsd: | http://www.w3.org/2001/XMLSchema# |
rdf: | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs: | http://www.w3.org/2000/01/rdf-schema# |
skos: | http://www.w3.org/2004/02/skos-core# |
dc: | http://purl.org/dc/terms/ |
No CEDAR-specific RDF vocabulary is introduced; the projection uses only RDF, RDFS, SKOS, XSD, and Dublin Core terms.
Per-variant projection
Each Value variant projects to a single RDF term. The “RDF term” column gives the produced node. The “Accompanying triples” column lists triples that travel with the term when the term is the object of an enclosing statement (for example, the value of an EmbeddedField instance). The exact subject/predicate of the enclosing statement is determined by the surrounding structure and is out of scope for this section.
Scalar values
Value variant | RDF term | Accompanying triples |
|---|---|---|
TextValue { value, lang } (lang present) | "value"@lang (rdf:langString) | none |
TextValue { value } (lang absent) | "value"^^xsd:string | none |
IntegerNumberValue { value } | "value"^^xsd:integer | none |
RealNumberValue { value, datatype } | "value"^^xsd:<datatype> | none |
BooleanValue { value } | "value"^^xsd:boolean ("true" or "false") | none |
For RealNumberValue, the <datatype> placeholder is the lexical name of the carried RealNumberDatatypeKind (decimal, float, or double), expanded against xsd:.
When the originating TextFieldSpec carries LangTagRequirement, the projection is pinned to a single RDF literal shape: "langTagRequired" always projects to rdf:langString literals; "langTagForbidden" always projects to xsd:string literals. "langTagOptional" (the default) admits either shape and projects each TextValue according to whether its lang slot is present.
Temporal values
Value variant | RDF term |
|---|---|
YearValue { value } | "value"^^xsd:string |
YearMonthValue { value } | "value"^^xsd:string |
FullDateValue { value } | "value"^^xsd:date |
TimeValue { value } | "value"^^xsd:time |
DateTimeValue { value } | "value"^^xsd:dateTime |
YearValue and YearMonthValue project to xsd:string literals. The temporal nature of the value is recoverable from the surrounding FieldSpec if needed; the projection does not introduce xsd:gYear or xsd:gYearMonth typed literals.
Contact values
Value variant | RDF term |
|---|---|
EmailValue { value } | "value"^^xsd:string |
PhoneNumberValue { value } | "value"^^xsd:string |
IRI-bearing values
LinkValue, OrcidValue, RorValue, DoiValue, PubMedIdValue, RridValue, and NihGrantIdValue each project to a plain RDF IRI node.
Value variant | RDF term | Accompanying triples |
|---|---|---|
LinkValue { iri, label } | <iri> | if label present: <iri> rdfs:label "label" |
OrcidValue { iri, label } | <iri> | if label present: <iri> rdfs:label "label" |
RorValue { iri, label } | <iri> | if label present: <iri> rdfs:label "label" |
DoiValue { iri, label } | <iri> | if label present: <iri> rdfs:label "label" |
PubMedIdValue { iri, label } | <iri> | if label present: <iri> rdfs:label "label" |
RridValue { iri, label } | <iri> | if label present: <iri> rdfs:label "label" |
NihGrantIdValue { iri, label } | <iri> | if label present: <iri> rdfs:label "label" |
The label is a MultilingualString on every IRI-bearing value. Each localization produces a separate rdfs:label triple. A label with no localizations (a single und-tagged entry) produces a single rdfs:label "label"@und triple.
Controlled-term values
ControlledTermValue projects to the term IRI together with optional metadata triples drawn from the optional Label, Notation, and PreferredLabel slots:
| Slot | Triple emitted (when present) |
|---|---|
label | <term> rdfs:label "label"@lang for each localization in the MultilingualString |
notation | <term> skos:notation "notation"^^xsd:string |
preferredLabel | <term> skos:prefLabel "preferredLabel"@lang for each localization in the MultilingualString |
The accompanying-triple count is therefore variable: zero (no optional slots), one, two, or more (when label or preferred label carry several localizations).
Enum values
An enum value’s RDF projection requires the surrounding EnumFieldSpec context: the value carries a bare Token, and the spec’s PermissibleValue+ list supplies the per-token Label, Description, and Meaning metadata that the projection draws on. This is the only Value whose RDF lift cannot be determined from the value alone.
EnumValue { value: T }projects as follows:- Look up
Tin the referencedEnumFieldSpec’sPermissibleValueentries to obtain the matchingpv. - If
pvcarries one or moreMeaningentries, project as one RDF IRI node perMeaning— i.e. an enum value withnmeanings projects tonIRI nodes. Each IRI node carriesrdfs:labeltriples drawn from the matchingMeaning’s ownlabel(one triple per localization in theMultilingualString); if theMeaningcarries nolabel,rdfs:labeltriples are drawn from the enclosingpv.labelinstead, providing a fallback display label when the bound term’s own label is not cached.dc:descriptiontriples are drawn frompv.description(one per localization). When this rule yields more than one RDF term, the surrounding statement that targets the enum value is duplicated once per term. - If
pvcarries noMeaning, project as"T"^^xsd:string. The accompanyingrdfs:labelanddc:descriptiontriples are not emitted in this case (the value is a bare lexical token).
- Look up
A conforming RDF emitter MUST therefore have access to the EnumFieldSpec of the surrounding EmbeddedField when projecting an EnumValue. RDF emitters that lift CEDAR data without schema context cannot project enum values faithfully.
Attribute value
AttributeValue { name, value } carries an attribute name and a nested value. The grammar types name as a Unicode string (the AttributeName production), not an Iri — attribute names are not constrained to be IRIs at the abstract-grammar level.
The projection treats name as the IRI of the predicate connecting the enclosing subject to the projected value; the projected RDF term is the projection of the nested value. The wrapper introduces one triple of the form <subject> <name> <projected-value>, where <subject> is supplied by the enclosing structure. The accompanying triples of the nested value (if any) travel with the projected value as for any other position.
For projection to succeed, the name string MUST be resolvable to a syntactically valid IRI — either because it is already an absolute IRI, or because the consuming tool resolves a relative name against an enclosing namespace before projection. CEDAR data whose name strings cannot be resolved this way is not projectable to RDF; tooling SHOULD either supply a default namespace or refuse to project such instances.
Annotation
An Annotation carries a property IRI and a body of polymorphic kind (AnnotationStringValue or AnnotationIriValue per grammar.md §Annotations). On any artifact carrying annotations (Field, Template, PresentationComponent), the annotation projects to a single triple whose subject is the artifact’s IRI, predicate is the annotation property, and object depends on the body kind:
| Annotation body kind | RDF term for the object |
|---|---|
AnnotationStringValue { value, lang } (lang present) | "value"@lang (rdf:langString) |
AnnotationStringValue { value } (lang absent) | "value"^^xsd:string |
AnnotationIriValue { iri } | <iri> |
Each annotation produces exactly one triple. Multiple annotations on the same artifact produce one triple each. Annotations are projected only when the surrounding artifact is itself projected; the wrapping Annotation carries no other RDF presence.
Round-trip and faithfulness
The projection is forward-only by design: it converts CEDAR Value instances into RDF. The reverse direction (lifting an arbitrary RDF graph back into CEDAR Value instances) is not defined by this specification. RDF data produced by this projection MAY be re-ingested into CEDAR by tooling that knows the source FieldSpec for each value position; in the absence of FieldSpec context the reverse direction is ambiguous.
Within the projection itself, CEDAR-side identity is preserved: two CEDAR Value instances with identical content project to RDF terms that are RDF-term-equal. Two CEDAR Value instances differing in any structural component project to RDF terms that differ in either the term itself or in the accompanying triples.
Non-projected information
The following CEDAR information is not carried by the projection:
- the
kinddiscriminator of eachValuevariant — it is not preserved as an RDF triple. Variants whose RDF terms coincide (for example,EmailValueandPhoneNumberValueboth projecting toxsd:stringliterals) cannot be distinguished from RDF alone, - presentation hints, label overrides, visibility, and other embedding-level configuration carried by
EmbeddedFieldproperties — the projection coversValuecontent only, - field-spec metadata such as units, validation regexes, or rendering hints — these are properties of the schema, not of the value,
- default values at either layer (
XxxFieldSpec.defaultValueandEmbeddedXxxField.defaultValue) — defaults are UI/UX initialisation only and never appear inTemplateInstanceartifacts (seegrammar.md§Defaults andinstances.md). The projection sees only the values an instance actually carries; defaults that were accepted are projected as the chosen value (indistinguishable from a user-typed identical value), and defaults that were not accepted are simply absent.
Tooling that requires faithful round-tripping of these CEDAR-native concerns SHOULD work directly with the wire form rather than relying on the RDF projection.
Host-Language Bindings
This document gives guidance on how to map the abstract grammar
(grammar.md) and the JSON wire format
(wire-grammar.md) onto host-language types and
idioms in TypeScript, Java, and Python.
1. Purpose and Scope
The CEDAR Structural Model is layered:
grammar.mddefines what the model is — the abstract productions, their components, and the structural invariants they satisfy.wire-grammar.mddefines what the JSON looks like — exactly one JSON shape per abstract production, with discriminator placement and inline constraints.serialization.mddefines the encoding rules that frame the wire shapes — property naming, NFC normalisation, big-integer fallback, the wrapping principle.- This document defines how those JSON shapes become host-language values — the in-memory types a binding library exposes, and the idioms it follows.
Where the prior three documents are normative, this one is recommendation-grade. A binding conforms by realising the meta-categories in §2 with idioms compatible with the spirit of the recommendations below; deviations are allowed but SHOULD be documented in the binding’s own README.
In scope: TypeScript, Java (17+), Python (3.11+).
Out of scope (for now): Rust, Go, C#, Swift, Kotlin, and other languages. New languages can be added by following the meta-pattern in §2 — for each category, name an idiomatic realisation that preserves the wire round-trip and the construction-time invariants.
The reference TypeScript implementation is
cedar-ts (npm package
@metadatacenter/cedar-model); see §5. For idioms not covered
explicitly here, cedar-ts is the source of truth on the TS side.
2. Meta-Categories
Each subsection below covers one structural pattern that recurs across the wire grammar. For every category we give:
- a one-paragraph definition in terms of the grammar / wire-grammar;
- a TypeScript idiom (reflecting cedar-ts);
- a Java idiom (Java 17+, Jackson 2.x with
jackson-databindandjackson-datatype-jdk8); - a Python idiom (Pydantic v2;
attrs/dataclassmentioned where appropriate); - validation guidance — when and where the binding enforces the associated constraints;
- a small worked example translating the same abstract production three ways.
Reading note: Jackson-annotation density. The Java idioms in this section annotate every record component explicitly with
@JsonProperty(...)and every mapping constructor with@JsonCreator. The intent is unambiguous wire-to-Java mapping that does not depend on parameter-name reflection (the-parameterscompiler flag) or on positional binding. A real binding MAY rely on Jackson defaults — e.g. record component name introspection plus implicit canonical-constructor binding — and elide most of these annotations; the explicit form here is the lower bound on what the binding contract requires, not the only style permitted.Forward references. §2 mentions cedar-ts module names (
leaves/,embedded/, etc.) and a few productions (EmbeddedArtifact,Template.members) before they are introduced in detail. The cedar-ts module layout is in §5; the productions themselves are defined ingrammar.mdandwire-grammar.md.
2.1 Plain object production
What it is. A wire production written as T ::: object { ... }
with no "kind": "..." literal property. These are the
singleton-only productions of the wire grammar — productions that
never appear as alternatives in any discriminator: kind union, and
therefore never carry kind on the wire (per the kind rule,
wire-grammar.md §1.5). Examples: Cardinality,
Property, LabelOverride, LifecycleMetadata,
SchemaArtifactVersioning, Annotation, Unit, OntologyReference,
OntologyDisplayHint, ControlledTermClass, PermissibleValue,
Meaning.
TypeScript idiom. A readonly interface plus a constructor
function. No kind field on the interface.
export interface Cardinality {
readonly min: number;
readonly max?: number;
}
export interface CardinalityInit {
readonly min: number;
readonly max?: number;
}
export function cardinality(init: CardinalityInit): Cardinality {
const out: { min: number; max?: number } = {
min: assertNonNegativeInteger(init.min),
};
if (init.max !== undefined) out.max = assertNonNegativeInteger(init.max);
return out;
}
Java idiom. A record whose components mirror the wire properties.
No Jackson type info is needed because the value lives at a singleton
position and is decoded by its enclosing field’s static type.
public record Cardinality(
@JsonProperty("min") int min,
@JsonProperty("max") @JsonInclude(NON_NULL) Integer max) {
@JsonCreator
public Cardinality {
if (min < 0) throw new CedarConstructionException("Cardinality.min must be >= 0");
if (max != null && max < 0) throw new CedarConstructionException("Cardinality.max must be >= 0");
}
}
Python idiom. A Pydantic v2 model with frozen=True and aliases
for any name that differs from snake_case.
from pydantic import BaseModel, ConfigDict, Field
class Cardinality(BaseModel):
model_config = ConfigDict(frozen=True, populate_by_name=True)
min: int = Field(ge=0)
max: int | None = Field(default=None, ge=0)
Validation guidance. Range checks (e.g., min >= 0) and any
inline constraints from wire-grammar.md apply at construction time.
The constructed value is always valid; downstream code never has to
revalidate.
Worked example: Cardinality { min: number; max?: number }. The
wire shape is { "min": 0, "max": 5 }; the three idioms above produce
that JSON via their language’s natural serializer (TS via plain
JSON.stringify; Java via Jackson default mapper; Python via
model_dump_json()).
2.2 Discriminated union with kind tag
What it is. A wire production written as T ::: A | B | … with
either an explicit // discriminator: kind comment or no discriminator
comment at all (in which case kind is the default per
wire-grammar.md §1.3). Each member is an object
production whose shape includes a "kind": "MemberName" literal
property. Examples: Value, FieldSpec, EmbeddedArtifact,
ControlledTermSource, PresentationComponent, InstanceValue,
SchemaArtifact, Artifact,
ExternalAuthorityValue, DateValue.
TypeScript idiom. A discriminated (tagged) union of interfaces, all
sharing a kind: "..." field as their literal-typed discriminant.
Construction goes through per-variant factory functions; type-narrowing
is by switch on value.kind.
export interface TextValue {
readonly kind: 'TextValue';
readonly value: string;
readonly lang?: LanguageTag;
}
export interface IntegerNumberValue {
readonly kind: 'IntegerNumberValue';
readonly value: string;
}
export type Value = TextValue | IntegerNumberValue /* | … */;
export function textValue(value: string, lang?: LanguageTag): TextValue {
return lang === undefined
? { kind: 'TextValue', value }
: { kind: 'TextValue', value, lang };
}
Java idiom. A sealed interface with one record per variant and
Jackson’s polymorphic-type annotations using the property name kind.
@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, property = "kind")
@JsonSubTypes({
@JsonSubTypes.Type(value = TextValue.class, name = "TextValue"),
@JsonSubTypes.Type(value = IntegerNumberValue.class, name = "IntegerNumberValue")
})
public sealed interface Value permits TextValue, IntegerNumberValue { }
@JsonTypeName("TextValue")
public record TextValue(
@JsonProperty("value") String value,
@JsonProperty("lang") @JsonInclude(NON_ABSENT) Optional<String> lang)
implements Value {
@JsonCreator
public TextValue { }
}
@JsonTypeName("IntegerNumberValue")
public record IntegerNumberValue(@JsonProperty("value") String value)
implements Value {
@JsonCreator
public IntegerNumberValue { }
}
Python idiom. A discriminated Union annotated with
pydantic.Discriminator("kind"). Each variant carries a kind: Literal["..."] field.
from typing import Literal, Annotated, Union
from pydantic import BaseModel, ConfigDict, Discriminator
class TextValue(BaseModel):
model_config = ConfigDict(frozen=True)
kind: Literal["TextValue"] = "TextValue"
value: str
lang: str | None = None
class IntegerNumberValue(BaseModel):
model_config = ConfigDict(frozen=True)
kind: Literal["IntegerNumberValue"] = "IntegerNumberValue"
value: str
Value = Annotated[Union[TextValue, IntegerNumberValue], Discriminator("kind")]
For complex roots, wrap in a pydantic.RootModel[Value] to permit
top-level decoding via Value.model_validate_json(...).
Validation guidance. The decoder rejects any input whose kind
value is not a known member. Encoders MUST emit kind with the exact
production name (no aliasing). The construction-time invariants of
each variant apply normally.
Worked example: Value (subset: TextValue | IntegerNumberValue). Wire
shape: {"kind": "TextValue", "value": "hi"}. All three idioms decode
that JSON to a value whose static type is Value and whose runtime
narrowing predicate (value.kind === 'TextValue' / instanceof TextValue
/ isinstance(v, TextValue)) returns true.
Java note: nested sealed interfaces and Jackson dispatch tables.
Where a sealed union permits another sealed union as a member, the
outer union’s @JsonSubTypes SHOULD enumerate all leaf concrete
records directly — a flat dispatch table — rather than delegating
to the intermediate sealed interface.
Rationale. Nested-@JsonTypeInfo delegation through an intermediate
sealed interface is fragile in Jackson 2.x: the resolver re-enters
the deserializer chain at the inner interface, which can fight with
@JsonTypeName on the leaves and produce spurious failures. The
wire form already requires kind to be one of the leaf names (never
an intermediate-group name), so a flat dispatch table is correct by
construction.
Example. EmbeddedArtifact is a sealed union over EmbeddedField
and EmbeddedPresentationComponent, with EmbeddedField itself
sealed over the 20 family records (EmbeddedTextField,
EmbeddedIntegerNumberField, etc.). The Jackson registration on
EmbeddedArtifact should list every leaf record (all 20
EmbeddedXxxField records plus every EmbeddedXxxComponent
record) in @JsonSubTypes, not the intermediate EmbeddedField
interface.
2.3 Position-discriminated union
What it is. A wire production written as T ::: A | B | … and
explicitly declared // discriminator: position in the wire
grammar. The variant is determined entirely by the enclosing property
and surrounding context; the encoded object itself carries no
discriminator. The principal example is RenderingHint inside the
various FieldSpec families: each FieldSpec family fixes which
RenderingHint variant is permitted at its renderingHint slot, so
the rendering hint encodes without a kind tag.
Note: position-discriminated vs. positionally-determinate. A position-discriminated union (this section) is one the wire grammar declares with
// discriminator: position, and whose members therefore carry nokindon the wire. This is distinct from the larger class of unions whose use sites happen to be positionally determinate but which the wire grammar declares with the defaultdiscriminator: kind. The umbrellaFieldSpecunion is a good example of the latter: every actual use site (XxxField.fieldSpec) is typed with the per-family concrete production (TextFieldSpec,IntegerNumberFieldSpec, …) so the variant could in principle be recovered positionally, butFieldSpecisdiscriminator: kindso every member still carries"kind"on the wire per the kind rule (§1.5). Use this section only for productions explicitly flagged// discriminator: position.
Bindings can usually realise each use site as a single concrete class, since the position fixes the variant. There is no need for a runtime union at the use site.
TypeScript idiom. A single concrete interface per use site; the
abstract union (RenderingHint) exists only as a documentation alias
and is not used as a runtime narrowing target.
Java idiom. A concrete record per use site. The intermediate union interface MAY exist purely for documentation but does not need Jackson polymorphism.
Python idiom. A concrete BaseModel per use site; the abstract
union may be exposed as a TypeAlias for documentation.
Validation guidance. None special — the variant is fixed by the enclosing property’s type. Decoders SHOULD NOT attempt cross-variant disambiguation at this position.
Worked example: DateRenderingHint inside DateFieldSpec.renderingHint.
Wire: {"kind":"DateFieldSpec","dateValueType":"fullDate","renderingHint":{"componentOrder":"dayMonthYear"}}. The
renderingHint property is statically typed as DateRenderingHint;
no kind tag appears on the inner object since the position fixes
the variant.
2.4 Typed primitive wrapper
What it is. A wire production written as T ::: string (or
number) where T names a specialised role for the primitive — Iri,
FieldId, TemplateId, OrcidIri, LanguageTag, Bcp47Tag, etc.
On the wire these collapse to the underlying primitive; in the
abstract grammar they are typed roles whose constraints (IRI
well-formedness, BCP 47, ASCII identifier shape) MUST be enforced at
decode time.
The trade-off: typed wrappers catch role mismatches (passing a
TemplateId where a FieldId is expected) at compile time, at the
cost of some construction friction. Plain strings are ergonomic but
cede that protection to runtime checks. Bindings MAY choose either
end of this spectrum; cedar-ts wraps strongly.
TypeScript idiom. Two patterns are in common use: a structural
object wrapper carrying kind, and a TypeScript branded type
(a compile-time-only role tag added to a primitive via a phantom
__brand property of a string-literal type — the value is still a
plain string at runtime, but the type system treats Iri and a
generic string as distinct types). The structural wrapper costs one
allocation per identifier but gives full structural typing in IDEs;
the branded type costs nothing at runtime but enforces the role only
at compile time. cedar-ts uses Option A.
Option A — structural wrapper. The cedar-ts choice for Iri,
FieldId, and the typed-id families:
export interface Iri { readonly kind: 'Iri'; readonly value: string; }
export function iri(value: string): Iri {
return { kind: 'Iri', value: parseIriString(value) };
}
cedar-ts’s FieldId family uses this form with a per-family kind
discriminant so the twenty families remain distinguishable in the type
system.
Option B — branded type. A lighter alternative; the value is a bare string at runtime, the type system enforces the role at compile time:
export type Iri = string & { readonly __brand: 'Iri' };
export function iri(value: string): Iri { /* validate */ return value as Iri; }
Java idiom. A dedicated value record:
public record Iri(@JsonValue String value) {
public Iri {
if (!IriSyntax.isValid(value)) throw new CedarConstructionException("Invalid IRI: " + value);
}
@JsonCreator public static Iri of(String value) { return new Iri(value); }
}
@JsonValue / @JsonCreator collapses to and from a bare JSON string
so the wire form remains primitive while the in-memory type is
nominal.
Python idiom. typing.NewType('Iri', str) for nominal typing in
static analysis; the runtime value is a plain str and serialises as
such.
from typing import NewType
Iri = NewType("Iri", str)
def iri(value: str) -> Iri:
if not is_iri(value):
raise CedarConstructionError(f"Invalid IRI: {value!r}")
return Iri(value)
For richer runtime validation, a Pydantic BaseModel wrapper or an
Annotated[str, AfterValidator(...)] form is also fine; the
NewType form is the lightest.
Validation guidance. All typed primitive wrappers MUST enforce their
syntactic constraints (RFC 3987 for IRI; BCP 47 for language tags; the
ASCII pattern [A-Za-z][A-Za-z0-9_-]* for EmbeddedArtifactKey;
SemVer for Version / ModelVersion) at the constructor.
Worked example: Iri and FieldId. Iri wire form: "https://example.org/x".
FieldId wire form: also "https://example.org/x" — the family is
recovered from the surrounding kind. Bindings reconstruct the
typed form by combining the JSON string with the static type at the
use site.
2.5 MultilingualString
What it is. A MultilingualString is an array of one or more
{value, lang} localizations of the same conceptual string — for
example, the English, French, and German labels for one field. The
wire production is MultilingualString ::: nonEmptyArray<LangString>,
with two invariants:
- The array MUST be non-empty.
- Lang tags MUST be unique within the array (case-folded, see
wire-grammar.md§2.2).
A MultilingualString is distinct from a single language-tagged
TextValue. A TextValue is one tagged object carrying kind,
value, and lang — a single localized string. A
MultilingualString is an array of localizations of the same
conceptual string. The two are not interchangeable.
The (value, lang) pattern recurs across all three target languages
and deserves its own section because the non-empty-and-unique-lang
invariants need explicit support.
TypeScript idiom. A readonly array alias plus a constructor that
enforces invariants and returns a frozen array. cedar-ts accepts a
range of input shapes — bare string, {value, lang}, [value, lang],
{ [lang]: value } map, or an array of any of those — and normalises
them to the canonical array form.
export type MultilingualString = readonly LangString[];
export interface LangString { readonly value: string; readonly lang: string; }
export function multilingualString(input: MultilingualStringInput): MultilingualString {
// normalise, BCP 47-validate every lang, dedup-check, freeze, return.
}
Java idiom. Two records, with the outer carrying the invariants:
public record LangString(
@JsonProperty("value") String value,
@JsonProperty("lang") String lang) {
@JsonCreator
public LangString { /* BCP 47 check on lang */ }
}
public record MultilingualString(@JsonValue List<LangString> entries) {
public MultilingualString {
if (entries == null || entries.isEmpty())
throw new CedarConstructionException("MultilingualString must be non-empty");
var seen = new java.util.HashSet<String>();
for (var e : entries) {
if (!seen.add(e.lang().toLowerCase(Locale.ROOT)))
throw new CedarConstructionException("Duplicate lang tag: " + e.lang());
}
entries = List.copyOf(entries);
}
@JsonCreator public static MultilingualString of(List<LangString> entries) { return new MultilingualString(entries); }
}
A NonEmptyList<T> helper type is a reasonable cross-cutting
abstraction if the binding has several non-empty arrays to model.
Python idiom. A Pydantic model with a model_validator(mode="after")
enforcing non-empty and unique lang tags:
from pydantic import BaseModel, ConfigDict, RootModel, model_validator
class LangString(BaseModel):
model_config = ConfigDict(frozen=True)
value: str
lang: str # validate BCP 47 with a field validator
class MultilingualString(RootModel[list[LangString]]):
model_config = ConfigDict(frozen=True)
@model_validator(mode="after")
def _check(self):
entries = self.root
if not entries:
raise CedarConstructionError("MultilingualString must be non-empty")
seen = set()
for e in entries:
key = e.lang.lower()
if key in seen:
raise CedarConstructionError(f"Duplicate lang tag: {e.lang!r}")
seen.add(key)
return self
attrs with __attrs_post_init__ is a lighter alternative; the
recommendation is Pydantic for the JSON round-trip story.
Validation guidance. Validate at construction. A constructed
MultilingualString is always non-empty and always lang-unique.
2.6 Optional component
What it is. A grammar production component marked [X]. On the
wire the property is encoded only when present
(serialization.md §4.2): conforming encoders
MUST NOT emit null or empty strings in place of an absent optional.
TypeScript idiom. prop?: T. The interface treats omission and
undefined identically; encoders skip the property at JSON write time.
Use JSON.stringify with no null-injection logic; the property is
naturally absent from the serialised output.
Java idiom. Prefer @Nullable T over Optional<T> in record
components. Jackson handles null/missing properties on records
cleanly with @JsonInclude(JsonInclude.Include.NON_NULL) at either
the field or class level. Optional<T> works but interacts awkwardly
with records (Jackson must be configured to recognise empty
Optionals) and adds a layer of allocation per access.
public record Cardinality(
@JsonProperty("min") int min,
@JsonProperty("max") @JsonInclude(NON_NULL) Integer max) { … }
Python idiom. T | None with default None; Pydantic respects
the optional semantics and excludes None fields from
model_dump_json(exclude_none=True).
class Cardinality(BaseModel):
model_config = ConfigDict(frozen=True)
min: int = Field(ge=0)
max: int | None = Field(default=None, ge=0)
# Round-trip omits `max` when None:
c = Cardinality(min=0)
c.model_dump_json(exclude_none=True) # '{"min":0}'
Set model_config = ConfigDict(json_dumps_kwargs={"exclude_none": True})
or use a custom model_dump_json wrapper to make this implicit.
Validation guidance. Decoders MUST treat "prop": null as an
encoding error (per serialization.md §4.2),
distinct from omission of the property.
2.7 String enum
What it is. A wire production T ::: "a" | "b" | … whose values
are drawn from a fixed set. All values are lowerCamelCase per
serialization.md §3.3. Examples: Status,
ValueRequirement, Visibility, DateValueType, DateComponentOrder,
TimeFormat, TimePrecision, DateTimeValueType,
TimezoneRequirement, RealNumberDatatypeKind (three values), the
flat-string rendering hints (TextRenderingHint,
SingleValuedEnumRenderingHint, MultiValuedEnumRenderingHint,
BooleanRenderingHint).
TypeScript idiom. A string-literal union. cedar-ts also exports a
frozen array of permitted values and an isXxx type guard.
export type Status = 'draft' | 'published';
export const STATUSES: readonly Status[] = Object.freeze(['draft', 'published']);
export const isStatus = (x: unknown): x is Status =>
typeof x === 'string' && (STATUSES as readonly string[]).includes(x);
Java idiom. A Java enum whose constants are uppercase by
convention, with @JsonProperty annotations mapping each constant to
its lowerCamelCase wire value:
public enum Status {
@JsonProperty("draft") DRAFT,
@JsonProperty("published") PUBLISHED
}
Jackson uses the annotation for both serialization and
deserialization. An unknown wire value yields Jackson’s standard
InvalidFormatException. Bindings that prefer to surface custom
errors, or that need a wire accessor on the enum (e.g. for non-Jackson
code paths), can use the @JsonValue / @JsonCreator pair instead.
Python idiom. enum.StrEnum (Python 3.11+); Pydantic accepts and
emits the string form directly.
from enum import StrEnum
class Status(StrEnum):
DRAFT = "draft"
PUBLISHED = "published"
Validation guidance. Decoders MUST reject string values not in the declared set. The enum surface must be closed: future wire-grammar additions trigger a binding version bump.
2.8 Repeated component
What it is. A grammar component marked X* (zero-or-more) or X+
(one-or-more). On the wire both encode as JSON arrays
(wire-grammar.md §1.1, §4.3); X+ is written as
nonEmptyArray<X> and carries the non-empty invariant. Order MUST be
preserved through encode and decode.
TypeScript idiom. readonly T[] with an explicit non-empty check
in the constructor for X+ cases. cedar-ts uses Object.freeze on
constructed arrays where the position carries an invariant
(MultilingualString, embedded in Template).
Java idiom. List<T>. Jackson handles arrays out of the box. For
non-empty cases, validate at the constructor with if (list.isEmpty()) throw … and store as List.copyOf(list) to enforce
immutability.
Python idiom. list[T]. Pydantic handles arrays out of the box.
For non-empty, use Field(min_length=1) or a model_validator.
Validation guidance. Decoders MUST reject empty arrays at
nonEmptyArray<X> positions. Encoders MUST preserve element order.
2.9 Constraints
What it is. Inline //-comments on wire-grammar.md productions
declare constraints not expressible in the type expression. The
constraint kinds in current use are:
- Lexical-form well-formedness. Values matching a syntactic
category — e.g. BCP 47 language tags, the ASCII identifier pattern
[A-Za-z][A-Za-z0-9_-]*forEmbeddedArtifactKey, RFC 3987 IRIs, SemVer forVersion/ModelVersion, ISO 8601 date-time stamps. - Uniqueness across a collection. Distinctness of an entry’s
identifying property within an array or set — e.g.
EmbeddedArtifact.keyvalues must be unique within aTemplate,LangString.langtags must be unique within aMultilingualString,PermissibleValue.valuetokens must be unique within an enum spec. - At-least-one-of. A production with all-optional components
requires at least one to be present — e.g.
OntologyDisplayHintrequires at least one ofacronymorname. - Value relationships across slots. A value at one slot must agree
with another slot — e.g. the IRI placed at a field’s
idMUST belong to a field of the same family as the enclosingkind. - Numeric ordering. Where a production carries paired bounds, the
lower bound must not exceed the upper — e.g.
Cardinality.min ≤ Cardinality.max.
Bindings SHOULD validate at construction time and throw a binding-specific exception type. A constructed instance is then always valid; downstream code may rely on the construction guarantee.
Recommend one canonical exception class per binding:
export class CedarConstructionError extends Error {
constructor(message: string) { super(message); this.name = 'CedarConstructionError'; }
}
public class CedarConstructionException extends RuntimeException {
public CedarConstructionException(String message) { super(message); }
}
class CedarConstructionError(Exception):
pass
Validation guidance. Validate eagerly at construction. Lazy
validation (deferring checks until access) is discouraged: the model
is value-typed; an invalid value should never exist in the runtime
heap. Where validation depends on a wider context (e.g., embedded-key
uniqueness depends on the whole Template.members array), perform
the check in the enclosing constructor.
2.10 Widening constructors
What it is. An ergonomic pattern in which a constructor accepts a
broader set of input shapes than its return type: iri() accepts
Iri | string; multilingualString() accepts string | LangString | { [lang]: string } | LangString[]; property() accepts string | Iri | PropertyInit. The widened constructor narrows to the canonical wire
shape, validating along the way. When the input is already in
canonical form the constructor SHOULD return it unchanged (so e.g.
iri(iri(s)) is well-defined and equivalent to iri(s)); this lets
callers chain widening constructors without redundancy concerns.
This is recommended-but-not-required. A binding’s narrow (canonical) constructor — taking exactly the wire-grammar shape — MUST exist; widening factories are convenience layers on top.
TypeScript idiom. Function overloads or a single union-typed input
parameter; the function dispatches on typeof / structural shape.
Java idiom. Static factory overloads on the record:
Iri.of(String), Iri.of(URI). Avoid widening the canonical record
constructor itself, which Jackson uses; add overloads as static
methods so the wire-shape constructor remains unambiguous.
Python idiom. Module-level factory functions accepting Union
types; the canonical Pydantic model constructor remains for the
narrow shape. Avoid __init__ overloading via sentinels; prefer
explicit factory functions (iri.from_string, etc.).
2.11 Immutability
Strongly recommend immutable-by-default for all binding types. A CEDAR artifact is a value; mutability is a hazard.
- TypeScript:
readonlyon every interface property;Object.freeze()on constructed instances and on any nested arrays. cedar-ts freezes invariant-bearing arrays (MultilingualString,Template.members). - Java:
recordtypes are immutable by language design; for non-record classes usefinalfields, no setters, and defensive copies on collections (List.copyOf,Set.copyOf,Map.copyOf). - Python: Pydantic models with
model_config = ConfigDict(frozen=True); dataclasses with@dataclass(frozen=True).
Equality is structural: two values with the same component values are equal regardless of allocation identity. Records and Pydantic models provide this automatically; TypeScript binders need a shallow-equality helper if equality is meaningful at call sites.
2.12 Override-precedence accessors
What it is. Several grammar slots come in pairs: a canonical value
on a reusable artifact and an optional override on the embedding site.
The override wins per the spec’s two-layer precedence rule
(grammar.md §Defaults; presentation.md
§Help-Text Rendering). The current pairs:
| Reusable artifact slot | Embedding-site override slot |
|---|---|
Field.fieldSpec.defaultValue | EmbeddedXxxField.defaultValue |
Field.label / Field.metadata.altLabels | EmbeddedXxxField.labelOverride.label / altLabels |
Field.helpText | EmbeddedXxxField.helpTextOverride |
Binding guidance. Bindings SHOULD expose a small convenience
accessor per pair so call sites do not re-implement the precedence
rule. The accessor takes the EmbeddedField and the resolved Field
and returns the effective value:
// TypeScript
function resolvedHelpText(
embedded: EmbeddedTextField,
field: TextField,
): MultilingualString | undefined {
return embedded.helpTextOverride ?? field.helpText;
}
// Java
public static Optional<MultilingualString> resolvedHelpText(
EmbeddedTextField embedded, TextField field) {
return Optional.ofNullable(embedded.helpTextOverride())
.or(() -> Optional.ofNullable(field.helpText()));
}
# Python
def resolved_help_text(
embedded: EmbeddedTextField, field: TextField
) -> MultilingualString | None:
return embedded.help_text_override or field.help_text
The same pattern applies to defaultValue and labelOverride.label,
each with its own accessor. Replace, not merge: the override
replaces the canonical value at the embedding site; partial
localization fallback (e.g., one language overridden, others falling
through) is not part of the precedence rule. Bindings MUST NOT
synthesise such fallback.
3. Naming Conventions per Language
| Language | Types | Functions / methods / properties | Constants |
|---|---|---|---|
| TypeScript | UpperCamelCase | lowerCamelCase | SCREAMING_SNAKE_CASE |
| Java | UpperCamelCase | lowerCamelCase | SCREAMING_SNAKE_CASE |
| Python | UpperCamelCase | snake_case | SCREAMING_SNAKE_CASE |
Reserved-word collisions (Java). As of the current model no
grammar property name collides with a Java reserved word. (Verified
by cross-referencing every property name in
wire-grammar.md against the full Java
reserved-word list.) A future grammar property whose name collides
with a Java reserved word SHOULD be escaped by either renaming the
Java field to a non-reserved synonym (e.g. isFoo for a wire foo)
or using a leading underscore (_foo), in either case mapping back
to the wire name via @JsonProperty("foo"). The wire name remains
canonical.
Property naming (Python). Pydantic models can use Field(alias= 'lowerCamelName') together with model_config = ConfigDict( populate_by_name=True) to expose Python snake_case attribute names
while preserving the wire’s lowerCamelCase. This is the recommended
pattern:
class SchemaArtifactVersioning(BaseModel):
model_config = ConfigDict(populate_by_name=True, frozen=True)
version: str
status: Status
previous_version: str | None = Field(default=None, alias="previousVersion")
derived_from: str | None = Field(default=None, alias="derivedFrom")
class TextField(BaseModel):
model_config = ConfigDict(populate_by_name=True, frozen=True)
kind: Literal["TextField"] = "TextField"
id: str
model_version: str = Field(alias="modelVersion")
metadata: CatalogMetadata
versioning: SchemaArtifactVersioning
field_spec: TextFieldSpec = Field(alias="fieldSpec")
label: MultilingualString
model_version is a top-level field on every concrete artifact class
(Template, TemplateInstance, every XxxField, and every
PresentationComponent variant); it is no longer nested inside
SchemaArtifactVersioning.
A binding MAY instead expose lowerCamelCase Python attribute names
to avoid the alias layer; the alias approach is recommended for
PEP 8 conformance on the Python surface.
4. Codebase Organisation
Bindings SHOULD organise the source tree so that everything specific
to a single field family lives together. A field family is the
twenty-way grouping introduced in grammar.md §3.2: TextField,
IntegerNumberField, RealNumberField, BooleanField, DateField,
TimeField, DateTimeField, ControlledTermField,
SingleValuedEnumField, MultiValuedEnumField, LinkField,
EmailField, PhoneNumberField, OrcidField, RorField, DoiField,
PubMedIdField, RridField, NihGrantIdField, and
AttributeValueField.
The “everything specific to a family” set comprises, at minimum, the family’s:
- typed identifier (
TextFieldId, etc.) - field artifact type (
TextField, etc.) - field spec type (
TextFieldSpec, etc.) - embedded field artifact type (
EmbeddedTextField, etc.) - per-family value type (
TextValue, etc.) - any per-family rendering hint (
TextRenderingHint, etc.) - per-family construction helpers (widening constructors, type guards, JSON adapters, validators)
Per-language convention:
- TypeScript. A single file per family —
text-field.ts,integer-number-field.ts,controlled-term-field.ts, etc. — that exports every type and helper in the list above. Cross-family abstractions (theFieldunion, theEmbeddedFieldunion, cross-cutting helpers) live in their own files and re-export from the per-family files where appropriate. - Java. A single package per family —
package org.example.cedar.field.text,org.example.cedar.field.integer,org.example.cedar.field.controlledterm, etc. — containing the family’s records, sealed interface members, type-info annotations, and per-family helpers. The umbrellaFieldsealed interface and cross-family abstractions live in a parent package (org.example.cedar.field). - Python. A single module per family —
cedar/field/text_field.py, etc. — paralleling the TypeScript layout.
The motivation is locality: any change to a field family — adding a
constraint, renaming a property, introducing a new rendering hint —
should touch one file (TS, Python) or one package (Java), not many.
This also makes it straightforward for a reader to trace a wire
property like defaultValue from its appearance in a
SingleValuedEnumFieldSpec block to the family’s EnumValue type
without navigating across the codebase.
Bindings MAY group cross-family abstractions (the Field union, the
EmbeddedField union, the Value union, Cardinality,
CatalogMetadata, SchemaArtifactVersioning, etc.) however they
like; only family-specific code is constrained by this guideline.
5. The Reference TypeScript Binding
The reference TypeScript implementation is
cedar-ts, published as
@metadatacenter/cedar-model on npm. It is the source of truth for
any TypeScript-specific idiom not covered explicitly in this document.
High-level structure (the src/ tree mirrors the grammar layering):
leaves/— primitive validators and typed leaves (Iri,LanguageTag,IsoDateTimeStamp, ASCII-id, BCP 47, SemVer, integer).multilingual.ts—MultilingualStringandLangString.values/— theValuefamily. EachValuevariant carries its family-specific content directly (lexical form, language tag, datatype, or boolean payload, as appropriate); there is no separateLiterallayer.identity.ts— artifact identifiers (FieldId,TemplateId,PresentationComponentId,TemplateInstanceId).metadata/—LifecycleMetadata,SchemaArtifactVersioning,Annotation.field-specs/—FieldSpecfamily.fields.ts—Fieldfamily.embedded/—EmbeddedField,EmbeddedTemplate,EmbeddedPresentationComponent, plusCardinality,Property,LabelOverride,Visibility,ValueRequirement.presentation/—PresentationComponentfamily.instances/—TemplateInstance,FieldValue,NestedTemplateInstance.template.ts—Template.index.ts— public API surface.
Conventions adopted by cedar-ts (already documented in §2 above):
readonlyon all interface properties;Object.freezeon invariant-bearing arrays.- A canonical
xxxInitinterface alongside eachXxxinterface, giving the construction-time input shape that may differ from the output (e.g., acceptsIri | stringwhere the output storesIri). - A widening constructor function (e.g.
cardinality(init),multilingualString(input)) per production. - A type guard (
isXxx) per polymorphic production. - A single
CedarConstructionErrorthrown for all construction-time invariant failures.
6. Open Issues per Language
Java.
- For
NonEmptyList<T>-style helper types used as aMultilingualStringsubstrate (or anywhere a non-empty collection invariant must be enforced), prefer a plain final class or sealed interface over arecord. Records lock the component layout into the canonical constructor signature, which constrains the API for static factories (NonEmptyList.of(t),NonEmptyList.of(t1, t2, …)), varargs construction, and any desire to implementList<T>directly — all easier on a final class. - Records cannot be
null-rejected at the canonical constructor in a way Jackson respects without extra annotations; combining@JsonInclude(NON_NULL)with explicit checks in the canonical constructor body is the established pattern.
Python.
- Pydantic v1 vs v2 differs significantly in discriminated-union handling. Bindings SHOULD target v2; the recommendations in §2 use v2 exclusively.
Optional[T]vsT | Noneis style-only since Python 3.10; preferT | Nonefor new code.enum.StrEnumrequires Python 3.11+. Bindings targeting earlier versions SHOULD useenum.Enumsubclassingstr.
All bindings.
- JSON numbers exceeding
2^53 − 1inNonNegativeIntegerslots: the wire grammar allows the string-fallback encoding (wire-grammar.md§2.1,serialization.md§5.1). Bindings SHOULD useBigInt(TS),BigInteger(Java), orint(Python ints are unbounded) on the binding side. The current model does not have any use site that actually exercises this — length bounds, cardinality bounds, traversal depths, numeric precision are all small — but the encoder MUST be capable of emitting the string form when given an out-of-range value. - Round-trip ordering of optional properties within a tagged object
is not significant; bindings MUST NOT rely on JSON property order
for correctness (per
serialization.md§4.7).
7. Reading wire-grammar.md as a Binding Implementer
A short cheat-sheet that maps wire-grammar.md notation to the
meta-categories above, so an implementer encountering a production can
quickly classify it:
wire-grammar.md shape | Category |
|---|---|
T ::: string / number / boolean / null | Primitive (or typed primitive wrapper — §2.4) |
T ::: array<X> | Repeated component (§2.8) |
T ::: nonEmptyArray<X> | Repeated component (§2.8); §2.5 for MultilingualString specifically |
T ::: object { … } with no "kind": "..." literal property | Plain object production (§2.1) |
T ::: object { … } with a "kind": "..." literal property | Member of a kind-discriminated union (§2.2) |
| `T ::: A | B |
| `T ::: A | B |
| `T ::: “a” | “b” |
T ::: SomeOtherProduction (collapsed wrapper, e.g. PreferredLabel ::: MultilingualString) | The wrapper carries no extra information; bind it as the inner type’s idiom. |
Optional components are marked with ? on the property
(prop?: Type) — see §2.6. Inline //-comments declare constraints to
enforce at construction (§2.9).
8. Cross-References
- Abstract grammar:
grammar.md - JSON wire shapes:
wire-grammar.md - JSON encoding rules:
serialization.md - Conformance rules:
validation.md - Reference TypeScript binding: cedar-ts (npm
@metadatacenter/cedar-model)
Instances
Overview
A TemplateInstance is an Artifact that conforms to a Template.
The structure of a TemplateInstance is determined by the embedded data-bearing artifacts of the referenced Template.
TemplateInstance
A TemplateInstance carries a TemplateInstanceId, a ModelVersion (the version of the CEDAR structural model the instance conforms to, hoisted to top-level on every concrete artifact), an ArtifactMetadata block, a reference to the Template it conforms to, and zero or more InstanceValue constructs.
TemplateInstance carries ArtifactMetadata rather than SchemaArtifactMetadata: instances do not carry schema versioning. The Template they reference fixes the schema version.
Each InstanceValue corresponds to an embedded artifact in the referenced Template that contributes data.
The template reference is persistent and provides the basis for validation and interpretation of instance content.
InstanceValue
InstanceValue has two forms:
FieldValueNestedTemplateInstance
PresentationComponent does not correspond to any InstanceValue.
FieldValue
A FieldValue associates an EmbeddedArtifactKey with one or more values for an EmbeddedField.
The key identifies the embedding site within the containing Template, which allows the same referenced Field to appear in multiple contexts without ambiguity.
FieldValue may contain multiple values when the corresponding EmbeddedField permits multiplicity.
The permitted form of each contained value is determined by the FieldSpec of the referenced Field. Each value in FieldValue.values is a member of the Value polymorphic union and therefore carries a kind discriminator on the wire (per wire-grammar.md §1.5). A decoder reads kind to pick the union arm; the resulting arm MUST match the family expected by the referenced FieldSpec (e.g. a FieldValue for a TextFieldSpec carries TextValue entries with "kind": "TextValue").
For EnumFieldSpec, every contained value is a tagged EnumValue ({ "kind": "EnumValue", "value": "<Token>" }) whose value MUST equal the canonical Token of one of the referenced spec’s PermissibleValue entries. A SingleValuedEnumFieldSpec permits exactly one such EnumValue per FieldValue; a MultiValuedEnumFieldSpec permits one or more, subject to the embedding’s Cardinality.
Defaults are not part of instances
A TemplateInstance records the values a user supplied; it does not record default values. Defaults specified at the field-level (XxxFieldSpec.defaultValue) or embedding-level (EmbeddedXxxField.defaultValue) are UI/UX initialisation only — they pre-populate the form a user fills in, but the resulting instance carries the user’s chosen value as if the user had typed it in by hand.
Two consequences:
- A user who accepts a default without modification produces a
FieldValuecarrying that value verbatim. From the instance’s perspective the default and a user-supplied identical value are indistinguishable. - A user who supplies no value (and the field is not required) produces no
FieldValuefor that key. The default does not appear by virtue of having existed; absence in the instance means absence, not “use the default.”
This matters for downstream consumers: the absence of a FieldValue for a given EmbeddedField is unambiguous evidence that no value was supplied, and never an implicit reference to a default.
See also grammar.md §Defaults and serialization.md §6.8.
NestedTemplateInstance
A NestedTemplateInstance associates an EmbeddedArtifactKey with nested InstanceValue constructs corresponding to an EmbeddedTemplate.
This provides recursive instance structure aligned with recursive template structure.
Conformance
A TemplateInstance MUST conform to the structure implied by its referenced Template.
A conforming instance MUST use EmbeddedArtifactKey values that identify embedded data-bearing artifacts in that template context.
Textual instance values MAY include language tags.
TextValue carries a lexical form and an optional language tag.
Numeric instance values carry a lexical form together with the corresponding XSD datatype: IntegerNumberValue is fixed at xsd:integer; RealNumberValue carries an explicit datatype (xsd:decimal, xsd:float, or xsd:double).
Date, time, and date-time instance values are represented separately by DateValue, TimeValue, and DateTimeValue, each carrying its own lexical form. Within DateValue, YearValue and YearMonthValue carry plain strings matching YYYY and YYYY-MM respectively; FullDateValue carries an xsd:date lexical form.
Controlled term instance values SHOULD preserve both a term Iri and a human-readable label. They MAY additionally preserve notation and preferred label information from the source terminology.
External authority instance values SHOULD preserve both the typed authority IRI (OrcidIri, RorIri, DoiIri, PubMedIri, RridIri, or NihGrantIri as appropriate) and, where available, a human-readable label.
Open Questions
- Should
TemplateInstancepermit partial conformance during authoring workflows, or should the model define only fully conforming instances?
Presentation Components
Overview
PresentationComponent defines reusable presentation or instructional content that may appear within a Template through EmbeddedPresentationComponent.
PresentationComponent is distinct from Field and MUST NOT be treated as a data-bearing schema construct. It is also distinct from SchemaArtifact: presentation components carry plain ArtifactMetadata rather than SchemaArtifactMetadata, since they do not participate in schema versioning.
Artifact shape
Every concrete PresentationComponent carries the following common slots:
PresentationComponentId— the artifact’s identity IRI.ModelVersion— the version of the CEDAR structural model the artifact conforms to (hoisted to top-level on every concrete artifact).ArtifactMetadata— the artifact’s name, description, lifecycle, optional annotations, etc.- a per-variant body: the substantive content of the component (HTML, image IRI, video IRI, or — for the structural break components — empty).
Defined Components
This specification defines the following PresentationComponent variants:
| Variant | Body |
|---|---|
RichTextComponent | HtmlContent (an HTML string for rendered presentation) |
ImageComponent | Iri for the image source, with optional Label and Description |
YoutubeVideoComponent | Iri for the video source, with optional Label and Description |
SectionBreakComponent | (no body) — contributes sectional separation in a rendered form |
PageBreakComponent | (no body) — contributes pagination structure |
These constructs replace the older practice of treating static presentation constructs as field variants.
Embedding
Presentation constructs appear in a Template only through EmbeddedPresentationComponent.
An EmbeddedPresentationComponent carries:
EmbeddedArtifactKey— the local key identifying this embedding within the containingTemplate.PresentationComponentId— theartifactRefto the reusablePresentationComponentbeing embedded.- optional
Visibility— the rendering visibility of the embedded component.
It does not carry a value requirement, cardinality, default value, label override, or semantic property IRI: the component contributes no instance data and exists purely to contribute presentational structure.
Instance Semantics
PresentationComponent does not produce InstanceValue.
Conforming implementations MUST NOT create FieldValue, NestedTemplateInstance, or any other InstanceValue for a PresentationComponent. The EmbeddedArtifactKey of an EmbeddedPresentationComponent MUST NOT appear as the key of any InstanceValue in a conforming TemplateInstance.
Help-Text Rendering
This section is normative for conforming form renderers. The structural model carries help-text content on the Field artifact (HelpText) and optional per-embedding overrides on EmbeddedField (HelpTextOverride). How that content is presented at form-render time is governed by the enclosing Template’s HelpDisplayMode.
Effective help-text resolution
At each EmbeddedField site, the effective help text is determined as follows:
- If the
EmbeddedFieldcarries aHelpTextOverride: the effective help text is the override’s value. - Otherwise, if the referenced
Fieldcarries aHelpText: the effective help text is the field’s value. - Otherwise, the effective help text is empty; the renderer displays no help for this field regardless of
HelpDisplayMode.
The override is replace, not merge: localizations present in the field’s HelpText but absent from the embedding’s HelpTextOverride do not fall back into the resolved content.
Display-mode selection
The presentation of effective help text at a given embedding site is governed by the HelpDisplayMode resolved per the cascade rule below:
"inline"— render the effective help text as visible text adjacent to the field, typically beneath the input. This is the default when no mode is set."tooltip"— render as a hover/focus tooltip, triggered by a?icon or other discoverable affordance. Conforming renderers MUST also make the text available to assistive technologies."both"— emit both the inline rendering and the tooltip rendering. Recommended for accessibility-sensitive contexts where redundancy is preferred."none"— do not render the effective help text at form-render time. The content remains part of the model and is available to alternative renderers (catalog browsers, RDF projectors, etc.).
Cascade rule for nested templates
HelpDisplayMode cascades from the outermost Template in a form to every field rendered within that form, including fields contributed by nested templates referenced via EmbeddedTemplate. Specifically:
- When a
TemplateT_outerembeds anotherTemplateT_innerviaEmbeddedTemplate, the renderer MUST useT_outer’sHelpDisplayMode(or its default if unset) when rendering fields contributed byT_inner. T_inner’s ownHelpDisplayModeis ignored for help-text rendering at that embedding site.T_inner’sHelpDisplayModeapplies only whenT_inneris rendered standalone (e.g., previewed in authoring tooling as a reusable artifact, or used as the top-level template in another context).
This rule is specific to HelpDisplayMode. Future TemplateRenderingHint slots may define different cascade behaviour and MUST state their cascade rule explicitly.
When HelpDisplayMode is absent from a Template — either because the template carries no TemplateRenderingHint, or because the hint omits the slot — the resolved mode is "inline".
Placeholder Rendering
This section is normative for conforming form renderers. The structural model carries placeholder content on rendering-hint productions attached to text-entry-capable field families (see grammar.md §Field Specs and §Rendering Hints, and the new Placeholder production). How that content is presented at form-render time is governed by the rules below.
What Placeholder is
Placeholder is a MultilingualString-valued slot on every rendering hint attached to a text-entry-capable field family. It carries sample input text — typically a short format demonstration such as "YYYY-MM-DD", "john.doe@example.com", or "https://orcid.org/0000-0000-0000-0000" — intended to be displayed inside an empty text-entry widget and to disappear once the user begins typing.
Placeholder is not semantic content about the field’s meaning; that is the role of HelpText. The two slots may coexist on the same field: HelpText explains what the field is for, Placeholder demonstrates what the typed input looks like.
Rendering requirements
Conforming renderers:
- SHOULD display the effective
Placeholdercontent inside text-entry input widgets when those inputs are empty. - MUST NOT display
Placeholdercontent in a way that could be mistaken for a user-supplied value. Placeholders MUST be visually distinguishable from real input — conventionally via reduced opacity, italics, or a contrasting style. - MAY omit
Placeholderrendering when accessibility concerns warrant (some screen readers handle the HTMLplaceholderattribute poorly). WhenPlaceholderis omitted from the visual rendering, the renderer SHOULD ensure the same content is available throughHelpTextor another accessible affordance if it conveys information the user otherwise lacks.
Localization selection
When Placeholder carries multiple language-tagged localizations, the renderer selects the entry whose LanguageTag best matches the user’s preferred display language, falling back per the spec’s existing MultilingualString-localization-preference rules. No new rule is introduced for Placeholder; it follows the same selection convention as HelpText, PreferredLabel, and other MultilingualString-valued display content.
Relationship to value validation
Placeholder content is purely presentational. It is not validated against the field spec’s value constraints (validationRegex, langTagRequirement, timezoneRequirement, minLength, maxLength, etc.). A placeholder of "YYYY-MM-DD" may appear on a date field whose values are constrained to ISO 8601 — the placeholder is a demonstration of the expected lexical shape, not an instance of one. Conforming validators MUST NOT apply field-spec value constraints to placeholder content.
Open Questions
- Model revision candidate: The current model requires all
PresentationComponentvariants to carry full reusable artifact identity. This is uniform but may be unnecessarily heavy for simple structural elements such asPageBreakComponent, which carry no meaningful content and are unlikely to be shared across templates. A future revision should consider whether lightweight inline-only variants could be introduced for such cases, and define the criteria for determining which components warrant reusable identity. - Which presentation-specific properties belong on the reusable
PresentationComponentversus onEmbeddedPresentationComponent?
Field Families
This is a navigation index for the 20 concrete field families defined by the spec. Each row links to the family’s four principal productions in grammar.md:
- Field artifact — the family’s
XxxField, a standalone reusable artifact. - Field spec — the family’s
XxxFieldSpec, carried by the standaloneFieldartifact. - Value — the family’s instance-value production (
XxxValue), carried by aFieldValuein aTemplateInstance. - Embedded form — the family’s
EmbeddedXxxField, used inside aTemplate’smembersto reference the standalone field.
The conformance fixture column points at the per-family Template + Instance pair under normative-tests/valid/ and the standalone Field artifact.
Scalar text and numeric
| Family | Field artifact | Field spec | Value | Embedded form | Fixtures |
|---|---|---|---|---|---|
| Text | TextField | TextFieldSpec | TextValue | EmbeddedTextField | 03–04, 49 |
| Integer number | IntegerNumberField | IntegerNumberFieldSpec | IntegerNumberValue | EmbeddedIntegerNumberField | 05–06, 50 |
| Real number | RealNumberField | RealNumberFieldSpec | RealNumberValue | EmbeddedRealNumberField | 07–10, 51–52 |
| Boolean | BooleanField | BooleanFieldSpec | BooleanValue | EmbeddedBooleanField | 11–12, 53 |
Temporal
| Family | Field artifact | Field spec | Value | Embedded form | Fixtures |
|---|---|---|---|---|---|
| Date | DateField | DateFieldSpec | DateValue (three arms: FullDateValue, YearValue, YearMonthValue) | EmbeddedDateField | 13–18, 54 |
| Time | TimeField | TimeFieldSpec | TimeValue | EmbeddedTimeField | 19–20, 55 |
| Date-time | DateTimeField | DateTimeFieldSpec | DateTimeValue | EmbeddedDateTimeField | 21–22, 56 |
Controlled vocabulary
| Family | Field artifact | Field spec | Value | Embedded form | Fixtures |
|---|---|---|---|---|---|
| Controlled term | ControlledTermField | ControlledTermFieldSpec | ControlledTermValue | EmbeddedControlledTermField | 23–24, 57–60 |
| Single-valued enum | SingleValuedEnumField | SingleValuedEnumFieldSpec | EnumValue | EmbeddedSingleValuedEnumField | 25–26, 61 |
| Multi-valued enum | MultiValuedEnumField | MultiValuedEnumFieldSpec | EnumValue (multi-valued) | EmbeddedMultiValuedEnumField | 27–28, 62 |
Reference and contact
| Family | Field artifact | Field spec | Value | Embedded form | Fixtures |
|---|---|---|---|---|---|
| Link | LinkField | LinkFieldSpec | LinkValue | EmbeddedLinkField | 29–30, 63 |
EmailField | EmailFieldSpec | EmailValue | EmbeddedEmailField | 31–32, 64 | |
| Phone number | PhoneNumberField | PhoneNumberFieldSpec | PhoneNumberValue | EmbeddedPhoneNumberField | 33–34, 65 |
External-authority identifiers
| Family | Field artifact | Field spec | Value | Embedded form | Fixtures |
|---|---|---|---|---|---|
| ORCID | OrcidField | OrcidFieldSpec | OrcidValue | EmbeddedOrcidField | 35–36, 66 |
| ROR | RorField | RorFieldSpec | RorValue | EmbeddedRorField | 37–38, 67 |
| DOI | DoiField | DoiFieldSpec | DoiValue | EmbeddedDoiField | 39–40, 68 |
| PubMed | PubMedIdField | PubMedIdFieldSpec | PubMedIdValue | EmbeddedPubMedIdField | 41–42, 69 |
| RRID | RridField | RridFieldSpec | RridValue | EmbeddedRridField | 43–44, 70 |
| NIH grant | NihGrantIdField | NihGrantIdFieldSpec | NihGrantIdValue | EmbeddedNihGrantIdField | 45–46, 71 |
Open-ended
| Family | Field artifact | Field spec | Value | Embedded form | Fixtures |
|---|---|---|---|---|---|
| Attribute value | AttributeValueField | AttributeValueFieldSpec | AttributeValue | EmbeddedAttributeValueField | 47–48, 72 |
Notes on the groupings
The groupings above are presentational, not normative. The spec does not partition field families into categories at the grammar level; every family is structurally a peer of every other under FieldSpec. The groupings here exist only to make this index easier to scan.
The four §9-of-serialization.md family-specific deviations are worth recalling at the embedding site:
EmbeddedBooleanFieldandEmbeddedSingleValuedEnumFieldomitcardinality(single-valued by construction).EmbeddedMultiValuedEnumField.defaultValueisEnumValue*(a sequence).EmbeddedAttributeValueFieldomitsdefaultValue.
The six external-authority identifier families (ORCID, ROR, DOI, PubMed, RRID, NIH grant) all share an identical XxxFieldSpec shape — { "kind": "<Family>FieldSpec" } with no per-family slots — but distinct value productions carrying typed IRIs.
Index of Productions
An alphabetical index of every production defined in this specification. The contents of this page are generated automatically from the EBNF blocks across all chapters.
A · B · C · D · E · F · H · I · L · M · N · O · P · R · S · T · U · V · Y
A
- AlternativeLabel — Grammar, Wire Grammar
- Annotation — Grammar, Wire Grammar
- AnnotationIriValue — Grammar, Wire Grammar
- AnnotationStringValue — Grammar, Wire Grammar
- AnnotationValue — Grammar, Wire Grammar
- Artifact — Grammar, Wire Grammar
- AttributeName — Grammar, Wire Grammar
- AttributeValue — Grammar, Wire Grammar
- AttributeValueField — Grammar, Wire Grammar
- AttributeValueFieldId — Grammar, Wire Grammar
- AttributeValueFieldSpec — Grammar, Wire Grammar
B
- BooleanField — Grammar, Wire Grammar
- BooleanFieldId — Grammar, Wire Grammar
- BooleanFieldSpec — Grammar, Wire Grammar
- BooleanRenderingHint — Grammar, Wire Grammar
- BooleanValue — Grammar, Wire Grammar
- BranchSource — Grammar, Wire Grammar
C
- Cardinality — Grammar, Wire Grammar
- CatalogMetadata — Grammar, Wire Grammar
- ClassSource — Grammar, Wire Grammar
- ContactField — Grammar, Wire Grammar
- ContactFieldSpec — Grammar, Wire Grammar
- ControlledTermClass — Grammar, Wire Grammar
- ControlledTermField — Grammar, Wire Grammar
- ControlledTermFieldId — Grammar, Wire Grammar
- ControlledTermFieldSpec — Grammar, Wire Grammar
- ControlledTermRenderingHint — Grammar, Wire Grammar
- ControlledTermSource — Grammar, Wire Grammar
- ControlledTermValue — Grammar, Wire Grammar
- CreatedBy — Grammar, Wire Grammar
- CreatedOn — Grammar, Wire Grammar
D
- DateComponentOrder — Grammar, Wire Grammar
- DateField — Grammar, Wire Grammar
- DateFieldId — Grammar, Wire Grammar
- DateFieldSpec — Grammar, Wire Grammar
- DateRenderingHint — Grammar, Wire Grammar
- DateTimeField — Grammar, Wire Grammar
- DateTimeFieldId — Grammar, Wire Grammar
- DateTimeFieldSpec — Grammar, Wire Grammar
- DateTimeRenderingHint — Grammar, Wire Grammar
- DateTimeValue — Grammar, Wire Grammar
- DateTimeValueType — Grammar, Wire Grammar
- DateValue — Grammar, Wire Grammar
- DateValueType — Grammar, Wire Grammar
- DecimalPlaces — Grammar, Wire Grammar
- DerivedFrom — Grammar, Wire Grammar
- Description — Grammar, Wire Grammar
- DoiField — Grammar, Wire Grammar
- DoiFieldId — Grammar, Wire Grammar
- DoiFieldSpec — Grammar, Wire Grammar
- DoiIri — Grammar, Wire Grammar
- DoiRenderingHint — Grammar, Wire Grammar
- DoiValue — Grammar, Wire Grammar
E
- EmailField — Grammar, Wire Grammar
- EmailFieldId — Grammar, Wire Grammar
- EmailFieldSpec — Grammar, Wire Grammar
- EmailRenderingHint — Grammar, Wire Grammar
- EmailValue — Grammar, Wire Grammar
- EmbeddedArtifact — Grammar, Wire Grammar
- EmbeddedArtifactKey — Grammar, Serialization, Wire Grammar
- EmbeddedAttributeValueField — Grammar, Wire Grammar
- EmbeddedBooleanField — Grammar, Wire Grammar
- EmbeddedControlledTermField — Grammar, Wire Grammar
- EmbeddedDateField — Grammar, Wire Grammar
- EmbeddedDateTimeField — Grammar, Wire Grammar
- EmbeddedDoiField — Grammar, Wire Grammar
- EmbeddedEmailField — Grammar, Wire Grammar
- EmbeddedField — Grammar, Wire Grammar
- EmbeddedIntegerNumberField — Grammar, Wire Grammar
- EmbeddedLinkField — Grammar, Wire Grammar
- EmbeddedMultiValuedEnumField — Grammar, Wire Grammar
- EmbeddedNihGrantIdField — Grammar, Wire Grammar
- EmbeddedOrcidField — Grammar, Wire Grammar
- EmbeddedPhoneNumberField — Grammar, Wire Grammar
- EmbeddedPresentationComponent — Grammar, Wire Grammar
- EmbeddedPubMedIdField — Grammar, Wire Grammar
- EmbeddedRealNumberField — Grammar, Wire Grammar
- EmbeddedRorField — Grammar, Wire Grammar
- EmbeddedRridField — Grammar, Wire Grammar
- EmbeddedSingleValuedEnumField — Grammar, Wire Grammar
- EmbeddedTemplate — Grammar, Wire Grammar
- EmbeddedTextField — Grammar, Wire Grammar
- EmbeddedTimeField — Grammar, Wire Grammar
- EnumField — Grammar, Wire Grammar
- EnumFieldSpec — Grammar, Wire Grammar
- EnumValue — Grammar, Wire Grammar
- ExternalAuthorityField — Grammar, Wire Grammar
- ExternalAuthorityFieldSpec — Grammar, Wire Grammar
- ExternalAuthorityValue — Grammar, Wire Grammar
F
- Field — Grammar, Wire Grammar
- FieldId — Grammar, Wire Grammar
- FieldSpec — Grammar, Wire Grammar
- FieldValue — Grammar, Wire Grammar
- Footer — Grammar, Wire Grammar
- FullDateValue — Grammar, Wire Grammar
H
- Header — Grammar, Serialization, Wire Grammar
- HelpDisplayMode — Grammar, Wire Grammar
- HelpText — Grammar, Wire Grammar
- HelpTextOverride — Grammar, Wire Grammar
- HtmlContent — Grammar, Wire Grammar
I
- Identifier — Grammar, Wire Grammar
- ImageComponent — Grammar, Wire Grammar
- InstanceValue — Grammar, Wire Grammar
- IntegerNumberField — Grammar, Wire Grammar
- IntegerNumberFieldId — Grammar, Wire Grammar
- IntegerNumberFieldSpec — Grammar, Wire Grammar
- IntegerNumberMaxValue — Grammar, Wire Grammar
- IntegerNumberMinValue — Grammar, Wire Grammar
- IntegerNumberValue — Grammar, Wire Grammar
- Iri — Grammar, Wire Grammar
- IsoDateTimeStamp — Grammar, Wire Grammar
L
- Label — Grammar, Wire Grammar
- LabelOverride — Grammar, Wire Grammar
- LangString — Grammar, Wire Grammar
- LangTagRequirement — Grammar, Wire Grammar
- LanguageTag — Grammar, Wire Grammar
- LexicalForm — Grammar, Wire Grammar
- LifecycleMetadata — Grammar, Wire Grammar
- LinkField — Grammar, Wire Grammar
- LinkFieldId — Grammar, Wire Grammar
- LinkFieldSpec — Grammar, Wire Grammar
- LinkRenderingHint — Grammar, Wire Grammar
- LinkValue — Grammar, Wire Grammar
M
- MaxCardinality — Grammar, Wire Grammar
- MaxLength — Grammar, Wire Grammar
- MaxTraversalDepth — Grammar, Wire Grammar
- Meaning — Grammar, Wire Grammar
- MinCardinality — Grammar, Wire Grammar
- MinLength — Grammar, Wire Grammar
- ModelVersion — Grammar, Wire Grammar
- ModifiedBy — Grammar, Wire Grammar
- ModifiedOn — Grammar, Wire Grammar
- MultilingualString — Grammar, Wire Grammar
- MultiValuedEnumField — Grammar, Wire Grammar
- MultiValuedEnumFieldId — Grammar, Wire Grammar
- MultiValuedEnumFieldSpec — Grammar, Wire Grammar
- MultiValuedEnumRenderingHint — Grammar, Wire Grammar
N
- NestedTemplateInstance — Grammar, Wire Grammar
- NihGrantIdField — Grammar, Wire Grammar
- NihGrantIdFieldId — Grammar, Wire Grammar
- NihGrantIdFieldSpec — Grammar, Wire Grammar
- NihGrantIdRenderingHint — Grammar, Wire Grammar
- NihGrantIdValue — Grammar, Wire Grammar
- NihGrantIri — Grammar, Wire Grammar
- NonNegativeInteger — Grammar, Serialization, Wire Grammar
- Notation — Grammar, Wire Grammar
- NumericField — Grammar, Wire Grammar
- NumericFieldSpec — Grammar, Wire Grammar
- NumericRenderingHint — Grammar, Wire Grammar
- NumericValue — Grammar, Wire Grammar
O
- OntologyAcronym — Grammar, Wire Grammar
- OntologyDisplayHint — Grammar, Wire Grammar
- OntologyIri — Grammar, Wire Grammar
- OntologyName — Grammar, Wire Grammar
- OntologyReference — Grammar, Wire Grammar
- OntologySource — Grammar, Wire Grammar
- OrcidField — Grammar, Wire Grammar
- OrcidFieldId — Grammar, Wire Grammar
- OrcidFieldSpec — Grammar, Wire Grammar
- OrcidIri — Grammar, Wire Grammar
- OrcidRenderingHint — Grammar, Wire Grammar
- OrcidValue — Grammar, Wire Grammar
P
- PageBreakComponent — Grammar, Wire Grammar
- PermissibleValue — Grammar, Wire Grammar
- PhoneNumberField — Grammar, Wire Grammar
- PhoneNumberFieldId — Grammar, Wire Grammar
- PhoneNumberFieldSpec — Grammar, Wire Grammar
- PhoneNumberRenderingHint — Grammar, Wire Grammar
- PhoneNumberValue — Grammar, Wire Grammar
- Placeholder — Grammar, Wire Grammar
- PreferredLabel — Grammar, Wire Grammar
- PresentationComponent — Grammar, Wire Grammar
- PresentationComponentId — Grammar, Wire Grammar
- PreviousVersion — Grammar, Wire Grammar
- Property — Grammar, Wire Grammar
- PropertyIri — Grammar, Wire Grammar
- PropertyLabel — Grammar, Wire Grammar
- PubMedIdField — Grammar, Wire Grammar
- PubMedIdFieldId — Grammar, Wire Grammar
- PubMedIdFieldSpec — Grammar, Wire Grammar
- PubMedIdRenderingHint — Grammar, Wire Grammar
- PubMedIdValue — Grammar, Wire Grammar
- PubMedIri — Grammar, Wire Grammar
R
- RealNumberDatatypeKind — Grammar, Wire Grammar
- RealNumberField — Grammar, Wire Grammar
- RealNumberFieldId — Grammar, Wire Grammar
- RealNumberFieldSpec — Grammar, Wire Grammar
- RealNumberMaxValue — Grammar, Wire Grammar
- RealNumberMinValue — Grammar, Wire Grammar
- RealNumberValue — Grammar, Wire Grammar
- RenderingHint — Grammar, Wire Grammar
- RichTextComponent — Grammar, Wire Grammar
- RootTermIri — Grammar, Wire Grammar
- RootTermLabel — Grammar, Wire Grammar
- RorField — Grammar, Wire Grammar
- RorFieldId — Grammar, Wire Grammar
- RorFieldSpec — Grammar, Wire Grammar
- RorIri — Grammar, Wire Grammar
- RorRenderingHint — Grammar, Wire Grammar
- RorValue — Grammar, Wire Grammar
- RridField — Grammar, Wire Grammar
- RridFieldId — Grammar, Wire Grammar
- RridFieldSpec — Grammar, Wire Grammar
- RridIri — Grammar, Wire Grammar
- RridRenderingHint — Grammar, Wire Grammar
- RridValue — Grammar, Wire Grammar
S
- SchemaArtifact — Grammar, Wire Grammar
- SchemaArtifactVersioning — Grammar, Wire Grammar
- SectionBreakComponent — Grammar, Wire Grammar
- SingleValuedEnumField — Grammar, Wire Grammar
- SingleValuedEnumFieldId — Grammar, Wire Grammar
- SingleValuedEnumFieldSpec — Grammar, Wire Grammar
- SingleValuedEnumRenderingHint — Grammar, Wire Grammar
- Status — Grammar, Wire Grammar
T
- Template — Grammar, Wire Grammar
- TemplateId — Grammar, Wire Grammar
- TemplateInstance — Grammar, Wire Grammar
- TemplateInstanceId — Grammar, Wire Grammar
- TemplateRenderingHint — Grammar, Wire Grammar
- TemporalField — Grammar, Wire Grammar
- TemporalFieldSpec — Grammar, Wire Grammar
- TermIri — Grammar, Wire Grammar
- TextField — Grammar, Wire Grammar
- TextFieldId — Grammar, Wire Grammar
- TextFieldSpec — Grammar, Wire Grammar
- TextLineMode — Grammar, Wire Grammar
- TextRenderingHint — Grammar, Wire Grammar
- TextValue — Grammar, Wire Grammar
- TimeField — Grammar, Wire Grammar
- TimeFieldId — Grammar, Wire Grammar
- TimeFieldSpec — Grammar, Wire Grammar
- TimeFormat — Grammar, Wire Grammar
- TimePrecision — Grammar, Wire Grammar
- TimeRenderingHint — Grammar, Wire Grammar
- TimeValue — Grammar, Wire Grammar
- TimezoneRequirement — Grammar, Wire Grammar
- Title — Grammar, Wire Grammar
- Token — Grammar, Wire Grammar
U
- Unit — Grammar, Wire Grammar
V
- ValidationRegex — Grammar, Wire Grammar
- Value — Grammar, Wire Grammar
- ValueRequirement — Grammar, Wire Grammar
- ValueSetIdentifier — Grammar, Wire Grammar
- ValueSetIri — Grammar, Wire Grammar
- ValueSetName — Grammar, Wire Grammar
- ValueSetSource — Grammar, Wire Grammar
- Version — Grammar, Wire Grammar
- Visibility — Grammar, Wire Grammar
Y
- YearMonthValue — Grammar, Wire Grammar
- YearValue — Grammar, Wire Grammar
- YoutubeVideoComponent — Grammar, Wire Grammar
CTM 1.6.0 Serialization Mapping
1. Purpose
This document specifies a one-directional, function-based mapping from the CEDAR Structural Model (defined in spec/grammar.md) to CTM 1.6.0 JSON-LD format. The Structural Model remains the authoritative definition of the model; this document defines how constructs in that model are encoded as CTM 1.6.0 JSON-LD values. Each encoding function takes one or more abstract grammar constructs as arguments and produces a JSON value. The functions are defined precisely enough to be directly implementable.
What CTM 1.6.0 is
CTM 1.6.0 (CEDAR Template Model version 1.6.0) is the concrete JSON-LD format used by the CEDAR Workbench to store and exchange metadata templates and their filled-in instances. A CTM 1.6.0 document is a JSON object that simultaneously serves three roles: it is a JSON-LD document (it carries @context, @id, and @type for RDF interpretation), a JSON Schema document (it carries $schema, type, properties, and required so that conforming instances can be validated), and a CEDAR-specific descriptor (it carries _valueConstraints and _ui keys understood by CEDAR tooling). These three concerns are all mixed into the same flat JSON object rather than being kept separate.
The abstract model vs. the serialization
The CEDAR Structural Model (spec/grammar.md) is the authoritative, format-independent definition of what a template means. It describes templates, fields, embedded artifacts, and instances in abstract terms — without committing to any particular wire format. This document defines how to translate that abstract model into CTM 1.6.0 JSON-LD. The mapping is one-directional (abstract → concrete) and lossy in places: some Structural Model constructs have no CTM 1.6.0 equivalent and are dropped (see Section 14).
Caution: This mapping is not round-trippable. Encoding a Structural Model construct to CTM 1.6.0 and then decoding back will not always recover the original construct. See Section 14 for a full list of known gaps and lossy areas before implementing.
Key structural ideas
Templates and fields are separate reusable artifacts. In the Structural Model, a Template does not contain Field objects directly — it contains EmbeddedField references that point to separately-defined Field artifacts. When encoding, information from both the embedding (EmbeddedField) and the referenced definition (Field) must be combined. Most field schema content (value shape, value constraints, UI hints) ends up inside the template’s "properties" object, keyed by the embedding’s key identifier.
Embedded artifact information is distributed across four top-level keys. For each field or nested template in a template, there is no single output key that corresponds to it. Instead its information is spread across "properties" (the field schema), "required" (whether it is mandatory), "_ui" (display order and label overrides), and "@context" (the property IRI mapping for JSON-LD). Understanding this distribution is essential to reading the encoding functions correctly.
The field object carries both schema structure and rendering hints. Each field’s entry inside "properties" is itself a JSON object that combines JSON Schema structure (what type of value the field holds, expressed via "properties", "required", "additionalProperties") with CTM-specific keys ("_valueConstraints" for validation rules such as required/optional, numeric type, or controlled term sources; "_ui" for rendering instructions such as input type and visibility). These are merged into a single flat field object.
Instance values are plain JSON-LD objects. A template instance is a flat JSON object whose keys are the field key identifiers from the template. Each key maps to a small JSON-LD value object — typically { "@value": "..." } for text and numeric fields, or { "@id": "..." } for IRI-valued fields. Multi-valued fields produce a JSON array of such objects. The template’s "@context" is reused in the instance so that each field key resolves to its property IRI for RDF interpretation.
Call graph
The diagram below shows the main call relationships between encoding functions. Nodes marked ×N represent a group of similar functions; see the relevant section for the individual entries. Dashed arrows indicate recursion.
flowchart TD
ET(["encode_template"])
ETI(["encode_template_instance"])
subgraph S5["§5 · Metadata"]
EAM["encode_artifact_metadata"]
ECM["encode_catalog_metadata"]
ETPR["encode_temporal_provenance"]
ESV["encode_schema_artifact_versioning"]
EAM --> ECM
EAM --> ETPR
EAM --> ESV
end
subgraph S6["§6 · Template Structure"]
ETC["encode_template_context"]
ETP["encode_template_properties"]
ETR["encode_template_required"]
ETUI["encode_template_ui"]
ETC --> EPCE["encode_property_context_entry"]
end
subgraph S7["§7 · Embedded Artifacts"]
EEAS["encode_embedded_artifact_schema"]
EEFS["encode_embedded_field_schema"]
EETS["encode_embedded_template_schema"]
EEPCS["encode_embedded_presentation\n_component_schema"]
EEAS --> EEFS
EEAS --> EETS
EEAS --> EEPCS
end
subgraph S8["§8 · Field"]
EF["encode_field"]
end
subgraph S9["§9 · Field Specs"]
EFT["encode_*_field_spec ×13"]
EEC["encode_embedding_constraints"]
EEUI["encode_embedding_ui"]
EFT --> EEC
EFT --> EEUI
end
ETE["encode_template_element §10"]
subgraph S11["§11 · Values"]
EV["encode_value"]
EVX["encode_*_value ×12"]
EV --> EVX
end
subgraph S12["§12 · Instance"]
EFV["encode_field_value"]
ENTS["encode_nested_template_instance_slot"]
end
ET --> EAM
ET --> ETC
ET --> ETP
ET --> ETR
ET --> ETUI
ETP --> EEAS
EEFS --> EF
EETS --> ETE
ETE -->|"reuses"| ETC
ETE -->|"reuses"| ETP
ETE -->|"reuses"| ETR
ETE -->|"reuses"| ETUI
ETE --> EAM
EF --> EAM
EF -->|"dispatch"| EFT
ETI --> ETC
ETI --> EAM
ETI --> EFV
ETI --> ENTS
EFV --> EV
ENTS -.->|"recursive"| ETI
2. Conventions
Function Signature Form
encode_X(x: X) → JSON-kind
X is a grammar production name, x is the parameter, and JSON-kind is one of: Object, String, Array, Number, Boolean, or null.
JSON Notation in Function Bodies
{ k₁: v₁, k₂: v₂ }— JSON object literal[ v₁, v₂ ]— JSON array"..."— literal stringnull— JSON nullomit— the key is absent from the output (not even present as null)
Accessor Notation
Dot notation is used on grammar constructs, e.g. T.schema_artifact_metadata or E.embedded_artifact_key. Where a grammar construct wraps a primitive string (e.g. Identifier ::= identifier(string)), write D.identifier.string to reach the string value.
Helper Functions
key(E)— the ASCII identifier string ofE’sEmbeddedArtifactKey; defined asE.embedded_artifact_key.ascii_identifieriri(I)— the IRI string of anIriconstruct; defined asI.iri_stringmerge(a, b, ...)— merge JSON objects left-to-right; later objects take precedence on key conflictsif P then k: v— include keykwith valuevonly when predicatePholds; otherwise omit[ x(E) for each E in xs ]— JSON array built by evaluatingx(E)for each elementEof sequencexs{ k(E): v(E) for each E in xs }— JSON object built from key-value pairs, one per element (inline form);xsmust be a plain sequence with no inline filter. In multi-line blocks, the iteration clause comes first:{ for each E in xs: k(E): v(E) }let x = expr— within a function body, binds the namexto the value ofexpr;xmay then be used in subsequent expressions in the same body[ E in xs | P(E) ]— the subsequence ofxsretaining only those elements for which predicateP(E)holds; used inletbindings to pre-filter before passing to a comprehensionxs ++ ys— concatenation of arraysxsandys
Cardinality Helper
is_multi(E)— true ifE.cardinalityis present andmax_cardinalityis eitherUnboundedCardinalityor aNonNegativeIntegergreater than 1
Default Conventions
- When
Cardinalityis absent, effective min = 1, effective max = 1 (single-valued). - When
ValueRequirementis absent, effective requirement isOptional.
3. Worked Example
This section traces a minimal template and a corresponding instance through the encoding functions. The goal is to show concretely what the abstract model constructs look like as CTM 1.6.0 JSON-LD, and which functions are responsible for each part of the output.
3.1 The Example Model
Template — “Sample Record”
| Property | Value |
|---|---|
template_id | https://repo.example.org/templates/sample-record |
| Name | "Sample Record" |
| Description | "A minimal metadata template for biological samples" |
| Version | 1.0.0 |
| Status | "draft" |
| Model version | 1.6.0 |
| Created / modified | 2024-01-15T10:00:00Z by https://orcid.example.org/0000-0001-2345-6789 |
Two embedded fields:
| Key | Property IRI | ValueRequirement | FieldSpec |
|---|---|---|---|
title | https://schema.org/name | "required" | TextFieldSpec (single line) |
count | https://example.org/sampleCount | Optional | IntegerNumberFieldSpec |
Instance — “Sample 42”
| Property | Value |
|---|---|
template_instance_id | https://repo.example.org/instances/abc123 |
schema:name | "Sample 42" |
| Based on | the template above |
| Created / modified | 2024-03-10T09:30:00Z by https://orcid.example.org/0000-0001-2345-6789 |
title value | TextValue — "Mouse Sample 42" |
count value | NumericValue — 5 (xsd:integer) |
3.2 Encoding the Template
encode_template(T) assembles the output by calling several sub-functions and merging their results. The annotations below identify the responsible function for each part.
{
// encode_template — fixed identity and schema keys
"@id": "https://repo.example.org/templates/sample-record",
"@type": "https://schema.metadatacenter.org/core/Template",
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"title": "Sample Record",
"description": "A minimal metadata template for biological samples",
"additionalProperties": false,
// encode_template_context — STANDARD_NS plus one entry per embedded field
// that carries a Property (both do here); encode_property_context_entry
// returns a plain IRI string when no property_label is present
"@context": {
"schema": "http://schema.org/",
"pav": "http://purl.org/pav/",
"oslc": "http://open-services.net/ns/core#",
"bibo": "http://purl.org/ontology/bibo/",
"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
"skos": "http://www.w3.org/2004/02/skos/core#",
"xsd": "http://www.w3.org/2001/XMLSchema#",
"title": "https://schema.org/name",
"count": "https://example.org/sampleCount"
},
// encode_template_properties — fixed instance-metadata entries followed
// by one entry per embedded artifact (encode_embedded_field_schema for each)
"properties": {
"@context": { "type": ["object", "null"] },
"@id": { "type": "string", "format": "uri" },
"schema:isBasedOn": { "type": "string", "format": "uri" },
"schema:name": { "type": "string" },
"schema:description": { "type": ["string", "null"] },
"pav:createdOn": { "type": ["string", "null"], "format": "date-time" },
"pav:createdBy": { "type": ["string", "null"], "format": "uri" },
"pav:lastUpdatedOn": { "type": ["string", "null"], "format": "date-time" },
"oslc:modifiedBy": { "type": ["string", "null"], "format": "uri" },
"title": { /* encode_embedded_field_schema — see Section 3.3 */ },
"count": { /* encode_embedded_field_schema — see Section 3.3 */ }
},
// encode_template_required — fixed keys plus "title" (the only required field)
"required": [
"@context", "@id", "schema:isBasedOn", "schema:name",
"schema:description", "pav:createdOn", "pav:createdBy",
"pav:lastUpdatedOn", "oslc:modifiedBy",
"title"
],
// encode_template_ui — order reflects embedded_artifacts sequence
"_ui": { "order": ["title", "count"] },
// encode_artifact_metadata — metadata keys merged at top level
"schema:name": "Sample Record",
"schema:description": "A minimal metadata template for biological samples",
"pav:version": "1.0.0",
"bibo:status": "bibo:draft",
"schema:schemaVersion": "1.6.0",
"pav:createdOn": "2024-01-15T10:00:00Z",
"pav:createdBy": "https://orcid.example.org/0000-0001-2345-6789",
"pav:lastUpdatedOn": "2024-01-15T10:00:00Z",
"oslc:modifiedBy": "https://orcid.example.org/0000-0001-2345-6789"
}
3.3 Encoding the Embedded Fields
Both fields are single-valued (is_multi = false), so encode_embedded_field_schema returns the field object directly with no array wrapper.
title field — encode_text_field_spec applies STRING_VALUE_SHAPE. encode_embedding_constraints sets requiredValue: true (the embedding is "required"). encode_text_rendering_hint returns "textfield" (absent hint defaults to single-line).
{
"@id": "https://repo.example.org/fields/title",
"@type": "https://schema.metadatacenter.org/core/TemplateField",
"@context": { /* STANDARD_NS */ },
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"title": "Title",
"description": "",
"properties": {
"@type": { "oneOf": [{ "type": "string", "format": "uri" }, { "type": "null" }] },
"@value": { "type": ["string", "null"] }
},
"required": ["@value"],
"additionalProperties": false,
"_valueConstraints": { "requiredValue": true },
"_ui": { "inputType": "textfield" },
// encode_artifact_metadata for the field:
"schema:name": "Title", "schema:description": null,
"pav:version": "1.0.0", "bibo:status": "bibo:draft",
"schema:schemaVersion": "1.6.0",
"pav:createdOn": "2024-01-15T10:00:00Z", ...
}
count field — encode_integer_number_field_spec applies NUMBER_VALUE_SHAPE and emits "xsd:integer" for the datatype slot (an integer-number field’s category is fixed). encode_embedding_constraints sets requiredValue: false (Optional).
{
"@id": "https://repo.example.org/fields/count",
"@type": "https://schema.metadatacenter.org/core/TemplateField",
"@context": { /* STANDARD_NS */ },
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"title": "Sample Count",
"description": "",
"properties": {
"@type": { "oneOf": [{ "type": "string", "format": "uri" }, { "type": "null" }] },
"@value": { "type": ["number", "null"] }
},
"required": ["@value"],
"additionalProperties": false,
"_valueConstraints": { "requiredValue": false, "numberType": "xsd:integer" },
"_ui": { "inputType": "numeric" },
"schema:name": "Sample Count", ...
}
3.4 Encoding the Instance
encode_template_instance(I, T) reuses the template context and maps each FieldValue using encode_field_value → encode_value.
{
// reuses encode_template_context(T) — same @context as the template
"@context": {
"schema": "http://schema.org/", /* ... STANDARD_NS ... */
"title": "https://schema.org/name",
"count": "https://example.org/sampleCount"
},
"@id": "https://repo.example.org/instances/abc123",
"schema:isBasedOn": "https://repo.example.org/templates/sample-record",
// encode_artifact_metadata
"schema:name": "Sample 42",
"schema:description": null,
"pav:createdOn": "2024-03-10T09:30:00Z",
"pav:createdBy": "https://orcid.example.org/0000-0001-2345-6789",
"pav:lastUpdatedOn": "2024-03-10T09:30:00Z",
"oslc:modifiedBy": "https://orcid.example.org/0000-0001-2345-6789",
// encode_field_value → encode_text_value (no language tag)
"title": { "@value": "Mouse Sample 42" },
// encode_field_value → encode_integer_number_value
"count": { "@value": "5", "@type": "xsd:integer" }
}
4. Standard Namespace Context Object
STANDARD_NS is the following JSON object. It is included in every @context produced by this mapping.
{
"schema": "http://schema.org/",
"pav": "http://purl.org/pav/",
"oslc": "http://open-services.net/ns/core#",
"bibo": "http://purl.org/ontology/bibo/",
"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
"skos": "http://www.w3.org/2004/02/skos/core#",
"xsd": "http://www.w3.org/2001/XMLSchema#"
}
STATIC_FIELD_NS is the smaller @context used by StaticTemplateField objects (presentation components). It omits rdfs, skos, and xsd.
{
"schema": "http://schema.org/",
"pav": "http://purl.org/pav/",
"bibo": "http://purl.org/ontology/bibo/",
"oslc": "http://open-services.net/ns/core#"
}
5. Metadata Encoding Functions
encode_artifact_metadata(A: Artifact) → Object
CTM 1.6.0 artifacts carry both human-readable metadata and (for schema artifacts) versioning information at the top level of their JSON object. The Structural Model factors these concerns differently: CatalogMetadata carries descriptive properties and lifecycle, SchemaArtifactVersioning is a parallel top-level slot on schema artifacts, and Label/Title are rendered-name slots that live as top-level slots on the artifact itself (Field.label, Template.title, optional TemplateInstance.label).
The CTM 1.6.0 encoder flattens all of these into a single flat property set on the artifact’s JSON object:
merge(
encode_catalog_metadata(A.catalog_metadata, rendered_name_of(A)),
encode_temporal_provenance(A.catalog_metadata.lifecycle),
A is SchemaArtifact ? encode_schema_artifact_versioning(A.versioning) : {}
)
Where rendered_name_of(A) selects the artifact’s rendered display name according to the artifact kind:
- For a
Field:A.label(always present). - For a
Template:A.title(always present). - For a
TemplateInstance:A.labelif present, otherwiseA.catalog_metadata.preferred_labelif present, otherwise the artifactidslug. - For a
PresentationComponent:A.catalog_metadata.preferred_labelif present, otherwise the artifactidslug.
Calls: encode_catalog_metadata, encode_temporal_provenance, encode_schema_artifact_versioning
encode_catalog_metadata(C: CatalogMetadata, rendered: MultilingualString or null) → Object
Encodes the human-readable identity of an artifact. The schema:name and schema:description keys are always written; schema:identifier and rdfs:label appear only when set in the Structural Model.
CTM 1.6.0 requires a single-string schema:name. The Structural Model carries multiple candidate sources for the artifact’s display name (the rendered slot Label/Title on artifacts that have one, plus the optional catalog slot CatalogMetadata.preferred_label). The rendered parameter is the rendered name chosen for this artifact by encode_artifact_metadata’s rendered_name_of rule. The encoder flattens it to a single string by selecting the en localization if present, else the first localization entry. The same flattened string is also written to rdfs:label for round-trip stability.
Returns a JSON object with the following keys:
| Key | Value | Condition |
|---|---|---|
"schema:name" | flatten_to_string(rendered) — prefer en, else first entry | Always present; falls back to artifact id slug if rendered is null |
"schema:description" | C.description.unicode_string | null if C.description absent |
"schema:identifier" | C.identifier.unicode_string | Omit if C.identifier absent |
"rdfs:label" | flatten_to_string(C.preferred_label) if present, else flatten_to_string(rendered) | Always present |
AlternativeLabel values on CatalogMetadata have no direct CTM 1.6.0 equivalent and are omitted on encode.
Reverse direction (CTM 1.6.0 import). When importing a CTM 1.6.0 document into the Structural Model, the CTM 1.6.0 schema:name is mapped to the artifact’s rendered slot — label for a Field or TemplateInstance, title for a Template. For a PresentationComponent (which has no rendered slot), schema:name is mapped to CatalogMetadata.preferred_label. If the legacy document carries a non-empty rdfs:label distinct from schema:name, the importer maps rdfs:label to CatalogMetadata.preferred_label so that the registry display name and the rendered display name can diverge after import; otherwise preferred_label is left absent.
encode_temporal_provenance(P: TemporalProvenance) → Object
Records when an artifact was created and last modified, and by whom. All four keys are always present; values are ISO 8601 date-time strings and IRI strings respectively.
Returns a JSON object with the following keys:
| Key | Value |
|---|---|
"pav:createdOn" | P.created_on.iso_8601_date_time_lexical_form |
"pav:createdBy" | iri(P.created_by) |
"pav:lastUpdatedOn" | P.modified_on.iso_8601_date_time_lexical_form |
"oslc:modifiedBy" | iri(P.modified_by) |
encode_schema_artifact_versioning(V: SchemaArtifactVersioning) → Object
Encodes the version number, publication status, and schema format version of a schema artifact. Optional pav:previousVersion and pav:derivedFrom links are included only when the Structural Model carries them.
Returns a JSON object with the following keys:
| Key | Value | Condition |
|---|---|---|
"pav:version" | V.version.semantic_version | Always present |
"bibo:status" | encode_status(V.status) | Always present |
"schema:schemaVersion" | V.model_version.semantic_version | Always present |
"pav:previousVersion" | iri(V.previous_version.iri) | Omit if V.previous_version absent |
"pav:derivedFrom" | iri(V.derived_from.iri) | Omit if V.derived_from absent |
Calls: encode_status
encode_status(S: Status) → String
Maps the two-valued Status enumeration to its corresponding bibo: vocabulary string.
Returns the string corresponding to the Status kind:
Status kind | Returns |
|---|---|
"draft" | "bibo:draft" |
"published" | "bibo:published" |
6. Template Encoding
encode_template(T: Template) → Object
The top-level template object is the root of a CTM 1.6.0 template document. It is produced by merging several independently constructed fragments into one flat JSON object.
A key characteristic of this encoding is that information about each embedded field is spread across multiple top-level keys — it does not appear under a single nested key. For each embedded field E referencing a field F:
"properties"receives an entry atkey(E)containing the full field schema (value shape, constraints, and UI hints) produced byencode_embedded_field_schema."required"receiveskey(E)if the embedding’s value requirement is"required"."_ui"receiveskey(E)in its"order"array (and optionally in"propertyLabels"), derived from the embedding itself."@context"receiveskey(E)mapped to the field’s property IRI, if the embedding carries aProperty.
This means the _ui key contains ordering and display information drawn from the embedding (EmbeddedField), while properties[key(E)] contains schema information drawn from the referenced Field. There is no single place in the output that corresponds one-to-one with an EmbeddedField — the embedding’s information is hoisted and distributed across these four top-level keys.
merge(
{
"@id": iri(T.template_id),
"@type": "https://schema.metadatacenter.org/core/Template",
"@context": encode_template_context(T),
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"title": T.schema_artifact_metadata.artifact_metadata.descriptive_metadata.name.unicode_string,
"description": T.schema_artifact_metadata.artifact_metadata.descriptive_metadata.description.unicode_string
if description is present, else "",
"properties": encode_template_properties(T),
"required": encode_template_required(T),
"additionalProperties": false,
"_ui": encode_template_ui(T)
},
encode_artifact_metadata(T)
)
Calls: encode_template_context, encode_template_properties, encode_template_required, encode_template_ui, encode_artifact_metadata
encode_template_context(T: Template) → Object
The @context maps compact term names to full IRIs for JSON-LD interpretation. Every template context begins with STANDARD_NS. For each data-bearing embedded artifact that carries a Property, an additional entry maps the artifact’s key string to its property IRI — or to a labelled mapping object if a property_label is also present. Artifacts without a Property (such as presentation components) contribute no context entry.
let embedded_properties = [ E in T.embedded_artifacts
| (E is EmbeddedField or E is EmbeddedTemplate) and E.property is present ]
merge(
STANDARD_NS,
{ key(E): encode_property_context_entry(E.property) for each E in embedded_properties }
)
Calls: encode_property_context_entry
encode_property_context_entry(P: Property) → String or Object
Determines the form of a single entry in the template’s @context. When only a property IRI is available the entry is a plain string. When a human-readable label is also present the entry is an object with both @id and rdfs:label to support labelled JSON-LD mapping.
| Condition | Returns |
|---|---|
P.property_label absent | iri(P.property_iri.iri) |
P.property_label present | { "@id": iri(P.property_iri.iri), "rdfs:label": P.property_label.unicode_string } |
encode_template_properties(T: Template) → Object
Produces the "properties" object for the template’s JSON Schema layer. This is one of the primary sites where embedded artifacts are encoded — each EmbeddedField and EmbeddedTemplate in the template contributes exactly one entry here, keyed by its EmbeddedArtifactKey.
The output has two parts merged together:
-
Fixed instance-metadata entries. Nine fixed keys (
@context,@id,schema:isBasedOn,schema:name,schema:description, and the four provenance keys) are always present. These define the schema for the instance-level metadata properties that every CTM 1.6.0 instance must carry, regardless of what fields the template defines. -
One entry per embedded artifact. For each
EmbeddedArtifact Ein the template, the entry atkey(E)is produced byencode_embedded_artifact_schema(E). For anEmbeddedFieldthis ultimately encodes the value shape, value constraints, and UI input type of the referencedField— meaning the bulk of the field encoding (what kind of value it holds, what type annotations are required, whether it is multi-valued) is expressed here insideproperties, not at the top level. For anEmbeddedTemplatethe entry contains the full nested element schema.EmbeddedPresentationComponententries are stubs ({}).
merge(
{
"@context": { "type": ["object", "null"] },
"@id": { "type": "string", "format": "uri" },
"schema:isBasedOn": { "type": "string", "format": "uri" },
"schema:name": { "type": "string" },
"schema:description": { "type": ["string", "null"] },
"pav:createdOn": { "type": ["string", "null"], "format": "date-time" },
"pav:createdBy": { "type": ["string", "null"], "format": "uri" },
"pav:lastUpdatedOn": { "type": ["string", "null"], "format": "date-time" },
"oslc:modifiedBy": { "type": ["string", "null"], "format": "uri" }
},
{
for each E in T.embedded_artifacts:
key(E): encode_embedded_artifact_schema(E)
}
)
Calls: encode_embedded_artifact_schema
encode_template_required(T: Template) → Array
Builds the required array for the template’s JSON Schema. The fixed instance-metadata keys are always required. In addition, any data-bearing embedded artifact whose effective ValueRequirement is "required" contributes its key to this array.
let required_embs = [ E in T.embedded_artifacts
| (E is EmbeddedField or E is EmbeddedTemplate)
and effective value_requirement of E is "required" ]
[ "@context", "@id", "schema:isBasedOn", "schema:name",
"schema:description", "pav:createdOn", "pav:createdBy",
"pav:lastUpdatedOn", "oslc:modifiedBy" ]
++ [ key(E) for each E in required_embs ]
encode_template_ui(T: Template) → Object
Encodes the _ui object for the template. The order entry lists all embedded artifact keys in their sequence order, controlling display order in rendering tools. When any embedding carries a label override, a propertyLabels map is also included. Header and Footer on the template are encoded as "header" and "footer" string keys when present.
let label_embs = [ E in T.embedded_artifacts | E.label_override is present ]
merge(
{ "order": [ key(E) for each E in T.embedded_artifacts ] },
if label_embs is non-empty:
{
"propertyLabels": { key(E): E.label_override.label.unicode_string for each E in label_embs }
},
if T.header is present: { "header": T.header.unicode_string },
if T.footer is present: { "footer": T.footer.unicode_string }
)
7. Embedded Artifact Schema Encoding
These functions produce the value placed at properties[key(E)] within the containing template or template element.
encode_embedded_artifact_schema(E: EmbeddedArtifact) → Object
Selects the appropriate encoding function based on whether the embedded artifact is a field, a nested template, or a presentation component. The result becomes the value placed at the artifact’s key in the parent template’s properties object.
Dispatches to the encoding function for the EmbeddedArtifact kind:
EmbeddedArtifact kind | Encoding function |
|---|---|
EmbeddedField | encode_embedded_field_schema(E) |
EmbeddedTemplate | encode_embedded_template_schema(E) |
EmbeddedPresentationComponent | encode_embedded_presentation_component_schema(E) |
Calls: encode_embedded_field_schema, encode_embedded_template_schema, encode_embedded_presentation_component_schema
encode_embedded_field_schema(E: EmbeddedField) → Object
This function is the bridge between the abstract EmbeddedField and the CTM 1.6.0 JSON Schema representation that an instance validator will actually use. Its job is to produce the value that goes at properties[key(E)] in the containing template.
There are two distinct concerns to resolve here:
1. Single-valued vs. multi-valued. The Structural Model represents cardinality on the EmbeddedField (the embedding), not on the Field definition itself. CTM 1.6.0 expresses multi-valued fields by wrapping the field schema in a JSON Schema array object ("type": "array", "items": ...), with optional minItems and maxItems bounds. Single-valued fields need no wrapper — the field object is used directly. This wrapping decision is therefore made here, at the embedding level, where the cardinality information lives. The is_multi(E) helper encapsulates this check.
2. Merging embedding context into the field encoding. The EmbeddedField also carries embedding-specific properties — most notably whether the field is required and whether it is hidden — that are not part of the reusable Field definition. These are passed down to encode_field via the E parameter so they can be incorporated into _valueConstraints and _ui within the field object itself.
let field_obj = encode_field(referenced_field(E), E)
where referenced_field(E) is the Field identified by the reference in E.
if is_multi(E):
{
"type": "array",
"items": field_obj,
if E.cardinality.min_cardinality is present:
"minItems": E.cardinality.min_cardinality.non_negative_integer.integer_lexical_form (as integer),
if E.cardinality.max_cardinality is present and not UnboundedCardinality:
"maxItems": E.cardinality.max_cardinality.non_negative_integer.integer_lexical_form (as integer)
}
else (single-valued):
field_obj
Calls: encode_field
encode_embedded_template_schema(E: EmbeddedTemplate) → Object
Parallel to encode_embedded_field_schema, but for nested template elements. Single-valued embeddings return the element object directly; multi-valued embeddings (determined by is_multi(E)) wrap it in an array descriptor with cardinality bounds.
Let elem_obj = encode_template_element(referenced_template(E), E).
if is_multi(E):
{
"type": "array",
"items": elem_obj,
if E.cardinality.min_cardinality is present:
"minItems": E.cardinality.min_cardinality.non_negative_integer.integer_lexical_form (as integer),
if E.cardinality.max_cardinality is present and not UnboundedCardinality:
"maxItems": E.cardinality.max_cardinality.non_negative_integer.integer_lexical_form (as integer)
}
else:
elem_obj
Calls: encode_template_element
encode_embedded_presentation_component_schema(E: EmbeddedPresentationComponent) → Object
Presentation components are encoded as StaticTemplateField objects — regular field-like objects with a specific @type and no value shape, required array, or _valueConstraints. The component’s content (HTML, image URL, YouTube identifier) is stored in _ui._content.
let pc_obj = encode_presentation_component(referenced_presentation_component(E), E)
where referenced_presentation_component(E) is the PresentationComponent identified by the reference in E.
pc_obj
Calls: encode_presentation_component
encode_presentation_component(PC: PresentationComponent, E: EmbeddedPresentationComponent) → Object
Produces a StaticTemplateField object. Unlike regular fields, this object carries no "properties", "required", or "_valueConstraints" keys — the component holds no instance data. The @context is the smaller STATIC_FIELD_NS rather than STANDARD_NS.
merge(
{
"@id": iri(PC.presentation_component_id),
"@type": "https://schema.metadatacenter.org/core/StaticTemplateField",
"@context": STATIC_FIELD_NS,
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"title": PC.schema_artifact_metadata.artifact_metadata.descriptive_metadata.name.unicode_string,
"description": PC.schema_artifact_metadata.artifact_metadata.descriptive_metadata.description.unicode_string
if present, else "",
"additionalProperties": false,
"_ui": encode_presentation_component_ui(PC)
},
encode_artifact_metadata(PC)
)
Calls: encode_presentation_component_ui, encode_artifact_metadata
encode_presentation_component_ui(PC: PresentationComponent) → Object
Returns the _ui object for a static field. All component kinds carry "inputType" and "_content".
PresentationComponent kind | "inputType" | "_content" |
|---|---|---|
PageBreakComponent | "page-break" | null |
SectionBreakComponent | "section-break" | null |
RichTextComponent | "richtext" | PC.html_content.unicode_string |
ImageComponent | "image" | iri(PC.iri) |
YoutubeVideoComponent | "youtube" | iri(PC.iri) |
ImageComponent.label, ImageComponent.description, YoutubeVideoComponent.label, and YoutubeVideoComponent.description accessibility metadata are not surfaced in CTM 1.6.0 output (the legacy form has no slot for them). See Section 14, Known Gaps.
8. Field Encoding
encode_field(F: Field, E: EmbeddedField) → Object
A CTM 1.6.0 field object merges fixed structural keys (@id, @type, $schema, type, title, description), the artifact metadata block, and the field-spec-specific encoding. The embedding E is passed to encode_field_spec because properties such as requiredValue and hidden depend on how the field is embedded rather than on the field definition itself.
merge(
{
"@id": iri(F.field_id),
"@type": "https://schema.metadatacenter.org/core/TemplateField",
"@context": STANDARD_NS,
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"title": F.schema_artifact_metadata.artifact_metadata.descriptive_metadata.name.unicode_string,
"description": F.schema_artifact_metadata.artifact_metadata.descriptive_metadata.description.unicode_string
if description is present, else ""
},
encode_artifact_metadata(F),
encode_field_spec(F.field_spec, E)
)
encode_field_spec(FT: FieldSpec, E: EmbeddedField) → Object is defined per field spec in Section 9 using a common skeleton with per-type value shape and constraint entries.
Calls: encode_artifact_metadata
9. Field Spec Encoding
Skeleton
Every standard field spec encoding function returns a fragment — an object with five keys — that gets merged into the full field object by encode_field. The skeleton below shows the structure, with placeholders for the parts that vary per field spec:
{
"properties": <value-shape>,
"required": <required>,
"additionalProperties": false,
"_valueConstraints": merge(encode_embedding_constraints(E), <vc-extras>),
"_ui": merge(encode_embedding_ui(E), <ui-extras>)
}
The placeholders mean:
-
<value-shape>— a JSON Schemapropertiesobject describing the keys an instance value for this field must (or may) carry. For example, a text field’s instance value is a JSON object with"@value"and optionally"@type"; a controlled term field’s value uses"@id"and"rdfs:label"instead. Three named shapes (STRING_VALUE_SHAPE,NUMBER_VALUE_SHAPE,IRI_VALUE_SHAPE) cover most field specs; each is defined below. -
<required>— the JSON Schemarequiredarray listing which keys from the value shape must be present in an instance value. Most field specs require["@value"]or[]; the exact list is given per field spec. -
<vc-extras>— additional keys to merge into_valueConstraintsbeyond the baserequiredValueflag. For example, a numeric field adds"numberType"here; a text field may add"defaultValue","minLength", etc. When a field spec has no extras,_valueConstraintsis justencode_embedding_constraints(E)directly. -
<ui-extras>— additional keys to merge into_uibeyond the basehiddenflag. At minimum, every field spec adds"inputType"here. Temporal fields also add"temporalGranularity"and similar hints.
Field specs that do not follow this skeleton (multi-valued enum and attribute-value) are noted explicitly in their entries.
Value Shapes
A value shape is a JSON Schema properties object that defines what keys an instance value object for this field spec must or may contain. Rather than repeat the same structures throughout, three shapes are named here and referenced by the per-field-spec entries.
STRING_VALUE_SHAPE — used by text, date, time, datetime, email, and phone number fields. Instance values carry a string "@value" and an optional "@type" IRI for typed literals:
{
"@type": { "oneOf": [{ "type": "string", "format": "uri" }, { "type": "null" }] },
"@value": { "type": ["string", "null"] }
}
NUMBER_VALUE_SHAPE — used by numeric fields. Instance values carry a numeric "@value" and an "@type" IRI identifying the XSD numeric datatype:
{
"@type": { "oneOf": [{ "type": "string", "format": "uri" }, { "type": "null" }] },
"@value": { "type": ["number", "null"] }
}
IRI_VALUE_SHAPE — used by controlled term, link, and external authority fields. Instance values carry an "@id" IRI rather than an "@value" string, plus an optional human-readable "rdfs:label":
{
"@type": { "oneOf": [{ "type": "string", "format": "uri" }, { "type": "null" }] },
"@id": { "type": "string", "format": "uri" },
"rdfs:label": { "type": ["string", "null"] }
}
Embedding Helper Functions
The two helpers below produce the base content of _valueConstraints and _ui from the EmbeddedField context. Every standard field spec merges these as the starting point before adding its own extras.
encode_embedding_constraints(E: EmbeddedField) → Object
Returns { "requiredValue": V } where V depends on the effective value requirement:
Effective ValueRequirement | "requiredValue" |
|---|---|
"required" | true |
"recommended" or "optional" | false |
Caution: The
"recommended"and"optional"distinctions from the Structural Model are both encoded as"requiredValue": falseand are therefore indistinguishable in CTM 1.6.0 output. This is not a JSON Schema concept —"requiredValue"is a CEDAR tooling hint only. The JSON Schema"required"array (produced byencode_template_required) separately handles enforcement, and it too only distinguishes"required"from everything else. The"recommended"/"optional"distinction is entirely lost in this encoding.
encode_embedding_ui(E: EmbeddedField) → Object
Returns a JSON object with the following keys:
| Key | Value | Condition |
|---|---|---|
"hidden" | true | Only when E.visibility = "hidden"; omit otherwise |
Field Spec Definitions
encode_text_field_spec(FT: TextFieldSpec, E: EmbeddedField) → Object
Text fields accept free-form string input. The rendering hint determines whether the input is single-line (textfield) or multi-line (textarea), defaulting to single-line when absent. Optional constraints — default value, length bounds, and a validation regex — are written to _valueConstraints only when present in the field definition.
Value shape: STRING_VALUE_SHAPE | Required: ["@value"]
_valueConstraints extras:
| Key | Value | Condition |
|---|---|---|
"defaultValue" | FT.default_value.text_value.lexical_form.unicode_string | Omit if absent |
"minLength" | FT.min_length.non_negative_integer (as integer) | Omit if absent |
"maxLength" | FT.max_length.non_negative_integer (as integer) | Omit if absent |
"regex" | FT.validation_regex.regex_pattern.unicode_string | Omit if absent |
_ui extras: { "inputType": encode_text_rendering_hint(FT.text_rendering_hint) }
Calls: encode_embedding_constraints, encode_embedding_ui, encode_text_rendering_hint
encode_text_rendering_hint(hint: TextRenderingHint or absent) → String
Returns the string corresponding to the hint value:
TextRenderingHint value | Returns |
|---|---|
"singleLine" or absent | "textfield" |
"multiLine" | "textarea" |
encode_integer_number_field_spec(FT: IntegerNumberFieldSpec, E: EmbeddedField) → Object
Integer-number fields hold base-10 integer lexical values. The numberType key is always written and carries "xsd:integer"; the integer category is fixed by the field family.
Value shape: NUMBER_VALUE_SHAPE | Required: ["@value"]
_valueConstraints extras:
| Key | Value | Condition |
|---|---|---|
"numberType" | "xsd:integer" | Always present |
"unitOfMeasure" | iri(FT.unit.iri) | Omit if absent |
"minValue" | FT.integer_number_min_value.integer_number_value.value (as integer) | Omit if absent |
"maxValue" | FT.integer_number_max_value.integer_number_value.value (as integer) | Omit if absent |
_ui extras: { "inputType": "numeric" }
Unit carries an Iri in the Structural Model; CTM 1.6.0 unitOfMeasure is a plain string. The IRI string value is used directly.
Calls: encode_embedding_constraints, encode_embedding_ui
encode_real_number_field_spec(FT: RealNumberFieldSpec, E: EmbeddedField) → Object
Real-number fields hold lexical values for one of three real-number kinds (decimal, float, double). The numberType key carries the corresponding XSD datatype IRI string.
Value shape: NUMBER_VALUE_SHAPE | Required: ["@value"]
_valueConstraints extras:
| Key | Value | Condition |
|---|---|---|
"numberType" | encode_real_number_datatype(FT.datatype) | Always present |
"unitOfMeasure" | iri(FT.unit.iri) | Omit if absent |
"minValue" | FT.real_number_min_value.real_number_value.value (as number) | Omit if absent |
"maxValue" | FT.real_number_max_value.real_number_value.value (as number) | Omit if absent |
A decimalPlaces hint, when present on the field’s NumericRenderingHint, is emitted under _ui rather than _valueConstraints.
_ui extras: { "inputType": "numeric", "decimalPlaces": FT.rendering_hint.decimal_places (as integer; omit if absent) }
Calls: encode_embedding_constraints, encode_embedding_ui, encode_real_number_datatype
encode_real_number_datatype(K: RealNumberDatatypeKind) → String
Returns the XSD datatype IRI string corresponding to the CEDAR-native RealNumberDatatypeKind:
RealNumberDatatypeKind | Returns |
|---|---|
"decimal" | "xsd:decimal" |
"float" | "xsd:float" |
"double" | "xsd:double" |
encode_date_field_spec(FT: DateFieldSpec, E: EmbeddedField) → Object
Date fields encode values at year, year-month, or full-date precision. Both _valueConstraints.temporalType (the XSD datatype) and _ui.temporalGranularity are derived from the same DateValueType. An optional dateFormat hint controls the display ordering of day, month, and year components.
Value shape: STRING_VALUE_SHAPE | Required: ["@value"]
_valueConstraints extras: { "temporalType": encode_date_value_type(FT.date_value_type) }
_ui extras:
| Key | Value | Condition |
|---|---|---|
"inputType" | "temporal" | Always present |
"temporalGranularity" | encode_date_granularity(FT.date_value_type) | Always present |
"dateFormat" | encode_date_format(FT.date_rendering_hint.date_format) | Omit if FT.date_rendering_hint absent or date_format absent |
Calls: encode_embedding_constraints, encode_embedding_ui, encode_date_value_type, encode_date_granularity, encode_date_format
encode_date_value_type(DVT: DateValueType) → String
Returns the XSD datatype string for the DateValueType kind:
DateValueType kind | Returns |
|---|---|
"year" | "xsd:gYear" |
"yearMonth" | "xsd:gYearMonth" |
"fullDate" | "xsd:date" |
encode_date_granularity(DVT: DateValueType) → String
Returns the temporalGranularity string for the DateValueType kind:
DateValueType kind | Returns |
|---|---|
"year" | "year" |
"yearMonth" | "month" |
"fullDate" | "day" |
encode_date_format(DF: DateComponentOrder) → String
Returns the dateFormat string for the DateComponentOrder kind:
DateComponentOrder kind | Returns |
|---|---|
"dayMonthYear" | "D/M/YYYY" |
"monthDayYear" | "M/D/YYYY" |
"yearMonthDay" | "YYYY/M/D" |
encode_time_field_spec(FT: TimeFieldSpec, E: EmbeddedField) → Object
Time fields always use the xsd:time datatype. The temporalGranularity and optional timezone and format hints are placed in _ui. The timezoneEnabled key is only written when the timezone requirement is explicitly stated; it is omitted when unset.
Value shape: STRING_VALUE_SHAPE | Required: ["@value"]
_valueConstraints extras: { "temporalType": "xsd:time" }
_ui extras:
| Key | Value | Condition |
|---|---|---|
"inputType" | "temporal" | Always present |
"temporalGranularity" | encode_time_precision(FT.time_precision) | Always present |
"timezoneEnabled" | true | Only when FT.timezone_requirement = "timezoneRequired" |
"timezoneEnabled" | false | Only when FT.timezone_requirement = "timezoneNotRequired" |
"inputTimeFormat" | "12h" | Only when FT.time_rendering_hint.time_format = "twelveHour" |
"inputTimeFormat" | "24h" | Only when FT.time_rendering_hint.time_format = "twentyFourHour" |
Calls: encode_embedding_constraints, encode_embedding_ui, encode_time_precision
encode_time_precision(TP: TimePrecision or absent) → String
Returns the temporalGranularity string for the TimePrecision kind:
TimePrecision kind | Returns |
|---|---|
"hourMinute" | "minute" |
"hourMinuteSecond" | "second" |
"hourMinuteSecondFraction" | "decimalSecond" |
| absent | "decimalSecond" |
encode_datetime_field_spec(FT: DateTimeFieldSpec, E: EmbeddedField) → Object
Date-time fields always use the xsd:dateTime datatype. They follow the same pattern as time fields for timezone and format hints, with granularity derived from DateTimeValueType rather than TimePrecision.
Value shape: STRING_VALUE_SHAPE | Required: ["@value"]
_valueConstraints extras: { "temporalType": "xsd:dateTime" }
_ui extras:
| Key | Value | Condition |
|---|---|---|
"inputType" | "temporal" | Always present |
"temporalGranularity" | encode_datetime_value_type(FT.datetime_value_type) | Always present |
"timezoneEnabled" | true | Only when FT.timezone_requirement = "timezoneRequired" |
"timezoneEnabled" | false | Only when FT.timezone_requirement = "timezoneNotRequired" |
"inputTimeFormat" | "12h" | Only when FT.date_time_rendering_hint.time_format = "twelveHour" |
"inputTimeFormat" | "24h" | Only when FT.date_time_rendering_hint.time_format = "twentyFourHour" |
Calls: encode_embedding_constraints, encode_embedding_ui, encode_datetime_value_type
encode_datetime_value_type(DVT: DateTimeValueType) → String
Returns the temporalGranularity string for the DateTimeValueType kind:
DateTimeValueType kind | Returns |
|---|---|
"dateHourMinute" | "minute" |
"dateHourMinuteSecond" | "second" |
"dateHourMinuteSecondFraction" | "decimalSecond" |
encode_controlled_term_field_spec(FT: ControlledTermFieldSpec, E: EmbeddedField) → Object
Controlled term fields constrain values to terms drawn from ontologies, branches of ontologies, named classes, or value sets. The four _valueConstraints list keys (ontologies, branches, classes, valueSets) are always present, each holding an array that is empty when no sources of that kind are configured. The multipleChoice: false flag distinguishes this from multi-valued enum fields.
Value shape: IRI_VALUE_SHAPE | Required: []
_valueConstraints extras:
| Key | Value | Condition |
|---|---|---|
"multipleChoice" | false | Always present |
"ontologies" | [ encode_ontology_source(S) for each OntologySource S in FT.controlled_term_sources ] | Always present |
"branches" | [ encode_branch_source(S) for each BranchSource S in FT.controlled_term_sources ] | Always present |
"classes" | [ encode_class_source_entry(C) for each ClassSource S in FT.controlled_term_sources, for each C in S.controlled_term_classes ] | Always present |
"valueSets" | [ encode_value_set_source(S) for each ValueSetSource S in FT.controlled_term_sources ] | Always present |
_ui extras: { "inputType": "textfield" }
Calls: encode_embedding_constraints, encode_embedding_ui, encode_ontology_source, encode_branch_source, encode_class_source_entry, encode_value_set_source
encode_ontology_source(S: OntologySource) → Object
Returns a JSON object with the following keys:
| Key | Value | Condition |
|---|---|---|
"uri" | iri(S.ontology_reference.ontology_iri.iri) | Always present |
"acronym" | S.ontology_reference.ontology_display_hint.ontology_acronym.unicode_string | Omit if absent |
"name" | S.ontology_reference.ontology_display_hint.ontology_name.unicode_string | Omit if absent |
encode_branch_source(S: BranchSource) → Object
Returns a JSON object with the following keys:
| Key | Value | Condition |
|---|---|---|
"uri" | iri(S.ontology_reference.ontology_iri.iri) | Always present |
"acronym" | S.ontology_reference.ontology_display_hint.ontology_acronym.unicode_string | Omit if absent |
"rootTermUri" | iri(S.root_term_iri.iri) | Always present |
"rootTermLabel" | S.root_term_label.unicode_string | Always present |
"maxDepth" | S.max_traversal_depth.non_negative_integer (as integer) | Omit if absent |
encode_class_source_entry(C: ControlledTermClass) → Object
Returns a JSON object with the following keys:
| Key | Value | Condition |
|---|---|---|
"uri" | iri(C.term_iri.iri) | Always present |
"label" | C.label.unicode_string | Always present |
"prefLabel" | C.label.unicode_string | Always present |
"type" | "OntologyClass" | Always present |
"source" | iri(C.ontology_reference.ontology_iri.iri) | Always present |
encode_value_set_source(S: ValueSetSource) → Object
Returns a JSON object with the following keys:
| Key | Value | Condition |
|---|---|---|
"identifier" | S.value_set_identifier.unicode_string | Always present |
"name" | S.value_set_name.unicode_string | Omit if absent |
"uri" | iri(S.value_set_iri.iri) | Omit if absent |
encode_single_valued_enum_field_spec(FT: SingleValuedEnumFieldSpec, E: EmbeddedField) → Object
SingleValuedEnumFieldSpec declares a closed list of PermissibleValue entries. CTM 1.6.0 has no native equivalent for the Structural Model’s enum-with-meanings construct: this encoder maps the spec into the legacy "literals" list, using each permissible value’s canonical Token as the legacy literal label. Per-value Label, Description, and Meaning metadata is dropped (see Section 14, Known Gaps). The multipleChoice: false flag distinguishes this from the multi-valued enum case.
Value shape: STRING_VALUE_SHAPE | Required: []
_valueConstraints extras:
| Key | Value | Condition |
|---|---|---|
"multipleChoice" | false | Always present |
"literals" | [ encode_permissible_value(PV) for each PV in FT.permissible_values ] | Always present |
"defaultValue" | FT.default_value.token.string | Omit if absent |
_ui extras: { "inputType": encode_single_valued_enum_rendering_hint(FT.rendering_hint) }
Calls: encode_embedding_constraints, encode_embedding_ui, encode_single_valued_enum_rendering_hint, encode_permissible_value
encode_single_valued_enum_rendering_hint(hint: SingleValuedEnumRenderingHint or absent) → String
Returns the inputType string for the hint value:
SingleValuedEnumRenderingHint value | Returns |
|---|---|
"radio" or absent | "radio" |
"dropdown" | "list" |
encode_permissible_value(PV: PermissibleValue) → Object
Encodes a single PermissibleValue from a SingleValuedEnumFieldSpec or MultiValuedEnumFieldSpec as a CTM 1.6.0 literals-array entry. The legacy entry carries a single label string; the encoder uses the permissible value’s Token as that label. The Token is the canonical wire-form key in the Structural Model and remains the value submitted in instances.
PermissibleValue.label and PermissibleValue.description localizations are dropped — CTM 1.6.0 has no slot for them on a literals entry. PermissibleValue.meanings is also dropped: CTM 1.6.0 literal options carry no ontology binding. See Section 14, Known Gaps.
| Key | Value | Condition |
|---|---|---|
"label" | PV.token.string | Always present |
selectedByDefault is no longer encoded per option. The Structural Model represents enum defaults at the spec level (SingleValuedEnumFieldSpec.defaultValue / MultiValuedEnumFieldSpec.defaultValues); these are emitted via defaultValue / defaultValues keys in _valueConstraints rather than as per-option flags. CTM 1.6.0 tooling support for those keys is not guaranteed (see Section 14).
encode_multi_valued_enum_field_spec(FT: MultiValuedEnumFieldSpec, E: EmbeddedField) → Object
Multi-valued enum fields allow instances to carry zero or more selected permissible values, so the value schema is wrapped in a JSON Schema array with minItems: 0. This field spec does not follow the standard skeleton. The multipleChoice: true flag distinguishes this from single-valued enum fields.
As with encode_single_valued_enum_field_spec, per-value Label, Description, and Meaning metadata is dropped at the legacy literals entries (see Section 14).
This field spec does not follow the standard skeleton. It wraps the value schema in an array:
{
"type": "array",
"minItems": 0,
"items": {
"type": "object",
"properties": { "@value": { "type": ["string", "null"] } },
"required": [],
"additionalProperties": false
},
"_valueConstraints": merge(encode_embedding_constraints(E), <vc-extras>),
"_ui": merge(encode_embedding_ui(E), <ui-extras>)
}
_valueConstraints extras:
| Key | Value | Condition |
|---|---|---|
"multipleChoice" | true | Always present |
"literals" | [ encode_permissible_value(PV) for each PV in FT.permissible_values ] | Always present |
"defaultValues" | [ T.string for each T in FT.default_values ] | Omit if absent or empty |
_ui extras: { "inputType": encode_multi_valued_enum_rendering_hint(FT.rendering_hint) }
Calls: encode_embedding_constraints, encode_embedding_ui, encode_multi_valued_enum_rendering_hint, encode_permissible_value
encode_multi_valued_enum_rendering_hint(hint: MultiValuedEnumRenderingHint or absent) → String
Returns the inputType string for the hint value:
MultiValuedEnumRenderingHint value | Returns |
|---|---|
"checkbox" or absent | "checkbox" |
"multiSelect" | "list" |
encode_link_field_spec(FT: LinkFieldSpec, E: EmbeddedField) → Object
Link fields hold a URI value with an optional human-readable label. They use IRI_VALUE_SHAPE and the link input type with no additional value constraints.
Value shape: IRI_VALUE_SHAPE | Required: [] | _valueConstraints extras: none | _ui extras: { "inputType": "link" }
Calls: encode_embedding_constraints, encode_embedding_ui
encode_email_field_spec(FT: EmailFieldSpec, E: EmbeddedField) → Object
Email fields hold a string value interpreted as an email address. They use STRING_VALUE_SHAPE and the email input type with no additional value constraints.
Value shape: STRING_VALUE_SHAPE | Required: [] | _valueConstraints extras: none | _ui extras: { "inputType": "email" }
Calls: encode_embedding_constraints, encode_embedding_ui
encode_phone_number_field_spec(FT: PhoneNumberFieldSpec, E: EmbeddedField) → Object
Phone number fields hold a string value interpreted as a phone number. They use STRING_VALUE_SHAPE and the phone-number input type with no additional value constraints.
Value shape: STRING_VALUE_SHAPE | Required: [] | _valueConstraints extras: none | _ui extras: { "inputType": "phone-number" }
Calls: encode_embedding_constraints, encode_embedding_ui
External Authority Field Specs
External authority fields identify entities from well-known registries such as ORCID, ROR, DOI, PubMed, RRID, and NIH Grant. All six types use IRI_VALUE_SHAPE and differ only in the inputType string written to _ui. They share the same skeleton entry:
Value shape: IRI_VALUE_SHAPE | Required: [] | _valueConstraints extras: none | _ui extras: { "inputType": encode_external_authority_input_type(FT) }
encode_external_authority_field_spec(FT: ExternalAuthorityFieldSpec, E: EmbeddedField) → Object
Applies the skeleton with the above parameters.
Calls: encode_embedding_constraints, encode_embedding_ui, encode_external_authority_input_type
encode_external_authority_input_type(FT: ExternalAuthorityFieldSpec) → String
Returns the inputType string for the field spec kind:
ExternalAuthorityFieldSpec kind | Returns |
|---|---|
OrcidFieldSpec | "orcid" |
RorFieldSpec | "ror" |
DoiFieldSpec | "doi" |
PubMedIdFieldSpec | "pubmed" |
RridFieldSpec | "rrid" |
NihGrantIdFieldSpec | "nih-grant" |
Caution: The
inputTypestring values for external authority fields are not standardised in the published CTM 1.6.0 specification. The values in the table above reflect common practice but MUST be confirmed against the deployed CTM 1.6.0 implementation before use. Encoding with incorrectinputTypevalues may cause CEDAR tooling to misrender or reject these fields.
encode_attribute_value_field_spec(FT: AttributeValueFieldSpec, E: EmbeddedField) → Object
Attribute-value fields hold dynamic key-value pairs whose attribute names are not known at schema definition time. CTM 1.6.0 represents this with a top-level array type and defers the dynamic key handling to the instance level via additionalProperties. This field spec does not follow the standard skeleton.
This field spec does not follow the standard skeleton. It uses a top-level array type:
{
"type": "array",
"items": { "type": "string" },
"minItems": 0,
"additionalProperties": false,
"_valueConstraints": merge(encode_embedding_constraints(E), { "requiredValue": false }),
"_ui": merge(encode_embedding_ui(E), { "inputType": "attribute-value" })
}
The instance representation of AttributeValue fields in CTM 1.6.0 uses additionalProperties at the instance level rather than a structured value schema. See Section 14, Known Gaps.
Calls: encode_embedding_constraints, encode_embedding_ui
10. Template Element Encoding
When a Template is referenced by an EmbeddedTemplate, it is encoded as a CTM 1.6.0 template element object.
encode_template_element(T: Template, E: EmbeddedTemplate) → Object
When a Template is used as a nested element, it is encoded identically to a top-level template except that @type becomes TemplateElement. All sub-functions (encode_template_context, encode_template_properties, encode_template_required, encode_template_ui) operate identically regardless of nesting depth.
merge(
{
"@id": iri(T.template_id),
"@type": "https://schema.metadatacenter.org/core/TemplateElement",
"@context": encode_template_context(T),
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"title": T.schema_artifact_metadata.artifact_metadata.descriptive_metadata.name.unicode_string,
"description": T.schema_artifact_metadata.artifact_metadata.descriptive_metadata.description.unicode_string
if description is present, else "",
"properties": encode_template_properties(T),
"required": encode_template_required(T),
"additionalProperties": false,
"_ui": encode_template_ui(T)
},
encode_artifact_metadata(T)
)
encode_template_context, encode_template_properties, encode_template_required, and encode_template_ui are as defined in Section 6 and operate identically on Template constructs whether they are top-level templates or nested template elements.
Calls: encode_template_context, encode_template_properties, encode_template_required, encode_template_ui, encode_artifact_metadata
11. Value Encoding (Instance Level)
These functions encode Value constructs as they appear within a TemplateInstance.
encode_value(V: Value) → Object
All value types are encoded as JSON objects, though the specific keys differ by type. This function dispatches to the appropriate type-specific encoder.
Dispatches to the encoding function for the Value kind:
Value kind | Encoding function |
|---|---|
TextValue | encode_text_value(V) |
IntegerNumberValue | encode_integer_number_value(V) |
RealNumberValue | encode_real_number_value(V) |
BooleanValue | encode_boolean_value(V) |
DateValue | encode_date_value(V) |
TimeValue | encode_time_value(V) |
DateTimeValue | encode_datetime_value(V) |
ControlledTermValue | encode_controlled_term_value(V) |
EnumValue | encode_enum_value(V) |
LinkValue | encode_link_value(V) |
EmailValue | encode_email_value(V) |
PhoneNumberValue | encode_phone_number_value(V) |
ExternalAuthorityValue | encode_external_authority_value(V) |
AttributeValue | encode_attribute_value(V) |
Calls: encode_text_value, encode_integer_number_value, encode_real_number_value, encode_boolean_value, encode_date_value, encode_time_value, encode_datetime_value, encode_controlled_term_value, encode_enum_value, encode_link_value, encode_email_value, encode_phone_number_value, encode_external_authority_value, encode_attribute_value
encode_text_value(V: TextValue) → Object
Returns a JSON object whose keys depend on whether V carries a language tag:
| Condition | "@value" source | "@language" |
|---|---|---|
V.lang absent | V.value.unicode_string | Omit |
V.lang present | V.value.unicode_string | V.lang.bcp_47_tag |
encode_integer_number_value(V: IntegerNumberValue) → Object
Integer-number instance values carry a base-10 integer lexical form. The XSD datatype IRI is fixed at "xsd:integer".
{
"@value": V.value.unicode_string,
"@type": "xsd:integer"
}
encode_real_number_value(V: RealNumberValue) → Object
Real-number instance values carry both a lexical form and an explicit RealNumberDatatypeKind. The kind is mapped to the corresponding XSD datatype IRI string by encode_real_number_datatype.
{
"@value": V.value.unicode_string,
"@type": encode_real_number_datatype(V.datatype)
}
encode_date_value(V: DateValue) → Object
Returns { "@value": <literal>, "@type": <xsd-type> } where the sources depend on the DateValue kind:
DateValue kind | "@value" source | "@type" |
|---|---|---|
YearValue | V.value | "xsd:gYear" |
YearMonthValue | V.value | "xsd:gYearMonth" |
FullDateValue | V.full_date_literal.lexical_form.string | "xsd:date" |
encode_time_value(V: TimeValue) → Object
Time instance values always use the xsd:time datatype. The lexical form is written directly from the time literal.
{ "@value": V.time_literal.lexical_form.unicode_string, "@type": "xsd:time" }
encode_datetime_value(V: DateTimeValue) → Object
Date-time instance values always use the xsd:dateTime datatype. The lexical form is written directly from the date-time literal.
{ "@value": V.date_time_literal.lexical_form.unicode_string, "@type": "xsd:dateTime" }
encode_controlled_term_value(V: ControlledTermValue) → Object
Returns a JSON object with the following keys:
| Key | Value | Condition |
|---|---|---|
"@id" | iri(V.term_iri.iri) | Always present |
"rdfs:label" | V.label.unicode_string | Omit if absent |
"skos:notation" | V.notation.unicode_string | Omit if absent |
"skos:prefLabel" | V.preferred_label.unicode_string | Omit if absent |
encode_enum_value(V: EnumValue) → Object
Encodes an EnumValue as a CTM 1.6.0 string-shaped JSON-LD value. The Token carried by the EnumValue is emitted under "@value". CTM 1.6.0 has no native concept of an enum value distinct from a string literal — the legacy form treats the submitted token as a plain string, with conformance to the spec’s permissible-value list enforced at the schema layer (the literals array under _valueConstraints).
{ "@value": V.token.string }
Per-value Meaning bindings carried by the source spec are not surfaced at the instance: the legacy wire form has no slot for them. Consumers that need ontology meanings MUST consult the source EnumFieldSpec.
Calls: none.
encode_link_value(V: LinkValue) → Object
Returns a JSON object with the following keys:
| Key | Value | Condition |
|---|---|---|
"@id" | iri(V.iri) | Always present |
"rdfs:label" | first localization of V.label (lexical form only) | Omit if V.label absent |
CTM 1.6.0’s rdfs:label slot accepts a single string only. When V.label is a multi-localization MultilingualString, the first entry is emitted; remaining localizations are dropped. See Section 14, Known Gaps.
encode_email_value(V: EmailValue) → Object
Email instance values are plain string objects with a single @value key. No type annotation is included.
{ "@value": V.simple_literal.lexical_form.unicode_string }
encode_phone_number_value(V: PhoneNumberValue) → Object
Phone number instance values are plain string objects with a single @value key. No type annotation is included.
{ "@value": V.simple_literal.lexical_form.unicode_string }
encode_external_authority_value(V: ExternalAuthorityValue) → Object
Each kind produces { "@id": <iri>, "rdfs:label": <label> } where "rdfs:label" is omitted when V.label is absent.
ExternalAuthorityValue kind | "@id" source |
|---|---|
OrcidValue | iri(V.orcid_iri.iri) |
RorValue | iri(V.ror_iri.iri) |
DoiValue | iri(V.doi_iri.iri) |
PubMedIdValue | iri(V.pub_med_iri.iri) |
RridValue | iri(V.rrid_iri.iri) |
NihGrantIdValue | iri(V.nih_grant_iri.iri) |
encode_attribute_value(V: AttributeValue) → Object
{ V.attribute_name.unicode_string: encode_value(V.value) }
Nested AttributeValue constructs produce nested objects. Multiple AttributeValue entries for the same instance field are merged into a single flat or nested JSON object in the CTM 1.6.0 representation.
12. Instance Encoding
encode_template_instance(I: TemplateInstance, T: Template) → Object
A template instance is encoded by reusing the template’s @context, writing instance identity and provenance metadata, and then encoding each field value and nested template instance slot. The template T is required as a parameter because the context and embedded artifact structure are derived from it rather than from the instance itself.
let fvs = [ IV in I.instance_values | IV is FieldValue ]
let ntis = [ IV in I.instance_values | IV is NestedTemplateInstance ]
let emb_fields = [ E in T.embedded_artifacts | E is EmbeddedField ]
let emb_templates = [ E in T.embedded_artifacts | E is EmbeddedTemplate ]
merge(
{
"@context": encode_template_context(T),
"@id": iri(I.template_instance_id),
"schema:isBasedOn": iri(T.template_id)
},
encode_artifact_metadata(I.artifact_metadata),
{ for each EF in emb_fields:
EF.key: encode_field_value(fv(EF), EF) },
{ for each ET in emb_templates:
ET.key: encode_nested_template_instance_slot(ntis_for(ET), ET) }
)
where fv(EF) denotes the FieldValue in fvs whose key equals EF.key, and ntis_for(ET) denotes [ NTI in ntis | NTI.key = ET.key ].
Calls: encode_template_context, encode_artifact_metadata, encode_field_value, encode_nested_template_instance_slot
encode_field_value(FV: FieldValue, EF: EmbeddedField) → Object or Array
Encodes a single field’s data within an instance. When the field is multi-valued (per is_multi(EF)) the result is a JSON array of encoded values; when single-valued it is a single encoded value object.
Caution: Consumers of CTM 1.6.0 instances must handle both forms at any given field key — either a plain JSON object or a JSON array. A consumer that always expects an object will silently misread or discard data for multi-valued fields. The cardinality information needed to know which form to expect is carried in the template schema (the
"type": "array"wrapper on the field entry in"properties"), not in the instance itself.
if is_multi(EF):
[ encode_value(V) for each V in FV.values ]
else:
encode_value(first(FV.values))
Calls: encode_value
encode_nested_template_instance_slot(NTIs: NestedTemplateInstance+, ET: EmbeddedTemplate) → Object or Array
Encodes a nested template slot within a parent instance. Multi-valued embeddings (per is_multi(ET)) produce a JSON array of encoded child instances; single-valued embeddings produce a single child instance object. Encoding recurses through encode_template_instance.
Let RT = the referenced Template of ET.
if is_multi(ET):
[ encode_template_instance(NTI, RT) for each NTI in NTIs ]
else:
encode_template_instance(first(NTIs), RT)
Calls: encode_template_instance
13. Annotations
Annotation constructs on CatalogMetadata have no standardised CTM 1.6.0 equivalent. They are encoded as top-level properties on the artifact object using the annotation name IRI as the JSON key.
encode_annotation(A: Annotation) → { key: value }
key: iri(A.property)
value: if A.body is AnnotationStringValue:
{ "@value": A.body.value, "@language"?: A.body.lang }
(or the raw lexical form string if simpler form is preferred)
if A.body is AnnotationIriValue:
iri(A.body.iri)
Implementations SHOULD confirm that annotation IRI keys are valid within the CTM 1.6.0 @context before including them.
14. Known Gaps and Lossy Areas
-
skos:prefLabelonStaticTemplateField— Real CTM 1.6.0 output includes askos:prefLabelkey at the top level of static field objects (presentation components). This is not currently produced byencode_presentation_componentbecauseencode_artifact_metadatamaps preferred labels tordfs:label. The relationship between the Structural Model’spreferred_labeland CTM 1.6.0’sskos:prefLabelon static fields needs clarification. -
propertyDescriptionsin_ui— Real CTM 1.6.0 templates include a"propertyDescriptions"map inside_ui, keyed by embedded artifact key, containing the description/help text for each field. This is not currently produced byencode_template_ui. The source of these descriptions (whether from theEmbeddedFieldor the referencedField) needs to be confirmed and the function updated accordingly. -
AlternativeLabel*onDescriptiveMetadata— No CTM 1.6.0 equivalent; omitted. -
PermissibleValuemetadata — CTM 1.6.0literals-array entries carry only a singlelabelstring, with no slot for per-valueDescription, ontologyMeaningbindings, or multilingualLabellocalizations.encode_permissible_valuedrops all of these and emits the value’s canonicalTokenas the legacylabel. Spec-level enum defaults (SingleValuedEnumFieldSpec.defaultValueandMultiValuedEnumFieldSpec.defaultValues) are emitted asdefaultValue/defaultValueskeys under_valueConstraints; CTM 1.6.0 tooling support for those keys is not guaranteed. The legacyselectedByDefaultper-option flag is no longer produced — the Structural Model now represents enum defaults exclusively at the spec level. Embedding-level defaults (EmbeddedSingleValuedEnumField.defaultValue/EmbeddedMultiValuedEnumField.defaultValue) have no CTM 1.6.0 equivalent and are dropped. -
Default values for link, email, phone number, and external authority field specs — CTM 1.6.0
_valueConstraints.defaultValueis primarily defined for text fields. Default value encoding forLinkDefaultValue,EmailDefaultValue,PhoneNumberDefaultValue, and external authority defaults is implementation-defined. -
AttributeValueinstance representation — CTM 1.6.0 usesadditionalPropertieson the instance object for attribute-value fields. The instance-level encoding ofAttributeValueinjects key-value pairs directly into the parent instance object rather than nesting them under a field key. -
RecommendedvsOptional— Both map to"requiredValue": falsein_valueConstraintsand neither contributes to the"required"array. The distinction is entirely lost in CTM 1.6.0 output. -
Unitas IRI — CTM 1.6.0unitOfMeasureis a plain string. The IRI string value is used directly; any human-readable label associated withUnitis omitted. -
Language-tagged text values — CTM 1.6.0 does not model language-tagged strings explicitly. The
@languagekey is included in the encoded value object as a JSON-LD extension; support in CTM 1.6.0 tooling is not guaranteed. -
External authority
inputTypevalues — TheinputTypestring values for ORCID, ROR, DOI, PubMed, RRID, and NIH Grant fields are not standardised in the published CTM 1.6.0 specification and SHOULD be confirmed against the deployed implementation. -
ImageComponentandYoutubeVideoComponentaccessibility metadata — Thelabel(alt text / caption title) anddescription(longer accessibility text) slots onImageComponentandYoutubeVideoComponenthave no CTM 1.6.0 equivalent and are dropped. Conforming consumers that require accessibility metadata MUST work with the Structural Model wire form rather than the CTM 1.6.0 mapping.