STXT Schemas (@stxt.schema)
1. Introduction2. Terminology
3. Relationship between STXT and Schema
4. General structure of a Schema
5. One schema per namespace
6. Node Definitions (`Node:`)
7. Children (`Children:`) and cross namespaces
8. Cardinalities
9. Types
10. Normative Examples
11. Schema Errors
12. Conformance
13. Schema of the Schema (`@stxt.schema`)
14. End of Document
1. Introduction
This document defines the specification of the STXT Schema language, a mechanism to validate STXT documents through formal semantic rules.
A schema:
- Is an STXT document with namespace
@stxt.schema. - Defines the nodes, types, and cardinalities of the target namespace.
- Does not modify the base syntax of STXT; it operates on the already parsed structure.
2. Terminology
The keywords "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" must be interpreted according to RFC 2119.
Terms such as node, indentation, namespace, inline and block >> keep their meaning in STXT-SPEC.
3. Relationship between STXT and Schema
Schema validation happens after STXT parsing:
- Parsing into STXT hierarchical structure.
- Resolution of the logical namespace (inheritance).
- Application of the corresponding schema.
Optionally it MAY act during the parsing process, as long as it is weakly coupled with it. In this way errors can be detected earlier.
4. General structure of a Schema
A schema is a document whose root node is: Schema (@stxt.schema): <target_namespace>
Example:
Schema (@stxt.schema): com.example.docs
Description: Schema for example documents
Node: Document
Type: GROUP
Children:
Child: Metadata (@com.google.html)
Max: 1
Child: Autor
Child: Fecha
Max: 1
Child: Content
Min: 1
Max: 1
Node: Autor
Node: Fecha
Type: DATE
Node: Content
Type: TEXT
5. One schema per namespace
For each logical namespace:
- MUST NOT exist more than one active schema simultaneously.
- If several schemas exist for the same namespace, the parser SHOULD establish a clear criterion to know which one it is applying, but never apply several simultaneously.
6. Node Definitions (`Node:`)
6.1 Basic form
Node: Node Name
Descrip: Node description
Type: Type
Children:
Child: name_child. It can have a namespace in case it is different from the target namespace
Min: optional, indicates the minimum number of childs that can appear
Max: optional, indicates the maximum number of childs that can appear
Rules:
Node NameMUST be unique within the schema.- Each
Nodedefines the semantics of the node in the target namespace. - If
Typeis omitted, the default type isINLINE.
6.2 Values in ENUM types
Node: Node Name
Descrip: Node description
Type: ENUM
Children:
Child: name_child. It can have a namespace in case it is different from the target namespace
Min: optional, indicates the minimum number of childs that can appear
Max: optional, indicates the maximum number of childs that can appear
Values:
Value: value 1
Value: value 2
Value: value 3
The ENUM type (and only ENUM) can specify a Values node with the allowed values (Value nodes).
At least one Value must exist.
7. Children (`Children:`) and cross namespaces
A node can have a Children entry. If it has it, it must have one or more Child nodes,
with the information of the allowed childs.
A Child can belong to another namespace, in which case it is indicated in the child name. Example:
Node: node name Children: Child: child name (child.namespace) Min: 0 Max: 1
- If the namespace is omitted, the target namespace of the schema is assumed.
- If it is indicated: the child belongs to that specific namespace. The definition will be made in another schema document.
7.1. Nodes must be explicitly shown in schemas.
Every node that appears in Children must have its own definition as Node: in its corresponding schema.
This way we avoid “ghost” children and guarantee that all nodes have defined semantics.
This implies:
- If it appears:
(1) Metadata (@com.google.html) then **there must exist a schema forcom.google.htmland within it there must existNode: Metadata`**. Validation is not necessary at that moment, but it is when validating a document for that namespace.
8. Cardinalities
Cardinalities are done through the Min and Max nodes of Child. They will be optional non-negative integers.
If they exist they indicate the minimum or maximum number of appearances of the child.
Rules:
- It applies per instance of the parent node.
- It counts only direct children with name + effective namespace.
- A conforming validator MUST check the cardinalities.
9. Types
Types define:
- The form of the node value (inline, block
>>, or none). - Whether the node is compatible with children.
- Content validation.
They are defined in the Node, with a Type element. Example:
Node: node name Type: NODE_TYPE Children: Child: a child name
Other considerations:
- The type DOES NOT control requiredness, only the form and validity of the value. Requiredness of appearance is controlled through cardinality.
- BLOCK-only types (
TEXT,CODE,BASE64,...) ARE NOT COMPATIBLE with children, and a children definition (Children) MUST give a parse error.
9.1. Basic structural types
A parser MUST allow these types and MUST validate the structure.
| Type | Text forms | Compatible children | Description / Validation |
|---|---|---|---|
| INLINE | INLINE | YES | Inline text :. Default type. Can have children. |
| BLOCK | BLOCK | NO | Only text block >>. |
| TEXT | INLINE/BLOCK | NO | Generic text. Inline : or block >>. Cannot have children. |
| GROUP | NONE | YES | Empty text. Only allowed children. Node container. |
9.2. Basic INLINE content types
A parser MUST allow these types and SHOULD validate the structure.
| Type | Text forms | Compatible children | Description / Validation |
|---|---|---|---|
| BOOLEAN | INLINE | YES | true / false. |
| NUMBER | INLINE | YES | JSON-format number. |
| DATE | INLINE | YES | YYYY-MM-DD. |
| ENUM | INLINE | YES | Only specified values (see 9.6) |
9.3. Extended INLINE content types
A parser SHOULD allow these types and SHOULD validate the structure.
| Type | Text forms | Compatible children | Description / Validation |
|---|---|---|---|
| INTEGER | INLINE | YES | Number without decimals (positive and negative). |
| NATURAL | INLINE | YES | Numbers greater than or equal to 0 without decimals. |
| TIME | INLINE | YES | ISO 8601, hh:mm:ss |
| TIMESTAMP | INLINE | YES | Full ISO 8601. |
| UUID | INLINE | YES | UUID |
| URL | INLINE | YES | URL/URI |
| INLINE | YES |
9.4 Binary content types INLINE/BLOCK
A parser SHOULD allow these types and MAY validate the structure.
| Type | Text forms | Compatible children | Description / Validation |
|---|---|---|---|
| HEXADECIMAL | INLINE / BLOCK | NO | [0-9A-Fa-f]+. Hexadecimal string |
| BINARY | INLINE / BLOCK | NO | [01]+ Binary string. |
| BASE64 | INLINE / BLOCK | NO | Base64 block. |
9.5 ENUM Type
The ENUM type is special, since it allows enumerating the allowed values for that node. Characteristics:
- The check will be CASE-SENSITIVE
- Since they are inline values, trim will be applied on the left and right.
- The node must define
ValueswithValuenodes, which represent the allowed values.
Example:
Node: Node Name
Type: ENUM
Values:
Value: value 1
Value: value 2
Value: value 3
A schema parser MUST check ENUM types with their allowed values, and throw errors if they are not met.
10. Normative Examples
10.1. Schema with cross-namespace references
Schema (@stxt.schema): com.example.docs
Node: Document
Type: GROUP
Children:
Child: Metadata (@com.google.html)
Max: 1
Child: Content
Min: 1
Max: 1
Node: Content
Type: BLOCK
And in com.google.html:
Schema (@stxt.schema): com.google.html
Node: Metadata
Type: INLINE
10.2. Valid document
Document (@com.example.docs):
Metadata (@com.google.html): info
Content>>
Line 1
Line 2
11. Schema Errors
A schema is invalid if:
- It defines two
Nodewith the same name. - It uses an unknown
Type. - It defines
Childrenin aNodewhose type does not allow children. - The cardinality is invalid.
- A child appears in
ChildrenwhoseNodeis not defined in its corresponding schema.
12. Conformance
An implementation is conforming if:
- It fully implements this document.
- It validates types, value forms, cardinalities, and allowed values (ENUM).
- It applies the strict rule of mandatory definition of all nodes referenced in
Children. - It rejects invalid documents and schemas.
13. Schema of the Schema (`@stxt.schema`)
This section defines the official schema of the schema system itself: the meta-schema that validates all documents in the @stxt.schema namespace.
13.1. Considerations
- Every schema document is:
Schema (@stxt.schema): <target-namespace> - A schema contains:
- Optionally a
Description. - One or more
Nodenodes.
- Optionally a
- Each
Node:- Has an inline value (the node name of the target namespace).
- May optionally have:
DescriptionTypeChildrenChild
- Each
Child(element ofChildren) defines the name (and optionally a different namespace) and may have:Min: Minimum number of nodes that must appear. If the node does not exist there is no minimum established.Max: Maximum number of nodes that can appear. If the node does not exist there is no maximum established.Values: Only for ENUM, withValuenodes with the allowed values.
- The names (
Schema,Node,Type,Children,Child,Description,Min,Max) belong to the@stxt.schema` namespace.
13.2. Full Meta-Schema
Schema (@stxt.schema): @stxt.schema
Node: Schema
Children:
Child: Description
Max: 1
Child: Node
Min: 1
Node: Node
Children:
Child: Type
Max: 1
Child: Children
Max: 1
Child: Description
Max: 1
Child: Values
Max: 1
Node: Children
Type: GROUP
Children:
Child: Child
Min: 1
Node: Description
Type: TEXT
Node: Child
Children:
Child: Min
Max: 1
Child: Max
Max: 1
Node: Min
Type: NATURAL
Node: Max
Type: NATURAL
Node: Type
Node: Values
Children:
Child: Value
Min: 1
Node: Value
13.3. Quick reading
-
SchemaInline value = target namespace (e.g.com.example.docs). Children:Description(?),Node(*). -
NodeInline value = target node name (e.g.Document,Autor). Optional children:Type: specific type (if missing ⇒INLINE).Children: Node with list of allowedChildDescription: explanatory text.Values: Allowed values (only ENUM type)
-
TypeInline (INLINE), with the type name (GROUP,INLINE,NUMBER, etc.). -
ChildrenBLOCK: literally contains theChilds>>block. -
DescriptionTEXT: can be inline or multiline.
13.4. Minimal valid example
Schema (@stxt.schema): com.example.docs
Node: Document
13.5. Complete example
Schema (@stxt.schema): com.example.docs
Description: Example schema
Node: Document
Type: GROUP
Children:
Child: Title
Min: 1
Max: 1
Child: Author
Child: Metadata (@com.google.html)
Max: 1
Node: Title
Type: INLINE
Node: Author
Type: INLINE