Introduction‎ > ‎Extensibility‎ > ‎

Extended Vocabularies

STEMMA uses a number of partially controlled vocabularies for its types, sub-types, and other taxonomies — collectively referred to here as “tag values” as they are specified within element content or attribute values. With the exception of source-type, these all belong to default implicit namespaces. Those namespaces are implicitly rooted on the versioned default namespace specified in the standard xmlns attribute, e.g. The following namespaces are currently defined by STEMMA. The ones highlighted in blue constitute fully controlled vocabularies that cannot be extended.





Name, Age, Role

Sibling, Parent

Deceased, Offspring

Deceased, Implied

Pet, Mascot

Cat, Dog

Footnote, Marginalia

RefNote, ShortRefNote

ImageCopy, Repository

Footnote, Inline

Text, Date, Integer

Short, Long


When, Where


Union, Travel, Birth


Nuclear, Blended

Military, Family

Union, Exclude

Trusted, Questionable

Primary, Secondary

Formal, SemiFormal

Link, Inline, Footnote


Alias, Nickname

Role, Age, Occupation

Wife, Brother, Mother

Head, Bride

Married, Widow

School, Cemetery, Ship

Title, Hierarchy



Street, District, Country

Title, Large

Artefact, Letter

Public, Family

Source, Inference

Original, Derivative


Caption, H1, Tablenote



Custom tag values may be defined by declaring a new namespace using an xmlns attribute in the <Dataset> header element. The prefix associated with that namespace can then be used to introduce custom tag values without any fear of clashing. For instance:


<Dataset Name=’Example’




<Type> MyEvents:xyz </Type>


This mechanism uses the same XML namespace feature that prevents clashes between element names and attribute names from different origins. XML tag names (elements and attributes) are deemed to belong to a given namespace and must be qualified using a namespace prefix if this is not the default one, e.g. <xs:annotation>. The qualified form is referred to as a QName. When the mechanism is employed within attribute values or element content then it is more correctly called a CURIE (“Compact URI”), and the qualified value may not even be a valid identifier in the STEMMA case then this occurs with Place coordinates.


Although the namespace prefix is bound to a namespace URI, the XML standard does not define how to map a QName to an equivalent URI specification. The XML Schema language (XSD) concatenates the local tag name and the namespace URI using a ‘#’ separator to create a Fragment identifier (e.g. but it is not clear what happens if the namespace URI already ends with a ‘#’. The RDF model, on the other hand, simply concatenates the namespace URI and the local tag name with no separator (e.g. http://stemma.parallaxview.coDataset). Most RDF namespace URIs already end with a ‘#’ (or even a ‘/’) but not always. This is a well-known problem and a possible solution has been proposed at QNameQuagmire.


This does not directly impact the STEMMA use of namespaces though. The above custom Event-type is defined by the pair (, xyz) and the predefined Event-type ‘Union’ is defined by the pair (, Union). The main differences here are that these namespaces apply to tag values, and the non-default namespaces are local to the associated Dataset.


An XML parser normally discards any such prefixes once the XML has been loaded since they usually just connect names to their respective namespace declarations. The exception to this is when they have been employed in attribute values or element content, as is the case in STEMMA and SOAP. The prefix-to-namespace mapping then has to be retained and made available to the program loading the XML. This is why the tag-value namespaces are expected to be associated with the enveloping Dataset element rather than any of elements below it.


Note that tag values should not be displayed directly in the UI of a genealogical product. See Locale-independence.


The namespace prefix DC=’’ is required for the Dublin Core semantic types mentioned in this documentation. The use of a namespace prefix, rather than simply “DC.”, accommodates other semantic-type systems if necessary.


See Digital Freedom for a related discussion.


On a technical note, this usage of a URI is sometimes referred to as a URN although, strictly speaking, a URN is a particular form of URI that uses a "urn:" scheme prefix and is designed to support hierarchical naming of objects. A much-quoted example of it is ISBN book references. The syntax is therefore more rigid (e.g. urn:xx:yy:zz) and the allowable characters more restricted. The associated namespace also has to be officially registered and that administrative burden tends to lessen its usage.

What we’ve used here is still a URI but employed for naming purposes, and is hence not the same as a URL. W3C don't really have a separate category for this although the historical use of the term URN in this context is accepted. Some material describes it as a "namespace name" but that's not universal. The most familiar example is the "namespace URI" in XML bodies.

Essentially, this form of URI names an object, or a type, and is guaranteed to be unique by virtue of the ownership of the domain used. For instance, a URI of is unique to STEMMA because the author owns the domain. It is also possible to derive URIs from a private email address, e.g. mailto:name@emaildomain?subject=types. Such namespace URIs are not designed to be dereferencable and so the scheme prefix isn't implying any access protocol.


In summary, this style of URI can be created in a decentralised manner (unlike real URNs), it is extensible (supporting derivative names and types), and it can be versioned. This contrasts with the use of raw UUIDs, or even ones wrapped as URNs (e.g. urn:uuid:d8e6a531-5dee-47a1-a0e2-ca5dbffd87c0), since they are amorphous and isolated  See URNs, Namespaces, and Registries for more details, and Uniform Resource identifier for a mention of URC.


See Uniform Resource Identifier for a good summary.