Release 4.8.0 is codenamed ‘The Six Degrees Release’.
This release introduces new features and resolves a number of issues raised by the TEI community. The majority of these changes and corrections are a consequence of feature requests or bugs reported by the TEI community using the GitHub tracking system. A full list of the issues resolved in the course of this release cycle may be found under the 4.8.0 milestone. Very special thanks to Michael Beißwenger and Harald Lüngen for their essential contributions to the new computer-mediated communication chapter. Many thanks also to the community contributors who reported issues, sent bug fixes, and helped with the drafting that led to this release, including: John Bampton, Benjamin W. Bohl, Lou Burnard, Martin Holmes, Martin de la Iglesia, Jessica Lu, Dominique Meeùs, Kiyonori Nagasaki, Bastian Politycki, Klaus Rettinghaus, Daniel Schwarz, Peter Stadler, Christian Thomas, Conal Tuohy, and Nicolas Vaughan.
The following changes are particularly worth highlighting in this release:
New encoding features
- A new chapter on computer-mediated communication (CMC) provides guidelines for structuring texts and corpora in TEI that encode the data and metadata of interactive posts from various media platforms (#1955 and PR #2537). The chapter introduces a new post element to encode a contribution to a CMC interaction, and with it:
- two new attribute classes att.cmc and att.indentation, of which post is a member.
- The following new attributes are specific to post:
- modality to document whether a post is written or spoken,
- replyTo to indicate a previous post to which a post replies or refers,
- generatedBy (a member of att.cmc) with suggested values "human", "template", "system", "bot", and "unspecified" to indicate how content in a post is generated.
- The post element is also a member of att.global, att.ascribed, att.datable, att.timed, att.fragmentable, att.docStatus, att.typed, and att.canonical giving it access to many attributes to identify agents responsible for posts, indicate their timing, and categorize them.
- A new attribute, indentLevel, is not restricted to CMC but provided in the class att.indentation to describe indentation of text content in a source, for example to mark a post’s level of indentation in a discussion thread.
Changes to content models
- With this release, quote is now a member of model.biblPart, permitting it to be used within bibl (#2544 and PR #2557).
- The Guidelines now deprecate the use of superEntry and re elements as superfluous since the entry element may now self-nest (#2488, #2487, PR #2532, and #2521).
- The event element is now more efficiently modeled using model.eventLike with no changes to its content. (#2524 and PR #2525).
- To improve gaiji descriptions, the scheme attribute was added to att.gaijiProp, and mapping, localProp unicodeProp, and unihanProp were added to att.datable. (#2132 and PR #2511).
- The datatype teidata.probability was previously defined too broadly as
xsd:double
, and has now been constrained to a value between 0 and 1 (#2518 and PR #2519).
The following changes introduced with this release could invalidate ODD customizations in TEI projects. Those maintaining ODD customizations of the TEI should be aware of the following changes, and may need to adapt their ODD files accordingly.
ODD-breaking changes and deprecation
- Following a deprecation period that has now ended, the content element, which declares the content model of an element being specified in an ODD, now requires exactly one and only one child element. If several RELAX NG elements are desired, they must now be wrapped in an rng:div (#2381 and PR #2409).
- In order to avoid ambiguity, Schematron constraints in ODDs must now include an sch:rule element with a context attribute, and the Guidelines have been updated to reflect this change (#2510 and PR #2513).
Improvements of prose and examples
- The definition of surface has been updated to reflect the context of embedded transcription (#2476).
- The description of teiCorpus has been updated in the language corpora chapter (#2445 and PR #2503).
- Examples of geo elements no longer include a comma to separate geocoordinates to better align with the prose of the Guidelines (#2560).
- Schematron constraints were simplified to remove redundancies in the simplePrint ODD (PR #2540).
Housekeeping
- The HTML Guidelines pages have been updated to output the current standard HTML 5 doctype (#2508).
- For Guidelines processing, we have removed unnecessary mode attributes with the value "add" from the element specifications, since all content is simply added to the Guidelines with no other mode possible (#2498 and #2520).
- The copyright notice in XML comments at the top of the Guidelines XML files has been simplified (#2514 and PR #2526).
- Superfluous namespace declarations have been removed from Guidelines datatype specification files (PR #2522).
In addition, many improvements have been made to the XSLT stylesheets (which provide processing of TEI ODD files for Roma and TEIGarage as well as other TEI conversions). The Stylesheets are maintained separately from the Guidelines at https://github.com/TEIC/Stylesheets. A full list of the issues resolved in the course of this release cycle may be found under the 7.57.0 milestone.
Highlights of this release follow.
- Updating the Stylesheets from XSLT 2.0 to XSLT 3.0 (Stylesheets #639, PR #649, and PR #663).
- Identifying and solving a problem in our testing of DOCX to TEI conversion when the Stylesheets version changes (Stylesheets #646 and PR #650).
- Correcting a bug in the common function.xsl that caused a "sup" value of rend to be treated the same as a "sub" value (Stylesheets #584 and PR #670.
- Identifying and solving several problems in ODD processing:
- Repairs to our transform scripts (Stylesheets #652 and PR #653);
- Ensuring that att.repeatable is properly processed on sequence, alternate, and anyElement (Stylesheets #627 and PR #633);
- Correcting a bug that caused attributes to be copied from a constraint element to the corresponding generated sch:rule element (Stylesheets #659 and PR #660);
- Repairing another bug that caused an output constraintSpec to appear in the wrong location in a constructed ODD (Stylesheets #319 and PR #675);
- Solving a serious problem mentioned in several tickets (Stylesheets #645, #678, #680) in which multiple elementSpec elements sharing the same ident value led to a build error in odd2odd.xsl, solved with Stylesheets PR #681;
- Solving a problem when replacing, changing, or deleting an attribute marked with the wrong class (e.g. in an outdated class after it has been relocated to a different class), in which the processed ODD produced duplicate attDef elements (Stylesheets #687 and PR #690).