iv. About These Guidelines

Table of contents

These Guidelines have been developed and are maintained by the Text Encoding Initiative Consortium (TEI); see iv.2. Historical Background. They are addressed to anyone who works with any kind of textual resource in digital form.

They make recommendations about suitable ways of representing those features of textual resources which need to be identified explicitly in order to facilitate processing by computer programs. In particular, they specify a set of markers (or tags) which may be inserted in the electronic representation of the text, in order to mark the text structure and other features of interest. Many, or most, computer programs depend on the presence of such explicit markers for their functionality, since without them a digitized text appears to be nothing but a sequence of undifferentiated bits. The success of the World Wide Web, for example, is partly a consequence of its use of such markup to indicate such features as headings and lists on individual pages, and to indicate links between pages. The process of inserting such explicit markers for implicit textual features is often called ‘markup’, or equivalently within this work ‘encoding’; the term ‘tagging’ is also used informally. We use the term encoding scheme or markup language to denote the complete set of rules associated with the use of markup in a given context; we use the term markup vocabulary for the specific set of markers or named distinctions employed by a given encoding scheme. Thus, this work both describes the TEI encoding scheme, and documents the TEI markup vocabulary.

The TEI encoding scheme is of particular usefulness in facilitating the loss-free interchange of data amongst individuals and research groups using different programs, computer systems, or application software. Since they contain an inventory of the features most often deployed for computer-based text processing, these Guidelines are also useful as a starting point for those designing new systems and creating new materials, even where interchange of information is not a primary objective.

These Guidelines apply to texts in any natural language, of any date, in any literary genre or text type, without restriction on form or content. They treat both continuous materials (‘running text’) and discontinuous materials such as dictionaries and linguistic corpora. Though principally directed to the needs of the scholarly research community, these Guidelines are not restricted to esoteric academic applications. They are also useful for librarians maintaining and documenting electronic materials, and for publishers and others creating or distributing electronic texts. Although they focus on problems of representing in electronic form texts which already exist in traditional media, these Guidelines are also applicable to textual material which is ‘born digital’. We believe them to be adequate to the widest variety of currently existing practices in using digital textual data, but by no means limited to them.

The rules and recommendations made in these Guidelines are expressed in terms of what is currently the most widely-used markup language for digital resources of all kinds: the Extensible Markup Language (XML), as defined by the World Wide Web Consortium's XML Recommendation. However, the TEI encoding scheme itself does not depend on this language; it was originally formulated in terms of SGML (the ISO Standard Generalized Markup Language), a predecessor of XML, and may in future years be re-expressed in other ways as the field of markup develops and matures. For more information on markup languages see chapter v. A Gentle Introduction to XML; for more information on the associated character encoding issues see chapter vi. Languages and Character Sets.

This document provides the authoritative and complete statement of the requirements and usage of the TEI encoding scheme. As such, although it includes numerous small examples, it must be stressed that this work is intended to be a reference manual rather than a tutorial guide.

The remainder of this chapter comprises three sections. The first gives an overview of the structure and notational conventions used throughout these Guidelines. The second enumerates the design principles underlying the TEI scheme and the application environments in which it may be found useful. Finally, the third section gives a brief account of the origins and development of the Text Encoding Initiative itself.

TEI: Structure and Notational Conventions of this Document⚓︎iv.1. Structure and Notational Conventions of this Document

The remaining two sections of the front matter to these Guidelines provide background tutorial material for those unfamiliar with basic markup technologies. Following the present introductory section, we present a detailed introduction to XML itself, intended to cover in a relatively painless manner as much as the novice user of the TEI scheme needs to know about markup languages in general and XML in particular. This is followed by a discussion of the general principles underlying current practice in the representation of different languages and writing systems in digital form. This chapter is largely intended for the user unfamiliar with the Unicode encoding systems, though the expert may also find its historical overview of interest.

The body of this edition of these Guidelines proper contains 23 chapters arranged in increasing order of specialist interest. The first five chapters discuss in depth matters likely to be of importance to anyone intending to apply the TEI scheme to virtually any kind of text. The next seven focus on particular kinds of text: verse, drama, spoken text, dictionaries, and manuscript materials. The next nine chapters deal with a wide range of topics, one or more of which are likely to be of interest in specialist applications of various kinds. The last two chapters deal with the XML encoding used to represent the TEI scheme itself, and provide technical information about its implementation. The last chapter also defines the notion of TEI conformance and its implications for interchange of materials produced according to these Guidelines.

As noted above, this is a reference work, and is not intended to be read through from beginning to end. However, the reader wishing to understand the full potential of the TEI scheme will need a thorough grasp of the material covered by the first four chapters and the last two. Beyond that, the reader is recommended to select according to their specific interests: one of the strengths of the TEI architecture is its modular nature.

As far as possible, extensive cross referencing is provided wherever related topics are dealt with; these are particularly effective in the online version of these Guidelines. In addition, a series of technical appendixes provide detailed formal definitions for every element, every class, and every macro discussed in the body of the work; these are also cross linked as appropriate. Finally, a detailed bibliography is provided, which identifies the source of many examples cited in the text as well as documenting works referred to, and listing other relevant publications.

As an aid to the reader, most chapters of these Guidelines follow the same basic organization. The chapter begins with an overview of the subjects treated within it, linked to the following subsections. Within each section where new elements are described, a summary table is first given, which provides their names and a brief description of their intended usage. This is then followed where appropriate by further discussion of each element, including wherever possible usage examples taken somewhat eclectically from a variety of real sources. These examples are not intended to be exhaustive, but rather to suggest typical ways in which the elements concerned may usefully be applied. Where appropriate, a link to a statement of the source for most examples is provided in the online version. Within the examples, use of whitespace such as newlines or indentation is simply intended to aid legibility, and is not prescriptive or normative.

Wherever TEI elements or classes are mentioned in the text, they are linked in the online version to the relevant reference specification for the element or class concerned. Element names are always given in the form name, where ‘name’ is the generic identifier of the element; empty elements such as pb or anchor include a closing slash to distinguish them wherever they are discussed. References to attributes take the form attname, where ‘attname’ is the name of the attribute. References to classes are also presented as links, for example model.divLike for a model class, and att.global for an attribute class.

TEI: TEI Naming Conventions⚓︎TEI Naming Conventions

These Guidelines use a more or less consistent set of conventions in the naming of XML elements and classes. This section summarizes those conventions.

Home

TEI: Element and Attribute Names⚓︎Element and Attribute Names

An unadorned name such as ‘blort’ is the name of a TEI element or attribute. ¹.

The following conventions apply to the choice of names:

Elements are given generic identifiers as far as possible consisting of one or more tokens, by which we mean whole words or recognisable abbreviations of them, taken from the English language.
Where an element name contains more than one token, the first letter of the second token, and of any subsequent ones, is capitalized, as in for example biblStruct, listPerson. This ‘camelCasing’ is used also for attribute names and symbolic values.
Module names also use whole words, for the most part, but are always all lower case.
The specification for an element or attribute whose name contains abbreviations generally also includes a gloss element providing the expanded sense of the name.
An element specification may also contain approved translations for element or attribute names in one or more other languages using the altIdent element; this is not however generally done in TEI P5.

Whole words are generally preferred for clarity. The following abbreviations are however commonly used within generic identifiers:

att: attribute
bibl: bibliographic description or reference in a bibliography
cat: category, especially as used in text classification
char: character, typically a Unicode character
doc: document: this usually refers to the original source document which is being encoded,
decl: declaration: has a specific sense in the TEI header, as discussed in 2.1.2 Types of Content in the TEI Header
desc: description: has a specific sense in the TEI header, as discussed in 2.1.2 Types of Content in the TEI Header
grp: group. In TEI usage, a group is distinguished from a list in that the former associates several objects which act as a single entity, while the latter does not. For example, a linkGrp combines several link elements which have certain properties in common, whereas a listBibl simply lists a number of otherwise unrelated bibl elements.
interp: interpretation or analysis
lang: (natural) language
ms: manuscript
org: organization, that is, a named group of people or legal entity
rdg: reading or version found in a specific witness
ref: reference or link
spec: technical specification or definition
stmt: statement: used in a specific sense in the TEI header, as discussed in 2.1.2 Types of Content in the TEI Header
struct: structured: that is, containing a specific set of named elements rather than ‘mixed content’
val: value, for example of a variable or an attribute
wit: witness: that is, a specific document which attests specific readings in a textual tradition or apparatus

Some abbreviations are used inconsistently: for example, add is an addition, and addSpan is a spanning addition, but addName is an additional name, not the name of an addition. Such inconsistencies are relatively few in number, and it is hoped to remove them in subsequent revisions of these Guidelines.

Some elements have very short abbreviated names: these are for the most part elements which are likely to be used very frequently in a marked up text, for example p (paragraph), s (segment) hi (highlighted phrase), ptr (pointer), div (division) etc. We do not specifically list such elements here: as noted above, an expansion of each such abbreviated name is provided within the documentation using the gloss element .

TEI: Class, Macro, and Datatype Names⚓︎Class, Macro, and Datatype Names

All named objects other than elements and attributes have one of the following prefixes, which indicate whether the object is a module, an attribute class, a model class, a datatype, or a macro:

Component	Name	Example
Attribute Classes	att.*	att.global
Model Classes	model.*	model.biblPart
Macros	macro.*	macro.paraContent
Datatypes	teidata.*	teidata.pointer

The concepts of model class, attribute class, etc. are defined in 1 The TEI Infrastructure. Here we simply note some conventions about their naming.

The following rules apply to attribute class names:

Attribute class names take the form att.xxx, where xxx is typically an adjective, or a series of adjectives separated by dots, describing a property common to the attributes which make up the class.
Attributes with the same name are considered to have the same semantics, whether the attribute is inherited from a class, or locally defined.

The following rules apply to model class names:

Model classes have names beginning model. followed by a root name, and zero or more suffixes as described below.
A root name may be the name of an element, generally the prototypical parent or sibling for elements which are members of the class.
The first suffix should be Part, if the class members are all children of the element named rootname; or Like, if the class members are all siblings of the element named rootname.
The rootname global is used to indicate that class members are permitted anywhere in a TEI document.
Additional suffixes may be added, prefixed by a dot, to distinguish subclasses, semantic or structural.

For example, the class of elements which can form part of a div is called model.divPart. This class includes as a subclass the elements which can form part of a div in a spoken text, which is named model.divPart.spoken

TEI: Design Principles⚓︎Design Principles

Because of its roots in the humanities research community, the TEI scheme is driven by its original goal of serving the needs of research, and is therefore committed to providing a maximum of comprehensibility, flexibility, and extensibility. More specific design goals of the TEI have been that these Guidelines should:

provide a standard format for data interchange
provide guidance for the encoding of texts in this format
support the encoding of all kinds of features of all kinds of texts studied by researchers
be application independent

This has led to a number of important design decisions, such as:

the choice of XML and Unicode
the provision of a large predefined tag set
encodings for different views of text
alternative encodings for the same textual features
mechanisms for user-defined modification of the scheme

We discuss some of these goals in more detail below.

The goal of creating a common interchange format which is application independent requires the definition of a specific markup syntax as well as the definition of a large set of elements or concepts. The syntax of the recommendations made in this document conforms to the World Wide Web Consortium's XML Recommendation (Bray et al. (eds.) (2006)) but their definition is as far as possible independent of any particular schema language.

The goal of providing guidance for text encoding suggests that recommendations be made as to what textual features should be recorded in various situations. However, when selecting certain features for encoding in preference to others, these Guidelines have tended to prefer generic solutions to specific ones, and to avoid areas where no consensus exists, while attempting to accommodate as many diverse views as feasible. Consequently, the TEI Guidelines make (with relatively rare exceptions) no suggestions or restrictions as to the relative importance of textual features. The philosophy of these Guidelines is ‘if you want to encode this feature, do it this way’—but very few features are mandatory. In the same spirit, while these Guidelines very rarely require you to encode any particular feature, they do require you to be honest about which features you have encoded, that is, to respect the meanings and usage rules they recommend for specific elements and attributes proposed.

The requirement to support all kinds of materials likely to be of interest in research has largely conditioned the development of the TEI into a very flexible and modular system. The development of other XML vocabularies or standards is typically motivated by the desire to create a single fully specified encoding scheme for use in a well-defined application domain. By contrast, the TEI is intended for use in a large number of rather ill-defined and often overlapping domains. It achieves its generality by means of the modular architecture described in 1 The TEI Infrastructure which enables each user to create a schema appropriate to their needs without compromising the interoperability of their data.

The Guidelines have been written largely with a focus on text capture (i.e. the representation in electronic form of an already existing copy text in another medium) rather than text creation (where no such copy text exists). Hence the frequent use of terms like ‘transcription’, ‘original’, ‘copy text’, etc. However, these Guidelines are equally applicable to text creation, although certain elements, such as sourceDesc, and certain attributes, such as the rendition indicators, will not be relevant in this case.

Concerning text capture the TEI Guidelines do not specify a particular approach to the problem of fidelity to the source text and recoverability of the original; such a choice is the responsibility of the text encoder. The current version of these Guidelines, however, provides a more fully elaborated set of tags for markup of rhetorical, linguistic, and simple typographic characteristics of the text than for detailed markup of page layout or for fine distinctions among type fonts or manuscript hands. It should be noted also that, with the present version of these Guidelines, it is no longer necessarily the case that an unmediated version of the source text can be recovered from an encoded text simply by removing the markup.

In these Guidelines, no hard and fast distinction is drawn between ‘objective’ and ‘subjective’ information or between ‘representation’ and ‘interpretation’. These distinctions, though widely made and often useful in narrow, well-defined contexts, are perhaps best interpreted as distinctions between issues on which there is a scholarly consensus and issues where no such consensus exists. Such consensus has been, and no doubt will be, subject to change. The TEI Guidelines do not make suggestions or restrictions as to which of these features should be encoded. The use of the terms descriptive and interpretive about different types of encoding in these Guidelines is not intended to support any particular view on these theoretical issues. Historically, it reflects a purely practical division of responsibility amongst the original working committees (see further iv.2. Historical Background).

In general, the accuracy and the reliability of the encoding and the appropriateness of the interpretation is for the individual user of the text to determine. The Guidelines provide a means of documenting the encoding in such a way that a user of the text can know the reasoning behind that encoding, and the general interpretive decisions on which it is based. The TEI header should be used to document and justify many such aspects of the encoding, but the choice of TEI elements for a particular feature is in itself a statement about the interpretation reached by the encoder.

In many situations more than one view of a text is needed since no absolute recommendation to embody one specific view of text can apply to all texts and all approaches to them. Within limits, the syntax of XML ensures that some encodings can be ignored for some purposes. To enable encoding multiple views, these Guidelines not only treat a variety of textual features, but sometimes provide several alternative encodings for what appear to be identical textual phenomena. These Guidelines offer the possibility of encoding many different views of the text, simultaneously if necessary. Where different views of the formal structure of a text are required, as opposed to different annotations on a single structural view, however, the formal syntax of XML (which requires a single hierarchical view of text structure) poses some problems; recommendations concerning ways of overcoming or circumventing that restriction are discussed in chapter 21 Non-hierarchical Structures.

In brief, the TEI Guidelines define a general-purpose encoding scheme which makes it possible to encode different views of text, possibly intended for different applications, serving the majority of scholarly purposes of text studies in the humanities. Because no predefined encoding scheme can possibly serve all research purposes, the TEI scheme is designed to facilitate both selection from a wide range of predefined markup choices, and the addition of new (non-TEI) markup options. By providing a formally verifiable means of extending the TEI recommendations, the TEI makes it simple for such user-identified modifications to be incorporated into future releases of these Guidelines as they evolve. The underlying mechanisms which support these aspects of the scheme are introduced in chapter 1 The TEI Infrastructure, and detailed discussions of their use provided in chapter 24 Using the TEI.

« Design Principles
Home

TEI: Intended Use⚓︎Intended Use

We envisage three primary functions for these Guidelines:

guidance for individual or local practice in text creation and data capture;
support of data interchange;
support of application-independent local processing.

These three functions are so thoroughly interwoven in practice that it is hardly possible to address any one without addressing the others. However, the distinction provides a useful framework for discussing the possible role of these Guidelines in work with electronic texts.

TEI: Use in Text Capture and Text Creation⚓︎Use in Text Capture and Text Creation

The description of textual features found in the chapters which follow should provide a useful checklist from which scholars planning to create electronic texts should select the subset of features suitable for their project.

Problems specific to text creation or text ‘capture’ have not been considered explicitly in this document. These Guidelines are not concerned with the process by which a digital text comes into being: it can be typed by hand, scanned from a printed book or typescript, read from a typesetter's tape, or acquired from another researcher who may have used another markup scheme (or no explicit markup at all).

We include here only some general points which are often raised about markup and the process of data capture.

XML can appear distressingly verbose, particularly when (as in these Guidelines) the names of tags and attributes are chosen for clarity and not for brevity. Editor macros and keyboard shortcuts can allow a typist to enter frequently used tags with single keystrokes. It is often possible to transform word-processed or scanned text automatically. Markup-aware software can help with maintaining the hierarchical structure of the document, and display the document with visual formatting rather than raw tags.

The techniques described in chapter 24.3 Customization may be used to develop simpler data capture TEI-conformant schemas, for example with limited numbers of elements, or with shorter names for the tags being used most often. Documents created with such schemas may then be automatically converted to a more elaborated TEI form.

TEI: Use for Interchange⚓︎Use for Interchange

The TEI format may simply be used as an interchange format, permitting projects to share resources even when their local encoding schemes differ. If there are n different encoding formats, to provide mappings between each possible pair of formats requires n×(n-1) translations; with an interchange format, only 2×n such mappings are needed. However, for such translations to be carried out without loss of information, the interchange format chosen must be as expressive (in a formal sense) as any of the target formats; this is a further reason for the TEI's provision of both highly abstract or generic encodings and highly specific ones.

To translate between any pair of encoding schemes implies:

identifying the sets of textual features distinguished by the two schemes;
determining where the two sets of features correspond;
creating a suitable set of mappings.

For example, to translate from encoding scheme X into the TEI scheme:

Make a list of all the textual features distinguished in X.
Identify the corresponding feature in the TEI scheme. There are three possibilities for each feature:
1. the feature exists in both X and the TEI scheme;
2. X has a feature which is absent from the TEI scheme;
3. X has a feature which corresponds with more than one feature in the TEI scheme.
The first case is a trivial renaming. The second will require an extension to the TEI scheme, as described in chapter 24.3 Customization. The third is more problematic, but not impossible, provided that a consistent choice can be made (and documented) amongst the alternatives.

The ease with which this translation can be defined will of course depend on the clarity with which scheme X represents the features it encodes.

Translating from the TEI into scheme X follows the same pattern, except that if a TEI feature has no equivalent in X, and X cannot be extended, information must be lost in translation.

The rules defining conformance to these Guidelines are given in some detail in chapter 24.4 Conformance. The basic principles informing those rules may be summarized as follows:

The TEI abstract model (that is, the set of categorical distinctions which it defines in the prose of the Guidelines) must be respected. The correspondence between a tag X and the semantic function assigned to it by these Guidelines may not be changed; such changes are known as tag abuse and strongly discouraged.
A TEI document must be expressed as a valid XML-conformant document which uses the TEI namespace appropriately. If, for example, the document encodes features not provided by these Guidelines, such extensions should not be associated with the TEI namespace.
It must be possible to validate a TEI document against a schema derived from these Guidelines, possibly with extensions provided in the recommended manner.

TEI: Use for Local Processing⚓︎Use for Local Processing

Machine-readable text can be manipulated in many ways; some users:

edit texts (e.g. word processors, syntax-directed editors)
edit, display, and link texts in hypertext systems
format and print texts using desktop publishing systems, or batch-oriented formatting programs
load texts into free-text retrieval databases or conventional databases
unload texts from databases as search results or for export to other software
search texts for words or phrases
perform content analysis on texts
collate texts for critical editions
scan texts for automatic indexing or similar purposes
parse texts linguistically
analyze texts stylistically
scan verse texts metrically
link text and images

These applications cover a wide range of likely uses but are by no means exhaustive. The aim has been to make the TEI Guidelines useful for encoding the same texts for different purposes. We have avoided anything which would restrict the use of the text for other applications. We have also tried not to omit anything essential to any single application.

Because the TEI format is expressed using XML, almost any modern text processing system is able to process it, and new TEI-aware software systems are able to build on a solid base of existing software libraries.

TEI: Historical Background⚓︎iv.2. Historical Background

The Text Encoding Initiative grew out of a planning conference sponsored by the Association for Computers and the Humanities (ACH) and funded by the U.S. National Endowment for the Humanities (NEH), which was held at Vassar College in November 1987. At this conference some thirty representatives of text archives, scholarly societies, and research projects met to discuss the feasibility of a standard encoding scheme and to make recommendations for its scope, structure, content, and drafting. During the conference, the Association for Computational Linguistics and the Association for Literary and Linguistic Computing agreed to join ACH as sponsors of a project to develop these Guidelines. The outcome of the conference was a set of principles (the ‘Poughkeepsie Principles’, Burnard (1988)), which determined the further course of the project.

The Text Encoding Initiative project began in June 1988 with funding from the NEH, soon followed by further funding from the Commission of the European Communities, the Andrew W. Mellon Foundation, and the Social Science and Humanities Research Council of Canada. Four working committees, composed of distinguished scholars and researchers from both Europe and North America, were named to deal with problems of text documentation, text representation, text analysis and interpretation, and metalanguage and syntax issues. Each committee was charged with the task of identifying ‘significant particularities’ in a range of texts, and two editors appointed to harmonize the resulting recommendations.

A first draft version (P1, with the ‘P’ here and subsequently standing for ‘Proposal’) of the Guidelines was distributed in July 1990 under the title Guidelines for the Encoding and Interchange of Machine-Readable Texts. Extensive public comment and further work on areas not covered in this version resulted in the drafting of a revised version, TEI P2, distribution of which began in April 1992. This version included substantial amounts of new material, resulting from work carried out by several specialist working groups, set up in 1990 and 1991 to propose extensions and revisions to the text of P1. The overall organization, both of the draft itself and of the scheme it describes, was entirely revised and reorganized in response to public comment on the first draft.

In June 1993 an Advisory Board met to review the current state of the TEI Guidelines, and recommended the formal publication of the work done to that time. That version of the TEI Guidelines, TEI P3, consolidated the work published as parts of TEI P2, along with some additional new material and was finally published in May of 1994 without the label draft, thus marking the conclusion of the initial development work.

In February of 1998 the World Wide Web Consortium issued a final Recommendation for the Extensible Markup Language, XML.² Following the rapid take-up of this new standard metalanguage, it became evident that the TEI Guidelines (which had been published originally as an SGML application) needed to be re-expressed in this new formalism if they were to survive. The TEI editors, with abundant assistance from others who had developed and used TEI, developed an update plan, and made tentative decisions on relevant syntactic issues.

In January of 1999, the University of Virginia and the University of Bergen formally proposed the creation of an international membership organization, to be known as the TEI Consortium, which would maintain, develop, and promote the TEI. Shortly thereafter, two further institutions with longstanding ties to the TEI (Brown University and Oxford University) joined them in formulating an Agreement to Establish a Consortium for the Maintenance of the Text Encoding Initiative (An Agreement to Establish a Consortium for the Maintenance of the Text Encoding Initiative (March 1999)), on which basis the TEI Consortium was eventually established and incorporated as a not-for-profit legal entity at the end of the year 2000. The first members of the new TEI Board took office during January of 2001.

The TEI Consortium was established in order to maintain a permanent home for the TEI as a democratically constituted, academically and economically independent, self-sustaining, non-profit organization. In addition, the TEI Consortium was intended to foster a broad-based user community with sustained involvement in the future development and widespread use of the TEI Guidelines (Burnard (2000)).

To oversee and manage the revision process in collaboration with the TEI Editors, the TEI Board formed a Technical Council, with a membership elected from the TEI user community. The Council met for the first time in January 2002 at King's College London. Its first task was to oversee production of an XML version of the TEI Guidelines, updating P3 to enable users to work with the emerging XML toolset. This, the P4 version of the Guidelines, was published in June 2002. It was essentially an XML version of P3, making no substantive changes to the constraints expressed in the schemas apart from those necessitated by the shift to XML, and changing only corrigible errors identified in the prose of the P3 Guidelines. However, given that P3 had by this time been in steady use since 1994, it was clear that a substantial revision of its content was necessary, and work began immediately on the P5 version of the Guidelines. This was planned as a thorough overhaul, involving a public call for features and new development in a number of important areas not previously addressed including character encoding, graphics, manuscript description, biographical and geographical data, and the encoding language in which the TEI Guidelines themselves are written.

The members of the TEI Council and its associated workgroups are listed in iii. Preface and Acknowledgments. In preparing this edition, they have been attentive to the requirements and practice of the widest possible range of TEI users, who are now to be found in many different research communities across the world, and have been largely instrumental in transforming the TEI from a grant-supported international research project into a self-sustaining community-based effort. One effect of the incorporation of the TEI has been the legal requirement to hold an annual meeting of the Consortium members; these meetings have emerged as an invaluable opportunity to sustain and reinforce that sense of community.

The present work is therefore the result of a sustained period of consultation, drafting, and revision, with input from many different experts. Whatever merits it may have are to be attributed to them; the Editors accept responsibility only for the errors remaining.

TEI: Future Developments and Version Numbers⚓︎iv.3. Future Developments and Version Numbers

The encoding recommended by this document may be used without fear that future versions of the TEI scheme will be inconsistent with it in fundamental ways. The TEI will be sensitive, in revising these Guidelines, to the possible problems which revision might pose for those who are already using this version of these Guidelines.

With TEI P5, a version numbering system is introduced following the pattern specified by the Unicode Consortium: the first digit identifies a major version number, the second digit a minor version number, and the third digit a sub-minor version number. The TEI undertakes that no change will be made to the formal expression of these Guidelines (that is, a TEI schema, as defined in 24.4 Conformance) such that documents conformant to a given major numbered release cease to be compatible with a subsequent release of the same major number. Moreover, as far as possible, new minor releases will be made only for the purpose of adding new compatible features, or of correcting errors in existing features.

The Guidelines are currently maintained as an open source project on the Github site https://github.com/TEIC/TEI, from which released and development versions may be freely downloaded. See Previous Releases of P5 for information on how to find specific versions of TEI releases (Guidelines, schemas etc.). Notice of errors detected and enhancements requested may be submitted at https://github.com/TEIC/TEI/issues.

Notes

During generation of TEI RELAX NG schema fragments, the patterns corresponding with these TEI names are given a prefix tei to allow them to co-exist with names from other XML namespace. This prefix is not visible to the end user, and is not used in TEI documentation. When generating multi-namespace schemas, however, the user needs to be aware of them.

↵

XML was originally developed as a way of publishing on the World Wide Web richly encoded documents such as those for which the TEI was designed. Several TEI participants contributed heavily to the development of XML, most notably XML's senior co-editor C. M. Sperberg-McQueen, who served as the North American editor for the TEI Guidelines from their inception until 1999.

↵

[English] [Deutsch] [Español] [Italiano] [Français] [日本語] [한국어] [中文]

TEI Guidelines P5 Version 4.11.0. Last updated on 18th February 2026, revision 358d2e48e. This page generated on 2026-02-18T11:20:08Z.

TEI: Guidelines for Electronic Text Encoding and Interchange