Skip to Main Content

Digital Humanities Toolkit

Resources for digital humanities

Text Encoding

Text encoding is the process of transforming bytes of data into readable characters for users of a system or program. It is also the process of making the various human languages interpretable to computers. 


The Text Encoding Initiative (TEI) is a consortium which collectively develops and maintains a standard for the representation of texts in digital form.

  • TEI Roma is a tool for working with TEI customizations. A TEI customization is a document from which you can generate a schema defining which elements and attributes from the TEI system you want to use, along with customized HTML or PDF documentation of it.
  • OXGarage is a web service to manage the transformation of documents between a variety of formats.
  • XSL stylesheets convert TEI XML files to HTML, LaTeX, or XSL:FO documents.


Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.