The latest practical class for the Introduction to Digital Curation module saw us having a go at using XML to create an Encoded Archival Description (EAD) collection level description.
What is XML?
XML stands for eXtensible Markup Language. It’s a metalanguage describing information, XML focusses on what the information is, but it doesn’t actually do anything. Rather, it was designed to store and transport data using self-descriptive tags, which are defined by the author. Let’s use the following example: a note to my Dad (Antony) from me (Sophie) sent on 8th February 2017 reminding him that my train arrives at Northampton at 18.15pm. Using XML:
<date>8th February 2017</date>
<body> Reminder that my train arrives at Northampton at 18.15pm.</body>
What is Encoded Archival Description?
Encoded Archival Description (EAD), is an international standard for encoding finding aids, allowing collection information to be standardised. EAD uses ‘mark-up’ or ‘tags’ to determine different parts of a finding aid so that it can be interpreted and processed by different computer systems. EAD is based on other international standards, and therefore the EAD descriptions are ideal for digital preservation and the exchange of data between different computer systems.
As an international standard, EAD has a tag library (we used the 2002 version). The tag library provides a list of the tags that can be used for EAD, and what information should correspond with each tag. The tags include the information that you would use to describe an archive collection, including:
Acquisition Information: <acqinfo>
Physical Location: <physloc>
My Experience of using XML for EAD
In our practical class, we were asked to use an XML editor to create an EAD encoded collection level description. For the task, we used Oxygen XML Editor, version 8.2, and we were given a pre-determined collection level description to work with, including the collection title, date range, acquisition information, extent, and a description of the collection.
Using the XML editor, we input the data from the collection level description using the tags from the EAD tag library. The XML editor was very helpful in that it immediately flagged up any areas where we had not obeyed the rules, such as leaving out or having an extra angle bracket, or inputting leaving out the paragraph marker <p>. The XML editor also flagged up for us automatically which tags could be used in each section, which meant that some of that thinking was already done for us. After a while of keyboard bashing, some confusion, and our lecturer explaining things, it was fairly easy to get the hang of using the tagging system.
So, as the XML only deals with what the information is, we needed to employ a stylesheet in order to present the data. Our lecturer had provided us with the EAD Document Type Definition (DTD) schema, which specifies the elements to be used to describe a collection as well as the arrangement of those elements. When the XML EAD that we had created was combined with the DTD stylesheet, the data that had been encoded could be seen and understood as normal text (i.e. without the coding).
Digital data is very complicated. It will take a take a while before interacting with data in this way becomes natural to me, but the future is digital.
Header image from Ruby Inside.
Archives Hub, ‘EAD’, JISC [n.d.] <https://archiveshub.jisc.ac.uk/ead/> [last accessed 08 February 2017].
Library of Congress, ‘About EAD’, Library of Congress (July 2012) <https://www.loc.gov/ead/eadabout.html> [last accessed 08 February 2017].
Library of Congress, ‘Development of the Encoded Archival Description DTD’, Library of Congress (December 2013) <https://www.loc.gov/ead/eaddev.html> [last accessed 08 February 2017].
Society of American Archivists, ‘Encoded Archival Description Tag Library’, Society of American Archivists (2002) <http://www2.archivists.org/sites/all/files/EAD2002TL_5-03-V2.pdf> [last accessed 08 February 2017].
w3schools.com, ‘Introduction to XML’, w3schools.com (2017) <http://www.w3schools.com/xml/xml_whatis.asp> [last accessed 08 February 2017].