Kate Barbera 2016-07-11 13:39:26
Linked Data may sound like a promising new way to organize records, but there is currently no clear path to implementation for our profession. For many archivists, Linked Data seems to be just out of reach—the next big digital trend. Between 2014 and 2016, my colleagues and I at Carnegie Museum of Art (CMOA) in Pittsburgh, Pennsylvania, tested Linked Data on a collection in the museum archives. While exploring the ups and downs of using this technology, we learned three hard and fast lessons that helped us make decisions, even when exploring unfamiliar territory. The final result was a brand new website for CMOA, http://records.cmoa.org, which explores the collection via semantic queries and offers users a Wikipedia-like experience. Although far from perfect, I hope this example encourages you to reimagine Linked Data as a more approachable means to achieve your goals and reach new audiences with your archives. Define What Linked Data Means for You The project began in September 2014 and focused on processing and providing digital access to CMOA’s Department of Film and Video Archive. The archive’s materials measure a thrilling 450 linear feet and contains materials accrued by the office within the museum responsible for the care, management, and exhibition of film and video artworks. The department opened in April of 1970 and was a vibrant part of CMOA’s regular programming until it was dissolved in the early 2000s. Artists such as Stan Brakhage, Hollis Frampton, Robert Breer, Albert Maysles, Joyce Wieland, Yvonne Rainer, and Buky Schwartz visited the museum to present their work and occasionally create new pieces. In the archives are hundreds of letters between museum curators and artists, dozens of letterpress posters advertising screening events and exhibitions, and more than 200 audio and video recordings of artist lectures and interviews. While processing the archive, we quickly discovered that many of these items contained valuable historical context for not only the film and video pieces in the museum’s art collection but also for many of the artists, curators, and film scholars who visited Pittsburgh for screenings and exhibitions starting in the 1970s. Slowly over the following months, the scope of our project expanded to include more robust digital representations of the artwork, people, and events found in the archive. Our search for an appropriate technological solution led us to Linked Data. In brief, Linked Data is a set of tools that allow us to semantically “link” objects, events, people, and other concepts digitally through an encoded structure. The structure organizes information so that computers can understand how these concepts are related. Using a Resource Description Framework (RDF), we can build data into “triples” that connect subjects with objects through predicates. These three elements are standardized with unique identifiers, which we can define using resources like ontologies and authority files. For archivists, this means that we can digitally express how the objects, events, people, and organizations are related within our collections and even how they connect to concepts outside of our archives (as long as the unique identifiers are consistent). It also means that we can create relationships with archival materials on the web that even we do not know exist. The Semantic web forms a giant constellation of data that has the potential to provide unprecedented context for archives in the digital world. For my colleagues and I at CMOA, Linked Data meant we could represent digitally the intellectual relationships between items in the Department of Film and Video Archive and the people, events, and artwork involved in the history that the archive embodies. Collaborate Outside Your Comfort Zone and Use Existing Resources By January 2015, we had begun building the infrastructure that would help us achieve the newly expanded scope of our project at CMOA. Two resources became absolutely crucial components of our project plan and workflow. First, we were fortunate to partner with a developer who already had experience working with the encoding methods used in Linked Data. We began working together very early in the project and were able to plan based on our combined technical requirements. We also devised an efficient workflow. As I processed and digitized the archive, I uploaded images and metadata directly into the collections management system. At the same time, our developer worked on restructuring it into Linked Data for the website. This partnership is ultimately what made the project possible. Second, to create and store the metadata for the project, we used KE Software’s Electronic Museum system, or Emu, which CMOA was already using to catalog its art collection. Underneath the Emu software is a relational database that allows users to create various types of records, including catalog (object) records, event records, and party (people/organization) records. Users relate these to one another with specified fields. As the project archivist, I found that using this system made the process of creating links in the metadata relatively straightforward because the types of relationships were already built into the software. We had a few additional factors working in our favor as well. We had no existing infrastructure in place for the archives that we had to work around and we had no data to migrate from legacy systems. We had the opportunity to build the entire infrastructure from the ground up. As we charged forward, our developer and our collections management system remained our greatest assets. Think Big But Scale Expectations Once spring arrived, the project was in full swing. The workflow was set and the infrastructure was in place. Now we had to finish the work, checking off every item on our lengthy to-do list. There was no precedent for this type of project in the CMOA archives. All work prior to this point was focused on minimal processing and description, and was usually done by interns and volunteers. One of the most challenging items on our list was simply metadata creation. We cataloged items in the archive using Archives Utility in Emu, which allows for compatibility with two archival standards: the General International Standard for Archival Description or ISAD(G), and Encoded Archival Description (EAD). We used these standards to generate the finding aid, but in order to create the necessary semantic links between the archive and people, events, and artwork, we had to devise local specifications for our item-level metadata. Furthermore, these specifications had to be extremely thorough, much more labor-intensive than is realistic for most archivists to apply on a broad scale. A single item could have upwards of 15 lines of metadata associated with it. In order to scale expectations for the amount of metadata we could produce with our limited staff and volunteers, we prioritized series within the archive and then selected specific items within those series. A New Way to Experience Collections In early 2016, we completed the project and launched the website. We discovered that Linked Data is labor intensive to build, but offers users an exciting and unprecedented way to experience collections online. Perhaps Linked Data still feels just over the horizon for many institutions. It may not be practical to implement using current technical resources and standards. However, archivists need to be thinking more broadly about what Linked Data means and could mean for the profession. It has enormous potential to help us express digitally the context of our collections, which after all, is what makes archives so engaging. If you are currently using or have plans to use Linked Data in your archives, I hope these lessons learned at Carnegie Museum of Art make your projects a little less daunting.
Published by Society of American Archivists. View All Articles.
This page can be found at http://www.bluetoad.com/article/Linked+Data+In+The+Archives/2530273/319673/article.html.