jump to navigation

NARA to Declassify 400 Million Pages of Documents in Three Years 2011/12/06

Posted by nydawg in Archives, Digital Archives, Digital Preservation, Electronic Records, Information Technology (IT), Media, Records Management.
Tags: , , , , , ,
add a comment

For a very long time, I have been trying to ask anyone who knows (from my colleagues to the AOTUS himself), why are we even attempting to preserve 250 million emails created during the Bush Administration.  As I’ve mentioned before, that works out to nearly one email every second for eight years!  (And remember, part of that time included Bush’s annual month-long vacations.)  So this story really seemed to give a bit of context in ways that the National Archives (NARA) deals with processing large collections of backlog materials.  “All of these pages had been piling up here, literally,” said Sheryl J. Shenberger, a former CIA official who is the head of the National Declassification Center (NDC) at the National Archives. “We had to develop a Costco attitude: We had 400 million pages . . . and we have three years to do them in.”

If you read Saturday’s article in the Washington Post, you’ll learn that “All of the backlogged documents date back 25 years or more, and most are Cold War-era files from the departments of Defense, State and Justice, among other agencies. The CIA manages the declassification of its own files.”  and that ““The current backlog is so huge that Americans are being denied the ability to hold government officials accountable for their actions,” [AOTUS David] Ferriero said. “By streamlining the declassification process, the NDC will usher in a new day in the world of access.”

If NARA is really trying to declassify, process, catalog, describe, preserve and make these pages available, I hope they’re planning on hiring some more archivists!  The problem is that when institutions are dealing with mass quantities of materials, the (quantitative) metrics we use, may actually hurt us in the future.  In the archival world, the prevailing wisdom seems to be MPLP (More Product, Less Process), but I would argue that archivists need to have qualitative metrics as well, if only to ensure that they are reducing redundancies and older, non-needed versions.  This gets to the crux of the distinction between best practices for records managers and best practices for digital asset managers (or digital archivists).  Ideally, a knowledgeable professional will collect and appraise these materials, and describe it in a way, so that a future plan can be created to ensure that these assets (or records) can be migrated forward into new formats accessible on emerging (or not-yet invented) media players and readers.

Ultimately, this leads to the most serious problem facing archivists: the metadata schemas that are most popular (DublinCore, IPTC, DACS, EAD, etc.) are not specific enough to help archivists plan for the future.  Until our metadata schemas can be updated to ensure that content, context, function, structure, brand, storage media and file formats can be specifically and granularly identified and notated, we will continue paddling frantically against the digital deluge with no workable strategy or plan, or awareness of potential problems (e.g. vendor lock-in, non-backwards compatible formats, etc.)  Sadly, in the face of huge quantities of materials (emails and pages), NARA will probably embrace MPLP, and ultimately hinder and hurt future access to the most important specific files, pages, emails, etc., because they will refuse to hire more professionals to do this work, and will (probably) rely on computer scientists and defense contractors to whitewash the problems and sell more software.