jump to navigation

Comparing Documentation Strategy of Civil War and First Gulf War 2011/11/21

Posted by nydawg in Archives, Best Practices, Digital Archives, Digital Preservation, Media, Records Management.
Tags: , , , ,
add a comment

I’ve said it before, and I’ll say it again (paraphrasing someone
else): “We are at risk of knowing less about the events leading up to
the First Gulf War than events leading up to the Civil War, because
all of the records and documents from the Civil War were conserved and
preserved, whereas all the records from the First Gulf War were
created on Wang Wordprocessors and never migrated forward and now lost

Case in point: Lincoln at Gettysburg; photo by Matthew Brady

Or 1991 Gulf War speech by Sec of Def Cheney:

or http://www.pbs.org/mediashift/2007/08/the-tangled-state-of-archived-n…

CLIR: Future Generations Will Know More About the Civil War than the Gulf War 2011/09/22

Posted by nydawg in Archives, Best Practices, Digital Archives, Education, Electronic Records, Information Technology (IT), Records Management.
Tags: , , , , , , , , , , , ,
add a comment

When I was in Queens College Graduate Library School six years ago, I took Professor Santon’s excellent course in Records Management which led me to understand that every institution has to manage its records and its assets and Intellectual Property.   The vital role the archive and records center play for every day use and long-term functions was made clear by the fact that records have a life cycle, basically creation – – use – – destruction or disposition.   The course was excellent, despite the fact that the main text books we used were from the early 1990s (and included a 3 1/4″ floppy that ran on Windows 3.1).

While doing an assignment, I found a more recent article which really led me to a revelation: electronic records will cause a lot of problems!  The one part that stuck out most and I still remember to this day was in a 2002 article “Record-breaking Dilemma” in Government Technology.  “The Council on Library and Information Resources, a nonprofit group that supports ways to keep information accessible, predicts that future generations will know more about the Civil War than the Gulf War. Why? Because the software that enables us to read the electronic records concerning the events of 1991 have already become obsolete. Just ask the folks who bought document-imaging systems from Wang the year that Saddam Hussein invaded Kuwait. Not only is Wang no longer in business, but locating a copy of the proprietary software, as well as any hardware, used to run the first generation of imaging systems is about as easy as finding a typewriter repairman. ” (emphasis added)

Obviously that article impacted my thinking about the Digital Dark Ages greatly, and it got me to wondering what will best practices be for managing born-digital assets or electronic records for increasingly long periods of time on storage media that is guaranteed for decreasing periods of time.  Or  “”We’re constantly asking ourselves, ‘How do we retain and access electronic records that must be stored permanently?'” she said. ”  Well, this gets to the crux of the issue, especially when records managers and archivists aren’t invited into the conversations with IT.  So when we are using more and more hard drives (or larger servers even in the cloud), “Hard-drive Makers Weaken Warranties“.  In a nutshell : “Three of the major hard-drive makers will cut down the length of warranties on some of their drives, starting Oct. 1, to streamline costs in the low-margin desktop disk storage business.”

So if we’re storing more data on storage media that is not for long-term preservation, then records and archival management must be an ongoing relay race, with appropriate ongoing funding and support, as more and more materials are copied or moved from one storage medium to another, periodically, every 3-5 years (or maybe that will soon be  1-3 years?).   Benign neglect is no longer a sound records management strategy.

That’s the technological challenge.  But there’s more!  I’ve gone on and on and on before about NARA’s ERA program and how one top priority is to ingest 250 million emails from the Bush Administration.  (I’ve done the math, it works out to nearly one email every second of the eight years.)  So we know that NARA is interested in preserving electronic records.  But a couple years ago I read this scary Fred Kaplan piece, “PowerPoint to the People: The urgent need to fix federalarchiving policies” in which he learned that “Finally—and this is simply stunning—the National Archives’ technology branch is so antiquated that it cannot process some of the most common software programs. Specifically, the study states, the archives “is still unable to accept Microsoft Word documents and PowerPoint slides.””

Uhhhhh, wait!  Well, at least that was written in 2009, so we can hope they have gotten their act together, but if you think about it too much, you might wonder if EVERYTHING NEEDED TO ARCHIVE IS ON MICROSOFT’S PROPRIETARY FORMATS?  Or you might just be inspired to ask if anyone really uses Powerpoint in the military.  Well, as Kaplan points out “This is a huge lapse. Nearly all internal briefings in the Pentagon these days are presented as PowerPoint slides. Officials told me three years ago that if an officer wanted to make a case for a war plan or a weapons program or just about anything, he or she had better make the case in PowerPoint—or forget about getting it approved.”  Or this piece from the NYTimes “We Have Met the Enemy and He Is Powerpoint” in which “Commanders say that behind all the PowerPoint jokes are serious concerns that the program stifles discussion, critical thinking and thoughtful decision-making. Not least, it ties up junior officers — referred to as PowerPoint Rangers — in the daily preparation of slides, be it for a Joint Staff meeting in Washington or for a platoon leader’s pre-mission combat briefing in a remote pocket of Afghanistan.”

We Have Met the Enemy, and He Is PowerPoint

Keep Bit Rot at Bay: Change is Afoot as LoC’s DPOE Trains the Trainers 2011/09/20

Posted by nydawg in Archives, Best Practices, Digital Archives, Digital Archiving, Digital Preservation, Information Technology (IT), Media.
Tags: , , , , ,
add a comment

This was forwarded to me by a nydawg member who subscribes to the UK’s Digital Preservation listserv.  I don’t know if  it’s been posted publicly in the US, but I guess this first one is by invitation-only.  I would LOVE to hear what they are teaching and how they are doing it, so I hope someday to attend as well.

Library of Congress To Launch New Corps of Digital Preservation Trainers

The Digital Preservation Outreach and Education program at the Library of Congress will hold its first national train-the-trainer workshop on September 20-23, 2011, in Washington, DC.

The DPOE Baseline Workshop will produce a corps of trainers who are equipped to teach others, in their home regions across the U.S., the basic principles and practices of preserving digital materials.  Examples of such materials include websites; emails; digital photos, music, and videos; and official records.

The 24 students in the workshop (first in a projected series) are professionals from a variety of backgrounds who were selected from a nationwide applicant pool to  represent their home regions, and who have at least some familiarity with community-based training and with digital preservation. They will be instructed by the following subject matter experts:

*   Nancy McGovern, Inter-university Consortium for Political and Social  Research, University of Michigan
*   Robin Dale, LYRASIS
*   Mary Molinaro, University of Kentucky Libraries
*   Katherine Skinner, Educopia Institute and MetaArchive Cooperative
*   Michael Thuman,  Tessella
*   Helen Tibbo, School of Information and Library Science, University of  North Carolina at Chapel Hill, and Society of American Archivists.

The curriculum has been developed by the DPOE staff and expert volunteer advisors and informed by DPOE-conducted research–including a nationwide needs-assessment survey and a review of curricula in existing training programs. An outcome of the September workshop will be for each participant to, in turn, hold at least one basic-level digital-preservation workshop in his or her home U.S. region by mid-2012.

The intent of the workshop is to share high-quality training in digital preservation, based upon a standardized set of core
principles, across the nation.  In time, the goal is to make the training available and affordable to virtually any interested
organization or individual.

The Library’s September 2011 workshop is invitation-only, but informational and media inquiries are welcome to George Coulbourne, DPOE Program Director, at gcou@loc.gov.

The Library created DPOE  in 2010.  Its mission is to foster national outreach and education to encourage individuals and organizations to actively preserve their digital content, building on a collaborative network of instructors, contributors and institutional partners. The DPOE website is www.loc.gov/dpoe
http://digitalpreservation.gov/education/.  Check out the curriculum and course offerings here.



WikiLeaks’ Cablegate and Systemic Problems 2011/09/06

Posted by nydawg in Best Practices, Digital Archives, Electronic Records, Information Technology (IT), Media, Privacy & Security, Records Management, WikiLeaks.
Tags: , , , , ,
1 comment so far

WikiLeaks Cablegate

Since late November of last year, the whole world has been watching as WikiLeaks got its hands on and slowly released thousands of classified cables created and distributed by the US over the last four decades.  As you may recall, the suspected leaker was Army Private First Class Pfc Bradley Manning who, undetected, was able to locate all the cables, copy them to his local system, burn them to CD-R (while allegedly lipsyncing Lady Gaga), and uploading an encrypted file to WikiLeaks.  (I’ve written previously , so I won’t get too detailed here.)

But last week, the story changed dramatically when The Guardian revealed that “A security breach has led to the WikiLeaks archive of 251,000 secret US diplomatic cables being made available online, without redaction to protect sources.  WikiLeaks has been releasing the cables over nine months by partnering with mainstream media organisations.  Selected cables have been published without sensitive information that could lead to the identification of informants or other at-risk individuals.”  To further confuse matters related to the origin of this newest leak, “A Twitter user has now published a link to the full, unredacted database of embassy cables. The user is believed to have found the information after acting on hints published in several media outlets and on the WikiLeaks Twitter feed, all of which cited a member of rival whistleblowing website OpenLeaks as the original source of the tipoffs.”  The Cablegate story, with all its twists and turns over the months, has left a big impression on me and, as an archivist and records manager, I think it is important to strip this story of all its emotionality and look at it calmly and rationally so that we can get to the bottom of this madness.

The first problem I have with the story, or more specifically, with the records management practices of the Defense Department is the scary fact that a low-level Private first class (Pfc) would have full access to the Army’s database.  This became a bit scarier when we learned that Pfc Manning used SIPRNet (Secret Internet Protocol Router Network) to gain full access to JWICS (Joint Worldwide Intelligence Communications System) as well as the [cilivian/non-military] diplomatic cables generated by the State Department.

So the first question I had to ask was: why does DoD have access to the State Department’s diplomatic cables, are they spying on the State Department?!  Well, maybe, but even if not, this staggering fact from a different Guardian article sent shivers down my spine:  “The US general accounting office identified 3,067,000 people cleared to “secret” and above in a 1993 study. Since then, the size of the security establishment has grown appreciably. Another GAO report in May 2009 said: “Following the terrorist attacks on September 11 2001 the nation’s defence and intelligence needs grew, prompting increased demand for personnel with security clearances.” A state department spokesman today refused to say exactly how many people had access to Siprnet.”

Other factors that scare the heck out of me related to “bad records management” and WikiLeaks Cablegate are the fact that there is a lack of CONTROL of these assets (they store everything online?!  Really?!); the DoD and State Department don’t use ENCRYPTION or cryptographic keys or protected distribution systems; the names of confidential sources were  not REDACTED in the embassy before uploading and sharing the cables with the world; their RETENTION SCHEDULES do not allow for some cables to be declassified and/or destroyed (so they keep everything online for decades and/or years); the majority of cables were UNCLASSIFIED suggesting that so many cables are created that they don’t even have enough staff to describe and CLASSIFY them in a better way?  The DoD didn’t have a method for setting ACCESS PRIVILEGES, or PERMISSIONS or AUTHORIZATION to ensure that a Pfc who is on probation would not be able to access (and copy and burn to portable media) all those cables undetected?!  There’s a question about password protection and authorization, but those problems could probably be covered with better ACCESS PRIVILEGES and PERMISSIONS.  Another question that leaves archivists confused is the idea that there seems to be limited version control.  In other words, it seems as if once a cable is completed, someone immediately uploads it, and then if the cable is updated and revised, a second cable will be created and uploaded.  This doesn’t seem to be a very smart way of trying to control the information when multiple copies may suggest differing viewpoints.

But perhaps the scariest part of the whole WikiLeaks’ Cablegate madness is simply that there was no TRACKING or TRACING mechanism so that the DoD could, through LOGS, trace data flows to show that one person (or one machine or one room in one building or whatever) had just downloaded a whole collection of CLASSIFIED materials!  [From the IT perspective, large flows of data may actually impact data flow speeds for other soldiers on the same network!]  And the fact that Pfc Manning was able to burn the data to CD-R suggests that when IT deployed the systems they forgot or neglected to DISABLE the burn function on a classified network!  (Okay, they’ve made some recent changes, but is it too late?!)

Many assume that Digital Forensics will provide a new way to authenticate data.  Well, if so, then why can’t they run a program on the cables and find out which system was used to burn the data and then trace the information back to the person who was using the machine at that time, as opposed to putting a soldier in jail, in solitary confinement, awaiting trial, convicted merely on a hearsay online chat he had with a known hacker?!  One other important consideration that also scares me: The military uses Outlook for their email correspondences, and Outlook creates multiple PST files.  As the National Journal puts it: “So how did Manning allegedly manage to get access to the diplomatic cables? They’re transmitted via e-mail in PDF form on a State Department network called ClassNet, but they’re stored in PST form on servers and are searchable. If Manning’s unit needed to know whether Iranian proxies had acquired some new weapon, the information might be contained within a diplomatic cable. All any analyst has to do is to download a PST file with the cables, unpack them, SNAP them up or down to a computer that is capable of interacting with a thumb drive or a burnable CD, and then erase the server logs that would have provided investigators with a road map of the analyst’s activities.”

Obviously the system was broken, informants’ security was compromised, our secrets are exposed, and the cat is out of the bag!  Yet even now, many are unwilling to listen to or heed the lessons we need to learn from this debacle.  Back in January, I attended a WikiLeaks panel discussion hosted by the Archivists Round Table of Metropolitan New York and was surprised to hear that most of these issues raised above were ignored.  I tried to ask a question regarding the systemic problems (don’t blame Manning), but even that was mostly ignored (or misunderstood) and not answered by everyone on the panel.

In my opinion, we have very serious problems related to best practices for records management.  If you look closely at DoD 5015.2, you can see that the problems are embedded in the language for software reqs, and nobody is looking at these problems in the ways that many archivists or records managers do (or should).  But honestly, the most insightful analysis and explanation were confessed by Manning himself: ““I would come in with music on a CD-RW labeled with something like ‘Lady Gaga,’ erase the music then write a compressed split file,” he was quoted in the logs as saying. “[I] listened and lip-synced to Lady Gaga’s ‘Telephone’ while exfiltrating possibly the largest data spillage in American history. Weak servers, weak logging, weak physical security, weak counter-intelligence, inattentive signal analysis … a perfect storm.

So maybe it is time for the military, the US National Archives, and all computer scientists and IT professionals to stop relying on computer processing and automated machine actions and start thinking of better ways to actually protect and control their classified and secret data.   Perhaps a good first move would be to hire more archivists and try to minimize the backlog quantity of Unclassified cables!  Or maybe it’s time to make sure that the embassies take responsibility for redacting the names of their sources before uploading the cables to a shared network?  And maybe it is time to consider a different model than the life cycle model which will account for the fact that often these cables will be used for different functions by different stakeholders through the course of its existence.

Arab Spring Diplomatics & Libyan Records Management 2011/09/05

Posted by nydawg in Archives, Best Practices, Digital Archives, Digital Preservation, Electronic Records, Information Technology (IT), Media, Records Management.
Tags: , , , , , , , , , , ,
add a comment

At 75th Annual Meeting of the SAA (Society of American Archivists) last week, I had the fortunate opportunity to attend many very interesting panels, speeches and discussions on archives, archival education, standards, electronic records, digital forensics, photography archives, digital media, and my mind is still reeling.   But when I heard this story on the news radio frequency, I needed to double-check.

As you all know, the Arab Springrevolutionary wave of demonstrations and protests in the Arab world. Since 18 December 2010 there have been revolutions in Tunisia and Egypt;
civil uprisings in BahrainSyria, &Yemen; major protests in AlgeriaIraqJordanMorocco, and

Omanand minor protests civil war in Libya resulting in the fall of the regime there; in

Kuwait, LebanonMauritaniaSaudi ArabiaSudan, and Western Sahara! Egypian President Hosni Mubarak resigned (or retired) and there’s a Civil War going on in Libya.   Meanwhile, with poor records management, documents were found in Libya’s External Security agency headquarters showing that the US was firmly on their side in the War on Terror:

“CIA moved to establish “a permanent presence” in Libya in 2004, according to a note from Stephen Kappes, at the time the No. 2 in the CIA’s clandestine service, to Libya’s then-intelligence chief, Moussa Koussa.  Secret documents unearthed by human rights activists indicate the CIA and MI6 had very close relations with Libya’s 2004 Gadhafi regime.

The memo began “Dear Musa,” and was signed by hand, “Steve.” Mr. Kappes was a critical player in the secret negotiations that led to Libyan leader Col. Moammar Gadhafi’s 2003 decision to give up his nuclear program. Through a spokeswoman, Mr. Kappes, who has retired from the agency, declined to comment.  A U.S. official said Libya had showed progress at the time. “Let’s keep in mind the context here: By 2004, the U.S. had successfully convinced the Libyan government to renounce its nuclear-weapons program and to help stop terrorists who were actively targeting Americans in the U.S. and abroad,” the official said.””


So I guess that means that if all of those documents from the CIA are secret, there would be no metric for tracing a record (at least on the US side).   In other words, every time a record is sent, copied or moved, a new version is created, but where is the original?  Depending on the operating system, the metadata may have a new Date Created.  How will anybody be able to find an authentic electronic record when it’s still stored on one person’s local system which is probably upgraded every few years?

There is a better way, a paradigm shift, and by looking at the Australian records continuum, “certainly provides a better view of reality than an approach that separates space and time”, we can find a better way so all [useless] data created is not aggregated.   With better and more appraisal, critical and analytical and technical and IP content, we can select and describe more completely the born digital assets and separate the wheat from the chaff, the needles and the haystacks, the molehills from the mountains, and (wait for it)  . . . see the forest for the trees.  By storing fewer assets and electronic records more carefully, we can actually guarantee better results.  Otherwise, we are simply pawns in the games of risk played (quite successfully) by IT Departments ensuring (but not insuring) the higher-ups that “we are archiving: we backup every week.” [For those who are wondering: when institutions “backup” they backup the assets one week, moves the tapes offsite and overwrite the assets the following week.  They don’t archive-to-tape for long-term preservation.]

Diplomatics may present a way for ethical archivists in to the world of IT, especially when it comes down to Digital Forensics.  But the point I’m ultimately trying to make, I think, is that electronic (or born digital) records management requires new skills, strategies, processes, standards, plans, goals and better practices than the status quo.  And this seems to be the big elephant in the room that nobody dares describe.

From Scroll to Screen and Back: Vendor Lock-In and eBooks 2011/09/04

Posted by nydawg in Best Practices, Digital Archives, Digital Humanities, Digital Preservation, Information Literacy, Information Technology (IT).
Tags: , , , , , , ,
add a comment

I saw an interesting article in the NYTimes titled “From Scroll to Screen” which looks at the transition in print media over the last two thousand years.  While the author is specifically looking at the transition from books to eBooks, he declares, “The last time a change of this magnitude occurred was circa 1450, when Johannes Gutenberg invented movable type. But if you go back further there’s a more helpful precedent for what’s going on. Starting in the first century A.D., Western readers discarded the scroll in favor of the codex — the bound book as we know it today.”

Like many archivists and librarians, I am also highly interested in how this transition will work in the future.  Last year I read William Powers‘ excellent Hamlet’s Blackberry which led me to new ways of thinking about media and different formats used to carry data and information between different stakeholders across time and space. . . .

So this article in the NYTimes by book critic Lev Grossman caught my interest when discussing how one format replaces the previous:  “In the classical world, the scroll was the book format of choice and the state of the art in information technology. Essentially it was a long, rolled-up piece of paper or parchment. To read a scroll you gradually unrolled it, exposing a bit of the text at a time; when you were done you had to roll it back up the right way, not unlike that other obsolete medium, the VHS tape.”

He goes on to explain how those scrolls were items of prestige, probably because of the “scarcity” of scroll-creators.  “Scrolls were the prestige format, used for important works only: sacred texts, legal documents, history, literature. To compile a shopping list or do their algebra, citizens of the ancient world wrote on wax-covered wooden tablets using the pointy end of a stick called a stylus. Tablets were for disposable text — the stylus also had a flat end, which you used to squash and scrape the wax flat when you were done. At some point someone had the very clever idea of stringing a few tablets together in a bundle. Eventually the bundled tablets were replaced with leaves of parchment and thus, probably, was born the codex. But nobody realized what a good idea it was until a very interesting group of people with some very radical ideas adopted it for their own purposes. Nowadays those people are known as Christians, and they used the codex as a way of distributing the Bible.”

And anyone who has ever tried to compare two or more passages in a book  on an ereader, you may be interested to read: “The codex also came with a fringe benefit: It created a very different reading experience. With a codex, for the first time, you could jump to any point in a text instantly, nonlinearly.”   This doesn’t quite work as easily in the tablet or eReader age, but stay tuned, as I imagine at some point they will improve on the technology.   “If the fable of the scroll and codex has a moral, this is it. We usually associate digital technology with nonlinearity, the forking paths that Web surfers beat through the Internet’s underbrush as they click from link to link. But e-books and nonlinearity don’t turn out to be very compatible.

So as we move from the tried and trusted durable medium of the codex and hard-cover book (even if printed on cheap paper) to the electronic tablet (early versions, soon-to-be-obsolete operating systems, playing outdated versions), our content management expertise and digital asset curator skills should become more valuable as new technologies eveolve and media formats become obsolete and disposable and our culture is at-risk.

But our hands are tied.   Even to address the pressing concerns of eBooks and eReaders and tablets, archivists are left out in the cold.  We dare not say anything about the vendor lock-in regarding Kindle’s proprietary formats, because they are Amazon’s Intellectual Property.  We cannot say anything about vendor lock-out in regards to Apple’s iPad tablet not playing (hot) Flash videos (see Steve Jobs “Thoughts on Flash”), and causing problems when accessing Fedora through its Flash application.  We cannot even mention the fact that iPads do not have any support for portable SD card and USB 2.0 external drives.  In other words, if you want to get information on (or off) your iPad, you probably have to email or upload it. . .  😦

So what can we do?  Or, more clearly, what should a digital archivist know to catalog and describe when working with born-digital materials?!  Well, of course there’s so much (not everything entirely relevant though!), but at the very least, better format (not just medium, but format, and maybe codec) descriptions can create better strategies leading to better plans, processes and best practices.  And keep your damn books!

I’ll admit that I finally broke down and bought an eReader.  Since I was going to be travelling to Chicago for the SAA, there were many articles I wanted to read and think about in advance, so for the last few months I was searching for the right one.  Of course, I was quite wary of Kindles because of the proprietary format and wasn’t sure how well it would read PDFS (or if it could), but friends suggested the Digital Reader or Sony eReader, and at least one friend suggested checking out the Nook from Barnes and Noble.

I wasn’t really sure what I wanted, and I ddn’t really care that much.  Basically, anything that would let me read PDFs, most of which I would download from the internet or specifically, the Society of American Archivists‘ (SAA), American Archivist journal.  Of course, I also found some excellent reads on Project Gutenberg and Internet Archive (where, btw, they’re preserving physical books!).  I’m really interested in the e-ink technology, so that was one factor, and the other factor was that I didn’t want to pay more than $130.  (Another factor, on which I thought I would have to compromise was that I wanted an eReader with two screens that would open like a book.)

Well, as you might expect, my research was not leading me to any answers, and I had almost decided to just go with one or the other (whichever was cheapest), and know it would be a temporary solution, until I can afford to buy a nice tablet computer . . . . .But then one day, I got an email offer from Woot for a one-day only clearance sale of all 2-screen dualbook (eReader & Tablet) from Entourage Pocket Edge!!  So I picked that up and I love it. . .. (Yes, there’s problems, but for reading pdfs and drawing on a journal or annotating pdfs and surfing the web on the tablet side, and etc. )   Maybe I’ll write more on it later, but for now, I hope you’ll just give a long thought about what we’ll lose if we give up functional non-linearity in our published works! (and I don’t mean Digital Humanities people with their datamining techniques..)

Paradigm Shift from Economics of Scarcity to Abundance & Scarcity of Common Sense 2011/08/24

Posted by nydawg in Archives, Best Practices, Digital Archives, Information Technology (IT), Intellectual Property.
Tags: , , , , , ,
add a comment

One of the most exciting (and scary!) aspects of being a digital archivist these days  is that everyone is living through a transition from the Atomic Age (age of atoms) to the Information (Digital) Age (age of bits), but archivists are also living through a professional paradigm shift from the economics of scarcity to the economics of abundance.   There is so much born-digital information created every day, month, year, decade, etc., that it is overwhelming just to contemplate how much information is created (and stored), and, while it seems like archivists are doing more and more work, there is some question about the metrics used to show that we are preserving the most significant material.  (e.g. NARA is accessioning 200 million emails from the George W. Bush Administration which, as I’ve blogged previously, works out to nearly one email every second of the Administration).

For me, this is a fascinating time for archivists because few people seem to understand how significant this transition is and will be.  In fact, from my experiences from library schools, many older faculty members seem unwilling (or unable?) to articulate this transition and, by extension, cannot even teach younger students how these changes will impact their lives and professions.  So rather than try to address these issues head-on, some educators ignore them and assign student readings from books written in the early 1990s or before.  (I have nothing against the study of “history”, but practicality would be helpful for students trying to get jobs as Information or Knowledge Mangers.)

Years ago, for example, when President John F. Kennedy wrote a memo or correspondence, his secretary would type it up in triplicate and send one copy to the intended recipient, file a second copy in the office, and send the third copy to the archives.   Decades later, if somebody wanted to find the original, the office copy or the archived version, it would most likely be filed away and accessible in its original paper format.  This system worked very well for hundreds and probably thousands of years!  In the Information Age, a similar memo for President Barack Obama might be created by the secretary as a born-digital Word format file, and copies of the file could (or should) be distributed in a similar manner (or perhaps converted to ISO 32000  PDF/A format for stable long-term preservation).  This may or may not be happening, but one big difference is that these electronic records (or born-digital files) are dependent on the software used to create them, and if the software is upgraded or replaced and newer versions are not backwards compatible, it may prove difficult to find, access and open those files.  (Also, it’s important to note that those files may have been stored on any variety of media formats which are no longer supported or accessible (e.g. remember Jaz drives or zip disks or CD-ROMS or 5.25″ floppys disks?)

To prevent losing mass quantities of materials, many libraries subscribe to LOCKSS or Lots Of Copies Keep Stuff Safe.  This may work for electronic journals created in PDF/A format, but it doesn’t work so well if ALL those copies are in a format (or font) that is obsolete or not supported– and/or are stored on a medium (floppies) that are no longer accessible on newer technologies (eg. iPads don’t have a DVD-ROM drive or a USB port)!

But this strategy may not work for Digital Archives because of the difference between accessibility and access or, as James Gleick, author of The Information: A History, A Theory, A Flood, puts it: “We’re in the habit of associating value with scarcity, but the digital world unlinks them. You can be the sole owner of a Jackson Pollock or a Blue Mauritius but not of a piece of information — not for long, anyway. Nor is obscurity a virtue. A hidden parchment page enters the light when it molts into a digital simulacrum. It was never the parchment that mattered.”

As Maria Popova puts it in her excellent essay “Accessibility vs. access: How the rhetoric of “rare” is changing in the age of information abundance“: “Because in a culture where abundance has replaced scarcity as our era’s greatest information problem, without these human sensemakers and curiosity sherpas, even the most abundant and accessible information can remain tragically “rare.””

Archivists and librarians have mastered the processes and practices from an earlier era of scarcity (e.g. item-level description) and seem unwilling (or unable) to consider a new and more efficient model.  I was trying to think of an analogy for this, and it hit me in Kennedy Airport where I am waiting for my flight to SAA’s annual meeting in Chicago: For hundreds and thousands of years, men and women have moved around while struggling to pack and carry their  luggage, but it wasn’t until 1970 that Bernard Sadow “invented” the suitcase with wheels.  What took so long?
It’s hard to say exactly what took so long, but it seems likely that travelers (especially macho travelers) had gotten so used to the inconvenience of lugging their heavy luggage through changing transportation systems that no one considered an easier, faster and better way.  But ultimately “common sense” won out, and now just about everyone (except me) has wheels on his/her luggage.  Why am I still holding out?!  I’m still waiting for suitcases that fly!

Whither Appraisal?: David Bearman’s “Archival Strategies” 2011/08/22

Posted by nydawg in Archives, Best Practices, Curating, Digital Archives, Digital Preservation, Education, Electronic Records, Information Technology (IT), Media, Records Management.
Tags: , , , , , ,
1 comment so far

Back in Fall 1995, American Archivist published one of the most controversial and debate-inspiring essays written by archival bad-boy David Bearman of Archives & Museum Informatics from Pittsburgh (now living in Canada).  The essay, “Archival Strategies” pointed to several problems (challenges/obstacles) in archival methods and strategies which, at the time, threatened to make the profession obsolete.   The piece was a follow-up to his “Archival Methods” from 1989 and showed “time and again that archivists have themselves documented order of magnitude and greater discrepancies between our approaches and our aims, they call for a redefinition of the problems, the objectives, the methods or the technologies appropriate to the archival endeavor.”  As he points out in Archival Strategies, “In Archival Methods, I argued that “most potential users of archives don’t,” and that “those who do use archives are not the users we prefer.””

This disconnect between archives and their future users led Bearman to write “I urged that we seek justification in use, and that we become indispensable to corporate functioning as the source of information pertaining to what the organization does, and as the locus of accountability.”  With his well-stated pithy aphorisms like “most potential users of archives don’t,” and that “those who do use archives are not the users we prefer,” he was able to point to the serious problem facing us today: past practices have led us to preserve the wrong stuff for our unprefered users!  Of course Information Technology has led us down this road since computer storage is marketed as so cheap (and always getting cheaper),  and it seems much easier to store everything than to let an archivist do his job starting with selection and appraisal, retention and preservation, arrangement and description, and access and use.

Ultimately, his essay is a clarion call for archivists to establish a clear goal for the profession, namely to accept their role in risk management and providing accountability for the greater societal goal.  The role of an archivist, in my opinion, is to serve as an institution’s conscience!  Perhaps that is the reason why library science and archival studies are considered science.   He suggests that strategic thinking is required “Because strategic thinking focuses on end results, it demands “outcome” oriented, rather than “output” oriented, success measures. For example, instead of measuring the number of cubic feet of accessions (an output of the accessioning process), we might measure the percentage of requests for records satisfied (which comes closer to reflecting the purpose of accessioning).”

This seminal essay is a fascinating read and groundbreaking analysis of the sorry state of appraisal.  “What we have actually been doing is scheduling records to assure that nothing valuable is thrown away, but this is not at all equivalent to assuring that everything valuable is kept.  Instead, these methods reduce the overall quantity of documentation; presumably we have felt that if the chaff was separated from the wheat it would be easier to identify what was truly important.  The effect, however, is to direct most records management and archival energy into
controlling the destruction of the 99 percent of records which are of only temporary value, rather than into identifying the 1 percent we want, and making efforts to secure them.”

Using incendiary language, Bearman goes on to state the obvious:  “Appraisal, which is the method we have
employed to select or identify records, is bankrupt.  Not only is it hopeless to try to sort out the cascade of “values” that can be found in records and to develop a formula by which these are applied to records, 16 it wastes resources and rarely even encounters the evidence of those business functions which we most want to document.”

2D lifecycle or 3D continuum

This is a revolutionary essay, and I strongly encourage every archivist to read it and think about it deeply.  The ideas have mostly languished and been ignored in this country as we continue to use the life cycle model, but Bearman’s ideas are written in the international standards for records management (ISO 15489) and  widely embraced in Australia (and China) where, over the last two decades, they have conceptualized and implemented the “Australian records continuum” model to great effect and, in doing so, they are looking at born-digital assets and electronic records from perspectives of all users, functions, and needs.  In my opinion, it seems like the continuum model is a 3D version of the lifecycle, which reminds me of this image from A Wrinkle in Time in which Mrs. Who and Mrs. Whatsit explain time travel to Meg and Charles Wallace by showing how an ant can quickly move across a string if the two ends are brought closer together.   In other words, if archivists look at the desired end result, they can appraise and process accordingly.


After reading the Bearman essay for the first time and seeing how it has caused such dramatic changes in archival conceptualizations, methods, strategies and processes elsewhere, but is still not taught in any depth in US library or archival studies schools, I spoke with other nydawg members, and we decided to use it as the text as for our next discussion group on Tuesday August 23.   I hope to revisit this topic later.

One last point.  Because of the deluge of materials accessioned by archives, “uncataloged backlog among manuscripts collections was a mean of nearly one-third repository holdings”, leading the authors to claim “Cataloging is  function that is not working.”  With budgets cut and small staffs unable to make progress, Mark Greene and Dennis Meissner wrote another revolutionary piece titled “More Product, Less Process: Pragmatically Revamping Traditional Processing Approaches to Deal with Late 20th-Century Collections” [MPLP] which was a plea for minimal processing.

Unlike Bearman’s “Archival Strategies”, MPLP leads archivists to believe that we must remain passive filers or describers or catalogers or undertakers.  But without a better understanding of appraisal and how to do it, we are doomed with analog, paper, born-digital or electronic records!  The clearest example of this is the National Archives and Records Administration’s Electronic Records Archive (ERA) which, according to Archivist of the United States David Ferriero “At the moment, most of the electronic records in ERA are Presidential records from the George W. Bush White House.  This important collection includes more than 200 million e-mail messages and more than 3 million digital photographs, as well as more than 30 million additional electronic records in other formats. ”

A few weeks ago, I actually crunched the numbers and figured out that 200 million emails over the course of eight years works out to nearly one email a second!  (365 days a year x 8 years = 2920 days plus 2 (leap year days)  2922 x 24 hours a day = 70,128 hours x 60 mins in an hour = 4,207,680 x 60 seconds per minute = 252,460,800. )
After doing the math, my first thought was, “if we’re trying to process and preserve every email sent every second by the George W. Bush Administration, we must be doing something wrong.”  And now, I think I understand the problem: we’re not doing effective appraisal.  Although we still have to wait for public access to the emails, I am fairly confident that researchers will find that nearly 90 percent of the collection are duplicates, or that they are keeping copies of the sent email, the different received emails, plus backups of all of them.  With better appraisal, this task should not be so difficult, and would leave more time for catalogers to do more detailed descriptions (which will be more important later, especially with different formats of “moving images” which are not compatible  with newer versions of hardware (e.g. iPads don’t play Flash Video).


Disaster Plan: Mystery Surrounds Loss of Digital 9/11 Records, Docs & Art 2011/08/21

Posted by nydawg in Archives, Best Practices, Digital Preservation, Education, Electronic Records, Intellectual Property, Privacy & Security.
Tags: , , , , , , , ,
add a comment

A few weeks ago, nydawg member NYU Professor Howard Besser shared this article from the AP.  As an archivist and records manager, I shudder to think that all copies of each lost asset was only stored in one place, and that no copies were stored offsite, stored in at least two geographically different locations.

“Besides ending nearly 3,000 lives, destroying planes and reducing buildings to tons of rubble and ash, the Sept. 11, 2001, attacks destroyed tens of thousands of records, irreplaceable historical documents and art.  In some cases, the inventories were destroyed along with the records. And the loss of human life at the time overshadowed the search for lost paper. A decade later, agencies and archivists say they’re still not completely sure what they lost or found, leaving them without much of a guide to piece together missing history.

“You can’t get the picture back, because critical pieces are missing,” said Kathleen D. Roe, operations director at the New York State Archives and co-chairwoman of the World Trade Center Documentation Project. “And so you can’t know what the whole picture looks like.”  . . . . “The trade center was home to more than 430 companies, including law firms, manufacturers and financial institutions. Twenty-one libraries were destroyed, including that of The Journal of Commerce. Dozens of federal, state and local government agencies were at the site, including the Equal Employment Opportunity Commission and the Securities and Exchange Commission.

from Northeast Document Conservation Center

But the story goes on to point out that nobody notified NARA!  I would think that most of these federal agencies would have disaster plans and policies, (check out the Library of Congress’s 404 page not found, or here or NARA and NARA from 1993 but maybe I’m wrong.   Fortunately, you can probably find assistance at NDECC dPlan….

 . .  . “Federal agencies are required by law to report the destruction of records to the U.S. National Archives and Records Administration — but none did. Federal archivists called the failure understandable, given the greater disaster.  After Sept. 11, “agencies did not do precisely what was required vis-à-vis records loss,” said David S. Ferriero, the Archivist of the United States, in an email to The Associated Press. “Appropriately, agencies were more concerned with loss of life and rebuilding operations — not managing or preserving records.”  He said off-site storage and redundant electronic systems backed up some records; but the attacks spurred the archives agency to emphasize the need for disaster planning to federal records managers.

Said Steven Aftergood, the director of the project on government secrecy at the watchdog group the Federation of American Scientists: “Under extreme circumstances, like those of 9/11, ordinary record keeping procedures will fail. Routine archival practices were never intended to deal with the destruction of entire offices or buildings.”

Read “Mystery Surrounds Loss of Records, Art on 9/11” , and when you’re ready and think you can get some institutional support, you might want to check out some great resources including:
the Society of American Archivists’ [SAA] annotated resources site for disaster plan templates, articles and other useful information; or a
useful guide from NARA Emergency Preparedness Bibliography (which is only 5 years old) or this from
NARA Disaster Preparation Primer from 1993
which doesn’t mention digital or electronic.