NARA to Declassify 400 Million Pages of Documents in Three Years 2011/12/06

Posted by nydawg in Archives, Digital Archives, Digital Preservation, Electronic Records, Information Technology (IT), Media, Records Management.
For a very long time, I have been trying to ask anyone who knows (from my colleagues to the AOTUS himself), why are we even attempting to preserve 250 million emails created during the Bush Administration.  As I’ve mentioned before, that works out to nearly one email every second for eight years!  (And remember, part of that time included Bush’s annual month-long vacations.)  So this story really seemed to give a bit of context in ways that the National Archives (NARA) deals with processing large collections of backlog materials.  “All of these pages had been piling up here, literally,” said Sheryl J. Shenberger, a former CIA official who is the head of the National Declassification Center (NDC) at the National Archives. “We had to develop a Costco attitude: We had 400 million pages . . . and we have three years to do them in.”

If you read Saturday’s article in the Washington Post, you’ll learn that “All of the backlogged documents date back 25 years or more, and most are Cold War-era files from the departments of Defense, State and Justice, among other agencies. The CIA manages the declassification of its own files.”  and that ““The current backlog is so huge that Americans are being denied the ability to hold government officials accountable for their actions,” [AOTUS David] Ferriero said. “By streamlining the declassification process, the NDC will usher in a new day in the world of access.”

If NARA is really trying to declassify, process, catalog, describe, preserve and make these pages available, I hope they’re planning on hiring some more archivists!  The problem is that when institutions are dealing with mass quantities of materials, the (quantitative) metrics we use, may actually hurt us in the future.  In the archival world, the prevailing wisdom seems to be MPLP (More Product, Less Process), but I would argue that archivists need to have qualitative metrics as well, if only to ensure that they are reducing redundancies and older, non-needed versions.  This gets to the crux of the distinction between best practices for records managers and best practices for digital asset managers (or digital archivists).  Ideally, a knowledgeable professional will collect and appraise these materials, and describe it in a way, so that a future plan can be created to ensure that these assets (or records) can be migrated forward into new formats accessible on emerging (or not-yet invented) media players and readers.

Ultimately, this leads to the most serious problem facing archivists: the metadata schemas that are most popular (DublinCore, IPTC, DACS, EAD, etc.) are not specific enough to help archivists plan for the future.  Until our metadata schemas can be updated to ensure that content, context, function, structure, brand, storage media and file formats can be specifically and granularly identified and notated, we will continue paddling frantically against the digital deluge with no workable strategy or plan, or awareness of potential problems (e.g. vendor lock-in, non-backwards compatible formats, etc.)  Sadly, in the face of huge quantities of materials (emails and pages), NARA will probably embrace MPLP, and ultimately hinder and hurt future access to the most important specific files, pages, emails, etc., because they will refuse to hire more professionals to do this work, and will (probably) rely on computer scientists and defense contractors to whitewash the problems and sell more software.

NARA’s Erratic ERA Offers No Content-Searching 2011/10/29

Posted by nydawg in Archives, Digital Archives, Electronic Records, Records Management.
Many of us have been watching the unruly boondoggle of NARA’s ERA over
the years, but this story seems a bit overdue. . . . In a nutshell,
“Searching text impossible on NARA’s e-Records Archive”.
I hope soon they’ll take on the task of separating the wheat from the
chaff of those 250million Bush emails. (nearly one [out-of-office?]
email every second for 8 years)

“People trying to search the text of documents through the National
Archives and Records Administration’s $430 million Electronic Records
Archive are going to be disappointed, according to the agency’s
inspector general.  Under the currently deployed system, users can
search only by metadata. That typically includes tags for information
such as name of the original publication, date of publication, agency
that originated the document, and a small number of keywords. Users
who hope to locate a document by a word or phrase that isn’t part of
the metadata will be unable to. . . .

The public’s ability to use the ERA is likely to be hampered because
of the lack of a full text-based search capability, which would be
similar to what is available on Google.com or other commercial search
engines, NARA Inspector General Paul Brachfeld said in an interview on
Oct. 26.  Lack of full text search “is one of the profound problems
with the ERA at this point,” Brachfeld said. “Metadata alone does not
tell the story of what is in the documents.””


WikiLeaks’ Cablegate Links State Dept. Bureau of Diplomatic Security to Madness 2011/09/28

Posted by nydawg in Archives, Digital Archives, Digital Preservation, Electronic Records, Information Technology (IT), Media, Privacy & Security, Records Management, WikiLeaks.
For the last year or so, I’ve been fascinated by the whole WikiLeaks Cablegate story.  As I posted previously, there are a number of factors that contribute to this story which make it particularly interesting for people concerned with records  management and best practices for accessing and sharing information.   In my opinion, Private first class Bradley Manning is a fall guy (lipsynching to Lady Gaga), but problems revealed serious systemic malfunctions.  So I was very interested to read this article by Andy Kroll: “The Only State Dept. Employee Who May Be Fired Over WikiLeaks“.

Peter Van Buren is no insurgent. Quite the opposite: For 23 years he’s worked as a foreign service officer at the State Department, and a damn good one from the looks of it. He speaks Japanese, Mandarin Chinese, and Korean; served his country from Seoul to Sydney, Tokyo to Baghdad; and has won multiple awards for his disaster relief work. So why was Van Buren treated like a terror suspect by his own employer? For linking to a single leaked cable dumped online by WikiLeaks earlier this month.”

Well, this led me to read a TomDispatch.com posting by Van Buren himself which offers a clear-headed look at the madness!  For one thing, Van Buren got into a heap of trouble and was “under investigation for allegedly disclosing classified information” for LINKING to a WikiLeaks document which was already on the Web!  As he put it: “two DS agents stated that the inclusion of that link amounted to disclosing classified material. In other words, a link to a document posted by who-knows-who on a public website available at this moment to anyone in the world was the legal equivalent of me stealing a Top Secret report, hiding it under my coat, and passing it to a Chinese spy in a dark alley.”

Van Buren goes on to analyze the situation by stating: “Let’s think through this disclosure of classified info thing, even if State won’t. Every website on the Internet includes links to other websites. It’s how the web works. If you include a link to say, a CNN article about Libya, you are not “disclosing” that information — it’s already there. You’re just saying: “Have a look at this.”  It’s like pointing out a newspaper article of interest to a guy next to you on the bus.  (Careful, though, if it’s an article from the New York Times or the Washington Post.  It might quote stuff from Wikileaks and then you could be endangering national security.)”

And, for me, the cherry on the top, and something I’ve been trying to state for most of the last year (including at the Archivists Round Table of Metropolitan New York meeting in January 2011), is the fact that “No one will ever be fired at State because of WikiLeaks — except, at some point, possibly me. Instead, State joined in the Federal mugging of Army Private Bradley Manning, the person alleged to have copied the cables onto a Lady Gaga CD while sitting in the Iraqi desert. That all those cables were available electronically to everyone from the Secretary of State to a lowly Army private was the result of a clumsy post-9/11 decision at the highest levels of the State Department to quickly make up for information-sharing shortcomings. Trying to please an angry Bush White House, State went from sharing almost nothing to sharing almost everything overnight. They flung their whole library onto the government’s classified intranet, SIPRnet, making it available to hundreds of thousands of Federal employees worldwide. . . . . State did not restrict access. If you were in, you could see it all. There was no safeguard to ask why someone in the Army in Iraq in 2010 needed to see reporting from 1980s Iceland. . . . . Most for-pay porn sites limit the amount of data that can be downloaded. Not State. Once those cables were available on SIPRnet, no alarms or restrictions were implemented so that low-level users couldn’t just download terabytes of classified data. If any activity logs were kept, it does not look like anyone checked them.

In other words, by pointing the finger of blame at a few (two) bad apples (Pfc Manning and Foreign Services Officer/ Author Van Buren), “… gets rid of a “troublemaker,” and the Bureau of Diplomatic Security people can claim that they are “doing something” about the WikiLeaks drip that continues even while they fiddle.”  Yet, the State Department and the Department of Defense still refuse to acknowledge the systemic problems of trying to provide UNRESTRICTED and UNTRACEABLE ACCESS to ALL CABLES to all LEVELS of employees from the highest administrative levels at State and Defense  to the lowliest of the low  (Private first class on probation or a contractor, like Aaron Barr, working in White Hat or Black Hat Ops.)  Okay, according to Homeland Security Today, there’s 3 million people (not just Americans, btw) with “secret” clearance and “only” half a million with access to SIPRNet!

This still strikes me as an example of the US acting like ostriches and burying its head so we will not have to acknowledge the serious problems that are all around us.  Mark my words: the system is still broken, and even though certain changes have been instituted (thumb drive bans), we have a much more serious and systemic problem which few dare to acknowledge.  What’s the solution?  Better appraisal and better records management!

CLIR: Future Generations Will Know More About the Civil War than the Gulf War 2011/09/22

Posted by nydawg in Archives, Best Practices, Digital Archives, Education, Electronic Records, Information Technology (IT), Records Management.
When I was in Queens College Graduate Library School six years ago, I took Professor Santon’s excellent course in Records Management which led me to understand that every institution has to manage its records and its assets and Intellectual Property.   The vital role the archive and records center play for every day use and long-term functions was made clear by the fact that records have a life cycle, basically creation – – use – – destruction or disposition.   The course was excellent, despite the fact that the main text books we used were from the early 1990s (and included a 3 1/4″ floppy that ran on Windows 3.1).

While doing an assignment, I found a more recent article which really led me to a revelation: electronic records will cause a lot of problems!  The one part that stuck out most and I still remember to this day was in a 2002 article “Record-breaking Dilemma” in Government Technology.  “The Council on Library and Information Resources, a nonprofit group that supports ways to keep information accessible, predicts that future generations will know more about the Civil War than the Gulf War. Why? Because the software that enables us to read the electronic records concerning the events of 1991 have already become obsolete. Just ask the folks who bought document-imaging systems from Wang the year that Saddam Hussein invaded Kuwait. Not only is Wang no longer in business, but locating a copy of the proprietary software, as well as any hardware, used to run the first generation of imaging systems is about as easy as finding a typewriter repairman. ” (emphasis added)

Obviously that article impacted my thinking about the Digital Dark Ages greatly, and it got me to wondering what will best practices be for managing born-digital assets or electronic records for increasingly long periods of time on storage media that is guaranteed for decreasing periods of time.  Or  “”We’re constantly asking ourselves, ‘How do we retain and access electronic records that must be stored permanently?'” she said. ”  Well, this gets to the crux of the issue, especially when records managers and archivists aren’t invited into the conversations with IT.  So when we are using more and more hard drives (or larger servers even in the cloud), “Hard-drive Makers Weaken Warranties“.  In a nutshell : “Three of the major hard-drive makers will cut down the length of warranties on some of their drives, starting Oct. 1, to streamline costs in the low-margin desktop disk storage business.”

So if we’re storing more data on storage media that is not for long-term preservation, then records and archival management must be an ongoing relay race, with appropriate ongoing funding and support, as more and more materials are copied or moved from one storage medium to another, periodically, every 3-5 years (or maybe that will soon be  1-3 years?).   Benign neglect is no longer a sound records management strategy.

That’s the technological challenge.  But there’s more!  I’ve gone on and on and on before about NARA’s ERA program and how one top priority is to ingest 250 million emails from the Bush Administration.  (I’ve done the math, it works out to nearly one email every second of the eight years.)  So we know that NARA is interested in preserving electronic records.  But a couple years ago I read this scary Fred Kaplan piece, “PowerPoint to the People: The urgent need to fix federalarchiving policies” in which he learned that “Finally—and this is simply stunning—the National Archives’ technology branch is so antiquated that it cannot process some of the most common software programs. Specifically, the study states, the archives “is still unable to accept Microsoft Word documents and PowerPoint slides.””

Uhhhhh, wait!  Well, at least that was written in 2009, so we can hope they have gotten their act together, but if you think about it too much, you might wonder if EVERYTHING NEEDED TO ARCHIVE IS ON MICROSOFT’S PROPRIETARY FORMATS?  Or you might just be inspired to ask if anyone really uses Powerpoint in the military.  Well, as Kaplan points out “This is a huge lapse. Nearly all internal briefings in the Pentagon these days are presented as PowerPoint slides. Officials told me three years ago that if an officer wanted to make a case for a war plan or a weapons program or just about anything, he or she had better make the case in PowerPoint—or forget about getting it approved.”  Or this piece from the NYTimes “We Have Met the Enemy and He Is Powerpoint” in which “Commanders say that behind all the PowerPoint jokes are serious concerns that the program stifles discussion, critical thinking and thoughtful decision-making. Not least, it ties up junior officers — referred to as PowerPoint Rangers — in the daily preparation of slides, be it for a Joint Staff meeting in Washington or for a platoon leader’s pre-mission combat briefing in a remote pocket of Afghanistan.”

We Have Met the Enemy, and He Is PowerPoint

U.S. Marshals’ Gross Mismanagement Undervalues Complex Assets 2011/09/20

Posted by nydawg in Archives, Electronic Records, Records Management.
People often wonder, “Why does anyone need a good, honest, ethical records manager anyway?”  or “Why don’t universities offer better programs in records management?” and “why do people responsible for hiring records managers not understand what records managers should do?” or maybe “what does the USMS Ethics Officer do?”

I wish I knew the answers, but here’s an interesting article revealing how shoddy records management can be described as Gross Mismanagement! “The unit of the United States Marshals Service that manages complex assets seized in criminal cases, including that of Bernard L. Madoff, kept such shoddy records that it could not say who bought assets or how much was paid for them in 8 of the 55 sales it handled from 2005 to 2010, according to an audit released on Tuesday. . . . . The report said that the auditors found significant problems in how the service managed complex assets and that those problems “increased the risk that the government could mismanage the administration and disposition of forfeited assets.” The team disposed of $136 million in seized and forfeited assets from January 2005 to August 2010. “This audit identified numerous deficiencies in the procedures the complex asset team implemented to track, safeguard, value and dispose of complicated and valuable assets,” the report said. The report added that the team’s overseer, the Asset Forfeiture Division, did not “vigorously oversee” the team. Read all about it in “Auditors Find Chaos in U.S. Marshal’s Asset Sale Record-Keeping“!

So where are all those jobs anyway? From the audit: “Additionally, we found that the limited staff and resources of the Complex Asset Team were disproportionate to its responsibilities.  From 2005 to 2009, the number of staff varied between two and four individuals. . . . While the 14 forfeiture financial specialist contractors had extensive experience relevant to forfeiture, their primary assignment during Briskman’s tenure was to assist USMS district offices and not the Complex Asset Team.” . . . “Between 2005 and 2010, the small staff of the Complex Asset Team disposed of over $136 million in assets, yet it operated in an environment lacking the procedures to guide its actions and decisions pertaining to seized and forfeited assets.” . . . Uh, hold on, so four employees were responsible for keeping records on $136 million in assets?! $34 million per employee?! I wonder what their daily salaries were!

“Further, we identified inaccuracies in some monthly reports Briskman compiled. The summaries Briskman provided for many assets stated, “No change from previous report.” Yet, when we compared these entries to previous entries for such assets, we noted that some of the entries contained unexplained changes from the previous monthly status report. For example, the listed taxable profit amount for one asset varied from $9.5 million to $12 million between different reports; however, the entries provided to summarize the changes for this particular asset in subsequent reports were “no change from previous report.”

and check out the recommendations including: “Recommendation 20: Ensure that managers know that they must thoroughly review financial disclosure forms and disclose any potential conflicts of interest to the USMS ethics office.”  Read the DoJ “AUDIT OF THE UNITED STATES MARSHALS SERVICE


NARA, Why Is the Government Destroying Our History? 2011/09/07

Posted by nydawg in Archives, Electronic Records, Intellectual Property, Privacy & Security, Records Management.
A colleague posted this sad (but true) story about the National Archives asking “Why Is the Government Destroying Our History?” and I noticed this set-up, “The U.S. National Archives and Records Administration (NARA) said it will destroy millions of federal court records and bankruptcy files from 1970 through 1995 but will hold those records the government deems “historically valuable.” . . . Ok, for those of you who think archivists and information purists are dirty curmudgeons who toil away amid dust balls to avoid socializing (on say, Facebook!), consider what is actually lost when these records are destroyed:

. . . Incrimination.  You are about to hire an executive. You call us to do a background check.  We find out he was charged with running a prostitution ring in the ’80s. Or, you are about to hire a new CFO.  You call us and during our research we find he has filed for personal bankruptcy protection three times in the last 15 years. ”

So, this is very troubling.  Offhand I don’t know what the retention schedules for court records and bankruptcy files are, but now it seems like the historians at NARA are convinced that they can describe these files as having “historic” value, but they won’t go near the “evidential” or “transactional” value.   Professional records managers are not making these decisions at NARA, because they would recognize the legal value.   So NARA, in its “infinite wisdom” will decide whether or not large parts of our shared legal history have “historical value”, at the same time that they believe that redundant digital junk (e.g. 250 million George W. Bush emails) merit long-term preservation, but court records related to criminal activity may not have value in the eyes of a .  They’re going to throw out the original, authentic records and create a black hole in our shared knowledge of our judicial system!

Anyone remember when George W. Bush signed Executive Order 132333 to limit access to President Reagan’s records?  Well, now imagine that NARA is doing the same thing with federal records.  So what does the Federal Records Act (FRA) have to say about court records, or how does NARA deal with court records and bankruptcy files?

Well, in Spring 2008,this power was held by the federal records centers (FRCs) of the National Archives and Records Administration (NARA).  In a promotional piece, “Ready Access NARA’s Federal Records Centers Offer Agencies Storage, Easy Use for 80 Billion Pages of Documents they were providing ready access.   “However, the majority of federal records—approximately 95 percent—are considered “temporary records.” Every temporary record has an official records retention schedule—that is, the amount of time it must legally be preserved for use before it is destroyed (usually by recycling). Retention schedules for temporary federal records vary widely, ranging from a few months to more than a century. For example, most agency information request correspondence is kept for less than a year. Individual tax returns are preserved for seven years. Corporate tax returns, while not considered “permanent,” must be retained for 75 years. And certain aircraft certification engineering files must be kept for 100 years.”

I’m not exactly sure what they are doing ,but I assume it’s something like saying that since the papers were digitized (scanned), the originals are no longer needed.  But for public records, NARA is steward.   “The public can also access federal court records held by FRCs. These records include files from U.S. bankruptcy courts, the U.S. court of appeals, and U.S. district court civil and criminal files. FRCs make court documents available for researchers such as reporters writing stories on high-profile cases, former bankruptcy court litigants applying for mortgages or other loans, companies conducting background checks on individuals, and legal professionals researching precedents.”

Okay, so it’s an interesting piece from NARA, but this part really stopped me in my tracks: “The federal records centers have ably served the federal government and the citizens of the United States for more than 50 years. As the needs of federal agencies change and grow, NARA’s FRCs are also changing and growing to ensure that they will continue to protect the information assets of the federal government.”

I hope I’m not the only person to cry foul on this!  It drives me crazy especially when you check the FRC website and see how heavily invested they are in having a social media presence (Twitter, Facebook).


WikiLeaks’ Cablegate and Systemic Problems 2011/09/06

Posted by nydawg in Best Practices, Digital Archives, Electronic Records, Information Technology (IT), Media, Privacy & Security, Records Management, WikiLeaks.
1 comment so far

WikiLeaks Cablegate

Since late November of last year, the whole world has been watching as WikiLeaks got its hands on and slowly released thousands of classified cables created and distributed by the US over the last four decades.  As you may recall, the suspected leaker was Army Private First Class Pfc Bradley Manning who, undetected, was able to locate all the cables, copy them to his local system, burn them to CD-R (while allegedly lipsyncing Lady Gaga), and uploading an encrypted file to WikiLeaks.  (I’ve written previously , so I won’t get too detailed here.)

But last week, the story changed dramatically when The Guardian revealed that “A security breach has led to the WikiLeaks archive of 251,000 secret US diplomatic cables being made available online, without redaction to protect sources.  WikiLeaks has been releasing the cables over nine months by partnering with mainstream media organisations.  Selected cables have been published without sensitive information that could lead to the identification of informants or other at-risk individuals.”  To further confuse matters related to the origin of this newest leak, “A Twitter user has now published a link to the full, unredacted database of embassy cables. The user is believed to have found the information after acting on hints published in several media outlets and on the WikiLeaks Twitter feed, all of which cited a member of rival whistleblowing website OpenLeaks as the original source of the tipoffs.”  The Cablegate story, with all its twists and turns over the months, has left a big impression on me and, as an archivist and records manager, I think it is important to strip this story of all its emotionality and look at it calmly and rationally so that we can get to the bottom of this madness.

The first problem I have with the story, or more specifically, with the records management practices of the Defense Department is the scary fact that a low-level Private first class (Pfc) would have full access to the Army’s database.  This became a bit scarier when we learned that Pfc Manning used SIPRNet (Secret Internet Protocol Router Network) to gain full access to JWICS (Joint Worldwide Intelligence Communications System) as well as the [cilivian/non-military] diplomatic cables generated by the State Department.

So the first question I had to ask was: why does DoD have access to the State Department’s diplomatic cables, are they spying on the State Department?!  Well, maybe, but even if not, this staggering fact from a different Guardian article sent shivers down my spine:  “The US general accounting office identified 3,067,000 people cleared to “secret” and above in a 1993 study. Since then, the size of the security establishment has grown appreciably. Another GAO report in May 2009 said: “Following the terrorist attacks on September 11 2001 the nation’s defence and intelligence needs grew, prompting increased demand for personnel with security clearances.” A state department spokesman today refused to say exactly how many people had access to Siprnet.”

Other factors that scare the heck out of me related to “bad records management” and WikiLeaks Cablegate are the fact that there is a lack of CONTROL of these assets (they store everything online?!  Really?!); the DoD and State Department don’t use ENCRYPTION or cryptographic keys or protected distribution systems; the names of confidential sources were  not REDACTED in the embassy before uploading and sharing the cables with the world; their RETENTION SCHEDULES do not allow for some cables to be declassified and/or destroyed (so they keep everything online for decades and/or years); the majority of cables were UNCLASSIFIED suggesting that so many cables are created that they don’t even have enough staff to describe and CLASSIFY them in a better way?  The DoD didn’t have a method for setting ACCESS PRIVILEGES, or PERMISSIONS or AUTHORIZATION to ensure that a Pfc who is on probation would not be able to access (and copy and burn to portable media) all those cables undetected?!  There’s a question about password protection and authorization, but those problems could probably be covered with better ACCESS PRIVILEGES and PERMISSIONS.  Another question that leaves archivists confused is the idea that there seems to be limited version control.  In other words, it seems as if once a cable is completed, someone immediately uploads it, and then if the cable is updated and revised, a second cable will be created and uploaded.  This doesn’t seem to be a very smart way of trying to control the information when multiple copies may suggest differing viewpoints.

But perhaps the scariest part of the whole WikiLeaks’ Cablegate madness is simply that there was no TRACKING or TRACING mechanism so that the DoD could, through LOGS, trace data flows to show that one person (or one machine or one room in one building or whatever) had just downloaded a whole collection of CLASSIFIED materials!  [From the IT perspective, large flows of data may actually impact data flow speeds for other soldiers on the same network!]  And the fact that Pfc Manning was able to burn the data to CD-R suggests that when IT deployed the systems they forgot or neglected to DISABLE the burn function on a classified network!  (Okay, they’ve made some recent changes, but is it too late?!)

Many assume that Digital Forensics will provide a new way to authenticate data.  Well, if so, then why can’t they run a program on the cables and find out which system was used to burn the data and then trace the information back to the person who was using the machine at that time, as opposed to putting a soldier in jail, in solitary confinement, awaiting trial, convicted merely on a hearsay online chat he had with a known hacker?!  One other important consideration that also scares me: The military uses Outlook for their email correspondences, and Outlook creates multiple PST files.  As the National Journal puts it: “So how did Manning allegedly manage to get access to the diplomatic cables? They’re transmitted via e-mail in PDF form on a State Department network called ClassNet, but they’re stored in PST form on servers and are searchable. If Manning’s unit needed to know whether Iranian proxies had acquired some new weapon, the information might be contained within a diplomatic cable. All any analyst has to do is to download a PST file with the cables, unpack them, SNAP them up or down to a computer that is capable of interacting with a thumb drive or a burnable CD, and then erase the server logs that would have provided investigators with a road map of the analyst’s activities.”

Obviously the system was broken, informants’ security was compromised, our secrets are exposed, and the cat is out of the bag!  Yet even now, many are unwilling to listen to or heed the lessons we need to learn from this debacle.  Back in January, I attended a WikiLeaks panel discussion hosted by the Archivists Round Table of Metropolitan New York and was surprised to hear that most of these issues raised above were ignored.  I tried to ask a question regarding the systemic problems (don’t blame Manning), but even that was mostly ignored (or misunderstood) and not answered by everyone on the panel.

In my opinion, we have very serious problems related to best practices for records management.  If you look closely at DoD 5015.2, you can see that the problems are embedded in the language for software reqs, and nobody is looking at these problems in the ways that many archivists or records managers do (or should).  But honestly, the most insightful analysis and explanation were confessed by Manning himself: ““I would come in with music on a CD-RW labeled with something like ‘Lady Gaga,’ erase the music then write a compressed split file,” he was quoted in the logs as saying. “[I] listened and lip-synced to Lady Gaga’s ‘Telephone’ while exfiltrating possibly the largest data spillage in American history. Weak servers, weak logging, weak physical security, weak counter-intelligence, inattentive signal analysis … a perfect storm.

So maybe it is time for the military, the US National Archives, and all computer scientists and IT professionals to stop relying on computer processing and automated machine actions and start thinking of better ways to actually protect and control their classified and secret data.   Perhaps a good first move would be to hire more archivists and try to minimize the backlog quantity of Unclassified cables!  Or maybe it’s time to make sure that the embassies take responsibility for redacting the names of their sources before uploading the cables to a shared network?  And maybe it is time to consider a different model than the life cycle model which will account for the fact that often these cables will be used for different functions by different stakeholders through the course of its existence.

Digital New York: Still a Few Bugs in the System 2011/09/05

Posted by nydawg in Curating, Digital Archiving, Education, Electronic Records, Information Technology (IT), Media.
Hurricane Irene (not to scale)

Many of you know that I missed all the excitement last week as Hurricane Irene bore down on the New York area.  I was in Chicago for the 75th Annual Meeting of the SAA (Society of American Archivists) and it got so bad that I received warning emails from my mother and my oldest brother.  [I assume they had received but not read my itinerary which clearly showed that I was heading to Minneapolis/St Paul after the meeting.]  So I figured I was in the clear until I realized sometime on Friday, “Whoops! I forgot to close my windows!”  So I guess I can say I was tangentially affected (by guilt caused) by Tropical Storm Irene. . . .

But as the story was developing, I was in touch with friends back East and learned that some who live in my neighborhood were advised to evacuate!  My ex-girlfriend evacuated our two (Brooklyn) cats to Manhattan, and sent me pictures!  Well, I live close enough to the East River to start to worry about my (second floor) apartment. .  With a little research, I learned that I could find the evacuation areas from nyc.gov.  But on Saturday, I didn’t have any luck accessing the PDF or whatever it was.

So this morning, I stopped for a cup of coffee in Champion, and happened to read an article that “The New York Times reported that the city’s official website, www.nyc.gov, was down on the morning of Friday, Aug. 26.  The news outlet suggested that the site was overwhelmed by people looking for information about the hurricane. As of 1:30 p.m. Pacific time, however, the site was back online.   The timing couldn’t have been worse. In what New York City Mayor Michael R. Bloomberg called a “first time” for the city, he ordered a mandatory evacuation of various coastal areas of the city’s five boroughs, covering roughly 250,000 people.”  So this is dysfunctional modern-day disaster planning.

From the TimesCity Learns Lessons From the Storm, Many of Them the Hard Way” we learn that “For example, the mayor’s office had predicted a surge in Web traffic on nyc.gov when it issued the evacuation order. But nobody expected five times the normal volume of traffic. By Friday afternoon, computer servers had become severely overloaded. The Web site sputtered and crashed for hours, when New Yorkers needed it most.  In the future, the city will try to modify the Web site so that it can be quickly stripped down to a few essential features  —  like an evacuation map, searchable by ZIP code —  that are in highest demand during an emergency.”

Hurricane Irene: NYC Evacuation Zones

I’m curious about what is the “normal volume” of traffic on that webpage?  But it seems to me that this is ultimately a problem wit making information accessible, but not thinking it through to the extent that an end-user (who may have to evacuate his/her house!) has to first click on the PDF, then download it, wait for it to finish downloading, launch it, and then search for the data needed. . . . .  The fact that this is not an integrated system where a person can easily plug his/her zip code into an online system to find out if his house is in an evacuation zone  suggests that the system is not very functional, best practices are not in use, and further, that perhaps the metrics used to show how vital Digital New York is, are the wrong metrics to use.

Why wouldn’t the IT staff at DoITTT consider creating mirror sites for downloading the PDFs?  So the first victim of Hurricane Irene was NYC.gov.  “In a tweet earlier this morning the city’s Chief Digital Officer apologized for the outage while giving specific links (which were also frequently down) to find the city’shurricane evacuation map (we’ve included it below for your convenience). And the city’s main Twitter feed just put out a similar tweet. Which means, damn, a LOT of people must be trying to access the city’s website. We’ve e-mailed to find out just how many users it takes to take down nyc.gov but have yet to hear back.”

Well, fortunately, they’ve probably learned some lessons from this hysteria, and it seems like no one suffered much damage in this area and, ironically (or fortunately) September is a good time to Get Prepared: “National Preparedness Month . .  . a nationwide campaign to promote emergency preparedness and encourage volunteerism.”  To learn more about NYC’s Digital Strategy and the Chief Digital Officer check here for the Road Map. (more…)

Arab Spring Diplomatics & Libyan Records Management 2011/09/05

Posted by nydawg in Archives, Best Practices, Digital Archives, Digital Preservation, Electronic Records, Information Technology (IT), Media, Records Management.
At 75th Annual Meeting of the SAA (Society of American Archivists) last week, I had the fortunate opportunity to attend many very interesting panels, speeches and discussions on archives, archival education, standards, electronic records, digital forensics, photography archives, digital media, and my mind is still reeling.   But when I heard this story on the news radio frequency, I needed to double-check.

As you all know, the Arab Springrevolutionary wave of demonstrations and protests in the Arab world. Since 18 December 2010 there have been revolutions in Tunisia and Egypt;
civil uprisings in BahrainSyria, &Yemen; major protests in AlgeriaIraqJordanMorocco, and

Omanand minor protests civil war in Libya resulting in the fall of the regime there; in

Kuwait, LebanonMauritaniaSaudi ArabiaSudan, and Western Sahara! Egypian President Hosni Mubarak resigned (or retired) and there’s a Civil War going on in Libya.   Meanwhile, with poor records management, documents were found in Libya’s External Security agency headquarters showing that the US was firmly on their side in the War on Terror:

“CIA moved to establish “a permanent presence” in Libya in 2004, according to a note from Stephen Kappes, at the time the No. 2 in the CIA’s clandestine service, to Libya’s then-intelligence chief, Moussa Koussa.  Secret documents unearthed by human rights activists indicate the CIA and MI6 had very close relations with Libya’s 2004 Gadhafi regime.

The memo began “Dear Musa,” and was signed by hand, “Steve.” Mr. Kappes was a critical player in the secret negotiations that led to Libyan leader Col. Moammar Gadhafi’s 2003 decision to give up his nuclear program. Through a spokeswoman, Mr. Kappes, who has retired from the agency, declined to comment.  A U.S. official said Libya had showed progress at the time. “Let’s keep in mind the context here: By 2004, the U.S. had successfully convinced the Libyan government to renounce its nuclear-weapons program and to help stop terrorists who were actively targeting Americans in the U.S. and abroad,” the official said.””


So I guess that means that if all of those documents from the CIA are secret, there would be no metric for tracing a record (at least on the US side).   In other words, every time a record is sent, copied or moved, a new version is created, but where is the original?  Depending on the operating system, the metadata may have a new Date Created.  How will anybody be able to find an authentic electronic record when it’s still stored on one person’s local system which is probably upgraded every few years?

There is a better way, a paradigm shift, and by looking at the Australian records continuum, “certainly provides a better view of reality than an approach that separates space and time”, we can find a better way so all [useless] data created is not aggregated.   With better and more appraisal, critical and analytical and technical and IP content, we can select and describe more completely the born digital assets and separate the wheat from the chaff, the needles and the haystacks, the molehills from the mountains, and (wait for it)  . . . see the forest for the trees.  By storing fewer assets and electronic records more carefully, we can actually guarantee better results.  Otherwise, we are simply pawns in the games of risk played (quite successfully) by IT Departments ensuring (but not insuring) the higher-ups that “we are archiving: we backup every week.” [For those who are wondering: when institutions “backup” they backup the assets one week, moves the tapes offsite and overwrite the assets the following week.  They don’t archive-to-tape for long-term preservation.]

Diplomatics may present a way for ethical archivists in to the world of IT, especially when it comes down to Digital Forensics.  But the point I’m ultimately trying to make, I think, is that electronic (or born digital) records management requires new skills, strategies, processes, standards, plans, goals and better practices than the status quo.  And this seems to be the big elephant in the room that nobody dares describe.

Back from the SAA Annual Meeting #saa11 2011/09/01

Posted by nydawg in Archives, Education, Electronic Records, Records Management.
I’m back from the annual meeting of the SAA, and I had a blast.  I had the opportunity to hear many archivists and historians opine on best practices, photographic collections, digital forensics, electronic records management and a whole lot of other interesting topics.  And I hope to write about it soon.

Before I do, though, I wanted to write briefly about something I noticed.  Last Wednesday, I had a 7:30 am flight and left brooklyn at 5:15.  The night before, I had used the MTA Trip Planner and learned that there was a G train at 5:30 connecting to the E @ 5:44 to arrive at AirTrain by 6:45.  Plenty of time to go through security and catch my 7:38 flight. . . .  But nooooooooo.

For some reason, the G train was late, and the next E train that came was running local (stopping at all stops), so the estimated 30 min. subway ride became a 60 min trek, leading to Security Check in at JFK (Jet Blue).  So I get my ticket and arrive at security at 7:10 am, and there’s about 1000000 people in front of me.  (Okay, maybe 100).   I finally made it through at around 7:30 and made a mad dash for the gate– without even time to put on my belt!  Alasl, I got there too late, and the door to the plane was closed.  The plane was still there (I was 8 mins early), but they had stopped boarding.  I asked the woman at the help desk and reserved a space on the next flight (8 hours later) and paid $40. . .. . Anyway, I learned a lesson, and blogged a couple times while there.  (something about flying suitcases)

So what did I learn?  Well, when I returned from Minneapolis, we touched down at 6:01 pm and I was home, in my apartment, at 7:31.  The E train was running express!

Well, this all got me thinking.  Maybe the MTA Trip Planner doesn’t make the connection or association that some E trains don’t run express early in the morning?   If it’s algorithm is finding outdated information, it could cause problems.  Or maybe the MTA Trip Planner should give multiple trip and time suggestions and offer a way to browse through the results, so people aren’t blue-skying their journeys to the JFK airport. . .

Okay, so I learned that I need to leave that extra 15 minutes early even when doing due diligence.   It wasn’t that bad, but it is a bit ironic that I spent eight hours waiting to take a 60 minute flight to Chicago followed by a 30 minute taxi to the gate.

One other thing I noticed was soldiers walking through airports carrying huge backpacks filled with who-knows-what.  I wonder if the military have embraced wheels on suitcases yet as per a previous post.