The Online Photographer

Check out our new site at www.theonlinephotographer.com!

Thursday, June 15, 2006

Storage: An Editorial

In the past week, we've endorsed two storage products on our site: MAM-A gold disks and Glyph hard drives. A great many commenters have written to say that neither are perfect or even adequate. Well, yes. That's right.

Years ago, working in a commercial darkroom, I happened to move a big old bench that some of the enlargers sat on (I'm a slob in real life, so don't ask me why I end up cleaning every darkroom I work in to within an inch of its life). On the floor behind it, in all the dust and spiderwebs, I found a strip of three 120 negatives. The picture in the middle was of a nude woman in one of those 1940s-style pinup poses that hide as much as they reveal.

Naturally, I cleaned off the negative and made a print of it.

It wasn't a very good picture, and the negative had been underdeveloped. The point is that it was at least 50 years old at the time, and it had lasted all that time—not only without pampering, but in the absence of human care of any sort.

A little while ago a comment came in about our post concerning the hard drives. The point was good; I just thought it took the wrong tone. It sounded kinda nasty. If your data is so important—I'm paraphrasing—why don't you get a RAID like the NSA or the DoD?

Yeah—why don't ya, ya idiot?

Deconstructing that idea is very instructive, it seems to me. First of all, if your data is so important—the implication being that you're really just a pretentious snob if you think your pictures are worth preserving. Second, if you do happen to lose all your data, well, it's because you didn't have redundant storage like, oh, say, the Department of Defense. So, in other words, it's your own fault.

I used to have a very unscrupulous boss who always said the same thing to everyone who complained about the store's bad service. "In all our years in business," he'd claim, "I've only had two complaints. You are the third." I would say we got complaints of one sort or another from about 60% of our customers—hundreds and hundreds in just the short time I worked there. But none of them knew that. For all they knew, my boss was telling the truth.

A concept similar to this has been very important in the development of, say, the personal computer. When you have computer problems, it's not the product's fault. It's your fault. You don't know everything there is to know about computers, do you? If you did, everything would have worked smoothly. That's the implication.

This greatly increases peoples' tolerance for downtime. It's not an unreliable product; it's user error. Not their fault, your fault. So for twenty years or more, computers were awful and buggy and recalcitrant and infuriating. But it didn't hurt the product segment. Consumers were happy to blame themselves.

Actually, it is the product's fault. Most people have serious problems with computers at least intermittently. They should be more reliable and more bulletproof. They should be better.

I just wonder if we haven't entered a similar paradigm where digital image storage is concerned. Digital photography has really only been part of the hobbyist mainstream for roughly ten years. And, truthfully, many of the people I happen to know who've actually been creating work digitally since 1996 have already either lost some work or had some work rendered inaccessible by the loss of the software or hardware necessary to recover it.

The implied response by the photo industry might as well be the same as my surly commenter's: Who are you, Richard Avedon? What makes your pictures so important?

Or, well, too bad you lost eight years of shooting, but you know, you really should have been backing up every day to three expensive outboard hard drives, and been recopying all your storage CDs every two years, and...and....

Your fault. Ya idiot.

Except that it's not. I just wonder what's going to happen as digital moves outward in time from its "cradle days," and people start regularly losing more and more of their work. Because, y'see, actually, it's not unreasonable for people to want to keep their own work intact over periods of time measured in decades, and we shouldn't all need to gradually become full-time archivists in order to insure that that happens.

There should be at least one more or less affordable storage medium that you can drop behind a bench for 50 years and have it still be usable when it comes back out. So that you can choose it if you want it or need it.

The digital imaging industry as a whole ignores this aspect of its technology at great peril not only to our own memories and our shared record of our culture, it seems to me, but to its own future.

—Mike Johnston
"The Online Photographer"
http://www.theonlinephotographer.blogspot.com


The above may be copied, republished, and re-posted as long as the author's name and website address are reproduced with it.

20 Comments:

Blogger fivetonsflax said...

"A concept similar to this has been very important in the development of, say, the personal computer. When you have computer problems, it's not the product's fault. It's your fault."

The fault is with marketing. Computers work according to their nature. We hide that nature under abstractions so we can forget that we're dealing with subtle, complex and fragile machines. Marketing sells the abstractions, not the realities; there's no place in their discourse for all the subtlety in a computing device.

When the mask slips, the consumer is disappointed. "I didn't sign up for this!" And he didn't. It's not his fault, and it's not the computer's fault. It's the fault of the people who lied by omission, the marketers.

I'd be very happy to see some kind of really permanent digital storage, but it's not as if some company could start churning 'em out next week. There are basic problems. How do you connect to such a device? USB, say? Ok ... so what happens when USB becomes obsolete and new computers don't have it any more?

Using open file formats is a good step, but the problem of physical format is a tough one.

4:41 PM  
Blogger Brandon J. Scott said...

This is the issue that is preventing me from fully embracing digital image capture. I know I am not well organized. I know that I tend to lose digital files. I know that drives fail on a fairly regular basis.

I've read articles and books on digital asset management, but every method takes a considerable amount of time and applied effort.

Some of my negatives and slides are hanging in a file cabinet and some are in archival boxes. They are not well organzied, and it is not easy to find a single image, but they are there and unless there is a major catastrophe (which could happen) they will continue to be there. The file cabinet is not going to fail, and the boxes are not going to spontaneously combust. I don't have to worry about giving in to the temptation to delete old images to make room for new images.

I'm sure my photographs don't mean much to many people, but they mean something to me. I want to be able to review my work for the rest of my life, and I am not convinced that I have the discipline to maintain my digital files in such a way that they will still be available to me in 5,10,20,40 years time.

5:17 PM  
Blogger Ken Tanaka said...

FWIW, Mike, I was actually very pleased that you posted the article on the gold archival DVDs. I had found the Belkin brand (presumably) of these discs. I suspected that they were OEMed from someone that offered them for le$$ but I could never find them. Bingo! My stack of 50 arrived today. I be very happy.

Regarding folks criticizing your "endorsements", well, it's to be expected particularly with regard to computer-related gear. Actually, constructive alternate suggestions are at the core of the value of presenting public suggestions at all.

Long-term storage of digital images is an enormous problem that stains deeper than personal collections. There's no immediately accessible solution to it. Creating more longer-lasting DVDs is possible but at much greater expense (many tens of thousands of dollars) than most people are willing to invest (thanks largely to some really nasty Hollywood-related royalties). Strange and sad as it may seem the print may well be the last man standing when all other media have expired.

5:34 PM  
Blogger Doug said...

What's "affordable"?

Using the price of that 120 negative film as a benchmark, it's affordable to simply keep our pictures on CF cards. A "normal speed" Kingston 2GB CF card is $63 (US) at Amazon right now and I can stuff about a thousand 8-Mpix JPEGs onto one. Or about 250 of my 8-Mpix Raw files.

Those darned CF cards seem to be virtually indestructible. The only real question is the future availability of CF card readers, but that's somewhat of a question for any kind of storage medium.

More expensive than hard drives or DVDs? Sure. But less expensive than film ever was. And getting even less expensive every day.

5:46 PM  
Blogger Ted Kostek said...

Mike,

This one is so easy, I'm surprised no one had called you on it yet:

"There should...be [a] storage medium that [will last] for 50 years... ."

There is: film! ;-)


On a different note, there was a list, supposedly by Bill Gates, saying that if cars had kept up with computers we'd all get 100 miles per gallon and drive 200 miles per hour.

A counter list appeared, supposedly by the CEO of GM, saying that if cars were as good as computers they would stop randomly while on the freeway, and we would consider this normal and acceptable.

In any event, digital reliablilty will continue to improve. In the meantime it's part of the price we pay to buy the flexibility of digital capture and manipulation.

6:00 PM  
Blogger John Bates said...

I've got a lot to say on this topic, and no time to say it.

1. "Data provenance" is an industry term, and an active area of investigation. I went to an interesting talk by an IT director at the National Archives (I think... can't remember the organization affiliation). It's a huge problem, with many subtleties, and it affects both digital and analog. (He gave a nice example: Rembrandt painted for people who would probably be viewing his paintings by candlelight. If they are displayed under fluorescent lighting, how much of the original intent is lost?) It's an active area of investigation, and it should be noted that open file formats only *start* to address the problem.

2. Storage should always be evaluated in terms of cost/benefit ratios. RAID, for example, is relatively costly when compared to a good backup process. And since we all have good backup processes (right?), it rarely makes sense for most users, UNLESS availability is also a requirement. I know that I talked about RAID as a reliability solution in my previous comment, but that's in the specific context of the reliability of a single storage system, not your overall storage workflow.

A standard practice in my world is storage tiering, in which people evaluate their data in terms of what it's worth to them. ("If your data is so important...") Part of my work is trying to get people to think in terms of these tiers, and to assign value to their data, so that they can sensibly develop storage plans. (Actually, I'm working hard on tech to develop plans for people: I want to get to a turn-key solution for you, Mike.)

3. A cheap fifty-year digital storage medium isn't gonna happen any time soon. Sorry, Mike. What might be possible would be a cheap fifty-year archiving solution, in which you pay somebody else to make sure that your data will live on. Iron Mountain offers such services to the corporate world now.

4. Users *are* conditioned to blame themselves when there's a problem. There's a really stupid mentality out there, that says people who understand computers are really smart, and if you can't figure something out, it's because you're just not smart enough. Well, I know plenty of really stupid people in the field, and they have the power to make user's lives miserable.

That having been said: do you blame your car's manufacturer when your seven-year-old muffler rusts out? Sure, they might have built a redundant muffler system, and coated it with a super-high-tech polymer, and sealed it in multiple layers of shielding. But parts wear out. It is your responsibility to plan on ways to cope. Exactly how much time, effort, and money you put into coping depends on:

a) How valuable the system and its contents are to you, and
b) The cost and convenience of the tools available to you. (That's the part I'm working on.)

6:03 PM  
Blogger J DM said...

I've managed to lose, misplace or otherwise forget thousands of negs over the years. Sorry, but I've been in a business that counts on impermanence (and repetition). Digital photography multiplies the problem. Lemme think, say a thousand or two photos a year that qualify as 'decent', and maybe 200 that are legitimate keepers. Wouldn't it be easier to publish a book every two years with the best photos? Use rag paper and good binding, and old technology offers an answer. Too low tech?

6:17 PM  
Blogger m. said...

A yearly book is not such a bad idea. I had a grad school friend ask my advice on backing up her thesis. I told her that, in addition to whatever else she did, she ought to print the thing out once a week. Worse comes to worse, she has to type it in again.

8:41 PM  
Blogger LostBryan said...

First, it's worth reading this paper and others like it. The bottom line is "just because you have backups doesn't mean the data is any good"
http://www.lockss.org/locksswiki/files/3/30/Eurosys2006.pdf

Second, film has similar problems (notably fading) but because it's analog it's more forgiving. And since we often lose it and then forget about it, it often appears more forgiving than it is. Then various reports focus on the images saved/found/etc and don't report (since they don't know) about the film simply lost, faded to utter uselessness, and so on. All of this makes film and paper look better than they are.

Third, computer characteristics are driven by technology/physics issues AND market issues. What sells is fast/big/cheap. Period.

Fourth, even if you could reliably keep every file you ever made (even if just photos) FINDING the best files for any purpose is still a very hard task.

Fifth, for everybody who isn't super famous, when you are dead, your work will be remembered by the books, prints, and perhaps web pages you create. So the best way to preserve the best images from any era is to print them the best way you know how and store them well. As a book is a good plan. A yearly "year's best" box is another decent plan. But prints must be your legacy, because essentially nobody is going to grovel over the 100's of thousands to millions of digital files that people will make over a lifetime, unless you rise to the stature of Gary Winogrand or Ansel Adams.

10:41 PM  
Blogger Oly said...

RAID isn't really such a bad idea. For instance, say you had 3 300GB drives. Add a fourth one and you could get RAID protection for all that data.

That drive cost, say, $200. How much is 900 GB of DVDs? $75ish? How much is your time burning 200 DVDs worth?

Burning DVDs and erasing those images from the drive for space is certainly the most economical. But RAID isn't as bad as you'd think. It's really too bad they gave it such an alarming name. I bet if they'd called it NiftyDrive or FluffyBunnyDisk more people would try it and find it solved their problems at a fair price.

10:57 PM  
Blogger John Bates said...

RAID does not replace backups. It protects against one and only one risk: the failure of a single hard drive. (More complex generalizations of RAID algorithms do protect against multiple drive failures.)

It does not protect against: accidental deletions, operating system errors, corruption due to power failures, or site disasters like fires or floods.

There are two reasons that a company (or the NSA, or the DoD) use RAID as part of their storage systems:

1) Availability: these folks measure the cost of downtime by the second. Some single transactions are worth millions of dollars. Losing access to their database, or worse, losing a transaction entirely, is a major, major deal. RAID offers them some protection during the hours between backups.

2) Performance: some workloads perform better under some RAID algorithms. Less of a concern for most companies, as there are usually easier ways to get performance.

Getting a reliable backup strategy in place should be much more of a priority than spending money on a RAID system, unless you have that kind of availability concern. True, disks are cheap, but there are other trade-offs in performance and complexity. (Software RAID is much, much slower than hardware RAID.)

11:58 PM  
Blogger John Bates said...

I just had an interesting thought, so forgive me, please, if I ramble on a bit more here. But it's working towards Mike's 50-year storage media, so here goes.

As a number of previous posters mentioned, the problem of storing data isn't new: it's just that we've only started to notice it more with digital data. We're accustomed to losing analog data: prints fade, negatives get scratched, etc. This data loss is acceptable, though often disappointing, because we can still perceive the original intent. With digital data, some bits matter much more than others, and the loss of a single bit can lead to the irretrievable loss of the entire dataset.

So one approach would be to retransfer the digital back to an analog format. Print it onto paper, or some transparency. Now you're back to solving the age-old problems of storing prints and "negatives".
That's a pretty good solution (for data that makes sense in an analog format).

But what if we could figure out a way to make the digital media act more like analog? (I'll have to look more into the way in which CD/DVD media degrades. If it's catastrophic, my idea isn't so hot.) What I mean is: how do we make digital media more tolerant towards bit rot?

The answer is a mantra of mine: make the software more reliable than the hardware. Accept that the media will gradually lose data, and build the coding into the software. It'll burn up a lot of capacity, of course.

Hmmph. Now I need to find out how DVD dyes fade.

12:36 AM  
Blogger Robert Roaldi said...

I worked as a software engineer for 25 years and have grown to hate home personal computers. Some aspects, like this forum and email, are nice, but the price we pay for this convenience is too high at the moment. They should be appliances, but they're not. They require far too much maintenance and attention. I think that the reason for this is the manufacturers discovered that they could save a lot of time and money in good design and product testing by marketing second rate product to tinkerers. The tinkerers live to sit around and find bugs. To them, that is the entire point. I want a computer to do things for me, not the other way round. Now we all have to become tinkerers, like it or not.

Long-term storage in the film age was imperfect but at least it was passive. Once you stored the negs/slides in the shoe box, you didn't have to do anything more. Sure, you could lose the box or accidentally set fire to it or who knows what else. With digital files, you have to actively maintain a level of computer knowledge throughout the rest of your life, upgrade equipment from time to time and copy and re-copy the data to new (and better?) media. If you don't, the stuff WILL disappear.

However, we're stuck with computers now and there's no going back. We used to have access to a film processing industry that did all the tedious stuff for us, at a reasonable cost, and we are giving it up in favour of doing it all ourselves, and we're not spending any less.

We need permanent storage but the high-tech industry does not want permanence. We want long-term stability, they hate it. In that sense, we are adversaries. Why aren't there professional backup services that will burn our files onto the same grade of CD as the music industry uses? My understanding is that those things are far more stable than the kind we burn at home (something to do with metal vs. vegetable dyes?). All my commerically CD's, now close to 20 years old, still work fine. That seems pretty close to archival to me.

I remember that in the early days of computers, one of the (few) reasons for having one at home was to store recipes. This was part of the justification that guys used to convince their wives to let them buy a computer. At our house, we print the recipes out on paper, put them in a binder to keep around the kitchen. The fact that the recipe is on the computer is beside the point.

7:25 AM  
Blogger Joe Decker said...

I too look forward to the day where digital storage is more inherently permenant, but it's just not the case.

Brandon: I do think it's possible to organize a backup scheme that gives a lot of protection and doesn't eat your life. I'm not the DOD, but I did ask for "RAID1" on my main desktop machine. I paid a couple hundred bucks for it. I spent another several hundred buying two external hard disks. On the first of each month, on my calendar, I take the one I have at home, plug it in, and press a button before I go to sleep. In the morning, it's made a copy of my desktop onto that external hard drive. I give that to my wife, who takes it to work, and then brings back the other one--note that there's always one copy (maybe a month old) not in my house. My time investment per month is probably about ten minutes, my total cost is probably around a thousand dollars, and my digital files are, on the whole, a lot better protected against disaster than my slides will ever be. And that work protects not only my digital image files, but other critical resources like my mailing list.

7:36 AM  
Blogger sbug said...

Wow. Can I get an AMEN? I couldn't agree more with your post.
I believe the underlying problem is that we are looking to store a very immature product (digital images) by using another fairly immature product (computers, hard drives, etc). Yes there is film and most any format can be 'read' still today but not particularly easily unless it is 35mm and maybe 120. If we look past the random failure of digital image storage medium and look to the accessability of the image, changes in file formats and connectivity is really not all that dissimilar to figuring out what to do these days with old film negatives that are anything other than 35mm.
I know most are hoping that the breakneck pace of digital camera development will wane and become more evolutionary than revolutionary. We will need to have the same turning point come with storage and the unfortunate issue is that we are not nearly as far along on the development curve.

7:58 AM  
Blogger winkalman said...

It's old news, but considering how many people have mentioned archiving digtital images by analog means I thought this might be of interest.

http://www.dcviews.com/press/Minox-Laboratories.htm

There you go Mike, there's your fifty years!

8:16 AM  
Blogger Ade said...

With my propeller head on, this seems a suitable moment to mention ZFS, a new filesystem architecture from Sun Microsystems. At the moment, it's being targeted at the enterprise but the code is out there and can be used anywhere - no reason it shouldn't make it down to home systems one day, or at least third party remote data storage silos.

Unlike existing filesystems, ZFS doesn't pretend that the underlying disk is reliable; in fact, it assumes it isn't. So it checksums all the data, has various reconstruction strategies in the event of errors and guarantees integrity. It can take snapshots of the data (point-in-time read-only copies) for backup purposes. It shouldn't matter whether your hardware is disk, flash or tomorrow's zippy new thing, ZFS will simply treat it as one big storage pool and carry on safeguarding your data.

I'll skip the rest of it. The basic features are described here: http://www.opensolaris.org/os/community/zfs/whatis/

I mentioned third parties above because this data resilience issue isn't going to be solved by anything that requires effort from the end user (burning DVDs, etc.). Far easier to entrust the task to someone reliable and trustworthy who makes this their job.

8:34 AM  
Blogger DonovanCO said...

Everybody needs a print organizer like my wife. We have about 70 albums of photos: many are of the family as a whole; each of our now adult four children has a series of albums. She recently edited many old albums, replacing binders and pages that were falling apart so as to ensure the integrity of the photographs. Prints from the last 4 years are mostly from digital cameras, which I have on CDs, but which she prefers viewing as prints.

The point is that it takes time and dedication, and always has, to preserve family memories and stories. I do like the suggestion of a service that could press commerical level CDs of digital images.

9:40 AM  
Blogger Dave New said...

Commercial mass-produced CDs are stamped from metal masters. So far, the cost involved in creating those masters exceeds what someone would be willing to pay to get only a handful of CDs stamped. Some local bands in my area had had CDs made in the past, but since then have found that for small runs (100 or so), it was easier to use a CD-R type duplicating machine, with printable label art.

On another note, I used to work for the world's largest minicartridge tape back up drive supplier (Irwin Magnetics) back in the 80's, as a software developer for our backup software. In those days, we were concerned with backing up those unreliable 10 MB hard drives, and never would have considered an (unpowered even) hard drive as a suitable archive media. How times have changed. For one thing, the size of typical hard drives have almost completely outstripped the capacity of reasonably priced tape drives, making that form of backup and archival impractical for small shops.

Anyway, my point was actually to talk about error code correction (ECC) in computer media. In the heydays of Irwin, we held the patent on using Reed-Solomon ECC on quarter-inch tape media (all such patents, if still in force, are now owned by Seagate, who eventually ended up with the spoils of a number of hostile buyouts and takeovers in the tape industry). One of the favorite things we used to do at Comdex was to use a hole punch to punch a hole in the middle of the tape, and then read the data back from the tape, as if the hole wasn't even there.

This was accomplished by recording two redundant blocks of data for every 16 recorded on the tape. With this amount of redundancy, you could lose an entire block's worth of data, and still reconstruct the missing block from the remaining 17 blocks. It was an impressive demonstration that help land us some major accounts (Compaq, IBM, NCR, Zenith Data Systems, etc). Still, things like failures in the floppy controller DMA channel (one particular OEM's computer comes to mind, who shall remain nameless) could still product completely correct (as far as the recorded CRCs and ECC was concerned) tapes of flawless garbage. In the worst case, a user might not discover that their backup wasn't functional until *after* they had lost an entire system's worth of data. To head off scenarios like this, Irwin added a readback byte-for-byte data compare, but since it always took longer to execute, impatient users would switch it off.

Which brings me to my final point -- user education, or the lack thereof (or is it motivation? sometimes you can't tell the difference). There hasn't been a backup system invented yet that some durn fool can't find a way around, or manage to ignore, or in some other way abuse, to the detriment of his own, or his department's data. My prime example was the habit that Irwin customers had of re-using the single backup tape that was packaged with the drive over and over, overwriting each previous backup with today's (or this week's) possibly already trashed data. No amount of warning in the manual could seem to convince these folks that they needed at the least a three-tape rotation scheme. I suppose some just thought that we were trying to market tapes to them, like folks accused Bill Gates & Co. of being 'memory merchants' because their PC-DOS macro assembler needed more than 48K of memory.

You can lead a computer user to backup nirvana, but you can't make them rotate their media (or some such drivel).

12:28 PM  
Blogger Brad said...

There's a new site called SwissPictureBank.com. It has a money back guarantee and the redundant storage you're looking for and it's just 3 cents per picture. I save almost all my photos there and its worked out great. I can't believe more people don't know about this place!

10:21 AM  

Post a Comment

<< Home