Newfangled works with independent agencies to create lead development web platforms for their clients.

Newfangled works with independent agencies to create lead development web platforms for their clients.

It's Time to Start a Digital Conservation Movement

at 8:00 am

Since the Amazon Kindle 2 was announced, I've continually been wondering whether a device like it is a good idea. At a price of $359.99 (no monthly fees, 60 second book delivery), it would take a long time of buying digital books at a substantially lower cost than the printed versions for it to "pay for itself." So, my first question was whether buying Kindle version books was even a good deal at all. I decided to look at some of the books I've read in the past year, comparing the prices for a Kindle version, a printed version, and a used version.

Title Kindle Price Print Price Used Price
The Divine Comedy $0.99 $16.50 $10.53
Year Million: Science at the Far Edge of Knowledge n/a $10.88 $5.00
Mapping Time: The Calendar and its History $19.25 $31.68 $3.83
Infotopia: How Many Minds Produce Knowledge n/a $10.85 $9.11
The Future of the Internet -- And How to Stop It $10.40 $11.56 $10.84
Rollback $6.29 $6.99 $0.98
The Shock of the Old: Technology and Global History since 1900 n/a $17.16 $11.72
Exploring Reality: The Intertwining of Science and Religion $9.99 $10.20 $8.80
Do What You Are: Discover the Perfect Career for You $7.99 $12.34 $9.89
Strategic Thinking for the Next Economy n/a $22.45 $0.94
First Among Equals: how to Manage a Group of Professionals n/a $11.70 $2.02
The Way We'll Be $9.99 $17.16 $8.32
Blown to Bits: How the New Economics of Information Transforms Strategy n/a n/a $5.95
Group Genius: The Creative Power of Collaboration $10.38 $11.53 $5.09
Father Ernetti's Chronovisor n/a $11.53 $0.16
Contemporary Futurist Thought n/a $22.50 $15.65
The Numerati $14.30 $16.38 $6.94
What Are You Optimistic About? $9.56 $11.66 $2.90
The Canon: A Whirligig Tour of the Beautiful Basics of Science $9.99 $10.85 $0.01
Seen/Unseen: Art, Science and Intuition from Leonardo to the Hubble Telescope n/a $44.00 $27.28


A few things are immediately clear from the table. First, not every book is available on the Kindle. Granted, some of these titles are a bit obscure, and the Kindle is a new format, so I'd expect this to change quickly. Second, in general, buying used is the cheaper way to go (Update (04/07/09): here's a recent post describing a Kindle book price boycott movement). Even factoring in shipping fees for ordering a used book through Amazon.com, going that route is still cheaper than the Kindle version for most texts, and it's probably a better environmental choice (no expensive plastic electronic device needed, and the book you're buying has already been printed and sold at least once). Of course, I could have added a fifth column indicating which of these titles I checked out of my local library (just about all of them). If you don't need to own the book, the library beats the Kindle, Amazon, or any used dealer for that matter, on price and "greenness" for sure. That said, I'm betting that electronic devices like the Kindle will become more and more common.

But, the more books are sold for devices like the Kindle, the more data centers will need to be constructed. As an example, Microsoft recently announced a new 75 acre data center opening in Washington state (another one just like it is being built in Texas for $550 million). This facility will consume 48 megawatts of power. To put that figure in perspective, 48 megawatts could power 40,000 homes! I had a hard time tracking down a total count for data centers maintained by Microsoft. However, Rob Bernard, Microsoft's Chief Environmental Strategist, has said that the entire data center industry is responsible for 880 million tons of CO2 emissions every year! Yikes!Google, on the other hand, currently has 36 datacenters and more planned. Because energy costs are skyrocketing for the tech industry, Microsoft, Google and Yahoo have all entertained the possibility of moving their data centers to Iceland, where the Invest in Iceland campaign is planning for low cost, geothermal energy supplied data centers.

This made me think of why our need for data centers would be increasing so rapidly. Of course, it's pretty obvious: Google offers over 7GB of free email storage. Its other applications don't seem to have a published official storage limit, though one user on this forum estimates that your potential Google Docs limit would be 50GB. Facebook does not seem to have a stated total limit for storage of anything (text data, photos or videos). In any case, all of your email, calendar, document, photo, and video data across all of these services will quickly amount to a lot of storage! Now consider that amount (whatever it is) multiplied by the over 100 million Gmail users, or the over 200 million Facebook users!

Do we really need to save every email? Every digital photo? Every video clip? Perhaps the concept of conservation will have to adjust further to include the conservation of data, too. This might not be a bad thing, in fact, it may help us to appreciate things more. Back when our cameras had exposure limits to a roll of film, we considered each shot more carefully, before we even clicked the shutter. Now, with digital cameras, there's no need for that kind of thinking. But we could probably stand to save a few less pictures here and there! I've seen Facebook accounts with over 800 pictures attached to them!

What do you think? Should we chill out with our excessive data retention?

Update (04/02/2009): Nicholas Carr just posted some more details about Google's data centers at his blog, Rough Type. Here's a clip:

I was particularly surprised to learn that Google rented all its data-center space until 2005, when it built its first center. That implies that The Dalles, Oregon, plant (shown in the photo above) was the company's first official data smelter. Each of Google's containers holds 1,160 servers, and the facility's original server building had 45 containers, which means that it probably was running a total of around 52,000 servers. Since The Dalles plant has three server buildings, that means - and here I'm drawing a speculative conclusion - that it might be running around 150,000 servers altogether.

Read the full post, which includes way more specifics, as well as images and video from inside the data centers.





Comments

Christopher Butler | March 30, 2009 8:17 AM
I just found an interesting article about how the Egyptians, by carving their data into rock, were far better off than we are at recording and saving information. Henry Newman muses that in order to preserve data for even close to as long as the Egyptians have, we need a framework that can transfer and maintain metadata between systems.
Andrew | March 30, 2009 8:38 PM
I had never thought of it that way, and I totally agree.
Brian | March 31, 2009 12:02 PM
This article came out yesterday and it seems as if you read this author's mind.

Maybe Facebook and its users alike should be conscious about how much data is stored on their servers.

Article about Facebook asking for $100 million to maintain their servers
Brian | March 31, 2009 1:23 PM
http://redherring.com/Home/25977
Alex | April 1, 2009 8:08 AM
It seems like we're moving toward more and more free storage and services, which is touted as a good thing, but it's only really because it's in the best interest of companies like Google to have your information. With Google, every email you save is a page to slap an ad on, right?
Christopher Butler | April 1, 2009 10:15 AM
Alex,

You're right- Google can afford to offer us all the free app candy they want because the more data we give them, the more advertisements they can sell. Advertising is how they can afford to subsidize a huge amount of R&D that goes toward the creation of great tools like Gmail, Google Analytics, etc. I mentioned this in my newsletter, To Buy or To Build. In a significant way, what Google is doing skews the entire development industry. Since the advertising subsidy is almost invisible to many users at this point, the perception of value has changed. People expect way more for way less now. It's a reality we grapple with very often, but that means that one of our jobs has to be to continually educate our clients and clients-to-be as to what the real value of working with us is. In addition to the development and the applications they get, they're also getting a dedicated relationship with human beings that are reachable by phone and email. Google does not offer that!

Thanks for reading and commenting,

Chris
Ted | April 3, 2009 10:21 AM
This is so typical of liberal environmentalists. If it's not one thing, it's another. If it's not save the owls and whales, it's recycling, etc. etc. We finally find a way to use less paper and you still have something to complain about!
Christopher Butler | April 3, 2009 10:33 AM
Ted,

Wow, you're definitely making your presence known with your comments! I think you might be getting the wrong idea about what I'm saying with this post. I'm not trying to be a complainer. On the contrary, I'm trying to talk more about the acceleration of storage and power use in this industry, and how consumers have been quickly acclimated to that increase. While it definitely has an environmental impact, it also has a financial one, too- both are concerning. As far as the environmental issues are concerned, I think it would be a mistake to tie our damage to the environment to any particular technology. Rather, it should be tied to our inclination toward excess, no?
John Carlton-Foss | April 4, 2009 10:29 PM
I think you make many excellent points, in support of the main point. It is not obvious what the answer is about how to proceed, because the business models for such companies as Google call for lots of storage and lots of CPU. Further, some people with whom I have talked seem to feel that the virtualization of information and communications will save us from climate change. Someone needs to do the numbers and see what makes sense. Maybe I will tackle that, but I also have many other things to do, and so perhaps someone else will do it first.

I do want to comment on your comments about digital pictures. This is a somewhat complicated area. Clearly 800 pictures attached to a Facebook site is excessive. As grist for the mill, I would suggest consideration of another scenario. Earlier this week I was on the roof of a building reviewing equipment with my business partner, a licensed professional engineer. We had limited time, and he went directly to key areas to document what we knew we had to have when we left the site. I clicked dozens of additional digital photographs of all the equipment. This made it feasible to retain information that otherwise we would not have retained. Later it turned out during the data analysis stage of the work that this extra information was essential. It was not that we were stupid about our narrow focus. It was that we focused where we had to focus and also captured as much of the periphery as we could. Why? Because, as anyone knows who has done video documentary, if you are filming in real time, and some key event happens, you often discover that part of that event occurs just outside the camera's frame. Thus, all those seemingly un-necessary photos turned out to be critically important. So the digital camera costs something in storage and battery, but it saves time and travel that would otherwise have been required to return to the site for more data. But then, taking it to the next level, the question is what do we do with all that rich extra data once we have finished using it. I again need to do the numbers, but I suspect that archiving it off onto a CD or DVD, labelling well, and storing the media efficiently goes a long way toward keeping it all as green as possible. Further, we fall even more deeply into agreement when I point out that such digital photographs should not be stored on a Google site simply because Google makes all those GBs of storage available for free.

One key lesson stands out for all of us in our time. We all need to restrain ourselves, even if it seems that we are partaking of unlimited resources. Those resources are not really unlimited. And they do have costs.
Christopher Butler | April 6, 2009 7:42 AM
John,

Thanks for your insight here. Do you monitor for keywords like "conservation" or "green" or "energy?" I'm wondering how you stumbled upon this post...

The example you cite, where capturing extra images ended up saving time and resources, is a good counterpoint that shows how the increase in storage capability for things like digital pictures can be a valuable tool applied in a variety of contexts. In this case, the conservation might come later, when deciding what files are saved and/or archived. I'd be interested in a more detailed comparison of archiving methods, though I suspect that using CD media would only be a good short term solution. The dyes that most CD-R's use are quite light sensitive and consequently seem to have a very short lifespan. I wouldn't want my only backup of important data to be on CD.

Your concluding comment, about the need for restraint, is really the core of what I was trying to get at with these posts about digital conservation. Thanks again for taking the time to comment,

Chris
Brian | May 4, 2009 3:04 PM
Chris,

I'm sure you are aware of this but Google started to literally scan books from multiple libraries for their online library. Here is a list of questions and answers pertaining to "the worlds greatest library".

Who knows if one day, we'll walk into the library and just hook our usb/kindle device into a server and download away.
Liz | June 18, 2009 12:07 AM
Actually, this is an interesting post/thread from the perspective of a museum professional who has had some experience digitizing collections. Right now I'm looking for some insight into the world of "digital conservation" -- meaning the maintenance and conservation of digital files, including images, used in tandem with the objects in museum collections (for identification, research, exhibition use, and so on.)

Many museums have been working on digitizing their collections for years, and are now bumping into the problems of media fallibility, format fallibility, and storage fallibility. It's practically a full-time job to manage a digital collection, let alone an "analog" one.

For us it's a standards issue, which seems diametrically opposed to the constantly-developing and plastic world of digital representation. (Nobody wants to have to go back and re-photograph 3.5 million objects every 10 years). Any thoughts on the best way to wed these two paradigms?
Christopher Butler | June 19, 2009 5:02 PM
@Brian, yes, I did and you're probably on the right track in terms of where that's headed...

@Liz, this is a great comment. I ran into similar issues when trying to archive my work as a film student at RISD. Some was in 16mm film, which degrades easily, some sound on magnetic tape, which is still pretty robust, some on DV tape, which is very vulnerable to dust and heat, some on BETA, which is resilient but too obscure, and some on DVD, which succumbs to scratches and light degradation. My best bet would be to get it all on a hard drive, but that's easier said than done at this point. Getting a DV tape or DVD ripped is one thing, but the 16mm film or BETA tape? Needless to say, I still have VHS, BETA, DV, film cans, and DVDs laying around... as if these projects were really worth keeping around!

I've also been thinking along similar lines after hearing a podcast from The Spark radio about dead media and digital preservation, and how archivists have a general concern about how to best store information- both current and old. One shocker from that show was that a guest mentioned that the national archive still does not accept document formats like Microsoft Word documents. If they don't accept that, can you imagine all the information that is not being brought in??

As far as wedding the two paradigms is concerned, I'm really not sure. We're constantly moving forward in terms of better media (bigger/stronger/faster/cheaper), so at the risk of sounding too pessimistic, I imagine that this progress will always be a bane to archivists. As far as I'm concerned, the sooner we get to a format that requires the fewest moving parts, the better... Also, let's ditch the disc as soon as possible. Nothing is more frustrating than renting a DVD and getting the "skipping over damaged area" message and then the next thing you know you're 45 minutes later in the film. Not cool.

Thanks for reading and commenting,

Chris

↑ top