Running out of disk space in the age of the cheap terabyte - avoiding data clutter.

Running out of space
Running out of drive space these days is like New Zealand running out of sheep. It's hard to believe it's even possible, but today I got a "running low on space" warning on my laptop. It's an upgraded 180G 7200rpm drive. I only had about 500MB free. What on earth was going on?

The first thing I did was empty the trash. That helped a little (I had almost 5,000 items in there). But I was still very low on space.

I needed to get some idea as to what was going on. So I fired up Terminal, ran du (including the du * -k | sort -nr variant), df, and wasn't really the wiser. I needed something visual. So I downloaded JDiskReport (actually I couldn't remember the name so I googled "visualize disk usage" which brought up a lifehacker.com article which reminded me of the project...) JDiskReport is a fantastic tool for seeing where your space has gone (and the guy who wrote it, Kars Lentzsh is a very talented Java Swing programmer as well.)

Turns out that, while I do have 70G of movies, and 14G of music (yes, a small amount but I actually own all of my music) the first surprise was Vmware Fusion: 16G of virtual machines! And really, I don't even use it all that much: I have an older XP image with IE 5.5 on it for testing. And an Ubuntu image for the same reason. Really, these should be 2G apiece, and one 10G image could be deleted. Easy fix.

The biggest surprise was my iMovie "events" folder: 30GB used. First, I have a lot of footage. Second, iMovie adds a lot of meta-data (in one case I had 1G of thumbnail data alone - for a one hour video). Third, raw video footage is very large. It is highly compressable, but compressing this stuff is not part of my workflow. I used iMovie to export a tiny version of a 1 hour video, then deleted the source files. The original was 4.8G. The exported video was 34MB! (It took about 10min to compress. I might also try using HandBrake to do the transcoding.) I can look forward to getting this down to around 200MB, I hope. But it will take time.

And then there are about 12G of photos floating around in various places.

How did this happen?

Cleaning the mess up is great, but if I don't figure out why it happened it's just going to happen again. Despite the increasing amount of storage, the tools I have to generate new data is increasing even faster. I have about 5 devices capable of producing photographs and video: two cameras, a flip video, my cell phone and a webcam. All of this new data is dutifully sync'd to my laptop (a process I wrote about earlier in an article on iPhoto), but then (apparently) the data just sits there, and problems like these arise.

Of course, this data shouldn't just sit there. It should be doing something useful, or it should get deleted. (The utility of data generally goes down over time. But that's ok because flickr, youtube, and facebook don't ever delete your data. It's their business to keep your data as informational as possible, so that's to your benefit.)

Should you have more photos on your hard-drive than on your favorite sharing service, or less? Most people would say more, I say less! Put the good photos on Flickr, and only keep the great ones locally. (Same with videos and youtube). If it's not good or great, it's deleted. Even if you decide to keep it your work isn't done - for example, you need to compress the video. (And you may want to compress the local photos you keep if you shoot RAW).

(Two unavoidable factors may keep more data on your pc than on the net in the short run. First, you may have a limited connection. This will make uploading even smaller files very slow. In the worst case, you're completely offline. Not much you can do there but wait, knowing that your data clutter is only temporary. Second, you may need to compose your story a bit, putting together the narrative and cleaning up the source data, and make decisions about what's good, what's great, and what's trash. That takes time! But being aware of all this work you're creating before hitting "Record" might make you more cautious. It might also inspire you to cull out your work before uploading to your PC!)

How much to keep? I don't know, but I do know that about 95% of my photos are pretty bad. So I'd say 4 photos and one very short video per event day are good, and half (or less) are great. That's about 40MB uploaded, 20MB (max) kept on the hard disk. That's still quite a lot to upload over a bad connection, but doable with reasonable broadband.

No comments: