Pierre wrote:
So is there a program that can keep track of that? It wouldn't make sense to do it individually
There probably are several.
I wouldn't bother because md5sum makes it easy to generate hashes for a bunch of files and to compare hashes yourself with standard tools like the find program. Keep it simple! This way you can verify the hashes from any OS (except Windows installs which have not yet been upgraded with Cygwin/MSYS/whatever).
If you really need such a program and can't find a good one, I've half-jokingly recommended using a bittorrent app. They allow you to make a single hash file for a whole directory, tell you exactly where in your big files you've got an error... and double nicely as a network/cloud backup application.
Alternatively you could do your backups with a program that automatically generates hashes such as duplicity. If you're not doing backups, you have bigger worries than hash management!
Pierre wrote:
what sort of work could I be doing to need so many TerraBytes for my personal computer if it all was "my data"?
There are so many reasons to have loads of data...
The most obvious cause would be if you recorded video. Even if you released most of your work to acquaintances or even to the public, you'd probably have lots of stuff that didn't make the edit you'd want to keep on hand.
Pierre wrote:
But there is also a large amount of archives collected over the years...e.g. just for the case of a particular set of events, the war in Libya the last two months, I have collected about 15GBs of selected videos (downloaded and captured/encoded from TV), images, articles, webpage screenshots etc...and this done extensively over a number of subjects...
But maybe this also does not fit the "my data criteria"...
I don't want to get into semantics and intellectual property issues. What matters is: can you easily download or rip the stuff again?
The 21st century version of press clipping you've done definitely ought to be backed up regardless of who legally owns the stuff. For documentaries you have on optical discs or which are easy to download, a backup is not absolutely necessary (I'd still do it but you might call that a luxury).
Assuming your work has some value, if you somehow are really lacking the resources to back up your clippings properly (that means having more than two copies, no matter how impratical you think that is!), you should get help.
You could edit/index/package your stuff in a way that makes it attractive and usable by other people and share it. This assumes you've got acquaintances who could help get the word out and the distribution going. Otherwise it's still a good long-term goal but you'd need another approach in the meantime.
You could get in touch with an organization which collects this kind of stuff and which can afford a few drives. From their point of view, you wouldn't be someone requesting a service but a potential volunteer contributor to their archives.
You could simply request donations of old hardware from acquaintances or even the public:
http://www.freecycle.org/group/GR/GreeceIf you can spare some capacity but not dedicated backup drives, you could use a P2P cloud backup scheme such as Wuala (which I haven't personally tested).
And finally, maybe you should try to meet fellow data collectors in your area.
Backup your stuff people! Drives fail. Hardware sucks. Software is buggy. People make mistakes. So don't make excuses!