Rendered at 10:51:05 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
ComputerGuru 18 hours ago [-]
Slightly related: I have a tool that writes random (incompressible) data to a disk and lets you verify it back without storing a copy (by using a csprng seed), initially developed for benchmarking SSDs that used to cheat to get better performance numbers but that can also be used for this purpose or to overwrite (“shred”) a disk: https://github.com/mqudsi/hddrand
fhdkweig 17 hours ago [-]
I haven't used badblocks https://en.wikipedia.org/wiki/Badblocks in about 10 years, but I was annoyed that this exact feature wasn't available for testing accidental swapping of block locations. badblocks only writes the same data to each block and thus they are all indistinguishable.
Superfud 6 hours ago [-]
ddrescue can read from /dev/urandom and write to any device. This can come in handy when you have a device with bad blocks but still would like to "shred" all the still writable parts of.
NooneAtAll3 4 hours ago [-]
there's no point in writing actual random data - you won't be able to check correctness without another copy
3 hours ago [-]
tosapple 2 hours ago [-]
[dead]
champtar 14 hours ago [-]
TIL `badblocks -t random` repeats the same random block over and over :(
jmb99 13 hours ago [-]
You can however set the block size to something quite large, which means you write the same random pattern spread out over multiple blocks repeatedly. If you pick an "odd" block size (like say, your native block size multiplied by 47), it's highly unlikely your disk under test will be swapping around "groups of 47 blocks." (I usually just do a nice multiple, like 4K16, but if you're super paranoid a weird multiple should be pretty much good enough). You won't get reporting of which exact* blocks on the drive are failing, but these days, that isn't really useful information - if any blocks are failing, warranty or ditch the drive.
I like the fact he's not just verifying all of them each year. AFAICR, reading the flash causes the row to be rewritten with the values just read.
I remember years ago working on the Wii, and there was a restriction on how often you could read the flash to avoid premature wearing. Not sure if that was just the specific type of storage, as googling suggests that NAND is subject to this and NOR isn't. I think pretty much all USB drives now use NOR flash, so maybe this isn't actually an issue any more.
wmf 17 hours ago [-]
reading the flash causes the row to be rewritten with the values just read
DRAM works that way but flash doesn't. Read disturb is a different issue.
pretty much all USB drives now use NOR flash
Nope, NOR flash is much more expensive than NAND so NOR is only used for firmware and everything else is NAND.
cyberax 16 hours ago [-]
But the firmware might have the logic to rewrite the block when it reads it in case it hasn't been written in a while.
wmf 14 hours ago [-]
SSDs should definitely rewrite static data if it has too many ECC errors. Unfortunately we don't know much about what's going on in SSDs. Some could have much better data integrity than others.
zozbot234 17 hours ago [-]
> reading the flash causes the row to be rewritten
This only happens very rarely, though more frequently as NAND flash goes QLC and beyond.
Besides, other experiments have shown that data remanence is way more of an issue with drives that are almost completely worn out (way beyond their specified TBW) and about to croak. Even then you only get rare bitrot that can be checked for and compensated quite cheaply in most cases.
If you take fresh media, write it just once or a few times at most, use substantial overprovisioning to keep the drive in its fast pseudo-SLC mode, and reread the media periodically, NAND can be a good enough storage system for most casual needs.
jofla_net 15 hours ago [-]
Buyer be warned i think this is extremely brand-dependent.
While i've had generally solid experience with sandisk for almost 20 years and had a few old drives (which i hear are slc-based so its not surprising) hold files for over 5 years no issue, i recently almost lost over 4 years of photos.
I had purchased some lexar drives from costco since they were dual interface (usb A / usb C) about 2 years ago, and it was usefull to just get some pictures off my phone. I usually don't rely on such a setup for long term but as with all things I was delayed tending to it. I figured there were 2 per box so i just copied them twice, and diffed them several times to make sure they were exact copies.
After 24 months, one of the drives had a %95 loss, almost every picture was lost cut-off bottom half or so. The other drive surprisingly seemed fine, though it had been plugged in every 6-9 months I recall, as I wanted to browse it a few times, it seems that this action saved the volume. Upon further inspection the good drive still lost 10 pictures in about 5 thousand, so it wasn't perfect.
To pile up another anecdote: late in 2018 I put a well used PC with an Intel SSDSCKKW240H6 (240GB SSD in M.2 form factor) in storage and picked it up 5 years later. The SSD was unreadable then. The PC (with a different storage system) still runs (sans the fan-control, which apparently took a beating on first boot after such long dormancy and the CMOS battery is meanwhile depleted).
Giefo6ah 14 hours ago [-]
> After 24 months, one of the drives had a %95 loss, almost every picture was lost cut-off bottom half or so.
If these are JPEGs with a grey or green lower half, it's likely only a few 16x16 macroblocks are corrupted and you can recover the rest.
This cannot be done programmatically because you have to guess what the average colour of the block was, but it can be worth it for precious pictures.
nneonneo 4 hours ago [-]
This is one of the reasons I like the PNG format - it has checksums. You can fix a surprising number of broken files by bruteforce testing plausible errors until the checksum passes.
With JPEG one of the big problems is that the data is Huffman-encoded without any inter-block markers (except maybe Restart Markers, if you're lucky). This means that a single bitflip can result in a code changing length, causing frameshifts in many subsequent blocks and rendering them all undecodable. If you have a large block of missing data (e.g. a 4k-byte sector of zeros), then you have to guess where in the image the bitstream resumes, in addition to guessing the value of the running DC components.
myself248 8 hours ago [-]
It's surprising that it can't be done programmatically, since "minimize the color difference above and below this super obvious line" seems like it should be a pretty straightforward success criterion.
ChrisMarshallNY 8 hours ago [-]
AI may be the “killer app,” for these kinds of “back up and squint” judgment calls.
jofla_net 12 hours ago [-]
yes they were grey
capitainenemo 8 hours ago [-]
Could it possibly be that it wasn't the drive, but maybe the import application?
Well I just ordered a drive from sandisk.com, hopefully they’re not mailing out counterfeits.
digdugdirk 18 hours ago [-]
What is the best consumer friendly long-term storage medium? Are we still better off with high capacity dvd/Blu ray discs?
bityard 17 hours ago [-]
Recordable blu-ray discs have a reported lifespan of hundreds of years if left untouched, but the high-capacity ones (128GB) are not especially cheap right now and I assume the writing process is slow. The drives themselves may not be easy to come by in future decades. But they are your best bet for "I want my data to outlive my grandchildren."
For the rest of us, a USB spinning rust hard drive formatted as exFAT is going to be hard to beat. You'll be able to plug this into virtually any computer made in the next few decades (modulo a USB adapter or two) and just read it. They are cheap (even allowing for the rising cost of storage), fast, and most importantly, they are easy. The data is stored magnetically, so is not susceptible to degradation just from sitting like SSDs or flash drives are.
Of course, you should not store any important data on only ONE drive. The 3-2-1 backup rule applies to archives as well: 3 copies, 2 different media, 1 off-site.
Marsymars 12 hours ago [-]
I recently went through this exercise and settled on HFS+ over exFAT. Reliability seems a bit better with some edge cases, and I don’t expect I’ll be put into a situation where I’m not able to read HFS+ drives.
(Though probably not appropriate if you’re primarily not a mac user, or won’t be in the future.)
josh3736 3 hours ago [-]
This is an… interesting choice for archival purposes. What exactly do you think makes HFS+'s reliability better? The only thing I can think of is that HFS+ has journaling while FAT and derivatives do not, but that doesn't particularly matter after the data is on the disk and it's cleanly unmounted (which should be a safe assumption in most archival scenarios).
The Linux HFS+ driver is basically unmaintained, and cannot write to journaled disks. On Windows, the only choice a paid driver. I guess it's fine if you're strictly a Mac user, but it's a real problem if you need to access the disk on another machine. Even if you don't, I still wouldn't trust Apple for long-term support of anything.
Meanwhile exFAT has native support on Windows, Mac, and Linux, and there are drivers for BSDs and others.
So 20 years down the line, you'll certainly have something that can read an exFAT drive without much if any pain, regardless of which platform you're using at the time. HFS+? Who knows.
That said, I'd consider ZFS or btrfs for HDD archival. Granted broad (Mac/Windows) support is weaker than FAT, but at least the filesystems are completely open source. But what really makes them interesting is their automatic data checksumming to detect (and possibly repair) bitrot, which is particularly useful for archival.
Krutonium 8 hours ago [-]
You're assuming Apple is going to continue even supporting HFS+ long term. They already convert volumes to APFS opportunistically.
Marsymars 8 hours ago [-]
APFS is generally not appropriate for HDDs, so yeah, I expect they'll keep supporting HFS+ for as long as they keep supporting non-flash storage.
In any case, if the situation changes, I expect there'll be enough lead time for me to adjust my strategy -- the failure scenario is completely different than rotting physical media.
fy20 12 hours ago [-]
I decided to go with NTFS for the filesystem as it has journaling. Works fine on Linux, and obviously Windows. For macOS there are various add-ons that support NTFS, but my use case there is read-only.
myself248 8 hours ago [-]
Paper.
Not even kidding.
With any other media, you have to hope that the drives are still available. Paper routinely lasts hundreds of years and we all have readers built right in.
vaylian 3 hours ago [-]
I like simple solutions. But paper has severe storage capacity limitations, which makes it impractical for storing large amounts of data.
orthogonal_cube 17 hours ago [-]
Probably depends on what “consumer-friendly” entails, how it’s stored, and the quantity of data.
If we’re talking the average tech-illiterate to literate-but-cost-and-space-constrained person, probably Blu-Ray. A burner+reader combo with a stack of dual-layer discs is probably cost-effective. High-capacity HDDs would probably be equally effective if you can guarantee that they’re stored away from accidents and mishandling, but if it requires a SATA-to-USB adapter with assembly then it might possibly be out of reach for some consumers, and any risk of damage from movement could rule it out entirely.
If we’re talking tech-savvy consumers who don’t have the IT budget of a corporation, maybe LTO-5 or LTO-6 tapes could work. Tapes themselves are very affordable and have a good shelf lifespan. Used libraries can be had for under $600. The primary issues would be finding one with an interface that works with your existing equipment and software to support tape read and write.
msy 13 hours ago [-]
Consumer? Apple or Google Photos or 'drive' functionality of either. The only real risk then is losing your account and Apple Photos has an option to keep them all locally on disk.
curt15 12 hours ago [-]
To be pedantic, the post you responded to asked about "storage medium", not storage services, which leads to the question of what storage medium they use and how long the services will be around.
1970-01-01 18 hours ago [-]
I've been a big fan of M-Disc BD-R.
josh3736 5 hours ago [-]
It does really depend on how much data you want to store, but if you've got a lot of it…
Tape.
Obviously extreme prosumer, but for long-term archival of lots of data, LTO tape wins in several ways:
- Discs just aren't actually that high capacity relative to modern HDD capacities. BD XL maxes out at 128 GB, while there are now 30 TB HDDs readily available. That's 240 discs per HDD. Modern LTO tapes store 12-18 TB, or 2-3 tapes per HDD.
- Anything flash-based is a bad choice for long-term storage. SSDs are very fast, but also (relatively) expensive at 15-20¢/GB. Reputable SD cards are in the same neighborhood. Despite the OP redditor's results here, flash is only expected to retain data for 5-10 years.
- Tape is the absolute lowest cost-per-GB you can find of any storage medium. At the moment, LTO 8/9 tape can be had on Amazon for ½¢/GB. Compare with BD-R at 2¢/GB, or BR-R XL M-disc at 15¢/GB. HDDs (spinning rust) are 2-3¢/GB.
- Consider also write speed. LTO can write 300+ MB/s. BD 16x maxes out around 68 MB/s.
- Manufacturers rate tapes for 30 years sitting on a shelf, and it wouldn't be surprising if they still read after 50 years¹. Plain BD-R lasts 5-20 years. M-disc is the interesting outlier, rated 100-1000 years.
Of course, the biggest problem with tape is the drives. While the media is dirt cheap, the drives are crazy expensive. It looks like you can pick up a used LTO-6 drive (2.5 TB tapes) on ebay for around $500. A brand new LTO-9 drive (18 TB tapes) will be $4000-5000.
In terms of breakeven points, a used LTO-6 drive + tapes beats plain BD after about 25 TB. Because of the cost of M-discs, they stop making sense after 1-2 TB. Purely on cost, a brand new LTO-9 drive + tapes doesn't beat used LTO-6 + tapes until about 800 TB (LTO-9 tape is ½¢/GB while LTO-6 tape is 1¢/GB), but there's definitely a point in there where the larger capacity of LTO-9 makes dealing with the physical media a whole lot easier.
So if you're looking for long-term storage for your photo album, a M-disc BD XL is probably a good choice. If you only have a few hundred GB of data, a couple discs + burner can be had for $300 or so, and you can be pretty sure your mom could manage to read the disc if necessary.
But if you're looking to back up your 100 TB homelab NAS, discs are not really feasible. You'll have to spend the next month swapping discs every 25 minutes², and then deal with your new thousand disc collection. Here's where a used LTO-6 drive makes a lot of sense. This is a real sweet spot if you can find a decent drive; all-in you'd spend about $1500 to back up your 100 TB.
This is what I do to backup my NAS — found an old LTO-6 drive and got a bunch of tapes. The drive plugs in to a SAS port (you might need a HBA PCI card, $50), and that's pretty much it. Linux has the drivers built in; it will show up as /dev/st0 and you can just point tar³ at it.
Finally, just to compare with cloud options, storing that 100 TB in AWS Glacier Deep Archive would run you slightly over $100/mo, so you're ahead with your own tapes after little over a year. Oh and don't forget to set aside an extra $8000 for data transfer fees should you ever actually want to retrieve your data lol.
² Or get a disc-swapping robot, but those run $4000-5000, at which point… you're better off with a brand new tape drive.
³ Thus using the Tape ARchiver program for its original purpose. Use -M to span tapes, tar will prompt you to swap.
layer8 17 hours ago [-]
Honestly: multiple copies of encrypted cloud storage. (Encryption just for privacy.) You need decentralized backups anyway. Alternatively, two NAS systems with some RAID variation in different locations that back up each other can be more cost-effective for large capacities.
foxglacier 13 hours ago [-]
You're talking about backups which you wouldn't normally need to keep for decades and will be powered on regularly anyway. If it's archival, such as family photos for your kids when they grow up, cloud storage can lose them if you die or go to prison or for whatever reason don't keep paying the bill.
layer8 11 hours ago [-]
If you go to prison, you can lose whatever media you have as well. I wouldn't rely on a single cloud storage provider, but mirror on multiple ones, and mirror on one or more local device as well, at least for the most important data. I wouldn't use physical media as primary backup copies today: long-term durability, and availability & support of matching peripherals are uncertain, and they don't make proper backups with redundancy easier, nor their verifiability.
For the kids, I'd rather make physical photo albums.
BoredPositron 17 hours ago [-]
What's long-term? I have some dvd-rs that push 20-25 years and despite the plastic getting brittle they still work. I also have some ide drives that still work without problems after 40 years. I would rather aim for 20 years and upgrade the storage device if I still need to retain the data.
vel0city 17 hours ago [-]
That's a thought I hadn't had. The plastic of the disk getting so brittle it shatters in the drive due to age. I wonder what's the embrittlement profile of polycarbonate stored in reasonable condition.
mapontosevenths 10 hours ago [-]
Brittleness is not a concern. "Disk rot" is. The dyes used to make writable DVD's were organic (AZO usually), and break down starting at around the 17 year mark (some earlier, if they were poorly made). They have some measure of redundancy built-in, so you may not notice right away. The discs begin to look a bit "cloudy" at first. Eventually they become unreadable.
Go with inorganic blu ray media if you want longevity. Most HTL blu rays made currently will last around 100 years if properly stored. If you need longer there are M-Disc's, but they are expensive and rumor has it that ALL verbatim 100Gb blu rays are essentially M-Discs with different labels these days.
For all practical purposes any Blu ray larger than 25Gb is probably inorganic HTL, but if you worry a lot you can buy more expensive "archival grade" discs from Japan as well that have been vettted and tested.
mrob 12 hours ago [-]
I've personally never noticed brittleness in old optical discs (unlike the polystyrene jewel cases, which often turn brittle). I don't think shattering is likely, but if it's a concern some optical drives allow limiting the maximum spin speed. If the drive supports it you can temporarily set it with the -x option of the "eject" command from util-linux.
ogurechny 8 hours ago [-]
I've been using an old Symbian phone with the same Class Not-That-Good SD card bought back then. In the early 2010s, I copied a lot of MP3 files and ebooks there, and used the camera to take photos occasionally. Then it was no longer used for music and other needs, and the files just rested there. After about 10 years, I've decided to play some music on the phone, and these tracks had a lot of skips and rattle. Images copied from the card showed a lot of damage, too. So when someone on the internet posts how SD cards are a cheap and compact long term storage, I am not impressed. You probably need to refresh all previously stored data with each monthly backup.
It should be mentioned that the phone board often gets warm during operation or battery charging, and the temperature is stated an important harmful factor in a different comment.
So if you have some old files on an old device, and assume that they are still there because their records in the file system still look fine, you might be surprised.
havaloc 13 hours ago [-]
I bought about 20 flash drives in 2019 at work to parcel out when needed, once or twice a quarter for users.
I needed one last week, and had to throw most of them away, they had all died from presumably dormancy, even new in the package.
Rewriting the data each year hides the actual issue here. Have had plenty of "nice" flash drives rot to hell in 18+ months of dormancy
benterris 18 hours ago [-]
Does rewriting data help prevent bit rot? Does it mean powered drives can take advantage of it by periodically rewriting the same data over?
monster_truck 17 hours ago [-]
It depends on the type of flash being used and the controller managing it. That he did not even identify the chips should inform you of the extent that these results can be trusted.
All I can say for sure is that you should not trust any flash for long term storage, thumb drive or otherwise. In serious enough, high usage, high heat enviornments where everything working without problems or delay is part of what they are paying us to be responsible for, it is standard practice to clone fresh images to nvmes every time, with multiple spares that can be swapped out in minutes when they inevitably fail anyways.
vel0city 17 hours ago [-]
It depends on how the flash modules are maintained and their quality, but yes having freshly written data will imply better data consistency on flash media.
Flash media relies on recharging, which may or many not happen often enough.
angry_albatross 18 hours ago [-]
Did you miss that there are 10 different drives and so they have 10 different years of tests where they are testing a completely untouched drive?
monster_truck 17 hours ago [-]
I don't think you're reading the results properly.
thinkling 17 hours ago [-]
I think they are reading it correctly. Year 1, they touched one drive and left 9 untouched. Year 2, they read one additional drive and left 8 untouched. Etc.
Springtime 17 hours ago [-]
Yes, it's also confirmed on the OP's blog linked in the post.
monster_truck 17 hours ago [-]
Those drives aren't being read
angry_albatross 14 hours ago [-]
What do you think I am I reading incorrectly? The post seems pretty clear:
"I filled 10 32-GB Kingston flash drives with pseudo-random data."
"The years where I'll first touch a new drive (assuming no errors) are: 1, 2, 3, 4, 6, 8, 11, 15, 20, 27"
And from the blog:
"Q: You know you powered the drive by reading it, right?
A: Yes, that’s why I wrote 10 drives to begin with. We want to see how something works if left unpowered for 1 year, 2 years, etc."
quilombodigital 7 hours ago [-]
I was trying to roast this guy because I figured a burn-in test should accomplish the same thing he’s doing. But no... I had to ask the great oracle, GPT (General Purpose Truth), and it told me the guy is actually doing something worthwhile. How are trolls supposed to survive in this inhospitable era?
Apparently, multi-year storage tests are still valuable for validating whether those estimates match reality. Who knew...
somat 17 hours ago [-]
On a related subject, physical media, like a song album. I started by wondering if there were ever any solid state distribution options (One Company tried SD cards) and then started digging into the underlying storage tech to see if I could find a write once long term stable process.
First the elephant in the room. Why solid state? because the drives to read the media are often the weak link. When the drives are no longer being manufactured how hard is it to make one? reading solid state drives is a relatively low precision electrical process compared to the high precision mechanical process needed for most media.
First on the chopping block was bulk storage. It tends to be delicate and hard to read and short lifespans. But if I limited myself to small storage there are some interesting options. fusible proms were promising but top out at a few megabytes. Mask roms? does anyone offer a mask rom service anymore?
Put a mask rom into a sd card... no, sd cards are too physically small. For a song album we want something bigger to put album art on. A thing the size of the original gameboy cartridge with a usb interface and a mask rom?
My conclusion, for that specific goal, indefinite future storage of a song album. Vinyl records. low tech enough that it is easy to make a player for them.
Cyphase 2 hours ago [-]
I saw "reddit.com" and my first thought was r/DataHoarder.
rambambram 16 hours ago [-]
I could google it, but I would rather ask HN: what are the best pens (or pen(cil)/paper-combination) for keeping written text as long as possible? I had some Stabilo pen which was very nice ergonomically, but the blue ink faded within a couple of years (laying on my window sil in the sun, but still).
My guess is: regular graphite pencil on porous paper is best. Any ideas about further things I have to take into account?
adrian_b 2 hours ago [-]
For keeping written text as long as possible it matters which are the risks for various things that can happen to the text, because no writing method is equally resistant to all of them.
There are at least 4 dangers for a written text: mechanical rubbing, fading due to light, water and organic solvents (e.g. alcohol).
There are many pigment-based inks that are specified to be lightfast and resistant to water and organic solvents, according to various archiving standards. Such inks are available for fountain pens or they are used in certain kinds of roller pens.
If you use such inks on paper that is somewhat porous, they will also be resistant to rubbing. There are certain kinds of "permanent pens", which have excellent resistance to rubbing even when you write on surfaces like plastic, glass or metal, not only on glossy paper, and which may also be lightfast and waterproof, but the text written with such permanent pens is easily washed with alcohol or other organic solvents (like also for text written with ball-point pens).
So the answer depends on your goal, but usually what you want is either a roller pen or ink for a fountain pen that are clearly specified as being pigment-based, lightfast and waterproof, together with paper on which you have checked that rubbing does not remove the written text. When using fountain pens, one must check that the archival pigment-based ink is known to be compatible with the model of fountain pen, otherwise clogging may occur. (For example, I use pigment-based ink cartridges from Sailor Japan, seiboku or souboku, with Sailor fountain pens, so compatibility is guaranteed.)
While graphite-based pencils produce writing that is lightfast and resistant to solvents, in my experience the inherent rubbing of the sheets of paper when you handle the notebook, or whatever you had used for writing, leads over the years to a fading of the text, so I do not like this method.
lich_king 16 hours ago [-]
I don't think there's a simple answer. For example, someone recommended black ink on white paper, but it really depends on the composition of that ink. Inorganic pigments last forever, but the ink used in black sharpies actually fades pretty quickly.
Pencil definitely lasts if the paper is undisturbed. I have some paperwork that's 100+ years old and with legible pencil text. On the flip side, if the paper is handled a lot, the writing will gradually fade because graphite particles just sit on the surface and can flake off.
On some level, the medium is your main problem. Low-grade paper, especially if stored in suboptimal conditions (hot attic, moist crawlspace, etc), may start falling apart in 20 years or less. Thick, acid-free stock stored under controlled conditions can survive hundreds of years.
rambambram 15 hours ago [-]
Thanks for the insight.
Acid-free paper sounds like the way to go. Do you have experience with this? Or is it common knowledge? Just curious!
I also read letters from my grandparents, stored by my parents in a simple shoe box. No special conditions, just light-free and inside the home for decades. They were still very much readable. I did not pay enough attention, but I guess it was blue ink from back in the day that they used.
lich_king 15 hours ago [-]
> Do you have experience with this? Or is it common knowledge? Just curious!
I collect vintage stuff that sometimes comes with paperwork, usually after spending a decade or two stashed away in the attic.
fhdkweig 16 hours ago [-]
I vote for graphite on paper. Ink will run if the paper gets wet. Of all the damage that has occurred to my papers, water is the most common. I keep a copy of important phone numbers written inside my wallet in case I ever lose my phone. Between an unexpected rainstorm, to an unchecked pocket before putting pants in a washing machine, to a spilled drink, I have gotten my wallet wet several times. Every time I used ink, I had to rewrite the list, but now with graphite, it isn't a problem.
rambambram 15 hours ago [-]
Thanks! I appreciate the input.
Do you just use regular graphite pencils, like with the HB scale or something?
fhdkweig 15 hours ago [-]
I just use the same BIC mechanical pencils with #2 lead that I picked up in college. No reason to get complicated.
Havoc 12 hours ago [-]
I'd think thin copper sheet on something soft would also work. That indentation will probably outlast any sort of ink
wrboyce 14 hours ago [-]
In the UK “registrar’s ink” is used for marriage certificates, I believe it is supposed to be good for many hundreds of years.
sandworm101 16 hours ago [-]
Black ink on white paper, stored in a cool dark place, will last many decades. If may fade but will remain readable. Want centuries? Use skin parchment. Millenia? An engraving pen on glass. Going for longer? Take a grinder to a block of granite, but the real problem there is the lack of geologically-stable storage on this planet.
nine_k 15 hours ago [-]
Granite is heavy and brittle. Instead, take a plate made of platinum or iridium, and engrave information on it. It offers excellent mechanical, chemical, and thermal durability. It can sink in volcanic lava and then hammered back out from the resulting rock, intact. (Expensive though.)
ian-g 14 hours ago [-]
An engraving pen on glass? Why not get some sheet glass and a stick of color.
Write it directly onto the glass _in glass_
rambambram 15 hours ago [-]
A couple of millennia might suffice. ;) Thanks for the input.
The engraving pen on glass is a good one. Any experience with it?
nullorempty 18 hours ago [-]
What's the simplest way to rewrite the data without actually copying the data? Like in place rewrite - you write what you read.
fhdkweig 18 hours ago [-]
I've seen "dd if=/dev/removable of=/dev/removable" suggested. I don't know if it actually works or if the OS optimizes it to a no-op.
valleyer 15 hours ago [-]
Certainly the OS can't optimize it to a no-op, since `dd` makes separate read and write syscalls.
I suppose your `dd` implementation itself could do so, but I don't know why it would.
piyh 16 hours ago [-]
the risk of catastrophic data loss from misuse of `dd` makes my hackles rise just looking at this.
I will never forget when I mixed up `if` and `of` during a routine backup.
`cat /dev/sda > /mnt/myDisk2` is so much safer, explicit, and in unix norms. It's also faster because you don't have to tune block size parameters.
Plus you can also do `pv /dev/sda > /mnt/myDisk2` to get transfer speed details.
Friends don't let friends use `dd` where `cat` can do the same job.
jmb99 13 hours ago [-]
I stopped getting scared of `if` and `of` about a decade ago when I started explicitly saying (in my head) "input file" and "output file" rather than "if" and "of." You still can mess up the order, but imo no more easily than you can swap `cat in > out` for `cat out > in`.
> Friends don't let friends use `dd` where `cat` can do the same job.
Technically yes... but I like being able to explicitly set block sizes and force sync writes.
ogurechny 8 hours ago [-]
I think you both are arguing about how to fight a bear with your bare hands. To win in that, you simply need to not fight with a bear.
Let's say someone made an expansion board with a cool feature: there are 5 documented I/O addresses, but accessing any other address fries the stored firmware. What would you do? No, not leaving a lot of comments in code in CAPS LOCK. No, not printing the correct hexadecimal values in red to put the message on the wall. You make a driver that only allows access to the correct addresses, and configure the rest of the system to make sure that it can only work through that driver.
Let's say there's a loading bay at the chemical plant with multiple flanges. If strong acid from the tanker is pumped into the main acid tank, everything is fine. If it is pumped into any other tank, the whole plant may explode and burn. What should be done? No, not promising that drivers will be fired, then shot by the firing squad if they make a mistake. Each connection is independently locked, and the driver only gets a single matching key.
You have wonderful programmable devices that allow you to solve non-standard problems with non-standard tools. What should be done is making a wrapper for dd that just does not allow you to do anything you don't want to happen. Even the most basic script with checks and confirmation is enough.
hpb42 16 hours ago [-]
Wouldn't a ZFS Scrub get the job done?
toast0 11 hours ago [-]
zpool scrub is what I would do, but it only rewrites data that needs it. The OP wanted to rewrite everything.
jmakov 20 hours ago [-]
Powered all the time on or powered off?
alnwlsn 19 hours ago [-]
OP says powered off.
shiroiuma 9 hours ago [-]
We really need that holographic data storage that they've been working on for decades now.
foxglacier 13 hours ago [-]
Beware that flash data lifetime is sensitive to temperature in the normal range people store stuff at. Store them in the roof space of your house that can exceed 40C each day and they might not last one year.
Definitely not a medium to passively store anything long term without power! Use Hard drives or Blu-ray instead.
jmb99 13 hours ago [-]
The linked post currently has demonstrated that 6 years in some reasonable-ish condition is perfectly fine.
> they might not last one year.
> Definitely not a medium to passively store anything long term without power!
Do you have any evidence to back up this claim? I'm much more interested in data than fear mongering.
foxglacier 13 hours ago [-]
He stored them on a shelf which is probably 25C max. and that has an order of magnitude longer life than at 40C. [1]
That's good. I want to keep some institutional knowledge and photos in "cold storage" and cloud subscriptions with a credit card and password are completely inviable.
I'll probably get a spinner and a flash drive and hope one of them survives the years.
fhdkweig 17 hours ago [-]
If privacy is your primary problem with cloud storage, I would suggest veracrypt containers. And if you aren't storing too much data, I would also suggest DVD/BluRay optical media with DVDisaster and PAR2 archives. I keep a DVD spindle in a safe deposit box that gets updated each year.
rpcope1 17 hours ago [-]
Unless the data is huge, you're probably going to be better off with M-Disc Blurays or DVDs, as they're explicitly designed for what you're trying to do.
https://fight-flash-fraud.readthedocs.io/en/stable/
I remember years ago working on the Wii, and there was a restriction on how often you could read the flash to avoid premature wearing. Not sure if that was just the specific type of storage, as googling suggests that NAND is subject to this and NOR isn't. I think pretty much all USB drives now use NOR flash, so maybe this isn't actually an issue any more.
DRAM works that way but flash doesn't. Read disturb is a different issue.
pretty much all USB drives now use NOR flash
Nope, NOR flash is much more expensive than NAND so NOR is only used for firmware and everything else is NAND.
This only happens very rarely, though more frequently as NAND flash goes QLC and beyond.
Besides, other experiments have shown that data remanence is way more of an issue with drives that are almost completely worn out (way beyond their specified TBW) and about to croak. Even then you only get rare bitrot that can be checked for and compensated quite cheaply in most cases.
If you take fresh media, write it just once or a few times at most, use substantial overprovisioning to keep the drive in its fast pseudo-SLC mode, and reread the media periodically, NAND can be a good enough storage system for most casual needs.
While i've had generally solid experience with sandisk for almost 20 years and had a few old drives (which i hear are slc-based so its not surprising) hold files for over 5 years no issue, i recently almost lost over 4 years of photos.
I had purchased some lexar drives from costco since they were dual interface (usb A / usb C) about 2 years ago, and it was usefull to just get some pictures off my phone. I usually don't rely on such a setup for long term but as with all things I was delayed tending to it. I figured there were 2 per box so i just copied them twice, and diffed them several times to make sure they were exact copies.
After 24 months, one of the drives had a %95 loss, almost every picture was lost cut-off bottom half or so. The other drive surprisingly seemed fine, though it had been plugged in every 6-9 months I recall, as I wanted to browse it a few times, it seems that this action saved the volume. Upon further inspection the good drive still lost 10 pictures in about 5 thousand, so it wasn't perfect.
Lexar.
https://www.ebay.com/itm/176810492981?chn=ps&_trkparms=ispr%...
If these are JPEGs with a grey or green lower half, it's likely only a few 16x16 macroblocks are corrupted and you can recover the rest.
This cannot be done programmatically because you have to guess what the average colour of the block was, but it can be worth it for precious pictures.
With JPEG one of the big problems is that the data is Huffman-encoded without any inter-block markers (except maybe Restart Markers, if you're lucky). This means that a single bitflip can result in a code changing length, causing frameshifts in many subsequent blocks and rendering them all undecodable. If you have a large block of missing data (e.g. a 4k-byte sector of zeros), then you have to guess where in the image the bitstream resumes, in addition to guessing the value of the running DC components.
https://news.ycombinator.com/item?id=45274277 (Apple Photos corrupts images on import - images truncated)
https://www.tomshardware.com/pc-components/ssds/fake-samsung...
For the rest of us, a USB spinning rust hard drive formatted as exFAT is going to be hard to beat. You'll be able to plug this into virtually any computer made in the next few decades (modulo a USB adapter or two) and just read it. They are cheap (even allowing for the rising cost of storage), fast, and most importantly, they are easy. The data is stored magnetically, so is not susceptible to degradation just from sitting like SSDs or flash drives are.
Of course, you should not store any important data on only ONE drive. The 3-2-1 backup rule applies to archives as well: 3 copies, 2 different media, 1 off-site.
(Though probably not appropriate if you’re primarily not a mac user, or won’t be in the future.)
The Linux HFS+ driver is basically unmaintained, and cannot write to journaled disks. On Windows, the only choice a paid driver. I guess it's fine if you're strictly a Mac user, but it's a real problem if you need to access the disk on another machine. Even if you don't, I still wouldn't trust Apple for long-term support of anything.
Meanwhile exFAT has native support on Windows, Mac, and Linux, and there are drivers for BSDs and others.
So 20 years down the line, you'll certainly have something that can read an exFAT drive without much if any pain, regardless of which platform you're using at the time. HFS+? Who knows.
That said, I'd consider ZFS or btrfs for HDD archival. Granted broad (Mac/Windows) support is weaker than FAT, but at least the filesystems are completely open source. But what really makes them interesting is their automatic data checksumming to detect (and possibly repair) bitrot, which is particularly useful for archival.
In any case, if the situation changes, I expect there'll be enough lead time for me to adjust my strategy -- the failure scenario is completely different than rotting physical media.
Not even kidding.
With any other media, you have to hope that the drives are still available. Paper routinely lasts hundreds of years and we all have readers built right in.
If we’re talking the average tech-illiterate to literate-but-cost-and-space-constrained person, probably Blu-Ray. A burner+reader combo with a stack of dual-layer discs is probably cost-effective. High-capacity HDDs would probably be equally effective if you can guarantee that they’re stored away from accidents and mishandling, but if it requires a SATA-to-USB adapter with assembly then it might possibly be out of reach for some consumers, and any risk of damage from movement could rule it out entirely.
If we’re talking tech-savvy consumers who don’t have the IT budget of a corporation, maybe LTO-5 or LTO-6 tapes could work. Tapes themselves are very affordable and have a good shelf lifespan. Used libraries can be had for under $600. The primary issues would be finding one with an interface that works with your existing equipment and software to support tape read and write.
Tape.
Obviously extreme prosumer, but for long-term archival of lots of data, LTO tape wins in several ways:
- Discs just aren't actually that high capacity relative to modern HDD capacities. BD XL maxes out at 128 GB, while there are now 30 TB HDDs readily available. That's 240 discs per HDD. Modern LTO tapes store 12-18 TB, or 2-3 tapes per HDD.
- Anything flash-based is a bad choice for long-term storage. SSDs are very fast, but also (relatively) expensive at 15-20¢/GB. Reputable SD cards are in the same neighborhood. Despite the OP redditor's results here, flash is only expected to retain data for 5-10 years.
- Tape is the absolute lowest cost-per-GB you can find of any storage medium. At the moment, LTO 8/9 tape can be had on Amazon for ½¢/GB. Compare with BD-R at 2¢/GB, or BR-R XL M-disc at 15¢/GB. HDDs (spinning rust) are 2-3¢/GB.
- Consider also write speed. LTO can write 300+ MB/s. BD 16x maxes out around 68 MB/s.
- Manufacturers rate tapes for 30 years sitting on a shelf, and it wouldn't be surprising if they still read after 50 years¹. Plain BD-R lasts 5-20 years. M-disc is the interesting outlier, rated 100-1000 years.
Of course, the biggest problem with tape is the drives. While the media is dirt cheap, the drives are crazy expensive. It looks like you can pick up a used LTO-6 drive (2.5 TB tapes) on ebay for around $500. A brand new LTO-9 drive (18 TB tapes) will be $4000-5000.
In terms of breakeven points, a used LTO-6 drive + tapes beats plain BD after about 25 TB. Because of the cost of M-discs, they stop making sense after 1-2 TB. Purely on cost, a brand new LTO-9 drive + tapes doesn't beat used LTO-6 + tapes until about 800 TB (LTO-9 tape is ½¢/GB while LTO-6 tape is 1¢/GB), but there's definitely a point in there where the larger capacity of LTO-9 makes dealing with the physical media a whole lot easier.
So if you're looking for long-term storage for your photo album, a M-disc BD XL is probably a good choice. If you only have a few hundred GB of data, a couple discs + burner can be had for $300 or so, and you can be pretty sure your mom could manage to read the disc if necessary.
But if you're looking to back up your 100 TB homelab NAS, discs are not really feasible. You'll have to spend the next month swapping discs every 25 minutes², and then deal with your new thousand disc collection. Here's where a used LTO-6 drive makes a lot of sense. This is a real sweet spot if you can find a decent drive; all-in you'd spend about $1500 to back up your 100 TB.
This is what I do to backup my NAS — found an old LTO-6 drive and got a bunch of tapes. The drive plugs in to a SAS port (you might need a HBA PCI card, $50), and that's pretty much it. Linux has the drivers built in; it will show up as /dev/st0 and you can just point tar³ at it.
Finally, just to compare with cloud options, storing that 100 TB in AWS Glacier Deep Archive would run you slightly over $100/mo, so you're ahead with your own tapes after little over a year. Oh and don't forget to set aside an extra $8000 for data transfer fees should you ever actually want to retrieve your data lol.
---
¹ eg the Unix v4 tape that was recently found and successfully read after 52 years — https://news.ycombinator.com/item?id=45840321
² Or get a disc-swapping robot, but those run $4000-5000, at which point… you're better off with a brand new tape drive.
³ Thus using the Tape ARchiver program for its original purpose. Use -M to span tapes, tar will prompt you to swap.
For the kids, I'd rather make physical photo albums.
Go with inorganic blu ray media if you want longevity. Most HTL blu rays made currently will last around 100 years if properly stored. If you need longer there are M-Disc's, but they are expensive and rumor has it that ALL verbatim 100Gb blu rays are essentially M-Discs with different labels these days.
For all practical purposes any Blu ray larger than 25Gb is probably inorganic HTL, but if you worry a lot you can buy more expensive "archival grade" discs from Japan as well that have been vettted and tested.
It should be mentioned that the phone board often gets warm during operation or battery charging, and the temperature is stated an important harmful factor in a different comment.
So if you have some old files on an old device, and assume that they are still there because their records in the file system still look fine, you might be surprised.
I needed one last week, and had to throw most of them away, they had all died from presumably dormancy, even new in the package.
All I can say for sure is that you should not trust any flash for long term storage, thumb drive or otherwise. In serious enough, high usage, high heat enviornments where everything working without problems or delay is part of what they are paying us to be responsible for, it is standard practice to clone fresh images to nvmes every time, with multiple spares that can be swapped out in minutes when they inevitably fail anyways.
Flash media relies on recharging, which may or many not happen often enough.
"I filled 10 32-GB Kingston flash drives with pseudo-random data."
"The years where I'll first touch a new drive (assuming no errors) are: 1, 2, 3, 4, 6, 8, 11, 15, 20, 27"
And from the blog: "Q: You know you powered the drive by reading it, right? A: Yes, that’s why I wrote 10 drives to begin with. We want to see how something works if left unpowered for 1 year, 2 years, etc."
Apparently, multi-year storage tests are still valuable for validating whether those estimates match reality. Who knew...
First the elephant in the room. Why solid state? because the drives to read the media are often the weak link. When the drives are no longer being manufactured how hard is it to make one? reading solid state drives is a relatively low precision electrical process compared to the high precision mechanical process needed for most media.
First on the chopping block was bulk storage. It tends to be delicate and hard to read and short lifespans. But if I limited myself to small storage there are some interesting options. fusible proms were promising but top out at a few megabytes. Mask roms? does anyone offer a mask rom service anymore?
Put a mask rom into a sd card... no, sd cards are too physically small. For a song album we want something bigger to put album art on. A thing the size of the original gameboy cartridge with a usb interface and a mask rom?
My conclusion, for that specific goal, indefinite future storage of a song album. Vinyl records. low tech enough that it is easy to make a player for them.
My guess is: regular graphite pencil on porous paper is best. Any ideas about further things I have to take into account?
There are at least 4 dangers for a written text: mechanical rubbing, fading due to light, water and organic solvents (e.g. alcohol).
There are many pigment-based inks that are specified to be lightfast and resistant to water and organic solvents, according to various archiving standards. Such inks are available for fountain pens or they are used in certain kinds of roller pens.
If you use such inks on paper that is somewhat porous, they will also be resistant to rubbing. There are certain kinds of "permanent pens", which have excellent resistance to rubbing even when you write on surfaces like plastic, glass or metal, not only on glossy paper, and which may also be lightfast and waterproof, but the text written with such permanent pens is easily washed with alcohol or other organic solvents (like also for text written with ball-point pens).
So the answer depends on your goal, but usually what you want is either a roller pen or ink for a fountain pen that are clearly specified as being pigment-based, lightfast and waterproof, together with paper on which you have checked that rubbing does not remove the written text. When using fountain pens, one must check that the archival pigment-based ink is known to be compatible with the model of fountain pen, otherwise clogging may occur. (For example, I use pigment-based ink cartridges from Sailor Japan, seiboku or souboku, with Sailor fountain pens, so compatibility is guaranteed.)
While graphite-based pencils produce writing that is lightfast and resistant to solvents, in my experience the inherent rubbing of the sheets of paper when you handle the notebook, or whatever you had used for writing, leads over the years to a fading of the text, so I do not like this method.
Pencil definitely lasts if the paper is undisturbed. I have some paperwork that's 100+ years old and with legible pencil text. On the flip side, if the paper is handled a lot, the writing will gradually fade because graphite particles just sit on the surface and can flake off.
On some level, the medium is your main problem. Low-grade paper, especially if stored in suboptimal conditions (hot attic, moist crawlspace, etc), may start falling apart in 20 years or less. Thick, acid-free stock stored under controlled conditions can survive hundreds of years.
Acid-free paper sounds like the way to go. Do you have experience with this? Or is it common knowledge? Just curious!
I also read letters from my grandparents, stored by my parents in a simple shoe box. No special conditions, just light-free and inside the home for decades. They were still very much readable. I did not pay enough attention, but I guess it was blue ink from back in the day that they used.
I collect vintage stuff that sometimes comes with paperwork, usually after spending a decade or two stashed away in the attic.
Do you just use regular graphite pencils, like with the HB scale or something?
The engraving pen on glass is a good one. Any experience with it?
I suppose your `dd` implementation itself could do so, but I don't know why it would.
I will never forget when I mixed up `if` and `of` during a routine backup.
`cat /dev/sda > /mnt/myDisk2` is so much safer, explicit, and in unix norms. It's also faster because you don't have to tune block size parameters.
Plus you can also do `pv /dev/sda > /mnt/myDisk2` to get transfer speed details.
Friends don't let friends use `dd` where `cat` can do the same job.
> Friends don't let friends use `dd` where `cat` can do the same job.
Technically yes... but I like being able to explicitly set block sizes and force sync writes.
Let's say someone made an expansion board with a cool feature: there are 5 documented I/O addresses, but accessing any other address fries the stored firmware. What would you do? No, not leaving a lot of comments in code in CAPS LOCK. No, not printing the correct hexadecimal values in red to put the message on the wall. You make a driver that only allows access to the correct addresses, and configure the rest of the system to make sure that it can only work through that driver.
Let's say there's a loading bay at the chemical plant with multiple flanges. If strong acid from the tanker is pumped into the main acid tank, everything is fine. If it is pumped into any other tank, the whole plant may explode and burn. What should be done? No, not promising that drivers will be fired, then shot by the firing squad if they make a mistake. Each connection is independently locked, and the driver only gets a single matching key.
You have wonderful programmable devices that allow you to solve non-standard problems with non-standard tools. What should be done is making a wrapper for dd that just does not allow you to do anything you don't want to happen. Even the most basic script with checks and confirmation is enough.
Definitely not a medium to passively store anything long term without power! Use Hard drives or Blu-ray instead.
> they might not last one year.
> Definitely not a medium to passively store anything long term without power!
Do you have any evidence to back up this claim? I'm much more interested in data than fear mongering.
[1] https://www.ni.com/en/support/documentation/supplemental/12/...
I'll probably get a spinner and a flash drive and hope one of them survives the years.