Note: if you'd prefer to consume this essay as a presentation, you can watch it here.
"Lost coins only make everyone else's coins worth slightly more. Think of it as a donation to everyone." - Satoshi Nakamoto
Bitcoin is a bearer asset. Much like with cash or gold, if it's lost or destroyed, you're out of luck. Your loss becomes everyone else's collective gain. If you don't wish to make such a donation to the world then you want to have robust redundant backups of your keys!
There are tons of guides, products, and services that promise to help you in this endeavor. Since one of the principles of Bitcoin is to verify rather than trust, over the past 4 years I've stress tested and reviewed 70 different metal seed backup devices; you can read my results and ratings here:
I explain my testing methodology here; my goal has been to see how well a device fares against the most common forms of loss due to fire, flood, and deformation. However, there are plenty of other (qualitative) aspects of backup device design that I have not explicitly rated, mainly around user experience.
The following is an exhaustive list of every aspect of seed phrase backups that I have formed an opinion about after extensive use of a wide variety of backup designs.
The optimal backup device will have the following characteristics:
- Stainless steel or titanium
- Small but thick form factor
- A single solid piece of metal
- Center punched / electrolysis etched data
- Template for decoding data inscribed onto the device
Now let's dive deeper into the why these attributes are desirable.
On Material Choice
In order to have high melting point, high corrosion resistance, high tensile strength, and high hardness, there are only 2 common materials that fit the bill.
- stainless steel
It's worth noting that not all stainless steel is the same; there are 4 principal types (austenitic, duple, ferritic, martensitic) and within those types are many different grades. You'll want austenitic steel because it's the most corrosion-resistant. Within the austenitic steel types, there are two main grades – grade 304 and grade 316.
Both have a high melting point.
- Grade 304: 1400-1450°C (2552-2642°F)
- Grade 316: 1375-1400°C (2507-2552°F)
They also both have high tensile strength.
- Grade 304: roughly 621 MPa
- Grade 316: rougly 579 MPa
When comparing 304 vs. 316 stainless steel, one major difference is resistance to chlorides such as salt. Grade 316 is more resistant to the elements, making it a more desirable stainless steel for things like maritime applications. However, I recommend going one step further and using 316L stainless steel because it has a lower carbon content, which provides an even greater level of intergranular corrosion resistance. This becomes especially important following the stress of surface deformation caused by markings indented onto the surface of the metal.
Titanium, on the other hand, has far fewer grades to choose from. You'll probably want either grade 4, which has the highest tensile strength, or grade 7 which is the most corrosion resistant.
Don't cheap out and use aluminum unless you want a pile of slag.
On Size & Form Factor
Just as the KISS (Keep it Simple, Stupid) principle is a great rule to follow when designing robust devices, keeping devices small also seems to make them less susceptible to failure. An added bonus is that they're easier to conceal. The flip side is that smaller devices are easier to lose.
All other things being equal, my testing leads me to believe that small devices are less prone to deformation. To be specific, it's not about the total volume taken up by the device. It's really a question of how thin / spread out the device is. The longer / wider a device is, the more susceptible it will be to being flexed because it's likely to be thin relative to the other 2 dimensions. To make a more quantitative recommendation, I would recommend metal thickness to be a minimum of 5mm and preferably over 1 cm in order to minimize flexibility. In terms of robustness against deformation, the best devices I've tested are in the shape of cylinders / hexagonal rods that are thick and solid.
Total volume, on this other hand, plays a role in thermal stress. A very large plate, especially multiple stacked plates, will retain a ton of heat when it's in a fire. This is not a problem in and of itself, but it does become a problem if the device is then exposed to a large volume of water (like a fire hose) and it rapidly cools. This cooling can result in deformation and if the device has fasteners holding it together, the may fail as a result.
On Post-Deformation Recovery
When designing the shape of a device, imagine how that shape will change if it is crushed. Will the data still be legible? Will the device seize up and be difficult to open?
Seizing up is not a critical failure, it just means that your average toolbox won't be able to help you access the data. Of course, if we're talking about a non-trivial amount of money at stake then you'll certainly be happy to find a local machine shop with heavy duty equipment if necessary.
Why create a sandwich of plates?
- To lock the device against physical attackers.
- To provide extra protection against stressors / scratches.
Keeping a device safe from physical attackers is simply not practical to do with the design of the device itself; at best you can store your backup in a highly secure location like a safe / vault, or hidden somewhere that no one is likely to look. You should assume that anyone who can get their hands on the device can access the data within. For more guidance on physical protection, read the "implementation" guide linked at the very bottom of this article.
From extensive testing I've found that as long as you are choosing the best materials and data encoding, you shouldn't need an extra layer of metal for protection. The downside to sandwiching is that depending upon the design you may now be more susceptible to loss due to fastener failure and device separation. You're also more susceptible to other forms of fastener failure that degrade the user experience. I'll delve into fasteners next.
Fasteners are a potential point of failure. I've seen some melt / fuse to the device because they were made from cheap materials. I've seen others dissolve because they could not withstand acidity. Most commonly, I've seen fasteners completely seize up a device that has been deformed by crushing.
To KISS if you must use a fastener, avoid using both a screw AND a nut - it's better to drill threaded holes directly into your device itself so that your fasteners screw into it.
There are also dilemmas when it comes to material choice for fasteners. For example, titanium galls with itself and steel fasteners will galvanically corrode over time if mixed with titanium.
In general I think it's best to design devices WITHOUT the need for fasteners. However, if you are using a threaded fastener you should use one made of a hardened material and you want the head to be as large as possible. Why? Because I've seen many a dainty fastener end up getting stripped when trying to open it. The holds true for phillips screws, hex screws, and star screws. If you must use a screw I'd use a flathead and make sure the groove is deep, not shallow.
On Total Number of Parts
KISS means fewer parts is better. But what about devices that use many parts for the encoded data? How bad would it be if the device is damaged and the parts get jumbled up / separated / lost? You have to think in terms of combinatorics.
For a 24 tile situation, if you lose the ordering then you may need to try up to 6.2x1023 (i.e. 24!) combinations to find your seed phrase. If you know the ordering but have lost tile(s) then you'll have to try up to 2048lost tiles combinations. The latter becomes computationally worse than the former after 7 tiles are lost.
If you're using one tile per letter (4 letters per word * 24 words = 96 tiles total) and they get jumbled up... that's 9.9x10149 (i.e. 96!) possible combinations. If you lose tiles then it gets more complicated to calculate complexity because it depends on the distribution of lost tiles and how many are still in order.
Long story short, if you are using many metal tiles then it should be 1 tile per seed word, preferably with the word's order number on the tile itself.
In an effort to minimize the device size, some manufacturers cram a ton of data into very small spaces. This can make it hard to store the data if your hand-eye coordination is not very precise. It can also make it difficult to read the data if you don't have great eyesight.
The device pictured here performed superbly on all of my stress tests, but it may be challenging for some people to use.
On Data Storage
- Full laser engraving by manufacturer
- Pre-laser engraved tiles
- Pre-stamped tiles
- Center Punch
- Electric Engraving
- Freehand Engraving
As someone who has stamped thousands of letters over the past few years, I hate this method with a passion. Why?
- It's easy to miss with the hammer and injure yourself.
- It's hard to get consistently centered strikes on a stamp that is held perfectly perpendicular to the surface. As such you'll end up with light strikes, double strikes, and plenty of stamp bits flying across the room.
- It's a lot more user friendly to use a jig, but reliable jigs are hard to come by. Most are plastic and tend to fail after a couple hundred strikes. Even the best metal jig and anvil set I've used ended up deforming after ~400 strikes.
- It's easy to screw up the orientation of the bit and end up with a letter rotated the wrong way. While this is not a critical issue, it's an aesthetic one.
The only 2 types of data storage I'd recommend are either electrolysis or center punch. Etching & stamping are arduous experiences in comparison to a simple center punch. It’s hard to screw up a center punch strike and even if you do get a light strike, it’s easy to perform a second strike in the same spot without creating a “ghosting” effect that actually makes it harder to read.
On Data Encoding
What are we encoding? Ultimately it's just a 32 byte (256 bit) number. There are numerous ways you can encode this data...
24 word mnemonic: lazy scale mix join hospital swamp furnace move spoil climb volcano around current obscure arrange ladder life first brush salt man exhaust gold autumn
Why is a mnemonic superior? It has a built-in checksum (that validates the integrity of the data) and it's more resilient against partial data loss. Think of it this way - if you stored a binary / decimal / hexadecimal encoding and one or two characters were destroyed or read incorrectly, you'd have no idea that it's wrong.
When it comes to actually inscribing your data on a device, avoid novel (unique) schemes. If you don't know HOW to read / decode your data then it becomes irrelevant if the device survives a catastrophe. You should assume that it may be decades after creation of your device that you may find yourself needing to use it. As such, you don't want to use any proprietary encoding scheme that has fallen out of use.
One trend that has become apparent is more and more devices being designed with the intention of encoding the BIP39 line number of a seed phrase word rather than the word's actual letters. While this does not change the AMOUNT of data required to be stored on the device (4 letters / numbers either way) it does have some other effects.
- If the device is grid based you only need 10 possible chars rather than 26. This means you can design center punch style plates more compactly.
- The downside is that it's less resilient to partial data loss. Let's say you lose legibility of 1 character in a word. When storing the words there will be MAYBE 2 possible words it could be. When storing line numbers there will be up to 10 possible words. Let's say you lose legibility of 2 characters. When storing the words there will be ~10 to ~20 possible words it could be. When storing line numbers there will be up to 100 possible words.
One minor thing that irks me about the devices that use binary encodings of the line number of the word is that the "2048" column feels like a waste of space - it will never be used except for the word "zoo." This could be eliminated by starting the line number count at 0 instead of 1, thus the word "abandon" would be represented by an empty row.
On Templates and Obfuscation
For security against physical attackers, some folks have suggested encoding data with a center punch as divots onto blank metal without any other markings so that a casual observer would never guess that it's a representation of a valuable seed phrase.
The downside here is that you're basically turning your backup into a 2-of-2 scheme. If you don't also have a redundant backup of the template then your unique data backup is worthless. A less likely but still noteworthy issue is that sufficient deformation of the device may make it difficult to apply the template in order to decode the data.
I recommend that the template for your device's data encoding be permanently etched onto the device itself. That means the grid, seed word numbers, etc, all need to be etched as deeply as the data for the words themselves will be. The etching method for the template isn't particularly relevant, whether it's via laser etching / electrolysis / stamping - as long as it's deep. I've seen plenty of devices fail because the template was either printed on with ink or because the etching for the template was relatively light and ended up being erased when subjected to heat or corrosion.
There are dozens of companies at time of writing that do nothing but specialize in manufacturing metal seed phrase backup devices. If you order a device from a company, you should assume that your shipping information will get leaked. As such, you should never have these devices shipped to your place of residence.
Are there privacy preserving alternatives? Sure, any "do it yourself" setup that uses common materials that can be purchased at any hardware store. It may be a bit less convenient, but that's usually the case when protecting your privacy. Some of the DIY backup projects that are worth looking into:
Once you have selected / created your device then you need to decide how to deploy it into the wild! That's a whole other set of problems that I cover in depth here:
This post will be a living document that will be updated if I come to any new conclusions from further testing. If you have suggestions for other aspects to add, don't hesitate to contact me!