Testing full sync of five Bitcoin node implementations.

As I’ve noted many times in the past, running a fully validating node gives you the strongest security model and privacy model that is available to Bitcoin users. But are all node implementations equal when it comes to performance? I had my doubts, so I ran a series of tests.

The computer I chose to use as a baseline is high-end but uses off-the-shelf hardware. I wanted to see what the latest generation SSDs were capable of doing, since syncing nodes tend to be very disk I/O intensive.

We’ve also run tests on less high-end hardware such as the Raspberry Pi, the board we use in our plug-and-play Casa Node for Lightning and Bitcoin. We learned that syncing Bitcoin Core from scratch on one can take weeks or months (which is why we ship Casa Nodes with the blockchain pre-synced).

I bought this PC at the beginning of 2018. It cost about $2,000.

I ran a few syncs back in February but nothing comprehensive across different implementations. When I ran another sync in October I was asked to test the performance when running the non-default full validation of all signatures.

Note that no Bitcoin implementation strictly fully validates the entire chain history by default. As a performance improvement, most of them don’t validate signatures before a certain point in time, usually a year or two ago. This is done under the assumption that if a year or more of proof of work has accumulated on top of those blocks, it’s highly unlikely that you are being fed fraudulent signatures that no one else has verified, and if you are then something is incredibly wrong and the security assumptions underlying the system are likely compromised.

Test Results

Testing Bitcoin Core was straightforward — I just needed to add the assumevalid parameter. My bitcoind.conf:

assumevalid=0
dbcache=24000
maxmempool=1000

Bcoin took a lot more work. I have regularly run Bcoin nodes but I haven’t spent a lot of time messing with the configuration options, nor had I ever tried a full validation sync. I ended up filing several github issues and had to try syncing about a dozen different times before I found the right parameters that wouldn’t cause the NodeJS process to blow heap and crash. My bcoin.conf:

checkpoints=false
coin-cache=1000

Along with: export NODE_OPTIONS= — max_old_space_size=16000

Libbitcoin Node wasn’t too bad — the hardest part was removing the hardcoded block checkpoints. In order to do so I had to clone the git repository, checkout the “version3” branch, delete all of the checkpoints other than genesis from bn.cfg, and run the install.sh script to compile the node and its dependencies.

I did have to try twice to sync it, because unlike every other implementation with a UTXO cache, Libbitcoin’s cache parameter is in units of UTXOs rather than megabytes of RAM. So the first time I ran a sync I set it to 60,000,000 UTXOs in order to keep them all in RAM — under the assumption that my 32GB of RAM would be sufficient since Bitcoin Core uses less than 10GB to keep the entire UTXO set in memory. Unfortunately Libbitcoin Node ended up using all of my RAM and once it started using the swap, my whole machine froze. After chatting with developer Eric Voskuil he noted that a UTXO cache greater than 10,000 would have negligible impact, so I ran the second sync with it set to 100,000 for good measure. My libbitcoin config:

[blockchain]
# I set this to the number of virtual cores since default is physical cores
cores = 12
[database]
cache_capacity = 100000
[node]
block_latency_seconds = 5

BTCD was fairly straightforward — it completed on the first sync. However, as far as I can tell it doesn’t have a configurable UTXO cache, if it has one at all. My btcd.conf:

nocheckpoints=1
sigcachemaxsize=1000000

The first Parity Bitcoin sync attempt didn’t go well. After Parity fixed the consensus bug I tried again.

Though once it got into the “full block” territory of the blockchain (after block 400,000) it slowed to 2 blocks per second. CPU usage was only about 30% though it was using a lot of the cache I made available — 17GB worth! But for some reason it was still slower than I expected — a look at the disk I/O activity showed it to be the culprit. For some reason it was constantly churning 50 MB/s to 100 MB/s in disk writes even though it was only adding about 2 MB/s worth of data to the blockchain. It kept slowing down after SegWit activation such that after block 500,000 it was only processing about 0.5 blocks per second. CPU, RAM, and bandwidth usage were about the same, so I checked disk I/O again. At this point there was very little write activity but it was constantly doing about 30 MB/s worth of disk reads even though there was 4GB of cache left that was not being used. There are probably some inefficiencies in Parity’s internal database (RocksDB) that are creating this bottleneck. I’ve had issues with RocksDB in the past while running Parity’s Ethereum node and while running Ripple nodes. RocksDB was so bad that Ripple actually wrote their own DB called NuDB. My pbtc config:

— btc
 — db-cache 24000
 — verification-edge 00000000839a8e6886ab5951d76f411475428afc90947ee320161bbf18eb6048

Bitcoin Knots is a fork of Bitcoin Core, so I didn’t expect the performance to be much different. I didn’t test it originally but after posting this article several folks requested a test, so I ran it, also with bitcoind.conf:

assumevalid=0
dbcache=24000
maxmempool=1000

After initially publishing this post, someone recommended that I test Gocoin. I had heard of this client a few times before, but it rarely comes up in my circles. I did manage to find a critical bug while testing it, but the developer fixed it within a few hours of being reported. Gocoin touts itself as a high performance node.

It keeps the entire UTXO set in RAM, providing the best block processing performance on the market.

After reading through the documentation, I made the following changes to get maximum performance:

My gocoin.conf:

LastTrustedBlock: 00000000839a8e6886ab5951d76f411475428afc90947ee320161bbf18eb6048
AllBalances.AutoLoad:false
UTXOSave.SecondsToTake:0

Not only was Gocoin fairly fast at syncing, it is the only node I’ve come across that provides a built-in web dashboard!

Performance Rankings

  1. Bitcoin Core 0.17: 5 hours, 11 minutes
  2. Bitcoin Knots 0.16.3: 5 hours, 27 minutes
  3. Gocoin 1.9.5: 12 hours, 32 minutes
  4. Libbitcoin Node 3.2.0: 20 hours, 24 minutes
  5. Bcoin 1.0.2: 27 hours, 32 minutes
  6. Parity Bitcoin 0.? (no release): 38 hours, 17 minutes
  7. BTCD 0.12.0: 95 hours, 12 minutes

Exact Comparisons Are Difficult

While I ran each implementation on the same hardware to keep those variables static, there are other factors that come into play.

  1. There’s no guarantee that my ISP was performing exactly the same throughout the duration of all the syncs.
  2. Some implementations may have connected to peers with more upstream bandwidth than other implementations. This could be random or it could be due to some implementations having better network management logic.
  3. Not all implementations have caching; even when configurable cache options are available it’s not always the same type of caching.
  4. Not all nodes perform the same indexing functions. For example, Libbitcoin Node always indexes all txs by hash — it’s inherent to the database structure. Thus this full node sync is more properly comparable to Bitcoin Core with the tx indexing option enabled.
  5. Operating and File system differences can come into play. One of my colleagues noted that when syncing on ZFS with spinning disks, Bitcoin Core performed better with a smaller dbcache. He reasoned that ZFS + Linux are very good at optimizing what data is cached/buffered, and the bitcoind DB caches ended up wasting more RAM caching unnecessary data, which meant that the overall amount of IO needed on the spinning disks was higher with a larger dbcache.

Room for Improvement

I’d like to see Libbitcoin Node and Parity Bitcoin make it easier to perform a full validation with a single parameter rather than having to dig through code and recompile without checkpoints or set a checkpoint at block 1 in order to override default checkpoints.

With Bcoin I had to modify code in order to raise the cache greater than 4 GB. I also had to set environment variables to increase the default NodeJS heap size which defaults to 1.5GB in v8, though after a ton of experimentation it wasn’t clear to me why bcoin kept blowing heap even when I had heap set to 8GB and the UTXO cache set to 4, 3, or even 2 GB. Bcoin should implement sanity checks that ensure a node operator can’t set a config that will crash the node due to blowing heap.

Another gripe I have that is nearly universal across different cryptocurrencies and implementations is that very few will throw an error if you pass an invalid configuration parameter. Most silently ignore them and you may not notice for hours or days. Sometimes this resulted in me having to re-sync a node multiple times before I was sure I had the configuration correct.

On a similar note, I think that every node implementation should generate its default config file on first startup, even if the file is empty. This way the node operator knows where the config options should go — more than once I’ve created a config file that was named incorrectly and thus was ignored.

As a side note, I was disappointed with the performance of my Samsung 960 EVO which markets sequential read speeds of 3,200 MB/s and write speeds of 1,900 MB/s. If you read their fine print the numbers are based on “Intelligent TurboWrite region” specs, after which the numbers drop drastically. In my real world testing with these node implementations I rarely saw read or write speeds exceed 100 MB/s. The question of how much disk activity is sequential and how much is random also comes into play, though I’d expect most of the writes to be sequential.

In general my feeling is that not many implementations have put much effort into optimizing their code to take advantage of higher end machines by being greedier with resource usage.

Altcoin Tests

Just for fun, I tried a few other popular non-Bitcoin-derivative cryptocurrencies.

I’ve been tracking Parity’s full validation sync time for a while now, ever since I started running Parity nodes at BitGo. Unfortunately it seems that the data being added to the Ethereum blockchain is significantly outpacing the rate at which the implementation is being improved.

Geth seems to also have a consensus bug, which is quite surprising given this implementation’s popularity. I can only assume that very few folks try to run it in full validation mode. Nevertheless, it’s amazing that this critical consensus bug was reported 4 months ago but still hasn’t been fixed…

I’ll have to try Geth again after the bug is fixed since the developers appear to have made some significant disk I/O optimizations.

Several weeks later the code was released and my test showed a significant improvement.

I had to run monerod twice because I assumed that the config file was monero.conf whereas it should have been bitmonero.conf.

The really cool thing about monerod is that is has an interactive mode that you can use to change the configuration of the node while it is running. This was really helpful for performance testing and I’m not aware of any other crypto node implementations that have this kind of interactive configuration.

I expected Zcash would be somewhat fast due to having a small blockchain but somewhat slow due to the computationally expensive zero knowledge proofs and due to being based off of Core 0.11. Looks like they balanced out.

Finally, I considered running a sync of Ripple just for fun but then I read the documentation and decided it didn’t sound fun after all.

At the time of writing (2018–10–29), a rippled server stores about 12GB of data per day and requires 8.4TB to store the full history of the XRP Ledger.

Conclusion

Given that the strongest security model a user can obtain in a public permissionless crypto asset network is to fully validate the entire history themselves, I think it’s important that we keep track of the resources required to do so.

We know that due to the nature of blockchains, the amount of data that needs to be validated for a new node that is syncing from scratch will relentlessly continue to increase over time.

The hard part is ensuring that these resource requirements do not outpace the advances in technology. If they do, then larger and larger swaths of the populace will be disenfranchised from the opportunity for self sovereignty in these systems.

Keeping track of these performance metrics and understanding the challenges faced by node operators due to the complexities of maintaining nodes helps us map out our plans for future versions of the Casa Node.