Backporting Block Pruning Pr 2923.

One of the benefits and drawbacks of blockchain technology is that the blockchain always gets bigger and relies on lots of people having their own copies to verify that any one transaction is valid with regard to the blockchain as a whole.

There are benefits to that, but the costs are ever-increasing storage requirements.

That shouldn’t keep people out of running their own nodes to help the network though.

Keep Small Shibes in the Network

the other day, someone asked on the dogecoindev subreddit is it possible to prune blockchain storage. This is a feature the Bitcoin core has: rather than keeping the entire blockchain stored locally, you can set a threshold for the amount of disk storage you can devote to the blockchain and the system will automatically discard older blocks to keep the most recent blocks so as not to exceed your storage space.

Obviously the network only works completely if some/many nodes provide the entire chain, but nodes can relay and verify transactions even if they have a pruned blockchain. There’s value in that.

In my previous post about removing misleading sync screen data, I praised a bug report for making it easy to see the shape of the correct solution. This Reddit post is the same. Notice the side-by-side screenshots of how Dogecoin Core behaves and how Bitcoin Core behaves.

This feature request is effective because it has a clear goal (let more people participate in the network) and it points to an existing implementation.

Forking is an Act of Love

The Dogecoin Core codebase is a fork of the Bitcoin Core codebase, though not as directly as that phrasing makes it sound. (What’s a fork? It’s a branch that’s probably never merging into the main trunk. See how branches work for more details.)

Dogecoin and Bitcoin diverged in a couple of small ways at the start (the name of the project, some of the design, the network port, the start of the blockchain) and then began to diverge in other, more substantial ways (no fixed coin supply, shorter block times, merge mining with LTC).

This means two things. First, that if you know your way around one codebase, you have an advantage to navigating the other. Second, that if you want to move code between codebases, it’s either very easy or very difficult, with not much in between.

Difficult doesn’t mean impossible.

Perhaps more importantly, the copyright and license of the source code means that there are no legal barriers to code moving back and forth between these and other similarly licensed projects.

Git and GitHub also make this easy.

Cherry- and Nit- Picking

I said as much to alamshafil, who found the right code in the Bitcoin Core and cherry picked the code to Dogecoin as PR 2923.

For experienced Git users, I use “cherry-pick” as a metaphor here. For everyone else, think of it this way: there’s a big lump of code in the Bitcoin Core referred to as PR 13043 (linked earlier). Because Dogecoin Core and Bitcoin Core have the same lineage, the easy way is “grab that code as a whole, and see if the change applies cleanly to the Dogecoin Core source code”.

I don’t know what additional changes were necessary to get the change to apply cleanly, but given how quickly alamshafil made the change available, I assume it was easy, as easy as ‘git cherry-pick cbede7dbfde83d53ef38d257e9940af5f163b03c` and maybe editing a couple of files.

Now the hard work begins.

Code is code, but software and community are more than just bits on a disk.

If you look at the comments on the pull request, you’ll see both Patrick and I have some opinions. Imagine that.

Patrick’s comments are about “How will this work with the network as a whole? How does this change affect the integrity of the system?”

My comments are about “Is this code consistent with the rest of our code?”

Neither are reasons not to make this change. They’re both questions we should be asking of every proposal because the answers will help make the system better.

Changes like this aren’t free. This code has to be maintained. Any new GUI elements introduce new information users will have to understand. Any new configuration options in configuration files, the GUI, or the code itself needs enough explanation that it’s not confusing.

Any changes made to this code will make it a little more difficult to borrow another good idea from Bitcoin Core or another fork, so we need to weigh those changes against the value we get from them.

What Should You Learn From This?

This change looks like it will be accepted soon, and it’s very likely that you’ll have the option when running a 1.14.6 node to use less storage space. It’s still important for as many nodes to use the full blockchain as possible, but this option gives more options to more people.

It’s important to recognize that not all good ideas have to come from a single codebase, and multiple similar projects can both compete and collaborate.

It’s important to recognize that, as a user, a good idea you see somewhere else can become a reality in your favorite project if you ask for it well and if the stars align and the implementation is easy.

It’s important to realize that, if you have enough dev skills to run git cherry-pick and submit a clean pull request, some features are easy to add in the right places if you have a spare afternoon.

I’m very proud of the folks on Reddit and Github for working quickly to make this feature a reality. I’m even more impressed that they provided such a good example of how community collaboration can work. Keep those feature request, bug reports, and code contributions coming.