One Transaction, Two Transaction, Three Ah-Ah-Ah (PR 2974)

Cryptocurrency nodes sometimes have to deal with a lot of data. As Mastering Bitcoin, 2e chapter 2 explains, multiple nodes have to have access to the entire blockchain to validate the provenance of every coin and every piece of every coin.

In other words, if you tweet @sodogetip tip @chromatic_x 10 doge, a lot of work goes on behind the scenes apart from @SoDogeTip managing all of the Twitter interactions.

SoDogeTip has a wallet for you. That wallet is a unique Dogecoin address. That address either has or hasn’t received funds from somewhere. Either someone’s sent it coins or it hasn’t. Those coins had to come from somewhere too. If you trace their lineage back far enough, they were part of a reward for miners in a coinbase transaction (see chapter 10 of the Bitcoin book).

Yes, but How Do You Know?

If at least one computer on the Internet somewhere has access to the entire blockchain, you can always verify the provenance of coins in a transaction.

Of course we want many computers to have copies so that we can all independently verify this so that we don’t have to trust a single entity for this truth, but that’s a different subject.

For the sake of argument, let’s assume only one computer verifies this though. How does the computer know that your SoDogeTip wallet has 10 Doge to send to my SoDogeTip wallet?

The naive approach is to work backwards from the most recent transaction, scanning every transaction for unspent inputs to your wallet address. That would definitely work, but as I write this, there are 4,253,217 blocks to scan for transactions. That number is higher as you read this. It went up to 4,253,218 in the span of these sentences as I write.

That would definitely work though, especially if we can trust that the most recent transactions all validate the accuracy of and build on the truthfulness of all previous transactions.

What if you had to scan a million blocks to find 10 Doge in unspent inputs though?

Worse, what if you wanted to know the entire unspent balance of that wallet? You’d have to scan every block on the blockchain to know for sure.

Why Do Textbooks Have Indexes?

A few years back, I wrote a book about the Perl programming language. I spent a lot of time crafting the index by hand (okay, not by hand; I wrote a Perl program to do it, of course) because I want readers to be able to say “What in the world is a default scalar variable?” and be able to look it up without starting in the first chapter and skimming every line (or hitting Ctrl-F in a web browser). It’s not just me being kind to my readers; it’s me trying to save countless people an enormous amount of time they’d otherwise spend doing repetitive, useless work.

This is also why dictionaries are in alphabetical order. (It’s not only because of that song!).

Good crypto wallets make an index of transactions related to a specific address because that makes it super fast to ask questions like “How many coins are available in this address?” or “What transactions do I need to refer to to make a new transaction?”

Why Dictionaries Don’t Have Indexes

That’s why PR 2942 is interesting, but this isn’t about PR 2942. I mean, it is, but it isn’t.

PR 2942 says “If you’re adding a new private key to a wallet, you already have the option to ask whether to rescan the entire blockchain for transactions related to that wallet, but you should also have the option to decide which transactions to rescan.”

PR 2974 says “What if you forgot?”

These are two sides of the same friendly dog-related coin. (There’s at least one other side too, but that’s a topic for another time.)

You can always get the right answer if you always scan the entire blockchain, but you don’t want to do that because it’s slow and wasteful.

You can always get the right answer from the index if you index everything, but you don’t want to do that because that’s slow and wasteful.

If you forget to index anything, you’re kind of stuck.

That makes me believe that it’s more friendly and kind and delightful for you to be able to index the things that you believe matter. You know more about what matters to you than anyone else does.

If you dump a wallet on one machine and restore it on another, you can go to DogeChain (up to 4,253,229 transactions now; I could type faster) and search for your wallet address and see the height of the block where the oldest transaction to your wallet appeared and rescan with that block height, because that’s the minimum amount of work the Dogecoin Core has to do to get you the right answer.

The less work the Core has to do, the faster it’ll get you your answer and the fewer resources it’ll use to calculate it.

Maybe that’s a weird thing for a junior developer on a Proof-of-Work cryptocurrency to say, but your happiness and convenience is a big deal. That’s why we have a big friendly golden doggo smiling from the logo, after all.

I expect the result of all of this work will be a handful of new arguments to a handful of existing RPC commands and then some updates to the GUI to allow you to have the Core do less work for you but get the same results. That, to me, is satisfying. A little bit of work on the part of the developers could save countless users hours and hours of work.

One Transaction, Two Transaction, Three Ah-Ah-Ah (PR 2974).

Yes, but How Do You Know?

Why Do Textbooks Have Indexes?

Why Dictionaries Don’t Have Indexes