Post

Why The Double Dabble Algorithm Works

Why The Double Dabble Algorithm Works

When learning to program, students are often asked to implement a function that converts an integer to a string. Usually, what they come up with (and what they are expected to come up with) is to iteratively use division by 10 to get the digits of the number one by one. While this certainly works, division is considered an expensive operation, and so constrained systems (and perhaps standard implementations as well) use an algorithm called “Double Dabble” to get the decimal representation of a number.

To be more precise, Double Dabble is an algorithm for converting the binary representation into BCD representation - Binary Coded Decimal. BCD uses 4 bits per decimal digit, so for example, the number 13, which in binary is 1101, would be 0001_0011 in BCD - note how it encodes 1 in the first nibble and 3 in the second. You can think of BCD as the mathematical operation that turns 1234 into 0x1234.

Here is how it works, according to Wikipedia:

The algorithm then iterates n times. On each iteration, any BCD digit which is at least 5 (0101 in binary) is incremented by 3 (0011); then the entire scratch space is left-shifted one bit. The increment ensures that a value of 5, incremented and left-shifted, becomes 16 (10000), thus correctly “carrying” into the next BCD digit.

And here’s an example, also from Wikipedia, of finding the decimal representation of 243:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
0000 0000 0000   11110011   Initialization
0000 0000 0001   11100110   Shift
0000 0000 0011   11001100   Shift
0000 0000 0111   10011000   Shift
0000 0000 1010   10011000   Add 3 to ONES, since it was 7
0000 0001 0101   00110000   Shift
0000 0001 1000   00110000   Add 3 to ONES, since it was 5
0000 0011 0000   01100000   Shift
0000 0110 0000   11000000   Shift
0000 1001 0000   11000000   Add 3 to TENS, since it was 6
0001 0010 0001   10000000   Shift
0010 0100 0011   00000000   Shift
   2    4    3
       BCD

Puzzled? So was I. The algorithm is pretty straight forward to implement, but WHY does it work? It turns out that it’s actually really simple, but to see why we will have to reinvent it.

Represenations of Integers

Let’s begin by thinking about what a representation of an integer is. Assuming we know what an integer is in its purest mathematical form, a representation of that integer is a string of smaller integers that people would all interpret as the same number. So the representations 21, 0x15, 0b10101 would all be representations of the same number - twenty one.

Without getting too philosophical, the point is that a representation is not just an arbitrary string, it’s a “recipe” for “creating” that number. If you wanted to actually get a feeling for what 421 is, and wanted to place 421 coins on your desk, the representation gives you the recipe for doing that, just by knowing the integers up to 10.

First, you put 4 coins on your desk. You then put more coins until you have 10 times what you had. Then, you put 2 coins on the desk, and then add more coins until you have 10 times what you had. Finally, you add 1 last coin, and you will end up with 421 coins on the desk. In that sense, the number 421 is actually just the following operations:

1
2
3
4
5
6
x = 0
x += 4
x *= 10
x += 2
x *= 10
x += 1

The important thing here is that 421 is that sequence of operations regardless of the base of the representation. Let’s try it with base 16:

1
2
3
4
5
6
x = 0
x += 0x4      // x = 0x4
x *= 0xA      // x = 0x28
x += 0x2      // x = 0x2A
x *= 0xA      // x = 0x1A4
x += 0x1      // x = 0x1A5

And sure enough, 0x1A5 is 421.

Let’s Double Dabble

We are now ready to construct the Double Dabble algorithm. We will first do it “on paper”, and then see how it can be implemented. In this example, we will convert 0b10101 to its decimal representation 21.

First, we take 0b10101 and look at it from left to right, giving us the following operations:

1
2
3
4
5
6
7
8
9
10
x = 0
x += 1
x *= 2
x += 0
x *= 2
x += 1
x *= 2
x += 0
x *= 2
x += 1

Now let’s apply these operations to a decimal string (assume we can do math on decimal strings):

1
2
3
4
5
6
7
8
9
10
x = 0   // "0"
x += 1  // "1"
x *= 2  // "2"
x += 0  // "2"
x *= 2  // "4"
x += 1  // "5"
x *= 2  // "10"
x += 0  // "10"
x *= 2  // "20"
x += 1  // "21"

And so we have the decimal representation!

Now we just have a final piece missing, that “assume we can do math on decimal strings”. Once we do that, we’re done!

Looking at the operations, you’ll see that in binary we only need to be able to do three things: multiply by 2, add 1, and add 0. Let’s begin by trying to multiply a decimal string by two. Maybe if we multiply each digit by two we will get the correct result? Let’s see for "123": multiplying each digit we get "246", which is correct. How about "246"? 6*2 is 12, which doesn’t fit in a single digit. What we have to do is put the 2 in the ones’ place, and carry that 1 to the tens’ place. Thus we get:

1
2
3
4
5
6
2  4  6     original
4  8  12    times 2
   1        carry
4  8  2
--------
4  9  2

Which is the correct result. So we can multiply a decimal string by 2 (although it’s a bit clumsy). How about adding 1 and 0? Zero is obviously not a problem as we don’t need to do anything, but it turns out that adding one is also very easy. Because we always add a 1 after we multiply by 2, so the number is even and thus we know that the ones’ digit is not 9. Therefore, to add 1 we simply need to add 1 to the last digit, without any side effects.

So we have everything we need, now we just have to put it all together. This is where everything clicks together nicely, but also where the mental steps leading here are blended into an incomprehensible mess.

Say we have a decimal string. In BCD, each digit is encoded as 4 bits, and these groups of 4 bits are all in one long bit stream:

1
2
   2    4    6
|0010|0100|0110|

Now, say we want to multiply it by 2. As you may know, to multiply an integer by 2 we can just shift it to the left. Shifting everything 1 bit to the left we get:

1
2
3
4
5
   2    4    6
|0010|0100|0110|

   4    8   12
|0100|1000|1100|

Remember, that 12 not carrying over is a problem. What we want is 0x12, not 12, so that we can leave that 2 and carry that 1 over. How can we turn 12 into 0x12, 13 into 0x13, etc.? Well, we can just add 6. Where does this 6 come from? It’s just the difference between (base) 10 and (base) 16. After adding 6 to the ones’ place we get:

1
2
3
4
5
6
7
8
9
   2    4    6
|0010|0100|0110|

   4    8   12
|0100|1000|1100|

          1         <---- carry
   4    8    2 
|0100|1000|0010|

But, as it turns out, there is a much more elegant way to carry that 1 into the tens’ place. What if we added the six BEFORE doing the shift? This way, the most significant bit of the nibble would be 1, and be shifted right to where it needs to be! But wait, we are adding it before it is multiplied by 2, so to compensate we need to divide it by 2 and we get… 3! (that’s not factorial)

And so we arrive at the final form: to multiply by 2, add 3 to all digits greater or equal to 5 (as these are the digits that will have a carry when multiplied by 2) and shift one bit to the left. In our example:

1
2
3
4
5
6
7
8
9
10
11
12
   2    4    6
|0010|0100|0110|

add 3 to ones' digit:

   2    4    9
|0010|0100|1001|

shift:

   4    9    2 
|0100|1001|0010|

Have another look at the example from Wikipedia. Notice how the leftmost bit is shifted into the output buffer (equivalent to adding 0 or 1), how the output buffer is shifted by 1 (equivalent to multiplying by 2), and how 3 is added to digits greater or equal to 5.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
0000 0000 0000   11110011   Initialization
0000 0000 0001   11100110   Shift
0000 0000 0011   11001100   Shift
0000 0000 0111   10011000   Shift
0000 0000 1010   10011000   Add 3 to ONES, since it was 7
0000 0001 0101   00110000   Shift
0000 0001 1000   00110000   Add 3 to ONES, since it was 5
0000 0011 0000   01100000   Shift
0000 0110 0000   11000000   Shift
0000 1001 0000   11000000   Add 3 to TENS, since it was 6
0001 0010 0001   10000000   Shift
0010 0100 0011   00000000   Shift
   2    4    3
       BCD

It suddenly makes sense!

This post is licensed under CC BY 4.0 by the author.