Python's stretchy integers.

Tags: software-dev · Python · C++

Big Numbers

In Python, you can do maths with arbitrarily large numbers:

2 ** 100 # = 1267650600228229401496703205376

The above is perfectly vaild and will not overflow, despite the maximum value of a 32bit unsigned integer of 9223372036854775807. This is a really cool bit of functionality which I've been wanting to include in my lisp interpreter. Let's see how Python does it.

This functionality was introduced to Python in PEP 237: "Unifying Long Integers and Integers". In the PEP, the implementation is brushed upon: Python Ints are abstractions over internal long types and bignums; a bignum being an array/vector of short ints representing arbitrary length integers.

Cool! That sounds pretty simple actually. An array of integers: just need to write functions describing how to do maths with these stretchy ints, how to compare them, how to convert them to strings, and how to convert a string to a stretchy int. Hmmm not quite simple, but doable.

The hardest parts, I think, will be the interconversion between a bignumn and string representations. The maths and comparison should be relatively easy to do, probably just needing a brush up of binary maths algorithms.

Searching for the source

Looking at the cPython relevant source code (longobject.c longobject.h longintrepr.h), we can get some detail about the implementation. The last header is where the meat of the structure is:

struct _longobject {
    PyObject_VAR_HEAD
    digit ob_digit[1];
};

And that's it.

A digit is the smallest increment of memory the long uses, and can either be a "normal" 32-bit integer, or a short int (16-bit). There are limitations of the algorithms used to do maths with the resulting long object that prevent smaller digits, in fact the algorithms require either 30 or 15 bits in a digit.

The object is a "PyObject_VAR_HEAD", which means it has a variable size (as the digit array can be grown or shrunk). The macro "Py_SIZE" is used to return the size of a variable-sized Python object, which in this case would return the number of digits. However, this macro is also used to store the sign of a long object. It is common throughout the source to see things like if (Py_SIZE(z) <0) { ... which doesn't make sense upon first look, but using the size member to store the sign is a pretty clever wait to do it, using a separate variable would be a waste of memory.

Also defined in longintrepr.h are:

PyLong_SHIFT
the number of bits to shift by (30 or 15)
_PyLong_DECIMAL_SHIFT
shifting a decimal multiplies by 10what?
_PyLong_DECIMAL_BASE
10_PyLongDECIMALSHIFT
PyLong_BASE
as for decimal, but for binary: 2shift
PyLong_MASK
PyLong_BASE - 1

Now we can look into the implementation longobject.c file, and see how data is handled.

1 + 1 = ?

Let's start with addition:

static PyObject *
long_add(PyLongObject *a, PyLongObject *b)
{
    PyLongObject *z;

    CHECK_BINOP(a, b);

    if (Py_ABS(Py_SIZE(a)) <= 1 && Py_ABS(Py_SIZE(b)) <= 1) {
        return PyLong_FromLong(MEDIUM_VALUE(a) + MEDIUM_VALUE(b));
    }
    if (Py_SIZE(a) < 0) {
        if (Py_SIZE(b) < 0) {
            z = x_add(a, b);
            if (z != NULL) {
                /* x_add received at least one multiple-digit int,
                   and thus z must be a multiple-digit int.
                   That also means z is not an element of
                   small_ints, so negating it in-place is safe. */
                assert(Py_REFCNT(z) == 1);
                Py_SIZE(z) = -(Py_SIZE(z));
            }
        }
        else
            z = x_sub(b, a);
    }
    else {
        if (Py_SIZE(b) < 0)
            z = x_sub(a, b);
        else
            z = x_add(a, b);
    }
    return (PyObject *)z;
}

This function looks at the values of the operands, a and b. Why they couldn't have chosen semi-descriptive names, I don't know. It checks if a and b either span more than a digit, and if not it adds them together directly, returning a new PyLong. Otherwise, call the function x_add (or x_sub) on the operands. Let's follow the trail:

/* Add the absolute values of two integers. */

static PyLongObject *
x_add(PyLongObject *a, PyLongObject *b)
{
    Py_ssize_t size_a = Py_ABS(Py_SIZE(a)), size_b = Py_ABS(Py_SIZE(b));
    PyLongObject *z;
    Py_ssize_t i;
    digit carry = 0;

    /* Ensure a is the larger of the two: */
    if (size_a < size_b) {
        { PyLongObject *temp = a; a = b; b = temp; }
        { Py_ssize_t size_temp = size_a;
            size_a = size_b;
            size_b = size_temp; }
    }
    z = _PyLong_New(size_a+1);
    if (z == NULL)
        return NULL;
    for (i = 0; i < size_b; ++i) {
        carry += a->ob_digit[i] + b->ob_digit[i];
        z->ob_digit[i] = carry & PyLong_MASK;
        carry >>= PyLong_SHIFT;
    }
    for (; i < size_a; ++i) {
        carry += a->ob_digit[i];
        z->ob_digit[i] = carry & PyLong_MASK;
        carry >>= PyLong_SHIFT;
    }
    z->ob_digit[i] = carry;
    return long_normalize(z);
}

This function takes two unsigned numbers and adds them. The first wee part is getting ready to do the addition (ensuring a > b, allocating result z), the meat of the function starts with the first loop:

for (i = 0; i < size_b; ++i) {
    carry += a->ob_digit[i] + b->ob_digit[i];
    z->ob_digit[i] = carry & PyLong_MASK;
    carry >>= PyLong_SHIFT;
}

As a digit is int16_t or int32_t, but the shift is 15 or 30, there is guaranteed to be enough room in a digit to store the addition of two digits, so a digit is sufficient for the carry. Starting from the smallest digit, digits of the operands are added with the result in carry, this is masked and stored in the result and the carry is right-shifted so it contains the carry relevant for the next digits.

This is fairly straightforward, but here's a wee run-through. Lets make the digit size 4 (24 = 16) with a shift of 3 and let's add 40 and 20:

# 40 -> 0b10 1000
# 20 -> 0b010100
a = [8, 2]
b = [4, 1]
z = [0, 0, 0]
carry = 0
mask = 15 # = 0b1111
shift = 4

# first iteration
carry = carry + a[0] + b[0] # -> 0 + 8 + 4 = 12
z[0] = carry & mask # -> 0b1100 & 0b1111 -> 0b1100 -> 12
carry = carry >> shift # -> 0

# second iteration
carry = carry + a[1] + b[1] # -> 0 + 2 + 1 = 3
z[1] = carry & mask # -> 0b0011 & 0b1111 -> 0b0011 -> 3
carry = carry >> shift # -> 0

z[2] = carry

#z = [12, 3, 0] -> 12 + 3*(2**shift) + 0 -> 60

Cool! The two loops work doing the same thing. The first loop adds up pairs of digits from a and b, and as a is the larger array of digits, the second loop carrys the carry up for the rest of a. Finally, the carry at the end is put into the final digit of the result which is normalised (leading zeroes removed) and returned.

So this process adds two multi-byte unsigned integers. To add negative numbers and so on, the process is wrapped in another function to check the signs of the operands, if they are both positive, or both negative, the above process is valid. Otherwise, we need something else…

… subtraction.

3 - 1 = ?

Again, let's start with the raw subtraction function:

/* Subtract the absolute values of two integers. */

static PyLongObject *
x_sub(PyLongObject *a, PyLongObject *b)
{
    Py_ssize_t size_a = Py_ABS(Py_SIZE(a)), size_b = Py_ABS(Py_SIZE(b));
    PyLongObject *z;
    Py_ssize_t i;
    int sign = 1;
    digit borrow = 0;

    /* Ensure a is the larger of the two: */
    if (size_a < size_b) {
        sign = -1;
        { PyLongObject *temp = a; a = b; b = temp; }
        { Py_ssize_t size_temp = size_a;
            size_a = size_b;
            size_b = size_temp; }
    }
    else if (size_a == size_b) {
        /* Find highest digit where a and b differ: */
        i = size_a;
        while (--i >= 0 && a->ob_digit[i] == b->ob_digit[i])
            ;
        if (i < 0)
            return (PyLongObject *)PyLong_FromLong(0);
        if (a->ob_digit[i] < b->ob_digit[i]) {
            sign = -1;
            { PyLongObject *temp = a; a = b; b = temp; }
        }
        size_a = size_b = i+1;
    }
    z = _PyLong_New(size_a);
    if (z == NULL)
        return NULL;
    for (i = 0; i < size_b; ++i) {
        /* The following assumes unsigned arithmetic
           works module 2**N for some N>PyLong_SHIFT. */
        borrow = a->ob_digit[i] - b->ob_digit[i] - borrow;
        z->ob_digit[i] = borrow & PyLong_MASK;
        borrow >>= PyLong_SHIFT;
        borrow &= 1; /* Keep only one sign bit */
    }
    for (; i < size_a; ++i) {
        borrow = a->ob_digit[i] - borrow;
        z->ob_digit[i] = borrow & PyLong_MASK;
        borrow >>= PyLong_SHIFT;
        borrow &= 1; /* Keep only one sign bit */
    }
    assert(borrow == 0);
    if (sign < 0) {
        Py_SIZE(z) = -Py_SIZE(z);
    }
    return maybe_small_long(long_normalize(z));
}

There's not much more to this func as there was in the previous: the main part of it is, again, from the first for loop:

for (i = 0; i < size_b; ++i) {
    /* The following assumes unsigned arithmetic
       works module 2**N for some N>PyLong_SHIFT. */
    borrow = a->ob_digit[i] - b->ob_digit[i] - borrow;
    z->ob_digit[i] = borrow & PyLong_MASK;
    borrow >>= PyLong_SHIFT;
    borrow &= 1; /* Keep only one sign bit */
}

Borrow is a digit, which is big enough to hold the value of any operation between two other digits of size PyLong_SHIFT, initially zero. In that first line, borrow stores the result of the subtraction, which is then masked and stored in the result on the next line. borrow is then shifted. So far, this is all the same as in addition, except the operation has changed to '-' and 'carry' to 'borrow'. Hmm but what about that last line? And what does that comment about unsigned arithmetic mean?

By right-shifting borrow, all that's left is a digit representing whether…

Let's look at a pseudo-code run through of this process. We'll use the 4-bit digit as before, and let's subtract 20 from 30:

a = [14, 1] # -> 0001 1110 -> 30
b = [4, 1]  # -> 0001 0100 -> 20
shift = 4
mask = 0b111

result = [0, 0, 0]
borrow = 0

borrow = a[0] - b[0] - borrow # -> 14 - 4 - 0 -> 10
result[0] = borrow & mask # -> 0b1010 & 0b1111 -> 0b1010 -> 10
borrow = borrow >> shift # -> 0
borrow &= 1 # -> 0

borrow = a[1] - b[1] - borrow # -> 1 - 1 - 0 -> 0
result[1] = borrow & mask # -> 0b0000 & 0b1111 -> 0b0000 -> 0
borrow = borrow >> shift # -> 0
borrow &= 1 # -> 0

assert borrow == 0 # after all this, borrow should be zero

# result = 0b1010 -> 10

Okay cool, making sense so far. What if we introduce a situation in which we need to use the borrow? Let's subtract 30 from 36:

a = [4, 2] # -> 0010 1110 -> 36
b = [14, 1]  # -> 0001 1110 -> 30
shift = 4
mask = 0b111

result = [0, 0]
borrow = 0

borrow = a[0] - b[0] - borrow # -> 4 - 14 - 0 -> -10 -> 0b11110110 (!)
result[0] = borrow & mask # -> 0b11110110 & 0b1111 -> 0b0110 -> 6
borrow = borrow >> shift # -> 0b11110110 >> 4 -> 0b1111
borrow &= 1 # -> 1

borrow = a[0] - b[0] - borrow # -> 2 - 1 - 1 -> 0
result[1] = borrow & mask # -> 0b0000 & 0b1111 -> 0b0000 -> 0
borrow = borrow >> shift # -> 0b0000 >> 4 -> 0
borrow &= 1 # -> 0

# result = 0b0110 -> 6

Ahh! It's two's complement! A negative number represented in binary two's complement is essentially MAX-number, so an 8-bit number (0-255) is 255-number. The borrow &= 1 checks to see if there is a 1 in the position above the digit base, if there is then the result of the subtraction was negative and a unit needs to be borrowed from the next digit.

Two operations down, three to go!

1 * 2 = ?

All right. Let's step it up a notch: multiplication.

In the code, it can be seen there are two algorithms in use for multiplication of multi-digit numbers. There's the "grade school" multiplication method, and the Karatsuba algorithm.

Let's just tackle the grade school method for now, I may come back to the Karatsuba algorithm (when I get my head around it!) The gradeschool algorithm has a small efficiency boost in that equal values (squaring) halves the number of multiplications necessary. We'll ignore this speed up for now, and concentrate on multiplication alone:

/* Grade school multiplication, ignoring the signs.
 * Returns the absolute value of the product, or NULL if error.
 */
static PyLongObject *
x_mul(PyLongObject *a, PyLongObject *b)
{
    PyLongObject *z;
    Py_ssize_t size_a = Py_ABS(Py_SIZE(a));
    Py_ssize_t size_b = Py_ABS(Py_SIZE(b));
    Py_ssize_t i, iz, ib, bend;

    z = _PyLong_New(size_a + size_b);
    if (z == NULL)
        return NULL;

    memset(z->ob_digit, 0, Py_SIZE(z) * sizeof(digit));

    /* removed: same-value multiplication case */

    for (i = 0; i < size_a; ++i) {
        twodigits carry = 0;
        twodigits f = a->ob_digit[i];
        iz = i;  //digit *pz = z->ob_digit + i;
        ib = 0;  //digit *pb = b->ob_digit;
        //digit *pbend = b->ob_digit + size_b;

        SIGCHECK({
                Py_DECREF(z);
                return NULL;
            });

        //while (pb < pbend) {
        while (ib < bend) {
            carry += z->ob_digit[iz] + b->ob_digits[ib++] * f;
            z->ob_digit[iz++] = (digit)(carry & PyLong_MASK);
            carry >>= PyLong_SHIFT;
            assert(carry <= PyLong_MASK);
        }
        if (carry)
            z->ob_digit[iz] += (digit)(carry & PyLong_MASK);
        assert((carry >> PyLong_SHIFT) == 0);
    }
    return long_normalize(z);
}

Note: in the source, they use pointers to values in the digit array. I have converted this to indices and get[] operations (leaving the original in as comments).

So this algorithm, now we've stripped out all the tough bits, is fairly simple: each digit in the larger number (a) multiplies each digit in the smaller number (b). The carry is a bit larger than in addition and subtraction, but other than that its a similar process with the operations switched to *.

What's next?

4 / 2 = ?

Algorithm D (Division of nonnegative integers).

Ahh, division. Division is done via a "divide and modulo" function long_divmod, the output is given by out parameters passed to the function, so we can take two outputs from the func (either the quotient, or the modulo). long_divmod actually calls another function, long_divrem which calculates quotient and remainder, by calling another function x_divrem:

/* Unsigned int division with remainder -- the algorithm.  The arguments v1
   and w1 should satisfy 2 <= Py_ABS(Py_SIZE(w1)) <= Py_ABS(Py_SIZE(v1)). */

static PyLongObject *
x_divrem(PyLongObject *v1, PyLongObject *w1, PyLongObject **prem)
{
    PyLongObject *v, *w, *a;
    Py_ssize_t i, k, size_v, size_w;
    int d;
    digit wm1, wm2, carry, q, r, vtop, *v0, *vk, *w0, *ak;
    twodigits vv;
    sdigit zhi;
    stwodigits z;

    /* We follow Knuth [The Art of Computer Programming, Vol. 2 (3rd
       edn.), section 4.3.1, Algorithm D], except that we don't explicitly
       handle the special case when the initial estimate q for a quotient
       digit is >= PyLong_BASE: the max value for q is PyLong_BASE+1, and
       that won't overflow a digit. */

    /* allocate space; w will also be used to hold the final remainder */
    size_v = Py_ABS(Py_SIZE(v1));
    size_w = Py_ABS(Py_SIZE(w1));
    assert(size_v >= size_w && size_w >= 2); /* Assert checks by div() */
    v = _PyLong_New(size_v+1);
    if (v == NULL) {
        *prem = NULL;
        return NULL;
    }
    w = _PyLong_New(size_w);
    if (w == NULL) {
        Py_DECREF(v);
        *prem = NULL;
        return NULL;
    }

    /* normalize: shift w1 left so that its top digit is >= PyLong_BASE/2.
       shift v1 left by the same amount.  Results go into w and v. */
    d = PyLong_SHIFT - bits_in_digit(w1->ob_digit[size_w-1]);
    carry = v_lshift(w->ob_digit, w1->ob_digit, size_w, d);
    assert(carry == 0);
    carry = v_lshift(v->ob_digit, v1->ob_digit, size_v, d);
    if (carry != 0 || v->ob_digit[size_v-1] >= w->ob_digit[size_w-1]) {
        v->ob_digit[size_v] = carry;
        size_v++;
    }

    /* Now v->ob_digit[size_v-1] < w->ob_digit[size_w-1], so quotient has
       at most (and usually exactly) k = size_v - size_w digits. */
    k = size_v - size_w;
    assert(k >= 0);
    a = _PyLong_New(k);
    if (a == NULL) {
        Py_DECREF(w);
        Py_DECREF(v);
        *prem = NULL;
        return NULL;
    }
    v0 = v->ob_digit;
    w0 = w->ob_digit;
    wm1 = w0[size_w-1];
    wm2 = w0[size_w-2];
    for (vk = v0+k, ak = a->ob_digit + k; vk-- > v0;) {
        /* inner loop: divide vk[0:size_w+1] by w0[0:size_w], giving
           single-digit quotient q, remainder in vk[0:size_w]. */

        SIGCHECK({
                Py_DECREF(a);
                Py_DECREF(w);
                Py_DECREF(v);
                *prem = NULL;
                return NULL;
            });

        /* estimate quotient digit q; may overestimate by 1 (rare) */
        vtop = vk[size_w];
        assert(vtop <= wm1);
        vv = ((twodigits)vtop << PyLong_SHIFT) | vk[size_w-1];
        q = (digit)(vv / wm1);
        r = (digit)(vv - (twodigits)wm1 * q); /* r = vv % wm1 */
        while ((twodigits)wm2 * q > (((twodigits)r << PyLong_SHIFT)
                                     | vk[size_w-2])) {
            --q;
            r += wm1;
            if (r >= PyLong_BASE)
                break;
        }
        assert(q <= PyLong_BASE);

        /* subtract q*w0[0:size_w] from vk[0:size_w+1] */
        zhi = 0;
        for (i = 0; i < size_w; ++i) {
            /* invariants: -PyLong_BASE <= -q <= zhi <= 0;
               -PyLong_BASE * q <= z < PyLong_BASE */
            z = (sdigit)vk[i] + zhi -
                (stwodigits)q * (stwodigits)w0[i];
            vk[i] = (digit)z & PyLong_MASK;
            zhi = (sdigit)Py_ARITHMETIC_RIGHT_SHIFT(stwodigits,
                                                    z, PyLong_SHIFT);
        }

        /* add w back if q was too large (this branch taken rarely) */
        assert((sdigit)vtop + zhi == -1 || (sdigit)vtop + zhi == 0);
        if ((sdigit)vtop + zhi < 0) {
            carry = 0;
            for (i = 0; i < size_w; ++i) {
                carry += vk[i] + w0[i];
                vk[i] = carry & PyLong_MASK;
                carry >>= PyLong_SHIFT;
            }
            --q;
        }

        /* store quotient digit */
        assert(q < PyLong_BASE);
        *--ak = q;
    }

    /* unshift remainder; we reuse w to store the result */
    carry = v_rshift(w0, v0, size_w, d);
    assert(carry==0);
    Py_DECREF(v);

    *prem = long_normalize(w);
    return long_normalize(a);
}

You know any code which includes a citation to Knuth is gonna be something special. The citation is to /"The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Chapter 4.3: Multiple-Precision Arithmetic, Algorithm D"/. I don't have access to this book (yet, it's on my wish list!), so after a bit of searching I found a website with the algorithm copied out. The author of that article adds a definition stage providing detail about the variables used in the algorithm:

  1. Define
    • Let U be the dividend (or numerator) of m+n "digits", stored in an array of m+n elements, one "digit" per element, with the most significant "digit" in element U[m+n−1] and the least significant "digit" in element U[0];
    • Let V be the divisor (or denominator) of n "digits", stored in a second array of n elements, with n greater than 1, a non-zero most significant "digit" in element V[n−1] and the least significant "digit" in element V[0];
    • Let B be the base (or radix) of the "digits" (also "limbs", "places" or "words"), typically (a power of) a power of 2, with B greater than 1.
    • The algorithm computes the quotient Q of m+1 "digits" as (U ⁄ V), and the remainder R of n "digits" as U % V, using the following "primitive" operations:
      1. addition or subtraction of two single-digit integers, giving a single-digit sum and a carry, or a single-digit difference and a borrow;
      2. "widening" multiplication of two single-digit integers, giving a double-digit product;
      3. "narrowing" division of a double-digit integer by a single-digit integer, giving a single-digit quotient and a single-digit remainder.
  2. Normalize
    • Set D to (B − 1) ÷ V[n−1];
    • Multiply all "digits" of U and V by D.
    • (On a binary computer, choose D to be a power of 2 instead of the value provided above; any value of D that results in V[n−1] not less than B ÷ 2 will suffice. If D is greater than 1, the eventual overflow "digit" of the dividend U goes into element U[m+n].)
  3. Initialize j
    • Set the loop counter j to m.
  4. Calculate Q′
    • Set Q′ to (U[n+j] * B + U[n−1+j]) ÷ V[n−1];
    • Set R′ to (U[n+j] * B + U[n−1+j]) % V[n−1];
    • Test if Q′ equals B or Q′ * V[n−2] is greater than R′ * B + U[n−2+j];
    • If yes, then decrease Q′ by 1, increase R′ by V[n−1], and repeat this test while R is less than B.
  5. Multiply and subtract
    • Replace (U[n+j]U[n−1+j]…U[j]) by (U[n+j]U[n−1+j]…U[j]) − Q′ * (V[n−1]…V[1]V[0]).
    • (The "digits" (U[n+j]…U[j]) should be kept positive; if the result of this step is actually negative, (U[n+j]…U[j]) should be left as the true value plus Bn+1, namely as the B’s complement of the true value, and a borrow to the left should be remembered.)
  6. Test remainder
    • Set Q[j] to Q′;
    • If the result of step D4 was negative, i.e. the subtraction needed a borrow, then proceed with step D6; otherwise proceed with step D7.
  7. Add back
    • Decrease Q[j] by 1 and add (0V[n−1]…V[1]V[0]) to (U[n+j]U[n−1+j]…U[1+j]U[j]).
    • (A carry will occur to the left of U[n+j], and it should be ignored since it cancels with the borrow that occurred in step D4.)
  8. Loop on j
    • Decrease j by 1;
    • Test if j is not less than 0;
    • If yes, go back to step D3.
  9. Unnormalize
    • Now (Q[m]…Q[1]Q[0]) is the desired quotient Q, and the desired remainder R may be obtained by dividing (U[n−1]…U[1]U[0]) by D.

Oooft.

Looking through that is pretty intimidating. I have a rudimentary idea of how to perform this long division in binary, but arbitrary length integer division written out so formally makes my eyes water. I don't want to copy code I don't understand, so I won't. I do intend to work towards this, but for now I will use my terrible terrible solution (sketched here in python):

def divrem(dividend, divisor):
    remainder = dividend
    quotient = 0

    while remainder > 0:
        remainder -= divisor
        quotient += 1

    if remainder < 0:
        remainder += divisor
        quotient -= 1

    return quotient, remainder

As multiplication is repeated addition: integer division is repeated subtraction. To implement this, we also need to be able to compare BigNum types, at least this is fairly trivial.

C++ implementation

Right, home stretch! Let's put what we've got together, and C++-ify it a bit.

Here's the header for the BigNum class:

#include <vector>
#include <sstream>

// Define some types. A 'digit' is the smallest unit of integer: the storage
// size of each element in the vector representing the big integer. 'twodigits'
// is as it sounds: a datatype large enough to hold something twice as big as a
// digit.
typedef unsigned short digit;
typedef unsigned int twodigits;

// DIGIT_SHIFT defines the bit-width of a number in a single digit. So moving
// from a digit in the LSB position to LSB+1 position left-shifts by DIGIT_SHIFT
// amount. In the Python implementation, they choose a digit shift of 15 or 30
// as this allows them to use more complicated algorithms giving performance
// boosts.
#define DIGIT_SHIFT 15

// DIGIT_BASE is the largest number representable by the digit, plus one. If
// DIGIT_SHIFT was 3, then our base would be 8, the maximum representable number
// 0b111 (decimal 7).
#define DIGIT_BASE (1<<DIGIT_SHIFT)

// DIGIT_MASK is the number where all the bits of the digit are one: the larges
// number representable by the digit.
#define DIGIT_MASK (DIGIT_BASE-1)


// Some of the algorithms used require the operands to be in order of number of
// digits. This macro will run function on pointer 'this' and pointer 'other' in
// varying orders depending on the result of the size comparison stored in
// variable 'thisbigger'.
#define SORTSIZE(FUNCTION)\
  if (thisbigger) {\
    FUNCTION(this, other);\
  }\
  else {\
    FUNCTION(other, this);\
  }


// BigNum();
// BigNum(long number);
// 
// A BigNum represents an Integer accross any number of bytes. Methods used here
// follow mostly along with that used by the Python developers, simplified as
// necessary.
class BigNum {

private:
  // The integer is describe entirely by this vector of digits, and a sign.
  std::vector<digit> digits;
  bool negative;

  // Class methods described here are defined in add.cpp, subtract.cpp,
  // multiply.cpp, and divide.cpp

  // https://en.wikipedia.org/wiki/Addition#Notation_and_terminology
  static BigNum add (const BigNum *large_addend, const BigNum *small_addend);

  // https://en.wikipedia.org/wiki/Subtraction#Notation_and_terminology
  static BigNum subtract(const BigNum *minuend, const BigNum *subtrahend);

  // https://en.wikipedia.org/wiki/Multiplication#Notation_and_terminology
  static BigNum multiply(const BigNum *large_factor, const BigNum *small_factor);

  // https://en.wikipedia.org/wiki/Division_(mathematics)#Notation
  static void divrem(const BigNum *dividend, const BigNum *divisor, BigNum *quotient, BigNum *remainder);


  // BigNum::normalise();
  // Removes leading zeros from the digit vector.
  void normalise()
  {
    while (this->digits.back() == 0)
      this->digits.pop_back();
  }

public:

  // Default constructor: creates a big number with value '0'.
  BigNum()
  {
    this->negative = false;
  }


  // Initialise constructor: creates a BigNum from the long value given.
  BigNum(long value)
  {
    this->negative = value < 0;
    if (this->negative) value = -value;

    long carry = value;
    while(carry) {
      this->digits.push_back(carry & DIGIT_MASK);
      carry = carry >> DIGIT_SHIFT;
    }
  }


  // Initialise constructor: create a BigNum from the string.
  BigNum(std::string value); // TODO


  // Addition. This code wraps the unsigned addition function defined in add.cpp
  // and unsigned subtraction in subtract.cpp.  Adding two positive, or two
  // negative numbers, is addition. Adding a positive to a negative, or a
  // negative to a positive, is subtraction.
  BigNum operator+(const BigNum &other_ref)
  {
    BigNum result;
    const BigNum *other = &other_ref;
    bool thisbigger = this->get_size_abs() > other->get_size_abs();
    if (this->negative) {
      if (other->negative) {
        SORTSIZE(result = BigNum::add);
        result.negative = true;
      }
      else {
        result = BigNum::subtract(other, this);
      }
    }
    else {
      if (other->negative) {
        result = BigNum::subtract(this, other);
      }
      else {
        SORTSIZE(result = BigNum::add);
      }
    }

    return result;
  }


  // Subtraction. Similar to addition above, but a negation operation is applied
  // to the right hand operand.
  BigNum operator-(const BigNum &other_ref)
  {
    BigNum result;
    const BigNum *other = &other_ref;
    bool thisbigger = this->get_size_abs() > other->get_size_abs();
    if (this->negative) {
      if (!other->negative) {
        SORTSIZE(result = BigNum::add);
        result.negative = true;
      }
      else {
        result = BigNum::subtract(this, other);
      }
    }
    else {
      if (!other->negative) {
        result = BigNum::subtract(other, this);
      }
      else {
        SORTSIZE(result = BigNum::add);
      }
    }

    result.negative = negative;
    return result;
  }


  // Multiplication. Need to sort the operands into bigger-first-order, then
  // call on the algorithm defined in multiply.cpp.
  BigNum operator*(const BigNum &other_ref)
  {
    const BigNum *other = &other_ref;
    bool thisbigger = this->get_size_abs() > other->get_size_abs();
    BigNum result;
    SORTSIZE(result = BigNum::multiply)
    result.negative = (other->negative != this->negative);
    return result;
  }


  // Division. No size-sorting this time, just call the divrem class method
  // (defined in divide.cpp) providing location reference for the quotient to be
  // returned.
  BigNum operator/(const BigNum &other)
  {
    BigNum quotient;
    BigNum::divrem(this, &other, &quotient, NULL);
    quotient.negative = (other.negative != this->negative);
    return quotient;
  }


  // Modulo. Same algorithm as division (divrem in divide.cpp), except we
  // provide the remainder address instead of the quotient.
  BigNum operator%(const BigNum &other)
  {
    BigNum remainder;
    BigNum::divrem(this, &other, NULL, &remainder);
    if (other.negative) {
      remainder = remainder - other;
    }
    return remainder;
  }


  // Greater than comparison.
  bool operator>(const BigNum &other)
  {
    if (other.negative != this->negative)
      return other.negative;

    size_t this_size = this->get_size_abs(), other_size = other.get_size_abs();

    if (this_size > other_size)
      return !this->negative; // this is bigger: if NOT negative then this is greater than.

    if (this_size < other_size)
      return this->negative; // this is smaller: if negative then this is greater than.

    if (this_size == 0)
      return false; // both the same: both zero

    int i = this->get_size_abs();
    while (--i >= 0 && this->digits[i] == other.digits[i]);

    if (i < 0)
      return false; // both the same

    return this->digits[i] > other.digits[i];
  }


  // Equality comparison.
  bool operator==(const BigNum &other)
  {
    if (this->get_size_abs() != other.get_size_abs()) {
      return false;
    }

    if (this->negative != other.negative) {
      return false;
    }

    for (size_t i = 0; i < this->get_size_abs(); i++) {
      if (this->digits[i] != other.digits[i])
        return false;
    }

    return true;
  }


  // not equal to
  bool operator!=(const BigNum &other)
  {
    return !((*this) == other);
  }


  // greater than or equal to
  bool operator>=(const BigNum &other)
  {
    return ((*this) > other) || ((*this) == other);
  }


  // less than
  bool operator<(const BigNum &other)
  {
    // less than is not greater or equal to
    return !((*this) >= other);
  }


  // less than or equal to
  bool operator<=(const BigNum &other)
  {
    // less than or equal is not greater
    return !((*this) > other);
  }


  // String representation. Show the value of the BigNum as a string.
  std::string repr() const {

    // TODO: this solution is temporary: it will over- and underflow if bigger
    // numbers than a long long is capable of storing is used.
    int i = this->get_size_abs();
    if (i == 0)
      return "0";

    long long res = 0;
    for (i--; i >= 0; i--) {
      res <<= DIGIT_SHIFT;
      res += this->digits[i];
    }

    std::stringstream ss;
    if (this->is_negative()) ss << "-";
    ss << res;
    return ss.str();
  }


  // Get the 'size' of the BigNum: this gives the number of digits used to store
  // the number. This is used in the maths operations to compare objects and to
  // decide result sizes.
  size_t get_size_abs() const
  {
    return this->digits.size();
  }


  // Get if the number is negative or not.
  bool is_negative() const
  {
    return this->negative;
  }

};

We have a new class object which can be initialised given a long value to hold. Operators + - * / % < <= == >= > are overloaded with the correct functions to perform multi-byte maths or comparison.

Addition

#include "bignum.hpp"

// unsigned addition of two BigNums. First operand is the larger of the two
// (spread across more digits).
BigNum BigNum::add(const BigNum *large_addend, const BigNum *small_addend)
{
  size_t
    size_large_addend = large_addend->get_size_abs(),
    size_small_addend = small_addend->get_size_abs(),
    i = 0;

  BigNum result;
  digit carry = 0;

  for (i = 0; i < size_small_addend; i++) {
    carry += large_addend->digits[i] + small_addend->digits[i];
    result.digits.push_back(carry & DIGIT_MASK);
    carry = carry >> DIGIT_SHIFT;
  }

  for (; i < size_large_addend; i++) {
    carry += large_addend->digits[i];
    result.digits.push_back(carry & DIGIT_MASK);
    carry = carry >> DIGIT_SHIFT;
  }

  result.digits.push_back(carry);

  result.normalise();
  return result;
}

Subtraction

#include <iostream>

#include "bignum.hpp"


// unsigned BigNum subtraction. If the number to be subtracted (the subtrahend)
// is larger than the number from which it is subtracted (the minuend) --- if
// the subtraction would cross zero --- then the values are swapped and the
// result is negated.
BigNum BigNum::subtract(const BigNum *minuend, const BigNum *subtrahend)
{
  size_t
    size_minuend = minuend->get_size_abs(),
    size_subtrahend = subtrahend->get_size_abs();
  const BigNum *temp;
  bool negative = false;

  if (size_minuend > size_subtrahend) {
    temp = minuend;
    minuend = subtrahend;
    subtrahend = temp;
    size_minuend = minuend->get_size_abs();
    size_subtrahend = subtrahend->get_size_abs();
    negative = true;
  }
  else if (size_minuend == size_subtrahend) {
    int i = size_minuend;
    while (--i >= 0 && minuend->digits[i] == subtrahend->digits[i]);

    if (i < 0)
      return BigNum(0);

    if (minuend->digits[i] < subtrahend->digits[i]) {
      negative = true;
      temp = minuend;
      minuend = subtrahend;
      subtrahend = temp;
    }

    size_minuend = size_subtrahend = i+1;
  }

  BigNum result;
  digit borrow = 0;
  size_t i;
  for (i = 0; i < size_subtrahend; i++) {
    borrow = minuend->digits[i] - subtrahend->digits[i] - borrow;
    result.digits.push_back(borrow & DIGIT_MASK);
    borrow = borrow >> DIGIT_SHIFT;
    borrow &= 1;
  }

  for (; i < size_minuend; i++) {
    borrow = minuend->digits[i] - borrow;
    result.digits.push_back(borrow & DIGIT_MASK);
    borrow = borrow >> DIGIT_SHIFT;
    borrow &= 1;
  }

  if (borrow != 0) {
    std::cerr << "borrow not zero; something's not right." << std::endl;
    exit(1);
  }

  result.negative = negative;
  result.normalise();
  return result;
}

Multiplication

#include "bignum.hpp"

// unisgned multiplication of BigNums.
BigNum BigNum::multiply(const BigNum *large_factor, const BigNum *small_factor)
{
  size_t
    size_large_factor = large_factor->get_size_abs(),
    size_small_factor = small_factor->get_size_abs();

  BigNum result;
  for (unsigned int i = 0; i < (size_large_factor + size_small_factor); i++)
    result.digits.push_back(0);

  unsigned int i_result, i_small_factor;
  for (unsigned int i_large_factor = 0; i_large_factor < size_large_factor; i_large_factor++) {
    twodigits carry = 0;
    twodigits f = large_factor->digits[i_large_factor];
    i_result = i_large_factor;
    i_small_factor = 0;

    for (i_small_factor = 0; i_small_factor < size_small_factor; i_small_factor++) {
      carry += result.digits[i_result] + small_factor->digits[i_small_factor] * f;
      result.digits[i_result++] = (digit)(carry & DIGIT_MASK);
      carry = carry >> DIGIT_SHIFT;
    }

    if (carry)
      result.digits[i_result] += (digit)(carry & DIGIT_MASK);

  }

  return result;
}

Division and Remainder

#include "bignum.hpp"

// This is a very, VERY, simple algorithm to calculate the quotient of a
// unsigned integer dividend divided by an unisgned integer divisor. The
// remainder is also output.
void BigNum::divrem(const BigNum *dividend_ptr, const BigNum *divisor_ptr,
                    BigNum *quotient, BigNum *remainder)
{
  BigNum _remainder = (*dividend_ptr);
  BigNum _quotient(0);
  const BigNum one(1);
  const BigNum divisor = (*divisor_ptr);
  while (_remainder > BigNum(0)) {
    _remainder = _remainder - divisor;
    _quotient = _quotient + one;
  }

  if (quotient != NULL)
    (*quotient) = _quotient;

  if (remainder != NULL)
    (*remainder) = _remainder;

}

Summary

In this post I looked at the big integer implementation in Python, and I created a rudimentary version in C++. While fairly simple, I (re)learned a lot about integer mathematics and the nuts and bolts involved in performing these everyday operations.

I also took my first dive into a large open source codebase to find answers! This was pretty exciting for me. It's incredible that some random person can have a question about exactly how a thing is implemented in the software they use, and then go look it up no questions. This is the amazing thing about open source software!

There are some bits and pieces in this implementation I've labeled TODO, like parsing a string to BigNum. I will probably not come back to this, but if I do I know where to start!

All of the C++ code here is on github.


Questions? Comments? Get in touch on Twitter!