Python's stretchy integers.
09 Jan 2020Big Numbers
In Python, you can do maths with arbitrarily large numbers:
2 ** 100 # = 1267650600228229401496703205376
The above is perfectly vaild and will not overflow, despite the maximum value
of a 32bit unsigned integer of 9223372036854775807
. This is a really cool
bit of functionality which I've been wanting to include in my lisp
interpreter. Let's see how Python does it.
This functionality was introduced to Python in PEP 237: "Unifying Long
Integers and Integers". In the PEP, the implementation is brushed upon: Python
Ints
are abstractions over internal long
types and bignums
; a bignum
being an array/vector of short ints representing arbitrary length
integers.
Cool! That sounds pretty simple actually. An array of integers: just need to write functions describing how to do maths with these stretchy ints, how to compare them, how to convert them to strings, and how to convert a string to a stretchy int. Hmmm not quite simple, but doable.
The hardest parts, I think, will be the interconversion between a bignumn
and string
representations. The maths and comparison should be relatively
easy to do, probably just needing a brush up of binary maths algorithms.
Searching for the source
Looking at the cPython relevant source code (longobject.c longobject.h longintrepr.h), we can get some detail about the implementation. The last header is where the meat of the structure is:
struct _longobject { PyObject_VAR_HEAD digit ob_digit[1]; };
And that's it.
A digit is the smallest increment of memory the long uses, and can either be a "normal" 32-bit integer, or a short int (16-bit). There are limitations of the algorithms used to do maths with the resulting long object that prevent smaller digits, in fact the algorithms require either 30 or 15 bits in a digit.
The object is a "PyObject_VAR_HEAD
", which means it has a variable size (as
the digit array can be grown or shrunk). The macro "Py_SIZE
" is used to
return the size of a variable-sized Python object, which in this case would
return the number of digits. However, this macro is also used to store the
sign of a long object. It is common throughout the source to see things like
if (Py_SIZE(z) <0) { ...
which doesn't make sense upon first look, but using
the size member to store the sign is a pretty clever wait to do it, using a
separate variable would be a waste of memory.
Also defined in longintrepr.h
are:
PyLong_SHIFT
- the number of bits to shift by (30 or 15)
_PyLong_DECIMAL_SHIFT
- shifting a decimal multiplies by 10what?
_PyLong_DECIMAL_BASE
- 10_PyLongDECIMALSHIFT
PyLong_BASE
- as for decimal, but for binary: 2shift
PyLong_MASK
PyLong_BASE
- 1
Now we can look into the implementation longobject.c
file, and see how data
is handled.
1 + 1 = ?
Let's start with addition:
static PyObject * long_add(PyLongObject *a, PyLongObject *b) { PyLongObject *z; CHECK_BINOP(a, b); if (Py_ABS(Py_SIZE(a)) <= 1 && Py_ABS(Py_SIZE(b)) <= 1) { return PyLong_FromLong(MEDIUM_VALUE(a) + MEDIUM_VALUE(b)); } if (Py_SIZE(a) < 0) { if (Py_SIZE(b) < 0) { z = x_add(a, b); if (z != NULL) { /* x_add received at least one multiple-digit int, and thus z must be a multiple-digit int. That also means z is not an element of small_ints, so negating it in-place is safe. */ assert(Py_REFCNT(z) == 1); Py_SIZE(z) = -(Py_SIZE(z)); } } else z = x_sub(b, a); } else { if (Py_SIZE(b) < 0) z = x_sub(a, b); else z = x_add(a, b); } return (PyObject *)z; }
This function looks at the values of the operands, a
and b
. Why they
couldn't have chosen semi-descriptive names, I don't know. It checks if a
and b
either span more than a digit, and if not it adds them together
directly, returning a new PyLong
. Otherwise, call the function x_add
(or
x_sub
) on the operands. Let's follow the trail:
/* Add the absolute values of two integers. */ static PyLongObject * x_add(PyLongObject *a, PyLongObject *b) { Py_ssize_t size_a = Py_ABS(Py_SIZE(a)), size_b = Py_ABS(Py_SIZE(b)); PyLongObject *z; Py_ssize_t i; digit carry = 0; /* Ensure a is the larger of the two: */ if (size_a < size_b) { { PyLongObject *temp = a; a = b; b = temp; } { Py_ssize_t size_temp = size_a; size_a = size_b; size_b = size_temp; } } z = _PyLong_New(size_a+1); if (z == NULL) return NULL; for (i = 0; i < size_b; ++i) { carry += a->ob_digit[i] + b->ob_digit[i]; z->ob_digit[i] = carry & PyLong_MASK; carry >>= PyLong_SHIFT; } for (; i < size_a; ++i) { carry += a->ob_digit[i]; z->ob_digit[i] = carry & PyLong_MASK; carry >>= PyLong_SHIFT; } z->ob_digit[i] = carry; return long_normalize(z); }
This function takes two unsigned numbers and adds them. The first wee part is
getting ready to do the addition (ensuring a > b
, allocating result z
),
the meat of the function starts with the first loop:
for (i = 0; i < size_b; ++i) {
carry += a->ob_digit[i] + b->ob_digit[i];
z->ob_digit[i] = carry & PyLong_MASK;
carry >>= PyLong_SHIFT;
}
As a digit is int16_t
or int32_t
, but the shift is 15
or 30
, there is
guaranteed to be enough room in a digit to store the addition of two digits,
so a digit is sufficient for the carry
. Starting from the smallest digit,
digits of the operands are added with the result in carry
, this is masked
and stored in the result and the carry is right-shifted so it contains the
carry relevant for the next digits.
This is fairly straightforward, but here's a wee run-through. Lets make the digit size 4 (24 = 16) with a shift of 3 and let's add 40 and 20:
# 40 -> 0b10 1000 # 20 -> 0b010100 a = [8, 2] b = [4, 1] z = [0, 0, 0] carry = 0 mask = 15 # = 0b1111 shift = 4 # first iteration carry = carry + a[0] + b[0] # -> 0 + 8 + 4 = 12 z[0] = carry & mask # -> 0b1100 & 0b1111 -> 0b1100 -> 12 carry = carry >> shift # -> 0 # second iteration carry = carry + a[1] + b[1] # -> 0 + 2 + 1 = 3 z[1] = carry & mask # -> 0b0011 & 0b1111 -> 0b0011 -> 3 carry = carry >> shift # -> 0 z[2] = carry #z = [12, 3, 0] -> 12 + 3*(2**shift) + 0 -> 60
Cool! The two loops work doing the same thing. The first loop adds up pairs of
digits from a
and b
, and as a
is the larger array of digits, the second
loop carrys the carry
up for the rest of a. Finally, the carry at the end is
put into the final digit of the result which is normalised (leading zeroes
removed) and returned.
So this process adds two multi-byte unsigned integers. To add negative numbers and so on, the process is wrapped in another function to check the signs of the operands, if they are both positive, or both negative, the above process is valid. Otherwise, we need something else…
… subtraction.
3 - 1 = ?
Again, let's start with the raw subtraction function:
/* Subtract the absolute values of two integers. */ static PyLongObject * x_sub(PyLongObject *a, PyLongObject *b) { Py_ssize_t size_a = Py_ABS(Py_SIZE(a)), size_b = Py_ABS(Py_SIZE(b)); PyLongObject *z; Py_ssize_t i; int sign = 1; digit borrow = 0; /* Ensure a is the larger of the two: */ if (size_a < size_b) { sign = -1; { PyLongObject *temp = a; a = b; b = temp; } { Py_ssize_t size_temp = size_a; size_a = size_b; size_b = size_temp; } } else if (size_a == size_b) { /* Find highest digit where a and b differ: */ i = size_a; while (--i >= 0 && a->ob_digit[i] == b->ob_digit[i]) ; if (i < 0) return (PyLongObject *)PyLong_FromLong(0); if (a->ob_digit[i] < b->ob_digit[i]) { sign = -1; { PyLongObject *temp = a; a = b; b = temp; } } size_a = size_b = i+1; } z = _PyLong_New(size_a); if (z == NULL) return NULL; for (i = 0; i < size_b; ++i) { /* The following assumes unsigned arithmetic works module 2**N for some N>PyLong_SHIFT. */ borrow = a->ob_digit[i] - b->ob_digit[i] - borrow; z->ob_digit[i] = borrow & PyLong_MASK; borrow >>= PyLong_SHIFT; borrow &= 1; /* Keep only one sign bit */ } for (; i < size_a; ++i) { borrow = a->ob_digit[i] - borrow; z->ob_digit[i] = borrow & PyLong_MASK; borrow >>= PyLong_SHIFT; borrow &= 1; /* Keep only one sign bit */ } assert(borrow == 0); if (sign < 0) { Py_SIZE(z) = -Py_SIZE(z); } return maybe_small_long(long_normalize(z)); }
There's not much more to this func as there was in the previous: the main part
of it is, again, from the first for
loop:
for (i = 0; i < size_b; ++i) { /* The following assumes unsigned arithmetic works module 2**N for some N>PyLong_SHIFT. */ borrow = a->ob_digit[i] - b->ob_digit[i] - borrow; z->ob_digit[i] = borrow & PyLong_MASK; borrow >>= PyLong_SHIFT; borrow &= 1; /* Keep only one sign bit */ }
Borrow is a digit, which is big enough to hold the value of any operation
between two other digits of size PyLong_SHIFT
, initially zero. In that first
line, borrow stores the result of the subtraction, which is then masked and
stored in the result on the next line. borrow
is then shifted. So far, this
is all the same as in addition, except the operation has changed to '-' and
'carry' to 'borrow'. Hmm but what about that last line? And what does that
comment about unsigned arithmetic mean?
By right-shifting borrow, all that's left is a digit representing whether…
Let's look at a pseudo-code run through of this process. We'll use the 4-bit digit as before, and let's subtract 20 from 30:
a = [14, 1] # -> 0001 1110 -> 30 b = [4, 1] # -> 0001 0100 -> 20 shift = 4 mask = 0b111 result = [0, 0, 0] borrow = 0 borrow = a[0] - b[0] - borrow # -> 14 - 4 - 0 -> 10 result[0] = borrow & mask # -> 0b1010 & 0b1111 -> 0b1010 -> 10 borrow = borrow >> shift # -> 0 borrow &= 1 # -> 0 borrow = a[1] - b[1] - borrow # -> 1 - 1 - 0 -> 0 result[1] = borrow & mask # -> 0b0000 & 0b1111 -> 0b0000 -> 0 borrow = borrow >> shift # -> 0 borrow &= 1 # -> 0 assert borrow == 0 # after all this, borrow should be zero # result = 0b1010 -> 10
Okay cool, making sense so far. What if we introduce a situation in which we need to use the borrow? Let's subtract 30 from 36:
a = [4, 2] # -> 0010 1110 -> 36 b = [14, 1] # -> 0001 1110 -> 30 shift = 4 mask = 0b111 result = [0, 0] borrow = 0 borrow = a[0] - b[0] - borrow # -> 4 - 14 - 0 -> -10 -> 0b11110110 (!) result[0] = borrow & mask # -> 0b11110110 & 0b1111 -> 0b0110 -> 6 borrow = borrow >> shift # -> 0b11110110 >> 4 -> 0b1111 borrow &= 1 # -> 1 borrow = a[0] - b[0] - borrow # -> 2 - 1 - 1 -> 0 result[1] = borrow & mask # -> 0b0000 & 0b1111 -> 0b0000 -> 0 borrow = borrow >> shift # -> 0b0000 >> 4 -> 0 borrow &= 1 # -> 0 # result = 0b0110 -> 6
Ahh! It's two's complement! A negative number represented in binary two's
complement is essentially MAX-number, so an 8-bit number (0-255) is
255-number. The borrow &= 1
checks to see if there is a 1
in the position
above the digit base, if there is then the result of the subtraction was
negative and a unit needs to be borrowed from the next digit.
Two operations down, three to go!
1 * 2 = ?
All right. Let's step it up a notch: multiplication.
In the code, it can be seen there are two algorithms in use for multiplication of multi-digit numbers. There's the "grade school" multiplication method, and the Karatsuba algorithm.
Let's just tackle the grade school method for now, I may come back to the Karatsuba algorithm (when I get my head around it!) The gradeschool algorithm has a small efficiency boost in that equal values (squaring) halves the number of multiplications necessary. We'll ignore this speed up for now, and concentrate on multiplication alone:
/* Grade school multiplication, ignoring the signs. * Returns the absolute value of the product, or NULL if error. */ static PyLongObject * x_mul(PyLongObject *a, PyLongObject *b) { PyLongObject *z; Py_ssize_t size_a = Py_ABS(Py_SIZE(a)); Py_ssize_t size_b = Py_ABS(Py_SIZE(b)); Py_ssize_t i, iz, ib, bend; z = _PyLong_New(size_a + size_b); if (z == NULL) return NULL; memset(z->ob_digit, 0, Py_SIZE(z) * sizeof(digit)); /* removed: same-value multiplication case */ for (i = 0; i < size_a; ++i) { twodigits carry = 0; twodigits f = a->ob_digit[i]; iz = i; //digit *pz = z->ob_digit + i; ib = 0; //digit *pb = b->ob_digit; //digit *pbend = b->ob_digit + size_b; SIGCHECK({ Py_DECREF(z); return NULL; }); //while (pb < pbend) { while (ib < bend) { carry += z->ob_digit[iz] + b->ob_digits[ib++] * f; z->ob_digit[iz++] = (digit)(carry & PyLong_MASK); carry >>= PyLong_SHIFT; assert(carry <= PyLong_MASK); } if (carry) z->ob_digit[iz] += (digit)(carry & PyLong_MASK); assert((carry >> PyLong_SHIFT) == 0); } return long_normalize(z); }
Note: in the source, they use pointers to values in the digit array. I have converted this to indices and get[] operations (leaving the original in as comments).
So this algorithm, now we've stripped out all the tough bits, is fairly
simple: each digit in the larger number (a
) multiplies each digit in the
smaller number (b
). The carry is a bit larger than in addition and
subtraction, but other than that its a similar process with the operations
switched to *
.
What's next?
4 / 2 = ?
Algorithm D (Division of nonnegative integers).
Ahh, division. Division is done via a "divide and modulo" function
long_divmod
, the output is given by out parameters passed to the function,
so we can take two outputs from the func (either the quotient, or the
modulo). long_divmod
actually calls another function, long_divrem
which
calculates quotient and remainder, by calling another function x_divrem
:
/* Unsigned int division with remainder -- the algorithm. The arguments v1 and w1 should satisfy 2 <= Py_ABS(Py_SIZE(w1)) <= Py_ABS(Py_SIZE(v1)). */ static PyLongObject * x_divrem(PyLongObject *v1, PyLongObject *w1, PyLongObject **prem) { PyLongObject *v, *w, *a; Py_ssize_t i, k, size_v, size_w; int d; digit wm1, wm2, carry, q, r, vtop, *v0, *vk, *w0, *ak; twodigits vv; sdigit zhi; stwodigits z; /* We follow Knuth [The Art of Computer Programming, Vol. 2 (3rd edn.), section 4.3.1, Algorithm D], except that we don't explicitly handle the special case when the initial estimate q for a quotient digit is >= PyLong_BASE: the max value for q is PyLong_BASE+1, and that won't overflow a digit. */ /* allocate space; w will also be used to hold the final remainder */ size_v = Py_ABS(Py_SIZE(v1)); size_w = Py_ABS(Py_SIZE(w1)); assert(size_v >= size_w && size_w >= 2); /* Assert checks by div() */ v = _PyLong_New(size_v+1); if (v == NULL) { *prem = NULL; return NULL; } w = _PyLong_New(size_w); if (w == NULL) { Py_DECREF(v); *prem = NULL; return NULL; } /* normalize: shift w1 left so that its top digit is >= PyLong_BASE/2. shift v1 left by the same amount. Results go into w and v. */ d = PyLong_SHIFT - bits_in_digit(w1->ob_digit[size_w-1]); carry = v_lshift(w->ob_digit, w1->ob_digit, size_w, d); assert(carry == 0); carry = v_lshift(v->ob_digit, v1->ob_digit, size_v, d); if (carry != 0 || v->ob_digit[size_v-1] >= w->ob_digit[size_w-1]) { v->ob_digit[size_v] = carry; size_v++; } /* Now v->ob_digit[size_v-1] < w->ob_digit[size_w-1], so quotient has at most (and usually exactly) k = size_v - size_w digits. */ k = size_v - size_w; assert(k >= 0); a = _PyLong_New(k); if (a == NULL) { Py_DECREF(w); Py_DECREF(v); *prem = NULL; return NULL; } v0 = v->ob_digit; w0 = w->ob_digit; wm1 = w0[size_w-1]; wm2 = w0[size_w-2]; for (vk = v0+k, ak = a->ob_digit + k; vk-- > v0;) { /* inner loop: divide vk[0:size_w+1] by w0[0:size_w], giving single-digit quotient q, remainder in vk[0:size_w]. */ SIGCHECK({ Py_DECREF(a); Py_DECREF(w); Py_DECREF(v); *prem = NULL; return NULL; }); /* estimate quotient digit q; may overestimate by 1 (rare) */ vtop = vk[size_w]; assert(vtop <= wm1); vv = ((twodigits)vtop << PyLong_SHIFT) | vk[size_w-1]; q = (digit)(vv / wm1); r = (digit)(vv - (twodigits)wm1 * q); /* r = vv % wm1 */ while ((twodigits)wm2 * q > (((twodigits)r << PyLong_SHIFT) | vk[size_w-2])) { --q; r += wm1; if (r >= PyLong_BASE) break; } assert(q <= PyLong_BASE); /* subtract q*w0[0:size_w] from vk[0:size_w+1] */ zhi = 0; for (i = 0; i < size_w; ++i) { /* invariants: -PyLong_BASE <= -q <= zhi <= 0; -PyLong_BASE * q <= z < PyLong_BASE */ z = (sdigit)vk[i] + zhi - (stwodigits)q * (stwodigits)w0[i]; vk[i] = (digit)z & PyLong_MASK; zhi = (sdigit)Py_ARITHMETIC_RIGHT_SHIFT(stwodigits, z, PyLong_SHIFT); } /* add w back if q was too large (this branch taken rarely) */ assert((sdigit)vtop + zhi == -1 || (sdigit)vtop + zhi == 0); if ((sdigit)vtop + zhi < 0) { carry = 0; for (i = 0; i < size_w; ++i) { carry += vk[i] + w0[i]; vk[i] = carry & PyLong_MASK; carry >>= PyLong_SHIFT; } --q; } /* store quotient digit */ assert(q < PyLong_BASE); *--ak = q; } /* unshift remainder; we reuse w to store the result */ carry = v_rshift(w0, v0, size_w, d); assert(carry==0); Py_DECREF(v); *prem = long_normalize(w); return long_normalize(a); }
You know any code which includes a citation to Knuth is gonna be something special. The citation is to /"The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Chapter 4.3: Multiple-Precision Arithmetic, Algorithm D"/. I don't have access to this book (yet, it's on my wish list!), so after a bit of searching I found a website with the algorithm copied out. The author of that article adds a definition stage providing detail about the variables used in the algorithm:
- Define
- Let U be the dividend (or numerator) of m+n "digits", stored in an array of m+n elements, one "digit" per element, with the most significant "digit" in element U[m+n−1] and the least significant "digit" in element U[0];
- Let V be the divisor (or denominator) of n "digits", stored in a second array of n elements, with n greater than 1, a non-zero most significant "digit" in element V[n−1] and the least significant "digit" in element V[0];
- Let B be the base (or radix) of the "digits" (also "limbs", "places" or "words"), typically (a power of) a power of 2, with B greater than 1.
- The algorithm computes the quotient Q of m+1 "digits" as (U ⁄ V), and
the remainder R of n "digits" as U % V, using the following "primitive"
operations:
- addition or subtraction of two single-digit integers, giving a single-digit sum and a carry, or a single-digit difference and a borrow;
- "widening" multiplication of two single-digit integers, giving a double-digit product;
- "narrowing" division of a double-digit integer by a single-digit integer, giving a single-digit quotient and a single-digit remainder.
- Normalize
- Set D to (B − 1) ÷ V[n−1];
- Multiply all "digits" of U and V by D.
- (On a binary computer, choose D to be a power of 2 instead of the value provided above; any value of D that results in V[n−1] not less than B ÷ 2 will suffice. If D is greater than 1, the eventual overflow "digit" of the dividend U goes into element U[m+n].)
- Initialize j
- Set the loop counter j to m.
- Calculate Q′
- Set Q′ to (U[n+j] * B + U[n−1+j]) ÷ V[n−1];
- Set R′ to (U[n+j] * B + U[n−1+j]) % V[n−1];
- Test if Q′ equals B or Q′ * V[n−2] is greater than R′ * B + U[n−2+j];
- If yes, then decrease Q′ by 1, increase R′ by V[n−1], and repeat this test while R is less than B.
- Multiply and subtract
- Replace (U[n+j]U[n−1+j]…U[j]) by (U[n+j]U[n−1+j]…U[j]) − Q′ * (V[n−1]…V[1]V[0]).
- (The "digits" (U[n+j]…U[j]) should be kept positive; if the result of this step is actually negative, (U[n+j]…U[j]) should be left as the true value plus Bn+1, namely as the B’s complement of the true value, and a borrow to the left should be remembered.)
- Test remainder
- Set Q[j] to Q′;
- If the result of step D4 was negative, i.e. the subtraction needed a borrow, then proceed with step D6; otherwise proceed with step D7.
- Add back
- Decrease Q[j] by 1 and add (0V[n−1]…V[1]V[0]) to (U[n+j]U[n−1+j]…U[1+j]U[j]).
- (A carry will occur to the left of U[n+j], and it should be ignored since it cancels with the borrow that occurred in step D4.)
- Loop on j
- Decrease j by 1;
- Test if j is not less than 0;
- If yes, go back to step D3.
- Unnormalize
- Now (Q[m]…Q[1]Q[0]) is the desired quotient Q, and the desired remainder R may be obtained by dividing (U[n−1]…U[1]U[0]) by D.
Oooft.
Looking through that is pretty intimidating. I have a rudimentary idea of how
to perform this long division in binary, but arbitrary length integer
division written out so formally makes my eyes water. I don't want to copy
code I don't understand, so I won't. I do intend to work towards this, but
for now I will use my terrible terrible solution (sketched here in python
):
def divrem(dividend, divisor): remainder = dividend quotient = 0 while remainder > 0: remainder -= divisor quotient += 1 if remainder < 0: remainder += divisor quotient -= 1 return quotient, remainder
As multiplication is repeated addition: integer division is repeated
subtraction. To implement this, we also need to be able to compare BigNum
types, at least this is fairly trivial.
C++
implementation
Right, home stretch! Let's put what we've got together, and C++
-ify it a
bit.
Here's the header for the BigNum
class:
#include <vector> #include <sstream> // Define some types. A 'digit' is the smallest unit of integer: the storage // size of each element in the vector representing the big integer. 'twodigits' // is as it sounds: a datatype large enough to hold something twice as big as a // digit. typedef unsigned short digit; typedef unsigned int twodigits; // DIGIT_SHIFT defines the bit-width of a number in a single digit. So moving // from a digit in the LSB position to LSB+1 position left-shifts by DIGIT_SHIFT // amount. In the Python implementation, they choose a digit shift of 15 or 30 // as this allows them to use more complicated algorithms giving performance // boosts. #define DIGIT_SHIFT 15 // DIGIT_BASE is the largest number representable by the digit, plus one. If // DIGIT_SHIFT was 3, then our base would be 8, the maximum representable number // 0b111 (decimal 7). #define DIGIT_BASE (1<<DIGIT_SHIFT) // DIGIT_MASK is the number where all the bits of the digit are one: the larges // number representable by the digit. #define DIGIT_MASK (DIGIT_BASE-1) // Some of the algorithms used require the operands to be in order of number of // digits. This macro will run function on pointer 'this' and pointer 'other' in // varying orders depending on the result of the size comparison stored in // variable 'thisbigger'. #define SORTSIZE(FUNCTION)\ if (thisbigger) {\ FUNCTION(this, other);\ }\ else {\ FUNCTION(other, this);\ } // BigNum(); // BigNum(long number); // // A BigNum represents an Integer accross any number of bytes. Methods used here // follow mostly along with that used by the Python developers, simplified as // necessary. class BigNum { private: // The integer is describe entirely by this vector of digits, and a sign. std::vector<digit> digits; bool negative; // Class methods described here are defined in add.cpp, subtract.cpp, // multiply.cpp, and divide.cpp // https://en.wikipedia.org/wiki/Addition#Notation_and_terminology static BigNum add (const BigNum *large_addend, const BigNum *small_addend); // https://en.wikipedia.org/wiki/Subtraction#Notation_and_terminology static BigNum subtract(const BigNum *minuend, const BigNum *subtrahend); // https://en.wikipedia.org/wiki/Multiplication#Notation_and_terminology static BigNum multiply(const BigNum *large_factor, const BigNum *small_factor); // https://en.wikipedia.org/wiki/Division_(mathematics)#Notation static void divrem(const BigNum *dividend, const BigNum *divisor, BigNum *quotient, BigNum *remainder); // BigNum::normalise(); // Removes leading zeros from the digit vector. void normalise() { while (this->digits.back() == 0) this->digits.pop_back(); } public: // Default constructor: creates a big number with value '0'. BigNum() { this->negative = false; } // Initialise constructor: creates a BigNum from the long value given. BigNum(long value) { this->negative = value < 0; if (this->negative) value = -value; long carry = value; while(carry) { this->digits.push_back(carry & DIGIT_MASK); carry = carry >> DIGIT_SHIFT; } } // Initialise constructor: create a BigNum from the string. BigNum(std::string value); // TODO // Addition. This code wraps the unsigned addition function defined in add.cpp // and unsigned subtraction in subtract.cpp. Adding two positive, or two // negative numbers, is addition. Adding a positive to a negative, or a // negative to a positive, is subtraction. BigNum operator+(const BigNum &other_ref) { BigNum result; const BigNum *other = &other_ref; bool thisbigger = this->get_size_abs() > other->get_size_abs(); if (this->negative) { if (other->negative) { SORTSIZE(result = BigNum::add); result.negative = true; } else { result = BigNum::subtract(other, this); } } else { if (other->negative) { result = BigNum::subtract(this, other); } else { SORTSIZE(result = BigNum::add); } } return result; } // Subtraction. Similar to addition above, but a negation operation is applied // to the right hand operand. BigNum operator-(const BigNum &other_ref) { BigNum result; const BigNum *other = &other_ref; bool thisbigger = this->get_size_abs() > other->get_size_abs(); if (this->negative) { if (!other->negative) { SORTSIZE(result = BigNum::add); result.negative = true; } else { result = BigNum::subtract(this, other); } } else { if (!other->negative) { result = BigNum::subtract(other, this); } else { SORTSIZE(result = BigNum::add); } } result.negative = negative; return result; } // Multiplication. Need to sort the operands into bigger-first-order, then // call on the algorithm defined in multiply.cpp. BigNum operator*(const BigNum &other_ref) { const BigNum *other = &other_ref; bool thisbigger = this->get_size_abs() > other->get_size_abs(); BigNum result; SORTSIZE(result = BigNum::multiply) result.negative = (other->negative != this->negative); return result; } // Division. No size-sorting this time, just call the divrem class method // (defined in divide.cpp) providing location reference for the quotient to be // returned. BigNum operator/(const BigNum &other) { BigNum quotient; BigNum::divrem(this, &other, "ient, NULL); quotient.negative = (other.negative != this->negative); return quotient; } // Modulo. Same algorithm as division (divrem in divide.cpp), except we // provide the remainder address instead of the quotient. BigNum operator%(const BigNum &other) { BigNum remainder; BigNum::divrem(this, &other, NULL, &remainder); if (other.negative) { remainder = remainder - other; } return remainder; } // Greater than comparison. bool operator>(const BigNum &other) { if (other.negative != this->negative) return other.negative; size_t this_size = this->get_size_abs(), other_size = other.get_size_abs(); if (this_size > other_size) return !this->negative; // this is bigger: if NOT negative then this is greater than. if (this_size < other_size) return this->negative; // this is smaller: if negative then this is greater than. if (this_size == 0) return false; // both the same: both zero int i = this->get_size_abs(); while (--i >= 0 && this->digits[i] == other.digits[i]); if (i < 0) return false; // both the same return this->digits[i] > other.digits[i]; } // Equality comparison. bool operator==(const BigNum &other) { if (this->get_size_abs() != other.get_size_abs()) { return false; } if (this->negative != other.negative) { return false; } for (size_t i = 0; i < this->get_size_abs(); i++) { if (this->digits[i] != other.digits[i]) return false; } return true; } // not equal to bool operator!=(const BigNum &other) { return !((*this) == other); } // greater than or equal to bool operator>=(const BigNum &other) { return ((*this) > other) || ((*this) == other); } // less than bool operator<(const BigNum &other) { // less than is not greater or equal to return !((*this) >= other); } // less than or equal to bool operator<=(const BigNum &other) { // less than or equal is not greater return !((*this) > other); } // String representation. Show the value of the BigNum as a string. std::string repr() const { // TODO: this solution is temporary: it will over- and underflow if bigger // numbers than a long long is capable of storing is used. int i = this->get_size_abs(); if (i == 0) return "0"; long long res = 0; for (i--; i >= 0; i--) { res <<= DIGIT_SHIFT; res += this->digits[i]; } std::stringstream ss; if (this->is_negative()) ss << "-"; ss << res; return ss.str(); } // Get the 'size' of the BigNum: this gives the number of digits used to store // the number. This is used in the maths operations to compare objects and to // decide result sizes. size_t get_size_abs() const { return this->digits.size(); } // Get if the number is negative or not. bool is_negative() const { return this->negative; } };
We have a new class object which can be initialised given a long
value to
hold. Operators + - * / % < <= == >= >
are overloaded with the correct
functions to perform multi-byte maths or comparison.
Addition
#include "bignum.hpp" // unsigned addition of two BigNums. First operand is the larger of the two // (spread across more digits). BigNum BigNum::add(const BigNum *large_addend, const BigNum *small_addend) { size_t size_large_addend = large_addend->get_size_abs(), size_small_addend = small_addend->get_size_abs(), i = 0; BigNum result; digit carry = 0; for (i = 0; i < size_small_addend; i++) { carry += large_addend->digits[i] + small_addend->digits[i]; result.digits.push_back(carry & DIGIT_MASK); carry = carry >> DIGIT_SHIFT; } for (; i < size_large_addend; i++) { carry += large_addend->digits[i]; result.digits.push_back(carry & DIGIT_MASK); carry = carry >> DIGIT_SHIFT; } result.digits.push_back(carry); result.normalise(); return result; }
Subtraction
#include <iostream> #include "bignum.hpp" // unsigned BigNum subtraction. If the number to be subtracted (the subtrahend) // is larger than the number from which it is subtracted (the minuend) --- if // the subtraction would cross zero --- then the values are swapped and the // result is negated. BigNum BigNum::subtract(const BigNum *minuend, const BigNum *subtrahend) { size_t size_minuend = minuend->get_size_abs(), size_subtrahend = subtrahend->get_size_abs(); const BigNum *temp; bool negative = false; if (size_minuend > size_subtrahend) { temp = minuend; minuend = subtrahend; subtrahend = temp; size_minuend = minuend->get_size_abs(); size_subtrahend = subtrahend->get_size_abs(); negative = true; } else if (size_minuend == size_subtrahend) { int i = size_minuend; while (--i >= 0 && minuend->digits[i] == subtrahend->digits[i]); if (i < 0) return BigNum(0); if (minuend->digits[i] < subtrahend->digits[i]) { negative = true; temp = minuend; minuend = subtrahend; subtrahend = temp; } size_minuend = size_subtrahend = i+1; } BigNum result; digit borrow = 0; size_t i; for (i = 0; i < size_subtrahend; i++) { borrow = minuend->digits[i] - subtrahend->digits[i] - borrow; result.digits.push_back(borrow & DIGIT_MASK); borrow = borrow >> DIGIT_SHIFT; borrow &= 1; } for (; i < size_minuend; i++) { borrow = minuend->digits[i] - borrow; result.digits.push_back(borrow & DIGIT_MASK); borrow = borrow >> DIGIT_SHIFT; borrow &= 1; } if (borrow != 0) { std::cerr << "borrow not zero; something's not right." << std::endl; exit(1); } result.negative = negative; result.normalise(); return result; }
Multiplication
#include "bignum.hpp" // unisgned multiplication of BigNums. BigNum BigNum::multiply(const BigNum *large_factor, const BigNum *small_factor) { size_t size_large_factor = large_factor->get_size_abs(), size_small_factor = small_factor->get_size_abs(); BigNum result; for (unsigned int i = 0; i < (size_large_factor + size_small_factor); i++) result.digits.push_back(0); unsigned int i_result, i_small_factor; for (unsigned int i_large_factor = 0; i_large_factor < size_large_factor; i_large_factor++) { twodigits carry = 0; twodigits f = large_factor->digits[i_large_factor]; i_result = i_large_factor; i_small_factor = 0; for (i_small_factor = 0; i_small_factor < size_small_factor; i_small_factor++) { carry += result.digits[i_result] + small_factor->digits[i_small_factor] * f; result.digits[i_result++] = (digit)(carry & DIGIT_MASK); carry = carry >> DIGIT_SHIFT; } if (carry) result.digits[i_result] += (digit)(carry & DIGIT_MASK); } return result; }
Division and Remainder
#include "bignum.hpp" // This is a very, VERY, simple algorithm to calculate the quotient of a // unsigned integer dividend divided by an unisgned integer divisor. The // remainder is also output. void BigNum::divrem(const BigNum *dividend_ptr, const BigNum *divisor_ptr, BigNum *quotient, BigNum *remainder) { BigNum _remainder = (*dividend_ptr); BigNum _quotient(0); const BigNum one(1); const BigNum divisor = (*divisor_ptr); while (_remainder > BigNum(0)) { _remainder = _remainder - divisor; _quotient = _quotient + one; } if (quotient != NULL) (*quotient) = _quotient; if (remainder != NULL) (*remainder) = _remainder; }
Summary
In this post I looked at the big integer implementation in Python
, and I
created a rudimentary version in C++
. While fairly simple, I (re)learned a
lot about integer mathematics and the nuts and bolts involved in performing
these everyday operations.
I also took my first dive into a large open source codebase to find answers! This was pretty exciting for me. It's incredible that some random person can have a question about exactly how a thing is implemented in the software they use, and then go look it up no questions. This is the amazing thing about open source software!
There are some bits and pieces in this implementation I've labeled TODO
,
like parsing a string to BigNum. I will probably not come back to this, but if
I do I know where to start!
All of the C++
code here is on github.
Questions? Comments? Get in touch on Twitter!