Exponentiation by squaring
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages)
(Learn how and when to remove this template message)

In mathematics and computer programming, exponentiating by squaring is a general method for fast computation of large positive integer powers of a number, or more generally of an element of a semigroup, like a polynomial or a square matrix. Some variants are commonly referred to as squareandmultiply algorithms or binary exponentiation. These can be of quite general use, for example in modular arithmetic or powering of matrices. For semigroups for which additive notation is commonly used, like elliptic curves used in cryptography, this method is also referred to as doubleandadd.
Contents
Basic method
This section may be confusing or unclear to readers. In particular, the example describes an different algorithm than the remaining of the section, as the bits of the exponent are considered in an opposite order. (April 2019) (Learn how and when to remove this template message)

The method is based on the observation that, for a positive integer n, we have
This method uses the bits of the exponent to determine which powers are computed.
This example shows how to compute using this method. The exponent, 13, is 1101 in binary. The bits are used in left to right order. The exponent has 4 bits, so there are 4 iterations.
First, initialize the result to 1: .
 Step 1) ; bit 1 = 1, so compute ;
 Step 2) ; bit 2 = 1, so compute ;
 Step 3) ; bit 3 = 0, so we are done with this step (equivalently, this corresponds to );
 Step 4) ; bit 4 = 1, so compute .
If we write in binary as , then this is equivalent to defining a sequence by letting and then defining for , where will equal .
This may be implemented as the following recursive algorithm:
Function exp_by_squaring(x, n)
if n < 0 then return exp_by_squaring(1 / x, n);
else if n = 0 then return 1;
else if n = 1 then return x ;
else if n is even then return exp_by_squaring(x * x, n / 2);
else if n is odd then return x * exp_by_squaring(x * x, (n  1) / 2);
Although not tailrecursive, this algorithm may be rewritten into a tail recursive algorithm by introducing an auxiliary function:
Function exp_by_squaring(x, n)
exp_by_squaring2(1, x, n)
Function exp_by_squaring2(y, x, n)
if n < 0 then return exp_by_squaring2(y, 1 / x,  n);
else if n = 0 then return y;
else if n = 1 then return x * y;
else if n is even then return exp_by_squaring2(y, x * x, n / 2);
else if n is odd then return exp_by_squaring2(x * y, x * x, (n  1) / 2).
The iterative version of the algorithm also uses a bounded auxiliary space, and is given by
Function exp_by_squaring_iterative(x, n)
if n < 0 then
x := 1 / x;
n := n;
if n = 0 then return 1
y := 1;
while n > 1 do
if n is even then
x := x * x;
n := n / 2;
else
y := x * y;
x := x * x;
n := (n – 1) / 2;
return x * y
Computational complexity
A brief analysis shows that such an algorithm uses squarings and at most multiplications, where denotes the floor function. More precisely, the number of multiplications is one less than the number of ones present in the binary expansion of n. For n greater than about 4 this is computationally more efficient than naively multiplying the base with itself repeatedly.
Each squaring results in approximately double the number of digits of the previous, and so, if multiplication of two ddigit numbers is implemented in O(d^{k}) operations for some fixed k, then the complexity of computing x^{n} is given by
2^{k}ary method
This algorithm calculates the value of x^{n} after expanding the exponent in base 2^{k}. It was first proposed by Brauer in 1939. In the algorithm below we make use of the following function f(0) = (k, 0) and f(m) = (s, u), where m = u·2^{s} with u odd.
Algorithm:
 Input
 An element x of G, a parameter k > 0, a nonnegative integer n = (n_{l−1}, n_{l−2}, ..., n_{0})_{2k} and the precomputed values .
 Output
 The element x^{n} in G
y := 1; i := l  1 while i ≥ 0 do (s, u) := f(n_{i}) for j := 1 to k  s do y := y^{2} y := y * x^{u} for j := 1 to s do y := y^{2} i := i  1 return y
For optimal efficiency, k should be the smallest integer satisfying^{[1]}
Slidingwindow method
This method is an efficient variant of the 2^{k}ary method. For example, to calculate the exponent 398, which has binary expansion (110 001 110)_{2}, we take a window of length 3 using the 2^{k}ary method algorithm and calculate 1, x^{3}, x^{6}, x^{12}, x^{24}, x^{48}, x^{49}, x^{98}, x^{99}, x^{198}, x^{199}, x^{398}. But, we can also compute 1, x^{3}, x^{6}, x^{12}, x^{24}, x^{48}, x^{96}, x^{192}, x^{198}, x^{199}, x^{398}, which saves one multiplication and amounts to evaluating (110 001 110)_{2}
Here is the general algorithm:
Algorithm:
 Input
 An element x of G, a non negative integer n=(n_{l−1}, n_{l−2}, ..., n_{0})_{2}, a parameter k > 0 and the precomputed values .
 Output
 The element x^{n} ∈ G.
Algorithm:
y := 1; i := l  1 while i > 1 do if n_{i} = 0 then y := y^{2}' i := i  1 else s := max{i  k + 1, 0} while n_{s} = 0 do s := s + 1^{[notes 1]} for h := 1 to i  s + 1 do y := y^{2} u := (n_{i}, n_{i1}, ..., n_{s})_{2} y := y * x^{u} i := s  1 return y
Montgomery's ladder technique
Many algorithms for exponentiation do not provide defence against sidechannel attacks. Namely, an attacker observing the sequence of squarings and multiplications can (partially) recover the exponent involved in the computation. This is a problem if the exponent should remain secret, as with many publickey cryptosystems. A technique called "Montgomery's ladder"^{[2]} addresses this concern.
Given the binary expansion of a positive, nonzero integer n = (n_{k−1}...n_{0})_{2} with n_{k−1} = 1, we can compute x^{n} as follows:
x_{1} = x; x_{2} = x^{2} for i = k  2 to 0 do If n_{i} = 0 then x_{2} = x_{1} * x_{2}; x_{1} = x_{1}^{2} else x_{1} = x_{1} * x_{2}; x_{2} = x_{2}^{2} return x_{1}
The algorithm performs a fixed sequence of operations (up to log n): a multiplication and squaring takes place for each bit in the exponent, regardless of the bit's specific value. A similar algorithm for multiplication by doubling exists.
This specific implementation of Montgomery's ladder is not yet protected against cache timing attacks: memory access latencies might still be observable to an attacker, as different variables are accessed depending on the value of bits of the secret exponent. Modern cryptographic implementations use a "scatter" technique to make sure the processor always misses the faster cache.^{[3]}
Fixedbase exponent
There are several methods which can be employed to calculate x^{n} when the base is fixed and the exponent varies. As one can see, precomputations play a key role in these algorithms.
Yao's method
Yao's method is orthogonal to the 2^{k}ary method where the exponent is expanded in radix b = 2^{k} and the computation is as performed in the algorithm above. Let n, n_{i}, b, and b_{i} be integers.
Let the exponent n be written as
where for all .
Let x_{i} = x^{bi}.
Then the algorithm uses the equality
Given the element x of G, and the exponent n written in the above form, along with the precomputed values x^{b0}...x^{bw−1}, the element x^{n} is calculated using the algorithm below:
y = 1, u = 1, j = h  1 while j > 0 do for i = 0 to w  1 do if n_{i} = j then u = u × x^{bi} y = y × u j = j  1 return y
If we set h = 2^{k} and b_{i} = h^{i}, then the n_{i} values are simply the digits of n in base h. Yao's method collects in u first those x_{i} that appear to the highest power ; in the next round those with power are collected in u as well etc. The variable y is multiplied times with the initial u, times with the next highest powers, and so on. The algorithm uses multiplications, and elements must be stored to compute x^{n}.^{[1]}
Euclidean method
The Euclidean method was first introduced in Efficient exponentiation using precomputation and vector addition chains by P.D Rooij.
This method for computing in group G, where n is a natural integer, whose algorithm is given below, is using the following equality recursively:
where . In other words, a Euclidean division of the exponent n_{1} by n_{0} is used to return a quotient q and a rest n_{1} mod n_{0}.
Given the base element x in group G, and the exponent written as in Yao's method, the element is calculated using precomputed values and then the algorithm below.
Begin loop Find , such that . Find , such that . Break loop if . Let , and then let . Compute recursively , and then let . End loop; Return .
The algorithm first finds the largest value among the n_{i} and then the supremum within the set of { n_{i} \ i ≠ M }. Then it raises x_{M} to the power q, multiplies this value with x_{N}, and then assigns x_{N} the result of this computation and n_{M} the value n_{M} modulo n_{N}.
Further applications
The same idea allows fast computation of large exponents modulo a number. Especially in cryptography, it is useful to compute powers in a ring of integers modulo q. It can also be used to compute integer powers in a group, using the rule
 Power(x, −n) = (Power(x, n))^{−1}.
The method works in every semigroup and is often used to compute powers of matrices.
For example, the evaluation of
 13789^{722341} (mod 2345) = 2029
would take a very long time and lots of storage space if the naïve method were used: compute 13789^{722341}, then take the remainder when divided by 2345. Even using a more effective method will take a long time: square 13789, take the remainder when divided by 2345, multiply the result by 13789, and so on. This will take less than modular multiplications.
Applying above expbysquaring algorithm, with "*" interpreted as x * y = xy mod 2345 (that is, a multiplication followed by a division with remainder) leads to only 27 multiplications and divisions of integers, which may all be stored in a single machine word.
Example implementations
Computation by powers of 2
This is a nonrecursive implementation of the above algorithm in Ruby.
n = n  1
is redundant when n = n / 2
implicitly rounds towards zero, as stronglytyped languages with integer division would do. (n = n << 1
has the same effect.) n[0]
is the rightmost bit of the binary representation of n
, so if it is 1, then the number is odd, and if it is zero, then the number is even. It is also n
modulo 2. (n & 1
has the same effect.)
def power(x, n)
result = 1
while n.nonzero?
if n[0].nonzero?
result *= x
n = 1
end
x *= x
n /= 2
end
return result
end
Runtime example: compute 3^{10}
parameter x = 3 parameter n = 10 result := 1 Iteration 1 n = 10 > n is even x := x^{2} = 3^{2} = 9 n := n / 2 = 5 Iteration 2 n = 5 > n is odd > result := result * x = 1 * x = 1 * 3^{2} = 9 n := n  1 = 4 x := x^{2} = 9^{2} = 3^{4} = 81 n := n / 2 = 2 Iteration 3 n = 2 > n is even x := x^{2} = 81^{2} = 3^{8} = 6561 n := n / 2 = 1 Iteration 4 n = 1 > n is odd > result := result * x = 3^{2} * 3^{8} = 3^{10} = 9 * 6561 = 59049 n := n  1 = 0 return result
Runtime example: compute 3^{10}
result := 3 bin := "1010" Iteration for digit 4: result := result^{2} = 3^{2} = 9 1010_{bin}  Digit equals "0" Iteration for digit 3: result := result^{2} = (3^{2})^{2} = 3^{4} = 81 1010_{bin}  Digit equals "1" > result := result*3 = (3^{2})^{2}*3 = 3^{5} = 243 Iteration for digit 2: result := result^{2} = ((3^{2})^{2}*3)^{2} = 3^{10} = 59049 1010_{bin}  Digit equals "0" return result
This example is based on the algorithm above. If calculated by hand, should go from left to right. If the start number is 1, just ignore it. Then if the next is one, square and multiply. If the next is zero, only square.
Calculation of products of powers
Exponentiation by squaring may also be used to calculate the product of 2 or more powers. If the underlying group or semigroup is commutative, then it is often possible to reduce the number of multiplications by computing the product simultaneously.
Example
The formula a^{7}×b^{5} may be calculated within 3 steps:
 ((a)^{2}×a)^{2}×a (4 multiplications for calculating a^{7}),
 ((b)^{2})^{2}×b (3 multiplications for calculating b^{5}),
 (a^{7})×(b^{5}) (1 multiplication to calculate the product of the two),
so one gets 8 multiplications in total.
A faster solution is to calculate both powers simultaneously:
 ((a×b)^{2}×a)^{2}×a×b,
which needs only 6 multiplications in total. Note that a×b is calculated twice; the result could be stored after the first calculation, which reduces the count of multiplication to 5.
Example with numbers:
 2^{7}×3^{5} = ((2×3)^{2}×2)^{2}×2×3 = (6^{2}×2)^{2}×6 = 72^{2}×6 = 31104.
Calculating the powers simultaneously instead of calculating them separately always reduces the count of multiplications if at least two of the exponents are greater than 1.
Using transformation
The example above a^{7}×b^{5} may also be calculated with only 5 multiplications if the expression is transformed before calculation:

a^{7}×b^{5} = a^{2}×(ab)^{5}, with ab := a×b,
 ab := a×b (1 multiplication),
 a^{2}×(ab)^{5} = ((ab)^{2}×a)^{2}×ab (4 multiplications).
Generalization of transformation shows the following scheme:
For calculating a^{A}×b^{B}×...×m^{M}×n^{N}
 Define ab := a×b, abc = ab×c, ...
 Calculate the transformed expression a^{A−B}×ab^{B−C}×...×abc..m^{M−N}×abc..mn^{N}.
Transformation before calculation often reduces the count of multiplications, but in some cases it also increases the count (see the last one of the examples below), so it may be a good idea to check the count of multiplications before using the transformed expression for calculation.
Examples
For the following expressions the count of multiplications is shown for calculating each power separately, calculating them simultaneously without transformation, and calculating them simultaneously after transformation.
Example  a^{7}×b^{5}×c^{3}  a^{5}×b^{5}×c^{3}  a^{7}×b^{4}×c^{1} 

separate  [((a)^{2}×a)^{2}×a] × [((b)^{2})^{2}×b] × [(c)^{2}×c] (11 multiplications) 
[((a)^{2})^{2}×a] × [((b)^{2})^{2}×b] × [(c)^{2}×c] (10 multiplications) 
[((a)^{2}×a)^{2}×a] × [((b)^{2})^{2}] × [c] (8 multiplications) 
simultaneous  ((a×b)^{2}×a×c)^{2}×a×b×c (8 multiplications) 
((a×b)^{2}×c)^{2}×a×b×c (7 multiplications) 
((a×b)^{2}×a)^{2}×a×c (6 multiplications) 
transformation  a := 2 ab := a×b abc := ab×c (2 multiplications) 
a := 2 ab := a×b abc := ab×c (2 multiplications) 
a := 2 ab := a×b abc := ab×c (2 multiplications) 
calculation after that  (a×ab×abc)^{2}×abc (4 multiplications ⇒ 6 in total) 
(ab×abc)^{2}×abc (3 multiplications ⇒ 5 in total) 
(a×ab)^{2}×a×ab×abc (5 multiplications ⇒ 7 in total) 
Signeddigit recoding
In certain computations it may be more efficient to allow negative coefficients and hence use the inverse of the base, provided inversion in G is "fast" or has been precomputed. For example, when computing x^{2k−1}, the binary method requires k−1 multiplications and k−1 squarings. However, one could perform k squarings to get x^{2k} and then multiply by x^{−1} to obtain x^{2k−1}.
To this end we define the signeddigit representation of an integer n in radix b as
Signed binary representation corresponds to the particular choice b = 2 and . It is denoted by . There are several methods for computing this representation. The representation is not unique. For example, take n = 478: two distinct signedbinary representations are given by and , where is used to denote −1. Since the binary method computes a multiplication for every nonzero entry in the base2 representation of n, we are interested in finding the signedbinary representation with the smallest number of nonzero entries, that is, the one with minimal Hamming weight. One method of doing this is to compute the representation in nonadjacent form, or NAF for short, which is one that satisfies and denoted by . For example, the NAF representation of 478 is . This representation always has minimal Hamming weight. A simple algorithm to compute the NAF representation of a given integer with is the following:
for i = 0 to l − 1 do return
Another algorithm by Koyama and Tsuruoka does not require the condition that ; it still minimizes the Hamming weight.
Alternatives and generalizations
Exponentiation by squaring can be viewed as a suboptimal additionchain exponentiation algorithm: it computes the exponent by an addition chain consisting of repeated exponent doublings (squarings) and/or incrementing exponents by one (multiplying by x) only. More generally, if one allows any previously computed exponents to be summed (by multiplying those powers of x), one can sometimes perform the exponentiation using fewer multiplications (but typically using more memory). The smallest power where this occurs is for n = 15:
 (squaring, 6 multiplies),
 (optimal addition chain, 5 multiplies if x^{3} is reused).
In general, finding the optimal addition chain for a given exponent is a hard problem, for which no efficient algorithms are known, so optimal chains are typically only used for small exponents (e.g. in compilers where the chains for small powers have been pretabulated). However, there are a number of heuristic algorithms that, while not being optimal, have fewer multiplications than exponentiation by squaring at the cost of additional bookkeeping work and memory usage. Regardless, the number of multiplications never grows more slowly than Θ(log n), so these algorithms only improve asymptotically upon exponentiation by squaring by a constant factor at best.
See also
 Modular exponentiation
 Vectorial addition chain
 Montgomery reduction
 Nonadjacent form
 Addition chain
Notes
 ^ In this line, the loop finds the longest string of length less than or equal to k ending in a nonzero value. Not all odd powers of 2 up to need be computed, and only those specifically involved in the computation need be considered.
References
 ^ ^{a} ^{b} Cohen, H.; Frey, G., eds. (2006). Handbook of Elliptic and Hyperelliptic Curve Cryptography. Discrete Mathematics and Its Applications. Chapman & Hall/CRC. ISBN 9781584885184.
 ^ Montgomery, Peter L. (1987). "Speeding the Pollard and Elliptic Curve Methods of Factorization" (PDF). Math. Comput. 48 (177): 243–264.
 ^ Gueron, Shay (5 April 2012). "Efficient software implementations of modular exponentiation" (PDF). Journal of Cryptographic Engineering. 2 (1): 31–43. doi:10.1007/s1338901200315.