ISOMORPHIC TRANSFORMATION AND ITS APPLICATION TO THE MODULO (2 n + 1) CHANNEL FOR RNS BASED FIR FILTER DESIGN

In this paper, the implementation of a Finite Impulse Response (FIR) ﬁlter in the Residue Number System (RNS), is presented, in which a modulo multiplier based on the isomorphism technique is used to perform multiplication in the ( 2 n + 1 ) channel. An RNS modular multiplication in the Galois Field GF ( 2 n + 1 ) is presented in detail in this paper. The multiplication is based on the isomorphic mapping technique adapted to the residue arithmetic. The isomorphic encoder and decoder look-up tables in the GF ( 2 8 + 1 ) are given. An architecture for FIR ﬁlter design based on distributed arithmetic for multiplication and accumulation in mentioned ( 2 n + 1 ) channel is also presented. This architecture is discussed in details and compared with with architecture based on isomorphing technique.


INTRODUCTION
Modulo (2 n + 1) multipliers of various types have been considered in literature (a) both inputs in standard representation, (b) one input in standard form and another in diminished-1 form and (c) both inputs in diminished-1 representation.
This paper develops an enhanced algorithm for the arithmetic modular (2 n +1) multiplication problem in the Residue Number System. The proposed algorithm is based on Galois finite field theory (Pradhan, 1978). Galois field (GF(m)) is a number system with a finite number of elements, m, and two main arithmetic operations, called addition and multiplication. Other operations such as division can be derived from those two (Chen et al., 2007). Some of the formal properties of a finite field the following. They consist of a set number of GF(m), and two operations, modular addition (+) and modular multiplication ( * ). The result of adding or multiplying two numbers from the finite field is always an element in the field.
Mapping the arithmetic multiplication problem over the Galois field GF(m) eliminates many of the limitations of existing algorithms for modular (2 n + 1) multiplication. And advantage of the proposed algorithm is that it has no restriction on the multiplier and the multiplicand, no diminished one multiplication, and no based extension operation.

A prime Galois field as a multiplier
A prime Galois field GF(m) is a finite field of order m (m is the number of elements) where m is a prime positive integer (Kitsos et al., 2003;Chen et al., 2014). They consist of two operations, modular addition (denoted by +) and modular multiplication (de-noted by * ), both operations are communicative and associative, that satisfies the usual arithmetic properties: (a) The (GF, +) is an Abelian group with an additive neutral element denoted by 0, such that a + 0 = a for any element a ∈ GF(m). (b) The (GF\0, * ) excluding the zero element is an Abelian group with a multiplicative neutral element denoted by 1, such that a * 1 = a for any element a ∈ GF(m). (c) For every element a ∈ GF(m), there is an additive inverse element −a, such that a + (−a) = 0. (d) For every nonzero element b ∈ GF(m) there is a multiplicative inverse element b −1 such that b * b −1 = 1. (c) Multiplication distributes across addition as: (a + b)c = ac + bc and c(a + b) = ca + cb for all a, b, c ∈ GF(m).
These properties can be satisfied if the field size is any prime number or any integer power of a prime. The organization of the paper is as follows: Section 2 gives a brief overview of the index mapping method; Section 3 gives a short explanation of the design (2 4 +1) channel for RNS based FIR filter design using the index mapping method given in Section 2; Section 4 deals with distributed arthmetic and its comparison with isomorphing transformatin; Section 5 deals with the conclusion of the work. Isomorphing encoded and decoded tables for modulu (m = 2 8 + 1) are given in Appendix.

THE INDEX MAPPING OVERVIEW
In the residue number system (RNS), an analogous method which can be, as logarithms multiplication used, to call index calculus (Padmavathy & Bhagvati, 2012). Using index mapping over the Galois Field GF(m), the multiplication operation can be implemented by the addition. The multiplication operation in RNS is a modular operation, therefore, multiplication can be done as an addition in RNS, which is easier than multiplication (Qi et al., 2012).
The groups (G 1 , * ) and (G 2 , ) are said to be isomorphic if there is a one-to-one correspondence (bijection) f : G 1 → G 2 that preserves the group operation, in other words, The input and output index mapping of RNS numbers is based on the following definitions.
Definition 1. The Euler's ϕ(n) function or the totient function of a positive integer n is the number of integers in the range (1, 2, . . . , n − 1) which are relatively prime or co-prime to n. If m is prime then ϕ(m) = m − 1.
For example, ϕ(5) = 4, the numbers 1, 2, 3, 4 are relatively prime to 5, but 5 is not. If n = p k 1 1 p k 2 2 · · · p k m m , where p 1 , p 2 , . . . , p m are distinct prime divisors of n and k i ≥ 1, then To calculate Euler's function, the Matlab string given in Listing 1 can be used.
Listing 1. Euler's ϕ(n) function N = 48; % for example n = 1:N -1; ind = gcd(n,N )==1; tot = n(ind) Definition 2. Let a 0 and m > 0 be relatively prime. Then x is called the order of a modulo n, denoted x = ord m (a) if x is the smallest natural number so that a x m = 1.
Definition 3. Let m ∈ N and g ∈ Z be such that gcd(g, m) = 1. Then g is called a primitive root modulo m if ord m g = ϕ(m), i.e. if the order of g is equal to the maximal possible value. In other words, an integer g is a primitive root modulo m if the powers of g generate all residue classes coprime to m.
For example, let m = 17 be a second order Femat number , then the number g = 3 is the primitive roots of the prime m.
Listing 2. The number g = 3 is a primitive root modulo 17 m = 17; % for example k = 1:m -1; g = 3; ind = mod(g.^k,m); ind =sort(ind) ind = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Definition 4. Let m be any prime number, and let g be any primitive root of m, then to each integer a, relative prime to m, there is a unique integer (index) i, denoted as i = ind g a, such that Indexes over Galois field GF(m) have the following important properties: 1. ind g l = 0, 2. ind g (a × b) = ind g a + ind g b m−1 , 3. ind g a n = n × ind g a m−1 , 4. ind g a = ind g g + ind g a m−1 , where g is any other primitive root.
Definition 5. For integer numbers a, b and m, For any integer a, r = a m shall denote the unique integer remainder r, 0 ≤ r ≤ m − 1, obtained upon dividing a by m; this operation is called reduction modulo m.
A special technique, based on isomorphic transformations (Jullien, 1980), can be used in RNS to transform the modular multiplication into a simpler modular addition. It is based on the concept of indices that are similar to logarithms, and primitive roots g which are similar to logarithm bases. It is possible to demonstrate that if the number m is a prime there exists a number of primitive roots (the number of the primitive roots can be computed by using the Eulers function) that share the following property: every element of the field GF(m) = 0, 1, . . . , m − 1 excluding the zero element can be generated by using the following equation where k (index) is an integer number and g a primitive root. In this way, an isomorphism exists between the multiplicative group {x} = {1, 2, . . . , m − 1} with the multiplication modulo m, and the additive group {k x } = {0, 1, . . . , m − 2} with the addition modulo m − 1. Multiplication of two integers can now be performed by adding the corresponding indices mod (m − 1), and then finding its inverse index value. Thus, the product of x and y is given by This approach is known as index calculus. By using isomorphisms, the product of the two residue numbers is mapped into the sum of their indexes which are obtained by an isomorphic mapping. The scheme for an index calculus multiplier is shown in Figure 1. This multiplication needs three ROM look-up tables and an addition modulo (m − 1). The modulo (m − 1) adder has two n-bit inputs and one n-bit output, were n = log 2 (m − 2) .
Proposed applications can only be computed with only index ROM and inverse index ROM look-up tables and addition modulo m − 1.

APPLICATION ISOMORPHING TRANSFORMATION TO THE MODULO (2 4 + 1) CHANNEL FIR FILTER
An m channel of N taps (degree) for RNS based FIR filters is described by the ordinary expression where x(n) is the input to the filter, A k represents the filter coefficients and y n is the otput of the filter. This can be implemented using a single Multiply Accumulate (MAC) engine, but it would require N MAC cycles, before the next input sample can be processed. Clearly, it is necessary to apply modular arithmetic. Two direct isomorphic transformations to obtain A n → k A n and x n−k → k x,n−k , and one inverse isomorphic transformation to obtain y n → k A n + k x,n−k 16 , are performed. Because of the complexity of modular multiplication, we used the isomorphism technique to implement the product of residues.
The prime number m = 2 4 + 1 = 17 is a second order Fermat number and it has 7 primitive roots. The complete list of primitive roots for GF(17) is: {3, 5, 6, 7, 10, 11, 12}. In isomorphism Table 1 the elements of prime field GF(17), which are generated by using mapping equation, (1), for primitive root g = 3, are given. GF(17) 1 3 9 10 13 5 15 11 16 14 8 7 4 12 2 6 k 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 The modular product of two integer elements x and y belonging to the Galois field with m elements is implemented in the following way 1. Forward mapping of x ∈ GF(m) and y ∈ GF(m) in the corresponding indices k x ans k y . 2. Addition modulo (m − 1) of the two indexes. 3. Reverse mapping of the result of the addition to obtain the final result of the modular product.
The block diagram of a typical isomorphic implementation of the three tap modulo 17 channel of RNS FIR filter is shown in Fig. 2. Product A k x n−k is transferred into Register R. The modulo (2 4 + 1) adder in the next stage adds the present sum to the previous sum fed back from Register ACC, which is initialized to zero, thus accumulating the summation of the products A k x n−k , over the interval i = 1, . . . , N. The final sum is left in Register ACC.
The architecture shown in figure 2 is also suitable for the modulo 2 8 + 1 channel of the RNS based FIR filter design. Isomorphing encoded and decoded tables for modulu (m = 2 8 + 1) are given in Appendix.
For example, for the modular multiplication in the GF (17) of the integers x = 15 and y = 16 the corresponding indices are k x = 6 and k y = 8 respectively, and these can be found in Table 1 Although theoretically multiplication by zero can not be performed using isomorphing technique, notice that by using used look-ups one can solve the problem by adding an additional code to every ROM.

DISTRIBUTED ARITHMETIC AECHITECTURE
Distributed arithmetic (DA) (NagaJyothi & SriDevi, 2017) is a well known method for the calculation of the sum of products to perform Multiplication and Accumulation (MAC). It is a very common method in many Digital Signal Processing (DSP) Algorithms. It should be noted that the DA method is applicable only to cases where the (MAC) operation involves fixed coefficients.
Let the variable y hold the result of an inner product operation between a integer data vector x i and a integer coefficient vector A n , i = 0, 1, 2, . . . , N − 1. The distributed arithmetic representation of the inner modular product operation is as follows: where A n are constant coefficient values (e.g. coefficients of FIR filter) and x n = [b n,0 , b n,1 , . . . , b n,K−1 ] is the corresponding data vector with N inputs, each binary encoded with bit length of N.
Using the standard multiply and accumulate approach, it is obvious that the calculation of this inner product will take N multiply and accumulate execution cycles, corresponding to the number of coefficients used in (4). Now consider expressing each input in the data vector, x n , in the unsigned binary number form as where K − 1 is binary word lengrth. The inner product y in (4) can then be written in the form associating it directly with the bit values of the inputs in the data vector The function (6) contains values representing the sum of products A n with the individual binary bit value b n.k of the data vector x n . Since the b n,k bit value is either 0 or 1, while the value of each A n is constant, there are 2 N possible combination values of f (A n , b n,k ).
Applying RNS arithmetic using a moduli set, for example RNS modulo basis is B = {m 1 , m 2 , . . . , m L } where one of them is m i = 2 p +1, for the inner product in (6), it can be rewritten in terms of its residue m i , i.e.
By applying the algebra of RNS we get follows: Hence values of f m i (A n , b n,k ) = f (A n , b n,k ) m i can be precomputed and stored in the Look Up Table LUT, which can be subsequently clocked out by using the bit-serial stream of the input vector for the accumulation operation. However each of the value needs to be first scaled with the 2 n m i factor, which is difficult to be implemented in hardware due to its modulo operation with respect to modulo m i .
The evaluation of a polynomial y i of degree N allows only N multiplications and N additions. This is optimal, since there are polynomials of degree N that cannot be evaluated with fewer arithmetic operations.
Using Horners method for evaluating a polynomial we can rewrite The basic distributed arithmetic architecture of a three tap (N = 3) FIR filter is shown in Fig. 3. The bank of shift registers in Fig. 1 stores four consecutive input samples. The concatenation of the rightmost bits of the shift registers becomes the address of the LUT. The shift registers are shifted right at every clock cycle. The corresponding LUT entries are also shifted and accumulated N consecutive times where N is the precision of the input data.
Example. Let N = 3 and K = 5 for m i = 17. Equation (11) is reduced to where [b 2,4 b 1,4 b 0,4 ] create a memory address which is loaded into the memory address register, and for n = 0, 1, 2, 3, 4. For n = 4 we have The DA of FIR filter consists of the LUT, shift registers and scaling accumulator. The block diagram of a typical distributed arithmetic implementation of the three taps RNS FIR filter for modulo 17 channel is shown in Fig. 3. We can store data in a look-up table of 2 K words addressed by K-bits. The multiplication by 2 can be implemented with a onebit shift to the left.
The look-up-table size increases exponentially with the filter coefficients. A smaller number of coefficients can be realized very ealisy with a LUT of a smaller size. When dealing with larger coefficients, they will take up a lot of storage space in the LUT, for implementation and also reduce the calculation speed.

CONCLUSIONS
The proposed algorithm over Galois field GF(m) provides an efficient algorithm for the modulo 2 8 + 1 multiplication problem. Efficient procedures were proposed to convert the multiplication problem to the addition problem. The proposed algorithm and mapping procedure can be implemented using lookup tables, which means that multiplication in RNS can be computed very fast. The results of this research research can be used to design a general purpose multimoduli ALU.
A modulo multiplier based on the isomorphism technique is compared with those realized as the distributed arithmetic. Isomorphing technique has the following advantages. It does not contain shift registers and memory size is not in the correlation with the FIR filter degree.