# Design of a low power and high performance digital multiplier using a novel 8T adder

# N.Prathima\*, K.HariKishore \*\*

\*(Department of Electronics & Communication Engg, KL University, Guntur, India) \*\* (Department of Electronics& Communication Engg, KL University, Guntur, India)

## ABSTRACT

Low power VLSI circuits have become important criteria for designing the energy efficient electronic designs for high performance and portable devices .The multipliers are the main key structure for designing an energy efficient processor where a multiplier design decides the digital signal processors efficiency.Multiplier is the most commonly used circuit in the digital devices. Multiplication is one of the basic functions used in digital signal processing. Full Adder is the main block of power dissipation in multiplier. So reducing the power dissipation of full adder ultimately reduces the power dissipation of multiplier.

In this paper a XNOR gate using three transistors has been presented. A single bit full adder using eight transistors has been designed using XNOR cell, which shows power dissipation of  $620.5\mu$  W. A 4x4 Wallace tree multiplier has been implemented by using the proposed 8T adder. Simulations have been carried out by using cadence tool based on gpdk180nm CMOS technology

*Keywords*-CMOS, exclusive-OR (XOR), exclusive-NOR (XNOR), full adder, low power, multiplier.

## I. INTRODUCTION

The prolific growth in semiconductor device industry hasled to the development of high performance portable systems with enhanced reliability in data transmission. Inorder to maintain portability of high-performancefidelityapplications, emphasis will be on incorporation of low-power modules in future system design. The design of such modules will have to partially rely on reduced powerconsumption and/or dissipation in fundamental arithmetic computation units such as adders and multipliers. This underscores a need to design low power multipliers towards the development of powerefficient high-performance systems.

The increasing demand for the high fidelity portable devices has laid emphasis on the development of low power and high performance systems. In the next generation processors, the low power design has to be incorporated into fundamental computation units, such as multipliers. The characterization and optimization of such low power multipliers will aid in comparison and choice of multiplier modules in system design. With exponential growth of portable electronic devices like laptops, multimedia and cellular device, research efforts in the field of low power VLSI (very large-scale integration) systems have increased many folds. With the rise in chip density, power consumption of VLSI systems is also increasing and this further, adds to reliability and packaging problems. Packaging and cooling cost of VLSI systems also goes up with high power dissipation. Now a day's low power consumption along with minimum delay and area requirements is one of important design consideration for IC designers.

The selection of the most efficient architecture to implement multiplication has continually challenged DSP system designers. The options currently available offer a wide range of tradeoffs in terms of speed, complexity and power consumption. Input sequences to the multiplier can be fed in parallel, serial or a hybrid (parallel-serial) approach. To achieve higher processing speeds.Parallel multipliers are usually adopted at the expense of high area Multiple parallel multiplication complexity. algorithms (architectures) [2][6] have been proposed to reduce the chip area and increase the speed of the multipliers. Various techniques have been developed to reduce the power dissipation of parallel multipliers. While several of these techniques reduce power dissipation by eliminating spurious transitions[7-8]others have focused on developing novel multiplierarchitectures and sign-extension techniques reduce power dissipation and improve to performance[3][9].

Yet another approach is to develop low-power 3–2 counters and 4–2 compressors, which are key components in parallel multipliers[10].Although each of these techniques helps reduce power dissipation, further reductions will be needed for future digital signal processing systems. This research uses an approach to significantly reducethe power consumption and the chip area of the parallel multipliers, without sacrificing performance. The approach is based on using low power, minimal transistor count adders that are the determining blocks

(second stage of algorithm) in the performance of the multiplier. The operation of a parallel multiplier can be divided into two parts:(a) formation of the partial products, and (b) summation of these products to form the final product of the multiplication. Binary addition is basic and most frequently used arithmetic operation in microprocessors, digital signal processors (DSP) and application-specific integrated circuits (ASIC) etc. Therefore, binary adders are crucial building blocks in VLSI circuits and efficient implementation of these adders affects the performance of entire system[5].

#### **II.BACKGROUND** 2.1. MULTIPLIER ARCHITECTURE



The multipliers play a major role in arithmetic operations in digital signal processing applications. The present development in processor designs aim at low power multiplier architecture usage in their processor circuits. So, the need for low power multipliers has increased. Hence the designers concentrate more on low power efficient circuit designs. Generally the computational performance of dsp processors is affected by its multipliers performance. Hence we put over a solid care to overcome those drawbacks using our design. Processors efficiency is usually determined from its multiplier speed and supply voltage. Booth algorithm and Wallace tree counters or compressors are the basic algorithms used to increase the speed of the parallel multipliers[1]. Thus to speed up the processor mostly a parallel multiplier can be used comparative to serial multipliers for better performance. But the demerit in the booth algorithms style is its high energy dissipation. Hence we go for a Wallace tree multiplier which is designed by using 8T adder.

If speed is not an concern in multiplier designs then partial products can be summed serially to reduce the risk of design complexity but speed and low power has became an important criteria in designingtoday's energy efficient processors. Hence the Wallace tree multiplier structure is used to add up all the partial products of the bits by parallel multiplication [4]. Thus, it is said to work in parallelism. In this method the bits in each column are compressed into two bits. Here an adder itself works as ancompressor where it compresses three input bits to two output bits where the output one bit is the carry bit of weight n+1 and the sum bit

of weight n. Hence it is said to be 3:2 compressor or counter and its multiplication style is shown in fig1. In this paper the multiplier is designed using 8Tadder.



Figure1:Wallace tree multiplier design flow

Multipliers can be implemented using different algorithms. Depending on the algorithm used, the performance characteristics of the multipliers vary. In the implementation of digital multipliers binary adders are an essential component. With the emergence of power as a design consideration, speed is not the only criterion by which various implementations are judged. Designing multipliers with low power, energy efficient adders reduce the power consumption and efficiency of multipliers.

|    |            |      |      | m         | 13 m2 | m1        | m0   |  |
|----|------------|------|------|-----------|-------|-----------|------|--|
|    |            |      |      | n         | 3 n2  | <b>n1</b> | n0   |  |
|    |            |      |      | n0m3      | n0m2  | n0m1      | m0n0 |  |
|    |            |      | n1m3 | n1m2      | n1m1  | n1m0      |      |  |
|    |            | n2m3 | n2m2 | n2m1      | n2m0  |           |      |  |
|    | n3m3       | n2m2 | n2m1 | n3m0      |       |           |      |  |
| p7 | р <b>6</b> | p5   | p4   | <b>p3</b> | p2    | p1        | PO   |  |
|    |            |      |      |           |       |           |      |  |

Figure2: Multiplication of 4\*4 bit numbers

Wallace trees were first introduced in 1964 in order todesign the multipliers whose completion time grows as the logarithm of the number of bits to be multiplied increases. Wallace tree multiplier is based on tree structure. In Fig1 a 4 bit Wallace tree multiplier is shown. Wallace method uses three-steps to process the multiplication operation

1) Formation of bit products

2) The bit product matrix is reduced to a 2-row matrix by using a carry-save adder (Wallace tree).

3) The remaining two rows are summed using a fastcarry-propagate adder to produce the product.

The advantage of Wallace tree is speed because the addition of partial products is O (log N) where N is the number of summands.

#### **II. CHARACTERISTICS OF MULTIPLIERS**

There are three main components of power consumption in digital CMOS VLSI circuits. 1) **Switching Power:** consumed in charging and discharging of the circuit consuit approximate during

anddischarging of the circuit capacitances during transistor switching.

2)**Short-Circuit Power:** consumed due to shortcircuit current flowing from power supply to ground during transistor switching. This power more dominates in Deep Sub Micron (DSM) technology.

3)**Static Power:** consumed due to static and leakage currents flowing while the circuit is in a stable state. The first two components are referred to as dynamic power, since power is consumed dynamically while the circuit's changing states. Dynamic power accounts for the majority of the total power consumption in digital CMOS VLSI circuits at micron technology [11][12].

$$P_{avg} = P_{dynamic} + P_{statec}$$
  
= (P<sub>Switching</sub>+P<sub>Short-Circuit</sub>)+P<sub>Leakage</sub> = ( $\alpha_0 \rightarrow_1 \times C_L \times V_{dd}^2$   
×  $\mathbf{f}_{clk}$ ) + ( $\mathbf{I}_{sc} \times V_{dd}$ ) + ( $\mathbf{I}_{leakage} \times V_{dd}$ ) \_\_\_\_\_  
(1)

The first term and second term in (1) collectively represents the dynamic power. Under the circumstance of 180nm the static power loss is far less than its counterpart—dynamic power dissipation. Therefore, in most cases, the total power loss is approximate to dynamic power consumption, which is also considered to be related to the internal node capacitance and the probability of switching.

#### **IV.SYSTEM DESCRIPTION**

a) Logic Equations for the Proposed Full Adder

Proposed full adder circuit has been implemented by two XNOR gates and one multiplexer block as shown in block diagram of Fig 3(a). Sum is generated by two XNOR gates and Cout is generated by two transistors multiplexer block. The single bit full adder using proposed XNOR gates with eight transistors has been implemented and shown in Figure 3(b). For multiplexer section typical values of width (Wn & WP) 1.0µm & 2µm for NMOS and PMOS transistors have been taken.







The function of full adder is based on following equation, given three single bit inputs as A, B, Cin and it generates two outputs of single bit Sum and Cout, where:

Sum= 
$$(A \oplus B) \oplus C_{in}$$

$$C_{out} = A.B + C_{in} (A \oplus B) -(2)$$

From fig3(a)

(3)

 $Sum = H XOR C_{in} = H. C_{in}' + H' C_{in}$ 

 $C_{out} = A$ . H' +  $C_{in}$ . H

Where H is half sum (A XOR B) and H' is complement of H.

#### b) Logic for the xnor gate

The exclusive–OR (XOR) and exclusive–NOR (XNOR) gates are the basic building blocks of a full adder circuit. The XOR/XNOR gates can be implemented using AND, OR, and NOT gates with high redundancy [1]. Optimized design of these gates enhances the performance of VLSI systems as these gates are utilized as sub blocks in larger circuits. XNOR/XOR design with less number of transistors, lesser power dissipation and delay are highly desirable for efficient implementation of the large VLSI system.Proposed design of XNOR with three transistors has been shown in Fig 2. In XNOR circuit, Width (Wn) of NMOS transistors N1 and N2 has been taken and 1.0µm respectively. Width (Wp) for transistor P1 has been taken as 2.0µm.



Figure4:3T XNOR gate

In figure(4) when A=B=0 output is high because transistor P1 is on and N1, N2 transistors are off. With input combinations of A=0 and B=1 circuit shows low output as transistor P1 is off and output node is discharged by transistor N2, which is in on condition. In case when A=B=1, output node shows high logic as transistor N1 is on and high logic is passed to output. In another case when A=1 and B=0 both transistors (P1 & N1) are on and output node is discharged rapidly by N1 and N2 transistors. In this case with A=1 transistor N1 turns on which further turn on the transistor N2 and a conducting path is provided by N1 and N2. This connectivity of output node with ground discharges the output node. The switching speed of N1 is higher than N2 because delay is inversely proportional to channel width. Due to on condition of transistor N1 the gate voltage of N2 increase above its threshold voltage and transistor N2 also goes in on condition. In this position the circuit is just behaving like an inverter with A=1 as input and gives output as low logic. Transistor P1 is just acting as load resistance with grounded gate input (B=0).

## **V. SIMULATION RESULTS**

a)Input and Output Waveforms

All results are obtained in gpdk180nm cmos technology using cadence tool.



Figure5:Schematic diagram of xnor using 3T



Figure6:Schematic diagram of 8T adder



Figure7:Wallace tree multiplier



Figure8:Input and output waveform of full adder with 8T



Figure9:Input and output waveform of xnor gate



A XNOR gate using three transistors has been presented, which shows power dissipation of  $261.2\mu$ W in  $0.50\mu$ m technology. A single bit full adder using eight transistors has been designed using proposed XNOR cell, which shows power dissipation

of  $620.5\mu$ W. A 4x4 Wallace tree multiplier has been implemented by using the proposed 8T adder, which shows a power dissipation of 18.23mw.Power consumption of proposed XNOR gate and full adder circuit's shows better performance in terms of power consumption and transistor count.

## **VII.FUTURE WORK**

As a future work we can extend the work by applying the proposed architectures of full adder using proposed xnor gates to different digital multipliers and the power dissipation and delays of various digital multipliers can be analyzed and compared so the best one can be chosen based on required application.

## REFERENCES

- B.Millar, P. E. Madrid, and E. E. Swartzlander, Jr., A fast hybridmultiplier combining Booth and Wallace/Dadda algorithms." Pro-ceedings of the 35th Midwest Symposium on Circuits and Systems(1992), pp. 158–165.
- [2] C. Lemonds and S. S. Shetti, A low power 16 by 16 multiplierusing transition reduction circuitry. Proceedings of the InternationalWorkshop on Low Power Design (1994), pp. 139–144.
- [3] E. de Angel and E. E. Swartzlander, Jr., Low power parallel multipli-ers. IX Proceedings of Workshop on VLSI Signal Processing (1997),pp. 199–208.
- [4] C.S.wallace, "A Suggestion for a fast multiplier"IEEETrans.Elechon.con@.,vol.EC 13,pp.14-17,Feb.1964.
- [5] A. M. Shams and M. Bayou, "A novel highperformance CMOS1-bit full adder cell," IEEE Transaction on Circuits Systems II, Analog Digital Signal Process, vol. 47, no. 5, pp. 478–481, May 2000.
- [6] Kei-Yong Khoo Zhan Yu and A. N. Willson, Jr., Improved-Boothencoding for low-power multipliers. Proceedings of the 1999 IEEEInternational Symposium on Circuits and Systems (1999), pp. 62–65.
- [7] G. E. Sobelman and D. L. Raatz, "Low-Power multiplier designusing delayed evaluation." Proceedings of the International Symposium on Circuits and Systems (1995), pp. 1564– 1567.
- [8] T. Sakuta, W. Lee, and P. T. Balsara, Delay balanced multipliers forlow power/low voltage DSP core. Proceedings of IEEE Symposiumon Low Power Electronics (1995), pp. 36–37.
- [9] M.-C.Wen, S.-J. Wang, and Y.-N. Lin, Low power parallel multiplierwith column

bypassing. Proceedings of IEEE International Sympo-sium on Circuits and Systems (2005), pp. 1638–1641

- [10] I.S.Abu-Khater, A.Bellaouar, and M.I. Elmasry, Circuit techniques for CMOS lowpower high-performance multipliers" IEEE Journal of solid-state circuits (1996), Vol. 31, pp. 1535–1546.
- [11] S. Devadas and S. Malik, —A survey of optimization techniques targeting low power VLSI circuits, || in Proc. 32nd ACM/IEEE Design Automation Conf., pp. 242–247, 1995
- [12] A. P. Chandrakasan, S. Sheng and R. W. Bordersen, Lowpower CMOS digital design, II IEEE J. Solid-State Circuits,vol.27,pp.473–484,1992.

