# Flip-Flop Circuit Families:Comparison of Layout and Topology for Low Power VLSI Circuits

# R.Uma

Electronics and Communication Engineering Rajiv Gandhi College of Engineering and Technology Puducherry, India

Abstract—The pertinent choice of flip-flop topologies is an essential importance in the design of VLSI integrated circuits for high speed and high performance CMOS circuits. Understanding the suitability of flip-flops and selecting the best topology for a given application is an important issueto fulfill the need of the design to satisfy low power and high performance circuit. This paper presents a widespread comparison of existing flip-flop classes and topologies in terms of its area, transistor count, parasitic values and power dissipation. In particular the comparison strategy includes the elucidation of circuit operation, simulation setup, parasitic estimation, area estimation, power dissipation estimation. An overview of the optimum design strategy is also presented. The operation of each flip-flop is elucidated and it is simulated using microwind and LT-SPICE simulator. The analysis for various flip-flops for area, power dissipation and propagation delays at 0.12µm 6metal layer technology is carried out.

Keywords-High speed, VLSI, flip-flops, topologies, power dissipation.

#### I. INTRODUCTION

Sequential logic circuits, such as registers, memory elements, counters etc., are heavily used in the implementation of Very Large Scale Integrated (VLSI) circuits [1]. As VLSI circuits continue to evolve and technologies progresses, the level of integration is increased and higher clock speed is achieved. For such submicron CMOS technology area, topology selection, power dissipation and speed are very important aspect especially for designing Clocked Storage Element (CSE) for high-speed and low-energy design like portable batteries and microprocessors. Various classes of flip-flops have been proposed [1]-[16] to achieve high-speed and low-energy operation. Understanding and selecting the appropriate choice of flip-flop topology for a particular application is difficult, since it involves a large number of existing topology, and depends on area, power dissipation and transistor sizing. In specific, efficient topology and good layout design is necessary to achieve reliable results that are usable in practical design.

A flip-flop is a bistable circuit which stores a logic state of 0 or 1 in response to a clock pulse with one or more data inputs. In digital circuit design, large proportion contributes to synchronous design and they are operated based on the clock signal to reduce the complexity of the circuit design. In the design of sequential circuits, a major challenge is the design of an efficient D flip-flop (DFF). Several static/dynamic DFF architectures have been proposed in [1]-[16]. The topology comparison commences with the conventional single edge triggered flip-flop SET [1] typically latch data either at the positive or negative edge of the clock. A SET FF can be configured to operate as master slave latch by cascading the sequential structure but it is incompetent as half of the clock edges are wasted, while the full implementation cost of the complete clock is endured. Next topology is Double Edge Triggered flip-flop DET, which can be triggered at the positive as well as the negative edges. Numerous topologies have been proposed in [2]-[4] but it is inefficient because of reduced driving capabilities and causing DC power consumption at the output node. The topology used in this work refers to [2] which overcomes these draw backs. The third topology elucidates semi-dynamic flip-flop with static and dynamic design proposed in [5]. This flip-flop provides both short latency and providing ability to incorporate logic functions with minimum delay which are used in high performance microprocessor. The next topology explained is Hybrid Latch Flip-Flop HLFF proposed in [16] that are intended at a significant reduction in latch and clock load. HLFF is similar to standard flip-flops that samples the data on one edge of the clock and eliminates retardation of data flow on the opposite edge.

ISSN: 2248-9622 www.ijera.com

#### Vol. 1, Issue 4, pp.1971-1982

The next topology discussed is Pulsed-Triggered Flip-Flop P-FF proposed in [6] that consists of pulse generator for generating strobe signals and a latch for data storage. These types of FF are used in high-speed applications which are less sensitive to clock jitter. Followed by, the explanation towards Sense-Amplifier -Based Flip-Flop (SAFF) that accepts true and complementary inputs and produces true and complementary outputs. The sense-amplifierbased flip-flop proposed in [7][8] has the main drawback that is the slave element composed by a set/reset (SR) NAND latch suffers from the possibility that one of the internal nodes will float low if the inputs switches to 0 or 1, while the clock is high. More over this circuit requires a minimum transistor number; which results in asymmetrical delays with a slow high-to-low clock-to-output delay. SAFF proposed in [9] eliminates these limitations. Next the characteristics of Write Port Master Slave flip-flop (WPMS) is explained which is proposed in [10] that has the structure devoid ofpMOS in itspassgate. The WPMS structure uses the "inverter - pass transistor" combination decouples the data path inverters from the clock, hence allowing a more efficient distribution of the gain and yielding lower energy for faster designs. Followed, the explanation towards True-Single-Phase Clocking (TSPC) proposed in [11] is an efficient methodology to achieve very high-speed VLSI design. TSPC is safer and takes less clock signal routing area. Finally an ingenious positive edge-triggered register that is based on a master-slave concept insensitive to clock overlap called the  $C^2MOS$  (Clocked CMOS) register proposed in [12][13] is discussed.

In this paper, an innovative analysis and comparison strategy is proposed, which suitably accounts for the estimation of area and power dissipation aspects to achieve fair and meaningful results. Such strategy is applied to compare a large number of FF topologies [1]-[16] in 0.12µm CMOS technology. The organization of the paper is as follows: The section II, describes the power consumption in storage elements. The section III, presents the elucidation of Flip-Flop topology. Section IV presents the simulated result. Section V presents the summary of comparison. Finally the conclusion is presented in section VI

#### II. **POWER CONSUMPTION IN STORAGE ELEMENT**

Power consumption in high-performance integrated circuits that has been one of the most critical constraints in highperformance designs recently. There are three main sources of power dissipation in the latch:

• Internal power dissipation of the latch, including the power dissipated for switching the output loads

• Local clock power dissipation, presents the portion of power dissipated in local clock buffer driving the clock input of the latch

• Local data power dissipation, presents the portion of power dissipated in the logic stage driving the data input of the latch

Total power parameter is the sum of all three measured kinds of power.

The clocking system of a digital VLSI has been defined in [14] is shown in Fig (1). In a synchronous circuit the power consumption can be represented as power consumed by the clock generator Pcg, global wiring capacitance  $P_{gw}$ , local wiring capacitance  $P_{lw}$ , total clocked gate capacitance  $P_{g}$ , power dissipation by all the flip-flops  $P_{ff}$ . The total power consumed is given as P<sub>cs</sub>:

 $P_{cs} = P_{cg} + P_{gw} + P_{lw} + P_g + P_{ff}$ = (1 + B<sup>-1</sup> + B<sup>-2</sup> + B<sup>-3</sup> .... + B<sup>-N</sup>) .V<sub>s</sub><sup>2</sup> .f.(C<sub>j</sub>+C<sub>gw</sub>+C<sub>lw</sub>+C<sub>g</sub>) + P<sub>ff</sub>





Fig 1. The block diagram of a digital VLSI showing the clocking system

#### Vol. 1, Issue 4, pp.1971-1982

Where

β: tapering factor of the clock generator

N : number of tapering stages in the clock buffer

V<sub>s</sub>: voltage swing of the clock distribution network

f: clock frequency

C<sub>i</sub>:junction capacitance of the source-drain regions of the output node of the clock generator

C<sub>gw</sub>:total parasitic capacitance of global wires

 $C_{lw}$ :total parasitic capacitance of local wires

C<sub>g</sub>:total gate capacitance of the clocking system.

 $P_{ff}$ : power dissipation in all the flip-flops (power dissipation due to switching output loads+ local clock buffer + logic stage driving the data input)

Total power dissipation in a synchronous system can be reduced by choosing the proper values of  $\beta$  and reducing the values of  $C_{gw}, C_{lw}, C_g, V_s$ , f.

### III. ELUCIDATION OF FLIP-FLOP TOPOLOGY

The topology chosen for this work are commonly used high performance flip-flops. They are

- Single Edge-Triggered Flip-Flop (SET)
- Double Edge-Triggered Flip-Flop (DET)
- Semi-Dynamic Flip-Flop (SDFF)
- Hybrid Latch Flip-Flop (HLFF)
- Pulse-Triggered Flip-Flop (P-FF)
- Sense-Amplifier-Based Flip-Flop (SAFF)
- Write-port Master Slave Flip-Flop (WPMS)
- True Single-Phase-Clock Flip-Flop(TSPC)
- Clocked CMOS Flip-Flop (C<sup>2</sup>CMOS)

#### A. Single Edge-Triggered Flip-Flop (SET):

This type of flip-flops samples the data either at the positive edge or negative edge of the clock. This conventional flip-flop is constructed based on master and slave latches. In this configuration the output of the master latch is the input of the slave latch, and the output of the slave latch provides the output of the flip-flop. A conventional SET flip-flop proposed in [1] is shown in Fig.(2a) where at the master stage a transmission gate TG1 is incorporated to receive the data input D, depending upon the control signal CLK and CLKB prior to a rising edge of the clock. Similarly on the slave side TG2 is added and which is non-conductive during TG1 is active and vice versa. Therefore when master is sampled with D input the slave will hold its previous state and serves as a buffer. During the other half of the cycle, slave latch is enabled and changes its state to that of the master flip-flop.

#### B. Double Edge-Triggered Flip-Flop (DET)

A double edge-triggered flip-flop is used to store data on the both rising and falling edge of the clock signal. Several DET flip-flops have been proposed [2]-[4]. These flip-flops occupy large silicon area and produces unwanted internal switching. The topology presented in this paper is proposed by Yu Chien-Cheng et al [1] which reduce the gate density as well as clock skew. The circuit description is illustrated in Fig. (2b).The circuit consists of parallel connection of two latches i.e. upper and lower data path. Transistor MN1, MP2, INV1, MP3 constitutes the upper data path while transistor MP1, MN2, INV2, MN3 forms the lower data path. During each transition, one is transparent on the rising edge and the other is transparent on the falling edge of the clock. The transistor MP2 provides feedback to pull up the storage node X substantially to VDD when signal node XB is low. Similarly the transistor MP2 provides feedback to pull down the node Y to GND when the signal node YB is high. When the clock is high and the data is also high the upper data path starts conducing and node X goes to high logic through the transistor MN3. On the other hand, when clock is low the lower data path starts conducting and node Y goes to low logic through transistor MN2. Meanwhile the upper data path quickly starts to send the previous hold data to the output node Q through the transistor MP3.

ISSN: 2248-9622 <u>www.ijera.com</u>

Vol. 1, Issue 4, pp.1971-1982

### C. Semi-Dynamic Flip-Flop

Fig. (2c) shows the block diagram of semi-dynamic flip-flop (SDFF) referred in [5]. The operation of the circuit is defined with precharge and the evaluation region. The circuit is constructed with dynamic input stage with static operation hence it is designated as semi-dynamic. During the falling edge of clock, the flip-flop enters the precharge phase. Node X precharges high and makes Q to be cut off from the input stage so that it retains its previous state. Node S remains high holding transistor N1 on because CKD is low during precharge. When the clock rises, the flip-flop enters evaluation phase. If D is '0', X remains high and node Q may either remain low or it may discharge through transistor N4-N5 driving QB high. If D is '1' and X starts to fall low, the transistor remains on to finish the transition. The weak cross-coupled inverters staticize the flip-flop and the final inverter buffers the output node. Conditional shutoff i.e., if input D were high prior to evaluation, node X would be discharged through the transistor N1-N3 which may drive Q high and QB low which may affect the logical values of flip-flop. To prevent the shutoff the transistor N1 is incorporated between the node X and CKD.

### D. Hybrid Latch Flip-Flop (HLFF)

HLFF samples the data on one edge of the clock and eliminates the obstruction (delay) of data flow on the reverse edge. HLFF is mainly aimed to design the substantial reduction in latch latency and clock load. The basic operation is similar to latch because it delivers a soft clock edge which allows for the stack passing and minimizing the effects of clock skew on cycle time [16]. This cycle time is determined by an assimilated one-shot derived from the clock edge. Fig. (2d) shows the basic construction of HLFF. Before the rising edge of the clock, the node X precharged to VDD and node Q holds the previous data, since the transistor N1 and N4 are off while N3, N6, and P1 are on. At the rising edge of the clock, the transistor N1 and N4 turn on while N3 and N6 stay on for three inverter delays. During this period the flip-flop is transparent and the data is sampled into the latch. When the transition of CKDB is low, the node X is decoupled from D, either it stays in that logic or begins to precharge to VDD by P3. The node X is completely precharged to hold the value of X to VDD at the negative edge of the clock. At an operating frequency of 500MHz, designed at half of the clock load capacitance will have the latency of two-third of the aggregated delays of the transparent low (TLL) and a transparent high (THL). As an upshot, cycle time is improved at a minimum of 10% and the clock load is reduced to 30%.

### E. Pulse-Triggered Flip-Flop (P-FF)

For high-speed operation of data storage and a popular alternative to master slave flip-flop is pulse-triggered flipflop. A P-FF consists of pulse generator for generating strobe signals and a latch for data storage. P-FF designs can be broadly classified into implicit and explicit type. In an implicit type, the pulse generator is a built-in logic of the latch design and is in an explicit type where the pulse generator and the latch designs are separate. The implicit type is often suitable for power efficient design; however the design topology suffers from lengthened discharging path which leads to inferior timing characteristics. On the other hand the explicit type overcomes the timing constraints, but faces problems in pulse width control issue and also driving large capacitive load when one pulse generator is shared among several latches. In this paper, a low-power implicit type P-FF is explained which is proposed by Yin-Tsung Hwang et al [6]. The circuit design is shown in Fig(2e). The proposed scheme reduces the number of NMOS transistor stack along the discharge path and it also supports conditional enhancement of pull down strength when the input data is '1'. During the negative edges of the clock i.e. when both the inputs D and CLK are zero the node Z is floating temporally. While on the positive edge of the clock the transistors N2 and N3 are on which makes the node Z to weak logic high, as a result it turns on transistor N1 with some delay due to the inverter I1. The voltage level of node X rises and turns off transistor P3 and the data is sampled due to the width of the generated discharging pulse. The switching power is considerably reduced due to weakened voltage swing. The transistor N2 and N3 speeds up the operations of pulse generation.

### F. Sense-Amplifier –Based Flip-Flop (SAFF)

Sense – Amplifier - Based flip-flop (SAFF) accepts true and complementary inputs and produces true and complementary outputs. They are constructed from a clocked sense amplifier so that they can rapidly response to small differential input voltages. The sense-amplifier-based flip-flop initially proposed in [7][8], is designed using fast differential sense-amplifier stage, followed by a slave latch. The main drawback of the SAFF proposed in [7][8] is the slave element composed by a set/reset (SR) NAND latch suffers from the possibility that one of the internal nodes will float low if the inputs switches, while the clock is high. More over this circuit requires a minimum transistor number; it results in asymmetrical delays with a slow high-to-low clock-to-output delay. In this paper, a

ISSN: 2248-9622 www.ijera.com

#### Vol. 1, Issue 4, pp.1971-1982

novel high-speed sense-amplifier-based flip-flop is explained which is proposed by Antonio et al [9]. The circuit design is shown in Fig(2f). The proposed scheme reduces the drawbacks of [7][8]. The circuit topology is segmented into two; they are the sense-amplifier stage and the output latch stage. The sense amplifier stage is implemented with conventional approach [7] whereas the output stage is constructed as a hybrid solution between the conventional NAND and N- $C^2MOS$ . During the negative edge of the clock the node Sbar and Rbar are precharged. While during the positive edge of the clock and if the data input D is high, the sense-amplifier pushes the node Sbar to zero, while node Rbar remains high. This makes the transistor N5 turn off and P1 turns on making the output node Q to be high. Meanwhile the output node Qbar is pulled down through the transistors N3,N4, and N6. During the precharge phase ie when the clock is in the rising edge the output node Q is high which makes the transistor N8 turns on, keeping the output node Qbar at zero even if the value of D is changed. The transistors P3,P4,N5,N6,N7 and N8 are used to hold the previous values of Q and Qbar. The transistors N1-N4 are used to speed up the high-to-low output transition.

#### G. Write-Port Master Slave Flip-Flop (WPMS)

The Write-Port Master Slave flip-flop (WPMS) works similar to conventional master slave flip-flop structure. However, the implementation of each latches done by a standard SRAM 6T cell. The implementation is shown in Fig (2g) referred to [10]. Keeper circuit using single NMOS is added on each side which is driven by true and complementary clock. On the falling edge of the clock both NMOS transistor off; the keeper is push-pulled from each side to change its state. During the rising edge of the clock the input D is sampled into the latch. The main benefits of this structure are that there is no PMOS transistor other than the keeper circuits, so that the clock-delay and the parasitic capacitance on the data-path can be considerably reduced.

#### H. True Single-Phase-Clock Flip-Flop (TSPC)

Conventional latches require both true and complementary clock signals. The True Single-Phase-Clock (TSPC) circuit technique uses only one clock signal that is never inverted and fits both static and dynamic CMOS circuits. The topology is shown in Fig (2h) with reference to [11]. On the falling edge of the clock the latch holds its previous state with help of the transistors P1 and N4. On the rising edge of the clock the D input is sampled through the transistors P1,N1,N3 and N4.

#### *I.* Clocked CMOS Flip-Flop ( $C^2$ CMOS) – A Clock-Skew Insensitive Approach

An ingenious positive edge-triggered register that is based on a master-slave concept insensitive to clock overlap is shown in Fig (2i) proposed in [12][13]. This circuit is called the  $C^2CMOS$  (Clocked CMOS) flip-flop which operates in two phases: when clk=0, the first driver is turned on, and the master stage acts as an inverter sampling the inverted version of D on the internal node X.The master stage is in the evaluation mode. When clk=1, the master stage section is in hold mode, while the second section evaluates. The previous value stored is propagated to the output node through the slave stage, which acts as an inverter.



# R.Uma / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 <u>www.ijera.com</u> Vol. 1, Issue 4, pp.1971-1982











(d)

(e)







(h)

(g)

Vol. 1, Issue 4, pp.1971-1982



(i)

Fig 2. Flip-Flop topologies, a. Single Edge-Triggered Flip-Flop (SET), b.Double Edge-Triggered Flip-Flop (DET), c.Semi-Dynamic Flip-Flop (SDFF), d.Hybrid Latch Flip-Flop (HLFF), e. Pulse-Triggered Flip-Flop (P-FF), f.Sense-Amplifier – Based Flip-Flop (SAFF), g.Write-Port Master Slave Flip-Flop (WPMS), h.True Single-Phase-Clock (TSPC), i. Clocked CMOS (C<sup>2</sup>CMOS)

#### **IV. SIMULATED RESULT**

To evaluate performance; different flip-flop structures discussed in this paper was designed using  $0.12 \mu m$  CMOS technology. All simulations are carried out at nominal conditions: VDD=1.2 V, I/O supply voltage:2.5 V and room temperature= 27 °C. The device model used in this simulation is empirical level 3,monte-carlo (normal dist. 20%) with the following MOSFET model parameter:

The simulated resultand the maximum and average drain current  $I_{DDMAX}$  and  $I_{DDAVG}$  are shown in Fig (3,4).

\* n-MOS Model 3 :

\* Iow leakage MODEL N1 NMOS LEVEL=3 VTO=0.40 UO=600.000 TOX= 2.0E-9

- +LD =0.000U THETA=0.500 GAMMA=0.400
- +PHI=0.200 KAPPA=0.060 VMAX=120.00K

+CGSO=100.0p CGDO=100.0p

+CGBO= 60.0p CJSW=240.0p

\* p-MOS Model 3: \* low leakage

MODEL P1 PMOS LEVEL=3 VTO=-0.45 UO=200.000 TOX= 2.0E-9 +LD =0.000U THETA=0.300 GAMMA=0.400 +PHI=0.200 KAPPA=0.060 VMAX=110.00K +CGSO=100.0p CGDO=100.0p +CGBO= 60.0p CJSW=240.0p





 ce:
 e0

 120
 e1

 120
 e1



**(e)** 





Vol. 1, Issue 4, pp.1971-1982



Fig 3. Simulated result of flip-flop topologies, a. Single Edge-Triggered Flip-Flop (SET), b.Double Edge-Triggered Flip-Flop (DET), c.Semi-Dynamic Flip-Flop (SDFF), d.Hybrid Latch Flip-Flop (HLFF), e. Pulse-Triggered Flip-Flop (P-FF), f.Sense-Amplifier –Based Flip-Flop (SAFF), g.Write-Port Master SlaveFlip-Flop (WPMS), h.True Single-Phase-Clock (TSPC), i. Clocked CMOS (C<sup>2</sup>CMOS)



Fig 4. I<sub>DDMAX</sub> and I<sub>DDAVG</sub> of flip-flop topologies, a. Single Edge-Triggered Flip-Flop (SET), b.Double Edge-Triggered Flip-Flop (DET), c.Semi-Dynamic Flip-Flop (SDFF), d.Hybrid Latch Flip-Flop (HLFF), e. Pulse-Triggered Flip-Flop (P-FF), f.Sense-Amplifier –Based Flip-Flop (SAFF), g.Write-Port Master Slave Flip-Flop (WPMS), h.True Single-Phase-Clock (TSPC), i. Clocked CMOS (C<sup>2</sup>CMOS)

#### Vol. 1, Issue 4, pp.1971-1982 V. **SUMMARY**

In this work, the performances of nine clocked storage elements (CSE) are tested for robustness against area, delay and power dissipation. They are selected for this experiment since they have been commonly used in the industry. For SET FF the operation is merely performed at the rising edge of the clock and cannot be performed at the falling edge of the clock. Though the topology is simple which requires 16 transistors, is incompetent since half of the clock edges are wasted, while the full implementation cost of the entire clock is tolerated. On the other hand DET FF stores data on the rising edge and falling edge of the clock signal. They reduce the clock frequency to half of its original value for the same data throughput. As an upshot, power consumption is reduced making DET FF desirable for low power applications. DET FF suffers performance degradation when compared to their single-edge counter parts, it increases delay along the critical path. Table 1 and 2 presents the comparison based on parasitic extraction and delay.

| Flip-Flop<br>type   | Total<br>Capacitance<br>fF | Total<br>Resistance<br>Ohms | Total<br>Inductance<br>nH (output<br>node) | Transistor<br>Count | I <sub>DDmax</sub><br>mA | I <sub>bbavg</sub><br>mA | Cross talk<br>fF(output<br>node) |
|---------------------|----------------------------|-----------------------------|--------------------------------------------|---------------------|--------------------------|--------------------------|----------------------------------|
| SET                 | 49.877                     | 2949                        | 0.07                                       | 16                  | 0.55                     | 0.002                    | 0.37                             |
| DET                 | 109.32                     | 3081                        | 0.09                                       | 12                  | 1.67                     | 0.393                    | 0.20                             |
| SDFF                | 48.69                      | 5079                        | 1.1                                        | 25                  | 0.10                     | 0.009                    | .09                              |
| HLFF                | 66.88                      | 4816                        | .11                                        | 20                  | 1.67                     | 0.240                    | 0.12                             |
| P-FF                | 64.453                     | 4773                        | .11                                        | 19                  | 0.39                     | 0.001                    | 0.09                             |
| SAFF                | 118.54                     | 8713                        | .2                                         | 24                  | 0.45                     | 0.011                    | 0.13                             |
| WPMS                | 86.597                     | 7555                        | .18                                        | 22                  | 1.99                     | 0.249                    | 0.12                             |
| TSPC                | 18.23                      | 1663                        | 0.03                                       | 6                   | 1.31                     | 0.004                    | 0.01                             |
| C <sup>2</sup> CMOS | 36.82                      | 2297                        | 0.05                                       | 8                   | 0.75                     | 0.007                    | 0.02                             |

Table 1 Comparison based on parasitic extraction

SDFF provides both short latency and the capability of incorporating logic functions with minimum delay penalty eliminating one gate delay from a critical path, and have the advantage to add scan circuitry to the basic design. HLFF is aimed at a substantial reduction in latch and clock skew. One of the benefits of HLFF is that it reduces retardation in data flow. However it suffers from sizing problems since a transparent window that is too large increases the hold-time and results in race condition. P-FF uses only one latch, allow time borrowing across clock cycle boundaries and feature a zero or even negative setup time. P-FF is less sensitive to clock jitter. Despite these advantages pulse generation circuitry requires delicate pulse width control in the process variation and the configuration of pulse clock distribution. SAFF is characterized by a non-zero setup time, a reduced hold time, a low clock load and true single phase operation. One of the major limitations of this topology is that it provides asymmetrical delays with a slow high-to-low clock-to-output delay. WPMS FF structure has no pMOS transistor other than the keeper circuits, so that the clock-delay and the parasitic capacitance on the data-path can be considerably reduced. In spite of this advantage, this latch shows worse performance and power consumption than DET FF. TSPC FF has several benefits such as the elimination of skew due to different clock phases and the clock signal being generated off-chip, which implies significant savings in chip area and power consumption. A single global clock signal needs to be generated and distributed, thus simplifying the design. A few draw backs are the need

ISSN: 2248-9622 www.ijera.com

### Vol. 1, Issue 4, pp.1971-1982

for two extra transistors in each stage, a clock delay problem that can be eliminated as long as the delay is less than the gate delay. P-block becomes a speed bottleneck, pre-charged nodes consume power and clock skew might be encountered. In  $C^2CMOS$  reduces the clock delay of the clock inverter and increases the speed of the design logic.

| Flip-Flop<br>type   | Latency<br>ps | Rise delay<br>ns(output<br>node) | Fall delay<br>ns(output<br>node) | Total delay<br>ns(output<br>node) | Compilation<br>memory<br>usage | Speed |
|---------------------|---------------|----------------------------------|----------------------------------|-----------------------------------|--------------------------------|-------|
| SET                 | 168           | 0.013                            | 0.002                            | 0.015                             | 4.0%                           | 1.2   |
| DET                 | 152           | 0.011                            | 0.005                            | 0.016                             | 7.0%                           | 1.3   |
| SDFF                | 188           | 0.009                            | 0.005                            | 0.014                             | 5.8%                           | 1.6   |
| HLFF                | 270           | 0.016                            | 0.006                            | 0.022                             | 6.6%                           | 1.1   |
| P-FF                | 162           | 0.009                            | 0.004                            | 0.013                             | 6.2                            | 1.3   |
| SAFF                | 220           | 0.02                             | 0.01                             | 0.03                              | 11.4                           | 1.12  |
| WPMS                | 178           | 0.027                            | 0.017                            | 0.044                             | 10.1                           | 1.4   |
| TSPC                | 101           | 0.002                            | 0.001                            | 0.003                             | 3.0                            | 1.2   |
| C <sup>2</sup> CMOS | 98            | 0.007                            | 0.001                            | .008                              | 4.8                            | 1.8   |

 Table 2 Comparison based on delay

The graph in Fig (5) shows the distribution of parasitic values of different FF topology. Fig (5a) represents the capacitances distribution over transistor count. It is observed that for SDFF the maximum number of transistor required is 25 but the capacitance for this topology is 48.69fF. Whereas the DET FF has 12 transistor but the capacitance values is 109.324fF which is high will produces high power dissipation. Fig (5b) shows the graph of speed versus latency. From the graph it is observed that the latency is less and speed is high for  $C^2$  CMOS whereas the latency for HLFF is high and speed is low.



Fig 5. Distribution of parasitic values. a. capacitances distribution over transistor count b. speed versus latency

The graph in Fig (6) shows the distribution of maximum and average drain current for different FF topology. Fig (6a) represents the  $I_{DDMAX}$  and  $I_{DDAVG}$ . It is observed that for WPMS FF has maximum drain current leakage and C<sup>2</sup>CMOS has the allowable leakage current. Fig (6b) shows the graph of rise, fall and total delay only at the output

#### Vol. 1, Issue 4, pp.1971-1982

node. From the graph it observed that the rise and fall delay is high for HLFF topology whereas  $C^2CMOS$  has the least rise and fall delay.



Fig 6 Current graph a. I<sub>DDMAX</sub> VS I<sub>DDAVG</sub>b.Distribution of rise, fall and total delay at the output node

All data for area, delay and power dissipation are obtained by microwind tool and simulations performed at the 0.12 $\mu$ m technology with power calculated using Predictive Technology Model (PTM) [15]. The granularity of transistor size is set to the minimum width of 1.02 $\mu$ m and the minimum length of 0.12 $\mu$ m for NMOS and the minimum width of 1.98 and the minimum length 0.12for PMOS. Table 3 shows the comparison of different FF against delay, power, transistor count, latency and area occupied. It is observed that the area occupied by SAFF is more and the C<sup>2</sup>CMOS occupies less area. On comparing the power dissipation DET dissipates more power and C<sup>2</sup>CMOS exhibits less power dissipation. Fig (8) shows the distribution of area versus delay. The topology SAFF has large delay whereas C<sup>2</sup>CMOS as less delay when compared to the other topologies.

| Flip-Flop<br>type   | Transistor<br>Count | Delay<br>ns | Power<br>(µw) | Area<br>(μm²) | Latency<br>ps |
|---------------------|---------------------|-------------|---------------|---------------|---------------|
| SET                 | 16                  | 30.67       | 402.21        | 78.26         | 168           |
| DET                 | 12                  | 42.89       | 510.54        | 70.98         | 152           |
| SDFF                | 25                  | 78.90       | 450.12        | 100.76        | 188           |
| HLFF                | 20                  | 67.76       | 425.90        | 98.75         | 270           |
| P-FF                | 19                  | 60.02       | 390.45        | 93.12         | 162           |
| SAFF                | 24                  | 75.34       | 360.67        | 102.54        | 220           |
| WPMS                | 22                  | 56.76       | 354.89        | 87.56         | 178           |
| TSPC                | 6                   | 28.9        | 230.78        | 60.34         | 101           |
| C <sup>2</sup> CMOS | 8                   | 27.61       | 256.21        | 59.47         | 98            |



Fig 8 Area Vs delay graph

## R.Uma / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 <u>www.ijera.com</u> Vol. 1, Issue 4, pp.1971-1982

# VI. CONCLUSION

In this paper, an exhaustive analysis and design methodology for commonly used high-speed flip-flops topologies in 0.12 $\mu$ m CMOS technologies has been carried out. The comparison has been performed with area, delay and power dissipation. The impact of layout parasitics has been included in the transistor-level design phase. The flip-flops chosen for a thorough comparative analysis, whose results are reported in section V and VI. According to the presented results, the fastest topology is the C<sup>2</sup>CMOS since the delay, area and power dissipation is less, the best low-power flip-flops are the SAFF, P-FF and TSPC. Moreover, the best topology under clock skew and latency are SDFF and HLFF. As a common observation, simpler basic structures are rewarded in micrometer technologies because of the strong impact of layout parasitics. In particular, pulsed topology and transmission gate FF topologies have been recognized as the most efficient FF topologies and used widely in high-speed microprocessors and other portable batteries.

#### **References:**

[1] Yu Chien-Cheng, "Low-Power Double Edge-Triggered Flip-Flop Circuit Design", The 3<sup>rd</sup> International Conference on Innovative Computing Information and Control (ICICIC'08), IEEE 2008

[2] R.Hossain, L.D. Wronski, and A. Albicki, "Low power design using double edge triggered flip-flops," IEEE Trans. on VLSI Systems, vol 2, no. 2, pp. 261-265, June 1994.

[3] M.Pedram, Q. Wu, and X. Wu, "A new design of double edge-triggered flip-flops," in Proc. Asia and south Pacific Design automation conference (ASP DAC), 1998, pp. 417-421.

[4] Y. Moissiadis, I. Bouras, A. Arapoyanni and L. Dermentzoglou, "A static differential double edge-triggered flip-flop based on clock racing", Microelectronics Journal, Vol 32, pp.665-671,2001.

[5] Fabian Klass, "Semi-Dynamic Flip-Flops with Embedded Logic" IEEE, 1998.

[6] Yin-Tsung Hwang, Jin-Fa Lin, and Ming-HwaSheu, "Low-Power Pulsed-Triggered Flip-Flop Design with Conditional Pulsed-Enhancement Scheme", IEEE Transaction on Very Large Scale Intergradation (VLSI) systems, IEEE 2011.

[7] M.Matsui, H.Hara, Y.Uetani, L.Kim, T.Nagamatsu, Y.Watanabe, A.Chiba, K.Matsuda, and T.Sakurai, "2D DCT macrocell using sense-amplifying pipeline flip-flop scheme," IEEE Journal Solid-State Circuits, vol29, no.5, pp1482-1490,Dec. 1994.

[8] B.Nikolic, V.G.Oklobdzija, V.Stajanovic, W.Jia, J.K.Chiu, and M.M.Leung,"Improved sense-amplifier based flip-flop," IEEE Journal Solid-State Circuits, vol35, no.6, pp876-883,Jun.2000.

[9] Antonio G.M.Strollo, Davide De Caro, Ettore Napoli, and Nicola Petra,"ANovel High-Speed Sense-Amplifier-Based Flip-Flop," IEEE Transaction on Very Large Scale Intergradation (VLSI) systems vol.13, no,11. Nov 2005.

[10] D.Markovic and J.Tschanz, "Transmission-gate based flip-flop," U.S. patent 6,642,765, November 2003.

[11] Jinn-Shyan Wang, Po-HuiYang,"A Pulse-Triggered TSPC Flip-Flop for High-Speed Low-power VLSI Design Applications," IEEE, 1998.

[12] J.Suzuki, K. Odagawa, and T.Abe, "Clocked CMOS calculator Circuitry," IEEE Journal of Solid State Circuits, vo. SC-8, Dec 1973, pp 462-469.

[13] N. Nedovic, V. G. Oklobdzija, W. W. Walker, "A Clock Skew Absorbing Flip-Flop", 2003 IEEE ISSCC, San Francisco, Feb. 2003.

[14] Jinn-Shyan Wang, "A new true-single-phase-clocked double-edge-triggered flip-flop for low-power VLSI design,", in proceedings of IEEE ISCAS 1997, pp1896-1899.

[15] W. Zhao and Y. Cao. "New generation of predictive technology model for sub-45nm design exploration," In *IEEE Intl.* Symp. On Quality Electronics Design, 2006

[16] Hamid Partovi, Robert burd, UdinSalim, Frederick weber, Luigi DiGregorio, Danold Draper, "Flow-throug Latch and Edge-Triggered Flip-flop Hybrid Elements," IEEE International Solid-State Circuits Conference, 1996.