# LOW POWER AND AREA EFFICIENT ASYMMETRICAL DIFFERENTIAL AMPLIFIER BASED CONTENT ADDRESSABLE MEMORY

# Kuludip Kumar Gupta<sup>1</sup>, Dr. R.K Sarin<sup>2</sup>, Er.Nitin Kr.Tiwari<sup>3</sup>

Department of Electronics & Communication<sup>1,2,3</sup> Dr. B. R. Ambedkar National Institute of Technology Jalandhar

Abstract--This paper presents XOR and XNOR type CAM which is implemented with a store unit of loadless four-transistor SRAM cell. The major portion of CAM power is consumed in match line and search line switching during CAM operation. The proposed XOR CAM of 4T store unit is compared with conventional XOR CAM[1] in 0.18-µm 1.8 V CMOS logic process with asymmetrical differential match line sense amplifier[5] .Measurements of Power dissipated and area consumed have been carried out through simulation and Layout Design ,respectively. Results show that area and power delay product reduces as compared to that of conventional XOR CAM .

Keywords-CAM, Sense Amplifier, SRAM

#### I. Introduction

Content addressable memory (CAM) consist of two units i.e store and compare. The store unit is generally traditional 6T SRAM which contain cross coupled inverters to store a single bit either (0 or 1) with two access transistor and compare unit is a pass transistor logic(PTL) to compared the stored data with the search data. Compare unit can be arranged XOR and XNOR type CAM as shown in fig.2(a) & 2(b).Whenever a search data on searchline comes, it compares with the stored bit with the help of the compare unit and corresponding the match-line is high or low .CAM is commonly used in applications such as pattern recognition ,data compression and address translation in network router and processor caches compression. Today CAM is widely used in the router where the packet comes and compare with the stored data and sent to final destination address . Initially routing table consists of software algorithm which forward the packet to its destination address but now-a-days it is implemented in hardware consists of CAM array which makes the searching fast due to parallel in nature

The basic architecture of CAM[2] is shown in fig. 1. It consist of three words and each words consists of three bit with all matchline of a word is shorted together and fed to a

match line sense amplifier. Initially the matchline is high before putting the search data on search-line register.

Whenever the search data matches in a word the matchline remains high and discharges to ground during mismatch. The major portion of power is consumed during searching where searchline is charged and discharge during precharge and evaluation phase respectively. During this process the matchline also consumes power as it is charging and discharging in this phase. In order to reduce power in search operation NAND type CAM [4] is used in which the CAM cell is XNOR type. For best search performance the NOR type CAM [4] is used which is XOR type. Today the challenging work is to reduce area and power consumption in CAM. To reduce the match delay one of the efficient method is to use various matchline sense amplifier. The traditional SRAM is made of 6T used for storing a bit. This paper presents a new loadless 4T SRAM [3] which is smaller than that of standard 6T SRAM. Additional we have used asymmetrical differential matchline sense amplifier<sup>[5]</sup> and finally compare the conventional XOR CAM with modified XOR CAM.





The basic CAM cell is based on the static memory cell(See figure 2).Data is stored in two cross coupled inverters. The two NMOS transistors controlled by the word line allow the CAM to be written. The three additional transistors used for matching. The bit storage portion is a standard 6T static RAM (SRAM) cell. Hence, this cell performs READ and WRITE operations similar to an SRAM cell.Third operation is search operation.

I. Write operation - It is performed by placing the data on the bit lines(BLs) and enabling the wordline (WL). This turns on the access transistors (N6-N7), and the internal nodes of the cross-coupled inverters are written by the BL data. Fig. 3 shows the WRITE operation when '0' is being written to a cell which originally stored '1'. Initially Vx = '1' and Vy = '0', P1 and N9 were 'ON', and P2 and N8 were 'OFF'. When WL is enabled (WL = '1'), access transistors (N6-N7) conducts resulting in BL currents IO and I1 (shown by thick arrows in fig. 3). These transient currents form voltage dividers (P1-N6 and N7-N9). If these transient currents can pull one of the nodes (Vx and Vy) to the inverter threshold voltage, the other node will flip due to the feedback action of the cross coupled inverters. If the inverter threshold voltage is Vdd/2, N7 needs to be much larger than N9 to pull Vy above this value because it is difficult to pass logic '1' using an NMOS transistor. On the other hand, Vx can be pulled below this value by choosing same size P1 and N6 (shown by encircled W in Figure 3). Thus, the pull up transistor (P1-P2) are sized 1.8 times more than acess transistor N6-N9.



Fig. 3. Write Operation in XOR CAM Cell

**II Read operation**-The READ operation is performed by pre-charging the BLs to VDD and enabling the WL. Figure 4 shows the READ operation, when '0' is stored (i.e. Vx = '0', Vy = '1'). Since the BL drivers are turned off during the READ operation, current Iread discharges BL (through N6 and N8). BLC remains at VDD because Vy = '1'. Therefore, a small differential voltage develops between BL and BLC, which is amplified to a rail-to-rail voltage by a BL sense amplifier (BLSA). Since the BLs are shared among all the cells in a column, they are highly capacitive. The small voltage swing in the BLs reduces power consumption and the access time during the READ operation. As shown in the Fig. 4 the



Fig. 4. READ Operation of XOR CAM Cell

current Iread raises the voltage Vx. Thus, the driver transistors (N8-N9) are sized such that Vx remains below the inverter threshold voltage, and hence the cell does not flip during the READ operation. Typically, the driver transistors (N8-N9) are sized 1.5 times wider than the access transistors (N6-N7).

**III. Search operation**-The Conventional Search operation is performed in three steps. 1)Search lines or bit lines i.e SL(BL) and SLC(BLC) are reset to GND. 2)Matchline is precharged to Vdd and 3) Finally, the search key bits and its complementary value are placed on BL and BLC respectively, with WL disabled.If the search key bit is identical to the stored value, ML-to-GND pull-down paths remain 'OFF', and the ML remains at VDD,indicating a "match". Otherwise, if the search key bit is different from the stored value, one of the pull-down paths conducts and discharges the ML to GND indicating a "mismatch". Resetting BL and BLC to GND before ML pre-charge phase ensures that both pull-down paths are 'OFF', and hence do not conflict with the ML pre-charging. Fig 5 shows the search operation when 0 is stored in the cell (Vx = '0' and Vy = '1'). For BL = '1' (BLC = '0'), ML is discharged to '0' detecting "mismatch" as shown in Fig.



Fig. 5. SEARCH operation in a XOR CAM cell for (a) "mismatch (b) "match

5(a). Similarly for BL='0',ML remains at '1' detecting "match" as shown in Fig 5 (b).

**III Proposed CAM and Sense-Amplifier**– The store unit of the XOR and XNOR CAM in fig 6 and fig 7 is loadless 4T SRAM [3]. The difference of this 4T SRAM from 6T SRAM is in read operation. Here, the bit lines are charged to ground instead of Vdd. The cell ratio of the transistor is taken as 3- the ratio of transconductance of the storage transistors to the transconductance of the access transistors.

The sense-amplifier shown in fig 8 is of two differential inputs and asymmetrical differential amplifer [5]. The operation of this sense amplifier is in three phases stand by, excitation and evaluation. During standby ML and SML are discharged to ground. Meanwhile EN signal is asserted high .During excitation ML is charged to certain level from the pulse generator as in fig. 9 and fig. 10 During evaluation, the search data on SLs is compared with stored data. In match-case the MLSO is discharged to ground and in mismatch-case MLSO



Fig. 6. XOR CAM



Fig. 7. XNOR CAM



Fig. 9. Waveform of XOR CAM [6T] in match case

**IV Power Consumption and Delay-**Matchline and Searchlines are highly capacitive in nature .So the major portions of power consumed in CAM is due to searchlines switching i.e searchlines are discharged to ground in precharge phase and during evaluation one of the searchline pair is high. During search operation matchline also consume power due to charging high in precharge phase and discharged to ground in mismatch case.

In this paper we calculate the average power consumed during the searchline operation between the 'stand by' (when RST is high) and the time when the matchline sense amplifier output goes to zero. And calculated matchline delay from the rising edge of the pulse generator to the falling edge of the matchline sense amplifier output (MLSO) in the evaluation phase for match case. The average power consumed for XOR CAM is equal to i.e.  $P_{av} = [(1/T) 0]^T (V X I) dt$ .



Fig. 8. Asymmetrical (differential sense amplifier)





Fig. 10. Waveform of XOR CAM [4T] in match case

| CAM  | POWER  | DELAY | POWER                     | AREA   |
|------|--------|-------|---------------------------|--------|
|      | ( uw)  | ( ns) | DELAY                     | (um X  |
|      |        |       | PRODUCT(fJ)               | um)    |
| XOR  | 10.582 | 1.67  | 1.76721*10 <sup>-14</sup> | 9.76 X |
| CAM  |        |       |                           | 9.450  |
| (6T) |        |       |                           |        |
| XOR  | 7.490  | 2.09  | 1.56541*10 <sup>-14</sup> | 9.18 X |
| CAM  |        |       |                           | 7.280  |
| (4T) |        |       |                           |        |
|      |        |       |                           | 1.00   |
|      |        |       | 1                         |        |
|      |        |       |                           |        |





Fig. 12. Layout of XOR CAM[4 T SRAM]

#### **V** Conclusion

A new XOR CAM [4T] with the asymmetrical differential matchline sense amplifier presented in this paper. The both conventional XOR CAM and proposed XOR CAM is simulated with a 0.18  $\mu$ m CMOS technology at temperature of 27°C. The size of the proposed cell layout (fig. 11 and fig. 12) is 27% smaller than that of XOR CAM(6T). Further the energy (power-delay product) saved through this circuit compared to the conventional circuit has shown to be more than 11%. Therefore the proposed design can be use in NAND and NOR type for high capacity parallel CAM in CMOS technology.

# References

- [1]. H. Miyatake, M. Tanaka, and Y. Mori, "A design for high speed lowpower CMOS fully parallel contentaddressable memory macros," *IEEE J. Solid-State Circuits*, vol. 36, pp. 956–968, June 2001.
- [2] K. Pagiamtzis and A. Sheikholeslami, "Contentaddressable memory (CAM) circuits and architectures:

A tutorial and survey," *IEEE J. Solid-State Circuits*, vol. 41, no. 3, pp. 712–727, Mar. 2006

- [3] Jinshen Yang , Li Chen " A New Loadless 4-Transistor SRAM Cell with a 0.18 um CMOS technology IEEE 2007
- [4] Yen-Jen Chang and Yuan Hong Liao Hybrid-type CAM CAM design for both power and performance efficiency in VLSI systems, Vol No:8,Aug 2008
- [5] Xiao-Liang Tan,Anh-Tuan Do,Shou-Shun Chen, Kiat-SengYeo,Zhi-Hui Kong "A New Match line Sensing Technique in Content Addressable Memory IEEE COOL Chips XIV 2011
- [6] www.egr.msu.edu/classes/ece410/mason/files/guidepower.pdf
- [7] http://www.ece.ncsu.edu/asic/ece733/hw/energy.pdf
- [8] http://www.ece.cmu.edu/~ee52 power.html