# **RESEARCH ARTICLE**

**OPEN ACCESS** 

# Network son Chips: Scalable Inter connects for Future Systemson Chips

# Nalini Kumar Sethy, Sambid Kumar Das,

Gandhi Institute of Excellent Technocrats, Bhubaneswar, India Shibani Institute of Technical Education, Bhubaneswar, Odisha, India

# ABSTRACT

According to the International Technology Roadmap for Semiconductors (ITRS), before the end of this decade wewill be entering the era of a billion transistors on a singlechip. It is being stated that soon we will have a chip of 50-100 nm comprising around 4 billion transistors operating at a frequency of 10 Ghz. Such a development means that in the near future we probably have devices with such com-plex functions ranging from mere mobile mobiledevices controlling satellite functions. But developing phones to suchkindofchipsisnotaneasytaskasthenumberoftransistorsincreases on-chip, and so does the complexity of integrat-ing them.Today's SoCs use shared or dedicated buses to interconnect the communicating on-chip resources.How-ever. these buses are not scalablebeyond а certain limit.Inthiscase,thecurrentinterconnectinfrastructurewillbe-come a bottleneck for the development of billion transistorchips. Hence. in this tutorial. will highlight we try to а newdesignparadigmthathasbeenproposedtocountertheinef-ficiencyofbusesinfutureSoCs.This newdesignparadigmhas been termed with а variety of titles, but the most commonandagreedupononeisNetworksonChips(NoCs).Wewillshow thathowthisparadigmshift fromordinarybusesto networks on chips can make the kind of SoCs mentionedaboveverymuchpossible.

Keywords: SoC,NetworkonChips,DesignChallenges

## I. INTRODUCTION

Chip integration has reached a stage where a completesystem can be placed in a single chip. When we sav completesystem, we mean all the required ingredients that m akeupaspecializedkindofapplicationonasinglesilico n sub-strate. This integration has been made possible because of the rapid developments in the field of VLSI designs; this isprimarilyusedinembeddedsystems.

Thus, insimpleterms an SoC can be defined as "an IC, designed by stitching together multiple stand-alone

VLSIdesignstoprovidefullfunctionalityforanapplica tion[1]."WhiledesigninganSoC, avendormay usealibr aryofcoresdesigned by external designers in addition to using coresfrom in-house libraries.Cores are basically pre-designed models of complex functions termed as Intellectual Prop-erty Blocks (IP Blocks), Virtual Components (VC) or sim-ply micros.Since the design of anSoC comprises coresfrom different sources /vendors, we can say that an SoC iscompletely heterogeneous, and that is one of the key issueswhich complicatesitsdesignprocess.



Figure1.SoC -[1]

A generalized form of today's SoC architectures is de-picted in figure 1. This figure shows the common componentsusedincurrentSoCs;SRAMS,DRAMS,Flashme m-

ory,ROM,DSPs,2D/3Dgraphics,andinterfacecoress uchas PCI, USB and UART. It should be noted that all thesecomponents may belong to different libraries of cores andmay belong to different vendors.Also, their organizationon the chip depends upon the application they are designed for – Application Specific Integrated Circuit (ASIC). A fewexamples of today's core based SoCs include GSM mo-bilephones,single chipdigital/videocams,GPScontrollers,smartpagerA

SICsetc.

However, the present SoCarchitecture doesn't suffice fo

rthe future needs, particularly in the terms of their intercon-nectdesignduetotheirpoorscalability andinefficiencyforhandlinglargenumberofpartners( wewillelaborateonthisin section 2). Hence, from here we move on toward our ac-tual topic of discussion, that is, Network on chips or morecommonlycalledNoCs.

A NoC is percieved as a collection of computational, storage and I/O resources on-chip that are connected witheach other via a network of routers or switches instead ofbeing connected with point to point wires. These resources communicate with each other using data packets that arerouted through the network in the same manner as is doneintraditional networks [2]. It is clear from the definition



## Figure2.Moore'sLaw

thatweneedtoemployhighlysophisticated andresearchedmethodologies from traditional computer networks and im-plement them on chip.But why?In order to elaborate onthisquestion,wehavetoexplorethemotivatingfactor

sthatare compelling the researchers and designers to move to-wardthe adoptionofNoCarchitecturesforfutureSoCs.

The area of NoC is still in its infancy, which is one ofthereasonswhytherearevariousnamesforthesameth ing;some call it on-chip networks, some networks on

silicon, butthemajority agrees upon "Networks on Chip s" (NoCs). However, we will be using these terminologie sinterchange-ably throughout our tutorial.

#### Motivations

AsprojectedbyITRS[3],aroundfourbilliontr ansistorswill be accommodated by the end of this decade. Althoughit sounds incredible, a number of factors are posing hin-drance to achieve billion transistor chip in future. In thefollowing we discuss some of the issues that need to beovercome before we can have a real chip with billions oftransistors.

#### Poorscalabilityofstandardbuses

The primary interconnection mechanism behind

today'sSoCsaresharedbuseswhichhelptotime-

sharewiresamong the communicating partners andleadto reduction I/O pins in cores, hence leading to a simplified wiringscheme.Previously, direct pin connections were used toconnectvariouscoresonachip;thisleadto

alargenumberofpinsforeachcore.Moreover,asthenu mberofcoreson-chip increased, so did the pins, thus, leading large routingtime and area, and unpredictable delays in signals and signalquality.Tosimplifythestructure,buseswereintrodu cedwhichprovedtobea

bettersolutionthantheirpredecessorsin terms of reduced signal delays and signal quality, and controlledrouting

time.However,ithasbeenobservedthatbusescannotbe sharedbeyond5-10partners,hence,mak-ing

scalability of the communication paradigm in SoCs amajorconcern[4].

### Designproductivitygap

It was in 1965 that Gordon Moore, co-

founder of Intelpublished his all-famous paper in which he predicted thatthecapacityofintegratedcircuitswillbedoubledev ery18

- 20 months (also called Moore's Law) [5]. It has beenobserved over the past years that current technology is notkeepingpacewithMoore'spredictionresultingina"

designproductivity gap" which is increasing at a pace of approx-imately 20% every year. This effect is shown in figure 2. This design productivity gap is not only because of moregates, and functionality and testability of the chips whichwere the only issues in the beginning, but many oth erfactors like wire delay, power management, embedded

software,moredesignchoices,andsignalintegritywhi charemakingthe entire design process more time consuming and com-plex.In order to cope with the productivity gap, we needexponentially growing design teams or/and design time todesign and implement systems which fit into a single IC;thisis veryunrealisticandrather impractical.

### Difficulttomaintainglobalsynchrony

Oneofthemajorproblemsofgrowingchipsist heglobalclock. It is becoming increasingly difficult to synchronize the clock signal traveling across the entire chip. This, inturn, is not only increasing the clock skew problem butalso affecting the power consumption which is reachingunacceptable limits.One remedyof the problem is toadopt "Globally Locally Asynchronous and Synchronous(GALS)" paradigm [6]. However, in such a case, there re-mains no coordination among the on-chip

communicatingpartners, hence, makingchipacollecti onofdistributed sys-tem.

#### Heterogeneity

A significant characteristic of SoCs is heterogeneity -components from different vendors lay on the same chip.Following the prediction of ITRS that the silicon substrateis becoming capable more and of absorbing more components, chipsofthe future are going to be more complex th an they are. Components having different functions and completelynovelfeatureswillbeintegratedonthesamesilic ondie.eventhoughthevaredesignedbydifferentdesign teamson a variety of platforms. Finally, all these

heterogeneous components of totally distinct characteristics (importantly, even analog devices can be included in addition to digitalones) have to be placed on a single chip, which makes

# thetaskquitecomplex.

# NetworksonChips(NoCs)

After realizing the inefficiency of traditional buses inSoCs, inconjunction with somany other factors, the de sign-ers of SoCs have come to a cross-road where they meet the computer architecture designers who are always interestedin finding dynamic and scalable architectures for buildingmicroprocessors. The scalability and wide success of theInternethasattractedtheattentionofcomputerarchit ectureas well as SoC designers and influenced them to borrow theidea of using packet based switching networks for the designoffutureSoCcommunicationinfrastructure.

It is an understood fact that the actual reason behindsuccess of the Internet and its scalability lies in a well de-fined protocol stack; the idea was to decouple communi-cation from computation. Packet switched communicationnot only provides high scalability, but also facilitates reuseof the communication architecture. The two major problems faced by SoC designers - re-usability and scalabil-ity- can, therefore well be addressed by the adoption ofpacket switched communication infrastructure for SoC in-terconnects. Also, from a business point of view. it is importanttoreducethedesigntimebyadoptingreusenoton lyatthecomputationallevelbutalsoreuseofthecommu nica-tion structure. This will in turn lower the time to marketnew products with ease.Keeping in view this idea of theInternet, many researchers have proposed

communicationarchitecturesbaseduponpacketswitc hedon chipnetworksfor connecting components in the future SoCs [7], [8], [9], [10].

Another important aspect of NoCs is that they decou-ple computation from communication, which is essential for chips that contain billions of transistors. Again, theidea comes from traditional networks such as the Internet, where the communication system is based upon a protocolstackirrespective of the number of the communication infrastructure in

NoCswillbedesignedusingaprotocolstack

whichprovideswelldefined interfaces separating communication service usagefrom service implementation. This instead means that ofconnectinghighlevelmodules(likeprocessors,DSP s,con-trollers etc.) by routing dedicated wires, they are connected to a network that routes packets between them as captionedin[7]"RoutePacketsnotwires"



Figure3.2DmeshbasedNoC

#### NoCmodel

Now, since we already know that NoCs is the most propriated esign choice to develop the future SoCs, then extstep is to discuss the design of the NoC itself.Since thearea of NoCs is really new, it provides us with an opportu-nity to create things on order clean slate in to obtain а anoptimaldesign. The immense amount of research that isal-

readybeingconducted in the Internet has been consider a blyused in defining the structure of NoCs by the researchers.

Inthefollowingpassageswewilldiscussproposedtopol ogies,protocols, switching and routing mechanisms for NoCs.

Inthefollowingpassageswewilldiscussproposedtopol ogies,protocols,switchingand routingmechanismsforNoCs.

There are quite a few topologies proposed for NoCs in-cluding fat tree, honeycomb, 2D mesh etc.: we will discuss he most common and agreed upon topology - 2D mesh -in our tutorial because of its simplicity. ConsiderFigure3 which shows a simple mesh topology where circles rep-resent switches while squares are resources. A resource isa computational unit; it can be a processor, memory, DSPcore etc, whereas switches route and buffer messages be-tween resources.It can be seen from the figure that al-most each switch is directly connected to neighboring fourswitches (except for the ones at the edges). The commu-nication channel consists of two one-directional point-to-point buses between two neighboring switches or a switchand a resource. It is expected that, as the technology growswithtime, the number and size of resources will al sogrow, resulting in growth of bandwidth of switchto-switch orswitch-to-resource links, but network wide communicationwillremainunaffected.

### TypicalNoCtopology



Figure 4. Protocolstack for NoC's-[11]

# 2 ProposedprotocolstackforNoCs

In order to achieve a scalable communication paradigmforNoCs,a protocolstackin comparisonto OSImodelhasbeen proposed in[11]. This model is shown infigure 4.It can be observed from the figure that the proposed pro-tocol stack is mainly composed of three layers; Physical,ArchitectureandControl,andSoftware.TheP hysicallayerdeals with signal voltages, slopes and wire sizing in termsof SoCs. Wires are the physical realization of communica-tion in the SoCs. Then comes the Architecture and Controllayer, which is the most important layer in SoC stack as itencompasses Data Link, Network and Transport layers. Inthis part, the architecture defines the physical layout of thenetworkresources, whereas the control protocols def inethew ays in which the senetwork architecture scanbe

www.ijera.com

useddur-ing system operations.Most of the research is happeningat this level in SoCs.Finally, we have the software layerwhichtakescareofthesystem andapplicationsoftware.

### Switchingtechniques/Flowcontrol

There are different techniques that are used to switchpacketsbetweennodesinanetwork.Themostpo pularonesinclude store-and-forward, virtual cutthrough and wormholeswitching.Inwhatfollows,wewilldiscussthesetec h-niques in brief and see which one is more appropriate forNoCsbasedonmesh.

• Store-and-Forward switching: This is the most popularpacketswitchingtechniqueincomputernetworks. Here, parts of the entire packet are stored at the re-

refere, parts of the entire packet are stored at the receiving router until the entire packet is received, after which it is forwarded tothe next routerinthepath. In this case, enough buffer space is required ateach routers to accommodate the entire packet. Also, for large packets, this method introduces extra packetdelays in router. Since buffer resources on-chip

arequiteexpensive,besidesthefactthatthistechnique needs more power consumption which is undesirablefor NoCs, Store-and-forward is infeasible for on-chipnetworks.

• Virtual Cut-through Switching: This technique is proposedtoreducethepacketdelaycausedbythestoreand-forward switching. In this case, the packet is notstoredinitsentiretyintherouter,butcanbeforwarded to the next hop as soon as it is received by the currentrouter.However,ifthenexthoprouterisnotavail able,thenthecurrentrouterhastostoreitincompletefor m.

• Wormhole switching: This switching mechanism wasbasically developed for parallel processors.The ad-

vantageofwormholeflowcontrolisthatitachieves minimum network delay and needs less buffer space.In this technique,the packets are further split intosmallunits calledflits whichare immediatelyforwarded upon arrival. The flits of a packet do not needtobestoredinasinglerouter,hencereducingthenee dforlargebufferspace.Itisforthisreasontermedasthebe stcandidateforon-chipinterconnectionnetworks.

Having inspected the three populars witching tech-

niques, we can now easily say that due to the memory andbufferconstraintsonchip,wormholeswitchingseemstobethebest option for NoCs.Here, it is important toclar-ify the difference between packet switching (forwarding)and routing (which is discussed in section 5.2) – these twoterms are sometimes intermixed, thus creating a confusedpicture.In traditional computer networks, packet switching/forwarding is mainly concerned with moving a packetfrom an input port of a router to the output. Routing dealswith determining the entire path a packet may take fromsource to destination. Confusingly, the term "routing func-tion" sometimes denotes the packet forwarding method inthecontextofNoCs.

## RoutinginNoCs

Routing is the process of moving information from asource to a designated target. This term is very common in he Internet. Routing can be static or dynamic. Static rout-ing is managed by an administrator manually, and is suit-able for networks where network traffic is predictable andrelatively simple, which is a rare case in the Internet. Dynamicrouting, as then a mesuggests, is used to dynamica llydiscover routes in case of path changes. Due to the regularstructure and on-chip memory constraints static routing ismore feasible for NoCs. However, in case of path failures, adaptive routing can be introduced but special c aremustbetakenastoavoidexcessiveuseofbuffersorlo gicon-chip.

A contention based hot potato routing method has beensuggestedforNoCs[12].Thistechniquecanpredic tabout contention in the forthcoming stages by using direct con-nections with the adjacent node.Here packets are dividedinto smallunits flits, so they can be easily handled us-ing limited buffers.Routes that lead towards the destina-tion are termed as profitable routes. Alternatively, a routethat leads a packet away from the destination is a misroute. Ideally, a packet should follow the profitable route to reacha destination. However, in cases when profitable routes arecongested and/or their queues are toolong, following amis-routemightofferless delayin reachingthedestination.

# 3 FutureofNoCs

We can seefromourpreviousdiscussionthatalotofresearch has already been done in the Architecture and Con-trol layer of NoCs. This provides an opportunity to extend the amount of research to those areas which are not addressedvery frequently but can prove to be vital for desig n-ing viable NoCs for future applications. In the following, we will see some important issues that will have a signifi-cantimpact on the future of NoCs.

# Reliability

SoCs are mainly designed for consumer products – themain issue related to these products is reliability.As thenumber of transistors increase on a chip, so does the prob-ability of faults, making reliability a major issue [13]. Fail-ures can occur due to a variety of reasons, for example, crosstalk faults can lead to permanent or transient failuresof the communication links [14]. In addition to this, imple-menting packet-based communication onchip brings newreliability related challenges along with it. A transient faultmay cause a bit-flip n the packet header due to which packetget routed to a wrong destination.Similarly, in case ofpermanent faults, one or many links may go down, caus-ing congestion in the alternate paths. Thus, it is extremelyimportanttodeploy mechanisms inNoCsthatcanhan-dle both permanent and transient errors to ensure reliablepacket delivery over shared communication channels.In[15], various reasons affecting the links and routers onchipare discussed and a model of dynamic routing for NoCs isproposed.We have provided some preliminary results toreroute packets on alternate paths in case of link failures in[16].

### QualityofService

We know that on-chip networks are designed for a pre-known set of computing resources with pre-defined traf-fic patterns, as compared to traditional networks which arebuiltforfuturegrowthandexpansion.Fromanordin ary

user'sperspective, behaviorofany application mustbe pre-dictable. Although to guarantee the highest level of predictability is minimal, some degree offitness for purpose is always assumed. Also, in terms of NoCs, which are madefor main stream consumer products, such a degree of expectation becomes in evitable. For example, amobile ph one should provide a better voice and video quality to its user than contemporaries.

From the Internet, we learned that a service can theoret-ically be "guaranteed" if a commitment is made,

otherwiseitistermedas"besteffort".IncaseofSoCsbot hkindsof services are essential. AnSoC can accommodate trafficranging from real time data which needs to be delivered ina stipulated amount of time with no or minimum distortionto regular data streams which follow no such constraints.However. providing separate infrastructure for both the ser-vices would mean to precious resources waste on-chip. Instead, a combined best effort and guaranteed services arc hi-tecture has been proposed in [17] for NoCs. This seems

tobequiteafeasiblesolutionconsideringtheefficientut iliza-tionofresourceson-chip.

#### Softwaremodel

As discussed in section 5, above the

architecture and control layer, we have the software layer that encompasses the application and system software categories.Programmingmodelgivesanabstractviewofthehardwaretoapp li-cation developers.In parallel programming, there are twoprogramming models - shared memory and message pass-ing. In the shared memory model communication is implicit with shared address space. whereas in the message passingmodel, processors have private memories and communica-tion occurs through explicit messages [18].In context ofNoCs, where computational resources represent a varietyofIntellectualProperty(IP)blocks,messagepas singseemsto be a choice of programming model for NoC applicationsoftware [11]. This model, despite being harder to write, is more efficient in terms of scalability and performance inheterogeneousenvironments.

Similarly, interms of systems of tware, astanda rdizedsetof operating system services and interfaces needs to he developed.However,afewquestionsarestillopenlikehav-ing a purely distributed kind of OS versus centralized one.In a distributed system, each resource has an independentOS running; such a system is just like LAN on a chip, butithasveryhighoverhead.Ontheotherhand,incaseof acentralized OS, aspecialprocessor ismeant to runtheOS services, but here the problem of scalability comes intoplay:willtherunningprocessorbeenoughwhenthe

systemgrowsorwillwe needmoreprocessorsforit?

### II. CONCLUSION

The NoC methodology will likely be the best solutioncounter the increasing complexity of SoCs.Formtheabovediscussion future itcanalsobeconcludedthatfu-tureSoCs will he platform-based because of short-time-to-market constraints.Development of NoCs will be a hugeeffortasitinvolvesreuseatalllevels;reuseofarchit ecture, hardware and software. Also, it includes of differreuse entlanguages, methods, tools and practices during devel op-ment.

Although the potential of NoCs is tremendous, it wouldberatherunlikelytofulfillallitspromisesbeforet hedevel-opment of some of its key components like a reliable NoCarchitecture, assurance of quality of service, and a viableNoCsoftwaremodel.Clearly,itcanbeconcluded thatthereisalot ofneedandscopeforresearchin thisarea.

#### REFERENCES

[1] RochitRajsumman, "System-on-achip:Design andTest",ArtechHousePublishers,2000

- [2] ÉrikaCota,LuigiCarro, FlávioWagner,Marcelo Lubaszewski, "Power Aware NoC reuse on the testingofcorebasedsystems",Proceedings,ITC,Se pt30-Oct02,2003,Charlotte NCUSA.
- [3] http://www.itrs.net/Common/2003ITRS/Ho me2003.htm
- [4] Axel Jantesh and HannuTenhumen, "Networks anabing" Kluwar Academia Publications 2003
- onchips",KluwerAcademicPublications,2003 ,Boston,USA.
- [5] GordonE.Moore, "Crammingmore componenetsontointegratedcircuits", Electro nics, Volume 38, Number 8, April 19, 1965.
- [6] Daniel M. Chapiro "Globally-Asynchronous Locally-Synchronous Systems" PhD Thesis, Stanford Univer-sity,Oct.1984
- [7] William J. Dally and Brian Towles, "Route packets,not wires:on-chip interconnection networks," Pro-ceedings, Design Automation Conference (DAC), pp.684-689,LasVegas,NV,June2001.
- [8] Shashi Kumar, Axel Jantsch, Juha-PekkaSoininen,MarttiForsell, Mikael MillbergJohny Oberg, KariTiensyrja and Ahmed Himani, "A network on chipArchitectureand DesignMethodology",Proceedings,IEEE computer society annual symposium on VLSI,2002.
- [9] PierreGuerrier, AllenGreiner, "Agenericarchit ecturefor on-chip packet switched interconnections", Proceedings, DATE, 2000, pp. 250-256.
- [10] AhmedHemani,AxelJantsch,ShashiKumar,A dam Postula, Johny Oberg, Mikael Millberg, DanLindqvist, "Networks on a chip: An Architecture forbillion transistor era", Proceedings, IEEE NorChipConference,November2000.
- [11] L. Benini and G. De Micheli, "Networks on Chip: Anew paradigm for component-based MPSoC design", in A. Jerrraya and W. Wolf Editors, MultiprocessorsSystemsonChips,MorganKa

ufman,2004,pp.49-80.

- [12] Terry Tao Ye, Luca Benini, Giavanni De Michelli, "PacketizationandRoutingAnalysis ofOn-chipMul-tiprocessor Networks", Journal of System Architecture,Vol50,Feb2004,pp.81-104.
- [13] ParthaPratimPande, CristianGrecu, Andre Ivanov,ResveSaleh,andGiovanni DeMichelli,"Design,Synthesis,andTestofNet worksonChips",IEEEDesignandTest,September/October,2005(Vol.2 2,No.5),pp.404-413.
- [14] Ming Shae Wu and Chung Len Lee, "Using a

PeriodicSquareWaveTestSignaltoDetectCross TalkFaults",Journal, IEEE Design & Test of Computers, Volume22,Issue2,March-April2005,pp.160-169.

- [15] Muhammad Ali, MichealWelzl, Martin Zwicknagl,Sybille Hellbrand, "Considerations for fault tolerantNetwork on chips", Proceedings, 17th IEEE ICM, Islamabad,Pakistan,13-15Dec.2005.
- [16] Muhammad Ali, Michael Welzl, Sybille Hellebrand, "Adynamicroutingmechanismfor networkonchip", Proceedings, IEEE NORCHIP, Oulu, Finland, 21-22November2005
- [17] E. Rijpkema and K. Goossens and A. Radulescu, J.Dielissenandvan Meerbergen, J. and P.Wielageand E Waterlander "Trade Offs in the Design of a Routerwith Both Guaranteed and Best-Effort Services forNetworks on Chip" Proceedings, IEEE: ComputersandDigitalTechnique,September2 003.
- [18] StevenBrawer, "Introduction toParallelProgramming", AcademicPrinters(July1989).