Efficient Design and Clocking for a Network-on-Chip

Date

2013-05-02

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

As VLSI fabrication technology scales, an increasing number of processing elements (cores) on a chip makes on-chip communication a new performance bottleneck. The Network-on-Chip (NoC) paradigm has emerged as an efficient and scalable infrastructure to handle the communication needs for such multi-core systems. In most existing NoCs, design decisions are made assuming that the NoC operates at the same or lower clock speed as the cores, which slows down the communication system. A major challenge in designing a high speed NoC is the difficulty of distributing a high speed, low power clock across the chip.

In this dissertation, we first propose several techniques to address the issue of distributing a high-speed, low power, low jitter clock across the IC. We primarily focus our attention on resonant standing wave oscillators (SWOs), which have recently emerged as a promising technique for high-speed, low power clock generation. In addition, we also present a dynamic programming based approach to synthesize a low jitter, low power buffered H-tree for clock distribution. In the second part of this dissertation, we use these efficient clock distribution schemes to present a novel fast NoC design that relies on source synchronous data transfer over a ring. In our source-synchronous design, the clock and data NoC are routed in parallel yielding a fast, robust design.

Architectural simulations on synthetic and real traffic show that our source-synchronous NoC designs can provide significantly lower latency while achieving the same or better bandwidth compared to a state of the art mesh, while consuming lower area. The fact that the our ring-based NoC runs significantly faster than the mesh contributes to these improvements. Moreover, since our proposed NoC designs are fully synchronous, they are very amenable to testing as well.

In the last part of this dissertation, we explore an alternate scheme of achieving high-speed on-chip data transfer using sinusoidal signals of different frequencies. The key advantage of our method is the ability to superimpose such sinusoids and thereby effectively send multiple logic values along the same wire in a clock cycle. Experimental results show that for the same throughput as that of a traditional scheme, we require significantly fewer wires.

Description

Citation