Browsing by Subject "System-on-Chip"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item A Benchmarking Platform For Network-On-Chip (NOC) Multiprocessor System-On- Chips(2012-02-14) Malave-Bonet, JavierNetwork-on-Chip (NOC) based designs have garnered significant attention from both researchers and industry over the past several years. The analysis of these designs has focused on broad topics such as NOC component micro-architecture, fault-tolerant communication, and system memory architecture. Nonetheless, the design of lowlatency, high-bandwidth, low-power and area-efficient NOC is extremely complex due to the conflicting nature of these design objectives. Benchmarks are an indispensable tool in the design process; providing thorough measurement and fair comparison between designs in order to achieve optimal results (i.e performance, cost, quality of service). This research proposes a benchmarking platform called NoCBench for evaluating the performance of Network-on-chip. Although previous research has proposed standard guidelines to develop benchmarks for Network-on-Chip, this work moves forward and proposes a System-C based simulation platform for system-level design exploration. It will provide an initial set of synthetic benchmarks for on-chip network interconnection validation along with an initial set of standardized processing cores, NOC components, and system-wide services. The benchmarks were constructed using synthetic applications described by Task Graphs For Free (TGFF) task graphs extracted from the E3S benchmark suite. Two benchmarks were used for characterization: Consumer and Networking. They are characterized based on throughput and latency. Case studies show how they can be used to evaluate metrics beyond throughput and latency (i.e. traffic distribution). The contribution of this work is two-fold: 1) This study provides a methodology for benchmark creation and characterization using NoCBench that evaluates important metrics in NOC design (i.e. end-to-end packet delay, throughput). 2) The developed full-system simulation platform provides a complete environment for further benchmark characterization on NOC based MpSoC as well as system-level design space exploration.Item Design, Implementation and Evaluation of a Configurable NoC for AcENoCs FPGA Accelerated Emulation Platform(2011-10-21) Lotlikar, Swapnil SubhashThe heterogenous nature and the demand for extensive parallel processing in modern applications have resulted in widespread use of Multicore System-on-Chip (SoC) architectures. The emerging Network-on-Chip (NoC) architecture provides an energy-efficient and scalable communication solution for Multicore SoCs, serving as a powerful replacement for traditional bus-based solutions. The key to successful realization of such architectures is a flexible, fast and robust emulation platform for fast design space exploration. In this research, we present the design and evaluation of a highly configurable NoC used in AcENoCs (Accelerated Emulation platform for NoCs), a flexible and cycle accurate field programmable gate array (FPGA) emulation platform for validating NoC architectures. Along with the implementation details, we also discuss the various design optimizations and tradeoffs, and assess the performance improvements of AcENoCs over existing simulators and emulators. We design a hardware library consisting of routers and links using verilog hardware description language (HDL). The router is parameterized and has a configurable number of physical ports, virtual channels (VCs) and pipeline depth. A packet switched NoC is constructed by connecting the routers in either 2D-Mesh or 2D-Torus topology. The NoC is integrated in the AcENoCs platform and prototyped on Xilinx Virtex-5 FPGA. The NoC was evaluated under various synthetic and realistic workloads generated by AcENoCs' traffic generators implemented on the Xilinx MicroBlaze embedded processor. In order to validate the NoC design, performance metrics like average latency and throughput were measured and compared against the results obtained using standard network simulators. FPGA implementation of the NoC using Xilinx tools indicated a 76% LUT utilization for a 5x5 2D-Mesh network. A VC allocator was found to be the single largest consumer of hardware resources within a router. The router design synthesized at a frequency of 135MHz, 124MHz and 109MHz for 3-port, 4-port and 5-port configurations, respectively. The operational frequency of the router in the AcENoCs environment was limited only by the software execution latency even though the hardware itself could be clocked at a much higher rate. An AcENoCs emulator showed speedup improvements of 10000-12000X over HDL simulators and 5-15X over software simulators, without sacrificing cycle accuracy.Item DESIGNING COST-EFFECTIVE COARSE-GRAINED RECONFIGURABLE ARCHITECTURE(2010-07-14) Kim, YoonjinApplication-specific optimization of embedded systems becomes inevitable to satisfy the market demand for designers to meet tighter constraints on cost, performance and power. On the other hand, the flexibility of a system is also important to accommodate the short time-to-market requirements for embedded systems. To compromise these incompatible demands, coarse-grained reconfigurable architecture (CGRA) has emerged as a suitable solution. A typical CGRA requires many processing elements (PEs) and a configuration cache for reconfiguration of its PE array. However, such a structure consumes significant area and power. Therefore, designing cost-effective CGRA has been a serious concern for reliability of CGRA-based embedded systems. As an effort to provide such cost-effective design, the first half of this work focuses on reducing power in the configuration cache. For power saving in the configuration cache, a low power reconfiguration technique is presented based on reusable context pipelining achieved by merging the concept of context reuse into context pipelining. In addition, we propose dynamic context compression capable of supporting only required bits of the context words set to enable and the redundant bits set to disable. Finally, we provide dynamic context management capable of reducing reduce power consumption in configuration cache by controlling a read/write operation of the redundant context words In the second part of this dissertation, we focus on designing a cost-effective PE array to reduce area and power. For area and power saving in a PE array, we devise a costeffective array fabric addresses novel rearrangement of processing elements and their interconnection designs to reduce area and power consumption. In addition, hierarchical reconfigurable computing arrays are proposed consisting of two reconfigurable computing blocks with two types of communication structure together. The two computing blocks have shared critical resources and such a sharing structure provides efficient communication interface between them with reducing overall area. Based on the proposed design approaches, a CGRA combining the multiple design schemes is shown to verify the synergy effect of the integrated approach. Experimental results show that the integrated approach reduces area by 23.07% and power by up to 72% when compared with the conventional CGRA.