Browsing by Subject "Low power"
Now showing 1 - 13 of 13
Results Per Page
Sort Options
Item Characterization and management of voltage noise in multi-core, multi-threaded processors(2013-05) Kim, Youngtaek; John, Lizy KurianReliability is one of the important issues of recent microprocessor design. Processors must provide correct behavior as users expect, and must not fail at any time. However, unreliable operation can be caused by excessive supply voltage fluctuations due to an inductive part in a microprocessor power distribution network. This voltage fluctuation issue is referred to as inductive or di/dt noise, and requires thorough analysis and sophisticated design solutions. This dissertation proposes an automated stressmark generation framework to characterize di/dt noise effect, and suggests a practical solution for management of di/dt effects while achieving performance and energy goals. First, the di/dt noise issue is analyzed from theory to a practical view. Inductance is a parasitic part in power distribution network for microprocessor, and its characteristics such as resonant frequencies are reviewed. Then, it is shown that supply voltage fluctuation from resonant behavior is much harmful than single event voltage fluctuations. Voltage fluctuations caused by standard benchmarks such as SPEC CPU2006, PARSEC, Linpack, etc. are studied. Next, an AUtomated DI/dT stressmark generation framework, referred to as AUDIT, is proposed to identify maximum voltage droop in a microprocessor power distribution network. The di/dt stressmark generated from AUDIT framework is an instruction sequence, which draws periodic high and low current pulses that maximize voltage fluctuations including voltage droops. AUDIT uses a Genetic Algorithm in scheduling and optimizing candidate instruction sequences to create a maximum voltage droop. In addition, AUDIT provides with both simulation and hardware measurement methods for finding maximum voltage droops in different design and verification stages of a processor. Failure points in hardware due to voltage droops are analyzed. Finally, a hardware technique, floating-point (FP) issue throttling, is examined, which provides a reduction in worst case voltage droop. This dissertation shows the impact of floating point throttling on voltage droop, and translates this reduction in voltage droop to an increase in operating frequency because additional guardband is no longer required to guard against droops resulting from heavy floating point usage. This dissertation presents two techniques to dynamically determine when to tradeoff FP throughput for reduced voltage margin and increased frequency. These techniques can work in software level without any modification of existing hardware.Item Correct low power design transformations for hardware systems(2013-08) Viswanath, Vinod; Abraham, Jacob A.We present a generic proof methodology to automatically prove correctness of design transformations introduced at the Register-Transfer Level (RTL) to achieve lower power dissipation in hardware systems. We also introduce a new algorithm to reduce switching activity power dissipation in microprocessors. We further apply our technique in a completely different domain of dynamic power management of Systems-on-Chip (SoCs). We demonstrate our methodology on real-life circuits. In this thesis, we address the dual problem of transforming hardware systems at higher levels of abstraction to achieve lower power dissipation, and a reliable way to verify the correctness of the afore-mentioned transformations. The thesis is in three parts. The first part introduces Instruction-driven Slicing, a new algorithm to automatically introduce RTL/System level annotations in microprocessors to achieve lower switching power dissipation. The second part introduces Dedicated Rewriting, a rewriting based generic proof methodology to automatically prove correctness of such high-level transformations for lowering power dissipation. The third part implements dedicated rewriting in the context of dynamically managing power dissipation of mobile and hand-held devices. We first present instruction-driven slicing, a new technique for annotating microprocessor descriptions at the Register Transfer Level in order to achieve lower power dissipation. Our technique automatically annotates existing RTL code to optimize the circuit for lowering power dissipated by switching activity. Our technique can be applied at the architectural level as well, achieving similar power gains. We first demonstrate our technique on architectural and RTL models of a 32-bit OpenRISC pipelined processor (OR1200), showing power gains for the SPEC2000 benchmarks. These annotations achieve reduction in power dissipation by changing the logic of the design. We further extend our technique to an out-of-order superscalar core and demonstrate power gains for the same SPEC2000 benchmarks on architectural and RTL models of PUMA, a fixed point out-of-order PowerPC microprocessor. We next present dedicated rewriting, a novel technique to automatically prove the correctness of low power transformations in hardware systems described at the Register Transfer Level. We guarantee the correctness of any low power transformation by providing a functional equivalence proof of the hardware design before and after the transformation. Dedicated rewriting is a highly automated deductive verification technique specially honed for proving correctness of low power transformations. We provide a notion of equivalence and establish the equivalence proof within our dedicated rewriting system. We demonstrate our technique on a non-trivial case study. We show equivalence of a Verilog RTL implementation of a Viterbi decoder, a component of the DRM System-On-Chip (SoC), before and after the application of multiple low power transformations. We next apply dedicated rewriting to a broader context of holistic power management of SoCs. This in turn creates a self-checking system and will automatically flag conflicting constraints or rules. Our system will manage power constraint rules using dedicated rewriting specially honed for dynamic power management of SoC designs. Together, this provides a common platform and representation to seamlessly cooperate between hardware and software constraints to achieve maximum platform power optimization dynamically during execution. We demonstrate our technique in multiple contexts on an SoC design of the state-of-the-art next generation Intel smartphone platform. Finally, we give a proof of instruction-driven slicing. We first prove that the annotations automatically introduced in the OR1200 processor preserve the original functionality of the machine using the ACL2 theorem prover. Then we establish the same proof within our dedicated rewriting system, and discuss the merits of such a technique and a framework. In the context of today's shrinking hardware and mobile internet devices, lowering power dissipation is a key problem. Verifying the correctness of transformations which achieve that is usually a time-consuming affair. Automatic and reliable methods of verification that are easy to use are extremely important. In this thesis we have presented one such transformation, and a generic framework to prove correctness of that and similar transformations. Our methodology is constructed in a manner that easily and seamlessly fits into the design cycle of creating complicated hardware systems. Our technique is also general enough to be applied in a completely different context of dynamic power management of mobile and hand-held devices.Item Design and production of an energy harvesting wireless sensor(2013-05) Bar, Farris Ahmad; Abraham, Jacob A.The widespread deployment of wireless sensors in our homes, offices, factories and infrastructure has opened the door for system designers to create novel approaches for powering wireless sensor nodes. In recent years, energy harvesting has emerged as the power supply of choice for embedded system designers, enabling wireless sensors to be used in applications that previously were not feasible with conventional battery-powered designs. This report details the design and development of an energy harvesting wireless sensor from concept to production. Design constraints included the requirement to operate reliably in a wide variety of environments, the use of commercially available components, and a visually appealing form factor. The result is a very power-efficient, solar-powered wireless sensor that measures temperature, voltage, and illumination level at the solar cell and has an ultra slim form factor.Item Design techniques for low-power SAR ADCs in nano-scale CMOS technologies(2016-05) Chen, Long; Sun, Nan; Viswanathan, T.R.; Pan, David Z.; Orshansky, Michael; Soenen, EricThis thesis presents low power design techniques for successive approximation register (SAR) analog-to-digital converters (ADCs) in nano-scale CMOS technologies. Low power SAR ADCs face two major challenges especially at high resolutions: (1) increased comparator power to suppress the noise, and (2) increased DAC switching energy due to the large DAC size. To improve the comparator’s power efficiency, a statistical estimation based comparator noise reduction technique is presented. It allows a low power and noisy comparator to achieve high signal-to-noise ratio (SNR) by estimating the conversion residue. A first prototype ADC in 65nm CMOS has been developed to validate the proposed noise reduction technique. It achieves 4.5 fJ/conv-step Walden figure of merit and 64.5 dB signal-to-noise and distortion ratio (SNDR). In addition, a bidirectional single-side switching technique is developed to reduce the DAC switching power. It can reduce the DAC switching power and the total number of unit capacitors by 86% and 75%, respectively. A second prototype ADC with the proposed switching technique is designed and fabricated in 180nm CMOS technology. It achieves an SNDR of 63.4 dB and consumes only 24 Wat 1MS/s, leading to aWalden figure of merit of 19.9 fJ/conv-step. This thesis also presents an improved loop-unrolled SAR ADC, which works at high frequency with reduced SAR logic power and delay. It employs the bidirectional single-side switching technique to reduce the comparator common-mode voltage variation. In addition, it uses a Vcm-adaptive offset calibration technique which can accurately calibrate comparator’s offset at its operating Vcm. A prototype ADC designed in 40nm CMOS achieves 35 dB at 700 MS/s sampling rate and consumes only 0.95 mW, leading to a Walden figure of merit of 30 fJ/conv-step.Item Development of an implantable system to measure the pressure-volume relationship in ambulatory rodent hearts(2012-12) Loeffler, Kathryn Rose; Valvano, Jonathan W., 1953-; Pearce, John A., 1946-The design, fabrication, and in-vivo testing of an implantable device to measure the pressure-volume (PV) relationship in the hearts of conscious, untethered rats is presented. Volume is measured using a tetrapolar catheter positioned in the left-ventricle which emits a 20kHz current field across the LV blood pool and parallel heart tissue and measures the resulting voltage. The admittance method is used to instantaneously remove the contribution of the parallel heart muscle and Wei’s non-linear blood conductance-to-volume equation is used to calculate volume. Pressure is measured with a strain gauge sensor at the tip of the catheter. The implant was designed to be small, light, and low-power. An average implant occupies 5 cm3, weighs 8g, and on a single charge collects data for 2 months taking 43 samples per day. Collected data is transmitted wirelessly via RF to a base station where it is recorded. The functionality of the implant and measurement system was verified in six rat experiments. In all experiments, ambulatory PV loops were measured on implantation day. Viable pressure data was recorded for 11 days in one rat; in another rat viable admittance data was collected for 10 days. Changing catheter position and non-constant blood resistivity are considered as sources of error in the volume measurement. Pressure drift due to changing atmospheric pressure is considered as a source of error in the pressure measurement. Lastly, alternative uses for the implant and directions for future improvement are considered.Item Dynamically controlling the clock frequency based on the variations in the voltage(2010-08) Chhatbar, Jigar Chandrakant; Abraham, Jacob A.; McDermot, MarkA digital logic circuit tends to become slower if the voltage (VDD) level drops below the normal VDD level. Because of this, the required data will not have settled before the arrival of the clock edge. This results in an incorrect sampling of the data leading to a functional failure of the chip. This thesis proposes a clock controller circuit which solves this issue. It consists of a voltage monitoring circuit to track the variations in the VDD level, a frequency multiplier and divider, and a selector logic circuit that outputs a particular frequency depending upon the VDD range in which the chip is operating.Item High performance continuous-time filters for information transfer systems(Texas A&M University, 2004-09-30) Mohieldin, Ahmed NaderVast attention has been paid to active continuous-time filters over the years. Thus as the cheap, readily available integrated circuit OpAmps replaced their discrete circuit versions, it became feasible to consider active-RC filter circuits using large numbers of OpAmps. Similarly the development of integrated operational transconductance amplifier (OTA) led to new filter configurations. This gave rise to OTA-C filters, using only active devices and capacitors, making it more suitable for integration. The demands on filter circuits have become ever more stringent as the world of electronics and communications has advanced. In addition, the continuing increase in the operating frequencies of modern circuits and systems increases the need for active filters that can perform at these higher frequencies; an area where the LC active filter emerges. What mainly limits the performance of an analog circuit are the non-idealities of the used building blocks and the circuit architecture. This research concentrates on the design issues of high frequency continuous-time integrated filters. Several novel circuit building blocks are introduced. A novel pseudo-differential fully balanced fully symmetric CMOS OTA architecture with inherent common-mode detection is proposed. Through judicious arrangement, the common-mode feedback circuit can be economically implemented. On the level of system architectures, a novel filter low-voltage 4th order RF bandpass filter structure based on emulation of two magnetically coupled resonators is presented. A unique feature of the proposed architecture is using electric coupling to emulate the effect of the coupled-inductors, thus providing bandwidth tuning with small passband ripple. As part of a direct conversion dual-mode 802.11b/Bluetooth receiver, a BiCMOS 5th order low-pass channel selection filter is designed. The filter operated from a single 2.5V supply and achieves a 76dB of out-of-band SFDR. A digital automatic tuning system is also implemented to account for process and temperature variations. As part of a Bluetooth transmitter, a low-power quadrature direct digital frequency synthesizer (DDFS) is presented. Piecewise linear approximation is used to avoid using a ROM look-up table to store the sine values in a conventional DDFS. Significant saving in power consumption, due to the elimination of the ROM, renders the design more suitable for portable wireless communication applications.Item Linearity and Noise Improvement Techniques Employing Low Power in Analog and RF Circuits and Systems(2012-12-07) Abdel Ghany, EhabThe implementation of highly integrated multi-bands and multi-standards reconfigurable radio transceivers is one of the great challenges in the area of integrated circuit technology today. In addition the rapid market growth and high quality demands that require cheaper and smaller solutions, the technical requirements for the transceiver function of a typical wireless device are considerably multi-dimensional. The major key performance metrics facing RFIC designers are power dissipation, speed, noise, linearity, gain, and efficiency. Beside the difficulty of the circuit design due to the trade-offs and correlations that exist between these parameters, the situation becomes more and more challenging when dealing with multi-standard radio systems on a single chip and applications with different requirements on the radio software and hardware aiming at highly flexible dynamic spectrum access. In this dissertation, different solutions are proposed to improve the linearity, reduce the noise and power consumption in analog and RF circuits and systems. A system level design digital approach is proposed to compensate the harmonic distortion components produced by transmitter circuits? nonlinearities. The approach relies on polyphase multipath scheme uses digital baseband phase rotation pre-distortion aiming at increasing harmonic cancellation and power consumption reduction over other reported techniques. New low power design techniques to enhance the noise and linearity of the receiver front-end LNA are also presented. The two proposed LNAs are fully differential and have a common-gate capacitive cross-coupled topology. The proposed LNAs avoids the use of bulky inductors that leads to area and cost saving. Prototypes are implemented in IBM 90 nm CMOS technology for the two LNAs. The first LNA covers the frequency range of 100 MHz to 1.77 GHz consuming 2.8 mW from a 2 V supply. Measurements show a gain of 23 dB with a 3-dB bandwidth of 1.76 GHz. The minimum NF is 1.85 dB while the input return loss is greater than 10 dB across the entire band. The second LNA covers the frequency range of 100 MHz to 1.6 GHz. A 6 dBm third-order input intercept point, IIP3, is measured at the maximum gain frequency. The core consumes low power of 1.55 mW using a 1.8 V supply. The measured voltage gain is 15.5 dB with a 3-dB bandwidth of 1.6 GHz. The LNA has a minimum NF of 3 dB across the whole band while achieving an input return loss greater than 12 dB. Finally, A CMOS single supply operational transconductance amplifier (OTA) is reported. It has high power supply rejection capabilities over the entire gain bandwidth (GBW). The OTA is fabricated on the AMI 0.5 um CMOS process. Measurements show power supply rejection ratio (PSRR) of 120 dB till 10 KHz. At 10 MHz, PSRR is 40 dB. The high performance PSRR is achieved using a high impedance current source and two noise reduction techniques. The OTA offers a very low current consumption of 25 uA from a 3.3 V supply.Item Low power architecture and circuit techniques for high boost wideband Gm-C filters(Texas A&M University, 2007-09-17) Gambhir, ManishaWith the current trend towards integration and higher data rates, read channel design needs to incorporate significant boost for a wider signal bandwidth. This dissertation explores the analog design problems associated with design of such 'Equalizing Filter' (boost filter) for read channel applications. Specifically, a 330MHz, 5th order Gm-C continuous time lowpass filter with 24dB boost is designed. Existing architectures are found to be unsuitable for low power, wideband and high boost operation. The proposed solution realizes boosting zeros by efficiently combining available transfer functions associated with all nodes of cascaded biquad cells. Further, circuit techniques suitable for high frequency filter design are elaborated such as: application of the Gilbert cell as a variable transconductor and a new Common-Mode-Feedback (CMFB) error amplifier that improves common mode accuracy without compromising on bandwidth or circuit complexity. A prototype is fabricated in a standard 0.35mm CMOS process. Experimental results show -41dB of IM3 for 250mV peak to peak swing with 8.6mW/pole of power dissipation.Item Low Power High Efficiency Integrated Class-D Amplifier Circuits for Mobile Devices(2015-01-12) Colli-Menchi, Adrian IsraelThe consumer?s demand for state-of-the-art multimedia devices such as smart phones and tablet computers has forced manufacturers to provide more system features to compete for a larger portion of the market share. The added features increase the power consumption and heat dissipation of integrated circuits, depleting the battery charge faster. Therefore, low-power high-efficiency circuits, such as the class-D audio amplifier, are needed to reduce heat dissipation and extend battery life in mobile devices. This dissertation focuses on new design techniques to create high performance class-D audio amplifiers that have low power consumption and occupy less space. The first part of this dissertation introduces the research motivation and fundamentals of audio amplification. The loudspeaker?s operation and main audio performance metrics are examined to explain the limitations in the amplification process. Moreover, the operating principle and design procedure of the main class-D amplifier architectures are reviewed to provide the performance tradeoffs involved. The second part of this dissertation presents two new circuit designs to improve the audio performance, power consumption, and efficiency of standard class-D audio amplifiers. The first work proposes a feed-forward power-supply noise cancellation technique for single-ended class-D amplifier architectures to improve the power-supply rejection ratio across the entire audio frequency range. The design methodology, implementation, and tradeoffs of the proposed technique are clearly delineated to demonstrate its simplicity and effectiveness. The second work introduces a new class-D output stage design for piezoelectric speakers. The proposed design uses stacked-cascode thick-oxide CMOS transistors at the output stage that makes possible to handle high voltages in a low voltage standard CMOS technology. The design tradeoffs in efficiency, linearity, and electromagnetic interference are discussed. Finally, the open problems in audio amplification for mobile devices are discussed to delineate the possible future work to improve the performance of class-D amplifiers. For all the presented works, proof-of-concept prototypes are fabricated, and the measured results are used to verify the correct operation of the proposed solutions.Item Modeling and synthesis of approximate digital circuits(2014-12) Miao, Jin; Orshansky, Michael; Gerstlauer, Andreas, 1970-Energy minimization has become an ever more important concern in the design of very large scale integrated circuits (VLSI). In recent years, approximate computing, which is based on the idea of trading off computational accuracy for improved energy efficiency, has attracted significant attention. Applications that are both compute-intensive and error-tolerant are most suitable to adopt approximation strategies. This includes digital signal processing, data mining, machine learning or search algorithms. Such approximations can be achieved at several design levels, ranging from software, algorithm and architecture, down to logic or transistor levels. This dissertation investigates two research threads for the derivation of approximate digital circuits at the logic level: 1) modeling and synthesis of fundamental arithmetic building blocks; 2) automated techniques for synthesizing arbitrary approximate logic circuits under general error specifications. The first thread investigates elementary arithmetic blocks, such as adders and multipliers, which are at the core of all data processing and often consume most of the energy in a circuit. An optimal strategy is developed to reduce energy consumption in timing-starved adders under voltage over-scaling. This allows a formal demonstration that, under quadratic error measures prevalent in signal processing applications, an adder design strategy that separates the most significant bits (MSBs) from the least significant bits (LSBs) is optimal. An optimal conditional bounding (CB) logic is further proposed for the LSBs, which selectively compensates for the occurrence of errors in the MSB part. There is a rich design space of optimal adders defined by different CB solutions. The other thread considers the problem of approximate logic synthesis (ALS) in two-level form. ALS is concerned with formally synthesizing a minimum-cost approximate Boolean function, whose behavior deviates from a specified exact Boolean function in a well-constrained manner. It is established that the ALS problem un-constrained by the frequency of errors is isomorphic to a Boolean relation (BR) minimization problem, and hence can be efficiently solved by existing BR minimizers. An efficient heuristic is further developed which iteratively refines the magnitude-constrained solution to arrive at a two-level representation also satisfying error frequency constraints. To extend the two-level solution into an approach for multi-level approximate logic synthesis (MALS), Boolean network simplifications allowed by external don't cares (EXDCs) are used. The key contribution is in finding non-trivial EXDCs that can maximally approach the external BR and, when applied to the Boolean network, solve the MALS problem constrained by magnitude only. The algorithm then ensures compliance to error frequency constraints by recovering the correct outputs on the sought number of error-producing inputs while aiming to minimize the network cost increase. Experiments have demonstrated the effectiveness of the proposed techniques in deriving approximate circuits. The approximate adders can save up to 60% energy compared to exact adders for a reasonable accuracy. When used in larger systems implementing image-processing algorithms, energy savings of 40% are possible. The logic synthesis approaches generally can produce approximate Boolean functions or networks with complexity reductions ranging from 30% to 50% under small error constraints.Item Modeling and synthesis of quality-energy optimal approximate adders(2012-12) Miao, Jin; Gerstlauer, Andreas, 1970-; Orshansky, MichaelRecent interest in approximate computation is driven by its potential to achieve large energy savings. We formally demonstrate an optimal way to reduce energy via voltage over-scaling at the cost of errors due to timing starvation in addition. A fundamental trade-off between error frequency and error magnitude in a timing-starved adder has been identified. We introduce a formal model to prove that for signal processing applications using a quadratic signal-to-noise ratio error measure, reducing bit-wise error frequency is sub-optimal. Instead, energy-optimal approximate addition requires limiting maximum error magnitude. Intriguingly, due to possible error patterns, this is achieved by reducing carry chains significantly below what is allowed by the timing budget for a large fraction of sum bits, using an aligned, fixed internal-carry structure for higher significance bits. We further demonstrate that remaining approximation error is reduced by realization of conditional bounding (CB) logic for lower significance bits. A key contribution is the formalization of an approximate CB logic synthesis problem that produces a rich space of Pareto-optimal adders with a range of quality-energy trade-offs. We show how CB logic can be customized to result in over- and under-estimating approximate adders, and how a dithering adder that mixes them produces zero-centered error distributions, and, in accumulation, a reduced-variance error. This work demonstrates synthesized approximate adders with energy up to 60% smaller than that of a conventional timing-starved adder, where a 30% reduction is due to the superior synthesis of inexact CB logic. When used in a larger system implementing an image-processing algorithm, energy savings of 40% are possible.Item Software optimization for power consumption in DSP embedded systems(2012-05) Temple, Andrew Richard; Julien, ChristineThis paper is intended to be a resource for programmers needing to optimize a DSP’s power consumption strictly through software. The paper will provide a basic introduction into power consumption background, measurement techniques, and then go into the details of power optimization, focusing on three main areas: algorithmic optimization, taking advantage of hardware features (low power modes, clock control, and voltage control), and data flow optimization with a discussion into the functionality and power considerations when using fast SRAM type memories (common for cache) and DDR SDRAM. This work includes examples and results as tested on Freescale’s current state of the art Digital Signal Processors.