Area and energy efficient VLSI architectures for low-density parity-check decoders using an on-the-fly computation

Gunnam, Kiran Kumar

Area and energy efficient VLSI architectures for low-density parity-check decoders using an on-the-fly computation

Date

2009-05-15

Authors

Gunnam, Kiran Kumar

Abstract

The VLSI implementation complexity of a low density parity check (LDPC) decoder is largely influenced by the interconnect and the storage requirements. This dissertation presents the decoder architectures for regular and irregular LDPC codes that provide substantial gains over existing academic and commercial implementations. Several structured properties of LDPC codes and decoding algorithms are observed and are used to construct hardware implementation with reduced processing complexity. The proposed architectures utilize an on-the-fly computation paradigm which permits scheduling of the computations in a way that the memory requirements and re-computations are reduced. Using this paradigm, the run-time configurable and multi-rate VLSI architectures for the rate compatible array LDPC codes and irregular block LDPC codes are designed. Rate compatible array codes are considered for DSL applications. Irregular block LDPC codes are proposed for IEEE 802.16e, IEEE 802.11n, and IEEE 802.20. When compared with a recent implementation of an 802.11n LDPC decoder, the proposed decoder reduces the logic complexity by 6.45x and memory complexity by 2x for a given data throughput. When compared to the latest reported multi-rate decoders, this decoder design has an area efficiency of around 5.5x and energy efficiency of 2.6x for a given data throughput. The numbers are normalized for a 180nm CMOS process. Properly designed array codes have low error floors and meet the requirements of magnetic channel and other applications which need several Gbps of data throughput. A high throughput and fixed code architecture for array LDPC codes has been designed. No modification to the code is performed as this can result in high error floors. This parallel decoder architecture has no routing congestion and is scalable for longer block lengths. When compared to the latest fixed code parallel decoders in the literature, this design has an area efficiency of around 36x and an energy efficiency of 3x for a given data throughput. Again, the numbers are normalized for a 180nm CMOS process. In summary, the design and analysis details of the proposed architectures are described in this dissertation. The results from the extensive simulation and VHDL verification on FPGA and ASIC design platforms are also presented.