Performance Analysis Of Caching Effect On Real Time Packet Processing In A Multi-threaded Processor
Caching has been time proven to be a very effective technique to improve memory access speed and average performance for general processors. Based on the real-world trace simulation, earlier research showed that cache can help improve the route lookup and packet classification performance in a Network Processor (NP). However, the existing studies did not take the packet delay/loss constraints into account. As a result, how effective the caching technique is, in dealing with traffic under stringent delay/loss constraints (as is the case for router interface using an NP for packet processing), is still an open issue. In this thesis, we aim at addressing the above issue through simulation studies based on a well-designed, lightweight simulator. We first demonstrate how such a simulator can be developed to allow effective performance analysis of a multi-threaded, single core processor. Then we apply this simulator to the study of the caching effect on the packet throughput performance under various delay/loss constraints. Our simulation studies indicate that the effectiveness of caching is sensitive to the actual delay/loss constraints. When drop/loss constraint is loose, use of a larger number of threads can effectively hide the memory latency, making caching less effective. Moreover, the effectiveness of caching is getting worst, when drop/loss constraint is tight and the number of threads is relatively small. Finally, our simulation shows that when cache miss distribution is uniform, caching is effective, improving throughput performance by 4.6% even when the miss ratio is as large as 27%.