Robust network compressive sensing

Chen, Yi-Chao, Ph. D.

Robust network compressive sensing

Date

2015-12

Authors

Chen, Yi-Chao, Ph. D.

Abstract

Networks are constantly generating an enormous amount of rich and diverse information. Such information creates exciting opportunities for network analytics and provides deep insights into the complex interactions among network entities. However, network analytics often faces the problem of (i) under-constraint, where there is too little data due to the feasibility/cost of collecting data; or (ii) over-constraint, where there is too much data so the analytics becomes unscalable. Compressive sensing is an effective technique to solve both problems. It leverages the underlying data structure for analysis. To address the under-constraint problem, we can apply compressive sensing to reconstruct missing elements or predict future data. To address the over-constraint problem, we can apply compressive sensing to identify important factors. Compressive sensing has many applications. In the thesis, we apply compressive sensing to missing data interpolation, anomaly detection, data segmentation, and activity recognition and show their benefit. To demonstrate the feasibility of compressive sensing in network analytics, we first apply it to detect anomalies in a customer care call dataset. Customer care call dataset is collected by a tier-1 ISP in US and includes the calls which are labeled as categories representing customers' problems. Customer care calls reveal the major events and problems observed by customers. We use a regression-based approach to find the relationship between calls and events. We show that compressive sensing is effective in identifying important factors and can leverage the low-rank structure and temporal stability of the data to improve the detection accuracy. While applying compressive sensing to the real-world data, we identify several challenges. One of the challenges is that real-world data are complicated and heterogeneous, and often violate the low-rank assumption required by existing compressive sensing techniques. Such violation significantly reduces the applicability and effectiveness of existing compressive sensing approaches. It is important to understand reasons behind the violation to design methods and mitigate the impact. Therefore, we analyze a wide range of real-world traces and our analysis reveals that there are different factors that contribute to the violation of low-rank property in real data. In particular, we find (i) noise, errors, anomalies, and (ii) the lack of synchronization in time and frequency-domain lead to network-induced blurring, and can easily cause a low-rank matrix to become a much higher rank. To address the problem of noise, errors, and anomalies, we present a robust compressive sensing technique. It explicitly account for anomalies by decomposing real-world data represented in the form of a matrix into a low-rank matrix, a sparse anomaly matrix, an error term, and a small noise matrix. To address the problem of the lack of synchronization, we present a data-driven synchronization algorithm. It removes misalignment while accounting for the time and frequency-domain heterogeneity in the real-world data. The data-driven synchronization can be applied to any compressive sensing technique and is general for any real-world trace. We show that the combination of two techniques can reduce the ranks of the real-world data, improve the effectiveness of compressive sensing, and have a wide range of applications.