Open source software maturity model based on linear regression and Bayesian analysis

Zhang, Dongmin

Open source software maturity model based on linear regression and Bayesian analysis

Date

2009-05-15

Authors

Zhang, Dongmin

Abstract

Open Source Software (OSS) is widely used and is becoming a significant and irreplaceable part of the software engineering community. Today a huge number of OSS exist. This becomes a problem if one needs to choose from such a large pool of OSS candidates in the same category. An OSS maturity model that facilitates the software assessment and helps users to make a decision is needed. A few maturity models have been proposed in the past. However, the parameters in the model are assigned not based on experimental data but on human experiences, feelings and judgments. These models are subjective and can provide only limited guidance for the users at the best. This dissertation has proposed a quantitative and objective model which is built from the statistical perspective. In this model, seven metrics are chosen as criteria for OSS evaluation. A linear multiple-regression model is created to assign a final score based on these seven metrics. This final score provides a convenient and objective way for the users to make a decision. The coefficients in the linear multiple-regression model are calculated from 43 OSS. From the statistical perspective, these coefficients are considered random variables. The joint distribution of the coefficients is discussed based on Bayesian statistics. More importantly, an updating rule is established through Bayesian analysis to improve the joint distribution, and thus the objectivity of the coefficients in the linear multiple-regression model, according to new incoming data. The updating rule provides the model the ability to learn and improve itself continually.