On two-sample data analysis by exponential model



Journal Title

Journal ISSN

Volume Title


Texas A&M University


We discuss two-sample problems and the implementation of a new two-sample data analysis procedure. The proposed procedure is based on the concepts of mid-distribution, design of score functions, components, comparison distribution, comparison density and exponential model. Assume that we have a random sample X1, . . . ,Xm from a continuous distribution F(y) = P(Xi y), i = 1, . . . ,m and a random sample Y1, . . . ,Yn from a continuous distribution G(y) = P(Yi y), i = 1, . . . ,n. Also assume independence of the two samples. The two-sample problem tests homogeneity of two samples and formally can be stated as H0 : F = G. To solve the two-sample problem, a number of tests have been proposed by statisticians in various contexts. Two typical tests are the two-sample t?test and the Wilcoxon's rank sum test. However, since they are testing differences in locations, they do not extract more information from the data as well as a test of the homogeneity of the distribution functions. Even though the Kolmogorov-Smirnov test statistic or Anderson-Darling tests can be used for the test of H0 : F = G, those statistics give no indication of the actual relation of F to G when H0 : F = G is rejected. Our goal is to learn why it was rejected. Our approach gives an answer using graphical tools which is a main property of our approach. Our approach is functional in the sense that the parameters to be estimated are probability density functions. Compared with other statistical tools for two-sample problems such as the t-test or the Wilcoxon rank-sum test, density estimation makes us understand the data more fully, which is essential in data analysis. Our approach to density estimation works with small sample sizes, too. Also our methodology makes almost no assumptions on two continuous distributions F and G. In that sense, our approach is nonparametric. Our approach gives graphical elements in two-sample problem where exist not many graphical elements typically. Furthermore, our procedure will help researchers to make a conclusion as to why two populations are different when H0 is rejected and to give an explanation to describe the relation between F and G in a graphical way.