Modeling correlation in binary count data with application to fragile site identification
Hintze, Christopher Jerry
MetadataShow full item record
Available fragile site identification software packages (FSM and FSM3) assume that all chromosomal breaks occur independently. However, under a Mendelian model of inheritance, homozygosity at fragile loci implies pairwise correlation between homologous sites. We construct correlation models for chromosomal breakage data in situations where either partitioned break count totals (per-site single-break and doublebreak totals) are known or only overall break count totals are known. We derive a likelihood ratio test and Neyman??????s C( ????) test for correlation between homologs when partitioned break count totals are known and outline a likelihood ratio test for correlation using only break count totals. Our simulation studies indicate that the C( ????) test using partitioned break count totals outperforms the other two tests for correlation in terms of both power and level. These studies further suggest that the power for detecting correlation is low when only break count totals are reported. Results of the C( ????) test for correlation applied to chromosomal breakage data from 14 human subjects indicate that detection of correlation between homologous fragile sites is problematic due to sparseness of breakage data. Simulation studies of the FSM and FSM3 algorithms using parameter values typical for fragile site data demonstrate that neither algorithm is significantly affected by fragile site correlation. Comparison of simulated fragile site misclassification rates in the presence of zero-breakage data supports previous studies (Olmsted 1999) that suggested FSM has lower false-negative rates and FSM3 has lower false-positive rates.