Division of Biostatistics and Bioinformatics
Introduction | Research Projects | Achievements | Members
Research Achievements 2005
For the year of 2004 and the first half year of 2005, the research activities and results are described in the following :
(1) Statistical methodological research
(i) Genetic statistics
Some nonparametric methods in genetic statistics are proposed for more robust results. By introducing appropriate weight functions, efficiency of the statistical methods can be improved.- (a)An asymptotic theory for the non-parametric maximum likelihood estimator in the Cox-gene model (accepted by Bernonlli)
-
(b)Multipoint linkage analysis under linkage
disequilibrium (in revision)
Abstract
This paper presents a new Monte Carlo approach to the problem of calculating the conditional probability of inheritance patterns given sibship genotype data, assuming the availability of parental level population haplotype frequencies. By limiting the study to sibships, we propose a linkage analysis method that allows linkage disequilibrium among relevant genetic loci, can incorporate general crossover process model, and is computationally feasible. The cruxes of this approach are systematic ways to introduce probability distributions on the space of legal ordered parental genotypes consistent with sibship genotype and given inheritance pattern, and on the space of legal inheritance patterns consistent with sibship genotype so as to apply importance sampling techniques for calculating various relevant probabilities. Using simulated data, we examine the performance of our method in terms of the accuracy in calculating the conditional probability of IBD sharing for sib-pairs given the sibship genotype data. It seems that our method performs very well and outperforms GENEHUNTER, when there is linkage disequilibrium, the parental level population haplotype frequencies are available, and the sample sizes in the importance samplings are large enough. - (c) An alternative conditional approach aiming to enhance chance in detecting linkage by incorporating linkage evidence from other regions (Genetic Epidemolgy. 2004)
-
(d) Heterogeneity in Human Crossover
Interference (in revision)
Abstract
We examine heterogeneity in human crossover interference based on Icelandic family genotype data. Under minimal assumption, we first reconstruct the crossover points on the meiotic products and then calculate coincidence coefficients based on the reconstructed crossover points. Our results indicated that differences in the level of crossover interference seem to exist between sexes, and interference for male chromosomes seems to be stronger; with given sex, the level of crossover interference seems to be similar across the chromosomes except the shorter ones; for female chromosomes, the interference in centromeric region is weak and is not different from the overall interference on the whole chromosome, but for certain male chromosomes, the interference in centromeric region does seem to exist although not as strong as its overall interference. We also found that intra-chromosome interference heterogeneity exists for every chromosome except possibly for the shorter ones and this heterogeneity is stronger for male chromosomes. Our results also indicated that when sex-specific genetic distance is used in coincidence coefficient, the level of interference for female chromosomes does not seem to be different from that for male chromosomes. -
(e) A multipoint linkage disequilibrium mapping
method based on the conventional case-control
design (submitted)
Abstract
This method builds on the representation that shows that the difference between a case and a control in probabilities of carrying the target allele at a marker is proportional to that at a trait locus, and that the proportionality factor is simply a measure of LD between the marker and the trait locus. Our method has the desired properties that (i) there is no need to specify phases of genotypic data with multiple markers, (ii) it provides an estimate of the disease locus along with sampling uncertainty to help investigators to narrow chromosomal regions and (iii) a single test statistic is provided to test for LD in the framed region rather than testing the hypothesis one marker at a time.
(2) Survival analysis, analysis for family data
- (a) Bayesian survival analysis using Bernstein polynomials (Scand. J. of Statistics, accepted)
-
(b) A semiparametric mixture model of cure time
and failure time, with application to SARS data
(in revision)
Abstract
We propose a semiparametric mixture model that extends the semiparametric cure model by incorporating a proportional hazard model for time-to-cure in addition to the proportional hazard model for the failure time. We also propose a self-consistency based algorithm for computing the nonparametric maximum likelihood estimates in this model. The development of this statistical method is motivated by studies of SARS (severe acute respiratory syndrome) epidemiology. SARS patients are kept in isolation until recovery or death. Because of no known treatment or preventive measure, it is important to know the case fatality rate and the distribution of admission-to-death and admission-to-discharge for the study of transmission dynamics and for better planning of patient care capacity. The performance of this method is successfully demonstrated in a simulation study and in the analysis of Taiwan SARS data.
(3) Statistical methodology research in the area of evaluation of pharmaceutical products
-
(a) A two-stage design for bridging studies (to
appear in Journal of Biopharmaceutical
Statistics)
One of the current issues for evaluation of bridging studies is a cross-study comparison. Therefore bias occurs when the study is not internally valid. A two-stage approach is therefore proposed to overcome the issue of internal validity. In particular, we use the regions as the stages to enroll the patients from the original region first and then to enroll patients from the new region subsequently. In other words, the bridging study of the new region is a second-stage sub-study of the whole trial and the patients for the bridging sub-study are enrolled only after the data obtained in the original region demonstrate a statistically significantly positive treatment effect. Methods for the determination of the sample size for each region and the critical values at each stage are also proposed.
-
(b) Bayesian non-inferiority approach to
evaluation of bridging studies (to appear in
Journal of Biopharmaceutical Statistics).
Under the situation that the test product has been already approved in the original region due to its proven efficacy against placebo control, if the data collected from the bridging study show that the efficacy of the test product from the new region is no worse than the efficacy of the test product from the original region by some clinically acceptable limit, then the efficacy observed in the bridging study in the new region can be claimed to be similar to that of the original region. This concept of similarity is referred to as similarity between the treatment effects from both new and original regions.
-
(c) Group sequential approach to evaluation of
bridging studies (Journal of Biopharmaceutical
Statistics).
We propose a group sequential method to incorporate the information of the foreign clinical data into evaluation of the positive treatment effect observed in the new region within the same study. Within this framework, regions are treated as group sequence. In other words, patients from the original region are enrolled first into the study and then the patients from the new region are enrolled subsequently.
(5) Statistical methodology research in the area of microarray data analysis
-
(a) A Bayes Regression Approach to array-CGH
Data (in revision)
Abstract
This paper develops a Bayes regression approach for the analysis of array-CGH data by utilizing not only the underlying spatial structure of the genomic alterations but also the observation that the noise associated with the ratio of the fluorescence intensities is bigger when the intensities get smaller. We show that this Bayes regression approach is particularly suitable for the analysis of cDNA microarray-CGH data, which are generally noisier than those using genomic clones. A simulation study and a real data analysis are included to illustrate this approach.