Boosting Alternating Decision Trees Modeling of Disease Trait Information

DSpace/Manakin Repository

Boosting Alternating Decision Trees Modeling of Disease Trait Information

Citable link to this page


Title: Boosting Alternating Decision Trees Modeling of Disease Trait Information
Author: Wong, Stephen TC; Liu, Kuang-Yu; Lin, Jennifer Y.; Zhou, Xiaobo

Note: Order does not necessarily reflect citation order of authors.

Citation: Liu, Kuang-Yu, Jennifer Lin, Xiaobo Zhou, and Stephen T. C. Wong. 2005. Boosting alternating decision trees modeling of disease trait information. BMC Genetics 6(Suppl 1): S132.
Full Text & Related Files:
Abstract: We applied the alternating decision trees (ADTrees) method to the last 3 replicates from the Aipotu, Danacca, Karangar, and NYC populations in the Problem 2 simulated Genetic Analysis Workshop dataset. Using information from the 12 binary phenotypes and sex as input and Kofendrerd Personality Disorder disease status as the outcome of ADTrees-based classifiers, we obtained a new quantitative trait based on average prediction scores, which was then used for genome-wide quantitative trait linkage (QTL) analysis. ADTrees are machine learning methods that combine boosting and decision trees algorithms to generate smaller and easier-to-interpret classification rules. In this application, we compared four modeling strategies from the combinations of two boosting iterations (log or exponential loss functions) coupled with two choices of tree generation types (a full alternating decision tree or a classic boosting decision tree). These four different strategies were applied to the founders in each population to construct four classifiers, which were then applied to each study participant. To compute average prediction score for each subject with a specific trait profile, such a process was repeated with 10 runs of 10-fold cross validation, and standardized prediction scores obtained from the 10 runs were averaged and used in subsequent expectation-maximization Haseman-Elston QTL analyses (implemented in GENEHUNTER) with the approximate 900 SNPs in Hardy-Weinberg equilibrium provided for each population. Our QTL analyses on the basis of four models (a full alternating decision tree and a classic boosting decision tree paired with either log or exponential loss function) detected evidence for linkage (Z ≥ 1.96, p less than 0.01) on chromosomes 1, 3, 5, and 9. Moreover, using average iteration and abundance scores for the 12 phenotypes and sex as their relevancy measurements, we found all relevant phenotypes for all four populations except phenotype b for the Karangar population, with suggested subgroup structure consistent with latent traits used in the model. In conclusion, our findings suggest that the ADTrees method may offer a more accurate representation of the disease status that allows for better detection of linkage evidence.
Published Version: doi:10.1186/1471-2156-6-S1-S132
Other Sources:
Terms of Use: This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at
Citable link to this page:
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)


Search DASH

Advanced Search