Date of Completion
Honors College Thesis
Microbiology and Molecular Genetics
Indra Neil Sarkar
Translational Bioinformatics, Comparative Genomics, Computational Biology, Bioinformatics, Alzheimer Disease, Phylogenomics
The characterization of complex diseases remains a great challenge for biomedical researchers due to the myriad interactions of genetic and environmental factors. Adaptation of phylogenomic techniques to increasingly available genomic data provides an evolutionary perspective that may elucidate important unknown features of complex diseases. Here an automated method is presented that leverages publicly available genomic data and phylogenomic techniques. The approach is tested with nine genes implicated in the development of Alzheimer Disease, a complex neurodegenerative syndrome.
The developed technique, which is an update to a previously described Perl script called “ASAP,” was implemented through a suite of Ruby scripts entitled “ASAP2,” first compiles a list of sequence-similarity based orthologues using PSI-BLAST and a recursive NCBI BLAST+ search strategy, then constructs maximum parsimony phylogenetic trees for each set of nucleotide and protein sequences, and calculates phylogenetic metrics (partitioned Bremer support values, combined branch scores, and Robinson-Foulds distance) to provide an empirical assessment of evolutionary conservation within a given genetic network.
This study demonstrates the potential for using automated simultaneous phylogenetic analysis to uncover previously unknown relationships among disease-associated genes that may not have been apparent using traditional, single-gene methods. Furthermore, the results provide the first integrated evolutionary history of an Alzheimer Disease gene network and identify potentially important co-evolutionary clustering around components of oxidative stress pathways.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.
Romano, Joseph D.; Tharp, William G.; and Sarkar, Indra Neil, "Exploring Complex Disease Gene Relationships Using Simultaneous Analysis" (2014). UVM Honors College Senior Theses. 35.
figure1_asap2_workflow.eps (5822 kB)
Figure 1: ASAP2 Workflow
figure2_AD_Nucleotide_Trees.eps (10493 kB)
Figure 2: AD Nucleotide Trees
figure3_AD_Protein_Trees.eps (10335 kB)
Figure 3: AD Protein Trees
figure4_nuc_pbsup_annotated.eps (2803 kB)
Figure 4: Nucleotide PBS Tree
figure5_prot_pbsup_annotated.eps (2792 kB)
Figure 5: Protein PBS Tree
figure6a_rf_nuc_pairs.eps (2576 kB)
Figure 6a: AD RF Nucleotide Pairs
figure6b_rf_prot_pairs.eps (1843 kB)
Figure 6b: AD RF Protein Pairs
figure7a_mtdna_prot.eps (1836 kB)
Figure 7a: mtDNA RF Protein Paris
figure7b_mtdna_nuc.eps (1972 kB)
Figure 7b: mtDNA RF Nucleotide Pairs
figure7c_both_prot.eps (2244 kB)
Figure 7c: AD+mtDNA RF Protein Pairs
figure7d_both_nuc.eps (2343 kB)
Figure 7d: AD+mtDNA RF Nucleotide Pairs