Studying Evolutionary Patterns in Cancer Risk Genes with Computational Tools to Create Sequence Alignments
Dillon-Martin, Maeve
Dillon-Martin, Maeve
Citations
Altmetric:
License
1
License
DOI
Abstract
Background An important aspect of cancer research is identifying inherited genetic variants that cause increased cancer susceptibility, then acting to improve cancer outcomes. We can predict the pathogenicity of genetic variants through computational methods, by aligning sequences of species (protein multiple sequence alignments, PMSA). Patterns of evolutionary conservation can help interpret which variants are important for human disease. Objective Use computational tools to assess PMSA quality and investigate evolutionary patterns that help predict pathogenic and benign variants. We hypothesize that current alignment tools cannot be fully automated to create good alignments at a large scale. Methods We used several computational tools: The National Center for Biotechnology Information BLAST tool to gather sequences of species, its CLINVAR database to identify variants, Clustal Omega to create PMSAs, and Phylip ProtPars to determine evolutionary variation. We chose 32 genes associated with hereditary cancers, counted pathogenic variants, and measured substitutions/site (a measure of conservation), PMSA gaps, and insertions. Results 94% of genes had small gaps (5-100 amino acids), 100% had small insertions, and 87.5% had large gaps and/or insertions. Most alignments need significant manual adjustment. Conclusions While automated existing programs are very helpful in PMSAs, the process remains labor-intensive and cannot be fully automated.
Description
Undergraduate
Date
2021-01-01
