Presentation Title

Studying Evolutionary Patterns in Cancer Risk Genes with Computational Tools to Create Sequence Alignments

Project Collaborators

Marc Greenblatt (Research Mentor), Alexander Karabachev (Medical Student Mentor)

Abstract

Background

An important aspect of cancer research is identifying inherited genetic variants that cause increased cancer susceptibility, then acting to improve cancer outcomes. We can predict the pathogenicity of genetic variants through computational methods, by aligning sequences of species (protein multiple sequence alignments, PMSA). Patterns of evolutionary conservation can help interpret which variants are important for human disease.

Objective

Use computational tools to assess PMSA quality and investigate evolutionary patterns that help predict pathogenic and benign variants. We hypothesize that current alignment tools cannot be fully automated to create good alignments at a large scale.

Methods

We used several computational tools: The National Center for Biotechnology Information BLAST tool to gather sequences of species, its CLINVAR database to identify variants, Clustal Omega to create PMSAs, and Phylip ProtPars to determine evolutionary variation. We chose 32 genes associated with hereditary cancers, counted pathogenic variants, and measured substitutions/site (a measure of conservation), PMSA gaps, and insertions.

Results

94% of genes had small gaps (5-100 amino acids), 100% had small insertions, and 87.5% had large gaps and/or insertions. Most alignments need significant manual adjustment.

Conclusions

While automated existing programs are very helpful in PMSAs, the process remains labor-intensive and cannot be fully automated.

Primary Faculty Mentor Name

Marc Greenblatt

Graduate Student Mentors

Alexander Karabachev

Status

Undergraduate

Student College

College of Arts and Sciences

Program/Major

Biological Science

Primary Research Category

Health Sciences

Second College (optional)

Honors College

Secondary Research Category

Biological Sciences

This document is currently not available here.

Share

COinS
 

Studying Evolutionary Patterns in Cancer Risk Genes with Computational Tools to Create Sequence Alignments

Background

An important aspect of cancer research is identifying inherited genetic variants that cause increased cancer susceptibility, then acting to improve cancer outcomes. We can predict the pathogenicity of genetic variants through computational methods, by aligning sequences of species (protein multiple sequence alignments, PMSA). Patterns of evolutionary conservation can help interpret which variants are important for human disease.

Objective

Use computational tools to assess PMSA quality and investigate evolutionary patterns that help predict pathogenic and benign variants. We hypothesize that current alignment tools cannot be fully automated to create good alignments at a large scale.

Methods

We used several computational tools: The National Center for Biotechnology Information BLAST tool to gather sequences of species, its CLINVAR database to identify variants, Clustal Omega to create PMSAs, and Phylip ProtPars to determine evolutionary variation. We chose 32 genes associated with hereditary cancers, counted pathogenic variants, and measured substitutions/site (a measure of conservation), PMSA gaps, and insertions.

Results

94% of genes had small gaps (5-100 amino acids), 100% had small insertions, and 87.5% had large gaps and/or insertions. Most alignments need significant manual adjustment.

Conclusions

While automated existing programs are very helpful in PMSAs, the process remains labor-intensive and cannot be fully automated.