A Bioinformatics Approach to Find Mouse DNA Repeats Significant in Aggressive Colon Cancer

Saturday, February 13, 2016
Nitya Bhaskaran, San Diego State University, San Diego, CA
Background: Colorectal cancer (CRC) is a global health issue with a significant racial disparity between African Americans (AA) and Caucasian Americans (CA). Elevated Microsatellite Alterations at Selected Tetranucleotide Repeats (EMAST) is a biomarker of aggressive CRC induced by inflammation and is characterized by insertions/deletions of tetranucleotide repeats in noncoding DNA. EMAST has been found more frequently in AA CRC. The lack of an animal model of EMAST has severely restricted its study and how it relates to aggressive CRC. A major difficulty in identifying a mouse model of EMAST is in finding repeat sequences that are likely to be unstable. Thus, our goal was to use bioinformatic tools to find potential EMAST sequences in the mouse. Methods: Published human EMAST sequences were analyzed and a set of requirements was gathered, including repeat length and type, to design a python program. Flanking nucleotides from human EMAST sequences were analyzed using the Multiple Em for Motif Elicitation (MEME) tool to discover motifs. These motifs were added to the program parameters. Mouse sequences found using our program were then checked for homology to human EMAST loci using ClustalW, and those that showed >90% homology was analyzed further using the ENSEMBL database. ENSEMBL allowed us to sort potential mouse EMAST sequences by previous evidence of instability, using the rationale that if sequences are unstable over time, they might be targets for EMAST. Primers were designed to sequences of interest and PCR was done in normal mouse tissue. Results: Novel motifs were discovered from analyzing human EMAST loci. These novel motifs allowed us to provide a workable list of sequences which were then further refined down to 15 by selecting those with greatest homology to human EMAST sequences that had shown previous instability. These 15 sequences were used in PCR and tested in normal mouse colon DNA. Sequencing of the PCR products has confirmed that we can amplify the correct sequences. PCR of tumor DNA from a colon cancer mouse model is currently in progress to determine if EMAST can be detected. Conclusions: Motifs that are significant in human EMAST loci have been identified for the first time. These motifs have been used in designing a unique python code to discover potential species-specific EMAST sequences. Mouse sequences found using our program have been confirmed using PCR. Our results have identified mouse tetranucleotide repeat sequences that have the potential to display EMAST.