Saturday, February 18, 2017
Exhibit Hall (Hynes Convention Center)
Zeinab Chahine, University of California, Irvine, SouthGate, CA
Here I examine the Copy Number Variation (CNVs) and assembly of the novel chimeric Sdic gene family across several genome assemblies of D. melanogaster. The Sdic gene is a species-specific gene believed to have originated approximately 5.4 mya, becoming relevant in sperm competition. Notably, Sdic became duplicated in tandem, being of the most noticeable multigene family expansions in the genus Drosophila. However, analysis of the precise repeat number of this gene cluster has been hampered due to the repetitive nature and high sequence resemblance of the copies. Due to the organizational features of the Sdic region, I hypothesize that unequal crossing-over has modified the number of Sdic copies between strains. As some of the assemblies have been obtained using PacBio sequencing technology, I also hypothesize that the number of Sdic copies might be different from that found in assemblies based on Sanger sequencing. In this study, I have annotated and compared the Sdic gene cluster from 5 different assemblies for the Sdic region obtained in 2 different strains of D. melanogaster. The results obtained show that the enhanced resolution of PacBio can uncover assembly errors in current Sanger-based reference assemblies that lead to erroneous estimates about the number of tandem duplicates. In addition, and when comparing high quality assemblies obtained with PacBio, I find evidence of differences in the number of Sdic copies between strains, which can be explained by unequal crossing-over. Additional experiments using a completely different technology will help to validate current findings. The present work lays out the foundations of the experimental and analytical procedures that should be used in the characterization of recently evolved gene clusters.