Autocorrelation Genetic Syntax of Eukaryotic Protein-Coding Sequences

Document
Document

Using the mathematical approach of autocorrelation on the protein-coding regions of numerous organisms, Arquès and Michel revealed theoretical sequence signals they call circular codes. Running autocorrelation on a pool of protein-coding sequences from multiple organisms allowed Arquès and Michel to generate universal circular codes for both eukaryotic and prokaryotic organisms. Later work using a simplified approach to autocorrelation has shown that circular codes can vary greatly from one organism to another as well as from the previously defined universal circular code for prokaryotes. Arquès and Michel have spent the last two decades trying to understand the significance of this sequence signal. This work offers a novel perspective on the sequence signal that autocorrelation reveals. Using autocorrelation on the protein-coding sequences from individual eukaryotic organisms, we analyze the sequence signal found within each organism. We find that in vivo these sequence signals vary from the theoretical eukaryotic circular code proposed by Arquès and Michel and lack the properties that are required to define them as circular codes. Our work provides a new approach to the analysis of these individual sequence signals. Using this approach, we are able to visualize how these sequence signals are unique to each organism and might evolve along with an organism's genomic sequence. As we look more closely at individual organisms, we find that these sequence signals may be driven by their correlation with the cellular tRNA levels of that organism. We also show that these sequence signals correlate with protein expression and may play a role in the recognition of translation initiation start sites.

    Item Description
    Name(s)
    Thesis advisor: Weir, Michael P.
    Thesis advisor: Rice, Michael
    Date
    April 01, 2013
    Extent
    163 pages
    Language
    eng
    Genre
    Physical Form
    electronic
    Discipline
    Subject
    Rights and Use
    In Copyright – Non-Commercial Use Permitted
    Digital Collection
    PID
    ir:2246