Skip to main content
Advertisement

< Back to Article

Fig 1.

Relative synonymous codon usage (RSCU) analysis revealed an overrepresentation of A/Uending codons across most of the Saccharomycotina subphylum.

Columns correspond to the 59 nondegenerate, non-stop codons; A/U-ending codons are shown in in purple font, and GC-ending codons are shown in green font. Rows correspond to the 327 Saccharomycotina species colored by major clade, following the recent genome-scale phylogeny of the subphylum [50]. Blue cells indicate overrepresented codons (RSCU > 1) and red cells indicate underrepresented codons (RSCU < 1). Codons were clustered (using hierarchical clustering) by RSCU value into three general groups (shown by horizontal bars of different colors): underrepresented A/U-ending codons (grey bar), underrepresented codons mostly ending in G/C (red bar), and overrepresented codons mostly ending in A/U (blue bar).

More »

Fig 1 Expand

Fig 2.

Differences in relative synonymous codon usage values between species are largely driven by variation in the usage of G/C- and A/U-ending codons.

The plot shows each of the 327 budding yeast species examined in this study along the first two dimensions (the X and Y axes) of a correspondence analysis. Each axis is labeled with the percent variance explained by the corresponding dimension and the codons that are the major drivers of the observed variance. The first dimension, which explains nearly 67% of the variation between species, is driven by the differential usage of G/C- versus A/U-ending codons. The second dimension, which differentiates the CUG-Ser1 clade, the CUG-Ser2 clade, and one Alloascoideaceae species from the rest of the species in the subphylum, explains a much smaller fraction of the observed variation (about 7%) and is primarily driven by differential usage of the CUA, CUG, UUG, and UUA codons in the two groups.

More »

Fig 2 Expand

Fig 3.

The high correlation between codon usage and GC composition of the third codon position suggests that codon usage bias at the level of individual codons is likely driven by genetic drift.

The graph illustrates a phylogenetic generalized least squares comparison between relative synonymous codon usage values and third codon position GC composition (GC3) for each codon across the 327 budding yeast species. Colors toward the red spectrum indicate a positive correlation between CG-ending codons and increasing GC3. Blue colors indicate a negative correlation between A/U-ending codons and increasing GC3. Grey cells denote non-degenerate codons encoding methionine or tryptophan or stop codons.

More »

Fig 3 Expand

Fig 4.

The complex relationship between relative frequency and genome-wide average base composition of the third codon position (GC3) suggests that individual codons vary in their fit to the neutral expectation (i.e., that codon usage is solely driven by GC mutational bias and genetic drift).

The neutral expectations for the different codons were obtained from the models developed by Palidwor et al. [38]. A) Observed relative frequency of the alanine codon GCC (shown on the Y axis) plotted against GC3(shown on the X axis) for each of the 327 budding yeast species analyzed in this study. The codon GCC had agood fit to the neutral expectation (black line, R-squared value = 0.671). B) Observed relative frequency of the arginine codon CGU plotted against GC3 composition for each species. The codon CGU had a poor fit to the neutral expectation (black line, R-squared value = -0.165); the same trend was also observed in the other Group-2 arginine codons (CGA and AGG). C) R-squared values for each of the codons (first column) and the sum of all codons for an amino acid (second column) compared to their neutral expectations. Boxes colored towards the red spectrum indicate a better fit to the neutral model, while boxes colored towards the blue spectrum indicate a poorer fit (i.e., worse than the mean) to the neutral model. Grey-colored boxes in the first column indicate non-degenerate amino acids or stop codons; grey boxes in the second column indicate codons that either have their own models (e.g., ATC) or have values that stem from the same model (e.g., all amino acids encoded by two codons, such as tyrosine (Y), which is encoded by TAT and TAC). Asterisks indicate codons with a Blomberg’s K variance over 1 when comparing GC3 and relative frequency, suggesting that the GC3 and relative frequency values for these codons are correlated due to phylogeny (i.e., closely related species tend to have more similar GC3 and relative frequency values due to shared ancestry).

More »

Fig 4 Expand

Fig 5.

Comparison of the silent third position GC composition of the third codon position (GC3) suggests that individual codons vary in their fit to the neutral expectation (i.e., that codon usage is solely driven by GC mutational bias and genetic drift).

The neutral expectations for the different codons were obtained from the models developed by Palidwor et al. (2010). A) Observed relative frequency of the alanine codon GCC (shown on the Y axis) plotted against GC3 (shown on the X axis) for each of the 327 budding yeast species analyzed in this study. The codon GCC had a good fit to the neutral expectation (black line, R-squared value = 0.671). B) Observed relative frequency of the arginine codon CGU plotted against GC3 composition for each species. The codon CGU had a poor fit to the neutral expectation (black line, R-squared value = -0.165); the same trend was also observed in the other Group-2 arginine codons (CGA and AGG). C) R-squared values for each of the codons (first column) and the sum of all codons for an amino acid (second column) compared to their neutral expectations. Boxes colored towards the red spectrum indicate a better fit to the neutral model, while boxes colored towards the blue spectrum indicate a poorer fit (i.e., worse than the mean) to the neutral model. Grey-colored boxes in the first column indicate non-degenerate amino acids or stop codons; grey boxes in the second column indicate codons that either have their own models (e.g., ATC) or have values that stem from the same model (e.g., all amino acids encoded by two codons, such as tyrosine (Y), which is encoded by TAT and TAC). Asterisks indicate codons with a Blomberg’s K variance over 1 when comparing GC3 and relative frequency, suggesting that the GC3 and relative frequency values for these codons are correlated due to phylogeny (i.e., closely related species tend to have more similar GC3 and relative frequency values due to shared ancestry).

More »

Fig 5 Expand

Fig 6.

Most genomes in the budding yeast subphylum exhibit moderate to high levels of translational selection on codon bias.

Translational selection on codon bias was measured using the S-test, which examines the correlation between the stAI value and the selective pressure (estimated by f(GC3)-ENC where f(GC3) is a modified function of Wright’s neutral relationship between the silent GC content of a gene and the effective number of codons) on all coding sequences in a genome. Each point in the comparison between stAI and selective pressure is a single coding sequence in one genome. Higher S-values indicate higher levels of translational selection on codon bias. A) Distribution of the significant S-values (p<0.05 in permutation test; 293 species out of 327) and non-significant S-values (p>0.05 in permutation test; 34 / 327 species). B) Pichia membranifaciens, an example of a species that exhibits low translational selection on codon bias (p<0.05 in permutation test; n = 10,000). C) Saccharomyces cerevisiae, an example of a species that exhibits high translational selection on codon bias (p < 0.01 in permutation test; n = 10,000).

More »

Fig 6 Expand

Fig 7.

Maximum translational selection occurs at an intermediate number of total tRNA genes in the genome.

This plot shows the relationship between the total number of tRNA genes in a genome (tRNAome size) and S-value for each the 327 budding yeast species analyzed in this study. The best fitting model (blue) was a Gaussian distribution with a maximum S-value at 336 tRNA genes. This suggests that species with either low or high numbers of total tRNA genes exhibit lower levels of translational selection.

More »

Fig 7 Expand