ancestral eukaryotic genome: Topics by Science.gov

Sample records for ancestral eukaryotic genome

Comparative genomics and evolution of eukaryotic phospholipidbiosynthesis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lykidis, Athanasios

2006-12-01

Phospholipid biosynthetic enzymes produce diverse molecular structures and are often present in multiple forms encoded by different genes. This work utilizes comparative genomics and phylogenetics for exploring the distribution, structure and evolution of phospholipid biosynthetic genes and pathways in 26 eukaryotic genomes. Although the basic structure of the pathways was formed early in eukaryotic evolution, the emerging picture indicates that individual enzyme families followed unique evolutionary courses. For example, choline and ethanolamine kinases and cytidylyltransferases emerged in ancestral eukaryotes, whereas, multiple forms of the corresponding phosphatidyltransferases evolved mainly in a lineage specific manner. Furthermore, several unicellular eukaryotes maintain bacterial-type enzymesmore » and reactions for the synthesis of phosphatidylglycerol and cardiolipin. Also, base-exchange phosphatidylserine synthases are widespread and ancestral enzymes. The multiplicity of phospholipid biosynthetic enzymes has been largely generated by gene expansion in a lineage specific manner. Thus, these observations suggest that phospholipid biosynthesis has been an actively evolving system. Finally, comparative genomic analysis indicates the existence of novel phosphatidyltransferases and provides a candidate for the uncharacterized eukaryotic phosphatidylglycerol phosphate phosphatase.« less
Evolution of domain promiscuity in eukaryotic genomes—a perspective from the inferred ancestral domain architectures†

PubMed Central

Cohen-Gihon, Inbar; Fong, Jessica H.; Sharan, Roded; Nussinov, Ruth

2012-01-01

Most eukaryotic proteins are composed of two or more domains. These assemble in a modular manner to create new proteins usually by the acquisition of one or more domains to an existing protein. Promiscuous domains which are found embedded in a variety of proteins and co-exist with many other domains are of particular interest and were shown to have roles in signaling pathways and mediating network communication. The evolution of domain promiscuity is still an open problem, mostly due to the lack of sequenced ancestral genomes. Here we use inferred domain architectures of ancestral genomes to trace the evolution of domain promiscuity in eukaryotic genomes. We find an increase in average promiscuity along many branches of the eukaryotic tree. Moreover, domain promiscuity can proceed at almost a steady rate over long evolutionary time or exhibit lineage-specific acceleration. We also observe that many signaling and regulatory domains gained domain promiscuity around the Bilateria divergence. In addition we show that those domains that played a role in the creation of two body axes and existed before the divergence of the bilaterians from fungi/metazoan achieve a boost in their promiscuities during the bilaterian evolution. PMID:21127809
The Genome of Naegleria gruberi Illuminates Early Eukaryotic Versatility

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fritz-Laylin, Lillian K.; Prochnik, Simon E.; Ginger, Michael L.

2010-03-01

Genome sequences of diverse free-living protists are essential for understanding eukaryotic evolution and molecular and cell biology. The free-living amoeboflagellate Naegleria gruberi belongs to a varied and ubiquitous protist clade (Heterolobosea) that diverged from other eukaryotic lineages over a billion years ago. Analysis of the 15,727 protein-coding genes encoded by Naegleria's 41 Mb nuclear genome indicates a capacity for both aerobic respiration and anaerobic metabolism with concomitant hydrogen production, with fundamental implications for the evolution of organelle metabolism. The Naegleria genome facilitates substantially broader phylogenomic comparisons of free-living eukaryotes than previously possible, allowing us to identify thousands of genes likelymore » present in the pan-eukaryotic ancestor, with 40% likely eukaryotic inventions. Moreover, we construct a comprehensive catalog of amoeboid-motility genes. The Naegleria genome, analyzed in the context of other protists, reveals a remarkably complex ancestral eukaryote with a rich repertoire of cytoskeletal, sexual, signaling, and metabolic modules.« less
The Naegleria genome: a free-living microbial eukaryote lends unique insights into core eukaryotic cell biology

PubMed Central

Fritz-Laylin, Lillian K.; Ginger, Michael L.; Walsh, Charles; Dawson, Scott C.; Fulton, Chandler

2016-01-01

Naegleria gruberi, a free-living protist, has long been treasured as a model for basal body and flagellar assembly due to its ability to differentiate from crawling amoebae into swimming flagellates. The full genome sequence of Naegleria gruberi has recently been used to estimate gene families ancestral to all eukaryotes and to identify novel aspects of Naegleria biology, including likely facultative anaerobic metabolism, extensive signaling cascades, and evidence for sexuality. Distinctive features of the Naegleria genome and nuclear biology provide unique perspectives for comparative cell biology, including cell division, RNA processing and nucleolar assembly. We highlight here exciting new and novel aspects of Naegleria biology identified through genomic analysis. PMID:21392573
Evolution of bacterial-like phosphoprotein phosphatases in photosynthetic eukaryotes features ancestral mitochondrial or archaeal origin and possible lateral gene transfer.

PubMed

Uhrig, R Glen; Kerk, David; Moorhead, Greg B

2013-12-01

Protein phosphorylation is a reversible regulatory process catalyzed by the opposing reactions of protein kinases and phosphatases, which are central to the proper functioning of the cell. Dysfunction of members in either the protein kinase or phosphatase family can have wide-ranging deleterious effects in both metazoans and plants alike. Previously, three bacterial-like phosphoprotein phosphatase classes were uncovered in eukaryotes and named according to the bacterial sequences with which they have the greatest similarity: Shewanella-like (SLP), Rhizobiales-like (RLPH), and ApaH-like (ALPH) phosphatases. Utilizing the wealth of data resulting from recently sequenced complete eukaryotic genomes, we conducted database searching by hidden Markov models, multiple sequence alignment, and phylogenetic tree inference with Bayesian and maximum likelihood methods to elucidate the pattern of evolution of eukaryotic bacterial-like phosphoprotein phosphatase sequences, which are predominantly distributed in photosynthetic eukaryotes. We uncovered a pattern of ancestral mitochondrial (SLP and RLPH) or archaeal (ALPH) gene entry into eukaryotes, supplemented by possible instances of lateral gene transfer between bacteria and eukaryotes. In addition to the previously known green algal and plant SLP1 and SLP2 protein forms, a more ancestral third form (SLP3) was found in green algae. Data from in silico subcellular localization predictions revealed class-specific differences in plants likely to result in distinct functions, and for SLP sequences, distinctive and possibly functionally significant differences between plants and nonphotosynthetic eukaryotes. Conserved carboxyl-terminal sequence motifs with class-specific patterns of residue substitutions, most prominent in photosynthetic organisms, raise the possibility of complex interactions with regulatory proteins.
Reconstruction of the vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates.

PubMed

Nakatani, Yoichiro; Takeda, Hiroyuki; Kohara, Yuji; Morishita, Shinichi

2007-09-01

Although several vertebrate genomes have been sequenced, little is known about the genome evolution of early vertebrates and how large-scale genomic changes such as the two rounds of whole-genome duplications (2R WGD) affected evolutionary complexity and novelty in vertebrates. Reconstructing the ancestral vertebrate genome is highly nontrivial because of the difficulty in identifying traces originating from the 2R WGD. To resolve this problem, we developed a novel method capable of pinning down remains of the 2R WGD in the human and medaka fish genomes using invertebrate tunicate and sea urchin genes to define ohnologs, i.e., paralogs produced by the 2R WGD. We validated the reconstruction using the chicken genome, which was not considered in the reconstruction step, and observed that many ancestral proto-chromosomes were retained in the chicken genome and had one-to-one correspondence to chicken microchromosomes, thereby confirming the reconstructed ancestral genomes. Our reconstruction revealed a contrast between the slow karyotype evolution after the second WGD and the rapid, lineage-specific genome reorganizations that occurred in the ancestral lineages of major taxonomic groups such as teleost fishes, amphibians, reptiles, and marsupials.
Reconstruction of Ancestral Genomes in Presence of Gene Gain and Loss.

PubMed

Avdeyev, Pavel; Jiang, Shuai; Aganezov, Sergey; Hu, Fei; Alekseyev, Max A

2016-03-01

Since most dramatic genomic changes are caused by genome rearrangements as well as gene duplications and gain/loss events, it becomes crucial to understand their mechanisms and reconstruct ancestral genomes of the given genomes. This problem was shown to be NP-complete even in the "simplest" case of three genomes, thus calling for heuristic rather than exact algorithmic solutions. At the same time, a larger number of input genomes may actually simplify the problem in practice as it was earlier illustrated with MGRA, a state-of-the-art software tool for reconstruction of ancestral genomes of multiple genomes. One of the key obstacles for MGRA and other similar tools is presence of breakpoint reuses when the same breakpoint region is broken by several different genome rearrangements in the course of evolution. Furthermore, such tools are often limited to genomes composed of the same genes with each gene present in a single copy in every genome. This limitation makes these tools inapplicable for many biological datasets and degrades the resolution of ancestral reconstructions in diverse datasets. We address these deficiencies by extending the MGRA algorithm to genomes with unequal gene contents. The developed next-generation tool MGRA2 can handle gene gain/loss events and shares the ability of MGRA to reconstruct ancestral genomes uniquely in the case of limited breakpoint reuse. Furthermore, MGRA2 employs a number of novel heuristics to cope with higher breakpoint reuse and process datasets inaccessible for MGRA. In practical experiments, MGRA2 shows superior performance for simulated and real genomes as compared to other ancestral genome reconstruction tools.
Fast ancestral gene order reconstruction of genomes with unequal gene content.

PubMed

Feijão, Pedro; Araujo, Eloi

2016-11-11

During evolution, genomes are modified by large scale structural events, such as rearrangements, deletions or insertions of large blocks of DNA. Of particular interest, in order to better understand how this type of genomic evolution happens, is the reconstruction of ancestral genomes, given a phylogenetic tree with extant genomes at its leaves. One way of solving this problem is to assume a rearrangement model, such as Double Cut and Join (DCJ), and find a set of ancestral genomes that minimizes the number of events on the input tree. Since this problem is NP-hard for most rearrangement models, exact solutions are practical only for small instances, and heuristics have to be used for larger datasets. This type of approach can be called event-based. Another common approach is based on finding conserved structures between the input genomes, such as adjacencies between genes, possibly also assigning weights that indicate a measure of confidence or probability that this particular structure is present on each ancestral genome, and then finding a set of non conflicting adjacencies that optimize some given function, usually trying to maximize total weight and minimizing character changes in the tree. We call this type of methods homology-based. In previous work, we proposed an ancestral reconstruction method that combines homology- and event-based ideas, using the concept of intermediate genomes, that arise in DCJ rearrangement scenarios. This method showed better rate of correctly reconstructed adjacencies than other methods, while also being faster, since the use of intermediate genomes greatly reduces the search space. Here, we generalize the intermediate genome concept to genomes with unequal gene content, extending our method to account for gene insertions and deletions of any length. In many of the simulated datasets, our proposed method had better results than MLGO and MGRA, two state-of-the-art algorithms for ancestral reconstruction with unequal gene content
Compositional patterns in the genomes of unicellular eukaryotes

PubMed Central

2013-01-01

Background The genomes of multicellular eukaryotes are compartmentalized in mosaics of isochores, large and fairly homogeneous stretches of DNA that belong to a small number of families characterized by different average GC levels, by different gene concentration (that increase with GC), different chromatin structures, different replication timing in the cell cycle, and other different properties. A question raised by these basic results concerns how far back in evolution the compartmentalized organization of the eukaryotic genomes arose. Results In the present work we approached this problem by studying the compositional organization of the genomes from the unicellular eukaryotes for which full sequences are available, the sample used being representative. The average GC levels of the genomes from unicellular eukaryotes cover an extremely wide range (19%-60% GC) and the compositional patterns of individual genomes are extremely different but all genomes tested show a compositional compartmentalization. Conclusions The average GC range of the genomes of unicellular eukaryotes is very broad (as broad as that of prokaryotes) and individual compositional patterns cover a very broad range from very narrow to very complex. Both features are not surprising for organisms that are very far from each other both in terms of phylogenetic distances and of environmental life conditions. Most importantly, all genomes tested, a representative sample of all supergroups of unicellular eukaryotes, are compositionally compartmentalized, a major difference with prokaryotes. PMID:24188247
Compositional patterns in the genomes of unicellular eukaryotes.

PubMed

Costantini, Maria; Alvarez-Valin, Fernando; Costantini, Susan; Cammarano, Rosalia; Bernardi, Giorgio

2013-11-05

The genomes of multicellular eukaryotes are compartmentalized in mosaics of isochores, large and fairly homogeneous stretches of DNA that belong to a small number of families characterized by different average GC levels, by different gene concentration (that increase with GC), different chromatin structures, different replication timing in the cell cycle, and other different properties. A question raised by these basic results concerns how far back in evolution the compartmentalized organization of the eukaryotic genomes arose. In the present work we approached this problem by studying the compositional organization of the genomes from the unicellular eukaryotes for which full sequences are available, the sample used being representative. The average GC levels of the genomes from unicellular eukaryotes cover an extremely wide range (19%-60% GC) and the compositional patterns of individual genomes are extremely different but all genomes tested show a compositional compartmentalization. The average GC range of the genomes of unicellular eukaryotes is very broad (as broad as that of prokaryotes) and individual compositional patterns cover a very broad range from very narrow to very complex. Both features are not surprising for organisms that are very far from each other both in terms of phylogenetic distances and of environmental life conditions. Most importantly, all genomes tested, a representative sample of all supergroups of unicellular eukaryotes, are compositionally compartmentalized, a major difference with prokaryotes.
GenColors-based comparative genome databases for small eukaryotic genomes.

PubMed

Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

2013-01-01

Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.
Deciphering the Diploid Ancestral Genome of the Mesohexaploid Brassica rapa[C][W

PubMed Central

Cheng, Feng; Mandáková, Terezie; Wu, Jian; Xie, Qi; Lysak, Martin A.; Wang, Xiaowu

2013-01-01

The genus Brassica includes several important agricultural and horticultural crops. Their current genome structures were shaped by whole-genome triplication followed by extensive diploidization. The availability of several crucifer genome sequences, especially that of Chinese cabbage (Brassica rapa), enables study of the evolution of the mesohexaploid Brassica genomes from their diploid progenitors. We reconstructed three ancestral subgenomes of B. rapa (n = 10) by comparing its whole-genome sequence to ancestral and extant Brassicaceae genomes. All three B. rapa paleogenomes apparently consisted of seven chromosomes, similar to the ancestral translocation Proto-Calepineae Karyotype (tPCK; n = 7), which is the evolutionarily younger variant of the Proto-Calepineae Karyotype (n = 7). Based on comparative analysis of genome sequences or linkage maps of Brassica oleracea, Brassica nigra, radish (Raphanus sativus), and other closely related species, we propose a two-step merging of three tPCK-like genomes to form the hexaploid ancestor of the tribe Brassiceae with 42 chromosomes. Subsequent diversification of the Brassiceae was marked by extensive genome reshuffling and chromosome number reduction mediated by translocation events and followed by loss and/or inactivation of centromeres. Furthermore, via interspecies genome comparison, we refined intervals for seven of the genomic blocks of the Ancestral Crucifer Karyotype (n = 8), thus revising the key reference genome for evolutionary genomics of crucifers. PMID:23653472
Comparative Genomics of Candidate Phylum TM6 Suggests That Parasitism Is Widespread and Ancestral in This Lineage

PubMed Central

Yeoh, Yun Kit; Sekiguchi, Yuji; Parks, Donovan H.; Hugenholtz, Philip

2016-01-01

Candidate phylum TM6 is a major bacterial lineage recognized through culture-independent rRNA surveys to be low abundance members in a wide range of habitats; however, they are poorly characterized due to a lack of pure culture representatives. Two recent genomic studies of TM6 bacteria revealed small genomes and limited gene repertoire, consistent with known or inferred dependence on eukaryotic hosts for their metabolic needs. Here, we obtained additional near-complete genomes of TM6 populations from agricultural soil and upflow anaerobic sludge blanket reactor metagenomes which, together with the two publicly available TM6 genomes, represent seven distinct family level lineages in the TM6 phylum. Genome-based phylogenetic analysis confirms that TM6 is an independent phylum level lineage in the bacterial domain, possibly affiliated with the Patescibacteria superphylum. All seven genomes are small (1.0–1.5 Mb) and lack complete biosynthetic pathways for various essential cellular building blocks including amino acids, lipids, and nucleotides. These and other features identified in the TM6 genomes such as a degenerated cell envelope, ATP/ADP translocases for parasitizing host ATP pools, and protein motifs to facilitate eukaryotic host interactions indicate that parasitism is widespread in this phylum. Phylogenetic analysis of ATP/ADP translocase genes suggests that the ancestral TM6 lineage was also parasitic. We propose the name Dependentiae (phyl. nov.) to reflect dependence of TM6 bacteria on host organisms. PMID:26615204
A Detailed History of Intron-rich Eukaryotic Ancestors Inferred from a Global Survey of 100 Complete Genomes

PubMed Central

Csuros, Miklos; Rogozin, Igor B.; Koonin, Eugene V.

2011-01-01

Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing. PMID:21935348
The evolution of sex: A new hypothesis based on mitochondrial mutational erosion: Mitochondrial mutational erosion in ancestral eukaryotes would favor the evolution of sex, harnessing nuclear recombination to optimize compensatory nuclear coadaptation.

PubMed

Havird, Justin C; Hall, Matthew D; Dowling, Damian K

2015-09-01

The evolution of sex in eukaryotes represents a paradox, given the "twofold" fitness cost it incurs. We hypothesize that the mutational dynamics of the mitochondrial genome would have favored the evolution of sexual reproduction. Mitochondrial DNA (mtDNA) exhibits a high-mutation rate across most eukaryote taxa, and several lines of evidence suggest that this high rate is an ancestral character. This seems inexplicable given that mtDNA-encoded genes underlie the expression of life's most salient functions, including energy conversion. We propose that negative metabolic effects linked to mitochondrial mutation accumulation would have invoked selection for sexual recombination between divergent host nuclear genomes in early eukaryote lineages. This would provide a mechanism by which recombinant host genotypes could be rapidly shuffled and screened for the presence of compensatory modifiers that offset mtDNA-induced harm. Under this hypothesis, recombination provides the genetic variation necessary for compensatory nuclear coadaptation to keep pace with mitochondrial mutation accumulation. © 2015 WILEY Periodicals, Inc.
An Evolutionary Network of Genes Present in the Eukaryote Common Ancestor Polls Genomes on Eukaryotic and Mitochondrial Origin

PubMed Central

Thiergart, Thorsten; Landan, Giddy; Schenk, Marc; Dagan, Tal; Martin, William F.

2012-01-01

To test the predictions of competing and mutually exclusive hypotheses for the origin of eukaryotes, we identified from a sample of 27 sequenced eukaryotic and 994 sequenced prokaryotic genomes 571 genes that were present in the eukaryote common ancestor and that have homologues among eubacterial and archaebacterial genomes. Maximum-likelihood trees identified the prokaryotic genomes that most frequently contained genes branching as the sister to the eukaryotic nuclear homologues. Among the archaebacteria, euryarchaeote genomes most frequently harbored the sister to the eukaryotic nuclear gene, whereas among eubacteria, the α-proteobacteria were most frequently represented within the sister group. Only 3 genes out of 571 gave a 3-domain tree. Homologues from α-proteobacterial genomes that branched as the sister to nuclear genes were found more frequently in genomes of facultatively anaerobic members of the rhiozobiales and rhodospirilliales than in obligate intracellular ricketttsial parasites. Following α-proteobacteria, the most frequent eubacterial sister lineages were γ-proteobacteria, δ-proteobacteria, and firmicutes, which were also the prokaryote genomes least frequently found as monophyletic groups in our trees. Although all 22 higher prokaryotic taxa sampled (crenarchaeotes, γ-proteobacteria, spirochaetes, chlamydias, etc.) harbor genes that branch as the sister to homologues present in the eukaryotic common ancestor, that is not evidence of 22 different prokaryotic cells participating at eukaryote origins because prokaryotic “lineages” have laterally acquired genes for more than 1.5 billion years since eukaryote origins. The data underscore the archaebacterial (host) nature of the eukaryotic informational genes and the eubacterial (mitochondrial) nature of eukaryotic energy metabolism. The network linking genes of the eukaryote ancestor to contemporary homologues distributed across prokaryotic genomes elucidates eukaryote gene origins in a
The others: our biased perspective of eukaryotic genomes

PubMed Central

del Campo, Javier; Sieracki, Michael E.; Molestina, Robert; Keeling, Patrick; Massana, Ramon; Ruiz-Trillo, Iñaki

2015-01-01

Understanding the origin and evolution of the eukaryotic cell and the full diversity of eukaryotes is relevant to many biological disciplines. However, our current understanding of eukaryotic genomes is extremely biased, leading to a skewed view of eukaryotic biology. We argue that a phylogeny-driven initiative to cover the full eukaryotic diversity is needed to overcome this bias. We encourage the community: (i) to sequence a representative of the neglected groups available at public culture collections, (ii) to increase our culturing efforts, and (iii) to embrace single cell genomics to access organisms refractory to propagation in culture. We hope that the community will welcome this proposal, explore the approaches suggested, and join efforts to sequence the full diversity of eukaryotes. PMID:24726347
Genome-reconstruction for eukaryotes from complex natural microbial communities.

PubMed

West, Patrick T; Probst, Alexander J; Grigoriev, Igor V; Thomas, Brian C; Banfield, Jillian F

2018-04-01

Microbial eukaryotes are integral components of natural microbial communities, and their inclusion is critical for many ecosystem studies, yet the majority of published metagenome analyses ignore eukaryotes. In order to include eukaryotes in environmental studies, we propose a method to recover eukaryotic genomes from complex metagenomic samples. A key step for genome recovery is separation of eukaryotic and prokaryotic fragments. We developed a k -mer-based strategy, EukRep, for eukaryotic sequence identification and applied it to environmental samples to show that it enables genome recovery, genome completeness evaluation, and prediction of metabolic potential. We used this approach to test the effect of addition of organic carbon on a geyser-associated microbial community and detected a substantial change of the community metabolism, with selection against almost all candidate phyla bacteria and archaea and for eukaryotes. Near complete genomes were reconstructed for three fungi placed within the Eurotiomycetes and an arthropod. While carbon fixation and sulfur oxidation were important functions in the geyser community prior to carbon addition, the organic carbon-impacted community showed enrichment for secreted proteases, secreted lipases, cellulose targeting CAZymes, and methanol oxidation. We demonstrate the broader utility of EukRep by reconstructing and evaluating relatively high-quality fungal, protist, and rotifer genomes from complex environmental samples. This approach opens the way for cultivation-independent analyses of whole microbial communities. © 2018 West et al.; Published by Cold Spring Harbor Laboratory Press.
Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuo, Alan; Grigoriev, Igor

2009-04-17

Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentousmore » ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.« less
Synteny conservation between the Prunus genome and both the present and ancestral Arabidopsis genomes

PubMed Central

Jung, Sook; Main, Dorrie; Staton, Margaret; Cho, Ilhyung; Zhebentyayeva, Tatyana; Arús, Pere; Abbott, Albert

2006-01-01

Background Due to the lack of availability of large genomic sequences for peach or other Prunus species, the degree of synteny conservation between the Prunus species and Arabidopsis has not been systematically assessed. Using the recently available peach EST sequences that are anchored to Prunus genetic maps and to peach physical map, we analyzed the extent of conserved synteny between the Prunus and the Arabidopsis genomes. The reconstructed pseudo-ancestral Arabidopsis genome, existed prior to the proposed recent polyploidy event, was also utilized in our analysis to further elucidate the evolutionary relationship. Results We analyzed the synteny conservation between the Prunus and the Arabidopsis genomes by comparing 475 peach ESTs that are anchored to Prunus genetic maps and their Arabidopsis homologs detected by sequence similarity. Microsyntenic regions were detected between all five Arabidopsis chromosomes and seven of the eight linkage groups of the Prunus reference map. An additional 1097 peach ESTs that are anchored to 431 BAC contigs of the peach physical map and their Arabidopsis homologs were also analyzed. Microsyntenic regions were detected in 77 BAC contigs. The syntenic regions from both data sets were short and contained only a couple of conserved gene pairs. The synteny between peach and Arabidopsis was fragmentary; all the Prunus linkage groups containing syntenic regions matched to more than two different Arabidopsis chromosomes, and most BAC contigs with multiple conserved syntenic regions corresponded to multiple Arabidopsis chromosomes. Using the same peach EST datasets and their Arabidopsis homologs, we also detected conserved syntenic regions in the pseudo-ancestral Arabidopsis genome. In many cases, the gene order and content of peach regions was more conserved in the ancestral genome than in the present Arabidopsis region. Statistical significance of each syntenic group was calculated using simulated Arabidopsis genome. Conclusion We

Unitary circular code motifs in genomes of eukaryotes.

PubMed

El Soufi, Karim; Michel, Christian J

A set X of 20 trinucleotides was identified in genes of bacteria, eukaryotes, plasmids and viruses, which has in average the highest occurrence in reading frame compared to its two shifted frames (Michel, 2015; Arquès and Michel, 1996). This set X has an interesting mathematical property as X is a circular code (Arquès and Michel, 1996). Thus, the motifs from this circular code X, called X motifs, have the property to always retrieve, synchronize and maintain the reading frame in genes. The origin of this circular code X in genes is an open problem since its discovery in 1996. Here, we first show that the unitary circular codes (UCC), i.e. sets of one word, allow to generate unitary circular code motifs (UCC motifs), i.e. a concatenation of the same motif (simple repeats) leading to low complexity DNA. Three classes of UCC motifs are studied here: repeated dinucleotides (D + motifs), repeated trinucleotides (T + motifs) and repeated tetranucleotides (T + motifs). Thus, the D + , T + and T + motifs allow to retrieve, synchronize and maintain a frame modulo 2, modulo 3 and modulo 4, respectively, and their shifted frames (1 modulo 2; 1 and 2 modulo 3; 1, 2 and 3 modulo 4 according to the C 2 , C 3 and C 4 properties, respectively) in the DNA sequences. The statistical distribution of the D + , T + and T + motifs is analyzed in the genomes of eukaryotes. A UCC motif and its comp lementary UCC motif have the same distribution in the eukaryotic genomes. Furthermore, a UCC motif and its complementary UCC motif have increasing occurrences contrary to their number of hydrogen bonds, very significant with the T + motifs. The longest D + , T + and T + motifs in the studied eukaryotic genomes are also given. Surprisingly, a scarcity of repeated trinucleotides (T + motifs) in the large eukaryotic genomes is observed compared to the D + and T + motifs. This result has been investigated and may be explained by two outcomes. Repeated trinucleotides (T + motifs) are identified
Ancestral Components of Admixed Genomes in a Mexican Cohort

PubMed Central

Johnson, Nicholas A.; Coram, Marc A.; Shriver, Mark D.; Romieu, Isabelle; Barsh, Gregory S.; London, Stephanie J.; Tang, Hua

2011-01-01

For most of the world, human genome structure at a population level is shaped by interplay between ancient geographic isolation and more recent demographic shifts, factors that are captured by the concepts of biogeographic ancestry and admixture, respectively. The ancestry of non-admixed individuals can often be traced to a specific population in a precise region, but current approaches for studying admixed individuals generally yield coarse information in which genome ancestry proportions are identified according to continent of origin. Here we introduce a new analytic strategy for this problem that allows fine-grained characterization of admixed individuals with respect to both geographic and genomic coordinates. Ancestry segments from different continents, identified with a probabilistic model, are used to construct and study “virtual genomes” of admixed individuals. We apply this approach to a cohort of 492 parent–offspring trios from Mexico City. The relative contributions from the three continental-level ancestral populations—Africa, Europe, and America—vary substantially between individuals, and the distribution of haplotype block length suggests an admixing time of 10–15 generations. The European and Indigenous American virtual genomes of each Mexican individual can be traced to precise regions within each continent, and they reveal a gradient of Amerindian ancestry between indigenous people of southwestern Mexico and Mayans of the Yucatan Peninsula. This contrasts sharply with the African roots of African Americans, which have been characterized by a uniform mixing of multiple West African populations. We also use the virtual European and Indigenous American genomes to search for the signatures of selection in the ancestral populations, and we identify previously known targets of selection in other populations, as well as new candidate loci. The ability to infer precise ancestral components of admixed genomes will facilitate studies of disease
Horizontal transfer of a eukaryotic plastid-targeted protein gene to cyanobacteria

PubMed Central

Rogers, Matthew B; Patron, Nicola J; Keeling, Patrick J

2007-01-01

Background Horizontal or lateral transfer of genetic material between distantly related prokaryotes has been shown to play a major role in the evolution of bacterial and archaeal genomes, but exchange of genes between prokaryotes and eukaryotes is not as well understood. In particular, gene flow from eukaryotes to prokaryotes is rarely documented with strong support, which is unusual since prokaryotic genomes appear to readily accept foreign genes. Results Here, we show that abundant marine cyanobacteria in the related genera Synechococcus and Prochlorococcus acquired a key Calvin cycle/glycolytic enzyme from a eukaryote. Two non-homologous forms of fructose bisphosphate aldolase (FBA) are characteristic of eukaryotes and prokaryotes respectively. However, a eukaryotic gene has been inserted immediately upstream of the ancestral prokaryotic gene in several strains (ecotypes) of Synechococcus and Prochlorococcus. In one lineage this new gene has replaced the ancestral gene altogether. The eukaryotic gene is most closely related to the plastid-targeted FBA from red algae. This eukaryotic-type FBA once replaced the plastid/cyanobacterial type in photosynthetic eukaryotes, hinting at a possible functional advantage in Calvin cycle reactions. The strains that now possess this eukaryotic FBA are scattered across the tree of Synechococcus and Prochlorococcus, perhaps because the gene has been transferred multiple times among cyanobacteria, or more likely because it has been selectively retained only in certain lineages. Conclusion A gene for plastid-targeted FBA has been transferred from red algae to cyanobacteria, where it has inserted itself beside its non-homologous, functional analogue. Its current distribution in Prochlorococcus and Synechococcus is punctate, suggesting a complex history since its introduction to this group. PMID:17584924
EUPAN enables pan-genome studies of a large number of eukaryotic genomes.

PubMed

Hu, Zhiqiang; Sun, Chen; Lu, Kuang-Chen; Chu, Xixia; Zhao, Yue; Lu, Jinyuan; Shi, Jianxin; Wei, Chaochun

2017-08-01

Pan-genome analyses are routinely carried out for bacteria to interpret the within-species gene presence/absence variations (PAVs). However, pan-genome analyses are rare for eukaryotes due to the large sizes and higher complexities of their genomes. Here we proposed EUPAN, a eukaryotic pan-genome analysis toolkit, enabling automatic large-scale eukaryotic pan-genome analyses and detection of gene PAVs at a relatively low sequencing depth. In the previous studies, we demonstrated the effectiveness and high accuracy of EUPAN in the pan-genome analysis of 453 rice genomes, in which we also revealed widespread gene PAVs among individual rice genomes. Moreover, EUPAN can be directly applied to the current re-sequencing projects primarily focusing on single nucleotide polymorphisms. EUPAN is implemented in Perl, R and C ++. It is supported under Linux and preferred for a computer cluster with LSF and SLURM job scheduling system. EUPAN together with its standard operating procedure (SOP) is freely available for non-commercial use (CC BY-NC 4.0) at http://cgm.sjtu.edu.cn/eupan/index.html . ccwei@sjtu.edu.cn or jianxin.shi@sjtu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants.

PubMed

van Baren, Marijke J; Bachy, Charles; Reistetter, Emily Nahas; Purvine, Samuel O; Grimwood, Jane; Sudek, Sebastian; Yu, Hang; Poirier, Camille; Deerinck, Thomas J; Kuo, Alan; Grigoriev, Igor V; Wong, Chee-Hong; Smith, Richard D; Callister, Stephen J; Wei, Chia-Lin; Schmutz, Jeremy; Worden, Alexandra Z

2016-03-31

Prasinophytes are widespread marine green algae that are related to plants. Cellular abundance of the prasinophyte Micromonas has reportedly increased in the Arctic due to climate-induced changes. Thus, studies of these unicellular eukaryotes are important for marine ecology and for understanding Viridiplantae evolution and diversification. We generated evidence-based Micromonas gene models using proteomics and RNA-Seq to improve prasinophyte genomic resources. First, sequences of four chromosomes in the 22 Mb Micromonas pusilla (CCMP1545) genome were finished. Comparison with the finished 21 Mb genome of Micromonas commoda (RCC299; named herein) shows they share ≤8,141 of ~10,000 protein-encoding genes, depending on the analysis method. Unlike RCC299 and other sequenced eukaryotes, CCMP1545 has two abundant repetitive intron types and a high percent (26 %) GC splice donors. Micromonas has more genus-specific protein families (19 %) than other genome sequenced prasinophytes (11 %). Comparative analyses using predicted proteomes from other prasinophytes reveal proteins likely related to scale formation and ancestral photosynthesis. Our studies also indicate that peptidoglycan (PG) biosynthesis enzymes have been lost in multiple independent events in select prasinophytes and plants. However, CCMP1545, polar Micromonas CCMP2099 and prasinophytes from other classes retain the entire PG pathway, like moss and glaucophyte algae. Surprisingly, multiple vascular plants also have the PG pathway, except the Penicillin-Binding Protein, and share a unique bi-domain protein potentially associated with the pathway. Alongside Micromonas experiments using antibiotics that halt bacterial PG biosynthesis, the findings highlight unrecognized phylogenetic complexity in PG-pathway retention and implicate a role in chloroplast structure or division in several extant Viridiplantae lineages. Extensive differences in gene loss and architecture between related prasinophytes underscore
Reconstruction of the ancestral plastid genome in Geraniaceae reveals a correlation between genome rearrangements, repeats, and nucleotide substitution rates.

PubMed

Weng, Mao-Lun; Blazier, John C; Govindu, Madhumita; Jansen, Robert K

2014-03-01

Geraniaceae plastid genomes are highly rearranged, and each of the four genera already sequenced in the family has a distinct genome organization. This study reports plastid genome sequences of six additional species, Francoa sonchifolia, Melianthus villosus, and Viviania marifolia from Geraniales, and Pelargonium alternans, California macrophylla, and Hypseocharis bilobata from Geraniaceae. These genome sequences, combined with previously published species, provide sufficient taxon sampling to reconstruct the ancestral plastid genome organization of Geraniaceae and the rearrangements unique to each genus. The ancestral plastid genome of Geraniaceae has a 4 kb inversion and a reduced, Pelargonium-like small single copy region. Our ancestral genome reconstruction suggests that a few minor rearrangements occurred in the stem branch of Geraniaceae followed by independent rearrangements in each genus. The genomic comparison demonstrates that a series of inverted repeat boundary shifts and inversions played a major role in shaping genome organization in the family. The distribution of repeats is strongly associated with breakpoints in the rearranged genomes, and the proportion and the number of large repeats (>20 bp and >60 bp) are significantly correlated with the degree of genome rearrangements. Increases in the degree of plastid genome rearrangements are correlated with the acceleration in nonsynonymous substitution rates (dN) but not with synonymous substitution rates (dS). Possible mechanisms that might contribute to this correlation, including DNA repair system and selection, are discussed.
Bioinformatics and genomic analysis of transposable elements in eukaryotic genomes.

PubMed

Janicki, Mateusz; Rooke, Rebecca; Yang, Guojun

2011-08-01

A major portion of most eukaryotic genomes are transposable elements (TEs). During evolution, TEs have introduced profound changes to genome size, structure, and function. As integral parts of genomes, the dynamic presence of TEs will continue to be a major force in reshaping genomes. Early computational analyses of TEs in genome sequences focused on filtering out "junk" sequences to facilitate gene annotation. When the high abundance and diversity of TEs in eukaryotic genomes were recognized, these early efforts transformed into the systematic genome-wide categorization and classification of TEs. The availability of genomic sequence data reversed the classical genetic approaches to discovering new TE families and superfamilies. Curated TE databases and their accurate annotation of genome sequences in turn facilitated the studies on TEs in a number of frontiers including: (1) TE-mediated changes of genome size and structure, (2) the influence of TEs on genome and gene functions, (3) TE regulation by host, (4) the evolution of TEs and their population dynamics, and (5) genomic scale studies of TE activity. Bioinformatics and genomic approaches have become an integral part of large-scale studies on TEs to extract information with pure in silico analyses or to assist wet lab experimental studies. The current revolution in genome sequencing technology facilitates further progress in the existing frontiers of research and emergence of new initiatives. The rapid generation of large-sequence datasets at record low costs on a routine basis is challenging the computing industry on storage capacity and manipulation speed and the bioinformatics community for improvement in algorithms and their implementations.
High-density marker profiling confirms ancestral genomes of Avena species and identifies D-genome chromosomes of hexaploid oat.

PubMed

Yan, Honghai; Bekele, Wubishet A; Wight, Charlene P; Peng, Yuanying; Langdon, Tim; Latta, Robert G; Fu, Yong-Bi; Diederichsen, Axel; Howarth, Catherine J; Jellen, Eric N; Boyle, Brian; Wei, Yuming; Tinker, Nicholas A

2016-11-01

Genome analysis of 27 oat species identifies ancestral groups, delineates the D genome, and identifies ancestral origin of 21 mapped chromosomes in hexaploid oat. We investigated genomic relationships among 27 species of the genus Avena using high-density genetic markers revealed by genotyping-by-sequencing (GBS). Two methods of GBS analysis were used: one based on tag-level haplotypes that were previously mapped in cultivated hexaploid oat (A. sativa), and one intended to sample and enumerate tag-level haplotypes originating from all species under investigation. Qualitatively, both methods gave similar predictions regarding the clustering of species and shared ancestral genomes. Furthermore, results were consistent with previous phylogenies of the genus obtained with conventional approaches, supporting the robustness of whole genome GBS analysis. Evidence is presented to justify the final and definitive classification of the tetraploids A. insularis, A. maroccana (=A. magna), and A. murphyi as containing D-plus-C genomes, and not A-plus-C genomes, as is most often specified in past literature. Through electronic painting of the 21 chromosome representations in the hexaploid oat consensus map, we show how the relative frequency of matches between mapped hexaploid-derived haplotypes and AC (DC)-genome tetraploids vs. A- and C-genome diploids can accurately reveal the genome origin of all hexaploid chromosomes, including the approximate positions of inter-genome translocations. Evidence is provided that supports the continued classification of a diverged B genome in AB tetraploids, and it is confirmed that no extant A-genome diploids, including A. canariensis, are similar enough to the D genome of tetraploid and hexaploid oat to warrant consideration as a D-genome diploid.
DeCoSTAR: Reconstructing the Ancestral Organization of Genes or Genomes Using Reconciled Phylogenies

PubMed Central

Anselmetti, Yoann; Patterson, Murray; Ponty, Yann; B�rard, S�verine; Chauve, Cedric; Scornavacca, Celine; Daubin, Vincent; Tannier, Eric

2017-01-01

DeCoSTAR is a software that aims at reconstructing the organization of ancestral genes or genomes in the form of sets of neighborhood relations (adjacencies) between pairs of ancestral genes or gene domains. It can also improve the assembly of fragmented genomes by proposing evolutionary-induced adjacencies between scaffolding fragments. Ancestral genes or domains are deduced from reconciled phylogenetic trees under an evolutionary model that considers gains, losses, speciations, duplications, and transfers as possible events for gene evolution. Reconciliations are either given as input or computed with the ecceTERA package, into which DeCoSTAR is integrated. DeCoSTAR computes adjacency evolutionary scenarios using a scoring scheme based on a weighted sum of adjacency gains and breakages. Solutions, both optimal and near-optimal, are sampled according to the Boltzmann–Gibbs distribution centered around parsimonious solutions, and statistical supports on ancestral and extant adjacencies are provided. DeCoSTAR supports the features of previously contributed tools that reconstruct ancestral adjacencies, namely DeCo, DeCoLT, ART-DeCo, and DeClone. In a few minutes, DeCoSTAR can reconstruct the evolutionary history of domains inside genes, of gene fusion and fission events, or of gene order along chromosomes, for large data sets including dozens of whole genomes from all kingdoms of life. We illustrate the potential of DeCoSTAR with several applications: ancestral reconstruction of gene orders for Anopheles mosquito genomes, multidomain proteins in Drosophila, and gene fusion and fission detection in Actinobacteria. Availability: http://pbil.univ-lyon1.fr/software/DeCoSTAR (Last accessed April 24, 2017). PMID:28402423
OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes

PubMed Central

Li, Li; Stoeckert, Christian J.; Roos, David S.

2003-01-01

The identification of orthologous groups is useful for genome annotation, studies on gene/protein evolution, comparative genomics, and the identification of taxonomically restricted sequences. Methods successfully exploited for prokaryotic genome analysis have proved difficult to apply to eukaryotes, however, as larger genomes may contain multiple paralogous genes, and sequence information is often incomplete. OrthoMCL provides a scalable method for constructing orthologous groups across multiple eukaryotic taxa, using a Markov Cluster algorithm to group (putative) orthologs and paralogs. This method performs similarly to the INPARANOID algorithm when applied to two genomes, but can be extended to cluster orthologs from multiple species. OrthoMCL clusters are coherent with groups identified by EGO, but improved recognition of “recent” paralogs permits overlapping EGO groups representing the same gene to be merged. Comparison with previously assigned EC annotations suggests a high degree of reliability, implying utility for automated eukaryotic genome annotation. OrthoMCL has been applied to the proteome data set from seven publicly available genomes (human, fly, worm, yeast, Arabidopsis, the malaria parasite Plasmodium falciparum, and Escherichia coli). A Web interface allows queries based on individual genes or user-defined phylogenetic patterns (http://www.cbil.upenn.edu/gene-family). Analysis of clusters incorporating P. falciparum genes identifies numerous enzymes that were incompletely annotated in first-pass annotation of the parasite genome. PMID:12952885
Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

PubMed

Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G

2000-12-15

The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.
The "fossilized" mitochondrial genome of Liriodendron tulipifera: ancestral gene content and order, ancestral editing sites, and extraordinarily low mutation rate.

PubMed

Richardson, Aaron O; Rice, Danny W; Young, Gregory J; Alverson, Andrew J; Palmer, Jeffrey D

2013-04-15

The mitochondrial genomes of flowering plants vary greatly in size, gene content, gene order, mutation rate and level of RNA editing. However, the narrow phylogenetic breadth of available genomic data has limited our ability to reconstruct these traits in the ancestral flowering plant and, therefore, to infer subsequent patterns of evolution across angiosperms. We sequenced the mitochondrial genome of Liriodendron tulipifera, the first from outside the monocots or eudicots. This 553,721 bp mitochondrial genome has evolved remarkably slowly in virtually all respects, with an extraordinarily low genome-wide silent substitution rate, retention of genes frequently lost in other angiosperm lineages, and conservation of ancestral gene clusters. The mitochondrial protein genes in Liriodendron are the most heavily edited of any angiosperm characterized to date. Most of these sites are also edited in various other lineages, which allowed us to polarize losses of editing sites in other parts of the angiosperm phylogeny. Finally, we added comprehensive gene sequence data for two other magnoliids, Magnolia stellata and the more distantly related Calycanthus floridus, to measure rates of sequence evolution in Liriodendron with greater accuracy. The Magnolia genome has evolved at an even lower rate, revealing a roughly 5,000-fold range of synonymous-site divergence among angiosperms whose mitochondrial gene space has been comprehensively sequenced. Using Liriodendron as a guide, we estimate that the ancestral flowering plant mitochondrial genome contained 41 protein genes, 14 tRNA genes of mitochondrial origin, as many as 7 tRNA genes of chloroplast origin, >700 sites of RNA editing, and some 14 colinear gene clusters. Many of these gene clusters, genes and RNA editing sites have been variously lost in different lineages over the course of the ensuing ∽200 million years of angiosperm evolution.
Function-selective domain architecture plasticity potentials in eukaryotic genome evolution

PubMed Central

Linkeviciute, Viktorija; Rackham, Owen J.L.; Gough, Julian; Oates, Matt E.; Fang, Hai

2015-01-01

To help evaluate how protein function impacts on genome evolution, we introduce a new concept of ‘architecture plasticity potential’ – the capacity to form distinct domain architectures – both for an individual domain, or more generally for a set of domains grouped by shared function. We devise a scoring metric to measure the plasticity potential for these domain sets, and evaluate how function has changed over time for different species. Applying this metric to a phylogenetic tree of eukaryotic genomes, we find that the involvement of each function is not random but highly selective. For certain lineages there is strong bias for evolution to involve domains related to certain functions. In general eukaryotic genomes, particularly animals, expand complex functional activities such as signalling and regulation, but at the cost of reducing metabolic processes. We also observe differential evolution of transcriptional regulation and a unique evolutionary role of channel regulators; crucially this is only observable in terms of the architecture plasticity potential. Our findings provide a new layer of information to understand the significance of function in eukaryotic genome evolution. A web search tool, available at http://supfam.org/Pevo, offers a wide spectrum of options for exploring functional importance in eukaryotic genome evolution. PMID:25980317
Comparative analysis of rosaceous genomes and the reconstruction of a putative ancestral genome for the family

PubMed Central

2011-01-01

Background Comparative genome mapping studies in Rosaceae have been conducted until now by aligning genetic maps within the same genus, or closely related genera and using a limited number of common markers. The growing body of genomics resources and sequence data for both Prunus and Fragaria permits detailed comparisons between these genera and the recently released Malus × domestica genome sequence. Results We generated a comparative analysis using 806 molecular markers that are anchored genetically to the Prunus and/or Fragaria reference maps, and physically to the Malus genome sequence. Markers in common for Malus and Prunus, and Malus and Fragaria, respectively were 784 and 148. The correspondence between marker positions was high and conserved syntenic blocks were identified among the three genera in the Rosaceae. We reconstructed a proposed ancestral genome for the Rosaceae. Conclusions A genome containing nine chromosomes is the most likely candidate for the ancestral Rosaceae progenitor. The number of chromosomal translocations observed between the three genera investigated was low. However, the number of inversions identified among Malus and Prunus was much higher than any reported genome comparisons in plants, suggesting that small inversions have played an important role in the evolution of these two genera or of the Rosaceae. PMID:21226921
The phagotrophic origin of eukaryotes and phylogenetic classification of Protozoa.

PubMed

Cavalier-Smith, T

2002-03-01

ancestrally biciliate clade, named 'bikonts'. The apparently conflicting rRNA and protein trees can be reconciled with each other and this ultrastructural interpretation if long-branch distortions, some mechanistically explicable, are allowed for. Bikonts comprise two groups: corticoflagellates, with a younger anterior cilium, no centrosomal cone and ancestrally a semi-rigid cell cortex with a microtubular band on either side of the posterior mature centriole; and Rhizaria [a new infrakingdom comprising Cercozoa (now including Ascetosporea classis nov.), Retaria phylum nov., Heliozoa and Apusozoa phylum nov.], having a centrosomal cone or radiating microtubules and two microtubular roots and a soft surface, frequently with reticulopodia. Corticoflagellates comprise photokaryotes (Plantae and chromalveolates, both ancestrally with cortical alveoli) and Excavata (a new protozoan infrakingdom comprising Loukozoa, Discicristata and Archezoa, ancestrally with three microtubular roots). All basal eukaryotic radiations were of mitochondrial aerobes; hydrogenosomes evolved polyphyletically from mitochondria long afterwards, the persistence of their double envelope long after their genomes disappeared being a striking instance of membrane heredity. I discuss the relationship between the 13 protozoan phyla recognized here and revise higher protozoan classification by updating as subkingdoms Lankester's 1878 division of Protozoa into Corticata (Excavata, Alveolata; with prominent cortical microtubules and ancestrally localized cytostome--the Parabasalia probably secondarily internalized the cytoskeleton) and Gymnomyxa [infrakingdoms Sarcomastigota (Choanozoa, Amoebozoa) and Rhizaria; both ancestrally with a non-cortical cytoskeleton of radiating singlet microtubules and a relatively soft cell surface with diffused feeding]. As the eukaryote root almost certainly lies within Gymnomyxa, probably among the Sarcomastigota, Corticata are derived. Following the single symbiogenetic origin of
3D genomics imposes evolution of the domain model of eukaryotic genome organization.

PubMed

Razin, Sergey V; Vassetzky, Yegor S

2017-02-01

The hypothesis that the genome is composed of a patchwork of structural and functional domains (units) that may be either active or repressed was proposed almost 30 years ago. Here, we examine the evolution of the domain model of eukaryotic genome organization in view of the expansion of genome-scale techniques in the twenty-first century that have provided us with a wealth of information on genome organization, folding, and functioning.
The ring of life provides evidence for a genome fusion origin of eukaryotes.

PubMed

Rivera, Maria C; Lake, James A

2004-09-09

Genomes hold within them the record of the evolution of life on Earth. But genome fusions and horizontal gene transfer seem to have obscured sufficiently the gene sequence record such that it is difficult to reconstruct the phylogenetic tree of life. Here we determine the general outline of the tree using complete genome data from representative prokaryotes and eukaryotes and a new genome analysis method that makes it possible to reconstruct ancient genome fusions and phylogenetic trees. Our analyses indicate that the eukaryotic genome resulted from a fusion of two diverse prokaryotic genomes, and therefore at the deepest levels linking prokaryotes and eukaryotes, the tree of life is actually a ring of life. One fusion partner branches from deep within an ancient photosynthetic clade, and the other is related to the archaeal prokaryotes. The eubacterial organism is either a proteobacterium, or a member of a larger photosynthetic clade that includes the Cyanobacteria and the Proteobacteria.
SigHunt: horizontal gene transfer finder optimized for eukaryotic genomes.

PubMed

Jaron, Kamil S; Moravec, Jiří C; Martínková, Natália

2014-04-15

Genomic islands (GIs) are DNA fragments incorporated into a genome through horizontal gene transfer (also called lateral gene transfer), often with functions novel for a given organism. While methods for their detection are well researched in prokaryotes, the complexity of eukaryotic genomes makes direct utilization of these methods unreliable, and so labour-intensive phylogenetic searches are used instead. We present a surrogate method that investigates nucleotide base composition of the DNA sequence in a eukaryotic genome and identifies putative GIs. We calculate a genomic signature as a vector of tetranucleotide (4-mer) frequencies using a sliding window approach. Extending the neighbourhood of the sliding window, we establish a local kernel density estimate of the 4-mer frequency. We score the number of 4-mer frequencies in the sliding window that deviate from the credibility interval of their local genomic density using a newly developed discrete interval accumulative score (DIAS). To further improve the effectiveness of DIAS, we select informative 4-mers in a range of organisms using the tetranucleotide quality score developed herein. We show that the SigHunt method is computationally efficient and able to detect GIs in eukaryotic genomes that represent non-ameliorated integration. Thus, it is suited to scanning for change in organisms with different DNA composition. Source code and scripts freely available for download at http://www.iba.muni.cz/index-en.php?pg=research-data-analysis-tools-sighunt are implemented in C and R and are platform-independent. 376090@mail.muni.cz or martinkova@ivb.cz. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Introns Protect Eukaryotic Genomes from Transcription-Associated Genetic Instability.

PubMed

Bonnet, Amandine; Grosso, Ana R; Elkaoutari, Abdessamad; Coleno, Emeline; Presle, Adrien; Sridhara, Sreerama C; Janbon, Guilhem; Géli, Vincent; de Almeida, Sérgio F; Palancade, Benoit

2017-08-17

Transcription is a source of genetic instability that can notably result from the formation of genotoxic DNA:RNA hybrids, or R-loops, between the nascent mRNA and its template. Here we report an unexpected function for introns in counteracting R-loop accumulation in eukaryotic genomes. Deletion of endogenous introns increases R-loop formation, while insertion of an intron into an intronless gene suppresses R-loop accumulation and its deleterious impact on transcription and recombination in yeast. Recruitment of the spliceosome onto the mRNA, but not splicing per se, is shown to be critical to attenuate R-loop formation and transcription-associated genetic instability. Genome-wide analyses in a number of distant species differing in their intron content, including human, further revealed that intron-containing genes and the intron-richest genomes are best protected against R-loop accumulation and subsequent genetic instability. Our results thereby provide a possible rationale for the conservation of introns throughout the eukaryotic lineage. Copyright © 2017 Elsevier Inc. All rights reserved.
The Oxytricha trifallax Macronuclear Genome: A Complex Eukaryotic Genome with 16,000 Tiny Chromosomes

PubMed Central

Swart, Estienne C.; Bracht, John R.; Magrini, Vincent; Minx, Patrick; Chen, Xiao; Zhou, Yi; Khurana, Jaspreet S.; Goldman, Aaron D.; Nowacki, Mariusz; Schotanus, Klaas; Jung, Seolkyoung; Fulton, Robert S.; Ly, Amy; McGrath, Sean; Haub, Kevin; Wiggins, Jessica L.; Storton, Donna; Matese, John C.; Parsons, Lance; Chang, Wei-Jen; Bowen, Michael S.; Stover, Nicholas A.; Jones, Thomas A.; Eddy, Sean R.; Herrick, Glenn A.; Doak, Thomas G.; Wilson, Richard K.; Mardis, Elaine R.; Landweber, Laura F.

2013-01-01

The macronuclear genome of the ciliate Oxytricha trifallax displays an extreme and unique eukaryotic genome architecture with extensive genomic variation. During sexual genome development, the expressed, somatic macronuclear genome is whittled down to the genic portion of a small fraction (∼5%) of its precursor “silent” germline micronuclear genome by a process of “unscrambling” and fragmentation. The tiny macronuclear “nanochromosomes” typically encode single, protein-coding genes (a small portion, 10%, encode 2–8 genes), have minimal noncoding regions, and are differentially amplified to an average of ∼2,000 copies. We report the high-quality genome assembly of ∼16,000 complete nanochromosomes (∼50 Mb haploid genome size) that vary from 469 bp to 66 kb long (mean ∼3.2 kb) and encode ∼18,500 genes. Alternative DNA fragmentation processes ∼10% of the nanochromosomes into multiple isoforms that usually encode complete genes. Nucleotide diversity in the macronucleus is very high (SNP heterozygosity is ∼4.0%), suggesting that Oxytricha trifallax may have one of the largest known effective population sizes of eukaryotes. Comparison to other ciliates with nonscrambled genomes and long macronuclear chromosomes (on the order of 100 kb) suggests several candidate proteins that could be involved in genome rearrangement, including domesticated MULE and IS1595-like DDE transposases. The assembly of the highly fragmented Oxytricha macronuclear genome is the first completed genome with such an unusual architecture. This genome sequence provides tantalizing glimpses into novel molecular biology and evolution. For example, Oxytricha maintains tens of millions of telomeres per cell and has also evolved an intriguing expansion of telomere end-binding proteins. In conjunction with the micronuclear genome in progress, the O. trifallax macronuclear genome will provide an invaluable resource for investigating programmed genome rearrangements, complementing

Origins and evolution of viruses of eukaryotes: The ultimate modularity

PubMed Central

Koonin, Eugene V.; Dolja, Valerian V.; Krupovic, Mart

2018-01-01

Viruses and other selfish genetic elements are dominant entities in the biosphere, with respect to both physical abundance and genetic diversity. Various selfish elements parasitize on all cellular life forms. The relative abundances of different classes of viruses are dramatically different between prokaryotes and eukaryotes. In prokaryotes, the great majority of viruses possess double-stranded (ds) DNA genomes, with a substantial minority of single-stranded (ss) DNA viruses and only limited presence of RNA viruses. In contrast, in eukaryotes, RNA viruses account for the majority of the virome diversity although ssDNA and dsDNA viruses are common as well. Phylogenomic analysis yields tangible clues for the origins of major classes of eukaryotic viruses and in particular their likely roots in prokaryotes. Specifically, the ancestral genome of positive-strand RNA viruses of eukaryotes might have been assembled de novo from genes derived from prokaryotic retroelements and bacteria although a primordial origin of this class of viruses cannot be ruled out. Different groups of double-stranded RNA viruses derive either from dsRNA bacteriophages or from positive-strand RNA viruses. The eukaryotic ssDNA viruses apparently evolved via a fusion of genes from prokaryotic rolling circle-replicating plasmids and positive-strand RNA viruses. Different families of eukaryotic dsDNA viruses appear to have originated from specific groups of bacteriophages on at least two independent occasions. Polintons, the largest known eukaryotic transposons, predicted to also form virus particles, most likely, were the evolutionary intermediates between bacterial tectiviruses and several groups of eukaryotic dsDNA viruses including the proposed order “Megavirales” that unites diverse families of large and giant viruses. Strikingly, evolution of all classes of eukaryotic viruses appears to have involved fusion between structural and replicative gene modules derived from different sources
Ancestral and derived protein import pathways in the mitochondrion of Reclinomonas americana.

PubMed

Tong, Janette; Dolezal, Pavel; Selkrig, Joel; Crawford, Simon; Simpson, Alastair G B; Noinaj, Nicholas; Buchanan, Susan K; Gabriel, Kipros; Lithgow, Trevor

2011-05-01

The evolution of mitochondria from ancestral bacteria required that new protein transport machinery be established. Recent controversy over the evolution of these new molecular machines hinges on the degree to which ancestral bacterial transporters contributed during the establishment of the new protein import pathway. Reclinomonas americana is a unicellular eukaryote with the most gene-rich mitochondrial genome known, and the large collection of membrane proteins encoded on the mitochondrial genome of R. americana includes a bacterial-type SecY protein transporter. Analysis of expressed sequence tags shows R. americana also has components of a mitochondrial protein translocase or "translocase in the inner mitochondrial membrane complex." Along with several other membrane proteins encoded on the mitochondrial genome Cox11, an assembly factor for cytochrome c oxidase retains sequence features suggesting that it is assembled by the SecY complex in R. americana. Despite this, protein import studies show that the RaCox11 protein is suited for import into mitochondria and functional complementation if the gene is transferred into the nucleus of yeast. Reclinomonas americana provides direct evidence that bacterial protein transport pathways were retained, alongside the evolving mitochondrial protein import machinery, shedding new light on the process of mitochondrial evolution.
Genomic impact of eukaryotic transposable elements

PubMed Central

2012-01-01

The third international conference on the genomic impact of eukaryotic transposable elements (TEs) was held 24 to 28 February 2012 at the Asilomar Conference Center, Pacific Grove, CA, USA. Sponsored in part by the National Institutes of Health grant 5 P41 LM006252, the goal of the conference was to bring together researchers from around the world who study the impact and mechanisms of TEs using multiple computational and experimental approaches. The meeting drew close to 170 attendees and included invited floor presentations on the biology of TEs and their genomic impact, as well as numerous talks contributed by young scientists. The workshop talks were devoted to computational analysis of TEs with additional time for discussion of unresolved issues. Also, there was ample opportunity for poster presentations and informal evening discussions. The success of the meeting reflects the important role of Repbase in comparative genomic studies, and emphasizes the need for close interactions between experimental and computational biologists in the years to come. PMID:23171443
Genomic impact of eukaryotic transposable elements.

PubMed

Arkhipova, Irina R; Batzer, Mark A; Brosius, Juergen; Feschotte, Cédric; Moran, John V; Schmitz, Jürgen; Jurka, Jerzy

2012-11-21

The third international conference on the genomic impact of eukaryotic transposable elements (TEs) was held 24 to 28 February 2012 at the Asilomar Conference Center, Pacific Grove, CA, USA. Sponsored in part by the National Institutes of Health grant 5 P41 LM006252, the goal of the conference was to bring together researchers from around the world who study the impact and mechanisms of TEs using multiple computational and experimental approaches. The meeting drew close to 170 attendees and included invited floor presentations on the biology of TEs and their genomic impact, as well as numerous talks contributed by young scientists. The workshop talks were devoted to computational analysis of TEs with additional time for discussion of unresolved issues. Also, there was ample opportunity for poster presentations and informal evening discussions. The success of the meeting reflects the important role of Repbase in comparative genomic studies, and emphasizes the need for close interactions between experimental and computational biologists in the years to come.
GWFASTA: server for FASTA search in eukaryotic and microbial genomes.

PubMed

Issac, Biju; Raghava, G P S

2002-09-01

Similarity searches are a powerful method for solving important biological problems such as database scanning, evolutionary studies, gene prediction, and protein structure prediction. FASTA is a widely used sequence comparison tool for rapid database scanning. Here we describe the GWFASTA server that was developed to assist the FASTA user in similarity searches against partially and/or completely sequenced genomes. GWFASTA consists of more than 60 microbial genomes, eight eukaryote genomes, and proteomes of annotatedgenomes. Infact, it provides the maximum number of databases for similarity searching from a single platform. GWFASTA allows the submission of more than one sequence as a single query for a FASTA search. It also provides integrated post-processing of FASTA output, including compositional analysis of proteins, multiple sequences alignment, and phylogenetic analysis. Furthermore, it summarizes the search results organism-wise for prokaryotes and chromosome-wise for eukaryotes. Thus, the integration of different tools for sequence analyses makes GWFASTA a powerful toolfor biologists.
Ancestral Relationships Using Metafounders: Finite Ancestral Populations and Across Population Relationships.

PubMed

Legarra, Andres; Christensen, Ole F; Vitezica, Zulma G; Aguilar, Ignacio; Misztal, Ignacy

2015-06-01

Recent use of genomic (marker-based) relationships shows that relationships exist within and across base population (breeds or lines). However, current treatment of pedigree relationships is unable to consider relationships within or across base populations, although such relationships must exist due to finite size of the ancestral population and connections between populations. This complicates the conciliation of both approaches and, in particular, combining pedigree with genomic relationships. We present a coherent theoretical framework to consider base population in pedigree relationships. We suggest a conceptual framework that considers each ancestral population as a finite-sized pool of gametes. This generates across-individual relationships and contrasts with the classical view which each population is considered as an infinite, unrelated pool. Several ancestral populations may be connected and therefore related. Each ancestral population can be represented as a "metafounder," a pseudo-individual included as founder of the pedigree and similar to an "unknown parent group." Metafounders have self- and across relationships according to a set of parameters, which measure ancestral relationships, i.e., homozygozities within populations and relationships across populations. These parameters can be estimated from existing pedigree and marker genotypes using maximum likelihood or a method based on summary statistics, for arbitrarily complex pedigrees. Equivalences of genetic variance and variance components between the classical and this new parameterization are shown. Segregation variance on crosses of populations is modeled. Efficient algorithms for computation of relationship matrices, their inverses, and inbreeding coefficients are presented. Use of metafounders leads to compatibility of genomic and pedigree relationship matrices and to simple computing algorithms. Examples and code are given. Copyright © 2015 by the Genetics Society of America.
Ancestral Relationships Using Metafounders: Finite Ancestral Populations and Across Population Relationships

PubMed Central

Legarra, Andres; Christensen, Ole F.; Vitezica, Zulma G.; Aguilar, Ignacio; Misztal, Ignacy

2015-01-01

Recent use of genomic (marker-based) relationships shows that relationships exist within and across base population (breeds or lines). However, current treatment of pedigree relationships is unable to consider relationships within or across base populations, although such relationships must exist due to finite size of the ancestral population and connections between populations. This complicates the conciliation of both approaches and, in particular, combining pedigree with genomic relationships. We present a coherent theoretical framework to consider base population in pedigree relationships. We suggest a conceptual framework that considers each ancestral population as a finite-sized pool of gametes. This generates across-individual relationships and contrasts with the classical view which each population is considered as an infinite, unrelated pool. Several ancestral populations may be connected and therefore related. Each ancestral population can be represented as a “metafounder,” a pseudo-individual included as founder of the pedigree and similar to an “unknown parent group.” Metafounders have self- and across relationships according to a set of parameters, which measure ancestral relationships, i.e., homozygozities within populations and relationships across populations. These parameters can be estimated from existing pedigree and marker genotypes using maximum likelihood or a method based on summary statistics, for arbitrarily complex pedigrees. Equivalences of genetic variance and variance components between the classical and this new parameterization are shown. Segregation variance on crosses of populations is modeled. Efficient algorithms for computation of relationship matrices, their inverses, and inbreeding coefficients are presented. Use of metafounders leads to compatibility of genomic and pedigree relationship matrices and to simple computing algorithms. Examples and code are given. PMID:25873631
Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants

DOE Office of Scientific and Technical Information (OSTI.GOV)

van Baren, Marijke J.; Bachy, Charles; Reistetter, Emily Nahas

Prasinophytes are widespread marine green algae that are related to plants. Abundance of the genus Micromonas has reportedly increased in the Arctic due to climate-induced changes. Thus, studies of these organisms are important for marine ecology and understanding Virdiplantae evolution and diversification. We generated evidence-based Micromonas gene models using proteomics and RNA-Seq to improve prasinophyte genomic resources. First, sequences of four chromosomes in the 22 Mb Micromonas pusilla (CCMP1545) genome were finished. Comparison with the finished 21 Mb Micromonas commoda (RCC299) shows they share ≤ 8,142 of ~10,000 protein-encoding genes, depending on the analysis method. Unlike RCC299 and other sequencedmore » eukaryotes, CCMP1545 has two abundant repetitive intron types and a high percent (26%) GC splice donors. Micromonas has more genus-specific protein families (19%) than other genome sequenced prasinophytes (11%). Comparative analyses using predicted proteomes from other prasinophytes reveal proteins likely related to scale formation and ancestral photosynthesis. Our studies also indicate that peptidoglycan (PG) biosynthesis enzymes have been lost in multiple independent events in select prasinophytes and most plants. However, CCMP1545, polar Micromonas CCMP2099 and prasinophytes from other claasses retain the entire PG pathway, like moss and glaucophyte algae. Multiple vascular plants that share a unique bi-domain protein also have the pathway, except the Penicillin-Binding-Protein. Alongside Micromonas experiments using antibiotics that halt bacterial PG biosynthesis, the findings highlight unrecognized phylogenetic complexity in the PG-pathway retention and implicate a role in chloroplast structure of division in several extant Vridiplantae lineages. Extensive differences in gene loss and architecture between related prasinophytes underscore their extensive divergence. PG biosynthesis genes from the cyanobacterial endosymbiont that became the
Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants

DOE PAGES

van Baren, Marijke J.; Bachy, Charles; Reistetter, Emily Nahas; ...

2016-03-31

Prasinophytes are widespread marine green algae that are related to plants. Abundance of the genus Micromonas has reportedly increased in the Arctic due to climate-induced changes. Thus, studies of these organisms are important for marine ecology and understanding Virdiplantae evolution and diversification. We generated evidence-based Micromonas gene models using proteomics and RNA-Seq to improve prasinophyte genomic resources. First, sequences of four chromosomes in the 22 Mb Micromonas pusilla (CCMP1545) genome were finished. Comparison with the finished 21 Mb Micromonas commoda (RCC299) shows they share ≤ 8,142 of ~10,000 protein-encoding genes, depending on the analysis method. Unlike RCC299 and other sequencedmore » eukaryotes, CCMP1545 has two abundant repetitive intron types and a high percent (26%) GC splice donors. Micromonas has more genus-specific protein families (19%) than other genome sequenced prasinophytes (11%). Comparative analyses using predicted proteomes from other prasinophytes reveal proteins likely related to scale formation and ancestral photosynthesis. Our studies also indicate that peptidoglycan (PG) biosynthesis enzymes have been lost in multiple independent events in select prasinophytes and most plants. However, CCMP1545, polar Micromonas CCMP2099 and prasinophytes from other claasses retain the entire PG pathway, like moss and glaucophyte algae. Multiple vascular plants that share a unique bi-domain protein also have the pathway, except the Penicillin-Binding-Protein. Alongside Micromonas experiments using antibiotics that halt bacterial PG biosynthesis, the findings highlight unrecognized phylogenetic complexity in the PG-pathway retention and implicate a role in chloroplast structure of division in several extant Vridiplantae lineages. Extensive differences in gene loss and architecture between related prasinophytes underscore their extensive divergence. PG biosynthesis genes from the cyanobacterial endosymbiont that became the
Eukaryotic genes of archaebacterial origin are more important than the more numerous eubacterial genes, irrespective of function.

PubMed

Cotton, James A; McInerney, James O

2010-10-05

The traditional tree of life shows eukaryotes as a distinct lineage of living things, but many studies have suggested that the first eukaryotic cells were chimeric, descended from both Eubacteria (through the mitochondrion) and Archaebacteria. Eukaryote nuclei thus contain genes of both eubacterial and archaebacterial origins, and these genes have different functions within eukaryotic cells. Here we report that archaebacterium-derived genes are significantly more likely to be essential to yeast viability, are more highly expressed, and are significantly more highly connected and more central in the yeast protein interaction network. These findings hold irrespective of whether the genes have an informational or operational function, so that many features of eukaryotic genes with prokaryotic homologs can be explained by their origin, rather than their function. Taken together, our results show that genes of archaebacterial origin are in some senses more important to yeast metabolism than genes of eubacterial origin. This importance reflects these genes' origin as the ancestral nuclear component of the eukaryotic genome.
Optimizing eukaryotic cell hosts for protein production through systems biotechnology and genome-scale modeling.

PubMed

Gutierrez, Jahir M; Lewis, Nathan E

2015-07-01

Eukaryotic cell lines, including Chinese hamster ovary cells, yeast, and insect cells, are invaluable hosts for the production of many recombinant proteins. With the advent of genomic resources, one can now leverage genome-scale computational modeling of cellular pathways to rationally engineer eukaryotic host cells. Genome-scale models of metabolism include all known biochemical reactions occurring in a specific cell. By describing these mathematically and using tools such as flux balance analysis, the models can simulate cell physiology and provide targets for cell engineering that could lead to enhanced cell viability, titer, and productivity. Here we review examples in which metabolic models in eukaryotic cell cultures have been used to rationally select targets for genetic modification, improve cellular metabolic capabilities, design media supplementation, and interpret high-throughput omics data. As more comprehensive models of metabolism and other cellular processes are developed for eukaryotic cell culture, these will enable further exciting developments in cell line engineering, thus accelerating recombinant protein production and biotechnology in the years to come. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
EuPathDB: the eukaryotic pathogen genomics database resource

PubMed Central

Aurrecoechea, Cristina; Barreto, Ana; Basenko, Evelina Y.; Brestelli, John; Brunk, Brian P.; Cade, Shon; Crouch, Kathryn; Doherty, Ryan; Falke, Dave; Fischer, Steve; Gajria, Bindu; Harb, Omar S.; Heiges, Mark; Hertz-Fowler, Christiane; Hu, Sufen; Iodice, John; Kissinger, Jessica C.; Lawrence, Cris; Li, Wei; Pinney, Deborah F.; Pulman, Jane A.; Roos, David S.; Shanmugasundram, Achchuthan; Silva-Franco, Fatima; Steinbiss, Sascha; Stoeckert, Christian J.; Spruill, Drew; Wang, Haiming; Warrenfeltz, Susanne; Zheng, Jie

2017-01-01

The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host–pathogen interactions. PMID:27903906
Origin and evolution of SINEs in eukaryotic genomes.

PubMed

Kramerov, D A; Vassetzky, N S

2011-12-01

Short interspersed elements (SINEs) are one of the two most prolific mobile genomic elements in most of the higher eukaryotes. Although their biology is still not thoroughly understood, unusual life cycle of these simple elements amplified as genomic parasites makes their evolution unique in many ways. In contrast to most genetic elements including other transposons, SINEs emerged de novo many times in evolution from available molecules (for example, tRNA). The involvement of reverse transcription in their amplification cycle, huge number of genomic copies and modular structure allow variation mechanisms in SINEs uncommon or rare in other genetic elements (module exchange between SINE families, dimerization, and so on.). Overall, SINE evolution includes their emergence, progressive optimization and counteraction to the cell's defense against mobile genetic elements.
Genome-wide mapping reveals single-origin chromosome replication in Leishmania, a eukaryotic microbe.

PubMed

Marques, Catarina A; Dickens, Nicholas J; Paape, Daniel; Campbell, Samantha J; McCulloch, Richard

2015-10-19

DNA replication initiates on defined genome sites, termed origins. Origin usage appears to follow common rules in the eukaryotic organisms examined to date: all chromosomes are replicated from multiple origins, which display variations in firing efficiency and are selected from a larger pool of potential origins. To ask if these features of DNA replication are true of all eukaryotes, we describe genome-wide origin mapping in the parasite Leishmania. Origin mapping in Leishmania suggests a striking divergence in origin usage relative to characterized eukaryotes, since each chromosome appears to be replicated from a single origin. By comparing two species of Leishmania, we find evidence that such origin singularity is maintained in the face of chromosome fusion or fission events during evolution. Mapping Leishmania origins suggests that all origins fire with equal efficiency, and that the genomic sites occupied by origins differ from related non-origins sites. Finally, we provide evidence that origin location in Leishmania displays striking conservation with Trypanosoma brucei, despite the latter parasite replicating its chromosomes from multiple, variable strength origins. The demonstration of chromosome replication for a single origin in Leishmania, a microbial eukaryote, has implications for the evolution of origin multiplicity and associated controls, and may explain the pervasive aneuploidy that characterizes Leishmania chromosome architecture.
GFFview: A Web Server for Parsing and Visualizing Annotation Information of Eukaryotic Genome.

PubMed

Deng, Feilong; Chen, Shi-Yi; Wu, Zhou-Lin; Hu, Yongsong; Jia, Xianbo; Lai, Song-Jia

2017-10-01

Owing to wide application of RNA sequencing (RNA-seq) technology, more and more eukaryotic genomes have been extensively annotated, such as the gene structure, alternative splicing, and noncoding loci. Annotation information of genome is prevalently stored as plain text in General Feature Format (GFF), which could be hundreds or thousands Mb in size. Therefore, it is a challenge for manipulating GFF file for biologists who have no bioinformatic skill. In this study, we provide a web server (GFFview) for parsing the annotation information of eukaryotic genome and then generating statistical description of six indices for visualization. GFFview is very useful for investigating quality and difference of the de novo assembled transcriptome in RNA-seq studies.
Origins and evolution of viruses of eukaryotes: The ultimate modularity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Koonin, Eugene V., E-mail: koonin@ncbi.nlm.nih.gov; Dolja, Valerian V., E-mail: doljav@science.oregonstate.edu; Krupovic, Mart, E-mail: krupovic@pasteur.fr

2015-05-15

Viruses and other selfish genetic elements are dominant entities in the biosphere, with respect to both physical abundance and genetic diversity. Various selfish elements parasitize on all cellular life forms. The relative abundances of different classes of viruses are dramatically different between prokaryotes and eukaryotes. In prokaryotes, the great majority of viruses possess double-stranded (ds) DNA genomes, with a substantial minority of single-stranded (ss) DNA viruses and only limited presence of RNA viruses. In contrast, in eukaryotes, RNA viruses account for the majority of the virome diversity although ssDNA and dsDNA viruses are common as well. Phylogenomic analysis yields tangiblemore » clues for the origins of major classes of eukaryotic viruses and in particular their likely roots in prokaryotes. Specifically, the ancestral genome of positive-strand RNA viruses of eukaryotes might have been assembled de novo from genes derived from prokaryotic retroelements and bacteria although a primordial origin of this class of viruses cannot be ruled out. Different groups of double-stranded RNA viruses derive either from dsRNA bacteriophages or from positive-strand RNA viruses. The eukaryotic ssDNA viruses apparently evolved via a fusion of genes from prokaryotic rolling circle-replicating plasmids and positive-strand RNA viruses. Different families of eukaryotic dsDNA viruses appear to have originated from specific groups of bacteriophages on at least two independent occasions. Polintons, the largest known eukaryotic transposons, predicted to also form virus particles, most likely, were the evolutionary intermediates between bacterial tectiviruses and several groups of eukaryotic dsDNA viruses including the proposed order “Megavirales” that unites diverse families of large and giant viruses. Strikingly, evolution of all classes of eukaryotic viruses appears to have involved fusion between structural and replicative gene modules derived from different
Origin and evolution of SINEs in eukaryotic genomes

PubMed Central

Kramerov, D A; Vassetzky, N S

2011-01-01

Short interspersed elements (SINEs) are one of the two most prolific mobile genomic elements in most of the higher eukaryotes. Although their biology is still not thoroughly understood, unusual life cycle of these simple elements amplified as genomic parasites makes their evolution unique in many ways. In contrast to most genetic elements including other transposons, SINEs emerged de novo many times in evolution from available molecules (for example, tRNA). The involvement of reverse transcription in their amplification cycle, huge number of genomic copies and modular structure allow variation mechanisms in SINEs uncommon or rare in other genetic elements (module exchange between SINE families, dimerization, and so on.). Overall, SINE evolution includes their emergence, progressive optimization and counteraction to the cell's defense against mobile genetic elements. PMID:21673742
The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant Functions

PubMed Central

Merchant, Sabeeha S.; Prochnik, Simon E.; Vallon, Olivier; Harris, Elizabeth H.; Karpowicz, Steven J.; Witman, George B.; Terry, Astrid; Salamov, Asaf; Fritz-Laylin, Lillian K.; Maréchal-Drouard, Laurence; Marshall, Wallace F.; Qu, Liang-Hu; Nelson, David R.; Sanderfoot, Anton A.; Spalding, Martin H.; Kapitonov, Vladimir V.; Ren, Qinghu; Ferris, Patrick; Lindquist, Erika; Shapiro, Harris; Lucas, Susan M.; Grimwood, Jane; Schmutz, Jeremy; Cardol, Pierre; Cerutti, Heriberto; Chanfreau, Guillaume; Chen, Chun-Long; Cognat, Valérie; Croft, Martin T.; Dent, Rachel; Dutcher, Susan; Fernández, Emilio; Ferris, Patrick; Fukuzawa, Hideya; González-Ballester, David; González-Halphen, Diego; Hallmann, Armin; Hanikenne, Marc; Hippler, Michael; Inwood, William; Jabbari, Kamel; Kalanon, Ming; Kuras, Richard; Lefebvre, Paul A.; Lemaire, Stéphane D.; Lobanov, Alexey V.; Lohr, Martin; Manuell, Andrea; Meier, Iris; Mets, Laurens; Mittag, Maria; Mittelmeier, Telsa; Moroney, James V.; Moseley, Jeffrey; Napoli, Carolyn; Nedelcu, Aurora M.; Niyogi, Krishna; Novoselov, Sergey V.; Paulsen, Ian T.; Pazour, Greg; Purton, Saul; Ral, Jean-Philippe; Riaño-Pachón, Diego Mauricio; Riekhof, Wayne; Rymarquis, Linda; Schroda, Michael; Stern, David; Umen, James; Willows, Robert; Wilson, Nedra; Zimmer, Sara Lana; Allmer, Jens; Balk, Janneke; Bisova, Katerina; Chen, Chong-Jian; Elias, Marek; Gendler, Karla; Hauser, Charles; Lamb, Mary Rose; Ledford, Heidi; Long, Joanne C.; Minagawa, Jun; Page, M. Dudley; Pan, Junmin; Pootakham, Wirulda; Roje, Sanja; Rose, Annkatrin; Stahlberg, Eric; Terauchi, Aimee M.; Yang, Pinfen; Ball, Steven; Bowler, Chris; Dieckmann, Carol L.; Gladyshev, Vadim N.; Green, Pamela; Jorgensen, Richard; Mayfield, Stephen; Mueller-Roeber, Bernd; Rajamani, Sathish; Sayre, Richard T.; Brokstein, Peter; Dubchak, Inna; Goodstein, David; Hornick, Leila; Huang, Y. Wayne; Jhaveri, Jinal; Luo, Yigong; Martínez, Diego; Ngau, Wing Chi Abby; Otillar, Bobby; Poliakov, Alexander; Porter, Aaron; Szajkowski, Lukasz; Werner, Gregory; Zhou, Kemin; Grigoriev, Igor V.; Rokhsar, Daniel S.; Grossman, Arthur R.

2010-01-01

Chlamydomonas reinhardtii is a unicellular green alga whose lineage diverged from land plants over 1 billion years ago. It is a model system for studying chloroplast-based photosynthesis, as well as the structure, assembly, and function of eukaryotic flagella (cilia), which were inherited from the common ancestor of plants and animals, but lost in land plants. We sequenced the ∼120-megabase nuclear genome of Chlamydomonas and performed comparative phylogenomic analyses, identifying genes encoding uncharacterized proteins that are likely associated with the function and biogenesis of chloroplasts or eukaryotic flagella. Analyses of the Chlamydomonas genome advance our understanding of the ancestral eukaryotic cell, reveal previously unknown genes associated with photosynthetic and flagellar functions, and establish links between ciliopathy and the composition and function of flagella. PMID:17932292
Elucidating the triplicated ancestral genome structure of radish based on chromosome-level comparison with the Brassica genomes.

PubMed

Jeong, Young-Min; Kim, Namshin; Ahn, Byung Ohg; Oh, Mijin; Chung, Won-Hyong; Chung, Hee; Jeong, Seongmun; Lim, Ki-Byung; Hwang, Yoon-Jung; Kim, Goon-Bo; Baek, Seunghoon; Choi, Sang-Bong; Hyung, Dae-Jin; Lee, Seung-Won; Sohn, Seong-Han; Kwon, Soo-Jin; Jin, Mina; Seol, Young-Joo; Chae, Won Byoung; Choi, Keun Jin; Park, Beom-Seok; Yu, Hee-Ju; Mun, Jeong-Hwan

2016-07-01

This study presents a chromosome-scale draft genome sequence of radish that is assembled into nine chromosomal pseudomolecules. A comprehensive comparative genome analysis with the Brassica genomes provides genomic evidences on the evolution of the mesohexaploid radish genome. Radish (Raphanus sativus L.) is an agronomically important root vegetable crop and its origin and phylogenetic position in the tribe Brassiceae is controversial. Here we present a comprehensive analysis of the radish genome based on the chromosome sequences of R. sativus cv. WK10039. The radish genome was sequenced and assembled into 426.2 Mb spanning >98 % of the gene space, of which 344.0 Mb were integrated into nine chromosome pseudomolecules. Approximately 36 % of the genome was repetitive sequences and 46,514 protein-coding genes were predicted and annotated. Comparative mapping of the tPCK-like ancestral genome revealed that the radish genome has intermediate characteristics between the Brassica A/C and B genomes in the triplicated segments, suggesting an internal origin from the genus Brassica. The evolutionary characteristics shared between radish and other Brassica species provided genomic evidences that the current form of nine chromosomes in radish was rearranged from the chromosomes of hexaploid progenitor. Overall, this study provides a chromosome-scale draft genome sequence of radish as well as novel insight into evolution of the mesohexaploid genomes in the tribe Brassiceae.
The COG database: an updated version includes eukaryotes

PubMed Central

Tatusov, Roman L; Fedorova, Natalie D; Jackson, John D; Jacobs, Aviva R; Kiryutin, Boris; Koonin, Eugene V; Krylov, Dmitri M; Mazumder, Raja; Mekhedov, Sergei L; Nikolskaya, Anastasia N; Rao, B Sridhar; Smirnov, Sergei; Sverdlov, Alexander V; Vasudevan, Sona; Wolf, Yuri I; Yin, Jodie J; Natale, Darren A

2003-01-01

Background The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies. Results We describe here a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after eukaryotic orthologous groups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The eukaryotic orthologous groups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or ~54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of ~20% of the KOG set. This conserved portion of the KOG set is much greater

The eukaryotic genome is structurally and functionally more like a social insect colony than a book.

PubMed

Qiu, Guo-Hua; Yang, Xiaoyan; Zheng, Xintian; Huang, Cuiqin

2017-11-01

Traditionally, the genome has been described as the 'book of life'. However, the metaphor of a book may not reflect the dynamic nature of the structure and function of the genome. In the eukaryotic genome, the number of centrally located protein-coding sequences is relatively constant across species, but the amount of noncoding DNA increases considerably with the increase of organismal evolutional complexity. Therefore, it has been hypothesized that the abundant peripheral noncoding DNA protects the genome and the central protein-coding sequences in the eukaryotic genome. Upon comparison with the habitation, sociality and defense mechanisms of a social insect colony, it is found that the genome is similar to a social insect colony in various aspects. A social insect colony may thus be a better metaphor than a book to describe the spatial organization and physical functions of the genome. The potential implications of the metaphor are also discussed.
A linear mitochondrial genome of Cyclospora cayetanensis (Eimeriidae, Eucoccidiorida, Coccidiasina, Apicomplexa) suggests the ancestral start position within mitochondrial genomes of eimeriid coccidia.

PubMed

Ogedengbe, Mosun E; Qvarnstrom, Yvonne; da Silva, Alexandre J; Arrowood, Michael J; Barta, John R

2015-05-01

The near complete mitochondrial genome for Cyclospora cayetanensis is 6184 bp in length with three protein-coding genes (Cox1, Cox3, CytB) and numerous lsrDNA and ssrDNA fragments. Gene arrangements were conserved with other coccidia in the Eimeriidae, but the C. cayetanensis mitochondrial genome is not circular-mapping. Terminal transferase tailing and nested PCR completed the 5'-terminus of the genome starting with a 21 bp A/T-only region that forms a potential stem-loop. Regions homologous to the C. cayetanensis mitochondrial genome 5'-terminus are found in all eimeriid mitochondrial genomes available and suggest this may be the ancestral start of eimeriid mitochondrial genomes. Copyright © 2015 Australian Society for Parasitology Inc. All rights reserved.
From the Cover: Genome analysis of the smallest free-living eukaryote Ostreococcus tauri unveils many unique features

NASA Astrophysics Data System (ADS)

Derelle, Evelyne; Ferraz, Conchita; Rombauts, Stephane; Rouzé, Pierre; Worden, Alexandra Z.; Robbens, Steven; Partensky, Frédéric; Degroeve, Sven; Echeynié, Sophie; Cooke, Richard; Saeys, Yvan; Wuyts, Jan; Jabbari, Kamel; Bowler, Chris; Panaud, Olivier; Piégu, Benoît; Ball, Steven G.; Ral, Jean-Philippe; Bouget, François-Yves; Piganeau, Gwenael; de Baets, Bernard; Picard, André; Delseny, Michel; Demaille, Jacques; van de Peer, Yves; Moreau, Hervé

2006-08-01

The green lineage is reportedly 1,500 million years old, evolving shortly after the endosymbiosis event that gave rise to early photosynthetic eukaryotes. In this study, we unveil the complete genome sequence of an ancient member of this lineage, the unicellular green alga Ostreococcus tauri (Prasinophyceae). This cosmopolitan marine primary producer is the world's smallest free-living eukaryote known to date. Features likely reflecting optimization of environmentally relevant pathways, including resource acquisition, unusual photosynthesis apparatus, and genes potentially involved in C4 photosynthesis, were observed, as was downsizing of many gene families. Overall, the 12.56-Mb nuclear genome has an extremely high gene density, in part because of extensive reduction of intergenic regions and other forms of compaction such as gene fusion. However, the genome is structurally complex. It exhibits previously unobserved levels of heterogeneity for a eukaryote. Two chromosomes differ structurally from the other eighteen. Both have a significantly biased G+C content, and, remarkably, they contain the majority of transposable elements. Many chromosome 2 genes also have unique codon usage and splicing, but phylogenetic analysis and composition do not support alien gene origin. In contrast, most chromosome 19 genes show no similarity to green lineage genes and a large number of them are specialized in cell surface processes. Taken together, the complete genome sequence, unusual features, and downsized gene families, make O. tauri an ideal model system for research on eukaryotic genome evolution, including chromosome specialization and green lineage ancestry. genome heterogeneity | genome sequence | green alga | Prasinophyceae | gene prediction
Insights into the red algae and eukaryotic evolution from the genome of Porphyra umbilicalis (Bangiophyceae, Rhodophyta)

PubMed Central

Brawley, Susan H.; Blouin, Nicolas A.; Ficko-Blean, Elizabeth; Wheeler, Glen L.; Lohr, Martin; Goodson, Holly V.; Jenkins, Jerry W.; Blaby-Haas, Crysten E.; Helliwell, Katherine E.; Chan, Cheong Xin; Marriage, Tara N.; Klein, Anita S.; Badis, Yacine; Brodie, Juliet; Cao, Yuanyu; Collén, Jonas; Dittami, Simon M.; Gachon, Claire M. M.; Green, Beverley R.; Karpowicz, Steven J.; Kim, Jay W.; Kudahl, Ulrich Johan; Lin, Senjie; Michel, Gurvan; Mittag, Maria; Olson, Bradley J. S. C.; Pangilinan, Jasmyn L.; Peng, Yi; Qiu, Huan; Shu, Shengqiang; Singer, John T.; Sprecher, Brittany N.; Wagner, Volker; Wang, Wenfei; Wang, Zhi-Yong; Yan, Juying; Yarish, Charles; Zäuner-Riek, Simone; Zhuang, Yunyun; Zou, Yong; Lindquist, Erika A.; Grimwood, Jane; Barry, Kerrie W.; Rokhsar, Daniel S.; Schmutz, Jeremy; Stiller, John W.; Grossman, Arthur R.; Prochnik, Simon E.

2017-01-01

Porphyra umbilicalis (laver) belongs to an ancient group of red algae (Bangiophyceae), is harvested for human food, and thrives in the harsh conditions of the upper intertidal zone. Here we present the 87.7-Mbp haploid Porphyra genome (65.8% G + C content, 13,125 gene loci) and elucidate traits that inform our understanding of the biology of red algae as one of the few multicellular eukaryotic lineages. Novel features of the Porphyra genome shared by other red algae relate to the cytoskeleton, calcium signaling, the cell cycle, and stress-tolerance mechanisms including photoprotection. Cytoskeletal motor proteins in Porphyra are restricted to a small set of kinesins that appear to be the only universal cytoskeletal motors within the red algae. Dynein motors are absent, and most red algae, including Porphyra, lack myosin. This surprisingly minimal cytoskeleton offers a potential explanation for why red algal cells and multicellular structures are more limited in size than in most multicellular lineages. Additional discoveries further relating to the stress tolerance of bangiophytes include ancestral enzymes for sulfation of the hydrophilic galactan-rich cell wall, evidence for mannan synthesis that originated before the divergence of green and red algae, and a high capacity for nutrient uptake. Our analyses provide a comprehensive understanding of the red algae, which are both commercially important and have played a major role in the evolution of other algal groups through secondary endosymbioses. PMID:28716924
Insights into the red algae and eukaryotic evolution from the genome of Porphyra umbilicalis (Bangiophyceae, Rhodophyta).

PubMed

Brawley, Susan H; Blouin, Nicolas A; Ficko-Blean, Elizabeth; Wheeler, Glen L; Lohr, Martin; Goodson, Holly V; Jenkins, Jerry W; Blaby-Haas, Crysten E; Helliwell, Katherine E; Chan, Cheong Xin; Marriage, Tara N; Bhattacharya, Debashish; Klein, Anita S; Badis, Yacine; Brodie, Juliet; Cao, Yuanyu; Collén, Jonas; Dittami, Simon M; Gachon, Claire M M; Green, Beverley R; Karpowicz, Steven J; Kim, Jay W; Kudahl, Ulrich Johan; Lin, Senjie; Michel, Gurvan; Mittag, Maria; Olson, Bradley J S C; Pangilinan, Jasmyn L; Peng, Yi; Qiu, Huan; Shu, Shengqiang; Singer, John T; Smith, Alison G; Sprecher, Brittany N; Wagner, Volker; Wang, Wenfei; Wang, Zhi-Yong; Yan, Juying; Yarish, Charles; Zäuner-Riek, Simone; Zhuang, Yunyun; Zou, Yong; Lindquist, Erika A; Grimwood, Jane; Barry, Kerrie W; Rokhsar, Daniel S; Schmutz, Jeremy; Stiller, John W; Grossman, Arthur R; Prochnik, Simon E

2017-08-01

Porphyra umbilicalis (laver) belongs to an ancient group of red algae (Bangiophyceae), is harvested for human food, and thrives in the harsh conditions of the upper intertidal zone. Here we present the 87.7-Mbp haploid Porphyra genome (65.8% G + C content, 13,125 gene loci) and elucidate traits that inform our understanding of the biology of red algae as one of the few multicellular eukaryotic lineages. Novel features of the Porphyra genome shared by other red algae relate to the cytoskeleton, calcium signaling, the cell cycle, and stress-tolerance mechanisms including photoprotection. Cytoskeletal motor proteins in Porphyra are restricted to a small set of kinesins that appear to be the only universal cytoskeletal motors within the red algae. Dynein motors are absent, and most red algae, including Porphyra , lack myosin. This surprisingly minimal cytoskeleton offers a potential explanation for why red algal cells and multicellular structures are more limited in size than in most multicellular lineages. Additional discoveries further relating to the stress tolerance of bangiophytes include ancestral enzymes for sulfation of the hydrophilic galactan-rich cell wall, evidence for mannan synthesis that originated before the divergence of green and red algae, and a high capacity for nutrient uptake. Our analyses provide a comprehensive understanding of the red algae, which are both commercially important and have played a major role in the evolution of other algal groups through secondary endosymbioses.
Insights into the red algae and eukaryotic evolution from the genome of Porphyra umbilicalis (Bangiophyceae, Rhodophyta)

DOE PAGES

Brawley, Susan H.; Blouin, Nicolas A.; Ficko-Blean, Elizabeth; ...

2017-07-17

Porphyra umbilicalis (laver) belongs to an ancient group of red algae (Bangiophyceae), is harvested for human food, and thrives in the harsh conditions of the upper intertidal zone. Here we present the 87.7-Mbp haploid Porphyra genome (65.8% G + C content, 13,125 gene loci) and elucidate traits that inform our understanding of the biology of red algae as one of the few multicellular eukaryotic lineages. Novel features of the Porphyra genome shared by other red algae relate to the cytoskeleton, calcium signaling, the cell cycle, and stress-tolerance mechanisms including photoprotection. Cytoskeletal motor proteins in Porphyra are restricted to a smallmore » set of kinesins that appear to be the only universal cytoskeletal motors within the red algae. Dynein motors are absent, and most red algae, including Porphyra, lack myosin. This surprisingly minimal cytoskeleton offers a potential explanation for why red algal cells and multicellular structures are more limited in size than in most multicellular lineages. Additional discoveries further relating to the stress tolerance of bangiophytes include ancestral enzymes for sulfation of the hydrophilic galactan-rich cell wall, evidence for mannan synthesis that originated before the divergence of green and red algae, and a high capacity for nutrient uptake. Our analyses provide a comprehensive understanding of the red algae, which are both commercially important and have played a major role in the evolution of other algal groups through secondary endosymbioses.« less
Insights into the red algae and eukaryotic evolution from the genome of Porphyra umbilicalis (Bangiophyceae, Rhodophyta)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brawley, Susan H.; Blouin, Nicolas A.; Ficko-Blean, Elizabeth

Porphyra umbilicalis (laver) belongs to an ancient group of red algae (Bangiophyceae), is harvested for human food, and thrives in the harsh conditions of the upper intertidal zone. Here we present the 87.7-Mbp haploid Porphyra genome (65.8% G + C content, 13,125 gene loci) and elucidate traits that inform our understanding of the biology of red algae as one of the few multicellular eukaryotic lineages. Novel features of the Porphyra genome shared by other red algae relate to the cytoskeleton, calcium signaling, the cell cycle, and stress-tolerance mechanisms including photoprotection. Cytoskeletal motor proteins in Porphyra are restricted to a smallmore » set of kinesins that appear to be the only universal cytoskeletal motors within the red algae. Dynein motors are absent, and most red algae, including Porphyra, lack myosin. This surprisingly minimal cytoskeleton offers a potential explanation for why red algal cells and multicellular structures are more limited in size than in most multicellular lineages. Additional discoveries further relating to the stress tolerance of bangiophytes include ancestral enzymes for sulfation of the hydrophilic galactan-rich cell wall, evidence for mannan synthesis that originated before the divergence of green and red algae, and a high capacity for nutrient uptake. Our analyses provide a comprehensive understanding of the red algae, which are both commercially important and have played a major role in the evolution of other algal groups through secondary endosymbioses.« less
The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population.

PubMed

Lack, Justin B; Cardeno, Charis M; Crepeau, Marc W; Taylor, William; Corbett-Detig, Russell B; Stevens, Kristian A; Langley, Charles H; Pool, John E

2015-04-01

Hundreds of wild-derived Drosophila melanogaster genomes have been published, but rigorous comparisons across data sets are precluded by differences in alignment methodology. The most common approach to reference-based genome assembly is a single round of alignment followed by quality filtering and variant detection. We evaluated variations and extensions of this approach and settled on an assembly strategy that utilizes two alignment programs and incorporates both substitutions and short indels to construct an updated reference for a second round of mapping prior to final variant detection. Utilizing this approach, we reassembled published D. melanogaster population genomic data sets and added unpublished genomes from several sub-Saharan populations. Most notably, we present aligned data from phase 3 of the Drosophila Population Genomics Project (DPGP3), which provides 197 genomes from a single ancestral range population of D. melanogaster (from Zambia). The large sample size, high genetic diversity, and potentially simpler demographic history of the DPGP3 sample will make this a highly valuable resource for fundamental population genetic research. The complete set of assemblies described here, termed the Drosophila Genome Nexus, presently comprises 623 consistently aligned genomes and is publicly available in multiple formats with supporting documentation and bioinformatic tools. This resource will greatly facilitate population genomic analysis in this model species by reducing the methodological differences between data sets. Copyright © 2015 by the Genetics Society of America.
Archaeal homologs of eukaryotic methylation guide small nucleolar RNAs: lessons from the Pyrococcus genomes.

PubMed

Gaspin, C; Cavaillé, J; Erauso, G; Bachellerie, J P

2000-04-07

Ribose methylation is a prevalent type of nucleotide modification in rRNA. Eukaryotic rRNAs display a complex pattern of ribose methylations, amounting to 55 in yeast Saccharomyces cerevisiae and about 100 in vertebrates. Ribose methylations of eukaryotic rRNAs are each guided by a cognate small RNA, belonging to the family of box C/D antisense snoRNAs, through transient formation of a specific base-pairing at the rRNA modification site. In prokaryotes, the pattern of rRNA ribose methylations has been fully characterized in a single species so far, Escherichia coli, which contains only four ribose methylated rRNA nucleotides. However, the hyperthermophile archaeon Sulfolobus solfataricus contains, like eukaryotes, a large number of (yet unmapped) rRNA ribose methylations and homologs of eukaryotic box C/D small nucleolar ribonuclear proteins have been identified in archaeal genomes. We have therefore searched archaeal genomes for potential homologs of eukaryotic methylation guide small nucleolar RNAs, by combining searches for structured motifs with homology searches. We have identified a family of 46 small RNAs, conserved in the genomes of three hyperthermophile Pyrococcus species, which we have experimentally characterized in Pyrococcus abyssi. The Pyrococcus small RNAs, the first reported homologs of methylation guide small nucleolar RNAs in organisms devoid of a nucleus, appear as a paradigm of minimalist box C/D antisense RNAs. They differ from their eukaryotic homologs by their outstanding structural homogeneity, extended consensus box motifs and the quasi-systematic presence of two (instead of one) rRNA antisense elements. Remarkably, for each small RNA the two antisense elements always match rRNA sequences close to each other in rRNA structure, suggesting an important role in rRNA folding. Only a few of the predicted P. abyssi rRNA ribose methylations have been detected so far. Further analysis of these archaeal small RNAs could provide new insights into
Ancestral chromosomal blocks are triplicated in Brassiceae species with varying chromosome number and genome size.

PubMed

Lysak, Martin A; Cheung, Kwok; Kitschke, Michaela; Bures, Petr

2007-10-01

The paleopolyploid character of genomes of the economically important genus Brassica and closely related species (tribe Brassiceae) is still fairly controversial. Here, we report on the comparative painting analysis of block F of the crucifer Ancestral Karyotype (AK; n = 8), consisting of 24 conserved genomic blocks, in 10 species traditionally treated as members of the tribe Brassiceae. Three homeologous copies of block F were identified per haploid chromosome complement in Brassiceae species with 2n = 14, 18, 20, 32, and 36. In high-polyploid (n >or= 30) species Crambe maritima (2n = 60), Crambe cordifolia (2n = 120), and Vella pseudocytisus (2n = 68), six, 12, and six copies of the analyzed block have been revealed, respectively. Homeologous regions resembled the ancestral structure of block F within the AK or were altered by inversions and/or translocations. In two species of the subtribe Zillineae, two of the three homeologous regions were combined via a reciprocal translocation onto one chromosome. Altogether, these findings provide compelling evidence of an ancient hexaploidization event and corresponding whole-genome triplication shared by the tribe Brassiceae. No direct relationship between chromosome number and genome size variation (1.2-2.5 pg/2C) has been found in Brassiceae species with 2n = 14 to 36. Only two homeologous copies of block F suggest a whole-genome duplication but not the triplication event in Orychophragmus violaceus (2n = 24), and confirm a phylogenetic position of this species outside the tribe Brassiceae. Chromosome duplication detected in Orychophragmus as well as chromosome rearrangements shared by Zillineae species demonstrate the usefulness of comparative cytogenetics for elucidation of phylogenetic relationships.
Vertebrate codon bias indicates a highly GC-rich ancestral genome.

PubMed

Nabiyouni, Maryam; Prakash, Ashwin; Fedorov, Alexei

2013-04-25

Two factors are thought to have contributed to the origin of codon usage bias in eukaryotes: 1) genome-wide mutational forces that shape overall GC-content and create context-dependent nucleotide bias, and 2) positive selection for codons that maximize efficient and accurate translation. Particularly in vertebrates, these two explanations contradict each other and cloud the origin of codon bias in the taxon. On the one hand, mutational forces fail to explain GC-richness (~60%) of third codon positions, given the GC-poor overall genomic composition among vertebrates (~40%). On the other hand, positive selection cannot easily explain strict regularities in codon preferences. Large-scale bioinformatic assessment, of nucleotide composition of coding and non-coding sequences in vertebrates and other taxa, suggests a simple possible resolution for this contradiction. Specifically, we propose that the last common vertebrate ancestor had a GC-rich genome (~65% GC). The data suggest that whole-genome mutational bias is the major driving force for generating codon bias. As the bias becomes prominent, it begins to affect translation and can result in positive selection for optimal codons. The positive selection can, in turn, significantly modulate codon preferences. Copyright © 2013 Elsevier B.V. All rights reserved.
Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution.

PubMed

Rogozin, Igor B; Wolf, Yuri I; Sorokin, Alexander V; Mirkin, Boris G; Koonin, Eugene V

2003-09-02

Sequencing of eukaryotic genomes allows one to address major evolutionary problems, such as the evolution of gene structure. We compared the intron positions in 684 orthologous gene sets from 8 complete genomes of animals, plants, fungi, and protists and constructed parsimonious scenarios of evolution of the exon-intron structure for the respective genes. Approximately one-third of the introns in the malaria parasite Plasmodium falciparum are shared with at least one crown group eukaryote; this number indicates that these introns have been conserved through >1.5 billion years of evolution that separate Plasmodium from the crown group. Paradoxically, humans share many more introns with the plant Arabidopsis thaliana than with the fly or nematode. The inferred evolutionary scenario holds that the common ancestor of Plasmodium and the crown group and, especially, the common ancestor of animals, plants, and fungi had numerous introns. Most of these ancestral introns, which are retained in the genomes of vertebrates and plants, have been lost in fungi, nematodes, arthropods, and probably Plasmodium. In addition, numerous introns have been inserted into vertebrate and plant genes, whereas, in other lineages, intron gain was much less prominent.
Evolution and Classification of Myosins, a Paneukaryotic Whole-Genome Approach

PubMed Central

Sebé-Pedrós, Arnau; Grau-Bové, Xavier; Richards, Thomas A.; Ruiz-Trillo, Iñaki

2014-01-01

Myosins are key components of the eukaryotic cytoskeleton, providing motility for a broad diversity of cargoes. Therefore, understanding the origin and evolutionary history of myosin classes is crucial to address the evolution of eukaryote cell biology. Here, we revise the classification of myosins using an updated taxon sampling that includes newly or recently sequenced genomes and transcriptomes from key taxa. We performed a survey of eukaryotic genomes and phylogenetic analyses of the myosin gene family, reconstructing the myosin toolkit at different key nodes in the eukaryotic tree of life. We also identified the phylogenetic distribution of myosin diversity in terms of number of genes, associated protein domains and number of classes in each taxa. Our analyses show that new classes (i.e., paralogs) and domain architectures were continuously generated throughout eukaryote evolution, with a significant expansion of myosin abundance and domain architectural diversity at the stem of Holozoa, predating the origin of animal multicellularity. Indeed, single-celled holozoans have the most complex myosin complement among eukaryotes, with paralogs of most myosins previously considered animal specific. We recover a dynamic evolutionary history, with several lineage-specific expansions (e.g., the myosin III-like gene family diversification in choanoflagellates), convergence in protein domain architectures (e.g., fungal and animal chitin synthase myosins), and important secondary losses. Overall, our evolutionary scheme demonstrates that the ancestral eukaryote likely had a complex myosin repertoire that included six genes with different protein domain architectures. Finally, we provide an integrative and robust classification, useful for future genomic and functional studies on this crucial eukaryotic gene family. PMID:24443438
Mitochondrial introgression suggests extensive ancestral hybridization events among Saccharomyces species.

PubMed

Peris, David; Arias, Armando; Orlić, Sandi; Belloch, Carmela; Pérez-Través, Laura; Querol, Amparo; Barrio, Eladio

2017-03-01

Horizontal gene transfer (HGT) in eukaryotic plastids and mitochondrial genomes is common, and plays an important role in organism evolution. In yeasts, recent mitochondrial HGT has been suggested between S. cerevisiae and S. paradoxus. However, few strains have been explored given the lack of accurate mitochondrial genome annotations. Mitochondrial genome sequences are important to understand how frequent these introgressions occur, and their role in cytonuclear incompatibilities and fitness. Indeed, most of the Bateson-Dobzhansky-Muller genetic incompatibilities described in yeasts are driven by cytonuclear incompatibilities. We herein explored the mitochondrial inheritance of several worldwide distributed wild Saccharomyces species and their hybrids isolated from different sources and geographic origins. We demonstrated the existence of several recombination points in mitochondrial region COX2-ORF1, likely mediated by either the activity of the protein encoded by the ORF1 (F-SceIII) gene, a free-standing homing endonuclease, or mostly facilitated by A+T tandem repeats and regions of integration of GC clusters. These introgressions were shown to occur among strains of the same species and among strains of different species, which suggests a complex model of Saccharomyces evolution that involves several ancestral hybridization events in wild environments. Copyright © 2017 Elsevier Inc. All rights reserved.
The relative ages of eukaryotes and akaryotes.

PubMed

Penny, David; Collins, Lesley J; Daly, Toni K; Cox, Simon J

2014-12-01

The Last Eukaryote Common Ancestor (LECA) appears to have the genetics required for meiosis, mitosis, nucleus and nuclear substructures, an exon/intron gene structure, spliceosomes, many centres of DNA replication, etc. (and including mitochondria). Most of these features are not generally explained by models for the origin of the Eukaryotic cell based on the fusion of an Archeon and a Bacterium. We find that the term 'prokaryote' is ambiguous and the non-phylogenetic term akaryote should be used in its place because we do not yet know the direction of evolution between eukaryotes and akaryotes. We use the term 'protoeukaryote' for the hypothetical stem group ancestral eukaryote that took up a bacterium as an endosymbiont that formed the mitochondrion. It is easier to make detailed models with a eukaryote to an akaryote transition, rather than vice versa. So we really are at a phylogenetic impasse in not being confident about the direction of change between eukaryotes and akaryotes.
Energetics and genetics across the prokaryote-eukaryote divide

PubMed Central

2011-01-01

Background All complex life on Earth is eukaryotic. All eukaryotic cells share a common ancestor that arose just once in four billion years of evolution. Prokaryotes show no tendency to evolve greater morphological complexity, despite their metabolic virtuosity. Here I argue that the eukaryotic cell originated in a unique prokaryotic endosymbiosis, a singular event that transformed the selection pressures acting on both host and endosymbiont. Results The reductive evolution and specialisation of endosymbionts to mitochondria resulted in an extreme genomic asymmetry, in which the residual mitochondrial genomes enabled the expansion of bioenergetic membranes over several orders of magnitude, overcoming the energetic constraints on prokaryotic genome size, and permitting the host cell genome to expand (in principle) over 200,000-fold. This energetic transformation was permissive, not prescriptive; I suggest that the actual increase in early eukaryotic genome size was driven by a heavy early bombardment of genes and introns from the endosymbiont to the host cell, producing a high mutation rate. Unlike prokaryotes, with lower mutation rates and heavy selection pressure to lose genes, early eukaryotes without genome-size limitations could mask mutations by cell fusion and genome duplication, as in allopolyploidy, giving rise to a proto-sexual cell cycle. The side effect was that a large number of shared eukaryotic basal traits accumulated in the same population, a sexual eukaryotic common ancestor, radically different to any known prokaryote. Conclusions The combination of massive bioenergetic expansion, release from genome-size constraints, and high mutation rate favoured a protosexual cell cycle and the accumulation of eukaryotic traits. These factors explain the unique origin of eukaryotes, the absence of true evolutionary intermediates, and the evolution of sex in eukaryotes but not prokaryotes. Reviewers This article was reviewed by: Eugene Koonin, William Martin
Ancestral Chromosomal Blocks Are Triplicated in Brassiceae Species with Varying Chromosome Number and Genome Size1

PubMed Central

Lysak, Martin A.; Cheung, Kwok; Kitschke, Michaela; Bureš, Petr

2007-01-01

The paleopolyploid character of genomes of the economically important genus Brassica and closely related species (tribe Brassiceae) is still fairly controversial. Here, we report on the comparative painting analysis of block F of the crucifer Ancestral Karyotype (AK; n = 8), consisting of 24 conserved genomic blocks, in 10 species traditionally treated as members of the tribe Brassiceae. Three homeologous copies of block F were identified per haploid chromosome complement in Brassiceae species with 2n = 14, 18, 20, 32, and 36. In high-polyploid (n ≥ 30) species Crambe maritima (2n = 60), Crambe cordifolia (2n = 120), and Vella pseudocytisus (2n = 68), six, 12, and six copies of the analyzed block have been revealed, respectively. Homeologous regions resembled the ancestral structure of block F within the AK or were altered by inversions and/or translocations. In two species of the subtribe Zillineae, two of the three homeologous regions were combined via a reciprocal translocation onto one chromosome. Altogether, these findings provide compelling evidence of an ancient hexaploidization event and corresponding whole-genome triplication shared by the tribe Brassiceae. No direct relationship between chromosome number and genome size variation (1.2–2.5 pg/2C) has been found in Brassiceae species with 2n = 14 to 36. Only two homeologous copies of block F suggest a whole-genome duplication but not the triplication event in Orychophragmus violaceus (2n = 24), and confirm a phylogenetic position of this species outside the tribe Brassiceae. Chromosome duplication detected in Orychophragmus as well as chromosome rearrangements shared by Zillineae species demonstrate the usefulness of comparative cytogenetics for elucidation of phylogenetic relationships. PMID:17720758
Visualization of genome signatures of eukaryote genomes by batch-learning self-organizing map with a special emphasis on Drosophila genomes.

PubMed

Abe, Takashi; Hamano, Yuta; Ikemura, Toshimichi

2014-01-01

A strategy of evolutionary studies that can compare vast numbers of genome sequences is becoming increasingly important with the remarkable progress of high-throughput DNA sequencing methods. We previously established a sequence alignment-free clustering method "BLSOM" for di-, tri-, and tetranucleotide compositions in genome sequences, which can characterize sequence characteristics (genome signatures) of a wide range of species. In the present study, we generated BLSOMs for tetra- and pentanucleotide compositions in approximately one million sequence fragments derived from 101 eukaryotes, for which almost complete genome sequences were available. BLSOM recognized phylotype-specific characteristics (e.g., key combinations of oligonucleotide frequencies) in the genome sequences, permitting phylotype-specific clustering of the sequences without any information regarding the species. In our detailed examination of 12 Drosophila species, the correlation between their phylogenetic classification and the classification on the BLSOMs was observed to visualize oligonucleotides diagnostic for species-specific clustering.
Real-time imaging of specific genomic loci in eukaryotic cells using the ANCHOR DNA labelling system.

PubMed

Germier, Thomas; Sylvain, Audibert; Silvia, Kocanova; David, Lane; Kerstin, Bystricky

2018-06-01

Spatio-temporal organization of the cell nucleus adapts to and regulates genomic processes. Microscopy approaches that enable direct monitoring of specific chromatin sites in single cells and in real time are needed to better understand the dynamics involved. In this chapter, we describe the principle and development of ANCHOR, a novel tool for DNA labelling in eukaryotic cells. Protocols for use of ANCHOR to visualize a single genomic locus in eukaryotic cells are presented. We describe an approach for live cell imaging of a DNA locus during the entire cell cycle in human breast cancer cells. Copyright © 2018 Elsevier Inc. All rights reserved.
Horizontal transfer of a ß-1,6-glucanase gene from an ancestral species of fungal endophyte to a cool-season grass host.

PubMed

Shinozuka, Hiroshi; Hettiarachchige, Inoka K; Shinozuka, Maiko; Cogan, Noel O I; Spangenberg, German C; Cocks, Benjamin G; Forster, John W; Sawbridge, Timothy I

2017-08-22

Molecular characterisation has convincingly demonstrated some types of horizontal gene transfer in eukaryotes, but nuclear gene transfer between distantly related eukaryotic groups appears to have been rare. For angiosperms (flowering plants), nuclear gene transfer events identified to date have been confined to genes originating from prokaryotes or other plant species. In this report, evidence for ancient horizontal transfer of a fungal nuclear gene, encoding a ß-1,6-glucanase enzyme for fungal cell wall degradation, into an angiosperm lineage is presented for the first time. The gene was identified from de novo sequencing and assembly of the genome and transcriptome of perennial ryegrass, a cool-season grass species. Molecular analysis confirmed the presence of the complete gene in the genome of perennial ryegrass. No corresponding sequence was found in other plant species, apart from members of the Poeae sub-tribes Loliinae and Dactylidinae. Evidence suggests that a common ancestor of the two sub-tribes acquired the gene from a species ancestral to contemporary grass-associated fungal endophytes around 9-13 million years ago. This first report of horizontal transfer of a nuclear gene from a taxonomically distant eukaryote to modern flowering plants provides evidence for a novel adaptation mechanism in angiosperms.

A genomic survey of the fish parasite Spironucleus salmonicida indicates genomic plasticity among diplomonads and significant lateral gene transfer in eukaryote genome evolution

PubMed Central

Andersson, Jan O; Sjögren, Åsa M; Horner, David S; Murphy, Colleen A; Dyal, Patricia L; Svärd, Staffan G; Logsdon, John M; Ragan, Mark A; Hirt, Robert P; Roger, Andrew J

2007-01-01

Background Comparative genomic studies of the mitochondrion-lacking protist group Diplomonadida (diplomonads) has been lacking, although Giardia lamblia has been intensively studied. We have performed a sequence survey project resulting in 2341 expressed sequence tags (EST) corresponding to 853 unique clones, 5275 genome survey sequences (GSS), and eleven finished contigs from the diplomonad fish parasite Spironucleus salmonicida (previously described as S. barkhanus). Results The analyses revealed a compact genome with few, if any, introns and very short 3' untranslated regions. Strikingly different patterns of codon usage were observed in genes corresponding to frequently sampled ESTs versus genes poorly sampled, indicating that translational selection is influencing the codon usage of highly expressed genes. Rigorous phylogenomic analyses identified 84 genes – mostly encoding metabolic proteins – that have been acquired by diplomonads or their relatively close ancestors via lateral gene transfer (LGT). Although most acquisitions were from prokaryotes, more than a dozen represent likely transfers of genes between eukaryotic lineages. Many genes that provide novel insights into the genetic basis of the biology and pathogenicity of this parasitic protist were identified including 149 that putatively encode variant-surface cysteine-rich proteins which are candidate virulence factors. A number of genomic properties that distinguish S. salmonicida from its human parasitic relative G. lamblia were identified such as nineteen putative lineage-specific gene acquisitions, distinct mutational biases and codon usage and distinct polyadenylation signals. Conclusion Our results highlight the power of comparative genomic studies to yield insights into the biology of parasitic protists and the evolution of their genomes, and suggest that genetic exchange between distantly-related protist lineages may be occurring at an appreciable rate in eukaryote genome evolution. PMID
The Persistent Contributions of RNA to Eukaryotic Gen(om)e Architecture and Cellular Function

PubMed Central

Brosius, Jürgen

2014-01-01

Currently, the best scenario for earliest forms of life is based on RNA molecules as they have the proven ability to catalyze enzymatic reactions and harbor genetic information. Evolutionary principles valid today become apparent in such models already. Furthermore, many features of eukaryotic genome architecture might have their origins in an RNA or RNA/protein (RNP) world, including the onset of a further transition, when DNA replaced RNA as the genetic bookkeeper of the cell. Chromosome maintenance, splicing, and regulatory function via RNA may be deeply rooted in the RNA/RNP worlds. Mostly in eukaryotes, conversion from RNA to DNA is still ongoing, which greatly impacts the plasticity of extant genomes. Raw material for novel genes encoding protein or RNA, or parts of genes including regulatory elements that selection can act on, continues to enter the evolutionary lottery. PMID:25081515
Polyploidy: adaptation to the genomic environment.

PubMed

Hollister, Jesse D

2015-02-01

Genomic evidence of ancestral whole genome duplication (WGD) and polyploidy is widespread among eukaryotic species, and especially among plants. WGD is thought to provide the raw material for adaptation in the form of duplicated genes, and polyploids are thought to benefit from both physiological and genetic buffering. Comparatively little attention has focused on the genomic challenge of polyploidy, however, although much evidence exists that polyploidy severely perturbs important cellular functions. Here, I review recent progress in the study of the re-establishment of stable meiosis in recently evolved polyploids, focusing on four plant species. This work has yielded an insight into the mechanisms underlying stabilization of genome transmission in polyploids, and is revealing remarkable parallels among diverse taxa. Importantly, these studies also provide a road map for investigating how polyploids respond to the challenge of WGD.
EuGI: a novel resource for studying genomic islands to facilitate horizontal gene transfer detection in eukaryotes.

PubMed

Clasen, Frederick Johannes; Pierneef, Rian Ewald; Slippers, Bernard; Reva, Oleg

2018-05-03

Genomic islands (GIs) are inserts of foreign DNA that have potentially arisen through horizontal gene transfer (HGT). There are evidences that GIs can contribute significantly to the evolution of prokaryotes. The acquisition of GIs through HGT in eukaryotes has, however, been largely unexplored. In this study, the previously developed GI prediction tool, SeqWord Gene Island Sniffer (SWGIS), is modified to predict GIs in eukaryotic chromosomes. Artificial simulations are used to estimate ratios of predicting false positive and false negative GIs by inserting GIs into different test chromosomes and performing the SWGIS v2.0 algorithm. Using SWGIS v2.0, GIs are then identified in 36 fungal, 22 protozoan and 8 invertebrate genomes. SWGIS v2.0 predicts GIs in large eukaryotic chromosomes based on the atypical nucleotide composition of these regions. Averages for predicting false negative and false positive GIs were 20.1% and 11.01% respectively. A total of 10,550 GIs were identified in 66 eukaryotic species with 5299 of these GIs coding for at least one functional protein. The EuGI web-resource, freely accessible at http://eugi.bi.up.ac.za , was developed that allows browsing the database created from identified GIs and genes within GIs through an interactive and visual interface. SWGIS v2.0 along with the EuGI database, which houses GIs identified in 66 different eukaryotic species, and the EuGI web-resource, provide the first comprehensive resource for studying HGT in eukaryotes.
Phylogenetic analysis of the core histone doublet and DNA topo II genes of Marseilleviridae: evidence of proto-eukaryotic provenance.

PubMed

Erives, Albert J

2017-11-28

While the genomes of eukaryotes and Archaea both encode the histone-fold domain, only eukaryotes encode the core histone paralogs H2A, H2B, H3, and H4. With DNA, these core histones assemble into the nucleosomal octamer underlying eukaryotic chromatin. Importantly, core histones for H2A and H3 are maintained as neofunctionalized paralogs adapted for general bulk chromatin (canonical H2 and H3) or specialized chromatin (H2A.Z enriched at gene promoters and cenH3s enriched at centromeres). In this context, the identification of core histone-like "doublets" in the cytoplasmic replication factories of the Marseilleviridae (MV) is a novel finding with possible relevance to understanding the origin of eukaryotic chromatin. Here, we analyze and compare the core histone doublet genes from all known MV genomes as well as other MV genes relevant to the origin of the eukaryotic replisome. Using different phylogenetic approaches, we show that MV histone domains encode obligate H2B-H2A and H4-H3 dimers of possible proto-eukaryotic origin. MV core histone moieties form sister clades to each of the four eukaryotic clades of canonical and variant core histones. This suggests that MV core histone moieties diverged prior to eukaryotic neofunctionalizations associated with paired linear chromosomes and variant histone octamer assembly. We also show that MV genomes encode a proto-eukaryotic DNA topoisomerase II enzyme that forms a sister clade to eukaryotes. This is a relevant finding given that DNA topo II influences histone deposition and chromatin compaction and is the second most abundant nuclear protein after histones. The combined domain architecture and phylogenomic analyses presented here suggest that a primitive origin for MV histone genes is a more parsimonious explanation than horizontal gene transfers + gene fusions + sufficient divergence to eliminate relatedness to eukaryotic neofunctionalizations within the H2A and H3 clades without loss of relatedness to each of
Genomes as documents of evolutionary history: a probabilistic macrosynteny model for the reconstruction of ancestral genomes

PubMed Central

Nakatani, Yoichiro; McLysaght, Aoife

2017-01-01

Abstract Motivation: It has been argued that whole-genome duplication (WGD) exerted a profound influence on the course of evolution. For the purpose of fully understanding the impact of WGD, several formal algorithms have been developed for reconstructing pre-WGD gene order in yeast and plant. However, to the best of our knowledge, those algorithms have never been successfully applied to WGD events in teleost and vertebrate, impeded by extensive gene shuffling and gene losses. Results: Here, we present a probabilistic model of macrosynteny (i.e. conserved linkage or chromosome-scale distribution of orthologs), develop a variational Bayes algorithm for inferring the structure of pre-WGD genomes, and study estimation accuracy by simulation. Then, by applying the method to the teleost WGD, we demonstrate effectiveness of the algorithm in a situation where gene-order reconstruction algorithms perform relatively poorly due to a high rate of rearrangement and extensive gene losses. Our high-resolution reconstruction reveals previously overlooked small-scale rearrangements, necessitating a revision to previous views on genome structure evolution in teleost and vertebrate. Conclusions: We have reconstructed the structure of a pre-WGD genome by employing a variational Bayes approach that was originally developed for inferring topics from millions of text documents. Interestingly, comparison of the macrosynteny and topic model algorithms suggests that macrosynteny can be regarded as documents on ancestral genome structure. From this perspective, the present study would seem to provide a textbook example of the prevalent metaphor that genomes are documents of evolutionary history. Availability and implementation: The analysis data are available for download at http://www.gen.tcd.ie/molevol/supp_data/MacrosyntenyTGD.zip, and the software written in Java is available upon request. Contact: yoichiro.nakatani@tcd.ie or aoife.mclysaght@tcd.ie Supplementary information
Genomes as documents of evolutionary history: a probabilistic macrosynteny model for the reconstruction of ancestral genomes.

PubMed

Nakatani, Yoichiro; McLysaght, Aoife

2017-07-15

It has been argued that whole-genome duplication (WGD) exerted a profound influence on the course of evolution. For the purpose of fully understanding the impact of WGD, several formal algorithms have been developed for reconstructing pre-WGD gene order in yeast and plant. However, to the best of our knowledge, those algorithms have never been successfully applied to WGD events in teleost and vertebrate, impeded by extensive gene shuffling and gene losses. Here, we present a probabilistic model of macrosynteny (i.e. conserved linkage or chromosome-scale distribution of orthologs), develop a variational Bayes algorithm for inferring the structure of pre-WGD genomes, and study estimation accuracy by simulation. Then, by applying the method to the teleost WGD, we demonstrate effectiveness of the algorithm in a situation where gene-order reconstruction algorithms perform relatively poorly due to a high rate of rearrangement and extensive gene losses. Our high-resolution reconstruction reveals previously overlooked small-scale rearrangements, necessitating a revision to previous views on genome structure evolution in teleost and vertebrate. We have reconstructed the structure of a pre-WGD genome by employing a variational Bayes approach that was originally developed for inferring topics from millions of text documents. Interestingly, comparison of the macrosynteny and topic model algorithms suggests that macrosynteny can be regarded as documents on ancestral genome structure. From this perspective, the present study would seem to provide a textbook example of the prevalent metaphor that genomes are documents of evolutionary history. The analysis data are available for download at http://www.gen.tcd.ie/molevol/supp_data/MacrosyntenyTGD.zip , and the software written in Java is available upon request. yoichiro.nakatani@tcd.ie or aoife.mclysaght@tcd.ie. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All
Reconstructed ancestral enzymes suggest long-term cooling of Earth's photic zone since the Archean

NASA Astrophysics Data System (ADS)

Garcia, Amanda K.; Schopf, J. William; Yokobori, Shin-ichi; Akanuma, Satoshi; Yamagishi, Akihiko

2017-05-01

Paleotemperatures inferred from the isotopic compositions (δ18O and δ30Si) of marine cherts suggest that Earth’s oceans cooled from 70 ± 15 °C in the Archean to the present ˜15 °C. This interpretation, however, has been subject to question due to uncertainties regarding oceanic isotopic compositions, diagenetic or metamorphic resetting of the isotopic record, and depositional environments. Analyses of the thermostability of reconstructed ancestral enzymes provide an independent method by which to assess the temperature history inferred from the isotopic evidence. Although previous studies have demonstrated extreme thermostability in reconstructed archaeal and bacterial proteins compatible with a hot early Earth, taxa investigated may have inhabited local thermal environments that differed significantly from average surface conditions. We here present thermostability measurements of reconstructed ancestral enzymatically active nucleoside diphosphate kinases (NDKs) derived from light-requiring prokaryotic and eukaryotic phototrophs having widely separated fossil-based divergence ages. The ancestral environmental temperatures thereby determined for these photic-zone organisms--shown in modern taxa to correlate strongly with NDK thermostability--are inferred to reflect ancient surface-environment paleotemperatures. Our results suggest that Earth's surface temperature decreased over geological time from ˜65-80 °C in the Archean, a finding consistent both with previous isotope-based and protein reconstruction-based interpretations. Interdisciplinary studies such as those reported here integrating genomic, geologic, and paleontologic data hold promise for providing new insight into the coevolution of life and environment over Earth history.
Reconstructed ancestral enzymes suggest long-term cooling of Earth's photic zone since the Archean.

PubMed

Garcia, Amanda K; Schopf, J William; Yokobori, Shin-Ichi; Akanuma, Satoshi; Yamagishi, Akihiko

2017-05-02

Paleotemperatures inferred from the isotopic compositions (δ 18 O and δ 30 Si) of marine cherts suggest that Earth's oceans cooled from 70 ± 15 °C in the Archean to the present ∼15 °C. This interpretation, however, has been subject to question due to uncertainties regarding oceanic isotopic compositions, diagenetic or metamorphic resetting of the isotopic record, and depositional environments. Analyses of the thermostability of reconstructed ancestral enzymes provide an independent method by which to assess the temperature history inferred from the isotopic evidence. Although previous studies have demonstrated extreme thermostability in reconstructed archaeal and bacterial proteins compatible with a hot early Earth, taxa investigated may have inhabited local thermal environments that differed significantly from average surface conditions. We here present thermostability measurements of reconstructed ancestral enzymatically active nucleoside diphosphate kinases (NDKs) derived from light-requiring prokaryotic and eukaryotic phototrophs having widely separated fossil-based divergence ages. The ancestral environmental temperatures thereby determined for these photic-zone organisms--shown in modern taxa to correlate strongly with NDK thermostability--are inferred to reflect ancient surface-environment paleotemperatures. Our results suggest that Earth's surface temperature decreased over geological time from ∼65-80 °C in the Archean, a finding consistent both with previous isotope-based and protein reconstruction-based interpretations. Interdisciplinary studies such as those reported here integrating genomic, geologic, and paleontologic data hold promise for providing new insight into the coevolution of life and environment over Earth history.
The Genome of the Obligate Intracellular Parasite Trachipleistophora hominis: New Insights into Microsporidian Genome Dynamics and Reductive Evolution

PubMed Central

Heinz, Eva; Williams, Tom A.; Nakjang, Sirintra; Noël, Christophe J.; Swan, Daniel C.; Goldberg, Alina V.; Harris, Simon R.; Weinmaier, Thomas; Markert, Stephanie; Becher, Dörte; Bernhardt, Jörg; Dagan, Tal; Hacker, Christian; Lucocq, John M.; Schweder, Thomas; Rattei, Thomas; Hall, Neil; Hirt, Robert P.; Embley, T. Martin

2012-01-01

The dynamics of reductive genome evolution for eukaryotes living inside other eukaryotic cells are poorly understood compared to well-studied model systems involving obligate intracellular bacteria. Here we present 8.5 Mb of sequence from the genome of the microsporidian Trachipleistophora hominis, isolated from an HIV/AIDS patient, which is an outgroup to the smaller compacted-genome species that primarily inform ideas of evolutionary mode for these enormously successful obligate intracellular parasites. Our data provide detailed information on the gene content, genome architecture and intergenic regions of a larger microsporidian genome, while comparative analyses allowed us to infer genomic features and metabolism of the common ancestor of the species investigated. Gene length reduction and massive loss of metabolic capacity in the common ancestor was accompanied by the evolution of novel microsporidian-specific protein families, whose conservation among microsporidians, against a background of reductive evolution, suggests they may have important functions in their parasitic lifestyle. The ancestor had already lost many metabolic pathways but retained glycolysis and the pentose phosphate pathway to provide cytosolic ATP and reduced coenzymes, and it had a minimal mitochondrion (mitosome) making Fe-S clusters but not ATP. It possessed bacterial-like nucleotide transport proteins as a key innovation for stealing host-generated ATP, the machinery for RNAi, key elements of the early secretory pathway, canonical eukaryotic as well as microsporidian-specific regulatory elements, a diversity of repetitive and transposable elements, and relatively low average gene density. Microsporidian genome evolution thus appears to have proceeded in at least two major steps: an ancestral remodelling of the proteome upon transition to intracellular parasitism that involved reduction but also selective expansion, followed by a secondary compaction of genome architecture in some, but
The genome of the polar eukaryotic microalga Coccomyxa subellipsoidea reveals traits of cold adaptation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blanc, Guillaume; Agarkova, Irina; Grimwood, Jane

2012-02-13

Background Little is known about the mechanisms of adaptation of life to the extreme environmental conditions encountered in polar regions. Here we present the genome sequence of a unicellular green alga from the division chlorophyta, Coccomyxa subellipsoidea C-169, which we will hereafter refer to as C-169. This is the first eukaryotic microorganism from a polar environment to have its genome sequenced. Results The 48.8 Mb genome contained in 20 chromosomes exhibits significant synteny conservation with the chromosomes of its relatives Chlorella variabilis and Chlamydomonas reinhardtii. The order of the genes is highly reshuffled within synteny blocks, suggesting that intra-chromosomal rearrangementsmore » were more prevalent than inter-chromosomal rearrangements. Remarkably, Zepp retrotransposons occur in clusters of nested elements with strictly one cluster per chromosome probably residing at the centromere. Several protein families overrepresented in C. subellipsoidae include proteins involved in lipid metabolism, transporters, cellulose synthases and short alcohol dehydrogenases. Conversely, C-169 lacks proteins that exist in all other sequenced chlorophytes, including components of the glycosyl phosphatidyl inositol anchoring system, pyruvate phosphate dikinase and the photosystem 1 reaction center subunit N (PsaN). Conclusions We suggest that some of these gene losses and gains could have contributed to adaptation to low temperatures. Comparison of these genomic features with the adaptive strategies of psychrophilic microbes suggests that prokaryotes and eukaryotes followed comparable evolutionary routes to adapt to cold environments.« less
Major Chromosomal Rearrangements Distinguish Willow and Poplar After the Ancestral “Salicoid” Genome Duplication

PubMed Central

Hou, Jing; Ye, Ning; Dong, Zhongyuan; Lu, Mengzhu; Li, Laigeng; Yin, Tongming

2016-01-01

Populus (poplar) and Salix (willow) are sister genera in the Salicaceae family. In both lineages extant species are predominantly diploid. Genome analysis previously revealed that the two lineages originated from a common tetraploid ancestor. In this study, we conducted a syntenic comparison of the corresponding 19 chromosome members of the poplar and willow genomes. Our observations revealed that almost every chromosomal segment had a parallel paralogous segment elsewhere in the genomes, and the two lineages shared a similar syntenic pinwheel pattern for most of the chromosomes, which indicated that the two lineages diverged after the genome reorganization in the common progenitor. The pinwheel patterns showed distinct differences for two chromosome pairs in each lineage. Further analysis detected two major interchromosomal rearrangements that distinguished the karyotypes of willow and poplar. Chromosome I of willow was a conjunction of poplar chromosome XVI and the lower portion of poplar chromosome I, whereas willow chromosome XVI corresponded to the upper portion of poplar chromosome I. Scientists have suggested that Populus is evolutionarily more primitive than Salix. Therefore, we propose that, after the “salicoid” duplication event, fission and fusion of the ancestral chromosomes first give rise to the diploid progenitor of extant Populus species. During the evolutionary process, fission and fusion of poplar chromosomes I and XVI subsequently give rise to the progenitor of extant Salix species. This study contributes to an improved understanding of genome divergence after ancient genome duplication in closely related lineages of higher plants. PMID:27352946
Bonus Organisms in High-Throughput Eukaryotic Whole-Genome Shorgun Assembly

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pangilinan, Jasmyn; Shapiro, Harris; Tu, Hank

2006-02-06

The DOE Joint Genome Institute has sequenced over 50 eukaryotic genomes, ranging in size from 15 MB to 1.6 GB, over a wide range of organism types. In the course of doing so, it has become clear that a substantial fraction of these data sets contains bonus organisms, usually prokaryotes, in addition to the desired genome. While some of these additional organisms are extraneous contamination, they are sometimes symbionts, and so can be of biological interest. Therefore, it is desirable to assemble the bonus organisms along with the main genome. This transforms the problem into one of metagenomic assembly, whichmore » is considerably more challenging than traditional whole-genome shotgun (WGS) assembly. The different organisms will usually be present at different sequence depths, which is difficult to handle in most WGS assemblers. In addition, with multiple distinct genomes present, chimerism can produce cross-organism combinations. Finally, there is no guarantee that only a single bonus organism will be present. For example, one JGI project contained at least two different prokaryotic contaminants, plus a 145 KB plasmid of unknown origin. We have developed techniques to routinely identify and handle such bonus organisms in a high-throughput sequencing environment. Approaches include screening and partitioning the unassembled data, and iterative subassemblies. These methods are applicable not only to bonus organisms, but also to desired components such as organelles. These procedures have the additional benefit of identifying, and allowing for the removal of, cloning artifacts such as E.coli and spurious vector inclusions.« less
Origin of eukaryotes from within archaea, archaeal eukaryome and bursts of gene gain: eukaryogenesis just made easier?

PubMed Central

Koonin, Eugene V.

2015-01-01

The origin of eukaryotes is a fundamental, forbidding evolutionary puzzle. Comparative genomic analysis clearly shows that the last eukaryotic common ancestor (LECA) possessed most of the signature complex features of modern eukaryotic cells, in particular the mitochondria, the endomembrane system including the nucleus, an advanced cytoskeleton and the ubiquitin network. Numerous duplications of ancestral genes, e.g. DNA polymerases, RNA polymerases and proteasome subunits, also can be traced back to the LECA. Thus, the LECA was not a primitive organism and its emergence must have resulted from extensive evolution towards cellular complexity. However, the scenario of eukaryogenesis, and in particular the relationship between endosymbiosis and the origin of eukaryotes, is far from being clear. Four recent developments provide new clues to the likely routes of eukaryogenesis. First, evolutionary reconstructions suggest complex ancestors for most of the major groups of archaea, with the subsequent evolution dominated by gene loss. Second, homologues of signature eukaryotic proteins, such as actin and tubulin that form the core of the cytoskeleton or the ubiquitin system, have been detected in diverse archaea. The discovery of this ‘dispersed eukaryome’ implies that the archaeal ancestor of eukaryotes was a complex cell that might have been capable of a primitive form of phagocytosis and thus conducive to endosymbiont capture. Third, phylogenomic analyses converge on the origin of most eukaryotic genes of archaeal descent from within the archaeal evolutionary tree, specifically, the TACK superphylum. Fourth, evidence has been presented that the origin of the major archaeal phyla involved massive acquisition of bacterial genes. Taken together, these findings make the symbiogenetic scenario for the origin of eukaryotes considerably more plausible and the origin of the organizational complexity of eukaryotic cells more readily explainable than they appeared until
Complex archaea that bridge the gap between prokaryotes and eukaryotes

PubMed Central

Martijn, Joran; Lind, Anders E.; van Eijk, Roel; Schleper, Christa; Guy, Lionel; Ettema, Thijs J. G.

2015-01-01

The origin of the eukaryotic cell remains one of the most contentious puzzles in modern biology. Recent studies have provided support for the emergence of the eukaryotic host cell from within the archaeal domain of life, but the identity and nature of the putative archaeal ancestor remain a subject of debate. Here we describe the discovery of ‘Lokiarchaeota’, a novel candidate archaeal phylum, which forms a monophyletic group with eukaryotes in phylogenomic analyses, and whose genomes encode an expanded repertoire of eukaryotic signature proteins that are suggestive of sophisticated membrane remodelling capabilities. Our results provide strong support for hypotheses in which the eukaryotic host evolved from a bona fide archaeon, and demonstrate that many components that underpin eukaryote-specific features were already present in that ancestor. This provided the host with a rich genomic ‘starter-kit’ to support the increase in the cellular and genomic complexity that is characteristic of eukaryotes. PMID:25945739
The Evolutionary Landscape of Dbl-Like RhoGEF Families: Adapting Eukaryotic Cells to Environmental Signals

PubMed Central

Blangy, Anne

2017-01-01

Abstract The dynamics of cell morphology in eukaryotes is largely controlled by small GTPases of the Rho family. Rho GTPases are activated by guanine nucleotide exchange factors (RhoGEFs), of which diffuse B-cell lymphoma (Dbl)-like members form the largest family. Here, we surveyed Dbl-like sequences from 175 eukaryotic genomes and illuminate how the Dbl family evolved in all eukaryotic supergroups. By combining probabilistic phylogenetic approaches and functional domain analysis, we show that the human Dbl-like family is made of 71 members, structured into 20 subfamilies. The 71 members were already present in ancestral jawed vertebrates, but several members were subsequently lost in specific clades, up to 12% in birds. The jawed vertebrate repertoire was established from two rounds of duplications that occurred between tunicates, cyclostomes, and jawed vertebrates. Duplicated members showed distinct tissue distributions, conserved at least in Amniotes. All 20 subfamilies have members in Deuterostomes and Protostomes. Nineteen subfamilies are present in Porifera, the first phylum that diverged in Metazoa, 14 in Choanoflagellida and Filasterea, single-celled organisms closely related to Metazoa and three in Fungi, the sister clade to Metazoa. Other eukaryotic supergroups show an extraordinary variability of Dbl-like repertoires as a result of repeated and independent gain and loss events. Last, we observed that in Metazoa, the number of Dbl-like RhoGEFs varies in proportion of cell signaling complexity. Overall, our analysis supports the conclusion that Dbl-like RhoGEFs were present at the origin of eukaryotes and evolved as highly adaptive cell signaling mediators. PMID:28541439
Comparative Genomics of a Bacterivorous Green Alga Reveals Evolutionary Causalities and Consequences of Phago-Mixotrophic Mode of Nutrition

PubMed Central

Burns, John A.; Paasch, Amber; Narechania, Apurva; Kim, Eunsoo

2015-01-01

Abstract Cymbomonas tetramitiformis—a marine prasinophyte—is one of only a few green algae that still retain an ancestral particulate-feeding mechanism while harvesting energy through photosynthesis. The genome of the alga is estimated to be 850 Mb–1.2 Gb in size—the bulk of which is filled with repetitive sequences—and is annotated with 37,366 protein-coding gene models. A number of unusual metabolic pathways (for the Chloroplastida) are predicted for C. tetramitiformis, including pathways for Lipid-A and peptidoglycan metabolism. Comparative analyses of the predicted peptides of C. tetramitiformis to sets of other eukaryotes revealed that nonphagocytes are depleted in a number of genes, a proportion of which have known function in feeding. In addition, our analysis suggests that obligatory phagotrophy is associated with the loss of genes that function in biosynthesis of small molecules (e.g., amino acids). Further, C. tetramitiformis and at least one other phago-mixotrophic alga are thus unique, compared with obligatory heterotrophs and nonphagocytes, in that both feeding and small molecule synthesis-related genes are retained in their genomes. These results suggest that early, ancestral host eukaryotes that gave rise to phototrophs had the capacity to assimilate building block molecules from inorganic substances (i.e., prototrophy). The loss of biosynthesis genes, thus, may at least partially explain the apparent lack of instances of permanent incorporation of photosynthetic endosymbionts in later-divergent, auxotrophic eukaryotic lineages, such as metazoans and ciliates. PMID:26224703
Dormant origins as a built-in safeguard in eukaryotic DNA replication against genome instability and disease development.

PubMed

Shima, Naoko; Pederson, Kayla D

2017-08-01

DNA replication is a prerequisite for cell proliferation, yet it can be increasingly challenging for a eukaryotic cell to faithfully duplicate its genome as its size and complexity expands. Dormant origins now emerge as a key component for cells to successfully accomplish such a demanding but essential task. In this perspective, we will first provide an overview of the fundamental processes eukaryotic cells have developed to regulate origin licensing and firing. With a special focus on mammalian systems, we will then highlight the role of dormant origins in preventing replication-associated genome instability and their functional interplay with proteins involved in the DNA damage repair response for tumor suppression. Lastly, deficiencies in the origin licensing machinery will be discussed in relation to their influence on stem cell maintenance and human diseases. Copyright © 2017 Elsevier B.V. All rights reserved.
Yeast 2.0-connecting the dots in the construction of the world's first functional synthetic eukaryotic genome.

PubMed

Pretorius, I S; Boeke, J D

2018-06-01

Historians of the future may well describe 2018 as the year that the world's first functional synthetic eukaryotic genome became a reality. Without the benefit of hindsight, it might be hard to completely grasp the long-term significance of a breakthrough moment in the history of science like this. The role of synthetic biology in the imminent birth of a budding Saccharomyces cerevisiae yeast cell carrying 16 man-made chromosomes causes the world of science to teeter on the threshold of a future-defining scientific frontier. The genome-engineering tools and technologies currently being developed to produce the ultimate yeast genome will irreversibly connect the dots between our improved understanding of the fundamentals of a complex cell containing its DNA in a specialised nucleus and the application of bioengineered eukaryotes designed for advanced biomanufacturing of beneficial products. By joining up the dots between the findings and learnings from the international Synthetic Yeast Genome project (known as the Yeast 2.0 or Sc2.0 project) and concurrent advancements in biodesign tools and smart data-intensive technologies, a future world powered by a thriving bioeconomy seems realistic. This global project demonstrates how a collaborative network of dot connectors-driven by a tinkerer's indomitable curiosity to understand how things work inside a eukaryotic cell-are using cutting-edge biodesign concepts and synthetic biology tools to advance science and to positively frame human futures (i.e. improved quality of life) in a planetary context (i.e. a sustainable environment). Explorations such as this have a rich history of resulting in unexpected discoveries and unanticipated applications for the benefit of people and planet. However, we must learn from past explorations into controversial futuristic sciences and ensure that researchers at the forefront of an emerging science such as synthetic biology remain connected to all stakeholders' concerns about the
Beyond Agrobacterium-Mediated Transformation: Horizontal Gene Transfer from Bacteria to Eukaryotes.

PubMed

Lacroix, Benoît; Citovsky, Vitaly

2018-03-03

Besides the massive gene transfer from organelles to the nuclear genomes, which occurred during the early evolution of eukaryote lineages, the importance of horizontal gene transfer (HGT) in eukaryotes remains controversial. Yet, increasing amounts of genomic data reveal many cases of bacterium-to-eukaryote HGT that likely represent a significant force in adaptive evolution of eukaryotic species. However, DNA transfer involved in genetic transformation of plants by Agrobacterium species has traditionally been considered as the unique example of natural DNA transfer and integration into eukaryotic genomes. Recent discoveries indicate that the repertoire of donor bacterial species and of recipient eukaryotic hosts potentially are much wider than previously thought, including donor bacterial species, such as plant symbiotic nitrogen-fixing bacteria (e.g., Rhizobium etli) and animal bacterial pathogens (e.g., Bartonella henselae, Helicobacter pylori), and recipient species from virtually all eukaryotic clades. Here, we review the molecular pathways and potential mechanisms of these trans-kingdom HGT events and discuss their utilization in biotechnology and research.

Streamlining and Large Ancestral Genomes in Archaea Inferred with a Phylogenetic Birth-and-Death Model

PubMed Central

Miklós, István

2009-01-01

Homologous genes originate from a common ancestor through vertical inheritance, duplication, or horizontal gene transfer. Entire homolog families spawned by a single ancestral gene can be identified across multiple genomes based on protein sequence similarity. The sequences, however, do not always reveal conclusively the history of large families. To study the evolution of complete gene repertoires, we propose here a mathematical framework that does not rely on resolved gene family histories. We show that so-called phylogenetic profiles, formed by family sizes across multiple genomes, are sufficient to infer principal evolutionary trends. The main novelty in our approach is an efficient algorithm to compute the likelihood of a phylogenetic profile in a model of birth-and-death processes acting on a phylogeny. We examine known gene families in 28 archaeal genomes using a probabilistic model that involves lineage- and family-specific components of gene acquisition, duplication, and loss. The model enables us to consider all possible histories when inferring statistics about archaeal evolution. According to our reconstruction, most lineages are characterized by a net loss of gene families. Major increases in gene repertoire have occurred only a few times. Our reconstruction underlines the importance of persistent streamlining processes in shaping genome composition in Archaea. It also suggests that early archaeal genomes were as complex as typical modern ones, and even show signs, in the case of the methanogenic ancestor, of an extremely large gene repertoire. PMID:19570746
Yeast 2.0—connecting the dots in the construction of the world's first functional synthetic eukaryotic genome

PubMed Central

Boeke, J D

2018-01-01

Abstract Historians of the future may well describe 2018 as the year that the world's first functional synthetic eukaryotic genome became a reality. Without the benefit of hindsight, it might be hard to completely grasp the long-term significance of a breakthrough moment in the history of science like this. The role of synthetic biology in the imminent birth of a budding Saccharomyces cerevisiae yeast cell carrying 16 man-made chromosomes causes the world of science to teeter on the threshold of a future-defining scientific frontier. The genome-engineering tools and technologies currently being developed to produce the ultimate yeast genome will irreversibly connect the dots between our improved understanding of the fundamentals of a complex cell containing its DNA in a specialised nucleus and the application of bioengineered eukaryotes designed for advanced biomanufacturing of beneficial products. By joining up the dots between the findings and learnings from the international Synthetic Yeast Genome project (known as the Yeast 2.0 or Sc2.0 project) and concurrent advancements in biodesign tools and smart data-intensive technologies, a future world powered by a thriving bioeconomy seems realistic. This global project demonstrates how a collaborative network of dot connectors—driven by a tinkerer's indomitable curiosity to understand how things work inside a eukaryotic cell—are using cutting-edge biodesign concepts and synthetic biology tools to advance science and to positively frame human futures (i.e. improved quality of life) in a planetary context (i.e. a sustainable environment). Explorations such as this have a rich history of resulting in unexpected discoveries and unanticipated applications for the benefit of people and planet. However, we must learn from past explorations into controversial futuristic sciences and ensure that researchers at the forefront of an emerging science such as synthetic biology remain connected to all stakeholders’ concerns
Symbiosis and the origin of eukaryotic motility

NASA Technical Reports Server (NTRS)

Margulis, L.; Hinkle, G.

1991-01-01

Ongoing work to test the hypothesis of the origin of eukaryotic cell organelles by microbial symbioses is discussed. Because of the widespread acceptance of the serial endosymbiotic theory (SET) of the origin of plastids and mitochondria, the idea of the symbiotic origin of the centrioles and axonemes for spirochete bacteria motility symbiosis was tested. Intracellular microtubular systems are purported to derive from symbiotic associations between ancestral eukaryotic cells and motile bacteria. Four lines of approach to this problem are being pursued: (1) cloning the gene of a tubulin-like protein discovered in Spirocheata bajacaliforniesis; (2) seeking axoneme proteins in spirochets by antibody cross-reaction; (3) attempting to cultivate larger, free-living spirochetes; and (4) studying in detail spirochetes (e.g., Cristispira) symbiotic with marine animals. Other aspects of the investigation are presented.
FrameD: A flexible program for quality check and gene prediction in prokaryotic genomes and noisy matured eukaryotic sequences.

PubMed

Schiex, Thomas; Gouzy, Jérôme; Moisan, Annick; de Oliveira, Yannick

2003-07-01

We describe FrameD, a program that predicts coding regions in prokaryotic and matured eukaryotic sequences. Initially targeted at gene prediction in bacterial GC rich genomes, the gene model used in FrameD also allows to predict genes in the presence of frameshifts and partially undetermined sequences which makes it also very suitable for gene prediction and frameshift correction in unfinished sequences such as EST and EST cluster sequences. Like recent eukaryotic gene prediction programs, FrameD also includes the ability to take into account protein similarity information both in its prediction and its graphical output. Its performances are evaluated on different bacterial genomes. The web site (http://genopole.toulouse.inra.fr/bioinfo/FrameD/FD) allows direct prediction, sequence correction and translation and the ability to learn new models for new organisms.
Transfer of DNA from Bacteria to Eukaryotes

PubMed Central

2016-01-01

ABSTRACT Historically, the members of the Agrobacterium genus have been considered the only bacterial species naturally able to transfer and integrate DNA into the genomes of their eukaryotic hosts. Yet, increasing evidence suggests that this ability to genetically transform eukaryotic host cells might be more widespread in the bacterial world. Indeed, analyses of accumulating genomic data reveal cases of horizontal gene transfer from bacteria to eukaryotes and suggest that it represents a significant force in adaptive evolution of eukaryotic species. Specifically, recent reports indicate that bacteria other than Agrobacterium, such as Bartonella henselae (a zoonotic pathogen), Rhizobium etli (a plant-symbiotic bacterium related to Agrobacterium), or even Escherichia coli, have the ability to genetically transform their host cells under laboratory conditions. This DNA transfer relies on type IV secretion systems (T4SSs), the molecular machines that transport macromolecules during conjugative plasmid transfer and also during transport of proteins and/or DNA to the eukaryotic recipient cells. In this review article, we explore the extent of possible transfer of genetic information from bacteria to eukaryotic cells as well as the evolutionary implications and potential applications of this transfer. PMID:27406565
Transcription factor IID in the Archaea: sequences in the Thermococcus celer genome would encode a product closely related to the TATA-binding protein of eukaryotes

NASA Technical Reports Server (NTRS)

Marsh, T. L.; Reich, C. I.; Whitelock, R. B.; Olsen, G. J.; Woese, C. R. (Principal Investigator)

1994-01-01

The first step in transcription initiation in eukaryotes is mediated by the TATA-binding protein, a subunit of the transcription factor IID complex. We have cloned and sequenced the gene for a presumptive homolog of this eukaryotic protein from Thermococcus celer, a member of the Archaea (formerly archaebacteria). The protein encoded by the archaeal gene is a tandem repeat of a conserved domain, corresponding to the repeated domain in its eukaryotic counterparts. Molecular phylogenetic analyses of the two halves of the repeat are consistent with the duplication occurring before the divergence of the archael and eukaryotic domains. In conjunction with previous observations of similarity in RNA polymerase subunit composition and sequences and the finding of a transcription factor IIB-like sequence in Pyrococcus woesei (a relative of T. celer) it appears that major features of the eukaryotic transcription apparatus were well-established before the origin of eukaryotic cellular organization. The divergence between the two halves of the archael protein is less than that between the halves of the individual eukaryotic sequences, indicating that the average rate of sequence change in the archael protein has been less than in its eukaryotic counterparts. To the extent that this lower rate applies to the genome as a whole, a clearer picture of the early genes (and gene families) that gave rise to present-day genomes is more apt to emerge from the study of sequences from the Archaea than from the corresponding sequences from eukaryotes.
Origins of Eukaryotic Sexual Reproduction

PubMed Central

2014-01-01

Sexual reproduction is a nearly universal feature of eukaryotic organisms. Given its ubiquity and shared core features, sex is thought to have arisen once in the last common ancestor to all eukaryotes. Using the perspectives of molecular genetics and cell biology, we consider documented and hypothetical scenarios for the instantiation and evolution of meiosis, fertilization, sex determination, uniparental inheritance of organelle genomes, and speciation. PMID:24591519
The genomic underpinnings of eukaryotic virus taxonomy: creating a sequence-based framework for family-level virus classification.

PubMed

Aiewsakun, Pakorn; Simmonds, Peter

2018-02-20

The International Committee on Taxonomy of Viruses (ICTV) classifies viruses into families, genera and species and provides a regulated system for their nomenclature that is universally used in virus descriptions. Virus taxonomic assignments have traditionally been based upon virus phenotypic properties such as host range, virion morphology and replication mechanisms, particularly at family level. However, gene sequence comparisons provide a clearer guide to their evolutionary relationships and provide the only information that may guide the incorporation of viruses detected in environmental (metagenomic) studies that lack any phenotypic data. The current study sought to determine whether the existing virus taxonomy could be reproduced by examination of genetic relationships through the extraction of protein-coding gene signatures and genome organisational features. We found large-scale consistency between genetic relationships and taxonomic assignments for viruses of all genome configurations and genome sizes. The analysis pipeline that we have called 'Genome Relationships Applied to Virus Taxonomy' (GRAViTy) was highly effective at reproducing the current assignments of viruses at family level as well as inter-family groupings into orders. Its ability to correctly differentiate assigned viruses from unassigned viruses, and classify them into the correct taxonomic group, was evaluated by threefold cross-validation technique. This predicted family membership of eukaryotic viruses with close to 100% accuracy and specificity potentially enabling the algorithm to predict assignments for the vast corpus of metagenomic sequences consistently with ICTV taxonomy rules. In an evaluation run of GRAViTy, over one half (460/921) of (near)-complete genome sequences from several large published metagenomic eukaryotic virus datasets were assigned to 127 novel family-level groupings. If corroborated by other analysis methods, these would potentially more than double the number of
Deep phylogeny, ancestral groups and the four ages of life

PubMed Central

Cavalier-Smith, Thomas

2010-01-01

Organismal phylogeny depends on cell division, stasis, mutational divergence, cell mergers (by sex or symbiogenesis), lateral gene transfer and death. The tree of life is a useful metaphor for organismal genealogical history provided we recognize that branches sometimes fuse. Hennigian cladistics emphasizes only lineage splitting, ignoring most other major phylogenetic processes. Though methodologically useful it has been conceptually confusing and harmed taxonomy, especially in mistakenly opposing ancestral (paraphyletic) taxa. The history of life involved about 10 really major innovations in cell structure. In membrane topology, there were five successive kinds of cell: (i) negibacteria, with two bounding membranes, (ii) unibacteria, with one bounding and no internal membranes, (iii) eukaryotes with endomembranes and mitochondria, (iv) plants with chloroplasts and (v) finally, chromists with plastids inside the rough endoplasmic reticulum. Membrane chemistry divides negibacteria into the more advanced Glycobacteria (e.g. Cyanobacteria and Proteobacteria) with outer membrane lipolysaccharide and primitive Eobacteria without lipopolysaccharide (deserving intenser study). It also divides unibacteria into posibacteria, ancestors of eukaryotes, and archaebacteria—the sisters (not ancestors) of eukaryotes and the youngest bacterial phylum. Anaerobic eobacteria, oxygenic cyanobacteria, desiccation-resistant posibacteria and finally neomura (eukaryotes plus archaebacteria) successively transformed Earth. Accidents and organizational constraints are as important as adaptiveness in body plan evolution. PMID:20008390
Intracellular metabolic pathway distribution in diatoms and tools for genome-enabled experimental diatom research.

PubMed

Gruber, Ansgar; Kroth, Peter G

2017-09-05

Diatoms are important primary producers in the oceans and can also dominate other aquatic habitats. One reason for the success of this phylogenetically relatively young group of unicellular organisms could be the impressive redundancy and diversity of metabolic isoenzymes in diatoms. This redundancy is a result of the evolutionary origin of diatom plastids by a eukaryote-eukaryote endosymbiosis, a process that implies temporary redundancy of functionally complete eukaryotic genomes. During the establishment of the plastids, this redundancy was partially reduced via gene losses, and was partially retained via gene transfer to the nucleus of the respective host cell. These gene transfers required re-assignment of intracellular targeting signals, a process that simultaneously altered the intracellular distribution of metabolic enzymes compared with the ancestral cells. Genome annotation, the correct assignment of the gene products and the prediction of putative function, strongly depends on the correct prediction of the intracellular targeting of a gene product. Here again diatoms are very peculiar, because the targeting systems for organelle import are partially different to those in land plants. In this review, we describe methods of predicting intracellular enzyme locations, highlight findings of metabolic peculiarities in diatoms and present genome-enabled approaches to study their metabolism.This article is part of the themed issue 'The peculiar carbon metabolism in diatoms'. © 2017 The Author(s).
A Helitron transposon reconstructed from bats reveals a novel mechanism of genome shuffling in eukaryotes.

PubMed

Grabundzija, Ivana; Messing, Simon A; Thomas, Jainy; Cosby, Rachel L; Bilic, Ilija; Miskey, Csaba; Gogol-Döring, Andreas; Kapitonov, Vladimir; Diem, Tanja; Dalda, Anna; Jurka, Jerzy; Pritham, Ellen J; Dyda, Fred; Izsvák, Zsuzsanna; Ivics, Zoltán

2016-03-02

Helitron transposons capture and mobilize gene fragments in eukaryotes, but experimental evidence for their transposition is lacking in the absence of an isolated active element. Here we reconstruct Helraiser, an ancient element from the bat genome, and use this transposon as an experimental tool to unravel the mechanism of Helitron transposition. A hairpin close to the 3'-end of the transposon functions as a transposition terminator. However, the 3'-end can be bypassed by the transposase, resulting in transduction of flanking sequences to new genomic locations. Helraiser transposition generates covalently closed circular intermediates, suggestive of a replicative transposition mechanism, which provides a powerful means to disseminate captured transcriptional regulatory signals across the genome. Indeed, we document the generation of novel transcripts by Helitron promoter capture both experimentally and by transcriptome analysis in bats. Our results provide mechanistic insight into Helitron transposition, and its impact on diversification of gene function by genome shuffling.
Mitochondria, the Cell Cycle, and the Origin of Sex via a Syncytial Eukaryote Common Ancestor

PubMed Central

Garg, Sriram G.; Martin, William F.

2016-01-01

Theories for the origin of sex traditionally start with an asexual mitosing cell and add recombination, thereby deriving meiosis from mitosis. Though sex was clearly present in the eukaryote common ancestor, the order of events linking the origin of sex and the origin of mitosis is unknown. Here, we present an evolutionary inference for the origin of sex starting with a bacterial ancestor of mitochondria in the cytosol of its archaeal host. We posit that symbiotic association led to the origin of mitochondria and gene transfer to host’s genome, generating a nucleus and a dedicated translational compartment, the eukaryotic cytosol, in which—by virtue of mitochondria—metabolic energy was not limiting. Spontaneous protein aggregation (monomer polymerization) and Adenosine Tri-phosphate (ATP)-dependent macromolecular movement in the cytosol thereby became selectable, giving rise to continuous microtubule-dependent chromosome separation (reduction division). We propose that eukaryotic chromosome division arose in a filamentous, syncytial, multinucleated ancestor, in which nuclei with insufficient chromosome numbers could complement each other through mRNA in the cytosol and generate new chromosome combinations through karyogamy. A syncytial (or coenocytic, a synonym) eukaryote ancestor, or Coeca, would account for the observation that the process of eukaryotic chromosome separation is more conserved than the process of eukaryotic cell division. The first progeny of such a syncytial ancestor were likely equivalent to meiospores, released into the environment by the host’s vesicle secretion machinery. The natural ability of archaea (the host) to fuse and recombine brought forth reciprocal recombination among fusing (syngamy and karyogamy) progeny—sex—in an ancestrally meiotic cell cycle, from which the simpler haploid and diploid mitotic cell cycles arose. The origin of eukaryotes was the origin of vertical lineage inheritance, and sex was required to keep
Well-characterized sequence features of eukaryote genomes and implications for ab initio gene prediction.

PubMed

Huang, Ying; Chen, Shi-Yi; Deng, Feilong

2016-01-01

In silico analysis of DNA sequences is an important area of computational biology in the post-genomic era. Over the past two decades, computational approaches for ab initio prediction of gene structure from genome sequence alone have largely facilitated our understanding on a variety of biological questions. Although the computational prediction of protein-coding genes has already been well-established, we are also facing challenges to robustly find the non-coding RNA genes, such as miRNA and lncRNA. Two main aspects of ab initio gene prediction include the computed values for describing sequence features and used algorithm for training the discriminant function, and by which different combinations are employed into various bioinformatic tools. Herein, we briefly review these well-characterized sequence features in eukaryote genomes and applications to ab initio gene prediction. The main purpose of this article is to provide an overview to beginners who aim to develop the related bioinformatic tools.
An alternative method for cDNA cloning from surrogate eukaryotic cells transfected with the corresponding genomic DNA.

PubMed

Hu, Lin-Yong; Cui, Chen-Chen; Song, Yu-Jie; Wang, Xiang-Guo; Jin, Ya-Ping; Wang, Ai-Hua; Zhang, Yong

2012-07-01

cDNA is widely used in gene function elucidation and/or transgenics research but often suitable tissues or cells from which to isolate mRNA for reverse transcription are unavailable. Here, an alternative method for cDNA cloning is described and tested by cloning the cDNA of human LALBA (human alpha-lactalbumin) from genomic DNA. First, genomic DNA containing all of the coding exons was cloned from human peripheral blood and inserted into a eukaryotic expression vector. Next, by delivering the plasmids into either 293T or fibroblast cells, surrogate cells were constructed. Finally, the total RNA was extracted from the surrogate cells and cDNA was obtained by RT-PCR. The human LALBA cDNA that was obtained was compared with the corresponding mRNA published in GenBank. The comparison showed that the two sequences were identical. The novel method for cDNA cloning from surrogate eukaryotic cells described here uses well-established techniques that are feasible and simple to use. We anticipate that this alternative method will have widespread applications.
How Malleable is the Eukaryotic Genome? Extreme Rate of Chromosomal Rearrangement in the Genus Drosophila

PubMed Central

Ranz, José María; Casals, Ferran; Ruiz, Alfredo

2001-01-01

During the evolution of the genus Drosophila, the molecular organization of the major chromosomal elements has been repeatedly rearranged via the fixation of paracentric inversions. Little detailed information is available, however, on the extent and effect of these changes at the molecular level. In principle, a full description of the rate and pattern of change could reveal the limits, if any, to which the eukaryotic genome can accommodate reorganizations. We have constructed a high-density physical map of the largest chromosomal element in Drosophila repleta (chromosome 2) and compared the order and distances between the markers with those on the homologous chromosomal element (3R) in Drosophila melanogaster. The two species belong to different subgenera (Drosophila and Sophophora, respectively), which diverged 40–62 million years (Myr) ago and represent, thus, the farthest lineages within the Drosophila genus. The comparison reveals extensive reshuffling of gene order from centromere to telomere. Using a maximum likelihood method, we estimate that 114 ± 14 paracentric inversions have been fixed in this chromosomal element since the divergence of the two species, that is, 0.9–1.4 inversions fixed per Myr. Comparison with available rates of chromosomal evolution, taking into account genome size, indicates that the Drosophila genome shows the highest rate found so far in any eukaryote. Twenty-one small segments (23–599 kb) comprising at least two independent (nonoverlapping) markers appear to be conserved between D. melanogaster and D. repleta. These results are consistent with the random breakage model and do not provide significant evidence of functional constraint of any kind. They support the notion that the Drosophila genome is extraordinarily malleable and has a modular organization. The high rate of chromosomal change also suggests a very limited transferability of the positional information from the Drosophila genome to other insects. [The
Enzymes involved in organellar DNA replication in photosynthetic eukaryotes.

PubMed

Moriyama, Takashi; Sato, Naoki

2014-01-01

Plastids and mitochondria possess their own genomes. Although the replication mechanisms of these organellar genomes remain unclear in photosynthetic eukaryotes, several organelle-localized enzymes related to genome replication, including DNA polymerase, DNA primase, DNA helicase, DNA topoisomerase, single-stranded DNA maintenance protein, DNA ligase, primer removal enzyme, and several DNA recombination-related enzymes, have been identified. In the reference Eudicot plant Arabidopsis thaliana, the replication-related enzymes of plastids and mitochondria are similar because many of them are dual targeted to both organelles, whereas in the red alga Cyanidioschyzon merolae, plastids and mitochondria contain different replication machinery components. The enzymes involved in organellar genome replication in green plants and red algae were derived from different origins, including proteobacterial, cyanobacterial, and eukaryotic lineages. In the present review, we summarize the available data for enzymes related to organellar genome replication in green plants and red algae. In addition, based on the type and distribution of replication enzymes in photosynthetic eukaryotes, we discuss the transitional history of replication enzymes in the organelles of plants.
Single-cell transcriptomics for microbial eukaryotes.

PubMed

Kolisko, Martin; Boscaro, Vittorio; Burki, Fabien; Lynn, Denis H; Keeling, Patrick J

2014-11-17

One of the greatest hindrances to a comprehensive understanding of microbial genomics, cell biology, ecology, and evolution is that most microbial life is not in culture. Solutions to this problem have mainly focused on whole-community surveys like metagenomics, but these analyses inevitably loose information and present particular challenges for eukaryotes, which are relatively rare and possess large, gene-sparse genomes. Single-cell analyses present an alternative solution that allows for specific species to be targeted, while retaining information on cellular identity, morphology, and partitioning of activities within microbial communities. Single-cell transcriptomics, pioneered in medical research, offers particular potential advantages for uncultivated eukaryotes, but the efficiency and biases have not been tested. Here we describe a simple and reproducible method for single-cell transcriptomics using manually isolated cells from five model ciliate species; we examine impacts of amplification bias and contamination, and compare the efficacy of gene discovery to traditional culture-based transcriptomics. Gene discovery using single-cell transcriptomes was found to be comparable to mass-culture methods, suggesting single-cell transcriptomics is an efficient entry point into genomic data from the vast majority of eukaryotic biodiversity. Copyright © 2014 Elsevier Ltd. All rights reserved.
Advances in computer simulation of genome evolution: toward more realistic evolutionary genomics analysis by approximate bayesian computation.

PubMed

Arenas, Miguel

2015-04-01

NGS technologies present a fast and cheap generation of genomic data. Nevertheless, ancestral genome inference is not so straightforward due to complex evolutionary processes acting on this material such as inversions, translocations, and other genome rearrangements that, in addition to their implicit complexity, can co-occur and confound ancestral inferences. Recently, models of genome evolution that accommodate such complex genomic events are emerging. This letter explores these novel evolutionary models and proposes their incorporation into robust statistical approaches based on computer simulations, such as approximate Bayesian computation, that may produce a more realistic evolutionary analysis of genomic data. Advantages and pitfalls in using these analytical methods are discussed. Potential applications of these ancestral genomic inferences are also pointed out.
Endosymbiosis and Eukaryotic Cell Evolution.

PubMed

Archibald, John M

2015-10-05

Understanding the evolution of eukaryotic cellular complexity is one of the grand challenges of modern biology. It has now been firmly established that mitochondria and plastids, the classical membrane-bound organelles of eukaryotic cells, evolved from bacteria by endosymbiosis. In the case of mitochondria, evidence points very clearly to an endosymbiont of α-proteobacterial ancestry. The precise nature of the host cell that partnered with this endosymbiont is, however, very much an open question. And while the host for the cyanobacterial progenitor of the plastid was undoubtedly a fully-fledged eukaryote, how - and how often - plastids moved from one eukaryote to another during algal diversification is vigorously debated. In this article I frame modern views on endosymbiotic theory in a historical context, highlighting the transformative role DNA sequencing played in solving early problems in eukaryotic cell evolution, and posing key unanswered questions emerging from the age of comparative genomics. Copyright © 2015 Elsevier Ltd. All rights reserved.
Endosymbiotic gene transfer from prokaryotic pangenomes: Inherited chimerism in eukaryotes.

PubMed

Ku, Chuan; Nelson-Sathi, Shijulal; Roettger, Mayo; Garg, Sriram; Hazkani-Covo, Einat; Martin, William F

2015-08-18

Endosymbiotic theory in eukaryotic-cell evolution rests upon a foundation of three cornerstone partners--the plastid (a cyanobacterium), the mitochondrion (a proteobacterium), and its host (an archaeon)--and carries a corollary that, over time, the majority of genes once present in the organelle genomes were relinquished to the chromosomes of the host (endosymbiotic gene transfer). However, notwithstanding eukaryote-specific gene inventions, single-gene phylogenies have never traced eukaryotic genes to three single prokaryotic sources, an issue that hinges crucially upon factors influencing phylogenetic inference. In the age of genomes, single-gene trees, once used to test the predictions of endosymbiotic theory, now spawn new theories that stand to eventually replace endosymbiotic theory with descriptive, gene tree-based variants featuring supernumerary symbionts: prokaryotic partners distinct from the cornerstone trio and whose existence is inferred solely from single-gene trees. We reason that the endosymbiotic ancestors of mitochondria and chloroplasts brought into the eukaryotic--and plant and algal--lineage a genome-sized sample of genes from the proteobacterial and cyanobacterial pangenomes of their respective day and that, even if molecular phylogeny were artifact-free, sampling prokaryotic pangenomes through endosymbiotic gene transfer would lead to inherited chimerism. Recombination in prokaryotes (transduction, conjugation, transformation) differs from recombination in eukaryotes (sex). Prokaryotic recombination leads to pangenomes, and eukaryotic recombination leads to vertical inheritance. Viewed from the perspective of endosymbiotic theory, the critical transition at the eukaryote origin that allowed escape from Muller's ratchet--the origin of eukaryotic recombination, or sex--might have required surprisingly little evolutionary innovation.

Clusters of ancestrally related genes that show paralogy in whole or in part are a major feature of the genomes of humans and other species.

PubMed

Walker, Michael B; King, Benjamin L; Paigen, Kenneth

2012-01-01

Arrangements of genes along chromosomes are a product of evolutionary processes, and we can expect that preferable arrangements will prevail over the span of evolutionary time, often being reflected in the non-random clustering of structurally and/or functionally related genes. Such non-random arrangements can arise by two distinct evolutionary processes: duplications of DNA sequences that give rise to clusters of genes sharing both sequence similarity and common sequence features and the migration together of genes related by function, but not by common descent. To provide a background for distinguishing between the two, which is important for future efforts to unravel the evolutionary processes involved, we here provide a description of the extent to which ancestrally related genes are found in proximity.Towards this purpose, we combined information from five genomic datasets, InterPro, SCOP, PANTHER, Ensembl protein families, and Ensembl gene paralogs. The results are provided in publicly available datasets (http://cgd.jax.org/datasets/clustering/paraclustering.shtml) describing the extent to which ancestrally related genes are in proximity beyond what is expected by chance (i.e. form paraclusters) in the human and nine other vertebrate genomes, as well as the D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae genomes. With the exception of Saccharomyces, paraclusters are a common feature of the genomes we examined. In the human genome they are estimated to include at least 22% of all protein coding genes. Paraclusters are far more prevalent among some gene families than others, are highly species or clade specific and can evolve rapidly, sometimes in response to environmental cues. Altogether, they account for a large portion of the functional clustering previously reported in several genomes.
Evolution of histone 2A for chromatin compaction in eukaryotes

PubMed Central

Macadangdang, Benjamin R; Oberai, Amit; Spektor, Tanya; Campos, Oscar A; Sheng, Fang; Carey, Michael F; Vogelauer, Maria; Kurdistani, Siavash K

2014-01-01

During eukaryotic evolution, genome size has increased disproportionately to nuclear volume, necessitating greater degrees of chromatin compaction in higher eukaryotes, which have evolved several mechanisms for genome compaction. However, it is unknown whether histones themselves have evolved to regulate chromatin compaction. Analysis of histone sequences from 160 eukaryotes revealed that the H2A N-terminus has systematically acquired arginines as genomes expanded. Insertion of arginines into their evolutionarily conserved position in H2A of a small-genome organism increased linear compaction by as much as 40%, while their absence markedly diminished compaction in cells with large genomes. This effect was recapitulated in vitro with nucleosomal arrays using unmodified histones, indicating that the H2A N-terminus directly modulates the chromatin fiber likely through intra- and inter-nucleosomal arginine–DNA contacts to enable tighter nucleosomal packing. Our findings reveal a novel evolutionary mechanism for regulation of chromatin compaction and may explain the frequent mutations of the H2A N-terminus in cancer. DOI: http://dx.doi.org/10.7554/eLife.02792.001 PMID:24939988
MADS goes genomic in conifers: towards determining the ancestral set of MADS-box genes in seed plants.

PubMed

Gramzow, Lydia; Weilandt, Lisa; Theißen, Günter

2014-11-01

MADS-box genes comprise a gene family coding for transcription factors. This gene family expanded greatly during land plant evolution such that the number of MADS-box genes ranges from one or two in green algae to around 100 in angiosperms. Given the crucial functions of MADS-box genes for nearly all aspects of plant development, the expansion of this gene family probably contributed to the increasing complexity of plants. However, the expansion of MADS-box genes during one important step of land plant evolution, namely the origin of seed plants, remains poorly understood due to the previous lack of whole-genome data for gymnosperms. The newly available genome sequences of Picea abies, Picea glauca and Pinus taeda were used to identify the complete set of MADS-box genes in these conifers. In addition, MADS-box genes were identified in the growing number of transcriptomes available for gymnosperms. With these datasets, phylogenies were constructed to determine the ancestral set of MADS-box genes of seed plants and to infer the ancestral functions of these genes. Type I MADS-box genes are under-represented in gymnosperms and only a minimum of two Type I MADS-box genes have been present in the most recent common ancestor (MRCA) of seed plants. In contrast, a large number of Type II MADS-box genes were found in gymnosperms. The MRCA of extant seed plants probably possessed at least 11-14 Type II MADS-box genes. In gymnosperms two duplications of Type II MADS-box genes were found, such that the MRCA of extant gymnosperms had at least 14-16 Type II MADS-box genes. The implied ancestral set of MADS-box genes for seed plants shows simplicity for Type I MADS-box genes and remarkable complexity for Type II MADS-box genes in terms of phylogeny and putative functions. The analysis of transcriptome data reveals that gymnosperm MADS-box genes are expressed in a great variety of tissues, indicating diverse roles of MADS-box genes for the development of gymnosperms. This study is
MADS goes genomic in conifers: towards determining the ancestral set of MADS-box genes in seed plants

PubMed Central

Gramzow, Lydia; Weilandt, Lisa; Theißen, Günter

2014-01-01

Background and Aims MADS-box genes comprise a gene family coding for transcription factors. This gene family expanded greatly during land plant evolution such that the number of MADS-box genes ranges from one or two in green algae to around 100 in angiosperms. Given the crucial functions of MADS-box genes for nearly all aspects of plant development, the expansion of this gene family probably contributed to the increasing complexity of plants. However, the expansion of MADS-box genes during one important step of land plant evolution, namely the origin of seed plants, remains poorly understood due to the previous lack of whole-genome data for gymnosperms. Methods The newly available genome sequences of Picea abies, Picea glauca and Pinus taeda were used to identify the complete set of MADS-box genes in these conifers. In addition, MADS-box genes were identified in the growing number of transcriptomes available for gymnosperms. With these datasets, phylogenies were constructed to determine the ancestral set of MADS-box genes of seed plants and to infer the ancestral functions of these genes. Key Results Type I MADS-box genes are under-represented in gymnosperms and only a minimum of two Type I MADS-box genes have been present in the most recent common ancestor (MRCA) of seed plants. In contrast, a large number of Type II MADS-box genes were found in gymnosperms. The MRCA of extant seed plants probably possessed at least 11–14 Type II MADS-box genes. In gymnosperms two duplications of Type II MADS-box genes were found, such that the MRCA of extant gymnosperms had at least 14–16 Type II MADS-box genes. Conclusions The implied ancestral set of MADS-box genes for seed plants shows simplicity for Type I MADS-box genes and remarkable complexity for Type II MADS-box genes in terms of phylogeny and putative functions. The analysis of transcriptome data reveals that gymnosperm MADS-box genes are expressed in a great variety of tissues, indicating diverse roles of MADS
How natural a kind is "eukaryote?".

PubMed

Doolittle, W Ford

2014-06-02

Systematics balances uneasily between realism and nominalism, uncommitted as to whether biological taxa are discoveries or inventions. If the former, they might be taken as natural kinds. I briefly review some philosophers' concepts of natural kinds and then argue that several of these apply well enough to "eukaryote." Although there are some sticky issues around genomic chimerism and when eukaryotes first appeared, if we allow for degrees in the naturalness of kinds, existing eukaryotes rank highly, higher than prokaryotes. Most biologists feel this intuitively: All I attempt to do here is provide some conceptual justification. Copyright © 2014 Cold Spring Harbor Laboratory Press; all rights reserved.
The logic of DNA replication in double-stranded DNA viruses: insights from global analysis of viral genomes

PubMed Central

Kazlauskas, Darius; Krupovic, Mart; Venclovas, Česlovas

2016-01-01

Abstract Genomic DNA replication is a complex process that involves multiple proteins. Cellular DNA replication systems are broadly classified into only two types, bacterial and archaeo-eukaryotic. In contrast, double-stranded (ds) DNA viruses feature a much broader diversity of DNA replication machineries. Viruses differ greatly in both completeness and composition of their sets of DNA replication proteins. In this study, we explored whether there are common patterns underlying this extreme diversity. We identified and analyzed all major functional groups of DNA replication proteins in all available proteomes of dsDNA viruses. Our results show that some proteins are common to viruses infecting all domains of life and likely represent components of the ancestral core set. These include B-family polymerases, SF3 helicases, archaeo-eukaryotic primases, clamps and clamp loaders of the archaeo-eukaryotic type, RNase H and ATP-dependent DNA ligases. We also discovered a clear correlation between genome size and self-sufficiency of viral DNA replication, the unanticipated dominance of replicative helicases and pervasive functional associations among certain groups of DNA replication proteins. Altogether, our results provide a comprehensive view on the diversity and evolution of replication systems in the DNA virome and uncover fundamental principles underlying the orchestration of viral DNA replication. PMID:27112572
Genome-wide computational identification of microRNAs and their targets in the deep-branching eukaryote Giardia lamblia.

PubMed

Zhang, Yan-Qiong; Chen, Dong-Liang; Tian, Hai-Feng; Zhang, Bao-Hong; Wen, Jian-Fan

2009-10-01

Using a combined computational program, we identified 50 potential microRNAs (miRNAs) in Giardia lamblia, one of the most primitive unicellular eukaryotes. These miRNAs are unique to G. lamblia and no homologues have been found in other organisms; miRNAs, currently known in other species, were not found in G. lamblia. This suggests that miRNA biogenesis and miRNA-mediated gene regulation pathway may evolve independently, especially in evolutionarily distant lineages. A majority (43) of the predicted miRNAs are located at one single locus; however, some miRNAs have two or more copies in the genome. Among the 58 miRNA genes, 28 are located in the intergenic regions whereas 30 are present in the anti-sense strands of the protein-coding sequences. Five predicted miRNAs are expressed in G. lamblia trophozoite cells evidenced by expressed sequence tags or RT-PCR. Thirty-seven identified miRNAs may target 50 protein-coding genes, including seven variant-specific surface proteins (VSPs). Our findings provide a clue that miRNA-mediated gene regulation may exist in the early stage of eukaryotic evolution, suggesting that it is an important regulation system ubiquitous in eukaryotes.
Ancestral effect on HOMA-IR levels quantitated in an American population of Mexican origin.

PubMed

Qu, Hui-Qi; Li, Quan; Lu, Yang; Hanis, Craig L; Fisher-Hoch, Susan P; McCormick, Joseph B

2012-12-01

An elevated insulin resistance index (homeostasis model assessment of insulin resistance [HOMA-IR]) is more commonly seen in the Mexican American population than in European populations. We report quantitative ancestral effects within a Mexican American population, and we correlate ancestral components with HOMA-IR. We performed ancestral analysis in 1,551 participants of the Cameron County Hispanic Cohort by genotyping 103 ancestry-informative markers (AIMs). These AIMs allow determination of the percentage (0-100%) ancestry from three major continental populations, i.e., European, African, and Amerindian. We observed that predominantly Amerindian ancestral components were associated with increased HOMA-IR (β = 0.124, P = 1.64 × 10(-7)). The correlation was more significant in males (Amerindian β = 0.165, P = 5.08 × 10(-7)) than in females (Amerindian β = 0.079, P = 0.019). This unique study design demonstrates how genomic markers for quantitative ancestral information can be used in admixed populations to predict phenotypic traits such as insulin resistance.
Anchoring genome sequence to chromosomes of the central bearded dragon (Pogona vitticeps) enables reconstruction of ancestral squamate macrochromosomes and identifies sequence content of the Z chromosome.

PubMed

Deakin, Janine E; Edwards, Melanie J; Patel, Hardip; O'Meally, Denis; Lian, Jinmin; Stenhouse, Rachael; Ryan, Sam; Livernois, Alexandra M; Azad, Bhumika; Holleley, Clare E; Li, Qiye; Georges, Arthur

2016-06-10

Squamates (lizards and snakes) are a speciose lineage of reptiles displaying considerable karyotypic diversity, particularly among lizards. Understanding the evolution of this diversity requires comparison of genome organisation between species. Although the genomes of several squamate species have now been sequenced, only the green anole lizard has any sequence anchored to chromosomes. There is only limited gene mapping data available for five other squamates. This makes it difficult to reconstruct the events that have led to extant squamate karyotypic diversity. The purpose of this study was to anchor the recently sequenced central bearded dragon (Pogona vitticeps) genome to chromosomes to trace the evolution of squamate chromosomes. Assigning sequence to sex chromosomes was of particular interest for identifying candidate sex determining genes. By using two different approaches to map conserved blocks of genes, we were able to anchor approximately 42 % of the dragon genome sequence to chromosomes. We constructed detailed comparative maps between dragon, anole and chicken genomes, and where possible, made broader comparisons across Squamata using cytogenetic mapping information for five other species. We show that squamate macrochromosomes are relatively well conserved between species, supporting findings from previous molecular cytogenetic studies. Macrochromosome diversity between members of the Toxicofera clade has been generated by intrachromosomal, and a small number of interchromosomal, rearrangements. We reconstructed the ancestral squamate macrochromosomes by drawing upon comparative cytogenetic mapping data from seven squamate species and propose the events leading to the arrangements observed in representative species. In addition, we assigned over 8 Mbp of sequence containing 219 genes to the Z chromosome, providing a list of genes to begin testing as candidate sex determining genes. Anchoring of the dragon genome has provided substantial insight into
Genome size differentiates co-occurring populations of the planktonic diatom Ditylum brightwellii (Bacillariophyta)

PubMed Central

2010-01-01

Background Diatoms are one of the most species-rich groups of eukaryotic microbes known. Diatoms are also the only group of eukaryotic micro-algae with a diplontic life history, suggesting that the ancestral diatom switched to a life history dominated by a duplicated genome. A key mechanism of speciation among diatoms could be a propensity for additional stable genome duplications. Across eukaryotic taxa, genome size is directly correlated to cell size and inversely correlated to physiological rates. Differences in relative genome size, cell size, and acclimated growth rates were analyzed in isolates of the diatom Ditylum brightwellii. Ditylum brightwellii consists of two main populations with identical 18s rDNA sequences; one population is distributed globally at temperate latitudes and the second appears to be localized to the Pacific Northwest coast of the USA. These two populations co-occur within the Puget Sound estuary of WA, USA, although their peak abundances differ depending on local conditions. Results All isolates from the more regionally-localized population (population 2) possessed 1.94 ± 0.74 times the amount of DNA, grew more slowly, and were generally larger than isolates from the more globally distributed population (population 1). The ITS1 sequences, cell sizes, and genome sizes of isolates from New Zealand were the same as population 1 isolates from Puget Sound, but their growth rates were within the range of the slower-growing population 2 isolates. Importantly, the observed genome size difference between isolates from the two populations was stable regardless of time in culture or the changes in cell size that accompany the diatom life history. Conclusions The observed two-fold difference in genome size between the D. brightwellii populations suggests that whole genome duplication occurred within cells of population 1 ultimately giving rise to population 2 cells. The apparent regional localization of population 2 is consistent with a recent
Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure

PubMed Central

Basu, Analabha; Sarkar-Roy, Neeta; Majumder, Partha P.

2016-01-01

India, occupying the center stage of Paleolithic and Neolithic migrations, has been underrepresented in genome-wide studies of variation. Systematic analysis of genome-wide data, using multiple robust statistical methods, on (i) 367 unrelated individuals drawn from 18 mainland and 2 island (Andaman and Nicobar Islands) populations selected to represent geographic, linguistic, and ethnic diversities, and (ii) individuals from populations represented in the Human Genome Diversity Panel (HGDP), reveal four major ancestries in mainland India. This contrasts with an earlier inference of two ancestries based on limited population sampling. A distinct ancestry of the populations of Andaman archipelago was identified and found to be coancestral to Oceanic populations. Analysis of ancestral haplotype blocks revealed that extant mainland populations (i) admixed widely irrespective of ancestry, although admixtures between populations was not always symmetric, and (ii) this practice was rapidly replaced by endogamy about 70 generations ago, among upper castes and Indo-European speakers predominantly. This estimated time coincides with the historical period of formulation and adoption of sociocultural norms restricting intermarriage in large social strata. A similar replacement observed among tribal populations was temporally less uniform. PMID:26811443
The Candidate Phylum Poribacteria by Single-Cell Genomics: New Insights into Phylogeny, Cell-Compartmentation, Eukaryote-Like Repeat Proteins, and Other Genomic Features

PubMed Central

Kamke, Janine; Rinke, Christian; Schwientek, Patrick; Mavromatis, Kostas; Ivanova, Natalia; Sczyrba, Alexander; Woyke, Tanja; Hentschel, Ute

2014-01-01

The candidate phylum Poribacteria is one of the most dominant and widespread members of the microbial communities residing within marine sponges. Cell compartmentalization had been postulated along with their discovery about a decade ago and their phylogenetic association to the Planctomycetes, Verrucomicrobia, Chlamydiae superphylum was proposed soon thereafter. In the present study we revised these features based on genomic data obtained from six poribacterial single cells. We propose that Poribacteria form a distinct monophyletic phylum contiguous to the PVC superphylum together with other candidate phyla. Our genomic analyses supported the possibility of cell compartmentalization in form of bacterial microcompartments. Further analyses of eukaryote-like protein domains stressed the importance of such proteins with features including tetratricopeptide repeats, leucin rich repeats as well as low density lipoproteins receptor repeats, the latter of which are reported here for the first time from a sponge symbiont. Finally, examining the most abundant protein domain family on poribacterial genomes revealed diverse phyH family proteins, some of which may be related to dissolved organic posphorus uptake. PMID:24498082
SINEs, evolution and genome structure in the opossum.

PubMed

Gu, Wanjun; Ray, David A; Walker, Jerilyn A; Barnes, Erin W; Gentles, Andrew J; Samollow, Paul B; Jurka, Jerzy; Batzer, Mark A; Pollock, David D

2007-07-01

Short INterspersed Elements (SINEs) are non-autonomous retrotransposons, usually between 100 and 500 base pairs (bp) in length, which are ubiquitous components of eukaryotic genomes. Their activity, distribution, and evolution can be highly informative on genomic structure and evolutionary processes. To determine recent activity, we amplified more than one hundred SINE1 loci in a panel of 43 M. domestica individuals derived from five diverse geographic locations. The SINE1 family has expanded recently enough that many loci were polymorphic, and the SINE1 insertion-based genetic distances among populations reflected geographic distance. Genome-wide comparisons of SINE1 densities and GC content revealed that high SINE1 density is associated with high GC content in a few long and many short spans. Young SINE1s, whether fixed or polymorphic, showed an unbiased GC content preference for insertion, indicating that the GC preference accumulates over long time periods, possibly in periodic bursts. SINE1 evolution is thus broadly similar to human Alu evolution, although it has an independent origin. High GC content adjacent to SINE1s is strongly correlated with bias towards higher AT to GC substitutions and lower GC to AT substitutions. This is consistent with biased gene conversion, and also indicates that like chickens, but unlike eutherian mammals, GC content heterogeneity (isochore structure) is reinforced by substitution processes in the M. domestica genome. Nevertheless, both high and low GC content regions are apparently headed towards lower GC content equilibria, possibly due to a relative shift to lower recombination rates in the recent Monodelphis ancestral lineage. Like eutherians, metatherian (marsupial) mammals have evolved high CpG substitution rates, but this is apparently a convergence in process rather than a shared ancestral state.
Multigene eukaryote phylogeny reveals the likely protozoan ancestors of opisthokonts (animals, fungi, choanozoans) and Amoebozoa.

PubMed

Cavalier-Smith, Thomas; Chao, Ema E; Snell, Elizabeth A; Berney, Cédric; Fiore-Donno, Anna Maria; Lewis, Rhodri

2014-12-01

Animals and fungi independently evolved from the protozoan phylum Choanozoa, these three groups constituting a major branch of the eukaryotic evolutionary tree known as opisthokonts. Opisthokonts and the protozoan phylum Amoebozoa (amoebae plus slime moulds) were previously argued to have evolved independently from the little-studied, largely flagellate, protozoan phylum, Sulcozoa. Sulcozoa are a likely evolutionary link between opisthokonts and the more primitive excavate flagellates that have ventral feeding grooves and the most primitive known mitochondria. To extend earlier sparse evidence for the ancestral (paraphyletic) nature of Sulcozoa, we sequenced transcriptomes from six gliding flagellates (two apusomonads; three planomonads; Mantamonas). Phylogenetic analyses of 173-192 genes and 73-122 eukaryote-wide taxa show Sulcozoa as deeply paraphyletic, confirming that opisthokonts and Amoebozoa independently evolved from sulcozoans by losing their ancestral ventral groove and dorsal pellicle: Apusozoa (apusomonads plus anaerobic breviate amoebae) are robustly sisters to opisthokonts and probably paraphyletic, breviates diverging before apusomonads; Varisulca (planomonads, Mantamonas, and non-gliding flagellate Collodictyon) are sisters to opisthokonts plus Apusozoa and Amoebozoa, and possibly holophyletic; Glissodiscea (planomonads, Mantamonas) may be holophyletic, but Mantamonas sometimes groups with Collodictyon instead. Taxon and gene sampling slightly affects tree topology; for the closest branches in Sulcozoa and opisthokonts, proportionally reducing missing data eliminates conflicts between homogeneous-model maximum-likelihood trees and evolutionarily more realistic site-heterogeneous trees. Sulcozoa, opisthokonts, and Amoebozoa constitute an often-pseudopodial 'podiate' clade, one of only three eukaryotic 'supergroups'. Our trees indicate that evolution of sulcozoan dorsal pellicle, ventral pseudopodia, and ciliary gliding (probably simultaneously
Genomicus update 2015: KaryoView and MatrixView provide a genome-wide perspective to multispecies comparative genomics

PubMed Central

Louis, Alexandra; Nguyen, Nga Thi Thuy; Muffato, Matthieu; Roest Crollius, Hugues

2015-01-01

The Genomicus web server (http://www.genomicus.biologie.ens.fr/genomicus) is a visualization tool allowing comparative genomics in four different phyla (Vertebrate, Fungi, Metazoan and Plants). It provides access to genomic information from extant species, as well as ancestral gene content and gene order for vertebrates and flowering plants. Here we present the new features available for vertebrate genome with a focus on new graphical tools. The interface to enter the database has been improved, two pairwise genome comparison tools are now available (KaryoView and MatrixView) and the multiple genome comparison tools (PhyloView and AlignView) propose three new kinds of representation and a more intuitive menu. These new developments have been implemented for Genomicus portal dedicated to vertebrates. This allows the analysis of 68 extant animal genomes, as well as 58 ancestral reconstructed genomes. The Genomicus server also provides access to ancestral gene orders, to facilitate evolutionary and comparative genomics studies, as well as computationally predicted regulatory interactions, thanks to the representation of conserved non-coding elements with their putative gene targets. PMID:25378326
EuCAP, a Eukaryotic Community Annotation Package, and its application to the rice genome

PubMed Central

Thibaud-Nissen, Françoise; Campbell, Matthew; Hamilton, John P; Zhu, Wei; Buell, C Robin

2007-01-01

Background Despite the improvements of tools for automated annotation of genome sequences, manual curation at the structural and functional level can provide an increased level of refinement to genome annotation. The Institute for Genomic Research Rice Genome Annotation (hereafter named the Osa1 Genome Annotation) is the product of an automated pipeline and, for this reason, will benefit from the input of biologists with expertise in rice and/or particular gene families. Leveraging knowledge from a dispersed community of scientists is a demonstrated way of improving a genome annotation. This requires tools that facilitate 1) the submission of gene annotation to an annotation project, 2) the review of the submitted models by project annotators, and 3) the incorporation of the submitted models in the ongoing annotation effort. Results We have developed the Eukaryotic Community Annotation Package (EuCAP), an annotation tool, and have applied it to the rice genome. The primary level of curation by community annotators (CA) has been the annotation of gene families. Annotation can be submitted by email or through the EuCAP Web Tool. The CA models are aligned to the rice pseudomolecules and the coordinates of these alignments, along with functional annotation, are stored in the MySQL EuCAP Gene Model database. Web pages displaying the alignments of the CA models to the Osa1 Genome models are automatically generated from the EuCAP Gene Model database. The alignments are reviewed by the project annotators (PAs) in the context of experimental evidence. Upon approval by the PAs, the CA models, along with the corresponding functional annotations, are integrated into the Osa1 Genome Annotation. The CA annotations, grouped by family, are displayed on the Community Annotation pages of the project website , as well as in the Community Annotation track of the Genome Browser. Conclusion We have applied EuCAP to rice. As of July 2007, the structural and/or functional annotation of 1
Trypanosome outer kinetochore proteins suggest conservation of chromosome segregation machinery across eukaryotes

PubMed Central

D’Archivio, Simon

2017-01-01

Kinetochores are multiprotein complexes that couple eukaryotic chromosomes to the mitotic spindle to ensure proper segregation. The model for kinetochore assembly is conserved between humans and yeast, and homologues of several components are widely distributed in eukaryotes, but key components are absent in some lineages. The recent discovery in a lineage of protozoa called kinetoplastids of unconventional kinetochores with no apparent homology to model organisms suggests that more than one system for eukaryotic chromosome segregation may exist. In this study, we report a new family of proteins distantly related to outer kinetochore proteins Ndc80 and Nuf2. The family member in kinetoplastids, KKT-interacting protein 1 (KKIP1), associates with the kinetochore, and its depletion causes severe defects in karyokinesis, loss of individual chromosomes, and gross defects in spindle assembly or stability. Immunopurification of KKIP1 from stabilized kinetochores identifies six further components, which form part of a trypanosome outer kinetochore complex. These findings suggest that kinetochores in organisms such as kinetoplastids are built from a divergent, but not ancestrally distinct, set of components and that Ndc80/Nuf2-like proteins are universal in eukaryotic division. PMID:28034897
A draft physical map of a D-genome cotton species (Gossypium raimondii)

PubMed Central

2010-01-01

Background Genetically anchored physical maps of large eukaryotic genomes have proven useful both for their intrinsic merit and as an adjunct to genome sequencing. Cultivated tetraploid cottons, Gossypium hirsutum and G. barbadense, share a common ancestor formed by a merger of the A and D genomes about 1-2 million years ago. Toward the long-term goal of characterizing the spectrum of diversity among cotton genomes, the worldwide cotton community has prioritized the D genome progenitor Gossypium raimondii for complete sequencing. Results A whole genome physical map of G. raimondii, the putative D genome ancestral species of tetraploid cottons was assembled, integrating genetically-anchored overgo hybridization probes, agarose based fingerprints and 'high information content fingerprinting' (HICF). A total of 13,662 BAC-end sequences and 2,828 DNA probes were used in genetically anchoring 1585 contigs to a cotton consensus genetic map, and 370 and 438 contigs, respectively to Arabidopsis thaliana (AT) and Vitis vinifera (VV) whole genome sequences. Conclusion Several lines of evidence suggest that the G. raimondii genome is comprised of two qualitatively different components. Much of the gene rich component is aligned to the Arabidopsis and Vitis vinifera genomes and shows promise for utilizing translational genomic approaches in understanding this important genome and its resident genes. The integrated genetic-physical map is of value both in assembling and validating a planned reference sequence. PMID:20569427
The logic of DNA replication in double-stranded DNA viruses: insights from global analysis of viral genomes.

PubMed

Kazlauskas, Darius; Krupovic, Mart; Venclovas, Česlovas

2016-06-02

Genomic DNA replication is a complex process that involves multiple proteins. Cellular DNA replication systems are broadly classified into only two types, bacterial and archaeo-eukaryotic. In contrast, double-stranded (ds) DNA viruses feature a much broader diversity of DNA replication machineries. Viruses differ greatly in both completeness and composition of their sets of DNA replication proteins. In this study, we explored whether there are common patterns underlying this extreme diversity. We identified and analyzed all major functional groups of DNA replication proteins in all available proteomes of dsDNA viruses. Our results show that some proteins are common to viruses infecting all domains of life and likely represent components of the ancestral core set. These include B-family polymerases, SF3 helicases, archaeo-eukaryotic primases, clamps and clamp loaders of the archaeo-eukaryotic type, RNase H and ATP-dependent DNA ligases. We also discovered a clear correlation between genome size and self-sufficiency of viral DNA replication, the unanticipated dominance of replicative helicases and pervasive functional associations among certain groups of DNA replication proteins. Altogether, our results provide a comprehensive view on the diversity and evolution of replication systems in the DNA virome and uncover fundamental principles underlying the orchestration of viral DNA replication. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Intermediary metabolism in protists: a sequence-based view of facultative anaerobic metabolism in evolutionarily diverse eukaryotes.

PubMed

Ginger, Michael L; Fritz-Laylin, Lillian K; Fulton, Chandler; Cande, W Zacheus; Dawson, Scott C

2010-12-01

Protists account for the bulk of eukaryotic diversity. Through studies of gene and especially genome sequences the molecular basis for this diversity can be determined. Evident from genome sequencing are examples of versatile metabolism that go far beyond the canonical pathways described for eukaryotes in textbooks. In the last 2-3 years, genome sequencing and transcript profiling has unveiled several examples of heterotrophic and phototrophic protists that are unexpectedly well-equipped for ATP production using a facultative anaerobic metabolism, including some protists that can (Chlamydomonas reinhardtii) or are predicted (Naegleria gruberi, Acanthamoeba castellanii, Amoebidium parasiticum) to produce H(2) in their metabolism. It is possible that some enzymes of anaerobic metabolism were acquired and distributed among eukaryotes by lateral transfer, but it is also likely that the common ancestor of eukaryotes already had far more metabolic versatility than was widely thought a few years ago. The discussion of core energy metabolism in unicellular eukaryotes is the subject of this review. Since genomic sequencing has so far only touched the surface of protist diversity, it is anticipated that sequences of additional protists may reveal an even wider range of metabolic capabilities, while simultaneously enriching our understanding of the early evolution of eukaryotes. Copyright © 2010 Elsevier GmbH. All rights reserved.

Intermediary Metabolism in Protists: a Sequence-based View of Facultative Anaerobic Metabolism in Evolutionarily Diverse Eukaryotes

PubMed Central

Ginger, Michael L.; Fritz-Laylin, Lillian K.; Fulton, Chandler; Cande, W. Zacheus; Dawson, Scott C.

2011-01-01

Protists account for the bulk of eukaryotic diversity. Through studies of gene and especially genome sequences the molecular basis for this diversity can be determined. Evident from genome sequencing are examples of versatile metabolism that go far beyond the canonical pathways described for eukaryotes in textbooks. In the last 2–3 years, genome sequencing and transcript profiling has unveiled several examples of heterotrophic and phototrophic protists that are unexpectedly well-equipped for ATP production using a facultative anaerobic metabolism, including some protists that can (Chlamydomonas reinhardtii) or are predicted (Naegleria gruberi, Acanthamoeba castellanii, Amoebidium parasiticum) to produce H2 in their metabolism. It is possible that some enzymes of anaerobic metabolism were acquired and distributed among eukaryotes by lateral transfer, but it is also likely that the common ancestor of eukaryotes already had far more metabolic versatility than was widely thought a few years ago. The discussion of core energy metabolism in unicellular eukaryotes is the subject of this review. Since genomic sequencing has so far only touched the surface of protist diversity, it is anticipated that sequences of additional protists may reveal an even wider range of metabolic capabilities, while simultaneously enriching our understanding of the early evolution of eukaryotes. PMID:21036663
Ancient diversification of eukaryotic MCM DNA replication proteins

PubMed Central

Liu, Yuan; Richards, Thomas A; Aves, Stephen J

2009-01-01

Background Yeast and animal cells require six mini-chromosome maintenance proteins (Mcm2-7) for pre-replication complex formation, DNA replication initiation and DNA synthesis. These six individual MCM proteins form distinct heterogeneous subunits within a hexamer which is believed to form the replicative helicase and which associates with the essential but non-homologous Mcm10 protein during DNA replication. In contrast Archaea generally only possess one MCM homologue which forms a homohexameric MCM helicase. In some eukaryotes Mcm8 and Mcm9 paralogues also appear to be involved in DNA replication although their exact roles are unclear. Results We used comparative genomics and phylogenetics to reconstruct the diversification of the eukaryotic Mcm2-9 gene family, demonstrating that Mcm2-9 were formed by seven gene duplication events before the last common ancestor of the eukaryotes. Mcm2-7 protein paralogues were present in all eukaryote genomes studied suggesting that no gene loss or functional replacements have been tolerated during the evolutionary diversification of eukaryotes. Mcm8 and 9 are widely distributed in eukaryotes and group together on the MCM phylogenetic tree to the exclusion of all other MCM paralogues suggesting co-ancestry. Mcm8 and Mcm9 are absent in some taxa, including Trichomonas and Giardia, and appear to have been secondarily lost in some fungi and some animals. The presence and absence of Mcm8 and 9 is concordant in all taxa sampled with the exception of Drosophila species. Mcm10 is present in most eukaryotes sampled but shows no concordant pattern of presence or absence with Mcm8 or 9. Conclusion A multifaceted and heterogeneous Mcm2-7 hexamer evolved during the early evolution of the eukaryote cell in parallel with numerous other acquisitions in cell complexity and prior to the diversification of extant eukaryotes. The conservation of all six paralogues throughout the eukaryotes suggests that each Mcm2-7 hexamer component has an
Analysis of simple sequence repeat (SSR) structure and sequence within Epichloë endophyte genomes reveals impacts on gene structure and insights into ancestral hybridization events.

PubMed

Clayton, William; Eaton, Carla Jane; Dupont, Pierre-Yves; Gillanders, Tim; Cameron, Nick; Saikia, Sanjay; Scott, Barry

2017-01-01

Epichloë grass endophytes comprise a group of filamentous fungi of both sexual and asexual species. Known for the beneficial characteristics they endow upon their grass hosts, the identification of these endophyte species has been of great interest agronomically and scientifically. The use of simple sequence repeat loci and the variation in repeat elements has been used to rapidly identify endophyte species and strains, however, little is known of how the structure of repeat elements changes between species and strains, and where these repeat elements are located in the fungal genome. We report on an in-depth analysis of the structure and genomic location of the simple sequence repeat locus B10, commonly used for Epichloë endophyte species identification. The B10 repeat was found to be located within an exon of a putative bZIP transcription factor, suggesting possible impacts on polypeptide sequence and thus protein function. Analysis of this repeat in the asexual endophyte hybrid Epichloë uncinata revealed that the structure of B10 alleles reflects the ancestral species that hybridized to give rise to this species. Understanding the structure and sequence of these simple sequence repeats provides a useful set of tools for readily distinguishing strains and for gaining insights into the ancestral species that have undergone hybridization events.
Eukaryotic acquisition of a bacterial operon

USDA-ARS?s Scientific Manuscript database

The yeast Saccharomyces cerevisiae is one of the champions of basic biomedical research due to its compact eukaryotic genome and ease of experimental manipulation. Despite these immense strengths, its impact on understanding the genetic basis of natural phenotypic variation has been limited by strai...
Gene flow and biological conflict systems in the origin and evolution of eukaryotes

PubMed Central

Aravind, L.; Anantharaman, Vivek; Zhang, Dapeng; de Souza, Robson F.; Iyer, Lakshminarayan M.

2012-01-01

The endosymbiotic origin of eukaryotes brought together two disparate genomes in the cell. Additionally, eukaryotic natural history has included other endosymbiotic events, phagotrophic consumption of organisms, and intimate interactions with viruses and endoparasites. These phenomena facilitated large-scale lateral gene transfer and biological conflicts. We synthesize information from nearly two decades of genomics to illustrate how the interplay between lateral gene transfer and biological conflicts has impacted the emergence of new adaptations in eukaryotes. Using apicomplexans as example, we illustrate how lateral transfer from animals has contributed to unique parasite-host interfaces comprised of adhesion- and O-linked glycosylation-related domains. Adaptations, emerging due to intense selection for diversity in the molecular participants in organismal and genomic conflicts, being dispersed by lateral transfer, were subsequently exapted for eukaryote-specific innovations. We illustrate this using examples relating to eukaryotic chromatin, RNAi and RNA-processing systems, signaling pathways, apoptosis and immunity. We highlight the major contributions from catalytic domains of bacterial toxin systems to the origin of signaling enzymes (e.g., ADP-ribosylation and small molecule messenger synthesis), mutagenic enzymes for immune receptor diversification and RNA-processing. Similarly, we discuss contributions of bacterial antibiotic/siderophore synthesis systems and intra-genomic and intra-cellular selfish elements (e.g., restriction-modification, mobile elements and lysogenic phages) in the emergence of chromatin remodeling/modifying enzymes and RNA-based regulation. We develop the concept that biological conflict systems served as evolutionary “nurseries” for innovations in the protein world, which were delivered to eukaryotes via lateral gene flow to spur key evolutionary innovations all the way from nucleogenesis to lineage-specific adaptations. PMID
Protein phylogenies and signature sequences: A reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes.

PubMed

Gupta, R S

1998-12-01

The presence of shared conserved insertion or deletions (indels) in protein sequences is a special type of signature sequence that shows considerable promise for phylogenetic inference. An alternative model of microbial evolution based on the use of indels of conserved proteins and the morphological features of prokaryotic organisms is proposed. In this model, extant archaebacteria and gram-positive bacteria, which have a simple, single-layered cell wall structure, are termed monoderm prokaryotes. They are believed to be descended from the most primitive organisms. Evidence from indels supports the view that the archaebacteria probably evolved from gram-positive bacteria, and I suggest that this evolution occurred in response to antibiotic selection pressures. Evidence is presented that diderm prokaryotes (i.e., gram-negative bacteria), which have a bilayered cell wall, are derived from monoderm prokaryotes. Signature sequences in different proteins provide a means to define a number of different taxa within prokaryotes (namely, low G+C and high G+C gram-positive, Deinococcus-Thermus, cyanobacteria, chlamydia-cytophaga related, and two different groups of Proteobacteria) and to indicate how they evolved from a common ancestor. Based on phylogenetic information from indels in different protein sequences, it is hypothesized that all eukaryotes, including amitochondriate and aplastidic organisms, received major gene contributions from both an archaebacterium and a gram-negative eubacterium. In this model, the ancestral eukaryotic cell is a chimera that resulted from a unique fusion event between the two separate groups of prokaryotes followed by integration of their genomes.
Evolution and Diversity of the Ras Superfamily of Small GTPases in Prokaryotes

PubMed Central

Wuichet, Kristin; Søgaard-Andersen, Lotte

2015-01-01

The Ras superfamily of small GTPases are single domain nucleotide-dependent molecular switches that act as highly tuned regulators of complex signal transduction pathways. Originally identified in eukaryotes for their roles in fundamental cellular processes including proliferation, motility, polarity, nuclear transport, and vesicle transport, recent studies have revealed that single domain GTPases also control complex functions such as cell polarity, motility, predation, development and antibiotic resistance in bacteria. Here, we used a computational genomics approach to understand the abundance, diversity, and evolution of small GTPases in prokaryotes. We collected 520 small GTPase sequences present in 17% of 1,611 prokaryotic genomes analyzed that cover diverse lineages. We identified two discrete families of small GTPases in prokaryotes that show evidence of three distinct catalytic mechanisms. The MglA family includes MglA homologs, which are typically associated with the MglB GTPase activating protein, whereas members of the Rup (Ras superfamily GTPase of unknown function in prokaryotes) family are not predicted to interact with MglB homologs. System classification and genome context analyses support the involvement of small GTPases in diverse prokaryotic signal transduction pathways including two component systems, laying the foundation for future experimental characterization of these proteins. Phylogenetic analysis of prokaryotic and eukaryotic GTPases supports that the last universal common ancestor contained ancestral MglA and Rup family members. We propose that the MglA family was lost from the ancestral eukaryote and that the Ras superfamily members in extant eukaryotes are the result of vertical and horizontal gene transfer events of ancestral Rup GTPases. PMID:25480683
Anaerobic energy metabolism in unicellular photosynthetic eukaryotes.

PubMed

Atteia, Ariane; van Lis, Robert; Tielens, Aloysius G M; Martin, William F

2013-02-01

Anaerobic metabolic pathways allow unicellular organisms to tolerate or colonize anoxic environments. Over the past ten years, genome sequencing projects have brought a new light on the extent of anaerobic metabolism in eukaryotes. A surprising development has been that free-living unicellular algae capable of photoautotrophic lifestyle are, in terms of their enzymatic repertoire, among the best equipped eukaryotes known when it comes to anaerobic energy metabolism. Some of these algae are marine organisms, common in the oceans, others are more typically soil inhabitants. All these species are important from the ecological (O(2)/CO(2) budget), biotechnological, and evolutionary perspectives. In the unicellular algae surveyed here, mixed-acid type fermentations are widespread while anaerobic respiration, which is more typical of eukaryotic heterotrophs, appears to be rare. The presence of a core anaerobic metabolism among the algae provides insights into its evolutionary origin, which traces to the eukaryote common ancestor. The predicted fermentative enzymes often exhibit an amino acid extension at the N-terminus, suggesting that these proteins might be compartmentalized in the cell, likely in the chloroplast or the mitochondrion. The green algae Chlamydomonas reinhardtii and Chlorella NC64 have the most extended set of fermentative enzymes reported so far. Among the eukaryotes with secondary plastids, the diatom Thalassiosira pseudonana has the most pronounced anaerobic capabilities as yet. From the standpoints of genomic, transcriptomic, and biochemical studies, anaerobic energy metabolism in C. reinhardtii remains the best characterized among photosynthetic protists. This article is part of a Special Issue entitled: The evolutionary aspects of bioenergetic systems. Copyright © 2012 Elsevier B.V. All rights reserved.
Genomicus update 2015: KaryoView and MatrixView provide a genome-wide perspective to multispecies comparative genomics.

PubMed

Louis, Alexandra; Nguyen, Nga Thi Thuy; Muffato, Matthieu; Roest Crollius, Hugues

2015-01-01

The Genomicus web server (http://www.genomicus.biologie.ens.fr/genomicus) is a visualization tool allowing comparative genomics in four different phyla (Vertebrate, Fungi, Metazoan and Plants). It provides access to genomic information from extant species, as well as ancestral gene content and gene order for vertebrates and flowering plants. Here we present the new features available for vertebrate genome with a focus on new graphical tools. The interface to enter the database has been improved, two pairwise genome comparison tools are now available (KaryoView and MatrixView) and the multiple genome comparison tools (PhyloView and AlignView) propose three new kinds of representation and a more intuitive menu. These new developments have been implemented for Genomicus portal dedicated to vertebrates. This allows the analysis of 68 extant animal genomes, as well as 58 ancestral reconstructed genomes. The Genomicus server also provides access to ancestral gene orders, to facilitate evolutionary and comparative genomics studies, as well as computationally predicted regulatory interactions, thanks to the representation of conserved non-coding elements with their putative gene targets. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
What was the ancestral sex-determining mechanism in amniote vertebrates?

PubMed

Johnson Pokorná, Martina; Kratochvíl, Lukáš

2016-02-01

Amniote vertebrates, the group consisting of mammals and reptiles including birds, possess various mechanisms of sex determination. Under environmental sex determination (ESD), the sex of individuals depends on the environmental conditions occurring during their development and therefore there are no sexual differences present in their genotypes. Alternatively, through the mode of genotypic sex determination (GSD), sex is determined by a sex-specific genotype, i.e. by the combination of sex chromosomes at various stages of differentiation at conception. As well as influencing sex determination, sex-specific parts of genomes may, and often do, develop specific reproductive or ecological roles in their bearers. Accordingly, an individual with a mismatch between phenotypic (gonadal) and genotypic sex, for example an individual sex-reversed by environmental effects, should have a lower fitness due to the lack of specialized, sex-specific parts of their genome. In this case, evolutionary transitions from GSD to ESD should be less likely than transitions in the opposite direction. This prediction contrasts with the view that GSD was the ancestral sex-determining mechanism for amniote vertebrates. Ancestral GSD would require several transitions from GSD to ESD associated with an independent dedifferentiation of sex chromosomes, at least in the ancestors of crocodiles, turtles, and lepidosaurs (tuataras and squamate reptiles). In this review, we argue that the alternative theory postulating ESD as ancestral in amniotes is more parsimonious and is largely concordant with the theoretical expectations and current knowledge of the phylogenetic distribution and homology of sex-determining mechanisms. © 2014 Cambridge Philosophical Society.
Eukaryotic systematics: a user's guide for cell biologists and parasitologists.

PubMed

Walker, Giselle; Dorrell, Richard G; Schlacht, Alexander; Dacks, Joel B

2011-11-01

Single-celled parasites like Entamoeba, Trypanosoma, Phytophthora and Plasmodium wreak untold havoc on human habitat and health. Understanding the position of the various protistan pathogens in the larger context of eukaryotic diversity informs our study of how these parasites operate on a cellular level, as well as how they have evolved. Here, we review the literature that has brought our understanding of eukaryotic relationships from an idea of parasites as primitive cells to a crystallized view of diversity that encompasses 6 major divisions, or supergroups, of eukaryotes. We provide an updated taxonomic scheme (for 2011), based on extensive genomic, ultrastructural and phylogenetic evidence, with three differing levels of taxonomic detail for ease of referencing and accessibility (see supplementary material at Cambridge Journals On-line). Two of the most pressing issues in cellular evolution, the root of the eukaryotic tree and the evolution of photosynthesis in complex algae, are also discussed along with ideas about what the new generation of genome sequencing technologies may contribute to the field of eukaryotic systematics. We hope that, armed with this user's guide, cell biologists and parasitologists will be encouraged about taking an increasingly evolutionary point of view in the battle against parasites representing real dangers to our livelihoods and lives.
Exogean: a framework for annotating protein-coding genes in eukaryotic genomic DNA

PubMed Central

Djebali, Sarah; Delaplace, Franck; Crollius, Hugues Roest

2006-01-01

Background Accurate and automatic gene identification in eukaryotic genomic DNA is more than ever of crucial importance to efficiently exploit the large volume of assembled genome sequences available to the community. Automatic methods have always been considered less reliable than human expertise. This is illustrated in the EGASP project, where reference annotations against which all automatic methods are measured are generated by human annotators and experimentally verified. We hypothesized that replicating the accuracy of human annotators in an automatic method could be achieved by formalizing the rules and decisions that they use, in a mathematical formalism. Results We have developed Exogean, a flexible framework based on directed acyclic colored multigraphs (DACMs) that can represent biological objects (for example, mRNA, ESTs, protein alignments, exons) and relationships between them. Graphs are analyzed to process the information according to rules that replicate those used by human annotators. Simple individual starting objects given as input to Exogean are thus combined and synthesized into complex objects such as protein coding transcripts. Conclusion We show here, in the context of the EGASP project, that Exogean is currently the method that best reproduces protein coding gene annotations from human experts, in terms of identifying at least one exact coding sequence per gene. We discuss current limitations of the method and several avenues for improvement. PMID:16925841
Comprehensive comparative analysis of kinesins in photosynthetic eukaryotes

PubMed Central

Richardson, Dale N; Simmons, Mark P; Reddy, Anireddy SN

2006-01-01

Background Kinesins, a superfamily of molecular motors, use microtubules as tracks and transport diverse cellular cargoes. All kinesins contain a highly conserved ~350 amino acid motor domain. Previous analysis of the completed genome sequence of one flowering plant (Arabidopsis) has resulted in identification of 61 kinesins. The recent completion of genome sequencing of several photosynthetic and non-photosynthetic eukaryotes that belong to divergent lineages offers a unique opportunity to conduct a comprehensive comparative analysis of kinesins in plant and non-plant systems and infer their evolutionary relationships. Results We used the kinesin motor domain to identify kinesins in the completed genome sequences of 19 species, including 13 newly sequenced genomes. Among the newly analyzed genomes, six represent photosynthetic eukaryotes. A total of 529 kinesins was used to perform comprehensive analysis of kinesins and to construct gene trees using the Bayesian and parsimony approaches. The previously recognized 14 families of kinesins are resolved as distinct lineages in our inferred gene tree. At least three of the 14 kinesin families are not represented in flowering plants. Chlamydomonas, a green alga that is part of the lineage that includes land plants, has at least nine of the 14 known kinesin families. Seven of ten families present in flowering plants are represented in Chlamydomonas, indicating that these families were retained in both the flowering-plant and green algae lineages. Conclusion The increase in the number of kinesins in flowering plants is due to vast expansion of the Kinesin-14 and Kinesin-7 families. The Kinesin-14 family, which typically contains a C-terminal motor, has many plant kinesins that have the motor domain at the N terminus, in the middle, or the C terminus. Several domains in kinesins are present exclusively either in plant or animal lineages. Addition of novel domains to kinesins in lineage-specific groups contributed to the
Origin of amphibian and avian chromosomes by fission, fusion, and retention of ancestral chromosomes

PubMed Central

Voss, Stephen R.; Kump, D. Kevin; Putta, Srikrishna; Pauly, Nathan; Reynolds, Anna; Henry, Rema J.; Basa, Saritha; Walker, John A.; Smith, Jeramiah J.

2011-01-01

Amphibian genomes differ greatly in DNA content and chromosome size, morphology, and number. Investigations of this diversity are needed to identify mechanisms that have shaped the evolution of vertebrate genomes. We used comparative mapping to investigate the organization of genes in the Mexican axolotl (Ambystoma mexicanum), a species that presents relatively few chromosomes (n = 14) and a gigantic genome (>20 pg/N). We show extensive conservation of synteny between Ambystoma, chicken, and human, and a positive correlation between the length of conserved segments and genome size. Ambystoma segments are estimated to be four to 51 times longer than homologous human and chicken segments. Strikingly, genes demarking the structures of 28 chicken chromosomes are ordered among linkage groups defining the Ambystoma genome, and we show that these same chromosomal segments are also conserved in a distantly related anuran amphibian (Xenopus tropicalis). Using linkage relationships from the amphibian maps, we predict that three chicken chromosomes originated by fusion, nine to 14 originated by fission, and 12–17 evolved directly from ancestral tetrapod chromosomes. We further show that some ancestral segments were fused prior to the divergence of salamanders and anurans, while others fused independently and randomly as chromosome numbers were reduced in lineages leading to Ambystoma and Xenopus. The maintenance of gene order relationships between chromosomal segments that have greatly expanded and contracted in salamander and chicken genomes, respectively, suggests selection to maintain synteny relationships and/or extremely low rates of chromosomal rearrangement. Overall, the results demonstrate the value of data from diverse, amphibian genomes in studies of vertebrate genome evolution. PMID:21482624
Polintons: a hotbed of eukaryotic virus, transposon and plasmid evolution

PubMed Central

Krupovic, Mart; Koonin, Eugene V.

2018-01-01

Polintons (also known as Mavericks) are large DNA transposons that are widespread in the genomes of eukaryotes. We have recently shown that Polintons encode virus capsid proteins, which suggests that these transposons might form virions, at least under some conditions. In this Opinion article, we delineate the evolutionary relationships among bacterial tectiviruses, Polintons, adenoviruses, virophages, large and giant DNA viruses of eukaryotes of the proposed order ‘Megavirales’, and linear mitochondrial and cytoplasmic plasmids. We hypothesize that Polintons were the first group of eukaryotic double-stranded DNA viruses to evolve from bacteriophages and that they gave rise to most large DNA viruses of eukaryotes and various other selfish genetic elements. PMID:25534808
INVESTIGATIONS INTO MOLECULAR PATHWAYS IN THE POST GENOME ERA: CROSS SPECIES COMPARATIVE GENOMICS APPROACH

EPA Science Inventory

Genome sequencing efforts in the past decade were aimed at generating draft sequences of many prokaryotic and eukaryotic model organisms. Successful completion of unicellular eukaryotes, worm, fly and human genome have opened up the new field of molecular biology and function...
Emerging players in the initiation of eukaryotic DNA replication

PubMed Central

2012-01-01

Faithful duplication of the genome in eukaryotes requires ordered assembly of a multi-protein complex called the pre-replicative complex (pre-RC) prior to S phase; transition to the pre-initiation complex (pre-IC) at the beginning of DNA replication; coordinated progression of the replisome during S phase; and well-controlled regulation of replication licensing to prevent re-replication. These events are achieved by the formation of distinct protein complexes that form in a cell cycle-dependent manner. Several components of the pre-RC and pre-IC are highly conserved across all examined eukaryotic species. Many of these proteins, in addition to their bona fide roles in DNA replication are also required for other cell cycle events including heterochromatin organization, chromosome segregation and centrosome biology. As the complexity of the genome increases dramatically from yeast to human, additional proteins have been identified in higher eukaryotes that dictate replication initiation, progression and licensing. In this review, we discuss the newly discovered components and their roles in cell cycle progression. PMID:23075259
The Korarchaeota: Archaeal orphans representing an ancestral lineage of life

DOE Office of Scientific and Technical Information (OSTI.GOV)

Elkins, James G.; Kunin, Victor; Anderson, Iain

Based on conserved cellular properties, all life on Earth can be grouped into different phyla which belong to the primary domains Bacteria, Archaea, and Eukarya. However, tracing back their evolutionary relationships has been impeded by horizontal gene transfer and gene loss. Within the Archaea, the kingdoms Crenarchaeota and Euryarchaeota exhibit a profound divergence. In order to elucidate the evolution of these two major kingdoms, representatives of more deeply diverged lineages would be required. Based on their environmental small subunit ribosomal (ss RNA) sequences, the Korarchaeota had been originally suggested to have an ancestral relationship to all known Archaea although thismore » assessment has been refuted. Here we describe the cultivation and initial characterization of the first member of the Korarchaeota, highly unusual, ultrathin filamentous cells about 0.16 {micro}m in diameter. A complete genome sequence obtained from enrichment cultures revealed an unprecedented combination of signature genes which were thought to be characteristic of either the Crenarchaeota, Euryarchaeota, or Eukarya. Cell division appears to be mediated through a FtsZ-dependent mechanism which is highly conserved throughout the Bacteria and Euryarchaeota. An rpb8 subunit of the DNA-dependent RNA polymerase was identified which is absent from other Archaea and has been described as a eukaryotic signature gene. In addition, the representative organism possesses a ribosome structure typical for members of the Crenarchaeota. Based on its gene complement, this lineage likely diverged near the separation of the two major kingdoms of Archaea. Further investigations of these unique organisms may shed additional light onto the evolution of extant life.« less
The Eukaryotic Pathogen Databases: a functional genomic resource integrating data from human and veterinary parasites.

PubMed

Harb, Omar S; Roos, David S

2015-01-01

Over the past 20 years, advances in high-throughput biological techniques and the availability of computational resources including fast Internet access have resulted in an explosion of large genome-scale data sets "big data." While such data are readily available for download and personal use and analysis from a variety of repositories, often such analysis requires access to seldom-available computational skills. As a result a number of databases have emerged to provide scientists with online tools enabling the interrogation of data without the need for sophisticated computational skills beyond basic knowledge of Internet browser utility. This chapter focuses on the Eukaryotic Pathogen Databases (EuPathDB: http://eupathdb.org) Bioinformatic Resource Center (BRC) and illustrates some of the available tools and methods.
Protein Phylogenies and Signature Sequences: A Reappraisal of Evolutionary Relationships among Archaebacteria, Eubacteria, and Eukaryotes

PubMed Central

Gupta, Radhey S.

1998-01-01

The presence of shared conserved insertion or deletions (indels) in protein sequences is a special type of signature sequence that shows considerable promise for phylogenetic inference. An alternative model of microbial evolution based on the use of indels of conserved proteins and the morphological features of prokaryotic organisms is proposed. In this model, extant archaebacteria and gram-positive bacteria, which have a simple, single-layered cell wall structure, are termed monoderm prokaryotes. They are believed to be descended from the most primitive organisms. Evidence from indels supports the view that the archaebacteria probably evolved from gram-positive bacteria, and I suggest that this evolution occurred in response to antibiotic selection pressures. Evidence is presented that diderm prokaryotes (i.e., gram-negative bacteria), which have a bilayered cell wall, are derived from monoderm prokaryotes. Signature sequences in different proteins provide a means to define a number of different taxa within prokaryotes (namely, low G+C and high G+C gram-positive, Deinococcus-Thermus, cyanobacteria, chlamydia-cytophaga related, and two different groups of Proteobacteria) and to indicate how they evolved from a common ancestor. Based on phylogenetic information from indels in different protein sequences, it is hypothesized that all eukaryotes, including amitochondriate and aplastidic organisms, received major gene contributions from both an archaebacterium and a gram-negative eubacterium. In this model, the ancestral eukaryotic cell is a chimera that resulted from a unique fusion event between the two separate groups of prokaryotes followed by integration of their genomes. PMID:9841678

Subcomplexes of Ancestral Respiratory Complex I Subunits Rapidly Turn Over in Vivo as Productive Assembly Intermediates in Arabidopsis*

PubMed Central

Li, Lei; Nelson, Clark J.; Carrie, Chris; Gawryluk, Ryan M. R.; Solheim, Cory; Gray, Michael W.; Whelan, James; Millar, A. Harvey

2013-01-01

Subcomplexes of mitochondrial respiratory complex I (CI; EC 1.6.5.3) are shown to turn over in vivo, and we propose a role in an ancestral assembly pathway. By progressively labeling Arabidopsis cell cultures with 15N and isolating mitochondria, we have identified CI subcomplexes through differences in 15N incorporation into their protein subunits. The 200-kDa subcomplex, containing the ancestral γ-carbonic anhydrase (γ-CA), γ-carbonic anhydrase-like, and 20.9-kDa subunits, had a significantly higher turnover rate than intact CI or CI+CIII2. In vitro import of precursors for these CI subunits demonstrated rapid generation of subcomplexes and revealed that their specific abundance varied when different ancestral subunits were imported. Time course studies of precursor import showed the further assembly of these subcomplexes into CI and CI+CIII2, indicating that the subcomplexes are productive intermediates of assembly. The strong transient incorporation of new subunits into the 200-kDa subcomplex in a γ-CA mutant is consistent with this subcomplex being a key initiator of CI assembly in plants. This evidence alongside the pattern of coincident occurrence of genes encoding these particular proteins broadly in eukaryotes, except for opisthokonts, provides a framework for the evolutionary conservation of these accessory subunits and evidence of their function in ancestral CI assembly. PMID:23271729
Post-genomics of microsporidia, with emphasis on a model of minimal eukaryotic proteome: a review.

PubMed

Texier, Catherine; Brosson, Damien; El Alaoui, Hicham; Méténier, Guy; Vivarès, Christian P

2005-05-01

The genome sequence of the microsporidian parasite Encephalitozoon cuniculi Levaditi, Nicolau et Schoen, 1923 contains about 2,000 genes that are representative of a non-redundant potential proteome composed of 1,909 protein chains. The purpose of this review is to relate some advances in the characterisation of this proteome through bioinformatics and experimental approaches. The reduced diversity of the set of E. cuniculi proteins is perceptible in all the compilations of predicted domains, orthologs, families and superfamilies, available in several public databases. The phyletic patterns of orthologs for seven eukaryotic organisms support an extensive gene loss in the fungal clade, with additional deletions in E. cuniculi. Most microsporidial orthologs are the smallest ones among eukaryotes, justifying an interest in the use of these compacted proteins to better discriminate between essential and non-essential regions. The three components of the E. cuniculi mRNA capping apparatus have been especially well characterized and the three-dimensional structure of the cap methyltransferase has been elucidated following the crystallisation of the microsporidial enzyme Ecm1. So far, our mass spectrometry-based analyses of the E. cuniculi spore proteome has led to the identification of about 170 proteins, one-quarter of these having no clearly predicted function. Immunocytochemical studies are in progress to determine the subcellular localisation of microsporidia-specific proteins. Post-translational modifications such as phosphorylation and glycosylation are expected to be soon explored.
The Mitochondrial Genome of the Guanaco Louse, Microthoracius praelongiceps: Insights into the Ancestral Mitochondrial Karyotype of Sucking Lice (Anoplura, Insecta)

PubMed Central

Li, Hu; Barker, Stephen C.

2017-01-01

Fragmented mitochondrial (mt) genomes have been reported in 11 species of sucking lice (suborder Anoplura) that infest humans, chimpanzees, pigs, horses, and rodents. There is substantial variation among these lice in mt karyotype: the number of minichromosomes of a species ranges from 9 to 20; the number of genes in a minichromosome ranges from 1 to 8; gene arrangement in a minichromosome differs between species, even in the same genus. We sequenced the mt genome of the guanaco louse, Microthoracius praelongiceps, to help establish the ancestral mt karyotype for sucking lice and understand how fragmented mt genomes evolved. The guanaco louse has 12 mt minichromosomes; each minichromosome has 2–5 genes and a non-coding region. The guanaco louse shares many features with rodent lice in mt karyotype, more than with other sucking lice. The guanaco louse, however, is more closely related phylogenetically to human lice, chimpanzee lice, pig lice, and horse lice than to rodent lice. By parsimony analysis of shared features in mt karyotype, we infer that the most recent common ancestor of sucking lice, which lived ∼75 Ma, had 11 minichromosomes; each minichromosome had 1–6 genes and a non-coding region. As sucking lice diverged, split of mt minichromosomes occurred many times in the lineages leading to the lice of humans, chimpanzees, and rodents whereas merger of minichromosomes occurred in the lineage leading to the lice of pigs and horses. Together, splits and mergers of minichromosomes created a very complex and dynamic mt genome organization in the sucking lice. PMID:28164215
Atypical mitochondrial inheritance patterns in eukaryotes.

PubMed

Breton, Sophie; Stewart, Donald T

2015-10-01

Mitochondrial DNA (mtDNA) is predominantly maternally inherited in eukaryotes. Diverse molecular mechanisms underlying the phenomenon of strict maternal inheritance (SMI) of mtDNA have been described, but the evolutionary forces responsible for its predominance in eukaryotes remain to be elucidated. Exceptions to SMI have been reported in diverse eukaryotic taxa, leading to the prediction that several distinct molecular mechanisms controlling mtDNA transmission are present among the eukaryotes. We propose that these mechanisms will be better understood by studying the deviations from the predominating pattern of SMI. This minireview summarizes studies on eukaryote species with unusual or rare mitochondrial inheritance patterns, i.e., other than the predominant SMI pattern, such as maternal inheritance of stable heteroplasmy, paternal leakage of mtDNA, biparental and strictly paternal inheritance, and doubly uniparental inheritance of mtDNA. The potential genes and mechanisms involved in controlling mitochondrial inheritance in these organisms are discussed. The linkage between mitochondrial inheritance and sex determination is also discussed, given that the atypical systems of mtDNA inheritance examined in this minireview are frequently found in organisms with uncommon sexual systems such as gynodioecy, monoecy, or andromonoecy. The potential of deviations from SMI for facilitating a better understanding of a number of fundamental questions in biology, such as the evolution of mtDNA inheritance, the coevolution of nuclear and mitochondrial genomes, and, perhaps, the role of mitochondria in sex determination, is considerable.
Inference of Ancestral Recombination Graphs through Topological Data Analysis

PubMed Central

Cámara, Pablo G.; Levine, Arnold J.; Rabadán, Raúl

2016-01-01

The recent explosion of genomic data has underscored the need for interpretable and comprehensive analyses that can capture complex phylogenetic relationships within and across species. Recombination, reassortment and horizontal gene transfer constitute examples of pervasive biological phenomena that cannot be captured by tree-like representations. Starting from hundreds of genomes, we are interested in the reconstruction of potential evolutionary histories leading to the observed data. Ancestral recombination graphs represent potential histories that explicitly accommodate recombination and mutation events across orthologous genomes. However, they are computationally costly to reconstruct, usually being infeasible for more than few tens of genomes. Recently, Topological Data Analysis (TDA) methods have been proposed as robust and scalable methods that can capture the genetic scale and frequency of recombination. We build upon previous TDA developments for detecting and quantifying recombination, and present a novel framework that can be applied to hundreds of genomes and can be interpreted in terms of minimal histories of mutation and recombination events, quantifying the scales and identifying the genomic locations of recombinations. We implement this framework in a software package, called TARGet, and apply it to several examples, including small migration between different populations, human recombination, and horizontal evolution in finches inhabiting the Galápagos Islands. PMID:27532298
Evolutionary genomics: is Buchnera a bacterium or an organelle?

PubMed

Andersson, J O

2000-11-30

The first genome sequence of an intracellular bacterial symbiont of a eukaryotic cell has been determined. The Buchnera genome shares features with the genomes of both intracellular pathogenic bacteria and eukaryotic organelles, and it may represent an intermediate between the two.
Genomic evolution in domestic cattle: ancestral haplotypes and healthy beef.

PubMed

Williamson, Joseph F; Steele, Edward J; Lester, Susan; Kalai, Oscar; Millman, John A; Wolrige, Lindsay; Bayard, Dominic; McLure, Craig; Dawkins, Roger L

2011-05-01

We have identified numerous Ancestral Haplotypes encoding a 14-Mb region of Bota C19. Three are frequent in Simmental, Angus and Wagyu and have been conserved since common progenitor populations. Others are more relevant to the differences between these 3 breeds including fat content and distribution in muscle. SREBF1 and Growth Hormone, which have been implicated in the production of healthy beef, are included within these haplotypes. However, we conclude that alleles at these 2 loci are less important than other sequences within the haplotypes. Identification of breeds and hybrids is improved by using haplotypes rather than individual alleles. Copyright © 2010 Elsevier Inc. All rights reserved.
The major architects of chromatin: architectural proteins in bacteria, archaea and eukaryotes.

PubMed

Luijsterburg, Martijn S; White, Malcolm F; van Driel, Roel; Dame, Remus Th

2008-01-01

The genomic DNA of all organisms across the three kingdoms of life needs to be compacted and functionally organized. Key players in these processes are DNA supercoiling, macromolecular crowding and architectural proteins that shape DNA by binding to it. The architectural proteins in bacteria, archaea and eukaryotes generally do not exhibit sequence or structural conservation especially across kingdoms. Instead, we propose that they are functionally conserved. Most of these proteins can be classified according to their architectural mode of action: bending, wrapping or bridging DNA. In order for DNA transactions to occur within a compact chromatin context, genome organization cannot be static. Indeed chromosomes are subject to a whole range of remodeling mechanisms. In this review, we discuss the role of (i) DNA supercoiling, (ii) macromolecular crowding and (iii) architectural proteins in genome organization, as well as (iv) mechanisms used to remodel chromosome structure and to modulate genomic activity. We conclude that the underlying mechanisms that shape and remodel genomes are remarkably similar among bacteria, archaea and eukaryotes.
Evidence of pervasive biologically functional secondary structures within the genomes of eukaryotic single-stranded DNA viruses.

PubMed

Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y F; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie; Martin, Darren Patrick

2014-02-01

Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here.
Evidence of Pervasive Biologically Functional Secondary Structures within the Genomes of Eukaryotic Single-Stranded DNA Viruses

PubMed Central

Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y. F.; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie

2014-01-01

Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here. PMID:24284329
Approaches to Fungal Genome Annotation

PubMed Central

Haas, Brian J.; Zeng, Qiandong; Pearson, Matthew D.; Cuomo, Christina A.; Wortman, Jennifer R.

2011-01-01

Fungal genome annotation is the starting point for analysis of genome content. This generally involves the application of diverse methods to identify features on a genome assembly such as protein-coding and non-coding genes, repeats and transposable elements, and pseudogenes. Here we describe tools and methods leveraged for eukaryotic genome annotation with a focus on the annotation of fungal nuclear and mitochondrial genomes. We highlight the application of the latest technologies and tools to improve the quality of predicted gene sets. The Broad Institute eukaryotic genome annotation pipeline is described as one example of how such methods and tools are integrated into a sequencing center’s production genome annotation environment. PMID:22059117
Biochemistry and Evolution of Anaerobic Energy Metabolism in Eukaryotes

PubMed Central

Müller, Miklós; Mentel, Marek; van Hellemond, Jaap J.; Henze, Katrin; Woehle, Christian; Gould, Sven B.; Yu, Re-Young; van der Giezen, Mark

2012-01-01

Summary: Major insights into the phylogenetic distribution, biochemistry, and evolutionary significance of organelles involved in ATP synthesis (energy metabolism) in eukaryotes that thrive in anaerobic environments for all or part of their life cycles have accrued in recent years. All known eukaryotic groups possess an organelle of mitochondrial origin, mapping the origin of mitochondria to the eukaryotic common ancestor, and genome sequence data are rapidly accumulating for eukaryotes that possess anaerobic mitochondria, hydrogenosomes, or mitosomes. Here we review the available biochemical data on the enzymes and pathways that eukaryotes use in anaerobic energy metabolism and summarize the metabolic end products that they generate in their anaerobic habitats, focusing on the biochemical roles that their mitochondria play in anaerobic ATP synthesis. We present metabolic maps of compartmentalized energy metabolism for 16 well-studied species. There are currently no enzymes of core anaerobic energy metabolism that are specific to any of the six eukaryotic supergroup lineages; genes present in one supergroup are also found in at least one other supergroup. The gene distribution across lineages thus reflects the presence of anaerobic energy metabolism in the eukaryote common ancestor and differential loss during the specialization of some lineages to oxic niches, just as oxphos capabilities have been differentially lost in specialization to anoxic niches and the parasitic life-style. Some facultative anaerobes have retained both aerobic and anaerobic pathways. Diversified eukaryotic lineages have retained the same enzymes of anaerobic ATP synthesis, in line with geochemical data indicating low environmental oxygen levels while eukaryotes arose and diversified. PMID:22688819
Uniting sex and eukaryote origins in an emerging oxygenic world.

PubMed

Gross, Jeferson; Bhattacharya, Debashish

2010-08-23

Theories about eukaryote origins (eukaryogenesis) need to provide unified explanations for the emergence of diverse complex features that define this lineage. Models that propose a prokaryote-to-eukaryote transition are gridlocked between the opposing "phagocytosis first" and "mitochondria as seed" paradigms, neither of which fully explain the origins of eukaryote cell complexity. Sex (outcrossing with meiosis) is an example of an elaborate trait not yet satisfactorily addressed in theories about eukaryogenesis. The ancestral nature of meiosis and its dependence on eukaryote cell biology suggest that the emergence of sex and eukaryogenesis were simultaneous and synergic and may be explained by a common selective pressure. We propose that a local rise in oxygen levels, due to cyanobacterial photosynthesis in ancient Archean microenvironments, was highly toxic to the surrounding biota. This selective pressure drove the transformation of an archaeal (archaebacterial) lineage into the first eukaryotes. Key is that oxygen might have acted in synergy with environmental stresses such as ultraviolet (UV) radiation and/or desiccation that resulted in the accumulation of reactive oxygen species (ROS). The emergence of eukaryote features such as the endomembrane system and acquisition of the mitochondrion are posited as strategies to cope with a metabolic crisis in the cell plasma membrane and the accumulation of ROS, respectively. Selective pressure for efficient repair of ROS/UV-damaged DNA drove the evolution of sex, which required cell-cell fusions, cytoskeleton-mediated chromosome movement, and emergence of the nuclear envelope. Our model implies that evolution of sex and eukaryogenesis were inseparable processes. Several types of data can be used to test our hypothesis. These include paleontological predictions, simulation of ancient oxygenic microenvironments, and cell biological experiments with Archaea exposed to ROS and UV stresses. Studies of archaeal conjugation
Diversity of Eukaryotic Translational Initiation Factor eIF4E in Protists.

PubMed

Jagus, Rosemary; Bachvaroff, Tsvetan R; Joshi, Bhavesh; Place, Allen R

2012-01-01

The greatest diversity of eukaryotic species is within the microbial eukaryotes, the protists, with plants and fungi/metazoa representing just two of the estimated seventy five lineages of eukaryotes. Protists are a diverse group characterized by unusual genome features and a wide range of genome sizes from 8.2 Mb in the apicomplexan parasite Babesia bovis to 112,000-220,050 Mb in the dinoflagellate Prorocentrum micans. Protists possess numerous cellular, molecular and biochemical traits not observed in "text-book" model organisms. These features challenge some of the concepts and assumptions about the regulation of gene expression in eukaryotes. Like multicellular eukaryotes, many protists encode multiple eIF4Es, but few functional studies have been undertaken except in parasitic species. An earlier phylogenetic analysis of protist eIF4Es indicated that they cannot be grouped within the three classes that describe eIF4E family members from multicellular organisms. Many more protist sequences are now available from which three clades can be recognized that are distinct from the plant/fungi/metazoan classes. Understanding of the protist eIF4Es will be facilitated as more sequences become available particularly for the under-represented opisthokonts and amoebozoa. Similarly, a better understanding of eIF4Es within each clade will develop as more functional studies of protist eIF4Es are completed.
Diversity of Eukaryotic Translational Initiation Factor eIF4E in Protists

PubMed Central

Jagus, Rosemary; Bachvaroff, Tsvetan R.; Joshi, Bhavesh; Place, Allen R.

2012-01-01

The greatest diversity of eukaryotic species is within the microbial eukaryotes, the protists, with plants and fungi/metazoa representing just two of the estimated seventy five lineages of eukaryotes. Protists are a diverse group characterized by unusual genome features and a wide range of genome sizes from 8.2 Mb in the apicomplexan parasite Babesia bovis to 112,000-220,050 Mb in the dinoflagellate Prorocentrum micans. Protists possess numerous cellular, molecular and biochemical traits not observed in “text-book” model organisms. These features challenge some of the concepts and assumptions about the regulation of gene expression in eukaryotes. Like multicellular eukaryotes, many protists encode multiple eIF4Es, but few functional studies have been undertaken except in parasitic species. An earlier phylogenetic analysis of protist eIF4Es indicated that they cannot be grouped within the three classes that describe eIF4E family members from multicellular organisms. Many more protist sequences are now available from which three clades can be recognized that are distinct from the plant/fungi/metazoan classes. Understanding of the protist eIF4Es will be facilitated as more sequences become available particularly for the under-represented opisthokonts and amoebozoa. Similarly, a better understanding of eIF4Es within each clade will develop as more functional studies of protist eIF4Es are completed. PMID:22778692
Comparative Genomics of Facultative Bacterial Symbionts Isolated from European Orius Species Reveals an Ancestral Symbiotic Association

PubMed Central

Chen, Xiaorui; Hitchings, Matthew D.; Mendoza, José E.; Balanza, Virginia; Facey, Paul D.; Dyson, Paul J.; Bielza, Pablo; Del Sol, Ricardo

2017-01-01

Pest control in agriculture employs diverse strategies, among which the use of predatory insects has steadily increased. The use of several species within the genus Orius in pest control is widely spread, particularly in Mediterranean Europe. Commercial mass rearing of predatory insects is costly, and research efforts have concentrated on diet manipulation and selective breeding to reduce costs and improve efficacy. The characterisation and contribution of microbial symbionts to Orius sp. fitness, behaviour, and potential impact on human health has been neglected. This paper provides the first genome sequence level description of the predominant culturable facultative bacterial symbionts associated with five Orius species (O. laevigatus, O. niger, O. pallidicornis, O. majusculus, and O. albidipennis) from several geographical locations. Two types of symbionts were broadly classified as members of the genera Serratia and Leucobacter, while a third constitutes a new genus within the Erwiniaceae. These symbionts were found to colonise all the insect specimens tested, which evidenced an ancestral symbiotic association between these bacteria and the genus Orius. Pangenome analyses of the Serratia sp. isolates offered clues linking Type VI secretion system effector–immunity proteins from the Tai4 sub-family to the symbiotic lifestyle. PMID:29067021
Origin and Evolution of the Self-Organizing Cytoskeleton in the Network of Eukaryotic Organelles

PubMed Central

Jékely, Gáspár

2014-01-01

The eukaryotic cytoskeleton evolved from prokaryotic cytomotive filaments. Prokaryotic filament systems show bewildering structural and dynamic complexity and, in many aspects, prefigure the self-organizing properties of the eukaryotic cytoskeleton. Here, the dynamic properties of the prokaryotic and eukaryotic cytoskeleton are compared, and how these relate to function and evolution of organellar networks is discussed. The evolution of new aspects of filament dynamics in eukaryotes, including severing and branching, and the advent of molecular motors converted the eukaryotic cytoskeleton into a self-organizing “active gel,” the dynamics of which can only be described with computational models. Advances in modeling and comparative genomics hold promise of a better understanding of the evolution of the self-organizing cytoskeleton in early eukaryotes, and its role in the evolution of novel eukaryotic functions, such as amoeboid motility, mitosis, and ciliary swimming. PMID:25183829
Whole genome comparisons of Fragaria, Prunus and Malus reveal different modes of evolution between Rosaceous subfamilies

PubMed Central

2012-01-01

Background Rosaceae include numerous economically important and morphologically diverse species. Comparative mapping between the member species in Rosaceae have indicated some level of synteny. Recently the whole genome of three crop species, peach, apple and strawberry, which belong to different genera of the Rosaceae family, have been sequenced, allowing in-depth comparison of these genomes. Results Our analysis using the whole genome sequences of peach, apple and strawberry identified 1399 orthologous regions between the three genomes, with a mean length of around 100 kb. Each peach chromosome showed major orthology mostly to one strawberry chromosome, but to more than two apple chromosomes, suggesting that the apple genome went through more chromosomal fissions in addition to the whole genome duplication after the divergence of the three genera. However, the distribution of contiguous ancestral regions, identified using the multiple genome rearrangements and ancestors (MGRA) algorithm, suggested that the Fragaria genome went through a greater number of small scale rearrangements compared to the other genomes since they diverged from a common ancestor. Using the contiguous ancestral regions, we reconstructed a hypothetical ancestral genome for the Rosaceae 7 composed of nine chromosomes and propose the evolutionary steps from the ancestral genome to the extant Fragaria, Prunus and Malus genomes. Conclusion Our analysis shows that different modes of evolution may have played major roles in different subfamilies of Rosaceae. The hypothetical ancestral genome of Rosaceae and the evolutionary steps that lead to three different lineages of Rosaceae will facilitate our understanding of plant genome evolution as well as have a practical impact on knowledge transfer among member species of Rosaceae. PMID:22475018
Evolutionary history of versatile-lipases from Agaricales through reconstruction of ancestral structures.

PubMed

Barriuso, Jorge; Martínez, María Jesús

2017-01-03

Fungal "Versatile carboxylic ester hydrolases" are enzymes with great biotechnological interest. Here we carried out a bioinformatic screening to find these proteins in genomes from Agaricales, by means of searching for conserved motifs, sequence and phylogenetic analysis, and three-dimensional modeling. Moreover, we reconstructed the molecular evolution of these enzymes along the time by inferring and analyzing the sequence of ancestral intermediate forms. The properties of the ancestral candidates are discussed on the basis of their three-dimensional structural models, the hydrophobicity of the lid, and the substrate binding intramolecular tunnel, revealing all of them featured properties of these enzymes. The evolutionary history of the putative lipases revealed an increase on the length and hydrophobicity of the lid region, as well as in the size of the substrate binding pocket, during evolution time. These facts suggest the enzymes' specialization towards certain substrates and their subsequent loss of promiscuity. These results bring to light the presence of different pools of lipases in fungi with different habitats and life styles. Despite the consistency of the data gathered from reconstruction of ancestral sequences, the heterologous expression of some of these candidates would be essential to corroborate enzymes' activities.
Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists.

PubMed

Sanitá Lima, Matheus; Smith, David Roy

2017-11-06

Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq) data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb), indicating that most of the organelle DNA-coding and noncoding-is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb) and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells. Copyright © 2017 Sanitá Lima and Smith.

Exploring microbial dark matter to resolve the deep archaeal ancestry of eukaryotes

DOE PAGES

Saw, Jimmy H.; Spang, Anja; Zaremba-Niedzwiedzka, Katarzyna; ...

2015-08-31

The origin of eukaryotes represents an enigmatic puzzle, which is still lacking a number of essential pieces. Whereas it is currently accepted that the process of eukaryogenesis involved an interplay between a host cell and an alphaproteobacterial endosymbiont, we currently lack detailed information regarding the identity and nature of these players. A number of studies have provided increasing support for the emergence of the eukaryotic host cell from within the archaeal domain of life, displaying a specific affiliation with the archaeal TACK superphylum. Recent studies have shown that genomic exploration of yet-uncultivated archaea, the so-called archaeal 'dark matter', is ablemore » to provide unprecedented insights into the process of eukaryogenesis. Here, we provide an overview of state-of-the-art cultivation-independent approaches, and demonstrate how these methods were used to obtain draft genome sequences of several novel members of the TACK superphylum, including Lokiarchaeum, two representatives of the Miscellaneous Crenarchaeotal Group (Bathyarchaeota), and a Korarchaeum-related lineage. In conclusion, the maturation of cultivation-independent genomics approaches, as well as future developments in next-generation sequencing technologies, will revolutionize our current view of microbial evolution and diversity, and provide profound new insights into the early evolution of life, including the enigmatic origin of the eukaryotic cell.« less
Exploring microbial dark matter to resolve the deep archaeal ancestry of eukaryotes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Saw, Jimmy H.; Spang, Anja; Zaremba-Niedzwiedzka, Katarzyna

The origin of eukaryotes represents an enigmatic puzzle, which is still lacking a number of essential pieces. Whereas it is currently accepted that the process of eukaryogenesis involved an interplay between a host cell and an alphaproteobacterial endosymbiont, we currently lack detailed information regarding the identity and nature of these players. A number of studies have provided increasing support for the emergence of the eukaryotic host cell from within the archaeal domain of life, displaying a specific affiliation with the archaeal TACK superphylum. Recent studies have shown that genomic exploration of yet-uncultivated archaea, the so-called archaeal 'dark matter', is ablemore » to provide unprecedented insights into the process of eukaryogenesis. Here, we provide an overview of state-of-the-art cultivation-independent approaches, and demonstrate how these methods were used to obtain draft genome sequences of several novel members of the TACK superphylum, including Lokiarchaeum, two representatives of the Miscellaneous Crenarchaeotal Group (Bathyarchaeota), and a Korarchaeum-related lineage. In conclusion, the maturation of cultivation-independent genomics approaches, as well as future developments in next-generation sequencing technologies, will revolutionize our current view of microbial evolution and diversity, and provide profound new insights into the early evolution of life, including the enigmatic origin of the eukaryotic cell.« less
In Silico Ionomics Segregates Parasitic from Free-Living Eukaryotes

PubMed Central

Greganova, Eva; Steinmann, Michael; Mäser, Pascal; Fankhauser, Niklaus

2013-01-01

Ion transporters are fundamental to life. Due to their ancient origin and conservation in sequence, ion transporters are also particularly well suited for comparative genomics of distantly related species. Here, we perform genome-wide ion transporter profiling as a basis for comparative genomics of eukaryotes. From a given predicted proteome, we identify all bona fide ion channels, ion porters, and ion pumps. Concentrating on unicellular eukaryotes (n = 37), we demonstrate that clustering of species according to their repertoire of ion transporters segregates obligate endoparasites (n = 23) on the one hand, from free-living species and facultative parasites (n = 14) on the other hand. This surprising finding indicates strong convergent evolution of the parasites regarding the acquisition and homeostasis of inorganic ions. Random forest classification identifies transporters of ammonia, plus transporters of iron and other transition metals, as the most informative for distinguishing the obligate parasites. Thus, in silico ionomics further underscores the importance of iron in infection biology and suggests access to host sources of nitrogen and transition metals to be selective forces in the evolution of parasitism. This finding is in agreement with the phenomenon of iron withholding as a primordial antimicrobial strategy of infected mammals. PMID:24048281
In silico ionomics segregates parasitic from free-living eukaryotes.

PubMed

Greganova, Eva; Steinmann, Michael; Mäser, Pascal; Fankhauser, Niklaus

2013-01-01

Ion transporters are fundamental to life. Due to their ancient origin and conservation in sequence, ion transporters are also particularly well suited for comparative genomics of distantly related species. Here, we perform genome-wide ion transporter profiling as a basis for comparative genomics of eukaryotes. From a given predicted proteome, we identify all bona fide ion channels, ion porters, and ion pumps. Concentrating on unicellular eukaryotes (n = 37), we demonstrate that clustering of species according to their repertoire of ion transporters segregates obligate endoparasites (n = 23) on the one hand, from free-living species and facultative parasites (n = 14) on the other hand. This surprising finding indicates strong convergent evolution of the parasites regarding the acquisition and homeostasis of inorganic ions. Random forest classification identifies transporters of ammonia, plus transporters of iron and other transition metals, as the most informative for distinguishing the obligate parasites. Thus, in silico ionomics further underscores the importance of iron in infection biology and suggests access to host sources of nitrogen and transition metals to be selective forces in the evolution of parasitism. This finding is in agreement with the phenomenon of iron withholding as a primordial antimicrobial strategy of infected mammals.
Genome Improvement at JGI-HAGSC

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grimwood, Jane; Schmutz, Jeremy J.; Myers, Richard M.

Since the completion of the sequencing of the human genome, the Joint Genome Institute (JGI) has rapidly expanded its scientific goals in several DOE mission-relevant areas. At the JGI-HAGSC, we have kept pace with this rapid expansion of projects with our focus on assessing, assembling, improving and finishing eukaryotic whole genome shotgun (WGS) projects for which the shotgun sequence is generated at the Production Genomic Facility (JGI-PGF). We follow this by combining the draft WGS with genomic resources generated at JGI-HAGSC or in collaborator laboratories (including BAC end sequences, genetic maps and FLcDNA sequences) to produce an improved draft sequence.more » For eukaryotic genomes important to the DOE mission, we then add further information from directed experiments to produce reference genomic sequences that are publicly available for any scientific researcher. Also, we have continued our program for producing BAC-based finished sequence, both for adding information to JGI genome projects and for small BAC-based sequencing projects proposed through any of the JGI sequencing programs. We have now built our computational expertise in WGS assembly and analysis and have moved eukaryotic genome assembly from the JGI-PGF to JGI-HAGSC. We have concentrated our assembly development work on large plant genomes and complex fungal and algal genomes.« less
Origin and evolution of the self-organizing cytoskeleton in the network of eukaryotic organelles.

PubMed

Jékely, Gáspár

2014-09-02

The eukaryotic cytoskeleton evolved from prokaryotic cytomotive filaments. Prokaryotic filament systems show bewildering structural and dynamic complexity and, in many aspects, prefigure the self-organizing properties of the eukaryotic cytoskeleton. Here, the dynamic properties of the prokaryotic and eukaryotic cytoskeleton are compared, and how these relate to function and evolution of organellar networks is discussed. The evolution of new aspects of filament dynamics in eukaryotes, including severing and branching, and the advent of molecular motors converted the eukaryotic cytoskeleton into a self-organizing "active gel," the dynamics of which can only be described with computational models. Advances in modeling and comparative genomics hold promise of a better understanding of the evolution of the self-organizing cytoskeleton in early eukaryotes, and its role in the evolution of novel eukaryotic functions, such as amoeboid motility, mitosis, and ciliary swimming. Copyright © 2014 Cold Spring Harbor Laboratory Press; all rights reserved.
Joint assembly and genetic mapping of the Atlantic horseshoe crab genome reveals ancient whole genome duplication

PubMed Central

2014-01-01

Background Horseshoe crabs are marine arthropods with a fossil record extending back approximately 450 million years. They exhibit remarkable morphological stability over their long evolutionary history, retaining a number of ancestral arthropod traits, and are often cited as examples of “living fossils.” As arthropods, they belong to the Ecdysozoa, an ancient super-phylum whose sequenced genomes (including insects and nematodes) have thus far shown more divergence from the ancestral pattern of eumetazoan genome organization than cnidarians, deuterostomes and lophotrochozoans. However, much of ecdysozoan diversity remains unrepresented in comparative genomic analyses. Results Here we apply a new strategy of combined de novo assembly and genetic mapping to examine the chromosome-scale genome organization of the Atlantic horseshoe crab, Limulus polyphemus. We constructed a genetic linkage map of this 2.7 Gbp genome by sequencing the nuclear DNA of 34 wild-collected, full-sibling embryos and their parents at a mean redundancy of 1.1x per sample. The map includes 84,307 sequence markers grouped into 1,876 distinct genetic intervals and 5,775 candidate conserved protein coding genes. Conclusions Comparison with other metazoan genomes shows that the L. polyphemus genome preserves ancestral bilaterian linkage groups, and that a common ancestor of modern horseshoe crabs underwent one or more ancient whole genome duplications 300 million years ago, followed by extensive chromosome fusion. These results provide a counter-example to the often noted correlation between whole genome duplication and evolutionary radiations. The new, low-cost genetic mapping method for obtaining a chromosome-scale view of non-model organism genomes that we demonstrate here does not require laboratory culture, and is potentially applicable to a broad range of other species. PMID:24987520
Characterisation of monotreme caseins reveals lineage-specific expansion of an ancestral casein locus in mammals.

PubMed

Lefèvre, Christophe M; Sharp, Julie A; Nicholas, Kevin R

2009-01-01

Using a milk-cell cDNA sequencing approach we characterised milk-protein sequences from two monotreme species, platypus (Ornithorhynchus anatinus) and echidna (Tachyglossus aculeatus) and found a full set of caseins and casein variants. The genomic organisation of the platypus casein locus is compared with other mammalian genomes, including the marsupial opossum and several eutherians. Physical linkage of casein genes has been seen in the casein loci of all mammalian genomes examined and we confirm that this is also observed in platypus. However, we show that a recent duplication of beta-casein occurred in the monotreme lineage, as opposed to more ancient duplications of alpha-casein in the eutherian lineage, while marsupials possess only single copies of alpha- and beta-caseins. Despite this variability, the close proximity of the main alpha- and beta-casein genes in an inverted tail-tail orientation and the relative orientation of the more distant kappa-casein genes are similar in all mammalian genome sequences so far available. Overall, the conservation of the genomic organisation of the caseins indicates the early, pre-monotreme development of the fundamental role of caseins during lactation. In contrast, the lineage-specific gene duplications that have occurred within the casein locus of monotremes and eutherians but not marsupials, which may have lost part of the ancestral casein locus, emphasises the independent selection on milk provision strategies to the young, most likely linked to different developmental strategies. The monotremes therefore provide insight into the ancestral drivers for lactation and how these have adapted in different lineages.
Analyses of charophyte chloroplast genomes help characterize the ancestral chloroplast genome of land plants.

PubMed

Civaň, Peter; Foster, Peter G; Embley, Martin T; Séneca, Ana; Cox, Cymon J

2014-04-01

Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes.
Analyses of Charophyte Chloroplast Genomes Help Characterize the Ancestral Chloroplast Genome of Land Plants

PubMed Central

Civáň, Peter; Foster, Peter G.; Embley, Martin T.; Séneca, Ana; Cox, Cymon J.

2014-01-01

Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes. PMID:24682153
A Taste of Algal Genomes from the Joint Genome Institute

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuo, Alan; Grigoriev, Igor

Algae play profound roles in aquatic food chains and the carbon cycle, can impose health and economic costs through toxic blooms, provide models for the study of symbiosis, photosynthesis, and eukaryotic evolution, and are candidate sources for bio-fuels; all of these research areas are part of the mission of DOE's Joint Genome Institute (JGI). To date JGI has sequenced, assembled, annotated, and released to the public the genomes of 18 species and strains of algae, sampling almost all of the major clades of photosynthetic eukaryotes. With more algal genomes currently undergoing analysis, JGI continues its commitment to driving forward basicmore » and applied algal science. Among these ongoing projects are the pan-genome of the dominant coccolithophore Emiliania huxleyi, the interrelationships between the 4 genomes in the nucleomorph-containing Bigelowiella natans and Guillardia theta, and the search for symbiosis genes of lichens.« less
Molecular paleontology and complexity in the last eukaryotic common ancestor

PubMed Central

Koumandou, V. Lila; Wickstead, Bill; Ginger, Michael L.; van der Giezen, Mark; Dacks, Joel B.

2013-01-01

Eukaryogenesis, the origin of the eukaryotic cell, represents one of the fundamental evolutionary transitions in the history of life on earth. This event, which is estimated to have occurred over one billion years ago, remains rather poorly understood. While some well-validated examples of fossil microbial eukaryotes for this time frame have been described, these can provide only basic morphology and the molecular machinery present in these organisms has remained unknown. Complete and partial genomic information has begun to fill this gap, and is being used to trace proteins and cellular traits to their roots and to provide unprecedented levels of resolution of structures, metabolic pathways and capabilities of organisms at these earliest points within the eukaryotic lineage. This is essentially allowing a molecular paleontology. What has emerged from these studies is spectacular cellular complexity prior to expansion of the eukaryotic lineages. Multiple reconstructed cellular systems indicate a very sophisticated biology, which by implication arose following the initial eukaryogenesis event but prior to eukaryotic radiation and provides a challenge in terms of explaining how these early eukaryotes arose and in understanding how they lived. Here, we provide brief overviews of several cellular systems and the major emerging conclusions, together with predictions for subsequent directions in evolution leading to extant taxa. We also consider what these reconstructions suggest about the life styles and capabilities of these earliest eukaryotes and the period of evolution between the radiation of eukaryotes and the eukaryogenesis event itself. PMID:23895660
Evolutionary Inference across Eukaryotes Identifies Specific Pressures Favoring Mitochondrial Gene Retention.

PubMed

Johnston, Iain G; Williams, Ben P

2016-02-24

Since their endosymbiotic origin, mitochondria have lost most of their genes. Although many selective mechanisms underlying the evolution of mitochondrial genomes have been proposed, a data-driven exploration of these hypotheses is lacking, and a quantitatively supported consensus remains absent. We developed HyperTraPS, a methodology coupling stochastic modeling with Bayesian inference, to identify the ordering of evolutionary events and suggest their causes. Using 2015 complete mitochondrial genomes, we inferred evolutionary trajectories of mtDNA gene loss across the eukaryotic tree of life. We find that proteins comprising the structural cores of the electron transport chain are preferentially encoded within mitochondrial genomes across eukaryotes. A combination of high GC content and high protein hydrophobicity is required to explain patterns of mtDNA gene retention; a model that accounts for these selective pressures can also predict the success of artificial gene transfer experiments in vivo. This work provides a general method for data-driven inference of the ordering of evolutionary and progressive events, here identifying the distinct features shaping mitochondrial genomes of present-day species. Copyright © 2016 Elsevier Inc. All rights reserved.
Archaeal "dark matter" and the origin of eukaryotes.

PubMed

Williams, Tom A; Embley, T Martin

2014-03-01

Current hypotheses about the history of cellular life are mainly based on analyses of cultivated organisms, but these represent only a small fraction of extant biodiversity. The sequencing of new environmental lineages therefore provides an opportunity to test, revise, or reject existing ideas about the tree of life and the origin of eukaryotes. According to the textbook three domains hypothesis, the eukaryotes emerge as the sister group to a monophyletic Archaea. However, recent analyses incorporating better phylogenetic models and an improved sampling of the archaeal domain have generally supported the competing eocyte hypothesis, in which core genes of eukaryotic cells originated from within the Archaea, with important implications for eukaryogenesis. Given this trend, it was surprising that a recent analysis incorporating new genomes from uncultivated Archaea recovered a strongly supported three domains tree. Here, we show that this result was due in part to the use of a poorly fitting phylogenetic model and also to the inclusion by an automated pipeline of genes of putative bacterial origin rather than nucleocytosolic versions for some of the eukaryotes analyzed. When these issues were resolved, analyses including the new archaeal lineages placed core eukaryotic genes within the Archaea. These results are consistent with a number of recent studies in which improved archaeal sampling and better phylogenetic models agree in supporting the eocyte tree over the three domains hypothesis.
Independent evolution of genomic characters during major metazoan transitions.

PubMed

Simakov, Oleg; Kawashima, Takeshi

2017-07-15

Metazoan evolution encompasses a vast evolutionary time scale spanning over 600 million years. Our ability to infer ancestral metazoan characters, both morphological and functional, is limited by our understanding of the nature and evolutionary dynamics of the underlying regulatory networks. Increasing coverage of metazoan genomes enables us to identify the evolutionary changes of the relevant genomic characters such as the loss or gain of coding sequences, gene duplications, micro- and macro-synteny, and non-coding element evolution in different lineages. In this review we describe recent advances in our understanding of ancestral metazoan coding and non-coding features, as deduced from genomic comparisons. Some genomic changes such as innovations in gene and linkage content occur at different rates across metazoan clades, suggesting some level of independence among genomic characters. While their contribution to biological innovation remains largely unclear, we review recent literature about certain genomic changes that do correlate with changes to specific developmental pathways and metazoan innovations. In particular, we discuss the origins of the recently described pharyngeal cluster which is conserved across deuterostome genomes, and highlight different genomic features that have contributed to the evolution of this group. We also assess our current capacity to infer ancestral metazoan states from gene models and comparative genomics tools and elaborate on the future directions of metazoan comparative genomics relevant to evo-devo studies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
A Surrogate Approach to Study the Evolution of Noncoding DNA Elements That Organize Eukaryotic Genomes

PubMed Central

Vermaak, Danielle; Bayes, Joshua J.

2009-01-01

Comparative genomics provides a facile way to address issues of evolutionary constraint acting on different elements of the genome. However, several important DNA elements have not reaped the benefits of this new approach. Some have proved intractable to current day sequencing technology. These include centromeric and heterochromatic DNA, which are essential for chromosome segregation as well as gene regulation, but the highly repetitive nature of the DNA sequences in these regions make them difficult to assemble into longer contigs. Other sequences, like dosage compensation X chromosomal sites, origins of DNA replication, or heterochromatic sequences that encode piwi-associated RNAs, have proved difficult to study because they do not have recognizable DNA features that allow them to be described functionally or computationally. We have employed an alternate approach to the direct study of these DNA elements. By using proteins that specifically bind these noncoding DNAs as surrogates, we can indirectly assay the evolutionary constraints acting on these important DNA elements. We review the impact that such “surrogate strategies” have had on our understanding of the evolutionary constraints shaping centromeres, origins of DNA replication, and dosage compensation X chromosomal sites. These have begun to reveal that in contrast to the view that such structural DNA elements are either highly constrained (under purifying selection) or free to drift (under neutral evolution), some of them may instead be shaped by adaptive evolution and genetic conflicts (these are not mutually exclusive). These insights also help to explain why the same elements (e.g., centromeres and replication origins), which are so complex in some eukaryotic genomes, can be simple and well defined in other where similar conflicts do not exist. PMID:19635763
The mitochondrial genome of the lycophyte Huperzia squarrosa: the most archaic form in vascular plants.

PubMed

Liu, Yang; Wang, Bin; Cui, Peng; Li, Libo; Xue, Jia-Yu; Yu, Jun; Qiu, Yin-Long

2012-01-01

Mitochondrial genomes have maintained some bacterial features despite their residence within eukaryotic cells for approximately two billion years. One of these features is the frequent presence of polycistronic operons. In land plants, however, it has been shown that all sequenced vascular plant chondromes lack large polycistronic operons while bryophyte chondromes have many of them. In this study, we provide the completely sequenced mitochondrial genome of a lycophyte, from Huperzia squarrosa, which is a member of the sister group to all other vascular plants. The genome, at a size of 413,530 base pairs, contains 66 genes and 32 group II introns. In addition, it has 69 pseudogene fragments for 24 of the 40 protein- and rRNA-coding genes. It represents the most archaic form of mitochondrial genomes of all vascular plants. In particular, it has one large conserved gene cluster containing up to 10 ribosomal protein genes, which likely represents a polycistronic operon but has been disrupted and greatly reduced in the chondromes of other vascular plants. It also has the least rearranged gene order in comparison to the chondromes of other vascular plants. The genome is ancestral in vascular plants in several other aspects: the gene content resembling those of charophytes and most bryophytes, all introns being cis-spliced, a low level of RNA editing, and lack of foreign DNA of chloroplast or nuclear origin.
Two Rounds of Whole Genome Duplication in the Ancestral Vertebrate

PubMed Central

Dehal, Paramvir; Boore, Jeffrey L

2005-01-01

The hypothesis that the relatively large and complex vertebrate genome was created by two ancient, whole genome duplications has been hotly debated, but remains unresolved. We reconstructed the evolutionary relationships of all gene families from the complete gene sets of a tunicate, fish, mouse, and human, and then determined when each gene duplicated relative to the evolutionary tree of the organisms. We confirmed the results of earlier studies that there remains little signal of these events in numbers of duplicated genes, gene tree topology, or the number of genes per multigene family. However, when we plotted the genomic map positions of only the subset of paralogous genes that were duplicated prior to the fish–tetrapod split, their global physical organization provides unmistakable evidence of two distinct genome duplication events early in vertebrate evolution indicated by clear patterns of four-way paralogous regions covering a large part of the human genome. Our results highlight the potential for these large-scale genomic events to have driven the evolutionary success of the vertebrate lineage. PMID:16128622
Three crocodilian genomes reveal ancestral patterns of evolution among archosaurs

PubMed Central

Green, Richard E; Braun, Edward L; Armstrong, Joel; Earl, Dent; Nguyen, Ngan; Hickey, Glenn; Vandewege, Michael W; St John, John A; Capella-Gutiérrez, Salvador; Castoe, Todd A; Kern, Colin; Fujita, Matthew K; Opazo, Juan C; Jurka, Jerzy; Kojima, Kenji K; Caballero, Juan; Hubley, Robert M; Smit, Arian F; Platt, Roy N; Lavoie, Christine A; Ramakodi, Meganathan P; Finger, John W; Suh, Alexander; Isberg, Sally R; Miles, Lee; Chong, Amanda Y; Jaratlerdsiri, Weerachai; Gongora, Jaime; Moran, Christopher; Iriarte, Andrés; McCormack, John; Burgess, Shane C; Edwards, Scott V; Lyons, Eric; Williams, Christina; Breen, Matthew; Howard, Jason T; Gresham, Cathy R; Peterson, Daniel G; Schmitz, Jürgen; Pollock, David D; Haussler, David; Triplett, Eric W; Zhang, Guojie; Irie, Naoki; Jarvis, Erich D; Brochu, Christopher A; Schmidt, Carl J; McCarthy, Fiona M; Faircloth, Brant C; Hoffmann, Federico G; Glenn, Travis C; Gabaldón, Toni; Paten, Benedict; Ray, David A

2015-01-01

To provide context for the diversifications of archosaurs, the group that includes crocodilians, dinosaurs and birds, we generated draft genomes of three crocodilians, Alligator mississippiensis (the American alligator), Crocodylus porosus (the saltwater crocodile), and Gavialis gangeticus (the Indian gharial). We observed an exceptionally slow rate of genome evolution within crocodilians at all levels, including nucleotide substitutions, indels, transposable element content and movement, gene family evolution, and chromosomal synteny. When placed within the context of related taxa including birds and turtles, this suggests that the common ancestor of all of these taxa also exhibited slow genome evolution and that the relatively rapid evolution of bird genomes represents an autapomorphy within that clade. The data also provided the opportunity to analyze heterozygosity in crocodilians, which indicates a likely reduction in population size for all three taxa through the Pleistocene. Finally, these new data combined with newly published bird genomes allowed us to reconstruct the partial genome of the common ancestor of archosaurs providing a tool to investigate the genetic starting material of crocodilians, birds, and dinosaurs. PMID:25504731
Mobile Bacterial Group II Introns at the Crux of Eukaryotic Evolution

PubMed Central

Lambowitz, Alan M.; Belfort, Marlene

2015-01-01

SUMMARY This review focuses on recent developments in our understanding of group II intron function, the relationships of these introns to retrotransposons and spliceosomes, and how their common features have informed thinking about bacterial group II introns as key elements in eukaryotic evolution. Reverse transcriptase-mediated and host factor-aided intron retrohoming pathways are considered along with retrotransposition mechanisms to novel sites in bacteria, where group II introns are thought to have originated. DNA target recognition and movement by target-primed reverse transcription infer an evolutionary relationship among group II introns, non-LTR retrotransposons, such as LINE elements, and telomerase. Additionally, group II introns are almost certainly the progenitors of spliceosomal introns. Their profound similarities include splicing chemistry extending to RNA catalysis, reaction stereochemistry, and the position of two divalent metals that perform catalysis at the RNA active site. There are also sequence and structural similarities between group II introns and the spliceosome’s small nuclear RNAs (snRNAs) and between a highly conserved core spliceosomal protein Prp8 and a group II intron-like reverse transcriptase. It has been proposed that group II introns entered eukaryotes during bacterial endosymbiosis or bacterial-archaeal fusion, proliferated within the nuclear genome, necessitating evolution of the nuclear envelope, and fragmented giving rise to spliceosomal introns. Thus, these bacterial self-splicing mobile elements have fundamentally impacted the composition of extant eukaryotic genomes, including the human genome, most of which is derived from close relatives of mobile group II introns. PMID:25878921

Structural studies demonstrating a bacteriophage-like replication cycle of the eukaryote-infecting Paramecium bursaria chlorella virus-1

PubMed Central

Shimoni, Eyal; Dadosh, Tali; Rechav, Katya; Unger, Tamar

2017-01-01

A fundamental stage in viral infection is the internalization of viral genomes in host cells. Although extensively studied, the mechanisms and factors responsible for the genome internalization process remain poorly understood. Here we report our observations, derived from diverse imaging methods on genome internalization of the large dsDNA Paramecium bursaria chlorella virus-1 (PBCV-1). Our studies reveal that early infection stages of this eukaryotic-infecting virus occurs by a bacteriophage-like pathway, whereby PBCV-1 generates a hole in the host cell wall and ejects its dsDNA genome in a linear, base-pair-by-base-pair process, through a membrane tunnel generated by the fusion of the virus internal membrane with the host membrane. Furthermore, our results imply that PBCV-1 DNA condensation that occurs shortly after infection probably plays a role in genome internalization, as hypothesized for the infection of some bacteriophages. The subsequent perforation of the host photosynthetic membranes presumably enables trafficking of viral genomes towards host nuclei. Previous studies established that at late infection stages PBCV-1 generates cytoplasmic organelles, termed viral factories, where viral assembly takes place, a feature characteristic of many large dsDNA viruses that infect eukaryotic organisms. PBCV-1 thus appears to combine a bacteriophage-like mechanism during early infection stages with a eukaryotic-like infection pathway in its late replication cycle. PMID:28850602
Allo-allo-triploid Sphagnum × falcatulum: single individuals contain most of the Holantarctic diversity for ancestrally indicative markers.

PubMed

Karlin, Eric F; Smouse, Peter E

2017-08-01

Allopolyploids exhibit both different levels and different patterns of genetic variation than are typical of diploids. However, scant attention has been given to the partitioning of allelic information and diversity in allopolyploids, particularly that among homeologous monoploid components of the hologenome. Sphagnum × falcatulum is a double allopolyploid peat moss that spans a considerable portion of the Holantarctic. With monoploid genomes from three ancestral species, this organism exhibits a complex evolutionary history involving serial inter-subgeneric allopolyploidizations. Studying populations from three disjunct regions [South Island (New Zealand); Tierra de Fuego archipelago (Chile, Argentina); Tasmania (Australia)], allelic information for five highly stable microsatellite markers that differed among the three (ancestral) monoploid genomes was examined. Using Shannon information and diversity measures, the holoploid information, as well as the information within and among the three component monoploid genomes, was partitioned into separate components for individuals within and among populations and regions, and those information components were then converted into corresponding diversity measures. The majority (76 %) of alleles detected across these five markers are most likely to have been captured by hybridization, but the information within each of the three monoploid genomes varied, suggesting a history of recurrent allopolyploidization between ancestral species containing different levels of genetic diversity. Information within individuals, equivalent to the information among monoploid genomes (for this dataset), was relatively stable, and represented 83 % of the grand total information across the Holantarctic, with both inter-regional and inter-population diversification each accounting for about 5 % of the total information. Sphagnum × falcatulum probably inherited the great majority of its genetic diversity at these markers by reticulation
Neandertal admixture in Eurasia confirmed by maximum-likelihood analysis of three genomes.

PubMed

Lohse, Konrad; Frantz, Laurent A F

2014-04-01

Although there has been much interest in estimating histories of divergence and admixture from genomic data, it has proved difficult to distinguish recent admixture from long-term structure in the ancestral population. Thus, recent genome-wide analyses based on summary statistics have sparked controversy about the possibility of interbreeding between Neandertals and modern humans in Eurasia. Here we derive the probability of full mutational configurations in nonrecombining sequence blocks under both admixture and ancestral structure scenarios. Dividing the genome into short blocks gives an efficient way to compute maximum-likelihood estimates of parameters. We apply this likelihood scheme to triplets of human and Neandertal genomes and compare the relative support for a model of admixture from Neandertals into Eurasian populations after their expansion out of Africa against a history of persistent structure in their common ancestral population in Africa. Our analysis allows us to conclusively reject a model of ancestral structure in Africa and instead reveals strong support for Neandertal admixture in Eurasia at a higher rate (3.4-7.3%) than suggested previously. Using analysis and simulations we show that our inference is more powerful than previous summary statistics and robust to realistic levels of recombination.
A widely employed germ cell marker is an ancient disordered protein with reproductive functions in diverse eukaryotes

PubMed Central

Carmell, Michelle A; Dokshin, Gregoriy A; Skaletsky, Helen; Hu, Yueh-Chiang; van Wolfswinkel, Josien C; Igarashi, Kyomi J; Bellott, Daniel W; Nefedov, Michael; Reddien, Peter W; Enders, George C; Uversky, Vladimir N; Mello, Craig C; Page, David C

2016-01-01

The advent of sexual reproduction and the evolution of a dedicated germline in multicellular organisms are critical landmarks in eukaryotic evolution. We report an ancient family of GCNA (germ cell nuclear antigen) proteins that arose in the earliest eukaryotes, and feature a rapidly evolving intrinsically disordered region (IDR). Phylogenetic analysis reveals that GCNA proteins emerged before the major eukaryotic lineages diverged; GCNA predates the origin of a dedicated germline by a billion years. Gcna gene expression is enriched in reproductive cells across eukarya – either just prior to or during meiosis in single-celled eukaryotes, and in stem cells and germ cells of diverse multicellular animals. Studies of Gcna-mutant C. elegans and mice indicate that GCNA has functioned in reproduction for at least 600 million years. Homology to IDR-containing proteins implicated in DNA damage repair suggests that GCNA proteins may protect the genomic integrity of cells carrying a heritable genome. DOI: http://dx.doi.org/10.7554/eLife.19993.001 PMID:27718356
Intra-plastid protein trafficking: how plant cells adapted prokaryotic mechanisms to the eukaryotic condition.

PubMed

Celedon, Jose M; Cline, Kenneth

2013-02-01

Protein trafficking and localization in plastids involve a complex interplay between ancient (prokaryotic) and novel (eukaryotic) translocases and targeting machineries. During evolution, ancient systems acquired new functions and novel translocation machineries were developed to facilitate the correct localization of nuclear encoded proteins targeted to the chloroplast. Because of its post-translational nature, targeting and integration of membrane proteins posed the biggest challenge to the organelle to avoid aggregation in the aqueous compartments. Soluble proteins faced a different kind of problem since some had to be transported across three membranes to reach their destination. Early studies suggested that chloroplasts addressed these issues by adapting ancient-prokaryotic machineries and integrating them with novel-eukaryotic systems, a process called 'conservative sorting'. In the last decade, detailed biochemical, genetic, and structural studies have unraveled the mechanisms of protein targeting and localization in chloroplasts, suggesting a highly integrated scheme where ancient and novel systems collaborate at different stages of the process. In this review we focus on the differences and similarities between chloroplast ancestral translocases and their prokaryotic relatives to highlight known modifications that adapted them to the eukaryotic situation. This article is part of a Special Issue entitled: Protein Import and Quality Control in Mitochondria and Plastids. Copyright © 2012 Elsevier B.V. All rights reserved.
GASP: Gapped Ancestral Sequence Prediction for proteins

PubMed Central

Edwards, Richard J; Shields, Denis C

2004-01-01

Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199
Bacteriophage T5 encodes a homolog of the eukaryotic transcription coactivator PC4 implicated in recombination-dependent DNA replication.

PubMed

Steigemann, Birthe; Schulz, Annina; Werten, Sebastiaan

2013-11-15

The RNA polymerase II cofactor PC4 globally regulates transcription of protein-encoding genes through interactions with unwinding DNA, the basal transcription machinery and transcription activators. Here, we report the surprising identification of PC4 homologs in all sequenced representatives of the T5 family of bacteriophages, as well as in an archaeon and seven phyla of eubacteria. We have solved the crystal structure of the full-length T5 protein at 1.9Å, revealing a striking resemblance to the characteristic single-stranded DNA (ssDNA)-binding core domain of PC4. Intriguing novel structural features include a potential regulatory region at the N-terminus and a C-terminal extension of the homodimerisation interface. The genome organisation of T5-related bacteriophages points at involvement of the PC4 homolog in recombination-dependent DNA replication, strongly suggesting that the protein corresponds to the hitherto elusive replicative ssDNA-binding protein of the T5 family. Our findings imply that PC4-like factors intervene in multiple unwinding-related processes by acting as versatile modifiers of nucleic acid conformation and raise the possibility that the eukaryotic transcription coactivator derives from ancestral DNA replication, recombination and repair factors. © 2013.
Compositional pressure and translational selection determine codon usage in the extremely GC-poor unicellular eukaryote Entamoeba histolytica.

PubMed

Romero, H; Zavala, A; Musto, H

2000-01-25

It is widely accepted that the compositional pressure is the only factor shaping codon usage in unicellular species displaying extremely biased genomic compositions. This seems to be the case in the prokaryotes Mycoplasma capricolum, Rickettsia prowasekii and Borrelia burgdorferi (GC-poor), and in Micrococcus luteus (GC-rich). However, in the GC-poor unicellular eukaryotes Dictyostelium discoideum and Plasmodium falciparum, there is evidence that selection, acting at the level of translation, influences codon choices. This is a twofold intriguing finding, since (1) the genomic GC levels of the above mentioned eukaryotes are lower than the GC% of any studied bacteria, and (2) bacteria usually have larger effective population sizes than eukaryotes, and hence natural selection is expected to overcome more efficiently the randomizing effects of genetic drift among prokaryotes than among eukaryotes. In order to gain a new insight about this problem, we analysed the patterns of codon preferences of the nuclear genes of Entamoeba histolytica, a unicellular eukaryote characterised by an extremely AT-rich genome (GC = 25%). The overall codon usage is strongly biased towards A and T in the third codon positions, and among the presumed highly expressed sequences, there is an increased relative usage of a subset of codons, many of which are C-ending. Since an increase in C in third codon positions is 'against' the compositional bias, we conclude that codon usage in E. histolytica, as happens in D. discoideum and P. falciparum, is the result of an equilibrium between compositional pressure and selection. These findings raise the question of why strongly compositionally biased eukaryotic cells may be more sensitive to the (presumed) slight differences among synonymous codons than compositionally biased bacteria.
Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

PubMed Central

Callister, Stephen J.; McCue, Lee Ann; Turse, Joshua E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

2008-01-01

While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits. PMID:18253490
Asymptotic Distributions of Coalescence Times and Ancestral Lineage Numbers for Populations with Temporally Varying Size

PubMed Central

Chen, Hua; Chen, Kun

2013-01-01

The distributions of coalescence times and ancestral lineage numbers play an essential role in coalescent modeling and ancestral inference. Both exact distributions of coalescence times and ancestral lineage numbers are expressed as the sum of alternating series, and the terms in the series become numerically intractable for large samples. More computationally attractive are their asymptotic distributions, which were derived in Griffiths (1984) for populations with constant size. In this article, we derive the asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size. For a sample of size n, denote by Tm the mth coalescent time, when m + 1 lineages coalesce into m lineages, and An(t) the number of ancestral lineages at time t back from the current generation. Similar to the results in Griffiths (1984), the number of ancestral lineages, An(t), and the coalescence times, Tm, are asymptotically normal, with the mean and variance of these distributions depending on the population size function, N(t). At the very early stage of the coalescent, when t → 0, the number of coalesced lineages n − An(t) follows a Poisson distribution, and as m → n, n(n−1)Tm/2N(0) follows a gamma distribution. We demonstrate the accuracy of the asymptotic approximations by comparing to both exact distributions and coalescent simulations. Several applications of the theoretical results are also shown: deriving statistics related to the properties of gene genealogies, such as the time to the most recent common ancestor (TMRCA) and the total branch length (TBL) of the genealogy, and deriving the allele frequency spectrum for large genealogies. With the advent of genomic-level sequencing data for large samples, the asymptotic distributions are expected to have wide applications in theoretical and methodological development for population genetic inference. PMID:23666939
Asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size.

PubMed

Chen, Hua; Chen, Kun

2013-07-01

The distributions of coalescence times and ancestral lineage numbers play an essential role in coalescent modeling and ancestral inference. Both exact distributions of coalescence times and ancestral lineage numbers are expressed as the sum of alternating series, and the terms in the series become numerically intractable for large samples. More computationally attractive are their asymptotic distributions, which were derived in Griffiths (1984) for populations with constant size. In this article, we derive the asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size. For a sample of size n, denote by Tm the mth coalescent time, when m + 1 lineages coalesce into m lineages, and An(t) the number of ancestral lineages at time t back from the current generation. Similar to the results in Griffiths (1984), the number of ancestral lineages, An(t), and the coalescence times, Tm, are asymptotically normal, with the mean and variance of these distributions depending on the population size function, N(t). At the very early stage of the coalescent, when t → 0, the number of coalesced lineages n - An(t) follows a Poisson distribution, and as m → n, $$n\\left(n-1\\right){T}_{m}/2N\\left(0\\right)$$ follows a gamma distribution. We demonstrate the accuracy of the asymptotic approximations by comparing to both exact distributions and coalescent simulations. Several applications of the theoretical results are also shown: deriving statistics related to the properties of gene genealogies, such as the time to the most recent common ancestor (TMRCA) and the total branch length (TBL) of the genealogy, and deriving the allele frequency spectrum for large genealogies. With the advent of genomic-level sequencing data for large samples, the asymptotic distributions are expected to have wide applications in theoretical and methodological development for population genetic inference.
RNase MRP and the RNA processing cascade in the eukaryotic ancestor.

PubMed

Woodhams, Michael D; Stadler, Peter F; Penny, David; Collins, Lesley J

2007-02-08

Within eukaryotes there is a complex cascade of RNA-based macromolecules that process other RNA molecules, especially mRNA, tRNA and rRNA. An example is RNase MRP processing ribosomal RNA (rRNA) in ribosome biogenesis. One hypothesis is that this complexity was present early in eukaryotic evolution; an alternative is that an initial simpler network later gained complexity by gene duplication in lineages that led to animals, fungi and plants. Recently there has been a rapid increase in support for the complexity-early theory because the vast majority of these RNA-processing reactions are found throughout eukaryotes, and thus were likely to be present in the last common ancestor of living eukaryotes, herein called the Eukaryotic Ancestor. We present an overview of the RNA processing cascade in the Eukaryotic Ancestor and investigate in particular, RNase MRP which was previously thought to have evolved later in eukaryotes due to its apparent limited distribution in fungi and animals and plants. Recent publications, as well as our own genomic searches, find previously unknown RNase MRP RNAs, indicating that RNase MRP has a wide distribution in eukaryotes. Combining secondary structure and promoter region analysis of RNAs for RNase MRP, along with analysis of the target substrate (rRNA), allows us to discuss this distribution in the light of eukaryotic evolution. We conclude that RNase MRP can now be placed in the RNA-processing cascade of the Eukaryotic Ancestor, highlighting the complexity of RNA-processing in early eukaryotes. Promoter analyses of MRP-RNA suggest that regulation of the critical processes of rRNA cleavage can vary, showing that even these key cellular processes (for which we expect high conservation) show some species-specific variability. We present our consensus MRP-RNA secondary structure as a useful model for further searches.
Recreating a functional ancestral archosaur visual pigment.

PubMed

Chang, Belinda S W; Jönsson, Karolina; Kazmi, Manija A; Donoghue, Michael J; Sakmar, Thomas P

2002-09-01

The ancestors of the archosaurs, a major branch of the diapsid reptiles, originated more than 240 MYA near the dawn of the Triassic Period. We used maximum likelihood phylogenetic ancestral reconstruction methods and explored different models of evolution for inferring the amino acid sequence of a putative ancestral archosaur visual pigment. Three different types of maximum likelihood models were used: nucleotide-based, amino acid-based, and codon-based models. Where possible, within each type of model, likelihood ratio tests were used to determine which model best fit the data. Ancestral reconstructions of the ancestral archosaur node using the best-fitting models of each type were found to be in agreement, except for three amino acid residues at which one reconstruction differed from the other two. To determine if these ancestral pigments would be functionally active, the corresponding genes were chemically synthesized and then expressed in a mammalian cell line in tissue culture. The expressed artificial genes were all found to bind to 11-cis-retinal to yield stable photoactive pigments with lambda(max) values of about 508 nm, which is slightly redshifted relative to that of extant vertebrate pigments. The ancestral archosaur pigments also activated the retinal G protein transducin, as measured in a fluorescence assay. Our results show that ancestral genes from ancient organisms can be reconstructed de novo and tested for function using a combination of phylogenetic and biochemical methods.
Archaeal “Dark Matter” and the Origin of Eukaryotes

PubMed Central

Williams, Tom A.; Embley, T. Martin

2014-01-01

Current hypotheses about the history of cellular life are mainly based on analyses of cultivated organisms, but these represent only a small fraction of extant biodiversity. The sequencing of new environmental lineages therefore provides an opportunity to test, revise, or reject existing ideas about the tree of life and the origin of eukaryotes. According to the textbook three domains hypothesis, the eukaryotes emerge as the sister group to a monophyletic Archaea. However, recent analyses incorporating better phylogenetic models and an improved sampling of the archaeal domain have generally supported the competing eocyte hypothesis, in which core genes of eukaryotic cells originated from within the Archaea, with important implications for eukaryogenesis. Given this trend, it was surprising that a recent analysis incorporating new genomes from uncultivated Archaea recovered a strongly supported three domains tree. Here, we show that this result was due in part to the use of a poorly fitting phylogenetic model and also to the inclusion by an automated pipeline of genes of putative bacterial origin rather than nucleocytosolic versions for some of the eukaryotes analyzed. When these issues were resolved, analyses including the new archaeal lineages placed core eukaryotic genes within the Archaea. These results are consistent with a number of recent studies in which improved archaeal sampling and better phylogenetic models agree in supporting the eocyte tree over the three domains hypothesis. PMID:24532674
Rab protein evolution and the history of the eukaryotic endomembrane system

PubMed Central

Brighouse, Andrew; Dacks, Joel B.

2010-01-01

Spectacular increases in the quantity of sequence data genome have facilitated major advances in eukaryotic comparative genomics. By exploiting homology with classical model organisms, this makes possible predictions of pathways and cellular functions currently impossible to address in intractable organisms. Echoing realization that core metabolic processes were established very early following evolution of life on earth, it is now emerging that many eukaryotic cellular features, including the endomembrane system, are ancient and organized around near-universal principles. Rab proteins are key mediators of vesicle transport and specificity, and via the presence of multiple paralogues, alterations in interaction specificity and modification of pathways, contribute greatly to the evolution of complexity of membrane transport. Understanding system-level contributions of Rab proteins to evolutionary history provides insight into the multiple processes sculpting cellular transport pathways and the exciting challenges that we face in delving further into the origins of membrane trafficking specificity. PMID:20582450
pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.

PubMed

Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael

2013-08-01

With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.
Three crocodilian genomes reveal ancestral patterns of evolution among archosaurs.

PubMed

Green, Richard E; Braun, Edward L; Armstrong, Joel; Earl, Dent; Nguyen, Ngan; Hickey, Glenn; Vandewege, Michael W; St John, John A; Capella-Gutiérrez, Salvador; Castoe, Todd A; Kern, Colin; Fujita, Matthew K; Opazo, Juan C; Jurka, Jerzy; Kojima, Kenji K; Caballero, Juan; Hubley, Robert M; Smit, Arian F; Platt, Roy N; Lavoie, Christine A; Ramakodi, Meganathan P; Finger, John W; Suh, Alexander; Isberg, Sally R; Miles, Lee; Chong, Amanda Y; Jaratlerdsiri, Weerachai; Gongora, Jaime; Moran, Christopher; Iriarte, Andrés; McCormack, John; Burgess, Shane C; Edwards, Scott V; Lyons, Eric; Williams, Christina; Breen, Matthew; Howard, Jason T; Gresham, Cathy R; Peterson, Daniel G; Schmitz, Jürgen; Pollock, David D; Haussler, David; Triplett, Eric W; Zhang, Guojie; Irie, Naoki; Jarvis, Erich D; Brochu, Christopher A; Schmidt, Carl J; McCarthy, Fiona M; Faircloth, Brant C; Hoffmann, Federico G; Glenn, Travis C; Gabaldón, Toni; Paten, Benedict; Ray, David A

2014-12-12

To provide context for the diversification of archosaurs--the group that includes crocodilians, dinosaurs, and birds--we generated draft genomes of three crocodilians: Alligator mississippiensis (the American alligator), Crocodylus porosus (the saltwater crocodile), and Gavialis gangeticus (the Indian gharial). We observed an exceptionally slow rate of genome evolution within crocodilians at all levels, including nucleotide substitutions, indels, transposable element content and movement, gene family evolution, and chromosomal synteny. When placed within the context of related taxa including birds and turtles, this suggests that the common ancestor of all of these taxa also exhibited slow genome evolution and that the comparatively rapid evolution is derived in birds. The data also provided the opportunity to analyze heterozygosity in crocodilians, which indicates a likely reduction in population size for all three taxa through the Pleistocene. Finally, these data combined with newly published bird genomes allowed us to reconstruct the partial genome of the common ancestor of archosaurs, thereby providing a tool to investigate the genetic starting material of crocodilians, birds, and dinosaurs. Copyright © 2014, American Association for the Advancement of Science.
Comparative Genomic Analysis Reveals a Diverse Repertoire of Genes Involved in Prokaryote-Eukaryote Interactions within the Pseudovibrio Genus

PubMed Central

Romano, Stefano; Fernàndez-Guerra, Antonio; Reen, F. Jerry; Glöckner, Frank O.; Crowley, Susan P.; O'Sullivan, Orla; Cotter, Paul D.; Adams, Claire; Dobson, Alan D. W.; O'Gara, Fergal

2016-01-01

Strains of the Pseudovibrio genus have been detected worldwide, mainly as part of bacterial communities associated with marine invertebrates, particularly sponges. This recurrent association has been considered as an indication of a symbiotic relationship between these microbes and their host. Until recently, the availability of only two genomes, belonging to closely related strains, has limited the knowledge on the genomic and physiological features of the genus to a single phylogenetic lineage. Here we present 10 newly sequenced genomes of Pseudovibrio strains isolated from marine sponges from the west coast of Ireland, and including the other two publicly available genomes we performed an extensive comparative genomic analysis. Homogeneity was apparent in terms of both the orthologous genes and the metabolic features shared amongst the 12 strains. At the genomic level, a key physiological difference observed amongst the isolates was the presence only in strain P. axinellae AD2 of genes encoding proteins involved in assimilatory nitrate reduction, which was then proved experimentally. We then focused on studying those systems known to be involved in the interactions with eukaryotic and prokaryotic cells. This analysis revealed that the genus harbors a large diversity of toxin-like proteins, secretion systems and their potential effectors. Their distribution in the genus was not always consistent with the phylogenetic relationship of the strains. Finally, our analyses identified new genomic islands encoding potential toxin-immunity systems, previously unknown in the genus. Our analyses shed new light on the Pseudovibrio genus, indicating a large diversity of both metabolic features and systems for interacting with the host. The diversity in both distribution and abundance of these systems amongst the strains underlines how metabolically and phylogenetically similar bacteria may use different strategies to interact with the host and find a niche within its
Conserved Gene Order and Expanded Inverted Repeats Characterize Plastid Genomes of Thalassiosirales

PubMed Central

Ashworth, Matt P.; Baeshen, Nabih A.; Baeshen, Mohammad N.; Bahieldin, Ahmed; Theriot, Edward C.; Jansen, Robert K.

2014-01-01

Diatoms are mostly photosynthetic eukaryotes within the heterokont lineage. Variable plastid genome sizes and extensive genome rearrangements have been observed across the diatom phylogeny, but little is known about plastid genome evolution within order- or family-level clades. The Thalassiosirales is one of the more comprehensively studied orders in terms of both genetics and morphology. Seven complete diatom plastid genomes are reported here including four Thalassiosirales: Thalassiosira weissflogii, Roundia cardiophora, Cyclotella sp. WC03_2, Cyclotella sp. L04_2, and three additional non-Thalassiosirales species Chaetoceros simplex, Cerataulina daemon, and Rhizosolenia imbricata. The sizes of the seven genomes vary from 116,459 to 129,498 bp, and their genomes are compact and lack introns. The larger size of the plastid genomes of Thalassiosirales compared to other diatoms is due primarily to expansion of the inverted repeat. Gene content within Thalassiosirales is more conserved compared to other diatom lineages. Gene order within Thalassiosirales is highly conserved except for the extensive genome rearrangement in Thalassiosira oceanica. Cyclotella nana, Thalassiosira weissflogii and Roundia cardiophora share an identical gene order, which is inferred to be the ancestral order for the Thalassiosirales, differing from that of the other two Cyclotella species by a single inversion. The genes ilvB and ilvH are missing in all six diatom plastid genomes except for Cerataulina daemon, suggesting an independent gain of these genes in this species. The acpP1 gene is missing in all Thalassiosirales, suggesting that its loss may be a synapomorphy for the order and this gene may have been functionally transferred to the nucleus. Three genes involved in photosynthesis, psaE, psaI, psaM, are missing in Rhizosolenia imbricata, which represents the first documented instance of the loss of photosynthetic genes in diatom plastid genomes. PMID:25233465
Genome defense against exogenous nucleic acids in eukaryotes by non-coding DNA occurs through CRISPR-like mechanisms in the cytosol and the bodyguard protection in the nucleus.

PubMed

Qiu, Guo-Hua

2016-01-01

In this review, the protective function of the abundant non-coding DNA in the eukaryotic genome is discussed from the perspective of genome defense against exogenous nucleic acids. Peripheral non-coding DNA has been proposed to act as a bodyguard that protects the genome and the central protein-coding sequences from ionizing radiation-induced DNA damage. In the proposed mechanism of protection, the radicals generated by water radiolysis in the cytosol and IR energy are absorbed, blocked and/or reduced by peripheral heterochromatin; then, the DNA damage sites in the heterochromatin are removed and expelled from the nucleus to the cytoplasm through nuclear pore complexes, most likely through the formation of extrachromosomal circular DNA. To strengthen this hypothesis, this review summarizes the experimental evidence supporting the protective function of non-coding DNA against exogenous nucleic acids. Based on these data, I hypothesize herein about the presence of an additional line of defense formed by small RNAs in the cytosol in addition to their bodyguard protection mechanism in the nucleus. Therefore, exogenous nucleic acids may be initially inactivated in the cytosol by small RNAs generated from non-coding DNA via mechanisms similar to the prokaryotic CRISPR-Cas system. Exogenous nucleic acids may enter the nucleus, where some are absorbed and/or blocked by heterochromatin and others integrate into chromosomes. The integrated fragments and the sites of DNA damage are removed by repetitive non-coding DNA elements in the heterochromatin and excluded from the nucleus. Therefore, the normal eukaryotic genome and the central protein-coding sequences are triply protected by non-coding DNA against invasion by exogenous nucleic acids. This review provides evidence supporting the protective role of non-coding DNA in genome defense. Copyright © 2016 Elsevier B.V. All rights reserved.

Structural Genomics: Correlation Blocks, Population Structure, and Genome Architecture

PubMed Central

Hu, Xin-Sheng; Yeh, Francis C.; Wang, Zhiquan

2011-01-01

An integration of the pattern of genome-wide inter-site associations with evolutionary forces is important for gaining insights into the genomic evolution in natural or artificial populations. Here, we assess the inter-site correlation blocks and their distributions along chromosomes. A correlation block is broadly termed as the DNA segment within which strong correlations exist between genetic diversities at any two sites. We bring together the population genetic structure and the genomic diversity structure that have been independently built on different scales and synthesize the existing theories and methods for characterizing genomic structure at the population level. We discuss how population structure could shape correlation blocks and their patterns within and between populations. Effects of evolutionary forces (selection, migration, genetic drift, and mutation) on the pattern of genome-wide correlation blocks are discussed. In eukaryote organisms, we briefly discuss the associations between the pattern of correlation blocks and genome assembly features in eukaryote organisms, including the impacts of multigene family, the perturbation of transposable elements, and the repetitive nongenic sequences and GC-rich isochores. Our reviews suggest that the observable pattern of correlation blocks can refine our understanding of the ecological and evolutionary processes underlying the genomic evolution at the population level. PMID:21886455
Global transcriptome analysis of eukaryotic genes affected by gromwell extract.

PubMed

Bang, Soohyun; Lee, Dohyun; Kim, Hanhe; Park, Jiyong; Bahn, Yong-Sun

2014-02-01

Gromwell is known to have diverse pharmacological, cosmetic and nutritional benefits for humans. Nevertheless, the biological influence of gromwell extract (GE) on the general physiology of eukaryotic cells remains unknown. In this study a global transcriptome analysis was performed to identify genes affected by the addition of GE with Cryptococcus neoformans as the model system. In response to GE treatment, genes involved in signal transduction were immediately regulated, and the evolutionarily conserved sets of genes involved in the core cellular functions, including DNA replication, RNA transcription/processing and protein translation/processing, were generally up-regulated. In contrast, a number of genes involved in carbohydrate metabolism and transport, inorganic ion transport and metabolism, post-translational modification/protein turnover/chaperone functions and signal transduction were down-regulated. Among the GE-responsive genes that are also evolutionarily conserved in the human genome, the expression patterns of YSA1, TPO2, CFO1 and PZF1 were confirmed by northern blot analysis. Based on the functional characterization of some GE-responsive genes, it was found that GE treatment may promote cellular tolerance against a variety of environmental stresses in eukaryotes. GE treatment affects the expression levels of a significant portion of the Cryptococcus genome, implying that GE significantly affects the general physiology of eukaryotic cells. © 2013 Society of Chemical Industry.
Spy: a new group of eukaryotic DNA transposons without target site duplications.

PubMed

Han, Min-Jin; Xu, Hong-En; Zhang, Hua-Hao; Feschotte, Cédric; Zhang, Ze

2014-06-24

Class 2 or DNA transposons populate the genomes of most eukaryotes and like other mobile genetic elements have a profound impact on genome evolution. Most DNA transposons belong to the cut-and-paste types, which are relatively simple elements characterized by terminal-inverted repeats (TIRs) flanking a single gene encoding a transposase. All eukaryotic cut-and-paste transposons so far described are also characterized by target site duplications (TSDs) of host DNA generated upon chromosomal insertion. Here, we report a new group of evolutionarily related DNA transposons called Spy, which also include TIRs and DDE motif-containing transposase but surprisingly do not create TSDs upon insertion. Instead, Spy transposons appear to transpose precisely between 5'-AAA and TTT-3' host nucleotides, without duplication or modification of the AAATTT target sites. Spy transposons were identified in the genomes of diverse invertebrate species based on transposase homology searches and structure-based approaches. Phylogenetic analyses indicate that Spy transposases are distantly related to IS5, ISL2EU, and PIF/Harbinger transposases. However, Spy transposons are distinct from these and other DNA transposon superfamilies by their lack of TSD and their target site preference. Our findings expand the known diversity of DNA transposons and reveal a new group of eukaryotic DDE transposases with unusual catalytic properties. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Neandertal Admixture in Eurasia Confirmed by Maximum-Likelihood Analysis of Three Genomes

PubMed Central

Lohse, Konrad; Frantz, Laurent A. F.

2014-01-01

Although there has been much interest in estimating histories of divergence and admixture from genomic data, it has proved difficult to distinguish recent admixture from long-term structure in the ancestral population. Thus, recent genome-wide analyses based on summary statistics have sparked controversy about the possibility of interbreeding between Neandertals and modern humans in Eurasia. Here we derive the probability of full mutational configurations in nonrecombining sequence blocks under both admixture and ancestral structure scenarios. Dividing the genome into short blocks gives an efficient way to compute maximum-likelihood estimates of parameters. We apply this likelihood scheme to triplets of human and Neandertal genomes and compare the relative support for a model of admixture from Neandertals into Eurasian populations after their expansion out of Africa against a history of persistent structure in their common ancestral population in Africa. Our analysis allows us to conclusively reject a model of ancestral structure in Africa and instead reveals strong support for Neandertal admixture in Eurasia at a higher rate (3.4−7.3%) than suggested previously. Using analysis and simulations we show that our inference is more powerful than previous summary statistics and robust to realistic levels of recombination. PMID:24532731
Macroevolutionary trends of atomic composition and related functional group proportion in eukaryotic and prokaryotic proteins.

PubMed

Zhang, Yu-Juan; Yang, Chun-Lin; Hao, You-Jin; Li, Ying; Chen, Bin; Wen, Jian-Fan

2014-01-25

To fully explore the trends of atomic composition during the macroevolution from prokaryote to eukaryote, five atoms (oxygen, sulfur, nitrogen, carbon, hydrogen) and related functional groups in prokaryotic and eukaryotic proteins were surveyed and compared. Genome-wide analysis showed that eukaryotic proteins have more oxygen, sulfur and nitrogen atoms than prokaryotes do. Clusters of Orthologous Groups (COG) analysis revealed that oxygen, sulfur, carbon and hydrogen frequencies are higher in eukaryotic proteins than in their prokaryotic orthologs. Furthermore, functional group analysis demonstrated that eukaryotic proteins tend to have higher proportions of sulfhydryl, hydroxyl and acylamino, but lower of sulfide and carboxyl. Taken together, an apparent trend of increase was observed for oxygen and sulfur atoms in the macroevolution; the variation of oxygen and sulfur compositions and their related functional groups in macroevolution made eukaryotic proteins carry more useful functional groups. These results will be helpful for better understanding the functional significances of atomic composition evolution. Copyright © 2013 Elsevier B.V. All rights reserved.
Shared Subgenome Dominance Following Polyploidization Explains Grass Genome Evolutionary Plasticity from a Seven Protochromosome Ancestor with 16K Protogenes

PubMed Central

Murat, Florent; Zhang, Rongzhi; Guizard, Sébastien; Flores, Raphael; Armero, Alix; Pont, Caroline; Steinbach, Delphine; Quesneville, Hadi; Cooke, Richard; Salse, Jerome

2013-01-01

Modern plant genomes are diploidized paleopolyploids. We revisited grass genome paleohistory in response to the diploidization process through a detailed investigation of the evolutionary fate of duplicated blocks. Ancestrally duplicated genes can be conserved, deleted, and shuffled, defining dominant (bias toward duplicate retention) and sensitive (bias toward duplicate erosion) chromosomal fragments. We propose a new grass genome paleohistory deriving from an ancestral karyotype structured in seven protochromosomes containing 16,464 protogenes and following evolutionary rules where 1) ancestral shared polyploidizations shaped conserved dominant (D) and sensitive (S) subgenomes, 2) subgenome dominance is revealed by both gene deletion and shuffling from the S blocks, 3) duplicate deletion/movement may have been mediated by single-/double-stranded illegitimate recombination mechanisms, 4) modern genomes arose through centromeric fusion of protochromosomes, leading to functional monocentric neochromosomes, 5) the fusion of two dominant blocks leads to supradominant neochromosomes (D + D = D) with higher ancestral gene retention compared with D + S = D (i.e., fusion of blocks with opposite sensitivity) or even S + S = S (i.e., fusion of two sensitive ancestral blocks). A new user-friendly online tool named “PlantSyntenyViewer,” available at http://urgi.versailles.inra.fr/synteny-cereal, presents the refined comparative genomics data. PMID:24317974
De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora, a Model Organism for Fungal Morphogenesis

PubMed Central

Nowrousian, Minou; Stajich, Jason E.; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D.; Pöggeler, Stefanie; Read, Nick D.; Seiler, Stephan; Smith, Kristina M.; Zickler, Denise; Kück, Ulrich; Freitag, Michael

2010-01-01

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30–90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in ∼4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative
De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

PubMed

Nowrousian, Minou; Stajich, Jason E; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D; Pöggeler, Stefanie; Read, Nick D; Seiler, Stephan; Smith, Kristina M; Zickler, Denise; Kück, Ulrich; Freitag, Michael

2010-04-08

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for
Reconstructed Ancestral Myo-Inositol-3-Phosphate Synthases Indicate That Ancestors of the Thermococcales and Thermotoga Species Were More Thermophilic than Their Descendants

PubMed Central

Butzin, Nicholas C.; Lapierre, Pascal; Green, Anna G.; Swithers, Kristen S.; Gogarten, J. Peter; Noll, Kenneth M.

2013-01-01

The bacterial genomes of Thermotoga species show evidence of significant interdomain horizontal gene transfer from the Archaea. Members of this genus acquired many genes from the Thermococcales, which grow at higher temperatures than Thermotoga species. In order to study the functional history of an interdomain horizontally acquired gene we used ancestral sequence reconstruction to examine the thermal characteristics of reconstructed ancestral proteins of the Thermotoga lineage and its archaeal donors. Several ancestral sequence reconstruction methods were used to determine the possible sequences of the ancestral Thermotoga and Archaea myo-inositol-3-phosphate synthase (MIPS). These sequences were predicted to be more thermostable than the extant proteins using an established sequence composition method. We verified these computational predictions by measuring the activities and thermostabilities of purified proteins from the Thermotoga and the Thermococcales species, and eight ancestral reconstructed proteins. We found that the ancestral proteins from both the archaeal donor and the Thermotoga most recent common ancestor recipient were more thermostable than their descendants. We show that there is a correlation between the thermostability of MIPS protein and the optimal growth temperature (OGT) of its host, which suggests that the OGT of the ancestors of these species of Archaea and the Thermotoga grew at higher OGTs than their descendants. PMID:24391933
Switch on the engine: how the eukaryotic replicative helicase MCM2-7 becomes activated.

PubMed

Tognetti, Silvia; Riera, Alberto; Speck, Christian

2015-03-01

A crucial step during eukaryotic initiation of DNA replication is the correct loading and activation of the replicative DNA helicase, which ensures that each replication origin fires only once. Unregulated DNA helicase loading and activation, as it occurs in cancer, can cause severe DNA damage and genomic instability. The essential mini-chromosome maintenance proteins 2-7 (MCM2-7) represent the core of the eukaryotic replicative helicase that is loaded at DNA replication origins during G1-phase of the cell cycle. The MCM2-7 helicase activity, however, is only triggered during S-phase once the holo-helicase Cdc45-MCM2-7-GINS (CMG) has been formed. A large number of factors and several kinases interact and contribute to CMG formation and helicase activation, though the exact mechanisms remain unclear. Crucially, upon DNA damage, this reaction is temporarily halted to ensure genome integrity. Here, we review the current understanding of helicase activation; we focus on protein interactions during CMG formation, discuss structural changes during helicase activation, and outline similarities and differences of the prokaryotic and eukaryotic helicase activation process.
Gene order in rosid phylogeny, inferred from pairwise syntenies among extant genomes

PubMed Central

2012-01-01

Background Ancestral gene order reconstruction for flowering plants has lagged behind developments in yeasts, insects and higher animals, because of the recency of widespread plant genome sequencing, sequencers' embargoes on public data use, paralogies due to whole genome duplication (WGD) and fractionation of undeleted duplicates, extensive paralogy from other sources, and the computational cost of existing methods. Results We address these problems, using the gene order of four core eudicot genomes (cacao, castor bean, papaya and grapevine) that have escaped any recent WGD events, and two others (poplar and cucumber) that descend from independent WGDs, in inferring the ancestral gene order of the rosid clade and those of its main subgroups, the fabids and malvids. We improve and adapt techniques including the OMG method for extracting large, paralogy-free, multiple orthologies from conflated pairwise synteny data among the six genomes and the PATHGROUPS approach for ancestral gene order reconstruction in a given phylogeny, where some genomes may be descendants of WGD events. We use the gene order evidence to evaluate the hypothesis that the order Malpighiales belongs to the malvids rather than as traditionally assigned to the fabids. Conclusions Gene orders of ancestral eudicot species, involving 10,000 or more genes can be reconstructed in an efficient, parsimonious and consistent way, despite paralogies due to WGD and other processes. Pairwise genomic syntenies provide appropriate input to a parameter-free procedure of multiple ortholog identification followed by gene-order reconstruction in solving instances of the "small phylogeny" problem. PMID:22759433
Massive expansion of the calpain gene family in unicellular eukaryotes.

PubMed

Zhao, Sen; Liang, Zhe; Demko, Viktor; Wilson, Robert; Johansen, Wenche; Olsen, Odd-Arne; Shalchian-Tabrizi, Kamran

2012-09-29

Calpains are Ca2+-dependent cysteine proteases that participate in a range of crucial cellular processes. Dysfunction of these enzymes may cause, for instance, life-threatening diseases in humans, the loss of sex determination in nematodes and embryo lethality in plants. Although the calpain family is well characterized in animal and plant model organisms, there is a great lack of knowledge about these genes in unicellular eukaryote species (i.e. protists). Here, we study the distribution and evolution of calpain genes in a wide range of eukaryote genomes from major branches in the tree of life. Our investigations reveal 24 types of protein domains that are combined with the calpain-specific catalytic domain CysPc. In total we identify 41 different calpain domain architectures, 28 of these domain combinations have not been previously described. Based on our phylogenetic inferences, we propose that at least four calpain variants were established in the early evolution of eukaryotes, most likely before the radiation of all the major supergroups of eukaryotes. Many domains associated with eukaryotic calpain genes can be found among eubacteria or archaebacteria but never in combination with the CysPc domain. The analyses presented here show that ancient modules present in prokaryotes, and a few de novo eukaryote domains, have been assembled into many novel domain combinations along the evolutionary history of eukaryotes. Some of the new calpain genes show a narrow distribution in a few branches in the tree of life, likely representing lineage-specific innovations. Hence, the functionally important classical calpain genes found among humans and vertebrates make up only a tiny fraction of the calpain family. In fact, a massive expansion of the calpain family occurred by domain shuffling among unicellular eukaryotes and contributed to a wealth of functionally different genes.
Massive expansion of the calpain gene family in unicellular eukaryotes

PubMed Central

2012-01-01

Background Calpains are Ca2+-dependent cysteine proteases that participate in a range of crucial cellular processes. Dysfunction of these enzymes may cause, for instance, life-threatening diseases in humans, the loss of sex determination in nematodes and embryo lethality in plants. Although the calpain family is well characterized in animal and plant model organisms, there is a great lack of knowledge about these genes in unicellular eukaryote species (i.e. protists). Here, we study the distribution and evolution of calpain genes in a wide range of eukaryote genomes from major branches in the tree of life. Results Our investigations reveal 24 types of protein domains that are combined with the calpain-specific catalytic domain CysPc. In total we identify 41 different calpain domain architectures, 28 of these domain combinations have not been previously described. Based on our phylogenetic inferences, we propose that at least four calpain variants were established in the early evolution of eukaryotes, most likely before the radiation of all the major supergroups of eukaryotes. Many domains associated with eukaryotic calpain genes can be found among eubacteria or archaebacteria but never in combination with the CysPc domain. Conclusions The analyses presented here show that ancient modules present in prokaryotes, and a few de novo eukaryote domains, have been assembled into many novel domain combinations along the evolutionary history of eukaryotes. Some of the new calpain genes show a narrow distribution in a few branches in the tree of life, likely representing lineage-specific innovations. Hence, the functionally important classical calpain genes found among humans and vertebrates make up only a tiny fraction of the calpain family. In fact, a massive expansion of the calpain family occurred by domain shuffling among unicellular eukaryotes and contributed to a wealth of functionally different genes. PMID:23020305
Ancestral whole-genome duplication in the marine chelicerate horseshoe crabs

PubMed Central

Kenny, N J; Chan, K W; Nong, W; Qu, Z; Maeso, I; Yip, H Y; Chan, T F; Kwan, H S; Holland, P W H; Chu, K H; Hui, J H L

2016-01-01

Whole-genome duplication (WGD) results in new genomic resources that can be exploited by evolution for rewiring genetic regulatory networks in organisms. In metazoans, WGD occurred before the last common ancestor of vertebrates, and has been postulated as a major evolutionary force that contributed to their speciation and diversification of morphological structures. Here, we have sequenced genomes from three of the four extant species of horseshoe crabs—Carcinoscorpius rotundicauda, Limulus polyphemus and Tachypleus tridentatus. Phylogenetic and sequence analyses of their Hox and other homeobox genes, which encode crucial transcription factors and have been used as indicators of WGD in animals, strongly suggests that WGD happened before the last common ancestor of these marine chelicerates >135 million years ago. Signatures of subfunctionalisation of paralogues of Hox genes are revealed in the appendages of two species of horseshoe crabs. Further, residual homeobox pseudogenes are observed in the three lineages. The existence of WGD in the horseshoe crabs, noted for relative morphological stasis over geological time, suggests that genomic diversity need not always be reflected phenotypically, in contrast to the suggested situation in vertebrates. This study provides evidence of ancient WGD in the ecdysozoan lineage, and reveals new opportunities for studying genomic and regulatory evolution after WGD in the Metazoa. PMID:26419336
Ancestral haplotype-based association mapping with generalized linear mixed models accounting for stratification.

PubMed

Zhang, Z; Guillaume, F; Sartelet, A; Charlier, C; Georges, M; Farnir, F; Druet, T

2012-10-01

In many situations, genome-wide association studies are performed in populations presenting stratification. Mixed models including a kinship matrix accounting for genetic relatedness among individuals have been shown to correct for population and/or family structure. Here we extend this methodology to generalized linear mixed models which properly model data under various distributions. In addition we perform association with ancestral haplotypes inferred using a hidden Markov model. The method was shown to properly account for stratification under various simulated scenari presenting population and/or family structure. Use of ancestral haplotypes resulted in higher power than SNPs on simulated datasets. Application to real data demonstrates the usefulness of the developed model. Full analysis of a dataset with 4600 individuals and 500 000 SNPs was performed in 2 h 36 min and required 2.28 Gb of RAM. The software GLASCOW can be freely downloaded from www.giga.ulg.ac.be/jcms/prod_381171/software. francois.guillaume@jouy.inra.fr Supplementary data are available at Bioinformatics online.
Ensembl Genomes 2013: scaling up access to genome-wide data.

PubMed

Kersey, Paul Julian; Allen, James E; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

2014-01-01

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.
EuPaGDT: a web tool tailored to design CRISPR guide RNAs for eukaryotic pathogens.

PubMed

Peng, Duo; Tarleton, Rick

2015-10-01

Recent development of CRISPR-Cas9 genome editing has enabled highly efficient and versatile manipulation of a variety of organisms and adaptation of the CRISPR-Cas9 system to eukaryotic pathogens has opened new avenues for studying these otherwise hard to manipulate organisms. Here we describe a webtool, Eukaryotic Pathogen gRNA Design Tool (EuPaGDT; available at http://grna.ctegd.uga.edu), which identifies guide RNA (gRNA) in input gene(s) to guide users in arriving at well-informed and appropriate gRNA design for many eukaryotic pathogens. Flexibility in gRNA design, accommodating unique eukaryotic pathogen (gene and genome) attributes and high-throughput gRNA design are the main features that distinguish EuPaGDT from other gRNA design tools. In addition to employing an array of known principles to score and rank gRNAs, EuPaGDT implements an effective on-target search algorithm to identify gRNA targeting multi-gene families, which are highly represented in these pathogens and play important roles in host-pathogen interactions. EuPaGDT also identifies and scores microhomology sequences flanking each gRNA targeted cut-site; these sites are often essential for the microhomology-mediated end joining process used for double-stranded break repair in these organisms. EuPaGDT also assists users in designing single-stranded oligonucleotides for homology directed repair. In batch processing mode, EuPaGDT is able to process genome-scale sequences, enabling preparation of gRNA libraries for large-scale screening projects.
UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs

PubMed Central

Mignone, Flavio; Grillo, Giorgio; Licciulli, Flavio; Iacono, Michele; Liuni, Sabino; Kersey, Paul J.; Duarte, Jorge; Saccone, Cecilia; Pesole, Graziano

2005-01-01

The 5′ and 3′ untranslated regions of eukaryotic mRNAs play crucial roles in the post-transcriptional regulation of gene expression through the modulation of nucleo-cytoplasmic mRNA transport, translation efficiency, subcellular localization and message stability. UTRdb is a curated database of 5′ and 3′ untranslated sequences of eukaryotic mRNAs, derived from several sources of primary data. Experimentally validated functional motifs are annotated (and also collated as the UTRsite database) and cross-links to genomic and protein data are provided. The integration of UTRdb with genomic and protein data has allowed the implementation of a powerful retrieval resource for the selection and extraction of UTR subsets based on their genomic coordinates and/or features of the protein encoded by the relevant mRNA (e.g. GO term, PFAM domain, etc.). All internet resources implemented for retrieval and functional analysis of 5′ and 3′ untranslated regions of eukaryotic mRNAs are accessible at http://www.ba.itb.cnr.it/UTR/. PMID:15608165
Insights into Land Plant Evolution Garnered from the Marchantia polymorpha Genome.

PubMed

Bowman, John L; Kohchi, Takayuki; Yamato, Katsuyuki T; Jenkins, Jerry; Shu, Shengqiang; Ishizaki, Kimitsune; Yamaoka, Shohei; Nishihama, Ryuichi; Nakamura, Yasukazu; Berger, Frédéric; Adam, Catherine; Aki, Shiori Sugamata; Althoff, Felix; Araki, Takashi; Arteaga-Vazquez, Mario A; Balasubrmanian, Sureshkumar; Barry, Kerrie; Bauer, Diane; Boehm, Christian R; Briginshaw, Liam; Caballero-Perez, Juan; Catarino, Bruno; Chen, Feng; Chiyoda, Shota; Chovatia, Mansi; Davies, Kevin M; Delmans, Mihails; Demura, Taku; Dierschke, Tom; Dolan, Liam; Dorantes-Acosta, Ana E; Eklund, D Magnus; Florent, Stevie N; Flores-Sandoval, Eduardo; Fujiyama, Asao; Fukuzawa, Hideya; Galik, Bence; Grimanelli, Daniel; Grimwood, Jane; Grossniklaus, Ueli; Hamada, Takahiro; Haseloff, Jim; Hetherington, Alexander J; Higo, Asuka; Hirakawa, Yuki; Hundley, Hope N; Ikeda, Yoko; Inoue, Keisuke; Inoue, Shin-Ichiro; Ishida, Sakiko; Jia, Qidong; Kakita, Mitsuru; Kanazawa, Takehiko; Kawai, Yosuke; Kawashima, Tomokazu; Kennedy, Megan; Kinose, Keita; Kinoshita, Toshinori; Kohara, Yuji; Koide, Eri; Komatsu, Kenji; Kopischke, Sarah; Kubo, Minoru; Kyozuka, Junko; Lagercrantz, Ulf; Lin, Shih-Shun; Lindquist, Erika; Lipzen, Anna M; Lu, Chia-Wei; De Luna, Efraín; Martienssen, Robert A; Minamino, Naoki; Mizutani, Masaharu; Mizutani, Miya; Mochizuki, Nobuyoshi; Monte, Isabel; Mosher, Rebecca; Nagasaki, Hideki; Nakagami, Hirofumi; Naramoto, Satoshi; Nishitani, Kazuhiko; Ohtani, Misato; Okamoto, Takashi; Okumura, Masaki; Phillips, Jeremy; Pollak, Bernardo; Reinders, Anke; Rövekamp, Moritz; Sano, Ryosuke; Sawa, Shinichiro; Schmid, Marc W; Shirakawa, Makoto; Solano, Roberto; Spunde, Alexander; Suetsugu, Noriyuki; Sugano, Sumio; Sugiyama, Akifumi; Sun, Rui; Suzuki, Yutaka; Takenaka, Mizuki; Takezawa, Daisuke; Tomogane, Hirokazu; Tsuzuki, Masayuki; Ueda, Takashi; Umeda, Masaaki; Ward, John M; Watanabe, Yuichiro; Yazaki, Kazufumi; Yokoyama, Ryusuke; Yoshitake, Yoshihiro; Yotsui, Izumi; Zachgo, Sabine; Schmutz, Jeremy

2017-10-05

The evolution of land flora transformed the terrestrial environment. Land plants evolved from an ancestral charophycean alga from which they inherited developmental, biochemical, and cell biological attributes. Additional biochemical and physiological adaptations to land, and a life cycle with an alternation between multicellular haploid and diploid generations that facilitated efficient dispersal of desiccation tolerant spores, evolved in the ancestral land plant. We analyzed the genome of the liverwort Marchantia polymorpha, a member of a basal land plant lineage. Relative to charophycean algae, land plant genomes are characterized by genes encoding novel biochemical pathways, new phytohormone signaling pathways (notably auxin), expanded repertoires of signaling pathways, and increased diversity in some transcription factor families. Compared with other sequenced land plants, M. polymorpha exhibits low genetic redundancy in most regulatory pathways, with this portion of its genome resembling that predicted for the ancestral land plant. PAPERCLIP. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Growth control of the eukaryote cell: a systems biology study in yeast.

PubMed

Castrillo, Juan I; Zeef, Leo A; Hoyle, David C; Zhang, Nianshu; Hayes, Andrew; Gardner, David Cj; Cornell, Michael J; Petty, June; Hakes, Luke; Wardleworth, Leanne; Rash, Bharat; Brown, Marie; Dunn, Warwick B; Broadhurst, David; O'Donoghue, Kerry; Hester, Svenja S; Dunkley, Tom Pj; Hart, Sarah R; Swainston, Neil; Li, Peter; Gaskell, Simon J; Paton, Norman W; Lilley, Kathryn S; Kell, Douglas B; Oliver, Stephen G

2007-01-01

Cell growth underlies many key cellular and developmental processes, yet a limited number of studies have been carried out on cell-growth regulation. Comprehensive studies at the transcriptional, proteomic and metabolic levels under defined controlled conditions are currently lacking. Metabolic control analysis is being exploited in a systems biology study of the eukaryotic cell. Using chemostat culture, we have measured the impact of changes in flux (growth rate) on the transcriptome, proteome, endometabolome and exometabolome of the yeast Saccharomyces cerevisiae. Each functional genomic level shows clear growth-rate-associated trends and discriminates between carbon-sufficient and carbon-limited conditions. Genes consistently and significantly upregulated with increasing growth rate are frequently essential and encode evolutionarily conserved proteins of known function that participate in many protein-protein interactions. In contrast, more unknown, and fewer essential, genes are downregulated with increasing growth rate; their protein products rarely interact with one another. A large proportion of yeast genes under positive growth-rate control share orthologs with other eukaryotes, including humans. Significantly, transcription of genes encoding components of the TOR complex (a major controller of eukaryotic cell growth) is not subject to growth-rate regulation. Moreover, integrative studies reveal the extent and importance of post-transcriptional control, patterns of control of metabolic fluxes at the level of enzyme synthesis, and the relevance of specific enzymatic reactions in the control of metabolic fluxes during cell growth. This work constitutes a first comprehensive systems biology study on growth-rate control in the eukaryotic cell. The results have direct implications for advanced studies on cell growth, in vivo regulation of metabolic fluxes for comprehensive metabolic engineering, and for the design of genome-scale systems biology models of the

Growth control of the eukaryote cell: a systems biology study in yeast

PubMed Central

Castrillo, Juan I; Zeef, Leo A; Hoyle, David C; Zhang, Nianshu; Hayes, Andrew; Gardner, David CJ; Cornell, Michael J; Petty, June; Hakes, Luke; Wardleworth, Leanne; Rash, Bharat; Brown, Marie; Dunn, Warwick B; Broadhurst, David; O'Donoghue, Kerry; Hester, Svenja S; Dunkley, Tom PJ; Hart, Sarah R; Swainston, Neil; Li, Peter; Gaskell, Simon J; Paton, Norman W; Lilley, Kathryn S; Kell, Douglas B; Oliver, Stephen G

2007-01-01

Background Cell growth underlies many key cellular and developmental processes, yet a limited number of studies have been carried out on cell-growth regulation. Comprehensive studies at the transcriptional, proteomic and metabolic levels under defined controlled conditions are currently lacking. Results Metabolic control analysis is being exploited in a systems biology study of the eukaryotic cell. Using chemostat culture, we have measured the impact of changes in flux (growth rate) on the transcriptome, proteome, endometabolome and exometabolome of the yeast Saccharomyces cerevisiae. Each functional genomic level shows clear growth-rate-associated trends and discriminates between carbon-sufficient and carbon-limited conditions. Genes consistently and significantly upregulated with increasing growth rate are frequently essential and encode evolutionarily conserved proteins of known function that participate in many protein-protein interactions. In contrast, more unknown, and fewer essential, genes are downregulated with increasing growth rate; their protein products rarely interact with one another. A large proportion of yeast genes under positive growth-rate control share orthologs with other eukaryotes, including humans. Significantly, transcription of genes encoding components of the TOR complex (a major controller of eukaryotic cell growth) is not subject to growth-rate regulation. Moreover, integrative studies reveal the extent and importance of post-transcriptional control, patterns of control of metabolic fluxes at the level of enzyme synthesis, and the relevance of specific enzymatic reactions in the control of metabolic fluxes during cell growth. Conclusion This work constitutes a first comprehensive systems biology study on growth-rate control in the eukaryotic cell. The results have direct implications for advanced studies on cell growth, in vivo regulation of metabolic fluxes for comprehensive metabolic engineering, and for the design of genome
Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea.

PubMed

Makarova, Kira S; Sorokin, Alexander V; Novichkov, Pavel S; Wolf, Yuri I; Koonin, Eugene V

2007-11-27

An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. New Archaeal Clusters of Orthologous Genes (arCOGs) were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon) using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover approximately 88% of the genes in a genome compared to a approximately 76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; approximately 40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome) consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA) is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile that
Genetic exchange in eukaryotes through horizontal transfer: connected by the mobilome.

PubMed

Wallau, Gabriel Luz; Vieira, Cristina; Loreto, Élgion Lúcio Silva

2018-01-01

All living species contain genetic information that was once shared by their common ancestor. DNA is being inherited through generations by vertical transmission (VT) from parents to offspring and from ancestor to descendant species. This process was considered the sole pathway by which biological entities exchange inheritable information. However, Horizontal Transfer (HT), the exchange of genetic information by other means than parents to offspring, was discovered in prokaryotes along with strong evidence showing that it is a very important process by which prokaryotes acquire new genes. For some time now, it has been a scientific consensus that HT events were rare and non-relevant for evolution of eukaryotic species, but there is growing evidence supporting that HT is an important and frequent phenomenon in eukaryotes as well. Here, we will discuss the latest findings regarding HT among eukaryotes, mainly HT of transposons (HTT), establishing HTT once and for all as an important phenomenon that should be taken into consideration to fully understand eukaryotes genome evolution. In addition, we will discuss the latest development methods to detect such events in a broader scale and highlight the new approaches which should be pursued by researchers to fill the knowledge gaps regarding HTT among eukaryotes.
DNA methylation in amphioxus: from ancestral functions to new roles in vertebrates.

PubMed

Albalat, Ricard; Martí-Solans, Josep; Cañestro, Cristian

2012-03-01

In vertebrates, DNA methylation is an epigenetic mechanism that modulates gene transcription, and plays crucial roles during development, cell fate maintenance, germ cell pluripotency and inheritable genome imprinting. DNA methylation might also play a role as a genome defense mechanism against the mutational activity derived from transposon mobility. In contrast to the heavily methylated genomes in vertebrates, most genomes in invertebrates are poorly or just moderately methylated, and the function of DNA methylation remains unclear. Here, we review the DNA methylation system in the cephalochordate amphioxus, which belongs to the most basally divergent group of our own phylum, the chordates. First, surveys of the amphioxus genome database reveal the presence of the DNA methylation machinery, DNA methyltransferases and methyl-CpG-binding domain proteins. Second, comparative genomics and analyses of conserved synteny between amphioxus and vertebrates provide robust evidence that the DNA methylation machinery of amphioxus represents the ancestral toolkit of chordates, and that its expansion in vertebrates was originated by the two rounds of whole-genome duplication that occurred in stem vertebrates. Third, in silico analysis of CpGo/e ratios throughout the amphioxus genome suggests a bimodal distribution of DNA methylation, consistent with a mosaic pattern comprising domains of methylated DNA interspersed with domains of unmethylated DNA, similar to the situation described in ascidians, but radically different to the globally methylated vertebrate genomes. Finally, we discuss potential roles of the DNA methylation system in amphioxus in the context of chordate genome evolution and the origin of vertebrates.
Evolution of double-stranded DNA viruses of eukaryotes: from bacteriophages to transposons to giant viruses

PubMed Central

Koonin, Eugene V; Krupovic, Mart; Yutin, Natalya

2015-01-01

Diverse eukaryotes including animals and protists are hosts to a broad variety of viruses with double-stranded (ds) DNA genomes, from the largest known viruses, such as pandoraviruses and mimiviruses, to tiny polyomaviruses. Recent comparative genomic analyses have revealed many evolutionary connections between dsDNA viruses of eukaryotes, bacteriophages, transposable elements, and linear DNA plasmids. These findings provide an evolutionary scenario that derives several major groups of eukaryotic dsDNA viruses, including the proposed order “Megavirales,” adenoviruses, and virophages from a group of large virus-like transposons known as Polintons (Mavericks). The Polintons have been recently shown to encode two capsid proteins, suggesting that these elements lead a dual lifestyle with both a transposon and a viral phase and should perhaps more appropriately be named polintoviruses. Here, we describe the recently identified evolutionary relationships between bacteriophages of the family Tectiviridae, polintoviruses, adenoviruses, virophages, large and giant DNA viruses of eukaryotes of the proposed order “Megavirales,” and linear mitochondrial and cytoplasmic plasmids. We outline an evolutionary scenario under which the polintoviruses were the first group of eukaryotic dsDNA viruses that evolved from bacteriophages and became the ancestors of most large DNA viruses of eukaryotes and a variety of other selfish elements. Distinct lines of origin are detectable only for herpesviruses (from a different bacteriophage root) and polyoma/papillomaviruses (from single-stranded DNA viruses and ultimately from plasmids). Phylogenomic analysis of giant viruses provides compelling evidence of their independent origins from smaller members of the putative order “Megavirales,” refuting the speculations on the evolution of these viruses from an extinct fourth domain of cellular life. PMID:25727355
A consensus map in cultivated hexaploid oat reveals conserved grass synteny with substantial sub-genome rearrangement

USDA-ARS?s Scientific Manuscript database

Hexaploid oat (Avena sativa, 2n = 6x = 42) is a member of the Poaceae family with a very large genome (~13 Gb) containing 21 chromosome pairs: seven from each of two similar ancestral diploids (A and D) and seven from a more diverged ancestral diploid (C). Physical rearrangements among ancestral oat...
Human Genetic Ancestral Composition Correlates with the Origin of Mycobacterium leprae Strains in a Leprosy Endemic Population.

PubMed

Cardona-Castro, Nora; Cortés, Edwin; Beltrán, Camilo; Romero, Marcela; Badel-Mogollón, Jaime E; Bedoya, Gabriel

2015-01-01

Recent reports have suggested that leprosy originated in Africa, extended to Asia and Europe, and arrived in the Americas during European colonization and the African slave trade. Due to colonization, the contemporary Colombian population is an admixture of Native-American, European and African ancestries. Because microorganisms are known to accompany humans during migrations, patterns of human migration can be traced by examining genomic changes in associated microbes. The current study analyzed 118 leprosy cases and 116 unrelated controls from two Colombian regions endemic for leprosy (Atlantic and Andean) in order to determine possible associations of leprosy with patient ancestral background (determined using 36 ancestry informative markers), Mycobacterium leprae genotype and/or patient geographical origin. We found significant differences between ancestral genetic composition. European components were predominant in Andean populations. In contrast, African components were higher in the Atlantic region. M. leprae genotypes were then analyzed for cluster associations and compared with the ancestral composition of leprosy patients. Two M. leprae principal clusters were found: haplotypes C54 and T45. Haplotype C54 associated with African origin and was more frequent in patients from the Atlantic region with a high African component. In contrast, haplotype T45 associated with European origin and was more frequent in Andean patients with a higher European component. These results suggest that the human and M. leprae genomes have co-existed since the African and European origins of the disease, with leprosy ultimately arriving in Colombia during colonization. Distinct M. leprae strains followed European and African settlement in the country and can be detected in contemporary Colombian populations.
Evolution of intrinsic disorder in eukaryotic proteins.

PubMed

Ahrens, Joseph B; Nunez-Castilla, Janelle; Siltberg-Liberles, Jessica

2017-09-01

Conformational flexibility conferred though regions of intrinsic structural disorder allows proteins to behave as dynamic molecules. While it is well-known that intrinsically disordered regions can undergo disorder-to-order transitions in real-time as part of their function, we also are beginning to learn more about the dynamics of disorder-to-order transitions along evolutionary time-scales. Intrinsically disordered regions endow proteins with functional promiscuity, which is further enhanced by the ability of some of these regions to undergo real-time disorder-to-order transitions. Disorder content affects gene retention after whole genome duplication, but it is not necessarily conserved. Altered patterns of disorder resulting from evolutionary disorder-to-order transitions indicate that disorder evolves to modify function through refining stability, regulation, and interactions. Here, we review the evolution of intrinsically disordered regions in eukaryotic proteins. We discuss the interplay between secondary structure and disorder on evolutionary time-scales, the importance of disorder for eukaryotic proteome expansion and functional divergence, and the evolutionary dynamics of disorder.
Eukaryotic algal phytochromes span the visible spectrum

PubMed Central

Rockwell, Nathan C.; Duanmu, Deqiang; Martin, Shelley S.; Bachy, Charles; Price, Dana C.; Bhattacharya, Debashish; Worden, Alexandra Z.; Lagarias, J. Clark

2014-01-01

Plant phytochromes are photoswitchable red/far-red photoreceptors that allow competition with neighboring plants for photosynthetically active red light. In aquatic environments, red and far-red light are rapidly attenuated with depth; therefore, photosynthetic species must use shorter wavelengths of light. Nevertheless, phytochrome-related proteins are found in recently sequenced genomes of many eukaryotic algae from aquatic environments. We examined the photosensory properties of seven phytochromes from diverse algae: four prasinophyte (green algal) species, the heterokont (brown algal) Ectocarpus siliculosus, and two glaucophyte species. We demonstrate that algal phytochromes are not limited to red and far-red responses. Instead, different algal phytochromes can sense orange, green, and even blue light. Characterization of these previously undescribed photosensors using CD spectroscopy supports a structurally heterogeneous chromophore in the far-red–absorbing photostate. Our study thus demonstrates that extensive spectral tuning of phytochromes has evolved in phylogenetically distinct lineages of aquatic photosynthetic eukaryotes. PMID:24567382
Population Stratification and Underrepresentation of Indian Subcontinent Genetic Diversity in the 1000 Genomes Project Dataset

PubMed Central

Sengupta, Dhriti; Choudhury, Ananyo; Basu, Analabha; Ramsay, Michèle

2016-01-01

Genomic variation in Indian populations is of great interest due to the diversity of ancestral components, social stratification, endogamy and complex admixture patterns. With an expanding population of 1.2 billion, India is also a treasure trove to catalogue innocuous as well as clinically relevant rare mutations. Recent studies have revealed four dominant ancestries in populations from mainland India: Ancestral North-Indian (ANI), Ancestral South-Indian (ASI), Ancestral Tibeto–Burman (ATB) and Ancestral Austro-Asiatic (AAA). The 1000 Genomes Project (KGP) Phase-3 data include about 500 genomes from five linguistically defined Indian-Subcontinent (IS) populations (Punjabi, Gujrati, Bengali, Telugu and Tamil) some of whom are recent migrants to USA or UK. Comparative analyses show that despite the distinct geographic origins of the KGP-IS populations, the ANI component is predominantly represented in this dataset. Previous studies demonstrated population substructure in the HapMap Gujrati population, and we found evidence for additional substructure in the Punjabi and Telugu populations. These substructured populations have characteristic/significant differences in heterozygosity and inbreeding coefficients. Moreover, we demonstrate that the substructure is better explained by factors like differences in proportion of ancestral components, and endogamy driven social structure rather than invoking a novel ancestral component to explain it. Therefore, using language and/or geography as a proxy for an ethnic unit is inadequate for many of the IS populations. This highlights the necessity for more nuanced sampling strategies or corrective statistical approaches, particularly for biomedical and population genetics research in India. PMID:27797945
Localization of a bacterial group II intron-encoded protein in eukaryotic nuclear splicing-related cell compartments.

PubMed

Nisa-Martínez, Rafael; Laporte, Philippe; Jiménez-Zurdo, José Ignacio; Frugier, Florian; Crespi, Martin; Toro, Nicolás

2013-01-01

Some bacterial group II introns are widely used for genetic engineering in bacteria, because they can be reprogrammed to insert into the desired DNA target sites. There is considerable interest in developing this group II intron gene targeting technology for use in eukaryotes, but nuclear genomes present several obstacles to the use of this approach. The nuclear genomes of eukaryotes do not contain group II introns, but these introns are thought to have been the progenitors of nuclear spliceosomal introns. We investigated the expression and subcellular localization of the bacterial RmInt1 group II intron-encoded protein (IEP) in Arabidopsis thaliana protoplasts. Following the expression of translational fusions of the wild-type protein and several mutant variants with EGFP, the full-length IEP was found exclusively in the nucleolus, whereas the maturase domain alone targeted EGFP to nuclear speckles. The distribution of the bacterial RmInt1 IEP in plant cell protoplasts suggests that the compartmentalization of eukaryotic cells into nucleus and cytoplasm does not prevent group II introns from invading the host genome. Furthermore, the trafficking of the IEP between the nucleolus and the speckles upon maturase inactivation is consistent with the hypothesis that the spliceosomal machinery evolved from group II introns.
Localization of a Bacterial Group II Intron-Encoded Protein in Eukaryotic Nuclear Splicing-Related Cell Compartments

PubMed Central

Nisa-Martínez, Rafael; Laporte, Philippe; Jiménez-Zurdo, José Ignacio; Frugier, Florian; Crespi, Martin; Toro, Nicolás

2013-01-01

Some bacterial group II introns are widely used for genetic engineering in bacteria, because they can be reprogrammed to insert into the desired DNA target sites. There is considerable interest in developing this group II intron gene targeting technology for use in eukaryotes, but nuclear genomes present several obstacles to the use of this approach. The nuclear genomes of eukaryotes do not contain group II introns, but these introns are thought to have been the progenitors of nuclear spliceosomal introns. We investigated the expression and subcellular localization of the bacterial RmInt1 group II intron-encoded protein (IEP) in Arabidopsis thaliana protoplasts. Following the expression of translational fusions of the wild-type protein and several mutant variants with EGFP, the full-length IEP was found exclusively in the nucleolus, whereas the maturase domain alone targeted EGFP to nuclear speckles. The distribution of the bacterial RmInt1 IEP in plant cell protoplasts suggests that the compartmentalization of eukaryotic cells into nucleus and cytoplasm does not prevent group II introns from invading the host genome. Furthermore, the trafficking of the IEP between the nucleolus and the speckles upon maturase inactivation is consistent with the hypothesis that the spliceosomal machinery evolved from group II introns. PMID:24391881
Genome-Wide Analyses and Functional Classification of Proline Repeat-Rich Proteins: Potential Role of eIF5A in Eukaryotic Evolution

PubMed Central

Mandal, Ajeet; Mandal, Swati; Park, Myung Hee

2014-01-01

The eukaryotic translation factor, eIF5A has been recently reported as a sequence-specific elongation factor that facilitates peptide bond formation at consecutive prolines in Saccharomyces cerevisiae, as its ortholog elongation factor P (EF-P) does in bacteria. We have searched the genome databases of 35 representative organisms from six kingdoms of life for PPP (Pro-Pro-Pro) and/or PPG (Pro-Pro-Gly)-encoding genes whose expression is expected to depend on eIF5A. We have made detailed analyses of proteome data of 5 selected species, Escherichia coli, Saccharomyces cerevisiae, Drosophila melanogaster, Mus musculus and Homo sapiens. The PPP and PPG motifs are low in the prokaryotic proteomes. However, their frequencies markedly increase with the biological complexity of eukaryotic organisms, and are higher in newly derived proteins than in those orthologous proteins commonly shared in all species. Ontology classifications of S. cerevisiae and human genes encoding the highest level of polyprolines reveal their strong association with several specific biological processes, including actin/cytoskeletal associated functions, RNA splicing/turnover, DNA binding/transcription and cell signaling. Previously reported phenotypic defects in actin polarity and mRNA decay of eIF5A mutant strains are consistent with the proposed role for eIF5A in the translation of the polyproline-containing proteins. Of all the amino acid tandem repeats (≥3 amino acids), only the proline repeat frequency correlates with functional complexity of the five organisms examined. Taken together, these findings suggest the importance of proline repeat-rich proteins and a potential role for eIF5A and its hypusine modification pathway in the course of eukaryotic evolution. PMID:25364902
Pan-arthropod analysis reveals somatic piRNAs as an ancestral defence against transposable elements.

PubMed

Lewis, Samuel H; Quarles, Kaycee A; Yang, Yujing; Tanguy, Melanie; Frézal, Lise; Smith, Stephen A; Sharma, Prashant P; Cordaux, Richard; Gilbert, Clément; Giraud, Isabelle; Collins, David H; Zamore, Phillip D; Miska, Eric A; Sarkies, Peter; Jiggins, Francis M

2018-01-01

In animals, small RNA molecules termed PIWI-interacting RNAs (piRNAs) silence transposable elements (TEs), protecting the germline from genomic instability and mutation. piRNAs have been detected in the soma in a few animals, but these are believed to be specific adaptations of individual species. Here, we report that somatic piRNAs were probably present in the ancestral arthropod more than 500 million years ago. Analysis of 20 species across the arthropod phylum suggests that somatic piRNAs targeting TEs and messenger RNAs are common among arthropods. The presence of an RNA-dependent RNA polymerase in chelicerates (horseshoe crabs, spiders and scorpions) suggests that arthropods originally used a plant-like RNA interference mechanism to silence TEs. Our results call into question the view that the ancestral role of the piRNA pathway was to protect the germline and demonstrate that small RNA silencing pathways have been repurposed for both somatic and germline functions throughout arthropod evolution.
Pathgroups, a dynamic data structure for genome reconstruction problems.

PubMed

Zheng, Chunfang

2010-07-01

Ancestral gene order reconstruction problems, including the median problem, quartet construction, small phylogeny, guided genome halving and genome aliquoting, are NP hard. Available heuristics dedicated to each of these problems are computationally costly for even small instances. We present a data structure enabling rapid heuristic solution to all these ancestral genome reconstruction problems. A generic greedy algorithm with look-ahead based on an automatically generated priority system suffices for all the problems using this data structure. The efficiency of the algorithm is due to fast updating of the structure during run time and to the simplicity of the priority scheme. We illustrate with the first rapid algorithm for quartet construction and apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny. http://albuquerque.bioinformatics.uottawa.ca/pathgroup/Quartet.html chunfang313@gmail.com Supplementary data are available at Bioinformatics online.
Distant Mimivirus relative with a larger genome highlights the fundamental features of Megaviridae

PubMed Central

Arslan, Defne; Legendre, Matthieu; Seltzer, Virginie; Abergel, Chantal; Claverie, Jean-Michel

2011-01-01

Mimivirus, a DNA virus infecting acanthamoeba, was for a long time the largest known virus both in terms of particle size and gene content. Its genome encodes 979 proteins, including the first four aminoacyl tRNA synthetases (ArgRS, CysRS, MetRS, and TyrRS) ever found outside of cellular organisms. The discovery that Mimivirus encoded trademark cellular functions prompted a wealth of theoretical studies revisiting the concept of virus and associated large DNA viruses with the emergence of early eukaryotes. However, the evolutionary significance of these unique features remained impossible to assess in absence of a Mimivirus relative exhibiting a suitable evolutionary divergence. Here, we present Megavirus chilensis, a giant virus isolated off the coast of Chile, but capable of replicating in fresh water acanthamoeba. Its 1,259,197-bp genome is the largest viral genome fully sequenced so far. It encodes 1,120 putative proteins, of which 258 (23%) have no Mimivirus homologs. The 594 Megavirus/Mimivirus orthologs share an average of 50% of identical residues. Despite this divergence, Megavirus retained all of the genomic features characteristic of Mimivirus, including its cellular-like genes. Moreover, Megavirus exhibits three additional aminoacyl-tRNA synthetase genes (IleRS, TrpRS, and AsnRS) adding strong support to the previous suggestion that the Mimivirus/Megavirus lineage evolved from an ancestral cellular genome by reductive evolution. The main differences in gene content between Mimivirus and Megavirus genomes are due to (i) lineages specific gains or losses of genes, (ii) lineage specific gene family expansion or deletion, and (iii) the insertion/migration of mobile elements (intron, intein). PMID:21987820
Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments

PubMed Central

Haas, Brian J; Salzberg, Steven L; Zhu, Wei; Pertea, Mihaela; Allen, Jonathan E; Orvis, Joshua; White, Owen; Buell, C Robin; Wortman, Jennifer R

2008-01-01

EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation. PMID:18190707
Chætognath transcriptome reveals ancestral and unique features among bilaterians

PubMed Central

Marlétaz, Ferdinand; Gilles, André; Caubit, Xavier; Perez, Yvan; Dossat, Carole; Samain, Sylvie; Gyapay, Gabor; Wincker, Patrick; Le Parco, Yannick

2008-01-01

Background The chætognaths (arrow worms) have puzzled zoologists for years because of their astonishing morphological and developmental characteristics. Despite their deuterostome-like development, phylogenomic studies recently positioned the chætognath phylum in protostomes, most likely in an early branching. This key phylogenetic position and the peculiar characteristics of chætognaths prompted further investigation of their genomic features. Results Transcriptomic and genomic data were collected from the chætognath Spadella cephaloptera through the sequencing of expressed sequence tags and genomic bacterial artificial chromosome clones. Transcript comparisons at various taxonomic scales emphasized the conservation of a core gene set and phylogenomic analysis confirmed the basal position of chætognaths among protostomes. A detailed survey of transcript diversity and individual genotyping revealed a past genome duplication event in the chætognath lineage, which was, surprisingly, followed by a high retention rate of duplicated genes. Moreover, striking genetic heterogeneity was detected within the sampled population at the nuclear and mitochondrial levels but cannot be explained by cryptic speciation. Finally, we found evidence for trans-splicing maturation of transcripts through splice-leader addition in the chætognath phylum and we further report that this processing is associated with operonic transcription. Conclusion These findings reveal both shared ancestral and unique derived characteristics of the chætognath genome, which suggests that this genome is likely the product of a very original evolutionary history. These features promote chætognaths as a pivotal model for comparative genomics, which could provide new clues for the investigation of the evolution of animal genomes. PMID:18533022
An experimental phylogeny to benchmark ancestral sequence reconstruction

PubMed Central

Randall, Ryan N.; Radford, Caelan E.; Roof, Kelsey A.; Natarajan, Divya K.; Gaucher, Eric A.

2016-01-01

Ancestral sequence reconstruction (ASR) is a still-burgeoning method that has revealed many key mechanisms of molecular evolution. One criticism of the approach is an inability to validate its algorithms within a biological context as opposed to a computer simulation. Here we build an experimental phylogeny using the gene of a single red fluorescent protein to address this criticism. The evolved phylogeny consists of 19 operational taxonomic units (leaves) and 17 ancestral bifurcations (nodes) that display a wide variety of fluorescent phenotypes. The 19 leaves then serve as ‘modern' sequences that we subject to ASR analyses using various algorithms and to benchmark against the known ancestral genotypes and ancestral phenotypes. We confirm computer simulations that show all algorithms infer ancient sequences with high accuracy, yet we also reveal wide variation in the phenotypes encoded by incorrectly inferred sequences. Specifically, Bayesian methods incorporating rate variation significantly outperform the maximum parsimony criterion in phenotypic accuracy. Subsampling of extant sequences had minor effect on the inference of ancestral sequences. PMID:27628687
A tree of life based on ninety-eight expressed genes conserved across diverse eukaryotic species

PubMed Central

Jayaswal, Pawan Kumar; Dogra, Vivek; Shanker, Asheesh; Sharma, Tilak Raj

2017-01-01

Rapid advances in DNA sequencing technologies have resulted in the accumulation of large data sets in the public domain, facilitating comparative studies to provide novel insights into the evolution of life. Phylogenetic studies across the eukaryotic taxa have been reported but on the basis of a limited number of genes. Here we present a genome-wide analysis across different plant, fungal, protist, and animal species, with reference to the 36,002 expressed genes of the rice genome. Our analysis revealed 9831 genes unique to rice and 98 genes conserved across all 49 eukaryotic species analysed. The 98 genes conserved across diverse eukaryotes mostly exhibited binding and catalytic activities and shared common sequence motifs; and hence appeared to have a common origin. The 98 conserved genes belonged to 22 functional gene families including 26S protease, actin, ADP–ribosylation factor, ATP synthase, casein kinase, DEAD-box protein, DnaK, elongation factor 2, glyceraldehyde 3-phosphate, phosphatase 2A, ras-related protein, Ser/Thr protein phosphatase family protein, tubulin, ubiquitin and others. The consensus Bayesian eukaryotic tree of life developed in this study demonstrated widely separated clades of plants, fungi, and animals. Musa acuminata provided an evolutionary link between monocotyledons and dicotyledons, and Salpingoeca rosetta provided an evolutionary link between fungi and animals, which indicating that protozoan species are close relatives of fungi and animals. The divergence times for 1176 species pairs were estimated accurately by integrating fossil information with synonymous substitution rates in the comprehensive set of 98 genes. The present study provides valuable insight into the evolution of eukaryotes. PMID:28922368

A Nitrile Hydratase in the Eukaryote Monosiga brevicollis

PubMed Central

Foerstner, Konrad U.; Doerks, Tobias; Muller, Jean; Raes, Jeroen; Bork, Peer

2008-01-01

Bacterial nitrile hydratase (NHases) are important industrial catalysts and waste water remediation tools. In a global computational screening of conventional and metagenomic sequence data for NHases, we detected the two usually separated NHase subunits fused in one protein of the choanoflagellate Monosiga brevicollis, a recently sequenced unicellular model organism from the closest sister group of Metazoa. This is the first time that an NHase is found in eukaryotes and the first time it is observed as a fusion protein. The presence of an intron, subunit fusion and expressed sequence tags covering parts of the gene exclude contamination and suggest a functional gene. Phylogenetic analyses and genomic context imply a probable ancient horizontal gene transfer (HGT) from proteobacteria. The newly discovered NHase might open biotechnological routes due to its unconventional structure, its new type of host and its apparent integration into eukaryotic protein networks. PMID:19096720
Argonaute Proteins and Mechanisms of RNA Interference in Eukaryotes and Prokaryotes.

PubMed

Olina, A V; Kulbachinskiy, A V; Aravin, A A; Esyunina, D M

2018-05-01

Noncoding RNAs play essential roles in genetic regulation in all organisms. In eukaryotic cells, many small noncoding RNAs act in complex with Argonaute proteins and regulate gene expression by recognizing complementary RNA targets. The complexes of Argonaute proteins with small RNAs also play a key role in silencing of mobile genetic elements and, in some cases, viruses. These processes are collectively called RNA interference. RNA interference is a powerful tool for specific gene silencing in both basic research and therapeutic applications. Argonaute proteins are also found in prokaryotic organisms. Recent studies have shown that prokaryotic Argonautes can also cleave their target nucleic acids, in particular DNA. This activity of prokaryotic Argonautes might potentially be used to edit eukaryotic genomes. However, the molecular mechanisms of small nucleic acid biogenesis and the functions of Argonaute proteins, in particular in bacteria and archaea, remain largely unknown. Here we briefly review available data on the RNA interference processes and Argonaute proteins in eukaryotes and prokaryotes.
Comparative genomic de-convolution of the cotton genome revealed a decaploid ancestor and widespread chromosomal fractionation.

PubMed

Wang, Xiyin; Guo, Hui; Wang, Jinpeng; Lei, Tianyu; Liu, Tao; Wang, Zhenyi; Li, Yuxian; Lee, Tae-Ho; Li, Jingping; Tang, Haibao; Jin, Dianchuan; Paterson, Andrew H

2016-02-01

The 'apparently' simple genomes of many angiosperms mask complex evolutionary histories. The reference genome sequence for cotton (Gossypium spp.) revealed a ploidy change of a complexity unprecedented to date, indeed that could not be distinguished as to its exact dosage. Herein, by developing several comparative, computational and statistical approaches, we revealed a 5× multiplication in the cotton lineage of an ancestral genome common to cotton and cacao, and proposed evolutionary models to show how such a decaploid ancestor formed. The c. 70% gene loss necessary to bring the ancestral decaploid to its current gene count appears to fit an approximate geometrical model; that is, although many genes may be lost by single-gene deletion events, some may be lost in groups of consecutive genes. Gene loss following cotton decaploidy has largely just reduced gene copy numbers of some homologous groups. We designed a novel approach to deconvolute layers of chromosome homology, providing definitive information on gene orthology and paralogy across broad evolutionary distances, both of fundamental value and serving as an important platform to support further studies in and beyond cotton and genomics communities. No claim to original US government works. New Phytologist © 2015 New Phytologist Trust.
The presence of the ancestral insect telomeric motif in kissing bugs (Triatominae) rules out the hypothesis of its loss in evolutionarily advanced Heteroptera (Cimicomorpha)

PubMed Central

Pita, Sebastián; Panzera, Francisco; Mora, Pablo; Vela, Jesús; Palomeque, Teresa; Lorite, Pedro

2016-01-01

Abstract Next-generation sequencing data analysis on Triatoma infestans Klug, 1834 (Heteroptera, Cimicomorpha, Reduviidae) revealed the presence of the ancestral insect (TTAGG)n telomeric motif in its genome. Fluorescence in situ hybridization confirms that chromosomes bear this telomeric sequence in their chromosomal ends. Furthermore, motif amount estimation was about 0.03% of the total genome, so that the average telomere length in each chromosomal end is almost 18 kb long. We also detected the presence of (TTAGG)n telomeric repeat in mitotic and meiotic chromosomes in other three species of Triatominae: Triatoma dimidiata Latreille, 1811, Dipetalogaster maxima Uhler, 1894, and Rhodnius prolixus Ståhl, 1859. This is the first report of the (TTAGG)n telomeric repeat in the infraorder Cimicomorpha, contradicting the currently accepted hypothesis that evolutionarily recent heteropterans lack this ancestral insect telomeric sequence. PMID:27830050
The Maximal C³ Self-Complementary Trinucleotide Circular Code X in Genes of Bacteria, Archaea, Eukaryotes, Plasmids and Viruses.

PubMed

Michel, Christian J

2017-04-18

In 1996, a set X of 20 trinucleotides was identified in genes of both prokaryotes and eukaryotes which has on average the highest occurrence in reading frame compared to its two shifted frames. Furthermore, this set X has an interesting mathematical property as X is a maximal C 3 self-complementary trinucleotide circular code. In 2015, by quantifying the inspection approach used in 1996, the circular code X was confirmed in the genes of bacteria and eukaryotes and was also identified in the genes of plasmids and viruses. The method was based on the preferential occurrence of trinucleotides among the three frames at the gene population level. We extend here this definition at the gene level. This new statistical approach considers all the genes, i.e., of large and small lengths, with the same weight for searching the circular code X . As a consequence, the concept of circular code, in particular the reading frame retrieval, is directly associated to each gene. At the gene level, the circular code X is strengthened in the genes of bacteria, eukaryotes, plasmids, and viruses, and is now also identified in the genes of archaea. The genes of mitochondria and chloroplasts contain a subset of the circular code X . Finally, by studying viral genes, the circular code X was found in DNA genomes, RNA genomes, double-stranded genomes, and single-stranded genomes.
Increased genetic diversity of ADME genes in African Americans compared with their putative ancestral source populations and implications for Pharmacogenomics

PubMed Central

2014-01-01

Background African Americans have been treated as a representative population for African ancestry for many purposes, including pharmacogenomic studies. However, the contribution of European ancestry is expected to result in considerable differences in the genetic architecture of African American individuals compared with an African genome. In particular, the genetic admixture influences the genomic diversity of drug metabolism-related genes, and may cause high heterogeneity of drug responses in admixed populations such as African Americans. Results The genomic ancestry information of African-American (ASW) samples was obtained from data of the 1000 Genomes Project, and local ancestral components were also extracted for 32 core genes and 252 extended genes, which are associated with drug absorption, distribution, metabolism, and excretion (ADME) genes. As expected, the global genetic diversity pattern in ASW was determined by the contributions of its putative ancestral source populations, and the whole profiles of ADME genes in ASW are much closer to those in YRI than in CEU. However, we observed much higher diversity in some functionally important ADME genes in ASW than either CEU or YRI, which could be a result of either genetic drift or natural selection, and we identified some signatures of the latter. We analyzed the clinically relevant polymorphic alleles and haplotypes, and found that 28 functional mutations (including 3 missense, 3 splice, and 22 regulator sites) exhibited significantly higher differentiation between the three populations. Conclusions Analysis of the genetic diversity of ADME genes showed differentiation between admixed population and its ancestral source populations. In particular, the different genetic diversity between ASW and YRI indicated that the ethnic differences in pharmacogenomic studies are broadly existed despite that African ancestry is dominant in Africans Americans. This study should advance our understanding of the genetic
An allele of an ancestral transcription factor dependent on a horizontally acquired gene product.

PubMed

Chen, H Deborah; Jewett, Mollie W; Groisman, Eduardo A

2012-01-01

Changes in gene regulatory circuits often give rise to phenotypic differences among closely related organisms. In bacteria, these changes can result from alterations in the ancestral genome and/or be brought about by genes acquired by horizontal transfer. Here, we identify an allele of the ancestral transcription factor PmrA that requires the horizontally acquired pmrD gene product to promote gene expression. We determined that a single amino acid difference between the PmrA proteins from the human adapted Salmonella enterica serovar Paratyphi B and the broad host range S. enterica serovar Typhimurium rendered transcription of PmrA-activated genes dependent on the PmrD protein in the former but not the latter serovar. Bacteria harboring the serovar Typhimurium allele exhibited polymyxin B resistance under PmrA- or under PmrA- and PmrD-inducing conditions. By contrast, isogenic strains with the serovar Paratyphi B allele displayed PmrA-regulated polymyxin B resistance only when experiencing activating conditions for both PmrA and PmrD. We establish that the two PmrA orthologs display quantitative differences in several biochemical properties. Strains harboring the serovar Paratyphi B allele showed enhanced biofilm formation, a property that might promote serovar Paratyphi B's chronic infection of the gallbladder. Our findings illustrate how subtle differences in ancestral genes can impact the ability of horizontally acquired genes to confer new properties.
Population Stratification and Underrepresentation of Indian Subcontinent Genetic Diversity in the 1000 Genomes Project Dataset.

PubMed

Sengupta, Dhriti; Choudhury, Ananyo; Basu, Analabha; Ramsay, Michèle

2016-12-31

Genomic variation in Indian populations is of great interest due to the diversity of ancestral components, social stratification, endogamy and complex admixture patterns. With an expanding population of 1.2 billion, India is also a treasure trove to catalogue innocuous as well as clinically relevant rare mutations. Recent studies have revealed four dominant ancestries in populations from mainland India: Ancestral North-Indian (ANI), Ancestral South-Indian (ASI), Ancestral Tibeto-Burman (ATB) and Ancestral Austro-Asiatic (AAA). The 1000 Genomes Project (KGP) Phase-3 data include about 500 genomes from five linguistically defined Indian-Subcontinent (IS) populations (Punjabi, Gujrati, Bengali, Telugu and Tamil) some of whom are recent migrants to USA or UK. Comparative analyses show that despite the distinct geographic origins of the KGP-IS populations, the ANI component is predominantly represented in this dataset. Previous studies demonstrated population substructure in the HapMap Gujrati population, and we found evidence for additional substructure in the Punjabi and Telugu populations. These substructured populations have characteristic/significant differences in heterozygosity and inbreeding coefficients. Moreover, we demonstrate that the substructure is better explained by factors like differences in proportion of ancestral components, and endogamy driven social structure rather than invoking a novel ancestral component to explain it. Therefore, using language and/or geography as a proxy for an ethnic unit is inadequate for many of the IS populations. This highlights the necessity for more nuanced sampling strategies or corrective statistical approaches, particularly for biomedical and population genetics research in India. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Sequencing of Australian wild rice genomes reveals ancestral relationships with domesticated rice.

PubMed

Brozynska, Marta; Copetti, Dario; Furtado, Agnelo; Wing, Rod A; Crayn, Darren; Fox, Glen; Ishikawa, Ryuji; Henry, Robert J

2017-06-01

The related A genome species of the Oryza genus are the effective gene pool for rice. Here, we report draft genomes for two Australian wild A genome taxa: O. rufipogon-like population, referred to as Taxon A, and O. meridionalis-like population, referred to as Taxon B. These two taxa were sequenced and assembled by integration of short- and long-read next-generation sequencing (NGS) data to create a genomic platform for a wider rice gene pool. Here, we report that, despite the distinct chloroplast genome, the nuclear genome of the Australian Taxon A has a sequence that is much closer to that of domesticated rice (O. sativa) than to the other Australian wild populations. Analysis of 4643 genes in the A genome clade showed that the Australian annual, O. meridionalis, and related perennial taxa have the most divergent (around 3 million years) genome sequences relative to domesticated rice. A test for admixture showed possible introgression into the Australian Taxon A (diverged around 1.6 million years ago) especially from the wild indica/O. nivara clade in Asia. These results demonstrate that northern Australia may be the centre of diversity of the A genome Oryza and suggest the possibility that this might also be the centre of origin of this group and represent an important resource for rice improvement. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Diversity and evolution of centromere repeats in the maize genome.

PubMed

Bilinski, Paul; Distor, Kevin; Gutierrez-Lopez, Jose; Mendoza, Gabriela Mendoza; Shi, Jinghua; Dawe, R Kelly; Ross-Ibarra, Jeffrey

2015-03-01

Centromere repeats are found in most eukaryotes and play a critical role in kinetochore formation. Though centromere repeats exhibit considerable diversity both within and among species, little is understood about the mechanisms that drive centromere repeat evolution. Here, we use maize as a model to investigate how a complex history involving polyploidy, fractionation, and recent domestication has impacted the diversity of the maize centromeric repeat CentC. We first validate the existence of long tandem arrays of repeats in maize and other taxa in the genus Zea. Although we find considerable sequence diversity among CentC copies genome-wide, genetic similarity among repeats is highest within these arrays, suggesting that tandem duplications are the primary mechanism for the generation of new copies. Nonetheless, clustering analyses identify similar sequences among distant repeats, and simulations suggest that this pattern may be due to homoplasious mutation. Although the two ancestral subgenomes of maize have contributed nearly equal numbers of centromeres, our analysis shows that the majority of all CentC repeats derive from one of the parental genomes, with an even stronger bias when examining the largest assembled contiguous clusters. Finally, by comparing maize with its wild progenitor teosinte, we find that the abundance of CentC likely decreased after domestication, while the pericentromeric repeat Cent4 has drastically increased.
Human Genetic Ancestral Composition Correlates with the Origin of Mycobacterium leprae Strains in a Leprosy Endemic Population

PubMed Central

Cardona-Castro, Nora; Cortés, Edwin; Beltrán, Camilo; Romero, Marcela; Badel-Mogollón, Jaime E.; Bedoya, Gabriel

2015-01-01

Recent reports have suggested that leprosy originated in Africa, extended to Asia and Europe, and arrived in the Americas during European colonization and the African slave trade. Due to colonization, the contemporary Colombian population is an admixture of Native-American, European and African ancestries. Because microorganisms are known to accompany humans during migrations, patterns of human migration can be traced by examining genomic changes in associated microbes. The current study analyzed 118 leprosy cases and 116 unrelated controls from two Colombian regions endemic for leprosy (Atlantic and Andean) in order to determine possible associations of leprosy with patient ancestral background (determined using 36 ancestry informative markers), Mycobacterium leprae genotype and/or patient geographical origin. We found significant differences between ancestral genetic composition. European components were predominant in Andean populations. In contrast, African components were higher in the Atlantic region. M. leprae genotypes were then analyzed for cluster associations and compared with the ancestral composition of leprosy patients. Two M. leprae principal clusters were found: haplotypes C54 and T45. Haplotype C54 associated with African origin and was more frequent in patients from the Atlantic region with a high African component. In contrast, haplotype T45 associated with European origin and was more frequent in Andean patients with a higher European component. These results suggest that the human and M. leprae genomes have co-existed since the African and European origins of the disease, with leprosy ultimately arriving in Colombia during colonization. Distinct M. leprae strains followed European and African settlement in the country and can be detected in contemporary Colombian populations. PMID:26360617
Genome-wide association study and ancestral origins of the slick-hair coat in tropically adapted cattle

PubMed Central

Huson, Heather J.; Kim, Eui-Soo; Godfrey, Robert W.; Olson, Timothy A.; McClure, Matthew C.; Chase, Chad C.; Rizzi, Rita; O'Brien, Ana M. P.; Van Tassell, Curt P.; Garcia, José F.; Sonstegard, Tad S.

2014-01-01

The slick hair coat (SLICK) is a dominantly inherited trait typically associated with tropically adapted cattle that are from Criollo descent through Spanish colonization of cattle into the New World. The trait is of interest relative to climate change, due to its association with improved thermo-tolerance and subsequent increased productivity. Previous studies localized the SLICK locus to a 4 cM region on chromosome (BTA) 20 and identified signatures of selection in this region derived from Senepol cattle. The current study compares three slick-haired Criollo-derived breeds including Senepol, Carora, and Romosinuano and three additional slick-haired cross-bred lineages to non-slick ancestral breeds. Genome-wide association (GWA), haplotype analysis, signatures of selection, runs of homozygosity (ROH), and identity by state (IBS) calculations were used to identify a 0.8 Mb (37.7–38.5 Mb) consensus region for the SLICK locus on BTA20 in which contains SKP2 and SPEF2 as possible candidate genes. Three specific haplotype patterns are identified in slick individuals, all with zero frequency in non-slick individuals. Admixture analysis identified common genetic patterns between the three slick breeds at the SLICK locus. Principal component analysis (PCA) and admixture results show Senepol and Romosinuano sharing a higher degree of genetic similarity to one another with a much lesser degree of similarity to Carora. Variation in GWA, haplotype analysis, and IBS calculations with accompanying population structure information supports potentially two mutations, one common to Senepol and Romosinuano and another in Carora, effecting genes contained within our refined location for the SLICK locus. PMID:24808908
Genome-wide association study and ancestral origins of the slick-hair coat in tropically adapted cattle.

PubMed

Huson, Heather J; Kim, Eui-Soo; Godfrey, Robert W; Olson, Timothy A; McClure, Matthew C; Chase, Chad C; Rizzi, Rita; O'Brien, Ana M P; Van Tassell, Curt P; Garcia, José F; Sonstegard, Tad S

2014-01-01

The slick hair coat (SLICK) is a dominantly inherited trait typically associated with tropically adapted cattle that are from Criollo descent through Spanish colonization of cattle into the New World. The trait is of interest relative to climate change, due to its association with improved thermo-tolerance and subsequent increased productivity. Previous studies localized the SLICK locus to a 4 cM region on chromosome (BTA) 20 and identified signatures of selection in this region derived from Senepol cattle. The current study compares three slick-haired Criollo-derived breeds including Senepol, Carora, and Romosinuano and three additional slick-haired cross-bred lineages to non-slick ancestral breeds. Genome-wide association (GWA), haplotype analysis, signatures of selection, runs of homozygosity (ROH), and identity by state (IBS) calculations were used to identify a 0.8 Mb (37.7-38.5 Mb) consensus region for the SLICK locus on BTA20 in which contains SKP2 and SPEF2 as possible candidate genes. Three specific haplotype patterns are identified in slick individuals, all with zero frequency in non-slick individuals. Admixture analysis identified common genetic patterns between the three slick breeds at the SLICK locus. Principal component analysis (PCA) and admixture results show Senepol and Romosinuano sharing a higher degree of genetic similarity to one another with a much lesser degree of similarity to Carora. Variation in GWA, haplotype analysis, and IBS calculations with accompanying population structure information supports potentially two mutations, one common to Senepol and Romosinuano and another in Carora, effecting genes contained within our refined location for the SLICK locus.
The complete mitochondrial genome of a stonefly species, Togoperla sp. (Plecoptera: Perlidae).

PubMed

Wang, Kai; Wang, Yuyu; Yang, Ding

2016-05-01

The complete mitochondrial (mt) genome of a stonefly species, Togoperla sp. (Plecoptera: Perlidae), was sequenced. The 15,723 bp long genome has the standard metazoan complement of 37 genes and an A+T-rich region, which is the same as the insect ancestral genome arrangement.
Distribution and Phylogeny of EFL and EF-1α in Euglenozoa Suggest Ancestral Co-Occurrence Followed by Differential Loss

PubMed Central

Gile, Gillian H.; Faktorová, Drahomíra; Castlejohn, Christina A.; Burger, Gertraud; Lang, B. Franz; Farmer, Mark A.; Lukeš, Julius; Keeling, Patrick J.

2009-01-01

Background The eukaryotic elongation factor EF-1α (also known as EF1A) catalyzes aminoacyl-tRNA binding by the ribosome during translation. Homologs of this essential protein occur in all domains of life, and it was previously thought to be ubiquitous in eukaryotes. Recently, however, a number of eukaryotes were found to lack EF-1α and instead encode a related protein called EFL (for EF-Like). EFL-encoding organisms are scattered widely across the tree of eukaryotes, and all have close relatives that encode EF-1α. This intriguingly complex distribution has been attributed to multiple lateral transfers because EFL's near mutual exclusivity with EF-1α makes an extended period of co-occurrence seem unlikely. However, differential loss may play a role in EFL evolution, and this possibility has been less widely discussed. Methodology/Principal Findings We have undertaken an EST- and PCR-based survey to determine the distribution of these two proteins in a previously under-sampled group, the Euglenozoa. EF-1α was found to be widespread and monophyletic, suggesting it is ancestral in this group. EFL was found in some species belonging to each of the three euglenozoan lineages, diplonemids, kinetoplastids, and euglenids. Conclusions/Significance Interestingly, the kinetoplastid EFL sequences are specifically related despite the fact that the lineages in which they are found are not sisters to one another, suggesting that EFL and EF-1α co-occurred in an early ancestor of kinetoplastids. This represents the strongest phylogenetic evidence to date that differential loss has contributed to the complex distribution of EFL and EF-1α. PMID:19357788
Distinguishing friends, foes, and freeloaders in giant genomes.

PubMed

Bennetzen, Jeffrey L; Park, Minkyu

2018-04-01

Most annotations of large eukaryotic genomes initially find transposable elements (TEs) and other repeats, then mask them so that subsequent efforts can be concentrated on the annotation and study of non-TE genes. However, TEs often contribute to host biology, and their community biologies are of intrinsic interest. This review discusses the challenges, rationale and technologies for comprehensive TE annotation in the commonly giant genomes of animals and plants. Complete discovery of the TEs in a fully sequenced genome is laborious, but feasible, with current strategies in the hands of a careful researcher. These deep TE studies have begun to provide important perspectives on how genomes evolve and the degree to which genome changes do and do not affect eukaryotic biology. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Few mitochondrial DNA sequences are inserted into the turkey (Meleagris gallopavo) nuclear genome: evolutionary analyses and informativity in the domestic lineage.

PubMed

Schiavo, G; Strillacci, M G; Ribani, A; Bovo, S; Roman-Ponce, S I; Cerolini, S; Bertolini, F; Bagnato, A; Fontanesi, L

2018-06-01

Mitochondrial DNA (mtDNA) insertions have been detected in the nuclear genome of many eukaryotes. These sequences are pseudogenes originated by horizontal transfer of mtDNA fragments into the nuclear genome, producing nuclear DNA sequences of mitochondrial origin (numt). In this study we determined the frequency and distribution of mtDNA-originated pseudogenes in the turkey (Meleagris gallopavo) nuclear genome. The turkey reference genome (Turkey_2.01) was aligned with the reference linearized mtDNA sequence using last. A total of 32 numt sequences (corresponding to 18 numt regions derived by unique insertional events) were identified in the turkey nuclear genome (size ranging from 66 to 1415 bp; identity against the modern turkey mtDNA corresponding region ranging from 62% to 100%). Numts were distributed in nine chromosomes and in one scaffold. They derived from parts of 10 mtDNA protein-coding genes, ribosomal genes, the control region and 10 tRNA genes. Seven numt regions reported in the turkey genome were identified in orthologues positions in the Gallus gallus genome and therefore were present in the ancestral genome that in the Cretaceous originated the lineages of the modern crown Galliformes. Five recently integrated turkey numts were validated by PCR in 168 turkeys of six different domestic populations. None of the analysed numts were polymorphic (i.e. absence of the inserted sequence, as reported in numts of recent integration in other species), suggesting that the reticulate speciation model is not useful for explaining the origin of the domesticated turkey lineage. © 2018 Stichting International Foundation for Animal Genetics.
Ancestral and more recently acquired syntenic relationships of MADS-box genes uncovered by the Physcomitrella patens pseudochromosomal genome assembly.

PubMed

Barker, Elizabeth I; Ashton, Neil W

2016-03-01

The Physcomitrella pseudochromosomal genome assembly revealed previously invisible synteny enabling realisation of the full potential of shared synteny as a tool for probing evolution of this plant's MADS-box gene family. Assembly of the sequenced genome of Physcomitrella patens into 27 mega-scaffolds (pseudochromosomes) has confirmed the major predictions of our earlier model of expansion of the MADS-box gene family in the Physcomitrella lineage. Additionally, microsynteny has been conserved in the immediate vicinity of some recent duplicates of MADS-box genes. However, comparison of non-syntenic MIKC MADS-box genes and neighbouring genes indicates that chromosomal rearrangements and/or sequence degeneration have destroyed shared synteny over longer distances (macrosynteny) around MADS-box genes despite subsets comprising two or three MIKC genes having remained syntenic. In contrast, half of the type I MADS-box genes have been transposed creating new syntenic relations with MIKC genes. This implies that conservation of ancient ancestral synteny of MIKC genes and of more recently acquired synteny of type I and MIKC genes may be selectively advantageous. Our revised model predicts the birth rate of MIKC genes in Physcomitrella is higher than that of type I genes. However, this difference is attributable to an early tandem duplication and an early segmental duplication of MIKC genes prior to the two polyploidisations that account for most of the expansion of the MADS-box gene family in Physcomitrella. Furthermore, this early segmental duplication spawned two chromosomal lineages: one with a MIKC (C) gene, belonging to the PPM2 clade, in close proximity to one or a pair of MIKC* genes and another with a MIKC (C) gene, belonging to the PpMADS-S clade, characterised by greater separation from syntenic MIKC* genes. Our model has evolutionary implications for the Physcomitrella karyotype.
Phylogenomics of nonavian reptiles and the structure of the ancestral amniote genome

PubMed Central

Shedlock, Andrew M.; Botka, Christopher W.; Zhao, Shaying; Shetty, Jyoti; Zhang, Tingting; Liu, Jun S.; Deschavanne, Patrick J.; Edwards, Scott V.

2007-01-01

We report results of a megabase-scale phylogenomic analysis of the Reptilia, the sister group of mammals. Large-scale end-sequence scanning of genomic clones of a turtle, alligator, and lizard reveals diverse, mammal-like landscapes of retroelements and simple sequence repeats (SSRs) not found in the chicken. Several global genomic traits, including distinctive phylogenetic lineages of CR1-like long interspersed elements (LINEs) and a paucity of A-T rich SSRs, characterize turtles and archosaur genomes, whereas higher frequencies of tandem repeats and a lower global GC content reveal mammal-like features in Anolis. Nonavian reptile genomes also possess a high frequency of diverse and novel 50-bp unit tandem duplications not found in chicken or mammals. The frequency distributions of ≈65,000 8-mer oligonucleotides suggest that rates of DNA-word frequency change are an order of magnitude slower in reptiles than in mammals. These results suggest a diverse array of interspersed and SSRs in the common ancestor of amniotes and a genomic conservatism and gradual loss of retroelements in reptiles that culminated in the minimalist chicken genome. PMID:17307883
The Maximal C3 Self-Complementary Trinucleotide Circular Code X in Genes of Bacteria, Archaea, Eukaryotes, Plasmids and Viruses

PubMed Central

Michel, Christian J.

2017-01-01

In 1996, a set X of 20 trinucleotides was identified in genes of both prokaryotes and eukaryotes which has on average the highest occurrence in reading frame compared to its two shifted frames. Furthermore, this set X has an interesting mathematical property as X is a maximal C3 self-complementary trinucleotide circular code. In 2015, by quantifying the inspection approach used in 1996, the circular code X was confirmed in the genes of bacteria and eukaryotes and was also identified in the genes of plasmids and viruses. The method was based on the preferential occurrence of trinucleotides among the three frames at the gene population level. We extend here this definition at the gene level. This new statistical approach considers all the genes, i.e., of large and small lengths, with the same weight for searching the circular code X. As a consequence, the concept of circular code, in particular the reading frame retrieval, is directly associated to each gene. At the gene level, the circular code X is strengthened in the genes of bacteria, eukaryotes, plasmids, and viruses, and is now also identified in the genes of archaea. The genes of mitochondria and chloroplasts contain a subset of the circular code X. Finally, by studying viral genes, the circular code X was found in DNA genomes, RNA genomes, double-stranded genomes, and single-stranded genomes. PMID:28420220

Fueling Future with Algal Genomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigoriev, Igor

Algae constitute a major component of fundamental eukaryotic diversity, play profound roles in the carbon cycle, and are prominent candidates for biofuel production. The US Department of Energy Joint Genome Institute (JGI) is leading the world in algal genome sequencing (http://jgi.doe.gov/Algae) and contributes of the algal genome projects worldwide (GOLD database, 2012). The sequenced algal genomes offer catalogs of genes, networks, and pathways. The sequenced first of its kind genomes of a haptophyte E.huxleyii, chlorarachniophyte B.natans, and cryptophyte G.theta fill the gaps in the eukaryotic tree of life and carry unique genes and pathways as well as molecular fossils ofmore » secondary endosymbiosis. Natural adaptation to conditions critical for industrial production is encoded in algal genomes, for example, growth of A.anophagefferens at very high cell densities during the harmful algae blooms or a global distribution across diverse environments of E.huxleyii, able to live on sparse nutrients due to its expanded pan-genome. Communications and signaling pathways can be derived from simple symbiotic systems like lichens or complex marine algae metagenomes. Collectively these datasets derived from algal genomics contribute to building a comprehensive parts list essential for algal biofuel development.« less
Ancestral genomic duplication of the insulin gene in tilapia: An analysis of possible implications for clinical islet xenotransplantation using donor islets from transgenic tilapia expressing a humanized insulin gene.

PubMed

Hrytsenko, Olga; Pohajdak, Bill; Wright, James R

2016-07-03

Tilapia, a teleost fish, have multiple large anatomically discrete islets which are easy to harvest, and when transplanted into diabetic murine recipients, provide normoglycemia and mammalian-like glucose tolerance profiles. Tilapia insulin differs structurally from human insulin which could preclude their use as islet donors for xenotransplantation. Therefore, we produced transgenic tilapia with islets expressing a humanized insulin gene. It is now known that fish genomes may possess an ancestral duplication and so tilapia may have a second insulin gene. Therefore, we cloned, sequenced, and characterized the tilapia insulin 2 transcript and found that its expression is negligible in islets, is not islet-specific, and would not likely need to be silenced in our transgenic fish.
Ancestral genomic duplication of the insulin gene in tilapia: An analysis of possible implications for clinical islet xenotransplantation using donor islets from transgenic tilapia expressing a humanized insulin gene

PubMed Central

Hrytsenko, Olga; Pohajdak, Bill; Wright, James R.

2016-01-01

ABSTRACT Tilapia, a teleost fish, have multiple large anatomically discrete islets which are easy to harvest, and when transplanted into diabetic murine recipients, provide normoglycemia and mammalian-like glucose tolerance profiles. Tilapia insulin differs structurally from human insulin which could preclude their use as islet donors for xenotransplantation. Therefore, we produced transgenic tilapia with islets expressing a humanized insulin gene. It is now known that fish genomes may possess an ancestral duplication and so tilapia may have a second insulin gene. Therefore, we cloned, sequenced, and characterized the tilapia insulin 2 transcript and found that its expression is negligible in islets, is not islet-specific, and would not likely need to be silenced in our transgenic fish. PMID:27222321
Mitogenomics and phylogenomics reveal priapulid worms as extant models of the ancestral Ecdysozoan.

PubMed

Webster, Bonnie L; Copley, Richard R; Jenner, Ronald A; Mackenzie-Dodds, Jacqueline A; Bourlat, Sarah J; Rota-Stabelli, Omar; Littlewood, D T J; Telford, Maximilian J

2006-01-01

Research into arthropod evolution is hampered by the derived nature and rapid evolution of the best-studied out-group: the nematodes. We consider priapulids as an alternative out-group. Priapulids are a small phylum of bottom-dwelling marine worms; their tubular body with spiny proboscis or introvert has changed little over 520 million years and recognizable priapulids are common among exceptionally preserved Cambrian fossils. Using the complete mitochondrial genome and 42 nuclear genes from Priapulus caudatus, we show that priapulids are slowly evolving ecdysozoans; almost all these priapulid genes have evolved more slowly than nematode orthologs and the priapulid mitochondrial gene order may be unchanged since the Cambrian. Considering their primitive bodyplan and embryology and the great conservation of both nuclear and mitochondrial genomes, priapulids may deserve the popular epithet of "living fossil." Their study is likely to yield significant new insights into the early evolution of the Ecdysozoa and the origins of the arthropods and their kin as well as aiding inference of the morphology of ancestral Ecdysozoa and Bilateria and their genomes.
Genome of Phaeocystis globosa virus PgV-16T highlights the common ancestry of the largest known DNA viruses infecting eukaryotes

PubMed Central

Santini, Sebastien; Jeudy, Sandra; Bartoli, Julia; Poirot, Olivier; Lescot, Magali; Abergel, Chantal; Barbe, Valérie; Wommack, K. Eric; Noordeloos, Anna A. M.; Brussaard, Corina P. D.; Claverie, Jean-Michel

2013-01-01

Large dsDNA viruses are involved in the population control of many globally distributed species of eukaryotic phytoplankton and have a prominent role in bloom termination. The genus Phaeocystis (Haptophyta, Prymnesiophyceae) includes several high-biomass-forming phytoplankton species, such as Phaeocystis globosa, the blooms of which occur mostly in the coastal zone of the North Atlantic and the North Sea. Here, we report the 459,984-bp-long genome sequence of P. globosa virus strain PgV-16T, encoding 434 proteins and eight tRNAs and, thus, the largest fully sequenced genome to date among viruses infecting algae. Surprisingly, PgV-16T exhibits no phylogenetic affinity with other viruses infecting microalgae (e.g., phycodnaviruses), including those infecting Emiliania huxleyi, another ubiquitous bloom-forming haptophyte. Rather, PgV-16T belongs to an emerging clade (the Megaviridae) clustering the viruses endowed with the largest known genomes, including Megavirus, Mimivirus (both infecting acanthamoeba), and a virus infecting the marine microflagellate grazer Cafeteria roenbergensis. Seventy-five percent of the best matches of PgV-16T–predicted proteins correspond to two viruses [Organic Lake phycodnavirus (OLPV)1 and OLPV2] from a hypersaline lake in Antarctica (Organic Lake), the hosts of which are unknown. As for OLPVs and other Megaviridae, the PgV-16T sequence data revealed the presence of a virophage-like genome. However, no virophage particle was detected in infected P. globosa cultures. The presence of many genes found only in Megaviridae in its genome and the presence of an associated virophage strongly suggest that PgV-16T shares a common ancestry with the largest known dsDNA viruses, the host range of which already encompasses the earliest diverging branches of domain Eukarya. PMID:23754393
Ancestral hierarchy and conflict.

PubMed

Boehm, Christopher

2012-05-18

Ancestral Pan, the shared predecessor of humans, bonobos, and chimpanzees, lived in social dominance hierarchies that created conflict through individual and coalitional competition. This ancestor had male and female mediators, but individuals often reconciled independently. An evolutionary trajectory is traced from this ancestor to extant hunter-gatherers, whose coalitional behavior results in suppressed dominance and competition, except in mate competition. A territorial ancestral Pan would not have engaged in intensive warfare if we consider bonobo behavior, but modern human foragers have the potential for full-scale war. Although hunter-gatherers are able to resolve conflicts preemptively, they also use mechanisms, such as truces and peace pacts, to mitigate conflict when the costs become too high. Today, humans retain the genetic underpinnings of both conflict and conflict management; thus, we retain the potential for both war and peace.
Comparative genomics meets topology: a novel view on genome median and halving problems.

PubMed

Alexeev, Nikita; Avdeyev, Pavel; Alekseyev, Max A

2016-11-11

Genome median and genome halving are combinatorial optimization problems that aim at reconstruction of ancestral genomes by minimizing the number of evolutionary events between them and genomes of the extant species. While these problems have been widely studied in past decades, their solutions are often either not efficient or not biologically adequate. These shortcomings have been recently addressed by restricting the problems solution space. We show that the restricted variants of genome median and halving problems are, in fact, closely related. We demonstrate that these problems have a neat topological interpretation in terms of embedded graphs and polygon gluings. We illustrate how such interpretation can lead to solutions to these problems in particular cases. This study provides an unexpected link between comparative genomics and topology, and demonstrates advantages of solving genome median and halving problems within the topological framework.
Endosymbiotic theories for eukaryote origin

PubMed Central

Martin, William F.; Garg, Sriram; Zimorski, Verena

2015-01-01

For over 100 years, endosymbiotic theories have figured in thoughts about the differences between prokaryotic and eukaryotic cells. More than 20 different versions of endosymbiotic theory have been presented in the literature to explain the origin of eukaryotes and their mitochondria. Very few of those models account for eukaryotic anaerobes. The role of energy and the energetic constraints that prokaryotic cell organization placed on evolutionary innovation in cell history has recently come to bear on endosymbiotic theory. Only cells that possessed mitochondria had the bioenergetic means to attain eukaryotic cell complexity, which is why there are no true intermediates in the prokaryote-to-eukaryote transition. Current versions of endosymbiotic theory have it that the host was an archaeon (an archaebacterium), not a eukaryote. Hence the evolutionary history and biology of archaea increasingly comes to bear on eukaryotic origins, more than ever before. Here, we have compiled a survey of endosymbiotic theories for the origin of eukaryotes and mitochondria, and for the origin of the eukaryotic nucleus, summarizing the essentials of each and contrasting some of their predictions to the observations. A new aspect of endosymbiosis in eukaryote evolution comes into focus from these considerations: the host for the origin of plastids was a facultative anaerobe. PMID:26323761
The Reference Genome Sequence of Saccharomyces cerevisiae: Then and Now

PubMed Central

Engel, Stacia R.; Dietrich, Fred S.; Fisk, Dianna G.; Binkley, Gail; Balakrishnan, Rama; Costanzo, Maria C.; Dwight, Selina S.; Hitz, Benjamin C.; Karra, Kalpana; Nash, Robert S.; Weng, Shuai; Wong, Edith D.; Lloyd, Paul; Skrzypek, Marek S.; Miyasato, Stuart R.; Simison, Matt; Cherry, J. Michael

2014-01-01

The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called “S288C 2010,” was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science. PMID:24374639
Derived Immune and Ancestral Pigmentation Alleles in a 7,000-Year-old Mesolithic European

PubMed Central

Olalde, Iñigo; Allentoft, Morten E.; Sánchez-Quinto, Federico; Santpere, Gabriel; Chiang, Charleston W. K.; DeGiorgio, Michael; Prado-Martínez, Javier; Rodríguez, Juan Antonio; Rasmussen, Simon; Quilez, Javier; Ramírez, Oscar; Marigorta, Urko M.; Fernández-Callejo, Marcos; Prada, María Encina; Encinas, Julio Manuel Vidal; Nielsen, Rasmus; Netea, Mihai G.; Novembre, John; Sturm, Richard A.; Sabeti, Pardis; Marquès-Bonet, Tomàs; Navarro, Arcadi; Willerslev, Eske; Lalueza-Fox, Carles

2014-01-01

Ancient genomic sequences have started revealing the origin and the demographic impact of Neolithic farmers spreading into Europe1–3. The adoption of farming, stock breeding and sedentary societies during the Neolithic may have resulted in adaptive changes in genes associated with immunity and diet4. However, the limited data available from earlier hunter-gatherers precludes an understanding of the selective processes associated with this crucial transition to agriculture in recent human evolution. By sequencing a ~7,000-year-old Mesolithic skeleton discovered at the La Braña-Arintero site in León (Spain), we retrieved the first complete pre-agricultural European human genome. Analysis of this genome in the context of other ancient samples suggests the existence of a common ancient genomic signature across Western and Central Eurasia from the Upper Paleolithic to the Mesolithic. The La Braña individual carries ancestral alleles in several skin pigmentation genes, suggesting that the light skin of modern Europeans was not yet ubiquitous in Mesolithic times. Moreover, we provide evidence that a significant number of derived, putatively adaptive variants associated with pathogen resistance in modern Europeans were already present in this hunter-gatherer. Hence, these genomic variants cannot represent novel mutations that occurred during the adaptation to the farming lifestyle. PMID:24463515
Why did eukaryotes evolve only once? Genetic and energetic aspects of conflict and conflict mediation

PubMed Central

Blackstone, Neil W.

2013-01-01

According to multi-level theory, evolutionary transitions require mediating conflicts between lower-level units in favour of the higher-level unit. By this view, the origin of eukaryotes and the origin of multicellularity would seem largely equivalent. Yet, eukaryotes evolved only once in the history of life, whereas multicellular eukaryotes have evolved many times. Examining conflicts between evolutionary units and mechanisms that mediate these conflicts can illuminate these differences. Energy-converting endosymbionts that allow eukaryotes to transcend surface-to-volume constraints also can allocate energy into their own selfish replication. This principal conflict in the origin of eukaryotes can be mediated by genetic or energetic mechanisms. Genome transfer diminishes the heritable variation of the symbiont, but requires the de novo evolution of the protein-import apparatus and was opposed by selection for selfish symbionts. By contrast, metabolic signalling is a shared primitive feature of all cells. Redox state of the cytosol is an emergent feature that cannot be subverted by an individual symbiont. Hypothetical scenarios illustrate how metabolic regulation may have mediated the conflicts inherent at different stages in the origin of eukaryotes. Aspects of metabolic regulation may have subsequently been coopted from within-cell to between-cell pathways, allowing multicellularity to emerge repeatedly. PMID:23754817
Why did eukaryotes evolve only once? Genetic and energetic aspects of conflict and conflict mediation.

PubMed

Blackstone, Neil W

2013-07-19

According to multi-level theory, evolutionary transitions require mediating conflicts between lower-level units in favour of the higher-level unit. By this view, the origin of eukaryotes and the origin of multicellularity would seem largely equivalent. Yet, eukaryotes evolved only once in the history of life, whereas multicellular eukaryotes have evolved many times. Examining conflicts between evolutionary units and mechanisms that mediate these conflicts can illuminate these differences. Energy-converting endosymbionts that allow eukaryotes to transcend surface-to-volume constraints also can allocate energy into their own selfish replication. This principal conflict in the origin of eukaryotes can be mediated by genetic or energetic mechanisms. Genome transfer diminishes the heritable variation of the symbiont, but requires the de novo evolution of the protein-import apparatus and was opposed by selection for selfish symbionts. By contrast, metabolic signalling is a shared primitive feature of all cells. Redox state of the cytosol is an emergent feature that cannot be subverted by an individual symbiont. Hypothetical scenarios illustrate how metabolic regulation may have mediated the conflicts inherent at different stages in the origin of eukaryotes. Aspects of metabolic regulation may have subsequently been coopted from within-cell to between-cell pathways, allowing multicellularity to emerge repeatedly.
Delineation of Steroid-Degrading Microorganisms through Comparative Genomic Analysis

PubMed Central

Bergstrand, Lee H.; Cardenas, Erick; Holert, Johannes; Van Hamme, Jonathan D.

2016-01-01

ABSTRACT Steroids are ubiquitous in natural environments and are a significant growth substrate for microorganisms. Microbial steroid metabolism is also important for some pathogens and for biotechnical applications. This study delineated the distribution of aerobic steroid catabolism pathways among over 8,000 microorganisms whose genomes are available in the NCBI RefSeq database. Combined analysis of bacterial, archaeal, and fungal genomes with both hidden Markov models and reciprocal BLAST identified 265 putative steroid degraders within only Actinobacteria and Proteobacteria, which mainly originated from soil, eukaryotic host, and aquatic environments. These bacteria include members of 17 genera not previously known to contain steroid degraders. A pathway for cholesterol degradation was conserved in many actinobacterial genera, particularly in members of the Corynebacterineae, and a pathway for cholate degradation was conserved in members of the genus Rhodococcus. A pathway for testosterone and, sometimes, cholate degradation had a patchy distribution among Proteobacteria. The steroid degradation genes tended to occur within large gene clusters. Growth experiments confirmed bioinformatic predictions of steroid metabolism capacity in nine bacterial strains. The results indicate there was a single ancestral 9,10-seco-steroid degradation pathway. Gene duplication, likely in a progenitor of Rhodococcus, later gave rise to a cholate degradation pathway. Proteobacteria and additional Actinobacteria subsequently obtained a cholate degradation pathway via horizontal gene transfer, in some cases facilitated by plasmids. Catabolism of steroids appears to be an important component of the ecological niches of broad groups of Actinobacteria and individual species of Proteobacteria. PMID:26956583
Derived immune and ancestral pigmentation alleles in a 7,000-year-old Mesolithic European.

PubMed

Olalde, Iñigo; Allentoft, Morten E; Sánchez-Quinto, Federico; Santpere, Gabriel; Chiang, Charleston W K; DeGiorgio, Michael; Prado-Martinez, Javier; Rodríguez, Juan Antonio; Rasmussen, Simon; Quilez, Javier; Ramírez, Oscar; Marigorta, Urko M; Fernández-Callejo, Marcos; Prada, María Encina; Encinas, Julio Manuel Vidal; Nielsen, Rasmus; Netea, Mihai G; Novembre, John; Sturm, Richard A; Sabeti, Pardis; Marquès-Bonet, Tomàs; Navarro, Arcadi; Willerslev, Eske; Lalueza-Fox, Carles

2014-03-13

Ancient genomic sequences have started to reveal the origin and the demographic impact of farmers from the Neolithic period spreading into Europe. The adoption of farming, stock breeding and sedentary societies during the Neolithic may have resulted in adaptive changes in genes associated with immunity and diet. However, the limited data available from earlier hunter-gatherers preclude an understanding of the selective processes associated with this crucial transition to agriculture in recent human evolution. Here we sequence an approximately 7,000-year-old Mesolithic skeleton discovered at the La Braña-Arintero site in León, Spain, to retrieve a complete pre-agricultural European human genome. Analysis of this genome in the context of other ancient samples suggests the existence of a common ancient genomic signature across western and central Eurasia from the Upper Paleolithic to the Mesolithic. The La Braña individual carries ancestral alleles in several skin pigmentation genes, suggesting that the light skin of modern Europeans was not yet ubiquitous in Mesolithic times. Moreover, we provide evidence that a significant number of derived, putatively adaptive variants associated with pathogen resistance in modern Europeans were already present in this hunter-gatherer.
Multiple roles of genome-attached bacteriophage terminal proteins

DOE Office of Scientific and Technical Information (OSTI.GOV)

Redrejo-Rodríguez, Modesto; Salas, Margarita, E-mail: msalas@cbm.csic.es

2014-11-15

Protein-primed replication constitutes a generalized mechanism to initiate DNA or RNA synthesis in linear genomes, including viruses, gram-positive bacteria, linear plasmids and mobile elements. By this mechanism a specific amino acid primes replication and becomes covalently linked to the genome ends. Despite the fact that TPs lack sequence homology, they share a similar structural arrangement, with the priming residue in the C-terminal half of the protein and an accumulation of positively charged residues at the N-terminal end. In addition, various bacteriophage TPs have been shown to have DNA-binding capacity that targets TPs and their attached genomes to the host nucleoid.more » Furthermore, a number of bacteriophage TPs from different viral families and with diverse hosts also contain putative nuclear localization signals and localize in the eukaryotic nucleus, which could lead to the transport of the attached DNA. This suggests a possible role of bacteriophage TPs in prokaryote-to-eukaryote horizontal gene transfer. - Highlights: • Protein-primed genome replication constitutes a strategy to initiate DNA or RNA synthesis in linear genomes. • Bacteriophage terminal proteins (TPs) are covalently attached to viral genomes by their primary function priming DNA replication. • TPs are also DNA-binding proteins and target phage genomes to the host nucleoid. • TPs can also localize in the eukaryotic nucleus and may have a role in phage-mediated interkingdom gene transfer.« less
Horizontal gene transfer is a significant driver of gene innovation in dinoflagellates.

PubMed

Wisecaver, Jennifer H; Brosnahan, Michael L; Hackett, Jeremiah D

2013-01-01

The dinoflagellates are an evolutionarily and ecologically important group of microbial eukaryotes. Previous work suggests that horizontal gene transfer (HGT) is an important source of gene innovation in these organisms. However, dinoflagellate genomes are notoriously large and complex, making genomic investigation of this phenomenon impractical with currently available sequencing technology. Fortunately, de novo transcriptome sequencing and assembly provides an alternative approach for investigating HGT. We sequenced the transcriptome of the dinoflagellate Alexandrium tamarense Group IV to investigate how HGT has contributed to gene innovation in this group. Our comprehensive A. tamarense Group IV gene set was compared with those of 16 other eukaryotic genomes. Ancestral gene content reconstruction of ortholog groups shows that A. tamarense Group IV has the largest number of gene families gained (314-1,563 depending on inference method) relative to all other organisms in the analysis (0-782). Phylogenomic analysis indicates that genes horizontally acquired from bacteria are a significant proportion of this gene influx, as are genes transferred from other eukaryotes either through HGT or endosymbiosis. The dinoflagellates also display curious cases of gene loss associated with mitochondrial metabolism including the entire Complex I of oxidative phosphorylation. Some of these missing genes have been functionally replaced by bacterial and eukaryotic xenologs. The transcriptome of A. tamarense Group IV lends strong support to a growing body of evidence that dinoflagellate genomes are extraordinarily impacted by HGT.
Unicellular eukaryotes as models in cell and molecular biology: critical appraisal of their past and future value.

PubMed

Simon, Martin; Plattner, Helmut

2014-01-01

Unicellular eukaryotes have been appreciated as model systems for the analysis of crucial questions in cell and molecular biology. This includes Dictyostelium (chemotaxis, amoeboid movement, phagocytosis), Tetrahymena (telomere structure, telomerase function), Paramecium (variant surface antigens, exocytosis, phagocytosis cycle) or both ciliates (ciliary beat regulation, surface pattern formation), Chlamydomonas (flagellar biogenesis and beat), and yeast (S. cerevisiae) for innumerable aspects. Nowadays many problems may be tackled with "higher" eukaryotic/metazoan cells for which full genomic information as well as domain databases, etc., were available long before protozoa. Established molecular tools, commercial antibodies, and established pharmacology are additional advantages available for higher eukaryotic cells. Moreover, an increasing number of inherited genetic disturbances in humans have become elucidated and can serve as new models. Among lower eukaryotes, yeast will remain a standard model because of its peculiarities, including its reduced genome and availability in the haploid form. But do protists still have a future as models? This touches not only the basic understanding of biology but also practical aspects of research, such as fund raising. As we try to scrutinize, due to specific advantages some protozoa should and will remain favorable models for analyzing novel genes or specific aspects of cell structure and function. Outstanding examples are epigenetic phenomena-a field of rising interest. © 2014 Elsevier Inc. All rights reserved.
A global perspective on Campanulaceae: Biogeographic, genomic, and floral evolution.

PubMed

Crowl, Andrew A; Miles, Nicholas W; Visger, Clayton J; Hansen, Kimberly; Ayers, Tina; Haberle, Rosemarie; Cellinese, Nico

2016-02-01

The Campanulaceae are a diverse clade of flowering plants encompassing more than 2300 species in myriad habitats from tropical rainforests to arctic tundra. A robust, multigene phylogeny, including all major lineages, is presented to provide a broad, evolutionary perspective of this cosmopolitan clade. We used a phylogenetic framework, in combination with divergence dating, ancestral range estimation, chromosome modeling, and morphological character reconstruction analyses to infer phylogenetic placement and timing of major biogeographic, genomic, and morphological changes in the history of the group and provide insights into the diversification of this clade across six continents. Ancestral range estimation supports an out-of-Africa diversification following the Cretaceous-Tertiary extinction event. Chromosomal modeling, with corroboration from the distribution of synonymous substitutions among gene duplicates, provides evidence for as many as 20 genome-wide duplication events before large radiations. Morphological reconstructions support the hypothesis that switches in floral symmetry and anther dehiscence were important in the evolution of secondary pollen presentation mechanisms. This study provides a broad, phylogenetic perspective on the evolution of the Campanulaceae clade. The remarkable habitat diversity and cosmopolitan distribution of this lineage appears to be the result of a complex history of genome duplications and numerous long-distance dispersal events. We failed to find evidence for an ancestral polyploidy event for this clade, and our analyses indicate an ancestral base number of nine for the group. This study will serve as a framework for future studies in diverse areas of research in Campanulaceae. © 2016 Botanical Society of America.
Coiled-Coil Proteins Facilitated the Functional Expansion of the Centrosome

PubMed Central

Kuhn, Michael; Hyman, Anthony A.; Beyer, Andreas

2014-01-01

Repurposing existing proteins for new cellular functions is recognized as a main mechanism of evolutionary innovation, but its role in organelle evolution is unclear. Here, we explore the mechanisms that led to the evolution of the centrosome, an ancestral eukaryotic organelle that expanded its functional repertoire through the course of evolution. We developed a refined sequence alignment technique that is more sensitive to coiled coil proteins, which are abundant in the centrosome. For proteins with high coiled-coil content, our algorithm identified 17% more reciprocal best hits than BLAST. Analyzing 108 eukaryotic genomes, we traced the evolutionary history of centrosome proteins. In order to assess how these proteins formed the centrosome and adopted new functions, we computationally emulated evolution by iteratively removing the most recently evolved proteins from the centrosomal protein interaction network. Coiled-coil proteins that first appeared in the animal–fungi ancestor act as scaffolds and recruit ancestral eukaryotic proteins such as kinases and phosphatases to the centrosome. This process created a signaling hub that is crucial for multicellular development. Our results demonstrate how ancient proteins can be co-opted to different cellular localizations, thereby becoming involved in novel functions. PMID:24901223
Genome-wide association study identifies HLA 8.1 ancestral haplotype alleles as major genetic risk factors for myositis phenotypes.

PubMed

Miller, F W; Chen, W; O'Hanlon, T P; Cooper, R G; Vencovsky, J; Rider, L G; Danko, K; Wedderburn, L R; Lundberg, I E; Pachman, L M; Reed, A M; Ytterberg, S R; Padyukov, L; Selva-O'Callaghan, A; Radstake, T R; Isenberg, D A; Chinoy, H; Ollier, W E R; Scheet, P; Peng, B; Lee, A; Byun, J; Lamb, J A; Gregersen, P K; Amos, C I

2015-10-01

Autoimmune muscle diseases (myositis) comprise a group of complex phenotypes influenced by genetic and environmental factors. To identify genetic risk factors in patients of European ancestry, we conducted a genome-wide association study (GWAS) of the major myositis phenotypes in a total of 1710 cases, which included 705 adult dermatomyositis, 473 juvenile dermatomyositis, 532 polymyositis and 202 adult dermatomyositis, juvenile dermatomyositis or polymyositis patients with anti-histidyl-tRNA synthetase (anti-Jo-1) autoantibodies, and compared them with 4724 controls. Single-nucleotide polymorphisms showing strong associations (P<5×10(-8)) in GWAS were identified in the major histocompatibility complex (MHC) region for all myositis phenotypes together, as well as for the four clinical and autoantibody phenotypes studied separately. Imputation and regression analyses found that alleles comprising the human leukocyte antigen (HLA) 8.1 ancestral haplotype (AH8.1) defined essentially all the genetic risk in the phenotypes studied. Although the HLA DRB1*03:01 allele showed slightly stronger associations with adult and juvenile dermatomyositis, and HLA B*08:01 with polymyositis and anti-Jo-1 autoantibody-positive myositis, multiple alleles of AH8.1 were required for the full risk effects. Our findings establish that alleles of the AH8.1 comprise the primary genetic risk factors associated with the major myositis phenotypes in geographically diverse Caucasian populations.

Genome-wide Association Study Identifies HLA 8.1 Ancestral Haplotype Alleles as Major Genetic Risk Factors for Myositis Phenotypes

PubMed Central

Miller, Frederick W.; Chen, Wei; O’Hanlon, Terrance P.; Cooper, Robert G.; Vencovsky, Jiri; Rider, Lisa G.; Danko, Katalin; Wedderburn, Lucy R.; Lundberg, Ingrid E.; Pachman, Lauren M.; Reed, Ann M.; Ytterberg, Steven R.; Padyukov, Leonid; Selva-O’Callaghan, Albert; Radstake, Timothy R.; Isenberg, David A.; Chinoy, Hector; Ollier, William E.R.; Scheet, Paul; Peng, Bo; Lee, Annette; Byun, Jinyoung; Lamb, Janine A.; Gregersen, Peter K.; Amos, Christopher I.

2016-01-01

Autoimmune muscle diseases (myositis) comprise a group of complex phenotypes influenced by genetic and environmental factors. To identify genetic risk factors in patients of European ancestry, we conducted a genome-wide association study (GWAS) of the major myositis phenotypes in a total of 1710 cases, which included 705 adult dermatomyositis; 473 juvenile dermatomyositis; 532 polymyositis; and 202 adult dermatomyositis, juvenile dermatomyositis or polymyositis patients with anti-histidyl tRNA synthetase (anti-Jo-1) autoantibodies, and compared them with 4724 controls. Single-nucleotide polymorphisms showing strong associations (P < 5 × 10−8) in GWAS were identified in the major histocompatibility complex (MHC) region for all myositis phenotypes together, as well as for the four clinical and autoantibody phenotypes studied separately. Imputation and regression analyses found that alleles comprising the human leukocyte antigen (HLA) 8.1 ancestral haplotype (AH8.1) defined essentially all the genetic risk in the phenotypes studied. Although the HLA DRB1*03:01 allele showed slightly stronger associations with adult and juvenile dermatomyositis, and HLA B*08:01 with polymyositis and anti-Jo-1 autoantibody-positive myositis, multiple alleles of AH8.1 were required for the full risk effects. Our findings establish that alleles of the AH8.1haplotype comprise the primary genetic risk factors associated with the major myositis phenotypes in geographically diverse Caucasian populations. PMID:26291516
Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set.

PubMed

Kanai, Masahiro; Tanaka, Toshihiro; Okada, Yukinori

2016-10-01

To assess the statistical significance of associations between variants and traits, genome-wide association studies (GWAS) should employ an appropriate threshold that accounts for the massive burden of multiple testing in the study. Although most studies in the current literature commonly set a genome-wide significance threshold at the level of P=5.0 × 10 -8 , the adequacy of this value for respective populations has not been fully investigated. To empirically estimate thresholds for different ancestral populations, we conducted GWAS simulations using the 1000 Genomes Phase 3 data set for Africans (AFR), Europeans (EUR), Admixed Americans (AMR), East Asians (EAS) and South Asians (SAS). The estimated empirical genome-wide significance thresholds were P sig =3.24 × 10 -8 (AFR), 9.26 × 10 -8 (EUR), 1.83 × 10 -7 (AMR), 1.61 × 10 -7 (EAS) and 9.46 × 10 -8 (SAS). We additionally conducted trans-ethnic meta-analyses across all populations (ALL) and all populations except for AFR (ΔAFR), which yielded P sig =3.25 × 10 -8 (ALL) and 4.20 × 10 -8 (ΔAFR). Our results indicate that the current threshold (P=5.0 × 10 -8 ) is overly stringent for all ancestral populations except for Africans; however, we should employ a more stringent threshold when conducting a meta-analysis, regardless of the presence of African samples.
Regulated Eukaryotic DNA Replication Origin Firing with Purified Proteins

PubMed Central

Yeeles, Joseph T.P.; Deegan, Tom D.; Janska, Agnieszka; Early, Anne; Diffley, John F. X.

2016-01-01

Eukaryotic cells initiate DNA replication from multiple origins, which must be tightly regulated to promote precise genome duplication in every cell cycle. To accomplish this, initiation is partitioned into two temporally discrete steps: a double hexameric MCM complex is first loaded at replication origins during G1 phase, and then converted to the active CMG (Cdc45, MCM, GINS) helicase during S phase. Here we describe the reconstitution of budding yeast DNA replication initiation with 16 purified replication factors, made from 42 polypeptides. Origin-dependent initiation recapitulates regulation seen in vivo. Cyclin dependent kinase (CDK) inhibits MCM loading by phosphorylating the origin recognition complex (ORC) and promotes CMG formation by phosphorylating Sld2 and Sld3. Dbf4 dependent kinase (DDK) promotes replication by phosphorylating MCM, and can act either before or after CDK. These experiments define the minimum complement of proteins, protein kinase substrates and co-factors required for regulated eukaryotic DNA replication. PMID:25739503
Regulated eukaryotic DNA replication origin firing with purified proteins.

PubMed

Yeeles, Joseph T P; Deegan, Tom D; Janska, Agnieszka; Early, Anne; Diffley, John F X

2015-03-26

Eukaryotic cells initiate DNA replication from multiple origins, which must be tightly regulated to promote precise genome duplication in every cell cycle. To accomplish this, initiation is partitioned into two temporally discrete steps: a double hexameric minichromosome maintenance (MCM) complex is first loaded at replication origins during G1 phase, and then converted to the active CMG (Cdc45-MCM-GINS) helicase during S phase. Here we describe the reconstitution of budding yeast DNA replication initiation with 16 purified replication factors, made from 42 polypeptides. Origin-dependent initiation recapitulates regulation seen in vivo. Cyclin-dependent kinase (CDK) inhibits MCM loading by phosphorylating the origin recognition complex (ORC) and promotes CMG formation by phosphorylating Sld2 and Sld3. Dbf4-dependent kinase (DDK) promotes replication by phosphorylating MCM, and can act either before or after CDK. These experiments define the minimum complement of proteins, protein kinase substrates and co-factors required for regulated eukaryotic DNA replication.
Genome Sequence of Azospirillum brasilense CBG497 and Comparative Analyses of Azospirillum Core and Accessory Genomes provide Insight into Niche Adaptation

PubMed Central

Wisniewski-Dyé, Florence; Lozano, Luis; Acosta-Cruz, Erika; Borland, Stéphanie; Drogue, Benoît; Prigent-Combaret, Claire; Rouy, Zoé; Barbe, Valérie; Mendoza Herrera, Alberto; González, Victor; Mavingui, Patrick

2012-01-01

Bacteria of the genus Azospirillum colonize roots of important cereals and grasses, and promote plant growth by several mechanisms, notably phytohormone synthesis. The genomes of several Azospirillum strains belonging to different species, isolated from various host plants and locations, were recently sequenced and published. In this study, an additional genome of an A. brasilense strain, isolated from maize grown on an alkaline soil in the northeast of Mexico, strain CBG497, was obtained. Comparative genomic analyses were performed on this new genome and three other genomes (A. brasilense Sp245, A. lipoferum 4B and Azospirillum sp. B510). The Azospirillum core genome was established and consists of 2,328 proteins, representing between 30% to 38% of the total encoded proteins within a genome. It is mainly chromosomally-encoded and contains 74% of genes of ancestral origin shared with some aquatic relatives. The non-ancestral part of the core genome is enriched in genes involved in signal transduction, in transport and in metabolism of carbohydrates and amino-acids, and in surface properties features linked to adaptation in fluctuating environments, such as soil and rhizosphere. Many genes involved in colonization of plant roots, plant-growth promotion (such as those involved in phytohormone biosynthesis), and properties involved in rhizosphere adaptation (such as catabolism of phenolic compounds, uptake of iron) are restricted to a particular strain and/or species, strongly suggesting niche-specific adaptation. PMID:24705077
Evolution: Tracing the origins of centrioles, cilia, and flagella.

PubMed

Carvalho-Santos, Zita; Azimzadeh, Juliette; Pereira-Leal, José B; Bettencourt-Dias, Mónica

2011-07-25

Centrioles/basal bodies (CBBs) are microtubule-based cylindrical organelles that nucleate the formation of centrosomes, cilia, and flagella. CBBs, cilia, and flagella are ancestral structures; they are present in all major eukaryotic groups. Despite the conservation of their core structure, there is variability in their architecture, function, and biogenesis. Recent genomic and functional studies have provided insight into the evolution of the structure and function of these organelles.
The origin of introns and their role in eukaryogenesis: a compromise solution to the introns-early versus introns-late debate?

PubMed Central

Koonin, Eugene V

2006-01-01

Background Ever since the discovery of 'genes in pieces' and mRNA splicing in eukaryotes, origin and evolution of spliceosomal introns have been considered within the conceptual framework of the 'introns early' versus 'introns late' debate. The 'introns early' hypothesis, which is closely linked to the so-called exon theory of gene evolution, posits that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. Under this scenario, the absence of spliceosomal introns in prokaryotes is considered to be a result of "genome streamlining". The 'introns late' hypothesis counters that spliceosomal introns emerged only in eukaryotes, and moreover, have been inserted into protein-coding genes continuously throughout the evolution of eukaryotes. Beyond the formal dilemma, the more substantial side of this debate has to do with possible roles of introns in the evolution of eukaryotes. Results I argue that several lines of evidence now suggest a coherent solution to the introns-early versus introns-late debate, and the emerging picture of intron evolution integrates aspects of both views although, formally, there seems to be no support for the original version of introns-early. Firstly, there is growing evidence that spliceosomal introns evolved from group II self-splicing introns which are present, usually, in small numbers, in many bacteria, and probably, moved into the evolving eukaryotic genome from the α-proteobacterial progenitor of the mitochondria. Secondly, the concept of a primordial pool of 'virus-like' genetic elements implies that self-splicing introns are among the most ancient genetic entities. Thirdly, reconstructions of the ancestral state of eukaryotic genes suggest that the last common ancestor of extant eukaryotes had an intron-rich genome. Thus, it appears that
GenomicusPlants: a web resource to study genome evolution in flowering plants.

PubMed

Louis, Alexandra; Murat, Florent; Salse, Jérôme; Crollius, Hugues Roest

2015-01-01

Comparative genomics combined with phylogenetic reconstructions are powerful approaches to study the evolution of genes and genomes. However, the current rapid expansion of the volume of genomic information makes it increasingly difficult to interrogate, integrate and synthesize comparative genome data while taking into account the maximum breadth of information available. GenomicusPlants (http://www.genomicus.biologie.ens.fr/genomicus-plants) is an extension of the Genomicus webserver that addresses this issue by allowing users to explore flowering plant genomes in an intuitive way, across the broadest evolutionary scales. Extant genomes of 26 flowering plants can be analyzed, as well as 23 ancestral reconstructed genomes. Ancestral gene order provides a long-term chronological view of gene order evolution, greatly facilitating comparative genomics and evolutionary studies. Four main interfaces ('views') are available where: (i) PhyloView combines phylogenetic trees with comparisons of genomic loci across any number of genomes; (ii) AlignView projects loci of interest against all other genomes to visualize its topological conservation; (iii) MatrixView compares two genomes in a classical dotplot representation; and (iv) Karyoview visualizes chromosome karyotypes 'painted' with colours of another genome of interest. All four views are interconnected and benefit from many customizable features. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
Extensive Chromosomal Reorganization in the Evolution of New World Muroid Rodents (Cricetidae, Sigmodontinae): Searching for Ancestral Phylogenetic Traits

PubMed Central

Pereira, Adenilson Leão; Malcher, Stella Miranda; Nagamachi, Cleusa Yoshiko; O’Brien, Patricia Caroline Mary; Ferguson-Smith, Malcolm Andrew; Mendes-Oliveira, Ana Cristina; Pieczarka, Julio Cesar

2016-01-01

Sigmodontinae rodents show great diversity and complexity in morphology and ecology. This diversity is accompanied by extensive chromosome variation challenging attempts to reconstruct their ancestral genome. The species Hylaeamys megacephalus–HME (Oryzomyini, 2n = 54), Necromys lasiurus—NLA (Akodontini, 2n = 34) and Akodon sp.–ASP (Akodontini, 2n = 10) have extreme diploid numbers that make it difficult to understand the rearrangements that are responsible for such differences. In this study we analyzed these changes using whole chromosome probes of HME in cross-species painting of NLA and ASP to construct chromosome homology maps that reveal the rearrangements between species. We include data from the literature for other Sigmodontinae previously studied with probes from HME and Mus musculus (MMU) probes. We also use the HME probes on MMU chromosomes for the comparative analysis of NLA with other species already mapped by MMU probes. Our results show that NLA and ASP have highly rearranged karyotypes when compared to HME. Eleven HME syntenic blocks are shared among the species studied here. Four syntenies may be ancestral to Akodontini (HME2/18, 3/25, 18/25 and 4/11/16) and eight to Sigmodontinae (HME26, 1/12, 6/21, 7/9, 5/17, 11/16, 20/13 and 19/14/19). Using MMU data we identified six associations shared among rodents from seven subfamilies, where MMU3/18 and MMU8/13 are phylogenetic signatures of Sigmodontinae. We suggest that the associations MMU2entire, MMU6proximal/12entire, MMU3/18, MMU8/13, MMU1/17, MMU10/17, MMU12/17, MMU5/16, MMU5/6 and MMU7/19 are part of the ancestral Sigmodontinae genome. PMID:26800516
Mitochondrial Genome Sequences of Nematocera (Lower Diptera): Evidence of Rearrangement following a Complete Genome Duplication in a Winter Crane Fly

PubMed Central

Beckenbach, Andrew T.

2012-01-01

The complete mitochondrial DNA sequences of eight representatives of lower Diptera, suborder Nematocera, along with nearly complete sequences from two other species, are presented. These taxa represent eight families not previously represented by complete mitochondrial DNA sequences. Most of the sequences retain the ancestral dipteran mitochondrial gene arrangement, while one sequence, that of the midge Arachnocampa flava (family Keroplatidae), has an inversion of the trnE gene. The most unusual result is the extensive rearrangement of the mitochondrial genome of a winter crane fly, Paracladura trichoptera (family Trichocera). The pattern of rearrangement indicates that the mechanism of rearrangement involved a tandem duplication of the entire mitochondrial genome, followed by random and nonrandom loss of one copy of each gene. Another winter crane fly retains the ancestral diperan gene arrangement. A preliminary mitochondrial phylogeny of the Diptera is also presented. PMID:22155689
FveGD: an online resource for diploid strawberry (fragaria vesca) genomics data

USDA-ARS?s Scientific Manuscript database

Fragaria vesca, a diploid strawberry species commonly known as the alpine or woodland strawberry, is a versatile experimental plant system that is an emerging model for the Rosaceae family. An ancestral F. vesca genome contributed to the genome of the octoploid dessert strawberry (F. xananassa) and...
The role of the DNA sliding clamp in Okazaki fragment maturation in archaea and eukaryotes.

PubMed

Beattie, Thomas R; Bell, Stephen D

2011-01-01

Efficient processing of Okazaki fragments generated during discontinuous lagging-strand DNA replication is critical for the maintenance of genome integrity. In eukaryotes, a number of enzymes co-ordinate to ensure the removal of initiating primers from the 5'-end of each fragment and the generation of a covalently linked daughter strand. Studies in eukaryotic systems have revealed that the co-ordination of DNA polymerase δ and FEN-1 (Flap Endonuclease 1) is sufficient to remove the majority of primers. Other pathways such as that involving Dna2 also operate under certain conditions, although, notably, Dna2 is not universally conserved between eukaryotes and archaea, unlike the other core factors. In addition to the catalytic components, the DNA sliding clamp, PCNA (proliferating-cell nuclear antigen), plays a pivotal role in binding and co-ordinating these enzymes at sites of lagging-strand replication. Structural studies in eukaryotic and archaeal systems have revealed that PCNA-binding proteins can adopt different conformations when binding PCNA. This conformational malleability may be key to the co-ordination of these enzymes' activities.
Phylogenetic Diversity of NTT Nucleotide Transport Proteins in Free-Living and Parasitic Bacteria and Eukaryotes

PubMed Central

Major, Peter; Embley, T. Martin

2017-01-01

Plasma membrane-located nucleotide transport proteins (NTTs) underpin the lifestyle of important obligate intracellular bacterial and eukaryotic pathogens by importing energy and nucleotides from infected host cells that the pathogens can no longer make for themselves. As such their presence is often seen as a hallmark of an intracellular lifestyle associated with reductive genome evolution and loss of primary biosynthetic pathways. Here, we investigate the phylogenetic distribution of NTT sequences across the domains of cellular life. Our analysis reveals an unexpectedly broad distribution of NTT genes in both host-associated and free-living prokaryotes and eukaryotes. We also identify cases of within-bacteria and bacteria-to-eukaryote horizontal NTT transfer, including into the base of the oomycetes, a major clade of parasitic eukaryotes. In addition to identifying sequences that retain the canonical NTT structure, we detected NTT gene fusions with HEAT-repeat and cyclic nucleotide binding domains in Cyanobacteria, pathogenic Chlamydiae and Oomycetes. Our results suggest that NTTs are versatile functional modules with a much wider distribution and a broader range of potential roles than has previously been appreciated. PMID:28164241
CRISPR-based technologies for the manipulation of eukaryotic genomes

PubMed Central

Komor, Alexis C.; Badran, Ahmed H.; Liu, David R.

2016-01-01

The CRISPR-Cas9 RNA-guided DNA endonuclease has contributed to an explosion of advances in the life sciences that have grown from the ability to edit genomes within living cells. In this review we summarize CRISPR-based technologies that enable mammalian genome editing and their various applications. We describe recent developments that extend the generality, DNA specificity, product selectivity, and fundamental capabilities of natural CRISPR systems, and some of the remarkable advancements in basic research, biotechnology, and therapeutics development that these developments have facilitated. PMID:27866654
Cryptosporidium as a testbed for single cell genome characterization of unicellular eukaryotes.

PubMed

Troell, Karin; Hallström, Björn; Divne, Anna-Maria; Alsmark, Cecilia; Arrighi, Romanico; Huss, Mikael; Beser, Jessica; Bertilsson, Stefan

2016-06-23

Infectious disease involving multiple genetically distinct populations of pathogens is frequently concurrent, but difficult to detect or describe with current routine methodology. Cryptosporidium sp. is a widespread gastrointestinal protozoan of global significance in both animals and humans. It cannot be easily maintained in culture and infections of multiple strains have been reported. To explore the potential use of single cell genomics methodology for revealing genome-level variation in clinical samples from Cryptosporidium-infected hosts, we sorted individual oocysts for subsequent genome amplification and full-genome sequencing. Cells were identified with fluorescent antibodies with an 80 % success rate for the entire single cell genomics workflow, demonstrating that the methodology can be applied directly to purified fecal samples. Ten amplified genomes from sorted single cells were selected for genome sequencing and compared both to the original population and a reference genome in order to evaluate the accuracy and performance of the method. Single cell genome coverage was on average 81 % even with a moderate sequencing effort and by combining the 10 single cell genomes, the full genome was accounted for. By a comparison to the original sample, biological variation could be distinguished and separated from noise introduced in the amplification. As a proof of principle, we have demonstrated the power of applying single cell genomics to dissect infectious disease caused by closely related parasite species or subtypes. The workflow can easily be expanded and adapted to target other protozoans, and potential applications include mapping genome-encoded traits, virulence, pathogenicity, host specificity and resistance at the level of cells as truly meaningful biological units.
Universal Temporal Profile of Replication Origin Activation in Eukaryotes

NASA Astrophysics Data System (ADS)

Goldar, Arach

2011-03-01

The complete and faithful transmission of eukaryotic genome to daughter cells involves the timely duplication of mother cell's DNA. DNA replication starts at multiple chromosomal positions called replication origin. From each activated replication origin two replication forks progress in opposite direction and duplicate the mother cell's DNA. While it is widely accepted that in eukaryotic organisms replication origins are activated in a stochastic manner, little is known on the sources of the observed stochasticity. It is often associated to the population variability to enter S phase. We extract from a growing Saccharomyces cerevisiae population the average rate of origin activation in a single cell by combining single molecule measurements and a numerical deconvolution technique. We show that the temporal profile of the rate of origin activation in a single cell is similar to the one extracted from a replicating cell population. Taking into account this observation we exclude the population variability as the origin of observed stochasticity in origin activation. We confirm that the rate of origin activation increases in the early stage of S phase and decreases at the latter stage. The population average activation rate extracted from single molecule analysis is in prefect accordance with the activation rate extracted from published micro-array data, confirming therefore the homogeneity and genome scale invariance of dynamic of replication process. All these observations point toward a possible role of replication fork to control the rate of origin activation.
Mutational Dynamics of Aroid Chloroplast Genomes

PubMed Central

Ahmed, Ibrar; Biggs, Patrick J.; Matthews, Peter J.; Collins, Lesley J.; Hendy, Michael D.; Lockhart, Peter J.

2012-01-01

A characteristic feature of eukaryote and prokaryote genomes is the co-occurrence of nucleotide substitution and insertion/deletion (indel) mutations. Although similar observations have also been made for chloroplast DNA, genome-wide associations have not been reported. We determined the chloroplast genome sequences for two morphotypes of taro (Colocasia esculenta; family Araceae) and compared these with four publicly available aroid chloroplast genomes. Here, we report the extent of genome-wide association between direct and inverted repeats, indels, and substitutions in these aroid chloroplast genomes. We suggest that alternative but not mutually exclusive hypotheses explain the mutational dynamics of chloroplast genome evolution. PMID:23204304
Symbiosis in eukaryotic evolution.

PubMed

López-García, Purificación; Eme, Laura; Moreira, David

2017-12-07

Fifty years ago, Lynn Margulis, inspiring in early twentieth-century ideas that put forward a symbiotic origin for some eukaryotic organelles, proposed a unified theory for the origin of the eukaryotic cell based on symbiosis as evolutionary mechanism. Margulis was profoundly aware of the importance of symbiosis in the natural microbial world and anticipated the evolutionary significance that integrated cooperative interactions might have as mechanism to increase cellular complexity. Today, we have started fully appreciating the vast extent of microbial diversity and the importance of syntrophic metabolic cooperation in natural ecosystems, especially in sediments and microbial mats. Also, not only the symbiogenetic origin of mitochondria and chloroplasts has been clearly demonstrated, but improvement in phylogenomic methods combined with recent discoveries of archaeal lineages more closely related to eukaryotes further support the symbiogenetic origin of the eukaryotic cell. Margulis left us in legacy the idea of 'eukaryogenesis by symbiogenesis'. Although this has been largely verified, when, where, and specifically how eukaryotic cells evolved are yet unclear. Here, we shortly review current knowledge about symbiotic interactions in the microbial world and their evolutionary impact, the status of eukaryogenetic models and the current challenges and perspectives ahead to reconstruct the evolutionary path to eukaryotes. Copyright © 2017 Elsevier Ltd. All rights reserved.
CRISPR-Based Technologies for the Manipulation of Eukaryotic Genomes.

PubMed

Komor, Alexis C; Badran, Ahmed H; Liu, David R

2017-01-12

The CRISPR-Cas9 RNA-guided DNA endonuclease has contributed to an explosion of advances in the life sciences that have grown from the ability to edit genomes within living cells. In this Review, we summarize CRISPR-based technologies that enable mammalian genome editing and their various applications. We describe recent developments that extend the generality, DNA specificity, product selectivity, and fundamental capabilities of natural CRISPR systems, and we highlight some of the remarkable advancements in basic research, biotechnology, and therapeutics science that these developments have facilitated. Copyright © 2017 Elsevier Inc. All rights reserved.
Ancestral European roots of Helicobacter pylori in India

PubMed Central

Devi, S Manjulata; Ahmed, Irshad; Francalacci, Paolo; Hussain, M Abid; Akhter, Yusuf; Alvi, Ayesha; Sechi, Leonardo A; Mégraud, Francis; Ahmed, Niyaz

2007-01-01

Background The human gastric pathogen Helicobacter pylori is co-evolved with its host and therefore, origins and expansion of multiple populations and sub populations of H. pylori mirror ancient human migrations. Ancestral origins of H. pylori in the vast Indian subcontinent are debatable. It is not clear how different waves of human migrations in South Asia shaped the population structure of H. pylori. We tried to address these issues through mapping genetic origins of present day H. pylori in India and their genomic comparison with hundreds of isolates from different geographic regions. Results We attempted to dissect genetic identity of strains by multilocus sequence typing (MLST) of the 7 housekeeping genes (atpA, efp, ureI, ppa, mutY, trpC, yphC) and phylogeographic analysis of haplotypes using MEGA and NETWORK software while incorporating DNA sequences and genotyping data of whole cag pathogenicity-islands (cagPAI). The distribution of cagPAI genes within these strains was analyzed by using PCR and the geographic type of cagA phosphorylation motif EPIYA was determined by gene sequencing. All the isolates analyzed revealed European ancestry and belonged to H. pylori sub-population, hpEurope. The cagPAI harbored by Indian strains revealed European features upon PCR based analysis and whole PAI sequencing. Conclusion These observations suggest that H. pylori strains in India share ancestral origins with their European counterparts. Further, non-existence of other sub-populations such as hpAfrica and hpEastAsia, at least in our collection of isolates, suggest that the hpEurope strains enjoyed a special fitness advantage in Indian stomachs to out-compete any endogenous strains. These results also might support hypotheses related to gene flow in India through Indo-Aryans and arrival of Neolithic practices and languages from the Fertile Crescent. PMID:17584914

Evolutionary versatility of eukaryotic protein domains revealed by their bigram networks

PubMed Central

2011-01-01

Background Protein domains are globular structures of independently folded polypeptides that exert catalytic or binding activities. Their sequences are recognized as evolutionary units that, through genome recombination, constitute protein repertoires of linkage patterns. Via mutations, domains acquire modified functions that contribute to the fitness of cells and organisms. Recent studies have addressed the evolutionary selection that may have shaped the functions of individual domains and the emergence of particular domain combinations, which led to new cellular functions in multi-cellular animals. This study focuses on modeling domain linkage globally and investigates evolutionary implications that may be revealed by novel computational analysis. Results A survey of 77 completely sequenced eukaryotic genomes implies a potential hierarchical and modular organization of biological functions in most living organisms. Domains in a genome or multiple genomes are modeled as a network of hetero-duplex covalent linkages, termed bigrams. A novel computational technique is introduced to decompose such networks, whereby the notion of domain "networking versatility" is derived and measured. The most and least "versatile" domains (termed "core domains" and "peripheral domains" respectively) are examined both computationally via sequence conservation measures and experimentally using selected domains. Our study suggests that such a versatility measure extracted from the bigram networks correlates with the adaptivity of domains during evolution, where the network core domains are highly adaptive, significantly contrasting the network peripheral domains. Conclusions Domain recombination has played a major part in the evolution of eukaryotes attributing to genome complexity. From a system point of view, as the results of selection and constant refinement, networks of domain linkage are structured in a hierarchical modular fashion. Domains with high degree of networking
Evolutionary versatility of eukaryotic protein domains revealed by their bigram networks.

PubMed

Xie, Xueying; Jin, Jing; Mao, Yongyi

2011-08-18

Protein domains are globular structures of independently folded polypeptides that exert catalytic or binding activities. Their sequences are recognized as evolutionary units that, through genome recombination, constitute protein repertoires of linkage patterns. Via mutations, domains acquire modified functions that contribute to the fitness of cells and organisms. Recent studies have addressed the evolutionary selection that may have shaped the functions of individual domains and the emergence of particular domain combinations, which led to new cellular functions in multi-cellular animals. This study focuses on modeling domain linkage globally and investigates evolutionary implications that may be revealed by novel computational analysis. A survey of 77 completely sequenced eukaryotic genomes implies a potential hierarchical and modular organization of biological functions in most living organisms. Domains in a genome or multiple genomes are modeled as a network of hetero-duplex covalent linkages, termed bigrams. A novel computational technique is introduced to decompose such networks, whereby the notion of domain "networking versatility" is derived and measured. The most and least "versatile" domains (termed "core domains" and "peripheral domains" respectively) are examined both computationally via sequence conservation measures and experimentally using selected domains. Our study suggests that such a versatility measure extracted from the bigram networks correlates with the adaptivity of domains during evolution, where the network core domains are highly adaptive, significantly contrasting the network peripheral domains. Domain recombination has played a major part in the evolution of eukaryotes attributing to genome complexity. From a system point of view, as the results of selection and constant refinement, networks of domain linkage are structured in a hierarchical modular fashion. Domains with high degree of networking versatility appear to be evolutionary
Robustness of Reconstructed Ancestral Protein Functions to Statistical Uncertainty.

PubMed

Eick, Geeta N; Bridgham, Jamie T; Anderson, Douglas P; Harms, Michael J; Thornton, Joseph W

2017-02-01

Hypotheses about the functions of ancient proteins and the effects of historical mutations on them are often tested using ancestral protein reconstruction (APR)-phylogenetic inference of ancestral sequences followed by synthesis and experimental characterization. Usually, some sequence sites are ambiguously reconstructed, with two or more statistically plausible states. The extent to which the inferred functions and mutational effects are robust to uncertainty about the ancestral sequence has not been studied systematically. To address this issue, we reconstructed ancestral proteins in three domain families that have different functions, architectures, and degrees of uncertainty; we then experimentally characterized the functional robustness of these proteins when uncertainty was incorporated using several approaches, including sampling amino acid states from the posterior distribution at each site and incorporating the alternative amino acid state at every ambiguous site in the sequence into a single "worst plausible case" protein. In every case, qualitative conclusions about the ancestral proteins' functions and the effects of key historical mutations were robust to sequence uncertainty, with similar functions observed even when scores of alternate amino acids were incorporated. There was some variation in quantitative descriptors of function among plausible sequences, suggesting that experimentally characterizing robustness is particularly important when quantitative estimates of ancient biochemical parameters are desired. The worst plausible case method appears to provide an efficient strategy for characterizing the functional robustness of ancestral proteins to large amounts of sequence uncertainty. Sampling from the posterior distribution sometimes produced artifactually nonfunctional proteins for sequences reconstructed with substantial ambiguity. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and
The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins.

PubMed

Dehal, Paramvir; Satou, Yutaka; Campbell, Robert K; Chapman, Jarrod; Degnan, Bernard; De Tomaso, Anthony; Davidson, Brad; Di Gregorio, Anna; Gelpke, Maarten; Goodstein, David M; Harafuji, Naoe; Hastings, Kenneth E M; Ho, Isaac; Hotta, Kohji; Huang, Wayne; Kawashima, Takeshi; Lemaire, Patrick; Martinez, Diego; Meinertzhagen, Ian A; Necula, Simona; Nonaka, Masaru; Putnam, Nik; Rash, Sam; Saiga, Hidetoshi; Satake, Masanobu; Terry, Astrid; Yamada, Lixy; Wang, Hong-Gang; Awazu, Satoko; Azumi, Kaoru; Boore, Jeffrey; Branno, Margherita; Chin-Bow, Stephen; DeSantis, Rosaria; Doyle, Sharon; Francino, Pilar; Keys, David N; Haga, Shinobu; Hayashi, Hiroko; Hino, Kyosuke; Imai, Kaoru S; Inaba, Kazuo; Kano, Shungo; Kobayashi, Kenji; Kobayashi, Mari; Lee, Byung-In; Makabe, Kazuhiro W; Manohar, Chitra; Matassi, Giorgio; Medina, Monica; Mochizuki, Yasuaki; Mount, Steve; Morishita, Tomomi; Miura, Sachiko; Nakayama, Akie; Nishizaka, Satoko; Nomoto, Hisayo; Ohta, Fumiko; Oishi, Kazuko; Rigoutsos, Isidore; Sano, Masako; Sasaki, Akane; Sasakura, Yasunori; Shoguchi, Eiichi; Shin-i, Tadasu; Spagnuolo, Antoinetta; Stainier, Didier; Suzuki, Miho M; Tassy, Olivier; Takatori, Naohito; Tokuoka, Miki; Yagi, Kasumi; Yoshizaki, Fumiko; Wada, Shuichi; Zhang, Cindy; Hyatt, P Douglas; Larimer, Frank; Detter, Chris; Doggett, Norman; Glavina, Tijana; Hawkins, Trevor; Richardson, Paul; Lucas, Susan; Kohara, Yuji; Levine, Michael; Satoh, Nori; Rokhsar, Daniel S

2002-12-13

The first chordates appear in the fossil record at the time of the Cambrian explosion, nearly 550 million years ago. The modern ascidian tadpole represents a plausible approximation to these ancestral chordates. To illuminate the origins of chordate and vertebrates, we generated a draft of the protein-coding portion of the genome of the most studied ascidian, Ciona intestinalis. The Ciona genome contains approximately 16,000 protein-coding genes, similar to the number in other invertebrates, but only half that found in vertebrates. Vertebrate gene families are typically found in simplified form in Ciona, suggesting that ascidians contain the basic ancestral complement of genes involved in cell signaling and development. The ascidian genome has also acquired a number of lineage-specific innovations, including a group of genes engaged in cellulose metabolism that are related to those in bacteria and fungi.
Transcription factor evolution in eukaryotes and the assembly of the regulatory toolkit in multicellular lineages

PubMed Central

de Mendoza, Alex; Sebé-Pedrós, Arnau; Šestak, Martin Sebastijan; Matejčić, Marija; Torruella, Guifré; Domazet-Lošo, Tomislav; Ruiz-Trillo, Iñaki

2013-01-01

Transcription factors (TFs) are the main players in transcriptional regulation in eukaryotes. However, it remains unclear what role TFs played in the origin of all of the different eukaryotic multicellular lineages. In this paper, we explore how the origin of TF repertoires shaped eukaryotic evolution and, in particular, their role into the emergence of multicellular lineages. We traced the origin and expansion of all known TFs through the eukaryotic tree of life, using the broadest possible taxon sampling and an updated phylogenetic background. Our results show that the most complex multicellular lineages (i.e., those with embryonic development, Metazoa and Embryophyta) have the most complex TF repertoires, and that these repertoires were assembled in a stepwise manner. We also show that a significant part of the metazoan and embryophyte TF toolkits evolved earlier, in their respective unicellular ancestors. To gain insights into the role of TFs in the development of both embryophytes and metazoans, we analyzed TF expression patterns throughout their ontogeny. The expression patterns observed in both groups recapitulate those of the whole transcriptome, but reveal some important differences. Our comparative genomics and expression data reshape our view on how TFs contributed to eukaryotic evolution and reveal the importance of TFs to the origins of multicellularity and embryonic development. PMID:24277850
Evolution and function of eukaryotic-like proteins from sponge symbionts.

PubMed

Reynolds, David; Thomas, Torsten

2016-10-01

Sponges (Porifera) are ancient metazoans that harbour diverse microorganisms, whose symbiotic interactions are essential for the host's health and function. Although symbiosis between bacteria and sponges are ubiquitous, the molecular mechanisms that control these associations are largely unknown. Recent (meta-) genomic analyses discovered an abundance of genes encoding for eukaryotic-like proteins (ELPs) in bacterial symbionts from different sponge species. ELPs belonging to the ankyrin repeat (AR) class from a bacterial symbiont of the sponge Cymbastela concentrica were subsequently found to modulate amoebal phagocytosis. This might be a molecular mechanism, by which symbionts can control their interaction with the sponge. In this study, we investigated the evolution and function of ELPs from other classes and from symbionts found in other sponges to better understand the importance of ELPs for bacteria-eukaryote interactions. Phylogenetic analyses showed that all of the nine ELPs investigated were most closely related to proteins found either in eukaryotes or in bacteria that can live in association with eukaryotes. ELPs were then recombinantly expressed in Escherichia coli and exposed to the amoeba Acanthamoeba castellanii, which is functionally analogous to phagocytic cells in sponges. Phagocytosis assays with E. coli containing three ELP classes (AR, TPR-SEL1 and NHL) showed a significantly higher percentage of amoeba containing bacteria and average number of intracellular bacteria per amoeba when compared to negative controls. The result that various classes of ELPs found in symbionts of different sponges can modulate phagocytosis indicates that they have a broader function in mediating bacteria-sponge interactions. © 2016 John Wiley & Sons Ltd.
Transposons to toxins: the provenance, architecture and diversification of a widespread class of eukaryotic effectors

PubMed Central

Zhang, Dapeng; Burroughs, A. Maxwell; Vidal, Newton D.; Iyer, Lakshminarayan M.; Aravind, L.

2016-01-01

Enzymatic effectors targeting nucleic acids, proteins and other cellular components are the mainstay of conflicts across life forms. Using comparative genomics we identify a large class of eukaryotic proteins, which include effectors from oomycetes, fungi and other parasites. The majority of these proteins have a characteristic domain architecture with one of several N-terminal ‘Header’ domains, which are predicted to play a role in trafficking of these effectors, including a novel version of the Ubiquitin fold. The Headers are followed by one or more diverse C-terminal domains, such as restriction endonuclease (REase), protein kinase, HNH endonuclease, LK-nuclease (a RNase) and multiple distinct peptidase domains, which are predicted to carry their toxicity determinants. The most common types of these proteins appear to have originated from prokaryotic transposases (e.g. TN7 and Mu) and combine a CDC6/ORC1-STAND clade NTPase domain with a C-terminal REase domain. Other than the so-called Crinkler effectors of oomycetes and fungi, these effectors are encoded by other eukaryotic parasites such as trypanosomatids (the RHS proteins) and the rhizarian Plasmodiophora, and symbionts like Capsaspora. Remarkably, we also find these proteins in free-living eukaryotes, including several viridiplantae, fungi, amoebozoans and animals. These versions might either still be transposons or function in other poorly understood eukaryote-specific inter-organismal and inter-genomic conflicts. These include the Medea1 selfish element of Tribolium that spreads via post-zygotic killing. We present a unified mechanism for the recombination-dependent diversification and action of this widespread class of molecular weaponry deployed across diverse conflicts ranging from parasitic to free-living forms. PMID:27060143
Communities of microbial eukaryotes in the mammalian gut within the context of environmental eukaryotic diversity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Parfrey, Laura Wegener; Walters, William A.; Lauber, Christian L.

2014-06-19

Eukaryotic microbes (protists) residing in the vertebrate gut influence host health and disease, but their diversity and distribution in healthy hosts is poorly understood. Protists found in the gut are typically considered parasites, but many are commensal and some are beneficial. Further, the hygiene hypothesis predicts that association with our co-evolved microbial symbionts may be important to overall health. It is therefore imperative that we understand the normal diversity of our eukaryotic gut microbiota to test for such effects and avoid eliminating commensal organisms. We assembled a dataset of healthy individuals from two populations, one with traditional, agrarian lifestyles andmore » a second with modern, westernized lifestyles, and characterized the human eukaryotic microbiota via high-throughput sequencing. To place the human gut microbiota within a broader context our dataset also includes gut samples from diverse mammals and samples from other aquatic and terrestrial environments. We curated the SILVA ribosomal database to reflect current knowledge of eukaryotic taxonomy and employ it as a phylogenetic framework to compare eukaryotic diversity across environment. We show that adults from the non-western population harbor a diverse community of protists, and diversity in the human gut is comparable to that in other mammals. However, the eukaryotic microbiota of the western population appears depauperate. The distribution of symbionts found in mammals reflects both host phylogeny and diet. Eukaryotic microbiota in the gut are less diverse and more patchily distributed than bacteria. More broadly, we show that eukaryotic communities in the gut are less diverse than in aquatic and terrestrial habitats, and few taxa are shared across habitat types, and diversity patterns of eukaryotes are correlated with those observed for bacteria. These results outline the distribution and diversity of microbial eukaryotic communities in the mammalian gut and across
Protein and DNA Modifications: Evolutionary Imprints of Bacterial Biochemical Diversification and Geochemistry on the Provenance of Eukaryotic Epigenetics

PubMed Central

Aravind, L.; Burroughs, A. Maxwell; Zhang, Dapeng; Iyer, Lakshminarayan M.

2014-01-01

Epigenetic information, which plays a major role in eukaryotic biology, is transmitted by covalent modifications of nuclear proteins (e.g., histones) and DNA, along with poorly understood processes involving cytoplasmic/secreted proteins and RNAs. The origin of eukaryotes was accompanied by emergence of a highly developed biochemical apparatus for encoding, resetting, and reading covalent epigenetic marks in proteins such as histones and tubulins. The provenance of this apparatus remained unclear until recently. Developments in comparative genomics show that key components of eukaryotic epigenetics emerged as part of the extensive biochemical innovation of secondary metabolism and intergenomic/interorganismal conflict systems in prokaryotes, particularly bacteria. These supplied not only enzymatic components for encoding and removing epigenetic modifications, but also readers of some of these marks. Diversification of these prokaryotic systems and subsequently eukaryotic epigenetics appear to have been considerably influenced by the great oxygenation event in the Earth’s history. PMID:24984775
Ancestrality and evolution of trait syndromes in finches (Fringillidae).

PubMed

Ponge, Jean-François; Zuccon, Dario; Elias, Marianne; Pavoine, Sandrine; Henry, Pierre-Yves; Théry, Marc; Guilbert, Éric

2017-12-01

Species traits have been hypothesized by one of us (Ponge, 2013) to evolve in a correlated manner as species colonize stable, undisturbed habitats, shifting from "ancestral" to "derived" strategies. We predicted that generalism, r-selection, sexual monomorphism, and migration/gregariousness are the ancestral states (collectively called strategy A) and evolved correlatively toward specialism, K-selection, sexual dimorphism, and residence/territoriality as habitat stabilized (collectively called B strategy). We analyzed the correlated evolution of four syndromes, summarizing the covariation between 53 traits, respectively, involved in ecological specialization, r-K gradient, sexual selection, and dispersal/social behaviors in 81 species representative of Fringillidae, a bird family with available natural history information and that shows variability for all these traits. The ancestrality of strategy A was supported for three of the four syndromes, the ancestrality of generalism having a weaker support, except for the core group Carduelinae (69 species). It appeared that two different B-strategies evolved from the ancestral state A, both associated with highly predictable environments: one in poorly seasonal environments, called B1, with species living permanently in lowland tropics, with "slow pace of life" and weak sexual dimorphism, and one in highly seasonal environments, called B2, with species breeding out-of-the-tropics, migratory, with a "fast pace of life" and high sexual dimorphism.
Sexually Dimorphic Effects of Ancestral Exposure to Vinclozolin on Stress Reactivity in Rats

PubMed Central

Gillette, Ross; Miller-Crews, Isaac; Nilsson, Eric E.; Skinner, Michael K.; Gore, Andrea C.

2014-01-01

How an individual responds to the environment depends upon both personal life history as well as inherited genetic and epigenetic factors from ancestors. Using a 2-hit, 3 generations apart model, we tested how F3 descendants of rats given in utero exposure to the environmental endocrine-disrupting chemical (EDC) vinclozolin reacted to stress during adolescence in their own lives, focusing on sexually dimorphic phenotypic outcomes. In adulthood, male and female F3 vinclozolin- or vehicle-lineage rats, stressed or nonstressed, were behaviorally characterized on a battery of tests and then euthanized. Serum was used for hormone assays, and brains were used for quantitative PCR and transcriptome analyses. Results showed that the effects of ancestral exposure to vinclozolin converged with stress experienced during adolescence in a sexually dimorphic manner. Debilitating effects were seen at all levels of the phenotype, including physiology, behavior, brain metabolism, gene expression, and genome-wide transcriptome modifications in specific brain nuclei. Additionally, females were significantly more vulnerable than males to transgenerational effects of vinclozolin on anxiety but not sociality tests. This fundamental transformation occurs in a manner not predicted by the ancestral exposure or the proximate effects of stress during adolescence, an interaction we refer to as synchronicity. PMID:25051444
Sexually dimorphic effects of ancestral exposure to vinclozolin on stress reactivity in rats.

PubMed

Gillette, Ross; Miller-Crews, Isaac; Nilsson, Eric E; Skinner, Michael K; Gore, Andrea C; Crews, David

2014-10-01

How an individual responds to the environment depends upon both personal life history as well as inherited genetic and epigenetic factors from ancestors. Using a 2-hit, 3 generations apart model, we tested how F3 descendants of rats given in utero exposure to the environmental endocrine-disrupting chemical (EDC) vinclozolin reacted to stress during adolescence in their own lives, focusing on sexually dimorphic phenotypic outcomes. In adulthood, male and female F3 vinclozolin- or vehicle-lineage rats, stressed or nonstressed, were behaviorally characterized on a battery of tests and then euthanized. Serum was used for hormone assays, and brains were used for quantitative PCR and transcriptome analyses. Results showed that the effects of ancestral exposure to vinclozolin converged with stress experienced during adolescence in a sexually dimorphic manner. Debilitating effects were seen at all levels of the phenotype, including physiology, behavior, brain metabolism, gene expression, and genome-wide transcriptome modifications in specific brain nuclei. Additionally, females were significantly more vulnerable than males to transgenerational effects of vinclozolin on anxiety but not sociality tests. This fundamental transformation occurs in a manner not predicted by the ancestral exposure or the proximate effects of stress during adolescence, an interaction we refer to as synchronicity.
Ancestral gene reconstruction and synthesis of ancient rhodopsins in the laboratory.

PubMed

Chang, Belinda S W

2003-08-01

Laboratory synthesis of ancestral proteins offers an intriguing opportunity to study the past directly. The development of Bayesian methods to infer ancestral sequences, combined with advances in models of molecular evolution, and synthetic gene technology make this an increasingly promising approach in evolutionary studies of molecular function. Visual pigments form the first step in the biochemical cascade of events in the retina in all animals known to possess visual capabilities. In vertebrates, the necessity of spanning a dynamic range of light intensities of many orders of magnitude has given rise to two different types of photoreceptors, rods specialized for dim-light conditions, and cones for daylight and color vision. These photoreceptors contain different types of visual pigment genes. Reviewed here are methods of inferring ancestral sequences, chemical synthesis of artificial ancestral genes in the laboratory, and applications to the evolution of vertebrate visual systems and the experimental recreation of an archosaur rod visual pigment. The ancestral archosaurs gave rise to several notable lineages of diapsid reptiles, including the birds and the dinosaurs, and would have existed over 200 MYA. What little is known of their physiology comes from fossil remains, and inference based on the biology of their living descendants. Despite its age, an ancestral archosaur pigment was successfully recreated in the lab, and showed interesting properties of its wavelength sensitivity that may have implications for the visual capabilities of the ancestral archosaurs in dim light.
Spontaneous Mutation Rate in the Smallest Photosynthetic Eukaryotes

PubMed Central

Krasovec, Marc; Eyre-Walker, Adam; Sanchez-Ferandin, Sophie

2017-01-01

Abstract Mutation is the ultimate source of genetic variation, and knowledge of mutation rates is fundamental for our understanding of all evolutionary processes. High throughput sequencing of mutation accumulation lines has provided genome wide spontaneous mutation rates in a dozen model species, but estimates from nonmodel organisms from much of the diversity of life are very limited. Here, we report mutation rates in four haploid marine bacterial-sized photosynthetic eukaryotic algae; Bathycoccus prasinos, Ostreococcus tauri, Ostreococcus mediterraneus, and Micromonas pusilla. The spontaneous mutation rate between species varies from μ = 4.4 × 10−10 to 9.8 × 10−10 mutations per nucleotide per generation. Within genomes, there is a two-fold increase of the mutation rate in intergenic regions, consistent with an optimization of mismatch and transcription-coupled DNA repair in coding sequences. Additionally, we show that deviation from the equilibrium GC content increases the mutation rate by ∼2% to ∼12% because of a GC bias in coding sequences. More generally, the difference between the observed and equilibrium GC content of genomes explains some of the inter-specific variation in mutation rates. PMID:28379581
An ancestral host defence peptide within human β-defensin 3 recapitulates the antibacterial and antiviral activity of the full-length molecule

PubMed Central

Nigro, Ersilia; Colavita, Irene; Sarnataro, Daniela; Scudiero, Olga; Zambrano, Gerardo; Granata, Vincenzo; Daniele, Aurora; Carotenuto, Alfonso; Galdiero, Stefania; Folliero, Veronica; Galdiero, Massimiliano; Urbanowicz, Richard A.; Ball, Jonathan K.; Salvatore, Francesco; Pessi, Antonello

2015-01-01

Host defence peptides (HDPs) are critical components of innate immunity. Despite their diversity, they share common features including a structural signature, designated “γ-core motif”. We reasoned that for each HDPs evolved from an ancestral γ-core, the latter should be the evolutionary starting point of the molecule, i.e. it should represent a structural scaffold for the modular construction of the full-length molecule, and possess biological properties. We explored the γ-core of human β-defensin 3 (HBD3) and found that it: (a) is the folding nucleus of HBD3; (b) folds rapidly and is stable in human serum; (c) displays antibacterial activity; (d) binds to CD98, which mediates HBD3 internalization in eukaryotic cells; (e) exerts antiviral activity against human immunodeficiency virus and herpes simplex virus; and (f) is not toxic to human cells. These results demonstrate that the γ-core within HBD3 is the ancestral core of the full-length molecule and is a viable HDP per se, since it is endowed with the most important biological features of HBD3. Notably, the small, stable scaffold of the HBD3 γ-core can be exploited to design disease-specific antimicrobial agents. PMID:26688341
The Big Bang of picorna-like virus evolution antedates the radiation of eukaryotic supergroups.

PubMed

Koonin, Eugene V; Wolf, Yuri I; Nagasaki, Keizo; Dolja, Valerian V

2008-12-01

The recent discovery of RNA viruses in diverse unicellular eukaryotes and developments in evolutionary genomics have provided the means for addressing the origin of eukaryotic RNA viruses. The phylogenetic analyses of RNA polymerases and helicases presented in this Analysis article reveal close evolutionary relationships between RNA viruses infecting hosts from the Chromalveolate and Excavate supergroups and distinct families of picorna-like viruses of plants and animals. Thus, diversification of picorna-like viruses probably occurred in a 'Big Bang' concomitant with key events of eukaryogenesis. The origins of the conserved genes of picorna-like viruses are traced to likely ancestors including bacterial group II retroelements, the family of HtrA proteases and DNA bacteriophages.
Hidden genetic variation in the germline genome of Tetrahymena thermophila.

PubMed

Dimond, K L; Zufall, R A

2016-06-01

Genome architecture varies greatly among eukaryotes. This diversity may profoundly affect the origin and maintenance of genetic variation within a population. Ciliates are microbial eukaryotes with unusual genome features, such as the separation of germline and somatic genomes within a single cell and amitotic division. These features have previously been proposed to increase the rate of molecular evolution in these species. Here, we assessed the fitness effects of genetic variation in the two genomes of natural isolates of the ciliate Tetrahymena thermophila. We find more extensive genetic variation in fitness in the transcriptionally silent germline genome than in the expressed somatic genome. Surprisingly, this variation is not primarily deleterious, but has both beneficial and deleterious effects. We conclude that Tetrahymena genome architecture allows for the maintenance of genetic variation that would otherwise be eliminated by selection. We consider the effect of selection on the two genomes and the impacts of reproductive strategies and the mechanism of sex determination on the structure of this variation. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.
Horizontal Gene Transfer is a Significant Driver of Gene Innovation in Dinoflagellates

PubMed Central

Wisecaver, Jennifer H.; Brosnahan, Michael L.; Hackett, Jeremiah D.

2013-01-01

The dinoflagellates are an evolutionarily and ecologically important group of microbial eukaryotes. Previous work suggests that horizontal gene transfer (HGT) is an important source of gene innovation in these organisms. However, dinoflagellate genomes are notoriously large and complex, making genomic investigation of this phenomenon impractical with currently available sequencing technology. Fortunately, de novo transcriptome sequencing and assembly provides an alternative approach for investigating HGT. We sequenced the transcriptome of the dinoflagellate Alexandrium tamarense Group IV to investigate how HGT has contributed to gene innovation in this group. Our comprehensive A. tamarense Group IV gene set was compared with those of 16 other eukaryotic genomes. Ancestral gene content reconstruction of ortholog groups shows that A. tamarense Group IV has the largest number of gene families gained (314–1,563 depending on inference method) relative to all other organisms in the analysis (0–782). Phylogenomic analysis indicates that genes horizontally acquired from bacteria are a significant proportion of this gene influx, as are genes transferred from other eukaryotes either through HGT or endosymbiosis. The dinoflagellates also display curious cases of gene loss associated with mitochondrial metabolism including the entire Complex I of oxidative phosphorylation. Some of these missing genes have been functionally replaced by bacterial and eukaryotic xenologs. The transcriptome of A. tamarense Group IV lends strong support to a growing body of evidence that dinoflagellate genomes are extraordinarily impacted by HGT. PMID:24259313
Genomic minimalism in the early diverging intestinal parasite Giardia lamblia.

PubMed

Morrison, Hilary G; McArthur, Andrew G; Gillin, Frances D; Aley, Stephen B; Adam, Rodney D; Olsen, Gary J; Best, Aaron A; Cande, W Zacheus; Chen, Feng; Cipriano, Michael J; Davids, Barbara J; Dawson, Scott C; Elmendorf, Heidi G; Hehl, Adrian B; Holder, Michael E; Huse, Susan M; Kim, Ulandt U; Lasek-Nesselquist, Erica; Manning, Gerard; Nigam, Anuranjini; Nixon, Julie E J; Palm, Daniel; Passamaneck, Nora E; Prabhu, Anjali; Reich, Claudia I; Reiner, David S; Samuelson, John; Svard, Staffan G; Sogin, Mitchell L

2007-09-28

The genome of the eukaryotic protist Giardia lamblia, an important human intestinal parasite, is compact in structure and content, contains few introns or mitochondrial relics, and has simplified machinery for DNA replication, transcription, RNA processing, and most metabolic pathways. Protein kinases comprise the single largest protein class and reflect Giardia's requirement for a complex signal transduction network for coordinating differentiation. Lateral gene transfer from bacterial and archaeal donors has shaped Giardia's genome, and previously unknown gene families, for example, cysteine-rich structural proteins, have been discovered. Unexpectedly, the genome shows little evidence of heterozygosity, supporting recent speculations that this organism is sexual. This genome sequence will not only be valuable for investigating the evolution of eukaryotes, but will also be applied to the search for new therapeutics for this parasite.
A phylogenetic Kalman filter for ancestral trait reconstruction using molecular data.

PubMed

Lartillot, Nicolas

2014-02-15

Correlation between life history or ecological traits and genomic features such as nucleotide or amino acid composition can be used for reconstructing the evolutionary history of the traits of interest along phylogenies. Thus far, however, such ancestral reconstructions have been done using simple linear regression approaches that do not account for phylogenetic inertia. These reconstructions could instead be seen as a genuine comparative regression problem, such as formalized by classical generalized least-square comparative methods, in which the trait of interest and the molecular predictor are represented as correlated Brownian characters coevolving along the phylogeny. Here, a Bayesian sampler is introduced, representing an alternative and more efficient algorithmic solution to this comparative regression problem, compared with currently existing generalized least-square approaches. Technically, ancestral trait reconstruction based on a molecular predictor is shown to be formally equivalent to a phylogenetic Kalman filter problem, for which backward and forward recursions are developed and implemented in the context of a Markov chain Monte Carlo sampler. The comparative regression method results in more accurate reconstructions and a more faithful representation of uncertainty, compared with simple linear regression. Application to the reconstruction of the evolution of optimal growth temperature in Archaea, using GC composition in ribosomal RNA stems and amino acid composition of a sample of protein-coding genes, confirms previous findings, in particular, pointing to a hyperthermophilic ancestor for the kingdom. The program is freely available at www.phylobayes.org.

Y chromosome of D. pseudoobscura is not homologous to the ancestral Drosophila Y.

PubMed

Carvalho, Antonio Bernardo; Clark, Andrew G

2005-01-07

We report a genome-wide search of Y-linked genes in Drosophila pseudoobscura. All six identifiable orthologs of the D. melanogaster Y-linked genes have autosomal inheritance in D. pseudoobscura. Four orthologs were investigated in detail and proved to be Y-linked in D. guanche and D. bifasciata, which shows that less than 18 million years ago the ancestral Drosophila Y chromosome was translocated to an autosome in the D. pseudoobscura lineage. We found 15 genes and pseudogenes in the current Y of D. pseudoobscura, and none are shared with the D. melanogaster Y. Hence, the Y chromosome in the D. pseudoobscura lineage appears to have arisen de novo and is not homologous to the D. melanogaster Y.
RNA Export through the NPC in Eukaryotes.

PubMed

Okamura, Masumi; Inose, Haruko; Masuda, Seiji

2015-03-20

In eukaryotic cells, RNAs are transcribed in the nucleus and exported to the cytoplasm through the nuclear pore complex. The RNA molecules that are exported from the nucleus into the cytoplasm include messenger RNAs (mRNAs), ribosomal RNAs (rRNAs), transfer RNAs (tRNAs), small nuclear RNAs (snRNAs), micro RNAs (miRNAs), and viral mRNAs. Each RNA is transported by a specific nuclear export receptor. It is believed that most of the mRNAs are exported by Nxf1 (Mex67 in yeast), whereas rRNAs, snRNAs, and a certain subset of mRNAs are exported in a Crm1/Xpo1-dependent manner. tRNAs and miRNAs are exported by Xpot and Xpo5. However, multiple export receptors are involved in the export of some RNAs, such as 60S ribosomal subunit. In addition to these export receptors, some adapter proteins are required to export RNAs. The RNA export system of eukaryotic cells is also used by several types of RNA virus that depend on the machineries of the host cell in the nucleus for replication of their genome, therefore this review describes the RNA export system of two representative viruses. We also discuss the NPC anchoring-dependent mRNA export factors that directly recruit specific genes to the NPC.
Insights into the Initiation of Eukaryotic DNA Replication.

PubMed

Bruck, Irina; Perez-Arnaiz, Patricia; Colbert, Max K; Kaplan, Daniel L

2015-01-01

The initiation of DNA replication is a highly regulated event in eukaryotic cells to ensure that the entire genome is copied once and only once during S phase. The primary target of cellular regulation of eukaryotic DNA replication initiation is the assembly and activation of the replication fork helicase, the 11-subunit assembly that unwinds DNA at a replication fork. The replication fork helicase, called CMG for Cdc45-Mcm2-7, and GINS, assembles in S phase from the constituent Cdc45, Mcm2-7, and GINS proteins. The assembly and activation of the CMG replication fork helicase during S phase is governed by 2 S-phase specific kinases, CDK and DDK. CDK stimulates the interaction between Sld2, Sld3, and Dpb11, 3 initiation factors that are each required for the initiation of DNA replication. DDK, on the other hand, phosphorylates the Mcm2, Mcm4, and Mcm6 subunits of the Mcm2-7 complex. Sld3 recruits Cdc45 to Mcm2-7 in a manner that depends on DDK, and recent work suggests that Sld3 binds directly to Mcm2-7 and also to single-stranded DNA. Furthermore, recent work demonstrates that Sld3 and its human homolog Treslin substantially stimulate DDK phosphorylation of Mcm2. These data suggest that the initiation factor Sld3/Treslin coordinates the assembly and activation of the eukaryotic replication fork helicase by recruiting Cdc45 to Mcm2-7, stimulating DDK phosphorylation of Mcm2, and binding directly to single-stranded DNA as the origin is melted.
Complete sequence and analysis of the mitochondrial genome of Hemiselmis andersenii CCMP644 (Cryptophyceae).

PubMed

Kim, Eunsoo; Lane, Christopher E; Curtis, Bruce A; Kozera, Catherine; Bowman, Sharen; Archibald, John M

2008-05-12

Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote) endosymbiosis. Cryptophytes are unusual in that they possess four genomes-a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented. The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a approximately 20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22-336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu) gene and possesses a trnS-derived 'trnK(uuu)', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher-order eukaryotic lineages. Comparison of
Rethinking the evolution of eukaryotic metabolism: novel cellular partitioning of enzymes in stramenopiles links serine biosynthesis to glycolysis in mitochondria.

PubMed

Abrahamian, Melania; Kagda, Meenakshi; Ah-Fong, Audrey M V; Judelson, Howard S

2017-12-04

An important feature of eukaryotic evolution is metabolic compartmentalization, in which certain pathways are restricted to the cytosol or specific organelles. Glycolysis in eukaryotes is described as a cytosolic process. The universality of this canon has been challenged by recent genome data that suggest that some glycolytic enzymes made by stramenopiles bear mitochondrial targeting peptides. Mining of oomycete, diatom, and brown algal genomes indicates that stramenopiles encode two forms of enzymes for the second half of glycolysis, one with and the other without mitochondrial targeting peptides. The predicted mitochondrial targeting was confirmed by using fluorescent tags to localize phosphoglycerate kinase, phosphoglycerate mutase, and pyruvate kinase in Phytophthora infestans, the oomycete that causes potato blight. A genome-wide search for other enzymes with atypical mitochondrial locations identified phosphoglycerate dehydrogenase, phosphoserine aminotransferase, and phosphoserine phosphatase, which form a pathway for generating serine from the glycolytic intermediate 3-phosphoglycerate. Fluorescent tags confirmed the delivery of these serine biosynthetic enzymes to P. infestans mitochondria. A cytosolic form of this serine biosynthetic pathway, which occurs in most eukaryotes, is missing from oomycetes and most other stramenopiles. The glycolysis and serine metabolism pathways of oomycetes appear to be mosaics of enzymes with different ancestries. While some of the noncanonical oomycete mitochondrial enzymes have the closest affinity in phylogenetic analyses with proteins from other stramenopiles, others cluster with bacterial, plant, or animal proteins. The genes encoding the mitochondrial phosphoglycerate kinase and serine-forming enzymes are physically linked on oomycete chromosomes, which suggests a shared origin. Stramenopile metabolism appears to have been shaped through the acquisition of genes by descent and lateral or endosymbiotic gene transfer
Protein and DNA modifications: evolutionary imprints of bacterial biochemical diversification and geochemistry on the provenance of eukaryotic epigenetics.

PubMed

Aravind, L; Burroughs, A Maxwell; Zhang, Dapeng; Iyer, Lakshminarayan M

2014-07-01

Epigenetic information, which plays a major role in eukaryotic biology, is transmitted by covalent modifications of nuclear proteins (e.g., histones) and DNA, along with poorly understood processes involving cytoplasmic/secreted proteins and RNAs. The origin of eukaryotes was accompanied by emergence of a highly developed biochemical apparatus for encoding, resetting, and reading covalent epigenetic marks in proteins such as histones and tubulins. The provenance of this apparatus remained unclear until recently. Developments in comparative genomics show that key components of eukaryotic epigenetics emerged as part of the extensive biochemical innovation of secondary metabolism and intergenomic/interorganismal conflict systems in prokaryotes, particularly bacteria. These supplied not only enzymatic components for encoding and removing epigenetic modifications, but also readers of some of these marks. Diversification of these prokaryotic systems and subsequently eukaryotic epigenetics appear to have been considerably influenced by the great oxygenation event in the Earth's history. Copyright © 2014 Cold Spring Harbor Laboratory Press; all rights reserved.
Use of mariner transposases for one-step delivery and integration of DNA in prokaryotes and eukaryotes by transfection

PubMed Central

Michlewski, Gracjan; Finnegan, David J.; Elfick, Alistair; Rosser, Susan J.

2017-01-01

Abstract Delivery of DNA to cells and its subsequent integration into the host genome is a fundamental task in molecular biology, biotechnology and gene therapy. Here we describe an IP-free one-step method that enables stable genome integration into either prokaryotic or eukaryotic cells. A synthetic mariner transposon is generated by flanking a DNA sequence with short inverted repeats. When purified recombinant Mos1 or Mboumar-9 transposase is co-transfected with transposon-containing plasmid DNA, it penetrates prokaryotic or eukaryotic cells and integrates the target DNA into the genome. In vivo integrations by purified transposase can be achieved by electroporation, chemical transfection or Lipofection of the transposase:DNA mixture, in contrast to other published transposon-based protocols which require electroporation or microinjection. As in other transposome systems, no helper plasmids are required since transposases are not expressed inside the host cells, thus leading to generation of stable cell lines. Since it does not require electroporation or microinjection, this tool has the potential to be applied for automated high-throughput creation of libraries of random integrants for purposes including gene knock-out libraries, screening for optimal integration positions or safe genome locations in different organisms, selection of the highest production of valuable compounds for biotechnology, and sequencing. PMID:28204586
Macronuclear Genome Sequence of the Ciliate Tetrahymena thermophila, a Model Eukaryote

PubMed Central

Eisen, Jonathan A; Coyne, Robert S; Wu, Martin; Wu, Dongying; Thiagarajan, Mathangi; Wortman, Jennifer R; Badger, Jonathan H; Ren, Qinghu; Amedeo, Paolo; Jones, Kristie M; Tallon, Luke J; Delcher, Arthur L; Salzberg, Steven L; Silva, Joana C; Haas, Brian J; Majoros, William H; Farzad, Maryam; Carlton, Jane M; Smith, Roger K; Garg, Jyoti; Pearlman, Ronald E; Karrer, Kathleen M; Sun, Lei; Manning, Gerard; Elde, Nels C; Turkewitz, Aaron P; Asai, David J; Wilkes, David E; Wang, Yufeng; Cai, Hong; Collins, Kathleen; Stewart, B. Andrew; Lee, Suzanne R; Wilamowska, Katarzyna; Weinberg, Zasha; Ruzzo, Walter L; Wloga, Dorota; Gaertig, Jacek; Frankel, Joseph; Tsao, Che-Chia; Gorovsky, Martin A; Keeling, Patrick J; Waller, Ross F; Patron, Nicola J; Cherry, J. Michael; Stover, Nicholas A; Krieger, Cynthia J; del Toro, Christina; Ryder, Hilary F; Williamson, Sondra C; Barbeau, Rebecca A; Hamilton, Eileen P; Orias, Eduardo

2006-01-01

The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct nuclei within a single cell. The germline-like micronucleus (MIC) has its genome held in reserve for sexual reproduction. The soma-like macronucleus (MAC), which possesses a genome processed from that of the MIC, is the center of gene expression and does not directly contribute DNA to sexual progeny. We report here the shotgun sequencing, assembly, and analysis of the MAC genome of T. thermophila, which is approximately 104 Mb in length and composed of approximately 225 chromosomes. Overall, the gene set is robust, with more than 27,000 predicted protein-coding genes, 15,000 of which have strong matches to genes in other organisms. The functional diversity encoded by these genes is substantial and reflects the complexity of processes required for a free-living, predatory, single-celled organism. This is highlighted by the abundance of lineage-specific duplications of genes with predicted roles in sensing and responding to environmental conditions (e.g., kinases), using diverse resources (e.g., proteases and transporters), and generating structural complexity (e.g., kinesins and dyneins). In contrast to the other lineages of alveolates (apicomplexans and dinoflagellates), no compelling evidence could be found for plastid-derived genes in the genome. UGA, the only T. thermophila stop codon, is used in some genes to encode selenocysteine, thus making this organism the first known with the potential to translate all 64 codons in nuclear genes into amino acids. We present genomic evidence supporting the hypothesis that the excision of DNA from the MIC to generate the MAC specifically targets foreign DNA as a form of genome self-defense. The combination of the genome sequence, the functional diversity encoded therein, and the presence of some pathways missing from other model
A universe of dwarfs and giants: genome size and chromosome evolution in the monocot family Melanthiaceae.

PubMed

Pellicer, Jaume; Kelly, Laura J; Leitch, Ilia J; Zomlefer, Wendy B; Fay, Michael F

2014-03-01

• Since the occurrence of giant genomes in angiosperms is restricted to just a few lineages, identifying where shifts towards genome obesity have occurred is essential for understanding the evolutionary mechanisms triggering this process. • Genome sizes were assessed using flow cytometry in 79 species and new chromosome numbers were obtained. Phylogenetically based statistical methods were applied to infer ancestral character reconstructions of chromosome numbers and nuclear DNA contents. • Melanthiaceae are the most diverse family in terms of genome size, with C-values ranging more than 230-fold. Our data confirmed that giant genomes are restricted to tribe Parideae, with most extant species in the family characterized by small genomes. Ancestral genome size reconstruction revealed that the most recent common ancestor (MRCA) for the family had a relatively small genome (1C = 5.37 pg). Chromosome losses and polyploidy are recovered as the main evolutionary mechanisms generating chromosome number change. • Genome evolution in Melanthiaceae has been characterized by a trend towards genome size reduction, with just one episode of dramatic DNA accumulation in Parideae. Such extreme contrasting profiles of genome size evolution illustrate the key role of transposable elements and chromosome rearrangements in driving the evolution of plant genomes. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
Enzyme functional evolution through improved catalysis of ancestrally nonpreferred substrates

PubMed Central

Huang, Ruiqi; Hippauf, Frank; Rohrbeck, Diana; Haustein, Maria; Wenke, Katrin; Feike, Janie; Sorrelle, Noah; Piechulla, Birgit; Barkman, Todd J.

2012-01-01

In this study, we investigated the role for ancestral functional variation that may be selected upon to generate protein functional shifts using ancestral protein resurrection, statistical tests for positive selection, forward and reverse evolutionary genetics, and enzyme functional assays. Data are presented for three instances of protein functional change in the salicylic acid/benzoic acid/theobromine (SABATH) lineage of plant secondary metabolite-producing enzymes. In each case, we demonstrate that ancestral nonpreferred activities were improved upon in a daughter enzyme after gene duplication, and that these functional shifts were likely coincident with positive selection. Both forward and reverse mutagenesis studies validate the impact of one or a few sites toward increasing activity with ancestrally nonpreferred substrates. In one case, we document the occurrence of an evolutionary reversal of an active site residue that reversed enzyme properties. Furthermore, these studies show that functionally important amino acid replacements result in substrate discrimination as reflected in evolutionary changes in the specificity constant (kcat/KM) for competing substrates, even though adaptive substitutions may affect KM and kcat separately. In total, these results indicate that nonpreferred, or even latent, ancestral protein activities may be coopted at later times to become the primary or preferred protein activities. PMID:22315396
Enzyme functional evolution through improved catalysis of ancestrally nonpreferred substrates.

PubMed

Huang, Ruiqi; Hippauf, Frank; Rohrbeck, Diana; Haustein, Maria; Wenke, Katrin; Feike, Janie; Sorrelle, Noah; Piechulla, Birgit; Barkman, Todd J

2012-02-21

In this study, we investigated the role for ancestral functional variation that may be selected upon to generate protein functional shifts using ancestral protein resurrection, statistical tests for positive selection, forward and reverse evolutionary genetics, and enzyme functional assays. Data are presented for three instances of protein functional change in the salicylic acid/benzoic acid/theobromine (SABATH) lineage of plant secondary metabolite-producing enzymes. In each case, we demonstrate that ancestral nonpreferred activities were improved upon in a daughter enzyme after gene duplication, and that these functional shifts were likely coincident with positive selection. Both forward and reverse mutagenesis studies validate the impact of one or a few sites toward increasing activity with ancestrally nonpreferred substrates. In one case, we document the occurrence of an evolutionary reversal of an active site residue that reversed enzyme properties. Furthermore, these studies show that functionally important amino acid replacements result in substrate discrimination as reflected in evolutionary changes in the specificity constant (k(cat)/K(M)) for competing substrates, even though adaptive substitutions may affect K(M) and k(cat) separately. In total, these results indicate that nonpreferred, or even latent, ancestral protein activities may be coopted at later times to become the primary or preferred protein activities.
Variation in recombination frequency and distribution across eukaryotes: patterns and processes

PubMed Central

Feulner, Philine G. D.; Johnston, Susan E.; Santure, Anna W.; Smadja, Carole M.

2017-01-01

Recombination, the exchange of DNA between maternal and paternal chromosomes during meiosis, is an essential feature of sexual reproduction in nearly all multicellular organisms. While the role of recombination in the evolution of sex has received theoretical and empirical attention, less is known about how recombination rate itself evolves and what influence this has on evolutionary processes within sexually reproducing organisms. Here, we explore the patterns of, and processes governing recombination in eukaryotes. We summarize patterns of variation, integrating current knowledge with an analysis of linkage map data in 353 organisms. We then discuss proximate and ultimate processes governing recombination rate variation and consider how these influence evolutionary processes. Genome-wide recombination rates (cM/Mb) can vary more than tenfold across eukaryotes, and there is large variation in the distribution of recombination events across closely related taxa, populations and individuals. We discuss how variation in rate and distribution relates to genome architecture, genetic and epigenetic mechanisms, sex, environmental perturbations and variable selective pressures. There has been great progress in determining the molecular mechanisms governing recombination, and with the continued development of new modelling and empirical approaches, there is now also great opportunity to further our understanding of how and why recombination rate varies. This article is part of the themed issue ‘Evolutionary causes and consequences of recombination rate variation in sexual organisms’. PMID:29109219
The Gam protein of bacteriophage Mu is an orthologue of eukaryotic Ku

PubMed Central

di Fagagna, Fabrizio d'Adda; Weller, Geoffrey R.; Doherty, Aidan J.; Jackson, Stephen P.

2003-01-01

Mu bacteriophage inserts its DNA into the genome of host bacteria and is used as a model for DNA transposition events in other systems. The eukaryotic Ku protein has key roles in DNA repair and in certain transposition events. Here we show that the Gam protein of phage Mu is conserved in bacteria, has sequence homology with both subunits of Ku, and has the potential to adopt a similar architecture to the core DNA-binding region of Ku. Through biochemical studies, we demonstrate that Gam and the related protein of Haemophilus influenzae display DNA binding characteristics remarkably similar to those of human Ku. In addition, we show that Gam can interfere with Ty1 retrotransposition in Saccharomyces cerevisiae. These data reveal structural and functional parallels between bacteriophage Gam and eukaryotic Ku and suggest that their functions have been evolutionarily conserved. PMID:12524520
Elucidating the composition and conservation of the autophagy pathway in photosynthetic eukaryotes

PubMed Central

Shemi, Adva; Ben-Dor, Shifra; Vardi, Assaf

2015-01-01

Aquatic photosynthetic eukaryotes represent highly diverse groups (green, red, and chromalveolate algae) derived from multiple endosymbiosis events, covering a wide spectrum of the tree of life. They are responsible for about 50% of the global photosynthesis and serve as the foundation for oceanic and fresh water food webs. Although the ecophysiology and molecular ecology of some algal species are extensively studied, some basic aspects of algal cell biology are still underexplored. The recent wealth of genomic resources from algae has opened new frontiers to decipher the role of cell signaling pathways and their function in an ecological and biotechnological context. Here, we took a bioinformatic approach to explore the distribution and conservation of TOR and autophagy-related (ATG) proteins (Atg in yeast) in diverse algal groups. Our genomic analysis demonstrates conservation of TOR and ATG proteins in green algae. In contrast, in all 5 available red algal genomes, we could not detect the sequences that encode for any of the 17 core ATG proteins examined, albeit TOR and its interacting proteins are conserved. This intriguing data suggests that the autophagy pathway is not conserved in red algae as it is in the entire eukaryote domain. In contrast, chromalveolates, despite being derived from the red-plastid lineage, retain and express ATG genes, which raises a fundamental question regarding the acquisition of ATG genes during algal evolution. Among chromalveolates, Emiliania huxleyi (Haptophyta), a bloom-forming coccolithophore, possesses the most complete set of ATG genes, and may serve as a model organism to study autophagy in marine protists with great ecological significance. PMID:25915714
PGDD: a database of gene and genome duplication in plants

PubMed Central

Lee, Tae-Ho; Tang, Haibao; Wang, Xiyin; Paterson, Andrew H.

2013-01-01

Genome duplication (GD) has permanently shaped the architecture and function of many higher eukaryotic genomes. The angiosperms (flowering plants) are outstanding models in which to elucidate consequences of GD for higher eukaryotes, owing to their propensity for chromosomal duplication or even triplication in a few cases. Duplicated genome structures often require both intra- and inter-genome alignments to unravel their evolutionary history, also providing the means to deduce both obvious and otherwise-cryptic orthology, paralogy and other relationships among genes. The burgeoning sets of angiosperm genome sequences provide the foundation for a host of investigations into the functional and evolutionary consequences of gene and GD. To provide genome alignments from a single resource based on uniform standards that have been validated by empirical studies, we built the Plant Genome Duplication Database (PGDD; freely available at http://chibba.agtec.uga.edu/duplication/), a web service providing synteny information in terms of colinearity between chromosomes. At present, PGDD contains data for 26 plants including bryophytes and chlorophyta, as well as angiosperms with draft genome sequences. In addition to the inclusion of new genomes as they become available, we are preparing new functions to enhance PGDD. PMID:23180799
Prokaryotic and eukaryotic DNA helicases. Essential molecular motor proteins for cellular machinery.

PubMed

Tuteja, Narendra; Tuteja, Renu

2004-05-01

DNA helicases are ubiquitous molecular motor proteins which harness the chemical free energy of ATP hydrolysis to catalyze the unwinding of energetically stable duplex DNA, and thus play important roles in nearly all aspects of nucleic acid metabolism, including replication, repair, recombination, and transcription. They break the hydrogen bonds between the duplex helix and move unidirectionally along the bound strand. All helicases are also translocases and DNA-dependent ATPases. Most contain conserved helicase motifs that act as an engine to power DNA unwinding. All DNA helicases share some common properties, including nucleic acid binding, NTP binding and hydrolysis, and unwinding of duplex DNA in the 3' to 5' or 5' to 3' direction. The minichromosome maintenance (Mcm) protein complex (Mcm4/6/7) provides a DNA-unwinding function at the origin of replication in all eukaryotes and may act as a licensing factor for DNA replication. The RecQ family of helicases is highly conserved from bacteria to humans and is required for the maintenance of genome integrity. They have also been implicated in a variety of human genetic disorders. Since the discovery of the first DNA helicase in Escherichia coli in 1976, and the first eukaryotic one in the lily in 1978, a large number of these enzymes have been isolated from both prokaryotic and eukaryotic systems, and the number is still growing. In this review we cover the historical background of DNA helicases, helicase assays, biochemical properties, prokaryotic and eukaryotic DNA helicases including Mcm proteins and the RecQ family of helicases. The properties of most of the known DNA helicases from prokaryotic and eukaryotic systems, including viruses and bacteriophages, are summarized in tables.
Molecular evolution of the betagamma lens crystallin superfamily: evidence for a retained ancestral function in gamma N crystallins?

PubMed

Weadick, Cameron J; Chang, Belinda S W

2009-05-01

Within the vertebrate eye, betagamma crystallins are extremely stable lens proteins that are uniquely adapted to increase refractory power while maintaining transparency. Unlike alpha crystallins, which are well-characterized, multifunctional proteins that have important functions both in and out of the lens, betagamma lens crystallins are a diverse group of proteins with no clear ancestral or contemporary nonlens role. We carried out phylogenetic and molecular evolutionary analyses of the betagamma-crystallin superfamily in order to study the evolutionary history of the gamma N crystallins, a recently discovered, biochemically atypical family suggested to possess a divergent or ancestral function. By including nonlens, betagamma-motif-containing sequences in our analysis as outgroups, we confirmed the phylogenetic position of the gamma N family as sister to other gamma crystallins. Using maximum likelihood codon models to estimate lineage-specific nonsynonymous-to-synonymous rate ratios revealed strong positive selection in all of the early lineages within the betagamma family, with the striking exception of the lineage leading to the gamma N crystallins which was characterized by strong purifying selection. Branch-site analysis, used to identify candidate sites involved in functional divergence between gamma N crystallins and its sister clade containing all other gamma crystallins, identified several positively selected changes at sites of known functional importance in the betagamma crystallin protein structure. Further analyses of a fish-specific gamma N crystallin gene duplication revealed a more recent episode of positive selection in only one of the two descendant lineages (gamma N2). Finally, from the guppy, Poecilia reticulata, we isolated complete gamma N1 and gamma N2 coding sequence data from cDNA and partial coding sequence data from genomic DNA in order to confirm the presence of a novel gamma N2 intron, discovered through data mining of two
Isolation of active regulatory elements from eukaryotic chromatin using FAIRE (Formaldehyde Assisted Isolation of Regulatory Elements)

PubMed Central

Giresi, Paul G.; Lieb, Jason D.

2009-01-01

The binding of sequence-specific regulatory factors and the recruitment of chromatin remodeling activities cause nucleosomes to be evicted from chromatin in eukaryotic cells. Traditionally, these active sites have been identified experimentally through their sensitivity to nucleases. Here we describe the details of a simple procedure for the genome-wide isolation of nucleosome-depleted DNA from human chromatin, termed FAIRE (Formaldehyde Assisted Isolation of Regulatory Elements). We also provide protocols for different methods of detecting FAIRE-enriched DNA, including use of PCR, DNA microarrays, and next-generation sequencing. FAIRE works on all eukaryotic chromatin tested to date. To perform FAIRE, chromatin is crosslinked with formaldehyde, sheared by sonication, and phenol-chloroform extracted. Most genomic DNA is crosslinked to nucleosomes and is sequestered to the interphase, whereas DNA recovered in the aqueous phase corresponds to nucleosome-depleted regions of the genome. The isolated regions are largely coincident with the location of DNaseI hypersensitive sites, transcriptional start sites, enhancers, insulators, and active promoters. Given its speed and simplicity, FAIRE has utility in establishing chromatin profiles of diverse cell types in health and disease, isolating DNA regulatory elements en masse for further characterization, and as a screening assay for the effects of small molecules on chromatin organization. PMID:19303047
Karyotype and gene order evolution from reconstructed extinct ancestors highlight contrasts in genome plasticity of modern rosid crops.

PubMed

Murat, Florent; Zhang, Rongzhi; Guizard, Sébastien; Gavranović, Haris; Flores, Raphael; Steinbach, Delphine; Quesneville, Hadi; Tannier, Eric; Salse, Jérôme

2015-01-29

We used nine complete genome sequences, from grape, poplar, Arabidopsis, soybean, lotus, apple, strawberry, cacao, and papaya, to investigate the paleohistory of rosid crops. We characterized an ancestral rosid karyotype, structured into 7/21 protochomosomes, with a minimal set of 6,250 ordered protogenes and a minimum physical coding gene space of 50 megabases. We also proposed ancestral karyotypes for the Caricaceae, Brassicaceae, Malvaceae, Fabaceae, Rosaceae, Salicaceae, and Vitaceae families with 9, 8, 10, 6, 12, 9, 12, and 19 protochromosomes, respectively. On the basis of these ancestral karyotypes and present-day species comparisons, we proposed a two-step evolutionary scenario based on allohexaploidization involving the newly characterized A, B, and C diploid progenitors leading to dominant (stable) and sensitive (plastic) genomic compartments in any modern rosid crops. Finally, a new user-friendly online tool, "DicotSyntenyViewer" (available from http://urgi.versailles.inra.fr/synteny-dicot), has been made available for accurate translational genomics in rosids. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Genomic Data Quality Impacts Automated Detection of Lateral Gene Transfer in Fungi

PubMed Central

Dupont, Pierre-Yves; Cox, Murray P.

2017-01-01

Lateral gene transfer (LGT, also known as horizontal gene transfer), an atypical mechanism of transferring genes between species, has almost become the default explanation for genes that display an unexpected composition or phylogeny. Numerous methods of detecting LGT events all rely on two fundamental strategies: primary structure composition or gene tree/species tree comparisons. Discouragingly, the results of these different approaches rarely coincide. With the wealth of genome data now available, detection of laterally transferred genes is increasingly being attempted in large uncurated eukaryotic datasets. However, detection methods depend greatly on the quality of the underlying genomic data, which are typically complex for eukaryotes. Furthermore, given the automated nature of genomic data collection, it is typically impractical to manually verify all protein or gene models, orthology predictions, and multiple sequence alignments, requiring researchers to accept a substantial margin of error in their datasets. Using a test case comprising plant-associated genomes across the fungal kingdom, this study reveals that composition- and phylogeny-based methods have little statistical power to detect laterally transferred genes. In particular, phylogenetic methods reveal extreme levels of topological variation in fungal gene trees, the vast majority of which show departures from the canonical species tree. Therefore, it is inherently challenging to detect LGT events in typical eukaryotic genomes. This finding is in striking contrast to the large number of claims for laterally transferred genes in eukaryotic species that routinely appear in the literature, and questions how many of these proposed examples are statistically well supported. PMID:28235827

Animal regeneration: ancestral character or evolutionary novelty?

PubMed

Slack, Jonathan Mw

2017-09-01

An old question about regeneration is whether it is an ancestral character which is a general property of living matter, or whether it represents a set of specific adaptations to the different circumstances faced by different types of animal. In this review, some recent results on regeneration are assessed to see if they can throw any new light on this question. Evidence in favour of an ancestral character comes from the role of Wnt and bone morphogenetic protein signalling in controlling the pattern of whole-body regeneration in acoels, which are a basal group of bilaterian animals. On the other hand, there is some evidence for adaptive acquisition or maintenance of the regeneration of appendages based on the occurrence of severe non-lethal predation, the existence of some novel genes in regenerating organisms, and differences at the molecular level between apparently similar forms of regeneration. It is tentatively concluded that whole-body regeneration is an ancestral character although has been lost from most animal lineages. Appendage regeneration is more likely to represent a derived character resulting from many specific adaptations. © 2017 The Author.
The archaebacterial origin of eukaryotes.

PubMed

Cox, Cymon J; Foster, Peter G; Hirt, Robert P; Harris, Simon R; Embley, T Martin

2008-12-23

The origin of the eukaryotic genetic apparatus is thought to be central to understanding the evolution of the eukaryotic cell. Disagreement about the source of the relevant genes has spawned competing hypotheses for the origins of the eukaryote nuclear lineage. The iconic rooted 3-domains tree of life shows eukaryotes and archaebacteria as separate groups that share a common ancestor to the exclusion of eubacteria. By contrast, the eocyte hypothesis has eukaryotes originating within the archaebacteria and sharing a common ancestor with a particular group called the Crenarchaeota or eocytes. Here, we have investigated the relative support for each hypothesis from analysis of 53 genes spanning the 3 domains, including essential components of the eukaryotic nucleic acid replication, transcription, and translation apparatus. As an important component of our analysis, we investigated the fit between model and data with respect to composition. Compositional heterogeneity is a pervasive problem for reconstruction of ancient relationships, which, if ignored, can produce an incorrect tree with strong support. To mitigate its effects, we used phylogenetic models that allow for changing nucleotide or amino acid compositions over the tree and data. Our analyses favor a topology that supports the eocyte hypothesis rather than archaebacterial monophyly and the 3-domains tree of life.
Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes.

PubMed

Huang, Shengfeng; Chen, Zelin; Yan, Xinyu; Yu, Ting; Huang, Guangrui; Yan, Qingyu; Pontarotti, Pierre Antoine; Zhao, Hongchen; Li, Jie; Yang, Ping; Wang, Ruihua; Li, Rui; Tao, Xin; Deng, Ting; Wang, Yiquan; Li, Guang; Zhang, Qiujin; Zhou, Sisi; You, Leiming; Yuan, Shaochun; Fu, Yonggui; Wu, Fenfang; Dong, Meiling; Chen, Shangwu; Xu, Anlong

2014-12-19

Vertebrates diverged from other chordates ~500 Myr ago and experienced successful innovations and adaptations, but the genomic basis underlying vertebrate origins are not fully understood. Here we suggest, through comparison with multiple lancelet (amphioxus) genomes, that ancient vertebrates experienced high rates of protein evolution, genome rearrangement and domain shuffling and that these rates greatly slowed down after the divergence of jawed and jawless vertebrates. Compared with lancelets, modern vertebrates retain, at least relatively, less protein diversity, fewer nucleotide polymorphisms, domain combinations and conserved non-coding elements (CNE). Modern vertebrates also lost substantial transposable element (TE) diversity, whereas lancelets preserve high TE diversity that includes even the long-sought RAG transposon. Lancelets also exhibit rapid gene turnover, pervasive transcription, fastest exon shuffling in metazoans and substantial TE methylation not observed in other invertebrates. These new lancelet genome sequences provide new insights into the chordate ancestral state and the vertebrate evolution.
GE-17ALTERATION OF THE p53 PATHWAY AND ANCESTRAL PROGENITORS ARE ASSOCIATED WITH TUMOR RECURRENCE IN GLIOBLASTOMA

PubMed Central

Kim, Hoon; Zheng, Siyuan; Amini, Seyed; Virk, Selene; Mikkelsen, Tom; Brat, Daniel; Sougnez, Carrie; Muller, Florian; Hu, Jian; Sloan, Andrew; Cohen, Mark; Van Meir, Erwin; Scarpace, Lisa; Lander, Eric; Gabriel, Stacey; Getz, Gad; Meyerson, Matthew; Chin, Lynda; Barnholtz-Sloan, Jill; Verhaak, Roel

2014-01-01

To evaluate evolutionary patterns of GBM recurrence, we analyzed whole genome sequencing (WGS) and multi-sector exome sequencing data from pairs of primary and posttreatment GBM. WGS on ten primary-recurrent pairs detected a median number of 12,214 mutations which we utilized to uncover clonal structures, by analyzing the distribution of mutation cellular frequencies (the fraction of tumor cells harboring a mutation). On average, 41 % of the mutations were shared by primary and recurrence. The majority of shared mutations were clonal in both primary and recurrence, but we also observed many clonal mutations that were uniquely detected in either the primary or the recurrence. This raises the intriguing possibility that major tumor clones in the primary tumor and disease relapse both evolved from a shared ancestral tumor cell population. At least one subclone was identified in the majority of WGS samples, and we observed groups of mutations that were at low cancer cell fractions in both primary and recurrence, suggesting that both subclones evolved from the same ancestral tumor cells separate from the major clone ancestral cells. To address the possibility that the lack of overlap between subsequent tumors was due to intratumoral heterogeneity, we analyzed exome sequencing from a second tumor sector of seven primary and six recurrent tumors. We found that the majority of "second biopsy" mutations were not conserved between time points, suggesting that intratumoral heterogeneity did not explain the large number of mutations uniquely detected in primary and recurrence. The limited overlap of mutations in primary and recurrence provides evidence for ancestral tumor cell populations that could not be eradicated by therapy, while offspring cell populations contained unique mutations, were selectively killed by treatment and could therefore no longer be detected after disease relapse. This study has provided new insights into patterns and dynamics of tumor evolution.
New Universal Rules of Eukaryotic Translation Initiation Fidelity

PubMed Central

Zur, Hadas; Tuller, Tamir

2013-01-01

The accepted model of eukaryotic translation initiation begins with the scanning of the transcript by the pre-initiation complex from the 5′end until an ATG codon with a specific nucleotide (nt) context surrounding it is recognized (Kozak rule). According to this model, ATG codons upstream to the beginning of the ORF should affect translation. We perform for the first time, a genome-wide statistical analysis, uncovering a new, more comprehensive and quantitative, set of initiation rules for improving the cost of translation and its efficiency. Analyzing dozens of eukaryotic genomes, we find that in all frames there is a universal trend of selection for low numbers of ATG codons; specifically, 16–27 codons upstream, but also 5–11 codons downstream of the START ATG, include less ATG codons than expected. We further suggest that there is selection for anti optimal ATG contexts in the vicinity of the START ATG. Thus, the efficiency and fidelity of translation initiation is encoded in the 5′UTR as required by the scanning model, but also at the beginning of the ORF. The observed nt patterns suggest that in all the analyzed organisms the pre-initiation complex often misses the START ATG of the ORF, and may start translation from an alternative initiation start-site. Thus, to prevent the translation of undesired proteins, there is selection for nucleotide sequences with low affinity to the pre-initiation complex near the beginning of the ORF. With the new suggested rules we were able to obtain a twice higher correlation with ribosomal density and protein levels in comparison to the Kozak rule alone (e.g. for protein levels r = 0.7 vs. r = 0.31; p<10−12). PMID:23874179
Developing improved durum wheat germplasm by altering the cytoplasmic genome

USDA-ARS?s Scientific Manuscript database

In eukaryotic organisms, nuclear and cytoplasmic genomes interact to drive cellular functions. These genomes have co-evolved to form specific nuclear-cytoplasmic interactions that are essential to the origin, success, and evolution of diploid and polyploid species. Hundreds of genetic diseases in h...
Genome skimming: A rapid approach to gaining diverse biological insights into multicellular pathogens

USDA-ARS?s Scientific Manuscript database

New genome sequence information can now be generated very quickly and cheaply for virtually any organism. The dive into genomics is increasingly tempting to scientists studying plant pathogens and other eukaryotic species without reference genomes. The ease of data collection, however, is tempered ...
HAL: a hierarchical format for storing and analyzing multiple genome alignments.

PubMed

Hickey, Glenn; Paten, Benedict; Earl, Dent; Zerbino, Daniel; Haussler, David

2013-05-15

Large multiple genome alignments and inferred ancestral genomes are ideal resources for comparative studies of molecular evolution, and advances in sequencing and computing technology are making them increasingly obtainable. These structures can provide a rich understanding of the genetic relationships between all subsets of species they contain. Current formats for storing genomic alignments, such as XMFA and MAF, are all indexed or ordered using a single reference genome, however, which limits the information that can be queried with respect to other species and clades. This loss of information grows with the number of species under comparison, as well as their phylogenetic distance. We present HAL, a compressed, graph-based hierarchical alignment format for storing multiple genome alignments and ancestral reconstructions. HAL graphs are indexed on all genomes they contain. Furthermore, they are organized phylogenetically, which allows for modular and parallel access to arbitrary subclades without fragmentation because of rearrangements that have occurred in other lineages. HAL graphs can be created or read with a comprehensive C++ API. A set of tools is also provided to perform basic operations, such as importing and exporting data, identifying mutations and coordinate mapping (liftover). All documentation and source code for the HAL API and tools are freely available at http://github.com/glennhickey/hal. hickey@soe.ucsc.edu or haussler@soe.ucsc.edu Supplementary data are available at Bioinformatics online.
On the need for widespread horizontal gene transfers under genome size constraint.

PubMed

Isambert, Hervé; Stein, Richard R

2009-08-25

While eukaryotes primarily evolve by duplication-divergence expansion (and reduction) of their own gene repertoire with only rare horizontal gene transfers, prokaryotes appear to evolve under both gene duplications and widespread horizontal gene transfers over long evolutionary time scales. But, the evolutionary origin of this striking difference in the importance of horizontal gene transfers remains by and large a mystery. We propose that the abundance of horizontal gene transfers in free-living prokaryotes is a simple but necessary consequence of two opposite effects: i) their apparent genome size constraint compared to typical eukaryote genomes and ii) their underlying genome expansion dynamics through gene duplication-divergence evolution, as demonstrated by the presence of many tandem and block repeated genes. In principle, this combination of genome size constraint and underlying duplication expansion should lead to a coalescent-like process with extensive turnover of functional genes. This would, however, imply the unlikely, systematic reinvention of functions from discarded genes within independent phylogenetic lineages. Instead, we propose that the long-term evolutionary adaptation of free-living prokaryotes must have resulted in the emergence of efficient non-phylogenetic pathways to circumvent gene loss. This need for widespread horizontal gene transfers due to genome size constraint implies, in particular, that prokaryotes must remain under strong selection pressure in order to maintain the long-term evolutionary adaptation of their "mutualized" gene pool, beyond the inevitable turnover of individual prokaryote species. By contrast, the absence of genome size constraint for typical eukaryotes has presumably relaxed their need for widespread horizontal gene transfers and strong selection pressure. Yet, the resulting loss of genetic functions, due to weak selection pressure and inefficient gene recovery mechanisms, must have ultimately favored the
The DNA-encoded nucleosome organization of a eukaryotic genome.

PubMed

Kaplan, Noam; Moore, Irene K; Fondufe-Mittendorf, Yvonne; Gossett, Andrea J; Tillo, Desiree; Field, Yair; LeProust, Emily M; Hughes, Timothy R; Lieb, Jason D; Widom, Jonathan; Segal, Eran

2009-03-19

Nucleosome organization is critical for gene regulation. In living cells this organization is determined by multiple factors, including the action of chromatin remodellers, competition with site-specific DNA-binding proteins, and the DNA sequence preferences of the nucleosomes themselves. However, it has been difficult to estimate the relative importance of each of these mechanisms in vivo, because in vivo nucleosome maps reflect the combined action of all influencing factors. Here we determine the importance of nucleosome DNA sequence preferences experimentally by measuring the genome-wide occupancy of nucleosomes assembled on purified yeast genomic DNA. The resulting map, in which nucleosome occupancy is governed only by the intrinsic sequence preferences of nucleosomes, is similar to in vivo nucleosome maps generated in three different growth conditions. In vitro, nucleosome depletion is evident at many transcription factor binding sites and around gene start and end sites, indicating that nucleosome depletion at these sites in vivo is partly encoded in the genome. We confirm these results with a micrococcal nuclease-independent experiment that measures the relative affinity of nucleosomes for approximately 40,000 double-stranded 150-base-pair oligonucleotides. Using our in vitro data, we devise a computational model of nucleosome sequence preferences that is significantly correlated with in vivo nucleosome occupancy in Caenorhabditis elegans. Our results indicate that the intrinsic DNA sequence preferences of nucleosomes have a central role in determining the organization of nucleosomes in vivo.
RNA Export through the NPC in Eukaryotes

PubMed Central

Okamura, Masumi; Inose, Haruko; Masuda, Seiji

2015-01-01

In eukaryotic cells, RNAs are transcribed in the nucleus and exported to the cytoplasm through the nuclear pore complex. The RNA molecules that are exported from the nucleus into the cytoplasm include messenger RNAs (mRNAs), ribosomal RNAs (rRNAs), transfer RNAs (tRNAs), small nuclear RNAs (snRNAs), micro RNAs (miRNAs), and viral mRNAs. Each RNA is transported by a specific nuclear export receptor. It is believed that most of the mRNAs are exported by Nxf1 (Mex67 in yeast), whereas rRNAs, snRNAs, and a certain subset of mRNAs are exported in a Crm1/Xpo1-dependent manner. tRNAs and miRNAs are exported by Xpot and Xpo5. However, multiple export receptors are involved in the export of some RNAs, such as 60S ribosomal subunit. In addition to these export receptors, some adapter proteins are required to export RNAs. The RNA export system of eukaryotic cells is also used by several types of RNA virus that depend on the machineries of the host cell in the nucleus for replication of their genome, therefore this review describes the RNA export system of two representative viruses. We also discuss the NPC anchoring-dependent mRNA export factors that directly recruit specific genes to the NPC. PMID:25802992
Viruses and viruslike particles of eukaryotic algae.

PubMed Central

Van Etten, J L; Lane, L C; Meints, R H

1991-01-01

Until recently there was little interest or information on viruses and viruslike particles of eukaryotic algae. However, this situation is changing. In the past decade many large double-stranded DNA-containing viruses that infect two culturable, unicellular, eukaryotic green algae have been discovered. These viruses can be produced in large quantities, assayed by plaque formation, and analyzed by standard bacteriophage techniques. The viruses are structurally similar to animal iridoviruses, their genomes are similar to but larger (greater than 300 kbp) than that of poxviruses, and their infection process resembles that of bacteriophages. Some of the viruses have DNAs with low levels of methylated bases, whereas others have DNAs with high concentrations of 5-methylcytosine and N6-methyladenine. Virus-encoded DNA methyltransferases are associated with the methylation and are accompanied by virus-encoded DNA site-specific (restriction) endonucleases. Some of these enzymes have sequence specificities identical to those of known bacterial enzymes, and others have previously unrecognized specificities. A separate rod-shaped RNA-containing algal virus has structural and nucleotide sequence affinities to higher plant viruses. Quite recently, viruses have been associated with rapid changes in marine algal populations. In the next decade we envision the discovery of new algal viruses, clarification of their role in various ecosystems, discovery of commercially useful genes in these viruses, and exploitation of algal virus genetic elements in plant and algal biotechnology. Images PMID:1779928
Ancient human genomes suggest three ancestral populations for present-day Europeans

PubMed Central

Lazaridis, Iosif; Patterson, Nick; Mittnik, Alissa; Renaud, Gabriel; Mallick, Swapan; Kirsanow, Karola; Sudmant, Peter H.; Schraiber, Joshua G.; Castellano, Sergi; Lipson, Mark; Berger, Bonnie; Economou, Christos; Bollongino, Ruth; Fu, Qiaomei; Bos, Kirsten I.; Nordenfelt, Susanne; Li, Heng; de Filippo, Cesare; Prüfer, Kay; Sawyer, Susanna; Posth, Cosimo; Haak, Wolfgang; Hallgren, Fredrik; Fornander, Elin; Rohland, Nadin; Delsate, Dominique; Francken, Michael; Guinet, Jean-Michel; Wahl, Joachim; Ayodo, George; Babiker, Hamza A.; Bailliet, Graciela; Balanovska, Elena; Balanovsky, Oleg; Barrantes, Ramiro; Bedoya, Gabriel; Ben-Ami, Haim; Bene, Judit; Berrada, Fouad; Bravi, Claudio M.; Brisighelli, Francesca; Busby, George B. J.; Cali, Francesco; Churnosov, Mikhail; Cole, David E. C.; Corach, Daniel; Damba, Larissa; van Driem, George; Dryomov, Stanislav; Dugoujon, Jean-Michel; Fedorova, Sardana A.; Romero, Irene Gallego; Gubina, Marina; Hammer, Michael; Henn, Brenna M.; Hervig, Tor; Hodoglugil, Ugur; Jha, Aashish R.; Karachanak-Yankova, Sena; Khusainova, Rita; Khusnutdinova, Elza; Kittles, Rick; Kivisild, Toomas; Klitz, William; Kučinskas, Vaidutis; Kushniarevich, Alena; Laredj, Leila; Litvinov, Sergey; Loukidis, Theologos; Mahley, Robert W.; Melegh, Béla; Metspalu, Ene; Molina, Julio; Mountain, Joanna; Näkkäläjärvi, Klemetti; Nesheva, Desislava; Nyambo, Thomas; Osipova, Ludmila; Parik, Jüri; Platonov, Fedor; Posukh, Olga; Romano, Valentino; Rothhammer, Francisco; Rudan, Igor; Ruizbakiev, Ruslan; Sahakyan, Hovhannes; Sajantila, Antti; Salas, Antonio; Starikovskaya, Elena B.; Tarekegn, Ayele; Toncheva, Draga; Turdikulova, Shahlo; Uktveryte, Ingrida; Utevska, Olga; Vasquez, René; Villena, Mercedes; Voevoda, Mikhail; Winkler, Cheryl; Yepiskoposyan, Levon; Zalloua, Pierre; Zemunik, Tatijana; Cooper, Alan; Capelli, Cristian; Thomas, Mark G.; Ruiz-Linares, Andres; Tishkoff, Sarah A.; Singh, Lalji; Thangaraj, Kumarasamy; Villems, Richard; Comas, David; Sukernik, Rem; Metspalu, Mait; Meyer, Matthias; Eichler, Evan E.; Burger, Joachim; Slatkin, Montgomery; Pääbo, Svante; Kelso, Janet; Reich, David; Krause, Johannes

2014-01-01

We sequenced the genomes of a ~7,000 year old farmer from Germany and eight ~8,000 year old hunter-gatherers from Luxembourg and Sweden. We analyzed these and other ancient genomes1–4 with 2,345 contemporary humans to show that most present Europeans derive from at least three highly differentiated populations: West European Hunter-Gatherers (WHG), who contributed ancestry to all Europeans but not to Near Easterners; Ancient North Eurasians (ANE) related to Upper Paleolithic Siberians3, who contributed to both Europeans and Near Easterners; and Early European Farmers (EEF), who were mainly of Near Eastern origin but also harbored WHG-related ancestry. We model these populations’ deep relationships and show that EEF had ~44% ancestry from a “Basal Eurasian” population that split prior to the diversification of other non-African lineages. PMID:25230663
Ancient human genomes suggest three ancestral populations for present-day Europeans.

PubMed

Lazaridis, Iosif; Patterson, Nick; Mittnik, Alissa; Renaud, Gabriel; Mallick, Swapan; Kirsanow, Karola; Sudmant, Peter H; Schraiber, Joshua G; Castellano, Sergi; Lipson, Mark; Berger, Bonnie; Economou, Christos; Bollongino, Ruth; Fu, Qiaomei; Bos, Kirsten I; Nordenfelt, Susanne; Li, Heng; de Filippo, Cesare; Prüfer, Kay; Sawyer, Susanna; Posth, Cosimo; Haak, Wolfgang; Hallgren, Fredrik; Fornander, Elin; Rohland, Nadin; Delsate, Dominique; Francken, Michael; Guinet, Jean-Michel; Wahl, Joachim; Ayodo, George; Babiker, Hamza A; Bailliet, Graciela; Balanovska, Elena; Balanovsky, Oleg; Barrantes, Ramiro; Bedoya, Gabriel; Ben-Ami, Haim; Bene, Judit; Berrada, Fouad; Bravi, Claudio M; Brisighelli, Francesca; Busby, George B J; Cali, Francesco; Churnosov, Mikhail; Cole, David E C; Corach, Daniel; Damba, Larissa; van Driem, George; Dryomov, Stanislav; Dugoujon, Jean-Michel; Fedorova, Sardana A; Gallego Romero, Irene; Gubina, Marina; Hammer, Michael; Henn, Brenna M; Hervig, Tor; Hodoglugil, Ugur; Jha, Aashish R; Karachanak-Yankova, Sena; Khusainova, Rita; Khusnutdinova, Elza; Kittles, Rick; Kivisild, Toomas; Klitz, William; Kučinskas, Vaidutis; Kushniarevich, Alena; Laredj, Leila; Litvinov, Sergey; Loukidis, Theologos; Mahley, Robert W; Melegh, Béla; Metspalu, Ene; Molina, Julio; Mountain, Joanna; Näkkäläjärvi, Klemetti; Nesheva, Desislava; Nyambo, Thomas; Osipova, Ludmila; Parik, Jüri; Platonov, Fedor; Posukh, Olga; Romano, Valentino; Rothhammer, Francisco; Rudan, Igor; Ruizbakiev, Ruslan; Sahakyan, Hovhannes; Sajantila, Antti; Salas, Antonio; Starikovskaya, Elena B; Tarekegn, Ayele; Toncheva, Draga; Turdikulova, Shahlo; Uktveryte, Ingrida; Utevska, Olga; Vasquez, René; Villena, Mercedes; Voevoda, Mikhail; Winkler, Cheryl A; Yepiskoposyan, Levon; Zalloua, Pierre; Zemunik, Tatijana; Cooper, Alan; Capelli, Cristian; Thomas, Mark G; Ruiz-Linares, Andres; Tishkoff, Sarah A; Singh, Lalji; Thangaraj, Kumarasamy; Villems, Richard; Comas, David; Sukernik, Rem; Metspalu, Mait; Meyer, Matthias; Eichler, Evan E; Burger, Joachim; Slatkin, Montgomery; Pääbo, Svante; Kelso, Janet; Reich, David; Krause, Johannes

2014-09-18

We sequenced the genomes of a ∼7,000-year-old farmer from Germany and eight ∼8,000-year-old hunter-gatherers from Luxembourg and Sweden. We analysed these and other ancient genomes with 2,345 contemporary humans to show that most present-day Europeans derive from at least three highly differentiated populations: west European hunter-gatherers, who contributed ancestry to all Europeans but not to Near Easterners; ancient north Eurasians related to Upper Palaeolithic Siberians, who contributed to both Europeans and Near Easterners; and early European farmers, who were mainly of Near Eastern origin but also harboured west European hunter-gatherer related ancestry. We model these populations' deep relationships and show that early European farmers had ∼44% ancestry from a 'basal Eurasian' population that split before the diversification of other non-African lineages.
An orthology-based analysis of pathogenic protozoa impacting global health: an improved comparative genomics approach with prokaryotes and model eukaryote orthologs.

PubMed

Cuadrat, Rafael R C; da Serra Cruz, Sérgio Manuel; Tschoeke, Diogo Antônio; Silva, Edno; Tosta, Frederico; Jucá, Henrique; Jardim, Rodrigo; Campos, Maria Luiza M; Mattoso, Marta; Dávila, Alberto M R

2014-08-01

A key focus in 21(st) century integrative biology and drug discovery for neglected tropical and other diseases has been the use of BLAST-based computational methods for identification of orthologous groups in pathogenic organisms to discern orthologs, with a view to evaluate similarities and differences among species, and thus allow the transfer of annotation from known/curated proteins to new/non-annotated ones. We used here a profile-based sensitive methodology to identify distant homologs, coupled to the NCBI's COG (Unicellular orthologs) and KOG (Eukaryote orthologs), permitting us to perform comparative genomics analyses on five protozoan genomes. OrthoSearch was used in five protozoan proteomes showing that 3901 and 7473 orthologs can be identified by comparison with COG and KOG proteomes, respectively. The core protozoa proteome inferred was 418 Protozoa-COG orthologous groups and 704 Protozoa-KOG orthologous groups: (i) 31.58% (132/418) belongs to the category J (translation, ribosomal structure, and biogenesis), and 9.81% (41/418) to the category O (post-translational modification, protein turnover, chaperones) using COG; (ii) 21.45% (151/704) belongs to the categories J, and 13.92% (98/704) to the O using KOG. The phylogenomic analysis showed four well-supported clades for Eukarya, discriminating Multicellular [(i) human, fly, plant and worm] and Unicellular [(ii) yeast, (iii) fungi, and (iv) protozoa] species. These encouraging results attest to the usefulness of the profile-based methodology for comparative genomics to accelerate semi-automatic re-annotation, especially of the protozoan proteomes. This approach may also lend itself for applications in global health, for example, in the case of novel drug target discovery against pathogenic organisms previously considered difficult to research with traditional drug discovery tools.
An Orthology-Based Analysis of Pathogenic Protozoa Impacting Global Health: An Improved Comparative Genomics Approach with Prokaryotes and Model Eukaryote Orthologs

PubMed Central

Cuadrat, Rafael R. C.; da Serra Cruz, Sérgio Manuel; Tschoeke, Diogo Antônio; Silva, Edno; Tosta, Frederico; Jucá, Henrique; Jardim, Rodrigo; Campos, Maria Luiza M.; Mattoso, Marta

2014-01-01

Abstract A key focus in 21st century integrative biology and drug discovery for neglected tropical and other diseases has been the use of BLAST-based computational methods for identification of orthologous groups in pathogenic organisms to discern orthologs, with a view to evaluate similarities and differences among species, and thus allow the transfer of annotation from known/curated proteins to new/non-annotated ones. We used here a profile-based sensitive methodology to identify distant homologs, coupled to the NCBI's COG (Unicellular orthologs) and KOG (Eukaryote orthologs), permitting us to perform comparative genomics analyses on five protozoan genomes. OrthoSearch was used in five protozoan proteomes showing that 3901 and 7473 orthologs can be identified by comparison with COG and KOG proteomes, respectively. The core protozoa proteome inferred was 418 Protozoa-COG orthologous groups and 704 Protozoa-KOG orthologous groups: (i) 31.58% (132/418) belongs to the category J (translation, ribosomal structure, and biogenesis), and 9.81% (41/418) to the category O (post-translational modification, protein turnover, chaperones) using COG; (ii) 21.45% (151/704) belongs to the categories J, and 13.92% (98/704) to the O using KOG. The phylogenomic analysis showed four well-supported clades for Eukarya, discriminating Multicellular [(i) human, fly, plant and worm] and Unicellular [(ii) yeast, (iii) fungi, and (iv) protozoa] species. These encouraging results attest to the usefulness of the profile-based methodology for comparative genomics to accelerate semi-automatic re-annotation, especially of the protozoan proteomes. This approach may also lend itself for applications in global health, for example, in the case of novel drug target discovery against pathogenic organisms previously considered difficult to research with traditional drug discovery tools. PMID:24960463
Evolution of patchily distributed proteins shared between eukaryotes and prokaryotes: Dictyostelium as a case study.

PubMed

Andersson, Jan O

2011-04-01

Protein families are often patchily distributed in the tree of life; they are present in distantly related organisms, but absent in more closely related lineages. This could either be the result of lateral gene transfer between ancestors of organisms that encode them, or losses in the lineages that lack them. Here a novel approach is developed to study the evolution of patchily distributed proteins shared between prokaryotes and eukaryotes. Proteins encoded in the genome of cellular slime mold Dictyostelium discoideum and a restricted number of other lineages, including at least one prokaryote, were identified. Analyses of the phylogenetic distribution of 49 such patchily distributed protein families showed conflicts with organismal phylogenies; 25 are shared with the distantly related amoeboflagellate Naegleria (Excavata), whereas only two are present in the more closely related Entamoeba. Most protein families show unexpected topologies in phylogenetic analyses; eukaryotes are polyphyletic in 85% of the trees. These observations suggest that gene transfers have been an important mechanism for the distribution of patchily distributed proteins across all domains of life. Further studies of this exchangeable gene fraction are needed for a better understanding of the origin and evolution of eukaryotic genes and the diversification process of eukaryotes. Copyright © 2011 S. Karger AG, Basel.
Biosynthesis of Selenocysteine on Its tRNA in Eukaryotes

PubMed Central

Mix, Heiko; Zhang, Yan; Saira, Kazima; Glass, Richard S; Berry, Marla J; Gladyshev, Vadim N; Hatfield, Dolph L

2007-01-01

Selenocysteine (Sec) is cotranslationally inserted into protein in response to UGA codons and is the 21st amino acid in the genetic code. However, the means by which Sec is synthesized in eukaryotes is not known. Herein, comparative genomics and experimental analyses revealed that the mammalian Sec synthase (SecS) is the previously identified pyridoxal phosphate-containing protein known as the soluble liver antigen. SecS required selenophosphate and O-phosphoseryl-tRNA[Ser]Sec as substrates to generate selenocysteyl-tRNA[Ser]Sec. Moreover, it was found that Sec was synthesized on the tRNA scaffold from selenide, ATP, and serine using tRNA[Ser]Sec, seryl-tRNA synthetase, O-phosphoseryl-tRNA[Ser]Sec kinase, selenophosphate synthetase, and SecS. By identifying the pathway of Sec biosynthesis in mammals, this study not only functionally characterized SecS but also assigned the function of the O-phosphoseryl-tRNA[Ser]Sec kinase. In addition, we found that selenophosphate synthetase 2 could synthesize monoselenophosphate in vitro but selenophosphate synthetase 1 could not. Conservation of the overall pathway of Sec biosynthesis suggests that this pathway is also active in other eukaryotes and archaea that synthesize selenoproteins. PMID:17194211
Microeconomic principles explain an optimal genome size in bacteria.

PubMed

Ranea, Juan A G; Grant, Alastair; Thornton, Janet M; Orengo, Christine A

2005-01-01

Bacteria can clearly enhance their survival by expanding their genetic repertoire. However, the tight packing of the bacterial genome and the fact that the most evolved species do not necessarily have the biggest genomes suggest there are other evolutionary factors limiting their genome expansion. To clarify these restrictions on size, we studied those protein families contributing most significantly to bacterial-genome complexity. We found that all bacteria apply the same basic and ancestral 'molecular technology' to optimize their reproductive efficiency. The same microeconomics principles that define the optimum size in a factory can also explain the existence of a statistical optimum in bacterial genome size. This optimum is reached when the bacterial genome obtains the maximum metabolic complexity (revenue) for minimal regulatory genes (logistic cost).
Gene Space Dynamics during the Evolution of Aegilops tauschii, Brachypodium distachyon, Oryza sativa, and Sorghum bicolor Genomes

USDA-ARS?s Scientific Manuscript database

Nine different regions totaling 9.7 Mb of the 4.02 Gb Aegilops tauschii genome were sequenced using the Sanger sequencing technology and compared with orthologous Brachypodium distachyon, Oryza sativa (rice) and Sorghum bicolor (sorghum) genomic sequences. The ancestral gene content in these regio...

Eukaryotic DING Proteins Are Endogenous: An Immunohistological Study in Mouse Tissues

PubMed Central

Collombet, Jean-Marc; Elias, Mikael; Gotthard, Guillaume; Four, Elise; Renault, Frédérique; Joffre, Aurélie; Baubichon, Dominique; Rochu, Daniel; Chabrière, Eric

2010-01-01

Background DING proteins encompass an intriguing protein family first characterized by their conserved N-terminal sequences. Some of these proteins seem to have key roles in various human diseases, e.g., rheumatoid arthritis, atherosclerosis, HIV suppression. Although this protein family seems to be ubiquitous in eukaryotes, their genes are consistently lacking from genomic databases. Such a lack has considerably hampered functional studies and has fostered therefore the hypothesis that DING proteins isolated from eukaryotes were in fact prokaryotic contaminants. Principal Findings In the framework of our study, we have performed a comprehensive immunological detection of DING proteins in mice. We demonstrate that DING proteins are present in all tissues tested as isoforms of various molecular weights (MWs). Their intracellular localization is tissue-dependant, being exclusively nuclear in neurons, but cytoplasmic and nuclear in other tissues. We also provide evidence that germ-free mouse plasma contains as much DING protein as wild-type. Significance Hence, data herein provide a valuable basis for future investigations aimed at eukaryotic DING proteins, revealing that these proteins seem ubiquitous in mouse tissue. Our results strongly suggest that mouse DING proteins are endogenous. Moreover, the determination in this study of the precise cellular localization of DING proteins constitute a precious evidence to understand their molecular involvements in their related human diseases. PMID:20161715
Characterization of Reconstructed Ancestral Proteins Suggests a Change in Temperature of the Ancient Biosphere.

PubMed

Akanuma, Satoshi

2017-08-06

Understanding the evolution of ancestral life, and especially the ability of some organisms to flourish in the variable environments experienced in Earth's early biosphere, requires knowledge of the characteristics and the environment of these ancestral organisms. Information about early life and environmental conditions has been obtained from fossil records and geological surveys. Recent advances in phylogenetic analysis, and an increasing number of protein sequences available in public databases, have made it possible to infer ancestral protein sequences possessed by ancient organisms. However, the in silico studies that assess the ancestral base content of ribosomal RNAs, the frequency of each amino acid in ancestral proteins, and estimate the environmental temperatures of ancient organisms, show conflicting results. The characterization of ancestral proteins reconstructed in vitro suggests that ancient organisms had very thermally stable proteins, and therefore were thermophilic or hyperthermophilic. Experimental data supports the idea that only thermophilic ancestors survived the catastrophic increase in temperature of the biosphere that was likely associated with meteorite impacts during the early history of Earth. In addition, by expanding the timescale and including more ancestral proteins for reconstruction, it appears as though the Earth's surface temperature gradually decreased over time, from Archean to present.
Genome evolution in Reptilia, the sister group of mammals.

PubMed

Janes, Daniel E; Organ, Christopher L; Fujita, Matthew K; Shedlock, Andrew M; Edwards, Scott V

2010-01-01

The genomes of birds and nonavian reptiles (Reptilia) are critical for understanding genome evolution in mammals and amniotes generally. Despite decades of study at the chromosomal and single-gene levels, and the evidence for great diversity in genome size, karyotype, and sex chromosome diversity, reptile genomes are virtually unknown in the comparative genomics era. The recent sequencing of the chicken and zebra finch genomes, in conjunction with genome scans and the online publication of the Anolis lizard genome, has begun to clarify the events leading from an ancestral amniote genome--predicted to be large and to possess a diverse repeat landscape on par with mammals and a birdlike sex chromosome system--to the small and highly streamlined genomes of birds. Reptilia exhibit a wide range of evolutionary rates of different subgenomes and, from isochores to mitochondrial DNA, provide a critical contrast to the genomic paradigms established in mammals.
Genome Sequences of Marine Shrimp Exopalaemon carinicauda Holthuis Provide Insights into Genome Size Evolution of Caridea.

PubMed

Yuan, Jianbo; Gao, Yi; Zhang, Xiaojun; Wei, Jiankai; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

2017-07-05

Crustacea, particularly Decapoda, contains many economically important species, such as shrimps and crabs. Crustaceans exhibit enormous (nearly 500-fold) variability in genome size. However, limited genome resources are available for investigating these species. Exopalaemon carinicauda Holthuis, an economical caridean shrimp, is a potential ideal experimental animal for research on crustaceans. In this study, we performed low-coverage sequencing and de novo assembly of the E. carinicauda genome. The assembly covers more than 95% of coding regions. E. carinicauda possesses a large complex genome (5.73 Gb), with size twice higher than those of many decapod shrimps. As such, comparative genomic analyses were implied to investigate factors affecting genome size evolution of decapods. However, clues associated with genome duplication were not identified, and few horizontally transferred sequences were detected. Ultimately, the burst of transposable elements, especially retrotransposons, was determined as the major factor influencing genome expansion. A total of 2 Gb repeats were identified, and RTE-BovB, Jockey, Gypsy, and DIRS were the four major retrotransposons that significantly expanded. Both recent (Jockey and Gypsy) and ancestral (DIRS) originated retrotransposons responsible for the genome evolution. The E. carinicauda genome also exhibited potential for the genomic and experimental research of shrimps.
Musculature in sipunculan worms: ontogeny and ancestral states.

PubMed

Schulze, Anja; Rice, Mary E

2009-01-01

Molecular phylogenetics suggests that the Sipuncula fall into the Annelida, although they are morphologically very distinct and lack segmentation. To understand the evolutionary transformations from the annelid to the sipunculan body plan, it is important to reconstruct the ancestral states within the respective clades at all life history stages. Here we reconstruct the ancestral states for the head/introvert retractor muscles and the body wall musculature in the Sipuncula using Bayesian statistics. In addition, we describe the ontogenetic transformations of the two muscle systems in four sipunculan species with different developmental modes, using F-actin staining with fluorescent-labeled phalloidin in conjunction with confocal laser scanning microscopy. All four species, which have smooth body wall musculature and less than the full set of four introvert retractor muscles as adults, go through developmental stages with four retractor muscles that are eventually reduced to a lower number in the adult. The circular and sometimes the longitudinal body wall musculature are split into bands that later transform into a smooth sheath. Our ancestral state reconstructions suggest with nearly 100% probability that the ancestral sipunculan had four introvert retractor muscles, longitudinal body wall musculature in bands and circular body wall musculature arranged as a smooth sheath. Species with crawling larvae have more strongly developed body wall musculature than those with swimming larvae. To interpret our findings in the context of annelid evolution, a more solid phylogenetic framework is needed for the entire group and more data on ontogenetic transformations of annelid musculature are desirable.
Extensive Horizontal Transfer and Homologous Recombination Generate Highly Chimeric Mitochondrial Genomes in Yeast.

PubMed

Wu, Baojun; Buljic, Adnan; Hao, Weilong

2015-10-01

The frequency of horizontal gene transfer (HGT) in mitochondrial DNA varies substantially. In plants, HGT is relatively common, whereas in animals it appears to be quite rare. It is of considerable importance to understand mitochondrial HGT across the major groups of eukaryotes at a genome-wide level, but so far this has been well studied only in plants. In this study, we generated ten new mitochondrial genome sequences and analyzed 40 mitochondrial genomes from the Saccharomycetaceae to assess the magnitude and nature of mitochondrial HGT in yeasts. We provide evidence for extensive, homologous-recombination-mediated, mitochondrial-to-mitochondrial HGT occurring throughout yeast mitochondrial genomes, leading to genomes that are highly chimeric evolutionarily. This HGT has led to substantial intraspecific polymorphism in both sequence content and sequence divergence, which to our knowledge has not been previously documented in any mitochondrial genome. The unexpectedly high frequency of mitochondrial HGT in yeast may be driven by frequent mitochondrial fusion, relatively low mitochondrial substitution rates and pseudohyphal fusion to produce heterokaryons. These findings suggest that mitochondrial HGT may play an important role in genome evolution of a much broader spectrum of eukaryotes than previously appreciated and that there is a critical need to systematically study the frequency, extent, and importance of mitochondrial HGT across eukaryotes. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Adaptive Memory: Ancestral Priorities and the Mnemonic Value of Survival Processing

ERIC Educational Resources Information Center

Nairne, James S.; Pandeirada, Josefa N. S.

2010-01-01

Evolutionary psychologists often propose that humans carry around "stone-age" brains, along with a toolkit of cognitive adaptations designed originally to solve hunter-gatherer problems. This perspective predicts that optimal cognitive performance might sometimes be induced by ancestrally-based problems, those present in ancestral environments,…
Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes

PubMed Central

Huang, Shengfeng; Chen, Zelin; Yan, Xinyu; Yu, Ting; Huang, Guangrui; Yan, Qingyu; Pontarotti, Pierre Antoine; Zhao, Hongchen; Li, Jie; Yang, Ping; Wang, Ruihua; Li, Rui; Tao, Xin; Deng, Ting; Wang, Yiquan; Li, Guang; Zhang, Qiujin; Zhou, Sisi; You, Leiming; Yuan, Shaochun; Fu, Yonggui; Wu, Fenfang; Dong, Meiling; Chen, Shangwu; Xu, Anlong

2014-01-01

Vertebrates diverged from other chordates ~500 Myr ago and experienced successful innovations and adaptations, but the genomic basis underlying vertebrate origins are not fully understood. Here we suggest, through comparison with multiple lancelet (amphioxus) genomes, that ancient vertebrates experienced high rates of protein evolution, genome rearrangement and domain shuffling and that these rates greatly slowed down after the divergence of jawed and jawless vertebrates. Compared with lancelets, modern vertebrates retain, at least relatively, less protein diversity, fewer nucleotide polymorphisms, domain combinations and conserved non-coding elements (CNE). Modern vertebrates also lost substantial transposable element (TE) diversity, whereas lancelets preserve high TE diversity that includes even the long-sought RAG transposon. Lancelets also exhibit rapid gene turnover, pervasive transcription, fastest exon shuffling in metazoans and substantial TE methylation not observed in other invertebrates. These new lancelet genome sequences provide new insights into the chordate ancestral state and the vertebrate evolution. PMID:25523484
Reconstruction of chromosome rearrangements between the two most ancestral duckweed species Spirodela polyrhiza and S. intermedia.

PubMed

Hoang, Phuong T N; Schubert, Ingo

2017-12-01

The monophyletic duckweeds comprising five genera within the monocot order Alismatales are neotenic, free-floating, aquatic organisms with fast vegetative propagation. Some species are considered for efficient biomass production, for life stock feeding, and for (simultaneous) wastewater phytoremediation. The ancestral genus Spirodela consists of only two species, Spirodela polyrhiza and Spirodela intermedia, both with a similar small genome (~160 Mbp/1C). Reference genome drafts and a physical map of 96 BACs on the 20 chromosome pairs of S. polyrhiza strain 7498 are available and provide useful tools for further evolutionary studies within and between duckweed genera. Here we applied sequential comparative multicolor fluorescence in situ hybridization (mcFISH) to address homeologous chromosomes in S. intermedia (2n = 36), to detect chromosome rearrangements between both species and to elucidate the mechanisms which may have led to the chromosome number alteration after their evolutionary separation. Ten chromosome pairs proved to be conserved between S. polyrhiza and S. intermedia, the remaining ones experienced, depending on the assumed direction of evolution, translocations, inversion, and fissions, respectively. These results represent a first step to unravel karyotype evolution among duckweeds and are anchor points for future genome assembly of S. intermedia.
Complete Sequence and Analysis of the Mitochondrial Genome of Hemiselmis andersenii CCMP644 (Cryptophyceae)

PubMed Central

Kim, Eunsoo; Lane, Christopher E; Curtis, Bruce A; Kozera, Catherine; Bowman, Sharen; Archibald, John M

2008-01-01

Background Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote) endosymbiosis. Cryptophytes are unusual in that they possess four genomes–a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented. Results The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a ~20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22–336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu) gene and possesses a trnS-derived 'trnK(uuu)', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher-order eukaryotic lineages
Genome-wide analysis of tandem repeats in plants and green algae

Treesearch

Zhixin Zhao; Cheng Guo; Sreeskandarajan Sutharzan; Pei Li; Craig Echt; Jie Zhang; Chun Liang

2014-01-01

Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among...
Did meiosis evolve before sex and the evolution of eukaryotic life cycles?

PubMed

Niklas, Karl J; Cobb, Edward D; Kutschera, Ulrich

2014-11-01

Biologists have long theorized about the evolution of life cycles, meiosis, and sexual reproduction. We revisit these topics and propose that the fundamental difference between life cycles is where and when multicellularity is expressed. We develop a scenario to explain the evolutionary transition from the life cycle of a unicellular organism to one in which multicellularity is expressed in either the haploid or diploid phase, or both. We propose further that meiosis might have evolved as a mechanism to correct for spontaneous whole-genome duplication (auto-polyploidy) and thus before the evolution of sexual reproduction sensu stricto (i.e. the formation of a diploid zygote via the fusion of haploid gametes) in the major eukaryotic clades. In addition, we propose, as others have, that sexual reproduction, which predominates in all eukaryotic clades, has many different advantages among which is that it produces variability among offspring and thus reduces sibling competition. © 2014 WILEY Periodicals, Inc.
Fluorescence in situ hybridization and optical mapping to correct scaffold arrangement in the tomato genome

USDA-ARS?s Scientific Manuscript database

Modern biological analyses are often assisted by recent technologies making the sequencing of complex genomes both technically possible and feasible. We recently sequenced the tomato genome that, like many eukaryotic genomes, is large and complex. Current sequencing technologies allow the developmen...
Comparative genome and transcriptome analyses of the social amoeba Acytostelium subglobosum that accomplishes multicellular development without germ-soma differentiation.

PubMed

Urushihara, Hideko; Kuwayama, Hidekazu; Fukuhara, Kensuke; Itoh, Takehiko; Kagoshima, Hiroshi; Shin-I, Tadasu; Toyoda, Atsushi; Ohishi, Kazuyo; Taniguchi, Tateaki; Noguchi, Hideki; Kuroki, Yoko; Hata, Takashi; Uchi, Kyoko; Mohri, Kurato; King, Jason S; Insall, Robert H; Kohara, Yuji; Fujiyama, Asao

2015-02-14

Social amoebae are lower eukaryotes that inhabit the soil. They are characterized by the construction of a starvation-induced multicellular fruiting body with a spore ball and supportive stalk. In most species, the stalk is filled with motile stalk cells, as represented by the model organism Dictyostelium discoideum, whose developmental mechanisms have been well characterized. However, in the genus Acytostelium, the stalk is acellular and all aggregated cells become spores. Phylogenetic analyses have shown that it is not an ancestral genus but has lost the ability to undergo cell differentiation. We performed genome and transcriptome analyses of Acytostelium subglobosum and compared our findings to other available dictyostelid genome data. Although A. subglobosum adopts a qualitatively different developmental program from other dictyostelids, its gene repertoire was largely conserved. Yet, families of polyketide synthase and extracellular matrix proteins have not expanded and a serine protease and ABC transporter B family gene, tagA, and a few other developmental genes are missing in the A. subglobosum lineage. Temporal gene expression patterns are astonishingly dissimilar from those of D. discoideum, and only a limited fraction of the ortholog pairs shared the same expression patterns, so that some signaling cascades for development seem to be disabled in A. subglobosum. The absence of the ability to undergo cell differentiation in Acytostelium is accompanied by a small change in coding potential and extensive alterations in gene expression patterns.
The dynamics of genome replication using deep sequencing

PubMed Central

Müller, Carolin A.; Hawkins, Michelle; Retkute, Renata; Malla, Sunir; Wilson, Ray; Blythe, Martin J.; Nakato, Ryuichiro; Komata, Makiko; Shirahige, Katsuhiko; de Moura, Alessandro P.S.; Nieduszynski, Conrad A.

2014-01-01

Eukaryotic genomes are replicated from multiple DNA replication origins. We present complementary deep sequencing approaches to measure origin location and activity in Saccharomyces cerevisiae. Measuring the increase in DNA copy number during a synchronous S-phase allowed the precise determination of genome replication. To map origin locations, replication forks were stalled close to their initiation sites; therefore, copy number enrichment was limited to origins. Replication timing profiles were generated from asynchronous cultures using fluorescence-activated cell sorting. Applying this technique we show that the replication profiles of haploid and diploid cells are indistinguishable, indicating that both cell types use the same cohort of origins with the same activities. Finally, increasing sequencing depth allowed the direct measure of replication dynamics from an exponentially growing culture. This is the first time this approach, called marker frequency analysis, has been successfully applied to a eukaryote. These data provide a high-resolution resource and methodological framework for studying genome biology. PMID:24089142
Evolution of RNA- and DNA-guided antivirus defense systems in prokaryotes and eukaryotes: common ancestry vs convergence.

PubMed

Koonin, Eugene V

2017-02-10

Complementarity between nucleic acid molecules is central to biological information transfer processes. Apart from the basal processes of replication, transcription and translation, complementarity is also employed by multiple defense and regulatory systems. All cellular life forms possess defense systems against viruses and mobile genetic elements, and in most of them some of the defense mechanisms involve small guide RNAs or DNAs that recognize parasite genomes and trigger their inactivation. The nucleic acid-guided defense systems include prokaryotic Argonaute (pAgo)-centered innate immunity and CRISPR-Cas adaptive immunity as well as diverse branches of RNA interference (RNAi) in eukaryotes. The archaeal pAgo machinery is the direct ancestor of eukaryotic RNAi that, however, acquired additional components, such as Dicer, and enormously diversified through multiple duplications. In contrast, eukaryotes lack any heritage of the CRISPR-Cas systems, conceivably, due to the cellular toxicity of some Cas proteins that would get activated as a result of operon disruption in eukaryotes. The adaptive immunity function in eukaryotes is taken over partly by the PIWI RNA branch of RNAi and partly by protein-based immunity. In this review, I briefly discuss the interplay between homology and analogy in the evolution of RNA- and DNA-guided immunity, and attempt to formulate some general evolutionary principles for this ancient class of defense systems. This article was reviewed by Mikhail Gelfand and Bojan Zagrovic.
Eukaryotic gene regulation by targeted chromatin re-modeling at dispersed, middle-repetitive sequence elements.

PubMed

Hodgetts, Ross

2004-12-01

RNA interference might have evolved to minimize the deleterious impact of transposable elements and viruses on eukaryotic genomes, because mutations in genes within the RNAi pathway cause mobilization of transposons in nematodes and flies. Although the first examples of RNAi involved post-transcriptional gene silencing, recently the pathway has been shown to act at the transcriptional level. It does so by establishing a chromatin configuration on the target DNA that has many of the hallmarks of heterochromatin, thus preventing its transcription. Members of dispersed, repeated sequence families appear to have been utilized by the RNAi machinery to regulate nearby genes in yeast. The unusual genomic distribution of three repeated element families in the chicken, fruit-fly and nematode genomes prompts speculation that some of these repeats have been co-opted to control gene expression, either locally or over extended chromosomal domains.
Mitigating Mitochondrial Genome Erosion Without Recombination.

PubMed

Radzvilavicius, Arunas L; Kokko, Hanna; Christie, Joshua R

2017-11-01

Mitochondria are ATP-producing organelles of bacterial ancestry that played a key role in the origin and early evolution of complex eukaryotic cells. Most modern eukaryotes transmit mitochondrial genes uniparentally, often without recombination among genetically divergent organelles. While this asymmetric inheritance maintains the efficacy of purifying selection at the level of the cell, the absence of recombination could also make the genome susceptible to Muller's ratchet. How mitochondria escape this irreversible defect accumulation is a fundamental unsolved question. Occasional paternal leakage could in principle promote recombination, but it would also compromise the purifying selection benefits of uniparental inheritance. We assess this tradeoff using a stochastic population-genetic model. In the absence of recombination, uniparental inheritance of freely-segregating genomes mitigates mutational erosion, while paternal leakage exacerbates the ratchet effect. Mitochondrial fusion-fission cycles ensure independent genome segregation, improving purifying selection. Paternal leakage provides opportunity for recombination to slow down the mutation accumulation, but always at a cost of increased steady-state mutation load. Our findings indicate that random segregation of mitochondrial genomes under uniparental inheritance can effectively combat the mutational meltdown, and that homologous recombination under paternal leakage might not be needed. Copyright © 2017 by the Genetics Society of America.
Development of a panel of genome-wide ancestry informative markers to study admixture throughout the Americas.

PubMed

Galanter, Joshua Mark; Fernandez-Lopez, Juan Carlos; Gignoux, Christopher R; Barnholtz-Sloan, Jill; Fernandez-Rozadilla, Ceres; Via, Marc; Hidalgo-Miranda, Alfredo; Contreras, Alejandra V; Figueroa, Laura Uribe; Raska, Paola; Jimenez-Sanchez, Gerardo; Zolezzi, Irma Silva; Torres, Maria; Ponte, Clara Ruiz; Ruiz, Yarimar; Salas, Antonio; Nguyen, Elizabeth; Eng, Celeste; Borjas, Lisbeth; Zabala, William; Barreto, Guillermo; González, Fernando Rondón; Ibarra, Adriana; Taboada, Patricia; Porras, Liliana; Moreno, Fabián; Bigham, Abigail; Gutierrez, Gerardo; Brutsaert, Tom; León-Velarde, Fabiola; Moore, Lorna G; Vargas, Enrique; Cruz, Miguel; Escobedo, Jorge; Rodriguez-Santana, José; Rodriguez-Cintrón, William; Chapela, Rocio; Ford, Jean G; Bustamante, Carlos; Seminara, Daniela; Shriver, Mark; Ziv, Elad; Burchard, Esteban Gonzalez; Haile, Robert; Parra, Esteban; Carracedo, Angel

2012-01-01

Most individuals throughout the Americas are admixed descendants of Native American, European, and African ancestors. Complex historical factors have resulted in varying proportions of ancestral contributions between individuals within and among ethnic groups. We developed a panel of 446 ancestry informative markers (AIMs) optimized to estimate ancestral proportions in individuals and populations throughout Latin America. We used genome-wide data from 953 individuals from diverse African, European, and Native American populations to select AIMs optimized for each of the three main continental populations that form the basis of modern Latin American populations. We selected markers on the basis of locus-specific branch length to be informative, well distributed throughout the genome, capable of being genotyped on widely available commercial platforms, and applicable throughout the Americas by minimizing within-continent heterogeneity. We then validated the panel in samples from four admixed populations by comparing ancestry estimates based on the AIMs panel to estimates based on genome-wide association study (GWAS) data. The panel provided balanced discriminatory power among the three ancestral populations and accurate estimates of individual ancestry proportions (R² > 0.9 for ancestral components with significant between-subject variance). Finally, we genotyped samples from 18 populations from Latin America using the AIMs panel and estimated variability in ancestry within and between these populations. This panel and its reference genotype information will be useful resources to explore population history of admixture in Latin America and to correct for the potential effects of population stratification in admixed samples in the region.
Development of a Panel of Genome-Wide Ancestry Informative Markers to Study Admixture Throughout the Americas

PubMed Central

Galanter, Joshua Mark; Fernandez-Lopez, Juan Carlos; Gignoux, Christopher R.; Barnholtz-Sloan, Jill; Fernandez-Rozadilla, Ceres; Via, Marc; Hidalgo-Miranda, Alfredo; Contreras, Alejandra V.; Figueroa, Laura Uribe; Raska, Paola; Jimenez-Sanchez, Gerardo; Silva Zolezzi, Irma; Torres, Maria; Ponte, Clara Ruiz; Ruiz, Yarimar; Salas, Antonio; Nguyen, Elizabeth; Eng, Celeste; Borjas, Lisbeth; Zabala, William; Barreto, Guillermo; Rondón González, Fernando; Ibarra, Adriana; Taboada, Patricia; Porras, Liliana; Moreno, Fabián; Bigham, Abigail; Gutierrez, Gerardo; Brutsaert, Tom; León-Velarde, Fabiola; Moore, Lorna G.; Vargas, Enrique; Cruz, Miguel; Escobedo, Jorge; Rodriguez-Santana, José; Rodriguez-Cintrón, William; Chapela, Rocio; Ford, Jean G.; Bustamante, Carlos; Seminara, Daniela; Shriver, Mark; Ziv, Elad; Gonzalez Burchard, Esteban; Haile, Robert

2012-01-01

Most individuals throughout the Americas are admixed descendants of Native American, European, and African ancestors. Complex historical factors have resulted in varying proportions of ancestral contributions between individuals within and among ethnic groups. We developed a panel of 446 ancestry informative markers (AIMs) optimized to estimate ancestral proportions in individuals and populations throughout Latin America. We used genome-wide data from 953 individuals from diverse African, European, and Native American populations to select AIMs optimized for each of the three main continental populations that form the basis of modern Latin American populations. We selected markers on the basis of locus-specific branch length to be informative, well distributed throughout the genome, capable of being genotyped on widely available commercial platforms, and applicable throughout the Americas by minimizing within-continent heterogeneity. We then validated the panel in samples from four admixed populations by comparing ancestry estimates based on the AIMs panel to estimates based on genome-wide association study (GWAS) data. The panel provided balanced discriminatory power among the three ancestral populations and accurate estimates of individual ancestry proportions (R2>0.9 for ancestral components with significant between-subject variance). Finally, we genotyped samples from 18 populations from Latin America using the AIMs panel and estimated variability in ancestry within and between these populations. This panel and its reference genotype information will be useful resources to explore population history of admixture in Latin America and to correct for the potential effects of population stratification in admixed samples in the region. PMID:22412386

The ancestral flower of angiosperms and its early diversification

PubMed Central

Sauquet, Hervé; von Balthazar, Maria; Magallón, Susana; Doyle, James A.; Endress, Peter K.; Bailes, Emily J.; Barroso de Morais, Erica; Bull-Hereñu, Kester; Carrive, Laetitia; Chartier, Marion; Chomicki, Guillaume; Coiro, Mario; Cornette, Raphaël; El Ottra, Juliana H. L.; Epicoco, Cyril; Foster, Charles S. P.; Jabbour, Florian; Haevermans, Agathe; Haevermans, Thomas; Hernández, Rebeca; Little, Stefan A.; Löfstrand, Stefan; Luna, Javier A.; Massoni, Julien; Nadot, Sophie; Pamperl, Susanne; Prieu, Charlotte; Reyes, Elisabeth; dos Santos, Patrícia; Schoonderwoerd, Kristel M.; Sontag, Susanne; Soulebeau, Anaëlle; Staedler, Yannick; Tschan, Georg F.; Wing-Sze Leung, Amy; Schönenberger, Jürg

2017-01-01

Recent advances in molecular phylogenetics and a series of important palaeobotanical discoveries have revolutionized our understanding of angiosperm diversification. Yet, the origin and early evolution of their most characteristic feature, the flower, remains poorly understood. In particular, the structure of the ancestral flower of all living angiosperms is still uncertain. Here we report model-based reconstructions for ancestral flowers at the deepest nodes in the phylogeny of angiosperms, using the largest data set of floral traits ever assembled. We reconstruct the ancestral angiosperm flower as bisexual and radially symmetric, with more than two whorls of three separate perianth organs each (undifferentiated tepals), more than two whorls of three separate stamens each, and more than five spirally arranged separate carpels. Although uncertainty remains for some of the characters, our reconstruction allows us to propose a new plausible scenario for the early diversification of flowers, leading to new testable hypotheses for future research on angiosperms. PMID:28763051
Insights into structural variations and genome rearrangements in prokaryotic genomes.

PubMed

Periwal, Vinita; Scaria, Vinod

2015-01-01

Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Hierarchically Aligning 10 Legume Genomes Establishes a Family-Level Genomics Platform.

PubMed

Wang, Jinpeng; Sun, Pengchuan; Li, Yuxian; Liu, Yinzhe; Yu, Jigao; Ma, Xuelian; Sun, Sangrong; Yang, Nanshan; Xia, Ruiyan; Lei, Tianyu; Liu, Xiaojian; Jiao, Beibei; Xing, Yue; Ge, Weina; Wang, Li; Wang, Zhenyi; Song, Xiaoming; Yuan, Min; Guo, Di; Zhang, Lan; Zhang, Jiaqi; Jin, Dianchuan; Chen, Wei; Pan, Yuxin; Liu, Tao; Jin, Ling; Sun, Jinshuai; Yu, Jiaxiang; Cheng, Rui; Duan, Xueqian; Shen, Shaoqi; Qin, Jun; Zhang, Meng-Chen; Paterson, Andrew H; Wang, Xiyin

2017-05-01

Mainly due to their economic importance, genomes of 10 legumes, including soybean ( Glycine max ), wild peanut ( Arachis duranensis and Arachis ipaensis ), and barrel medic ( Medicago truncatula ), have been sequenced. However, a family-level comparative genomics analysis has been unavailable. With grape ( Vitis vinifera ) and selected legume genomes as outgroups, we managed to perform a hierarchical and event-related alignment of these genomes and deconvoluted layers of homologous regions produced by ancestral polyploidizations or speciations. Consequently, we illustrated genomic fractionation characterized by widespread gene losses after the polyploidizations. Notably, high similarity in gene retention between recently duplicated chromosomes in soybean supported the likely autopolyploidy nature of its tetraploid ancestor. Moreover, although most gene losses were nearly random, largely but not fully described by geometric distribution, we showed that polyploidization contributed divergently to the copy number variation of important gene families. Besides, we showed significantly divergent evolutionary levels among legumes and, by performing synonymous nucleotide substitutions at synonymous sites correction, redated major evolutionary events during their expansion. This effort laid a solid foundation for further genomics exploration in the legume research community and beyond. We describe only a tiny fraction of legume comparative genomics analysis that we performed; more information was stored in the newly constructed Legume Comparative Genomics Research Platform (www.legumegrp.org). © 2017 American Society of Plant Biologists. All Rights Reserved.
Phylogenetic implications of the 38 putative ancestral chromosome segments for four canid species.

PubMed

Graphodatsky, A S; Yang, F; O'Brien, P C; Perelman, P; Milne, B S; Serdukova, N; Kawada, S I; Ferguson-Smith, M A

2001-01-01

Chromosome homologies between the Japanese raccoon dog (Nectereutes procyonoides viverrinus, 2n = 39 + 2-4 B chromosomes) and domestic dog (Canis familiaris, 2n = 78) have been established by hybridizing a complete set of canine paint probes onto high-resolution G-banded chromosomes of the raccoon dog. Dog chromosomes 1, 13, and 19 each correspond to two raccoon dog chromosome segments, while the remaining 35 dog autosomes each correspond to a single segment. In total, 38 dog autosome paints revealed 41 conserved segments in the raccoon dog. The use of dog painting probes has enabled integration of the raccoon dog chromosomes into the previously established comparative map for the domestic dog, Arctic fox (Alopex lagopus), and red fox (Vulpes vulpes). Extensive chromosome arm homologies were found among chromosomes of the red fox, Arctic fox, and raccoon dog. Contradicting previous findings, our results show that the raccoon dog does not share a single biarmed autosome in common with the Arctic fox, red fox, or domestic cat. Comparative analysis of the distribution patterns of conserved chromosome segments revealed by dog paints in the genomes of the canids, cats, and human reveals 38 ancestral autosome segments. These segments could represent the ancestral chromosome arms in the karyotype of the most recent ancestor of the Canidae family, which we suggest could have had a low diploid number, based on comparisons with outgroup species. Copyright 2001 S. Karger AG, Basel.
Virtual Genomes in Flux: An Interplay of Neutrality and Adaptability Explains Genome Expansion and Streamlining

PubMed Central

Cuypers, Thomas D.; Hogeweg, Paulien

2012-01-01

The picture that emerges from phylogenetic gene content reconstructions is that genomes evolve in a dynamic pattern of rapid expansion and gradual streamlining. Ancestral organisms have been estimated to possess remarkably rich gene complements, although gene loss is a driving force in subsequent lineage adaptation and diversification. Here, we study genome dynamics in a model of virtual cells evolving to maintain homeostasis. We observe a pattern of an initial rapid expansion of the genome and a prolonged phase of mutational load reduction. Generally, load reduction is achieved by the deletion of redundant genes, generating a streamlining pattern. Load reduction can also occur as a result of the generation of highly neutral genomic regions. These regions can expand and contract in a neutral fashion. Our study suggests that genome expansion and streamlining are generic patterns of evolving systems. We propose that the complex genotype to phenotype mapping in virtual cells as well as in their biological counterparts drives genome size dynamics, due to an emerging interplay between adaptation, neutrality, and evolvability. PMID:22234601
Genomic identification of regulatory elements by evolutionary sequence comparison and functional analysis.

PubMed

Loots, Gabriela G

2008-01-01

Despite remarkable recent advances in genomics that have enabled us to identify most of the genes in the human genome, comparable efforts to define transcriptional cis-regulatory elements that control gene expression are lagging behind. The difficulty of this task stems from two equally important problems: our knowledge of how regulatory elements are encoded in genomes remains elementary, and there is a vast genomic search space for regulatory elements, since most of mammalian genomes are noncoding. Comparative genomic approaches are having a remarkable impact on the study of transcriptional regulation in eukaryotes and currently represent the most efficient and reliable methods of predicting noncoding sequences likely to control the patterns of gene expression. By subjecting eukaryotic genomic sequences to computational comparisons and subsequent experimentation, we are inching our way toward a more comprehensive catalog of common regulatory motifs that lie behind fundamental biological processes. We are still far from comprehending how the transcriptional regulatory code is encrypted in the human genome and providing an initial global view of regulatory gene networks, but collectively, the continued development of comparative and experimental approaches will rapidly expand our knowledge of the transcriptional regulome.
Origin and evolution of spliceosomal introns

PubMed Central

2012-01-01

Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome
Hierarchically Aligning 10 Legume Genomes Establishes a Family-Level Genomics Platform1[OPEN

PubMed Central

Sun, Pengchuan; Li, Yuxian; Liu, Yinzhe; Yu, Jigao; Ma, Xuelian; Sun, Sangrong; Yang, Nanshan; Xia, Ruiyan; Lei, Tianyu; Liu, Xiaojian; Jiao, Beibei; Xing, Yue; Ge, Weina; Wang, Li; Song, Xiaoming; Yuan, Min; Guo, Di; Zhang, Lan; Zhang, Jiaqi; Chen, Wei; Pan, Yuxin; Liu, Tao; Jin, Ling; Sun, Jinshuai; Yu, Jiaxiang; Duan, Xueqian; Shen, Shaoqi; Qin, Jun; Zhang, Meng-chen; Paterson, Andrew H.

2017-01-01

Mainly due to their economic importance, genomes of 10 legumes, including soybean (Glycine max), wild peanut (Arachis duranensis and Arachis ipaensis), and barrel medic (Medicago truncatula), have been sequenced. However, a family-level comparative genomics analysis has been unavailable. With grape (Vitis vinifera) and selected legume genomes as outgroups, we managed to perform a hierarchical and event-related alignment of these genomes and deconvoluted layers of homologous regions produced by ancestral polyploidizations or speciations. Consequently, we illustrated genomic fractionation characterized by widespread gene losses after the polyploidizations. Notably, high similarity in gene retention between recently duplicated chromosomes in soybean supported the likely autopolyploidy nature of its tetraploid ancestor. Moreover, although most gene losses were nearly random, largely but not fully described by geometric distribution, we showed that polyploidization contributed divergently to the copy number variation of important gene families. Besides, we showed significantly divergent evolutionary levels among legumes and, by performing synonymous nucleotide substitutions at synonymous sites correction, redated major evolutionary events during their expansion. This effort laid a solid foundation for further genomics exploration in the legume research community and beyond. We describe only a tiny fraction of legume comparative genomics analysis that we performed; more information was stored in the newly constructed Legume Comparative Genomics Research Platform (www.legumegrp.org). PMID:28325848
The reconstructed ancestral subunit a functions as both V-ATPase isoforms Vph1p and Stv1p in Saccharomyces cerevisiae

PubMed Central

Finnigan, Gregory C.; Hanson-Smith, Victor; Houser, Benjamin D.; Park, Hae J.; Stevens, Tom H.

2011-01-01

The vacuolar-type, proton-translocating ATPase (V-ATPase) is a multisubunit enzyme responsible for organelle acidification in eukaryotic cells. Many organisms have evolved V-ATPase subunit isoforms that allow for increased specialization of this critical enzyme. Differential targeting of the V-ATPase to specific subcellular organelles occurs in eukaryotes from humans to budding yeast. In Saccharomyces cerevisiae, the two subunit a isoforms are the only difference between the two V-ATPase populations. Incorporation of Vph1p or Stv1p into the V-ATPase dictates the localization of the V-ATPase to the vacuole or late Golgi/endosome, respectively. A duplication event within fungi gave rise to two subunit a genes. We used ancestral gene reconstruction to generate the most recent common ancestor of Vph1p and Stv1p (Anc.a) and tested its function in yeast. Anc.a localized to both the Golgi/endosomal network and vacuolar membrane and acidified these compartments as part of a hybrid V-ATPase complex. Trafficking of Anc.a did not require retrograde transport from the late endosome to the Golgi that has evolved for retrieval of the Stv1p isoform. Rather, Anc.a localized to both structures through slowed anterograde transport en route to the vacuole. Our results suggest an evolutionary model that describes the differential localization of the two yeast V-ATPase isoforms. PMID:21737673
Adaptive genomic evolution of opsins reveals that early mammals flourished in nocturnal environments.

PubMed

Borges, Rui; Johnson, Warren E; O'Brien, Stephen J; Gomes, Cidália; Heesy, Christopher P; Antunes, Agostinho

2018-02-05

Based on evolutionary patterns of the vertebrate eye, Walls (1942) hypothesized that early placental mammals evolved primarily in nocturnal habitats. However, not only Eutheria, but all mammals show photic characteristics (i.e. dichromatic vision, rod-dominated retina) suggestive of a scotopic eye design. Here, we used integrative comparative genomic and phylogenetic methodologies employing the photoreceptive opsin gene family in 154 mammals to test the likelihood of a nocturnal period in the emergence of all mammals. We showed that mammals possess genomic patterns concordant with a nocturnal ancestry. The loss of the RH2, VA, PARA, PARIE and OPN4x opsins in all mammals led us to advance a probable and most-parsimonious hypothesis of a global nocturnal bottleneck that explains the loss of these genes in the emerging lineage (> > 215.5 million years ago). In addition, ancestral character reconstruction analyses provided strong evidence that ancestral mammals possessed a nocturnal lifestyle, ultra-violet-sensitive vision, low visual acuity and low orbit convergence (i.e. panoramic vision). Overall, this study provides insight into the evolutionary history of the mammalian eye while discussing important ecological aspects of the photic paleo-environments ancestral mammals have occupied.
Systems Biology Approaches for Understanding Genome Architecture.

PubMed

Sewitz, Sven; Lipkow, Karen

2016-01-01

The linear and three-dimensional arrangement and composition of chromatin in eukaryotic genomes underlies the mechanisms directing gene regulation. Understanding this organization requires the integration of many data types and experimental results. Here we describe the approach of integrating genome-wide protein-DNA binding data to determine chromatin states. To investigate spatial aspects of genome organization, we present a detailed description of how to run stochastic simulations of protein movements within a simulated nucleus in 3D. This systems level approach enables the development of novel questions aimed at understanding the basic mechanisms that regulate genome dynamics.
The intrinsic combinatorial organization and information theoretic content of a sequence are correlated to the DNA encoded nucleosome organization of eukaryotic genomes.

PubMed

Utro, Filippo; Di Benedetto, Valeria; Corona, Davide F V; Giancarlo, Raffaele

2016-03-15

Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. We contribute to close this important methodological gap between the two models by providing three very simple formulas for the sequence specific one. They are all based on well-known formulas in Computer Science and Bioinformatics, and they give different quantifications of how complex a sequence is. In view of how remarkably well they perform, it is very surprising that measures of sequence complexity have not even been considered as candidates to close the mentioned gap. We provide experimental evidence that the intrinsic level of combinatorial organization and information-theoretic content of subsequences within a genome are strongly correlated to the level of DNA encoded nucleosome organization discovered by Kaplan et al Our results establish an important connection between the intrinsic complexity of subsequences in a genome and the intrinsic, i.e. DNA encoded, nucleosome organization of eukaryotic genomes. It is a first step towards a mathematical characterization of this latter 'encoding'. Supplementary data are available at Bioinformatics online. futro@us.ibm.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
The genome of the domesticated apple (Malus × domestica Borkh.).

PubMed

Velasco, Riccardo; Zharkikh, Andrey; Affourtit, Jason; Dhingra, Amit; Cestaro, Alessandro; Kalyanaraman, Ananth; Fontana, Paolo; Bhatnagar, Satish K; Troggio, Michela; Pruss, Dmitry; Salvi, Silvio; Pindo, Massimo; Baldi, Paolo; Castelletti, Sara; Cavaiuolo, Marina; Coppola, Giuseppina; Costa, Fabrizio; Cova, Valentina; Dal Ri, Antonio; Goremykin, Vadim; Komjanc, Matteo; Longhi, Sara; Magnago, Pierluigi; Malacarne, Giulia; Malnoy, Mickael; Micheletti, Diego; Moretto, Marco; Perazzolli, Michele; Si-Ammour, Azeddine; Vezzulli, Silvia; Zini, Elena; Eldredge, Glenn; Fitzgerald, Lisa M; Gutin, Natalia; Lanchbury, Jerry; Macalma, Teresita; Mitchell, Jeff T; Reid, Julia; Wardell, Bryan; Kodira, Chinnappa; Chen, Zhoutao; Desany, Brian; Niazi, Faheem; Palmer, Melinda; Koepke, Tyson; Jiwan, Derick; Schaeffer, Scott; Krishnan, Vandhana; Wu, Changjun; Chu, Vu T; King, Stephen T; Vick, Jessica; Tao, Quanzhou; Mraz, Amy; Stormo, Aimee; Stormo, Keith; Bogden, Robert; Ederle, Davide; Stella, Alessandra; Vecchietti, Alberto; Kater, Martin M; Masiero, Simona; Lasserre, Pauline; Lespinasse, Yves; Allan, Andrew C; Bus, Vincent; Chagné, David; Crowhurst, Ross N; Gleave, Andrew P; Lavezzo, Enrico; Fawcett, Jeffrey A; Proost, Sebastian; Rouzé, Pierre; Sterck, Lieven; Toppo, Stefano; Lazzari, Barbara; Hellens, Roger P; Durel, Charles-Eric; Gutin, Alexander; Bumgarner, Roger E; Gardiner, Susan E; Skolnick, Mark; Egholm, Michael; Van de Peer, Yves; Salamini, Francesco; Viola, Roberto

2010-10-01

We report a high-quality draft genome sequence of the domesticated apple (Malus × domestica). We show that a relatively recent (>50 million years ago) genome-wide duplication (GWD) has resulted in the transition from nine ancestral chromosomes to 17 chromosomes in the Pyreae. Traces of older GWDs partly support the monophyly of the ancestral paleohexaploidy of eudicots. Phylogenetic reconstruction of Pyreae and the genus Malus, relative to major Rosaceae taxa, identified the progenitor of the cultivated apple as M. sieversii. Expansion of gene families reported to be involved in fruit development may explain formation of the pome, a Pyreae-specific false fruit that develops by proliferation of the basal part of the sepals, the receptacle. In apple, a subclade of MADS-box genes, normally involved in flower and fruit development, is expanded to include 15 members, as are other gene families involved in Rosaceae-specific metabolism, such as transport and assimilation of sorbitol.
Eukaryotic organisms in Proterozoic oceans

PubMed Central

Knoll, A.H; Javaux, E.J; Hewitt, D; Cohen, P

2006-01-01

The geological record of protists begins well before the Ediacaran and Cambrian diversification of animals, but the antiquity of that history, its reliability as a chronicle of evolution and the causal inferences that can be drawn from it remain subjects of debate. Well-preserved protists are known from a relatively small number of Proterozoic formations, but taphonomic considerations suggest that they capture at least broad aspects of early eukaryotic evolution. A modest diversity of problematic, possibly stem group protists occurs in ca 1800–1300 Myr old rocks. 1300–720 Myr fossils document the divergence of major eukaryotic clades, but only with the Ediacaran–Cambrian radiation of animals did diversity increase within most clades with fossilizable members. While taxonomic placement of many Proterozoic eukaryotes may be arguable, the presence of characters used for that placement is not. Focus on character evolution permits inferences about the innovations in cell biology and development that underpin the taxonomic and morphological diversification of eukaryotic organisms. PMID:16754612
Navigating yeast genome maintenance with functional genomics.

PubMed

Measday, Vivien; Stirling, Peter C

2016-03-01

Maintenance of genome integrity is a fundamental requirement of all organisms. To address this, organisms have evolved extremely faithful modes of replication, DNA repair and chromosome segregation to combat the deleterious effects of an unstable genome. Nonetheless, a small amount of genome instability is the driver of evolutionary change and adaptation, and thus a low level of instability is permitted in populations. While defects in genome maintenance almost invariably reduce fitness in the short term, they can create an environment where beneficial mutations are more likely to occur. The importance of this fact is clearest in the development of human cancer, where genome instability is a well-established enabling characteristic of carcinogenesis. This raises the crucial question: what are the cellular pathways that promote genome maintenance and what are their mechanisms? Work in model organisms, in particular the yeast Saccharomyces cerevisiae, has provided the global foundations of genome maintenance mechanisms in eukaryotes. The development of pioneering genomic tools inS. cerevisiae, such as the systematic creation of mutants in all nonessential and essential genes, has enabled whole-genome approaches to identifying genes with roles in genome maintenance. Here, we review the extensive whole-genome approaches taken in yeast, with an emphasis on functional genomic screens, to understand the genetic basis of genome instability, highlighting a range of genetic and cytological screening modalities. By revealing the biological pathways and processes regulating genome integrity, these analyses contribute to the systems-level map of the yeast cell and inform studies of human disease, especially cancer. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Invasion of Ancestral Mammals into Dim-light Environments Inferred from Adaptive Evolution of the Phototransduction Genes.

PubMed

Wu, Yonghua; Wang, Haifeng; Hadly, Elizabeth A

2017-04-20

Nocturnality is a key evolutionary innovation of mammals that enables mammals to occupy relatively empty nocturnal niches. Invasion of ancestral mammals into nocturnality has long been inferred from the phylogenetic relationships of crown Mammalia, which is primarily nocturnal, and crown Reptilia, which is primarily diurnal, although molecular evidence for this is lacking. Here we used phylogenetic analyses of the vision genes involved in the phototransduction pathway to predict the diel activity patterns of ancestral mammals and reptiles. Our results demonstrated that the common ancestor of the extant Mammalia was dominated by positive selection for dim-light vision, supporting the predominate nocturnality of the ancestral mammals. Further analyses showed that the nocturnality of the ancestral mammals was probably derived from the predominate diurnality of the ancestral amniotes, which featured strong positive selection for bright-light vision. Like the ancestral amniotes, the common ancestor of the extant reptiles and various taxa in Squamata, one of the main competitors of the temporal niches of the ancestral mammals, were found to be predominate diurnality as well. Despite this relatively apparent temporal niche partitioning between ancestral mammals and the relevant reptiles, our results suggested partial overlap of their temporal niches during crepuscular periods.
Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure.

PubMed

Gordon, Sean P; Contreras-Moreira, Bruno; Woods, Daniel P; Des Marais, David L; Burgess, Diane; Shu, Shengqiang; Stritt, Christoph; Roulin, Anne C; Schackwitz, Wendy; Tyler, Ludmila; Martin, Joel; Lipzen, Anna; Dochy, Niklas; Phillips, Jeremy; Barry, Kerrie; Geuten, Koen; Budak, Hikmet; Juenger, Thomas E; Amasino, Richard; Caicedo, Ana L; Goodstein, David; Davidson, Patrick; Mur, Luis A J; Figueroa, Melania; Freeling, Michael; Catalan, Pilar; Vogel, John P

2017-12-19

While prokaryotic pan-genomes have been shown to contain many more genes than any individual organism, the prevalence and functional significance of differentially present genes in eukaryotes remains poorly understood. Whole-genome de novo assembly and annotation of 54 lines of the grass Brachypodium distachyon yield a pan-genome containing nearly twice the number of genes found in any individual genome. Genes present in all lines are enriched for essential biological functions, while genes present in only some lines are enriched for conditionally beneficial functions (e.g., defense and development), display faster evolutionary rates, lie closer to transposable elements and are less likely to be syntenic with orthologous genes in other grasses. Our data suggest that differentially present genes contribute substantially to phenotypic variation within a eukaryote species, these genes have a major influence in population genetics, and transposable elements play a key role in pan-genome evolution.
Eukaryotic tRNAs fingerprint invertebrates vis-à-vis vertebrates.

PubMed

Mitra, Sanga; Das, Pijush; Samadder, Arpa; Das, Smarajit; Betai, Rupal; Chakrabarti, Jayprokas

2015-01-01

During translation, aminoacyl-tRNA synthetases recognize the identities of the tRNAs to charge them with their respective amino acids. The conserved identities of 58,244 eukaryotic tRNAs of 24 invertebrates and 45 vertebrates in genomic tRNA database were analyzed and their novel features extracted. The internal promoter sequences, namely, A-Box and B-Box, were investigated and evidence gathered that the intervention of optional nucleotides at 17a and 17b correlated with the optimal length of the A-Box. The presence of canonical transcription terminator sequences at the immediate vicinity of tRNA genes was ventured. Even though non-canonical introns had been reported in red alga, green alga, and nucleomorph so far, fairly motivating evidence of their existence emerged in tRNA genes of other eukaryotes. Non-canonical introns were seen to interfere with the internal promoters in two cases, questioning their transcription fidelity. In a first of its kind, phylogenetic constructs based on tRNA molecules delineated and built the trees of the vast and diverse invertebrates and vertebrates. Finally, two tRNA models representing the invertebrates and the vertebrates were drawn, by isolating the dominant consensus in the positional fluctuations of nucleotide compositions.
The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution.

PubMed

Verde, Ignazio; Abbott, Albert G; Scalabrin, Simone; Jung, Sook; Shu, Shengqiang; Marroni, Fabio; Zhebentyayeva, Tatyana; Dettori, Maria Teresa; Grimwood, Jane; Cattonaro, Federica; Zuccolo, Andrea; Rossini, Laura; Jenkins, Jerry; Vendramin, Elisa; Meisel, Lee A; Decroocq, Veronique; Sosinski, Bryon; Prochnik, Simon; Mitros, Therese; Policriti, Alberto; Cipriani, Guido; Dondini, Luca; Ficklin, Stephen; Goodstein, David M; Xuan, Pengfei; Del Fabbro, Cristian; Aramini, Valeria; Copetti, Dario; Gonzalez, Susana; Horner, David S; Falchi, Rachele; Lucas, Susan; Mica, Erica; Maldonado, Jonathan; Lazzari, Barbara; Bielenberg, Douglas; Pirona, Raul; Miculan, Mara; Barakat, Abdelali; Testolin, Raffaele; Stella, Alessandra; Tartarini, Stefano; Tonutti, Pietro; Arús, Pere; Orellana, Ariel; Wells, Christina; Main, Dorrie; Vizzotto, Giannina; Silva, Herman; Salamini, Francesco; Schmutz, Jeremy; Morgante, Michele; Rokhsar, Daniel S

2013-05-01

Rosaceae is the most important fruit-producing clade, and its key commercially relevant genera (Fragaria, Rosa, Rubus and Prunus) show broadly diverse growth habits, fruit types and compact diploid genomes. Peach, a diploid Prunus species, is one of the best genetically characterized deciduous trees. Here we describe the high-quality genome sequence of peach obtained from a completely homozygous genotype. We obtained a complete chromosome-scale assembly using Sanger whole-genome shotgun methods. We predicted 27,852 protein-coding genes, as well as noncoding RNAs. We investigated the path of peach domestication through whole-genome resequencing of 14 Prunus accessions. The analyses suggest major genetic bottlenecks that have substantially shaped peach genome diversity. Furthermore, comparative analyses showed that peach has not undergone recent whole-genome duplication, and even though the ancestral triplicated blocks in peach are fragmentary compared to those in grape, all seven paleosets of paralogs from the putative paleoancestor are detectable.
A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs.

PubMed

Swain, Martin T; Tsai, Isheng J; Assefa, Samual A; Newbold, Chris; Berriman, Matthew; Otto, Thomas D

2012-06-07

Genome projects now produce draft assemblies within weeks owing to advanced high-throughput sequencing technologies. For milestone projects such as Escherichia coli or Homo sapiens, teams of scientists were employed to manually curate and finish these genomes to a high standard. Nowadays, this is not feasible for most projects, and the quality of genomes is generally of a much lower standard. This protocol describes software (PAGIT) that is used to improve the quality of draft genomes. It offers flexible functionality to close gaps in scaffolds, correct base errors in the consensus sequence and exploit reference genomes (if available) in order to improve scaffolding and generating annotations. The protocol is most accessible for bacterial and small eukaryotic genomes (up to 300 Mb), such as pathogenic bacteria, malaria and parasitic worms. Applying PAGIT to an E. coli assembly takes ∼24 h: it doubles the average contig size and annotates over 4,300 gene models.

The genome sequence of the Irish potato famine pathogen Phytophthora infestans

USDA-ARS?s Scientific Manuscript database

Phytophthora infestans is the most destructive pathogen of potato and a model organism for the oomycetes, a distinct lineage of fungus-like eukaryotes that are related to photosynthetic organisms such as brown algae and diatoms. Here, we report the genome sequence of P. infestans. The ~240 Mb genome...
Phylogenetic evidence for a fusion of archaeal and bacterial SemiSWEETs to form eukaryotic SWEETs and identification of SWEET hexose transporters in the amphibian chytrid pathogen Batrachochytrium dendrobatidis.

PubMed

Hu, Yi-Bing; Sosso, Davide; Qu, Xiao-Qing; Chen, Li-Qing; Ma, Lai; Chermak, Diane; Zhang, De-Chun; Frommer, Wolf B

2016-10-01

SWEETs represent a new class of sugar transporters first described in plants, animals, and humans and later in prokaryotes. Plant SWEETs play key roles in phloem loading, seed filling, and nectar secretion, whereas the role of archaeal, bacterial, and animal transporters remains elusive. Structural analyses show that eukaryotic SWEETs are composed of 2 triple-helix bundles (THBs) fused via an inversion linker helix, whereas prokaryotic SemiSWEETs contain only a single THB and require homodimerization to form transport pores. This study indicates that SWEETs retained sugar transport activity in all kingdoms of life, and that SemiSWEETs are likely their ancestral units. Fusion of oligomeric subunits into single polypeptides during evolution of eukaryotes is commonly found for transporters. Phylogenetic analyses indicate that THBs of eukaryotic SWEETs may not have evolved by tandem duplication of an open reading frame, but rather originated by fusion between an archaeal and a bacterial SemiSWEET, which potentially explains the asymmetry of eukaryotic SWEETs. Moreover, despite the ancient ancestry, SWEETs had not been identified in fungi or oomycetes. Here, we report the identification of SWEETs in oomycetes as well as SWEETs and a potential SemiSWEET in primitive fungi. BdSWEET1 and BdSWEET2 from Batrachochytrium dendrobatidis, a nonhyphal zoosporic fungus that causes global decline in amphibians, showed glucose and fructose transport activities.-Hu, Y.-B., Sosso, D., Qu, X.-Q., Chen, L.-Q., Ma, L., Chermak, D., Zhang, D.-C., Frommer, W. B. Phylogenetic evidence for a fusion of archaeal and bacterial SemiSWEETs to form eukaryotic SWEETs and identification of SWEET hexose transporters in the amphibian chytrid pathogen Batrachochytrium dendrobatidis. © FASEB.
Precambrian Skeletonized Microbial Eukaryotes

NASA Astrophysics Data System (ADS)

Lipps, Jere H.

2017-04-01

Skeletal heterotrophic eukaryotes are mostly absent from the Precambrian, although algal eukaryotes appear about 2.2 billion years ago. Tintinnids, radiolaria and foraminifera have molecular origins well back into the Precambrian yet no representatives of these groups are known with certainty in that time. These data infer times of the last common ancestors, not the appearance of true representatives of these groups which may well have diversified or not been preserved since those splits. Previous reports of these groups in the Precambrian are misinterpretations of other objects in the fossil record. Reported tintinnids at 1600 mya from China are metamorphic shards or mineral artifacts, the many specimens from 635-715 mya in Mongolia may be eukaryotes but they are not tintinnids, and the putative tintinnids at 580 mya in the Doushantou formation of China are diagenetic alterations of well-known acritarchs. The oldest supposed foraminiferan is Titanotheca from 550 to 565 mya rocks in South America and Africa is based on the occurrence of rutile in the tests and in a few modern agglutinated foraminifera, as well as the agglutinated tests. Neither of these nor the morphology are characteristic of foraminifera; hence these fossils remain as indeterminate microfossils. Platysolenites, an agglutinated tube identical to the modern foraminiferan Bathysiphon, occurs in the latest Neoproterozoic in Russia, Canada, and the USA (California). Some of the larger fossils occurring in typical Ediacaran (late Neoproterozoic) assemblages may be xenophyophorids (very large foraminifera), but the comparison is disputed and flawed. Radiolaria, on occasion, have been reported in the Precambrian, but the earliest known clearly identifiable ones are in the Cambrian. The only certain Precambrian heterotrophic skeletal eukaryotes (thecamoebians) occur in fresh-water rocks at about 750 mya. Skeletonized radiolaria and foraminifera appear sparsely in the Cambrian and radiate in the Ordovician
Invasion of Ancestral Mammals into Dim-light Environments Inferred from Adaptive Evolution of the Phototransduction Genes

PubMed Central

Wu, Yonghua; Wang, Haifeng; Hadly, Elizabeth A.

2017-01-01

Nocturnality is a key evolutionary innovation of mammals that enables mammals to occupy relatively empty nocturnal niches. Invasion of ancestral mammals into nocturnality has long been inferred from the phylogenetic relationships of crown Mammalia, which is primarily nocturnal, and crown Reptilia, which is primarily diurnal, although molecular evidence for this is lacking. Here we used phylogenetic analyses of the vision genes involved in the phototransduction pathway to predict the diel activity patterns of ancestral mammals and reptiles. Our results demonstrated that the common ancestor of the extant Mammalia was dominated by positive selection for dim-light vision, supporting the predominate nocturnality of the ancestral mammals. Further analyses showed that the nocturnality of the ancestral mammals was probably derived from the predominate diurnality of the ancestral amniotes, which featured strong positive selection for bright-light vision. Like the ancestral amniotes, the common ancestor of the extant reptiles and various taxa in Squamata, one of the main competitors of the temporal niches of the ancestral mammals, were found to be predominate diurnality as well. Despite this relatively apparent temporal niche partitioning between ancestral mammals and the relevant reptiles, our results suggested partial overlap of their temporal niches during crepuscular periods. PMID:28425474
Mavericks, a novel class of giant transposable elements widespread in eukaryotes and related to DNA viruses.

PubMed

Pritham, Ellen J; Putliwala, Tasneem; Feschotte, Cédric

2007-04-01

We previously identified a group of atypical mobile elements designated Mavericks from the nematodes Caenorhabditis elegans and C. briggsae and the zebrafish Danio rerio. Here we present the results of comprehensive database searches of the genome sequences available, which reveal that Mavericks are widespread in invertebrates and non-mammalian vertebrates but show a patchy distribution in non-animal species, being present in the fungi Glomus intraradices and Phakopsora pachyrhizi and in several single-celled eukaryotes such as the ciliate Tetrahymena thermophila, the stramenopile Phytophthora infestans and the trichomonad Trichomonas vaginalis, but not detectable in plants. This distribution, together with comparative and phylogenetic analyses of Maverick-encoded proteins, is suggestive of an ancient origin of these elements in eukaryotes followed by lineage-specific losses and/or recurrent episodes of horizontal transmission. In addition, we report that Maverick elements have amplified recently to high copy numbers in T. vaginalis where they now occupy as much as 30% of the genome. Sequence analysis confirms that most Mavericks encode a retroviral-like integrase, but lack other open reading frames typically found in retroelements. Nevertheless, the length and conservation of the target site duplication created upon Maverick insertion (5- or 6-bp) is consistent with a role of the integrase-like protein in the integration of a double-stranded DNA transposition intermediate. Mavericks also display long terminal-inverted repeats but do not contain ORFs similar to proteins encoded by DNA transposons. Instead, Mavericks encode a conserved set of 5 to 9 genes (in addition to the integrase) that are predicted to encode proteins with homology to replication and packaging proteins of some bacteriophages and diverse eukaryotic double-stranded DNA viruses, including a DNA polymerase B homolog and putative capsid proteins. Based on these and other structural similarities, we
Inverse PCR-based method for isolating novel SINEs from genome.

PubMed

Han, Yawei; Chen, Liping; Guan, Lihong; He, Shunping

2014-04-01

Short interspersed elements (SINEs) are moderately repetitive DNA sequences in eukaryotic genomes. Although eukaryotic genomes contain numerous SINEs copy, it is very difficult and laborious to isolate and identify them by the reported methods. In this study, the inverse PCR was successfully applied to isolate SINEs from Opsariichthys bidens genome in Eastern Asian Cyprinid. A group of SINEs derived from tRNA(Ala) molecular had been identified, which were named Opsar according to Opsariichthys. SINEs characteristics were exhibited in Opsar, which contained a tRNA(Ala)-derived region at the 5' end, a tRNA-unrelated region, and AT-rich region at the 3' end. The tRNA-derived region of Opsar shared 76 % sequence similarity with tRNA(Ala) gene. This result indicated that Opsar could derive from the inactive or pseudogene of tRNA(Ala). The reliability of method was tested by obtaining C-SINE, Ct-SINE, and M-SINEs from Ctenopharyngodon idellus, Megalobrama amblycephala, and Cyprinus carpio genomes. This method is simpler than the previously reported, which successfully omitted many steps, such as preparation of probes, construction of genomic libraries, and hybridization.
Mutant power: using mutant allele collections for yeast functional genomics.

PubMed

Norman, Kaitlyn L; Kumar, Anuj

2016-03-01

The budding yeast has long served as a model eukaryote for the functional genomic analysis of highly conserved signaling pathways, cellular processes and mechanisms underlying human disease. The collection of reagents available for genomics in yeast is extensive, encompassing a growing diversity of mutant collections beyond gene deletion sets in the standard wild-type S288C genetic background. We review here three main types of mutant allele collections: transposon mutagen collections, essential gene collections and overexpression libraries. Each collection provides unique and identifiable alleles that can be utilized in genome-wide, high-throughput studies. These genomic reagents are particularly informative in identifying synthetic phenotypes and functions associated with essential genes, including those modeled most effectively in complex genetic backgrounds. Several examples of genomic studies in filamentous/pseudohyphal backgrounds are provided here to illustrate this point. Additionally, the limitations of each approach are examined. Collectively, these mutant allele collections in Saccharomyces cerevisiae and the related pathogenic yeast Candida albicans promise insights toward an advanced understanding of eukaryotic molecular and cellular biology. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
On the Number of Non-equivalent Ancestral Configurations for Matching Gene Trees and Species Trees.

PubMed

Disanto, Filippo; Rosenberg, Noah A

2017-09-14

An ancestral configuration is one of the combinatorially distinct sets of gene lineages that, for a given gene tree, can reach a given node of a specified species tree. Ancestral configurations have appeared in recursive algebraic computations of the conditional probability that a gene tree topology is produced under the multispecies coalescent model for a given species tree. For matching gene trees and species trees, we study the number of ancestral configurations, considered up to an equivalence relation introduced by Wu (Evolution 66:763-775, 2012) to reduce the complexity of the recursive probability computation. We examine the largest number of non-equivalent ancestral configurations possible for a given tree size n. Whereas the smallest number of non-equivalent ancestral configurations increases polynomially with n, we show that the largest number increases with [Formula: see text], where k is a constant that satisfies [Formula: see text]. Under a uniform distribution on the set of binary labeled trees with a given size n, the mean number of non-equivalent ancestral configurations grows exponentially with n. The results refine an earlier analysis of the number of ancestral configurations considered without applying the equivalence relation, showing that use of the equivalence relation does not alter the exponential nature of the increase with tree size.
The eukaryotic fossil record in deep time

NASA Astrophysics Data System (ADS)

Butterfield, N.

2011-12-01

Eukaryotic organisms are defining constituents of the Phanerozoic biosphere, but they also extend well back into the Proterozoic record, primarily in the form of microscopic body fossils. Criteria for identifying pre-Ediacaran eukaryotes include large cell size, morphologically complex cell walls and/or the recognition of diagnostically eukaryotic cell division patterns. The oldest unambiguous eukaryote currently on record is an acanthomorphic acritarch (Tappania) from the Palaeoproterozoic Semri Group of central India. Older candidate eukaryotes are difficult to distinguish from giant bacteria, prokaryotic colonies or diagenetic artefacts. In younger Meso- and Neoproterozoic strata, the challenge is to recognize particular grades and clades of eukaryotes, and to document their macro-evolutionary expression. Distinctive unicellular forms include mid-Neoproterozoic testate amoebae and phosphate biomineralizing 'scale-microfossils' comparable to an extant green alga. There is also a significant record of seaweeds, possible fungi and problematica from this interval, documenting multiple independent experiments in eukaryotic multicellularity. Taxonomically resolved forms include a bangiacean red alga and probable vaucheriacean chromalveolate algae from the late Mesoproterozoic, and populations of hydrodictyacean and siphonocladalean green algae of mid Neoproterozoic age. Despite this phylogenetic breadth, however, or arguments from molecular clocks, there is no convincing evidence for pre-Ediacaran metazoans or metaphytes. The conspicuously incomplete nature of the Proterozoic record makes it difficult to resolve larger-scale ecological and evolutionary patterns. Even so, both body fossils and biomarker data point to a pre-Ediacaran biosphere dominated overwhelming by prokaryotes. Contemporaneous eukaryotes appear to be limited to conspicuously shallow water environments, and exhibit fundamentally lower levels of morphological diversity and evolutionary turnover than
The Giardia lamblia genome.

PubMed

Adam, R D

2000-04-10

Giardia lamblia is a protozoan parasite of humans and other mammals that is thought to be one of the most primitive extant eukaryotic organisms. Although distinctly eukaryotic, it is notable for its lack of mitochondria, nucleoli, and perixosomes. It has been suggested that Giardia spp. are pre-mitochondriate organisms, but the identification of genes in G. lamblia thought to be of mitochondrial origin has generated controversy regarding that designation. Giardi lamblia trophozoites have two nuclei that are identical in all ways that have been studied. They are polyploid with at least four, and perhaps eight or more, copies of each of five chromosomes per organism and have an estimated genome complexity of 1.2x10(7)bp of DNA, and GC content of 46%. There is evidence for recombination at the telomeres of some of the chromosomes, and multiple size variants of single chromosomes have been identified within cloned isolates. However, the internal regions of the chromosomes demonstrate no evidence of recombination. For example, there is no evidence for control of vsp gene expression by DNA recombination, and no evidence for rapid mutation in the vsp genes. Single pass sequences of approximately 9% of the G. lamblia genome have already been obtained. An ongoing genome project plans to obtain approximately 95% of the genome by a random approach, as well as a complete physical map using a bacterial artificial chromosome library. The results will facilitate a better understanding of the biology of Giardia spp. as well as their phylogenetic relationship to other primitive organisms.
The structured ancestral selection graph and the many-demes limit.

PubMed

Slade, Paul F; Wakeley, John

2005-02-01

We show that the unstructured ancestral selection graph applies to part of the history of a sample from a population structured by restricted migration among subpopulations, or demes. The result holds in the limit as the number of demes tends to infinity with proportionately weak selection, and we have also made the assumptions of island-type migration and that demes are equivalent in size. After an instantaneous sample-size adjustment, this structured ancestral selection graph converges to an unstructured ancestral selection graph with a mutation parameter that depends inversely on the migration rate. In contrast, the selection parameter for the population is independent of the migration rate and is identical to the selection parameter in an unstructured population. We show analytically that estimators of the migration rate, based on pairwise sequence differences, derived under the assumption of neutrality should perform equally well in the presence of weak selection. We also modify an algorithm for simulating genealogies conditional on the frequencies of two selected alleles in a sample. This permits efficient simulation of stronger selection than was previously possible. Using this new algorithm, we simulate gene genealogies under the many-demes ancestral selection graph and identify some situations in which migration has a strong effect on the time to the most recent common ancestor of the sample. We find that a similar effect also increases the sensitivity of the genealogy to selection.
The Impact of Chromatin Dynamics on Cas9-Mediated Genome Editing in Human Cells.

PubMed

Daer, René M; Cutts, Josh P; Brafman, David A; Haynes, Karmella A

2017-03-17

In order to efficiently edit eukaryotic genomes, it is critical to test the impact of chromatin dynamics on CRISPR/Cas9 function and develop strategies to adapt the system to eukaryotic contexts. So far, research has extensively characterized the relationship between the CRISPR endonuclease Cas9 and the composition of the RNA-DNA duplex that mediates the system's precision. Evidence suggests that chromatin modifications and DNA packaging can block eukaryotic genome editing by custom-built DNA endonucleases like Cas9; however, the underlying mechanism of Cas9 inhibition is unclear. Here, we demonstrate that closed, gene-silencing-associated chromatin is a mechanism for the interference of Cas9-mediated DNA editing. Our assays use a transgenic cell line with a drug-inducible switch to control chromatin states (open and closed) at a single genomic locus. We show that closed chromatin inhibits binding and editing at specific target sites and that artificial reversal of the silenced state restores editing efficiency. These results provide new insights to improve Cas9-mediated editing in human and other mammalian cells.
A Draft Sequence of the Neandertal Genome

PubMed Central

Green, Richard E.; Li, Heng; Zhai, Weiwei; Fritz, Markus Hsi-Yang; Hansen, Nancy F.; Durand, Eric Y.; Malaspinas, Anna-Sapfo; Jensen, Jeffrey D.; Marques-Bonet, Tomas; Alkan, Can; Prüfer, Kay; Meyer, Matthias; Burbano, Hernán A.; Good, Jeffrey M.; Schultz, Rigo; Aximu-Petri, Ayinuer; Butthof, Anne; Höber, Barbara; Höffner, Barbara; Siegemund, Madlen; Weihmann, Antje; Nusbaum, Chad; Lander, Eric S.; Russ, Carsten; Novod, Nathaniel; Affourtit, Jason; Egholm, Michael; Verna, Christine; Rudan, Pavao; Brajkovic, Dejana; Kucan, Željko; Gušic, Ivan; Doronichev, Vladimir B.; Golovanova, Liubov V.; Lalueza-Fox, Carles; de la Rasilla, Marco; Fortea, Javier; Rosas, Antonio; Schmitz, Ralf W.; Johnson, Philip L. F.; Eichler, Evan E.; Falush, Daniel; Birney, Ewan; Mullikin, James C.; Slatkin, Montgomery; Nielsen, Rasmus; Kelso, Janet; Lachmann, Michael; Reich, David; Pääbo, Svante

2016-01-01

Neandertals, the closest evolutionary relatives of present-day humans, lived in large parts of Europe and western Asia before disappearing 30,000 years ago. We present a draft sequence of the Neandertal genome composed of more than 4 billion nucleotides from three individuals. Comparisons of the Neandertal genome to the genomes of five present-day humans from different parts of the world identify a number of genomic regions that may have been affected by positive selection in ancestral modern humans, including genes involved in metabolism and in cognitive and skeletal development. We show that Neandertals shared more genetic variants with present-day humans in Eurasia than with present-day humans in sub-Saharan Africa, suggesting that gene flow from Neandertals into the ancestors of non-Africans occurred before the divergence of Eurasian groups from each other. PMID:20448178
Retroelements and their impact on genome evolution and functioning.

PubMed

Gogvadze, Elena; Buzdin, Anton

2009-12-01

Retroelements comprise a considerable fraction of eukaryotic genomes. Since their initial discovery by Barbara McClintock in maize DNA, retroelements have been found in genomes of almost all organisms. First considered as a "junk DNA" or genomic parasites, they were shown to influence genome functioning and to promote genetic innovations. For this reason, they were suggested as an important creative force in the genome evolution and adaptation of an organism to altered environmental conditions. In this review, we summarize the up-to-date knowledge of different ways of retroelement involvement in structural and functional evolution of genes and genomes, as well as the mechanisms generated by cells to control their retrotransposition.
RPAN: rice pan-genome browser for ∼3000 rice genomes.

PubMed

Sun, Chen; Hu, Zhiqiang; Zheng, Tianqing; Lu, Kuangchen; Zhao, Yue; Wang, Wensheng; Shi, Jianxin; Wang, Chunchao; Lu, Jinyuan; Zhang, Dabing; Li, Zhikang; Wei, Chaochun

2017-01-25

A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Evolution of four gene families with patchy phylogenetic distributions: influx of genes into protist genomes

PubMed Central

Andersson, Jan O; Hirt, Robert P; Foster, Peter G; Roger, Andrew J

2006-01-01

Background Lateral gene transfer (LGT) in eukaryotes from non-organellar sources is a controversial subject in need of further study. Here we present gene distribution and phylogenetic analyses of the genes encoding the hybrid-cluster protein, A-type flavoprotein, glucosamine-6-phosphate isomerase, and alcohol dehydrogenase E. These four genes have a limited distribution among sequenced prokaryotic and eukaryotic genomes and were previously implicated in gene transfer events affecting eukaryotes. If our previous contention that these genes were introduced by LGT independently into the diplomonad and Entamoeba lineages were true, we expect that the number of putative transfers and the phylogenetic signal supporting LGT should be stable or increase, rather than decrease, when novel eukaryotic and prokaryotic homologs are added to the analyses. Results The addition of homologs from phagotrophic protists, including several Entamoeba species, the pelobiont Mastigamoeba balamuthi, and the parabasalid Trichomonas vaginalis, and a large quantity of sequences from genome projects resulted in an apparent increase in the number of putative transfer events affecting all three domains of life. Some of the eukaryotic transfers affect a wide range of protists, such as three divergent lineages of Amoebozoa, represented by Entamoeba, Mastigamoeba, and Dictyostelium, while other transfers only affect a limited diversity, for example only the Entamoeba lineage. These observations are consistent with a model where these genes have been introduced into protist genomes independently from various sources over a long evolutionary time. Conclusion Phylogenetic analyses of the updated datasets using more sophisticated phylogenetic methods, in combination with the gene distribution analyses, strengthened, rather than weakened, the support for LGT as an important mechanism affecting the evolution of these gene families. Thus, gene transfer seems to be an on-going evolutionary mechanism by
Expanding the eukaryotic genetic code

DOEpatents

Chin, Jason W.; Cropp, T. Ashton; Anderson, J. Christopher; Schultz, Peter G.

2013-01-22

This invention provides compositions and methods for producing translational components that expand the number of genetically encoded amino acids in eukaryotic cells. The components include orthogonal tRNAs, orthogonal aminoacyl-tRNA synthetases, orthogonal pairs of tRNAs/synthetases and unnatural amino acids. Proteins and methods of producing proteins with unnatural amino acids in eukaryotic cells are also provided.
Expanding the eukaryotic genetic code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chin, Jason W.; Cropp, T. Ashton; Anderson, J. Christopher

This invention provides compositions and methods for producing translational components that expand the number of genetically encoded amino acids in eukaryotic cells. The components include orthogonal tRNAs, orthogonal aminoacyl-tRNA synthetases, orthogonal pairs of tRNAs/synthetases and unnatural amino acids. Proteins and methods of producing proteins with unnatural amino acids in eukaryotic cells are also provided.
Expanding the eukaryotic genetic code

DOEpatents

Chin, Jason W [Cambridge, GB; Cropp, T Ashton [Bethesda, MD; Anderson, J Christopher [San Francisco, CA; Schultz, Peter G [La Jolla, CA

2009-10-27

This invention provides compositions and methods for producing translational components that expand the number of genetically encoded amino acids in eukaryotic cells. The components include orthogonal tRNAs, orthogonal aminoacyl-tRNA synthetases, orthogonal pairs of tRNAs/synthetases and unnatural amino acids. Proteins and methods of producing proteins with unnatural amino acids in eukaryotic cells are also provided.
Expanding the eukaryotic genetic code

DOEpatents

Chin, Jason W; Cropp, T. Ashton; Anderson, J. Christopher; Schultz, Peter G

2015-02-03

This invention provides compositions and methods for producing translational components that expand the number of genetically encoded amino acids in eukaryotic cells. The components include orthogonal tRNAs, orthogonal aminoacyl-tRNA synthetases, orthogonal pairs of tRNAs/synthetases and unnatural amino acids. Proteins and methods of producing proteins with unnatural amino acids in eukaryotic cells are also provided.

Expanding the eukaryotic genetic code

DOEpatents

Chin, Jason W [Cambridge, GB; Cropp, T Ashton [Bethesda, MD; Anderson, J Christopher [San Francisco, CA; Schultz, Peter G [La Jolla, CA

2009-12-01

This invention provides compositions and methods for producing translational components that expand the number of genetically encoded amino acids in eukaryotic cells. The components include orthogonal tRNAs, orthogonal aminoacyl-tRNA synthetases, orthogonal pairs of tRNAs/synthetases and unnatural amino acids. Proteins and methods of producing proteins with unnatural amino acids in eukaryotic cells are also provided.
Expanding the eukaryotic genetic code

DOEpatents

Chin, Jason W [Cambridge, GB; Cropp, T Ashton [Bethesda, MD; Anderson, J Christopher [San Francisco, CA; Schultz, Peter G [La Jolla, CA

2012-02-14

This invention provides compositions and methods for producing translational components that expand the number of genetically encoded amino acids in eukaryotic cells. The components include orthogonal tRNAs, orthogonal aminoacyl-tRNA synthetases, orthogonal pairs of tRNAs/synthetases and unnatural amino acids. Proteins and methods of producing proteins with unnatural amino acids in eukaryotic cells are also provided.
Expanding the eukaryotic genetic code

DOEpatents

Chin, Jason W [Cambridge, GB; Cropp, T Ashton [Bethesda, MD; Anderson, J Christopher [San Francisco, CA; Schultz, Peter G [La Jolla, CA

2009-11-17

This invention provides compositions and methods for producing translational components that expand the number of genetically encoded amino acids in eukaryotic cells. The components include orthogonal tRNAs, orthogonal aminoacyl-tRNA synthetases, orthogonal pairs of tRNAs/synthetases and unnatural amino acids. Proteins and methods of producing proteins with unnatural amino acids in eukaryotic cells are also provided.
Expanding the eukaryotic genetic code

DOEpatents

Chin, Jason W.; Cropp, T. Ashton; Anderson, J. Christopher; Schultz, Peter G.

2010-09-14

This invention provides compositions and methods for producing translational components that expand the number of genetically encoded amino acids in eukaryotic cells. The components include orthogonal tRNAs, orthogonal aminoacyl-tRNA synthetases, orthogonal pairs of tRNAs/synthetases and unnatural amino acids. Proteins and methods of producing proteins with unnatural amino acids in eukaryotic cells are also provided.
Expanding the eukaryotic genetic code

DOEpatents

Chin, Jason W [Cambridge, GB; Cropp, T Ashton [Bethesda, MD; Anderson, J Christopher [San Francisco, CA; Schultz, Peter G [La Jolla, CA

2012-05-08

This invention provides compositions and methods for producing translational components that expand the number of genetically encoded amino acids in eukaryotic cells. The components include orthogonal tRNAs, orthogonal aminoacyl-tRNA synthetases, orthogonal pairs of tRNAs/synthetases and unnatural amino acids. Proteins and methods of producing proteins with unnatural amino acids in eukaryotic cells are also provided.
Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure

DOE PAGES

Gordon, Sean P.; Contreras-Moreira, Bruno; Woods, Daniel P.; ...

2017-12-19

While prokaryotic pan-genomes have been shown to contain many more genes than any individual organism, the prevalence and functional significance of differentially present genes in eukaryotes remains poorly understood. Whole-genome de novo assembly and annotation of 54 lines of the grass Brachypodium distachyon yield a pan-genome containing nearly twice the number of genes found in any individual genome. Genes present in all lines are enriched for essential biological functions, while genes present in only some lines are enriched for conditionally beneficial functions (e.g., defense and development), display faster evolutionary rates, lie closer to transposable elements and are less likely tomore » be syntenic with orthologous genes in other grasses. Our data suggest that differentially present genes contribute substantially to phenotypic variation within a eukaryote species, these genes have a major influence in population genetics, and transposable elements play a key role in pan-genome evolution.« less
Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gordon, Sean P.; Contreras-Moreira, Bruno; Woods, Daniel P.

While prokaryotic pan-genomes have been shown to contain many more genes than any individual organism, the prevalence and functional significance of differentially present genes in eukaryotes remains poorly understood. Whole-genome de novo assembly and annotation of 54 lines of the grass Brachypodium distachyon yield a pan-genome containing nearly twice the number of genes found in any individual genome. Genes present in all lines are enriched for essential biological functions, while genes present in only some lines are enriched for conditionally beneficial functions (e.g., defense and development), display faster evolutionary rates, lie closer to transposable elements and are less likely tomore » be syntenic with orthologous genes in other grasses. Our data suggest that differentially present genes contribute substantially to phenotypic variation within a eukaryote species, these genes have a major influence in population genetics, and transposable elements play a key role in pan-genome evolution.« less
Multicellularity arose several times in the evolution of eukaryotes (response to DOI 10.1002/bies.201100187).

PubMed

Parfrey, Laura Wegener; Lahr, Daniel J G

2013-04-01

The cellular slime mold Dictyostelium has cell-cell connections similar in structure, function, and underlying molecular mechanisms to animal epithelial cells. These similarities form the basis for the proposal that multicellularity is ancestral to the clade containing animals, fungi, and Amoebozoa (including Dictyostelium): Amorphea (formerly "unikonts"). This hypothesis is intriguing and if true could precipitate a paradigm shift. However, phylogenetic analyses of two key genes reveal patterns inconsistent with a single origin of multicellularity. A single origin in Amorphea would also require loss of multicellularity in each of the many unicellular lineages within this clade. Further, there are numerous other origins of multicellularity within eukaryotes, including three within Amorphea, that are not characterized by these structural and mechanistic similarities. Instead, convergent evolution resulting from similar selective pressures for forming multicellular structures with motile and differentiated cells is the most likely explanation for the observed similarities between animal and dictyostelid cell-cell connections. Copyright © 2013 WILEY Periodicals, Inc.
Genome-wide association study of red blood cell traits in Hispanics/Latinos: The Hispanic Community Health Study/Study of Latinos

PubMed Central

Morrison, Jean V.; Brown, Lisa; Schurmann, Claudia; Chen, Diane D.; Liu, Yong Mei; Auer, Paul L.; Taylor, Kent D.; Papanicolaou, George; Kurita, Ryo; Nakamura, Yukio; Loos, Ruth J. F.; North, Kari E.; Thornton, Timothy A.; Pankratz, Nathan; Bauer, Daniel E.

2017-01-01

Prior GWAS have identified loci associated with red blood cell (RBC) traits in populations of European, African, and Asian ancestry. These studies have not included individuals with an Amerindian ancestral background, such as Hispanics/Latinos, nor evaluated the full spectrum of genomic variation beyond single nucleotide variants. Using a custom genotyping array enriched for Amerindian ancestral content and 1000 Genomes imputation, we performed GWAS in 12,502 participants of Hispanic Community Health Study and Study of Latinos (HCHS/SOL) for hematocrit, hemoglobin, RBC count, RBC distribution width (RDW), and RBC indices. Approximately 60% of previously reported RBC trait loci generalized to HCHS/SOL Hispanics/Latinos, including African ancestral alpha- and beta-globin gene variants. In addition to the known 3.8kb alpha-globin copy number variant, we identified an Amerindian ancestral association in an alpha-globin regulatory region on chromosome 16p13.3 for mean corpuscular volume and mean corpuscular hemoglobin. We also discovered and replicated three genome-wide significant variants in previously unreported loci for RDW (SLC12A2 rs17764730, PSMB5 rs941718), and hematocrit (PROX1 rs3754140). Among the proxy variants at the SLC12A2 locus we identified rs3812049, located in a bi-directional promoter between SLC12A2 (which encodes a red cell membrane ion-transport protein) and an upstream anti-sense long-noncoding RNA, LINC01184, as the likely causal variant. We further demonstrate that disruption of the regulatory element harboring rs3812049 affects transcription of SLC12A2 and LINC01184 in human erythroid progenitor cells. Together, these results reinforce the importance of genetic study of diverse ancestral populations, in particular Hispanics/Latinos. PMID:28453575
Living Organisms Author Their Read-Write Genomes in Evolution

PubMed Central

2017-01-01

Evolutionary variations generating phenotypic adaptations and novel taxa resulted from complex cellular activities altering genome content and expression: (i) Symbiogenetic cell mergers producing the mitochondrion-bearing ancestor of eukaryotes and chloroplast-bearing ancestors of photosynthetic eukaryotes; (ii) interspecific hybridizations and genome doublings generating new species and adaptive radiations of higher plants and animals; and, (iii) interspecific horizontal DNA transfer encoding virtually all of the cellular functions between organisms and their viruses in all domains of life. Consequently, assuming that evolutionary processes occur in isolated genomes of individual species has become an unrealistic abstraction. Adaptive variations also involved natural genetic engineering of mobile DNA elements to rewire regulatory networks. In the most highly evolved organisms, biological complexity scales with “non-coding” DNA content more closely than with protein-coding capacity. Coincidentally, we have learned how so-called “non-coding” RNAs that are rich in repetitive mobile DNA sequences are key regulators of complex phenotypes. Both biotic and abiotic ecological challenges serve as triggers for episodes of elevated genome change. The intersections of cell activities, biosphere interactions, horizontal DNA transfers, and non-random Read-Write genome modifications by natural genetic engineering provide a rich molecular and biological foundation for understanding how ecological disruptions can stimulate productive, often abrupt, evolutionary transformations. PMID:29211049
Living Organisms Author Their Read-Write Genomes in Evolution.

PubMed

Shapiro, James A

2017-12-06

Evolutionary variations generating phenotypic adaptations and novel taxa resulted from complex cellular activities altering genome content and expression: (i) Symbiogenetic cell mergers producing the mitochondrion-bearing ancestor of eukaryotes and chloroplast-bearing ancestors of photosynthetic eukaryotes; (ii) interspecific hybridizations and genome doublings generating new species and adaptive radiations of higher plants and animals; and, (iii) interspecific horizontal DNA transfer encoding virtually all of the cellular functions between organisms and their viruses in all domains of life. Consequently, assuming that evolutionary processes occur in isolated genomes of individual species has become an unrealistic abstraction. Adaptive variations also involved natural genetic engineering of mobile DNA elements to rewire regulatory networks. In the most highly evolved organisms, biological complexity scales with "non-coding" DNA content more closely than with protein-coding capacity. Coincidentally, we have learned how so-called "non-coding" RNAs that are rich in repetitive mobile DNA sequences are key regulators of complex phenotypes. Both biotic and abiotic ecological challenges serve as triggers for episodes of elevated genome change. The intersections of cell activities, biosphere interactions, horizontal DNA transfers, and non-random Read-Write genome modifications by natural genetic engineering provide a rich molecular and biological foundation for understanding how ecological disruptions can stimulate productive, often abrupt, evolutionary transformations.
The chloroplast genome sequence of the green alga Leptosira terrestris: multiple losses of the inverted repeat and extensive genome rearrangements within the Trebouxiophyceae

PubMed Central

de Cambiaire, Jean-Charles; Otis, Christian; Turmel, Monique; Lemieux, Claude

2007-01-01

Background In the Chlorophyta – the green algal phylum comprising the classes Prasinophyceae, Ulvophyceae, Trebouxiophyceae and Chlorophyceae – the chloroplast genome displays a highly variable architecture. While chlorophycean chloroplast DNAs (cpDNAs) deviate considerably from the ancestral pattern described for the prasinophyte Nephroselmis olivacea, the degree of remodelling sustained by the two ulvophyte cpDNAs completely sequenced to date is intermediate relative to those observed for chlorophycean and trebouxiophyte cpDNAs. Chlorella vulgaris (Chlorellales) is currently the only photosynthetic trebouxiophyte whose complete cpDNA sequence has been reported. To gain insights into the evolutionary trends of the chloroplast genome in the Trebouxiophyceae, we sequenced cpDNA from the filamentous alga Leptosira terrestris (Ctenocladales). Results The 195,081-bp Leptosira chloroplast genome resembles the 150,613-bp Chlorella genome in lacking a large inverted repeat (IR) but differs greatly in gene order. Six of the conserved genes present in Chlorella cpDNA are missing from the Leptosira gene repertoire. The 106 conserved genes, four introns and 11 free standing open reading frames (ORFs) account for 48.3% of the genome sequence. This is the lowest gene density yet observed among chlorophyte cpDNAs. Contrary to the situation in Chlorella but similar to that in the chlorophycean Scenedesmus obliquus, the gene distribution is highly biased over the two DNA strands in Leptosira. Nine genes, compared to only three in Chlorella, have significantly expanded coding regions relative to their homologues in ancestral-type green algal cpDNAs. As observed in chlorophycean genomes, the rpoB gene is fragmented into two ORFs. Short repeats account for 5.1% of the Leptosira genome sequence and are present mainly in intergenic regions. Conclusion Our results highlight the great plasticity of the chloroplast genome in the Trebouxiophyceae and indicate that the IR was lost on at
Six Subgroups and Extensive Recent Duplications Characterize the Evolution of the Eukaryotic Tubulin Protein Family

PubMed Central

Findeisen, Peggy; Mühlhausen, Stefanie; Dempewolf, Silke; Hertzog, Jonny; Zietlow, Alexander; Carlomagno, Teresa; Kollmar, Martin

2014-01-01

Tubulins belong to the most abundant proteins in eukaryotes providing the backbone for many cellular substructures like the mitotic and meiotic spindles, the intracellular cytoskeletal network, and the axonemes of cilia and flagella. Homologs have even been reported for archaea and bacteria. However, a taxonomically broad and whole-genome-based analysis of the tubulin protein family has never been performed, and thus, the number of subfamilies, their taxonomic distribution, and the exact grouping of the supposed archaeal and bacterial homologs are unknown. Here, we present the analysis of 3,524 tubulins from 504 species. The tubulins formed six major subfamilies, α to ζ. Species of all major kingdoms of the eukaryotes encode members of these subfamilies implying that they must have already been present in the last common eukaryotic ancestor. The proposed archaeal homologs grouped together with the bacterial TubZ proteins as sister clade to the FtsZ proteins indicating that tubulins are unique to eukaryotes. Most species contained α- and/or β-tubulin gene duplicates resulting from recent branch- and species-specific duplication events. This shows that tubulins cannot be used for constructing species phylogenies without resolving their ortholog–paralog relationships. The many gene duplicates and also the independent loss of the δ-, ε-, or ζ-tubulins, which have been shown to be part of the triplet microtubules in basal bodies, suggest that tubulins can functionally substitute each other. PMID:25169981
Paleobiological perspectives on early eukaryotic evolution.

PubMed

Knoll, Andrew H

2014-01-01

Eukaryotic organisms radiated in Proterozoic oceans with oxygenated surface waters, but, commonly, anoxia at depth. Exceptionally preserved fossils of red algae favor crown group emergence more than 1200 million years ago, but older (up to 1600-1800 million years) microfossils could record stem group eukaryotes. Major eukaryotic diversification ~800 million years ago is documented by the increase in the taxonomic richness of complex, organic-walled microfossils, including simple coenocytic and multicellular forms, as well as widespread tests comparable to those of extant testate amoebae and simple foraminiferans and diverse scales comparable to organic and siliceous scales formed today by protists in several clades. Mid-Neoproterozoic establishment or expansion of eukaryophagy provides a possible mechanism for accelerating eukaryotic diversification long after the origin of the domain. Protists continued to diversify along with animals in the more pervasively oxygenated oceans of the Phanerozoic Eon.
Positional orthology: putting genomic evolutionary relationships into context.

PubMed

Dewey, Colin N

2011-09-01

Orthology is a powerful refinement of homology that allows us to describe more precisely the evolution of genomes and understand the function of the genes they contain. However, because orthology is not concerned with genomic position, it is limited in its ability to describe genes that are likely to have equivalent roles in different genomes. Because of this limitation, the concept of 'positional orthology' has emerged, which describes the relation between orthologous genes that retain their ancestral genomic positions. In this review, we formally define this concept, for which we introduce the shorter term 'toporthology', with respect to the evolutionary events experienced by a gene's ancestors. Through a discussion of recent studies on the role of genomic context in gene evolution, we show that the distinction between orthology and toporthology is biologically significant. We then review a number of orthology prediction methods that take genomic context into account and thus that may be used to infer the important relation of toporthology.
Positional orthology: putting genomic evolutionary relationships into context

PubMed Central

2011-01-01

Orthology is a powerful refinement of homology that allows us to describe more precisely the evolution of genomes and understand the function of the genes they contain. However, because orthology is not concerned with genomic position, it is limited in its ability to describe genes that are likely to have equivalent roles in different genomes. Because of this limitation, the concept of ‘positional orthology’ has emerged, which describes the relation between orthologous genes that retain their ancestral genomic positions. In this review, we formally define this concept, for which we introduce the shorter term ‘toporthology’, with respect to the evolutionary events experienced by a gene’s ancestors. Through a discussion of recent studies on the role of genomic context in gene evolution, we show that the distinction between orthology and toporthology is biologically significant. We then review a number of orthology prediction methods that take genomic context into account and thus that may be used to infer the important relation of toporthology. PMID:21705766
Ancestral sequence reconstruction in primate mitochondrial DNA: compositional bias and effect on functional inference.

PubMed

Krishnan, Neeraja M; Seligmann, Hervé; Stewart, Caro-Beth; De Koning, A P Jason; Pollock, David D

2004-10-01

Reconstruction of ancestral DNA and amino acid sequences is an important means of inferring information about past evolutionary events. Such reconstructions suggest changes in molecular function and evolutionary processes over the course of evolution and are used to infer adaptation and convergence. Maximum likelihood (ML) is generally thought to provide relatively accurate reconstructed sequences compared to parsimony, but both methods lead to the inference of multiple directional changes in nucleotide frequencies in primate mitochondrial DNA (mtDNA). To better understand this surprising result, as well as to better understand how parsimony and ML differ, we constructed a series of computationally simple "conditional pathway" methods that differed in the number of substitutions allowed per site along each branch, and we also evaluated the entire Bayesian posterior frequency distribution of reconstructed ancestral states. We analyzed primate mitochondrial cytochrome b (Cyt-b) and cytochrome oxidase subunit I (COI) genes and found that ML reconstructs ancestral frequencies that are often more different from tip sequences than are parsimony reconstructions. In contrast, frequency reconstructions based on the posterior ensemble more closely resemble extant nucleotide frequencies. Simulations indicate that these differences in ancestral sequence inference are probably due to deterministic bias caused by high uncertainty in the optimization-based ancestral reconstruction methods (parsimony, ML, Bayesian maximum a posteriori). In contrast, ancestral nucleotide frequencies based on an average of the Bayesian set of credible ancestral sequences are much less biased. The methods involving simpler conditional pathway calculations have slightly reduced likelihood values compared to full likelihood calculations, but they can provide fairly unbiased nucleotide reconstructions and may be useful in more complex phylogenetic analyses than considered here due to their speed and
Conservation of coevolving protein interfaces bridges prokaryote–eukaryote homologies in the twilight zone

PubMed Central

Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

2016-01-01

Protein–protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein–protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein–protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein–protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach. PMID:27965389
Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone.

PubMed

Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

2016-12-27

Protein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein-protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.
Ancestry, admixture and fitness in Colombian genomes

PubMed Central

Rishishwar, Lavanya; Conley, Andrew B.; Wigington, Charles H.; Wang, Lu; Valderrama-Aguirre, Augusto; King Jordan, I.

2015-01-01

The human dimension of the Columbian Exchange entailed substantial genetic admixture between ancestral source populations from Africa, the Americas and Europe, which had evolved separately for many thousands of years. We sought to address the implications of the creation of admixed American genomes, containing novel allelic combinations, for human health and fitness via analysis of an admixed Colombian population from Medellin. Colombian genomes from Medellin show a wide range of three-way admixture contributions from ancestral source populations. The primary ancestry component for the population is European (average = 74.6%, range = 45.0%–96.7%), followed by Native American (average = 18.1%, range = 2.1%–33.3%) and African (average = 7.3%, range = 0.2%–38.6%). Locus-specific patterns of ancestry were evaluated to search for genomic regions that are enriched across the population for particular ancestry contributions. Adaptive and innate immune system related genes and pathways are particularly over-represented among ancestry-enriched segments, including genes (HLA-B and MAPK10) that are involved in defense against endemic pathogens such as malaria. Genes that encode functions related to skin pigmentation (SCL4A5) and cutaneous glands (EDAR) are also found in regions with anomalous ancestry patterns. These results suggest the possibility that ancestry-specific loci were differentially retained in the modern admixed Colombian population based on their utility in the New World environment. PMID:26197429

Comparative genomic survey of microbial arylamine N-acetyltransferases

USDA-ARS?s Scientific Manuscript database

Introduction: Microorganisms are constantly exposed to exogenous chemical influences. Our previous genomic surveys have identified putative NAT genes across a phylogenetic spectrum of prokaryotic and eukaryotic microorganisms. We are currently pursuing two lines of investigation: The first looks int...
Classification and Lineage Tracing of SH2 Domains Throughout Eukaryotes.

PubMed

Liu, Bernard A

2017-01-01

Today there exists a rapidly expanding number of sequenced genomes. Cataloging protein interaction domains such as the Src Homology 2 (SH2) domain across these various genomes can be accomplished with ease due to existing algorithms and predictions models. An evolutionary analysis of SH2 domains provides a step towards understanding how SH2 proteins integrated with existing signaling networks to position phosphotyrosine signaling as a crucial driver of robust cellular communication networks in metazoans. However organizing and tracing SH2 domain across organisms and understanding their evolutionary trajectory remains a challenge. This chapter describes several methodologies towards analyzing the evolutionary trajectory of SH2 domains including a global SH2 domain classification system, which facilitates annotation of new SH2 sequences essential for tracing the lineage of SH2 domains throughout eukaryote evolution. This classification utilizes a combination of sequence homology, protein domain architecture and the boundary positions between introns and exons within the SH2 domain or genes encoding these domains. Discrete SH2 families can then be traced across various genomes to provide insight into its origins. Furthermore, additional methods for examining potential mechanisms for divergence of SH2 domains from structural changes to alterations in the protein domain content and genome duplication will be discussed. Therefore a better understanding of SH2 domain evolution may enhance our insight into the emergence of phosphotyrosine signaling and the expansion of protein interaction domains.
Prokaryotic ancestry of eukaryotic protein networks mediating innate immunity and apoptosis.

PubMed

Dunin-Horkawicz, Stanislaw; Kopec, Klaus O; Lupas, Andrei N

2014-04-03

Protein domains characteristic of eukaryotic innate immunity and apoptosis have many prokaryotic counterparts of unknown function. By reconstructing interactomes computationally, we found that bacterial proteins containing these domains are part of a network that also includes other domains not hitherto associated with immunity. This network is connected to the network of prokaryotic signal transduction proteins, such as histidine kinases and chemoreceptors. The network varies considerably in domain composition and degree of paralogy, even between strains of the same species, and its repetitive domains are often amplified recently, with individual repeats sharing up to 100% sequence identity. Both phenomena are evidence of considerable evolutionary pressure and thus compatible with a role in the "arms race" between host and pathogen. In order to investigate the relationship of this network to its eukaryotic counterparts, we performed a cluster analysis of organisms based on a census of its constituent domains across all fully sequenced genomes. We obtained a large central cluster of mainly unicellular organisms, from which multicellular organisms radiate out in two main directions. One is taken by multicellular bacteria, primarily cyanobacteria and actinomycetes, and plants form an extension of this direction, connected via the basal, unicellular cyanobacteria. The second main direction is taken by animals and fungi, which form separate branches with a common root in the α-proteobacteria of the central cluster. This analysis supports the notion that the innate immunity networks of eukaryotes originated from their endosymbionts and that increases in the complexity of these networks accompanied the emergence of multicellularity. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
How MCM loading and spreading specify eukaryotic DNA replication initiation sites.

PubMed

Hyrien, Olivier

2016-01-01

DNA replication origins strikingly differ between eukaryotic species and cell types. Origins are localized and can be highly efficient in budding yeast, are randomly located in early fly and frog embryos, which do not transcribe their genomes, and are clustered in broad (10-100 kb) non-transcribed zones, frequently abutting transcribed genes, in mammalian cells. Nonetheless, in all cases, origins are established during the G1-phase of the cell cycle by the loading of double hexamers of the Mcm 2-7 proteins (MCM DHs), the core of the replicative helicase. MCM DH activation in S-phase leads to origin unwinding, polymerase recruitment, and initiation of bidirectional DNA synthesis. Although MCM DHs are initially loaded at sites defined by the binding of the origin recognition complex (ORC), they ultimately bind chromatin in much greater numbers than ORC and only a fraction are activated in any one S-phase. Data suggest that the multiplicity and functional redundancy of MCM DHs provide robustness to the replication process and affect replication time and that MCM DHs can slide along the DNA and spread over large distances around the ORC. Recent studies further show that MCM DHs are displaced along the DNA by collision with transcription complexes but remain functional for initiation after displacement. Therefore, eukaryotic DNA replication relies on intrinsically mobile and flexible origins, a strategy fundamentally different from bacteria but conserved from yeast to human. These properties of MCM DHs likely contribute to the establishment of broad, intergenic replication initiation zones in higher eukaryotes.
How MCM loading and spreading specify eukaryotic DNA replication initiation sites

PubMed Central

Hyrien, Olivier

2016-01-01

DNA replication origins strikingly differ between eukaryotic species and cell types. Origins are localized and can be highly efficient in budding yeast, are randomly located in early fly and frog embryos, which do not transcribe their genomes, and are clustered in broad (10-100 kb) non-transcribed zones, frequently abutting transcribed genes, in mammalian cells. Nonetheless, in all cases, origins are established during the G1-phase of the cell cycle by the loading of double hexamers of the Mcm 2-7 proteins (MCM DHs), the core of the replicative helicase. MCM DH activation in S-phase leads to origin unwinding, polymerase recruitment, and initiation of bidirectional DNA synthesis. Although MCM DHs are initially loaded at sites defined by the binding of the origin recognition complex (ORC), they ultimately bind chromatin in much greater numbers than ORC and only a fraction are activated in any one S-phase. Data suggest that the multiplicity and functional redundancy of MCM DHs provide robustness to the replication process and affect replication time and that MCM DHs can slide along the DNA and spread over large distances around the ORC. Recent studies further show that MCM DHs are displaced along the DNA by collision with transcription complexes but remain functional for initiation after displacement. Therefore, eukaryotic DNA replication relies on intrinsically mobile and flexible origins, a strategy fundamentally different from bacteria but conserved from yeast to human. These properties of MCM DHs likely contribute to the establishment of broad, intergenic replication initiation zones in higher eukaryotes. PMID:27635237
Snapshot of the Eukaryotic Gene Expression in Muskoxen Rumen—A Metatranscriptomic Approach

PubMed Central

O'Toole, Nicholas; Barboza, Perry S.; Ungerfeld, Emilio; Leigh, Mary Beth; Selinger, L. Brent; Butler, Greg; Tsang, Adrian; McAllister, Tim A.; Forster, Robert J.

2011-01-01

Background Herbivores rely on digestive tract lignocellulolytic microorganisms, including bacteria, fungi and protozoa, to derive energy and carbon from plant cell wall polysaccharides. Culture independent metagenomic studies have been used to reveal the genetic content of the bacterial species within gut microbiomes. However, the nature of the genes encoded by eukaryotic protozoa and fungi within these environments has not been explored using metagenomic or metatranscriptomic approaches. Methodology/Principal Findings In this study, a metatranscriptomic approach was used to investigate the functional diversity of the eukaryotic microorganisms within the rumen of muskoxen (Ovibos moschatus), with a focus on plant cell wall degrading enzymes. Polyadenylated RNA (mRNA) was sequenced on the Illumina Genome Analyzer II system and 2.8 gigabases of sequences were obtained and 59129 contigs assembled. Plant cell wall degrading enzyme modules including glycoside hydrolases, carbohydrate esterases and polysaccharide lyases were identified from over 2500 contigs. These included a number of glycoside hydrolase family 6 (GH6), GH48 and swollenin modules, which have rarely been described in previous gut metagenomic studies. Conclusions/Significance The muskoxen rumen metatranscriptome demonstrates a much higher percentage of cellulase enzyme discovery and an 8.7x higher rate of total carbohydrate active enzyme discovery per gigabase of sequence than previous rumen metagenomes. This study provides a snapshot of eukaryotic gene expression in the muskoxen rumen, and identifies a number of candidate genes coding for potentially valuable lignocellulolytic enzymes. PMID:21655220
Effects of number of training generations on genomic prediction for various traits in a layer chicken population.

PubMed

Weng, Ziqing; Wolc, Anna; Shen, Xia; Fernando, Rohan L; Dekkers, Jack C M; Arango, Jesus; Settar, Petek; Fulton, Janet E; O'Sullivan, Neil P; Garrick, Dorian J

2016-03-19

Genomic estimated breeding values (GEBV) based on single nucleotide polymorphism (SNP) genotypes are widely used in animal improvement programs. It is typically assumed that the larger the number of animals is in the training set, the higher is the prediction accuracy of GEBV. The aim of this study was to quantify genomic prediction accuracy depending on the number of ancestral generations included in the training set, and to determine the optimal number of training generations for different traits in an elite layer breeding line. Phenotypic records for 16 traits on 17,793 birds were used. All parents and some selection candidates from nine non-overlapping generations were genotyped for 23,098 segregating SNPs. An animal model with pedigree relationships (PBLUP) and the BayesB genomic prediction model were applied to predict EBV or GEBV at each validation generation (progeny of the most recent training generation) based on varying numbers of immediately preceding ancestral generations. Prediction accuracy of EBV or GEBV was assessed as the correlation between EBV and phenotypes adjusted for fixed effects, divided by the square root of trait heritability. The optimal number of training generations that resulted in the greatest prediction accuracy of GEBV was determined for each trait. The relationship between optimal number of training generations and heritability was investigated. On average, accuracies were higher with the BayesB model than with PBLUP. Prediction accuracies of GEBV increased as the number of closely-related ancestral generations included in the training set increased, but reached an asymptote or slightly decreased when distant ancestral generations were used in the training set. The optimal number of training generations was 4 or more for high heritability traits but less than that for low heritability traits. For less heritable traits, limiting the training datasets to individuals closely related to the validation population resulted in the best
Reference genome sequence of the model plant Setaria

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bennetzen, Jeffrey L; Schmutz, Jeremy; Wang, Hao

We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ~400-Mb assembly covers ~80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species thatmore » demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).« less
Reference genome sequence of the model plant Setaria.

PubMed

Bennetzen, Jeffrey L; Schmutz, Jeremy; Wang, Hao; Percifield, Ryan; Hawkins, Jennifer; Pontaroli, Ana C; Estep, Matt; Feng, Liang; Vaughn, Justin N; Grimwood, Jane; Jenkins, Jerry; Barry, Kerrie; Lindquist, Erika; Hellsten, Uffe; Deshpande, Shweta; Wang, Xuewen; Wu, Xiaomei; Mitros, Therese; Triplett, Jimmy; Yang, Xiaohan; Ye, Chu-Yu; Mauro-Herrera, Margarita; Wang, Lin; Li, Pinghua; Sharma, Manoj; Sharma, Rita; Ronald, Pamela C; Panaud, Olivier; Kellogg, Elizabeth A; Brutnell, Thomas P; Doust, Andrew N; Tuskan, Gerald A; Rokhsar, Daniel; Devos, Katrien M

2012-05-13

We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ∼400-Mb assembly covers ∼80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).
Methods and Applications of CRISPR-Mediated Base Editing in Eukaryotic Genomes.

PubMed

Hess, Gaelen T; Tycko, Josh; Yao, David; Bassik, Michael C

2017-10-05

The past several years have seen an explosion in development of applications for the CRISPR-Cas9 system, from efficient genome editing, to high-throughput screening, to recruitment of a range of DNA and chromatin-modifying enzymes. While homology-directed repair (HDR) coupled with Cas9 nuclease cleavage has been used with great success to repair and re-write genomes, recently developed base-editing systems present a useful orthogonal strategy to engineer nucleotide substitutions. Base editing relies on recruitment of cytidine deaminases to introduce changes (rather than double-stranded breaks and donor templates) and offers potential improvements in efficiency while limiting damage and simplifying the delivery of editing machinery. At the same time, these systems enable novel mutagenesis strategies to introduce sequence diversity for engineering and discovery. Here, we review the different base-editing platforms, including their deaminase recruitment strategies and editing outcomes, and compare them to other CRISPR genome-editing technologies. Additionally, we discuss how these systems have been applied in therapeutic, engineering, and research settings. Lastly, we explore future directions of this emerging technology. Copyright © 2017 Elsevier Inc. All rights reserved.
Solution Hybrid Selection Capture for the Recovery of Functional Full-Length Eukaryotic cDNAs From Complex Environmental Samples

PubMed Central

Bragalini, Claudia; Ribière, Céline; Parisot, Nicolas; Vallon, Laurent; Prudent, Elsa; Peyretaillade, Eric; Girlanda, Mariangela; Peyret, Pierre; Marmeisse, Roland; Luis, Patricia

2014-01-01

Eukaryotic microbial communities play key functional roles in soil biology and potentially represent a rich source of natural products including biocatalysts. Culture-independent molecular methods are powerful tools to isolate functional genes from uncultured microorganisms. However, none of the methods used in environmental genomics allow for a rapid isolation of numerous functional genes from eukaryotic microbial communities. We developed an original adaptation of the solution hybrid selection (SHS) for an efficient recovery of functional complementary DNAs (cDNAs) synthesized from soil-extracted polyadenylated mRNAs. This protocol was tested on the Glycoside Hydrolase 11 gene family encoding endo-xylanases for which we designed 35 explorative 31-mers capture probes. SHS was implemented on four soil eukaryotic cDNA pools. After two successive rounds of capture, >90% of the resulting cDNAs were GH11 sequences, of which 70% (38 among 53 sequenced genes) were full length. Between 1.5 and 25% of the cloned captured sequences were expressed in Saccharomyces cerevisiae. Sequencing of polymerase chain reaction-amplified GH11 gene fragments from the captured sequences highlighted hundreds of phylogenetically diverse sequences that were not yet described, in public databases. This protocol offers the possibility of performing exhaustive exploration of eukaryotic gene families within microbial communities thriving in any type of environment. PMID:25281543
Paleobiological Perspectives on Early Eukaryotic Evolution

PubMed Central

Knoll, Andrew H.

2014-01-01

Eukaryotic organisms radiated in Proterozoic oceans with oxygenated surface waters, but, commonly, anoxia at depth. Exceptionally preserved fossils of red algae favor crown group emergence more than 1200 million years ago, but older (up to 1600–1800 million years) microfossils could record stem group eukaryotes. Major eukaryotic diversification ∼800 million years ago is documented by the increase in the taxonomic richness of complex, organic-walled microfossils, including simple coenocytic and multicellular forms, as well as widespread tests comparable to those of extant testate amoebae and simple foraminiferans and diverse scales comparable to organic and siliceous scales formed today by protists in several clades. Mid-Neoproterozoic establishment or expansion of eukaryophagy provides a possible mechanism for accelerating eukaryotic diversification long after the origin of the domain. Protists continued to diversify along with animals in the more pervasively oxygenated oceans of the Phanerozoic Eon. PMID:24384569
MCM Paradox: Abundance of Eukaryotic Replicative Helicases and Genomic Integrity.

PubMed

Das, Mitali; Singh, Sunita; Pradhan, Satyajit; Narayan, Gopeshwar

2014-01-01

As a crucial component of DNA replication licensing system, minichromosome maintenance (MCM) 2-7 complex acts as the eukaryotic DNA replicative helicase. The six related MCM proteins form a heterohexamer and bind with ORC, CDC6, and Cdt1 to form the prereplication complex. Although the MCMs are well known as replicative helicases, their overabundance and distribution patterns on chromatin present a paradox called the "MCM paradox." Several approaches had been taken to solve the MCM paradox and describe the purpose of excess MCMs distributed beyond the replication origins. Alternative functions of these MCMs rather than a helicase had also been proposed. This review focuses on several models and concepts generated to solve the MCM paradox coinciding with their helicase function and provides insight into the concept that excess MCMs are meant for licensing dormant origins as a backup during replication stress. Finally, we extend our view towards the effect of alteration of MCM level. Though an excess MCM constituent is needed for normal cells to withstand stress, there must be a delineation of the threshold level in normal and malignant cells. This review also outlooks the future prospects to better understand the MCM biology.
Therapeutic Genome Editing: Prospects and Challenges

PubMed Central

Cox, David Benjamin Turitz; Platt, Randall Jeffrey; Zhang, Feng

2015-01-01

Recent advances in the development of genome editing technologies based on programmable nucleases have significantly improved our ability to make precise changes in the genomes of eukaryotic cells. Genome editing is already broadening our ability to elucidate the contribution of genetics to disease by facilitating the creation of more accurate cellular and animal models of pathological processes. A particularly tantalizing application of programmable nucleases is the potential to directly correct genetic mutations in affected tissues and cells to treat diseases that are refractory to traditional therapies. Here we discuss current progress towards developing programmable nuclease-based therapies as well as future prospects and challenges. PMID:25654603
The protective function of noncoding DNA in genome defense of eukaryotic male germ cells.

PubMed

Qiu, Guo-Hua; Huang, Cuiqin; Zheng, Xintian; Yang, Xiaoyan

2018-04-01

Peripheral and abundant noncoding DNA has been hypothesized to protect the genome and the central protein-coding sequences against DNA damage in somatic genome. In the cytosol, invading exogenous nucleic acids may first be deactivated by small RNAs encoded by noncoding DNA via mechanisms similar to the prokaryotic CRISPR-Cas system. In the nucleus, the radicals generated by radiation in the cytosol, radiation energy and invading exogenous nucleic acids are absorbed, blocked and/or reduced by peripheral heterochromatin, and damaged DNA in heterochromatin is removed and excluded from the nucleus to the cytoplasm through nuclear pore complexes. To further strengthen the hypothesis, this review summarizes the experimental evidence supporting the protective function of noncoding DNA in the genome of male germ cells. Based on these data, this review provides evidence supporting the protective role of noncoding DNA in the genome defense of sperm genome through similar mechanisms to those of the somatic genome.
Genome expansion and gene loss in powdery mildew fungi reveal functional tradeoffs in extreme parasitism

USDA-ARS?s Scientific Manuscript database

Eukaryotic genomes vary in size over five orders of magnitude ranging from microsporidia (~2.9Mb) to the lung-fish (~1.2Tb). This extraordinary variation is largely a result of the proliferation of mobile DNA elements also referred to as “genomic parasites.” The constraints on genome size may be imp...
Hypothesis: Gene-rich plastid genomes in red algae may be an outcome of nuclear genome reduction.

PubMed

Qiu, Huan; Lee, Jun Mo; Yoon, Hwan Su; Bhattacharya, Debashish

2017-06-01

Red algae (Rhodophyta) putatively diverged from the eukaryote tree of life >1.2 billion years ago and are the source of plastids in the ecologically important diatoms, haptophytes, and dinoflagellates. In general, red algae contain the largest plastid gene inventory among all such organelles derived from primary, secondary, or additional rounds of endosymbiosis. In contrast, their nuclear gene inventory is reduced when compared to their putative sister lineage, the Viridiplantae, and other photosynthetic lineages. The latter is thought to have resulted from a phase of genome reduction that occurred in the stem lineage of Rhodophyta. A recent comparative analysis of a taxonomically broad collection of red algal and Viridiplantae plastid genomes demonstrates that the red algal ancestor encoded ~1.5× more plastid genes than Viridiplantae. This difference is primarily explained by more extensive endosymbiotic gene transfer (EGT) in the stem lineage of Viridiplantae, when compared to red algae. We postulate that limited EGT in Rhodophytes resulted from the countervailing force of ancient, and likely recurrent, nuclear genome reduction. In other words, the propensity for nuclear gene loss led to the retention of red algal plastid genes that would otherwise have undergone intracellular gene transfer to the nucleus. This hypothesis recognizes the primacy of nuclear genome evolution over that of plastids, which have no inherent control of their gene inventory and can change dramatically (e.g., secondarily non-photosynthetic eukaryotes, dinoflagellates) in response to selection acting on the host lineage. © 2017 Phycological Society of America.
Evolution of Prdm Genes in Animals: Insights from Comparative Genomics

PubMed Central

Vervoort, Michel; Meulemeester, David; Béhague, Julien; Kerner, Pierre

2016-01-01

Prdm genes encode transcription factors with a subtype of SET domain known as the PRDF1-RIZ (PR) homology domain and a variable number of zinc finger motifs. These genes are involved in a wide variety of functions during animal development. As most Prdm genes have been studied in vertebrates, especially in mice, little is known about the evolution of this gene family. We searched for Prdm genes in the fully sequenced genomes of 93 different species representative of all the main metazoan lineages. A total of 976 Prdm genes were identified in these species. The number of Prdm genes per species ranges from 2 to 19. To better understand how the Prdm gene family has evolved in metazoans, we performed phylogenetic analyses using this large set of identified Prdm genes. These analyses allowed us to define 14 different subfamilies of Prdm genes and to establish, through ancestral state reconstruction, that 11 of them are ancestral to bilaterian animals. Three additional subfamilies were acquired during early vertebrate evolution (Prdm5, Prdm11, and Prdm17). Several gene duplication and gene loss events were identified and mapped onto the metazoan phylogenetic tree. By studying a large number of nonmetazoan genomes, we confirmed that Prdm genes likely constitute a metazoan-specific gene family. Our data also suggest that Prdm genes originated before the diversification of animals through the association of a single ancestral SET domain encoding gene with one or several zinc finger encoding genes. PMID:26560352
The mitochondrial genome of Frankliniella intonsa: insights into the evolution of mitochondrial genomes at lower taxonomic levels in Thysanoptera.

PubMed

Yan, Dankan; Tang, Yunxia; Hu, Min; Liu, Fengquan; Zhang, Dongfang; Fan, Jiaqin

2014-10-01

Thrips is an ideal group for studying the evolution of mitochondrial (mt) genomes in the genus and family due to independent rearrangements within this order. The complete sequence of the mitochondrial DNA (mtDNA) of the flower thrips Frankliniella intonsa has been completed and annotated in this study. The circular genome is 15,215bp in length with an A+T content of 75.9% and contains the typical 37 genes and it has triplicate putative control regions. Nucleotide composition is A+T biased, and the majority of the protein-coding genes present opposite CG skew which is reflected by the nucleotide composition, codon and amino acid usage. Although the known thrips have massive gene rearrangements, it showed no reversal of strand asymmetry. Gene rearrangements have been found in the lower taxonomic levels of thrips. Three tRNA genes were translocated in the genus Frankliniella and eight tRNA genes in the family Thripidae. Although the gene arrangements of mt genomes of all three thrips species differ massively from the ancestral insect, they are all very similar to each other, indicating that there was a large rearrangement somewhere before the most recent common ancestor of these three species and very little genomic evolution or rearrangements after then. The extremely similar sequences among the CRs suggest that they are ongoing concerted evolution. Analyses of the up and downstream sequence of CRs reveal that the CR2 is actually the ancestral CR. The three CRs are in the same spot in each of the three thrips mt genomes which have the identical inverted genes. These characteristics might be obtained from the most recent common ancestor of this three thrips. Above observations suggest that the mt genomes of the three thrips keep a single massive rearrangement from the common ancestor and have low evolutionary rates among them. Copyright © 2014 Elsevier Inc. All rights reserved.
The ancestral selection graph under strong directional selection.

PubMed

Pokalyuk, Cornelia; Pfaffelhuber, Peter

2013-08-01

The ancestral selection graph (ASG) was introduced by Neuhauser and Krone (1997) in order to study populations of constant size which evolve under selection. Coalescence events, which occur at rate 1 for every pair of lines, lead to joint ancestry. In addition, splitting events in the ASG at rate α, the scaled selection coefficient, produce possible ancestors, such that the real ancestor depends on the ancestral alleles. Here, we use the ASG in the case without mutation in order to study fixation of a beneficial mutant. Using our main tool, a reversibility property of the ASG, we provide a new proof of the fact that a beneficial allele fixes roughly in time (2logα)/α if α is large. Copyright © 2012 Elsevier Inc. All rights reserved.

Domain selection combined with improved cloning strategy for high throughput expression of higher eukaryotic proteins

PubMed Central

Chen, Yunjia; Qiu, Shihong; Luan, Chi-Hao; Luo, Ming

2007-01-01

Background Expression of higher eukaryotic genes as soluble, stable recombinant proteins is still a bottleneck step in biochemical and structural studies of novel proteins today. Correct identification of stable domains/fragments within the open reading frame (ORF), combined with proper cloning strategies, can greatly enhance the success rate when higher eukaryotic proteins are expressed as these domains/fragments. Furthermore, a HTP cloning pipeline incorporated with bioinformatics domain/fragment selection methods will be beneficial to studies of structure and function genomics/proteomics. Results With bioinformatics tools, we developed a domain/domain boundary prediction (DDBP) method, which was trained by available experimental data. Combined with an improved cloning strategy, DDBP had been applied to 57 proteins from C. elegans. Expression and purification results showed there was a 10-fold increase in terms of obtaining purified proteins. Based on the DDBP method, the improved GATEWAY cloning strategy and a robotic platform, we constructed a high throughput (HTP) cloning pipeline, including PCR primer design, PCR, BP reaction, transformation, plating, colony picking and entry clones extraction, which have been successfully applied to 90 C. elegans genes, 88 Brucella genes, and 188 human genes. More than 97% of the targeted genes were obtained as entry clones. This pipeline has a modular design and can adopt different operations for a variety of cloning/expression strategies. Conclusion The DDBP method and improved cloning strategy were satisfactory. The cloning pipeline, combined with our recombinant protein HTP expression pipeline and the crystal screening robots, constitutes a complete platform for structure genomics/proteomics. This platform will increase the success rate of purification and crystallization dramatically and promote the further advancement of structure genomics/proteomics. PMID:17663785
Petrologic, tectonic, and metallogenic evolution of the Ancestral Cascades magmatic arc, Washington, Oregon, and northern California

USGS Publications Warehouse

du Bray, Edward A.; John, David A.

2011-01-01

Present-day High Cascades arc magmatism was preceded by ~40 m.y. of nearly cospatial magmatism represented by the ancestral Cascades arc in Washington, Oregon, and northernmost California (United States). Time-space-composition relations for the ancestral Cascades arc have been synthesized from a recent compilation of more than 4000 geochemical analyses and associated age data. Neither the composition nor distribution of ancestral Cascades magmatism was uniform along the length of the ancestral arc through time. Initial (>40 to 36 Ma) ancestral Cascades magmatism (mostly basalt and basaltic andesite) was focused at the north end of the arc between the present-day locations of Mount Rainier and the Columbia River. From 35 to 18 Ma, initial basaltic andesite and andesite magmatism evolved to include dacite and rhyolite; magmatic activity became more voluminous and extended along most of the arc. Between 17 and 8 Ma, magmatism was focused along the part of the arc coincident with the northern two-thirds of Oregon and returned to more mafic compositions. Subsequent ancestral Cascades magmatism was dominated by basaltic andesite to basalt prior to the post–4 Ma onset of High Cascades magmatism. Transitional tholeiitic to calc-alkaline compositions dominated early (before 40 to ca. 25 Ma) ancestral Cascades eruptive products, whereas the majority of the younger arc rocks have a calc-alkaline affinity. Tholeiitic compositions characteristic of the oldest ancestral arc magmas suggest development associated with thin, immature crust and slab window processes, whereas the younger, calc-alkaline magmas suggest interaction with thicker, more evolved crust and more conventional subduction-related magmatic processes. Presumed changes in subducted slab dip through time also correlate with fundamental magma composition variation. The predominance of mafic compositions during latest ancestral arc magmatism and throughout the history of modern High Cascades magmatism probably
Heterologous Expression of Toxins from Bacterial Toxin-Antitoxin Systems in Eukaryotic Cells: Strategies and Applications

PubMed Central

Yeo, Chew Chieng; Abu Bakar, Fauziah; Chan, Wai Ting; Espinosa, Manuel; Harikrishna, Jennifer Ann

2016-01-01

Toxin-antitoxin (TA) systems are found in nearly all prokaryotic genomes and usually consist of a pair of co-transcribed genes, one of which encodes a stable toxin and the other, its cognate labile antitoxin. Certain environmental and physiological cues trigger the degradation of the antitoxin, causing activation of the toxin, leading either to the death or stasis of the host cell. TA systems have a variety of functions in the bacterial cell, including acting as mediators of programmed cell death, the induction of a dormant state known as persistence and the stable maintenance of plasmids and other mobile genetic elements. Some bacterial TA systems are functional when expressed in eukaryotic cells and this has led to several innovative applications, which are the subject of this review. Here, we look at how bacterial TA systems have been utilized for the genetic manipulation of yeasts and other eukaryotes, for the containment of genetically modified organisms, and for the engineering of high expression eukaryotic cell lines. We also examine how TA systems have been adopted as an important tool in developmental biology research for the ablation of specific cells and the potential for utility of TA systems in antiviral and anticancer gene therapies. PMID:26907343
An Ancestral Recombination Graph for Diploid Populations with Skewed Offspring Distribution

PubMed Central

Birkner, Matthias; Blath, Jochen; Eldon, Bjarki

2013-01-01

A large offspring-number diploid biparental multilocus population model of Moran type is our object of study. At each time step, a pair of diploid individuals drawn uniformly at random contributes offspring to the population. The number of offspring can be large relative to the total population size. Similar “heavily skewed” reproduction mechanisms have been recently considered by various authors (cf. e.g., Eldon and Wakeley 2006, 2008) and reviewed by Hedgecock and Pudovkin (2011). Each diploid parental individual contributes exactly one chromosome to each diploid offspring, and hence ancestral lineages can coalesce only when in distinct individuals. A separation-of-timescales phenomenon is thus observed. A result of Möhle (1998) is extended to obtain convergence of the ancestral process to an ancestral recombination graph necessarily admitting simultaneous multiple mergers of ancestral lineages. The usual ancestral recombination graph is obtained as a special case of our model when the parents contribute only one offspring to the population each time. Due to diploidy and large offspring numbers, novel effects appear. For example, the marginal genealogy at each locus admits simultaneous multiple mergers in up to four groups, and different loci remain substantially correlated even as the recombination rate grows large. Thus, genealogies for loci far apart on the same chromosome remain correlated. Correlation in coalescence times for two loci is derived and shown to be a function of the coalescence parameters of our model. Extending the observations by Eldon and Wakeley (2008), predictions of linkage disequilibrium are shown to be functions of the reproduction parameters of our model, in addition to the recombination rate. Correlations in ratios of coalescence times between loci can be high, even when the recombination rate is high and sample size is large, in large offspring-number populations, as suggested by simulations, hinting at how to distinguish between
Structural-Functional Organization of the Eukaryotic Cell Nucleus and Transcription Regulation: Introduction to This Special Issue of Biochemistry (Moscow).

PubMed

Razin, S V

2018-04-01

This issue of Biochemistry (Moscow) is devoted to the cell nucleus and mechanisms of transcription regulation. Over the years, biochemical processes in the cell nucleus have been studied in isolation, outside the context of their spatial organization. Now it is clear that segregation of functional processes within a compartmentalized cell nucleus is very important for the implementation of basic genetic processes. The functional compartmentalization of the cell nucleus is closely related to the spatial organization of the genome, which in turn plays a key role in the operation of epigenetic mechanisms. In this issue of Biochemistry (Moscow), we present a selection of review articles covering the functional architecture of the eukaryotic cell nucleus, the mechanisms of genome folding, the role of stochastic processes in establishing 3D architecture of the genome, and the impact of genome spatial organization on transcription regulation.
Are maternal mitochondria the selfish entities that are masters of the cells of eukaryotic multicellular organisms?

PubMed Central

Barlow, Peter W; Baldelli, E; Baluška, Frantisek

2009-01-01

The Energide concept, as well as the endosymbiotic theory of eukaryotic cell organization and evolution, proposes that present-day cells of eukaryotic organisms are mosaics of specialized and cooperating units, or organelles. Some of these units were originally free-living prokaryotes, which were engulfed during evolutionary time. Mitochondria represent one of these types of previously independent organisms, the Energide, is another type. This new perspective on the organization of the cell has been further expanded to reveal the concept of a public milieu, the cytosol, in which Energides and mitochondria live, each with their own private internal milieu. The present paper discusses how the endosymbiotic theory implicates a new hypothesis about the hierarchical and communicational organization of the integrated prokaryotic components of the eukaryotic cell and provides a new angle from which to consider the theory of evolution and its bearing upon cellular complexity. Thus, it is proposed that the “selfish gene” hypothesis of Dawkins1 is not the only possible perspective for comprehending genomic and cellular evolution. Our proposal is that maternal mitochondria are the selfish “master” entities of the eukaryotic cell with respect not only to their propagation from cell-to-cell and from generation-to-generation but also to their regulation of all other cellular functions. However, it should be recognized that the concept of “master” and “servant” cell components is a metaphor; in present-day living organisms their organellar components are considered to be interdependent and inseparable. PMID:19513277
Multiple recent horizontal transfers of a large genomic region in cheese making fungi.

PubMed

Cheeseman, Kevin; Ropars, Jeanne; Renault, Pierre; Dupont, Joëlle; Gouzy, Jérôme; Branca, Antoine; Abraham, Anne-Laure; Ceppi, Maurizio; Conseiller, Emmanuel; Debuchy, Robert; Malagnac, Fabienne; Goarin, Anne; Silar, Philippe; Lacoste, Sandrine; Sallet, Erika; Bensimon, Aaron; Giraud, Tatiana; Brygoo, Yves

2014-01-01

While the extent and impact of horizontal transfers in prokaryotes are widely acknowledged, their importance to the eukaryotic kingdom is unclear and thought by many to be anecdotal. Here we report multiple recent transfers of a huge genomic island between Penicillium spp. found in the food environment. Sequencing of the two leading filamentous fungi used in cheese making, P. roqueforti and P. camemberti, and comparison with the penicillin producer P. rubens reveals a 575 kb long genomic island in P. roqueforti--called Wallaby--present as identical fragments at non-homologous loci in P. camemberti and P. rubens. Wallaby is detected in Penicillium collections exclusively in strains from food environments. Wallaby encompasses about 250 predicted genes, some of which are probably involved in competition with microorganisms. The occurrence of multiple recent eukaryotic transfers in the food environment provides strong evidence for the importance of this understudied and probably underestimated phenomenon in eukaryotes.
Multiple recent horizontal transfers of a large genomic region in cheese making fungi

PubMed Central

Cheeseman, Kevin; Ropars, Jeanne; Renault, Pierre; Dupont, Joëlle; Gouzy, Jérôme; Branca, Antoine; Abraham, Anne-Laure; Ceppi, Maurizio; Conseiller, Emmanuel; Debuchy, Robert; Malagnac, Fabienne; Goarin, Anne; Silar, Philippe; Lacoste, Sandrine; Sallet, Erika; Bensimon, Aaron; Giraud, Tatiana; Brygoo, Yves

2014-01-01

While the extent and impact of horizontal transfers in prokaryotes are widely acknowledged, their importance to the eukaryotic kingdom is unclear and thought by many to be anecdotal. Here we report multiple recent transfers of a huge genomic island between Penicillium spp. found in the food environment. Sequencing of the two leading filamentous fungi used in cheese making, P. roqueforti and P. camemberti, and comparison with the penicillin producer P. rubens reveals a 575 kb long genomic island in P. roqueforti—called Wallaby—present as identical fragments at non-homologous loci in P. camemberti and P. rubens. Wallaby is detected in Penicillium collections exclusively in strains from food environments. Wallaby encompasses about 250 predicted genes, some of which are probably involved in competition with microorganisms. The occurrence of multiple recent eukaryotic transfers in the food environment provides strong evidence for the importance of this understudied and probably underestimated phenomenon in eukaryotes. PMID:24407037
Octocoral Mitochondrial Genomes Provide Insights into the Phylogenetic History of Gene Order Rearrangements, Order Reversals, and Cnidarian Phylogenetics

PubMed Central

Figueroa, Diego F.; Baco, Amy R.

2015-01-01

We use full mitochondrial genomes to test the robustness of the phylogeny of the Octocorallia, to determine the evolutionary pathway for the five known mitochondrial gene rearrangements in octocorals, and to test the suitability of using mitochondrial genomes for higher taxonomic-level phylogenetic reconstructions. Our phylogeny supports three major divisions within the Octocorallia and show that Paragorgiidae is paraphyletic, with Sibogagorgia forming a sister branch to the Coralliidae. Furthermore, Sibogagorgia cauliflora has what is presumed to be the ancestral gene order in octocorals, but the presence of a pair of inverted repeat sequences suggest that this gene order was not conserved but rather evolved back to this apparent ancestral state. Based on this we recommend the resurrection of the family Sibogagorgiidae to fix the paraphyly of the Paragorgiidae. This is the first study to show that in the Octocorallia, mitochondrial gene orders have evolved back to an ancestral state after going through a gene rearrangement, with at least one of the gene orders evolving independently in different lineages. A number of studies have used gene boundaries to determine the type of mitochondrial gene arrangement present. However, our findings suggest that this method known as gene junction screening may miss evolutionary reversals. Additionally, substitution saturation analysis demonstrates that while whole mitochondrial genomes can be used effectively for phylogenetic analyses within Octocorallia, their utility at higher taxonomic levels within Cnidaria is inadequate. Therefore for phylogenetic reconstruction at taxonomic levels higher than subclass within the Cnidaria, nuclear genes will be required, even when whole mitochondrial genomes are available. PMID:25539723
Enhancer Sharing Promotes Neighborhoods of Transcriptional Regulation Across Eukaryotes

PubMed Central

Quintero-Cadena, Porfirio; Sternberg, Paul W.

2016-01-01

Enhancers physically interact with transcriptional promoters, looping over distances that can span multiple regulatory elements. Given that enhancer–promoter (EP) interactions generally occur via common protein complexes, it is unclear whether EP pairing is predominantly deterministic or proximity guided. Here, we present cross-organismic evidence suggesting that most EP pairs are compatible, largely determined by physical proximity rather than specific interactions. By reanalyzing transcriptome datasets, we find that the transcription of gene neighbors is correlated over distances that scale with genome size. We experimentally show that nonspecific EP interactions can explain such correlation, and that EP distance acts as a scaling factor for the transcriptional influence of an enhancer. We propose that enhancer sharing is commonplace among eukaryotes, and that EP distance is an important layer of information in gene regulation. PMID:27799341
Six subgroups and extensive recent duplications characterize the evolution of the eukaryotic tubulin protein family.

PubMed

Findeisen, Peggy; Mühlhausen, Stefanie; Dempewolf, Silke; Hertzog, Jonny; Zietlow, Alexander; Carlomagno, Teresa; Kollmar, Martin

2014-08-27

Tubulins belong to the most abundant proteins in eukaryotes providing the backbone for many cellular substructures like the mitotic and meiotic spindles, the intracellular cytoskeletal network, and the axonemes of cilia and flagella. Homologs have even been reported for archaea and bacteria. However, a taxonomically broad and whole-genome-based analysis of the tubulin protein family has never been performed, and thus, the number of subfamilies, their taxonomic distribution, and the exact grouping of the supposed archaeal and bacterial homologs are unknown. Here, we present the analysis of 3,524 tubulins from 504 species. The tubulins formed six major subfamilies, α to ζ. Species of all major kingdoms of the eukaryotes encode members of these subfamilies implying that they must have already been present in the last common eukaryotic ancestor. The proposed archaeal homologs grouped together with the bacterial TubZ proteins as sister clade to the FtsZ proteins indicating that tubulins are unique to eukaryotes. Most species contained α- and/or β-tubulin gene duplicates resulting from recent branch- and species-specific duplication events. This shows that tubulins cannot be used for constructing species phylogenies without resolving their ortholog-paralog relationships. The many gene duplicates and also the independent loss of the δ-, ε-, or ζ-tubulins, which have been shown to be part of the triplet microtubules in basal bodies, suggest that tubulins can functionally substitute each other. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Infant and juvenile growth in ancestral Pueblo Indians.

PubMed

Schillaci, Michael A; Nikitovic, Dejana; Akins, Nancy J; Tripp, Lianne; Palkovich, Ann M

2011-06-01

The present study examines patterns of infant and juvenile growth in a diachronic sample of ancestral Pueblo Indians (AD 1300-1680) from the American Southwest. An assessment of growth patterns is accompanied by an evaluation of pathological conditions often considered to be indicators of nutritional deficiencies and/or gastrointestinal infections. Growth patterns and the distribution of pathological conditions are interpreted relative to culturally relevant age categories defined by Puebloan rites of passage described in the ethnographic literature. A visual comparison of growth distance curves revealed that relative to a modern comparative group our sample of ancestral Pueblo infant and juveniles exhibited faltering growth beginning soon after birth to about 5 years of age. A comparison of curves describing growth relative to adult femoral length, however, indicated reduced growth occurring later, by around 2 years of age. Similar to previous studies, we observed a high proportion of nonsurvivors exhibiting porotic cranial lesions during the first 2 years of life. Contrary to expectations, infants and juveniles without evidence of porotic cranial lesions exhibited a higher degree of stunting. Our study is generally consistent with previous research reporting poor health and high mortality for ancestral Pueblo Indian infants and juveniles. Through use of a culturally relevant context defining childhood, we argue that the observed poor health and high mortality in our sample occur before the important transition from young to older child and the concomitant initial incorporation into tribal ritual organization. Copyright © 2011 Wiley-Liss, Inc.
Death of a dogma: eukaryotic mRNAs can code for more than one protein

PubMed Central

Mouilleron, Hélène; Delcourt, Vivian; Roucou, Xavier

2016-01-01

mRNAs carry the genetic information that is translated by ribosomes. The traditional view of a mature eukaryotic mRNA is a molecule with three main regions, the 5′ UTR, the protein coding open reading frame (ORF) or coding sequence (CDS), and the 3′ UTR. This concept assumes that ribosomes translate one ORF only, generally the longest one, and produce one protein. As a result, in the early days of genomics and bioinformatics, one CDS was associated with each protein-coding gene. This fundamental concept of a single CDS is being challenged by increasing experimental evidence indicating that annotated proteins are not the only proteins translated from mRNAs. In particular, mass spectrometry (MS)-based proteomics and ribosome profiling have detected productive translation of alternative open reading frames. In several cases, the alternative and annotated proteins interact. Thus, the expression of two or more proteins translated from the same mRNA may offer a mechanism to ensure the co-expression of proteins which have functional interactions. Translational mechanisms already described in eukaryotic cells indicate that the cellular machinery is able to translate different CDSs from a single viral or cellular mRNA. In addition to summarizing data showing that the protein coding potential of eukaryotic mRNAs has been underestimated, this review aims to challenge the single translated CDS dogma. PMID:26578573
Dynamics of genomic innovation in the unicellular ancestry of animals

PubMed Central

Grau-Bové, Xavier; Torruella, Guifré; Donachie, Stuart; Suga, Hiroshi; Leonard, Guy; Richards, Thomas A; Ruiz-Trillo, Iñaki

2017-01-01

Which genomic innovations underpinned the origin of multicellular animals is still an open debate. Here, we investigate this question by reconstructing the genome architecture and gene family diversity of ancestral premetazoans, aiming to date the emergence of animal-like traits. Our comparative analysis involves genomes from animals and their closest unicellular relatives (the Holozoa), including four new genomes: three Ichthyosporea and Corallochytrium limacisporum. Here, we show that the earliest animals were shaped by dynamic changes in genome architecture before the emergence of multicellularity: an early burst of gene diversity in the ancestor of Holozoa, enriched in transcription factors and cell adhesion machinery, was followed by multiple and differently-timed episodes of synteny disruption, intron gain and genome expansions. Thus, the foundations of animal genome architecture were laid before the origin of complex multicellularity – highlighting the necessity of a unicellular perspective to understand early animal evolution. DOI: http://dx.doi.org/10.7554/eLife.26036.001 PMID:28726632
Automated multiplex genome-scale engineering in yeast

PubMed Central

Si, Tong; Chao, Ran; Min, Yuhao; Wu, Yuying; Ren, Wen; Zhao, Huimin

2017-01-01

Genome-scale engineering is indispensable in understanding and engineering microorganisms, but the current tools are mainly limited to bacterial systems. Here we report an automated platform for multiplex genome-scale engineering in Saccharomyces cerevisiae, an important eukaryotic model and widely used microbial cell factory. Standardized genetic parts encoding overexpression and knockdown mutations of >90% yeast genes are created in a single step from a full-length cDNA library. With the aid of CRISPR-Cas, these genetic parts are iteratively integrated into the repetitive genomic sequences in a modular manner using robotic automation. This system allows functional mapping and multiplex optimization on a genome scale for diverse phenotypes including cellulase expression, isobutanol production, glycerol utilization and acetic acid tolerance, and may greatly accelerate future genome-scale engineering endeavours in yeast. PMID:28469255
Genome Content and Phylogenomics Reveal both Ancestral and Lateral Evolutionary Pathways in Plant-Pathogenic Streptomyces Species

PubMed Central

Huguet-Tapia, Jose C.; Lefebure, Tristan; Badger, Jonathan H.; Guan, Dongli; Stanhope, Michael J.

2016-01-01

Streptomyces spp. are highly differentiated actinomycetes with large, linear chromosomes that encode an arsenal of biologically active molecules and catabolic enzymes. Members of this genus are well equipped for life in nutrient-limited environments and are common soil saprophytes. Out of the hundreds of species in the genus Streptomyces, a small group has evolved the ability to infect plants. The recent availability of Streptomyces genome sequences, including four genomes of pathogenic species, provided an opportunity to characterize the gene content specific to these pathogens and to study phylogenetic relationships among them. Genome sequencing, comparative genomics, and phylogenetic analysis enabled us to discriminate pathogenic from saprophytic Streptomyces strains; moreover, we calculated that the pathogen-specific genome contains 4,662 orthologs. Phylogenetic reconstruction suggested that Streptomyces scabies and S. ipomoeae share an ancestor but that their biosynthetic clusters encoding the required virulence factor thaxtomin have diverged. In contrast, S. turgidiscabies and S. acidiscabies, two relatively unrelated pathogens, possess highly similar thaxtomin biosynthesis clusters, which suggests that the acquisition of these genes was through lateral gene transfer. PMID:26826232
MCM Paradox: Abundance of Eukaryotic Replicative Helicases and Genomic Integrity

PubMed Central

Das, Mitali; Singh, Sunita; Pradhan, Satyajit

2014-01-01

As a crucial component of DNA replication licensing system, minichromosome maintenance (MCM) 2–7 complex acts as the eukaryotic DNA replicative helicase. The six related MCM proteins form a heterohexamer and bind with ORC, CDC6, and Cdt1 to form the prereplication complex. Although the MCMs are well known as replicative helicases, their overabundance and distribution patterns on chromatin present a paradox called the “MCM paradox.” Several approaches had been taken to solve the MCM paradox and describe the purpose of excess MCMs distributed beyond the replication origins. Alternative functions of these MCMs rather than a helicase had also been proposed. This review focuses on several models and concepts generated to solve the MCM paradox coinciding with their helicase function and provides insight into the concept that excess MCMs are meant for licensing dormant origins as a backup during replication stress. Finally, we extend our view towards the effect of alteration of MCM level. Though an excess MCM constituent is needed for normal cells to withstand stress, there must be a delineation of the threshold level in normal and malignant cells. This review also outlooks the future prospects to better understand the MCM biology. PMID:25386362
Metaxa: a software tool for automated detection and discrimination among ribosomal small subunit (12S/16S/18S) sequences of archaea, bacteria, eukaryotes, mitochondria, and chloroplasts in metagenomes and environmental sequencing datasets.

PubMed

Bengtsson, Johan; Eriksson, K Martin; Hartmann, Martin; Wang, Zheng; Shenoy, Belle Damodara; Grelet, Gwen-Aëlle; Abarenkov, Kessy; Petri, Anna; Rosenblad, Magnus Alm; Nilsson, R Henrik

2011-10-01

The ribosomal small subunit (SSU) rRNA gene has emerged as an important genetic marker for taxonomic identification in environmental sequencing datasets. In addition to being present in the nucleus of eukaryotes and the core genome of prokaryotes, the gene is also found in the mitochondria of eukaryotes and in the chloroplasts of photosynthetic eukaryotes. These three sets of genes are conceptually paralogous and should in most situations not be aligned and analyzed jointly. To identify the origin of SSU sequences in complex sequence datasets has hitherto been a time-consuming and largely manual undertaking. However, the present study introduces Metaxa ( http://microbiology.se/software/metaxa/ ), an automated software tool to extract full-length and partial SSU sequences from larger sequence datasets and assign them to an archaeal, bacterial, nuclear eukaryote, mitochondrial, or chloroplast origin. Using data from reference databases and from full-length organelle and organism genomes, we show that Metaxa detects and scores SSU sequences for origin with very low proportions of false positives and negatives. We believe that this tool will be useful in microbial and evolutionary ecology as well as in metagenomics.
The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes.

PubMed

Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin

2011-01-01

The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
A Three-Dimensional Model of the Yeast Genome

NASA Astrophysics Data System (ADS)

Noble, William; Duan, Zhi-Jun; Andronescu, Mirela; Schutz, Kevin; McIlwain, Sean; Kim, Yoo Jung; Lee, Choli; Shendure, Jay; Fields, Stanley; Blau, C. Anthony

Layered on top of information conveyed by DNA sequence and chromatin are higher order structures that encompass portions of chromosomes, entire chromosomes, and even whole genomes. Interphase chromosomes are not positioned randomly within the nucleus, but instead adopt preferred conformations. Disparate DNA elements co-localize into functionally defined aggregates or factories for transcription and DNA replication. In budding yeast, Drosophila and many other eukaryotes, chromosomes adopt a Rabl configuration, with arms extending from centromeres adjacent to the spindle pole body to telomeres that abut the nuclear envelope. Nonetheless, the topologies and spatial relationships of chromosomes remain poorly understood. Here we developed a method to globally capture intra- and inter-chromosomal interactions, and applied it to generate a map at kilobase resolution of the haploid genome of Saccharomyces cerevisiae. The map recapitulates known features of genome organization, thereby validating the method, and identifies new features. Extensive regional and higher order folding of individual chromosomes is observed. Chromosome XII exhibits a striking conformation that implicates the nucleolus as a formidable barrier to interaction between DNA sequences at either end. Inter-chromosomal contacts are anchored by centromeres and include interactions among transfer RNA genes, among origins of early DNA replication and among sites where chromosomal breakpoints occur. Finally, we constructed a three-dimensional model of the yeast genome. Our findings provide a glimpse of the interface between the form and function of a eukaryotic genome.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.