Advertisement
Current Biology
This journal offers authors two options (open access or subscription) to publish research

Comparative transcriptomics reveals the molecular toolkit used by an algivorous protist for cell wall perforation

  • Author Footnotes
    4 These authors contributed equally
    Jennifer V. Gerbracht
    Footnotes
    4 These authors contributed equally
    Affiliations
    Institute for Zoology, University of Cologne, Zülpicher Str. 47b, 50674 Cologne, Germany
    Search for articles by this author
  • Author Footnotes
    4 These authors contributed equally
    ,
    Author Footnotes
    5 Present address: Laboratoire de Sciences Judiciaires et de Médecine Légale, 1701 rue Parthenais, Montréal, QC H2K 3S7, Canada
    Tommy Harding
    Footnotes
    4 These authors contributed equally
    5 Present address: Laboratoire de Sciences Judiciaires et de Médecine Légale, 1701 rue Parthenais, Montréal, QC H2K 3S7, Canada
    Affiliations
    Department of Biochemistry and Molecular Biology, Dalhousie University, 5850 College Street, Halifax, NS B3H 4R2, Canada
    Search for articles by this author
  • Alastair G.B. Simpson
    Affiliations
    Department of Biology, Dalhousie University, 1355 Oxford Street, Halifax, NS B3H 4R2, Canada
    Search for articles by this author
  • Andrew J. Roger
    Affiliations
    Department of Biochemistry and Molecular Biology, Dalhousie University, 5850 College Street, Halifax, NS B3H 4R2, Canada
    Search for articles by this author
  • Author Footnotes
    6 Lead contact
    Sebastian Hess
    Correspondence
    Corresponding author
    Footnotes
    6 Lead contact
    Affiliations
    Institute for Zoology, University of Cologne, Zülpicher Str. 47b, 50674 Cologne, Germany

    Department of Biochemistry and Molecular Biology, Dalhousie University, 5850 College Street, Halifax, NS B3H 4R2, Canada

    Department of Biology, Dalhousie University, 1355 Oxford Street, Halifax, NS B3H 4R2, Canada
    Search for articles by this author
  • Author Footnotes
    4 These authors contributed equally
    5 Present address: Laboratoire de Sciences Judiciaires et de Médecine Légale, 1701 rue Parthenais, Montréal, QC H2K 3S7, Canada
    6 Lead contact
Open AccessPublished:June 13, 2022DOI:https://doi.org/10.1016/j.cub.2022.05.049

      Highlights

      • Orciraptor shows distinct transcriptional changes in different life history stages
      • A putative GH5_5 cellulase is highly expressed and upregulated during feeding
      • Orciraptor contains an unexpected suite of chitin-related factors
      • Potential LPMOs point to enzymatic novelties in protists

      Summary

      Microbial eukaryotes display a stunning diversity of feeding strategies, ranging from generalist predators to highly specialized parasites. The unicellular “protoplast feeders” represent a fascinating mechanistic intermediate, as they penetrate other eukaryotic cells (algae and fungi) like some parasites but then devour their cell contents by phagocytosis.
      • Hess S.
      • Melkonian M.
      The mystery of clade X: Orciraptor gen. nov. and Viridiraptor gen. nov. are highly specialised, algivorous amoeboflagellates (Glissomonadida, Cercozoa).
      Besides prey recognition and attachment, this complex behavior involves the local, pre-phagocytotic dissolution of the prey cell wall, which results in well-defined perforations of species-specific size and structure.
      • Busch A.
      • Hess S.
      The cytoskeleton architecture of algivorous protoplast feeders (Viridiraptoridae, Rhizaria) indicates actin-guided perforation of prey cell walls.
      Yet the molecular processes that enable protoplast feeders to overcome cell walls of diverse biochemical composition remain unknown. We used the flagellate Orciraptor agilis (Viridiraptoridae, Rhizaria) as a model protoplast feeder and applied differential gene expression analysis to examine its penetration of green algal cell walls. Besides distinct expression changes that reflect major cellular processes (e.g., locomotion and cell division), we found lytic carbohydrate-active enzymes that are highly expressed and upregulated during the attack on the alga. A putative endocellulase (family GH5_5) with a secretion signal is most prominent, and a potential key factor for cell wall dissolution. Other candidate enzymes (e.g., lytic polysaccharide monooxygenases) belong to families that are largely uncharacterized, emphasizing the potential of non-fungal microeukaryotes for enzyme exploration. Unexpectedly, we discovered various chitin-related factors that point to an unknown chitin metabolism in Orciraptor agilis, potentially also involved in the feeding process. Our findings provide first molecular insights into an important microbial feeding behavior and new directions for cell biology research on non-model eukaryotes.

      Keywords

      Results and discussion

      Food acquisition in Orciraptor agilis captured by comparative transcriptomics

      Orciraptor agilis is a flagellate of the family Viridiraptoridae (Rhizaria) that can feed on the cell contents of dead algal cells (necrophagy).
      • Hess S.
      • Melkonian M.
      The mystery of clade X: Orciraptor gen. nov. and Viridiraptor gen. nov. are highly specialised, algivorous amoeboflagellates (Glissomonadida, Cercozoa).
      After attachment to its prey, filaments of Mougeotia sp. (Zygnematophyceae, Streptophyta), it perforates the algal cell wall and phagocytoses the nutrient-rich chloroplast (Figure 1A). The cell wall dissolution is confined to a narrow elliptical zone, which results in removal of a lid-like cell wall disk (Figure 1B). This perforation pattern appears to be defined by a transient, F-actin-rich cellular domain, the lysopodium
      • Busch A.
      • Hess S.
      The cytoskeleton architecture of algivorous protoplast feeders (Viridiraptoridae, Rhizaria) indicates actin-guided perforation of prey cell walls.
      (Figure 1C). As revealed by scanning electron microscopy, Orciraptor agilis degrades both main structural components of the plant-like algal cell wall, (1) crystalline cellulose microfibrils and (2) gel-like pectic substances (Figure 1D). No mechanical systems for cell wall perforation have been observed at an ultrastructural level. Instead, the close contact of Orciraptor’s plasma membrane to the zone of cell wall erosion indicates contact digestion
      • Busch A.
      • Hess S.
      The cytoskeleton architecture of algivorous protoplast feeders (Viridiraptoridae, Rhizaria) indicates actin-guided perforation of prey cell walls.
      (Figure 1C). Contact digestion is based on enzymes that are tethered to or adsorbed on a surface.
      • Ugolev A.M.
      Parietal (contact) digestion.
      For example, this process is known from malignant melanoma cells, which degrade the extracellular matrix by membrane-bound proteases during tissue invasion.
      • Nakahara H.
      • Howard L.
      • Thompson E.W.
      • Sato H.
      • Seiki M.
      • Yeh Y.
      • Chen W.T.
      Transmembrane/cytoplasmic domain-mediated membrane type 1-matrix metalloprotease docking to invadopodia is required for cell invasion.
      Figure thumbnail gr1
      Figure 1Feeding, life history, and de novo transcriptome assembly of Orciraptor agilis
      (A) Orciraptor agilis extracting the chloroplast of Mougeotia sp. after perforating the algal cell wall (differential interference contrast). Scale bar, 5 μm.
      (B) Annular dissolution of the algal cell wall resulting from an attempted attack (phase contrast). Scale bar, 5 μm.
      (C) Distribution of F-actin (green: fluorescent phalloidin) reveals the lysopodium in Orciraptor agilis formed during attack on Mougeotia sp. (overlay of differential interference contrast and fluorescence channels). The increased blue fluorescence (Calcofluor white) at the contact sites indicates lysis of the algal cell wall. Scale bar, 5 μm.
      (D) Scanning electron micrograph of a perforation by Orciraptor agilis reveals the degradation of both main structural components of Mougeotia’s cell wall, gel-like biopolymers (potentially pectins; indicated by “gel”) and cellulose microfibrils (indicated by “fib”). Scale bars, 2 μm and 200 nm (inset).
      (E) Life history stages of Orciraptor agilis from which the samples were generated.
      (F) Benchmarked universal single-copy orthologs (BUSCOs) assessment of the assembled transcriptome. The analysis was performed with the “Eukaryota” dataset and the “Alveolata” dataset (sister group of Rhizaria).
      (G) Upset plot showing the number and overlap of ORFs annotated by the indicated annotation tools and databases. Only intersection sizes > 100 are shown.
      (H) Principal component analysis (PCA) based on the expression level of all transcripts for each replicate included in the experiment.
      See also and .
      To gain insight into the molecular mechanisms underlying this feeding process, we compared the transcriptomes of Orciraptor cultures in three well-defined life history stages (=conditions; Figure 1E): (1) motile, gliding flagellates searching for algal cells (“gliding”); (2) cells during cell wall perforation about 45 min after contact with algal cells (“attacking”); and (3) a culture with excess algal material, which was enriched in digesting and dividing cells (“digesting-dividing”). Orciraptor agilis is an excellent laboratory model, as it can be synchronized by starvation and attacks within a few minutes after addition of algal cells (Figure S1A; Video S1). Both its ability to grow under bacteria-free conditions and its preference for dead algae let us observe Orciraptor’s gene expression changes very clearly (no bacterial transcripts, no adaption by the algal food). Since there is no high-throughput genomic or transcriptomic data available for the Viridiraptoridae, we generated a transcriptome assembly de novo using the data from all conditions (nine samples, plus a sample of Mougeotia sp. to identify algal reads). This assembly captures the most complete picture of the transcriptomic landscape and was later used for read mapping and quantification for differential expression analyses. The transcriptome was determined to be 64.3% and 80.5% complete as assessed with benchmarked universal single-copy orthologs (BUSCOs) using Eukaryota and Alveolata datasets, respectively (Figure 1F). These values likely underestimate the true completeness substantially, given the relatively isolated phylogenetic position of Orciraptor agilis. Using four different tools for functional annotation, 90.7% of the 49,848 predicted open reading frames (ORFs) received homology-based annotations and/or annotations of protein domains (Figures 1G and S1B). A principal component analysis of all replicates showed tight and highly distinct clusters for each condition (Figure 1H).
      • Loading ...

      Global expression changes reflect Orciraptor’s life history

      To explore cellular processes affected by transcriptional changes between the life history stages of Orciraptor agilis, we performed a differential expression analysis for each pair of two conditions (Figure S1C). Differentially expressed transcripts (|log2 fold change| ≥ 1, adjusted p value < 0.001) in either of these comparisons were hierarchically clustered based on their relative expression changes in all conditions (Figure 2A). Applying an 80% maximum-height cutoff criterion to the clustering dendrogram yielded five clusters containing transcripts with similar expression patterns (Figure 2B). For each cluster, significantly enriched gene ontology (GO) terms were determined (Figure S2), thereby identifying cellular processes associated with marked expression changes during Orciraptor’s life history. In addition, we specifically investigated expression changes in transcripts for cytoskeletal and flagellum-specific proteins, as viridiraptorids shift from a rigid flagellate to an amoeboid stage during feeding.
      • Busch A.
      • Hess S.
      The cytoskeleton architecture of algivorous protoplast feeders (Viridiraptoridae, Rhizaria) indicates actin-guided perforation of prey cell walls.
      Figure thumbnail gr2
      Figure 2Clustering of differentially expressed transcripts and expression changes throughout Orciraptor’s life history
      (A) Hierarchical clustering of transcripts that were differentially expressed (|log2 fold change| ≥ 1, adjusted p value < 0.001) in at least one pairwise comparison. Variance stabilizing transformed counts were used to perform hierarchical cluster analysis using Pearson correlation as distance method and complete linkage. The resulting dendrogram was cut at 80% of the maximum height to define clusters.
      (B) Five clusters of transcripts with similar expression patterns resulting from the cut dendrogram shown in (A).
      (C) Heatmap of differentially expressed transcripts belonging to the GO terms that were significantly enriched in any of the five clusters. G, gliding; A, attacking; D, digesting. Each square represents one biological replicate.
      See also and .
      Gliding cells show relatively low expression levels in most of the listed terms and some categories appear to be specifically downregulated relative to both attacking and digesting-dividing cells (Figure 2C). This includes terms related to protein production such as ribosome biogenesis, translation, and RNA-related processes (splicing, binding), as well as some important metabolic processes (tricarboxylic acid [TCA] cycle, fatty acid biosynthesis, and sterol biosynthesis). This aligns well with the fact that gliding cells move around but do not eat or divide. We hypothesize that the overall energy consumption in gliding cells is reduced and that the available resources are used for cellular processes that maximize the chance of encountering food (especially movement). Interestingly, we found four kinesin homologs that are specifically upregulated in the “gliding” condition, including one of the flagellum-associated kinesins (KIF17/OSM-3) from the kinesin-2 family (Figure S3A, asterisk). This family comprises plus-end-directed microtubule-based motor proteins and is well studied in connection with intraflagellar transport (IFT).
      • Marande W.
      • Kohl L.
      Flagellar kinesins in protists.
      Orciraptor agilis performs a gliding motility that apparently relies on a traction system located in the adhering posterior flagellum.
      • Hess S.
      • Melkonian M.
      The mystery of clade X: Orciraptor gen. nov. and Viridiraptor gen. nov. are highly specialised, algivorous amoeboflagellates (Glissomonadida, Cercozoa).
      This form of motility is widespread among heterotrophic flagellates of various phylogenetic affinities
      • Cavalier-Smith T.
      • Chao E.E.
      • Lewis R.
      Multigene phylogeny and cell evolution of chromist infrakingdom Rhizaria: contrasting cell organisation of sister phyla Cercozoa and Retaria.
      ,
      • Patterson D.J.
      • Simpson A.G.B.
      Heterotrophic flagellates from coastal marine and hypersaline sediments in Western Australia.
      but poorly understood. The locomotion in Orciraptor agilis might be driven by an anterograde membrane motion along this flagellum, which in IFT is based on kinesins.
      In the “attacking” condition, gene categories associated with ribosomal RNA and ribosome biosynthesis, protein production, and some energy- and lipid-related metabolic processes show enhanced expression when compared with both gliding and digesting-dividing cells (Figure 2C). This indicates a marked switch in cellular activity upon contact with the algal cells. The pronounced expression of genes linked to ribosomal RNA processing and ribosome assembly during attack suggests that the protein production machinery becomes restored to full capacity in preparation for the upcoming cellular processes such as phagocytosis, digestion, synthesis of biomass, and multiplication. Furthermore, the “attacking” condition is characterized by a high and specific expression of 14 distinct transcripts identified as coding for myosins (Figure S3A), some of which might be involved in the formation and maintenance of pseudopodial structures. Upon contact with algal cells, Orciraptor agilis switches from a motile microtubule-dominated flagellate to an F-actin-dominated amoeboid cell. In this amoeboid stage, Orciraptor agilis develops the lysopodium as a cytoskeletal template for cell wall perforation
      • Busch A.
      • Hess S.
      The cytoskeleton architecture of algivorous protoplast feeders (Viridiraptoridae, Rhizaria) indicates actin-guided perforation of prey cell walls.
      and later uses pseudopodia to extract algal cell contents.
      • Hess S.
      • Melkonian M.
      The mystery of clade X: Orciraptor gen. nov. and Viridiraptor gen. nov. are highly specialised, algivorous amoeboflagellates (Glissomonadida, Cercozoa).
      Despite these cellular changes, viridiraptorids retain their flagella over the entire life history (in contrast to certain amoeboflagellates from the Amoebozoa and Heterolobosea).
      • Busch A.
      • Hess S.
      The cytoskeleton architecture of algivorous protoplast feeders (Viridiraptoridae, Rhizaria) indicates actin-guided perforation of prey cell walls.
      This is reflected by the absence of significant regulation of 41 out of 45 flagellum-associated genes found in Orciraptor agilis (except the kinesin-2 homolog and centriole-associated genes; Figure S3B). Furthermore, the “attacking” condition shows upregulated transcripts that are related to a potential “chitin-binding” function (Figure 2C), although the green algal food is unlikely to contain chitinous substances (details below).
      In the “digesting-dividing” condition, transcripts associated with translation and protein production remain highly expressed—similar to the “attacking” condition, yet with a slightly different pattern (Figure 2C). There were also profound expression changes in the “digesting-dividing” condition. Transcripts related to signaling show the lowest expression levels of all studied life history stages, while energy conversion, lipid biosynthesis, and glutathione-related processes were at maximum expression (Figure 2C). This may reflect the conversion of algal chloroplast material into viridiraptorid biomass that happens during the digestive phase. Glutathione-related processes, in particular, might be involved in the detoxification of ingested algal food, as chlorophylls and their breakdown products are known to produce reactive oxygen species when exposed to light.
      • Kashiyama Y.
      • Tamiaki H.
      Risk management by organisms of the phototoxicity of chlorophylls.
      A very pronounced upregulation in our global analysis is observed in transcripts related to DNA dynamics and cell division. Detailed examination of the expression of cytoskeletal components revealed a large fraction (18/28; two-thirds) of kinesins that are specifically upregulated during the “digesting-dividing” stage, together with regulators of chromosome condensation (RCC1), mitotic spindle-associated factors (ASPM), centrin, and two proteins (PLK4 and POC1) associated with centriole assembly and duplication (Figures S3A and S3B). This matches the observation that viridiraptorids only undergo mitosis and divide after food uptake,
      • Hess S.
      • Melkonian M.
      The mystery of clade X: Orciraptor gen. nov. and Viridiraptor gen. nov. are highly specialised, algivorous amoeboflagellates (Glissomonadida, Cercozoa).
      while “gliding cells” seem to be arrested in a pre-division stage, as evidenced by the presence of procentrioles for flagellar duplication.
      • Hess S.
      • Melkonian M.
      Ultrastructure of the algivorous amoeboflagellate viridiraptor invadens (Glissomonadida, Cercozoa).
      In viridiraptorids, both mitosis and cytokinesis rely heavily on microtubular structures, the spindle apparatus and a cortical system of overlapping cytoplasm microtubules,
      • Busch A.
      • Hess S.
      The cytoskeleton architecture of algivorous protoplast feeders (Viridiraptoridae, Rhizaria) indicates actin-guided perforation of prey cell walls.
      which may explain the marked expression changes of microtubule-related factors. All in all, our transcriptomic data clearly reflect the main cellular processes observed during the three studied life history stages of Orciraptor agilis.

      Lytic CAZymes are highly upregulated during attack

      The alga Mougeotia sp. possesses a plant-like cell wall with crystalline cellulose and gel-like pectins as structural components.
      • Hotchkiss A.T.
      • Gretz M.R.
      • Hicks K.B.
      • Malcolm Brown R.
      The composition and phylogenetic significance of the Mougeotia (Charophyceae) cell wall.
      ,
      • Permann C.
      • Herburger K.
      • Niedermeier M.
      • Felhofer M.
      • Gierlinger N.
      • Holzinger A.
      Cell wall characteristics during sexual reproduction of Mougeotia sp. (Zygnematophyceae) revealed by electron microscopy, glycan microarrays and RAMAN spectroscopy.
      We suspected that Orciraptor agilis utilizes carbohydrate-active enzymes (CAZymes) to degrade these polymers and analyzed the expression of annotated lytic CAZymes such as glycoside hydrolases (GH) and polysaccharide lyases (PL) (Figure S3C). Orciraptor agilis expressed a great diversity of GHs and some PLs, listed in Figure 3A according to their expression level in the “attacking” condition. Indeed, there are several GHs with putative cellulase activity (GH5_5, GH5, GH6, and GH44), and GHs and PLs that might degrade pectin or pectate (GH28 contains polygalacturonases;
      • Markovic O.
      • Janecek S.
      Pectin degrading glycoside hydrolases of family 28: sequence-structural features, specificities and evolution.
      PL9 members cleave homogalacturonan by a β-elimination mechanism
      • Jenkins J.
      • Shevchik V.E.
      • Hugouvieux-Cotte-Pattat N.
      • Pickersgill R.W.
      The crystal structure of pectate lyase Pel9A from Erwinia chrysanthemi.
      ). Some of these candidates were clearly upregulated in the “attacking” condition, especially the members of families GH5_5 and PL9_2 (Figure 3A).
      Figure thumbnail gr3
      Figure 3Glycoside hydrolases (GH) and polysaccharide lyases of Orciraptor agilis, with details on a highly expressed putative endocellulase
      (A) Top 50 most highly expressed CAZymes of the glycoside hydrolase (GH) and polysaccharide lyases (PL) families in the “attacking” condition. Expression levels are shown as transcripts per million (TPMs). A red bar indicates upregulation (log2 fold change ≥ 1, adjusted p value < 0.001) in the attacking versus “gliding” condition. The main substrates of the respective CAZyme families are listed. The colored boxes indicate whether the contig is complete (“Comp.,” gray) and has a signal peptide (“SP,” blue) or transmembrane domains (“TMD,” purple). Contigs annotated as CAZymes with putative endoglucanase function (EC 3.2.1.4) are marked with a light blue dot. Contigs annotated as CAZymes that target chitin are marked with a yellow dot.
      (B) Expression levels as normalized counts of the most highly expressed GH5_5 contig (GH5_5A). Each dot represents one biological replicate.
      (C) Schematic depiction of the GH5_5A functional domains.
      (D) In silico structure prediction of the GH5_5 domain from the Orciraptor GH5_5A shown next to an endoglucanase from Thermoascus aurantiacus.
      (E) Radial phylogenetic tree of GH5_5 family proteins from bacteria and eukaryotes. Highlighted are the three GH5_5 sequences from Orciraptor agilis. Ultrafast bootstrap values are shown as branch support.
      See also and .
      The most highly expressed CAZyme by far, here termed GH5_5A, showed a marked and specific upregulation in the “attacking” condition, with a log2 fold change of 2.2 (adjusted p = 3.1 × 10−136) compared with the “gliding” condition (Figure 3B). The contig was split in the original assembly, likely due to intronic sequences present (Figure S4A), but completed with an alternative assembly strategy (STAR Methods).
      Characterized family GH5_5 members from other organisms are typical endocellulases, i.e., they perform internal cleavage of β-1,4-glucosidic linkages.
      • Aspeborg H.
      • Coutinho P.M.
      • Wang Y.
      • Brumer 3rd, H.
      • Henrissat B.
      Evolution, substrate specificity and subfamily classification of glycoside hydrolase family 5 (GH5).
      This activity is also required for the degradation of cellulose microfibrils in plant-like cell walls, making the highly expressed and regulated GH5_5A of Orciraptor agilis a strong candidate for an important wall-degrading role. The complete ORF of GH5_5A is 2,249 amino acids long, which corresponds to a large protein of approximately 226 kDa with an N-terminal signal peptide and a C-terminal transmembrane domain (TMD; Figure 3C). The protein might be secreted and remain tethered to a membrane. The GH5_5 domain is followed by a series of seven related sequence motifs, some of which are weakly assigned to the cellulose-binding domain CBM2 (non-significant E value). Future wet-lab studies have to elucidate whether these repeats can bind cellulose. Based on these features, it is possible that the enzyme is anchored on the external side of the plasma membrane and aids in contact digestion when Orciraptor agilis is attached to its prey.
      To gain more insight into the catalytic function of GH5_5A, we predicted in silico the structure of the GH5_5 module with iterative threading assembly refinement (I-TASSER).
      • Yang J.
      • Yan R.
      • Roy A.
      • Xu D.
      • Poisson J.
      • Zhang Y.
      The I-TASSER Suite: protein structure and function prediction.
      • Roy A.
      • Kucukural A.
      • Zhang Y.
      I-TASSER: a unified platform for automated protein structure and function prediction.
      • Zhang Y.
      I-TASSER server for protein 3D structure prediction.
      The predicted tertiary structure showed a high similarity (TM score 0.898) to the experimentally determined atomic resolution structure of the “major endoglucanase” from the fungus Thermoascus aurantiacus (Figure 3D).
      • Van Petegem F.
      • Vandenberghe I.
      • Bhat M.K.
      • Van Beeumen J.
      Atomic resolution structure of the major endoglucanase from Thermoascus aurantiacus.
      Furthermore, a multiple sequence alignment of the GH5_5 modules from Orciraptor agilis revealed conservation of residues that are part of the active site of a functionally and structurally characterized GH5_5 endoglucanase (Figure S4B).
      • Delsaute M.
      • Berlemont R.
      • Dehareng D.
      • Van Elder D.
      • Galleni M.
      • Bauvois C.
      Three-dimensional structure of RBcel1, a metagenome-derived psychrotolerant family GH5 endoglucanase.
      ,
      • Berlemont R.
      • Delsaute M.
      • Pipers D.
      • D'Amico S.
      • Feller G.
      • Galleni M.
      • Power P.
      Insights into bacterial cellulose biosynthesis by functional metagenomics on Antarctic soil samples.
      We found two other transcripts that encode GH5_5 modules in Orciraptor agilis, GH5_5B and GH5_5C. Both had much lower expression levels, and only GH5_5C was upregulated in the “attacking” condition. To elucidate the relationships of the three endocellulases of Orciraptor agilis, we performed phylogenetic analyses with prokaryotic and eukaryotic homologs. The maximum likelihood tree in Figure 3E shows that GH5_5 sequences from diverse eukaryotic supergroups do not cluster together but are intermingled with prokaryotic sequences. As expected for a dataset of relatively short protein sequences (∼300 amino acids), the phylogenetic analysis cannot resolve many of these deeper branches. Fungal and dinoflagellate (Alveolata) cellulases, however, form two distinct clades indicating significant in-group diversification (Figure 3E), which might relate to their cell wall biology (e.g., cellulosic thecal plates in dinoflagellates
      • Chan W.S.
      • Kwok A.C.M.
      • Wong J.T.Y.
      Knockdown of dinoflagellate cellulose synthase CesA1 resulted in malformed intracellular cellulosic thecal plates and severely impeded cyst-to-swarmer transition.
      ). The three GH5_5 domains from Orciraptor agilis clustered together as well, with full support (UFBoot = 100), suggesting that they are paralogs stemming from a common ancestral gene. Their closest relatives in the tree were sequences from other protists (e.g., choanoflagellates, amoebozoans), which, however, changed with taxon selection (see supplemental information). Due to lacking statistical support and poor taxon sampling of protists in general,
      • Burki F.
      • Roger A.J.
      • Brown M.W.
      • Simpson A.G.B.
      The new tree of eukaryotes.
      ,
      • Sibbald S.J.
      • Archibald J.M.
      More protist genomes needed.
      the origin of Orciraptor’s GH5_5 cellulases remains unresolved. Future efforts in the genomic exploration of microbial eukaryotes promise to fill these gaps and to provide further evolutionary insights.

      Are chitin-related factors involved in contacting algal surfaces?

      The analysis of Orciraptor’s glycoside hydrolases also revealed four putative chitinases from the GH18 family. These GH18 candidates were among the 50 most highly expressed CAZymes, and two of them were clearly upregulated in the “attacking” condition (Figure 3A). This was surprising, as zygnematophytes such as Mougeotia sp. do not produce detectable chitin, nor do they possess chitin synthases in their genomes.
      • Jiao C.
      • Sørensen I.
      • Sun X.
      • Sun H.
      • Behar H.
      • Alseekh S.
      • Philippe G.
      • Palacio Lopez K.
      • Sun L.
      • Reed R.
      • et al.
      The Penium margaritaceum genome: hallmarks of the origins of land plants.
      Another possibility is that chitin plays a role in the physiology of Orciraptor agilis, especially during feeding. Noting the marked expression changes of “chitin-binding” factors in our global analysis (Figure 2C), we examined carbohydrate-binding modules (CBMs) and their expressional changes. Orciraptor agilis expressed proteins with various CBMs, as listed in Figure 4A according to their expression level in the “attacking” condition. The most highly expressed CBMs belong to family CBM13, which contains members with diverse binding functions (e.g., galactose, mannose, GalNAc, and xylan), so that substrate specificity cannot be predicted.
      • Fujimoto Z.
      Structure and function of carbohydrate-binding module families 13 and 42 of glycoside hydrolases, comprising a beta-trefoil fold.
      Interestingly, there were several transcripts with CBM50 (LysM) and/or CBM18 modules, both of which are known to bind chitin (and peptidoglycan).
      • de Jonge R.
      • van Esse H.P.
      • Kombrink A.
      • Shinya T.
      • Desaki Y.
      • Bours R.
      • van der Krol S.
      • Shibuya N.
      • Joosten M.H.
      • Thomma B.P.
      Conserved fungal LysM effector Ecp6 prevents chitin-triggered immunity in plants.
      ,
      • Abramyan J.
      • Stajich J.E.
      Species-specific chitin-binding module 18 expansion in the amphibian pathogen Batrachochytrium dendrobatidis.
      Several of these factors were upregulated during attack. The most highly expressed and upregulated chitin-binding transcript encodes five LysMs and one CBM18 module (Figure 4B). It also has a signal peptide and a C-terminal TMD and hence could be tethered to a membrane, similar to known LysM-containing chitin receptors in plants.
      • Kombrink A.
      • Sánchez-Vallet A.
      • Thomma B.P.
      The role of chitin detection in plant-pathogen interactions.
      These findings are a good starting point for future research, as LysMs are important factors in plant-pathogen interactions and rhizobial symbioses
      • Hu S.P.
      • Li J.J.
      • Dhar N.
      • Li J.P.
      • Chen J.Y.
      • Jian W.
      • Dai X.F.
      • Yang X.Y.
      Lysin motif (LysM) proteins: interlinking manipulation of plant immunity and fungi.
      but largely unexplored in free-living protists. In addition to chitinases and putative chitin binders, we found other chitin-related factors that were relatively highly expressed, such as potential lytic polysaccharide monooxygenases (LPMOs) of family AA11 and a chitin synthase (Figure 4C). The Orciraptor agilis transcriptome encoded the entire biosynthetic pathway from fructose/glucose to chitin, and the array of chitin-synthesizing, -binding, and -degrading factors points to a significant role for chitin (or related substances). The occurrence, properties, and roles of such biopolymers in protists are, however, largely unknown and deserve future study.
      Figure thumbnail gr4
      Figure 4Carbohydrate-binding modules (CBMs), lytic polysaccharide monooxygenases (LPMOs), and other chitin-related factors expressed in Orciraptor agilis
      (A) Top 50 most highly expressed contigs annotated with carbohydrate-binding modules (CBMs) in the “attacking” condition. Expression levels are shown as transcripts per million (TPMs). A red bar indicates upregulation (log2 fold change ≥ 1, adjusted p value < 0.001) in the “attacking” versus “gliding” condition. The targeted carbohydrates of the respective CBMs are listed. The colored boxes indicate whether the contig is complete (“Comp.,” gray) and has a signal peptide (“SP,” blue) or transmembrane domains (“TMD,” purple).
      (B) Schematic depiction of the most highly expressed chitin-related contig.
      (C) MA plot (“attacking” versus “gliding” condition) depicting the expression levels of contigs annotated with a chitin-related function.
      (D) Expression levels of contigs annotated as AA11 shown as transcripts per million (TPM). A red bar indicates upregulation (log2 fold change ≥ 1, adjusted p value < 0.001) in the “attacking” versus “gliding” condition. The colored boxes indicate whether the contig is complete (“Comp.,” gray) and has a signal peptide (“SP,” blue) or transmembrane domains (“TMD,” purple).
      (E) Schematic depiction of the most highly expressed AA11 from Orciraptor agilis.
      (F) In silico structure prediction of the functional domain of the AA11-type LPMO from Orciraptor agilis next to the AA11 LPMO from Aspergillus oryzae (residues 1–151). The copper cofactor (orange) and three binding residues (red) in the Orciraptor structure were predicted by COACH and COFACTOR.
      (G) Details of the ligand-binding sites of the structures shown in (F).
      Three contigs encode putative LPMOs of family AA11 (Figure 4D). LPMOs are promising enzymes for biotechnology as many of them act on recalcitrant substrates such as crystalline cellulose or chitin and greatly enhance biomass degradation in synergy with other CAZymes.
      • Harris P.V.
      • Welner D.
      • McFarland K.C.
      • Re E.
      • Navarro Poulsen J.C.
      • Brown K.
      • Salbo R.
      • Ding H.
      • Vlasenko E.
      • Merino S.
      • et al.
      Stimulation of lignocellulosic biomass hydrolysis by proteins of glycoside hydrolase family 61: structure and function of a large, enigmatic family.
      ,
      • Vaaje-Kolstad G.
      • Westereng B.
      • Horn S.J.
      • Liu Z.
      • Zhai H.
      • Sørlie M.
      • Eijsink V.G.
      An oxidative enzyme boosting the enzymatic conversion of recalcitrant polysaccharides.
      However, only a single characterized enzyme is known for family AA11, the copper-dependent LPMO from Aspergillus oryzae that degrades chitin.
      • Hemsworth G.R.
      • Henrissat B.
      • Davies G.J.
      • Walton P.H.
      Discovery and characterization of a new family of lytic polysaccharide monooxygenases.
      The most highly expressed putative LPMO from Orciraptor agilis represents a 375-amino-acid-long protein with an N-terminal signal peptide and a C-terminal TMD (Figure 4E). The sequence similarity between Orciraptor’s LPMO module and the characterized AA11 protein is relatively low (25.4% identity), but their relationship is supported by in silico structure prediction. The AA11 LPMO from Aspergillus oryzae was the closest hit as determined with I-TASSER and clearly resembles the predicted structure from Orciraptor agilis (TM-score 0.683; Figure 4F). Furthermore, protein-ligand-binding site predictions with COACH
      • Yang J.
      • Roy A.
      • Zhang Y.
      Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment.
      and COFACTOR
      • Roy A.
      • Yang J.
      • Zhang Y.
      COFACTOR: an accurate comparative algorithm for structure-based protein function annotation.
      • Roy A.
      • Zhang Y.
      Recognizing protein-ligand binding sites by global structural alignment and local geometry refinement.
      • Zhang C.
      • Freddolino P.L.
      • Zhang Y.
      COFACTOR: improved protein function prediction by combining structure, sequence and protein-protein interaction information.
      predicted a divalent copper ion as ligand, which is typical for known LPMOs.
      • Aachmann F.L.
      • Sørlie M.
      • Skjåk-Bræk G.
      • Eijsink V.G.
      • Vaaje-Kolstad G.
      NMR structure of a lytic polysaccharide monooxygenase provides insight into copper binding, protein dynamics, and substrate interactions.
      The ligand was bound by a trio of amino acid residues (His1, His63, and Tyr135) that is very similar to that known to bind copper in the AA11 of Aspergillus (Figure 4G). It is still difficult to predict the activity of new LPMO homologs with low sequence identity to characterized enzymes, as there is an expanding number of LPMO families whose members vary markedly in substrate specificity. Besides LPMOs that degrade cellulose and chitin (e.g., families AA9, AA10, and AA11), there are also enzymes that act on semi-crystalline or amorphous substrates such as starch (AA13), xylan (AA14), and pectin (AA17).
      • Forsberg Z.
      • Sørlie M.
      • Petrović D.
      • Courtade G.
      • Aachmann F.L.
      • Vaaje-Kolstad G.
      • Bissaro B.
      • Røhr Å.K.
      • Eijsink V.G.
      Polysaccharide degradation by lytic polysaccharide monooxygenases.
      ,
      • Sabbadin F.
      • Urresti S.
      • Henrissat B.
      • Avrova A.O.
      • Welsh L.R.J.
      • Lindley P.J.
      • Csukai M.
      • Squires J.N.
      • Walton P.H.
      • Davies G.J.
      • et al.
      Secreted pectin monooxygenases drive plant infection by pathogenic oomycetes.
      Furthermore, there exist structurally similar “LPMO-like proteins,” for which no lytic activity could be demonstrated.
      • Labourel A.
      • Frandsen K.E.H.
      • Zhang F.
      • Brouilly N.
      • Grisel S.
      • Haon M.
      • Ciano L.
      • Ropartz D.
      • Fanuel M.
      • Martin F.
      • et al.
      A fungal family of lytic polysaccharide monooxygenase-like copper proteins.
      Overall, detailed knowledge about the biological functions of LPMOs and LPMO-like proteins in their natural hosts is scarce, and additional organismal model systems for in situ studies are needed. We do not yet know the activity of the putative LPMO from Orciraptor agilis, nor its role in Orciraptor’s biology, but our finding opens new avenues to explore LPMO functions in non-fungal microeukaryotes. Furthermore, we demonstrate that these organisms represent an almost untapped resource for new enzymes with potential biotechnological relevance. The degradation of biomass containing recalcitrant polysaccharides is important in biofuel processing, for example, and the discovery of efficient CAZymes using “omics”-based functional prediction in non-model organisms could help to optimize this process in the future.
      • Tingley J.P.
      • Low K.E.
      • Xing X.
      • Abbott D.W.
      Combined whole cell wall analysis and streamlined in silico carbohydrate-active enzyme discovery to improve biocatalytic conversion of agricultural crop residues.

      Conclusions

      Comparative transcriptomics applied to synchronized cultures of the protoplast feeder Orciraptor agilis have provided the first insights into the molecular factors underpinning the perforation of algal cell walls. The pronounced upregulation of GH5_5A during attack identifies this highly expressed glycoside hydrolase as a potential key factor for the pre-phagocytotic dissolution of the cell wall. The molecular features of this putative endocellulase (signal peptide and TMD) suggest that the protein is secreted but remains tethered to a membrane, e.g., the extracellular side of the plasma membrane. Interestingly, several other candidate proteins that were upregulated during attack (e.g., chitin binders and LPMOs) display similar features, pointing to a membrane-tethered toolkit of CAZymes, similar to that known from bacterial cellulosomes.
      • Artzi L.
      • Bayer E.A.
      • Moraïs S.
      Cellulosomes: bacterial nanomachines for dismantling plant polysaccharides.
      Our findings support the hypothesis that protoplast feeders overcome prey cell walls by contact digestion or similar well-concerted processes with membrane-bound factors and that the mechanistic aspects of their intricate feeding strategy differ fundamentally from those of “typical” predators and fungal parasites. Other experimental approaches are now required to reveal more of the molecular secrets of protoplast feeders for which our sequence data will be an important resource. In a broader context, this study contributes in-depth transcriptomic information for another representative of the vastly undersampled eukaryotic supergroup Rhizaria and will help us to understand the cell biological specialties found in this ecologically diverse group of protists.

      STAR★Methods

      Key resources table

      Tabled 1
      REAGENT or RESOURCESOURCEIDENTIFIER
      Chemicals, peptides, and recombinant proteins
      TRIzol ReagentThermo Fisher ScientificN/A
      Deposited data
      RNA-seq data from Orciraptor and MougeotiaThis paperArray Express: E-MTAB-11291
      Transcriptome assembly MougeotiaThis paperENA: PRJEB49552, HBWW01000001-HBWW01035866
      Transcriptome assembly Orciraptor (Trinity)This paperENA: PRJEB49552, HBWT01000001-HBWT01049848
      Transcriptome assembly Orciraptor (rnaSPAdes)This paperENA: PRJEB49552, HBWU01000001-HBWU01052058
      Functional annotation, gene expression tables, peptide sequencesThis paperZenodo: https://doi.org/10.5281/zenodo.5776597
      Code for analysis of Trinity assemblyThis paperZenodo/GitHub https://doi.org/10.5281/zenodo.5814798
      Code for analysis of rnaSPAdes assemblyThis paperZenodo/GitHub https://doi.org/10.5281/zenodo.5814879
      Experimental models: Organisms/strains
      Orciraptor agilis: Strain OrcA03Laboratory of Sebastian HessN/A
      Mougeotia sp.: Strain CCAC 3626Central Collection of Algal Cultures, CCACN/A
      Software and algorithms
      Rcorrector (version 1.0.4)Song and Florea
      • Song L.
      • Florea L.
      Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads.
      https://github.com/mourisl/Rcorrector
      Trim Galore (version 0.6.6)N/Ahttps://www.bioinformatics.babraham.ac.uk/projects/trim_galore/
      rnaSPAdes (version 3.14.1 and 3.15.0)Bushmanova et al.
      • Bushmanova E.
      • Antipov D.
      • Lapidus A.
      • Prjibelski A.D.
      rnaSPAdes: a de novo transcriptome assembler and its application to RNA-seq data.
      https://cab.spbu.ru/software/rnaspades/
      Bowtie2 (version 2.4.2)Langmead and Salzberg
      • Langmead B.
      • Salzberg S.L.
      Fast gapped-read alignment with Bowtie 2.
      http://bowtie-bio.sourceforge.net/bowtie2/index.shtml
      Trinity (version 2.0.6)Grabherr et al.
      • Grabherr M.G.
      • Haas B.J.
      • Yassour M.
      • Levin J.Z.
      • Thompson D.A.
      • Amit I.
      • Adiconis X.
      • Fan L.
      • Raychowdhury R.
      • Zeng Q.
      • et al.
      Full-length transcriptome assembly from RNA-seq data without a reference genome.
      https://github.com/trinityrnaseq/trinityrnaseq
      BLAST+ (version 2.10.1)Camacho et al.
      • Camacho C.
      • Coulouris G.
      • Avagyan V.
      • Ma N.
      • Papadopoulos J.
      • Bealer K.
      • et al.
      BLAST+: architecture and applications.
      https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/
      TransDecoder (version 2.1.0)N/Ahttps://github.com/TransDecoder/TransDecoder
      DIAMOND (version 2.0.11)Buchfink et al.
      • Buchfink B.
      • Reuter K.
      • Drost H.G.
      Sensitive protein alignments at tree-of-life scale using DIAMOND.
      https://github.com/bbuchfink/diamond
      BUSCO (version 4.0.6)Seppey et al.
      • Seppey M.
      • Manni M.
      • Zdobnov E.M.
      BUSCO: assessing genome assembly and annotation completeness.
      https://busco.ezlab.org/
      InterProScan (version 5.52-86.0)Blum et al.
      • Blum M.
      • Chang H.Y.
      • Chuguransky S.
      • Grego T.
      • Kandasaamy S.
      • Mitchell A.
      • Nuka G.
      • Paysan-Lafosse T.
      • Qureshi M.
      • Raj S.
      • et al.
      The InterPro protein families and domains database: 20 years on.
      https://www.ebi.ac.uk/interpro/download/
      eggNOG-mapper (version 2.0.5)Cantalapiedra et al.
      • Cantalapiedra C.P.
      • Hernández-Plaza A.
      • Letunic I.
      • Bork P.
      • Huerta-Cepas J.
      eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale.
      https://github.com/eggnogdb/eggnog-mapper
      dbcan2 (version 3.0)Zhang et al.
      • Zhang H.
      • Yohe T.
      • Huang L.
      • Entwistle S.
      • Wu P.
      • Yang Z.
      • Busk P.K.
      • Xu Y.
      • Yin Y.
      dbCAN2: a meta server for automated carbohydrate-active enzyme annotation.
      https://bcb.unl.edu/dbCAN2/
      Phobius webserverKäll et al.
      • Käll L.
      • Krogh A.
      • Sonnhammer E.L.
      Advantages of combined transmembrane topology and signal peptide prediction—the Phobius web server.
      https://phobius.sbc.su.se/
      Salmon (version 1.4.0)Patro et al.
      • Patro R.
      • Duggal G.
      • Love M.I.
      • Irizarry R.A.
      • Kingsford C.
      Salmon provides fast and bias-aware quantification of transcript expression.
      https://github.com/COMBINE-lab/salmon
      Tximport (version 1.18.0)Soneson et al.
      • Soneson C.
      • Love M.I.
      • Robinson M.D.
      Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences.
      https://bioconductor.org/packages/release/bioc/html/tximport.html
      DESeq2 (version 1.30.0)Love et al.
      • Love M.I.
      • Huber W.
      • Anders S.
      Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.
      https://bioconductor.org/packages/release/bioc/html/DESeq2.html
      Blast2GO (version 5.2.5)Biobamhttps://www.blast2go.com/
      GOseq (version 1.42.0)Young et al.
      • Young M.D.
      • Wakefield M.J.
      • Smyth G.K.
      • Oshlack A.
      Gene ontology analysis for RNA-seq: accounting for selection bias.
      https://bioconductor.org/packages/release/bioc/html/goseq.html
      I-TASSER webserverYang and Zhang
      • Yang J.
      • Zhang Y.
      I-TASSER server: new development for protein structure and function predictions.
      https://zhanggroup.org/I-TASSER/
      PyMOL (version 1.8.x)N/Ahttps://github.com/jvsguerra/pymol-1.8.x-windows
      Lace (version 1.14.1)Davidson et al.
      • Davidson N.M.
      • Hawkins A.D.K.
      • Oshlack A.
      SuperTranscripts: a data driven reference for analysis and visualisation of transcriptomes.
      https://github.com/Oshlack/Lace
      STAR (version 2.7.8a)Dobin et al.
      • Dobin A.
      • Davis C.A.
      • Schlesinger F.
      • Drenkow J.
      • Zaleski C.
      • Jha S.
      • Batut P.
      • Chaisson M.
      • Gingeras T.R.
      STAR: ultrafast universal RNA-seq aligner.
      https://github.com/alexdobin/STAR
      StringTie (version 2.1.5)Pertea et al.
      • Pertea M.
      • Pertea G.M.
      • Antonescu C.M.
      • Chang T.C.
      • Mendell J.T.
      • Salzberg S.L.
      StringTie enables improved reconstruction of a transcriptome from RNA-seq reads.
      https://ccb.jhu.edu/software/stringtie/
      Gffread (version 0.12.2)Pertea and Pertea
      • Pertea G.
      • Pertea M.
      GFF Utilities: GffRead and GffCompare.
      http://ccb.jhu.edu/software/stringtie/gff.shtml#gffread
      MAFFT (version 7.487)Katoh and Standley
      • Katoh K.
      • Standley D.M.
      MAFFT multiple sequence alignment software version 7: improvements in performance and usability.
      https://mafft.cbrc.jp/alignment/software/
      TrimAl (version 1.4.rev15)Capella-Gutiérrez et al.
      • Capella-Gutiérrez S.
      • Silla-Martínez J.M.
      • Gabaldón T.
      trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses.
      http://trimal.cgenomics.org/
      Jalview (version 2.11.1.4)Waterhouse et al.
      • Waterhouse A.M.
      • Procter J.B.
      • Martin D.M.
      • Clamp M.
      • Barton G.J.
      Jalview version 2—a multiple sequence alignment editor and analysis workbench.
      https://www.jalview.org/
      IQ-TREE (version 2.1.4-beta)Minh et al.
      • Minh B.Q.
      • Schmidt H.A.
      • Chernomor O.
      • Schrempf D.
      • Woodhams M.D.
      • von Haeseler A.
      • Lanfear R.
      IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era.
      http://www.iqtree.org

      Resource availability

      Lead contact

      Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Sebastian Hess ( sebastian.hess@uni-koeln.de ).

      Materials availability

      This study did not generate new unique reagents.

      Experimental model and subject details

      Mougeotia sp.

      The filamentous green alga Mougeotia sp. (strain CCAC 3626) was grown in vented polystyrene cell culture flasks (Falcon T25; Corning, New York, USA) with the culture medium Waris-H containing 1 % (v/v) bacterial standard medium (0.8 % peptone, 0.1 % glucose, 0.1 % meat extract, 0.1 % yeast extract in distilled water (w/v); for references see Hess and Melkonian
      • Hess S.
      • Melkonian M.
      The mystery of clade X: Orciraptor gen. nov. and Viridiraptor gen. nov. are highly specialised, algivorous amoeboflagellates (Glissomonadida, Cercozoa).
      ) and artificial light (white LEDs, photon fluence rate 10–30 μmol m−2 s−1, 14:10 h light-dark cycle) at 16 °C. The strain CCAC 3626 was deposited in and is available from the Central Collection of Algal Cultures (CCAC) at the University of Duisburg-Essen (https://www.uni-due.de/biology/ccac/).

      Orciraptor agilis

      Orciraptor agilis (strain OrcA03) was cultivated in a diluted suspension of freeze-killed filaments of Mougeotia sp. (strain CCAC 3626) at 4-21 °C as described previously.
      • Hess S.
      • Melkonian M.
      The mystery of clade X: Orciraptor gen. nov. and Viridiraptor gen. nov. are highly specialised, algivorous amoeboflagellates (Glissomonadida, Cercozoa).
      In short, about 25 ml of an algal culture (details below) was mixed with about 475 ml sterile, distilled water, distributed to polystyrene cell culture flasks (e.g. 25 ml in Falcon T25; Corning, New York, USA), frozen and stored at -20 °C. These flasks were thawed and then inoculated with about 2 ml of a running Orciraptor culture. The strain OrcA03 is available from the laboratory of the corresponding author.

      Method details

      Microscopy

      Light microscopy was done with a ZEISS IM35 inverted microscope equipped with differential interference contrast and phase contrast optics, and an electronic flash (details see Hess and Melkonian
      • Hess S.
      • Melkonian M.
      The mystery of clade X: Orciraptor gen. nov. and Viridiraptor gen. nov. are highly specialised, algivorous amoeboflagellates (Glissomonadida, Cercozoa).
      ). For the localization of F-actin and algal cell walls, attacking Orciraptor cells were aldehyde-fixed, washed and stained with a fluorescent phalloidin conjugate and Calcofluor White as described in Busch and Hess.
      • Busch A.
      • Hess S.
      The cytoskeleton architecture of algivorous protoplast feeders (Viridiraptoridae, Rhizaria) indicates actin-guided perforation of prey cell walls.
      For scanning electron microscopy of cell wall perforations, filaments of Mougeotia sp. emptied by Orciraptor agilis were collected from old cultures by sedimentation and placed on poly-L-lysine coated cover slips. After about one hour of sedimentation, the cover slips were passed through a graded ethanol series (30%-50%-96%-100%; 5 min each step), transferred to 100% hexamethyldisilazane (HMDS) and incubated for 15 min. After a final exchange of HMDS, the fluid was aspirated and the samples were air-dried in a fume hood. The dry samples were sputter coated with gold and imaged with a ZEISS Neon 40 scanning electron microscope (secondary electron detector, 2.5 kV acceleration voltage; ZEISS, Oberkochen, Germany).

      RNA isolation of Mougeotia sp.

      Algal filaments were collected with a 40 μm strainer (Falcon 40 μm Cell Strainer; Corning, New York, USA) from a well-grown culture, added to liquid nitrogen in a ceramic mortar, and ground to powder. Several milliliters of TRIzol Reagent (Thermo Fisher Scientific, Waltham, Massachusetts, USA) were added and mixed with the ground algal material during thawing. The resulting TRIzol extract was transferred in a test tube, mixed for several minutes at room temperature, and subjected to RNA isolation according to the manufacturer’s instructions of the TRIzol Reagent. The isolated RNA was checked for integrity by agarose gel electrophoresis, quantified, and stored frozen in nuclease-free water.

      Synchronization of Orciraptor cultures and RNA isolation

      Nine large cultures of Orciraptor agilis were set up in vented T-175 cell culture flasks (Sarstedt, Nümbrecht, Germany) by adding about 30 ml of an Orciraptor suspension (gliding, aggressive cells from regular cultures) to about 250 ml of freeze-killed, algal material diluted in distilled water (details on dilution above), and incubated at 21 °C in the dark. One day after inoculation, three cultures with digesting and dividing Orciraptor cells (“digesting-dividing” condition) were processed for extraction of total RNA (details below). Two days after inoculation, the remaining cultures contained gliding, aggressive flagellates. Three of these cultures were directly processed for RNA extraction (“gliding” condition), while the other three cultures were spiked with 250 μl concentrated, freeze-killed Mougeotia filaments. These algae have been ultrasonicated before, to fragment filaments into shorter pieces (for faster sedimentation), and washed two times with distilled water (by centrifugation at 1000 g for 10 min and resuspension). After about 45 min, when almost all Orciraptor cells had started attack on the Mougeotia filaments (“attacking” condition), total RNA was extracted.
      For extraction of total RNA, cultures of all conditions were processed the same way: After careful aspiration of most of the culture supernatant, the cells were quickly agitated and filtered onto a 3 μm polycarbonate membrane disc filter (Sterlitech, Auburn, Washington, USA) with a vacuum filtration device. The cell-bearing filter was then put upside down in a 60 mm Petri dish with 3 ml TRIzol Reagent (Thermo Fisher Scientific, Waltham, Massachusetts, USA), and placed on a rocking table. After several minutes of mixing, the filter was removed from the Petri dish and the TRIzol extract stored frozen at -80 °C until further processing. From these samples, RNA was isolated according to the manufacturer’s instructions of the TRIzol Reagent, then treated with the TURBO DNase (Thermo Fisher Scientific, Waltham, Massachusetts, USA), checked for integrity by agarose gel electrophoresis, quantified, and stored frozen in nuclease-free water.

      Quantification and statistical analysis

      RNA-seq and de novo transcriptome assemblies

      The RNA samples of Mougeotia sp. and Orciraptor agilis were submitted to Génome Québec (Montréal, Québec, Canada) for strand-specific library preparation (including poly-A-enrichment) and RNA sequencing on a HiSeq2500 platform (PE125) and HiSeq4000 (PE100), respectively. For Mougeotia sp., about 45 million read pairs were obtained. K-mer based error correction was performed with Rcorrector
      • Song L.
      • Florea L.
      Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads.
      (version 1.0.4). Quality and adapter trimming was performed with Trim Galore
      • Martin M.
      Cutadapt removes adapter sequences from high-throughput sequencing reads.
      , (version 0.6.6). The processed reads were assembled using rnaSPAdes
      • Bushmanova E.
      • Antipov D.
      • Lapidus A.
      • Prjibelski A.D.
      rnaSPAdes: a de novo transcriptome assembler and its application to RNA-seq data.
      (version 3.15.0) using a strand-specific option (--ss rf).
      For Orciraptor agilis, about 392 million read pairs (30-50 million read pairs per sample) were generated. Rcorrector
      • Song L.
      • Florea L.
      Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads.
      (version 1.0.4) was used to perform k-mer based error correction. Adapter and low-quality bases were trimmed with Trim Galore
      • Martin M.
      Cutadapt removes adapter sequences from high-throughput sequencing reads.
      , (version 0.6.6). The processed reads were mapped to ribosomal sequences of the SILVA SSU r138.1 database for the groups “Orciraptor” and “Mougeotia”. This step was performed using bowtie2
      • Langmead B.
      • Salzberg S.L.
      Fast gapped-read alignment with Bowtie 2.
      (version 2.4.2) with the parameters --very-sensitive and --score-min C,0,0 and only unmapped reads were kept. These reads were then mapped to the Mougeotia transcriptome with bowtie2
      • Langmead B.
      • Salzberg S.L.
      Fast gapped-read alignment with Bowtie 2.
      (version 2.4.2) using default parameters. Reads that did not map to the algal transcriptome were used for the de novo assembly.
      Since different de novo transcriptome assembly software and parameter settings can produce different results,
      • Hölzer M.
      • Marz M.
      De novo transcriptome assembly: a comprehensive cross-species comparison of short-read RNA-seq assemblers.
      the assembly of the Orciraptor agilis transcriptome was performed with two assembly tools: Trinity
      • Grabherr M.G.
      • Haas B.J.
      • Yassour M.
      • Levin J.Z.
      • Thompson D.A.
      • Amit I.
      • Adiconis X.
      • Fan L.
      • Raychowdhury R.
      • Zeng Q.
      • et al.
      Full-length transcriptome assembly from RNA-seq data without a reference genome.
      and rnaSPAdes.
      • Bushmanova E.
      • Antipov D.
      • Lapidus A.
      • Prjibelski A.D.
      rnaSPAdes: a de novo transcriptome assembler and its application to RNA-seq data.
      Both methods performed well according to the statistics shown in Figure S1B. The rnaSPAdes assembly yielded slightly longer ORFs, while the Trinity assembly resulted in a higher number of ORFs. As we found several artificially fused contigs (e.g. CAZymes fused with parts of ribosomal proteins) in the rnaSPAdes assembly, we performed all downstream analyses with the Trinity assembly. The one exception was that the rnaSPAdes assembly was used to extend an incomplete contig (GH5_5A, described in detail below).
      Assembly with Trinity: The filtered reads from all three life history conditions of Orciraptor agilis were pooled for a strand-specific (--SS_lib_type RF) de novo assembly using Trinity
      • Grabherr M.G.
      • Haas B.J.
      • Yassour M.
      • Levin J.Z.
      • Thompson D.A.
      • Amit I.
      • Adiconis X.
      • Fan L.
      • Raychowdhury R.
      • Zeng Q.
      • et al.
      Full-length transcriptome assembly from RNA-seq data without a reference genome.
      (version 2.0.6). The contigs were blasted against the nt database to detect potential contaminants (task: megablast, version 2.10.1). Sequences resulting in hits with > 95% identity over a length of minimum 100 nt that matched to ribosomal, algal, bacterial, or viral sequences were removed from the assembly (in total 393 contigs). ORFs were predicted with TransDecoder (version 2.1.0). To remove redundancy in the assembly all ORFs, as protein sequences, were first compared to each other using DIAMOND
      • Buchfink B.
      • Reuter K.
      • Drost H.G.
      Sensitive protein alignments at tree-of-life scale using DIAMOND.
      (version 2.0.11). Then, for each pair sharing >95% identity along >90% of the shortest ORF in the pair, the longest ORF was kept for further analyses.
      Assembly with rnaSPAdes: The assembly was also performed with rnaSPAdes
      • Bushmanova E.
      • Antipov D.
      • Lapidus A.
      • Prjibelski A.D.
      rnaSPAdes: a de novo transcriptome assembler and its application to RNA-seq data.
      (version 3.14.1) using the strand-specific option (--ss rf). This transcriptome was filtered for a minimum contig size of 200 nt. To identify potential contaminants, the remaining contigs were compared to the nt database using blastn (task: megablast, version 2.10.1). Contigs resulting in hits with > 95% identity over a length of minimum 100 nt that corresponded to ribosomal, algal, bacterial, or viral sequences were removed (in total 269 contigs).

      Assembly statistics

      Transcriptome assembly statistics were obtained with the scripts “TrinityStats.pl” and “contig_ExN50_statistic.pl” from the Trinity
      • Grabherr M.G.
      • Haas B.J.
      • Yassour M.
      • Levin J.Z.
      • Thompson D.A.
      • Amit I.
      • Adiconis X.
      • Fan L.
      • Raychowdhury R.
      • Zeng Q.
      • et al.
      Full-length transcriptome assembly from RNA-seq data without a reference genome.
      toolkit utilities. The presence of single-copy orthologs was determined with BUSCO
      • Seppey M.
      • Manni M.
      • Zdobnov E.M.
      BUSCO: assessing genome assembly and annotation completeness.
      (version 4.0.6) for the lineage datasets “eukaryota_odb10” and “alveolata_odb10”. ORF statistics were obtained using the custom bash script transdecoder_count.sh.

      Functional annotation

      The predicted ORF sequences of the Trinity assembly were compared to the UniProtKB/Swiss-Prot database (Release 2021_01) using DIAMOND
      • Buchfink B.
      • Reuter K.
      • Drost H.G.
      Sensitive protein alignments at tree-of-life scale using DIAMOND.
      (version 2.0.6). Furthermore, an InterProScan analysis
      • Blum M.
      • Chang H.Y.
      • Chuguransky S.
      • Grego T.
      • Kandasaamy S.
      • Mitchell A.
      • Nuka G.
      • Paysan-Lafosse T.
      • Qureshi M.
      • Raj S.
      • et al.
      The InterPro protein families and domains database: 20 years on.
      (version 5.52-86.0) with lookup of corresponding pathway and Gene Ontology annotation was conducted (databases: CDD-3.18, Coils-2.2.1, Gene3D-4.3.0, Hamap-2020_05, MobiDBLite-2.0, PANTHER-15.0, Pfam-33.1, PIRSF-3.10, PIRSR-2021_02, PRINTS-42.0, ProSitePatterns-2021_01, ProSiteProfiles-2021_01, SFLD-4, SMART-7.1, SUPERFAMILY-1.75, TIGRFAM-15.0). EggNOG-mapper
      • Cantalapiedra C.P.
      • Hernández-Plaza A.
      • Letunic I.
      • Bork P.
      • Huerta-Cepas J.
      eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale.
      ,
      • Huerta-Cepas J.
      • Szklarczyk D.
      • Heller D.
      • Hernández-Plaza A.
      • Forslund S.K.
      • Cook H.
      • Mende D.R.
      • Letunic I.
      • Rattei T.
      • Jensen L.J.
      • et al.
      eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses.
      (version 2.0.5) was used in DIAMOND and HMM mode for a functional annotation based on precomputed orthologous groups and phylogenies. The results from the DIAMOND-based annotation were kept and HMM hits were added for sequences that were not annotated using DIAMOND. Carbohydrate-active enzymes were annotated with dbcan2
      • Zhang H.
      • Yohe T.
      • Huang L.
      • Entwistle S.
      • Wu P.
      • Yang Z.
      • Busk P.K.
      • Xu Y.
      • Yin Y.
      dbCAN2: a meta server for automated carbohydrate-active enzyme annotation.
      (stand-alone version 3.0) in HMM mode using the dbCAN-HMMdb-V9 database and an E-value cut-off of 1 × 10-5. Transmembrane domains and signal peptides were predicted for selected protein sequences with the Phobius webserver (performed on the 15th of October 2021).
      • Käll L.
      • Krogh A.
      • Sonnhammer E.L.
      Advantages of combined transmembrane topology and signal peptide prediction—the Phobius web server.

      Homology searches of flagellum-associated proteins

      A dataset of 60 flagellar toolkit proteins
      • Galindo L.J.
      • López-García P.
      • Torruella G.
      • Karpov S.
      • Moreira D.
      Phylogenomics of a new fungal phylum reveals multiple waves of reductive evolution across Holomycota.
      ,
      • Torruella G.
      • de Mendoza A.
      • Grau-Bové X.
      • Antó M.
      • Chaplin M.A.
      • del Campo J.
      • Eme L.
      • Pérez-Cordón G.
      • Whipps C.M.
      • Nichols K.M.
      • et al.
      Phylogenomics reveals convergent evolution of lifestyles in close relatives of animals and fungi.
      were used to search for homologues in Orciraptor agilis. The proteins were identified using the best reverse blast hit method using an E-value cut-off of 1 × 10-5.

      Differential expression analysis

      The processed and filtered reads were mapped to the coding sequences obtained from the Trinity assembly with bowtie2
      • Langmead B.
      • Salzberg S.L.
      Fast gapped-read alignment with Bowtie 2.
      (version 2.4.2). Transcript abundance was quantified with salmon
      • Patro R.
      • Duggal G.
      • Love M.I.
      • Irizarry R.A.
      • Kingsford C.
      Salmon provides fast and bias-aware quantification of transcript expression.
      (version 1.4.0) in alignment-based mode. The read counts were parsed with tximport
      • Soneson C.
      • Love M.I.
      • Robinson M.D.
      Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences.
      (version 1.18.0) to generate matrices containing counts and abundances (TPM). A pre-filtering step only keeping contigs with CPM > 1 in 2 or more samples was applied. Differential expression analysis was performed with DESeq2
      • Love M.I.
      • Huber W.
      • Anders S.
      Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.
      (version 1.30.0).

      Expression profiles and gene ontology enrichment analysis

      Transcripts that were differentially expressed (|log2 fold change| ≥ 1, adjusted p-value < 0.001) in at least one pairwise comparison were clustered based on variance stabilising transformed counts. Hierarchical clustering was performed using Pearson correlation as distance method and complete linkage. The resulting dendrogram was cut at 80% of the maximum height, yielding five clusters of transcripts with similar expression patterns. Gene Ontology (GO) terms were retrieved by mapping the DIAMOND/Swiss-Prot annotation and merging them with the ones retrieved in the InterProScan analysis in Blast2GO
      • Götz S.
      • García-Gómez J.M.
      • Terol J.
      • Williams T.D.
      • Nagaraj S.H.
      • Nueda M.J.
      • Robles M.
      • Talón M.
      • Dopazo J.
      • Conesa A.
      High-throughput functional annotation and data mining with the Blast2GO suite.
      (version 5.2.5). GO term enrichment analysis was performed with GOseq
      • Young M.D.
      • Wakefield M.J.
      • Smyth G.K.
      • Oshlack A.
      Gene ontology analysis for RNA-seq: accounting for selection bias.
      (version 1.42.0). The sequence lengths required for the analysis were computed with the script “fasta_seq_length.pl” from the Trinity
      • Grabherr M.G.
      • Haas B.J.
      • Yassour M.
      • Levin J.Z.
      • Thompson D.A.
      • Amit I.
      • Adiconis X.
      • Fan L.
      • Raychowdhury R.
      • Zeng Q.
      • et al.
      Full-length transcriptome assembly from RNA-seq data without a reference genome.
      toolkit utilities.

      Protein structure prediction and structure-based functional annotation

      The structural modelling of the respective protein domains was performed using the I-TASSER web server.
      • Yang J.
      • Zhang Y.
      I-TASSER server: new development for protein structure and function predictions.
      PyMOL (version 1.8.x) was used for the visualisation of protein structures.

      Extension of the GH5_5A contig

      In the Trinity assembly, the predicted ORF of the most highly expressed GH5_5A was incomplete and lacked a stop codon. There was another ORF present with 100% identity in the GH5_5 module which might have been separated during assembly because of intronic sequences (see Figure S3A for predicted splicing sites). In the rnaSPAdes assembly seven isoforms for the GH5_5A cellulase belonging to one gene cluster were identified. To represent all transcripts per gene cluster, superTranscripts were constructed using Lace
      • Davidson N.M.
      • Hawkins A.D.K.
      • Oshlack A.
      SuperTranscripts: a data driven reference for analysis and visualisation of transcriptomes.
      (version 1.14.1), reads were aligned to the superTranscriptome with STAR
      • Dobin A.
      • Davis C.A.
      • Schlesinger F.
      • Drenkow J.
      • Zaleski C.
      • Jha S.
      • Batut P.
      • Chaisson M.
      • Gingeras T.R.
      STAR: ultrafast universal RNA-seq aligner.
      (version 2.7.8a) and sequences were visualised in IGV
      • Robinson J.T.
      • Thorvaldsdóttir H.
      • Winckler W.
      • Guttman M.
      • Lander E.S.
      • Getz G.
      • Mesirov J.P.
      Integrative genomics viewer.
      (version 2.9.2). The alignment was used in StringTie
      • Pertea M.
      • Pertea G.M.
      • Antonescu C.M.
      • Chang T.C.
      • Mendell J.T.
      • Salzberg S.L.
      StringTie enables improved reconstruction of a transcriptome from RNA-seq reads.
      (version 2.1.5) to assemble transcripts. Next, the sequences were extracted using gffread (version 0.12.2)
      • Pertea G.
      • Pertea M.
      GFF Utilities: GffRead and GffCompare.
      and ORF prediction was performed with TransDecoder. Using this approach, a GH5_5A transcript was extracted that encoded a complete ORF with a length of 2249 amino acids. Read mapping and differential gene expression analysis was repeated with the Trinity assembly in which the GH5_5A contigs were replaced by the extended sequence. This quantification was used for Figures 3A and 3B.

      Phylogenetic analysis of GH5_5 domains

      The GH5_5 domain sequence of Orciraptor GH5_5A was used as a query to search for homologues in the non-redundant NCBI database, as well as the EukProt database.
      • Richter D.J.
      • Berney C.
      • Strassert J.F.H.
      • Burki F.
      • de Vargas C.
      EukProt: a database of genome-scale predicted proteins across the diversity of eukaryotic life.
      A multiple sequence alignment was created with MAFFT
      • Katoh K.
      • Standley D.M.
      MAFFT multiple sequence alignment software version 7: improvements in performance and usability.
      (version 7.487) applying the L-INS-i method. The alignment was trimmed with trimAl
      • Capella-Gutiérrez S.
      • Silla-Martínez J.M.
      • Gabaldón T.
      trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses.
      (version 1.4.rev15) using the -automated1 setting. Identical sequences from the trimmed alignment were removed in Jalview
      • Waterhouse A.M.
      • Procter J.B.
      • Martin D.M.
      • Clamp M.
      • Barton G.J.
      Jalview version 2—a multiple sequence alignment editor and analysis workbench.
      (version 2.11.1.4) using the “Remove Redundancy” function with a threshold of 100. The substitution model with the best-fit was determined to be Q.pfam+R5 by the ModelFinder
      • Kalyaanamoorthy S.
      • Minh B.Q.
      • Wong T.K.F.
      • von Haeseler A.
      • Jermiin L.S.
      ModelFinder: fast model selection for accurate phylogenetic estimates.
      function of IQ-TREE
      • Minh B.Q.
      • Schmidt H.A.
      • Chernomor O.
      • Schrempf D.
      • Woodhams M.D.
      • von Haeseler A.
      • Lanfear R.
      IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era.
      (version 2.1.4-beta). A maximum likelihood tree was computed with IQ-TREE using this model and branch support values were calculated with UFboot
      • Hoang D.T.
      • Chernomor O.
      • von Haeseler A.
      • Minh B.Q.
      • Vinh L.S.
      UFBoot2: improving the ultrafast bootstrap approximation.
      with 1000 bootstrap replicates.

      Data and code availability

      • RNA-seq data have been deposited at ArrayExpress and are publicly available as of the date of publication. Accession numbers are listed in the key resources table.
      • Transcriptome assemblies have been deposited at ENA and are publicly available as of the date of publication. Accession numbers are listed in the key resources table.
      • Peptide sequences, gene expression tables, functional annotation data, sequence alignments and phylogenetic trees (GH5_5) have been deposited at Zenodo and are publicly available as of the date of publication. DOIs are listed in the key resources table.
      • All original code has been deposited at GitHub and is publicly available as of the date of publication. DOIs are listed in the key resources table.

      Acknowledgments

      This work was funded by the German Research Foundation grants 283693520 and 417585753 (to S.H.). Funding for the work carried out in A.J.R.’s laboratory was provided by a Discovery grant 2017-06792 from the Natural Sciences and Engineering Research Council (NSERC) of Canada. Funding for work in A.G.B.S.’s laboratory was supported by NSERC Discovery grant 298366-2014 . We thank Ruth Bruker (University of Cologne) for assistance with scanning electron microscopy.

      Author contributions

      Conceptualization, S.H.; investigation, J.V.G., T.H., and S.H.; writing – original draft, J.V.G., T.H., and S.H.; writing – review & editing, all authors; visualization, J.V.G. and S.H.; funding acquisition, A.G.B.S., A.J.R., and S.H.

      Declaration of interests

      The authors declare no competing interests.

      Supplemental information

      References

        • Hess S.
        • Melkonian M.
        The mystery of clade X: Orciraptor gen. nov. and Viridiraptor gen. nov. are highly specialised, algivorous amoeboflagellates (Glissomonadida, Cercozoa).
        Protist. 2013; 164: 706-747https://doi.org/10.1016/j.protis.2013.07.003
        • Busch A.
        • Hess S.
        The cytoskeleton architecture of algivorous protoplast feeders (Viridiraptoridae, Rhizaria) indicates actin-guided perforation of prey cell walls.
        Protist. 2017; 168: 12-31https://doi.org/10.1016/j.protis.2016.10.004
        • Ugolev A.M.
        Parietal (contact) digestion.
        Bull. Exp. Biol. Med. 1960; 49: 10-13
        • Nakahara H.
        • Howard L.
        • Thompson E.W.
        • Sato H.
        • Seiki M.
        • Yeh Y.
        • Chen W.T.
        Transmembrane/cytoplasmic domain-mediated membrane type 1-matrix metalloprotease docking to invadopodia is required for cell invasion.
        Proc. Natl. Acad. Sci. USA. 1997; 94: 7959-7964https://doi.org/10.1073/pnas.94.15.7959
        • Marande W.
        • Kohl L.
        Flagellar kinesins in protists.
        Future Microbiol. 2011; 6: 231-246https://doi.org/10.2217/fmb.10.167
        • Cavalier-Smith T.
        • Chao E.E.
        • Lewis R.
        Multigene phylogeny and cell evolution of chromist infrakingdom Rhizaria: contrasting cell organisation of sister phyla Cercozoa and Retaria.
        Protoplasma. 2018; 255: 1517-1574https://doi.org/10.1007/s00709-018-1241-1
        • Patterson D.J.
        • Simpson A.G.B.
        Heterotrophic flagellates from coastal marine and hypersaline sediments in Western Australia.
        Eur. J. Protistol. 1996; 32: 423-448
        • Kashiyama Y.
        • Tamiaki H.
        Risk management by organisms of the phototoxicity of chlorophylls.
        Chem. Lett. 2014; 43: 148-156https://doi.org/10.1246/cl.131005
        • Hess S.
        • Melkonian M.
        Ultrastructure of the algivorous amoeboflagellate viridiraptor invadens (Glissomonadida, Cercozoa).
        Protist. 2014; 165: 605-635https://doi.org/10.1016/j.protis.2014.07.004
        • Hotchkiss A.T.
        • Gretz M.R.
        • Hicks K.B.
        • Malcolm Brown R.
        The composition and phylogenetic significance of the Mougeotia (Charophyceae) cell wall.
        J. Phycol. 1989; 25: 646-654https://doi.org/10.1111/j.0022-3646.1989.00646.x
        • Permann C.
        • Herburger K.
        • Niedermeier M.
        • Felhofer M.
        • Gierlinger N.
        • Holzinger A.
        Cell wall characteristics during sexual reproduction of Mougeotia sp. (Zygnematophyceae) revealed by electron microscopy, glycan microarrays and RAMAN spectroscopy.
        Protoplasma. 2021; 258: 1261-1275https://doi.org/10.1007/s00709-021-01659-5
        • Markovic O.
        • Janecek S.
        Pectin degrading glycoside hydrolases of family 28: sequence-structural features, specificities and evolution.
        Protein Eng. 2001; 14: 615-631https://doi.org/10.1093/protein/14.9.615
        • Jenkins J.
        • Shevchik V.E.
        • Hugouvieux-Cotte-Pattat N.
        • Pickersgill R.W.
        The crystal structure of pectate lyase Pel9A from Erwinia chrysanthemi.
        J. Biol. Chem. 2004; 279: 9139-9145https://doi.org/10.1074/jbc.M311390200
        • Aspeborg H.
        • Coutinho P.M.
        • Wang Y.
        • Brumer 3rd, H.
        • Henrissat B.
        Evolution, substrate specificity and subfamily classification of glycoside hydrolase family 5 (GH5).
        BMC Evol. Biol. 2012; 12: 186https://doi.org/10.1186/1471-2148-12-186
        • Yang J.
        • Yan R.
        • Roy A.
        • Xu D.
        • Poisson J.
        • Zhang Y.
        The I-TASSER Suite: protein structure and function prediction.
        Nat. Methods. 2015; 12: 7-8https://doi.org/10.1038/nmeth.3213
        • Roy A.
        • Kucukural A.
        • Zhang Y.
        I-TASSER: a unified platform for automated protein structure and function prediction.
        Nat. Protoc. 2010; 5: 725-738https://doi.org/10.1038/nprot.2010.5
        • Zhang Y.
        I-TASSER server for protein 3D structure prediction.
        BMC Bioinformatics. 2008; 9: 40https://doi.org/10.1186/1471-2105-9-40
        • Van Petegem F.
        • Vandenberghe I.
        • Bhat M.K.
        • Van Beeumen J.
        Atomic resolution structure of the major endoglucanase from Thermoascus aurantiacus.
        Biochem. Biophys. Res. Commun. 2002; 296: 161-166https://doi.org/10.1016/s0006-291x(02)00775-1
        • Delsaute M.
        • Berlemont R.
        • Dehareng D.
        • Van Elder D.
        • Galleni M.
        • Bauvois C.
        Three-dimensional structure of RBcel1, a metagenome-derived psychrotolerant family GH5 endoglucanase.
        Acta Crystallogr. Sect. F Struct. Biol. Cryst. Commun. 2013; 69: 828-833https://doi.org/10.1107/S1744309113014565
        • Berlemont R.
        • Delsaute M.
        • Pipers D.
        • D'Amico S.
        • Feller G.
        • Galleni M.
        • Power P.
        Insights into bacterial cellulose biosynthesis by functional metagenomics on Antarctic soil samples.
        ISME J. 2009; 3: 1070-1081https://doi.org/10.1038/ismej.2009.48
        • Chan W.S.
        • Kwok A.C.M.
        • Wong J.T.Y.
        Knockdown of dinoflagellate cellulose synthase CesA1 resulted in malformed intracellular cellulosic thecal plates and severely impeded cyst-to-swarmer transition.
        Front. Microbiol. 2019; 10: 546https://doi.org/10.3389/fmicb.2019.00546
        • Burki F.
        • Roger A.J.
        • Brown M.W.
        • Simpson A.G.B.
        The new tree of eukaryotes.
        Trends Ecol. Evol. 2020; 35: 43-55https://doi.org/10.1016/j.tree.2019.08.008
        • Sibbald S.J.
        • Archibald J.M.
        More protist genomes needed.
        Nat. Ecol. Evol. 2017; 1: 145https://doi.org/10.1038/s41559-017-0145
        • Jiao C.
        • Sørensen I.
        • Sun X.
        • Sun H.
        • Behar H.
        • Alseekh S.
        • Philippe G.
        • Palacio Lopez K.
        • Sun L.
        • Reed R.
        • et al.
        The Penium margaritaceum genome: hallmarks of the origins of land plants.
        Cell. 2020; 181: 1097-1111.e12https://doi.org/10.1016/j.cell.2020.04.019
        • Fujimoto Z.
        Structure and function of carbohydrate-binding module families 13 and 42 of glycoside hydrolases, comprising a beta-trefoil fold.
        Biosci. Biotechnol. Biochem. 2013; 77: 1363-1371https://doi.org/10.1271/bbb.130183
        • de Jonge R.
        • van Esse H.P.
        • Kombrink A.
        • Shinya T.
        • Desaki Y.
        • Bours R.
        • van der Krol S.
        • Shibuya N.
        • Joosten M.H.
        • Thomma B.P.
        Conserved fungal LysM effector Ecp6 prevents chitin-triggered immunity in plants.
        Science. 2010; 329: 953-955https://doi.org/10.1126/science.1190859
        • Abramyan J.
        • Stajich J.E.
        Species-specific chitin-binding module 18 expansion in the amphibian pathogen Batrachochytrium dendrobatidis.
        mBio. 2012; 3 (e00150–e00112)https://doi.org/10.1128/mBio.00150-12
        • Kombrink A.
        • Sánchez-Vallet A.
        • Thomma B.P.
        The role of chitin detection in plant-pathogen interactions.
        Microbes Infect. 2011; 13: 1168-1176https://doi.org/10.1016/j.micinf.2011.07.010
        • Hu S.P.
        • Li J.J.
        • Dhar N.
        • Li J.P.
        • Chen J.Y.
        • Jian W.
        • Dai X.F.
        • Yang X.Y.
        Lysin motif (LysM) proteins: interlinking manipulation of plant immunity and fungi.
        Int. J. Mol. Sci. 2021; 22https://doi.org/10.3390/ijms22063114
        • Harris P.V.
        • Welner D.
        • McFarland K.C.
        • Re E.
        • Navarro Poulsen J.C.
        • Brown K.
        • Salbo R.
        • Ding H.
        • Vlasenko E.
        • Merino S.
        • et al.
        Stimulation of lignocellulosic biomass hydrolysis by proteins of glycoside hydrolase family 61: structure and function of a large, enigmatic family.
        Biochemistry. 2010; 49: 3305-3316https://doi.org/10.1021/bi100009p
        • Vaaje-Kolstad G.
        • Westereng B.
        • Horn S.J.
        • Liu Z.
        • Zhai H.
        • Sørlie M.
        • Eijsink V.G.
        An oxidative enzyme boosting the enzymatic conversion of recalcitrant polysaccharides.
        Science. 2010; 330: 219-222https://doi.org/10.1126/science.1192231
        • Hemsworth G.R.
        • Henrissat B.
        • Davies G.J.
        • Walton P.H.
        Discovery and characterization of a new family of lytic polysaccharide monooxygenases.
        Nat. Chem. Biol. 2014; 10: 122-126https://doi.org/10.1038/nchembio.1417
        • Yang J.
        • Roy A.
        • Zhang Y.
        Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment.
        Bioinformatics. 2013; 29: 2588-2595https://doi.org/10.1093/bioinformatics/btt447
        • Roy A.
        • Yang J.
        • Zhang Y.
        COFACTOR: an accurate comparative algorithm for structure-based protein function annotation.
        Nucleic Acids Res. 2012; 40: W471-W477https://doi.org/10.1093/nar/gks372
        • Roy A.
        • Zhang Y.
        Recognizing protein-ligand binding sites by global structural alignment and local geometry refinement.
        Structure. 2012; 20: 987-997https://doi.org/10.1016/j.str.2012.03.009
        • Zhang C.
        • Freddolino P.L.
        • Zhang Y.
        COFACTOR: improved protein function prediction by combining structure, sequence and protein-protein interaction information.
        Nucleic Acids Res. 2017; 45: W291-W299https://doi.org/10.1093/nar/gkx366
        • Aachmann F.L.
        • Sørlie M.
        • Skjåk-Bræk G.
        • Eijsink V.G.
        • Vaaje-Kolstad G.
        NMR structure of a lytic polysaccharide monooxygenase provides insight into copper binding, protein dynamics, and substrate interactions.
        Proc. Natl. Acad. Sci. USA. 2012; 109: 18779-18784https://doi.org/10.1073/pnas.1208822109
        • Forsberg Z.
        • Sørlie M.
        • Petrović D.
        • Courtade G.
        • Aachmann F.L.
        • Vaaje-Kolstad G.
        • Bissaro B.
        • Røhr Å.K.
        • Eijsink V.G.
        Polysaccharide degradation by lytic polysaccharide monooxygenases.
        Curr. Opin. Struct. Biol. 2019; 59: 54-64https://doi.org/10.1016/j.sbi.2019.02.015
        • Sabbadin F.
        • Urresti S.
        • Henrissat B.
        • Avrova A.O.
        • Welsh L.R.J.
        • Lindley P.J.
        • Csukai M.
        • Squires J.N.
        • Walton P.H.
        • Davies G.J.
        • et al.
        Secreted pectin monooxygenases drive plant infection by pathogenic oomycetes.
        Science. 2021; 373: 774-779https://doi.org/10.1126/science.abj1342
        • Labourel A.
        • Frandsen K.E.H.
        • Zhang F.
        • Brouilly N.
        • Grisel S.
        • Haon M.
        • Ciano L.
        • Ropartz D.
        • Fanuel M.
        • Martin F.
        • et al.
        A fungal family of lytic polysaccharide monooxygenase-like copper proteins.
        Nat. Chem. Biol. 2020; 16: 345-350https://doi.org/10.1038/s41589-019-0438-8
        • Tingley J.P.
        • Low K.E.
        • Xing X.
        • Abbott D.W.
        Combined whole cell wall analysis and streamlined in silico carbohydrate-active enzyme discovery to improve biocatalytic conversion of agricultural crop residues.
        Biotechnol. Biofuels. 2021; 14: 16https://doi.org/10.1186/s13068-020-01869-8
        • Artzi L.
        • Bayer E.A.
        • Moraïs S.
        Cellulosomes: bacterial nanomachines for dismantling plant polysaccharides.
        Nat. Rev. Microbiol. 2017; 15: 83-95https://doi.org/10.1038/nrmicro.2016.164
        • Song L.
        • Florea L.
        Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads.
        GigaScience. 2015; 4: 48https://doi.org/10.1186/s13742-015-0089-y
        • Bushmanova E.
        • Antipov D.
        • Lapidus A.
        • Prjibelski A.D.
        rnaSPAdes: a de novo transcriptome assembler and its application to RNA-seq data.
        GigaScience. 2019; 8https://doi.org/10.1093/gigascience/giz100
        • Langmead B.
        • Salzberg S.L.
        Fast gapped-read alignment with Bowtie 2.
        Nat. Methods. 2012; 9: 357-359https://doi.org/10.1038/nmeth.1923
        • Grabherr M.G.
        • Haas B.J.
        • Yassour M.
        • Levin J.Z.
        • Thompson D.A.
        • Amit I.
        • Adiconis X.
        • Fan L.
        • Raychowdhury R.
        • Zeng Q.
        • et al.
        Full-length transcriptome assembly from RNA-seq data without a reference genome.
        Nat. Biotechnol. 2011; 29: 644-652https://doi.org/10.1038/nbt.1883
        • Camacho C.
        • Coulouris G.
        • Avagyan V.
        • Ma N.
        • Papadopoulos J.
        • Bealer K.
        • et al.
        BLAST+: architecture and applications.
        BMC Bioinformatics. 2009; 10: 421https://doi.org/10.1186/1471-2105-10-421
        • Buchfink B.
        • Reuter K.
        • Drost H.G.
        Sensitive protein alignments at tree-of-life scale using DIAMOND.
        Nat. Methods. 2021; 18: 366-368https://doi.org/10.1038/s41592-021-01101-x
        • Seppey M.
        • Manni M.
        • Zdobnov E.M.
        BUSCO: assessing genome assembly and annotation completeness.
        Methods Mol. Biol. 2019; 1962: 227-245https://doi.org/10.1007/978-1-4939-9173-0_14
        • Blum M.
        • Chang H.Y.
        • Chuguransky S.
        • Grego T.
        • Kandasaamy S.
        • Mitchell A.
        • Nuka G.
        • Paysan-Lafosse T.
        • Qureshi M.
        • Raj S.
        • et al.
        The InterPro protein families and domains database: 20 years on.
        Nucleic Acids Res. 2021; 49: D344-D354https://doi.org/10.1093/nar/gkaa977
        • Cantalapiedra C.P.
        • Hernández-Plaza A.
        • Letunic I.
        • Bork P.
        • Huerta-Cepas J.
        eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale.
        Mol. Biol. Evol. 2021; 38: 5825-5829
        • Zhang H.
        • Yohe T.
        • Huang L.
        • Entwistle S.
        • Wu P.
        • Yang Z.
        • Busk P.K.
        • Xu Y.
        • Yin Y.
        dbCAN2: a meta server for automated carbohydrate-active enzyme annotation.
        Nucleic Acids Res. 2018; 46: W95-W101https://doi.org/10.1093/nar/gky418
        • Käll L.
        • Krogh A.
        • Sonnhammer E.L.
        Advantages of combined transmembrane topology and signal peptide prediction—the Phobius web server.
        Nucleic Acids Res. 2007; 35: W429-W432https://doi.org/10.1093/nar/gkm256
        • Patro R.
        • Duggal G.
        • Love M.I.
        • Irizarry R.A.
        • Kingsford C.
        Salmon provides fast and bias-aware quantification of transcript expression.
        Nat. Methods. 2017; 14: 417-419https://doi.org/10.1038/nmeth.4197
        • Soneson C.
        • Love M.I.
        • Robinson M.D.
        Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences.
        F1000Res. 2015; 4: 1521https://doi.org/10.12688/f1000research.7563.2
        • Love M.I.
        • Huber W.
        • Anders S.
        Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.
        Genome Biol. 2014; 15: 550https://doi.org/10.1186/s13059-014-0550-8
        • Young M.D.
        • Wakefield M.J.
        • Smyth G.K.
        • Oshlack A.
        Gene ontology analysis for RNA-seq: accounting for selection bias.
        Genome Biol. 2010; 11: R14https://doi.org/10.1186/gb-2010-11-2-r14
        • Yang J.
        • Zhang Y.
        I-TASSER server: new development for protein structure and function predictions.
        Nucleic Acids Res. 2015; 43: W174-W181https://doi.org/10.1093/nar/gkv342
        • Davidson N.M.
        • Hawkins A.D.K.
        • Oshlack A.
        SuperTranscripts: a data driven reference for analysis and visualisation of transcriptomes.
        Genome Biol. 2017; 18: 148https://doi.org/10.1186/s13059-017-1284-1
        • Dobin A.
        • Davis C.A.
        • Schlesinger F.
        • Drenkow J.
        • Zaleski C.
        • Jha S.
        • Batut P.
        • Chaisson M.
        • Gingeras T.R.
        STAR: ultrafast universal RNA-seq aligner.
        Bioinformatics. 2013; 29: 15-21https://doi.org/10.1093/bioinformatics/bts635
        • Pertea M.
        • Pertea G.M.
        • Antonescu C.M.
        • Chang T.C.
        • Mendell J.T.
        • Salzberg S.L.
        StringTie enables improved reconstruction of a transcriptome from RNA-seq reads.
        Nat. Biotechnol. 2015; 33: 290-295https://doi.org/10.1038/nbt.3122
        • Pertea G.
        • Pertea M.
        GFF Utilities: GffRead and GffCompare.
        F1000Res. 2020; 9: 9https://doi.org/10.12688/f1000research.23297.2
        • Katoh K.
        • Standley D.M.
        MAFFT multiple sequence alignment software version 7: improvements in performance and usability.
        Mol. Biol. Evol. 2013; 30: 772-780https://doi.org/10.1093/molbev/mst010
        • Capella-Gutiérrez S.
        • Silla-Martínez J.M.
        • Gabaldón T.
        trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses.
        Bioinformatics. 2009; 25: 1972-1973https://doi.org/10.1093/bioinformatics/btp348
        • Waterhouse A.M.
        • Procter J.B.
        • Martin D.M.
        • Clamp M.
        • Barton G.J.
        Jalview version 2—a multiple sequence alignment editor and analysis workbench.
        Bioinformatics. 2009; 25: 1189-1191https://doi.org/10.1093/bioinformatics/btp033
        • Minh B.Q.
        • Schmidt H.A.
        • Chernomor O.
        • Schrempf D.
        • Woodhams M.D.
        • von Haeseler A.
        • Lanfear R.
        IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era.
        Mol. Biol. Evol. 2020; 37: 1530-1534https://doi.org/10.1093/molbev/msaa015
        • Martin M.
        Cutadapt removes adapter sequences from high-throughput sequencing reads.
        EMBnet J. 2011; 17: 10-12https://doi.org/10.14806/ej.17.1.200
        • Krueger F.
        Trim Galore.
        2022
        • Hölzer M.
        • Marz M.
        De novo transcriptome assembly: a comprehensive cross-species comparison of short-read RNA-seq assemblers.
        GigaScience. 2019; 8: giz039https://doi.org/10.1093/gigascience/giz039
        • Haas B.
        TransDecoder.
        2022
        • Huerta-Cepas J.
        • Szklarczyk D.
        • Heller D.
        • Hernández-Plaza A.
        • Forslund S.K.
        • Cook H.
        • Mende D.R.
        • Letunic I.
        • Rattei T.
        • Jensen L.J.
        • et al.
        eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses.
        Nucleic Acids Res. 2019; 47: D309-D314https://doi.org/10.1093/nar/gky1085
        • Galindo L.J.
        • López-García P.
        • Torruella G.
        • Karpov S.
        • Moreira D.
        Phylogenomics of a new fungal phylum reveals multiple waves of reductive evolution across Holomycota.
        Nat. Commun. 2021; 12: 4973https://doi.org/10.1038/s41467-021-25308-w
        • Torruella G.
        • de Mendoza A.
        • Grau-Bové X.
        • Antó M.
        • Chaplin M.A.
        • del Campo J.
        • Eme L.
        • Pérez-Cordón G.
        • Whipps C.M.
        • Nichols K.M.
        • et al.
        Phylogenomics reveals convergent evolution of lifestyles in close relatives of animals and fungi.
        Curr. Biol. 2015; 25: 2404-2410https://doi.org/10.1016/j.cub.2015.07.053
        • Götz S.
        • García-Gómez J.M.
        • Terol J.
        • Williams T.D.
        • Nagaraj S.H.
        • Nueda M.J.
        • Robles M.
        • Talón M.
        • Dopazo J.
        • Conesa A.
        High-throughput functional annotation and data mining with the Blast2GO suite.
        Nucleic Acids Res. 2008; 36: 3420-3435https://doi.org/10.1093/nar/gkn176
        • Robinson J.T.
        • Thorvaldsdóttir H.
        • Winckler W.
        • Guttman M.
        • Lander E.S.
        • Getz G.
        • Mesirov J.P.
        Integrative genomics viewer.
        Nat. Biotechnol. 2011; 29: 24-26https://doi.org/10.1038/nbt.1754
        • Richter D.J.
        • Berney C.
        • Strassert J.F.H.
        • Burki F.
        • de Vargas C.
        EukProt: a database of genome-scale predicted proteins across the diversity of eukaryotic life.
        Preprint at bioRxiv. 2020; https://doi.org/10.1101/2020.06.30.180687
        • Kalyaanamoorthy S.
        • Minh B.Q.
        • Wong T.K.F.
        • von Haeseler A.
        • Jermiin L.S.
        ModelFinder: fast model selection for accurate phylogenetic estimates.
        Nat. Methods. 2017; 14: 587-589https://doi.org/10.1038/nmeth.4285
        • Hoang D.T.
        • Chernomor O.
        • von Haeseler A.
        • Minh B.Q.
        • Vinh L.S.
        UFBoot2: improving the ultrafast bootstrap approximation.
        Mol. Biol. Evol. 2018; 35: 518-522https://doi.org/10.1093/molbev/msx281