Skip to main content
SearchLoginLogin or Signup

Putative horizontal gene transfer events point to candidate genes involved in cross-species neuromodulation

We are interested in neuroactive metabolites that influence animal behavior. Some fungi have horizontally transferred neuroactive metabolite pathways between species. We used a horizontal gene transfer detection pipeline to screen for novel fungal genes tied to neuroactivity.
Published onJul 20, 2023
Putative horizontal gene transfer events point to candidate genes involved in cross-species neuromodulation
·

Purpose

Naturally produced neuroactive metabolites play interesting roles in the interactions of bacteria, plants, and fungi with other organisms in their environments. We set out to identify novel fungal genes involved in production of neuroactive metabolites that influence other organisms. Fungal neuroactive metabolites have diverse ecological functions that may confer a survival advantage. There are previous examples of fungal horizontal gene transfer (HGT) of genes involved in neuroactive metabolite production, and HGT is known to impact the evolution of gene clusters responsible for production of toxins and other bioactive molecules in fungi [1][2][3][4][5][6].

We were curious if detecting putative HGT events within fungi that are known to produce neuroactive metabolites would point us to novel biosynthetic gene clusters. To do this, we used an in-house HGT detection pipeline to computationally search a small set of fungal genomes for HGT events [7]. We identified three putative HGT events between bacteria and insect-infecting fungi that have potential ties to neuroactive metabolite production. We also identified 160 other putative inter-kingdom HGT events across 11 fungal genera.

Though we don’t plan to go deeper at the moment, we think that this data may be interesting for researchers investigating inter-kingdom HGT or fungal ecology and metabolite production, especially for fungi involved in behavioral manipulation. We have only scratched the surface, so we hope others can use this as a jumping-off point!

  • All associated code and associated data are available in this GitHub repository, including the fungal genomes we analyzed and a results table of candidate HGT events.

Share your thoughts!

Watch a video tutorial on making a PubPub account and commenting. Please feel free to add line-by-line comments anywhere within this text, provide overall feedback by commenting in the box at the bottom of the page, or use the URL for this page in a tweet about this work. Please make all feedback public so other readers can benefit from the discussion.

We’ve put this effort on ice! 🧊

#StrategicMisalignment #TranslationalMismatch

We want to prioritize outcomes that are translationally actionable in the near term. While this type of analysis can generate predictions about pathways related to neuromodulation, the next steps of predicting the products and their neurological effects is extremely challenging. While we could invest in hiring and infrastructure to push this forward, we ultimately decided it wouldn’t be the best use of our resources at this time.

Learn more about the Icebox and the different reasons we ice projects.

Background and goals

A wide range of organisms, including plants, fungi, and bacteria, produce neuroactive molecules that can influence the behavior of other species. Many of these neuroactive molecules are structurally similar to neurotransmitters like serotonin [8]. Binding of these molecules to neurotransmitter receptors in animal brains can lead to changes in the activity of specific neural circuits and can result in altered behaviors, perceptions, emotions, and thoughts. While we do not fully understand the ecological roles of many of these neuromodulators, they are thought to have diverse functions that benefit the organism that produces them, such as inhibiting the growth of competing species, deterring herbivory, or influencing behavior to facilitate spore dispersal [9][10][11]. For example, psilocybin (the hallucinogen in “magic mushrooms”) is a well-known psychoactive molecule that may influence insect behavior [6]

As discovery of natural neuroactive molecules often relies on exploring specific systems that include both the source of the neuroactive molecules and the organism that experiences a behavior change (e.g. plants used to influence human behavior in traditional ceremonies), we aimed to identify a greater diversity of neuromodulators in a way that was not initially dependent on starting from known molecules or biosynthetic genes. Our goal was to devise a strategy to screen for genes involved in biosynthesis of novel neuroactive metabolites.

After considering several strategies, we chose to use horizontal gene transfer (HGT) to point us to potential neuroactive metabolite gene clusters, starting from a small number of fungal genomes. The rationale for this approach includes the high estimate of HGT in fungal genomes and prior examples of the influence of horizontal transfer on the evolution of gene clusters responsible for toxin and neuroactive molecule production in fungi [1][2][3][4]. It is estimated that HGT has impacted 0.1–2.8 percent of genes in a typical fungal genome [1][12][13]. In one example, phylogenetic analysis suggests that the genes underlying the first biosynthesis steps of neuroactive ergot alkaloid production were horizontally transferred to ergot fungi from another fungal class [5]. Previous studies have also shown that the biosynthetic gene cluster for psilocybin has been horizontally transferred among fungal species [6]. We therefore wanted to see if we could use HGT to more broadly screen for these types of genes.

SHOW ME THE DATA: View our list of fungal proteins potentially acquired through HGT.

The approach

We used our preHGT pipeline [7] to search for horizontally-transferred genes in fungi that might be related to neuroactive molecule production. HGT can enable the rapid transfer of beneficial traits between organisms. While screening for HGT is not expected to be specific to, or globally enriched for, neuromodulation-related genes, we selected this approach because (1) it was starting from a point that was agnostic to known neuroactive molecules and therefore might point us to more “novel” genes, (2) it was a computational experiment that we could perform very quickly for a small test data set based on available in-house tools, (3) there are some known examples of genes related to neuromodulation being horizontally transferred [5][6], and (4) based on the known roles of neuroactive metabolites in chemical defense and behavioral manipulation, it is reasonable to hypothesize that such genes would provide a fitness advantage to the recipient and thus be selected for and maintained [2].

Genome contamination is a major source of false positives when using a BLAST-based HGT screen. As described in [7], we’ve tried to minimize false positives due to contamination by flagging events that are on very short contigs, that have very high sequence identity, or that are only present in a single genome within the pangenome. We also combat other false positives by filtering out events with a query coverage of less than 70% when the match’s bit score is low. Using these methods, we searched for inter-kingdom HGT using publicly available genomes for 12 fungal genera, and then looked for evidence of potential ties to production of neuroactive molecules among the putatively transferred genes.

Selecting genomes

We selected 12 fungal genera (Figure 1) to include in this initial experiment based on either (1) known production of neuroactive molecules in the genus (Psilocybe, Amanita, Claviceps, Epichloë), (2) ease of obtaining physical cultures for potential downstream experimental work (Ganoderma, Hericium, Fomitopsis, Pleurotus, Agrocybe), or (3) involvement in a close relationship with arthropods (Cordyceps, Ophiocordyceps, Termitomyces). Cordyceps/Ophiocordyceps genera contain fungi that are known to manipulate arthropod behavior, potentially through secreted fungal factors that can act on arthropod neural circuits [11]. While Termitomyces fungi are not known to manipulate behavior, we thought that the existence of an extended symbiotic relationship with social arthropods (termites) may be favorable to the evolution of neuroactive molecule production.

Figure 1

Fungal genera that we selected for HGT analysis.

Applying the HGT detection pipeline, “preHGT”

We used an in-house pipeline, preHGT (described in detail in [7]), to scan genus-level pangenomes for recent inter-kingdom HGT events using a BLAST-based taxonomic approach. Briefly, we collected all genomes with gene models that were publicly available on NCBI for these 12 fungal genera — this consisted of 103 fungal genomes [14][15]. You can find the genomes we used as inputs here. We created nucleotide pangenomes at the genus level by clustering genes at 90% length and sequence identity to determine the unique set of genes for the genus. We then chose a representative sequence for each cluster by selecting the sequence with the most alignments. We used BLASTp to compare representative proteins to a clustered non-redundant (nr) database, inspired by NCBI’s experimental ClusteredNR [16]. The clustered database [17] let us capture more taxonomically diverse hits. We then used an “alien index” to calculate the difference between the e-value of the best non-fungal hit and the best fungal hit [18]. If the best non-fungal hit e-value is closer to zero than the best fungal hit, the alien index will be positive. We used the following previously published thresholds for determining HGT events: an alien index >0 as possible HGT, >15 as likely HGT, and >45 as highly likely [18]. We performed ortholog annotation (KEGG, PFAM, and viral and biosynthetic gene clusters) for fungal proteins that may have been horizontally transferred. The pipeline produced a table of possible HGT events, including the predicted donor and acceptor taxa, alien index and BLASTp values, ortholog annotations, and genomic location information for this protein in the fungal ‘acceptor’ genome.

All code and associated data are available in this GitHub repository (DOI: 10.5281/zenodo.8148467), including the fungal genomes we analyzed and a results table of candidate HGT events.

Additional methods

We used Notion AI to suggest wording ideas and streamline/clarify content, and then edited the AI-generated text. We used Geneious Prime software (version 2023.1.1) to look at the genomic context of putatively transferred genes. We pulled the Cordyceps sp. RAO-2017 and Ophiocordyceps kimflemingiae contigs described in detail below from GenBank accessions NJEV00000000 and LAZP00000000. We used the UniProt Knowledgebase to look at protein annotations [19]. We used AntiSMASH v7.0 to investigate the Cordyceps polyketide region [20].

The results

SHOW ME THE DATA: View our list of fungal proteins potentially acquired through HGT.

The BLAST-based approach to looking for HGT in the 11 fungal pangenomes (excluding Pleurotus) yielded 163 genes that were possible, likely, or highly likely HGT based on our scoring metrics (Figure 2). From Pleurotus, we predicted an additional 289 genes. Of these, 156 were predicted to be transfers between Pleurotus and Salix suchowensis (Suchow willow). We were suspicious that these events might be a product of genome contamination. Preliminary investigation showed that many of the transferred genes were present in multiple Pleurotus genomes, that these genes were embedded within larger genomic contigs, and that there were many genomically co-localized clusters of transferred genes. These observations suggest that these putative events are not purely contamination artifacts. However, we decided not to further investigate these events for the time being.

Figure 2

Number of candidate HGT events per fungal genus, colored by putative donor taxonomy.

For the 163 genes that we predicted outside of Pleurotus, we manually scanned the functional annotations to identify functions that might be of interest for production of secondary metabolites or specifically neuroactive molecules. Many of the proteins do not have clear functional annotations, making interpretation challenging. However, three proteins with annotations stood out initially: an aminoglycoside 3-N-acetyltransferase [AAC(3)], a phospholipase D, and polyketide synthase. These three HGT events occurred between actinomycete bacteria and either Cordyceps (two events) or Ophiocordyceps (one event) fungi. Actinomycetes are gram-positive bacteria that are well-known for their production of diverse bioactive secondary metabolites, including antibiotics [21]. As mentioned previously, we included Cordyceps and Ophiocordyceps because they are known to participate in neuromodulation of arthropods.

Below, we discuss why we found these three hits most intriguing and talk through other interesting genes that we saw in their genomic neighborhoods.

Candidate 1: An AAC(3) in Ophiocordyceps kimflemingiae

Our pipeline predicts that the AAC(3) hit was transferred from the actinomycete Saccharothrix sp. CB00851 to the zombie-ant fungus O. kimflemingiae. Aminoglycoside 3-N-acetyltransferases from bacteria are often found on mobile genetic elements and are associated with antibiotic resistance, as they can transfer an acetyl group to aminoglycoside antibiotics and render them ineffective [22]

In the O. kimflemingiae genome, the gene encoding this protein is on a 16 kb contig, with four upstream genes and two downstream genes (Figure 3, A). Most genes on this contig contain introns and are annotated as hypothetical proteins, although some do have functional predictions in UniProt. Three of these genes (Ophio1_1|g5854–g5856), including the AAC(3) of interest, were shown to be upregulated in O. kimflemingiae during behavioral manipulation of ants [23]. The other two proteins are related to an indoleamine 2,3-dioxygenase and a kynurenine 3-monooxygenase. Both of these enzymes are tied to the kynurenine pathway of tryptophan degradation and the formation of the neurotoxic metabolite 3-hydroxykynurenine. Dysregulation of the kynurenine pathway is associated with nervous system disorders [24]. Another enzyme in this pathway, kynurenine formamidase, was also previously observed to be upregulated in O. kimflemingiae during behavioral manipulation, suggesting that this fungal gene region may be involved in fungal-host interactions [23].

Figure 3

Genomic contexts of hits that may be associated with neuroactive metabolite production.

(A) AAC(3) in Ophiocordyceps kimflemingiae.

(B) Polyketide synthase and phospholipase in Cordyceps.

GFF-formatted annotated sequence files of the contigs diagrammed in this figure are available here.

Acceptor genus

HGT candidate

Donor hit

% identity to best donor match

Donor lineage

Gene annotation

Total length of contig with gene

Alien index

Cordyceps

PHH88552

GJJ33502.1

45.9

Corynebacterium species

Phospholipase D 

20085

4.3

Cordyceps

PHH87744

WP_181580790.1

53.5

Nocardia huaxiensis

Polyketide synthase

16490

6.8

Ophiocordyceps

PFH56969

OKI29105.1

62.2

Saccharothrix sp. CB00851

Aminoglycoside 3-N-acetyltransferase

16152

2.5

Table 1. Candidate neuromodulatory HGT events.
Acceptor genus: Fungal genus of the query sequence
HGT candidate: Query sequence ID
Donor hit: Protein accession for the best match in the donor group (in this case, bacteria)
% identity to best donor match: the percent identity for the best match within the donor group
Donor lineage: Species with the best match in the donor group
Gene annotation: Putative function of the gene
Total length of contig with gene: Total length of the genomic contig in the acceptor genome assembly that contains the HGT candidate
Alien index: HGT probability based on e-value, >0 and <15 indicates possible HGT

Candidates 2–3: A polyketide synthase and a phospholipase in Cordyceps sp. RAO-2017

The actinomycete–Cordyceps polyketide synthase gene stood out to us because of the role of polyketide synthases in the production of bioactive fungal secondary metabolites [25]. Polyketide synthase genes are upregulated in Ophiocordyceps fungi during arthropod behavioral manipulation [23]. This gene is on a 16 kb contig in the Cordyceps sp. RAO-2017 genome; this region seems to be related to a type I polyketide, with four separate polyketide synthase genes (Hirsu2|8479–8482) containing iterative ketosynthase/malonyl-CoA:ACP transacylase, dehydratase, enoyl reductase, and acyl carrier domains (Figure 3, B). Next to this polyketide region, there is a gene encoding a CFEM domain-containing membrane protein; CFEM proteins are associated with fungal pathogenicity [26].

The second candidate actinomycete–Cordyceps transfer was a phospholipase D gene. Phospholipases can disrupt host membranes, and pathogenic fungi often secrete them as virulence factors [27][28]. Phospholipase D has homology to enzymes in arachnid venoms that have sphingomyelinase activity and to bacterial toxins that can cause hemolysis and vascular permeabilization [29]. It has also been suggested that phospholipase D may play a role in insect pathogenesis specifically, based on high homology among these enzymes from unrelated insect pathogens [30]. Consistent with what we see here, it has been suggested that sphingomyelinase D-like genes were recently horizontally transferred among pathogenic fungi and actinobacteria [29][31]. We hypothesized that this gene may therefore play a role in pathogenic interactions between Cordyceps and their insect hosts. On the same 20 kb contig in the Cordyceps sp. RAO-2017 genome, we find a class E group IV cytochrome P450 oxygenase. P450 enzymes play important roles in the biosynthesis of a number of mycotoxins, alkaloids, and other secondary metabolites [32][33]. We also find an aspartic protease, an oxidative N-demethylase, and a transcription factor. In the O. kimflemingiae transcriptomic data referred to in the previous section, multiple aspartic protease and P450 genes are upregulated during behavioral manipulation, suggesting that these enzyme classes may be relevant to behavioral manipulation in closely related fungi [23].

Key takeaways

HGT analysis pointed us to some interesting gene clusters in the genomes of fungi that parasitize arthropods. These regions may be involved in production of neuroactive molecules, although these results are very preliminary and a lot of experimental follow-up work would be needed to confirm or refute this. Based on the events we’ve studied, we think this fast approach to screening for HGT had a good signal-to-noise ratio. A phylogenetic-tree-based method is the gold standard for HGT detection and would let us better understand the evolutionary history of these events.

Next steps

There is a ton to explore in these HGT predictions. As many fungal genes in this set of genomes are not functionally annotated, it takes a little digging to get any clues surrounding the biology of these putative transfer events, so we were only able to look at a handful of our potential hits in any detail.

We’d hoped to identify translationally actionable molecules, but reaching this outcome presents many additional challenges. For example, it is very difficult to predict the products of biosynthetic gene clusters, what the targets of the compounds are, and what their downstream neurological effects may be. Moving forward in this area would require significant additional investment in experimental tools and assays that we have decided not to pursue right now.

So while we are not currently planning to continue this effort or follow up with this data set, we expect that it could point to some interesting biology and encourage others to dive into the data!


Share your thoughts!

Watch a video tutorial on making a PubPub account and commenting. Please feel free to add line-by-line comments anywhere within this text, provide overall feedback by commenting in the box at the bottom of the page, or use the URL for this page in a tweet about this work. Please make all feedback public so other readers can benefit from the discussion.


  • Acknowledgements

    • We’d like to thank Drs. Martin Steinegger and Milot Mirdita for providing a detailed explanation of representative sequence selection in MMseqs2.


Contributors
(A–Z)
Supervision
Supervision
Editing, Visualization
Critical Feedback
Formal Analysis, Investigation, Software, Visualization
Conceptualization, Formal Analysis, Investigation, Visualization, Writing
Comments
5
?
Molly Brothers:
  1. Interestingly, the clearest hits you dig into were both found in species that have a known effect on arthropod behavior. Why do you think that is? Have these pathways been more heavily studied and characterized than the ones that have reported effects on human biology? Are HGT events more likely to happen in fungal lineages that are associated with arthropods for some reason? Do you think it is possible to use a different or larger set of fungal genomes to enrich for HGT events that might display neuroactivity in humans?

  2. What would you think about applying this type of analysis to the organisms in the human gut microbiome? There is quite a bit of research on the gut/nervous system connection — it would be super interesting to see HGT events that might be involved in neuromodulation jumping around in our gut organisms that might be having an impact on our neurobiology / neurological health.

?
Taylor Reiter:

1. We intentionally included fungal species with known impacts on Arthropod behavior, so the fact that we found those hits is likely a bias in design influenced by the fact that these systems have been well studied. As far as pathways that have reported effects on human biology, I'm not an expert in that space so it is difficult to compare. I'm also not sure if HGT events are more likely to happen in fungal lineages that are associated with arthropods. It is definitely possible to use a larger set of fungal genomes, but when we thought of this project in retrospect, we decided an HGT-independent approach might be better instead; we would likely scan fungal genomes or transcriptomes for biosynthetic gene clusters or alkaloid-synthesizing proteins. We think that HGT might not be a super strong signal that allows us to hone in specifically on genes that are neuromodulators for humans.

2. I think it might be difficult to apply preHGT directly to human gut organisms, but I might be wrong. I think this could be challenging because for PreHGT to work well, it needs to operate on genomes. It would be best if these genomes were generated via isolation and sequencing, and not all gut microbes have been sequenced in this way. Using short-read metagenomes would likely generate too many false positives for PreHGT, as it wasn't designed for this use case.

Adair L. Borges:

Willows make lots of interesting chemicals, I wonder if this is connected.

Adair L. Borges:

Contigs that have very high sequence ID to what specifically?

?
Taylor Reiter:

Great question. Contigs that have very high sequence identity to their query. If a query (e.g. a fungal gene) has 99% identity to a bacterial gene in our BLAST database (in this case, a clustered version of NCBI’s nr database), then it is highly likely that the fungal gene is actually a bacterial contaminant.

+ 1 more...
Jonathan A. Eisen:

I am curious if you can explain more about how the pipeline identifies the putative donor and acceptor taxa. More specifically I am interested in how the method distinguishes transfers between whole lineages (e.g., the branch leading up to a genus) vs transfers to / from specific species. Can you narrow down the donor / recipient events to particular parts of a tree branch?

?
Taylor Reiter:

I think we do a better job of distinguishing “transfers between a whole lineage vs. to/from specific species” for the acceptor genome than for the donor genome.

Currently, the pipeline reports the following metrics for the predicted donor:
* blast_donor_lineage_at_hgt_taxonomy_level: the lineage of the predicted donor group at the hgt_taxonomy_level (for kingdom level algorithms, this will be the kingdom (ex. Bacteria)).
* blast_donor_best_match_full_lineage: the full taxonomic lineage of the best match at hgt_taxonomy_level (this is the match with the best bitscore).
* blast_donor_num_matches_at_lineage: number of BLAST hits at the donor_lineage_at_hgt_taxonomy_level (so following the above example, the number of hits to Bacteria).

and the following metrics for the predicted acceptor:
* blast_acceptor_lineage_at_hgt_taxonomy_level: the taxonomic lineage of the acceptor genome up to the HGT level (ex. Fungi)
* blast_acceptor_lca_level: within the acceptor group, what level of taxonomy does the lowest common ancestor occur among all matches? If it’s at the phylum level, the HGT event is probably older than if it’s at the genus level. Or, if the HGT is only observed in two phyla, perhaps the HGT happened twice.
* blast_acceptor_num_matches_at_lineage: number of BLAST hits at the acceptor_lineage_at_hgt_taxonomy_level

We could add an equivalent blast_donor_lca_level to try and get at this.

At the moment, the pipeline outputs a a massive TSV file with these metrics as well as others, and then its “left as an exercise to the reader” to read all of the pipeline’s documentation and integrate and interpret all of the results. This is suboptimal — we would like to create a visualization dashboard that co-situates relevant information with visualizations and interpretation hints/documentation, but we haven’t had the time to do this yet!

Jonathan A. Eisen:

Can you provide information about where the tree came from in Figure 1?

Emily C.P. Weiss:

Hi Jonathan! The tree in figure one is just a cladogram inspired by the representation shown in the JGI Mycocosm portal. It is meant to give some context for how the species we investigated in this study relate to other fungi.