Researchers studying any organism with genomic data can follow this simple walkthrough to create sets of barcoded probes for the multiplexed FISH technique called MERFISH. We’re sharing interactive code notebooks that can be adapted to design barcoded FISH probes for any species.
Quantifying movement is a powerful window into cellular functions. However, cells can generate movement through a variety of complex mechanisms. Here, we generate a flexible framework for comparing an especially variable type of motility: cellular crawling.
The process of deciding whether a candidate actin homolog represents a “true” actin is tricky. We propose clear and data-driven criteria to define actin that highlight the functional importance of this protein while accounting for phylogenetic diversity.
Adair L. Borges, Atanas Radkov, and Peter S. Thuy-Boun
TR
+1
Published: Dec 19, 2022
This pub details a process for phage amplification and concentration, DNA extraction, and HPLC and MS analysis of phage nucleosides. We optimized the approach with model phages known to use non-canonical nucleosides in their DNA, but plan to apply it for other phages.
seqqc is a Nextflow pipeline for quality control of short- or long-read sequencing data. It quickly assesses the quality of sequencing data so that it can be posted to a public repository before analysis for biological insights. Faster open data, faster knowledge for everyone.
Feridun Mert Celebi, Elizabeth A. McDaniel, and Taylor Reiter
SC
+2
Published: Mar 07, 2023
A workflow orchestration framework can streamline repeatable tasks and make workflows broadly usable. From several options, we chose Nextflow due to the ease of deploying across platforms, vibrant nf-core community, and ability to manage and monitor workflows with Nextflow Tower.
Even with many tools available, categorizing species is tough. We used data from Raman spectroscopy, a form of label-free imaging, to infer phylogenetic patterns among several dozen diverse microbial taxa, offering a non-destructive and rapid way to dissect species relationships.
Prachee Avasthi, Tara Essock-Burns, Galo Garcia III, Jase Gehring, David Q. Matus, David G. Mets, and Ryan York
TE
+3
Published: May 03, 2023
Constraining motile microorganisms for live imaging often requires costly microfluidics or optical traps to keep them in view. We used patterned stamps and agar to make versatile, inexpensive “microchambers” and offer a way to predict the right chamber size for a given organism.
We want to seamlessly process and summarize metagenomics data from Illumina or Nanopore technologies. We built a Nextflow workflow that handles common metagenomics tasks and produces useful outputs and intuitive visualizations.
The increasingly large number of sequences available in public databases makes searches slower and slower. We clustered the NCBI non-redundant protein database and calculated taxonomic info for each cluster. This collapses similar sequences and reduces the database by over half.
Horizontal gene transfer (HGT) is the exchange of DNA between species. It can lead to the acquisition of new gene functions, so finding HGT events can reveal genome novelty. preHGT is a pipeline that uses multiple existing methods to quickly screen for transferred genes.
Prachee Avasthi, Ben Braverman, Tara Essock-Burns, Galo Garcia III, Cameron Dale MacQuarrie, David Q. Matus, David G. Mets, and Ryan York
BB
TE
+7
Published: Jun 23, 2023
We’re crossing C. reinhardtii and C. smithii algae for high-throughput genotype-phenotype mapping. In preparation, we’re comparing the parents to uncover unique species-specific phenotypes.
Feridun Mert Celebi, Seemay Chou, Erin McGeever, Austin H. Patton, and Ryan York
SC
+4
Published: Sep 29, 2023
We want to find and use evolutionary innovations to solve present-day problems. We developed NovelTree, an efficient phylogenomic workflow that will empower us to decode the evolutionary traces of these innovations across the tree of life.
We want to swiftly generate genome assemblies and produce quality control statistics to gauge the need for more curation. We built a Nextflow pipeline that assembles Illumina, Nanopore, or PacBio sequencing reads for a single organism and runs QC checks on the resulting assembly.
Adair L. Borges, Feridun Mert Celebi, Kira E. Poskanzer, and Taylor Reiter
RD
KP
TR
Published: Aug 25, 2023
We implemented a lightweight method to identify viruses in 342 human brain bulk and single-cell sequencing data sets, and identified two glioblastoma cells from a single patient that contained deltapolyomavirus sequences.
Prachee Avasthi, Feridun Mert Celebi, Elizabeth A. McDaniel, Kira E. Poskanzer, Michael E. Reitman, and Emily C.P. Weiss
SC
RD
+5
Published: Dec 20, 2023
Some human proteins are encoded by genes with repetitive sequences, which, if they expand, damage the nervous system and cause disorders like Huntington’s disease. We found animals with similar proteins that have more repeats than we’ve ever seen in healthy people.
It is commonly assumed that phenotypes arise from the cumulative effects of many independent genes. However, we show that by accounting for dependent and nonlinear biological relationships, we can generate models that predict phenotypes with great accuracy.
Genetic models of complex traits often rely on incorrect assumptions that drivers of trait variation are additive and independent. An information theoretic framework for analyzing trait variation can better capture phenomena like allelic dominance and gene-gene interaction.
Prachee Avasthi, Brae M. Bigge, Feridun Mert Celebi, Keith Cheveralls, Jase Gehring, Erin McGeever, Gilad Mishne, Atanas Radkov, and 1 more
BB
KC
RD
+14
Published: Sep 29, 2023
The ProteinCartography pipeline identifies proteins related to a query protein using sequence- and structure-based searches, compares all protein structures, and creates a navigable map that can be used to look at protein relationships and make hypotheses about function.
Prachee Avasthi, Feridun Mert Celebi, and Elizabeth A. McDaniel
BB
+3
Published: Oct 06, 2023
Only some bacteria accumulate substantial amounts of polyphosphate (polyP). We thought that despite sequence divergence, polyP synthesis enzymes in these bacteria might have similar structures. We found this is sometimes true but doesn’t fully explain the phenomenon.
Feridun Mert Celebi, Keith Cheveralls, Seemay Chou, Tara Essock-Burns, and Galo Garcia III
KC
SC
TE
Published: Nov 17, 2023
We distilled label-free microscopy data by comparing and implementing feature-detection algorithms. Sobel and Laplacian methods outperformed pixel intensity variance in accuracy.
Prachee Avasthi, Feridun Mert Celebi, Keith Cheveralls, Seemay Chou, Ilya Kolb, and David Q. Matus
KC
SC
AH
+5
Published: Dec 02, 2023
Machine learning is a powerful tool for classifying images in a time series, such as the developmental stages of embryos. We built a classifier using only bright-field microscopy images to infer nematode embryonic stages at high throughput.
Feridun Mert Celebi, Megan L. Hochstrasser, Elizabeth A. McDaniel, and Jasmine Neal
MD
Published: Dec 20, 2023
Since releasing our pub on polyphosphate-forming proteins in bacteria, we’ve noticed the community has similar problems studying this process in diverse organisms. We’re actively seeking feedback with a focus on advancing basic discoveries and useful tools in this space!
Feridun Mert Celebi, Seemay Chou, Elizabeth A. McDaniel, Taylor Reiter, and Emily C.P. Weiss
SC
RD
+2
Published: Feb 24, 2024
We previously released a draft genome assembly for the lone star tick, A. americanum. We've now predicted genes from this assembly to use for downstream functional characterization and comparative genomics efforts.
Prachee Avasthi, Brae M. Bigge, Dennis A. Sun, and Ryan York
BB
TR
DS
+1
Published: Feb 14, 2024
We've applied ProteinCartography, a tool for protein family exploration, to the well-studied actin family. We’re able to categorize actins and related proteins into distinguishable functional buckets, and we uncovered some surprising hypotheses that could prompt further study.
Adair L. Borges, Feridun Mert Celebi, Keith Cheveralls, and Taylor Reiter
KC
GM
+1
Published: Aug 08, 2024
We explored the use of embeddings from protein language models to distinguish between genuine and putative coding open reading frames (ORFs). We found that an embeddings-based approach (shared as a small Python package called plm-utils) improves identification of short ORFs.
Adair L. Borges, Feridun Mert Celebi, Keith Cheveralls, Seemay Chou, Taylor Reiter, and Emily C.P. Weiss
KC
SC
+2
Published: Aug 08, 2024
Peptigate predicts bioactive peptides from transcriptomes. It integrates existing tools to predict sORF-encoded peptides, cleavage peptides, and RiPPs, then annotates them for bioactivity and other properties. We welcome feedback on expanding its capabilities.
Adair L. Borges, Feridun Mert Celebi, Reilly O. Cooper, and Elizabeth A. McDaniel
RC
+1
Published: Aug 22, 2024
This workflow lets you find potential circular DNA in your organism of interest using short-read, whole-genome sequencing data and a reference genome. We applied it to parasitoid wasps and some other parasites and found putative circular DNA.