A workflow to isolate phage DNA and identify nucleosides by HPLC and mass spectrometry
This pub details a process for phage amplification and concentration, DNA extraction, and HPLC and MS analysis of phage nucleosides. We optimized the approach with model phages known to use non-canonical nucleosides in their DNA, but plan to apply it for other phages.
DNA extraction, high performance liquid chromatography (HPLC) analysis and mass spectrometry (MS) are bread-and-butter techniques for the chemical analysis of nucleic acids. We optimized this set of protocols to enable such analysis for phage genomes with modified nucleosides, and ultimately hope to use it to discover new DNA modifications from bacteriophages that we isolate from microbial communities.
We’re sharing our detailed protocols to help others tackling similar problems. This pub may be useful to anyone studying phage nucleic acids or searching for novel DNA chemistries.
Watch a video tutorial on making a PubPub account and commenting. Please feel free to add line-by-line comments anywhere within this text, provide overall feedback by commenting in the box at the bottom of the page, or use the URL for this page in a tweet about this work. Please make all feedback public so other readers can benefit from the discussion.
We’ve put this effort on ice! 🧊
#HardToScale #TechnicalGap
HPLC worked well to study genome chemistries of cultured phages. For us to succeed in discovering an array of potentially translatable DNA chemistries, however, we’d need higher-throughput methods to survey microbial communities for unusual nucleoside content. We experimented with using LC-MS/MS and Nanopore sequencing to detect modifications in an isolation-independent way in microbial communities, but neither worked well "out of the box" for this purpose.
Learn more about the Icebox and the different reasons we ice projects.
Background and goals
Bacteriophages (or phages) are the viruses that infect bacteria. Some phages use DNA modifications to protect their genome from degradation by bacterial immune systems [1][2][3][4][5][6]. At Arcadia, we are broadly exploring the distribution and diversity of phage nucleic acid chemistries. One way to do this is to isolate phages from microbial communities and screen them for non-standard DNA chemistries. To do this, we needed a set of protocols that would allow us to quickly determine if a phage we’ve isolated uses a non-standard nucleoside.
In this pub, we share techniques for chemical analysis of modified phage DNA. We optimized these protocols using two phages with well-studied DNA modifications: phage T4, which has modified cytosines with glucosyl-methyl moieties [7][8], and phage SPO1, which has replaced thymine with hydroxy-methyl uracil [9][10]. In future experiments, we will use these protocols to characterize nucleic acids from new phages that we isolate.
The strategy
The phage community developed and routinely uses the approaches that we describe here [11]. We’re sharing our implementation of these existing methods as part of a straightforward workflow, optimized around detecting modified phage nucleosides. We will apply this approach to perform chemical analysis of uncharacterized phage genomes in future work.
We are sharing a collection of five protocols (view them all on protocols.io or click below to jump to the corresponding pub section):
These methods should be applicable to any laboratory-cultivated phage that can be grown to sufficiently high concentration to enable successful nucleic acid extraction.
The method
The following is a high-level overview of our approach, also visually summarized in Figure 1. You can view detailed, step-by-step protocols in this collection on protocols.io.
Starting with a pure culture of phage, these protocols detail phage amplification, concentration, DNA extraction, nucleoside digestion, and chemical analysis of phage nucleosides with HPLC LC–MS/MS. We optimized these steps using model dsDNA phages with known genome modifications. Phage T4 infects Escherichia coli and has modified cytosines with glucosyl-methyl moieties [7][8]. Phage SPO1 infects Bacillus subtilis and has replaced thymine with hydroxy-methyl uracil [9][10]. These phages and their hosts are easy to work with, and have well-characterized nucleic acid chemistries. This makes them an ideal starting point for researchers looking to establish methods to study phage nucleic acid chemistry.
Below, we detail our protocols and results from analyzing phage T4 and SPO1 genomes. While we developed the protocols using these model lytic dsDNA phages, we anticipate that they can be tweaked to enable chemical analysis of phages that have different growth conditions or ssDNA or RNA genomes.
Step 1: Phage amplification and concentration
This approach to phage genome analysis begins with amplifying the phage to a high titer. Both T4 and SPO1 are lytic phages that grow well in liquid culture, and so we chose to amplify the phage in 30 mL of broth media. We supplemented the media with 1 mM MgSO4 and 1 mM CaCl2 to enhance phage adsorption. This worked well for our model phages — in 30 mL, we obtained a concentration of 1010 PFU/mL for T4 and 109 PFU/mL for SPO1.
We anticipate that in the future, some of our newly isolated phages may need to be propagated using slightly different techniques. Temperate phages should be amplified using the double-agar overlay method [12], and some large diffusion-limited phages may benefit from using in-gel techniques [13]. Also, the identities and levels of cations may need to be adjusted depending on the individual biology of the phage.
After amplification, we concentrated the 30 mL of phage lysate down to 300 µL for DNA extraction. To concentrate the phage, we found that both PEG precipitation and filtration-based concentration worked well. PEG precipitation requires less hands-on time, but is overall longer as it requires an overnight incubation step. We also suspect that individual phages will be differentially sensitive to these concentration methods, so one should select a concentration protocol that works best for their phage of interest.
Step 2: Phage DNA extraction and digestion to single nucleosides
After amplification and concentration, the phages are ready for DNA extraction. Initially, we chose to use the NEB Monarch kit to extract high-molecular-weight (HMW) DNA. While any approach that can harvest high-purity phage DNA would be appropriate here, we chose a method that would generate HMW DNA compatible with long-read Nanopore sequencing. We started with the Monarch kit because it can be performed on a benchtop.
Using the Monarch kit, we obtained high concentrations of high-purity T4 and SPO1 DNA. We used a Nanodrop spectrophotometer to quickly check the concentration and purity, and downstream chemical analyses (HPLC and LC–MS/MS) also confirmed the purity of the DNA (Table 1). Note that SPO1 has a high 260/280 ratio: this is because it contains uracil, and thus has an “RNA-like” 260/280 value.
In further iterations of this experiment, we switched to using phenol-chloroform extraction to harvest HMW phage DNA. Phenol-chloroform extraction cannot be performed on a benchtop, and generates substantial chemical waste. However, we found that for some phages, phenol-chloroform succeeded when the Monarch kit prep failed to yield DNA. When harvesting DNA for new phages, we now routinely use phenol-chloroform as it appears to be a more robust method.
After DNA isolation, we digested 1 µg of DNA from each phage sample down to single nucleosides using the NEB Nucleoside Digestion Mix. We chose this kit because it is directly compatible with HPLC and LC–MS/MS.
Step 3: Phage nucleoside analysis with high-performance liquid chromatography (HPLC)
Once the DNA is broken down into single nucleosides, those nucleosides can be analyzed using HPLC. We developed a 30-minute binary gradient using a reverse-phase column, which provided great peak resolution (Figure 2). In addition, we developed a short 10-minute isocratic gradient that we may use for higher-throughput analysis of nucleosides.
To analyze phage nucleosides, we first ran a set of standard deoxynucleosides (dA, dT, dG, dC, dU — each at 1 mg per mL) to obtain retention times for unmodified nucleosides (Figure 2, A). These standards should be included in each HPLC run. To analyze the samples for modified nucleosides, we injected 100 ng into the HPLC and compared the retention times of the sample nucleosides to the standards. We also plotted the A260 values to see the full sample content.
Some nucleoside modifications are easy to spot visually by looking at A260 absorbance plotted over time. T4 phage has two small peaks that correspond to alpha and beta glucosylmethyl deoxycytidine, and is missing a canonical deoxycytidine peak (Figure 2, B). Similarly, SPO1 is obviously missing a thymidine peak, and instead has a new peak that corresponds to hydroxymethyl deoxyuridine (Figure 2, C). However, the difference in retention time between the deoxyuridine standard and the hydroxymethyl deoxyuridine peak in SPO1 is very small, and easily missed. We interpret this to mean that HPLC analysis is good for quickly flagging large-scale changes to nucleic acid composition, but less sensitive to other changes.
Step 4: Nucleoside analysis with liquid chromatography–tandem mass spectrometry (LC–MS/MS)
LC–MS/MS is our most sensitive tool for analyzing nucleosides. We analyzed nucleosides derived from 500 ng of DNA, digested with the NEB Nucleoside Digestion Mix. This kit is directly compatible with LC–MS/MS. In our LC–MS/MS run, we first separated nucleosides using a binary solvent gradient on a C18 column. This gradient is not optimized, but generated usable data and works as a starting point for further optimization. We acquired data in positive mode with an MS1 scan targeting ions in the 200–800 m/z range, and followed each MS1 scan with seven data-dependent MS2 scans. In this experiment, we used a Thermo LTQ Orbitrap XL at the QB3/Chemistry Mass Spectrometry Facility at UC Berkeley.
We manually inspected mass spectrometry data and noticed a consistent pattern of −116 m/z differences between probable nucleoside precursor ions and their most prominent fragmentation product ions, suggesting a pattern of deoxyribose neutral mass loss during fragmentation (Figure 3). Based on this pattern, we wrote Python scripts in Jupyter notebooks to automate nucleoside identification within our accurate mass high-resolution dataset.
Taking advantage of this consistent fragmentation pattern for nucleosides, we identified ions that corresponded to the nucleosides known to be in phage T4 and SPO1 (Figure 4). We also identified an ion in the T4 sample that corresponds to methylated deoxyadenosine, which the HPLC analysis missed, highlighting the increased sensitivity of LC–MS/MS (Figure 4). This methylation mark was likely added by the E. coli strain B Dam methylase [14] or the T4 Dam methylase [15], which methylate adenine at GATC motifs [16].
All code generated and used for the pub is available in this GitHub repository (DOI: 10.5281/zenodo.7447542), including a Jupyter notebook to find nucleosides in mass spec data; mass lists for nucleosides, charged adducts, and neutral adducts; and outputs.
SHOW ME THE DATA:Access our raw and processed mass spec data on Zenodo (DOI: 10.5281/zenodo.7319990).
Challenges identifying nucleosides in complex community samples
We developed this set of protocols using phages with known genome modifications, ultimately aiming to apply them to uncultured phages with potentially novel modifications in microbial community samples. We’ve chosen to shift away from these scientific directions, but we’re sharing our data sets and the issues we encountered to help others working on similar questions.
LC–MS/MS
We tried applying the LC–MS/MS assay to analyze DNA extracted from microbial communities and viromes to see if we could detect nucleoside modification without first individually isolating bacteriophages, but were largely unsuccessful.
We worked with the CRO Arome to use LC–MS/MS to profile the nucleoside content of cheese microbial communities. We chose this CRO because they have a highly sensitive Orbitrap Exploris 480 machine that can take high-resolution measurements, which we thought would be necessary for analyzing potentially complex nucleoside samples from natural communities. We used phenol-chloroform extraction to harvest DNA from cheese microbial communities and their paired viromes (see this protocol collection for methods details) and analyzed the digested nucleosides via LC–MS/MS with a HILIC column in positive ion mode under neutral pH.
Unfortunately, we didn’t achieve the sensitivity that we would need to detect rare, non-standard nucleotides using this approach. For example, we did not see any signal for the nucleoside thymidine (dT) in the MS1, meaning our approach was not even sensitive enough to detect one of the four most abundant nucleosides in the community. If we were going to follow up on this, we would need to put a lot more work into methods development to increase the sensitivity and dynamic range of the assay.
Another issue we saw was a high level of background from RNA nucleosides in our sample, despite the DNA samples having gone through an RNase treatment. We hypothesize that trace RNA nucleosides must have persisted after the digestion, and then were more ionizable than the DNA nucleosides, leading to their enhanced detection in LC–MS/MS. If we were to do this again, we would run the samples through a DNA cleanup column to remove small RNA oligos and/or lingering nucleosides. If anyone wants to explore the raw data, we’ve shared it on Zenodo.
SHOW ME THE DATA: Our raw LC–MS/MS data from cheese communities and paired viromes, first-pass analysis, and methods details are available on Zenodo (DOI: 10.5281/zenodo.7996414).
Nanopore sequencing
We also hoped to complement these chemical methods with Nanopore-based modification discovery to directly link phage genome sequences to their chemical composition [17]. Briefly, we generated paired WGA:native R10 chemistry data sets of cheese microbial communities using Nanopore sequencing (read more about this in [18]). Unfortunately, we found that the de novo modification prediction tools only worked well with R9 chemistries. We have shared the FAST5 files through the European Nucleotide Archive (ENA) for others to use in tool development, and encourage others to reuse the data.
Share your thoughts!
Watch a video tutorial on making a PubPub account and commenting. Please feel free to add line-by-line comments anywhere within this text, provide overall feedback by commenting in the box at the bottom of the page, or use the URL for this page in a tweet about this work. Please make all feedback public so other readers can benefit from the discussion.
Acknowledgements
Thank you to the QB3/Chemistry Mass Spectrometry Facility at UC Berkeley (NIH grant number 1S10OD020062-01) for mass spectrometry of isolated phage nucleosides. Thank you to Arome for mass spectrometry of cheese community nucleosides. Thanks also to the Guertin lab for sharing DNA helix paint brushes for Adobe Illustrator.
Is it possible that some of the RNA “background” you have here is actually RNA stretches in the DNA backbone? There are reports of some organisms transiently incorporating RNA into repair patches in genomes (e.g., see https://www.sciencedirect.com/science/article/pii/S1568786420301762)
Jonathan A. Eisen:
See also https://www.pnas.org/doi/10.1073/pnas.1309119110
Oh that’s really neat, thanks for pointing it out! I had defaulted to assuming it was from the host. The paper you sent is really interesting - Specifically I was interested in the finding purified T4 Dam methylase binds in a surprisingly non-specific mode to the DNA substrate they provide. It appears that they use “conventional” DNA for their DNA binding assays with unmodified cytosines. I wonder if the T4 Dam methylase is specific for DNA with modified cytosines?