NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE9384 Query DataSets for GSE9384
Status Public on Oct 31, 2007
Title Design and testing of genome-proxy microarrays to profile marine microbial communities
Platform organisms uncultured crenarchaeote 4B7; Prochlorococcus marinus subsp. pastoris str. CCMP1986; uncultured marine alpha proteobacterium; uncultured marine gamma proteobacterium EBAC31A08; uncultured marine group II euryarchaeote 37F11; uncultured Pseudomonadota bacterium; uncultured crenarchaeote 74A4; uncultured marine bacterium 440; uncultured marine bacterium 577; uncultured marine bacterium 583; uncultured gamma proteobacterium eBACHOT4E07; uncultured marine bacterium EB0_41B09; uncultured proteobacterium 60D04; uncultured proteobacterium 65D09; uncultured marine group II euryarchaeote EF100_57A08; uncultured marine bacterium EB000_55B11
Sample organisms Prochlorococcus marinus subsp. pastoris str. CCMP1986; Prochlorococcus marinus str. MIT 9312; Prochlorococcus marinus str. MIT 9313; Prochlorococcus marinus str. MIT 9515; marine metagenome
Experiment type Other
Summary Microarrays are useful tools for detecting and quantifying specific functional and phylogenetic genes in natural microbial communities. In order to track uncultivated microbial genotypes and their close relatives in an environmental context, we designed and implemented a “genome proxy” microarray that targets microbial genome fragments recovered directly from the environment. Fragments consisted of sequenced clones from large-insert genomic libraries from microbial communities in Monterey Bay, the Hawaii Ocean Time-series station ALOHA, and Antarctic coastal waters. In a prototype array, we designed probe sets to thirteen of the sequenced genome fragments and to genomic regions of the cultivated cyanobacterium Prochlorococcus MED4. Each probe set consisted of multiple 70-mers, each targeting an individual ORF, and distributed along each ~40-160kbp contiguous genomic region. The targeted organisms or clones, and close relatives, were hybridized to the array both as pure DNA mixtures and as additions of cells to a background of coastal seawater. This prototype array correctly identified the presence or absence of the target organisms and their relatives in laboratory mixes, with negligible cross-hybridization to organisms having ≤~75% genomic identity. In addition, the array correctly identified target cells added to a background of environmental DNA, with a limit of detection of ~0.1% of the community, corresponding to ~10^3 cells/ml in these samples. Signal correlated to cell concentration with an R2 of 1.0 across six orders of magnitude. In addition the array could track a related strain (at 86% genomic identity to that targeted) with a linearity of R2=0.9999 and a limit of detection of ~1% of the community. Closely related genotypes were distinguishable by differing hybridization patterns across each probe set. This array’s multiple-probe, “genome-proxy” approach and consequent ability to track both target genotypes and their close relatives is important for the array’s environmental application given the recent discoveries of considerable intra-population diversity within marine microbial communities.
Keywords: target addition experiment, proof-of-concept for GPL6012
 
Overall design ***Overall Array design***

The prototype microarray targeted thirteen BAC or fosmid genome fragments (20-160kb) from both bacteria and archaea, recovered from a variety of marine habitats, as well as the cyanobacterium Prochlorococcus MED4. These clones were originally sequenced because of the presence of taxonomic marker or specific functional genes. This array consisted of sets of 70-bp oligonucleotides targeting each genome or genome fragment (Fig. 1), dispersed along the target sequences with no more than one probe per gene, and excluding rRNA genes as targets. The probes were selected solely based on theoretical thermodynamic properties and GC content (~40%); that is, probe selection did not focus on specific genes or regions, but simply produced the “optimal” probes for each genome proxy based on the probes’ predicted hybridization properties. rRNA genes were excluded, because this probe design approach, which avoids sequence alignments and considerations of RNA secondary structure, would be unlikely to result in useful rRNA probes. Furthermore, rRNA probes of traditional design could not be included on the array because their appropriate hybridization conditions would be very different from those of this array’s probes.

***Microarray probe design***

Microarray 70-mer probes were designed using the program ArrayOligoSelector (Zhu et al., 2003) with the following settings: target %GC = 40%, 1 probe/gene, with the ORFs for each genome fragment as both the input and the database file. The output candidate 70mers were then sorted based on their %GC and those closest to 40% were chosen. In the case of more than the target number of probes having 40%GC, the subset with the lowest free energy of hybridization were selected as probes. Generally, 20 probes were selected per organism. Prochlorococcus MED4 was represented by 60 probes total, 20 each for three different 80kb “genome-proxy” regions: 0-80kb, 1.29-1.37Mbp, and 1.58 to 1.66Mbp.

Using the same method, a set (n=20) of positive control probes were designed to the genome of the halophillic archaeon Halobacterium salinarum NRC-1. Negative control probes (n=28) were designed to a set of 49 random 1000-base sequences (Stothard, 2000).

***Microarray construction and hybridization***

Oligonucleotides were synthesized (Illumina, San Diego, California), suspended in 3XSSC to a concentration of 40pmol/μl, and spotted on homemade poly-L-lysine-coated glass slides using a QArray 2 microarraying robot (Genetix, Hampshire, England). Six replicates of each probe were spotted.

***Microarray data analysis***

Hybridized arrays were scanned using an Axon Instruments 4000B scanner (Foster City, CA) and the data was normalized and filtered using perl scripts written for the purpose, by the following steps. (1) Signal intensities for each spot were calculated by subtracting the local background (mean F532 – median B532, as calculated by GenePix Pro 5.1 software, Axon Instruments). (2) The median value across replicates was calculated for each probe. (3) For each probe set, the number of probes greater than twice the mean negative control signal was calculated, before further processing. (4) Filter I: Arrays with less than half their positive control probes exceeding twice the mean negative control signal were considered poor quality, low dynamic range, arrays and were excluded from further analysis. (5) Each probe signal was corrected for non-specific binding by subtracting the mean negative control spot signal. (6) The data was then normalized for array-to-array variations in brightness by dividing each probe signal by the mean positive control signal. This positive control signal was the mean signal across the Halobacterium salinarum probes in each hybridization, with identical amounts of H. salinarum DNA having been added to each reaction prior to amplification and labeling. (7) Filter II: In order for a genotype to be considered “present”, at least 45% of its probes had to exceed twice the mean negative control signal. (8) Finally, each genotype signal was calculated as either the MEAN or TUKEY BIWEIGHT across its probe set.

***Experimental Design***

The array was hybridized to laboratory mixtures of cloned environmental genomic DNA targeted by the array, in varying ratios. The use of multiple probes to target many genes from each organism helped to normalize probe-to-probe heterogeneity, by averaging across all probes in a set (as described below). The evenness of probe response across each genotype’s set was also used to evaluate the relatedness of hybridizing DNA.

To more precisely define the array’s phylogenetic range and specificity, it was tested against DNA from Prochlorococcus MED4 and related strains, spanning the known range of Prochlorococcus phylogenetic diversity.

To test the effects of hybridization stringency on the specificity and signal of the MED4 probes, Prochlorococcus strains were hybridized at a range of conditions.

To test whether the specificity results for Prochlorococcus were comparable for other targeted clades, two genome fragments recovered from closely related phylotypes within the SAR86 clade of the gamma-proteobacteria were represented on the array, and were tested for specificity.

To understand the equivalence of probe sets targeting different regions of the same organism’s genome, we targeted three 80kb “genome proxy” regions of the Prochlorococcus MED4 genome. One of the regions fell in a genomic “island” where inter-strain variability is concentrated (“ISL5” in Coleman et al., 2006).

To test the array in a complex environmental context, we collected coastal seawater (lacking detectable Prochlorococcus cells by flow cytometry) and added Prochlorococcus cells from strains MED4, MIT9515, MIT9312 and MIT9313 over a range of concentrations from ~101 – 106 cells/ml (Fig. 3). The seawater was then filtered and the DNA extracted, amplified, labeled, and hybridized to the array.
 
Contributor(s) Rich VI, Konstantinidis K, DeLong EF
Citation(s) 18028413
Submission date Oct 19, 2007
Last update date Mar 17, 2012
Contact name Virginia Isabel Rich
E-mail(s) [email protected]
Organization name University of Arizona
Street address 1041 E. Lowell St.
City Tucson
State/province AZ
ZIP/Postal code 85721
Country USA
 
Platforms (1)
GPL6012 A genome proxy array for profiling marine micobial communities
Samples (28)
GSM237693 Environmental DNA + Prochloro 9312, 10^1 cells/ml
GSM237694 Environmental DNA + Prochloro 9312, 10^2 cells/ml
GSM237695 Environmental DNA + Prochloro 9312, 10^3 cells/ml
Relations
BioProject PRJNA103099

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE9384_RAW.tar 5.7 Mb (http)(custom) TAR (of GPR)
Processed data included within Sample table

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap