GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM2259971

Query DataSets for GSM2259971

Status

Public on Mar 09, 2018

Title

Adelman_Dmel_S2_Start-seq_5pr_and_3pr_control_rep1

Sample type

SRA

Source name

Drosophila S2 cells

Organism

Drosophila melanogaster

Characteristics

cell line: S2
treatment: control
chip antibody: N/A

Treatment protocol

LacZ RNAi was performed for 48h and cells harvested at a consistent cell density of 4–6 x 106 cells/ml, using the same method as previously described (PMID: 24184211)

Growth protocol

Drosophila S2 cells from the DGRC were grown in M3 media supplemented with bactopeptone, yeast extract and 10% FBS.

Extracted molecule

total RNA

Extraction protocol

Total RNA was extracted from nuclei using Trizol reagent (Invitrogen).
Start-RNA libraries were prepared as described in (Nechaev et al., Science 2010) except reads were size selected in the range 20-80 nt to exclude full length snRNA species.

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

cDNA

Instrument model

Illumina NextSeq 500

Description

short capped RNA isolated from drosophila S2 nuclei
processed data file: Adelman_Dmel_S2_Start-seq_5prRNA_control_rep1and2_forward_normalized.bedGraph.gz
processed data file: Adelman_Dmel_S2_Start-seq_5prRNA_control_rep1and2_reverse_normalized.bedGraph.gz
processed data file: Adelman_Dmel_S2_Start-seq_3prRNA_control_rep1and2_forward_normalized.bedGraph.gz
processed data file: Adelman_Dmel_S2_Start-seq_3prRNA_control_rep1and2_reverse_normalized.bedGraph.gz
Start-seq (Nechaev, Science 2010) isolates the transcription start site (TSS) associated RNAs. Paired-end sequencing is performed on the short RNAs isolated, such that the first end sequenced gives the precise 5'-end of the transcript, corresponding to the TSS. The second read gives the transcript 3'-end, which corresponds to the place the polymerase pauses. So the two reads in the fastq are parsed separately because they give different information.

Data processing

Base calling and generation of FASTQ files performed using standard CASAVA pipeline for HiSeq runs, bcl2fastq2 for NextSeq
RNA-seq and 4sU read pairs containing one or more members with mean quality score <20 filtered; start-seq trimmed for adapter using cutadapt 1.9.1, pairs containing reads trimmed shorter than 20 nt filtered; iCLIP sequencing reads were collapsed for PCR duplicates, trimmed of 5' and 3' adaptors
RNA-seq and 4sU mapped to reference using tophat 2.0.4, in a strand-specific manner, with bowtie1 as the underlying aligner, also to index of ERCC spikes using bowtie 0.12.8; start-seq mapped to index composed of FlyBase annotated rRNA/tRNA and spike-in sequences, successfully aligned pairs filtered, remaining mapped to reference using bowtie 0.12.8 retaining uniquely mappable pairs only, allowing 2 mismatches; PolII ChIP-seq mapped against both mm9 and dm3 reference genomes, retaining uniquely aligned reads only, allowing two mismatches using bowtie 0.12.8; MNase-seq and other ChIP-seq mapped using bowtie 0.12.8 retaining uniquely mapped pairs only, allowing 2 mismatches; iCLIP-seq mapped to the mouse (mm9) genome using STAR
RNA-seq and 4sU strand-specific coverage tracks were generated using genomeCoverageBed, normalized with custom scripts using factors determined by DESeq 1.12.2 based on ERCC spike counts and combined with unionBedGraphs and custom scripts (total count per nt for RNA-seq, mean count for 4sU); start-seq strand-specific bedGraphs generated using custom scripts based on 5' mapping location of end 1 reads only, normalized by linear regression slope of spike-in counts per sample versus mean of control samples, total normalized counts per condition determined using custom scripts; MNase-seq alignments were concatenated and fragments of length <120 and >180 filtered and converted to BED format using custom script, coverage track generated using genomeCoverageBed; ChIP-seq bedGraphs were generated by removing duplicate fragments, filtering very short or long fragments (NELF-B: <60, >410; PolII and Spt5: <60, >380; H3K4me1: <70, >500; H3K4me3: <70, >525; H3K27ac: <50, >450), and determining counts of fragment centers in 25 nt bins tiling the genome, using custom scripts, PolII samples were normalized such that the counts aligning to mm9 were equal, and replicates combined using custom scripts; iCLIP-seq RT stop identificaiton, gene assignment, and peak calling was performed as in FAST-iCLIP (PMID: 25411354)
Genome_build: dm3 (ChIP-seq, RNA-seq, 4sU, start-seq), mm9 (MNase,iClip)
Supplementary_files_format_and_content: RNA-seq and 4sU: bigWig containing combined normalized coverage of all replicates; start-seq: bedGraph containing combined normalized count of end 1 5' mapping locations for all replicates; MNase-seq: bedGraph containing combined coverage of 120-180 nt fragments for all replicates; ChIP-seq: bedGraph containing combined count of fragment centers for all replicates, in 25 nt bins

Submission date

Aug 04, 2016

Last update date

May 15, 2019

Contact name

Karen Adelman

E-mail(s)

[email protected]

Organization name

Harvard Medical School

Department

Biological Chemistry and Molecular Pharmacology