NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM4970221 Query DataSets for GSM4970221
Status Public on Jan 31, 2021
Title ChIP-seq_Pla-B1h_inputPolII_rep2
Sample type SRA
 
Source name K562 erythroleukemia cells
Organism Homo sapiens
Characteristics method protocol: ChIP-seq (Morey et al., 2012; Stock et al, 2007)
chip antibody: None
Treatment protocol Cells at a confluency of 300.000 cell/mL were treated with DMSO (1:20.000 dilution, Sigma, D2438) as solvent control or with 1 μM Pladienolide-B from a DMSO-resuspended 20 mM stock (1:20.000 dilution, Santa Cruz, sc-391691).
Growth protocol Human K562 cells were obtained from DSMZ (DSMZ no.: ACC-10) and cultured in antibiotic-free RPMI 1640 medium (Thermo Fisher Scientific, 31870–074) supplemented with 10% heat-inactivated fetal bovine serum (Thermo Fisher Scientific, 10500–064) and 2 mM GlutaMAX (Thermo Fisher Scientific, 35050087) at 37 ̊C and 5% CO2. Cells were verified to be free of mycoplasma contamination using Plasmo Test Mycoplasma Detection Kit (InvivoGen, rep-pt1). K562 cells were authenticated at the DSMZ Identification Service according to standards for STR profiling (ASN-0002). Biological replicates were grown independently.
Extracted molecule genomic DNA
Extraction protocol NEBNext® Ultra™ II DNA Library Prep Kit
 
Library strategy ChIP-Seq
Library source genomic
Library selection ChIP
Instrument model NextSeq 550
 
Data processing TT-seq and RNA-seq: Paired-end 75 and 150 bp reads with additional 6 bp of barcodes were obtained for each group of samples. Two replicates were treated with Pladienolide B (Pla-B) and two replicates treated with a control solvent (DMSO). Reads were aligned to the hg20/hg38 (GRCh38) genome assembly (Human Genome Reference Consortium) using STAR 2.6.0 (Dobin et al., 2013), with the following specifications: outFilterMismatchNmax 2, outFilterMultimapScoreRange 0 and alignIntronMax 500000. Bam files were filtered with Samtools (Li et al., 2009) to remove alignments with MAPQ smaller than 7 (-q 7) and only proper pairs (-f2) were selected. Read counts for different features were calculated with HTSeq (Anders et al., 2015). Further data processing was carried out using the R/Bioconductor environment. Expressed transcripts were defined as possessing more than 50 read counts per kilobase (RPK) in two summarized replicates of TT-seq solvent control (DMSO). Prior to quantification, data was normalized by using added RNA spike-in as described previously (Schwalb et al., 2016)
mNET-seq: Paired-end 42 bp reads with additional 6 bp of barcodes were obtained for each of the samples. Reads were trimmed for adapter content with Cutadapt (1.18, RRID:SCR_011841) (Martin, 2011) with -O 12 -m 25 -a TGGAATTCTCGG -A GATCGTCGGACT. mNET-seq data was normalized using S. cerevisiae RNA spike-ins. To this end, a combined genome was generated using the Ensemble genome assembly for both human hg38 (GRCh38) and S. cerevisiae (R64-1-1), against which the reads were mapped using STAR (2.6.0, RRID:SCR_015899) (Dobin et al., 2013). Around 80% and 20% of reads mapped to the human and S. cerevisiae genomes, respectively. Bam files were filtered with Samtools (1.3.1, RRID:SCR_002105) (Li et al., 2009) to remove alignments with MAPQ smaller than 7 (-q 7) and only proper pairs (-f2) were selected. Read counts for different features were calculated with HTSeq (0.6.1.p1, RRID:SCR_005514) (Anders et al., 2015). Further data processing was carried out using the R/Bioconductor environment. Antisense bias ratio was determined using positions in regions without antisense annotation with a coverage of at least 100 according to the defined major isoforms. Data was normalized using added S. cerevisiae RNA spike-ins. mNETs-seq coverage was normalized with a median of ratios method (Love et al., 2014) using the antisense corrected counts for S. cerevisiae transcripts with an RPK of 100 or higher in two summarized replicates of mNET-seq solvent control (DMSO).
ChIP-seq: Paired-end 42 or 75 bp reads with additional 6 bp of barcodes were obtained for each of the samples. Reads were aligned using Bowtie 2 (2.3.5, RRID:SCR_005476) (Langmead and Salzberg, 2012) to both human hg38 (GRCh38) and D. melanogaster (BDGP6.28). Bam files were filtered with Samtools (1.3.1, RRID:SCR_002105) (Li et al., 2009) to remove alignments with MAPQ smaller than 7 (-q 7) and only proper pairs (-f2) were selected. Further data processing was carried out using the R/Bioconductor environment. ChIP-seq coverages were obtained from piled-up counts for every genomic position, using physical coverage, that is, counting both sequenced bases covered by reads and unsequenced bases spanned between proper mate-pair reads. Data was normalized using added D. melanogaster RNA spike-ins. Normalization factors were obtained by dividing the total D. melanogaster read counts for each sample by the total read counts of the sample with the lowest read counts. ChIP-seq coverages were divided by the respective normalization factors.
Major isoform annotation: Salmon (0.13.1, RRID:SCR_017036) (Patro et al., 2017) was used in order to select the major isoforms present in our dataset. RNA-seq samples for 1 h DMSO or 1 uM Pla-B treatments were mapped against curated RefSeq annotated protein-coding isoforms (UCSC RefSeq GRCh38, downloaded in April 2019). For each gene, the major isoform was determined as the one with maximum mean Transcripts Per Million (TPM) value across all RNA-seq samples. Major isoforms were excluded from further analysis if they represented less than 70 % of the gene isoforms based on the calculated mean TPM value. Additionally, major isoforms associated with overlapping genes as well as isoforms located on chromosomes X, Y and M were discarded from further analysis. The final major isoform annotation includes 6,694 isoforms containing 65,976 exons and 59,282 introns. A total of 5,535 major transcript isoforms of protein-coding genes with RPK >= 50 of TT-seq solvent control (DMSO) were included in the analysis.
Intronless genes annotation: Intronless genes were defined as RefSeq annotated genes comprising one single isoform with one single exon. To avoid effects from neighboring intron-containing genes, only intronless genes at least 1 kb distant from the neighboring intron-containing annotated transcripts (strand independent) were included in the analysis. Moreover, because a long 3´ UTR has been recently reported to be alternatively spliced in an intronless gene (François et al., 2018), only intronless genes with UTRs <= 100 bp were included. A total of 51 expressed protein-coding intronless genes were included in the analysis.
Genome_build: hg20/hg38 (GRCh38) genome assembly (Human Genome Reference Consortium http://hgdownload.soe.ucsc.edu/downloads.html#human)
Supplementary_files_format_and_content: bigwig coverage files
 
Submission date Dec 10, 2020
Last update date Jan 31, 2021
Contact name Sara Patricia Monteiro Martins
E-mail(s) [email protected]
Organization name MPI for biophysical chemistry
Department Molecular Biology
Lab Cramer
Street address Fassberg 11
City Göttingen
ZIP/Postal code 37077
Country Germany
 
Platform ID GPL21697
Series (1)
GSE148433 Efficient RNA polymerase II pause release requires U2 snRNP function
Relations
BioSample SAMN17053593
SRA SRX9671280

Supplementary file Size Download File type/resource
GSM4970221_ChIP-seq_Pla-B1h_inputPolII_rep2.bw 898.9 Mb (ftp)(http) BW
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap