|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Jan 31, 2021 |
Title |
ChIP-seq_Pla-B1h_inputPolII_rep2 |
Sample type |
SRA |
|
|
Source name |
K562 erythroleukemia cells
|
Organism |
Homo sapiens |
Characteristics |
method protocol: ChIP-seq (Morey et al., 2012; Stock et al, 2007) chip antibody: None
|
Treatment protocol |
Cells at a confluency of 300.000 cell/mL were treated with DMSO (1:20.000 dilution, Sigma, D2438) as solvent control or with 1 μM Pladienolide-B from a DMSO-resuspended 20 mM stock (1:20.000 dilution, Santa Cruz, sc-391691).
|
Growth protocol |
Human K562 cells were obtained from DSMZ (DSMZ no.: ACC-10) and cultured in antibiotic-free RPMI 1640 medium (Thermo Fisher Scientific, 31870–074) supplemented with 10% heat-inactivated fetal bovine serum (Thermo Fisher Scientific, 10500–064) and 2 mM GlutaMAX (Thermo Fisher Scientific, 35050087) at 37 ̊C and 5% CO2. Cells were verified to be free of mycoplasma contamination using Plasmo Test Mycoplasma Detection Kit (InvivoGen, rep-pt1). K562 cells were authenticated at the DSMZ Identification Service according to standards for STR profiling (ASN-0002). Biological replicates were grown independently.
|
Extracted molecule |
genomic DNA |
Extraction protocol |
NEBNext® Ultra™ II DNA Library Prep Kit
|
|
|
Library strategy |
ChIP-Seq |
Library source |
genomic |
Library selection |
ChIP |
Instrument model |
NextSeq 550 |
|
|
Data processing |
TT-seq and RNA-seq: Paired-end 75 and 150 bp reads with additional 6 bp of barcodes were obtained for each group of samples. Two replicates were treated with Pladienolide B (Pla-B) and two replicates treated with a control solvent (DMSO). Reads were aligned to the hg20/hg38 (GRCh38) genome assembly (Human Genome Reference Consortium) using STAR 2.6.0 (Dobin et al., 2013), with the following specifications: outFilterMismatchNmax 2, outFilterMultimapScoreRange 0 and alignIntronMax 500000. Bam files were filtered with Samtools (Li et al., 2009) to remove alignments with MAPQ smaller than 7 (-q 7) and only proper pairs (-f2) were selected. Read counts for different features were calculated with HTSeq (Anders et al., 2015). Further data processing was carried out using the R/Bioconductor environment. Expressed transcripts were defined as possessing more than 50 read counts per kilobase (RPK) in two summarized replicates of TT-seq solvent control (DMSO). Prior to quantification, data was normalized by using added RNA spike-in as described previously (Schwalb et al., 2016) mNET-seq: Paired-end 42 bp reads with additional 6 bp of barcodes were obtained for each of the samples. Reads were trimmed for adapter content with Cutadapt (1.18, RRID:SCR_011841) (Martin, 2011) with -O 12 -m 25 -a TGGAATTCTCGG -A GATCGTCGGACT. mNET-seq data was normalized using S. cerevisiae RNA spike-ins. To this end, a combined genome was generated using the Ensemble genome assembly for both human hg38 (GRCh38) and S. cerevisiae (R64-1-1), against which the reads were mapped using STAR (2.6.0, RRID:SCR_015899) (Dobin et al., 2013). Around 80% and 20% of reads mapped to the human and S. cerevisiae genomes, respectively. Bam files were filtered with Samtools (1.3.1, RRID:SCR_002105) (Li et al., 2009) to remove alignments with MAPQ smaller than 7 (-q 7) and only proper pairs (-f2) were selected. Read counts for different features were calculated with HTSeq (0.6.1.p1, RRID:SCR_005514) (Anders et al., 2015). Further data processing was carried out using the R/Bioconductor environment. Antisense bias ratio was determined using positions in regions without antisense annotation with a coverage of at least 100 according to the defined major isoforms. Data was normalized using added S. cerevisiae RNA spike-ins. mNETs-seq coverage was normalized with a median of ratios method (Love et al., 2014) using the antisense corrected counts for S. cerevisiae transcripts with an RPK of 100 or higher in two summarized replicates of mNET-seq solvent control (DMSO). ChIP-seq: Paired-end 42 or 75 bp reads with additional 6 bp of barcodes were obtained for each of the samples. Reads were aligned using Bowtie 2 (2.3.5, RRID:SCR_005476) (Langmead and Salzberg, 2012) to both human hg38 (GRCh38) and D. melanogaster (BDGP6.28). Bam files were filtered with Samtools (1.3.1, RRID:SCR_002105) (Li et al., 2009) to remove alignments with MAPQ smaller than 7 (-q 7) and only proper pairs (-f2) were selected. Further data processing was carried out using the R/Bioconductor environment. ChIP-seq coverages were obtained from piled-up counts for every genomic position, using physical coverage, that is, counting both sequenced bases covered by reads and unsequenced bases spanned between proper mate-pair reads. Data was normalized using added D. melanogaster RNA spike-ins. Normalization factors were obtained by dividing the total D. melanogaster read counts for each sample by the total read counts of the sample with the lowest read counts. ChIP-seq coverages were divided by the respective normalization factors. Major isoform annotation: Salmon (0.13.1, RRID:SCR_017036) (Patro et al., 2017) was used in order to select the major isoforms present in our dataset. RNA-seq samples for 1 h DMSO or 1 uM Pla-B treatments were mapped against curated RefSeq annotated protein-coding isoforms (UCSC RefSeq GRCh38, downloaded in April 2019). For each gene, the major isoform was determined as the one with maximum mean Transcripts Per Million (TPM) value across all RNA-seq samples. Major isoforms were excluded from further analysis if they represented less than 70 % of the gene isoforms based on the calculated mean TPM value. Additionally, major isoforms associated with overlapping genes as well as isoforms located on chromosomes X, Y and M were discarded from further analysis. The final major isoform annotation includes 6,694 isoforms containing 65,976 exons and 59,282 introns. A total of 5,535 major transcript isoforms of protein-coding genes with RPK >= 50 of TT-seq solvent control (DMSO) were included in the analysis. Intronless genes annotation: Intronless genes were defined as RefSeq annotated genes comprising one single isoform with one single exon. To avoid effects from neighboring intron-containing genes, only intronless genes at least 1 kb distant from the neighboring intron-containing annotated transcripts (strand independent) were included in the analysis. Moreover, because a long 3´ UTR has been recently reported to be alternatively spliced in an intronless gene (François et al., 2018), only intronless genes with UTRs <= 100 bp were included. A total of 51 expressed protein-coding intronless genes were included in the analysis. Genome_build: hg20/hg38 (GRCh38) genome assembly (Human Genome Reference Consortium http://hgdownload.soe.ucsc.edu/downloads.html#human) Supplementary_files_format_and_content: bigwig coverage files
|
|
|
Submission date |
Dec 10, 2020 |
Last update date |
Jan 31, 2021 |
Contact name |
Sara Patricia Monteiro Martins |
E-mail(s) |
[email protected]
|
Organization name |
MPI for biophysical chemistry
|
Department |
Molecular Biology
|
Lab |
Cramer
|
Street address |
Fassberg 11
|
City |
Göttingen |
ZIP/Postal code |
37077 |
Country |
Germany |
|
|
Platform ID |
GPL21697 |
Series (1) |
GSE148433 |
Efficient RNA polymerase II pause release requires U2 snRNP function |
|
Relations |
BioSample |
SAMN17053593 |
SRA |
SRX9671280 |
Supplementary file |
Size |
Download |
File type/resource |
GSM4970221_ChIP-seq_Pla-B1h_inputPolII_rep2.bw |
898.9 Mb |
(ftp)(http) |
BW |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
Processed data are available on Series record |
|
|
|
|
|