|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Feb 21, 2014 |
Title |
40repbio1 |
Sample type |
SRA |
|
|
Source name |
Entire fruit(pericarp,placenta and seed)
|
Organism |
Capsicum annuum |
Characteristics |
cultivar: Serrano Tampiqueño 74' developmental stage: 40(days after anthesis) tissue: Entire fruit (including seeds)
|
Treatment protocol |
The fruits were randomly collected from different plants at states 10, 20, 40 and 60 DAA. After the harvest the fruits were cleaned with ethanol and immediately frozen in liquid nitrogen and stored at -80◦ C till use.
|
Growth protocol |
Capsicum annum cultivar ’Serrano Tampiqueño 74’ was germinated and cultivated under optimal conditions in a completely randomized experimental design in the greenhouse facilities. The plants grew in the Summer-Spring period and the flowers were tagged imme- diately after anthesis.
|
Extracted molecule |
total RNA |
Extraction protocol |
Nucleo Spin RNA Plant (Macherey-Nagel) was used for total RNA extraction and contaminating genomic DNA was removed by DNase I (Macherey-Nagel) treatment during RNA isolation procedure in accordance with the manufacturer’s protocol. The eight samples (two biological replicates of each state of development; 10, 20, 40 and 60 DAA) were prepared for RNA-seq using illumina TruSeq RNA Sample Preparation v2 Guide following manufacturer’s instructions.
|
|
|
Library strategy |
RNA-Seq |
Library source |
transcriptomic |
Library selection |
cDNA |
Instrument model |
Illumina MiSeq |
|
|
Data processing |
The 8 cDNA libraries were sequenced from both 5’ and 3’ ends in a flow cell using the Illumina MiSeq System platform according to the manufacturer’s instructions. We did three sequencing runs (technical replicates) with the aim of increase the sequence depth. The fluorescent image processing, base-calling and quality value calculation in the three run were performed by the Illumina MiSeq Control Software, in which 150 bp paired-end reads were obtained Before assembly, the raw reads were filtered using PRINSEQ 0.20.3 program to obtain high-quality clean reads by removing duplicated sequences, the reads containing more than 2% N rate (the “N ” character representing ambiguous bases), low complexity reads with entropy less than 70 and low-quality reads containing mean quality score Q-value ≤ 25. The Q-value is the quality score assigned to each base by the Illumina’s base-caller from the Illumina MiSeq Control Software, similar to the Phred score of the base call. The command-line parameters for PRINSEQ 0.20.3 were: -fastq1.fq -fastq2.fq -out_format3 -min_qual_mean25 -ns_max_noniupac -derep1 -lc_methodentropy -lc_threshold70 De novo assembly of the clean reads was performed using Trinity (release 20121005) using the DIAG (Data Intensive Academic Grid facilities) with 48 G RAM per node and using 32 Gb for the Jellyfish step. The size of k-mers in Trinity is 25 by default an the rest of the assembly parameters were used under default. The command-line parameters used in the assembly were: − − seqTypefq − − left1.fq − − right2.fq − −CPU8 − −JM32G − − no cleanup RSEM version 1.2.0 was used for remapping the reads to the assembled contigs and transcript quantification of the 45505 genes and 99487 isoforms assembled with Trinity. This software estimated the expression levels taking into account read mapping uncertainty using an Expectation-Maximization algorithm. Briefly the process consisted in run 2 steps: first, a set of reference transcript sequences was generated and pre-processed by the script rsem-prepare-reference using bowtie version 0.12.7 for constructing the indexes. The following defaults parameters were used: rsem-prepare-reference name_assembled_file.fasta custom_reference_name we filtered the genes that according to RSEM model had a sum of 0 tags in all the libraries using R.15.3 commands. The number of genes was reduced to 42,401 using this procedure. The expression data for the 42,401 contigs in the eight sequenced libraries was summarized by adding the counts of the contigs that shared the same identifier; i.e., we considered contigs with the same identifier to be representing the same gene. This resulted in a data matrix with 34,066 rows (chili pepper genes) and eight columns (libraries). For normalization we employed a novel method, based in the work of Good, that efficiently removes the bias in the fold change caused by this factor. To evaluate differential gene expression (DGE) between neighboring intervals, say between 10 to 20, 20 to 40 and 40 to 60 DAA, we used the facilities within the R package edgeR. Briefly, for each contrast (neighbor interval) we entered the data using the DGEList function, estimated common and tag- wise dispersion, entered the corresponding normalization factors and performed the exact test via the exactTest function. P values resulting from the exact test were then feed into the qvalue function [17] with default parameters, except that we set fdr.level = 0.01 to obtain a false discovery rate of 1%. we used a de novo transcriptome reference assembly with trinity (release 20121005) Genome_build: n/a Supplementary_files_format_and_content: tab-delimited excell files include TPM values for each Sample
|
|
|
Submission date |
Jan 15, 2014 |
Last update date |
May 15, 2019 |
Contact name |
Octavio Martinez de la Vega |
E-mail(s) |
[email protected]
|
Organization name |
Cinvestav
|
Department |
Unidad de Genómica Avanzada (Langebio)
|
Lab |
Computational Biology
|
Street address |
km 9.6 Libramiento Norte
|
City |
Irapuato |
State/province |
Guanajuato |
ZIP/Postal code |
14631 |
Country |
Mexico |
|
|
Platform ID |
GPL18177 |
Series (1) |
GSE54123 |
Dynamics of the chili pepper transcriptome during fruit development |
|
Relations |
BioSample |
SAMN02585083 |
SRA |
SRX433773 |
Supplementary data files not provided |
SRA Run Selector |
Raw data are available in SRA |
|
|
|
|
|