GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM1308336

Query DataSets for GSM1308336

Status

Public on Feb 21, 2014

Title

40repbio1

Sample type

SRA

Source name

Entire fruit(pericarp,placenta and seed)

Organism

Capsicum annuum

Characteristics

cultivar: Serrano Tampiqueño 74'
developmental stage: 40(days after anthesis)
tissue: Entire fruit (including seeds)

Treatment protocol

The fruits were randomly collected from different plants at states 10, 20, 40 and 60 DAA. After the harvest the fruits were cleaned with ethanol and immediately frozen in liquid nitrogen and stored at -80◦ C till use.

Growth protocol

Capsicum annum cultivar ’Serrano Tampiqueño 74’ was germinated and cultivated under optimal conditions in a completely randomized experimental design in the greenhouse facilities. The plants grew in the Summer-Spring period and the flowers were tagged imme- diately after anthesis.

Extracted molecule

total RNA

Extraction protocol

Nucleo Spin RNA Plant (Macherey-Nagel) was used for total RNA extraction and contaminating genomic DNA was removed by DNase I (Macherey-Nagel) treatment during RNA isolation procedure in accordance with the manufacturer’s protocol.
The eight samples (two biological replicates of each state of development; 10, 20, 40 and 60 DAA) were prepared for RNA-seq using illumina TruSeq RNA Sample Preparation v2 Guide following manufacturer’s instructions.

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

cDNA

Instrument model

Illumina MiSeq

Data processing

The 8 cDNA libraries were sequenced from both 5’ and 3’ ends in a flow cell using the Illumina MiSeq System platform according to the manufacturer’s instructions. We did three sequencing runs (technical replicates) with the aim of increase the sequence depth.
The fluorescent image processing, base-calling and quality value calculation in the three run were performed by the Illumina MiSeq Control Software, in which 150 bp paired-end reads were obtained
Before assembly, the raw reads were filtered using PRINSEQ 0.20.3 program to obtain high-quality clean reads by removing duplicated sequences, the reads containing more than 2% N rate (the “N ” character representing ambiguous bases), low complexity reads with entropy less than 70 and low-quality reads containing mean quality score Q-value ≤ 25. The Q-value is the quality score assigned to each base by the Illumina’s base-caller from the Illumina MiSeq Control Software, similar to the Phred score of the base call. The command-line parameters for PRINSEQ 0.20.3 were: -fastq1.fq -fastq2.fq -out_format3 -min_qual_mean25 -ns_max_noniupac -derep1 -lc_methodentropy -lc_threshold70
De novo assembly of the clean reads was performed using Trinity (release 20121005) using the DIAG (Data Intensive Academic Grid facilities) with 48 G RAM per node and using 32 Gb for the Jellyfish step. The size of k-mers in Trinity is 25 by default an the rest of the assembly parameters were used under default. The command-line parameters used in the assembly were: − − seqTypefq − − left1.fq − − right2.fq − −CPU8 − −JM32G − − no cleanup
RSEM version 1.2.0 was used for remapping the reads to the assembled contigs and transcript quantification of the 45505 genes and 99487 isoforms assembled with Trinity. This software estimated the expression levels taking into account read mapping uncertainty using an Expectation-Maximization algorithm. Briefly the process consisted in run 2 steps: first, a set of reference transcript sequences was generated and pre-processed by the script rsem-prepare-reference using bowtie version 0.12.7 for constructing the indexes. The following defaults parameters were used: rsem-prepare-reference name_assembled_file.fasta custom_reference_name
we filtered the genes that according to RSEM model had a sum of 0 tags in all the libraries using R.15.3 commands. The number of genes was reduced to 42,401 using this procedure.
The expression data for the 42,401 contigs in the eight sequenced libraries was summarized by adding the counts of the contigs that shared the same identifier; i.e., we considered contigs with the same identifier to be representing the same gene. This resulted in a data matrix with 34,066 rows (chili pepper genes) and eight columns (libraries).
For normalization we employed a novel method, based in the work of Good, that efficiently removes the bias in the fold change caused by this factor.
To evaluate differential gene expression (DGE) between neighboring intervals, say between 10 to 20, 20 to 40 and 40 to 60 DAA, we used the facilities within the R package edgeR. Briefly, for each contrast (neighbor interval) we entered the data using the DGEList function, estimated common and tag- wise dispersion, entered the corresponding normalization factors and performed the exact test via the exactTest function.
P values resulting from the exact test were then feed into the qvalue function [17] with default parameters, except that we set fdr.level = 0.01 to obtain a false discovery rate of 1%.
we used a de novo transcriptome reference assembly with trinity (release 20121005)
Genome_build: n/a
Supplementary_files_format_and_content: tab-delimited excell files include TPM values for each Sample

Submission date

Jan 15, 2014

Last update date

May 15, 2019

Contact name

Octavio Martinez de la Vega

E-mail(s)

[email protected]

Organization name

Cinvestav

Department

Unidad de Genómica Avanzada (Langebio)