NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE160944 Query DataSets for GSE160944
Status Public on Apr 26, 2021
Title Leveraging histone modifications to improve genome annotation
Organisms Glycine max; Zea mays
Experiment type Genome binding/occupancy profiling by high throughput sequencing
Summary With the creation of accurate, chromosome-scale genomes, the next challenge facing the genomics community is the accurate idenfication of transcriptional units, distinguishing them from aberrant transcriptional noise. This has proven to be a challenge as annotation by traditional means, such as short read RNA-seq followed by transcriptome assembly, which is prone to the generation of in-silico artifacts. To address this issue, we took advantage of epigenomic data in the form of ChIP-seq to unbiasedly annotate plant genomes and identify potential annotation issues, as well as identify novel genes. Histone modifications appear in the genome in a reproducible and predictable manner, making them an ideal resource to use in annotation. Trimethylation of histone 3 lysine 4 (H3K4me3), as well as acetylation of histone 3 lysine 56 are well documented to coincide with initiation of transcription by polymerase II (Pol II) at promoter sequences. These initiation marks, paired with marks deposited across the gene body during transcriptional elongation, such as histone 3 lysine 36 tri-methylation (H3K36me3) and histone 3 lysine 4 mono-methylation (H3K4me1), offer a framework to begin identifying complete transcriptional units. We leveraged these data on a genome-wide scale, allowing for identification of annotations discordant with empirical data. In total, 13,159 potential annotation issues were found in Zea mays across three different tissues, which were corroborated using complementary RNA-based approaches. Upon correction and validation, genes were extended by an average of 2,128 base pairs, and the length of discovered novel genes was 1,962 base pairs. Application of this method to five additional plant genomes revealed a variety of novel gene annotations, including 13,836 in Asparagus officianalis, 2,724 in Setaria viridis, 2,446 in Sorghum bicolor, 8,631 in Glycine max, and 2,585 in Phaseolous vulgaris.
 
Overall design ChIP-seq of histone modification H3K36me3, H3K4me1, and H3K56ac for Glycine max. ChIP-seq for H3K4me1 of Zea mays ear
 
Contributor(s) Medieta P, Schmitz R, Zhang X
Citation(s) 34568920
Submission date Nov 05, 2020
Last update date Oct 06, 2021
Contact name Robert J Schmitz
E-mail(s) [email protected]
Organization name University of Georgia
Department Genetics
Street address B416 Davison Life Sciences
City Athens
State/province GA
ZIP/Postal code 30602
Country USA
 
Platforms (4)
GPL20156 Illumina NextSeq 500 (Zea mays)
GPL22410 Illumina NextSeq 500 (Glycine max)
GPL25410 Illumina NovaSeq 6000 (Zea mays)
Samples (6)
GSM4886809 Soybean_10days_tis_leaf_H3K36me3
GSM4886810 Soybean_10days_tis_leaf_H3K4me3
GSM4886811 Soybean_10days_tis_leaf_H3K56ac
Relations
BioProject PRJNA674873
SRA SRP291339

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE160944_RAW.tar 1.2 Mb (http)(custom) TAR (of BED)
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap