NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM5012390 Query DataSets for GSM5012390
Status Public on Jan 31, 2021
Title E3
Sample type SRA
 
Source name Myeloma cell lines (500 cells)
Organism Homo sapiens
Characteristics cell type: Myeloma cell lines
sample: sample2
cell_number: 500
Extracted molecule polyA RNA
Extraction protocol Samples were processed using the Drop-seq DolomiteBio Nadia encapsulator system.
For nanopore sequencing, cDNA was amplified with 25 SMART PCR reactions and sequencing libraries were prepared using the Oxford Nanopore LSK-109 library preperation kit.
 
Library strategy RNA-Seq
Library source transcriptomic
Library selection cDNA
Instrument model MinION
 
Description Myeloma cells
Data processing We performed basecalling on the raw fast5 data using Guppy (v) (guppy_basecaller –compress-fastq -c dna_r9.4.1_450bps_hac.cfg -x “cuda:1”) in GPU mode from Oxford Nanopore Technologies running on a GTX 1080 Ti graphics card. For each read we identify the barcode and UMI sequence by searching for the polyA region and flanking regions before and after the barcode/UMI. Accurately sequenced barcodes were identified based on their dual nucleotide complementarity. Unambiguous barcodes were then used as a guide to error correct the ambiguous barcodes in a second pass correction analysis approach. We performed fuzzy searching using a Levenshtein distance of 4 (unless otherwise stated in the figure legend) and replaced the original ambiguous barcode with the unambiguous sequence. A whitelist of barcodes was then generated using UMI-tools whitelist (umi_tools whitelist --bc-pattern=CCCCCCCCCCCCCCCCCCCCCCCCNNNNNNNNNNNNNNNN --set-cell-number=1000) [3]. This whitelist was used to assess the quality of our cells to read count ratio and used as an input for UMI-tools extract. Next the barcode and UMI sequence of each read was extracted and placed within the read2 header file using UMI-tools extract (umi_tools extract --bc-pattern=CCCCCCCCCCCCCCCCCCCCCCCCNNNNNNNNNNNNNNNN --whitelist=whitelist.txt). Reads were then aligned to the transcriptome using minimap2 [10] (-ax splice -uf --MD --sam-hit-only --junc-bed) using the reference transcriptome for human hg38 and mouse mm10. The resulting sam file was converted to a bam file and then sorted and indexed using samtools [11]. The transcript name was then added as a XT tag within the bam file using pysam. Finally, UMI-tools count (umi_tools count –per-gene –gene-tag=XT –per-cell –double-barcode) was used to count features to cells before being converted to a market matrix format. We modified UMI-tools count to handle the double nucleotide UMIs as defined below. This counts matrix was then used as an input into the standard Seurat pipeline.
Genome_build: hg38/mm10
Supplementary_files_format_and_content: mtx
 
Submission date Jan 10, 2021
Last update date Feb 01, 2021
Contact name Adam Cribbs
E-mail(s) [email protected]
Organization name University of Oxford
Department NDORMS
Street address Windmill Road
City Oxford
ZIP/Postal code OX37LD
Country United Kingdom
 
Platform ID GPL24106
Series (1)
GSE162053 High throughput error correction using dual nucleotide dimer blocks allows direct single-cell nanopore transcriptome sequencing
Relations
BioSample SAMN17274451
SRA SRX9816546

Supplementary file Size Download File type/resource
GSM5012390_E3-genes.barcodes.txt.gz 8.4 Kb (ftp)(http) TXT
GSM5012390_E3-genes.genes.txt.gz 5.4 Kb (ftp)(http) TXT
GSM5012390_E3-genes.mtx.gz 554.7 Kb (ftp)(http) MTX
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap