Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng...
-
Upload
hillary-crawford -
Category
Documents
-
view
216 -
download
0
Transcript of Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng...
![Page 1: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/1.jpg)
Comprehensive analysis of RNA-Seq data reveals extensive RNA editing
in a human transcriptome
Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122
Presented by: GUAN Peiyong23rd Feb 2012
![Page 2: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/2.jpg)
Overview
Definition Mechanisms Functions
Data Pipeline Results
RNA Editing Concepts BGI’s Methodology
![Page 3: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/3.jpg)
RNA Editing Concepts
Definition Mechanisms Functions
![Page 4: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/4.jpg)
Gott & Emeson. Annu Rev Genet. 2000
![Page 5: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/5.jpg)
RNA Editing | Definition
RNA editing can be broadly defined as any site-specific alteration in an RNA sequence that could have been copied from the template, excluding changes due to processes such as RNA splicing and polyadenylation.
RNA editing is a process that changes the identity of an RNA base after it has been transcribed from a DNA sequence.
Gott & Emeson. Annu Rev Genet. 2000.E. C. Hayden, Nature 473, 432 (2011).
![Page 6: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/6.jpg)
RNA Editing | Mechanisms
Insertion / deletion RNA editing Posttranscriptional Nucleotide Insertion/Deletion Nucleotide Deletion-Insertion Nucleotide Insertion During Transcription Mixed Nucleotide Insertion
Conversion / substitution editing Adenosine-to-Inosine Editing (A-I or A-G, most prevalent in human)
Enzyme ADAR (adenosine deaminases that act on RNA). Cytidine-to-Uridine Editing (C-U)
Enzyme APOBECs (apolipoprotein B mRNA editing enzymes, catalytic polypeptide-like).
E.g., CAA UAA (STOP) Uridine-to-Cytidine Editing (U-C)
Li et al. Science doi:10.1126/science.1207018 ; 2011Gott & Emeson. Annu Rev Genet. 2000
![Page 7: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/7.jpg)
RNA Editing | Functions
E. C. Hayden, Nature 473, 432 (2011).
![Page 8: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/8.jpg)
BGI Methodology
Data Pipeline Results
![Page 9: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/9.jpg)
Data | Preparation
75bp and 100bp
90bp, strand-specific
LymphoblastoidCell Line
767.58 million reads (73.84%) uniquely aligned
767.58 million reads (73.84%) uniquely aligned
Illumina Genome Sequence Analyzer
![Page 10: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/10.jpg)
Data | Preparation
RNA-Seq of Lymphoblastoid cell line of a male Han Chinese individual (YH) Genome sequence was reported previously.
Nature 456, 60-65, (2008) 767 million sequence reads
RNA-Seq 75bp and 100bp Poly (A)+
90bp Poly (A) - - strand-specific sequencing Small RNA-Seq
![Page 11: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/11.jpg)
Data | Sequencing Coverage
![Page 12: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/12.jpg)
Data | Sequencing Depth
![Page 13: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/13.jpg)
Data | Simulated Data
Paired-end reads with fixed length of 75 bp, simulated randomly from chromosome 1 of the human RefSeq. Use chromosome 1 of the NCBI human RefSeq as a
reference: Two sets of simulated data were created:
Set #1: Random SNV by MAQ with default options (5-, 10-, 20-, and 50-fold coverage).
Set #2: A→G substitution at positions referenced in the DARNED database (50-fold coverage).
![Page 14: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/14.jpg)
Pipeline | Overview
![Page 15: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/15.jpg)
Pipeline | Illumina reads alignment (SOAP2)
Due to the potential uncertainty in read alignment across splice junctions, SOAP2 was used in this regard rather than tools that utilize gapped alignment across exon boundary, such as SOAPsplice.
Reference genome (NCBI Build 36.1, hg18). Two paired-reads – aligned together with both in the correct orientation. Aligning the cDNA reads to the reference genome:
≤ 3 mismatches for the 75-bp reads. ≤ 4 mismatches for the 90-bp and 100-bp reads.
Best Hits – alignments with the least number of mismatches: Uniquely placed – 1 best hit (kept). Repeatedly placed – multiple equal best hits (discarded). Potential PCR duplicates (discarded).
Reads with unique ungapped genome alignment.
![Page 16: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/16.jpg)
Pipeline | RNA editing sites/RNA-centric SNVs detection
Multiple filters with stringent thresholds to facilitate unbiased detection of bona fide editing or base substitution events in the RNA-Seq reads.
RNA-centric SNVs were first identified from aligned cDNA reads using SOAPsnp, which uses a method based on Bayes’ theorem (the reverse probability model) to call consensus genotype by carefully considering the data quality, alignment and recurring experimental errors, with parameters e = 0.0001 and r = 0.00005.
We further lifted a default filter in the basic filter step of the program that was designed to discard sequence reads with more than one variant within a 5-bp span (for clustered AG editing?).
SNVs 10 filtering steps
![Page 17: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/17.jpg)
Pipeline | 1. Basic filter
Retain SNVs that meet the following criteria: Quality score of consensus genotype ≥ 20. Covered depth ≥ 5. Repeats (estimated copy number of the flanked
sequence in genome) ≤ 1.
![Page 18: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/18.jpg)
Pipeline | 2. Read parameter filter
Optimize parameters using simulated data set: m, the minimal distance of a SNV site to its supporting
reads’ ends q, minimal sequencing quality score of SNV-corresponding
nucleotide (m, q)=(15,20)
n, minimal number of supporting reads that meet the above two cutoff parameters n = 2
![Page 19: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/19.jpg)
Pipeline | 2. Read parameter filter
Two sets of data: Set #1: random substitution Set #2: A G substitution in DARNED database
![Page 20: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/20.jpg)
Pipeline | 3. RNA-DNA variants filter.
Focus on RNA-DNA variants only: Sites of which DNA genotypes are the same as RNA
genotypes were removed.
![Page 21: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/21.jpg)
Pipeline | 4. YH genome variants filter.
Distinguish RNA editing from allele-specific expression and duplication polymorphisms: Keep SNVs remaining from step 3 only if their corresponding
DNA genotypes are homozygous and diploid in copy number. Parameters of YH genome sequence reads corresponding to a
candidate site: Depth is ≥5; Consensus quality is ≥20; Average quality of the first best allele ≥ 20; Depth of the second best allele, if present, is <5% of the total number of
reads; The second best allele should not be the variant allele in the RNA data; And average sequencing quality of the second best allele is <10.
Exclude genomic duplication polymorphisms: CNVnator tool with bin set to 50, and removed sites that were nondiploid
in nature.
![Page 22: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/22.jpg)
Pipeline | 5. MES filter.
Remove misaligned reads that arise from mapping error inherent to the mapping algorithm (MES): Simulate read sequences based on all human genes (hg18
transcriptome) using MAQ without mutation (-r parameter).
Align simulated reads using SOAP2 & call SNVs using SOAPsnp.
Filter the identified SNVs using filters #3 and #4 MES. SNVs matched the MES were removed.
![Page 23: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/23.jpg)
Pipeline | 6. Strand filter.
Remove potential strand-specific errors in sequences generated by the Illumina platform: Evaluate the counts of the reads mapped to the +/- strands
using Fisher’s exact test. Discard the site if:
Reads exhibited strand bias in distribution (P < 0.01) & Number of supporting reads mapped to either strand is <2.
![Page 24: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/24.jpg)
Pipeline | 7. BLAT filter.
Address the potential pitfall of paralogous sequences in site calling: Use BLAT to search for SNVs’ supporting reads in the
reference genome. Same mismatch tolerance used in SOAP2 alignment.
Discard all supporting reads with >1 hit. Filter SNVs that have <2 qualified supporting reads.
![Page 25: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/25.jpg)
Pipeline | 8. Known SNPs filter.
Eliminate germline variants: Cross-reference the remaining SNVs against known
SNP databases: 1000 Genomes Project. Genomes of Yoruba, Watson, Korean. dbSNP (version 129).
![Page 26: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/26.jpg)
Pipeline | 9. Multiple type of mismatches filter.
Discard SNV candidate sites with >1 nonreference type: For example:
Reference allele – A Nonreference alleles – G and T
![Page 27: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/27.jpg)
Pipeline | 10. Editing degree filter.
Exclude polymorphic sites with extreme degree of variation (100%):
sitethe covering reads of #
allele variant the g supportinreads of#Editing of Degree
Remaining Sites Further Analysis
![Page 28: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/28.jpg)
Pipeline | Analysis of the sequence and structural features of RNA editing.
To identify sites dsRNA structure, or sites in 3 -UTR ′that are likely microRNA seed matches: Li, J.B. et al. Genome-wide identification of human
RNA editing sites by parallel DNA capturing and sequencing. Science 324, 1210–1213 (2009).
![Page 29: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/29.jpg)
Pipeline | Analysis of the sequence and structural features of RNA editing. Editing sites clustering:
Defined as occurrence of ≥3 variants per 100bp. Conserved region:
Annotated as ‘most conserved’ by the UCSC genome browser.
Coding sequence: Defined by the RefSeq annotation.
Highly edited genes: ≥10 variant sites per gene
Gene enrichment: DAVID pathway-classification tool.
![Page 30: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/30.jpg)
Pipeline | Identification of miRNA and editing (1).
Filtering of small RNA reads: Filter out low-quality reads; Trim 3 adaptor sequence by a dynamic programming ′
algorithm; Remove adaptor contaminations formed by adaptor
ligation; Retain only short trimmed reads of sizes from 18 to 30
nt.
![Page 31: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/31.jpg)
Pipeline | Identification of miRNA and editing (2).
Annotate and categorize small RNAs: Filter out small RNA reads possibly from known
noncoding RNAs: rRNA, tRNA, snRNA and snoRNA deposited in the Rfam
database and the NCBI Genbank. Discard small RNA reads assigned to exonic regions. Subject the remaining small RNA to MIREAP, which
identifies miRNA candidates according to the canonical hairpin structure and sequencing data.
![Page 32: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/32.jpg)
Pipeline | Identification of miRNA and editing (3). Align identified miRNA reads to miRNA reference
sequences: BLAST, ≤1 mismatch. Reads that were uniquely aligned and overlapped with
known miRNAs were used to identify miRNA editing sites. First, identify reads with mismatch to hg18 genome.
Reads with mismatch within 1 nt at 5 end or 2 nt at 3 end were ′ ′discarded (?).
Then, identify miRNA edits by the following criteria: Sequencing depth of editing sites ≥ 5; Frequency of SNV occurrence ≥5% & ≤95%; Variants that were not found in previous SNP annotations
YH, 1000 genomes project, Yoruba, Watson, Korean and dbSNP version 129.
![Page 33: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/33.jpg)
Results| Editing Events Identified
22,688 RNA editing sites Poly (A)+
To ascertain the editing type for these sites, cross-reference against RefSeq.
~30% of the identified sites: Unannotated in the database (5,381). Corresponded to overlapping transcript units on both strands
(57). 11,467 sites were unambiguously mapped to known gene models.
Poly (A)-
To identify editing sites in the intergenic regions of the transcriptome
11,221 RNA editing sites identified.
![Page 34: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/34.jpg)
Results| Editing Sites Distribution
50% leads to changes in coded
amino acids.
50% leads to changes in coded
amino acids.
![Page 35: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/35.jpg)
Results|
Editing sites Characterization
Poly (A)+ Poly (A)-
Poly (A)+
Poly (A)+,CDS
Poly (A)-
![Page 36: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/36.jpg)
Results| Novel Editing Sites
![Page 37: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/37.jpg)
Results| Frequency of nucleotides in the flanking sequences
Poly (A)+ Poly (A)-
![Page 38: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/38.jpg)
Results| % of Edits in Conserved Regions
Poly (A)+ Poly (A)-
![Page 39: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/39.jpg)
Results| Experimental Validation
Two replicates of PCR amplification and Sanger sequencing of both DNA and RNA from the same batch of cells from the YH cell line.
![Page 40: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/40.jpg)
Results | Comparison with Other Datasets
![Page 41: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/41.jpg)
Results | Genes with multiple editing sites.
![Page 42: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/42.jpg)
Results | RNA editing and miRNA-mediated regulation
2,474 editing sites in 3 -UTRs′ Extract 6 + 1 + 6 bp sequence & search in miRBase.
![Page 43: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/43.jpg)
Summary
Pipeline for identifying RNA editing events by screening RNA-DNA differences in the same individual. 10 filters to handle various aspects of false positives.
Experimentally validated novel RNA editing sites. Evidence of extensive RNA editing in a human cell line.
Question: since the model parameter were optimized using random data from DARNED, why there is no significant overlaps between DARNED database and BGI’s discovered editing sites?
![Page 44: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/44.jpg)
![Page 45: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/45.jpg)
![Page 46: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/46.jpg)
RNA Editing
![Page 47: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/47.jpg)
Overview
Literature Review RNA Editing Concepts
DefinitionMechanismsFunctions
RNA Editing Site PredictionPrediction Methods
Machine Learning Based Methods Mapping Based Methods
Database
![Page 48: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/48.jpg)
Literature Review RNA Editing Concepts RNA Editing Site Prediction
![Page 49: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/49.jpg)
RNA Editing Concepts
Definition Mechanisms Functions
![Page 50: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/50.jpg)
Gott & Emeson. Annu Rev Genet. 2000
![Page 51: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/51.jpg)
RNA Editing | Definition
RNA editing can be broadly defined as any site-specific alteration in an RNA sequence that could have been copied from the template, excluding changes due to processes such as RNA splicing and polyadenylation.
“RNA editing” is a process that changes the identity of an RNA base after it has been transcribed from a DNA sequence.
Gott & Emeson. Annu Rev Genet. 2000.E. C. Hayden, Nature 473, 432 (2011).
![Page 52: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/52.jpg)
RNA Editing | Mechanisms
Insertion / deletion RNA editing Posttranscriptional Nucleotide Insertion/Deletion Nucleotide Deletion-Insertion Nucleotide Insertion During Transcription Mixed Nucleotide Insertion
Conversion / substitution editing Adenosine-to-Inosine Editing (A-I or A-G, most prevalent in human)
Enzyme ADAR (adenosine deaminases that act on RNA). Cytidine-to-Uridine Editing (C-U)
Enzyme APOBECs (apolipoprotein B mRNA editing enzymes, catalytic polypeptide-like).
E.g., CAA UAA (STOP) Uridine-to-Cytidine Editing (U-C)
Li et al. Science doi:10.1126/science.1207018 ; 2011Gott & Emeson. Annu Rev Genet. 2000
![Page 53: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/53.jpg)
RNA Editing | Functions
E. C. Hayden, Nature 473, 432 (2011).
![Page 54: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/54.jpg)
RNA Editing | Functions (Cont’d…)
mRNA
Gott & Emeson. Annu Rev Genet. 2000
![Page 55: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/55.jpg)
RNA Editing | Functions (Cont’d…)
Maas et al. 2003
![Page 56: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/56.jpg)
RNA Editing Site Prediction
Prediction Methods Machine Learning Based Methods Mapping Based Methods
Database
![Page 57: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/57.jpg)
Machine Learning | Bundschuh 2004
Bundschuh 2004
![Page 58: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/58.jpg)
Machine Learning | Bundschuh 2004
Bundschuh 2004
![Page 59: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/59.jpg)
Machine Learning | Bundschuh 2004
Bundschuh 2004
![Page 60: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/60.jpg)
Machine Learning | Bundschuh 2004
Bundschuh 2004
![Page 61: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/61.jpg)
Machine Learning | Bundschuh 2004
Results Predictive performance of over 90% on the amino acid
level and of over 70% on the editing site level. Limitations
Specific for Physarum polycephalum Insertion of C is most common.
Requires training data Uses information on homologs of the gene in other organisms
and statistical information on editing sites specific for Physarum. Very limited training / testing data
Only 6 genes with known RNA editing sites in the mitochondrion of Physarum.
Tested using leave-one-out approach.
Bundschuh 2004
![Page 62: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/62.jpg)
Machine Learning | Clutterbuck et al. 2005
Clutterbuck et al. 2005
![Page 63: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/63.jpg)
Machine Learning | Clutterbuck et al. 2005
Focused on recoding A–I mRNA editing sites A recoding site is a site where editing alters the amino acid
sequence. Used a combination of 7 predictive features to screen
a large set of expressed versus genomic sequence mismatches. For many of the known sites, editing can be observed in
multiple species and often occurs in well-conserved sequences.
In addition, they often occur within imperfect inverted repeats and in clusters, etc.
Clutterbuck et al. 2005
![Page 64: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/64.jpg)
Machine Learning | Clutterbuck et al. 2005
Clutterbuck et al. 2005
![Page 65: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/65.jpg)
Machine Learning | Clutterbuck et al. 2005
7features Number of putatively edited mouse cDNAs or ESTs with the same mismatch as the same
position (Allowed values: 1, 2, >2); Number of non-edited mouse cDNAs or ESTs combined with the number of publicly
available genomic sequences for each given mismatch (Allowed values: 1, 2, >2); Where possible the human homologues were aligned using Lagan; We calculated the effect of the edit on the amino acid sequence by BLAST searching the
Ensembl nucleotide sequence against the equivalent protein sequence, then mapping the putative editing site onto the alignment.
Sequence conservation was analyzed using the same Lagan mouse/human alignments, from which the best conserved 120 bp window overlapping each putative editing site was selected (Continuous variable);
Putative mouse ECSs were identified by scanning for inverted repeats using a Smith–Waterman alignment algorithm from EMBOSS
Clusters of sites were defined by the observation of more than one putative editing site within an exon (Continuous variable).
Clutterbuck et al. 2005
![Page 66: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/66.jpg)
Machine Learning | Clutterbuck et al. 2005
Clutterbuck et al. 2005
![Page 67: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/67.jpg)
Machine Learning | Clutterbuck et al. 2005
Limitations Only identifies recoding sites. Only for A-I editing. Highly depends on known biology knowledge (7
seemingly ad hoc features). Model over-fitting? (so many features that should be
inter-dependent).
Clutterbuck et al. 2005
![Page 68: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/68.jpg)
Mapping | Kim et al. 2004
Kim et al. 2004
![Page 69: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/69.jpg)
Mapping | Kim et al. 2004
Kim et al. 2004
![Page 70: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/70.jpg)
Mapping | Kim et al. 2004
Method Mapping human and mouse cDNA from UCSC to the reference
genome. Filtering (95% sequence identity + alignment score). Using a scan statistic method to look for clusters of A-to-G
substitutions in each transcript. Results
An excess of A-G substitutions in human full-length cDNAs. Correlation between A-G substituted bases and Alu sequences. Etc.
Limitations Relying on biology knowledge, i.e., the A-G substitution sites
tends to cluster together.
Kim et al. 2004
![Page 71: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/71.jpg)
Mapping | Levanon et al. 2004
Levanon et al. 2004
![Page 72: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/72.jpg)
Mapping | Levanon et al. 2004
Levanon et al. 2004
![Page 73: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/73.jpg)
Mapping | Levanon et al. 2004
Method Algorithm to align the expressed part of the gene with the corresponding
genomic region, looking for reverse complement alignments longer than 32 bp with identity levels higher than 85%.
Cleaned the sequences supporting the stem region. Because sequencing errors tend to cluster in certain regions, especially in low
complexity areas and towards the ends of sequences, we discarded all single-letter repeats longer than 4 bp, as well as 150 bp at both ends of each sequence.
In addition, all 50 nucleotide-wide windows in which the total number of mismatches was five or more were considered as having low sequencing quality and were discarded.
However, four or more identical sequential mismatches were masked in the count for mismatches in a given window. This exception is intended to retain sequences with many sequential editing sites.
Mismatches supported by <5% of available sequences were also discarded, and, finally, known SNPs of genomic origin were removed.
Levanon et al. 2004
![Page 74: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/74.jpg)
Mapping | Levanon et al. 2004
Results Mapped 12,723 A-to-I editing sites in 1,637 different
genes, with an estimated accuracy of 95%, raising the number of known editing sites by two orders of magnitude.
Limitations Only focused on A-I editing.
Levanon et al. 2004
![Page 75: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/75.jpg)
Mapping | Li et al. 2011
Li et al. 2011
![Page 76: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/76.jpg)
Flowchart of Analysis
Li et al. 2011
![Page 77: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/77.jpg)
Mapping | Li et al. 2011
Method Comparing RNA and sequences from Human B cells of 27
individuals, who were sequenced in the 1000 Genomes Project and the International HapMap Project.
Map RNA-seq to the hg18 mRNA using Bowtie. Results
More than 10, 000 exonic sites with RNA and DNA differences (RDD).
RRD not limited to A-G and C-U, but all 12 possible categories. Problem
Too many false-positives caused by paralogs in the genome. Rigorous filtering should have been performed.
Li et al. 2011
![Page 78: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/78.jpg)
Mapping | Schrider et al. 2011
Schrider et al. 2011
![Page 79: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/79.jpg)
Mapping | Schrider et al. 2011
Similar methods to Li et al. Science 2011 paper but: Additional filtering steps. Schrider et al. used BWA instead of Bowtie for mapping because it is
more accurate and allows for indels. This paper criticize the Li et al. paper.
Pointing out that many of the 10, 000 exonic RDD sites discovered by Li et al. are actually from paralogs.
But Levanon et al. 2004 even mapped 12,723 A-to-I editing sites in 1637 different genes!
This raised the questions: How precise is the mapping? How abundant is the RNA editing events in human?
Schrider et al. 2011
![Page 80: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/80.jpg)
Mapping | Bahn et al. 2011
Bahn et al. 2011
![Page 81: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/81.jpg)
Mapping | Bahn et al. 2011
2 main challenges for genome-wide identification of RNA editing: Separating true editing sites from false discoveries
and, Accurate estimation of editing levels.
Bahn et al. 2011
![Page 82: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/82.jpg)
Mapping | Bahn et al. 2011
Bahn et al. 2011
![Page 83: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/83.jpg)
Mapping | Bahn et al. 2011
Bahn et al. 2011
![Page 84: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/84.jpg)
Mapping | Bahn et al. 2011
Determine whether the DNA–RNA differences are likely authentic events or sequencing errors.
Bahn et al. 2011
![Page 85: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/85.jpg)
Mapping | Bahn et al. 2011
Data RNA-seq data of a human glioblastoma cell line, U87MG. Whole genome sequencing data.
Method Combines multiple mapping tools (Bowtie, BLAT and TopHat) Double-filtering of mismatches in the mapped reads:
Mapped uniquely with ≤ n1 (5) mismatches and, Did not map to other genomic loci with ≤ n2 (12) mismatches are retained (n2 > n1).
Results Around 10,000 DNA–RNA differences were identified, the majority being
putative A-to-I editing sites. (FDR ~ 5%). The estimated editing levels from RNA-seq correlated well with those based on
traditional clonal sequencing. Simulated data for FDR calculating.
Bahn et al. 2011
![Page 86: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/86.jpg)
Mapping | Carmi et al. 2011
Carmi et al. 2011
![Page 87: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/87.jpg)
Mapping | Carmi et al. 2011
Carmi et al. 2011
![Page 88: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/88.jpg)
Mapping | Carmi et al. 2011
Method MegaBLAST Very important step to replace As with Gs and re-map. Filtering
RNAs that appeared ultra-edited in more than one transformation/strand combination.
Etc.
Carmi et al. 2011
![Page 89: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/89.jpg)
Mapping |Picardi et al. 2011
Picardi et al. 2011
![Page 90: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/90.jpg)
Mapping |Picardi et al. 2011
Method Aligns RNA-seq reads to hg18 (Bowtie). Pileup (SAMTool). Explore position by position and record all substitutions. Build table with probability of observing the change (Fisher-
test). Filtering
Known genomic polymorphisms in dbSNP (v130) are excluded. Substitutions compatible with RefSeq annotations are filtered in. Sites with a coverage lower than 10 reads are removed. Sites with multiple observed substitutions are also excluded. Sites with a background higher than 0.1 are not considered.
URL: http://t.caspur.it/ExpEdit/
Picardi et al. 2011
![Page 91: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/91.jpg)
Database | Kiran et al. 2010
Data from previously published papers.
Kiran et al. 2010
![Page 92: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome Peng et al. Nature Biotechnology (2012) doi:10.1038/nbt.2122.](https://reader036.fdocuments.in/reader036/viewer/2022062515/56649d155503460f949ea055/html5/thumbnails/92.jpg)
Summary | Literature Review
Machine Learning Based Methods Highly depends on the biology knowledge Highly depends on experimentally validated data for model training
Mapping Based Methods Sensitive to systematic sequencing error. Different mapping tools:
Different ways of dealing with gaps, mismatches and splice junctions. Mapability
Ultra-edited RNA may not be mapped & discarded. Intensive filtering is required due to:
Paralogs in the genome Imperfect mapping
Data DARNED could serve as a validation / comparison resource.
E. C. Hayden, Nature 473, 432 (2011).