psi+gag RRE cPPT hU6 EGFP sgRNA PGK WPRE 2A …...Unmodified A 37 5 A 375-Cas 9 HT 29-Cas 9 293...

ified A

A375-C

HT29-C

MOLM13

BV2-Cas

Cas9 Activity Assay

hU6 EGFP sgRNA PGK WPRE

�·/75 �·/75

psi+gag RRE cPPT

Puromycin 2A EGFP

pXPR_011

GFP - A

A375 A375-Cas9

0.52% 82.1%

Supplementary Figure 1. Cas9 activity assays for cell lines used in this study. (a) Schematic of pXPR_011 (Addgene #59702). Cells that do not express Cas9 will express EGFP and puromycin resistance, while cells with sufficient levels of active Cas9 will utilize the sgRNA targeting EGFP. Placement of EGFP downstream of a 2A site allows for continued puromy-cin resistance after indel formation. (b) Quantitation of the fraction of cells with Cas9 activity. Samples were analyzed ten to fourteen days post-infection.(c) Example flow cytometey plots.

Nature Biotechnology: doi:10.1038/nbt.3437

0.0 0.2 0.4 0.6 0.8 1.0

Avana Pool 1

Avana Pool 2

Avana Pool 3

Avana Pool 4

Avana Pool 5

Avana Pool 6

GeCKOv2

lative

Supplementary Figure 2 Distribution of sgRNA abundance in the plasmid

DNA pool of each Avana subpool in lentiGuide (solid lines) and GeCKOv2

library in lentiGuide (dashed line).

Fraction of ranked sgRNAs

Expand Cas9 expressing cell line

ůŝďƌĂƌǇ�Ăƚ�ĚĞƐŝƌĞĚ�ƌĞƉƌĞƐĞŶƚĂƟŽŶ

^Ɖůŝƚ�ƚŽ�ƐŵĂůů�ŵŽůĞĐƵůĞ

2 weeks passaging

�ǆƚƌĂĐƚ�Ő�E��ĨƌŽŵ�ƐĐƌĞĞŶ�ĞŶĚ�ĐĞůůƉĞůůĞƚƐ�ĂŶĚ�ƉĞƌĨŽƌŵ�ďĂƌĐŽĚŝŶŐ�W�Z

^Ƶďŵŝƚ�ƐĂŵƉůĞƐ�ĂŶĚ�ĚĞĐŽŶǀŽůƵƚĞ�ŽƵƚƉƵƚ

/ůůƵŵŝŶĂ�ƐĞƋƵĞŶĐŝŶŐ

Supplementary Figure 3 Pooled Screen workflow. Cells are infected with a Cas9-expression plasmid and are subjected to one week or more of blasticidin selection. Cells are infected with library and selected with puromycin for one week before being split into small molecule condi-tions or continued passaging for positive and negative selection screens, respectively. At then end point, cell pellets are collected and gDNA extracted. Samples are then barcoded and sequenced by Illumina. For analysis, the log2-fold-change of each sgRNA is determined rela-tive to the starting plasmid DNA (pDNA) pool.

ϭн�ǁĞĞŬ�ďůĂƐƟĐŝĚŝŶ�ƐĞůĞĐƟŽŶ

0 50 100

Gene Ranks

ueAvana lentiCRISPRv2

Avana lentiGuide

GeCKOv1 lentiCRISPRv1

GeCKOv1 lentiGuide

GeCKOv2 lentiGuide

Supplementary Figure 4 Comparison of the p-values for the top 100 genes

determined by weighted sum analysis in RIGER. y-axis is plotted in log10-

scale.

100Day 010 days, no drug10 days, vemurafnib

1 2 3 4 5 6 7 8 9 10 A AB B C A B C D A B C D ENegative Controls CUL3 NF1 NF2 TP53

Supplementary Figure 5 Validation of EGFP competition assay to confirm individual sgRNA activity. A375-Cas9 cells were infected with individual sgRNAs, selected with puromycin for one week, and then mixed with A375-Cas9 cells also expressing EGFP and puromycin resis-tance. The ratio in each population was determined by flow cytometry. Populations were then maintained with vemurafenib or with no drug for 10 days, and the relative fraction of sgRNA-containing, EGFP-negative cells was determined by flow cytometry. For the 10 negative controls, all cells died after 10 days in the presence of vemurafenib and thus could not be assessed by flow cytometry.

2 3 4 5 6Number of subpools

Supplementary Figure 6 Subsampling analysis of Avana library in selumetinib resistance screening data. STARS was run on all six subpools to determine the number of genes that passed at FDR thresholds of 10% and 25% (first number in legend). Subpools were then iteratively removed and STARS was re-run to determine the average number of genes retained by using fewer subpools at FDR thresholds of 10%, 25% and 75% (second number in legend). Error bars represent one standard deviation when subsampling from different combina-tions of subpools.

LG 10,10

LG 10,75

LG 25,25LC 10,10

LC 10,75

LC 25,25

ï8 ï6 ï4 ï2 0 2 4 6 80.0

Negative controls

Targeting

Log2-fold changeLog2-fold change

Negative controls

Avana lentiGuide

Log2-fold change

n Avana, 6 subpools

A375 cells

Supplementary Figure 7. Change in abundance of Avana library in negative selection

screens. (a) Cumulative distribution of 996 negative controls and 108,467 gene target-

ing sgRNAs in A375 cells for the Avana library in lentiGuide. Data are from two biologi-

cal replicates. (b) Cumulative distribution of 1000 negative controls and 73,687 gene

targeting sgRNAs in HT29 cells for the Avana library screened with 4 subpools in

lentiGuide. Data are from three biological replicates.

ï5 ï4 ï3 ï2 ï1 0 1 2 3

Log2-fold change

Negative controls

Targeting

BAvana, 4 subpools

HT29 cells

0.0001

Top 100 Core Essential Genes

GeCKOv2

lentiGuide

Supplementary Figure 8. Depletion of core essential genes with the

Avana and GeCKOv2 libraries as assessed by STARS analysis. Data

from two biological replicates were merged and STARS analysis

performed for all genes.

KEGG Gene Set

reAvanaGeCKOv2

Supplementary Figure 9. Gene Set Enrichment Analysis (GSEA) of negative selection screens performed in A375 cells with the indicated libraries in lentiGuide. All KEGG gene sets with normalized enrichment scores of 2.0 or greater are shown for Avana, with the corresponding score shown for GeCKOv2. The input ranked gene list was determined by RIGER analysis using the weighted sum option.

0 20 40 60 80 1000.0

Avana, 4 subpools (0.84)GeCKOv2 (0.71)

Percent-Rank of Core Essential sgRNAs

Supplementary Figure 10 ROC-AUC analysis of core essential genes in HT29 cells for Avana screened with 4 subpools and the GeCKOv2 library in lentiGuide. AUCs for each library are specified in brackets. Data are from three biological replicates.

tion D

oublings

6-thioguanineno drug

Popula

tion D

oublings

Popula

tion D

oublings

Control

sgRNA HPRT NUDT5

sgRNA 4 sgRNA 4sgRNA 6 sgRNA 6

Supplementary Figure 11 Validation of 6-thioguanine resistance hits. Individual sgRNAs

cloned into lentiGuide were infected into 3 cell lines, selected with puromycin, and after 1

week of selection were split into 6-thioguanine or continued to be passaged in the absence

of drug (day 0). An sgRNA targeting EGFP was used as a control. The populations were

counted at each passage, and the cumulative number of population doublings was deter-

mined relative to day 0. For A375 cells, the columns represent cell counts taken on days 2,

5, 7, 9, 11, 13, and 15; for HT29 cells, days 5, 7, 10, 15; for 293T cells, days 2, 5, 7, 9, 11,

13, and 15.

CCDC101

CUL3 NF1

MED12 TADA2B

Percent Protein Length

Supplementary Figure 12 Activity of sgRNAs across the length of the targeted protein for the RES dataset.

SVM + LogReg (Doench, RS1)L1 class., 1 & 2 ntsL1 class., 1 nt only (Xu)SVM, 1 & 2 ntsSVM, 1 nts only (Wang)

Train:Test:

RESRES

FC+RESFC+RES

Supplementary Figure 13 Performance of various classification models assessed by area under the curve (AUC). Error bars represent one standard deviation for performance across genes using a leave-one-gene-out method for training and testing.

train: FCtest: FC

train: RES test: RES

train: FC+RES test: FC+RES

L1 RegressionL1 Classifier

train: FCtest: FC

train: RES test: RES

train: FC+RES test: FC+RES

Supplementary Figure 14 Comparison of two comparable predictive models, one of which uses only a 0/1 binarization of the sgRNA scores (L1 Classifier), and the other that uses the sgRNA scores directly (L1 Regression). (a) uses area under the curve (AUC) as the performance metric, while (b) uses Spearman correlation as the metric. For all three data sets, and across both metrics, the regression approach outperforms the classification approach. Each bar shows the mean value across all genes, with the error bars denoting one standard deviation across the genes. Statistical significance of the improvement in Spearman correlation when using regression over classification is, for FC, RES and FC+RES data, respectively, p < 1×10-16, p = 5.4×10-13, p < 1×10-16

Supplementary Figure 15. Top features weights comprising Rule Set 2. The Gini impor-tances sum to 1.0. The two most important features, postition dependent order 1 and 2, refer to the identity of a specific single and dinucleotides at a particular position, e.g. is there an A at position 3, is there an AG at positions 9 and 10. Position independent order 1 and 2 refer to the total number of particular nucleotides, e.g. how many As are there in the sequence, how many CGs. The 3 melting temperature features (Tm) for the sgRNA were discretized by the UHJLRQ�RI�WKH�VJ51$��ZLWK�SRVLWLRQ��DV�WKH��·�HQG�RI�WKH�VJ51$�DQG�SRVLWLRQ��DV�WKH�3$0�proximal nucleotide. The NGGN interaction consisted of the two nucleotides that occur in the 3$0�RQ�HDFK�VLGH�RI�WKH�LQYDULDQW�**�

Supplementary Figure 16 Performance of the final model for Rule Set 2, gradient boosted regression trees, when trained and tested on all datasets.

FC RES FC+RES0.40

FCRESFC+RES

Test Data

Trained Data

Set 1 S

Eff Ineff Eff IneffEff Ineff

MouseHumanHuman

Activity:

Species:

731 438 671 237 830 234

All essentialNon-RiboRibosomalDataset:

Supplementary Figure 17 Performance of Rule Set 1 on negative selection

data, using negative selection data as curated by Xu et al. From Wang et al.,

sgRNAs were classified as either effective or ineffective for dropout screens

performed in HL-60 and KBM-7 human cell lines, and split into two classes,

one for ribosomal genes and the other for essential, non-ribosomal genes. The

data curated from Koike-Yusa et al. represent a dropout screen performed in

mouse ES cells, and all essential genes were kept as one class in the curation

performed by Xu et al. Rule Set 1 effectively distinguishes effective from

ineffective sgRNAs in all cases, with p-values of 1.4x10-32, 1.8x10-16, and

1.1x10-11 from left to right (two-sample Kolmogorov-Smirnov test).

********

Supplementary Figure 18 CD33 flow cytometry outline. Gates were set on the BD

FACSDiVa software. The sample was first gated to exclude debris; two additional gates

were applied to select singlets and a fourth to select viable cells via DAPI. Finally, the

population was gated, using a CD-33 DAPI positive control and a DAPI only negative

control, in order to sort for PE negative cells (green bracket).

SSC-W FSC-W

0 100K 200K 0 100K 200K

0 100K 200K 0 104

Supplementary Figure 19 Protein-activity map of CD33. Log2-fold change of sgRNAs utilizing the canonical NGG PAM across the length of the target protein are shown. The sgRNAs in the region of highest activity (red) were included for further analysis, and sgRNAs targeted to regions of low activity or exhibiting low activity (blue) were excluded from subsequent analysis.

IncludedExcluded

Log 2-f

Percent Protein40 60 80 100

Mismatch Insertion

Deletion

Supplementary Figure 20 Distributions of the number of unique position/nucleotide instances in the off-target data set, for the three different classes of imperfect interac-tions.

Number of instances7

9 10 11 12 1314 15 16 1718 19 20 21 22 23 25 26 29Number of instances

60 62 63 64 65

Number of instances7 10 11 12 13 14 15 16 17 18 19 20 21 22 23 25 26 28

Supplementary Figure 21 Comparison of the average delta-log2-fold-change and percent-active calculations for variant sgRNAs for all mismatch interac-tions; Pearson correlation coefficient = -0.97.

Average delta log2-fold-change0 1 2 3 4

Supplementary Figure 22. Optimized bowtie2 misses potential off-target sites. For three

sgRNAs that had previously been assessed by Guide-Seq all possible off-target sites

were identified by two search methods, optimized bowtie2 (see Methods) and Cas-

OFFinder. The potential sites were then assessed by the CFD score, and the total number

of off-target sites in the human genome receiving a CFD score > 0.2 are plotted.

GAGTCCGAGCAGAAGAAGA

GGAATCCCTTCTGCAGCACC

GACCCCCTCCACCCCGCCTC0

sgRNA sequence

f off-

Cas-OFFinder

Optimized bowtie2

psi+gag RRE cPPT hU6 EGFP sgRNA PGK WPRE 2A …...Unmodified A 37 5 A 375-Cas 9 HT 29-Cas 9 293...

Documents

Transcript of psi+gag RRE cPPT hU6 EGFP sgRNA PGK WPRE 2A …...Unmodified A 37 5 A 375-Cas 9 HT 29-Cas 9 293...

Generation of Targeted Knockout Mutants in Arabidopsis ... · Keywords: CRISPR/Cas9, Genome editing, Arabidopsis thaliana, Plants, Knockout [Background] The CRISPR/Cas9 system (Cas9)

Cas9 for developing models in drug discoverycsmres.co.uk/cs.public.upd/article-downloads/Use of... · 2018-03-29 · CRISPR/Cas9-mediated genome editing The CRISPR/Cas9 system is

CRISPR-Cas9--Genome Editing (.pptx)

Editing plant genomes with CRISPR/Cas9 · Editing plant genomes with CRISPR/Cas9 Khaoula Belhaj1, Angela Chaparro-Garcia1, Sophien Kamoun, Nicola J Patron and Vladimir Nekrasov CRISPR/Cas9

Invitrogen TrueCut Cas9 Protein v2 - Thermo Fisher Scientific€¦ · Cas9 protein forms a very stable ribonucleoprotein (RNP) complex with the guide RNA (gRNA) component of the CRISPR-Cas9

genetic engineering · 2017-01-04 · genetic engineering new horizons in medicine CRISPR-Cas9 CRISPR-Cas9 technology Pharming CRISPR-Cas9 is a genome editing tool which is changing

Optogenetic interrogation of neural circuits: technology for … · 2010. 2. 18. · WPRE ITR Opsin XFP polyA ITR Lentivirus vector WPRE cPPT RRE Opsin Promoter (up to 5 kb) Promoter

Plant genome editing made easy: targeted mutagenesis in model … · 2013-10-22 · applications of the CRISPR/Cas technology in plants. Keywords: CRISPR, Cas9, Plant, Genome editing,

Editing DNA with CRISPR-Cas9

Cas9 system - bch.cbd.int

What is CRISPR-Cas9? · to improve gene targeting methods, including CRISPR-Cas systems, transcription activator-like effector nucleases (TALENs) and zinc-finger nucleases (ZFNs).

NCER Position on Crispr-Cas9

Cas9 GENOME EDITING: CRISPR/CAS9 DELIVERY METHODS … · What is CRISPR/Cas9 Genome Editing? CRISPR Facilitates Multiple Types of Genome Modification The CRISPR/Cas9 system is a powerful

Application of the CRISPR–Cas System for Efficient …...Letter to the Editor 2009Figure 1. The CRISPR–Cas9 System Induces Efficient Targeted Gene Editing and Correction in Plants.

DNA cleaving and gene repression using CRISPR/Cas · In this Praktikum, our learning goal includes the mechanism of CRISPR-Cas system, ... PAM sequence is also bound by Cas9, which

A CRISPR-Cas9 gene engineering work- flow: generating ... · CRISPR-Cas9 Gene Engineering platform to knock out PPIB in HEK293T cells. First, co-transfection of the mKate2-Cas9 expression

IDT CRISPR-Cas9 nickase mutants and template optimization ... · CRISPR-Cas9 genome editing, the RNP is composed of hybridized crRNA, tracrRNA, and the Cas9 endonuclease. Delivery

Review Strategies in the delivery of Cas9 ribonucleoprotein for CRISPR/Cas9 … · 2020. 10. 20. · CRISPR/Cas9 genome editing has gained rapidly increasing attentions in recent

Genome Editing CRISPR-Cas9

Crystal structure of Cas9 in complex with guide …crystal structure of the Cas9–sgRNA–cDNA ternary complex. S. pyogenes Cas9 was expressed in Escherichia coli, and then purified