Last updated: April 21, 2026

How to Design and QC a CRISPR sgRNA Library

Use this page when you need to design a CRISPR sgRNA library and make sure it is ready for cloning, packaging, and screening. It walks through choosing the right library format, ranking guides, sizing the pool, defining the oligo, and checking both pre-order and post-synthesis QC. If you need a faster answer on one step, jump to the shorter CRISPR library workflow, Coverage Calculator, Batch Sequence QC, and Oligo Pool QC Metrics.

Planning a CRISPR sgRNA library from guide selection to QC

Choose guides, size the library, and set QC thresholds before cloning and screening.

Key Takeaways

  • Use 4-10 sgRNAs per gene for knockout screens — more guides increase statistical power and reduce false negatives.
  • Select sgRNAs with GC content 40-70%, avoid TTTT (polymerase III terminator), and target constitutive exons in the first 50% of the coding sequence.
  • Activity scoring (Rule Set 2/Azimuth) and off-target analysis (CFD score >0.9, MIT score >80) are both essential for guide selection.
  • Pool synthesis oligos are typically 73-80 nt for SpCas9: 20 nt spacer + scaffold overlap + amplification primers + restriction sites.
  • Verify library representation by NGS at 500-1000x coverage: >90% of guides detected, Gini <0.25, <10% dropout.
  • Include 500-1000 non-targeting control guides plus 50-100 essential gene positive controls in every library.

1. Which CRISPR Library Format Fits Your Screen?

Start by matching the library format to the screen objective. Knockout, CRISPRa, CRISPRi, base editing, and tiling libraries need different target regions, guide counts, and downstream QC expectations.

Library TypeCas ProteinTarget RegionGuides/GeneTypical Library Size
CRISPRko (Knockout)SpCas9 (active)Constitutive coding exons4-6~80K-120K guides
CRISPRa (Activation)dCas9-VP64/p65/RtaPromoter (-200 to +1 of TSS)5-10~100K-200K guides
CRISPRi (Interference)dCas9-KRABTSS region (-50 to +300)5-10~100K-200K guides
Base EditingCBE4 or ABE8eCoding exons (edit window pos 4-8)4-8~80K-160K guides
TilingSpCas9 or dCas9Every PAM across target regionAll possible50-500 per region

Which Cas system fits your delivery and PAM constraints?

SystemPAMSpacer LengthCut TypeAdvantages
SpCas9NGG20 ntBlunt (3 bp upstream PAM)Most validated, best scoring tools
SaCas9NNGRRT21-23 ntBluntSmaller size for AAV delivery
AsCas12aTTTV23 ntStaggered (5' overhang)AT-rich targets, multiplex from single transcript
LbCas12aTTTV23 ntStaggered (5' overhang)High activity at 37°C
enAsCas12aTTYN, VTTV, TRTV23 ntStaggeredExpanded PAM flexibility

2. How Should You Filter and Rank sgRNAs?

Treat sgRNA selection as a ranking workflow, not a single score cutoff. Start with hard filters, then compare on-target and off-target scores so you know why each guide made the final library.

CriterionOptimalFilter (Hard)Rationale
GC Content40-70%Exclude <30% or >80%Binding stability and activity correlation
Poly-TNo TTTTExclude any TTTTPol III termination signal
Homopolymer≤3 consecutiveExclude ≥5 consecutiveSynthesis error and misalignment
Target PositionFirst 50% of CDSExclude last 10% of CDSEarlier frameshifts = stronger KO
Exon TargetingConstitutive exonsAvoid alt-spliced exonsEnsure disruption in all isoforms
Activity ScoreRule Set 2 >0.6Exclude <0.2Predicts cutting efficiency
Specificity (CFD)CFD >0.9Exclude <0.5Off-target risk assessment
Specificity (MIT)MIT >80Exclude <50Alternative specificity metric

Which scoring models should you trust?

Rule Set 2 / Azimuth (Doench et al., 2016): Gradient-boosted regression tree model trained on >4,000 sgRNAs targeting 17 genes across multiple cell lines. Azimuth is the software implementation of Rule Set 2. Input features include dinucleotide composition, GC content, position-specific base preferences, and chromatin accessibility. Scores range from 0-1, with >0.5 indicating high predicted activity. Significantly outperforms Rule Set 1 (Doench et al., 2014, which used logistic regression on ~1,800 sgRNAs). Available through the Broad GPP portal.

CFD Score (Doench et al., 2016): Cutting Frequency Determination score for off-target prediction. Trained on the same dataset as Rule Set 2. Evaluates each mismatch position and type independently. Higher CFD = higher off-target risk at that site.

DeepCas9 / CHOPCHOP: Deep learning models trained on larger datasets. Consider these for non-standard applications or when designing guides for organisms with limited training data coverage.

Screen your designed spacer sequences with our GC Content Analyzer (batch mode) and Batch Sequence QC to identify sequences with extreme GC, TTTT motifs, or problematic homopolymers before synthesis.

3. How Large Will the Library and Screen Be?

Before you order the pool, confirm that library size, cell coverage, and sequencing depth all still fit the experiment. Underpowered CRISPR screens hide real hits and inflate guide-level noise.

Screen TypeGuides/GeneTotal Library SizeCell CoverageCells Needed
Genome-wide KO4~80K500x per guide40M cells
Genome-wide CRISPRa/i5-10~100-200K500x per guide50-100M cells
Focused sublibrary6-101K-10K1000x per guide1-10M cells
Tiling screenAll available5K-50K500x per guide2.5-25M cells

Use our Coverage Calculator to determine the minimum cell number, sequencing depth, and replicate count for your screen design. The tool accounts for library complexity, infection efficiency (MOI), and desired statistical power.

4. What Should Each CRISPR Library Oligo Contain?

Every CRISPR library oligo needs to carry the spacer plus the constant regions required for amplification, cloning, and expression. Lock this architecture before quote review so vendor length limits and downstream cloning stay aligned.

Typical SpCas9 Library Oligo (lentiGuide-Puro)

5'-[FWD_PRIMER]-[BsmBI_site]-[20nt_SPACER]-[BsmBI_site]-[REV_PRIMER]-3'
ElementLengthFunction
Forward primer18-22 ntPCR amplification handle
BsmBI site + overhang10 ntGolden Gate cloning into vector
Spacer sequence20 ntsgRNA targeting sequence (variable)
BsmBI site + overhang10 ntGolden Gate cloning into vector
Reverse primer18-22 ntPCR amplification handle
Total oligo76-84 ntWell within array synthesis limits

Verify your full oligo sequences (spacer + constant regions) through our Batch Sequence QC tool before ordering synthesis. This checks for unintended restriction sites that would interfere with cloning, as well as secondary structures that could impair PCR amplification of the pool.

5. Which Sequence Checks Should You Run Before Ordering?

Run full-length sequence checks before you approve synthesis. This is where you catch TTTT terminators, GC extremes, structures, and Tm outliers before they become dropout or amplification bias.

TTTT Motif Check

The most critical filter: any spacer containing TTTT will produce a truncated, non-functional sgRNA. This is a hard filter — no exceptions.

Use Batch QC

GC Content Analysis

Spacer GC 40-70%. Extreme GC causes synthesis bias (dropout) and reduced Cas9 activity. Use batch mode for genome-wide libraries.

Use GC Analyzer

Secondary Structures

Strong hairpins in the spacer (ΔG < -3 kcal/mol) can block Cas9 loading. Check the full oligo for structures that impair PCR amplification.

Use Structure Predictor

Tm Uniformity

For PCR amplification handles, ensure consistent Tm (58-62°C). The spacer region contributes to overall oligo Tm — check for outliers.

Use Tm Calculator

6. How Do You QC the Library After Synthesis and Cloning?

After synthesis and cloning, review the library as a release decision. You need to know whether representation, uniformity, titer, MOI, and replicate behavior are strong enough to screen or whether you should re-clone, scale up, or re-order.

QC StageMetricTargetIf Failed
Plasmid LibraryRepresentation (NGS)>90% guides detectedSub-clone at higher ratio
Plasmid LibraryUniformity (Gini)<0.25Re-amplify with fewer cycles
Plasmid LibraryDropout rate<10%Redesign/resynthesize dropped guides
Viral ProductionTiter (TU/mL)>10^7 for lentiConcentrate by ultracentrifugation
TransductionMOI0.3-0.5 (30-50% infection)Titrate virus on target cells
Screen CoverageCells per guide≥500xScale up cell culture
Screen ResultsReplicate correlationPearson r >0.7Add technical replicates

Use our Uniformity Estimator to predict expected representation from your sequencing depth, and our Coverage Calculator to determine minimum sequencing depth for adequate power.

7. Worked Example: Planning a 500-Gene Kinase Screen

This example shows how a focused CRISPR knockout project turns guide selection, controls, oligo design, and screen coverage into an actual order plan for a first pooled screen.

Step 1: Define Your Gene List

Start with the KinBase curated human kinome (518 protein kinases). For a drug resistance screen, also add 20-30 known resistance genes from literature (drug efflux pumps, target pathway genes). Final list: ~545 genes.

Step 2: Design sgRNAs with CRISPick

Upload your gene list to CRISPick (Broad GPP). Settings: SpCas9, Human GRCh38, CRISPRko mode, 6 guides per gene. CRISPick applies Rule Set 2 scoring and CFD off-target analysis automatically.

Total sgRNAs
3,270
+ Controls
730
Total Library
4,000 guides

Step 3: Add Constant Regions

Append amplification primers and BsmBI cloning sites. Each oligo: 5'-CACCG(20nt spacer)GTTTTAGAGCTAGAAATAGC-3' = 45 nt. With PCR handles: ~85 nt total. Run all 4,000 through Batch QC — expect ~98% pass rate.

Step 4: Order & Screen

Order from Twist (4K pool at ~85 nt = ~$320). Amplify with 8 PCR cycles, clone into lentiGuide-Puro. Cells needed: 4,000 guides × 500x coverage = 2M cells per replicate (very manageable). Total timeline: 8-10 weeks from gene list to screen results.

💡 Pro Tip: A focused 4K library is an ideal first CRISPR screen — it requires only 2M cells per replicate (one T75 flask), costs ~$500 total for synthesis + cloning, and can be sequenced on a single MiSeq run. Start here before attempting genome-wide screens that need 100M+ cells.

⚠️ Pitfall: Don't skip pre-synthesis QC even for small libraries. In our kinase example, ~80 of 3,270 sgRNAs (2.4%) contain poly-G runs or extreme GC content. These will drop out of the pool, creating blind spots in exactly the genes you care about. Replace them with alternative sgRNAs from CRISPick.

8. Which sgRNA Design Tool Fits Your Project?

Several free tools can design sgRNA libraries, but they are not interchangeable. Choose based on library scale, organism, editing mode, and whether you need pooled-screen throughput or gene-by-gene flexibility.

ToolScoringOff-TargetBatch SizeBest ForLimitation
CRISPick (Broad)Rule Set 2 / AzimuthCFD + MITGenome-wideLarge-scale KO/CRISPRa/i screensHuman/mouse only
CHOPCHOPMultiple (configurable)Bowtie alignment1-50 genesMulti-organism, individual genesNot suited for genome-wide
CRISPRscanMoreno-Mateos 2015Basic alignment1-10 genesZebrafish, in vivo injectionTrained on zebrafish data
BE-DesignerBE-Hive / ABE predictionCas-OFFinder1-100 genesBase editing screensBE-specific, not for KO
GUIDES (Zhang lab)Combined activity scoreComprehensive1-1000 genesCustom focused librariesSlower for large batches

💡 Our Recommendation: Use CRISPick for any library with >50 genes — it has the best-validated scoring algorithm and handles genome-wide designs. For smaller projects or non-model organisms, use CHOPCHOP for its flexibility. For base editing libraries, BE-Designer is essential as it predicts the editing window and outcome. Regardless of tool, always post-process with our Batch QC to catch synthesis-problematic sequences.

9. How Long Will the Project Take and What Will It Cost?

Budget the full workflow, not just the oligo pool. Cloning, cell culture, virus production, sequencing, and analysis often cost more than synthesis itself.

PhaseDurationCost (Focused 4K)Cost (Genome-wide 80K)Key Deliverable
1. sgRNA Design2-3 days$0 (free tools)$0 (free tools)Ranked sgRNA list + QC report
2. Oligo Synthesis2-3 weeks$200-400$3,000-6,000Lyophilized oligo pool
3. Cloning + QC1-2 weeks$200-500$500-1,500Plasmid library (NGS-verified)
4. Virus Production1 week$100-200$300-600Lentiviral stock (titered)
5. Screen Execution2-4 weeks$500-1,500$2,000-5,000Selected cell populations
6. NGS + Analysis1 week$300-500$800-1,500Hit gene list
Total8-12 weeks$1,300-3,100$6,600-14,600Validated hit list

💡 Pro Tip: The biggest variable cost is cell culture, not synthesis. For adherent cells, genome-wide screens require 50-100 T175 flasks per replicate. Use suspension cells if possible (e.g., K562, Jurkat) — they scale easily in spinner flasks and need less hands-on time.

⚠️ Pitfall: Don't underestimate the cloning step. Golden Gate cloning of oligo pools has a typical 40-60% correct insertion rate. You need enough colonies/transformations to achieve >90% library representation. Plan for at least 10-20× the library size in colonies (e.g., 40K-80K colonies for a 4K library). Electro-competent cells with >10⁹ CFU/μg efficiency are essential.

📋 Protocol: Golden Gate Cloning of Oligo Pool into lentiGuide-Puro
Step 1 — Golden Gate Reaction (20 μL)
─────────────────────────────────────────────
Component Volume Final Conc
─────────────────────────────────────────────
lentiGuide-Puro (BsmBI 2 μL 50 ng
digested, gel-purified)
Oligo pool (annealed) 1 μL ~1:3 molar
10× T4 Ligase Buffer 2 μL 1×
BsmBI-v2 (10 U/μL) 1 μL 10 U
T4 DNA Ligase (400 U) 1 μL 400 U
Nuclease-free H₂O 13 μL
─────────────────────────────────────────────
Cycling: [37°C 5min → 16°C 5min] × 25
→ 55°C 5min → 4°C hold
Step 2 — Transformation
─────────────────────────────────────────────
Add 2 μL reaction to 25 μL Endura cells
Electroporate: 1.8 kV, 200 Ω, 25 μF
Add 975 μL SOC, recover 37°C 1h
Plate serial dilutions (1:10, 1:100)
on LB+Amp to count colonies
Plate remainder on 15 cm LB+Amp plates
Step 3 — Colony counting & QC
─────────────────────────────────────────────
Target: ≥10× library size in colonies
(e.g., 40K colonies for 4K library)
Scrape all colonies → maxiprep
NGS verify at 500× coverage per guide

Use electro-competent cells with >10⁹ CFU/μg efficiency (Endura or MegaX DH10B). Chemical transformation is insufficient for large libraries. Expect 40-60% correct insertions; the remainder are empty vector or recombinants. Source: Addgene CRISPR Pooled Library Cloning Protocol; Joung et al., Nature Protocols 2017.

10. Which Controls Should You Include in the Library?

Controls determine whether the final screen can be trusted. Include enough negative, positive, and cut-site controls to separate biology from technical noise.

Control TypeCountPurposeSourcePriority
Non-targeting500-1000Null distribution for statisticsRandom 20mers, no genomic match🔴 Essential
Safe-harbor targeting100-200Cut-site effect controlAAVS1, ROSA26 intergenic guides🟡 Recommended
Essential gene (+ctrl)50-100Validate KO efficiencyCEG2 set (Hart et al., 2017)🔴 Essential
Known drug targets20-50Validate screen phenotypeLiterature-curated for your drug🟡 Recommended
GFP/RFP targeting10-20Measure infection & KO efficiencyTarget reporter in Cas9 cells🟢 Optional

💡 Pro Tip: Use your non-targeting controls to calculate representation uniformity. Since they have no biological effect, their read count distribution should reflect pure technical noise. If non-targeting controls show >5-fold variation, your screen has a systematic bias (infection MOI, cell growth artifacts) and results should be interpreted cautiously. This is the single best internal quality metric for any CRISPR screen.

11. How Should You Package Virus and Set MOI?

Lentiviral packaging is where many otherwise solid libraries fall apart. Set MOI and cell coverage early so packaging, infection, and selection stay compatible with the screen design.

Parameter✅ Optimal⚠️ Acceptable🔴 Problematic
MOI (pooled KO screen)0.3–0.50.5–0.7>0.7 (multi-infection)
MOI (CRISPRi/a screen)0.3–0.50.5–1.0>1.0
Transduction efficiency30–50%20–30%<20% (insufficient)
Viral titer (TU/mL)10⁷–10⁸10⁶–10⁷<10⁶ (re-package)
Cell coverage per guide≥500 cells200–500 cells<200 cells
Selection antibioticPuromycin 2 μg/mL, 48hPuro 1–3 μg/mL, 48–72hNo selection (noise)

💡 Pro Tip: To calculate the number of cells needed for infection: Cells = (Library size × coverage) / MOI. For a 4K library with 500× coverage at MOI 0.3: 4,000 × 500 / 0.3 = 6.7 million cells needed. For a genome-wide 80K library: 80,000 × 500 / 0.3 = 133 million cells — plan for T175 flasks or spinners!

⚠️ Pitfall: Never skip the MOI titration. Infect small aliquots at serial dilutions (1:10, 1:30, 1:100, 1:300) and measure survival after puromycin selection. Aim for ~30% survival = MOI 0.3. If you inherit virus from another lab, always re-titer — freeze-thaw cycles degrade lentivirus by 10–50% per cycle.

12. Frequently Asked Questions

How many sgRNAs per gene do I need for a CRISPR screen?
For genome-wide knockout screens: use 4-6 sgRNAs per gene (minimum). For focused/sublibrary screens: 6-10 per gene provides better statistical power. Top-performing libraries (Brunello, TKOv3) use 4 guides per gene for ~19,000 genes. More guides per gene reduces false negatives but increases library size and sequencing cost. For CRISPRa/CRISPRi screens, use 5-10 guides per TSS as activity is more position-dependent.
What is the difference between CRISPRko, CRISPRa, and CRISPRi libraries?
CRISPRko (knockout) uses active Cas9 to create indels that disrupt gene function — guides target coding exons. CRISPRa (activation) uses dCas9 fused to transcriptional activators (VP64, p65, Rta) — guides target promoter regions within -200 to +1 bp of the TSS. CRISPRi (interference) uses dCas9 fused to KRAB repressor — guides target -50 to +300 bp of the TSS. Each requires different guide design rules and scoring algorithms.
Why should I avoid TTTT sequences in sgRNAs?
The TTTT motif (four consecutive thymidines in the DNA template) acts as a transcription termination signal for RNA Polymerase III, which drives sgRNA expression from U6 or H1 promoters. This causes premature termination of the sgRNA transcript, producing truncated, non-functional guides. Always filter out spacer sequences containing TTTT during library design.
What Cas systems can I use for CRISPR screens?
SpCas9 (PAM: NGG, spacer: 20 nt) is the most widely used with the most validated libraries. SaCas9 (PAM: NNGRRT, spacer: 21-23 nt) is smaller, useful for AAV delivery. Cas12a/Cpf1 (PAM: TTTV, spacer: 23 nt) generates staggered cuts and enables multiplex guide expression from a single transcript. AsCas12a and LbCas12a are the most active Cas12a variants. Choose based on your delivery method and PAM availability in target regions.
How do I analyze CRISPR screen results?
Common analysis pipelines: (1) MAGeCK — tests for gene-level enrichment/depletion using negative binomial models; (2) BAGEL2 — Bayesian classifier using pre-defined reference gene sets for essentiality scoring; (3) JACKS — accounts for guide-level efficacy variation. Start with MAGeCK for most applications. Quality metrics: library coverage (≥500x per guide), guide-level correlation between replicates (Pearson r >0.7), and clear separation of positive/negative control genes.
What controls should I include in my CRISPR library?
Essential controls: (1) Non-targeting guides — 500-1000 random sequences with no genomic match, for null distribution; (2) Safe-harbor targeting guides — 100-200 guides targeting gene deserts (AAVS1, ROSA26), for cut-site control; (3) Positive controls — 50-100 guides targeting known essential genes (CEG2 set) or drug targets, for screen validation. Controls should comprise 5-10% of total library.

Related Tools

Next Pages to Open

Continue with the pool-ordering, QC, or shorter CRISPR workflow page that matches the next job after library design.