Last updated: March 4, 2026

CRISPR Library Design Guide: sgRNA Selection, Pool Synthesis & QC

How to design a CRISPR screen library: Select 4-10 high-activity sgRNAs per gene targeting constitutive exons, score with Azimuth/Rule Set 2 for on-target activity and CFD for off-target specificity, then synthesize as an oligo pool with amplification handles and cloning adapters. This guide covers library types (CRISPRko/a/i), Cas system selection, oligo architecture, pool synthesis QC, and screen analysis. Use our Coverage Calculator, GC Analyzer, and Batch QC Tool for library design and validation.

Key Takeaways

  • Use 4-10 sgRNAs per gene for knockout screens — more guides increase statistical power and reduce false negatives.
  • Select sgRNAs with GC content 40-70%, avoid TTTT (polymerase III terminator), and target constitutive exons in the first 50% of the coding sequence.
  • Activity scoring (Rule Set 2/Azimuth) and off-target analysis (CFD score >0.9, MIT score >80) are both essential for guide selection.
  • Pool synthesis oligos are typically 73-80 nt for SpCas9: 20 nt spacer + scaffold overlap + amplification primers + restriction sites.
  • Verify library representation by NGS at 500-1000x coverage: >90% of guides detected, Gini <0.25, <10% dropout.
  • Include 500-1000 non-targeting control guides plus 50-100 essential gene positive controls in every library.

1. CRISPR Library Types

CRISPR libraries are classified by their mechanism of action and by their scope (genome-wide vs targeted). Each type requires different guide design rules, scoring algorithms, and oligo architectures.

Library TypeCas ProteinTarget RegionGuides/GeneTypical Library Size
CRISPRko (Knockout)SpCas9 (active)Constitutive coding exons4-6~80K-120K guides
CRISPRa (Activation)dCas9-VP64/p65/RtaPromoter (-200 to +1 of TSS)5-10~100K-200K guides
CRISPRi (Interference)dCas9-KRABTSS region (-50 to +300)5-10~100K-200K guides
Base EditingCBE4 or ABE8eCoding exons (edit window pos 4-8)4-8~80K-160K guides
TilingSpCas9 or dCas9Every PAM across target regionAll possible50-500 per region

Cas System Comparison

SystemPAMSpacer LengthCut TypeAdvantages
SpCas9NGG20 ntBlunt (3 bp upstream PAM)Most validated, best scoring tools
SaCas9NNGRRT21-23 ntBluntSmaller size for AAV delivery
AsCas12aTTTV23 ntStaggered (5' overhang)AT-rich targets, multiplex from single transcript
LbCas12aTTTV23 ntStaggered (5' overhang)High activity at 37°C
enAsCas12aTTYN, VTTV, TRTV23 ntStaggeredExpanded PAM flexibility

2. sgRNA Design Criteria

Effective sgRNA selection combines sequence composition rules, on-target activity prediction, and off-target specificity analysis. Apply these criteria sequentially to filter and rank candidate guides.

CriterionOptimalFilter (Hard)Rationale
GC Content40-70%Exclude <30% or >80%Binding stability and activity correlation
Poly-TNo TTTTExclude any TTTTPol III termination signal
Homopolymer≤3 consecutiveExclude ≥5 consecutiveSynthesis error and misalignment
Target PositionFirst 50% of CDSExclude last 10% of CDSEarlier frameshifts = stronger KO
Exon TargetingConstitutive exonsAvoid alt-spliced exonsEnsure disruption in all isoforms
Activity ScoreRule Set 2 >0.6Exclude <0.2Predicts cutting efficiency
Specificity (CFD)CFD >0.9Exclude <0.5Off-target risk assessment
Specificity (MIT)MIT >80Exclude <50Alternative specificity metric

Activity Scoring Methods

Rule Set 2 (Doench et al., 2016): Logistic regression model trained on 2,549 sgRNAs from 8 cell lines. Input features include dinucleotide composition, GC content, and position-specific base preferences. Scores range from 0-1, with >0.5 indicating high predicted activity.

Azimuth (Doench et al., 2016): Updated gradient-boosted regression tree model. Generally outperforms Rule Set 1 and gives better discrimination for the top-scoring guides. Available through the Broad GPP portal.

DeepCas9 / CHOPCHOP: Deep learning models trained on larger datasets. Consider these for non-standard applications or when designing guides for organisms with limited training data coverage.

Screen your designed spacer sequences with our GC Content Analyzer (batch mode) and Batch Sequence QC to identify sequences with extreme GC, TTTT motifs, or problematic homopolymers before synthesis.

3. Library Sizing & Coverage

Library size and screening coverage determine the statistical power of your CRISPR screen. Insufficient coverage leads to high guide-level noise and missed hits.

Screen TypeGuides/GeneTotal Library SizeCell CoverageCells Needed
Genome-wide KO4~80K500x per guide40M cells
Genome-wide CRISPRa/i5-10~100-200K500x per guide50-100M cells
Focused sublibrary6-101K-10K1000x per guide1-10M cells
Tiling screenAll available5K-50K500x per guide2.5-25M cells

Use our Coverage Calculator to determine the minimum cell number, sequencing depth, and replicate count for your screen design. The tool accounts for library complexity, infection efficiency (MOI), and desired statistical power.

4. Oligo Architecture & Cloning

Each oligo in a CRISPR library encodes the sgRNA spacer sequence flanked by elements required for amplification, cloning, and expression. The exact architecture depends on your vector system.

Typical SpCas9 Library Oligo (lentiGuide-Puro)

5'-[FWD_PRIMER]-[BsmBI_site]-[20nt_SPACER]-[BsmBI_site]-[REV_PRIMER]-3'
ElementLengthFunction
Forward primer18-22 ntPCR amplification handle
BsmBI site + overhang10 ntGolden Gate cloning into vector
Spacer sequence20 ntsgRNA targeting sequence (variable)
BsmBI site + overhang10 ntGolden Gate cloning into vector
Reverse primer18-22 ntPCR amplification handle
Total oligo76-84 ntWell within array synthesis limits

Verify your full oligo sequences (spacer + constant regions) through our Batch Sequence QC tool before ordering synthesis. This checks for unintended restriction sites that would interfere with cloning, as well as secondary structures that could impair PCR amplification of the pool.

5. Sequence QC for CRISPR Libraries

Pre-synthesis sequence screening prevents costly design errors. Screen all oligo sequences (full-length, including constant regions) for the following issues:

TTTT Motif Check

The most critical filter: any spacer containing TTTT will produce a truncated, non-functional sgRNA. This is a hard filter — no exceptions.

Use Batch QC

GC Content Analysis

Spacer GC 40-70%. Extreme GC causes synthesis bias (dropout) and reduced Cas9 activity. Use batch mode for genome-wide libraries.

Use GC Analyzer

Secondary Structures

Strong hairpins in the spacer (ΔG < -3 kcal/mol) can block Cas9 loading. Check the full oligo for structures that impair PCR amplification.

Use Structure Predictor

Tm Uniformity

For PCR amplification handles, ensure consistent Tm (58-62°C). The spacer region contributes to overall oligo Tm — check for outliers.

Use Tm Calculator

6. Post-Synthesis QC & Screen Analysis

QC StageMetricTargetIf Failed
Plasmid LibraryRepresentation (NGS)>90% guides detectedSub-clone at higher ratio
Plasmid LibraryUniformity (Gini)<0.25Re-amplify with fewer cycles
Plasmid LibraryDropout rate<10%Redesign/resynthesize dropped guides
Viral ProductionTiter (TU/mL)>10^7 for lentiConcentrate by ultracentrifugation
TransductionMOI0.3-0.5 (30-50% infection)Titrate virus on target cells
Screen CoverageCells per guide≥500xScale up cell culture
Screen ResultsReplicate correlationPearson r >0.7Add technical replicates

Use our Uniformity Estimator to predict expected representation from your sequencing depth, and our Coverage Calculator to determine minimum sequencing depth for adequate power.

Frequently Asked Questions

How many sgRNAs per gene do I need for a CRISPR screen?
For genome-wide knockout screens: use 4-6 sgRNAs per gene (minimum). For focused/sublibrary screens: 6-10 per gene provides better statistical power. Top-performing libraries (Brunello, TKOv3) use 4 guides per gene for ~19,000 genes. More guides per gene reduces false negatives but increases library size and sequencing cost. For CRISPRa/CRISPRi screens, use 5-10 guides per TSS as activity is more position-dependent.
What is the difference between CRISPRko, CRISPRa, and CRISPRi libraries?
CRISPRko (knockout) uses active Cas9 to create indels that disrupt gene function — guides target coding exons. CRISPRa (activation) uses dCas9 fused to transcriptional activators (VP64, p65, Rta) — guides target promoter regions within -200 to +1 bp of the TSS. CRISPRi (interference) uses dCas9 fused to KRAB repressor — guides target -50 to +300 bp of the TSS. Each requires different guide design rules and scoring algorithms.
Why should I avoid TTTT sequences in sgRNAs?
The TTTT motif (four consecutive thymidines in the DNA template) acts as a transcription termination signal for RNA Polymerase III, which drives sgRNA expression from U6 or H1 promoters. This causes premature termination of the sgRNA transcript, producing truncated, non-functional guides. Always filter out spacer sequences containing TTTT during library design.
What Cas systems can I use for CRISPR screens?
SpCas9 (PAM: NGG, spacer: 20 nt) is the most widely used with the most validated libraries. SaCas9 (PAM: NNGRRT, spacer: 21-23 nt) is smaller, useful for AAV delivery. Cas12a/Cpf1 (PAM: TTTV, spacer: 23 nt) generates staggered cuts and enables multiplex guide expression from a single transcript. AsCas12a and LbCas12a are the most active Cas12a variants. Choose based on your delivery method and PAM availability in target regions.
How do I analyze CRISPR screen results?
Common analysis pipelines: (1) MAGeCK — tests for gene-level enrichment/depletion using negative binomial models; (2) BAGEL2 — Bayesian classifier using pre-defined reference gene sets for essentiality scoring; (3) JACKS — accounts for guide-level efficacy variation. Start with MAGeCK for most applications. Quality metrics: library coverage (≥500x per guide), guide-level correlation between replicates (Pearson r >0.7), and clear separation of positive/negative control genes.
What controls should I include in my CRISPR library?
Essential controls: (1) Non-targeting guides — 500-1000 random sequences with no genomic match, for null distribution; (2) Safe-harbor targeting guides — 100-200 guides targeting gene deserts (AAVS1, ROSA26), for cut-site control; (3) Positive controls — 50-100 guides targeting known essential genes (CEG2 set) or drug targets, for screen validation. Controls should comprise 5-10% of total library.

Related Tools

Further Reading