API Reference#

Complete API documentation for PhaseLab.

Core Module#

PhaseLab Coherence: IR coherence metrics for simulation reliability.

The core metrics from Informational Relativity: - R̄ (R-bar): Order parameter / coherence score - V_φ (V-phi): Phase variance - Relationship: R̄ = exp(-V_φ/2)

These metrics assess whether a quantum or dynamical simulation is reliable.

phaselab.core.coherence.phase_variance(phases)[source]#

Compute circular phase variance V_φ from an array of phases.

V_φ = -2 * ln(R̄) where R̄ is the Kuramoto order parameter.

Args:

phases: Array of phase angles in radians.

Returns:

Phase variance V_φ (non-negative).

phaselab.core.coherence.coherence_score(data, mode='auto')[source]#

Compute the coherence score R̄ from various input types.

Args:
data: Can be:
  • Array of phases → compute Kuramoto R̄

  • Array of expectation values → compute consistency R̄

  • Single V_φ value → compute R̄ = exp(-V_φ/2)

  • Statevector → compute from amplitudes

mode: “phases”, “expectations”, “variance”, or “auto”

Returns:

Coherence score R̄ in [0, 1].

phaselab.core.coherence.go_no_go(R_bar, threshold=np.float64(0.1353352832366127))[source]#

Determine GO/NO-GO classification based on coherence.

The e^-2 threshold is the fundamental boundary from IR theory. Below this, simulations are considered unreliable.

Args:

R_bar: Coherence score. threshold: GO/NO-GO boundary (default: e^-2 ≈ 0.135).

Returns:

“GO” if R̄ > threshold, else “NO-GO”.

phaselab.core.coherence.classify_coherence(R_bar)[source]#

Classify coherence level into human-readable categories.

Args:

R_bar: Coherence score.

Returns:

Classification string.

phaselab.core.coherence.compare_sim_hardware(sim_R, hw_R, tolerance=0.05)[source]#

Compare simulator and hardware coherence values.

Args:

sim_R: Simulator coherence. hw_R: Hardware coherence. tolerance: Acceptable difference for “EXCELLENT” agreement.

Returns:

(difference, agreement_level)

phaselab.core.coherence.ensemble_coherence(coherence_values, weights=None)[source]#

Compute ensemble coherence from multiple measurements.

Args:

coherence_values: List of R̄ values. weights: Optional weights for weighted average.

Returns:

Ensemble coherence score.

PhaseLab Constants: Universal constants from Informational Relativity framework.

CRISPR Module#

PhaseLab CRISPR: Comprehensive guide RNA design pipeline with IR coherence validation.

Provides: - CRISPRa guide design for transcriptional activation - CRISPRi guide design for transcriptional interference/repression - CRISPR knockout guide design for gene disruption - Prime editing pegRNA design for precise edits - Base editing guide design (ABE/CBE) for single-nucleotide changes - PAM site scanning (NGG, NNGRRT, etc.) - Off-target scoring (MIT, CFD algorithms) - Thermodynamic binding energy (SantaLucia) - Chromatin accessibility modeling - IR coherence-based reliability scoring

NEW in v0.9.3 - CRISPRa Binding Register Model: - NucleaseRole: Explicit BINDING vs CUTTING mode - Relaxed PAM patterns for dCas9 binding (e.g., SaCas9 NNGRRN) - Sliding binding register (±2bp) for GC-dense promoters - Configurable guide length for literature reproduction - Validated against Chang et al. 2022 sg2 winner

NEW in v0.9.2 - Guide Enumeration & Policy System: - Region declaration with multi-TSS support - PAM scanning for SpCas9, SaCas9, Cas12a - Policy-based dominance ranking - Benchmark mode for validation against published guides - Reproducibility manifests

NEW in v0.7.0 - Enhanced Pipeline: - Full Virtual Assay Stack integration - Biological context from ENCODE (ATAC-seq, methylation, histones) - ML efficiency predictions (DeepCRISPR, DeepSpCas9 adapters) - Evidence fusion with uncertainty quantification - Context-aware guide ranking

phaselab.crispr.design_guides(sequence, tss_index, config=None, dnase_peaks=None, verbose=False)[source]#

Design and rank guide RNAs for CRISPRa/CRISPRi.

This is the main entry point for the CRISPR pipeline. It: 1. Scans for PAM sites 2. Filters candidates by window and quality 3. Computes multi-layer scores 4. Optionally runs IR coherence simulation 5. Returns ranked candidates

Args:

sequence: Promoter DNA sequence (5’->3’). tss_index: Position of TSS in sequence (0-based). config: GuideDesignConfig with parameters. dnase_peaks: Optional list of (start, end) DNase HS sites. verbose: Print progress messages.

Returns:

DataFrame with ranked guide candidates and scores.

Example:
>>> from phaselab.crispr import design_guides
>>> guides = design_guides(
...     sequence=rai1_promoter,
...     tss_index=500,
... )
>>> print(guides[['sequence', 'position', 'combined_score']].head())
class phaselab.crispr.GuideDesignConfig(pam='NGG', guide_length=20, crispr_window=(-400, -50), min_gc=0.4, max_gc=0.7, max_homopolymer=4, min_complexity=0.5, filter_poly_t=True, filter_repeats=True, poly_t_threshold=4, weight_mit=1.0, weight_cfd=1.0, weight_gc=0.5, weight_chromatin=0.8, weight_delta_g=0.3, compute_guide_coherence=False, weight_coherence=0.0, compute_coherence=False, coherence_shots=2000, hardware_backend=None, top_n=10)[source]#

Bases: object

Configuration for guide RNA design pipeline.

Attributes:
hardware_backend
coherence_shots: int = 2000#
compute_coherence: bool = False#
compute_guide_coherence: bool = False#
crispr_window: Tuple[int, int] = (-400, -50)#
filter_poly_t: bool = True#
filter_repeats: bool = True#
guide_length: int = 20#
hardware_backend: str | None = None#
max_gc: float = 0.7#
max_homopolymer: int = 4#
min_complexity: float = 0.5#
min_gc: float = 0.4#
pam: str = 'NGG'#
poly_t_threshold: int = 4#
top_n: int = 10#
weight_cfd: float = 1.0#
weight_chromatin: float = 0.8#
weight_coherence: float = 0.0#
weight_delta_g: float = 0.3#
weight_gc: float = 0.5#
weight_mit: float = 1.0#
phaselab.crispr.validate_guide(guide_seq, compute_deprecated_coherence=False)[source]#

Quick validation of a single guide sequence.

Includes U6/Pol III compatibility and repeat region checks (v0.9.1+).

NOTE (v1.0.0): Guide-sequence coherence is now DEPRECATED. Use phaselab.spatial for region-based spatial coherence instead. The coherence_R field is only computed if compute_deprecated_coherence=True.

Args:

guide_seq: Guide sequence to validate. compute_deprecated_coherence: If True, compute guide-sequence coherence

(DEPRECATED - does not predict outcomes).

Returns:

Dictionary with validation results.

phaselab.crispr.design_crispri_guides(sequence, tss_index, config=None, dnase_peaks=None, verbose=False)[source]#

Design guide RNAs for CRISPRi transcriptional repression.

Optimized for dCas9-KRAB mediated gene silencing with guides targeting the TSS-proximal region.

Args:

sequence: Promoter/gene DNA sequence. tss_index: Position of TSS in sequence (0-based). config: CRISPRiConfig with parameters. dnase_peaks: Optional DNase hypersensitive sites. verbose: Print progress messages.

Returns:

DataFrame with ranked CRISPRi guide candidates.

Example:
>>> from phaselab.crispr import design_crispri_guides
>>> guides = design_crispri_guides(
...     sequence=promoter_seq,
...     tss_index=500,
... )
>>> print(guides[['sequence', 'position', 'repression_efficiency']].head())
class phaselab.crispr.CRISPRiConfig(pam='NGG', guide_length=20, crispri_window=(-50, 300), min_gc=0.35, max_gc=0.75, max_homopolymer=4, min_complexity=0.5, min_repression_efficiency=0.3, weight_mit=1.0, weight_cfd=1.0, weight_repression=1.5, weight_position=1.2, weight_chromatin=0.8, weight_delta_g=0.3, weight_coherence=1.0, compute_coherence=True, coherence_shots=2000, repressor='KRAB', top_n=10)[source]#

Bases: object

Configuration for CRISPRi guide design.

coherence_shots: int = 2000#
compute_coherence: bool = True#
crispri_window: Tuple[int, int] = (-50, 300)#
guide_length: int = 20#
max_gc: float = 0.75#
max_homopolymer: int = 4#
min_complexity: float = 0.5#
min_gc: float = 0.35#
min_repression_efficiency: float = 0.3#
pam: str = 'NGG'#
repressor: str = 'KRAB'#
top_n: int = 10#
weight_cfd: float = 1.0#
weight_chromatin: float = 0.8#
weight_coherence: float = 1.0#
weight_delta_g: float = 0.3#
weight_mit: float = 1.0#
weight_position: float = 1.2#
weight_repression: float = 1.5#
phaselab.crispr.validate_crispri_guide(guide_seq, position, strand='+', repressor='KRAB')[source]#

Validate a single guide for CRISPRi application.

Args:

guide_seq: Guide sequence to validate. position: Position relative to TSS. strand: Strand orientation. repressor: Repressor domain type.

Returns:

Validation results dictionary.

phaselab.crispr.repression_efficiency_score(guide_seq, position, strand, repressor='KRAB')[source]#

Predict repression efficiency for CRISPRi.

Based on position relative to TSS and strand orientation.

Args:

guide_seq: Guide sequence. position: Position relative to TSS. strand: “+” (template) or “-” (non-template). repressor: Repressor domain type.

Returns:

Repression efficiency score (0.0 to 1.0).

phaselab.crispr.steric_hindrance_score(position, strand)[source]#

Calculate steric hindrance potential.

Guides that block RNA polymerase progression are most effective.

Args:

position: Position relative to TSS. strand: Strand orientation.

Returns:

Steric hindrance score (0.0 to 1.0).

phaselab.crispr.design_knockout_guides(sequence, cds_start, config=None, exon_boundaries=None, verbose=False)[source]#

Design guide RNAs for CRISPR knockout.

This pipeline is optimized for gene disruption through frameshift mutations caused by NHEJ repair of Cas9-induced DSBs.

Args:

sequence: Gene/exon DNA sequence. cds_start: Position of CDS start (ATG) in sequence. config: KnockoutConfig with parameters. exon_boundaries: Optional list of (start, end) for exon positions. verbose: Print progress messages.

Returns:

DataFrame with ranked knockout guide candidates.

Example:
>>> from phaselab.crispr import design_knockout_guides
>>> guides = design_knockout_guides(
...     sequence=gene_sequence,
...     cds_start=200,
... )
>>> print(guides[['sequence', 'cut_efficiency', 'frameshift_prob']].head())
class phaselab.crispr.KnockoutConfig(pam='NGG', guide_length=20, target_window=(0, 500), min_gc=0.35, max_gc=0.75, max_homopolymer=4, min_complexity=0.5, min_cut_efficiency=0.3, weight_mit=1.0, weight_cfd=1.0, weight_cut_efficiency=1.5, weight_delta_g=0.3, weight_coherence=1.0, compute_coherence=True, coherence_shots=2000, top_n=10)[source]#

Bases: object

Configuration for CRISPR knockout guide design.

coherence_shots: int = 2000#
compute_coherence: bool = True#
guide_length: int = 20#
max_gc: float = 0.75#
max_homopolymer: int = 4#
min_complexity: float = 0.5#
min_cut_efficiency: float = 0.3#
min_gc: float = 0.35#
pam: str = 'NGG'#
target_window: Tuple[int, int] = (0, 500)#
top_n: int = 10#
weight_cfd: float = 1.0#
weight_coherence: float = 1.0#
weight_cut_efficiency: float = 1.5#
weight_delta_g: float = 0.3#
weight_mit: float = 1.0#
phaselab.crispr.validate_knockout_guide(guide_seq)[source]#

Validate a single guide for knockout application.

Args:

guide_seq: Guide sequence to validate.

Returns:

Validation results dictionary.

phaselab.crispr.cut_efficiency_score(guide_seq)[source]#

Predict cutting efficiency using Rule Set 2-like model.

Based on Doench et al. 2016 on-target scoring. Higher score = more efficient cutting.

Args:

guide_seq: 20bp guide sequence.

Returns:

Cut efficiency score (0.0 to 1.0).

phaselab.crispr.frameshift_probability(guide_position, exon_length, cds_position)[source]#

Estimate probability of frameshift from indel at cut site.

Cuts earlier in CDS have higher chance of causing functional knockouts.

Args:

guide_position: Position of guide in genomic coordinates. exon_length: Length of the exon. cds_position: Position within the coding sequence (0-based).

Returns:

Frameshift probability estimate (0.0 to 1.0).

phaselab.crispr.repair_pathway_prediction(guide_seq, local_sequence=None)[source]#

Predict repair pathway preference (NHEJ vs HDR).

For knockout, NHEJ is preferred (causes indels).

Args:

guide_seq: Guide RNA sequence. local_sequence: Sequence context around cut site.

Returns:

Dict with pathway probabilities.

phaselab.crispr.design_prime_edit(sequence, edit_position, edit_from, edit_to, config=None, verbose=False)[source]#

Design pegRNAs for prime editing.

This designs complete prime editing guide RNAs with optimized PBS and RT template components.

Args:

sequence: DNA sequence containing the edit site. edit_position: Position of the edit in sequence (0-based). edit_from: Original sequence to replace. edit_to: New sequence (the edit). config: PrimeEditConfig with parameters. verbose: Print progress messages.

Returns:

DataFrame with ranked pegRNA candidates.

Example:
>>> from phaselab.crispr import design_prime_edit
>>> pegrnas = design_prime_edit(
...     sequence=gene_region,
...     edit_position=150,
...     edit_from="A",
...     edit_to="G",  # A-to-G substitution
... )
>>> print(pegrnas[['spacer', 'pbs_length', 'rt_length', 'score']].head())
class phaselab.crispr.PrimeEditConfig(pam='NGG', guide_length=20, pbs_length_min=8, pbs_length_max=17, pbs_optimal_length=(13, 15), pbs_gc_min=0.35, pbs_gc_max=0.65, rt_length_min=7, rt_length_max=34, rt_optimal_length=(10, 16), edit_type='substitution', min_gc=0.35, max_gc=0.75, max_homopolymer=4, check_secondary_structure=True, max_secondary_structure_dg=-5.0, weight_mit=1.0, weight_cfd=1.0, weight_pbs_score=1.2, weight_rt_score=1.2, weight_nick_distance=0.8, weight_coherence=1.0, compute_coherence=True, coherence_shots=2000, top_n=10)[source]#

Bases: object

Configuration for prime editing guide design.

check_secondary_structure: bool = True#
coherence_shots: int = 2000#
compute_coherence: bool = True#
edit_type: str = 'substitution'#
guide_length: int = 20#
max_gc: float = 0.75#
max_homopolymer: int = 4#
max_secondary_structure_dg: float = -5.0#
min_gc: float = 0.35#
pam: str = 'NGG'#
pbs_gc_max: float = 0.65#
pbs_gc_min: float = 0.35#
pbs_length_max: int = 17#
pbs_length_min: int = 8#
pbs_optimal_length: Tuple[int, int] = (13, 15)#
rt_length_max: int = 34#
rt_length_min: int = 7#
rt_optimal_length: Tuple[int, int] = (10, 16)#
top_n: int = 10#
weight_cfd: float = 1.0#
weight_coherence: float = 1.0#
weight_mit: float = 1.0#
weight_nick_distance: float = 0.8#
weight_pbs_score: float = 1.2#
weight_rt_score: float = 1.2#
phaselab.crispr.validate_prime_edit(spacer, pbs, rt_template, edit_type='substitution')[source]#

Validate a pegRNA design.

Args:

spacer: 20bp spacer sequence. pbs: PBS sequence. rt_template: RT template sequence. edit_type: Type of edit.

Returns:

Validation results dictionary.

phaselab.crispr.design_pbs(target_sequence, nick_position, pbs_lengths=None)[source]#

Design Primer Binding Sites for pegRNA.

PBS binds to the nicked DNA strand to initiate RT-mediated synthesis.

Args:

target_sequence: DNA sequence around target site. nick_position: Position where Cas9 nicks (3bp upstream of PAM). pbs_lengths: List of PBS lengths to try.

Returns:

List of PBS designs with scores.

phaselab.crispr.design_rt_template(target_sequence, nick_position, edit_position, edit_from, edit_to, rt_lengths=None)[source]#

Design RT templates for pegRNA.

RT template encodes the desired edit and homology arms.

Args:

target_sequence: DNA sequence around target site. nick_position: Position where Cas9 nicks. edit_position: Position of the desired edit. edit_from: Original sequence at edit site. edit_to: Desired new sequence. rt_lengths: List of RT template lengths to try.

Returns:

List of RT template designs with scores.

phaselab.crispr.pbs_score(pbs_seq)[source]#

Score a PBS sequence.

Args:

pbs_seq: PBS sequence.

Returns:

Score (0 to ~1.5).

phaselab.crispr.rt_template_score(rt_seq)[source]#

Score an RT template sequence.

Args:

rt_seq: RT template sequence.

Returns:

Score (0 to ~1.5).

phaselab.crispr.reverse_complement(seq)[source]#

Return reverse complement of DNA sequence.

phaselab.crispr.estimate_hairpin_dg(seq)[source]#

Estimate secondary structure ΔG (simplified).

More negative = stronger secondary structure = worse.

Returns:

Estimated ΔG in kcal/mol.

phaselab.crispr.design_base_edit_guides(sequence, target_position, target_base=None, config=None, verbose=False)[source]#

Design guides for base editing at a specific position.

Finds guides that place the target base within the editor’s activity window and ranks by predicted efficiency.

Args:

sequence: DNA sequence containing target site. target_position: Position of base to edit (0-based in sequence). target_base: Target base (“A” for ABE, “C” for CBE). Auto-detected from editor if not specified. config: BaseEditConfig with parameters. verbose: Print progress messages.

Returns:

DataFrame with ranked base editing guide candidates.

Example:
>>> from phaselab.crispr import design_base_edit_guides
>>> guides = design_base_edit_guides(
...     sequence=gene_region,
...     target_position=100,  # Position of A to edit
...     config=BaseEditConfig(editor="ABE8e"),
... )
>>> print(guides[['sequence', 'target_in_window_pos', 'efficiency']].head())
class phaselab.crispr.BaseEditConfig(editor='ABE8e', pam='NGG', guide_length=20, activity_window=None, target_base='A', min_gc=0.35, max_gc=0.7, max_homopolymer=4, check_bystanders=True, max_bystanders_in_window=2, weight_mit=1.0, weight_cfd=1.0, weight_position=1.5, weight_context=1.0, weight_bystander=-0.5, weight_coherence=1.0, compute_coherence=True, coherence_shots=2000, top_n=10)[source]#

Bases: object

Configuration for base editing guide design.

Attributes:
activity_window
activity_window: Tuple[int, int] | None = None#
check_bystanders: bool = True#
coherence_shots: int = 2000#
compute_coherence: bool = True#
editor: str = 'ABE8e'#
guide_length: int = 20#
max_bystanders_in_window: int = 2#
max_gc: float = 0.7#
max_homopolymer: int = 4#
min_gc: float = 0.35#
pam: str = 'NGG'#
target_base: str = 'A'#
top_n: int = 10#
weight_bystander: float = -0.5#
weight_cfd: float = 1.0#
weight_coherence: float = 1.0#
weight_context: float = 1.0#
weight_mit: float = 1.0#
weight_position: float = 1.5#
phaselab.crispr.validate_base_edit(guide_seq, target_position, editor='ABE8e')[source]#

Validate a guide for base editing.

Args:

guide_seq: Guide sequence. target_position: Position of target base in guide (1-indexed). editor: Base editor type.

Returns:

Validation results dictionary.

phaselab.crispr.design_abe_guides(sequence, target_position, **kwargs)[source]#

Design ABE (A→G) guides. Shortcut for design_base_edit_guides with ABE8e.

phaselab.crispr.design_cbe_guides(sequence, target_position, **kwargs)[source]#

Design CBE (C→T) guides. Shortcut for design_base_edit_guides with BE4.

phaselab.crispr.editing_efficiency_at_position(position, editor='ABE8e')[source]#

Get relative editing efficiency at a position.

Args:

position: Position in guide (1-indexed from PAM-distal). editor: Base editor type.

Returns:

Relative efficiency (0.0 to 1.0).

phaselab.crispr.find_bystanders(guide_seq, target_position, editor='ABE8e')[source]#

Find bystander editable bases in the activity window.

Args:

guide_seq: Guide sequence. target_position: Position of intended edit. editor: Base editor type.

Returns:

List of bystander positions and predicted efficiencies.

phaselab.crispr.get_activity_window(editor)[source]#

Get activity window for a base editor.

phaselab.crispr.find_pam_sites(sequence, pam='NGG', guide_length=20, both_strands=True)[source]#

Find all PAM sites in a sequence and extract guide sequences.

For SpCas9 (NGG): Guide is 20bp upstream of PAM. For Cas12a (TTTV): Guide is downstream of PAM.

Args:

sequence: DNA sequence to scan (5’->3’). pam: PAM pattern name (e.g., “NGG”, “NNGRRT”) or regex. guide_length: Length of guide/protospacer (default 20). both_strands: Scan both forward and reverse strands.

Returns:

List of PAMHit objects.

phaselab.crispr.gc_content(sequence)[source]#

Calculate GC content of a sequence.

Args:

sequence: DNA/RNA sequence.

Returns:

GC fraction (0.0 to 1.0).

phaselab.crispr.delta_g_santalucia(sequence, temperature=37.0, na_conc=0.1)[source]#

Calculate ΔG of hybridization using SantaLucia nearest-neighbor model.

Args:

sequence: DNA/RNA sequence (assumes binding to perfect complement). temperature: Temperature in Celsius. na_conc: Na+ concentration in M.

Returns:

ΔG in kcal/mol (negative = favorable binding).

phaselab.crispr.mit_specificity_score(guide_seq, off_target_count=0, avg_mismatches=4.0)[source]#

Calculate MIT specificity score (simplified).

The full MIT algorithm requires genome-wide alignment. This provides an estimate based on guide sequence properties.

Higher score = more specific (fewer predicted off-targets).

Args:

guide_seq: 20bp guide sequence. off_target_count: Number of known off-targets (if available). avg_mismatches: Average mismatches to off-targets.

Returns:

MIT specificity score (0-100).

phaselab.crispr.cfd_score(guide_seq, target_seq=None)[source]#

Calculate CFD (Cutting Frequency Determination) score.

CFD predicts how likely an off-target site will be cut. For on-target (no mismatches), returns 100.

Args:

guide_seq: 20bp guide sequence. target_seq: Target sequence (if different from perfect match).

Returns:

CFD score (0-100, higher = more cutting).

phaselab.crispr.max_homopolymer_run(sequence)[source]#

Find the longest homopolymer run in a sequence.

Args:

sequence: DNA/RNA sequence.

Returns:

Length of longest single-nucleotide repeat.

phaselab.crispr.chromatin_accessibility_score(position, tss_position, dnase_peaks=None)[source]#

Estimate chromatin accessibility at a genomic position.

Without experimental data, uses heuristic based on TSS proximity. Near TSS = more likely to be open chromatin.

Args:

position: Genomic position. tss_position: Transcription start site position. dnase_peaks: Optional list of (start, end) DNase HS peaks.

Returns:

(state, accessibility_score) state: “OPEN”, “MODERATE”, or “CLOSED” accessibility_score: 0.0 to 1.0

phaselab.crispr.poly_t_penalty(sequence, threshold=4)[source]#

Check for poly-T runs that cause U6/U3 Pol III termination.

Pol III promoters (U6, U3) terminate at poly-T runs, making guides starting with TTTT or containing long T-runs incompatible with standard expression systems.

Args:

sequence: Guide sequence. threshold: Minimum T-run length to flag (default: 4 = TTTT).

Returns:

(is_problematic, reason) is_problematic: True if guide has poly-T issue reason: Description of the issue (empty if none)

Example:
>>> poly_t_penalty("TTTTAATGGCCGGCGATGCC")
(True, "TTTT at 5' end - incompatible with U6/U3 promoters")
phaselab.crispr.is_repeat_region(sequence, min_repeat_length=3)[source]#

Detect if guide is in a repetitive/low-complexity region.

Guides in repeat regions often have: - Multiple identical off-targets - Mapping ambiguity - Reduced specificity

Args:

sequence: Guide sequence. min_repeat_length: Minimum unit length for repeat detection.

Returns:

(is_repeat, reason) is_repeat: True if sequence is repetitive reason: Description of repeat type

Example:
>>> is_repeat_region("CAGCAGCAGCAGCAGCAGCA")
(True, "Tandem repeat: CAG repeated 6+ times")
phaselab.crispr.u6_compatibility_check(sequence)[source]#

Comprehensive U6/Pol III compatibility check.

Checks for all known issues with Pol III-driven expression: - Poly-T termination signals - G at position 1 preferred for U6 - Internal TTTT runs

Args:

sequence: Guide sequence.

Returns:

(is_compatible, warnings) is_compatible: True if guide can be expressed from U6 warnings: List of compatibility warnings

Example:
>>> u6_compatibility_check("GAAGTGACGGCTAGGGCTCC")
(True, [])
>>> u6_compatibility_check("TTTTAATGGCCGGCGATGCC")
(False, ["TTTT at 5' - will cause Pol III termination"])
class phaselab.crispr.CRISPORMetrics(mit_score, cfd_score, off_targets=<factory>, u6_compatible=True, is_repeat=False)[source]#

Bases: object

Container for CRISPOR-style guide metrics.

Attributes:

mit_score: MIT specificity score (0-100) cfd_score: CFD cutting frequency score (0-100) off_targets: Dict mapping mismatch count to number of off-targets u6_compatible: Whether compatible with U6/Pol III is_repeat: Whether in genomic repeat region

is_repeat: bool = False#
u6_compatible: bool = True#
mit_score: float#
cfd_score: float#
off_targets: Dict[int, int]#
phaselab.crispr.crispor_composite_score(mit_score, cfd_score, off_targets=None, u6_compatible=True, is_repeat=False, weights=None)[source]#

Calculate CRISPOR-style composite score with mismatch distance weighting.

Formula:

COMPOSITE = (MIT + CFD) - Σ(weight[mm] × count[mm]) - U6_penalty - repeat_penalty

Where:
  • weight[0-1mm] = 50-100 (critical)

  • weight[2mm] = 25 (important)

  • weight[3mm] = 5 (minor)

  • weight[4mm] = 1 (minimal)

  • U6_penalty = 100 if incompatible

  • repeat_penalty = 1000 if in repeat region

This correctly handles the “MIT 98 / CFD 98 trap” where high raw scores are misleading due to many off-targets or U6 incompatibility.

Args:

mit_score: MIT specificity score (0-100). cfd_score: CFD cutting frequency score (0-100). off_targets: Dict mapping mismatch count to off-target count.

e.g., {0: 0, 1: 0, 2: 1, 3: 5, 4: 23}

u6_compatible: True if compatible with U6/Pol III promoters. is_repeat: True if guide is in genomic repeat region. weights: Custom mismatch weights (default: OFFTARGET_MISMATCH_WEIGHTS).

Returns:

(composite_score, breakdown) composite_score: Final score (higher is better, can be negative) breakdown: Dict with component contributions

Example:
>>> # Guide #1: MIT=93, CFD=95, 0 dangerous OTs
>>> score, _ = crispor_composite_score(93, 95, {0:0, 1:0, 2:0, 3:5, 4:15})
>>> score
160.0  # (93+95) - (5*5 + 15*1) = 188 - 25 - 15 = 148
>>> # Guide #7: MIT=98, CFD=98, TTTT start
>>> score, _ = crispor_composite_score(98, 98, {0:0, 1:0, 2:1, 3:6, 4:22},
...                                     u6_compatible=False)
>>> score
41.0  # (98+98) - (25 + 30 + 22) - 100 = 196 - 77 - 100 = 19
phaselab.crispr.rank_guides_crispor_style(guides, require_u6_compatible=True, exclude_repeats=True)[source]#

Rank guides using CRISPOR-style composite scoring.

This function takes guide dictionaries with CRISPOR metrics and returns them sorted by composite score (highest first).

Args:
guides: List of guide dicts, each containing:
  • sequence: Guide sequence

  • mit_score: MIT specificity (0-100)

  • cfd_score: CFD score (0-100)

  • off_targets: Dict {mismatch_count: num_off_targets}

  • u6_compatible: bool (optional, default True)

  • is_repeat: bool (optional, default False)

require_u6_compatible: Exclude U6-incompatible guides entirely. exclude_repeats: Exclude guides in repeat regions entirely.

Returns:

Sorted list of guides with ‘crispor_composite’ and ‘crispor_rank’ added.

Example:
>>> guides = [
...     {"sequence": "TTCGATGAATGGTTGCTACC", "mit_score": 93, "cfd_score": 95,
...      "off_targets": {0:0, 1:0, 2:0, 3:5, 4:15}},
...     {"sequence": "TTTTAATGGCCGGCGATGCC", "mit_score": 98, "cfd_score": 98,
...      "off_targets": {0:0, 1:0, 2:1, 3:6, 4:22}, "u6_compatible": False},
... ]
>>> ranked = rank_guides_crispor_style(guides)
>>> ranked[0]["sequence"]
'TTCGATGAATGGTTGCTACC'  # Guide #1 wins despite lower MIT/CFD
phaselab.crispr.validate_and_rerank_with_crispor(phaselab_guides, crispor_data, verbose=True)[source]#

Validate PhaseLab guides against CRISPOR data and re-rank using v0.9.1 composite scoring.

This function bridges PhaseLab’s design_guides() output with CRISPOR validation, applying proper off-target mismatch distance weighting per ChatGPT’s recommendations.

Args:
phaselab_guides: Output from design_guides() - list of guide dicts with:
  • sequence: 20bp guide

  • position: relative to TSS

  • gc: GC content

  • coherence_R: IR coherence score

  • go_no_go: “GO” or “NO-GO”

  • mit_score: PhaseLab’s MIT estimate

  • cfd_score: PhaseLab’s CFD estimate

  • combined_score: PhaseLab’s original ranking score

crispor_data: CRISPOR output - list of dicts with:
  • sequence: Guide sequence (matching key)

  • mit_specificity: Real MIT score from CRISPOR

  • cfd_specificity: Real CFD score from CRISPOR

  • ot_0mm, ot_1mm, ot_2mm, ot_3mm, ot_4mm: Off-target counts

verbose: Print comparison table.

Returns:

Merged and re-ranked guides with CRISPOR validation, including: - crispor_mit: Validated MIT score - crispor_cfd: Validated CFD score - off_targets: Dict of {mm: count} - crispor_composite: CRISPOR composite score - crispor_rank: New rank based on composite score - phaselab_rank: Original PhaseLab rank - rank_delta: Change in rank (positive = improved)

Example:
>>> from phaselab.crispr import design_guides
>>> guides_df = design_guides(sequence, tss_index=500)
>>> phaselab_guides = guides_df.to_dict('records')
>>>
>>> # After running CRISPOR (web or local)
>>> crispor_data = parse_crispor_results(crispor_output)
>>>
>>> validated = validate_and_rerank_with_crispor(phaselab_guides, crispor_data)
>>> print(f"Top guide changed: {validated[0]['sequence']}")
class phaselab.crispr.RankingPolicy(value)[source]#

Bases: Enum

Named ranking policies for different use cases.

Each policy defines: - Hard gates (what disqualifies a guide entirely) - Dominance order for safety-critical mismatches - Tie-breaker weights for lower-priority factors

CUTTING_STRICT = 'cutting_strict'#
BINDING_STRICT = 'binding_strict'#
EXPLORATORY = 'exploratory'#
class phaselab.crispr.PolicyConfig(name, description, gate_unscorable, gate_0mm, gate_1mm, gate_u6_incompatible, gate_repeats, max_0mm, max_1mm, max_2mm, weight_3mm, weight_4mm, weight_mit, weight_cfd)[source]#

Bases: NamedTuple

Configuration for a ranking policy.

Methods

count(value, /)

Return number of occurrences of value.

index(value[, start, stop])

Return first index of value.

description: str#

Alias for field number 1

gate_0mm: bool#

Alias for field number 3

gate_1mm: bool#

Alias for field number 4

gate_repeats: bool#

Alias for field number 6

gate_u6_incompatible: bool#

Alias for field number 5

gate_unscorable: bool#

Alias for field number 2

max_0mm: int#

Alias for field number 7

max_1mm: int#

Alias for field number 8

max_2mm: int#

Alias for field number 9

name: str#

Alias for field number 0

weight_3mm: float#

Alias for field number 10

weight_4mm: float#

Alias for field number 11

weight_cfd: float#

Alias for field number 13

weight_mit: float#

Alias for field number 12

class phaselab.crispr.GateResult(passed, excluded_by=None, reason=None)[source]#

Bases: object

Result of applying hard gates to a guide.

Attributes:
excluded_by
reason
excluded_by: str | None = None#
reason: str | None = None#
passed: bool#
class phaselab.crispr.GuideTier(value)[source]#

Bases: Enum

Tiers for guide categorization.

A = 'A'#
B = 'B'#
C = 'C'#
X = 'X'#
phaselab.crispr.apply_hard_gates(guide, policy=RankingPolicy.CUTTING_STRICT)[source]#

Apply hard gates to a guide based on the ranking policy.

Hard gates are EXCLUSION criteria - guides failing any gate are removed from consideration entirely, not just penalized.

Args:

guide: Guide dict with CRISPOR metrics. policy: Ranking policy to use.

Returns:

GateResult with passed=True if guide passes all gates.

Example:
>>> result = apply_hard_gates(guide, RankingPolicy.CUTTING_STRICT)
>>> if not result.passed:
...     print(f"Excluded: {result.reason}")
phaselab.crispr.rank_guides(guides, policy=RankingPolicy.CUTTING_STRICT, return_excluded=False)[source]#

Rank guides using dominance-based lexicographic sorting.

This is the v0.9.2+ replacement for composite scoring. Key features: - Hard gates exclude guides entirely (not just penalized) - Lexicographic sort on (0mm, 1mm, 2mm) - safety-critical - Tie-breaker uses MIT/CFD and 3-4mm counts - Output includes tier assignments for wet lab convenience

Args:

guides: List of guide dicts with CRISPOR metrics. policy: Ranking policy to use. return_excluded: Include excluded guides in output.

Returns:

Dict with: - ranked: List of ranked guides (passing hard gates) - excluded: List of excluded guides (if return_excluded=True) - tiers: Dict mapping tier to list of guides - policy: Policy used - manifest: Run manifest for reproducibility

Example:
>>> result = rank_guides(guides, RankingPolicy.CUTTING_STRICT)
>>> print(f"Top guide: {result['ranked'][0]['sequence']}")
>>> print(f"Tier A guides: {len(result['tiers']['A'])}")
phaselab.crispr.emit_manifest(guides, policy, sequence_name='unknown', genome='hg38', crispor_version=None, tss_index=None, window=None)[source]#

Generate a reproducibility manifest for a ranking run.

Every ranking run should emit this manifest to enable: - Reproducibility of results - Audit trail for validation - Debugging of “why did the winner change” issues

Args:

guides: The ranked guides. policy: Ranking policy used. sequence_name: Name of the target sequence/gene. genome: Genome build (e.g., “hg38”). crispor_version: CRISPOR version/commit if known. tss_index: TSS position in the input sequence. window: CRISPRa/i window used (e.g., (-400, -50)).

Returns:

Manifest dict suitable for JSON serialization.

Example:
>>> result = rank_guides(guides, RankingPolicy.CUTTING_STRICT)
>>> manifest = emit_manifest(
...     result['ranked'],
...     RankingPolicy.CUTTING_STRICT,
...     sequence_name="RAI1",
...     genome="hg38",
... )
>>> with open('rai1_manifest.json', 'w') as f:
...     json.dump(manifest, f, indent=2)
phaselab.crispr.print_ranking_report(result, max_guides=15)[source]#

Print a formatted ranking report to stdout.

Args:

result: Output from rank_guides(). max_guides: Maximum guides to show.

class phaselab.crispr.CoherenceMode(value)[source]#

Bases: Enum

Coherence computation mode.

HEURISTIC: Fast proxy using Hamiltonian coefficient variance
  • Good for filtering/tie-breaking

  • NOT actual quantum coherence

QUANTUM: Actual VQE simulation (slow but accurate)
  • Matches hardware validation results

  • Use for research-grade analysis

HEURISTIC = 'heuristic'#
QUANTUM = 'quantum'#
phaselab.crispr.compute_guide_coherence(guide_seq, mode='heuristic', use_atlas_q=True, return_full_result=False, n_shots=1000)[source]#

Compute IR coherence for a guide sequence.

TWO MODES AVAILABLE:

  1. mode=”heuristic” (default, fast): - Uses Hamiltonian coefficient variance as proxy - R̄ ≈ 0.68-0.69 typically - Does NOT benefit from ATLAS-Q - Use for filtering/tie-breaking, NOT primary ranking

  2. mode=”quantum” (slow, research-grade): - Runs actual VQE simulation on gRNA Hamiltonian - R̄ ≈ 0.84-0.97 (matches IBM hardware validation) - ATLAS-Q provides significant speedup - Use for final candidate validation

Args:

guide_seq: 20bp guide sequence (DNA, uppercase preferred) mode: “heuristic” (fast proxy) or “quantum” (VQE simulation) use_atlas_q: Use ATLAS-Q backend if available (only affects quantum mode) return_full_result: Return full CoherenceResult instead of just R̄ n_shots: Number of measurement shots for quantum mode (default 1000)

Returns:

R̄ value [0, 1] or full CoherenceResult if return_full_result=True

Example:
>>> # Fast heuristic (default)
>>> r_bar = compute_guide_coherence("ATCGATCGATCGATCGATCG")
>>> print(f"Heuristic R̄ = {r_bar:.4f}")
>>> # Research-grade quantum
>>> r_bar = compute_guide_coherence("ATCGATCGATCGATCGATCG", mode="quantum")
>>> print(f"Quantum R̄ = {r_bar:.4f}")
phaselab.crispr.compute_guide_coherence_with_details(guide_seq, mode='heuristic', use_atlas_q=True)[source]#

Compute coherence with full details.

Args:

guide_seq: 20bp guide sequence mode: “heuristic” (fast proxy) or “quantum” (VQE simulation) use_atlas_q: Use ATLAS-Q backend if available (only affects quantum mode)

Returns:

Tuple of (R_bar, V_phi, is_go, method)

phaselab.crispr.compute_coherence_batch(guide_sequences, mode='heuristic', use_atlas_q=True)[source]#

Compute coherence for multiple guides efficiently.

Args:

guide_sequences: List of guide sequences mode: “heuristic” (fast proxy) or “quantum” (VQE simulation) use_atlas_q: Use ATLAS-Q backend if available (only affects quantum mode)

Returns:

List of R̄ values

Note:
  • mode=”heuristic”: ~0.1ms per guide (recommended for screening)

  • mode=”quantum”: ~100-500ms per guide (recommended for final validation)

phaselab.crispr.compute_coherence_with_zscore(guide_sequences, mode='heuristic', use_atlas_q=True)[source]#

Compute coherence for multiple guides with domain-calibrated z-scores.

Z-score provides relative ranking within a locus, which is more discriminative than absolute R̄ when most guides have similar coherence (e.g., in GC-rich regions).

Args:

guide_sequences: List of guide sequences from the same locus mode: “heuristic” (fast proxy) or “quantum” (VQE simulation) use_atlas_q: Use ATLAS-Q backend if available (only affects quantum mode)

Returns:

List of (R_bar, z_score) tuples - R̄: Absolute coherence - z_score: (R̄ - mean(R̄)) / std(R̄) within the locus

Example:
>>> seqs = ["ATCGATCGATCGATCGATCG", "GCTAGCTAGCTAGCTAGCTA", ...]
>>> results = compute_coherence_with_zscore(seqs)
>>> for r_bar, zscore in results:
...     print(f"R̄={r_bar:.4f}, z={zscore:+.2f}")
phaselab.crispr.is_guide_coherent(guide_seq, threshold=np.float64(0.1353352832366127), mode='heuristic', use_atlas_q=True)[source]#

Check if guide meets coherence threshold.

Args:

guide_seq: 20bp guide sequence threshold: R̄ threshold (default: e^-2 ≈ 0.135) mode: “heuristic” (fast proxy) or “quantum” (VQE simulation) use_atlas_q: Use ATLAS-Q backend if available (only affects quantum mode)

Returns:

True if R̄ > threshold (GO status)

phaselab.crispr.get_coherence_method()[source]#

Get the coherence computation method that will be used.

Returns:

“atlas_q” if ATLAS-Q available, “native” otherwise

phaselab.crispr.get_coherence_eligibility_info(mode='heuristic')[source]#

Get detailed info about coherence computation eligibility.

This helps users understand what method will be used and why.

Args:

mode: “heuristic” (default) or “quantum”

Returns:

Dict with: - mode: “heuristic” or “quantum” - method: Expected method string in CoherenceResult - atlas_q_available: bool - acceleration_active: bool (True if ATLAS-Q acceleration will be used) - expected_r_bar_range: Typical R̄ range for this mode - expected_time_per_guide: Approximate time in ms - reason: Human-readable explanation

class phaselab.crispr.GuideCandidate(sequence, position, strand, individual_score, coherence_R=0.0, name='', efficiency=0.0, specificity=0.0)[source]#

Bases: object

A guide candidate for multi-guide analysis.

coherence_R: float = 0.0#
efficiency: float = 0.0#
name: str = ''#
specificity: float = 0.0#
sequence: str#
position: int#
strand: str#
individual_score: float#
class phaselab.crispr.GuidePair(guide1, guide2, spacing, has_steric_clash, in_optimal_range, synergy_score, combined_efficiency, interaction_type)[source]#

Bases: object

Analysis of a pair of guides.

guide1: GuideCandidate#
guide2: GuideCandidate#
spacing: int#
has_steric_clash: bool#
in_optimal_range: bool#
synergy_score: float#
combined_efficiency: float#
interaction_type: str#
class phaselab.crispr.MultiGuideSet(guides, pairs, combined_score, synergy_factor, ensemble_coherence, has_clashes, min_spacing, max_spacing, recommendation, go_no_go)[source]#

Bases: object

A set of guides designed to work together.

guides: List[GuideCandidate]#
pairs: List[GuidePair]#
combined_score: float#
synergy_factor: float#
ensemble_coherence: float#
has_clashes: bool#
min_spacing: int#
max_spacing: int#
recommendation: str#
go_no_go: str#
class phaselab.crispr.MultiGuideConfig(min_spacing=30, optimal_spacing_min=50, optimal_spacing_max=200, max_interaction_distance=500, clash_radius=25, enable_synergy=True, synergy_model='empirical', max_guides=4, min_guides=2, weight_individual=0.4, weight_synergy=0.3, weight_coherence=0.3, compute_coherence=True)[source]#

Bases: object

Configuration for multi-guide analysis.

clash_radius: int = 25#
compute_coherence: bool = True#
enable_synergy: bool = True#
max_guides: int = 4#
max_interaction_distance: int = 500#
min_guides: int = 2#
min_spacing: int = 30#
optimal_spacing_max: int = 200#
optimal_spacing_min: int = 50#
synergy_model: str = 'empirical'#
weight_coherence: float = 0.3#
weight_individual: float = 0.4#
weight_synergy: float = 0.3#
phaselab.crispr.design_multiguide_set(candidates, n_guides=2, tss_position=0, config=None)[source]#

Design optimal multi-guide sets from candidates.

Evaluates all combinations and ranks by combined effectiveness.

Args:

candidates: List of guide candidates n_guides: Number of guides in each set tss_position: TSS position config: MultiGuideConfig

Returns:

List of MultiGuideSet objects, sorted by score

Example:
>>> candidates = [GuideCandidate(...) for guide in top_guides]
>>> sets = design_multiguide_set(candidates, n_guides=2)
>>> print(f"Best set synergy: {sets[0].synergy_factor:.2f}")
phaselab.crispr.analyze_guide_pair(guide1, guide2, tss_position=0, config=None)[source]#

Analyze interaction between two guides.

Args:

guide1: First guide guide2: Second guide tss_position: TSS position config: MultiGuideConfig

Returns:

GuidePair analysis object

phaselab.crispr.predict_pairwise_synergy(guide1, guide2, tss_position=0, config=None)[source]#

Predict synergy between two guides.

Synergy > 1.0 means combined effect is greater than sum of parts. Synergy < 1.0 means interference or redundancy.

Args:

guide1: First guide guide2: Second guide tss_position: TSS position for relative calculations config: MultiGuideConfig

Returns:

Synergy coefficient (0-2, with 1.0 = additive)

phaselab.crispr.validate_guide_set(guides, config=None)[source]#

Validate a proposed guide set for compatibility.

Checks: - Steric clashes - Spacing appropriateness - Coherence quality - Overall synergy potential

Args:

guides: List of guides to validate config: MultiGuideConfig

Returns:

Validation results dictionary

phaselab.crispr.optimize_guide_spacing(anchor_guide, candidates, tss_position=0, config=None)[source]#

Find optimal second guide given an anchor guide.

Useful when one guide is already selected and you need to find the best partner.

Args:

anchor_guide: The fixed guide candidates: Potential partner guides tss_position: TSS position config: MultiGuideConfig

Returns:

List of (guide, pair_analysis) tuples, sorted by synergy

class phaselab.crispr.Enhancer(chrom, start, end, name='', element_type='dELS', activity_score=0.0, target_gene=None, target_gene_tss=None, distance_to_tss=0, abc_score=0.0, eqtl_support=False, hi_c_contact=0.0, tissue='generic', tissue_specific=False)[source]#

Bases: object

Enhancer element annotation.

Attributes:
center
is_proximal
length
target_gene
target_gene_tss
abc_score: float = 0.0#
activity_score: float = 0.0#
property center: int#
distance_to_tss: int = 0#
element_type: str = 'dELS'#
eqtl_support: bool = False#
hi_c_contact: float = 0.0#
property is_proximal: bool#
property length: int#
name: str = ''#
target_gene: str | None = None#
target_gene_tss: int | None = None#
tissue: str = 'generic'#
tissue_specific: bool = False#
chrom: str#
start: int#
end: int#
class phaselab.crispr.EnhancerConfig(min_enhancer_length=200, max_enhancer_distance=1000000, min_activity_score=0.3, min_abc_score=0.02, guides_per_enhancer=3, prefer_enhancer_center=True, require_tissue_match=False, target_tissue='generic')[source]#

Bases: object

Configuration for enhancer targeting.

guides_per_enhancer: int = 3#
max_enhancer_distance: int = 1000000#
min_abc_score: float = 0.02#
min_activity_score: float = 0.3#
min_enhancer_length: int = 200#
prefer_enhancer_center: bool = True#
require_tissue_match: bool = False#
target_tissue: str = 'generic'#
class phaselab.crispr.EnhancerGuideResult(guide_sequence, guide_position, enhancer, activation_potential, guide_quality, combined_score, coherence_R, go_no_go, recommendation)[source]#

Bases: object

Result from enhancer-targeting guide design.

guide_sequence: str#
guide_position: int#
enhancer: Enhancer#
activation_potential: float#
guide_quality: float#
combined_score: float#
coherence_R: float#
go_no_go: str#
recommendation: str#
phaselab.crispr.design_enhancer_guides(enhancer, sequence, enhancer_start_in_seq, config=None, guide_length=20, pam='NGG')[source]#

Design CRISPRa guides targeting an enhancer.

For enhancer activation, guides should target: - Near the enhancer center (peak of activity) - Accessible regions within enhancer - Avoid repressive elements

Args:

enhancer: Target enhancer sequence: DNA sequence containing enhancer enhancer_start_in_seq: Position of enhancer start in sequence config: EnhancerConfig guide_length: Guide length (default 20) pam: PAM sequence

Returns:

List of EnhancerGuideResult objects

phaselab.crispr.identify_target_enhancers(gene_symbol, gene_tss, gene_chrom, candidate_enhancers, config=None)[source]#

Identify enhancers likely to regulate a target gene.

Args:

gene_symbol: Target gene symbol gene_tss: Gene TSS position gene_chrom: Gene chromosome candidate_enhancers: List of candidate enhancers config: EnhancerConfig

Returns:

List of (enhancer, link_score) tuples, sorted by score

phaselab.crispr.score_enhancer_for_activation(enhancer, config=None)[source]#

Score an enhancer’s potential for CRISPRa activation.

Considers: - Enhancer activity (already active enhancers harder to boost) - Distance to target gene - ABC score (link confidence) - Tissue specificity

Args:

enhancer: Enhancer to score config: EnhancerConfig

Returns:

Scoring dictionary

phaselab.crispr.predict_enhancer_activation_effect(enhancer, guide_quality, baseline_expression=1.0)[source]#

Predict effect of enhancer CRISPRa on target gene expression.

Args:

enhancer: Target enhancer guide_quality: Quality score of guide used baseline_expression: Baseline gene expression level

Returns:

Prediction dictionary with fold-change estimates

phaselab.crispr.compare_promoter_vs_enhancer(promoter_guides, enhancer_guides, target_expression, baseline_expression=0.5)[source]#

Compare promoter CRISPRa vs enhancer CRISPRa strategies.

Args:

promoter_guides: Promoter-targeting guide results enhancer_guides: Enhancer-targeting guide results target_expression: Desired expression level baseline_expression: Current baseline expression

Returns:

Comparison and recommendation

phaselab.crispr.get_known_enhancers(gene)[source]#

Get known enhancers for a gene from built-in database.

Args:

gene: Gene symbol

Returns:

List of Enhancer objects

class phaselab.crispr.Modality(value)[source]#

Bases: Enum

CRISPR modality.

CRISPRA = 'CRISPRa'#
CRISPRI = 'CRISPRi'#
KNOCKOUT = 'Knockout'#
BASE_EDITING = 'BaseEditing'#
PRIME_EDITING = 'PrimeEditing'#
class phaselab.crispr.EnhancedGuideConfig(pam='NGG', guide_length=20, modality=Modality.CRISPRA, crispra_window=(-400, -50), crispri_window=(-50, 300), knockout_window=(0, 500), min_gc=0.35, max_gc=0.75, max_homopolymer=4, min_complexity=0.4, max_offtargets=10, coherence_mode=CoherenceMode.HEURISTIC, go_threshold=np.float64(0.1353352832366127), use_biological_context=True, cell_type=CellType.K562, use_ml_predictors=True, ml_min_confidence=0.3, fusion_config=None, top_n=20, include_nogo=False)[source]#

Bases: object

Configuration for enhanced guide RNA design pipeline.

Extends basic config with v0.7.0 Virtual Assay Stack options.

Attributes:
fusion_config
window

Get window for current modality.

cell_type: CellType | str = 'K562'#
coherence_mode: CoherenceMode = 'heuristic'#
crispra_window: Tuple[int, int] = (-400, -50)#
crispri_window: Tuple[int, int] = (-50, 300)#
fusion_config: FusionConfig | None = None#
go_threshold: float = np.float64(0.1353352832366127)#
guide_length: int = 20#
include_nogo: bool = False#
knockout_window: Tuple[int, int] = (0, 500)#
max_gc: float = 0.75#
max_homopolymer: int = 4#
max_offtargets: int = 10#
min_complexity: float = 0.4#
min_gc: float = 0.35#
ml_min_confidence: float = 0.3#
modality: Modality = 'CRISPRa'#
pam: str = 'NGG'#
top_n: int = 20#
use_biological_context: bool = True#
use_ml_predictors: bool = True#
property window: Tuple[int, int]#

Get window for current modality.

class phaselab.crispr.EnhancedGuide(sequence, pam, position, strand, chrom=None, abs_start=None, abs_end=None, gc=0.0, delta_g=0.0, complexity=0.0, homopolymer=0, mit_score=0.0, cfd_score=0.0, offtarget_count=0, accessibility=0.5, methylation=0.5, histone_activity=0.5, chromatin_state='Unknown', context_confidence=0.0, ml_efficiency=0.5, ml_confidence=0.0, ml_predictors_used=0, coherence=0.0, coherence_mode='heuristic', fused_score=0.0, fused_confidence=0.0, go_status='NO-GO', go_reason='', passes_gates=False, failed_gates=<factory>, evidence_level='C', claim_level='unknown', claim_description='', is_unknown=False, layer_disagreements=<factory>, has_critical_disagreement=False)[source]#

Bases: object

Enhanced guide result with full evidence breakdown.

Attributes:
abs_end
abs_start
chrom
claim_level_enum

Return ClaimLevel enum.

is_go
is_viable

Methods

to_dict()

Convert to dictionary.

abs_end: int | None = None#
abs_start: int | None = None#
accessibility: float = 0.5#
cfd_score: float = 0.0#
chrom: str | None = None#
chromatin_state: str = 'Unknown'#
claim_description: str = ''#
claim_level: str = 'unknown'#
property claim_level_enum: ClaimLevel#

Return ClaimLevel enum.

coherence: float = 0.0#
coherence_mode: str = 'heuristic'#
complexity: float = 0.0#
context_confidence: float = 0.0#
delta_g: float = 0.0#
evidence_level: str = 'C'#
fused_confidence: float = 0.0#
fused_score: float = 0.0#
gc: float = 0.0#
go_reason: str = ''#
go_status: str = 'NO-GO'#
has_critical_disagreement: bool = False#
histone_activity: float = 0.5#
homopolymer: int = 0#
property is_go: bool#
is_unknown: bool = False#
property is_viable: bool#
methylation: float = 0.5#
mit_score: float = 0.0#
ml_confidence: float = 0.0#
ml_efficiency: float = 0.5#
ml_predictors_used: int = 0#
offtarget_count: int = 0#
passes_gates: bool = False#
to_dict()[source]#

Convert to dictionary.

sequence: str#
pam: str#
position: int#
strand: str#
failed_gates: List[str]#
layer_disagreements: List[Dict[str, Any]]#
class phaselab.crispr.EnhancedDesignResult(guides, config, target_gene=None, total_candidates=0, filtered_by_gates=0, filtered_by_coherence=0)[source]#

Bases: object

Result container for enhanced guide design.

Attributes:
go_guides

Return only GO guides.

target_gene
viable_guides

Return only viable guides (pass gates + GO).

Methods

summary()

Get design summary.

to_dataframe()

Convert to pandas DataFrame.

filtered_by_coherence: int = 0#
filtered_by_gates: int = 0#
property go_guides: List[EnhancedGuide]#

Return only GO guides.

summary()[source]#

Get design summary.

target_gene: str | None = None#
to_dataframe()[source]#

Convert to pandas DataFrame.

total_candidates: int = 0#
property viable_guides: List[EnhancedGuide]#

Return only viable guides (pass gates + GO).

guides: List[EnhancedGuide]#
config: EnhancedGuideConfig#
phaselab.crispr.design_enhanced_guides(sequence, tss_index, config=None, target_gene=None, chrom=None, chrom_offset=0, verbose=False)[source]#

Design guide RNAs using the full Virtual Assay Stack.

This enhanced pipeline integrates: 1. Sequence-based scoring (GC, thermodynamics, specificity) 2. Biological context (chromatin state, methylation, histones) 3. ML efficiency predictions (DeepCRISPR, DeepSpCas9) 4. IR coherence (heuristic or quantum) 5. Evidence fusion with uncertainty quantification

Parameters:
  • sequence (str) – Promoter/target DNA sequence (5’->3’)

  • tss_index (int) – Position of TSS in sequence (0-based)

  • config (EnhancedGuideConfig, optional) – Pipeline configuration

  • target_gene (str, optional) – Gene symbol for ML predictors

  • chrom (str, optional) – Chromosome for context lookup (e.g., “chr4”)

  • chrom_offset (int) – Genomic offset of sequence start

  • verbose (bool) – Print progress messages

Returns:

Ranked guides with full evidence breakdown

Return type:

EnhancedDesignResult

Example

>>> from phaselab.crispr import design_enhanced_guides, EnhancedGuideConfig, Modality
>>> from phaselab.context import CellType
>>>
>>> config = EnhancedGuideConfig(
...     modality=Modality.CRISPRA,
...     cell_type=CellType.K562,
...     coherence_mode=CoherenceMode.HEURISTIC,
... )
>>>
>>> result = design_enhanced_guides(
...     sequence=scn2a_promoter,
...     tss_index=500,
...     config=config,
...     target_gene="SCN2A",
...     chrom="chr2",
...     chrom_offset=165000000,
... )
>>>
>>> print(f"Found {len(result.viable_guides)} viable guides")
>>> for g in result.viable_guides[:5]:
...     print(f"  {g.sequence} Score={g.fused_score:.3f} {g.go_status}")
phaselab.crispr.compare_guides_with_without_context(sequence, tss_index, chrom, chrom_offset, cell_type=CellType.K562, target_gene=None)[source]#

Compare guide rankings with and without biological context.

Useful for demonstrating the value of the Virtual Assay Stack.

Return type:

Dict with ‘basic’, ‘enhanced’, and ‘rank_changes’ DataFrames

class phaselab.crispr.CrisporConfig(crispor_path, python_executable=None, genome='hg38', timeout=300, compute_efficiency=False)[source]#

Bases: object

Configuration for CRISPOR integration.

Attributes:
python_executable
compute_efficiency: bool = False#
genome: str = 'hg38'#
python_executable: str | None = None#
timeout: int = 300#
crispor_path: str#
class phaselab.crispr.CrisporValidator(crispor_path, genome='hg38', python_executable=None, compute_efficiency=False)[source]#

Bases: object

Validates guide RNAs using local CRISPOR installation.

Provides genome-wide off-target search and specificity scoring.

Example:
>>> validator = CrisporValidator("/path/to/crispor", genome="hg38")
>>> results = validator.validate_guides(guides, sequence)
>>> for guide in results:
...     print(f"{guide['sequence']}: MIT={guide['mit_specificity']}")

Methods

validate_guides(guides, sequence[, ...])

Validate a list of guide dictionaries against CRISPOR.

validate_sequence(sequence[, sequence_name])

Run CRISPOR on a sequence and return all guides with off-target data.

__init__(crispor_path, genome='hg38', python_executable=None, compute_efficiency=False)[source]#

Initialize CRISPOR validator.

Args:

crispor_path: Path to CRISPOR installation directory. genome: Genome identifier (e.g., “hg38”, “mm10”). python_executable: Python to use for running CRISPOR. compute_efficiency: Whether to compute efficiency scores.

validate_guides(guides, sequence, sequence_name='input')[source]#

Validate a list of guide dictionaries against CRISPOR.

IMPORTANT: Only re-ranks within the PhaseLab candidate set. CRISPOR validation constrains but doesn’t override PhaseLab’s biological window logic.

Args:

guides: List of guide dicts (from design_guides()). sequence: The full sequence used for guide design. sequence_name: Name for the sequence.

Returns:

Guides with CRISPOR validation data merged in. Fields added: - mit_specificity: CRISPOR MIT score (None if unscorable) - cfd_specificity: CRISPOR CFD score (None if unscorable) - ot_0mm, ot_1mm, etc.: Off-target counts by mismatch - crispor_validated: True if CRISPOR could score this guide - is_unscorable: True if MIT=0, CFD=0, OTs=0 (invalid data)

validate_sequence(sequence, sequence_name='input')[source]#

Run CRISPOR on a sequence and return all guides with off-target data.

Args:

sequence: Input DNA sequence (promoter region). sequence_name: Name for the sequence (used in output).

Returns:

(guides, off_targets) - guides: List of guide dicts with MIT/CFD scores - off_targets: List of off-target hit dicts

phaselab.crispr.setup_crispor(install_path='~/.phaselab/crispor', genome='hg38')[source]#

Download and set up CRISPOR for local use.

This is a convenience function that: 1. Clones CRISPOR from GitHub 2. Downloads the specified genome index

Note: Genome downloads are large (~6GB for hg38).

Args:

install_path: Where to install CRISPOR. genome: Genome to download (default: hg38).

Returns:

Path to CRISPOR installation.

Example:
>>> crispor_path = setup_crispor()
>>> validator = CrisporValidator(crispor_path)
phaselab.crispr.validate_with_crispor(guides, sequence, crispor_path, genome='hg38')[source]#

One-shot guide validation with CRISPOR.

Args:

guides: List of guide dicts from design_guides(). sequence: The sequence used for guide design. crispor_path: Path to CRISPOR installation. genome: Genome identifier.

Returns:

Guides with CRISPOR validation data.

class phaselab.crispr.GenomeBuild(value)[source]#

Bases: Enum

Supported genome builds.

HG38 = 'hg38'#
HG19 = 'hg19'#
MM10 = 'mm10'#
MM39 = 'mm39'#
class phaselab.crispr.TSSSource(value)[source]#

Bases: Enum

TSS annotation sources.

GENCODE = 'gencode'#
REFSEQ = 'refseq'#
ENSEMBL = 'ensembl'#
CUSTOM = 'custom'#
class phaselab.crispr.TSSAnnotation(gene_id, gene_symbol, transcript_id, chromosome, position, strand, source, source_version=None, support_level=None, is_canonical=False)[source]#

Bases: object

Transcription Start Site annotation.

Represents a single TSS from an annotation source.

Attributes:
source_version
support_level
is_canonical: bool = False#
source_version: str | None = None#
support_level: int | None = None#
gene_id: str#
gene_symbol: str#
transcript_id: str#
chromosome: str#
position: int#
strand: str#
source: TSSSource#
class phaselab.crispr.Window(upstream=400, downstream=50, name=None)[source]#

Bases: object

Relative window around a TSS.

Coordinates are relative to TSS (negative = upstream). For minus-strand genes, upstream/downstream are automatically flipped.

Attributes:
name

Methods

to_genomic(tss_position, strand)

Convert relative window to absolute genomic coordinates.

downstream: int = 50#
name: str | None = None#
to_genomic(tss_position, strand)[source]#

Convert relative window to absolute genomic coordinates.

Args:

tss_position: Absolute TSS position. strand: ‘+’ or ‘-‘.

Returns:

(start, end) as 0-based half-open interval.

upstream: int = 400#
class phaselab.crispr.Region(chromosome, start, end, strand, gene_id, gene_symbol, tss_id, tss_position, window, modality, genome_build=GenomeBuild.HG38)[source]#

Bases: object

A genomic region for guide enumeration.

Represents a specific interval to scan for candidate guides, with full provenance tracking.

Attributes:
length
region_id

Unique identifier for this region.

genome_build: GenomeBuild = 'hg38'#
property length: int#
property region_id: str#

Unique identifier for this region.

chromosome: str#
start: int#
end: int#
strand: str#
gene_id: str#
gene_symbol: str#
tss_id: str#
tss_position: int#
window: Window#
modality: Modality#
class phaselab.crispr.RegionSet(gene_id, gene_symbol, genome_build, modality, regions=<factory>, tss_annotations=<factory>, tss_source=TSSSource.GENCODE, tss_source_version=None)[source]#

Bases: object

Collection of regions for a single target.

Supports multiple TSS hypotheses and multiple windows per TSS.

Attributes:
total_length

Total bp covered (may overlap).

tss_source_version

Methods

add_region(region)

Add a region to the set.

to_dict()

Serialize for manifest.

add_region(region)[source]#

Add a region to the set.

to_dict()[source]#

Serialize for manifest.

property total_length: int#

Total bp covered (may overlap).

tss_source: TSSSource = 'gencode'#
tss_source_version: str | None = None#
gene_id: str#
gene_symbol: str#
genome_build: GenomeBuild#
modality: Modality#
regions: List[Region]#
tss_annotations: List[TSSAnnotation]#
class phaselab.crispr.RegionBuilder(genome_build=GenomeBuild.HG38, modality=Modality.CRISPRA)[source]#

Bases: object

Build regions from TSS annotations and window specifications.

This is the main interface for region declaration.

Example:
>>> builder = RegionBuilder(GenomeBuild.HG38, Modality.CRISPRA)
>>> builder.add_tss(TSSAnnotation(...))
>>> builder.add_window(Window(upstream=400, downstream=50))
>>> region_set = builder.build("RAI1")

Methods

add_tss(tss)

Add a TSS annotation.

add_tss_manual(gene_symbol, chromosome, ...)

Add a TSS with manual coordinates.

add_window(window)

Add a window specification.

build([gene_symbol])

Build the region set.

use_default_window()

Use the default window for the current modality.

add_tss(tss)[source]#

Add a TSS annotation.

add_tss_manual(gene_symbol, chromosome, position, strand, transcript_id='manual', gene_id=None)[source]#

Add a TSS with manual coordinates.

add_window(window)[source]#

Add a window specification.

build(gene_symbol=None)[source]#

Build the region set.

Generates one region per (TSS, window) combination.

Args:

gene_symbol: Override gene symbol (uses first TSS if not provided).

Returns:

RegionSet with all declared regions.

use_default_window()[source]#

Use the default window for the current modality.

phaselab.crispr.build_regions_for_gene(gene_symbol, genome_build=GenomeBuild.HG38, modality=Modality.CRISPRA, windows=None, use_canonical_only=True)[source]#

Convenience function to build regions for a gene using built-in TSS data.

Args:

gene_symbol: Gene name. genome_build: Genome build. modality: CRISPR modality. windows: Custom windows (uses default if None). use_canonical_only: Only use canonical/MANE transcripts.

Returns:

RegionSet for the gene.

Example:
>>> regions = build_regions_for_gene("RAI1", GenomeBuild.HG38, Modality.CRISPRA)
>>> print(regions.regions[0])
phaselab.crispr.get_tss_for_gene(gene_symbol, genome_build, source=TSSSource.GENCODE)[source]#

Look up TSS annotations for a gene.

Currently uses built-in data. In production, would query GENCODE/RefSeq annotation files.

Args:

gene_symbol: Gene name (e.g., “RAI1”). genome_build: Genome build. source: Annotation source to use.

Returns:

List of TSSAnnotation objects.

class phaselab.crispr.Nuclease(value)[source]#

Bases: Enum

Supported CRISPR nucleases.

SPCAS9 = 'SpCas9'#
SPCAS9_NG = 'SpCas9-NG'#
SACAS9 = 'SaCas9'#
ASCAS12A = 'AsCas12a'#
LBCAS12A = 'LbCas12a'#
CAS9_HF1 = 'Cas9-HF1'#
ECAS9 = 'eCas9'#
class phaselab.crispr.NucleaseConfig(name, pam_pattern, pam_side, guide_length, guide_length_range, pam_display, pam_pattern_binding=None, pam_display_binding=None)[source]#

Bases: object

Configuration for a nuclease.

Attributes:
pam_display_binding
pam_pattern_binding

Methods

get_pam_display([role])

Get PAM display string for the specified role.

get_pam_pattern([role])

Get PAM pattern for the specified role.

get_pam_display(role=NucleaseRole.CUTTING)[source]#

Get PAM display string for the specified role.

get_pam_pattern(role=NucleaseRole.CUTTING)[source]#

Get PAM pattern for the specified role.

pam_display_binding: str | None = None#
pam_pattern_binding: str | None = None#
name: str#
pam_pattern: str#
pam_side: str#
guide_length: int#
guide_length_range: Tuple[int, int]#
pam_display: str#
class phaselab.crispr.CandidateGuide(sequence, pam, chromosome, start, end, strand, region_id, nuclease, tss_relative_position=None)[source]#

Bases: object

A candidate protospacer from enumeration.

This is the raw output of PAM scanning - not yet evaluated.

Attributes:
guide_id
sequence_with_pam
tss_relative_position

Methods

to_dict()

Convert to dictionary for downstream processing.

property guide_id: str#
property sequence_with_pam: str#
to_dict()[source]#

Convert to dictionary for downstream processing.

tss_relative_position: int | None = None#
sequence: str#
pam: str#
chromosome: str#
start: int#
end: int#
strand: str#
region_id: str#
nuclease: Nuclease#
class phaselab.crispr.PAMScanner(nuclease=Nuclease.SPCAS9, role=NucleaseRole.CUTTING, guide_length=None)[source]#

Bases: object

Scan sequences for PAM sites and extract protospacers.

This is pure string scanning - no biological decisions.

v0.9.3: Added NucleaseRole parameter to select PAM stringency.

BINDING mode uses relaxed PAMs suitable for CRISPRa/CRISPRi. Added guide_length override for non-standard designs.

Example:
>>> # Cutting mode (knockout) - strict PAM
>>> scanner = PAMScanner(Nuclease.SPCAS9, role=NucleaseRole.CUTTING)
>>>
>>> # Binding mode (CRISPRa) - relaxed PAM
>>> scanner = PAMScanner(Nuclease.SACAS9, role=NucleaseRole.BINDING)
>>>
>>> # Custom guide length (some papers use 20bp with SaCas9)
>>> scanner = PAMScanner(Nuclease.SACAS9, guide_length=20)
>>> candidates = scanner.scan_sequence(sequence, region)

Methods

scan_sequence(sequence, region[, ...])

Scan a sequence for PAM sites and extract protospacers.

scan_sequence(sequence, region, sequence_offset=0)[source]#

Scan a sequence for PAM sites and extract protospacers.

v0.9.3: BINDING mode uses sliding register enumeration. CRISPRa binding is tolerant of 1-2bp shifts in guide-PAM registration, especially in GC-dense promoters where PAM-like motifs overlap. This reflects biological reality - the functional binding register can shift while maintaining activity.

Args:

sequence: DNA sequence to scan. region: Region metadata for provenance. sequence_offset: Offset of sequence start in genomic coords.

Returns:

List of CandidateGuide objects.

class phaselab.crispr.EnumerationResult(candidates, region_set, nuclease)[source]#

Bases: object

Result of guide enumeration across all regions.

Attributes:
candidates_by_region
candidates_by_strand
total_candidates

Methods

summary()

Generate summary string.

to_dict_list()

Convert candidates to list of dicts for downstream processing.

property candidates_by_region: Dict[str, int]#
property candidates_by_strand: Dict[str, int]#
summary()[source]#

Generate summary string.

to_dict_list()[source]#

Convert candidates to list of dicts for downstream processing.

property total_candidates: int#
candidates: List[CandidateGuide]#
region_set: RegionSet#
nuclease: Nuclease#
phaselab.crispr.enumerate_guides(region_set, sequences, nuclease=Nuclease.SPCAS9, role=NucleaseRole.CUTTING, guide_length=None)[source]#

Enumerate all candidate guides in a region set.

Args:

region_set: Regions to scan. sequences: Dict mapping region_id to DNA sequence. nuclease: Nuclease to use for PAM scanning. role: NucleaseRole.CUTTING (strict PAM) or NucleaseRole.BINDING (relaxed). guide_length: Override default guide length (e.g., 20bp with SaCas9).

Returns:

EnumerationResult with all candidates.

Example:
>>> # Knockout (cutting) - strict PAM
>>> result = enumerate_guides(regions, sequences, Nuclease.SPCAS9)
>>>
>>> # CRISPRa (binding) - relaxed PAM, 20bp guides
>>> result = enumerate_guides(
...     regions, sequences, Nuclease.SACAS9,
...     role=NucleaseRole.BINDING,
...     guide_length=20,  # Chang et al. used 20bp with SaCas9
... )
phaselab.crispr.enumerate_from_sequence(sequence, gene_symbol='input', tss_position=0, chromosome='chr1', strand='+', nuclease=Nuclease.SPCAS9, role=NucleaseRole.CUTTING, guide_length=None)[source]#

Enumerate guides directly from a sequence string.

Convenience function for testing or when working with user-provided promoter sequences.

Args:

sequence: DNA sequence to scan. gene_symbol: Gene name for labeling. tss_position: Position of TSS within sequence. chromosome: Chromosome for coordinates. strand: Strand of the gene. nuclease: Nuclease to use. role: NucleaseRole.CUTTING (strict) or NucleaseRole.BINDING (relaxed). guide_length: Override default guide length (e.g., 20bp with SaCas9).

Returns:

EnumerationResult with candidates.

Example:
>>> # CRISPRa with relaxed PAM and 20bp guides
>>> result = enumerate_from_sequence(
...     promoter_seq,
...     gene_symbol="Rai1",
...     tss_position=600,
...     nuclease=Nuclease.SACAS9,
...     role=NucleaseRole.BINDING,
...     guide_length=20,  # Chang et al. used 20bp
... )
class phaselab.crispr.NucleaseRole(value)[source]#

Bases: Enum

Role determines PAM strictness.

CUTTING: Nuclease must cleave DNA. Requires canonical PAM.

Used for: knockout, HDR, base editing at cut site

BINDING: Nuclease only needs to bind (dCas9). Tolerates non-canonical PAMs.

Used for: CRISPRa, CRISPRi, epigenome editing

This distinction is critical because CRISPRa papers routinely use guides that would fail canonical PAM validation but work experimentally because dCas9 binding tolerance >> Cas9 cutting tolerance.

CUTTING = 'cutting'#
BINDING = 'binding'#
class phaselab.crispr.DesignResult(ranked_guides, tiers, gene_symbol, genome_build, modality, nuclease, policy, region_set=None, enumeration=None, total_enumerated=0, total_after_gates=0, manifest=<factory>)[source]#

Bases: object

Complete result of guide design pipeline.

Contains all intermediate outputs for transparency.

Attributes:
enumeration
region_set
tier_a_guides

Get Tier A guides (0/0/0 off-targets).

tier_b_guides

Get Tier B guides.

top_guide

Get the top-ranked guide.

Methods

summary()

Generate human-readable summary.

to_dataframe()

Convert ranked guides to pandas DataFrame (if pandas available).

enumeration: EnumerationResult | None = None#
region_set: RegionSet | None = None#
summary()[source]#

Generate human-readable summary.

property tier_a_guides: List[Dict[str, Any]]#

Get Tier A guides (0/0/0 off-targets).

property tier_b_guides: List[Dict[str, Any]]#

Get Tier B guides.

to_dataframe()[source]#

Convert ranked guides to pandas DataFrame (if pandas available).

property top_guide: Dict[str, Any] | None#

Get the top-ranked guide.

total_after_gates: int = 0#
total_enumerated: int = 0#
ranked_guides: List[Dict[str, Any]]#
tiers: Dict[str, List[Dict[str, Any]]]#
gene_symbol: str#
genome_build: GenomeBuild#
modality: Modality#
nuclease: Nuclease#
policy: RankingPolicy#
manifest: Dict[str, Any]#
class phaselab.crispr.BenchmarkResult(gene_symbol, published_guides, phaselab_rankings, passed, pass_rate, details)[source]#

Bases: object

Result of benchmarking PhaseLab against published guides.

Methods

summary

summary()[source]#
gene_symbol: str#
published_guides: List[Dict[str, Any]]#
phaselab_rankings: List[Dict[str, Any]]#
passed: bool#
pass_rate: float#
details: List[str]#
phaselab.crispr.design_crispra_guides(gene_symbol, promoter_sequence=None, tss_position=None, genome_build=GenomeBuild.HG38, nuclease=Nuclease.SPCAS9, policy=RankingPolicy.BINDING_STRICT, window=None, check_u6=True, relaxed_pam=True, guide_length=None)[source]#

Design CRISPRa guides for a target gene.

This is the main entry point for CRISPRa guide design.

v0.9.3: CRISPRa uses BINDING mode by default (relaxed PAM constraints).

This is experimentally validated - dCas9 binding tolerates non-canonical PAMs that would never support cutting. Guide length can be overridden (some papers use 20bp with SaCas9).

Args:

gene_symbol: Target gene name (e.g., “RAI1”). promoter_sequence: Promoter DNA sequence (required). tss_position: Position of TSS within promoter_sequence. genome_build: Genome build for annotation lookup. nuclease: CRISPR nuclease to use. policy: Ranking policy (default: BINDING_STRICT for CRISPRa). window: Custom window (uses default CRISPRa window if None). check_u6: Check U6/Pol III compatibility. relaxed_pam: Use binding-mode PAM (default True for CRISPRa). guide_length: Override default guide length (e.g., 20bp with SaCas9).

Returns:

DesignResult with ranked guides.

Example:
>>> # Match Chang et al. 2022 parameters
>>> result = design_crispra_guides(
...     gene_symbol="Rai1",
...     promoter_sequence=rai1_promoter,
...     tss_position=600,
...     nuclease=Nuclease.SACAS9,
...     guide_length=20,  # They used 20bp guides
... )
>>> for guide in result.tier_a_guides[:5]:
...     print(f"{guide['sequence']} (Tier A)")
phaselab.crispr.design_crispra_guides_v2(gene_symbol, promoter_sequence=None, tss_position=None, genome_build=GenomeBuild.HG38, nuclease=Nuclease.SPCAS9, policy=RankingPolicy.BINDING_STRICT, window=None, check_u6=True, relaxed_pam=True, guide_length=None)#

Design CRISPRa guides for a target gene.

This is the main entry point for CRISPRa guide design.

v0.9.3: CRISPRa uses BINDING mode by default (relaxed PAM constraints).

This is experimentally validated - dCas9 binding tolerates non-canonical PAMs that would never support cutting. Guide length can be overridden (some papers use 20bp with SaCas9).

Args:

gene_symbol: Target gene name (e.g., “RAI1”). promoter_sequence: Promoter DNA sequence (required). tss_position: Position of TSS within promoter_sequence. genome_build: Genome build for annotation lookup. nuclease: CRISPR nuclease to use. policy: Ranking policy (default: BINDING_STRICT for CRISPRa). window: Custom window (uses default CRISPRa window if None). check_u6: Check U6/Pol III compatibility. relaxed_pam: Use binding-mode PAM (default True for CRISPRa). guide_length: Override default guide length (e.g., 20bp with SaCas9).

Returns:

DesignResult with ranked guides.

Example:
>>> # Match Chang et al. 2022 parameters
>>> result = design_crispra_guides(
...     gene_symbol="Rai1",
...     promoter_sequence=rai1_promoter,
...     tss_position=600,
...     nuclease=Nuclease.SACAS9,
...     guide_length=20,  # They used 20bp guides
... )
>>> for guide in result.tier_a_guides[:5]:
...     print(f"{guide['sequence']} (Tier A)")
phaselab.crispr.design_knockout_guides_v2(gene_symbol, exon_sequence, nuclease=Nuclease.SPCAS9, policy=RankingPolicy.CUTTING_STRICT, check_u6=True)#

Design knockout guides for a target gene.

For knockout, we typically target early exons.

Args:

gene_symbol: Target gene name. exon_sequence: Exon DNA sequence to target. nuclease: CRISPR nuclease. policy: Ranking policy (default: CUTTING_STRICT for safety). check_u6: Check U6/Pol III compatibility.

Returns:

DesignResult with ranked guides.

phaselab.crispr.evaluate_candidate(candidate, check_u6=True)[source]#

Evaluate a candidate guide for sequence quality.

This applies local heuristics. Full off-target evaluation requires CRISPOR integration.

Args:

candidate: CandidateGuide from enumeration. check_u6: Check U6/Pol III compatibility.

Returns:

Dict with evaluation metrics.

phaselab.crispr.evaluate_candidates(candidates, check_u6=True)[source]#

Evaluate a list of candidates.

phaselab.crispr.benchmark_against_published(published_guides, design_result, require_tier_a=False, require_top_n=10)[source]#

Benchmark PhaseLab rankings against published experimental results.

Args:

published_guides: List of dicts with ‘sequence’ and ‘experimental_winner’. design_result: DesignResult from design_crispra_guides or similar. require_tier_a: Require winners to be Tier A. require_top_n: Require winners to be in top N.

Returns:

BenchmarkResult with pass/fail analysis.

Example:
>>> published = [
...     {'sequence': 'CCTGGCACCCGAGGCCACGA', 'experimental_winner': True},
...     {'sequence': 'GTCTAAGTCCAAAATCCTCA', 'experimental_winner': False},
... ]
>>> result = benchmark_against_published(published, design_result)
>>> print(result.summary())
class phaselab.crispr.BindingEnergyResult(delta_E, absolute_E, coherence, n_qubits, n_paulis, evidence, method, details)[source]#

Bases: object

Result from quantum binding energy calculation.

Attributes:

delta_E: Relative binding energy (negative = more favorable) absolute_E: Absolute ground state energy (from VQE) coherence: IR coherence R̄ from VQE n_qubits: Number of qubits used n_paulis: Number of Pauli terms in Hamiltonian evidence: Evidence level (“QUANTUM”, “CLASSICAL”, “HEURISTIC”) method: Method used for calculation details: Additional calculation details

Methods

is_go()

Check if coherence passes GO threshold (e^-2 ≈ 0.135).

is_go()[source]#

Check if coherence passes GO threshold (e^-2 ≈ 0.135).

delta_E: float#
absolute_E: float#
coherence: float#
n_qubits: int#
n_paulis: int#
evidence: str#
method: str#
details: Dict[str, Any]#
phaselab.crispr.compute_binding_energy(guide_seq, reference_seq=None, use_quantum=True, use_pyscf=False, flanking_bp=10)[source]#

Compute relative binding energy for a guide sequence.

This is the main entry point for Path A: Binding Energy Landscape.

Args:

guide_seq: 20bp guide RNA sequence reference_seq: Reference guide for ΔE calculation (if None, uses internal ref) use_quantum: Use quantum VQE (ATLAS-Q) if available use_pyscf: Try PySCF quantum chemistry for Hamiltonian flanking_bp: Flanking bases for context

Returns:

BindingEnergyResult with ΔE and coherence

Example:
>>> result = compute_binding_energy("ATCGATCGATCGATCGATCG")
>>> print(f"ΔE = {result.delta_E:.4f}, R̄ = {result.coherence:.3f}")
>>> if result.is_go():
...     print("Coherent binding - GO")
phaselab.crispr.compute_binding_landscape(guides, reference_guide=None, use_quantum=True, use_pyscf=False)[source]#

Compute binding energy landscape for multiple guides.

This ranks guides by their relative binding energetics.

Args:

guides: List of guide sequences reference_guide: Reference for ΔE calculation (if None, uses first guide) use_quantum: Use quantum VQE if available use_pyscf: Use PySCF quantum chemistry (slower but more accurate)

Returns:

List of BindingEnergyResult sorted by ΔE (most favorable first)

Example:
>>> guides = ["ATCGATCGATCGATCGATCG", "GCGCGCGCGCGCGCGCGCGC"]
>>> results = compute_binding_landscape(guides, use_pyscf=False)  # Fast
>>> for r in results:
...     print(f"{r.details['guide_fragment']}: ΔE={r.delta_E:.4f}")
phaselab.crispr.rank_guides_by_binding(guides, use_quantum=True)[source]#

Add binding energy ranking to existing guide data.

This can be used as a post-processing step after standard guide design.

Args:

guides: List of guide dicts (must have ‘sequence’ key) use_quantum: Use quantum VQE if available

Returns:

Guides with added ‘binding_energy’ field

Example:
>>> from phaselab.crispr import design_crispra_guides
>>> result = design_crispra_guides(...)
>>> ranked = rank_guides_by_binding(result.ranked_guides)
class phaselab.crispr.PhaseAlignmentResult(phase_coherence, enhancement_factor, critical_window, phase_velocity, evidence, details)[source]#

Bases: object

Result from transcriptional phase alignment analysis.

Attributes:

phase_coherence: R̄ at TSS after perturbation (higher = better) enhancement_factor: Predicted fold-change in transcription critical_window: Whether guide is in optimal TSS window (-400 to -50) phase_velocity: Rate of phase propagation to TSS evidence: Evidence level details: Additional calculation details

Methods

is_go()

Check if phase coherence passes GO threshold (e^-2 ≈ 0.135).

is_go()[source]#

Check if phase coherence passes GO threshold (e^-2 ≈ 0.135).

phase_coherence: float#
enhancement_factor: float#
critical_window: bool#
phase_velocity: float#
evidence: str#
details: Dict[str, Any]#
phaselab.crispr.compute_phase_alignment(guide_sequence, promoter_sequence, tss_position, guide_position, perturbation_strength=1.0)[source]#

Compute transcriptional phase alignment score for a guide.

This is the main entry point for Path B: Transcriptional Phase Alignment.

Args:

guide_sequence: Guide RNA sequence (for verification) promoter_sequence: Full promoter DNA sequence tss_position: Position of TSS in promoter_sequence guide_position: Start position of guide binding

Returns:

PhaseAlignmentResult with coherence and enhancement prediction

Example:
>>> result = compute_phase_alignment(
...     guide_sequence="ATCGATCGATCGATCGATCG",
...     promoter_sequence=promoter_seq,
...     tss_position=500,
...     guide_position=200,
... )
>>> print(f"Phase coherence: {result.phase_coherence:.3f}")
>>> print(f"Enhancement: {result.enhancement_factor:.2f}×")
phaselab.crispr.compute_phase_landscape(guides, promoter_sequence, tss_position)[source]#

Compute phase alignment for multiple guides.

Args:

guides: List of guide dicts with ‘sequence’ and ‘position’ keys promoter_sequence: Full promoter DNA sequence tss_position: Position of TSS

Returns:

Guides with added ‘phase_alignment’ field

Example:
>>> guides = [{'sequence': 'ATG...', 'position': 200}, ...]
>>> results = compute_phase_landscape(guides, promoter, tss=500)
phaselab.crispr.rank_guides_by_phase(guides, promoter_sequence, tss_position)[source]#

Rank guides by transcriptional phase alignment.

This can be used as a post-processing step after standard guide design.

Args:

guides: List of guide dicts promoter_sequence: Full promoter DNA tss_position: TSS position

Returns:

Guides sorted by phase coherence (highest first)

phaselab.crispr.optimal_guide_position(promoter_sequence, tss_position, search_window=(-400, -50), step_size=20)[source]#

Find optimal guide position by scanning phase landscape.

Args:

promoter_sequence: Full promoter sequence tss_position: TSS position search_window: Window relative to TSS to search step_size: Position step size (bp)

Returns:

Dict with optimal position and predicted enhancement

Example:
>>> result = optimal_guide_position(promoter, tss=500)
>>> print(f"Best position: {result['position']} ({result['enhancement']:.2f}×)")
class phaselab.crispr.OffTargetGeometryResult(R_bar_on, R_bar_off_mean, R_bar_off_std, delta_R, specificity_score, n_offtargets, evidence, details)[source]#

Bases: object

Result from off-target geometry analysis.

Attributes:

R_bar_on: Coherence for on-target binding R_bar_off_mean: Mean coherence for off-target ensemble R_bar_off_std: Std of off-target coherence delta_R: Coherence contrast (R̄_on - R̄_off_mean) specificity_score: Combined specificity metric (0-100) n_offtargets: Number of off-targets analyzed evidence: Evidence level details: Additional calculation details

Methods

is_go()

Check if on-target coherence passes GO threshold.

is_specific([threshold])

Check if coherence contrast indicates good specificity.

is_go()[source]#

Check if on-target coherence passes GO threshold.

is_specific(threshold=0.3)[source]#

Check if coherence contrast indicates good specificity.

R_bar_on: float#
R_bar_off_mean: float#
R_bar_off_std: float#
delta_R: float#
specificity_score: float#
n_offtargets: int#
evidence: str#
details: Dict[str, Any]#
phaselab.crispr.compute_offtarget_geometry(guide_seq, offtargets=None, use_quantum=True)[source]#

Compute off-target landscape geometry for a guide.

This is the main entry point for Path C: Off-Target Landscape Geometry.

Args:

guide_seq: Guide RNA sequence (20bp) offtargets: List of off-target dicts with ‘sequence’ key

If None, generates synthetic off-targets

use_quantum: Use ATLAS-Q for coherence if available

Returns:

OffTargetGeometryResult with coherence contrast

Example:
>>> # With real CRISPOR off-targets
>>> offtargets = [{'sequence': 'ATCGATCGATCGATCGATCC', 'mismatches': 1}, ...]
>>> result = compute_offtarget_geometry(guide, offtargets)
>>> print(f"Coherence contrast: {result.delta_R:.3f}")
>>>
>>> # With synthetic off-targets (for testing)
>>> result = compute_offtarget_geometry(guide, offtargets=None)
phaselab.crispr.compute_geometry_landscape(guides, offtargets_per_guide=None, use_quantum=True)[source]#

Compute off-target geometry for multiple guides.

Args:

guides: List of guide dicts with ‘sequence’ key offtargets_per_guide: Dict mapping guide sequence to off-targets use_quantum: Use ATLAS-Q if available

Returns:

Guides with added ‘offtarget_geometry’ field

phaselab.crispr.rank_guides_by_geometry(guides, offtargets_per_guide=None, use_quantum=True)[source]#

Rank guides by off-target geometry specificity.

Args:

guides: List of guide dicts offtargets_per_guide: Real off-targets from CRISPOR use_quantum: Use ATLAS-Q if available

Returns:

Guides sorted by specificity score (highest first)

phaselab.crispr.integrate_crispor_offtargets(crispor_results)[source]#

Convert CRISPOR results to off-target format for geometry analysis.

Args:

crispor_results: Output from CrisporValidator or parse_crispor_results

Returns:

Dict mapping guide sequence to list of off-target dicts

class phaselab.crispr.MultiEvidenceResult(combined_score, binding_result, phase_result, geometry_result, individual_scores, evidence_levels, is_go, recommendation, details=<factory>)[source]#

Bases: object

Result from multi-evidence scoring.

Attributes:

combined_score: Fused score from all paths (0-1) binding_result: Result from Path A phase_result: Result from Path B geometry_result: Result from Path C individual_scores: Normalized scores per path evidence_levels: Evidence level per path is_go: Whether combined analysis passes GO threshold recommendation: Human-readable recommendation

combined_score: float#
binding_result: BindingEnergyResult | None#
phase_result: PhaseAlignmentResult | None#
geometry_result: OffTargetGeometryResult | None#
individual_scores: Dict[str, float]#
evidence_levels: Dict[str, str]#
is_go: bool#
recommendation: str#
details: Dict[str, Any]#
phaselab.crispr.compute_multi_evidence_score(guide_sequence, promoter_sequence=None, tss_position=None, guide_position=None, offtargets=None, weights=None, run_binding=True, run_phase=True, run_geometry=True, use_quantum=True)[source]#

Compute multi-evidence score for a CRISPRa guide.

This is the main entry point for unified guide evaluation.

Args:

guide_sequence: 20bp guide RNA sequence promoter_sequence: Full promoter DNA (required for Path B) tss_position: TSS position (required for Path B) guide_position: Guide binding position (required for Path B) offtargets: Off-target list for Path C (optional) weights: Custom weights for evidence fusion run_binding: Run Path A (quantum binding) run_phase: Run Path B (phase alignment) run_geometry: Run Path C (geometry) use_quantum: Use ATLAS-Q where available

Returns:

MultiEvidenceResult with fused score and individual results

Example:
>>> result = compute_multi_evidence_score(
...     guide_sequence="ATCGATCGATCGATCGATCG",
...     promoter_sequence=promoter,
...     tss_position=500,
...     guide_position=200,
... )
>>> print(f"Combined: {result.combined_score:.3f}")
>>> if result.is_go:
...     print(f"RECOMMENDATION: {result.recommendation}")
phaselab.crispr.rank_guides_multi_evidence(guides, promoter_sequence=None, tss_position=None, offtargets_per_guide=None, weights=None, use_quantum=True)[source]#

Rank multiple guides using multi-evidence scoring.

Args:

guides: List of guide dicts with ‘sequence’ and optionally ‘position’ promoter_sequence: Full promoter DNA tss_position: TSS position offtargets_per_guide: Dict mapping guide sequence to off-targets weights: Custom weights use_quantum: Use ATLAS-Q

Returns:

Guides sorted by combined score (highest first)

Example:
>>> from phaselab.crispr import design_crispra_guides
>>> result = design_crispra_guides(...)
>>> ranked = rank_guides_multi_evidence(
...     result.ranked_guides,
...     promoter_sequence=promoter,
...     tss_position=500,
... )
>>> for g in ranked[:5]:
...     print(f"{g['sequence']}: {g['multi_evidence']['combined_score']:.3f}")
class phaselab.crispr.DiscriminatorStatus(value)[source]#

Bases: Enum

Status of quantum discriminator execution.

NOT_RUN = 'not_run'#
INSUFFICIENT_GUIDES = 'insufficient_guides'#
NO_DEGENERACY = 'no_degeneracy'#
QUANTUM_SUCCESS = 'quantum_success'#
QUANTUM_FAILED = 'quantum_failed'#
class phaselab.crispr.QuantumGuideResult(guide_sequence, binding_energy, energy_uncertainty, coherence, is_go, n_qubits, n_paulis, vqe_iterations, execution_time_s, details=<factory>)[source]#

Bases: object

Result from quantum discriminator for a single guide.

Attributes:

guide_sequence: The guide RNA sequence binding_energy: Ground-state binding energy (Hartree) energy_uncertainty: Measurement uncertainty coherence: IR execution coherence R̄ is_go: Whether result passes GO/NO-GO threshold n_qubits: Number of qubits used n_paulis: Number of Pauli terms vqe_iterations: VQE iterations to convergence execution_time_s: Wall-clock time

guide_sequence: str#
binding_energy: float#
energy_uncertainty: float#
coherence: float#
is_go: bool#
n_qubits: int#
n_paulis: int#
vqe_iterations: int#
execution_time_s: float#
details: Dict[str, Any]#
class phaselab.crispr.DiscriminatorResult(status, ranked_guides, energy_separations, significant_separations, classical_scores, quantum_advantage, manifest)[source]#

Bases: object

Complete result from quantum discriminator.

Attributes:

status: Overall status ranked_guides: Guides ordered by quantum binding energy energy_separations: Pairwise energy differences significant_separations: Which pairs are statistically significant classical_scores: Original classical scores for comparison quantum_advantage: Whether quantum resolved classical degeneracy manifest: Full execution manifest for reproducibility

Methods

summary()

Generate human-readable summary.

summary()[source]#

Generate human-readable summary.

status: DiscriminatorStatus#
ranked_guides: List[QuantumGuideResult]#
energy_separations: Dict[str, float]#
significant_separations: Dict[str, bool]#
classical_scores: Dict[str, float]#
quantum_advantage: bool#
manifest: Dict[str, Any]#
phaselab.crispr.run_quantum_discriminator(guides, dna_context, backend_name='ibm_torino', use_hardware=False, shots=1000, max_iterations=30, degeneracy_threshold=0.05, output_dir=None)[source]#

Run the quantum discriminator on a set of candidate guides.

This is the main entry point for quantum discrimination.

Only guides that pass all classical gates AND are within degeneracy threshold will be evaluated on quantum hardware.

Args:

guides: List of guide dicts with ‘sequence’ and scoring info dna_context: DNA target context for binding backend_name: IBM Quantum backend use_hardware: Use real IBM Quantum (False = simulation) shots: Shots per measurement max_iterations: Max VQE iterations degeneracy_threshold: Score difference for degeneracy output_dir: Directory to save results

Returns:

DiscriminatorResult with quantum-ordered guides

Example:
>>> guides = [
...     {'sequence': 'GCGCGCGCGC...', 'combined_score': 0.95},
...     {'sequence': 'ATCGATCGAT...', 'combined_score': 0.94},
... ]
>>> result = run_quantum_discriminator(guides, dna_context)
>>> print(result.summary())
phaselab.crispr.design_guides_with_quantum_discriminator(gene, guides, dna_context, quantum_stage='late', quantum_backend='ibm_torino', use_hardware=False, max_quantum_guides=3)[source]#

PhaseLab API: Design guides with optional quantum discrimination.

This is the high-level API for quantum-enhanced guide design.

Quantum is: - Optional (only if degeneracy detected) - Late-stage (after classical filtering) - Quality-gated (GO/NO-GO enforced)

Args:

gene: Target gene symbol guides: Pre-filtered guides from classical pipeline dna_context: DNA target context quantum_stage: When to use quantum (“late” or “always”) quantum_backend: IBM Quantum backend use_hardware: Use real hardware max_quantum_guides: Max guides to evaluate on quantum

Returns:

Dict with final ranked guides and quantum results

Example:
>>> result = design_guides_with_quantum_discriminator(
...     gene="RAI1",
...     guides=classical_top_guides,
...     dna_context=rai1_promoter,
...     quantum_stage="late",
...     use_hardware=True,
... )

Coherence Utilities#

Shared coherence utilities for CRISPR modules.

Provides unified coherence computation for all CRISPR modalities with two modes:

  1. HEURISTIC (default): Fast, uses Hamiltonian coefficient variance as proxy - R̄ ≈ 0.68-0.69 typically - Use as tie-breaker, not primary ranking signal - Does NOT benefit from ATLAS-Q acceleration

  2. QUANTUM (optional): Slow, runs actual VQE simulation - R̄ ≈ 0.84-0.97 (matches hardware validation) - Use for research-grade analysis - ATLAS-Q provides significant speedup here

v0.6.0: Centralized coherence using quantum/coherence.py with ATLAS-Q backend. v0.6.1: Added coherence_mode parameter for honest heuristic vs quantum distinction.

class phaselab.crispr.coherence_utils.CoherenceMode(value)[source]#

Bases: Enum

Coherence computation mode.

HEURISTIC: Fast proxy using Hamiltonian coefficient variance
  • Good for filtering/tie-breaking

  • NOT actual quantum coherence

QUANTUM: Actual VQE simulation (slow but accurate)
  • Matches hardware validation results

  • Use for research-grade analysis

HEURISTIC = 'heuristic'#
QUANTUM = 'quantum'#
phaselab.crispr.coherence_utils.compute_guide_coherence(guide_seq, mode='heuristic', use_atlas_q=True, return_full_result=False, n_shots=1000)[source]#

Compute IR coherence for a guide sequence.

TWO MODES AVAILABLE:

  1. mode=”heuristic” (default, fast): - Uses Hamiltonian coefficient variance as proxy - R̄ ≈ 0.68-0.69 typically - Does NOT benefit from ATLAS-Q - Use for filtering/tie-breaking, NOT primary ranking

  2. mode=”quantum” (slow, research-grade): - Runs actual VQE simulation on gRNA Hamiltonian - R̄ ≈ 0.84-0.97 (matches IBM hardware validation) - ATLAS-Q provides significant speedup - Use for final candidate validation

Args:

guide_seq: 20bp guide sequence (DNA, uppercase preferred) mode: “heuristic” (fast proxy) or “quantum” (VQE simulation) use_atlas_q: Use ATLAS-Q backend if available (only affects quantum mode) return_full_result: Return full CoherenceResult instead of just R̄ n_shots: Number of measurement shots for quantum mode (default 1000)

Returns:

R̄ value [0, 1] or full CoherenceResult if return_full_result=True

Example:
>>> # Fast heuristic (default)
>>> r_bar = compute_guide_coherence("ATCGATCGATCGATCGATCG")
>>> print(f"Heuristic R̄ = {r_bar:.4f}")
>>> # Research-grade quantum
>>> r_bar = compute_guide_coherence("ATCGATCGATCGATCGATCG", mode="quantum")
>>> print(f"Quantum R̄ = {r_bar:.4f}")
phaselab.crispr.coherence_utils.compute_guide_coherence_with_details(guide_seq, mode='heuristic', use_atlas_q=True)[source]#

Compute coherence with full details.

Args:

guide_seq: 20bp guide sequence mode: “heuristic” (fast proxy) or “quantum” (VQE simulation) use_atlas_q: Use ATLAS-Q backend if available (only affects quantum mode)

Returns:

Tuple of (R_bar, V_phi, is_go, method)

phaselab.crispr.coherence_utils.compute_coherence_batch(guide_sequences, mode='heuristic', use_atlas_q=True)[source]#

Compute coherence for multiple guides efficiently.

Args:

guide_sequences: List of guide sequences mode: “heuristic” (fast proxy) or “quantum” (VQE simulation) use_atlas_q: Use ATLAS-Q backend if available (only affects quantum mode)

Returns:

List of R̄ values

Note:
  • mode=”heuristic”: ~0.1ms per guide (recommended for screening)

  • mode=”quantum”: ~100-500ms per guide (recommended for final validation)

phaselab.crispr.coherence_utils.is_guide_coherent(guide_seq, threshold=np.float64(0.1353352832366127), mode='heuristic', use_atlas_q=True)[source]#

Check if guide meets coherence threshold.

Args:

guide_seq: 20bp guide sequence threshold: R̄ threshold (default: e^-2 ≈ 0.135) mode: “heuristic” (fast proxy) or “quantum” (VQE simulation) use_atlas_q: Use ATLAS-Q backend if available (only affects quantum mode)

Returns:

True if R̄ > threshold (GO status)

phaselab.crispr.coherence_utils.get_coherence_method()[source]#

Get the coherence computation method that will be used.

Returns:

“atlas_q” if ATLAS-Q available, “native” otherwise

phaselab.crispr.coherence_utils.compute_coherence_with_zscore(guide_sequences, mode='heuristic', use_atlas_q=True)[source]#

Compute coherence for multiple guides with domain-calibrated z-scores.

Z-score provides relative ranking within a locus, which is more discriminative than absolute R̄ when most guides have similar coherence (e.g., in GC-rich regions).

Args:

guide_sequences: List of guide sequences from the same locus mode: “heuristic” (fast proxy) or “quantum” (VQE simulation) use_atlas_q: Use ATLAS-Q backend if available (only affects quantum mode)

Returns:

List of (R_bar, z_score) tuples - R̄: Absolute coherence - z_score: (R̄ - mean(R̄)) / std(R̄) within the locus

Example:
>>> seqs = ["ATCGATCGATCGATCGATCG", "GCTAGCTAGCTAGCTAGCTA", ...]
>>> results = compute_coherence_with_zscore(seqs)
>>> for r_bar, zscore in results:
...     print(f"R̄={r_bar:.4f}, z={zscore:+.2f}")
phaselab.crispr.coherence_utils.get_coherence_eligibility_info(mode='heuristic')[source]#

Get detailed info about coherence computation eligibility.

This helps users understand what method will be used and why.

Args:

mode: “heuristic” (default) or “quantum”

Returns:

Dict with: - mode: “heuristic” or “quantum” - method: Expected method string in CoherenceResult - atlas_q_available: bool - acceleration_active: bool (True if ATLAS-Q acceleration will be used) - expected_r_bar_range: Typical R̄ range for this mode - expected_time_per_guide: Approximate time in ms - reason: Human-readable explanation

Quantum Module#

PhaseLab Quantum: ATLAS-Q integration and quantum computation modes.

This module provides: - Quantum mode configuration (off/audit/required) - Integration with ATLAS-Q for IR measurement grouping - VQE optimization for coherence validation - IBM Quantum hardware support - GPU acceleration (when available)

Quantum Modes (v1.0.0): - OFF: Classical-only (default, fastest) - AUDIT: Classical + quantum validation on subset - REQUIRED: Quantum mandatory for all coherence

All features are optional and gracefully degrade if atlas-quantum is not installed.

phaselab.quantum.is_atlas_q_available()[source]#

Check if atlas-quantum is installed.

phaselab.quantum.get_atlas_q_version()[source]#

Get atlas-quantum version if installed.

class phaselab.quantum.QuantumMode(value)[source]#

Bases: Enum

Quantum computation mode.

OFF: Classical-only (default, fastest) AUDIT: Classical + quantum validation subset REQUIRED: Quantum mandatory for all coherence

OFF = 'off'#
AUDIT = 'audit'#
REQUIRED = 'required'#
class phaselab.quantum.QuantumConfig(mode=QuantumMode.OFF, backend='simulator', shots=4096, audit_fraction=0.1, error_on_unavailable=True, ibm_token=None)[source]#

Bases: object

Configuration for quantum computation.

Attributes:

mode: Quantum computation mode (off/audit/required) backend: Quantum backend to use (simulator/ibm_torino/etc.) shots: Number of measurement shots audit_fraction: Fraction of results to audit (for AUDIT mode) error_on_unavailable: Raise error if required quantum unavailable ibm_token: IBM Quantum API token (optional)

Attributes:
ibm_token

Methods

get_backend_info()

Get information about configured backend.

is_quantum_available()

Check if quantum computation is available.

should_use_quantum()

Determine if quantum should be used for this config.

__post_init__()[source]#

Validate configuration.

audit_fraction: float = 0.1#
backend: str = 'simulator'#
error_on_unavailable: bool = True#
get_backend_info()[source]#

Get information about configured backend.

ibm_token: str | None = None#
is_quantum_available()[source]#

Check if quantum computation is available.

mode: QuantumMode = 'off'#
shots: int = 4096#
should_use_quantum()[source]#

Determine if quantum should be used for this config.

phaselab.quantum.get_quantum_config()[source]#

Get the global quantum configuration.

phaselab.quantum.set_quantum_config(config)[source]#

Set the global quantum configuration.

phaselab.quantum.set_quantum_mode(mode)[source]#

Set the global quantum mode.

Args:

mode: QuantumMode enum or string (“off”, “audit”, or “required”)

Example:
>>> from phaselab.quantum import set_quantum_mode, QuantumMode
>>> set_quantum_mode("audit")  # String form
>>> set_quantum_mode(QuantumMode.AUDIT)  # Enum form
phaselab.quantum.get_quantum_mode()[source]#

Get the current quantum mode as QuantumMode enum.

phaselab.quantum.configure_quantum(mode='off', backend='simulator', shots=4096, ibm_token=None)[source]#

Configure quantum computation settings.

Args:

mode: “off”, “audit”, or “required” backend: Backend name (simulator, ibm_torino, etc.) shots: Number of measurement shots ibm_token: IBM Quantum API token

Returns:

QuantumConfig object

Example:
>>> config = configure_quantum(
...     mode="required",
...     backend="ibm_torino",
...     shots=8192,
... )
phaselab.quantum.quantum_status()[source]#

Get comprehensive quantum status.

Returns:

Dictionary with quantum configuration and availability.

phaselab.quantum.__getattr__(name)[source]#

Lazy import submodules.

PhaseLab Quantum Coherence: Real circular statistics via ATLAS-Q.

Replaces the heuristic coherence calculation with proper circular statistics from Pauli expectation values:

  1. Map ⟨P⟩ ∈ [-1, 1] → φ = arccos(⟨P⟩) ∈ [0, π]

  2. Compute mean phasor: ⟨e^(iφ)⟩

  3. R̄ = |⟨e^(iφ)⟩| (mean resultant length)

  4. V_φ = -2 ln(R̄) (circular variance)

This is the validated IR coherence metric from ATLAS-Q hardware runs.

class phaselab.quantum.coherence.CoherenceResult(R_bar, V_phi, is_go, n_measurements, method)[source]#

Bases: object

Result from coherence calculation.

Attributes:

R_bar: Mean resultant length [0, 1] V_phi: Circular variance [0, ∞) is_go: Whether R̄ > e^-2 threshold n_measurements: Number of measurements used method: Calculation method used

R_bar: float#
V_phi: float#
is_go: bool#
n_measurements: int#
method: str#
phaselab.quantum.coherence.compute_coherence_from_expectations(expectation_values, e2_threshold=np.float64(0.1353352832366127), use_atlas_q=True)[source]#

Compute coherence from Pauli expectation values using circular statistics.

This is the proper IR coherence calculation: 1. Convert ⟨P⟩ → phases via arccos 2. Compute mean resultant length R̄ 3. Derive circular variance V_φ = -2 ln(R̄)

Args:

expectation_values: Array of Pauli expectations ⟨P⟩ ∈ [-1, 1] e2_threshold: GO/NO-GO threshold (default: e^-2 ≈ 0.135) use_atlas_q: Use ATLAS-Q backend if available

Returns:

CoherenceResult with R̄, V_φ, and classification

Example:
>>> expectations = np.array([0.9, 0.85, 0.88, 0.92])
>>> result = compute_coherence_from_expectations(expectations)
>>> print(f"R̄ = {result.R_bar:.4f}, Status: {'GO' if result.is_go else 'NO-GO'}")
phaselab.quantum.coherence.compute_coherence_from_phases(phases, e2_threshold=np.float64(0.1353352832366127))[source]#

Compute coherence directly from phase angles.

Uses the Kuramoto order parameter: R̄ = |⟨e^(iφ)⟩|

Args:

phases: Array of phase angles in radians e2_threshold: GO/NO-GO threshold

Returns:

CoherenceResult with R̄, V_φ, and classification

phaselab.quantum.coherence.compute_coherence_from_hamiltonian(coefficients, use_atlas_q=True)[source]#

Estimate coherence from Hamiltonian structure (heuristic).

This is a simplified method that estimates coherence from Hamiltonian coefficient variance without actual measurement. Use compute_coherence_from_expectations() for real coherence.

Args:

coefficients: Hamiltonian term coefficients use_atlas_q: Use ATLAS-Q for improved estimation

Returns:

CoherenceResult (heuristic estimate)

phaselab.quantum.coherence.compare_coherence(sim_result, hw_result, tolerance=0.05)[source]#

Compare simulator and hardware coherence values.

Args:

sim_result: Simulator coherence result hw_result: Hardware coherence result tolerance: Acceptable difference for “EXCELLENT”

Returns:

(absolute_difference, agreement_level)

Circadian Module#

PhaseLab Circadian: Clock gene network models with IR coherence.

Provides: - Kuramoto-based circadian oscillator models - SMS-specific RAI1 dosage models - PER gene delay dynamics - REV-ERBα / RORα modulation - Therapeutic window analysis - Multi-tissue circadian models (v0.3.0) - Jet lag and shift work simulations (v0.3.0)

phaselab.circadian.simulate_sms_clock(rai1_level=0.5, t_end=240.0, dt=0.1, params=None, y0=None, random_seed=None)[source]#

Simulate the SMS circadian clock under a given RAI1 level.

Args:
rai1_level: RAI1 expression level (0.0-1.5+).
  • 0.5 = typical SMS (haploinsufficient)

  • 1.0 = normal

  • 0.6-0.8 = therapeutic target

t_end: Simulation duration in hours (default 240 = 10 days). dt: Output time step. params: SMSClockParams (uses defaults if None). y0: Initial state [theta_C, theta_B, P, R, V]. random_seed: For reproducible initial conditions.

Returns:
Dictionary with:
  • t: time array

  • theta_C: CLOCK phase

  • theta_B: BMAL1 phase

  • P: PER feedback level

  • R: RORα level

  • V: REV-ERBα level

  • order_param: R̄(t) synchronization

  • final_R_bar: final synchronization score

  • classification: synchronization quality

  • go_no_go: GO/NO-GO status

class phaselab.circadian.SMSClockParams(omega_C=0.2617993877991494, omega_B=0.2670353755551324, K_base=0.6, tau_P=4.0, alpha_P=2.0, P_init=0.0, tau_R=12.0, tau_V=12.0, R_target=0.5, V_target=0.5, R_init=0.5, V_init=0.5, R_mid=0.5, R_k=0.15, V_mid=0.5, V_k=0.15, beta_R=0.5, beta_V=0.5, K0=0.6, rai1_scale_factor=1.0, include_per_cry=False, noise_strength=0.0)[source]#

Bases: object

Parameters for the Smith-Magenis Syndrome circadian model.

All frequencies are in rad/hour, time is in hours.

K0: float = 0.6#
K_base: float = 0.6#
P_init: float = 0.0#
R_init: float = 0.5#
R_k: float = 0.15#
R_mid: float = 0.5#
R_target: float = 0.5#
V_init: float = 0.5#
V_k: float = 0.15#
V_mid: float = 0.5#
V_target: float = 0.5#
alpha_P: float = 2.0#
beta_R: float = 0.5#
beta_V: float = 0.5#
include_per_cry: bool = False#
noise_strength: float = 0.0#
omega_B: float = 0.2670353755551324#
omega_C: float = 0.2617993877991494#
rai1_scale_factor: float = 1.0#
tau_P: float = 4.0#
tau_R: float = 12.0#
tau_V: float = 12.0#
phaselab.circadian.therapeutic_scan(rai1_levels=None, params=None, t_end=240.0, n_trials=3)[source]#

Scan RAI1 levels to find therapeutic window.

Simulates clock at multiple RAI1 levels to identify the boost needed to restore synchronization.

Args:

rai1_levels: List of RAI1 levels to test. params: SMSClockParams. t_end: Simulation duration. n_trials: Number of trials per level (for averaging).

Returns:
Dictionary with:
  • levels: RAI1 levels tested

  • R_bars: mean synchronization at each level

  • R_bar_std: standard deviation

  • classifications: synchronization class

  • therapeutic_window: (min, max) for SYNCHRONIZED

  • optimal_level: RAI1 level with best sync

phaselab.circadian.classify_synchronization(R_bar)[source]#

Classify circadian synchronization quality.

Args:

R_bar: Order parameter (0-1).

Returns:

Classification string.

phaselab.circadian.kuramoto_order_parameter(phases)[source]#

Compute Kuramoto order parameter from phase array.

Args:

phases: Array of oscillator phases (radians).

Returns:

(R, psi) where R is magnitude and psi is mean phase.

phaselab.circadian.kuramoto_ode(t, phases, omegas, K)[source]#

Kuramoto ODE right-hand side.

dθ_i/dt = ω_i + Σ_j K_ij sin(θ_j - θ_i)

Args:

t: Time (not used, for ODE solver interface). phases: Current phases of all oscillators. omegas: Natural frequencies of oscillators. K: Coupling matrix (N x N).

Returns:

Phase derivatives dθ/dt.

class phaselab.circadian.MultiTissueParams(tissues=<factory>, K_global=0.5, K_peripheral=0.1, light_amplitude=0.3, light_phase=0.0, feeding_amplitude=0.2, feeding_phase=1.5707963267948966, disease_tissue=None, disease_severity=0.0, noise_strength=0.01)[source]#

Bases: object

Parameters for multi-tissue circadian model.

Attributes:
disease_tissue
K_global: float = 0.5#
K_peripheral: float = 0.1#
disease_severity: float = 0.0#
disease_tissue: str | None = None#
feeding_amplitude: float = 0.2#
feeding_phase: float = 1.5707963267948966#
light_amplitude: float = 0.3#
light_phase: float = 0.0#
noise_strength: float = 0.01#
tissues: List[str]#
class phaselab.circadian.MultiTissueResult(t, phases, amplitudes, global_R_bar, tissue_R_bars, scn_peripheral_coherence, phase_delays, go_no_go, classification)[source]#

Bases: object

Results from multi-tissue simulation.

Methods

to_dict

to_dict()[source]#
t: ndarray#
phases: Dict[str, ndarray]#
amplitudes: Dict[str, ndarray]#
global_R_bar: float#
tissue_R_bars: Dict[str, float]#
scn_peripheral_coherence: float#
phase_delays: Dict[str, float]#
go_no_go: str#
classification: str#
phaselab.circadian.simulate_multi_tissue(params=None, t_end=240.0, dt=0.5, random_seed=None)[source]#

Simulate multi-tissue circadian clock.

Args:

params: MultiTissueParams (uses defaults if None) t_end: Simulation duration in hours dt: Output time step random_seed: For reproducible results

Returns:

MultiTissueResult with trajectories and coherence metrics

phaselab.circadian.jet_lag_simulation(time_shift=8.0, direction='east', tissues=None, t_end=336.0)[source]#

Simulate jet lag recovery across tissues.

Args:

time_shift: Number of hours shifted direction: “east” (phase advance) or “west” (phase delay) tissues: Tissues to simulate t_end: Simulation duration

Returns:

Dictionary with recovery trajectories and times

phaselab.circadian.shift_work_simulation(shift_schedule='rotating', shift_duration_days=7, tissues=None)[source]#

Simulate circadian disruption from shift work.

Args:

shift_schedule: Type of shift work pattern shift_duration_days: Days per shift rotation tissues: Tissues to simulate

Returns:

Dictionary with chronic disruption metrics

Integrations#

CRISPOR Integration#

PhaseLab CRISPOR Integration.

Provides integration with CRISPOR for comprehensive guide RNA design: - Off-target scoring (MIT, CFD) - On-target activity prediction (Doench 2016) - Genome-wide off-target enumeration - Combined IR coherence + specificity scoring - IR-enhanced off-target analysis (v0.6.0+):

  • Off-target entropy (risk distribution)

  • Coherence contrast (ΔR̄)

  • Energy spectrum analysis

  • Exonic risk flagging

Usage:

from phaselab.integrations.crispor import CrisporClient, design_guides_with_crispor

# Initialize client client = CrisporClient(crispor_path=”/path/to/crispor”)

# Design guides with combined scoring results = design_guides_with_crispor(

sequence=”ATCG…”, crispor_client=client,

)

# For IR-enhanced off-target analysis from phaselab.integrations.crispor import analyze_offtarget_landscape ir_analysis = analyze_offtarget_landscape(guide_seq, r_bar, offtargets)

class phaselab.integrations.crispor.CrisporClient(config)[source]#

Bases: object

Client for running CRISPOR analyses.

CRISPOR provides: - On-target activity scoring (Doench 2016, Moreno-Mateos) - Off-target enumeration and scoring (MIT, CFD) - Genome-wide specificity analysis

Example:

config = CrisporConfig(crispor_path=”/opt/crispor”) client = CrisporClient(config)

result = client.score_sequence(

sequence=”ATCGATCG…”, name=”my_target”

)

Attributes:
crispor_script

Methods

is_available(crispor_path)

Check if CRISPOR is available at the given path.

score_guides(guides[, include_offtargets, ...])

Score specific guide sequences.

score_sequence(sequence[, name, ...])

Run CRISPOR on a DNA sequence.

property crispor_script: Path#
static is_available(crispor_path)[source]#

Check if CRISPOR is available at the given path.

score_guides(guides, include_offtargets=True, output_dir=None)[source]#

Score specific guide sequences.

Args:

guides: List of 20bp guide sequences (without PAM) include_offtargets: Whether to compute off-targets output_dir: Where to save results

Returns:

CrisporOutput with paths to result files

score_sequence(sequence, name='target', include_offtargets=True, output_dir=None)[source]#

Run CRISPOR on a DNA sequence.

Args:

sequence: DNA sequence to analyze (will find all PAM sites) name: Name for the sequence in output include_offtargets: Whether to compute off-targets (slower but important) output_dir: Where to save results (temp dir if None)

Returns:

CrisporOutput with paths to result files

class phaselab.integrations.crispor.CrisporConfig(crispor_path, genome='hg38', genome_dir=None, pam='NGG', max_mismatches=4, python_bin='python3', timeout=300)[source]#

Bases: object

Configuration for CRISPOR client.

Attributes:
genome_dir
genome: str = 'hg38'#
genome_dir: Path | None = None#
max_mismatches: int = 4#
pam: str = 'NGG'#
python_bin: str = 'python3'#
timeout: int = 300#
crispor_path: Path#
class phaselab.integrations.crispor.CrisporOutput(guides_tsv, offtargets_tsv, stdout, stderr, success, output_dir)[source]#

Bases: object

Output from a CRISPOR run.

guides_tsv: Path#
offtargets_tsv: Path | None#
stdout: str#
stderr: str#
success: bool#
output_dir: Path#
class phaselab.integrations.crispor.CrisporGuideRow(guide_id, sequence, pam, strand, chrom=None, start=None, end=None, doench_2016=None, moreno_mateos=None, out_of_frame=None, mit_specificity=None, cfd_specificity=None, ot_0mm=0, ot_1mm=0, ot_2mm=0, ot_3mm=0, ot_4mm=0, ot_exonic_0mm=0, ot_exonic_1mm=0, ot_exonic_2mm=0, ot_exonic_3mm=0, total_offtargets=0, raw=<factory>)[source]#

Bases: object

Parsed guide from CRISPOR output.

Attributes:
cfd_specificity
chrom
doench_2016
end
has_exonic_close_offtargets

Whether there are exonic off-targets with 0-2 mismatches.

mit_specificity
moreno_mateos
out_of_frame
start
total_close_offtargets

Off-targets with 0-2 mismatches (most dangerous).

cfd_specificity: float | None = None#
chrom: str | None = None#
doench_2016: float | None = None#
end: int | None = None#
property has_exonic_close_offtargets: bool#

Whether there are exonic off-targets with 0-2 mismatches.

mit_specificity: float | None = None#
moreno_mateos: float | None = None#
ot_0mm: int = 0#
ot_1mm: int = 0#
ot_2mm: int = 0#
ot_3mm: int = 0#
ot_4mm: int = 0#
ot_exonic_0mm: int = 0#
ot_exonic_1mm: int = 0#
ot_exonic_2mm: int = 0#
ot_exonic_3mm: int = 0#
out_of_frame: float | None = None#
start: int | None = None#
property total_close_offtargets: int#

Off-targets with 0-2 mismatches (most dangerous).

total_offtargets: int = 0#
guide_id: str#
sequence: str#
pam: str#
strand: str#
raw: Dict[str, Any]#
class phaselab.integrations.crispor.CrisporOffTarget(guide_id, chrom, start, end, strand, sequence, mismatches, mismatch_positions, cfd_score, gene=None, gene_region=None)[source]#

Bases: object

A single off-target site from CRISPOR.

Attributes:
gene
gene_region
gene: str | None = None#
gene_region: str | None = None#
guide_id: str#
chrom: str#
start: int#
end: int#
strand: str#
sequence: str#
mismatches: int#
mismatch_positions: List[int]#
cfd_score: float#
phaselab.integrations.crispor.parse_guides_tsv(path)[source]#

Parse CRISPOR guides.tsv output file.

CRISPOR column names vary between versions, so we try multiple names.

phaselab.integrations.crispor.parse_offtargets_tsv(path)[source]#

Parse CRISPOR offtargets.tsv output file.

Returns detailed information about each off-target site.

phaselab.integrations.crispor.index_guides_by_sequence(guides)[source]#

Create index of guides by sequence for fast lookup.

class phaselab.integrations.crispor.EvidenceLevel(value)[source]#

Bases: Enum

Evidence level for guide validation.

A = CRISPOR + experimental off-target + wet-lab outcome (gold standard) B = CRISPOR metrics + IR add-ons (entropy, ΔR̄) (validated) C = Missing CRISPOR, or relying only on heuristic coherence (suggestions only)

A = 'A'#
B = 'B'#
C = 'C'#
class phaselab.integrations.crispor.CoherenceSource(value)[source]#

Bases: Enum

Source of coherence calculation.

Makes it clear where R̄ came from and how much to trust it.

property is_quantum_measured: bool#

True if coherence came from actual quantum measurement/simulation.

HEURISTIC = 'heuristic'#
ATLAS_Q_EXPECTATIONS = 'atlas_q_expectations'#
ATLAS_Q_VQE = 'atlas_q_vqe'#
HARDWARE = 'hardware'#
UNKNOWN = 'unknown'#
class phaselab.integrations.crispor.GuideCandidate(sequence, pam, position, strand, r_bar=0.0, r_bar_std=0.0, r_bar_zscore=0.0, phase_mean=0.0, phase_concentration=0.0, coherence_source=CoherenceSource.UNKNOWN, mit_specificity=None, cfd_specificity=None, doench_2016=None, moreno_mateos=None, out_of_frame=None, ot_0mm=0, ot_1mm=0, ot_2mm=0, ot_3mm=0, ot_4mm=0, ot_entropy=0.0, ot_entropy_normalized=0.0, delta_r_bar=0.0, ot_r_bar_max=0.0, ot_energy_tail=0.0, risk_concentrated=False, ot_competitive=False, has_exonic_close_ot=False, risk_mass_close=0.0, risk_mass_distant=0.0, risk_mass_exonic=0.0, tail_risk_score=0.0, gini_coefficient=0.0, herfindahl_index=0.0, final_score=0.0, stage1_pass=True, rank=0, evidence_level=EvidenceLevel.C, crispor_validated=False, ir_analysis_done=False, has_experimental_data=False, warnings=<factory>, score_breakdown=<factory>)[source]#

Bases: object

Combined guide candidate with both PhaseLab IR and CRISPOR metrics.

This unifies quantum phase coherence with classical off-target analysis. Now includes IR-enhanced off-target metrics (v0.6.0+).

IMPORTANT: Check evidence_level before trusting scores: - Level A: Fully validated (can trust score) - Level B: CRISPOR validated + IR (can compare within category) - Level C: Unvalidated suggestions only (do NOT compare to A/B)

Attributes:
cfd_specificity
doench_2016
full_sequence

Guide + PAM.

mit_specificity
moreno_mateos
out_of_frame
total_close_offtargets

Off-targets with 0-2 mismatches (most dangerous).

total_offtargets

All off-targets.

Methods

compute_evidence_level()

Compute and return the evidence level based on available data.

to_dict()

Convert to dictionary for JSON serialization.

cfd_specificity: float | None = None#
coherence_source: CoherenceSource = 'unknown'#
compute_evidence_level()[source]#

Compute and return the evidence level based on available data.

crispor_validated: bool = False#
delta_r_bar: float = 0.0#
doench_2016: float | None = None#
evidence_level: EvidenceLevel = 'C'#
final_score: float = 0.0#
property full_sequence: str#

Guide + PAM.

gini_coefficient: float = 0.0#
has_exonic_close_ot: bool = False#
has_experimental_data: bool = False#
herfindahl_index: float = 0.0#
ir_analysis_done: bool = False#
mit_specificity: float | None = None#
moreno_mateos: float | None = None#
ot_0mm: int = 0#
ot_1mm: int = 0#
ot_2mm: int = 0#
ot_3mm: int = 0#
ot_4mm: int = 0#
ot_competitive: bool = False#
ot_energy_tail: float = 0.0#
ot_entropy: float = 0.0#
ot_entropy_normalized: float = 0.0#
ot_r_bar_max: float = 0.0#
out_of_frame: float | None = None#
phase_concentration: float = 0.0#
phase_mean: float = 0.0#
r_bar: float = 0.0#
r_bar_std: float = 0.0#
r_bar_zscore: float = 0.0#
rank: int = 0#
risk_concentrated: bool = False#
risk_mass_close: float = 0.0#
risk_mass_distant: float = 0.0#
risk_mass_exonic: float = 0.0#
stage1_pass: bool = True#
tail_risk_score: float = 0.0#
to_dict()[source]#

Convert to dictionary for JSON serialization.

property total_close_offtargets: int#

Off-targets with 0-2 mismatches (most dangerous).

property total_offtargets: int#

All off-targets.

sequence: str#
pam: str#
position: int#
strand: str#
warnings: List[str]#
score_breakdown: Dict[str, float]#
class phaselab.integrations.crispor.ScoringWeights(w_coherence=0.3, w_coherence_heuristic=0.05, w_mit=0.25, w_cfd=0.1, w_doench=0.2, penalty_per_close_ot=0.02, penalty_per_distant_ot=0.005, penalty_perfect_match_ot=0.5, bonus_diffuse_risk=0.05, penalty_concentrated_risk=0.1, penalty_competitive_ot=0.08, penalty_exonic_close_ot=0.15, penalty_high_affinity_ot=0.05, penalty_per_risk_mass_close=0.01, penalty_per_risk_mass_exonic=0.02, penalty_tail_risk=0.1, min_mit_score=30.0, max_perfect_offtargets=1, max_exonic_close_ot=5, max_tail_risk=0.7, min_coherence=0.8, max_close_offtargets=50, entropy_low_threshold=0.3, entropy_high_threshold=0.7, delta_r_bar_threshold=0.05, max_score_unvalidated=0.2)[source]#

Bases: object

Weights for multi-objective guide scoring.

Default weights balance safety (specificity) with efficacy (activity, coherence). Now includes IR-enhanced off-target metrics (v0.6.0+).

TWO-STAGE SCORING MODEL: - Stage 1: Safety gate (hard filters) - rejects unsafe guides - Stage 2: Ranking (soft scores) - ranks safe guides

This prevents unsafe guides from ever appearing in top rankings.

COHERENCE MODE (v0.6.1): - Quantum mode (VQE): w_coherence = 0.30 (full weight) - Heuristic mode: w_coherence = 0.05 (tie-breaker only)

Heuristic coherence is demoted because it’s a structural proxy, not true quantum phase coherence. Use it to break ties among guides with similar CRISPOR scores, not as primary ranking signal.

bonus_diffuse_risk: float = 0.05#
delta_r_bar_threshold: float = 0.05#
entropy_high_threshold: float = 0.7#
entropy_low_threshold: float = 0.3#
max_close_offtargets: int = 50#
max_exonic_close_ot: int = 5#
max_perfect_offtargets: int = 1#
max_score_unvalidated: float = 0.2#
max_tail_risk: float = 0.7#
min_coherence: float = 0.8#
min_mit_score: float = 30.0#
penalty_competitive_ot: float = 0.08#
penalty_concentrated_risk: float = 0.1#
penalty_exonic_close_ot: float = 0.15#
penalty_high_affinity_ot: float = 0.05#
penalty_per_close_ot: float = 0.02#
penalty_per_distant_ot: float = 0.005#
penalty_per_risk_mass_close: float = 0.01#
penalty_per_risk_mass_exonic: float = 0.02#
penalty_perfect_match_ot: float = 0.5#
penalty_tail_risk: float = 0.1#
w_cfd: float = 0.1#
w_coherence: float = 0.3#
w_coherence_heuristic: float = 0.05#
w_doench: float = 0.2#
w_mit: float = 0.25#
phaselab.integrations.crispor.check_safety_gate(guide, weights)[source]#

Stage 1: Safety gate - hard filters that reject unsafe guides.

Returns:

(passed, rejection_reasons)

phaselab.integrations.crispor.design_guides_with_crispor(sequence, crispor_client, name='target', tss_position=500, window_upstream=400, window_downstream=100, weights=None, ir_scores=None)[source]#

Full pipeline: find guides, run CRISPOR, score and rank.

Args:

sequence: Target DNA sequence crispor_client: Configured CRISPOR client name: Target name tss_position: Position of TSS in the sequence (for relative coords) window_upstream: bp upstream of TSS to search window_downstream: bp downstream of TSS to search weights: Scoring weights ir_scores: Dict mapping guide sequence -> R̄ value (from PhaseLab)

Returns:

Ranked list of GuideCandidate objects

phaselab.integrations.crispor.compute_final_score(guide, weights=None, coherence_contrast=0.0)[source]#

Two-stage scoring for guide candidates.

STAGE 1: Safety gate (hard filters) - If failed, guide.stage1_pass = False and score is capped at 0.0

STAGE 2: Ranking (soft scores) - Combines IR coherence, CRISPOR metrics, off-target penalties - Only applied to guides that pass Stage 1

Returns:

Tuple of (score, warnings)

phaselab.integrations.crispor.merge_crispor_data(guides, crispor_output)[source]#

Merge CRISPOR results into guide candidates.

Matches guides by sequence and adds CRISPOR metrics.

phaselab.integrations.crispor.rank_guides(guides, weights=None)[source]#

Score and rank all guide candidates.

Returns guides sorted by final_score descending.

phaselab.integrations.crispor.generate_report(candidates, output_path, top_n=20)[source]#

Generate a detailed report of top guide candidates.

phaselab.integrations.crispor.save_results_json(candidates, output_path, metadata=None)[source]#

Save results to JSON for programmatic access.

class phaselab.integrations.crispor.OffTargetSite(sequence, mismatches, cfd_score, gene=None, region=None, r_bar=0.0, binding_energy=0.0, phase_mean=0.0)[source]#

Bases: object

Represents a single off-target site with IR metrics.

Attributes:
gene
is_close

0-2 mismatches are considered ‘close’ and dangerous.

is_exonic
region
binding_energy: float = 0.0#
gene: str | None = None#
property is_close: bool#

0-2 mismatches are considered ‘close’ and dangerous.

property is_exonic: bool#
phase_mean: float = 0.0#
r_bar: float = 0.0#
region: str | None = None#
sequence: str#
mismatches: int#
cfd_score: float#
class phaselab.integrations.crispor.OffTargetIRAnalysis(guide_sequence, r_bar_on_target=0.0, r_bar_off_max=0.0, delta_r_bar=0.0, entropy=0.0, entropy_normalized=0.0, energy_on_target=0.0, energy_off_mean=0.0, energy_off_max=0.0, energy_tail_risk=0.0, n_clusters=0, largest_cluster_size=0, cluster_concentration=0.0, risk_concentrated=False, off_target_competitive=False, has_exonic_risk=False, has_exonic_close_ot=False, ir_enhanced_score=0.0, n_offtargets_analyzed=0, close_offtargets=<factory>)[source]#

Bases: object

Complete IR analysis of a guide’s off-target landscape.

This provides structure-aware metrics that CRISPOR alone doesn’t compute.

Methods

to_dict

cluster_concentration: float = 0.0#
delta_r_bar: float = 0.0#
energy_off_max: float = 0.0#
energy_off_mean: float = 0.0#
energy_on_target: float = 0.0#
energy_tail_risk: float = 0.0#
entropy: float = 0.0#
entropy_normalized: float = 0.0#
has_exonic_close_ot: bool = False#
has_exonic_risk: bool = False#
ir_enhanced_score: float = 0.0#
largest_cluster_size: int = 0#
n_clusters: int = 0#
n_offtargets_analyzed: int = 0#
off_target_competitive: bool = False#
r_bar_off_max: float = 0.0#
r_bar_on_target: float = 0.0#
risk_concentrated: bool = False#
to_dict()[source]#
guide_sequence: str#
close_offtargets: List[OffTargetSite]#
phaselab.integrations.crispor.analyze_offtarget_landscape(guide_sequence, on_target_r_bar, offtargets, on_target_energy=None, delta_r_bar_threshold=0.05)[source]#

Comprehensive IR analysis of a guide’s off-target landscape.

This is the main entry point that computes all IR-enhanced metrics.

Args:

guide_sequence: 20bp guide sequence on_target_r_bar: Pre-computed on-target coherence offtargets: List of OffTargetSite objects from CRISPOR on_target_energy: Pre-computed on-target binding energy (optional) delta_r_bar_threshold: Threshold below which off-targets are “competitive”

Returns:

OffTargetIRAnalysis with all computed metrics

phaselab.integrations.crispor.compute_offtarget_entropy(offtargets, use_cfd_weights=True)[source]#

Compute Shannon entropy of off-target risk distribution.

Low entropy = risk concentrated in few sites (dangerous) High entropy = risk diffuse across many weak sites (safer)

Args:

offtargets: List of off-target sites use_cfd_weights: If True, weight by CFD score; otherwise by mismatches

Returns:

(entropy, normalized_entropy) where normalized is 0-1 scale

phaselab.integrations.crispor.compute_coherence_contrast(guide_sequence, on_target_r_bar, offtargets, top_k=20)[source]#

Compute coherence contrast: ΔR̄ = R̄_on - max(R̄_off).

This measures whether off-targets are “too good” physically.

Args:

guide_sequence: The guide RNA sequence on_target_r_bar: Pre-computed on-target coherence offtargets: List of off-target sites top_k: Only analyze top K most dangerous off-targets

Returns:

(r_bar_off_max, delta_r_bar)

phaselab.integrations.crispor.compute_energy_spectrum(guide_sequence, offtargets)[source]#

Compute energy spectrum of off-targets.

Returns:

(mean_energy, max_energy, tail_risk_95)

phaselab.integrations.crispor.compute_ir_enhanced_score(base_score, ir_analysis, weights=None)[source]#

Adjust a guide’s score based on IR analysis of off-targets.

This adds/subtracts from the base score based on: - Entropy bonus/penalty - Coherence contrast - Exonic risk - Energy tail risk

Args:

base_score: Initial score from standard pipeline ir_analysis: Results from analyze_offtarget_landscape weights: Optional custom weights

Returns:

(adjusted_score, list of adjustment reasons)

phaselab.integrations.crispor.compute_offtarget_clustering(offtargets, similarity_threshold=0.8)[source]#

Cluster off-targets by sequence similarity to detect dangerous “families”.

Uses a simple hierarchical approach based on sequence identity. Off-targets that are similar to each other may indicate a systematic vulnerability (e.g., pseudogenes, gene families).

Args:

offtargets: List of off-target sites similarity_threshold: Minimum similarity to be in same cluster (0-1)

Returns:

(n_clusters, largest_cluster_size, cluster_concentration) - cluster_concentration: fraction of off-targets in largest cluster

phaselab.integrations.crispor.compute_region_difficulty(sequence, k=3)[source]#

Compute region difficulty metrics (“soup index”).

GC-rich and repetitive regions are inherently harder to design specific guides for. This normalizes expectations.

Args:

sequence: DNA sequence to analyze k: k-mer size for entropy calculation

Returns:

Dict with difficulty metrics: - gc_content: GC fraction - kmer_entropy: k-mer Shannon entropy (higher = more complex) - repeat_fraction: fraction of sequence in homopolymers - difficulty_index: composite score (higher = harder)

phaselab.integrations.crispor.compute_risk_mass(offtargets)[source]#

Compute risk mass metrics: sum(CFD) by mismatch bucket and annotation.

Risk mass is more informative than off-target counts because it weights by actual cutting probability (CFD score).

Args:

offtargets: List of off-target sites

Returns:

Dict with: - risk_mass_close: sum(CFD) for 0-2mm off-targets - risk_mass_distant: sum(CFD) for 3-4mm off-targets - risk_mass_exonic: sum(CFD) for exonic off-targets - risk_mass_by_mm: {0: sum, 1: sum, 2: sum, 3: sum, 4: sum}

phaselab.integrations.crispor.compute_tail_risk(offtargets, focus_exonic=True)[source]#

Compute tail-risk score: the worst single off-target.

Tail risk dominates safety perception more than total OT count. A single high-CFD exonic off-target is more dangerous than 100 low-CFD intergenic off-targets.

Args:

offtargets: List of off-target sites focus_exonic: If True, only consider exonic off-targets for max

Returns:

(max_cfd_score, worst_offtarget)

phaselab.integrations.crispor.compute_concentration_measures(offtargets)[source]#

Compute concentration measures for off-target risk.

These complement entropy by capturing different aspects of risk distribution: - Gini coefficient: 0 = all equal, 1 = all in one site - Herfindahl-Hirschman Index (HHI): 0-1, higher = more concentrated

Args:

offtargets: List of off-target sites

Returns:

Dict with gini_coefficient, herfindahl_index