Two-Stage Framework#
PhaseLab uses a two-stage approach to reliability assessment: feasibility inference (pre-tiling) and landscape resolution (with tiling).
Overview#
Stage |
Data Required |
Output |
Claim Level |
|---|---|---|---|
Stage I |
Structural priors only |
GO/NO-GO feasibility |
EXPLORATORY |
Stage II |
16-20 perturbations |
Resolved landscape |
CONTEXT_DEPENDENT |
Stage I: Feasibility (Pre-Tiling)#
Purpose: Should we invest in a tiling screen?
Inputs:
Gene identity and promoter region
ENCODE/Roadmap chromatin accessibility
Known structural features
Method:
from phaselab.spatial import estimate_feasibility
feasibility = estimate_feasibility(
gene_symbol='RAI1',
promoter_coords=('chr17', 17555000, 17558000),
cell_type='K562',
)
print(f"Predicted stable fraction: {feasibility.stable_fraction:.0%}")
print(f"Recommendation: {feasibility.recommendation}")
Outputs:
Predicted stable regions (candidate positions)
Expected signal-to-noise (will tiling be informative?)
GO/NO-GO for tiling investment
What Stage I CANNOT Do:
Definitively identify stable regions
Guarantee guide selection success
Replace empirical validation
Stage II: Resolution (With Tiling)#
Purpose: Resolve the actual response landscape
Inputs:
Tiling screen data (16-20+ perturbations minimum)
Position and response values
Method:
from phaselab.spatial import load_tiling_screen, analyze_tiling_coherence
# Load tiling data
landscape = load_tiling_screen(
'tiling_results.tsv',
position_col='position',
response_col='log2fc',
)
# Analyze
result = analyze_tiling_coherence(landscape)
# Check validation
if result.is_validated:
print(f"Validated correlation: r={result.profile.correlation:.3f}")
print(f"Stable regions: {len(result.stable_regions)}")
else:
print("Insufficient data for validation")
Outputs:
Validated coherence profile (empirically tested)
Definitive stable/amplifying regions
Quantified variance reduction
Why Two Stages?#
Cost-benefit optimization:
Cost Type |
Stage I |
Stage II |
|---|---|---|
Computational |
Minutes |
Minutes |
Experimental |
$0 |
~$2,000 (tiling) |
Time |
Hours |
Weeks |
Risk reduction:
Stage I filters out cases where tiling is unlikely to help:
Uniformly accessible promoters (no landscape structure)
Uniformly repressed regions (no signal)
Insufficient prior knowledge
Minimum Viable Tiling#
Stage II requires a minimum dataset:
Metric |
Minimum |
Recommended |
|---|---|---|
Perturbations |
16 |
20-30 |
Spacing |
Even across region |
25-50bp intervals |
Replicates |
2 |
3+ |
Coverage |
Full window |
±500bp of target |
Why 16-20? This provides sufficient power to:
Detect coherence-outcome correlation (r > 0.2)
Identify at least 2 stable regions
Estimate variance reduction
Transition Criteria#
Stage I → Stage II when:
# Automatic recommendation
if feasibility.recommendation == "PROCEED_TO_TILING":
# Run tiling screen
pass
# Manual override (with justification)
if feasibility.stable_fraction < 0.3 and must_proceed:
print("Warning: Low expected yield from tiling")
Stage II validation requires:
Correlation p-value < 0.05
At least one stable region identified
Variance reduction > 20%
Practical Workflow#
"""Complete two-stage workflow"""
from phaselab.spatial import (
estimate_feasibility,
load_tiling_screen,
analyze_tiling_coherence,
)
# === STAGE I ===
print("Stage I: Feasibility Assessment")
print("=" * 40)
feasibility = estimate_feasibility(
gene_symbol='MYC',
promoter_coords=('chr8', 128747680, 128750680),
cell_type='K562',
)
print(f"Predicted stable fraction: {feasibility.stable_fraction:.0%}")
print(f"Recommendation: {feasibility.recommendation}")
if feasibility.recommendation != "PROCEED_TO_TILING":
print("Stage I indicates tiling unlikely to help")
print("Consider alternative approaches")
exit()
# === RUN TILING SCREEN (wet lab) ===
# ... weeks pass ...
# === STAGE II ===
print("\nStage II: Landscape Resolution")
print("=" * 40)
landscape = load_tiling_screen('myc_tiling.tsv', 'position', 'log2fc')
result = analyze_tiling_coherence(landscape)
print(f"Validated: {result.is_validated}")
print(f"Correlation: r={result.profile.correlation:.3f}")
print(f"Stable regions: {len(result.stable_regions)}")
print(f"Claim level: {result.claim_level}")
Comparison Table#
Aspect |
Stage I |
Stage II |
|---|---|---|
Data source |
ENCODE, structure |
Your tiling data |
Confidence |
EXPLORATORY |
CONTEXT_DEPENDENT |
Stable regions |
Predicted |
Validated |
Can select guides? |
Not reliably |
Yes |
When to use |
Before experiment |
After tiling |
Summary#
Stage I tells you IF tiling will help (cheap, fast)
Stage II tells you WHERE to target (requires data)
Never skip Stage I - it saves wasted experiments
Never skip Stage II - Stage I predictions need validation
See Also#
Spatial Coherence Analysis (v1.0.0) - Core methodology
Claim Levels - Understanding confidence levels