atlas_q.linalg_robust#
Robust Linear Algebra Operations with Fallback Cascade
Provides GPU-first SVD with automatic fallback to CPU and jitter-based recovery for numerically unstable matrices.
Author: ATLAS-Q Contributors Date: October 2025 License: MIT
- atlas_q.linalg_robust.robust_svd(X)[source]#
Robust SVD with fallback cascade: CUDA → jitter → CPU
- Args:
X: Input tensor to decompose
- Returns:
U, S, Vh, driver_used
Strategy: 1. Try torch.linalg.svd on GPU (cuSOLVER backend) 2. If fails, add small jitter and retry on GPU 3. If still fails, fall back to CPU SVD 4. Return which driver succeeded for diagnostics
- atlas_q.linalg_robust.robust_qr(X)[source]#
Robust QR decomposition with fallback
- Args:
X: Input tensor to decompose
- Returns:
Q, R, driver_used
- atlas_q.linalg_robust.condition_number(S)[source]#
Compute condition number from singular values
- Args:
S: Singular values (sorted descending)
- Returns:
Condition number (σ_max / σ_min)
Overview#
The linalg_robust module provides GPU-accelerated linear algebra operations with automatic fallback mechanisms for numerical stability. Critical for reliable tensor network simulations where ill-conditioned matrices can arise from accumulated roundoff errors or singular configurations.
Numerical Challenges in MPS
Matrix Product State operations require frequent Singular Value Decompositions (SVD) and QR factorizations. Challenges include:
Ill-conditioned matrices: Condition numbers κ > 10⁶ cause numerical instabilities
Near-zero singular values: Singular values < 10⁻¹⁴ (machine epsilon) can lead to convergence failures
GPU limitations: cuSOLVER occasionally fails on borderline cases that CPU LAPACK handles
Accumulated roundoff: Deep circuits accumulate floating-point errors
Fallback Cascade Strategy
The robust operations implement a three-tier strategy:
The jitter regularization adds a tiny diagonal perturbation \(\epsilon I\) to improve conditioning without significantly altering the decomposition.
Key Guarantees
Always succeeds: Falls back to CPU if GPU fails
Minimal overhead: ~99% of operations succeed on GPU
Diagnostic tracking: Returns which driver succeeded
Automatic integration: Used transparently by AdaptiveMPS
Functions#
Robust SVD with fallback cascade: CUDA → jitter → CPU |
|
Robust QR decomposition with fallback |
|
Compute condition number from singular values |
robust_svd#
- atlas_q.linalg_robust.robust_svd(X)[source]#
Robust SVD with fallback cascade: CUDA → jitter → CPU
- Args:
X: Input tensor to decompose
- Returns:
U, S, Vh, driver_used
Strategy: 1. Try torch.linalg.svd on GPU (cuSOLVER backend) 2. If fails, add small jitter and retry on GPU 3. If still fails, fall back to CPU SVD 4. Return which driver succeeded for diagnostics
Performs singular value decomposition with a three-stage fallback cascade:
Direct CUDA SVD using cuSOLVER backend
GPU SVD with small jitter added for numerical stability
CPU fallback (always succeeds, slower)
Returns the decomposition along with which driver succeeded for diagnostic purposes.
robust_qr#
- atlas_q.linalg_robust.robust_qr(X)[source]#
Robust QR decomposition with fallback
- Args:
X: Input tensor to decompose
- Returns:
Q, R, driver_used
Performs QR decomposition with fallback:
Direct CUDA QR
CPU fallback if GPU fails
Returns Q and R matrices along with driver information.
condition_number#
- atlas_q.linalg_robust.condition_number(S)[source]#
Compute condition number from singular values
- Args:
S: Singular values (sorted descending)
- Returns:
Condition number (σ_max / σ_min)
Computes the condition number \(\kappa = \sigma_{\text{max}} / \sigma_{\text{min}}\) from singular values. Large condition numbers (> 10⁶) indicate ill-conditioned matrices that may benefit from higher precision or regularization.
Examples#
Basic SVD with automatic fallback:
import torch
from atlas_q.linalg_robust import robust_svd
X = torch.randn(100, 50, dtype=torch.complex64, device='cuda')
U, S, Vh, driver = robust_svd(X)
print(f"SVD succeeded using: {driver}")
print(f"Singular values: {S[:5]}")
Checking condition number:
from atlas_q.linalg_robust import robust_svd, condition_number
X = torch.randn(50, 50, dtype=torch.complex64, device='cuda')
U, S, Vh, driver = robust_svd(X)
cond = condition_number(S)
print(f"Condition number: {cond:.2e}")
if cond > 1e6:
print("Warning: Matrix is ill-conditioned")
QR decomposition:
from atlas_q.linalg_robust import robust_qr
X = torch.randn(100, 50, dtype=torch.complex64, device='cuda')
Q, R, driver = robust_qr(X)
print(f"QR succeeded using: {driver}")
# Verify orthogonality
I = torch.matmul(Q.conj().T, Q)
error = torch.norm(I - torch.eye(50, device='cuda'))
print(f"Orthogonality error: {error:.2e}")
Integration with AdaptiveMPS:
from atlas_q.adaptive_mps import AdaptiveMPS
import torch
# AdaptiveMPS automatically uses robust_svd internally
mps = AdaptiveMPS(num_qubits=20, bond_dim=16, device='cuda')
# Apply gates - robust SVD handles any numerical issues
H = torch.tensor([[1, 1], [1, -1]], dtype=torch.complex64) / torch.sqrt(torch.tensor(2.0))
H = H.to('cuda')
for q in range(20):
mps.apply_single_qubit_gate(q, H)
# Check statistics to see fallback usage
stats = mps.stats_summary()
print(f"GPU SVD usage: {stats['cuda_svd_pct']:.1f}%")
print(f"CPU fallback usage: {stats['cpu_fallback_pct']:.1f}%")
Error Handling#
The robust linear algebra routines never raise exceptions for numerical failures. Instead, they automatically fall back through strategies until one succeeds. This ensures simulations continue even with challenging numerical conditions.
Driver Return Values:
'torch_cuda'- Successful GPU SVD/QR'torch_cuda_jitter'- GPU SVD with jitter regularization'torch_cpu'- CPU fallback was required
Example 4: Handling Ill-Conditioned Matrices
import torch
from atlas_q.linalg_robust import robust_svd, condition_number
# Create an ill-conditioned matrix
A = torch.randn(50, 50, dtype=torch.complex64, device='cuda')
A[:, 0] = A[:, 1] * 1e-10 # Make first two columns nearly identical
U, S, Vh, driver = robust_svd(A)
cond = condition_number(S)
print(f"Condition number: {cond:.2e}")
print(f"Driver used: {driver}")
# Even with κ ~ 10¹⁰, robust_svd succeeds
# Likely falls back to jitter or CPU
Performance Considerations#
Fallback Frequency
In typical MPS simulations:
torch_cuda: 98-99% of SVDs (fast GPU path)
torch_cuda_jitter: 0.5-1% (negligible overhead)
torch_cpu: 0.1-0.5% (2-10× slower but rare)
Timing Comparison (50×50 complex64 matrix)
Method |
Time (µs) |
Relative |
Success Rate |
|---|---|---|---|
cuSOLVER (direct) |
45 |
1.0× |
98% |
cuSOLVER + jitter |
48 |
1.07× |
99.9% |
CPU LAPACK |
320 |
7.1× |
100% |
The overhead of fallback logic is negligible (< 1µs).
Memory Usage
GPU path: Workspace O(mn) on GPU
CPU fallback: Temporary copy to CPU (8mn bytes for complex64)
Jitter: No additional memory (in-place modification)
When Fallbacks Occur
Common scenarios triggering fallbacks:
Deep circuits (> 100 layers): Accumulated roundoff errors
Small singular values: σ_min < 10⁻¹²
Highly entangled states: κ > 10⁸
Malformed gates: Non-unitary or near-singular
Optimization Strategies
Use complex128 for deep circuits: Reduces roundoff accumulation
mps = AdaptiveMPS(num_qubits=50, bond_dim=32, device='cuda', dtype=torch.complex128)
Monitor condition numbers: Alert on κ > 10⁶
stats = mps.stats_summary() if stats['max_condition_number'] > 1e6: print("Warning: Ill-conditioned matrices detected")
**Periodic recanonical**ization: Reset numerical errors
if step % 1000 == 0: mps.canonicalize(chi_max=64) # Full SVD sweep
Increase truncation threshold: eps_bond=1e-8 → 1e-6 for stability
Best Practices#
Development/Debugging
Enable verbose logging to track fallback frequency
Plot condition numbers vs. circuit depth
Test with both float32 and float64 to identify numerical issues
Production
Monitor CPU fallback rate (should be < 1%)
Alert if rate > 5% (indicates systematic numerical problems)
Use mixed precision: complex64 for most operations, complex128 for final result
Benchmarking
Compare GPU-only vs. robust implementations:
import time
import torch
A = torch.randn(100, 100, dtype=torch.complex64, device='cuda')
# GPU-only (may fail)
start = time.time()
U, S, Vh = torch.linalg.svd(A)
gpu_time = time.time() - start
# Robust (always succeeds)
from atlas_q.linalg_robust import robust_svd
start = time.time()
U, S, Vh, driver = robust_svd(A)
robust_time = time.time() - start
print(f"GPU: {gpu_time*1e6:.1f} µs")
print(f"Robust: {robust_time*1e6:.1f} µs ({robust_time/gpu_time:.2f}× overhead)")
print(f"Driver: {driver}")
Typical overhead: < 5% for normal matrices.
Limitations#
Complex128 Performance
~2× slower than complex64 on GPU
Worth it for circuits > 50 layers or κ > 10⁸
CPU Fallback Latency
Can cause 10× slowdown if frequent (> 10%)
Consider cuQuantum backend for better GPU stability
Jitter Side Effects
Jitter ε = 10⁻¹² alters singular values by O(ε)
Negligible for typical thresholds (> 10⁻¹⁰)
May affect very high-precision requirements
Use Cases#
Critical for:
Deep quantum circuits (> 50 layers)
High-entanglement states (χ > 64)
Long-running simulations (hours)
Production systems requiring reliability
Optional for:
Shallow circuits (< 20 layers)
Low entanglement (χ < 32)
Exploratory development
See Also#
atlas_q.adaptive_mps - MPS using robust linear algebra internally
atlas_q.diagnostics - Condition number monitoring and statistics
atlas_q.truncation - Truncation strategies for numerical stability
atlas_q.cuquantum_backend - Alternative GPU backend with different stability profile
Numerical Stability - Detailed numerical stability discussion
References#
Golub & C. F. Van Loan, Matrix Computations, 4th edition, Johns Hopkins University Press (2013).
Higham, Accuracy and Stability of Numerical Algorithms, 2nd edition, SIAM (2002).
Dongarra et al., “The impact of multicore on computational science software,” CTWatch Quarterly 3, 1 (2007).