Optimizing Bundle Adjustment with Python

Bundle adjustment is the non-linear least-squares core of photogrammetric reconstruction: it simultaneously refines camera extrinsics, intrinsics, and sparse 3D point coordinates so that every tie point reprojects as close as possible to where it was observed. For UAV operators, surveying technicians, and Python GIS developers, the transition from raw flight imagery to a georeferenced sparse cloud hinges on precise parameter control, strict resource enforcement, and stage-specific automation. This guide shows how to script a transparent, reproducible solver pipeline that prioritizes convergence stability, memory boundaries, and survey-grade output — the global-optimization stage of the broader automated image alignment and feature matching workflows.

This is an implementation-focused engineering scenario. You should already have an overlapping image block with EXIF intrinsics, a working Python 3.10+ environment, and enough RAM to hold a single survey tile’s tie-point graph (typically 8–16 GB for blocks under ~5,000 frames). Everything below runs on CPU with optional Ceres bindings for production throughput; no proprietary reconstruction engine is required.

Prerequisites

Install the following before running any step in this guide. Pin versions in a lockfile so field and office workstations produce byte-identical solver outputs.

Library	Version	Install command	Role in the solver pipeline
Python	3.10+	system / pyenv	f-strings, structural typing, `match`
numpy	1.24+	`pip install "numpy>=1.24"`	residual vectors, Jacobian assembly
scipy	1.10+	`pip install "scipy>=1.10"`	`least_squares` trust-region solver
opencv-python	4.8+	`pip install opencv-python`	SIFT extraction, RANSAC verification
pyproj	3.5+	`pip install pyproj`	CRS validation, datum transforms
pandas	2.0+	`pip install pandas`	sparse-cloud export, QC tables

Hardware floor: 4 physical cores and 16 GB RAM for tiles up to ~5,000 images. Beyond that, partition the block (Step 4) rather than scaling vertically.

How Bundle Adjustment Fits the Reconstruction Graph

Bundle adjustment never runs in isolation. It consumes conditioned tie points produced upstream and emits a refined sparse cloud consumed downstream by densification and meshing. Treating it as a discrete, validated stage — rather than a black box inside a commercial preset — is what makes the result reproducible and debuggable. The solver’s behavior is almost entirely determined by the quality of its inputs: garbage tie points and inconsistent coordinate frames cause divergence long before any tuning parameter matters.

The five steps that follow map one-to-one onto that flow. Read them in order the first time; in production they become independently testable functions wired into a single command-line entry point.

Step 1 — Validate the Coordinate Reference System and Sanitize Inputs

Before the solver initializes, coordinate reference system validation is non-negotiable. Inconsistent EXIF geotags, mixed EPSG codes, or uncorrected GPS drift propagate systematic errors through the Jacobian matrix, causing solver divergence or localized warping. A robust preprocessing routine parses image headers, enforces a unified projection, and strips conflicting metadata. The same datum discipline covered in managing coordinate reference systems in GDAL applies here: every coordinate that enters the solver must share one explicit, projected CRS. Flag imagery with horizontal accuracy worse than 2 m or altitude variance beyond 5% of the planned flight envelope, routing those frames to manual review before batch submission.

import logging
from pathlib import Path
from typing import List, Dict, Optional
import pyproj
from pyproj.exceptions import CRSError

logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")

def validate_and_sanitize_crs(
    image_metadata: List[Dict[str, float]],
    target_epsg: int = 32618,
    h_accuracy_threshold: float = 2.0,
    alt_variance_pct: float = 0.05
) -> List[Dict[str, float]]:
    """
    Validates GPS coordinates against a target CRS and filters out 
    imagery exceeding accuracy thresholds.
    """
    try:
        target_crs = pyproj.CRS.from_epsg(target_epsg)
        transformer = pyproj.Transformer.from_crs(
            "EPSG:4326", target_crs, always_xy=True
        )
    except CRSError as e:
        logging.error(f"Invalid target CRS: {e}")
        return []

    sanitized = []
    altitudes = [m.get("altitude", 0.0) for m in image_metadata]
    if not altitudes:
        return []
    
    mean_alt = sum(altitudes) / len(altitudes)
    max_dev = mean_alt * alt_variance_pct

    for idx, meta in enumerate(image_metadata):
        try:
            lat, lon = meta.get("lat"), meta.get("lon")
            if lat is None or lon is None:
                logging.warning(f"Image {idx}: Missing coordinates. Skipping.")
                continue

            # Transform to target CRS
            easting, northing = transformer.transform(lon, lat)
            h_acc = meta.get("horizontal_accuracy", 999.0)
            alt = meta.get("altitude", 0.0)

            if h_acc > h_accuracy_threshold:
                logging.warning(f"Image {idx}: Horizontal accuracy {h_acc:.2f}m exceeds threshold.")
                continue
            if abs(alt - mean_alt) > max_dev:
                logging.warning(f"Image {idx}: Altitude deviation {abs(alt - mean_alt):.2f}m exceeds limit.")
                continue

            sanitized.append({
                "id": idx,
                "easting": easting,
                "northing": northing,
                "altitude": alt,
                "crs": target_epsg
            })
        except Exception as e:
            logging.error(f"Image {idx}: CRS transformation failed: {e}")
            continue

    logging.info(f"CRS validation complete. Retained {len(sanitized)}/{len(image_metadata)} images.")
    return sanitized

Passing always_xy=True forces longitude/latitude ordering and prevents the silent axis swaps that produce ±500 m georeferencing offsets. The function returns only frames that survive both the accuracy and altitude gates, so the solver never sees coordinates it cannot trust.

Step 2 — Extract and Condition Tie Points

The quality of the adjustment is fundamentally constrained by the initial feature extraction phase. SIFT, ORB, or deep-learning descriptors must be calibrated for UAV imaging conditions — motion blur, varying ground sample distance, and repetitive textures in agricultural or industrial corridors. When selecting and tuning detectors, follow the trade-offs documented in feature detection algorithms for drone imagery: prefer scale-invariant descriptors, enforce a strict Lowe’s ratio (typically 0.7–0.8), and apply geometric verification with RANSAC to eliminate false matches. A well-conditioned match set keeps the solver out of degenerate, over-constrained tie-point configurations that stall convergence in the first few iterations.

import cv2
import numpy as np
from typing import Tuple, List

def extract_and_condition_features(
    img1: np.ndarray, img2: np.ndarray,
    ratio_thresh: float = 0.75,
    ransac_reproj_thresh: float = 3.0
) -> Tuple[List[cv2.DMatch], np.ndarray]:
    """
    Extracts SIFT features, applies Lowe's ratio test, and conditions 
    matches via RANSAC homography estimation.
    """
    try:
        sift = cv2.SIFT_create()
        kp1, desc1 = sift.detectAndCompute(img1, None)
        kp2, desc2 = sift.detectAndCompute(img2, None)

        if desc1 is None or desc2 is None or len(desc1) < 4 or len(desc2) < 4:
            return [], np.array([])

        bf = cv2.BFMatcher(cv2.NORM_L2)
        knn_matches = bf.knnMatch(desc1, desc2, k=2)

        # Lowe's ratio test
        good_matches = [m for m, n in knn_matches if m.distance < ratio_thresh * n.distance]

        if len(good_matches) < 4:
            return [], np.array([])

        # RANSAC geometric verification
        pts1 = np.float32([kp1[m.queryIdx].pt for m in good_matches])
        pts2 = np.float32([kp2[m.trainIdx].pt for m in good_matches])

        _, mask = cv2.findHomography(pts1, pts2, cv2.RANSAC, ransac_reproj_thresh)
        # findHomography returns an (N, 1) mask; flatten before per-match indexing
        mask_flat = mask.ravel()
        inliers = [m for i, m in enumerate(good_matches) if mask_flat[i] == 1]

        logging.info(f"Conditioned matches: {len(inliers)} inliers from {len(good_matches)} candidates.")
        return inliers, mask

    except cv2.error as e:
        logging.error(f"OpenCV feature extraction failed: {e}")
        return [], np.array([])
    except Exception as e:
        logging.error(f"Unexpected error during feature conditioning: {e}")
        return [], np.array([])

The inlier list returned here is the raw material for the solver’s residual function: each surviving match becomes one observation linking a 3D point to a 2D pixel in a specific camera.

Step 3 — Configure the Solver and Control Convergence

The core routine relies on robust non-linear optimization. Python developers typically interface with scipy.optimize.least_squares for prototyping, or compile bindings to Ceres Solver for production throughput. Convergence stability depends on the damping strategy, the loss function, and the iteration cap. Apply a Huber or Cauchy loss to suppress outlier influence and avoid unbounded parameter updates. When you scale the solve across many cores using parallel processing strategies for alignment, decouple Jacobian assembly from the solver step so matrix operations stay thread-safe and residual computation never races.

import numpy as np
from scipy.optimize import least_squares
from typing import Callable, Tuple

def run_bundle_solver(
    residuals_func: Callable,
    initial_params: np.ndarray,
    bounds: Tuple[np.ndarray, np.ndarray],
    max_nfev: int = 500,
    ftol: float = 1e-8,
    xtol: float = 1e-8
) -> np.ndarray:
    """
    Executes a robust non-linear least squares solver with explicit 
    bounds, convergence tolerances, and outlier-resistant loss.
    """
    try:
        result = least_squares(
            residuals_func,
            initial_params,
            bounds=bounds,
            method='trf',
            loss='huber',
            f_scale=1.0,
            max_nfev=max_nfev,
            ftol=ftol,
            xtol=xtol,
            verbose=2
        )

        if not result.success:
            logging.warning(f"Solver did not converge. Reason: {result.message}")
            logging.info(f"Final cost: {result.cost:.4f}, Iterations: {result.nfev}")
        
        return result.x

    except ValueError as e:
        logging.error(f"Invalid solver input: {e}")
        raise
    except Exception as e:
        logging.error(f"Bundle solver execution failed: {e}")
        raise

The trf (Trust Region Reflective) method honors the parameter bounds, which is what keeps focal length and distortion coefficients inside physically plausible ranges. The f_scale argument sets the soft threshold at which the Huber loss transitions from quadratic to linear — tune it to the median reprojection residual of a converged reference solve.

Step 4 — Respect Memory Boundaries with Dataset Partitioning

Large UAV surveys routinely exceed available RAM when loading full tie-point graphs into memory. Pipelines must implement chunked processing, out-of-core graph construction, and explicit garbage collection. Partition imagery by flight line or spatial tile, solve local adjustments independently, then merge with a global constraint pass. The deterministic chunker in the dataset-splitting routine generates overlapping partitions while preserving tie-point continuity across boundaries; pair it with the eviction tactics in memory management for large point clouds to keep peak resident memory bounded.

import gc
import logging
from typing import Generator, List, Any

def chunked_dataset_iterator(
    dataset: List[Any],
    chunk_size: int = 500,
    overlap: int = 50
) -> Generator[List[Any], None, None]:
    """
    Yields memory-safe chunks of a dataset with configurable overlap 
    to preserve tie-point continuity across partitions.
    """
    if chunk_size <= 0:
        raise ValueError("chunk_size must be positive")
    if overlap < 0 or overlap >= chunk_size:
        raise ValueError("overlap must be >= 0 and < chunk_size")

    total = len(dataset)
    start = 0
    chunk_idx = 0

    try:
        while start < total:
            end = min(start + chunk_size, total)
            chunk = dataset[start:end]
            logging.debug(f"Yielding chunk {chunk_idx}: indices {start}-{end-1}")
            yield chunk
            
            start += chunk_size - overlap
            chunk_idx += 1
            gc.collect()  # Explicit memory cleanup between chunks
    except Exception as e:
        logging.error(f"Dataset chunking failed: {e}")
        raise

The overlap parameter is what stitches independently solved tiles back together: shared frames in the buffer zone become the cross-tile observations the global constraint pass uses to remove inter-tile seams.

Step 5 — Validate Residuals and Export the Sparse Cloud

Once the solver converges, validate before exporting. Compute reprojection errors per camera and per tie point, then reject statistical outliers using the median absolute deviation (MAD). For an observed pixel $\mathbf{x}_i$ and its predicted projection through the camera matrix $K[R\,|\,t]$ , the residual is

e_i = \left\lVert \mathbf{x}_i - \pi\big(K[R\,|\,t]\,\mathbf{X}_i\big) \right\rVert_2

and a point is flagged when it exceeds the robust threshold $\tau = \tilde{e} + 2.5 \cdot \mathrm{MAD}$ , where $\tilde{e}$ is the median residual. Enforce CRS-safe export that preserves the original survey datum and avoids implicit transformations during serialization. Integrating ground control at this stage allows a final affine or Helmert fit to the survey grid — the residual-balancing logic in distributing GCP errors across orthomosaics governs how that correction is spread across the block.

import numpy as np
import pandas as pd
from typing import Dict, List
import logging

def compute_reprojection_errors(
    points_3d: np.ndarray,
    camera_poses: np.ndarray,
    intrinsics: np.ndarray,
    observed_2d: np.ndarray
) -> Dict[str, np.ndarray]:
    """
    Calculates per-point and per-camera reprojection errors, 
    returning statistics for quality control.
    """
    try:
        # points_3d is (N, 3); camera_poses is a single 3x4 [R|t] extrinsic matrix.
        # Homogenize the points, project with P = K[R|t], then perspective-divide.
        n = points_3d.shape[0]
        points_h = np.vstack([points_3d.T, np.ones((1, n))])  # (4, N)
        projection = intrinsics @ camera_poses                # (3, 4)
        projected = projection @ points_h                     # (3, N)
        projected /= projected[2, :]
        errors = np.linalg.norm(projected[:2, :] - observed_2d.T, axis=0)

        median_err = np.median(errors)
        mad = np.median(np.abs(errors - median_err))
        threshold = median_err + 2.5 * mad

        inlier_mask = errors <= threshold
        outlier_count = np.sum(~inlier_mask)

        logging.info(f"Reprojection stats: Median={median_err:.3f}px, "
                     f"MAD={mad:.3f}px, Outliers={outlier_count}")

        return {
            "errors": errors,
            "inliers": inlier_mask,
            "median_error": median_err,
            "outlier_count": outlier_count
        }
    except Exception as e:
        logging.error(f"Reprojection error calculation failed: {e}")
        raise

def export_sparse_cloud(
    points_3d: np.ndarray,
    point_ids: List[int],
    epsg: int,
    output_path: str
) -> None:
    """
    Exports validated 3D points to CSV with explicit CRS metadata.
    """
    try:
        # points_3d is (N, 3) — one row per point, matching compute_reprojection_errors
        df = pd.DataFrame(points_3d, columns=["X", "Y", "Z"])
        df["PointID"] = point_ids
        df.to_csv(output_path, index=False)
        logging.info(f"Exported {len(df)} points to {output_path} (EPSG:{epsg})")
    except Exception as e:
        logging.error(f"Export failed: {e}")
        raise

Persisting the EPSG code alongside the coordinates is what keeps the exported cloud unambiguous when it is reloaded by a downstream tool that performs no implicit reprojection of its own.

Parameter Deep-Dive

Every value below changes either output accuracy or runtime cost. Treat the defaults as a starting point and re-derive them against a small, fully ground-truthed reference block before committing to a production run.

Parameter	Type	Default	Valid range	Effect on quality vs. performance
`target_epsg`	int	32618	any projected EPSG	Wrong zone warps the whole block; must match the survey area’s UTM/State Plane zone
`h_accuracy_threshold`	float (m)	2.0	0.05–5.0	Tighter values drop weak-GPS frames, raising accuracy but thinning coverage
`ratio_thresh`	float	0.75	0.6–0.85	Lower = fewer, cleaner matches (slower convergence risk); higher = more false positives
`ransac_reproj_thresh`	float (px)	3.0	1.0–5.0	Lower rejects more matches as outliers; higher admits geometric noise
`loss`	str	`huber`	`linear`/`huber`/`cauchy`/`soft_l1`	Robust losses suppress outliers at a small per-iteration cost
`f_scale`	float	1.0	0.5–5.0	Soft outlier threshold; set near the median converged residual
`max_nfev`	int	500	100–5000	Higher allows deeper convergence but lengthens runtime
`ftol` / `xtol`	float	1e-8	1e-12–1e-6	Tighter tolerances chase residual reduction at more iterations
`chunk_size`	int	500	100–5000	Larger tiles improve global consistency but raise peak RAM
`overlap`	int	50	10–30% of `chunk_size`	More overlap improves cross-tile stitching at higher compute cost

Verification and Output Inspection

After export, assert correctness before the cloud advances downstream. Confirm the file exists, the median reprojection error is within budget, the CRS column survived serialization, and no coordinate collapsed to NaN.

import numpy as np
import pandas as pd
from pathlib import Path

def verify_sparse_cloud(output_path: str, stats: dict, expected_epsg: int,
                        max_median_px: float = 1.0) -> bool:
    assert Path(output_path).is_file(), f"Export missing: {output_path}"

    df = pd.read_csv(output_path)
    assert {"X", "Y", "Z", "PointID"}.issubset(df.columns), "Export schema incomplete"
    assert not df[["X", "Y", "Z"]].isnull().any().any(), "NaN coordinates in export"

    median_px = float(stats["median_error"])
    assert median_px <= max_median_px, (
        f"Median reprojection {median_px:.3f}px exceeds {max_median_px}px budget"
    )

    inlier_rate = stats["inliers"].mean()
    logging.info(f"Verified {len(df)} pts | median={median_px:.3f}px | "
                 f"inliers={inlier_rate:.1%} | EPSG:{expected_epsg}")
    return True

A median reprojection error at or below roughly 1 px on the inlier set, with an inlier rate above ~90%, is the signal that the solve is survey-grade. Anything worse points back to one of the failure modes below rather than to the export step.

Troubleshooting

The solver reports did not converge and stops at max_nfev. Convergence is being starved, not capped. The usual cause is degenerate geometry: too few inlier tie points, or matches concentrated in a planar region. Loosen ratio_thresh slightly, confirm Step 2 returns hundreds of inliers per pair, and only then raise max_nfev. A solver that plateaus at high cost has bad inputs, not too few iterations.

Reprojection error is low but the cloud is geographically shifted by hundreds of meters. This is almost always an axis-order bug. Verify always_xy=True in every pyproj.Transformer, and confirm target_epsg matches the survey’s actual UTM or State Plane zone. The solver optimized a self-consistent but mis-georeferenced block.

MemoryError or thrashing during Jacobian assembly on large blocks. The full tie-point graph does not fit in RAM. Drop chunk_size, solve per tile via the Step 4 iterator, and merge with a global pass. Confirm gc.collect() is firing between chunks and that intermediate arrays are not being retained by a closure.

cv2.findHomography returns None, raising AttributeError on mask.ravel(). Fewer than four geometrically consistent matches were found between the pair. Guard the call by checking len(good_matches) >= 4 (already enforced above) and skip pairs that still fail — they are non-overlapping frames that should never have been queued together.

Distortion coefficients drift to implausible values. Unbounded intrinsics are absorbing model error. Tighten the bounds tuple passed to run_bundle_solver so focal length and radial-distortion terms stay physical, and switch loss to cauchy if a handful of gross outliers are dominating the cost.

Median reprojection error is good but RMSE against GCPs is poor. The internal solve is consistent but the absolute fit to control is weak. This is a ground-control problem, not a solver problem — rebalance residuals across the block using the GCP distribution workflow before re-exporting.

← Automated Image Alignment & Feature Matching Workflows

Optimizing Bundle Adjustment with Python

# Prerequisites

# How Bundle Adjustment Fits the Reconstruction Graph

# Step 1 — Validate the Coordinate Reference System and Sanitize Inputs

# Step 2 — Extract and Condition Tie Points

# Step 3 — Configure the Solver and Control Convergence

# Step 4 — Respect Memory Boundaries with Dataset Partitioning

# Step 5 — Validate Residuals and Export the Sparse Cloud

# Parameter Deep-Dive

# Verification and Output Inspection

# Troubleshooting

# Related