Feature Detection Algorithms for Drone Imagery

Feature detection is the first compute-heavy stage of any photogrammetric reconstruction, and the detector you choose decides how the rest of the pipeline behaves. This page walks through selecting, configuring, and batch-running keypoint detectors on UAV-captured imagery in Python — the exact scenario a surveying technician or GIS developer faces when a multi-hundred-image survey block has to be turned into a dense, well-distributed tie-point network without exhausting workstation RAM. Everything here feeds the broader automated image alignment and feature matching workflows, where these descriptors are matched, filtered with RANSAC, and handed to bundle adjustment.

Audience prerequisites: Python 3.10+, working knowledge of numpy array dtypes, and a machine with at least 16 GB RAM (32 GB recommended for SIFT on full-resolution frames). You should already have your imagery organised on disk; if not, set up the batch processing directory structure first, because the extractor below assumes one flight strip per directory.

Prerequisites

Library	Version	Install command	Role in this workflow
`opencv-contrib-python`	≥ 4.8	`pip install opencv-contrib-python`	SIFT, ORB, AKAZE detectors (`contrib` is required for SIFT/SURF)
`numpy`	≥ 1.24	`pip install numpy`	Keypoint/descriptor arrays, dtype normalisation
`rasterio`	≥ 1.3	`pip install rasterio`	GeoTIFF I/O, affine transform, embedded CRS
`pyproj`	≥ 3.6	`pip install pyproj`	CRS parsing and equality checks (PROJ engine)

Pin these in a requirements.txt and install into a fresh virtual environment. The opencv-contrib-python and base opencv-python packages must not both be installed — the duplicate cv2 namespaces shadow each other and SIFT silently disappears.

Conceptual Architecture

Detection sits between image ingestion and pairwise matching. Each frame is read as a single-band 2D array, normalised to 8-bit, and passed to a detector that returns keypoints (pixel coordinates, scale, orientation) plus a descriptor matrix. Crucially, the GeoTIFF’s affine transform and CRS travel with the descriptors so that later stages can map pixel coordinates back to ground coordinates without re-opening the rasters. Descriptors are persisted out-of-core immediately, so a survey block never has to sit in memory all at once — the same determinism principle the parent pipeline depends on.

UAV imagery stresses detectors in specific ways: high forward/side overlap produces near-duplicate frames, repetitive textures (agricultural fields, industrial rooftops, water) generate ambiguous matches, and illumination drifts across long flight lines. The detector and its parameters must be chosen against those conditions rather than left at library defaults.

1. Choosing a Detector

Three detectors cover almost every UAV mapping case:

SIFT (Scale-Invariant Feature Transform) — the accuracy baseline. Strong scale and rotation invariance, dense well-distributed keypoints, but 128-dimension float32 descriptors are memory-hungry and frequently trigger out-of-memory (OOM) failures on full-resolution blocks.
ORB (Oriented FAST and Rotated BRIEF) — fast, with compact 32-byte binary descriptors that scale to consumer hardware. Weaker under large viewpoint change and low-contrast surfaces.
AKAZE — a middle ground that preserves edge fidelity at reduced descriptor dimensionality, useful for infrastructure inspection where structural edges dominate.

The choice is not fixed at design time. Drive it from measured inlier ratios, flight ground sample distance (GSD), and available RAM. A head-to-head benchmark with threshold calibration lives in fixing SIFT vs ORB performance in UAV photos.

import cv2

def build_detector(detector_type: str = "SIFT", max_features: int = 8000):
    """Return a configured OpenCV detector. ORB/AKAZE fall back gracefully
    when the contrib SIFT module is unavailable."""
    detector_type = detector_type.upper()
    if detector_type == "SIFT":
        # contrastThreshold/edgeThreshold tuned for aerial textures below.
        return cv2.SIFT_create(
            nfeatures=max_features,
            contrastThreshold=0.04,
            edgeThreshold=10,
        )
    if detector_type == "AKAZE":
        return cv2.AKAZE_create(threshold=0.001)
    return cv2.ORB_create(nfeatures=max_features, scaleFactor=1.2, nlevels=8)

2. Extracting Features From a Single Frame

Read the first band as a 2D array — src.read() with no index returns a 3D (bands, H, W) array that cv2 rejects — normalise non-8-bit data, then carry the CRS and affine transform alongside the descriptors.

import os
import numpy as np
import rasterio

def extract_one(img_path: str, detector) -> dict | None:
    """Extract keypoints + descriptors from one GeoTIFF, preserving CRS/transform."""
    with rasterio.open(img_path) as src:
        crs = src.crs
        transform = src.transform
        img_data = src.read(1)  # band 1 only -> 2D array for OpenCV

    # Detectors require 8-bit single-channel input.
    if img_data.dtype != np.uint8:
        img_data = cv2.normalize(img_data, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)

    kp, desc = detector.detectAndCompute(img_data, None)
    if not kp or desc is None:
        return None

    return {
        "image": os.path.basename(img_path),
        # Persist pt, scale, and orientation so matches can be re-derived later.
        "keypoints": np.array([(p.pt[0], p.pt[1], p.size, p.angle) for p in kp]),
        "descriptors": desc,
        "crs": crs.to_string() if crs else None,
        "transform": transform.to_gdal(),
    }

3. Batch Processing a Flight Strip

Production runs must be reproducible and memory-bounded. The driver below processes imagery in configurable chunks across a ProcessPoolExecutor, serialises each frame’s results to a compressed .npz immediately, and forces garbage collection between chunks so RAM never climbs monotonically. This mirrors the parallel processing strategies for alignment used elsewhere in the pipeline; keep chunk size and worker count conservative to leave headroom for the descriptor matrices themselves.

import gc
import logging
import concurrent.futures
from pathlib import Path

logging.basicConfig(level=logging.INFO, format="%(asctime)s | %(levelname)s | %(message)s")

def extract_features_chunk(image_paths, detector_type="SIFT", max_features=8000):
    """Worker entry point: build a detector once, process a list of frames."""
    detector = build_detector(detector_type, max_features)
    results = []
    for img_path in image_paths:
        try:
            res = extract_one(img_path, detector)
            if res is not None:
                results.append(res)
        except Exception as e:  # corrupted frame / missing EXIF must not kill the batch
            logging.warning(f"Failed to process {img_path}: {e}")
    return results

def process_flight_strip(image_dir, chunk_size=20, detector="SIFT", output_dir="./features"):
    Path(output_dir).mkdir(parents=True, exist_ok=True)
    images = sorted(str(p) for p in Path(image_dir).glob("*.tif"))
    if not images:
        logging.error("No TIFF images found in directory.")
        return []

    chunks = [images[i:i + chunk_size] for i in range(0, len(images), chunk_size)]
    all_crs = []

    with concurrent.futures.ProcessPoolExecutor(max_workers=min(os.cpu_count(), 4)) as executor:
        futures = [executor.submit(extract_features_chunk, c, detector) for c in chunks]
        for i, future in enumerate(concurrent.futures.as_completed(futures)):
            try:
                for res in future.result():
                    if res["crs"]:
                        all_crs.append(res["crs"])
                    np.savez_compressed(
                        Path(output_dir) / f"{res['image']}.npz",
                        keypoints=res["keypoints"],
                        descriptors=res["descriptors"],
                        crs=res["crs"],
                        transform=res["transform"],
                    )
                logging.info(f"Processed chunk {i + 1}/{len(chunks)}")
            except Exception as e:
                logging.error(f"Chunk processing failed: {e}")
            finally:
                gc.collect()  # release descriptor arrays before the next chunk

    logging.info("Feature extraction complete. Ready for spatial resection.")
    return all_crs

4. Enforcing CRS Consistency

Coordinate reference system integrity has to be guarded before descriptors leave this stage. UAV imagery ships with WGS84 (EPSG:4326) or a local projected CRS, and a single mismatched strip propagates into warped orthomosaics, inaccurate DEMs, and failed ground control point registration downstream. Validate that every frame shares one projected CRS, the same discipline applied in managing coordinate reference systems in GDAL.

import pyproj

def validate_crs_consistency(crs_list, project_crs="EPSG:32633") -> list[str]:
    """Confirm every extracted frame shares the target projected CRS."""
    target = pyproj.CRS.from_string(project_crs)
    valid = []
    for crs_str in crs_list:
        if crs_str is None:
            continue
        try:
            source = pyproj.CRS.from_string(crs_str)
            if source.equals(target) or source.to_epsg() == target.to_epsg():
                valid.append(crs_str)
            else:
                logging.warning(f"CRS mismatch: {crs_str} vs target {project_crs}")
        except Exception:
            logging.warning(f"Unparseable CRS string: {crs_str}")
    return valid

Parameter Deep-Dive

Parameter	Detector	Type	Default	Valid range	Effect
`nfeatures` / `max_features`	SIFT, ORB	int	0 (unbounded SIFT)	2000–20000	Caps retained keypoints. Higher = denser graph, more RAM and matching time; for >80% overlap reduce 30–40% to avoid descriptor saturation
`contrastThreshold`	SIFT	float	0.04	0.02–0.08	Lower keeps low-contrast keypoints (more, noisier on bland fields); raise to suppress false positives
`edgeThreshold`	SIFT	float	10	5–20	Higher retains more edge-like features; lower rejects them (helps on repetitive linear textures)
`threshold`	AKAZE	float	0.001	0.0005–0.005	Detector response floor; lower yields more keypoints at higher compute cost
`scaleFactor`	ORB	float	1.2	1.1–1.5	Pyramid decimation between levels; smaller = finer scale coverage, slower
`nlevels`	ORB	int	8	4–12	Pyramid depth; more levels improve scale invariance, increase memory
`chunk_size`	driver	int	20	5–50	Frames per worker task; reduce to ≤15 on <32 GB RAM machines
`max_workers`	driver	int	min(cpu, 4)	1–cpu_count	Parallel workers; each holds a full chunk of descriptors, so cap to avoid OOM

Verification and Output Inspection

Never hand descriptors to matching without asserting the outputs are well-formed. Check that every frame produced a file, descriptors have the dtype the detector promised (float32 for SIFT, uint8 for ORB/AKAZE), keypoints are non-empty, and the CRS is uniform.

import numpy as np
from pathlib import Path

def inspect_features(output_dir="./features", expected_crs="EPSG:32633"):
    files = sorted(Path(output_dir).glob("*.npz"))
    assert files, "No feature files were written — extraction failed silently."

    crs_seen = set()
    for f in files:
        data = np.load(f, allow_pickle=True)
        kp, desc = data["keypoints"], data["descriptors"]
        assert kp.shape[0] > 0, f"{f.name}: zero keypoints"
        assert desc.shape[0] == kp.shape[0], f"{f.name}: keypoint/descriptor count mismatch"
        crs_seen.add(str(data["crs"]))

    assert crs_seen == {expected_crs}, f"Mixed or unexpected CRS: {crs_seen}"
    print(f"OK: {len(files)} frames, uniform CRS {expected_crs}")

# A median of 2,000–8,000 keypoints per frame is healthy for survey-grade overlap.

If median keypoint counts collapse below a few hundred per frame, the imagery is likely over-smoothed or under-exposed — apply CLAHE (Contrast Limited Adaptive Histogram Equalization) before detection rather than loosening thresholds, which only invites false matches.

Operational Best Practices

Overlap-aware thresholding: for datasets with >80% overlap, cut nfeatures by 30–40%. Redundant tie points inflate matching time without improving geometric stability.
Illumination normalisation: apply CLAHE to frames captured during sunrise/sunset transitions to stabilise gradient-based detectors.
Descriptor compression: store binary descriptors (ORB, BRISK) in compressed .npz or Parquet; keep SIFT descriptors as float32 to halve I/O.
Hardware scaling: on <32 GB RAM, enforce chunk_size ≤ 15 and use numpy.memmap for descriptor caching.
Pre-flight metadata gates: flag mixed-CRS datasets or missing focal-length metadata before extraction begins. Validating EXIF GPS data before processing catches most of these early.

Troubleshooting

AttributeError: module 'cv2' has no attribute 'SIFT_create' You installed opencv-python instead of opencv-contrib-python, or both at once. Uninstall both, then install only opencv-contrib-python. SIFT lives in the contrib build.

detectAndCompute returns desc is None for some frames The frame had no detectable features (uniform water, blown-out sky, or a fully shadowed strip). The extractor already skips these, but a high skip rate means an exposure or thresholding problem — apply CLAHE or lower contrastThreshold slightly rather than ignoring the frames.

cv2.error: (-5:Bad argument) image is empty or has incorrect depth You passed a 3D array. rasterio’s src.read() returns (bands, H, W); use src.read(1) for a 2D band, and normalise non-8-bit data with cv2.normalize(...).astype(np.uint8).

Workers die with MemoryError / the OS OOM-killer terminates the process Each worker holds a full chunk of float32 SIFT descriptors. Reduce max_workers, drop chunk_size to ≤ 15, and lower nfeatures. Confirm headroom before scaling back up.

Matching later produces almost no inliers despite thousands of keypoints Usually a CRS or scale mismatch, not a detection failure. Run validate_crs_consistency and verify GSD is consistent across strips; mismatched projections survive detection but collapse during bundle adjustment optimisation.

Descriptor .npz files load with pickle errors You saved object arrays (e.g. crs=None) and loaded without allow_pickle=True. Add the flag, or store the CRS as a plain string to keep the archive picklable-free.

← Automated Image Alignment & Feature Matching Workflows

Feature Detection Algorithms for Drone Imagery

# Prerequisites

# Conceptual Architecture

# 1. Choosing a Detector

# 2. Extracting Features From a Single Frame

# 3. Batch Processing a Flight Strip

# 4. Enforcing CRS Consistency

# Parameter Deep-Dive

# Verification and Output Inspection

# Operational Best Practices

# Troubleshooting

# Related