An advanced framework for polarization image fusion using analytical attention heads to achieve high-quality fusion of RGB and polarization images with noise suppression and feature enhancement. Paper's share link (available before 2026-3-10): Elsevier - Optics and Lasers in Engineering
This repository implements AGPFusion, a novel attention-guided polarization image fusion method designed for industrial inspection applications. The framework intelligently combines RGB and polarization images through multiple analytical attention mechanisms including gradient, texture, semantic, entropy, and noise-aware heads, producing enhanced fused images with improved contrast and detail preservation.
- Multi-Head Attention Architecture: Combines 5 specialized attention heads (GAH, TAH, EAH, SAH, NAH)
- Intelligent Noise Suppression: Adaptive local variance-based noise attenuation with configurable soft/hard masking
- Multi-Scale Processing: Automatic block size adaptation for different image resolutions
- Comprehensive Evaluation: 14 quantitative metrics (CC, MI, SSIM, PSNR, NIQE, NIMA, Qabf, etc.)
- Batch Processing: Efficient processing of entire datasets with detailed logging
- Flexible Configuration: Extensive parameter tuning for domain-specific optimization
DeepFusion/
βββ fusion_agp.py # Main fusion script
βββ metric.py # Evaluation metrics tool
βββ core/
β βββ AGPFusion.py # Core fusion implementation
β βββ metric.py # Metric calculations
β βββ tensor.py # Tensor utilities
β βββ common.py # Common functions
βββ images/ # Documentation figures
βββ README.md # This file
pip install torch torchvision
pip install opencv-python scikit-image matplotlib tqdm
pip install kornia pytorch-grad-cam natsortpython fusion_agp.py \
--rgb_input "path/to/rgb/image.png" \
--pol_input "path/to/polarization/image.png" \
--outputdir "path/to/output" \
--single_img_testpython fusion_agp.py \
--rgb_input "datasets/F12CCP/rgb/*.png" \
--pol_input "datasets/F12CCP/pol/*.png" \
--outputdir "results/fusion_batch" \
--img_size 1024,768# Single folder evaluation
python metric.py \
--vis_src "datasets/F12CCP/rgb" \
--pol_src "datasets/F12CCP/pol" \
--fusion_results "results/fusion_batch"
# Batch evaluation (all subfolders)
python metric.py \
--vis_src "datasets/F12CCP/rgb" \
--pol_src "datasets/F12CCP/pol" \
--fusion_root "results/parameter_sweep"| Parameter | Default | Description |
|---|---|---|
use_gah |
True |
Enable Gradient Attention Head (Sobel & Laplacian) |
use_tah |
True |
Enable Texture Attention Head (LBP & Canny) |
use_eah |
True |
Enable Entropy Attention Head (Information entropy) |
use_sah |
True |
Enable Semantic Attention Head (EfficientNet GradCAM++) |
use_nah |
True |
Enable Noise Attention Head (Adaptive local variance) |
use_ms_enh |
True |
Enable Multi-Scale Enhancement (Otsu mask) |
alpha |
5.0 |
Noise attenuation slope parameter |
beta |
5.0 |
Entropy weight exponent |
lambda_b |
0.3 |
Otsu mask offset ratio for main source enhancement |
sigma |
2.0 |
Feature enhancement Gaussian sigma |
noise_method |
'soft_mask' |
Noise suppression method: 'soft_mask' or 'hard_mask' |
| Parameter | Default | Description |
|---|---|---|
rgb_input |
- | Path pattern for RGB images (supports glob) |
pol_input |
- | Path pattern for polarization images |
outputdir |
- | Output directory |
img_size |
(1024, 768) |
Target size as width,height |
single_img_test |
True |
Single image mode (set to False for batch) |
The framework computes 14 comprehensive metrics:
| Category | Metrics |
|---|---|
| Correlation | CC (Correlation Coefficient), MI (Mutual Information) |
| Structural | SSIM, PSNR, SCD (Sum of Correlations of Differences) |
| Quality | AG (Average Gradient), EN (Entropy), SF (Spatial Frequency) |
| Perceptual | NIQE, VIFF, NIMA (Neural Image Assessment) |
| Fusion-specific | Qabf (Edge-based), Nabf (Noise), Labf (Local) |
Results are saved to metric.csv with per-image values and summary statistics (mean & variance).
We provide the Industry-Polarization-Dataset containing 416 sets of polarization images for industrial product inspection:
- DoFP Camera: Daheng MER2-503-36U3M POL (Sony IMX264 MZR, 2448Γ2048)
- Time-Sequential: Custom rotating polarization imaging system
- Polarization Types: DoFP, linear polarization sequences, full Stokes images
- Challenges: Specular reflection, low contrast, no registration required
Download Link: https://pan.quark.cn/s/9c8fe6e6db79
Benchmark results on the F12CCP dataset demonstrate superior performance compared to state-of-the-art methods.
# 5 Attention Heads working collaboratively:
1. GAH: Sobel + Laplacian filters for gradient features
2. TAH: LBP + Canny edge detection for texture patterns
3. EAH: Block-wise entropy for information-rich regions
4. SAH: EfficientNet GradCAM++ for semantic saliency
5. NAH: Adaptive local variance with entropy guidance - Attention Map Computation: Multi-head feature extraction
- Weight Map Generation: Cross-source attention comparison
- Image Decomposition: Base/detail layer separation (average pooling)
- Guided Filtering: Edge-preserving smoothing of weight maps
- Layer Fusion: Weighted combination of base/detail components
- Post-processing: CLAHE contrast enhancement
F12CCP Dataset results demonstrate:
- Superior Edge Preservation: High Qabf scores
- Noise Robustness: Low Nabf values
- Contrast Enhancement: Improved AG, EN, and SF metrics
- Visual Quality: Competitive NIQE and NIMA scores
If you use this code or dataset in your research, please cite:
@article{ZHOU2026109628,
title = {Polarization image fusion via analytical attention heads: A multi-scale feature integration framework},
journal = {Optics and Lasers in Engineering},
volume = {201},
pages = {109628},
year = {2026},
issn = {0143-8166},
doi = {https://doi.org/10.1016/j.optlaseng.2026.109628},
url = {https://www.sciencedirect.com/science/article/pii/S014381662600028X},
author = {Junzhuo Zhou and Jun Zou and Ye Qiu and Zhihe Liu and Jia Hao and Wenli Li and Yiting Yu},
keywords = {Polarization imaging, Multimodal image fusion, Computer vision, Industrial application, Surface inspection},
}Contributions are welcome! Please feel free to submit pull requests or open issues for bugs and feature requests.
This project is licensed under the MIT License.