1
Polarization imaging captures the polarization state of light, revealing information invisible to the human eye yet valuable in domains such as biomedical diagnostics, autonomous driving, and remote sensing. However, conventional polarization cameras are often expensive, bulky, or both, limiting their practical use. Lensless imaging offers a compact, low-cost alternative by replacing the lens with a simple optical element like a diffuser and performing computational reconstruction, but existing lensless polarization systems suffer from limited reconstruction quality.
To overcome these limitations, we introduce an RGB-guided lensless polarization imaging system that combines a compact polarization–RGB sensor with an auxiliary, widely available conventional RGB camera providing structural guidance. We reconstruct multi-angle polarization images for each RGB color channel through a two-stage pipeline: a physics-based inversion (FISTA or ADMM) recovers an initial polarization image, followed by a Transformer-based fusion network (adapted from SwinFuSR) that refines this reconstruction using the RGB guidance image from the conventional camera.
Our two-stage method significantly improves reconstruction quality and fidelity over lensless-only baselines, generalizes across datasets and imaging conditions, and achieves high-quality real-world results on our physical prototype lensless camera without any fine-tuning.
RGB-guided lensless polarization imaging system: (a) optical setup; (b) custom polarization mask; (c) captured lensless image under front illumination with two orthogonally polarized projectors; and (d) reconstructed grayscale polarization result, visualized by mapping the 0°, 45°, and 90° outputs to the R, G, and B channels.
The pipeline has two stages. Stage I (physics-based reconstruction): polarization intensity images (color or grayscale) are recovered from lensless measurements using iterative optimization (FISTA or ADMM) with a 3D total-variation prior. Stage II (RGB-guided deep refinement): a Transformer-based fusion network refines the Stage-I reconstruction using a coarsely aligned RGB guidance image. The refinement network is adapted from SwinFuSR (Arnold et al., 2024).
Each polarization grayscale triplet (0°, 45°, 90°) is visualized as an RGB composite.
Qualitative results on real lensless polarization data (3-angle grayscale). Each reconstructed polarization triplet (0°, 45°, 90°) is visualized as an RGB composite. Note the significant improvement in the structural details achieved by RGB guidance.
| RGB (guide) | FISTA recon. | FISTA + Transformer | Ours | Reference |
|---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Three-angle grayscale configuration (PSNR ↑ / SSIM ↑ / LPIPS ↓). We compare against physics-based baselines (FISTA, ADMM), a learning-based baseline derived from our architecture without RGB guidance (FISTA + Transformer), and two additional learning-based methods: FlatNet (Khan et al., 2020) and PolarAnything (Zhang et al., 2025). PIP is the training domain; UPLight and ZJU-RGB-P are unseen evaluation sets.
| Method | PIP | UPLight | ZJU-RGB-P |
|---|---|---|---|
| FISTA | 13.87 / 0.45 / 0.45 | 16.72 / 0.26 / 0.53 | 14.50 / 0.46 / 0.44 |
| FlatNet | 21.57 / 0.68 / 0.45 | 10.78 / 0.27 / 0.98 | 16.73 / 0.54 / 0.57 |
| PolarAnything (RGB) | 22.02 / 0.66 / 0.29 | 11.98 / 0.40 / 0.93 | 19.96 / 0.62 / 0.38 |
| PolarAnything (FISTA) | 21.51 / 0.64 / 0.31 | 11.84 / 0.36 / 0.98 | 19.05 / 0.58 / 0.42 |
| FISTA + Transformer | 28.85 / 0.88 / 0.12 | 17.93 / 0.44 / 0.53 | 27.20 / 0.89 / 0.19 |
| Ours (FISTA input) | 35.13 / 0.97 / 0.03 | 20.49 / 0.52 / 0.32 | 31.19 / 0.97 / 0.07 |
FlatNet and PolarAnything struggle under the partial polarization sampling of lensless measurements and show limited generalization to unseen datasets. Our method consistently outperforms all baselines, recovering richer high-frequency details and improving structural fidelity. See the paper for full results including the four-angle RGB configuration, ADMM variants, PSF robustness, fine-tuning, and ablation studies.
Qualitative reconstruction results on UPLight and ZJU-RGB-P. Columns: RGB guidance, FISTA reconstruction, FISTA + Transformer w/o RGB, our full RGB-guided model, its fine-tuned version, and the ground-truth polarization image. Each polarization grayscale triplet (0°, 45°, 90°) is visualized as an RGB composite. Note how the RGB guidance improves the high-frequency recovery.
| RGB (guide) | FISTA pred | FISTA + Transformer | Ours | Ours (fine-tuned) | GT | |
|---|---|---|---|---|---|---|
| UPLight | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
| ZJU-RGB-P | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Stage II checkpoints trained on FISTA Stage-I reconstructions from PIP. Download from Hugging Face. Usage details are in the repository README.md.
| Checkpoint file | net_type |
use_guide |
Description |
|---|---|---|---|
grayscale_with_guide.pth |
swinfusionSR_GRAYSCALE |
true |
Grayscale polarimetric, RGB-guided |
grayscale_no_guide.pth |
swinfusionSR_GRAYSCALE |
false |
Grayscale polarimetric, no guide |
color_with_guide.pth |
swinfusionSRcolor |
true |
Color polarimetric, RGB-guided |
color_no_guide.pth |
swinfusionSRcolor |
false |
Color polarimetric, no guide |
Set path/pretrained_netG in your option JSON to a downloaded .pth or use wget as in the README.
@misc{kraicer2026guidedlenslesspolarizationimaging,
title = {Guided Lensless Polarization Imaging},
author = {Noa Kraicer and Erez Yosef and Raja Giryes},
year = {2026},
eprint = {2603.27357},
archivePrefix = {arXiv},
primaryClass = {eess.IV},
url = {https://arxiv.org/abs/2603.27357}
}
We thank Tomer Pee’r and Michael Baltaxe (General Motors) for providing a suitable version of the PIP dataset, and Shay Elmalem for fruitful discussions. This work was partially supported by the Center for AI and Data Science at Tel Aviv University (TAD) and by ERC Grant No. 10111339.
The physics-based reconstruction (Stage I) was inspired by Spectral DiffuserCam (Monakhova et al., 2020). The refinement network (Stage II) builds on SwinFuSR (Arnold et al., 2024).