Guided Lensless Polarization Imaging

Abstract

Polarization imaging captures the polarization state of light, revealing information invisible to the human eye yet valuable in domains such as biomedical diagnostics, autonomous driving, and remote sensing. However, conventional polarization cameras are often expensive, bulky, or both, limiting their practical use. Lensless imaging offers a compact, low-cost alternative by replacing the lens with a simple optical element like a diffuser and performing computational reconstruction, but existing lensless polarization systems suffer from limited reconstruction quality.

To overcome these limitations, we introduce an RGB-guided lensless polarization imaging system that combines a compact polarization–RGB sensor with an auxiliary, widely available conventional RGB camera providing structural guidance. We reconstruct multi-angle polarization images for each RGB color channel through a two-stage pipeline: a physics-based inversion (FISTA or ADMM) recovers an initial polarization image, followed by a Transformer-based fusion network (adapted from SwinFuSR) that refines this reconstruction using the RGB guidance image from the conventional camera.

Our two-stage method significantly improves reconstruction quality and fidelity over lensless-only baselines, generalizes across datasets and imaging conditions, and achieves high-quality real-world results on our physical prototype lensless camera without any fine-tuning.

Imaging System

RGB-guided lensless polarization imaging system: (a) optical setup; (b) custom polarization mask; (c) captured lensless image under front illumination with two orthogonally polarized projectors; and (d) reconstructed grayscale polarization result, visualized by mapping the 0°, 45°, and 90° outputs to the R, G, and B channels.

(a) Optical setup

(b) Self-fabricated polarization mask

(d) Reconstructed polarization result

Method

The pipeline has two stages. Stage I (physics-based reconstruction): polarization intensity images (color or grayscale) are recovered from lensless measurements using iterative optimization (FISTA or ADMM) with a 3D total-variation prior. Stage II (RGB-guided deep refinement): a Transformer-based fusion network refines the Stage-I reconstruction using a coarsely aligned RGB guidance image. The refinement network is adapted from SwinFuSR (Arnold et al., 2024).

Overview of the proposed RGB-guided reconstruction pipeline: Stage 1 FISTA/ADMM, Stage 2 cross-domain fusion

Overview of the proposed RGB-guided reconstruction pipeline. The process consists of two stages: (1) polarization intensity images (color or grayscale) are reconstructed from lensless measurements using a physics-based algorithm (FISTA/ADMM); and (2) the initial reconstruction and a registered RGB image of the same scene are separately encoded and fused through cross-domain attention to produce a refined polarization reconstruction. For visualization, the grayscale reconstructions at three polarization angles (0°, 45°, 90°) are mapped to the R, G, and B channels. The pipeline is compatible with more general input configurations.

Contributions

We propose the first RGB-guided lensless polarization imaging system, combining simple and low-cost hardware with a reconstruction algorithm achieving state-of-the-art results for lensless polarization imaging.
We design a two-stage reconstruction approach that integrates a physics-based solver (e.g., FISTA/ADMM) with an adapted version of a Swin Transformer utilized for cross-modal fusion, enabling RGB-guided reconstruction of polarization intensity images through self- and cross-attention.
We conduct extensive experiments on multiple simulated datasets, demonstrating consistent improvements over lensless-only baselines in PSNR, SSIM, and LPIPS, with strong generalization across datasets and unseen point-spread-functions (PSFs).
We demonstrate promising real-world results on a prototype lensless polarization camera, validating the method’s practical feasibility without additional fine-tuning.

Results

Each polarization grayscale triplet (0°, 45°, 90°) is visualized as an RGB composite.

Real lensless data

Qualitative results on real lensless polarization data (3-angle grayscale). Each reconstructed polarization triplet (0°, 45°, 90°) is visualized as an RGB composite. Note the significant improvement in the structural details achieved by RGB guidance.

RGB (guide)	FISTA recon.	FISTA + Transformer	Ours	Reference

Quantitative comparison

Three-angle grayscale configuration (PSNR ↑ / SSIM ↑ / LPIPS ↓). We compare against physics-based baselines (FISTA, ADMM), a learning-based baseline derived from our architecture without RGB guidance (FISTA + Transformer), and two additional learning-based methods: FlatNet (Khan et al., 2020) and PolarAnything (Zhang et al., 2025). PIP is the training domain; UPLight and ZJU-RGB-P are unseen evaluation sets.

Method	PIP	UPLight	ZJU-RGB-P
FISTA	13.87 / 0.45 / 0.45	16.72 / 0.26 / 0.53	14.50 / 0.46 / 0.44
FlatNet	21.57 / 0.68 / 0.45	10.78 / 0.27 / 0.98	16.73 / 0.54 / 0.57
PolarAnything (RGB)	22.02 / 0.66 / 0.29	11.98 / 0.40 / 0.93	19.96 / 0.62 / 0.38
PolarAnything (FISTA)	21.51 / 0.64 / 0.31	11.84 / 0.36 / 0.98	19.05 / 0.58 / 0.42
FISTA + Transformer	28.85 / 0.88 / 0.12	17.93 / 0.44 / 0.53	27.20 / 0.89 / 0.19
Ours (FISTA input)	35.13 / 0.97 / 0.03	20.49 / 0.52 / 0.32	31.19 / 0.97 / 0.07

FlatNet and PolarAnything struggle under the partial polarization sampling of lensless measurements and show limited generalization to unseen datasets. Our method consistently outperforms all baselines, recovering richer high-frequency details and improving structural fidelity. See the paper for full results including the four-angle RGB configuration, ADMM variants, PSF robustness, fine-tuning, and ablation studies.

Simulated datasets

Qualitative reconstruction results on UPLight and ZJU-RGB-P. Columns: RGB guidance, FISTA reconstruction, FISTA + Transformer w/o RGB, our full RGB-guided model, its fine-tuned version, and the ground-truth polarization image. Each polarization grayscale triplet (0°, 45°, 90°) is visualized as an RGB composite. Note how the RGB guidance improves the high-frequency recovery.

	RGB (guide)	FISTA pred	FISTA + Transformer	Ours	Ours (fine-tuned)	GT
UPLight
UPLight
ZJU-RGB-P
ZJU-RGB-P

Pretrained models

Stage II checkpoints trained on FISTA Stage-I reconstructions from PIP. Download from Hugging Face. Usage details are in the repository README.md.

Checkpoint file	`net_type`	`use_guide`	Description
`grayscale_with_guide.pth`	`swinfusionSR_GRAYSCALE`	`true`	Grayscale polarimetric, RGB-guided
`grayscale_no_guide.pth`	`swinfusionSR_GRAYSCALE`	`false`	Grayscale polarimetric, no guide
`color_with_guide.pth`	`swinfusionSRcolor`	`true`	Color polarimetric, RGB-guided
`color_no_guide.pth`	`swinfusionSRcolor`	`false`	Color polarimetric, no guide

Set path/pretrained_netG in your option JSON to a downloaded .pth or use wget as in the README.

Acknowledgements

We thank Tomer Pee’r and Michael Baltaxe (General Motors) for providing a suitable version of the PIP dataset, and Shay Elmalem for fruitful discussions. This work was partially supported by the Center for AI and Data Science at Tel Aviv University (TAD) and by ERC Grant No. 10111339.

The physics-based reconstruction (Stage I) was inspired by Spectral DiffuserCam (Monakhova et al., 2020). The refinement network (Stage II) builds on SwinFuSR (Arnold et al., 2024).