NPU Model Conversion

This section covers converting your own Stable Diffusion checkpoints into NPU-compatible assets that Local Dream can load on supported Snapdragon devices.

When You Need This

✅ You want to run a custom SD1.5 or SDXL checkpoint on the NPU path.
❌ You want to run a custom SD1.5 checkpoint on the CPU/GPU path — this is supported directly in the app, no host-side conversion required.

Available Workflows

Workflow	Status	Guide
SD1.5 → NPU	Stable	SD1.5 Conversion Guide
SDXL → NPU	Experimental	SDXL Conversion Guide

What to Expect

Conversion is host-side, not on-device. You will need a Linux or WSL machine.
The pipeline produces W8A16-quantized QNN binaries packaged into a zip that the app imports.
Each chip family (_min, _8gen1, _8gen2, _8gen3) is converted separately.
A single SD1.5 conversion run takes several hours of CPU time. SDXL takes substantially longer.

Hardware Requirements

Workflow	RAM + swap	Disk	GPU
SD1.5 @ 512×512	~20 GB	~30 GB	optional
SD1.5 @ higher resolutions	64 GB+	60 GB+	optional
SDXL @ 1024×1024	64 GB+	60 GB+	optional

A CUDA-enabled GPU is optional — it only speeds up the data preparation phase. The actual quantization runs on CPU.

Skip the Conversion?

If you just want a model that works without the conversion overhead, check the pre-converted community collections first. Many popular SD1.5 and SDXL checkpoints are already available there.

NPU Model Conversion ​

When You Need This ​

Available Workflows ​

What to Expect ​

Hardware Requirements ​