NPU Model Conversion
This section covers converting your own Stable Diffusion checkpoints into NPU-compatible assets that Local Dream can load on supported Snapdragon devices.
When You Need This
- ✅ You want to run a custom SD1.5 or SDXL checkpoint on the NPU path.
- ❌ You want to run a custom SD1.5 checkpoint on the CPU/GPU path — this is supported directly in the app, no host-side conversion required.
Available Workflows
| Workflow | Status | Guide |
|---|---|---|
| SD1.5 → NPU | Stable | SD1.5 Conversion Guide |
| SDXL → NPU | Experimental | SDXL Conversion Guide |
What to Expect
- Conversion is host-side, not on-device. You will need a Linux or WSL machine.
- The pipeline produces W8A16-quantized QNN binaries packaged into a zip that the app imports.
- Each chip family (
_min,_8gen1,_8gen2,_8gen3) is converted separately. - A single SD1.5 conversion run takes several hours of CPU time. SDXL takes substantially longer.
Hardware Requirements
| Workflow | RAM + swap | Disk | GPU |
|---|---|---|---|
| SD1.5 @ 512×512 | ~20 GB | ~30 GB | optional |
| SD1.5 @ higher resolutions | 64 GB+ | 60 GB+ | optional |
| SDXL @ 1024×1024 | 64 GB+ | 60 GB+ | optional |
A CUDA-enabled GPU is optional — it only speeds up the data preparation phase. The actual quantization runs on CPU.
Skip the Conversion?
If you just want a model that works without the conversion overhead, check the pre-converted community collections first. Many popular SD1.5 and SDXL checkpoints are already available there.