Skip to content

NPU Model Conversion

This section covers converting your own Stable Diffusion checkpoints into NPU-compatible assets that Local Dream can load on supported Snapdragon devices.

When You Need This

  • ✅ You want to run a custom SD1.5 or SDXL checkpoint on the NPU path.
  • ❌ You want to run a custom SD1.5 checkpoint on the CPU/GPU path — this is supported directly in the app, no host-side conversion required.

Available Workflows

WorkflowStatusGuide
SD1.5 → NPUStableSD1.5 Conversion Guide
SDXL → NPUExperimentalSDXL Conversion Guide

What to Expect

  • Conversion is host-side, not on-device. You will need a Linux or WSL machine.
  • The pipeline produces W8A16-quantized QNN binaries packaged into a zip that the app imports.
  • Each chip family (_min, _8gen1, _8gen2, _8gen3) is converted separately.
  • A single SD1.5 conversion run takes several hours of CPU time. SDXL takes substantially longer.

Hardware Requirements

WorkflowRAM + swapDiskGPU
SD1.5 @ 512×512~20 GB~30 GBoptional
SD1.5 @ higher resolutions64 GB+60 GB+optional
SDXL @ 1024×102464 GB+60 GB+optional

A CUDA-enabled GPU is optional — it only speeds up the data preparation phase. The actual quantization runs on CPU.

Skip the Conversion?

If you just want a model that works without the conversion overhead, check the pre-converted community collections first. Many popular SD1.5 and SDXL checkpoints are already available there.