How to resolve inference failures due to unsupported operations on NPU?

Hi! hope the following reply could answer your questions:) .

Extending NPU Support for Custom Ops

MediaTek currently does not provide a workflow or tool for adding custom operator (op) support to the NPU backend.
Custom or unsupported ops must be adapted to use supported operators, or models must be redesigned for compatibility.

Recommended Workflow for Supported NPU Deployment
  1. Prepare your model in TFLite format.
  2. Avoid dynamic shapes—use fixed shape tensors to ensure stable conversion and execution.
  3. Use quantized models for optimal performance on the NPU.
    • Quantization helps ensure support for hardware-accelerated inference and lower latency.
  4. Verify supported operators and avoid custom/uncommon ops.
    • Use the official operator list for your platform (see Converter Supported Operators).
    • For NP8: follow the NP8 documentation. For NP6/NP7, consult the respective hardware documentation.
  5. Use MediaTek Converter tools for model conversion and validation.

Tip:
Before retraining, use the converter tools with the TFLite specification to confirm your model is compatible and supported. This reduces the chance of running into unsupported operator failures in final deployment.

Fallback Mechanism: Offline vs Online Inference

Summary Table

Approach Custom Op Support Fallback to CPU/GPU Recommended Use Case
Offline Inference No No Model fully supported by NPU
Online Inference No Yes (CPU only for supported TFLite ops) Mixed operator support or rapid prototyping

MediaTek Recommendation

  • Redesign or reimplement custom operators using supported TFLite ops for NPU deployment.
  • Prefer offline inference for production, provided your model is fully supported.
  • Use online inference for prototyping or mixed workloads needing fallback.
  • Always consult the latest supported operator list and platform documentation.