We’re currently working on converting the Whisper-Small PyTorch model to DLA using the NeuroPilot 8 SDK with MDLA 5.5.
So far, we successfully converted the PyTorch model to TFLite using the MTK PyTorch converter and obtained a valid TFLite file. However, when attempting to convert the TFLite model to DLA format, the process fails due to unsupported operations.
We’ve attached the list of unsupported ops below for reference. Any guidance or recommended workarounds would be greatly appreciated.
./../../neuron_sdk/host/bin/ncc-tflite --arch mdla5.5 whisper-small_mdla_fp32.tflite -o whisper.dla --runtime-dynamic-shape
OP[3]: RESHAPE
├ MDLA: Cannot support Int32 input
├ MDLA: Cannot support Int32 output
├ EDPA: unsupported operation
OP[4]: CAST
├ MDLA: Cannot support Int32 output
├ EDPA: CheckConversion(srcType, dstType) Unsupported data type conversion: Float16 → Int32.
OP[329]: GATHER
├ MDLA: unsupported operation
├ EDPA: unsupported operation
ERROR: Cannot find an execution plan because of unsupported operations
ERROR: Fail to compile whisper-small_mdla_fp32.tflite
We’re running into an issue where MDLA reports that INT32 is not supported during DLA compilation. However, during the PyTorch → TFLite conversion step, the MTK_Pytorch_converter still force-casts our INT64 tensors to INT32 and generates the TFLite model anyway.
These are the warnings we see during conversion:
Importing the model …
[python/converters/base_converter.py:1457] RuntimeWarning: Forcible import the aten::gelu operator by using tanh approximation.
done
Converting the model …
[python/converters/base_converter.py:1503] RuntimeWarning: Forcibly convert the unsupported type of tensor ‘input.7’ from DT_INT64 to DT_INT32. Expect quality drop in some cases.
[python/converters/base_converter.py:1503] RuntimeWarning: Forcibly convert the unsupported type of tensor ‘input_ids.1’ from DT_INT64 to DT_INT32. Expect quality drop in some cases.