I’m using GAI-Deployment-Toolkit-v2.0.8_qwen2.5-0.5b-1.5b-7b-v0.1on Genio 520.
When compiling generative part with MDLA_NUM=1 (correct for 1x MDLA5.3), the script runs quickly, maps tensors (curr_keys_0 to … , MTKEXT_FULLY_CONNECTED, SILU, etc.), then finishes without generating any .dla file for generative.
- With MDLA_NUM=4: Compile pass but inference outputs repetitive “!!!”.
How can I fix this to make generative compilation work with --num-mdla=1 on Genio 520?
Thanks a lot!
I tried running the ncc-tflite command directly while removing some of the flags from the original script to test, and I got the error as shown in the attached log.
While waiting for support from the community, I tried switching to LLaMA 3.2-1B and was able to successfully generate the DLA file. However, when deploying it on the device, I encountered the error shown below.
I would sincerely appreciate it if you could help me identify the cause of this issue and suggest how to resolve it. After fixing this problem, I plan to try the same workflow with Qwen 2.5 and will report the results back to you.
Thank you very much for your time and support. I really appreciate your help.
I would like to first align on the process you followed. During the DLA conversion stage, did you use the script provided in the toolkit compile directory, or did you manually convert the model using the ncc-tflite tool?
To avoid potential inference issues caused by missing compilation parameters, we strongly recommend using the official script provided in the toolkit for model conversion.