Environment Information
- Platform: Genio-720
- Toolkit Version: GAI-Deployment-Toolkit-v2.0.8_qwen2.5-0.5b-1.5b-7b-v0.1.tar.gz
Issue Description
When running inference, hallucinated or inconsistent results may appear
Example output:
Root Cause
This issue is usually caused by compiling the model with an incorrect DLA format (for example, being compiled to mdla5.5 instead of the intended version).
Solution
Check the scripts compile_generative.sh and compile_prompt.sh, and make sure the following variables are set correctly:
MDLA_VER="mdla5.3,edma3.6"
MDLA_NUM="1"
TCM_SIZE="256"
After updating these values:
- Recompile the model into DLA format.
- Deploy the newly compiled model to your board.
- Run inference again to verify that the hallucination issue is resolved.
