Environment Information
- Platform: Genio-720
- Toolkit Version: GAI-Deployment-Toolkit-v2.0.8_qwen2.5-0.5b-1.5b-7b-v0.1.tar.gz
Issue Description:
The interface for Qwen2.5-7B inference often produces repeated responses.
Solution:
- Upgrade to NP8 and verify issue persistence.
- Confirm post-PTQ model correctness.
- Adjust configuration as per NP8 documentation. In
config.json, set:
"mask_value": -10000
- For PTQ Qwen2.5 models, enable rotation:
"rotate": true,
"rotate_mode": "ortho"