[LLM + Tree Speculative Decoding] [Inference Setup] Unable to Locate modelOutputQuantScale

jing_liao · September 24, 2025, 8:41am

Environment Information

Platform: Genio-720, D9300
Toolkit Version: GAI-Deployment-Toolkit-v2.0.9_vicuna1.5-7b-tree-speculative-decoding-plus-v0.1.tar.gz

Issue Description:
Users are unsure how to locate or set the modelOutputQuantScale (as seen in Netron) for PTQ and Tree Speculative Decoding tasks.

Solution:
The toolkit automatically provides the correct modelOutputQuantScale in the config_<model_name>.yaml file. Manual extraction or adjustment is unnecessary.

joying.kuo · September 25, 2025, 8:54am

The solution was tested and verified on July 12th, 2025

Topic		Replies	Views
GAI Toolkit Cheat Sheet NeuroPilot - GenAI Android , GenAI	3	132	September 26, 2025
[LLaMA3-8B] [PTQ] Common Issues in Calibration Dataset Generation NeuroPilot - GenAI Genio-720	1	17	September 25, 2025
[Qwen2.5-0.5B] [On-device Inference] Incorrect Output (Hallucination) During DLA Inference NeuroPilot - GenAI Genio-720	1	35	September 25, 2025
[Qwen2.5-3B] [On-device Inference] Input Mismatch and Output Repetition NeuroPilot - GenAI Genio-720	1	39	September 25, 2025
[Qwen2.5-7B] [On-device Inference] Repetitive Output Content for Inference NeuroPilot - GenAI Genio-720	1	22	September 25, 2025

[LLM + Tree Speculative Decoding] [Inference Setup] Unable to Locate modelOutputQuantScale

Environment Information

Related topics